Hello, I work on an application that has to index OCR texts of scanned books. Naturally there occur many words that are hyphenated across lines.
I wonder if there is already an Analyzer or maybe a TokenFilter that can merge those syllables back into whole words? It looks like Erik Hatcher uses something like that at http://www.lucenebook.com/. Thanks in advance, Markus --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]