Hi all, I'm using the StandardAnalyzer in an application based on Lucene 1.4.2.
Currently, and by default, the StandardAnalyser "throws semicolon-signs away" at index and store time. For example, a document like "ee3e städer" looks liks "ee3e str" when retrieved from the index (That is, the ;-sign is missing). The document is stored as a Field.Text in the index. What I would like to do is to index, and store, words like "städer" and retrieve them in exactly the same form, i.e. as "städer". I can imagine that the result I would like to achieve can be produced by some modifications to the StandardTokenizer.jj (or somewhere else). Can someone please help me by showing me where/how such change can be made. (Note: It is not necessary to be able to search for text with semicolon-sign included, just to retrieve them in their original format.) cheers Clas / Frisim.com --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
