Hi all,

I'm using the StandardAnalyzer in an application based on Lucene 1.4.2. 

Currently, and by default, the StandardAnalyser "throws
semicolon-signs away" at index and store time. For example, a document
like "ee3e städer" looks liks "ee3e st&#x00e4der" when
retrieved from the index (That is, the ;-sign is missing). The
document is stored as a Field.Text in the index.

What I would like to do is to index, and store, words like
"städer" and retrieve them in exactly the same form, i.e. as
"städer".

I can imagine that the result I would like to achieve can be produced
by some modifications to the StandardTokenizer.jj (or somewhere else).
Can someone please help me by showing me where/how such change can be
made.

(Note: It is not necessary to be able to search for text with
semicolon-sign included, just to retrieve them in their original
format.)

cheers
Clas / Frisim.com

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to