Well, it is done now. As final result, I surrended myself to "double-storing". This way, I have indexed the original text with COMPRESSED option to save some space.
And to highlight the results correctly, I made some matching between unaccented-words and original words by regular expressions, and the results is satisfactory. Thanks all for the brainstorming ^^ Cesar Erick Erickson wrote: > > See below... > > > On Feb 11, 2008 12:17 PM, Cesar Ronchese <[EMAIL PROTECTED]> wrote: > >> >> Hey, Erick. You inferred right. >> >> I analized your code and it looks like a common Indexing and Searching >> code. >> Are you sure you pasted the correct code? :P >> > > Did you try to run it? It's just a self-contained example showing that > searching > and displaying are distinct. > > The indexer part indexes a mixed-case string. The search is then > performed on a lower-case string, and the println shows that a > document was found. The next println echoes back the stored text > showing that the original was stored. Just substitute your preferred > filter to see how this would work for you. > > > >> >> Anyways, is the concept about doubling storing data, one content with >> accents and other without? If yes, I did it earlier, but once I search in >> the non-accent content and show accent content, the HitHighlighter will >> now >> work properly. >> -- >> > > Is this a typo or is your problem solved? I confess that haven't had the > necessity to use the highlighter package yet, so I may be missing > something... > > But you're not really "double storing". You'll find that indexed code > takes > MUCH less space than you would think, nowhere near the amount > required to store the data too. So there's good reason to separate the > two. > > You have no choice except to store the data if you want the user to see > something pretty..... > > Erick > > >> >> View this message in context: >> http://www.nabble.com/Indexing-accented-characters%2C-then-searching-by-any-form-tp15412778p15415770.html >> Sent from the Lucene - Java Users mailing list archive at Nabble.com. >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [EMAIL PROTECTED] >> For additional commands, e-mail: [EMAIL PROTECTED] >> >> > > -- View this message in context: http://www.nabble.com/Indexing-accented-characters%2C-then-searching-by-any-form-tp15412778p15423851.html Sent from the Lucene - Java Users mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]