Yes, I have but it is too memory intensive. I used highlighter as my first attempt but it was not a good solution because, I have to send the entire text to highlighter.
What I did instead is similar to your suggestion. 1. use the analyzer to return me a token stream. 2. search the token stream for the keyword I'm looking for (need to analyze that keyword as well!) 3. extract the token's offset. 4. use the offsets in the index and Java's RandomFileArray to "seek" the byte(character) position then extract a "fragment" of about 500 chars around that index. This solution requires little memory use and, I hope, will work as I expect under steady stress. How does this sound to you? What I would LOVE is if I could do it in a standard Lucene search like I mentioned earlier. Hit.doc[0].getHitTokenList() :confused: Something like this... ~Dustin Erik Hatcher wrote: > > Have you looked at the contrib Highlighter? Or using an Analyzer > directly to give you the offsets? > > Erik > > On Feb 26, 2009, at 9:32 AM, HPDrifter wrote: > >> >> When I get a search result based on my index, I need the exact >> tokens which >> were identified in the index as part of the result. Why? I need the >> character offsets. >> >> I have a solution right now...almost, but it bugs the hell out of me >> that I >> can say something like... >> documentHit[0].getIdentifiedTokens(); >> >> Do I need to make a contribution in order to make this happen?:ninja: >> >> >> -- >> View this message in context: >> http://www.nabble.com/Getting-tokens-from-search-results.--Simple-concept-tp22225364p22225364.html >> Sent from the Lucene - Java Developer mailing list archive at >> Nabble.com. >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org >> For additional commands, e-mail: java-dev-h...@lucene.apache.org > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-dev-h...@lucene.apache.org > > > -- View this message in context: http://www.nabble.com/Getting-tokens-from-search-results.--Simple-concept-tp22225364p22247863.html Sent from the Lucene - Java Developer mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org