Performance of hit highlighting and finding term positions for a specific document

Kevin A. Burton Tue, 30 Mar 2004 16:56:56 -0800

I'm playing with this package:

http://home.clara.net/markharwood/lucene/highlight.htm

Trying to do hit highlighting. This implementation uses another Analyzer to find the positions for the result terms.

This seems that it's very inefficient since lucene already knows the frequency and position of given terms in the index.

My question is whether it's hard to find a TermPosition for a given term in a given document rather than the whole index.

IndexReader.termPositions( Term term ) is term specific not term and document specific.

Also it seems that after all this time that Lucene should have efficient hit highlighting as a standard package. Is there any interest in seeing a contribution in the sandbox for this if it uses the index positions?

--

Please reply using PGP.

http://peerfear.org/pubkey.asc NewsMonster - http://www.newsmonster.org/
Kevin A. Burton, Location - San Francisco, CA, Cell - 415.595.9965
AIM/YIM - sfburtonator, Web - http://peerfear.org/
GPG fingerprint: 5FB2 F3E2 760E 70A8 6174 D393 E84D 8D04 99F1 4412
IRC - freenode.net #infoanarchy | #p2p-hackers | #newsmonster

signature.asc
Description: OpenPGP digital signature

Performance of hit highlighting and finding term positions for a specific document

Reply via email to