Re: RE : Performance of hit highlighting and finding term positions for a specific document

Kevin A. Burton Wed, 31 Mar 2004 18:46:45 -0800

Rasik Pandey wrote:

Hello,

I've been meaning to look into good ways to store token offset
information to allow for very
efficient highlighting and I believe Mark may also be looking
into improving the highlighter via
other means such as temporary ram indexes. Search the archives
to get a background on some of the
idea's we've tossed around ('Dmitry's Term Vector stuff, plus
some' and 'Demoting results' come to
mind as threads that touch this topic).


I would be nice if CachingRewrittenQueryWrapper.java that I sent to lucene-dev (see below) last week became part of these highlighting effors, if appropriate. We use it to collect terms for a query that searches of multiple indices.

Actually I had to write one for my tests with the highlighter. I'm using a MultiSearcher and a WildcardQuery which the highlighter didn't have support for.

My impl was fairly basic so I wouldn't suggest a contribution... I'm sure your's is better. The suggested changes to the highlighter for providing tokens would make this work well together.

Kevin

--

Please reply using PGP.


    http://peerfear.org/pubkey.asc    
    
    NewsMonster - http://www.newsmonster.org/
    
Kevin A. Burton, Location - San Francisco, CA, Cell - 415.595.9965
       AIM/YIM - sfburtonator,  Web - http://peerfear.org/
GPG fingerprint: 5FB2 F3E2 760E 70A8 6174 D393 E84D 8D04 99F1 4412
  IRC - freenode.net #infoanarchy | #p2p-hackers | #newsmonster

signature.asc
Description: OpenPGP digital signature

Re: RE : Performance of hit highlighting and finding term positions for a specific document

Reply via email to