Rasik Pandey wrote:
Hello,

  
I've been meaning to look into good ways to store token offset
information to allow for very
efficient highlighting and I believe Mark may also be looking
into improving the highlighter via
other means such as temporary ram indexes. Search the archives
to get a background on some of the
idea's we've tossed around ('Dmitry's Term Vector stuff, plus
some' and 'Demoting results' come to
mind as threads that touch this topic).
    

I would be nice if CachingRewrittenQueryWrapper.java that I sent to lucene-dev (see below) last week became part of these highlighting effors, if appropriate. We use it to collect terms for a query that searches of multiple indices.
  
Actually I had to write one for my tests with the highlighter. I'm using a MultiSearcher and a WildcardQuery which the highlighter didn't have support for. 

My impl was fairly basic so I wouldn't suggest a contribution... I'm sure your's is better.  The suggested changes to the highlighter for providing tokens would make this work well together.

Kevin

--
Please reply using PGP.

    http://peerfear.org/pubkey.asc    
    
    NewsMonster - http://www.newsmonster.org/
    
Kevin A. Burton, Location - San Francisco, CA, Cell - 415.595.9965
       AIM/YIM - sfburtonator,  Web - http://peerfear.org/
GPG fingerprint: 5FB2 F3E2 760E 70A8 6174 D393 E84D 8D04 99F1 4412
  IRC - freenode.net #infoanarchy | #p2p-hackers | #newsmonster

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to