Re: Post-sorted inverted index?

2004-07-20 Thread Doug Cutting
You can define a subclass of FilterIndexReader that re-sorts documents 
in TermPositions(Term) and document(int), then use 
IndexWriter.addIndexes() to write this in Lucene's standard format.  I 
have done this in Nutch, with the (as yet unused) IndexOptimizer.

http://cvs.sourceforge.net/viewcvs.py/nutch/nutch/src/java/net/nutch/indexer/IndexOptimizer.java?view=markup
Doug
Aphinyanaphongs, Yindalon wrote:
I gather from reading the documentation that the scores for each document hit are computed at query time.  I have an application that, due to the complexity of the function, cannot compute scores at query time.  Would it be possible for me to store the documents in pre-sorted order in the inverted index? (i.e. after the initial index is created, to have a post processing step to sort and reindex the final documents).
 
For example:
Document A - score 0.2
Document B - score 0.4
Document C - score 0.6
 
Thus for the word 'the', the stored order in the index would be C,B,A.
 
Thanks!

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Post-sorted inverted index?

2004-07-19 Thread Aphinyanaphongs, Yindalon
I gather from reading the documentation that the scores for each document hit are 
computed at query time.  I have an application that, due to the complexity of the 
function, cannot compute scores at query time.  Would it be possible for me to store 
the documents in pre-sorted order in the inverted index? (i.e. after the initial index 
is created, to have a post processing step to sort and reindex the final documents).
 
For example:
Document A - score 0.2
Document B - score 0.4
Document C - score 0.6
 
Thus for the word 'the', the stored order in the index would be C,B,A.
 
Thanks!