RE: Payloads

Uwe Schindler Sun, 20 Dec 2009 07:07:02 -0800

The problem was solved in #lucene irc channel already. The behaviour of
PayloadTermQuery was correct if you compare scores of a document with an
even and no-even match in the *same* query.


In general: You cannot compare scores on different queries or different
indexes.

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: [email protected]

> -----Original Message-----
> From: Elias Khsheibun [mailto:[email protected]]
> Sent: Sunday, December 20, 2009 2:51 PM
> To: [email protected]
> Subject: RE: Payloads
> 
> 
> I'm trying to run queries now, the problem is - the scoring of the
> BoostingTermQuery is always giving a double weight to even terms, and not
> if
> the query itself contains the term, here is the code that I'm using:
> 
> 
> public class DocumentAnalyzer extends Analyzer {
> 
>       @Override
>       public TokenStream tokenStream(String fieldName, Reader reader) {
>               TokenStream result = new WhitespaceTokenizer(reader);
>               result = new TermPositionPayloadTokenFilter(result);
> 
>               return result;
>       }
> 
> }
> 
> 
> public class TermPositionPayloadTokenFilter extends TokenFilter {
> 
>     protected PayloadAttribute payAtt;
>     protected PositionIncrementAttribute posIncrAtt;
> 
>     private static final Payload evenPayload = new
> Payload(PayloadHelper.encodeFloat(2.0f));
> 
>     private int termPosition = 0;
> 
>     public TermPositionPayloadTokenFilter(TokenStream input) {
>         super(input);
>         payAtt = (PayloadAttribute) addAttribute(PayloadAttribute.class);
>         posIncrAtt = (PositionIncrementAttribute)
> addAttribute(PositionIncrementAttribute.class);
>     }
> 
>     @Override
>     public final boolean incrementToken() throws IOException {
>         if (input.incrementToken()) {
>             if ((termPosition % 2) == 0)
>                 payAtt.setPayload(evenPayload);
>             termPosition += posIncrAtt.getPositionIncrement();
>             return true;
>         } else {
>             return false;
>         }
>     }
> 
> }
> 
> 
> 
> public class BoostingSimilarity extends DefaultSimilarity {
>       public float scorePayload(String fieldName, byte[] payload, int
> offset, int length) {
>       if (payload != null)
>       return PayloadHelper.decodeFloat(payload, offset);
> 
>       else
>       return 1.0F;
>       }
> }
> 
> And this is a test I've written, if you look at the scores, then you will
> notice that the BoostingTermQuery is always giving a double weight to even
> terms no matter if they appear in the query or no (this is my current
> problem now):
> 
> public class PayloadsTest extends TestCase {
>       Directory dir;
>       IndexWriter writer;
>       DocumentAnalyzer analyzer;
>       protected void setUp() throws Exception {
>       super.setUp();
>       dir = new RAMDirectory();
>       analyzer = new DocumentAnalyzer();
>       writer = new IndexWriter(dir, analyzer,
> IndexWriter.MaxFieldLength.UNLIMITED);
>       }
>       protected void tearDown() throws Exception {
>       super.tearDown();
>       writer.close();
>       }
>       void addDoc(String title, String contents) throws IOException {
>       Document doc = new Document();
>       doc.add(new Field("title",
>       title,
>       Field.Store.YES,
>       Field.Index.NO));
> 
>       doc.add(new Field("contents",
>                       contents,
>                       Field.Store.NO,
>                       Field.Index.ANALYZED));
> 
>       writer.addDocument(doc);
>       }
> 
>       public void testBoostingTermQuery() throws Throwable {
>       addDoc("Hurricane warning", "A hurricane warning was issued at 6 AM
> for the outer great banks");
>       addDoc("Warning label maker", "The warning label maker is a
> delightful toy for your precocious six year old's warning needs");
>       addDoc("Tornado warning", "There is a tornado warning for Worcester
> county until 6 PM today");
>       writer.commit();
>       IndexSearcher searcher = new IndexSearcher(dir);
>       searcher.setSimilarity(new BoostingSimilarity());
>       Term warning = new Term("contents", "tornado");
>       Query query1 = new TermQuery(warning);
>       System.out.println("\nTermQuery results:");
> 
>       ScoreDoc [] hits = searcher.search(query1, 10).scoreDocs;
>        for (int i = 0; i < hits.length; i++) {
>             Document hitDoc = searcher.doc(hits[i].doc);
>             System.out.println(hitDoc.get("title"));
>        }
>       Query query2 = new BoostingTermQuery(warning);
>       System.out.println("\nBoostingTermQuery results:");
> 
>       ScoreDoc [] hits2 = searcher.search(query2, 10).scoreDocs;
>       for (int i = 0; i < hits2.length; i++) {
>             Document hitDoc = searcher.doc(hits2[i].doc);
>             System.out.println(hitDoc.get("title"));
>        }
>       }
>       }
> 
> 
> -----Original Message-----
> From: AHMET ARSLAN [mailto:[email protected]]
> Sent: Saturday, December 19, 2009 11:19 PM
> To: [email protected]
> Subject: RE: Payloads
> 
> 
> > If I need to override the QueryParser
> > to return PayloadTermQuery, what
> > function for PayloadFunction should I use in the
> > constructor (If you can
> > show me an example).
> 
> I am not sure about that. Maybe custom one.
> 
> > In your code I didn't see an indexer, will this work with
> > the regular
> > IndexWriter but with the new Analyzer that you overloaded
> 
> No, at index time [IndexWriter] you are going to use a new analyzer that
> uses WhitespaceTokenizer  + TermPositionPayloadTokenFilter.
> 
> PayloadAnalyzer will be used at query time. [QueryParser]
> 
> You need to setSimilarity(new CustomSimilarity) of both indexer and
> searcher.
> 
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

RE: Payloads

Reply via email to