The problem was solved in #lucene irc channel already. The behaviour of PayloadTermQuery was correct if you compare scores of a document with an even and no-even match in the *same* query.
In general: You cannot compare scores on different queries or different indexes. ----- Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: [email protected] > -----Original Message----- > From: Elias Khsheibun [mailto:[email protected]] > Sent: Sunday, December 20, 2009 2:51 PM > To: [email protected] > Subject: RE: Payloads > > > I'm trying to run queries now, the problem is - the scoring of the > BoostingTermQuery is always giving a double weight to even terms, and not > if > the query itself contains the term, here is the code that I'm using: > > > public class DocumentAnalyzer extends Analyzer { > > @Override > public TokenStream tokenStream(String fieldName, Reader reader) { > TokenStream result = new WhitespaceTokenizer(reader); > result = new TermPositionPayloadTokenFilter(result); > > return result; > } > > } > > > public class TermPositionPayloadTokenFilter extends TokenFilter { > > protected PayloadAttribute payAtt; > protected PositionIncrementAttribute posIncrAtt; > > private static final Payload evenPayload = new > Payload(PayloadHelper.encodeFloat(2.0f)); > > private int termPosition = 0; > > public TermPositionPayloadTokenFilter(TokenStream input) { > super(input); > payAtt = (PayloadAttribute) addAttribute(PayloadAttribute.class); > posIncrAtt = (PositionIncrementAttribute) > addAttribute(PositionIncrementAttribute.class); > } > > @Override > public final boolean incrementToken() throws IOException { > if (input.incrementToken()) { > if ((termPosition % 2) == 0) > payAtt.setPayload(evenPayload); > termPosition += posIncrAtt.getPositionIncrement(); > return true; > } else { > return false; > } > } > > } > > > > public class BoostingSimilarity extends DefaultSimilarity { > public float scorePayload(String fieldName, byte[] payload, int > offset, int length) { > if (payload != null) > return PayloadHelper.decodeFloat(payload, offset); > > else > return 1.0F; > } > } > > And this is a test I've written, if you look at the scores, then you will > notice that the BoostingTermQuery is always giving a double weight to even > terms no matter if they appear in the query or no (this is my current > problem now): > > public class PayloadsTest extends TestCase { > Directory dir; > IndexWriter writer; > DocumentAnalyzer analyzer; > protected void setUp() throws Exception { > super.setUp(); > dir = new RAMDirectory(); > analyzer = new DocumentAnalyzer(); > writer = new IndexWriter(dir, analyzer, > IndexWriter.MaxFieldLength.UNLIMITED); > } > protected void tearDown() throws Exception { > super.tearDown(); > writer.close(); > } > void addDoc(String title, String contents) throws IOException { > Document doc = new Document(); > doc.add(new Field("title", > title, > Field.Store.YES, > Field.Index.NO)); > > doc.add(new Field("contents", > contents, > Field.Store.NO, > Field.Index.ANALYZED)); > > writer.addDocument(doc); > } > > public void testBoostingTermQuery() throws Throwable { > addDoc("Hurricane warning", "A hurricane warning was issued at 6 AM > for the outer great banks"); > addDoc("Warning label maker", "The warning label maker is a > delightful toy for your precocious six year old's warning needs"); > addDoc("Tornado warning", "There is a tornado warning for Worcester > county until 6 PM today"); > writer.commit(); > IndexSearcher searcher = new IndexSearcher(dir); > searcher.setSimilarity(new BoostingSimilarity()); > Term warning = new Term("contents", "tornado"); > Query query1 = new TermQuery(warning); > System.out.println("\nTermQuery results:"); > > ScoreDoc [] hits = searcher.search(query1, 10).scoreDocs; > for (int i = 0; i < hits.length; i++) { > Document hitDoc = searcher.doc(hits[i].doc); > System.out.println(hitDoc.get("title")); > } > Query query2 = new BoostingTermQuery(warning); > System.out.println("\nBoostingTermQuery results:"); > > ScoreDoc [] hits2 = searcher.search(query2, 10).scoreDocs; > for (int i = 0; i < hits2.length; i++) { > Document hitDoc = searcher.doc(hits2[i].doc); > System.out.println(hitDoc.get("title")); > } > } > } > > > -----Original Message----- > From: AHMET ARSLAN [mailto:[email protected]] > Sent: Saturday, December 19, 2009 11:19 PM > To: [email protected] > Subject: RE: Payloads > > > > If I need to override the QueryParser > > to return PayloadTermQuery, what > > function for PayloadFunction should I use in the > > constructor (If you can > > show me an example). > > I am not sure about that. Maybe custom one. > > > In your code I didn't see an indexer, will this work with > > the regular > > IndexWriter but with the new Analyzer that you overloaded > > No, at index time [IndexWriter] you are going to use a new analyzer that > uses WhitespaceTokenizer + TermPositionPayloadTokenFilter. > > PayloadAnalyzer will be used at query time. [QueryParser] > > You need to setSimilarity(new CustomSimilarity) of both indexer and > searcher. > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
