lucene score formula -------------------- Key: LUCENENET-111 URL: https://issues.apache.org/jira/browse/LUCENENET-111 Project: Lucene.Net Issue Type: Bug Environment: ASP.NET C# Reporter: immanouel
I´m working with lucene. net in C# and i, being quite confuse, have a question : First of all i created an IndexWriter code to insert text on lucene, then an indexsearcher ... here´s the index write code: /........................................................................................................ IndexWriter writer = new IndexWriter(sIndexPath, new StandardAnalyzer(), false); Document doc = new Document(); doc.Add(new Field("title", sHeader, Field.Store.YES, Field.Index.UN_TOKENIZED)); doc.Add(new Field("link", sType, Field.Store.YES, Field.Index.UN_TOKENIZED)); doc.Add(new Field("content", sContent, Field.Store.YES, Field.Index.TOKENIZED)); writer.AddDocument(doc); writer.Optimize(); writer.Close(); /...................................................................................................... here´s the IndexSearcher code: (As you can see there´s an - explanation - code output) //.................................................................................................................................... IndexSearcher searcher = new IndexSearcher(IndexLocation()); QueryParser oParser = new QueryParser("content", new StandardAnalyzer()); string sSearchQuery = TextBox1.Text; Hits oHitColl = searcher.Search(oParser.Parse(sSearchQuery)); for (int i = 0; i < oHitColl.Length(); i++) { Explanation explanation = searcher.Explain(query, oHitColl.Id(i));//generate explanation of single document for query // Document oDoc = oHitColl.Doc(i); string conteudo = oDoc.Get("content"); if (conteudo != null) { Label1.Text = Label1.Text + explanation.ToString() + "<br>";//output explanation Label2.Text = Label2.Text + "</p>-------------<br>"; } //.............................As you can see the code is nothing special.............................................. Everything went well, except that i don´t understand something in the output... : 1,356585 = fieldWeight(content:açores in 448), product of: 1 = tf(termFreq(content:açores)=1) 2,713169 = idf(docFreq=85) 0,5 = fieldNorm(field=content, doc=448) ---------- 0,4239327 = fieldWeight(content:açores in 253), product of: 2 = tf(termFreq(content:açores)=4) 2,713169 = idf(docFreq=85) 0,078125 = fieldNorm(field=content, doc=253) ---------- 0,4153675 = fieldWeight(content:açores in 125), product of: 2,44949 = tf(termFreq(content:açores)=6) 2,713169 = idf(docFreq=85) 0,0625 = fieldNorm(field=content, doc=125) ---------- 0,3791769 = fieldWeight(content:açores in 210), product of: 2,236068 = tf(termFreq(content:açores)=5) 2,713169 = idf(docFreq=85) 0,0625 = fieldNorm(field=content, doc=210) ---------- 0,3671364 = fieldWeight(content:açores in 259), product of: 1,732051 = tf(termFreq(content:açores)=3) 2,713169 = idf(docFreq=85) 0,078125 = fieldNorm(field=content, doc=259) ---------- 0,3634466 = fieldWeight(content:açores in 95), product of: 2,44949 = tf(termFreq(content:açores)=6) 2,713169 = idf(docFreq=85) 0,0546875 = fieldNorm(field=content, doc=95) etc.. So here goes the question: shouldn´t the output order be based on the termFreq ?? how come termfrequecy = 1 be in the top order list? Shouldn´t the (termFreq(content:açores)=1) be the last (on the list) and (termFreq(content:açores)=6) at top (of the list)? Am i doing something wrong? Is there something about lucene score formula that i should know? Thanks! PS - Please response!! -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.