[ 
https://issues.apache.org/jira/browse/OAK-3352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14733299#comment-14733299
 ] 

Chetan Mehrotra edited comment on OAK-3352 at 9/7/15 6:57 AM:
--------------------------------------------------------------

*Proposal*

* User can express the desire to obtain the score explanation by specifying a 
specific column name e.g. {{oak:explainScore}}. 
{code}select oak:explainScore, * from [nt:base] where contains(., 'foo'){code}
* Oak query engine interprets that intention and enables a flag in {{Filter}} 
* Index implementation can then provide a value for this column as per 
implementation detail. LucenePropertyIndex can map it to Lucene {{Explanation}} 
support
* Users/Tooling can then extract the value for this column and hence enable 
user to understand how the score is being calculated

Information provided by explanation looks like below
{noformat}
2015-09-07 10:54:09,899 INFO  NA [qtp2077476379-73] 
o.a.j.o.p.i.l.LucenePropertyIndex - 
/content/geometrixx-outdoors/en/search/test_in_title -> 3.6227639 = (MATCH) 
weight(:fulltext:keyword in 509) [DefaultSimilarity], result of:
  3.6227639 = fieldWeight in 509, product of:
    1.4142135 = tf(freq=2.0), with freq of:
      2.0 = termFreq=2.0
    6.831149 = idf(docFreq=3, maxDocs=1363)
    0.375 = fieldNorm(doc=509)
 
2015-09-07 10:54:09,899 INFO  NA [qtp2077476379-73] 
o.a.j.o.p.i.l.LucenePropertyIndex - 
/content/geometrixx-outdoors/en/search/test_in_description -> 3.01897 = (MATCH) 
weight(:fulltext:keyword in 508) [DefaultSimilarity], result of:
  3.01897 = fieldWeight in 508, product of:
    1.4142135 = tf(freq=2.0), with freq of:
      2.0 = termFreq=2.0
    6.831149 = idf(docFreq=3, maxDocs=1363)
    0.3125 = fieldNorm(doc=508)
 
2015-09-07 10:54:09,899 INFO  NA [qtp2077476379-73] 
o.a.j.o.p.i.l.LucenePropertyIndex - 
/content/geometrixx-outdoors/en/search/test_in_text -> 2.5616808 = (MATCH) 
weight(:fulltext:keyword in 507) [DefaultSimilarity], result of:
  2.5616808 = fieldWeight in 507, product of:
    1.0 = tf(freq=1.0), with freq of:
      1.0 = termFreq=1.0
    6.831149 = idf(docFreq=3, maxDocs=1363)
    0.375 = fieldNorm(doc=507)
{noformat}

[~tmueller] [~alexparvulescu] [~teofili] [~edivad] [~catholicon] Thoughts?


was (Author: chetanm):
*Proposal*

* User can express the desire to obtain the score explanation by specifying a 
specific column name e.g. {{oak:explainScore}}. _select oak:explainScore, * 
from [nt:base] where contains(., 'foo')_
* Oak query engine interprets that intention and enables a flag in {{Filter}} 
* Index implementation can then provide a value for this column as per 
implementation detail. LucenePropertyIndex can map it to Lucene {{Explanation}} 
support
* Users/Tooling can then extract the value for this column and hence enable 
user to understand how the score is being calculated

Information provided by explanation looks like below
{noformat}
2015-09-07 10:54:09,899 INFO  NA [qtp2077476379-73] 
o.a.j.o.p.i.l.LucenePropertyIndex - 
/content/geometrixx-outdoors/en/search/test_in_title -> 3.6227639 = (MATCH) 
weight(:fulltext:keyword in 509) [DefaultSimilarity], result of:
  3.6227639 = fieldWeight in 509, product of:
    1.4142135 = tf(freq=2.0), with freq of:
      2.0 = termFreq=2.0
    6.831149 = idf(docFreq=3, maxDocs=1363)
    0.375 = fieldNorm(doc=509)
 
2015-09-07 10:54:09,899 INFO  NA [qtp2077476379-73] 
o.a.j.o.p.i.l.LucenePropertyIndex - 
/content/geometrixx-outdoors/en/search/test_in_description -> 3.01897 = (MATCH) 
weight(:fulltext:keyword in 508) [DefaultSimilarity], result of:
  3.01897 = fieldWeight in 508, product of:
    1.4142135 = tf(freq=2.0), with freq of:
      2.0 = termFreq=2.0
    6.831149 = idf(docFreq=3, maxDocs=1363)
    0.3125 = fieldNorm(doc=508)
 
2015-09-07 10:54:09,899 INFO  NA [qtp2077476379-73] 
o.a.j.o.p.i.l.LucenePropertyIndex - 
/content/geometrixx-outdoors/en/search/test_in_text -> 2.5616808 = (MATCH) 
weight(:fulltext:keyword in 507) [DefaultSimilarity], result of:
  2.5616808 = fieldWeight in 507, product of:
    1.0 = tf(freq=1.0), with freq of:
      1.0 = termFreq=1.0
    6.831149 = idf(docFreq=3, maxDocs=1363)
    0.375 = fieldNorm(doc=507)
{noformat}

[~tmueller] [~alexparvulescu] [~teofili] [~edivad] [~catholicon] Thoughts?

> Expose Lucene search score explanation 
> ---------------------------------------
>
>                 Key: OAK-3352
>                 URL: https://issues.apache.org/jira/browse/OAK-3352
>             Project: Jackrabbit Oak
>          Issue Type: New Feature
>          Components: lucene, query
>            Reporter: Chetan Mehrotra
>             Fix For: 1.4
>
>
> Lucene provides {{Explanation}} [1] support to understand how the score for a 
> particular search result is determined. To enable users to reason out about 
> fulltext search results in Oak we should expose this information as part of 
> query result
> [1] 
> https://lucene.apache.org/core/4_0_0/core/org/apache/lucene/search/Explanation.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to