[jira] [Commented] (LUCENE-8841) Explore Relevance Based Performance Benchmarks

Doug Turnbull (JIRA) Sat, 08 Jun 2019 02:40:17 -0700


    [ 
https://issues.apache.org/jira/browse/LUCENE-8841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16859165#comment-16859165
 ]


Doug Turnbull commented on LUCENE-8841:
---------------------------------------

Big +1, though I suspect it would be very hard! This could be an Apache project 
in and of itself...

One challenge is that the number of use cases Lucene is used is tremendously 
diverse, from job search, to e-commerce, to legal search, to enterprise search, 
to news search, to Web search, to everything in between and outside the box. 
You wouldn't want a situation, for example, where you only have an e-commerce 
test set, so you end up creating a situation where Enterprise search users are 
harmed because of decisions made optimizing an e-commerce set. 

Another challenge is getting reliable relevance judgments. Teams go deep into 
developing their methodology for creating a golden set of judgments. This of 
course can be very domain specific and challenging problem. There's not a 
one-size-fits-all obvious approach. Some teams use human judges, others crowd 
source, others very analytics based. Some have access to conversion data, 
others don't. You have all sorts of biases to contend with in every situation. 
And the judgments evolve over time. (today's most relevant iPhone isn't the 
same as 2 years ago). So getting it right takes a lot of energy and time from 
mature search orgs. So what judgments/data you choose isn't clear if you want 
to cover a broad range of use cases.

I think the best case is to partner with some organizations that are willing to 
open up this data alongside their corpus. Where we could validate and feel good 
about the methodology they use in generating judgments. You'd need to update 
the relevance judgments and corpus over time. There's of course TREC and other 
academic datasets, that's one data point. Some folks I know at Wikipedia have 
talked about this. But you'd want some more commercial datasets (corpus + 
judgments).

But partnering with orgs would also have limits, as this stuff has very 
high-value to companies... But perhaps they'd be incentivized to open up their 
data if Lucene was going to make decisions with it that helped them?!?

 

> Explore Relevance Based Performance Benchmarks
> ----------------------------------------------
>
>                 Key: LUCENE-8841
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8841
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Atri Sharma
>            Priority: Major
>
> While discussing improvements in relevance of fuzzy queries with [~jimczi], 
> the topic of how to measure impact of changes to relevance of common queries 
> came up. While a non trivial effort, having such a benchmark will allow us to 
> measure the impact of potential changes and also catch regressions well in 
> time.
>  
> This Jira tracks ideas and efforts in that direction



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (LUCENE-8841) Explore Relevance Based Performance Benchmarks

Reply via email to