[jira] [Commented] (LUCENE-3320) Explore Proximity Scoring

2021-03-13 Thread Tomoko Uchida (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-3320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17300785#comment-17300785
 ] 

Tomoko Uchida commented on LUCENE-3320:
---

Just a note: I found a variation of the original B ̈uttcher's algorithm that 
combines proximity scoring with faster top-k retrieval (dynamic early 
termination). 
https://domino.mpi-inf.mpg.de/intranet/ag5/ag5publ.nsf/AuthorEditorIndividualView/f778c4b7609bee48c1257301002f4b0c/$FILE/schenkelBHTW-SPIRE07.pdf

To me, the main challenge here will be saving upper bound of the proximity 
score (that can be calculated when indexing) within the term postings.

> Explore Proximity Scoring 
> --
>
> Key: LUCENE-3320
> URL: https://issues.apache.org/jira/browse/LUCENE-3320
> Project: Lucene - Core
>  Issue Type: Sub-task
>  Components: core/search
>Affects Versions: Positions Branch
>Reporter: Simon Willnauer
>Priority: Major
> Fix For: Positions Branch
>
>
> Positions will be first class citizens rather sooner than later. We should 
> explore proximity scoring possibilities as well as collection / scoring 
> algorithms like proposed on LUCENE-2878 (2 phase collection)
> This paper might provide some basis for actual scoring implementation: 
> http://plg.uwaterloo.ca/~claclark/sigir2006_term_proximity.pdf



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3320) Explore Proximity Scoring

2021-03-04 Thread Tomoko Uchida (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-3320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17295393#comment-17295393
 ] 

Tomoko Uchida commented on LUCENE-3320:
---

Thanks [~mikemccand] for the pointer!

I think this will bring great improvement especially for long queries or 
natural language queries. I'd need proximity scoring for a project I'm 
currently working on... will give it a try.

> Explore Proximity Scoring 
> --
>
> Key: LUCENE-3320
> URL: https://issues.apache.org/jira/browse/LUCENE-3320
> Project: Lucene - Core
>  Issue Type: Sub-task
>  Components: core/search
>Affects Versions: Positions Branch
>Reporter: Simon Willnauer
>Priority: Major
> Fix For: Positions Branch
>
>
> Positions will be first class citizens rather sooner than later. We should 
> explore proximity scoring possibilities as well as collection / scoring 
> algorithms like proposed on LUCENE-2878 (2 phase collection)
> This paper might provide some basis for actual scoring implementation: 
> http://plg.uwaterloo.ca/~claclark/sigir2006_term_proximity.pdf



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3320) Explore Proximity Scoring

2021-03-03 Thread Michael McCandless (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-3320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17294647#comment-17294647
 ] 

Michael McCandless commented on LUCENE-3320:


[~tomoko] yeah +1 to start exploring this, now that "first class positions 
iterators" (LUCENE-2878) has landed, long ago!

Here is another paper that explores how one might fold positions into BM25 
through "virtual regions" that are field-like: 
[https://www.dc.fi.udc.es/~roi/publications/sigir2012a.pdf]

Not sure how it compares to the above paper!

 

> Explore Proximity Scoring 
> --
>
> Key: LUCENE-3320
> URL: https://issues.apache.org/jira/browse/LUCENE-3320
> Project: Lucene - Core
>  Issue Type: Sub-task
>  Components: core/search
>Affects Versions: Positions Branch
>Reporter: Simon Willnauer
>Priority: Major
> Fix For: Positions Branch
>
>
> Positions will be first class citizens rather sooner than later. We should 
> explore proximity scoring possibilities as well as collection / scoring 
> algorithms like proposed on LUCENE-2878 (2 phase collection)
> This paper might provide some basis for actual scoring implementation: 
> http://plg.uwaterloo.ca/~claclark/sigir2006_term_proximity.pdf



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3320) Explore Proximity Scoring

2021-02-25 Thread Tomoko Uchida (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-3320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17291335#comment-17291335
 ] 

Tomoko Uchida commented on LUCENE-3320:
---

I am very new to this issue. LUCENE-2878 has been resolved a few years ago; is 
there possibility to proceed this issue, or is there someone who is already 
involved? Thanks.

> Explore Proximity Scoring 
> --
>
> Key: LUCENE-3320
> URL: https://issues.apache.org/jira/browse/LUCENE-3320
> Project: Lucene - Core
>  Issue Type: Sub-task
>  Components: core/search
>Affects Versions: Positions Branch
>Reporter: Simon Willnauer
>Priority: Major
> Fix For: Positions Branch
>
>
> Positions will be first class citizens rather sooner than later. We should 
> explore proximity scoring possibilities as well as collection / scoring 
> algorithms like proposed on LUCENE-2878 (2 phase collection)
> This paper might provide some basis for actual scoring implementation: 
> http://plg.uwaterloo.ca/~claclark/sigir2006_term_proximity.pdf



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org