[jira] [Commented] (LUCENE-8791) Add CollectorRescorer

Elbek Kamoliddinov (JIRA) Fri, 17 May 2019 01:44:39 -0700


    [ 
https://issues.apache.org/jira/browse/LUCENE-8791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16842005#comment-16842005
 ]


Elbek Kamoliddinov commented on LUCENE-8791:
--------------------------------------------

Sorry, I was bit busy and could not work on feedback. I created another 
iteration with some changes. 
{quote}
1) In buildSegmentScorers, could we rename maxDoc as endDoc? Since maxDoc is 
what is used to typically denote the number of documents in a segment, the code 
can be conflicting to read.
{quote}
I left the name as maxDoc, this is denoting max doc and used to check if we 
went beyond it so we advance. Is it convincing? 
{quote}
When mapping LeafSlices to SegmentScorers, we do another loop post the 
construction of SegmentScorers. Is there a way where we could do the mapping in 
buildSegmentScorers itself at the time of scorer task allocation?
{quote}
The thing is slice may contain out of order segments, for example slice 0 may 
contain segment 0 and segment 3. So we have to do full iteration here first 
then map segments to the slices. 
{quote}
3) I am a bit unsure about how the rescorer method works. If IndexSearcher has 
non empty slices, then that implies that a non null ExecutorService was passed 
in to IndexSearcher. Is there any correlation to that and CollectorRescorer 
using an ExecutorService?
{quote}
I think it depends on use case, one may want to run index searcher with 
executor but wants this run sequential? I made some changes there, so now slice 
works only with executor, if executor is null then slice is ignored. Also if 
slice is null but executor is not null (and segment count is >1) then resocres 
each segment in a separate thread.  
{quote}
4) Could you please break up rescorer method into more modular objects? 
Ideally, a method each for sequential and parallel cases. Also, there is some 
code duplication at the tail where you do the sequential execution, that should 
get eliminated.
{quote}
Done, but I did it based on slice or segment based. Because in both new methods 
it maybe possible that execution happens in parallel. 

{quote}
Thinking more about this, using LeafReaderContext.ord directly as positional 
parameter might be risky since LeafReaderContext has a constructor where it 
sets the ord to 0. Ideally that should not happen given that these segments are 
already scored once, but that might be one case worth thinking about?
{quote}
I don't think I understood this concern. You are saying there might be 
duplicate leaf with 0 ord or there might gap in the leaf ord? I don't think 
either one is possible. A lot of internal classes uses ord in such way, for 
example {{OrdinalMap}}.   

Let me know what you think.




> Add CollectorRescorer
> ---------------------
>
>                 Key: LUCENE-8791
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8791
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Elbek Kamoliddinov
>            Priority: Major
>         Attachments: LUCENE-8791.patch, LUCENE-8791.patch, LUCENE-8791.patch
>
>
> This is another implementation of query rescorer api (LUCENE-5489). It adds 
> rescoring functionality based on provided CollectorManager. 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (LUCENE-8791) Add CollectorRescorer

Reply via email to