[
https://issues.apache.org/jira/browse/LUCENE-8791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16842005#comment-16842005
]
Elbek Kamoliddinov commented on LUCENE-8791:
--------------------------------------------
Sorry, I was bit busy and could not work on feedback. I created another
iteration with some changes.
{quote}
1) In buildSegmentScorers, could we rename maxDoc as endDoc? Since maxDoc is
what is used to typically denote the number of documents in a segment, the code
can be conflicting to read.
{quote}
I left the name as maxDoc, this is denoting max doc and used to check if we
went beyond it so we advance. Is it convincing?
{quote}
When mapping LeafSlices to SegmentScorers, we do another loop post the
construction of SegmentScorers. Is there a way where we could do the mapping in
buildSegmentScorers itself at the time of scorer task allocation?
{quote}
The thing is slice may contain out of order segments, for example slice 0 may
contain segment 0 and segment 3. So we have to do full iteration here first
then map segments to the slices.
{quote}
3) I am a bit unsure about how the rescorer method works. If IndexSearcher has
non empty slices, then that implies that a non null ExecutorService was passed
in to IndexSearcher. Is there any correlation to that and CollectorRescorer
using an ExecutorService?
{quote}
I think it depends on use case, one may want to run index searcher with
executor but wants this run sequential? I made some changes there, so now slice
works only with executor, if executor is null then slice is ignored. Also if
slice is null but executor is not null (and segment count is >1) then resocres
each segment in a separate thread.
{quote}
4) Could you please break up rescorer method into more modular objects?
Ideally, a method each for sequential and parallel cases. Also, there is some
code duplication at the tail where you do the sequential execution, that should
get eliminated.
{quote}
Done, but I did it based on slice or segment based. Because in both new methods
it maybe possible that execution happens in parallel.
{quote}
Thinking more about this, using LeafReaderContext.ord directly as positional
parameter might be risky since LeafReaderContext has a constructor where it
sets the ord to 0. Ideally that should not happen given that these segments are
already scored once, but that might be one case worth thinking about?
{quote}
I don't think I understood this concern. You are saying there might be
duplicate leaf with 0 ord or there might gap in the leaf ord? I don't think
either one is possible. A lot of internal classes uses ord in such way, for
example {{OrdinalMap}}.
Let me know what you think.
> Add CollectorRescorer
> ---------------------
>
> Key: LUCENE-8791
> URL: https://issues.apache.org/jira/browse/LUCENE-8791
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Elbek Kamoliddinov
> Priority: Major
> Attachments: LUCENE-8791.patch, LUCENE-8791.patch, LUCENE-8791.patch
>
>
> This is another implementation of query rescorer api (LUCENE-5489). It adds
> rescoring functionality based on provided CollectorManager.
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]