[
https://issues.apache.org/jira/browse/LUCENE-9418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17190420#comment-17190420
]
Brian Coverstone edited comment on LUCENE-9418 at 9/3/20, 10:44 PM:
--------------------------------------------------------------------
I believe this may still be an issue in 8.6.0, as I'm finding the last slot can
often have an incorrect record.
I found a workaround, and that is to always select 1 more than needed.
Here is some pseudo code to demonstrate:
{quote}ComplexPhraseQueryParser cpqp = new
ComplexPhraseQueryParser("somefield", analyzer);
Query query = cpqp.parse("somevalue");
pageSize = 10;
pageNum = 1;
requestedRecords = pageSize * pageNum + 1; //+1 workaround
startOffset = (pageNum - 1) * pageSize;
FieldComparatorSource fsc = new FieldComparatorSource() {
@Override
public FieldComparator<String> newComparator(String fieldname, int
numhits, int sortPos, boolean reversed)
Unknown macro: \{ return new StringValComparatorIgnoreCase(numhits,
fieldname); }
};
Sort sort = new Sort(new SortField("firstname", fsc, false));
IndexSearcher searcher = new IndexSearcher(reader);
TopFieldCollector tfcollector = TopFieldCollector.create(sort,
requestedRecords, Integer.MAX_VALUE);
searcher.search(query, tfcollector);
ScoreDoc[] hits = tfcollector.topDocs(startOffset, pageSize).scoreDocs;
{quote}
At this point "hits" is correct. However, if I remove the "+1" from the
requestedRecords above, the last item in "hits" is often incorrect.
was (Author: brain2000):
I believe this may still be an issue in 8.6.0, as I'm finding the last slot can
often have an incorrect record.
I found a workaround, and that is to always select 1 more than needed.
Here is some pseudo code to demonstrate:
{quote}ComplexPhraseQueryParser cpqp = new
ComplexPhraseQueryParser("somefield", analyzer);
Query query = cpqp.parse("somevalue");
pageSize = 10;
pageNum = 1;
requestedRecords = pageSize * pageNum + 1; //+1 workaround
startOffset = (pageNum - 1) * pageSize;
FieldComparatorSource fsc = new FieldComparatorSource() {
@Override
public FieldComparator<String> newComparator(String fieldname, int
numhits, int sortPos, boolean reversed)
{
return new StringValComparatorIgnoreCase(numhits, fieldname);
}
};
Sort sort = new Sort(new SortField("firstname", fsc, false));
IndexSearcher searcher = new IndexSearcher(reader);
TopFieldCollector tfcollector = TopFieldCollector.create(sort,
requestedRecords + 1, Integer.MAX_VALUE);
searcher.search(query, tfcollector);
ScoreDoc[] hits = tfcollector.topDocs(startOffset, pageSize).scoreDocs;
{quote}
At this point "hits" is correct. However, if I remove the "+1" from the
requestedRecords above, the last item in "hits" is often incorrect.
> Ordered intervals can give inaccurate hits on interleaved terms
> ---------------------------------------------------------------
>
> Key: LUCENE-9418
> URL: https://issues.apache.org/jira/browse/LUCENE-9418
> Project: Lucene - Core
> Issue Type: Bug
> Reporter: Alan Woodward
> Assignee: Alan Woodward
> Priority: Major
> Fix For: 8.6
>
> Time Spent: 20m
> Remaining Estimate: 0h
>
> Given the text 'A B A C', an ordered interval over 'A B C' will return the
> inaccurate interval [2, 3], due to the way minimization is handled after
> matches are found.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]