[
https://issues.apache.org/jira/browse/LUCENE-7255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15257952#comment-15257952
]
Christine Poerschke commented on LUCENE-7255:
---------------------------------------------
bq. Do you think it would be possible to directly pass the page size to the
{{EarlyTerminatingSortingCollector}} instead of the number of documents to
collect?
Do you mean something like this i.e. a sort of convenience additional
constructor?
{code}
- public EarlyTerminatingSortingCollector(Collector in, Sort sort, int
numDocsToCollect, Sort mergePolicySort) {
- ...
- }
+ public EarlyTerminatingSortingCollector(Collector in, Sort sort, int
numDocsToCollect, Sort mergePolicySort) {
+ this(in, sort, 0, numDocsToCollect, mergePolicy);
+ }
+
+ public EarlyTerminatingSortingCollector(Collector in, Sort sort, int
numToSkip, int numWanted, Sort mergePolicySort) {
+ ...
+ }
{code}
That sounds good to me. [~shaie], [~rcmuir], [~jpountz] - what do you think?
bq. I will be happy adding the paging test to
{{TestEarlyTerminatingSortingCollector}}, if you want.
+1 for a paging test.
> Paging with SortingMergePolicy and EarlyTerminatingSortingCollector
> -------------------------------------------------------------------
>
> Key: LUCENE-7255
> URL: https://issues.apache.org/jira/browse/LUCENE-7255
> Project: Lucene - Core
> Issue Type: Bug
> Affects Versions: 5.3, 5.4, 5.5, 6.0
> Reporter: Andrés de la Peña
> Labels: EarlyTerminatingSortingCollector, pagination, paging,
> searchafter, sortingmergepolicy
>
> {{EarlyTerminatingSortingCollector}} seems to don't work when used with a
> {{TopDocsCollector}} searching for documents after a certain {{FieldDoc}}.
> That is, it can't be used for paging. The following code allows to reproduce
> the problem:
> {code}
> // Sort to be used both with merge policy and queries
> Sort sort = new Sort(new SortedNumericSortField(FIELD_NAME,
> SortField.Type.INT));
> // Create directory
> RAMDirectory directory = new RAMDirectory();
> // Setup merge policy
> TieredMergePolicy tieredMergePolicy = new TieredMergePolicy();
> SortingMergePolicy sortingMergePolicy = new
> SortingMergePolicy(tieredMergePolicy, sort);
> // Setup index writer
> IndexWriterConfig indexWriterConfig = new IndexWriterConfig(new
> SimpleAnalyzer());
> indexWriterConfig.setOpenMode(IndexWriterConfig.OpenMode.CREATE_OR_APPEND);
> indexWriterConfig.setMergePolicy(sortingMergePolicy);
> IndexWriter indexWriter = new IndexWriter(directory, indexWriterConfig);
> // Index values
> for (int i = 1; i <= 1000; i++) {
> Document document = new Document();
> document.add(new NumericDocValuesField(FIELD_NAME, i));
> indexWriter.addDocument(document);
> }
> // Force index merge to ensure early termination
> indexWriter.forceMerge(1, true);
> indexWriter.commit();
> // Create index searcher
> IndexReader reader = DirectoryReader.open(directory);
> IndexSearcher searcher = new IndexSearcher(reader);
> // Paginated read
> int pageSize = 10;
> FieldDoc pageStart = null;
> while (true) {
> logger.info("Collecting page starting at: {}", pageStart);
> Query query = new MatchAllDocsQuery();
> TopDocsCollector tfc = TopFieldCollector.create(sort, pageSize,
> pageStart, true, false, false);
> EarlyTerminatingSortingCollector collector = new
> EarlyTerminatingSortingCollector(tfc, sort, pageSize, sort);
> searcher.search(query, collector);
> ScoreDoc[] scoreDocs = tfc.topDocs().scoreDocs;
> for (ScoreDoc scoreDoc : scoreDocs) {
> pageStart = (FieldDoc) scoreDoc;
> logger.info("FOUND {}", scoreDoc);
> }
> logger.info("Terminated early: {}", collector.terminatedEarly());
> if (scoreDocs.length < pageSize) break;
> }
> // Close
> reader.close();
> indexWriter.close();
> directory.close();
> {code}
> The query for the second page doesn't return any results. However, it gets
> the expected results when if we don't wrap the {{TopFieldCollector}} with the
> {{EarlyTerminatingSortingCollector}}.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]