s1monw commented on code in PR #12829: URL: https://github.com/apache/lucene/pull/12829#discussion_r1423637103
########## lucene/core/src/java/org/apache/lucene/index/IndexingChain.java: ########## @@ -219,15 +222,33 @@ private Sorter.DocMap maybeSortSegment(SegmentWriteState state) throws IOExcepti } LeafReader docValuesReader = getDocValuesLeafReader(); - + Function<IndexSorter.DocComparator, IndexSorter.DocComparator> comparatorWrapper = in -> in; + + if (state.segmentInfo.getHasBlocks() && indexSort.getParentField() != null) { + final DocIdSetIterator readerValues = + docValuesReader.getNumericDocValues(indexSort.getParentField()); + BitSet parents = BitSet.of(readerValues, state.segmentInfo.maxDoc()); + comparatorWrapper = + in -> + (docID1, docID2) -> + in.compare(parents.nextSetBit(docID1), parents.nextSetBit(docID2)); + } + assert state.segmentInfo.getHasBlocks() == false + || indexSort.getParentField() != null + || indexCreatedVersionMajor < Version.LUCENE_10_0_0.major + : "parent field is not set but the index has blocks. indexCreatedVersionMajor: " + + indexCreatedVersionMajor; List<IndexSorter.DocComparator> comparators = new ArrayList<>(); for (int i = 0; i < indexSort.getSort().length; i++) { SortField sortField = indexSort.getSort()[i]; IndexSorter sorter = sortField.getIndexSorter(); if (sorter == null) { throw new UnsupportedOperationException("Cannot sort index using sort field " + sortField); } - comparators.add(sorter.getDocComparator(docValuesReader, state.segmentInfo.maxDoc())); + + IndexSorter.DocComparator docComparator = Review Comment: I also thought a bit about other uses of this field that we should evaluate. One of the main things that make we worried is that fact that our delete API doesn't give the guarantees that it should IMO. Today you can just delete the parent without the children which will then in-turn merge adjacent blocks magically or erroneous together and searches will return broken results. With this field we can fix applying deletes to also delete all children if a parent is deleted which is the right thing to do in this case. There might be more usecases for this down to road mainly for index consistency. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org