Following up on my questions since they didn't get much love the first time. Any inputs are greatly appreciated!
Thanks, Rahul On Wed, Sep 14, 2022 at 3:58 PM Rahul Goswami <rahul196...@gmail.com> wrote: > Hello, > > I was going through some parts of the Lucene source and had some questions: > 1) Can lucene have 0 document segments? Or will they always be purged > (either by TMP or otherwise) on a commit? > Eg: A segment has 4 docs, and I make a /update call to overwrite all 4 > docs (so deleted docs == max docs) and call commit. Will/Can this segment > still exist after commit? > > 2) Starting Lucene 7.0, each segment also stores a "minVersion" which > tracks the min version of the segment that contributed docs to this > segment. > > https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.11.1/lucene/core/src/java/org/apache/lucene/index/SegmentInfo.java#L83 > > Reading through LUCENE-7756 I see that one reason to have minVersion was > to have the entire version of the original index stored somewhere since a > change was made to store only the major version at the index level (in > SegmentInfos) > > > https://issues.apache.org/jira/browse/LUCENE-7756?focusedCommentId=15945863&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-15945863 > > Checking the code, I found it's being consulted for any signs of index > corruption but that was pretty much it. Curious if there is any other > intended/planned use for minVersion? Eg: some choice of codec at read time > based on this field or anything else? > > Thanks, > Rahul > >