[GitHub] lucene-solr pull request #527: LUCENE-8609: Allow getting consistent docstat...
Github user asfgit closed the pull request at: https://github.com/apache/lucene-solr/pull/527 --- - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[GitHub] lucene-solr pull request #527: LUCENE-8609: Allow getting consistent docstat...
Github user dnhatn commented on a diff in the pull request: https://github.com/apache/lucene-solr/pull/527#discussion_r241832076 --- Diff: lucene/core/src/test/org/apache/lucene/index/TestIndexWriter.java --- @@ -3300,7 +3315,7 @@ public int numDeletesToMerge(SegmentCommitInfo info, int delCount, IOSupplier
[GitHub] lucene-solr pull request #527: LUCENE-8609: Allow getting consistent docstat...
Github user dnhatn commented on a diff in the pull request: https://github.com/apache/lucene-solr/pull/527#discussion_r241832034 --- Diff: lucene/core/src/test/org/apache/lucene/index/TestIndexWriter.java --- @@ -3147,7 +3162,7 @@ public void testSoftUpdateDocuments() throws IOException { for (SegmentCommitInfo info : writer.cloneSegmentInfos()) { numSoftDeleted += info.getSoftDelCount(); } -assertEquals(writer.maxDoc() - writer.numDocs(), numSoftDeleted); +assertEquals(writer.getDocStats().maxDoc - writer.getDocStats().numDocs, numSoftDeleted); --- End diff -- maybe use a single docStats? --- - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[GitHub] lucene-solr pull request #527: LUCENE-8609: Allow getting consistent docstat...
Github user dnhatn commented on a diff in the pull request: https://github.com/apache/lucene-solr/pull/527#discussion_r241504268 --- Diff: lucene/core/src/java/org/apache/lucene/index/IndexWriter.java --- @@ -5289,4 +5289,48 @@ final synchronized boolean segmentCommitInfoExist(SegmentCommitInfo sci) { final synchronized SegmentInfos cloneSegmentInfos() { return segmentInfos.clone(); } + + /** + * Returns accurate {@link DocStats} form this writer. This is equivalent to calling {@link #numDocs()} and {@link #maxDoc()} + * but is not subject to race-conditions. The numDoc for instance can change after maxDoc is fetched that causes numDocs to be + * greater than maxDoc which makes it hard to get accurate document stats from IndexWriter. + */ + public synchronized DocStats getDocStats() { +ensureOpen(); +int numDocs = docWriter.getNumDocs(); +int maxDoc = numDocs; +for (final SegmentCommitInfo info : segmentInfos) { + maxDoc = info.info.maxDoc(); --- End diff -- `=` -> `+=`. --- - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[GitHub] lucene-solr pull request #527: LUCENE-8609: Allow getting consistent docstat...
GitHub user s1monw opened a pull request: https://github.com/apache/lucene-solr/pull/527 LUCENE-8609: Allow getting consistent docstats from IndexWriter Today we have #numDocs() and #maxDoc() on IndexWriter. This is enough to get all stats for the current index but it's subject to concurrency and might return numbers that are not consistent ie. some cases can return maxDoc < numDocs which is undesirable. This change adds a getDocStats() method to index writer to allow fetching consistent numbers for these stats. You can merge this pull request into a Git repository by running: $ git pull https://github.com/s1monw/lucene-solr docstats Alternatively you can review and apply these changes as the patch at: https://github.com/apache/lucene-solr/pull/527.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #527 commit 6721c40e16c485038fd092dbaa672204e6fdb3c6 Author: Simon Willnauer Date: 2018-12-13T15:05:47Z LUCENE-8609: Allow getting consistent docstats from IndexWriter Today we have #numDocs() and #maxDoc() on IndexWriter. This is enough to get all stats for the current index but it's subject to concurrency and might return numbers that are not consistent ie. some cases can return maxDoc < numDocs which is undesirable. This change adds a getDocStats() method to index writer to allow fetching consistent numbers for these stats. --- - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org