magibney commented on a change in pull request #592:
URL: https://github.com/apache/solr/pull/592#discussion_r838851191



##########
File path: solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java
##########
@@ -935,22 +956,96 @@ private DocSet getAndCacheDocSet(Query query) throws 
IOException {
     return filterCache.computeIfAbsent(query, q -> getDocSetNC(q, null));
   }
 
-  private static Query matchAllDocsQuery = new MatchAllDocsQuery();
-  private volatile BitDocSet liveDocs;
+  private static final MatchAllDocsQuery MATCH_ALL_DOCS_QUERY = new 
MatchAllDocsQuery();
+
+  /**
+   * A naively cached canonical `liveDocs` DocSet. This does not need to be 
volatile. It may be set
+   * multiple times, but should always be set to the same value, as all set 
values should pass
+   * through `liveDocsCache.computeIfAbsent`
+   */
+  private BitDocSet liveDocs;
+
+  private static final BitDocSet EMPTY = new BitDocSet(new FixedBitSet(0), 0);
+
+  private BitDocSet computeLiveDocs() {
+    switch (leafContexts.size()) {
+      case 0:
+        assert numDocs() == 0;
+        return EMPTY;
+      case 1:
+        final Bits onlySegLiveDocs = 
leafContexts.get(0).reader().getLiveDocs();
+        final FixedBitSet fbs;
+        if (onlySegLiveDocs == null) {
+          // `LeafReader.getLiveDocs()` returns null if no deleted docs -- 
accordingly, set all bits
+          final int onlySegMaxDoc = maxDoc();
+          fbs = new FixedBitSet(onlySegMaxDoc);
+          fbs.set(0, onlySegMaxDoc);
+        } else {
+          fbs = FixedBitSet.copyOf(onlySegLiveDocs);
+        }
+        assert fbs.cardinality() == numDocs();
+        return new BitDocSet(fbs, numDocs());
+      default:
+        final FixedBitSet bs = new FixedBitSet(maxDoc());
+        for (LeafReaderContext ctx : leafContexts) {
+          final LeafReader r = ctx.reader();
+          final Bits segLiveDocs = r.getLiveDocs();
+          final int segDocBase = ctx.docBase;
+          if (segLiveDocs == null) {
+            // `LeafReader.getLiveDocs()` returns null if no deleted docs -- 
accordingly, set all
+            // bits in seg range
+            bs.set(segDocBase, segDocBase + r.maxDoc());
+          } else {
+            DocSetUtil.copyTo(segLiveDocs, 0, r.maxDoc(), bs, segDocBase);
+          }
+        }
+        assert bs.cardinality() == numDocs();
+        return new BitDocSet(bs, numDocs());
+    }
+  }
+
+  private BitDocSet populateLiveDocs(Supplier<BitDocSet> liveDocsSupplier) {
+    final boolean computeInline;
+    final CompletableFuture<BitDocSet> liveDocsCacheInstance;
+    synchronized (liveDocsCache) {
+      if (liveDocsCache[0] != null) {
+        computeInline = false;
+        liveDocsCacheInstance = liveDocsCache[0];
+      } else {
+        computeInline = true;
+        liveDocsCacheInstance = new CompletableFuture<>();
+        liveDocsCache[0] = liveDocsCacheInstance;
+      }
+    }
+    final BitDocSet docs;
+    if (computeInline) {
+      docs = liveDocsSupplier.get();
+      liveDocsCacheInstance.complete(docs);
+      liveDocs = docs;
+      liveDocsInsertsCount.increment();
+    } else {

Review comment:
       Well I guess the only downside of running the computation within the 
synchronized block is that you lose the metrics distinction between 
`liveDocsHitCount` and `liveDocsAsyncHitCount`. That's probably mainly a 
curiosity now ("during the initialization of liveDocs, approximately how many 
redundant livedocs computations were avoided?"); the initial hope was to use 
that metrics distinction in tests to verify that requests which _would_ have 
resulted in redundant livedocs computations were in fact waiting on a single 
computation. But I kind of dropped that -- see 
[here](https://github.com/apache/solr/blob/91222454725122a76399348eb3143e373d2b222c/solr/core/src/test/org/apache/solr/search/TestMainQueryCaching.java#L445-L460)
 and [here](#discussion_r806102113).
   
   If deciding to move computation inside the synchronized block, I think it'd 
make sense to remove the metrics distinction, since iiuc it would no longer be 
measuring anything meaningful.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to