dsmiley commented on a change in pull request #592:
URL: https://github.com/apache/solr/pull/592#discussion_r800100152
##########
File path: solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java
##########
@@ -854,8 +868,58 @@ private DocSet getAndCacheDocSet(Query query) throws
IOException {
return filterCache.computeIfAbsent(query, q -> getDocSetNC(q, null));
}
- private static Query matchAllDocsQuery = new MatchAllDocsQuery();
+ private static final Query MATCH_ALL_DOCS_QUERY = new MatchAllDocsQuery();
private volatile BitDocSet liveDocs;
+ private final FutureTask<BitDocSet> liveDocsFuture = new
FutureTask<>(this::computeLiveDocs);
+
+ private BitDocSet computeLiveDocs() {
Review comment:
So you wrote all this low level code to avoid a simple MatchAllDocsQuery
with a DocSetCollector? See
org.apache.solr.search.SolrIndexSearcher.getDocSetNC which calls DocSetUtil.
Even if we choose to keep your low level code, I'd propose it go to a special
case inside of DocSetUtil when the query is MatchAllDocsQuery.
##########
File path: solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java
##########
@@ -1299,6 +1385,47 @@ public DocList getDocList(Query query, List<Query>
filterList, Sort lsort, int o
public static final int GET_DOCLIST = 0x02; // get the documents actually
returned in a response
public static final int GET_SCORES = 0x01;
+ private static boolean isConstantScoreQuery(Query q) {
Review comment:
I've seen a need for this elsewhere. Let's make it handle other cases
too (e.g. MatchAllDocsQuery, MatchNoDocsQuery, "Filter" (soon to be
DocSetQuery)) and be recursive to the cases below (hey why not). This could
move to QueryUtils rather than having the massive SolrIndexSearcher grow
##########
File path: solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java
##########
@@ -865,9 +929,31 @@ public BitDocSet getLiveDocSet() throws IOException {
// Going through the filter cache will provide thread safety here if we
only had getLiveDocs,
// but the addition of setLiveDocs means we needed to add volatile to
"liveDocs".
BitDocSet docs = liveDocs;
- if (docs == null) {
- //note: maybe should instead calc manually by segment, using
FixedBitSet.copyOf(segLiveDocs); avoid filter cache?
- liveDocs = docs = getDocSetBits(matchAllDocsQuery);
+ if (docs != null) {
+ matchAllDocsCacheHitCount.incrementAndGet();
+ } else {
+ if (matchAllDocsCacheComputationTracker.compareAndSet(Long.MIN_VALUE,
0)) {
+ // run the initial/only/final future inline
+ // This thread will block execution here and `liveDocsFuture.get()`
(below) should then return immediately
+ liveDocsFuture.run();
+ } else {
+ // another thread has already called `computeLiveDocs.run()`; this
thread will block on
+ // `liveDocsFuture.get()` (below)
+ if (matchAllDocsCacheComputationTracker.getAndIncrement() < 0) {
Review comment:
The Future + AtomicLong thing looks overly complicated; maybe I'm not
getting why it needs to be so complicated. Can we have a simple double-check
lock with synchronized to ensure only one creation happens? We've all seen
this pattern many times, I'm sure. We don't even need the liveDocs to be
volatile since it's an immutable object, not unlike a String, except for the
lazy populated size but that doesn't matter as it's idempotent.
##########
File path: solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java
##########
@@ -2281,6 +2455,11 @@ public void initializeMetrics(SolrMetricsContext
parentContext, String scope) {
parentContext.gauge(() -> openTime, true, "openedAt",
Category.SEARCHER.toString(), scope);
parentContext.gauge(() -> warmupTime, true, "warmupTime",
Category.SEARCHER.toString(), scope);
parentContext.gauge(() -> registerTime, true, "registeredAt",
Category.SEARCHER.toString(), scope);
+ parentContext.gauge(fullSortCount::get, true, "fullSortCount",
Category.SEARCHER.toString(), scope);
+ parentContext.gauge(skipSortCount::get, true, "skipSortCount",
Category.SEARCHER.toString(), scope);
+ parentContext.gauge(matchAllDocsCacheConsultationCount::get, true,
"matchAllDocsCacheConsultationCount", Category.SEARCHER.toString(), scope);
+ parentContext.gauge(matchAllDocsCacheHitCount::get, true,
"matchAllDocsCacheHitCount", Category.SEARCHER.toString(), scope);
+ parentContext.gauge(matchAllDocsCacheComputationTracker::get, true,
"matchAllDocsCacheComputationTracker", Category.SEARCHER.toString(), scope);
Review comment:
Is there value in publishing that "Tracker"? It doesn't even sound like
a metric.
If we want these metrics, perhaps we should instead have a one-element
Cache, which already publishes metrics and in a standard way across other
Caches. This is just an idea... if it's not easy then nevermind.
##########
File path: solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java
##########
@@ -854,8 +868,58 @@ private DocSet getAndCacheDocSet(Query query) throws
IOException {
return filterCache.computeIfAbsent(query, q -> getDocSetNC(q, null));
}
- private static Query matchAllDocsQuery = new MatchAllDocsQuery();
+ private static final Query MATCH_ALL_DOCS_QUERY = new MatchAllDocsQuery();
private volatile BitDocSet liveDocs;
+ private final FutureTask<BitDocSet> liveDocsFuture = new
FutureTask<>(this::computeLiveDocs);
+
+ private BitDocSet computeLiveDocs() {
+ switch (leafContexts.size()) {
+ case 0:
+ assert numDocs() == 0;
+ return new BitDocSet(DocSet.empty().getFixedBitSet(), 0);
Review comment:
This looks round-about. Just return `new FixedBitSet(0);`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]