[GitHub] [lucene-solr] jimczi commented on a change in pull request #823: LUCENE-8939: Introduce Shared Count Early Termination In Parallel Search

2019-08-29 Thread GitBox
jimczi commented on a change in pull request #823: LUCENE-8939: Introduce 
Shared Count Early Termination In Parallel Search
URL: https://github.com/apache/lucene-solr/pull/823#discussion_r319066704
 
 

 ##
 File path: 
lucene/core/src/java/org/apache/lucene/search/HitCountThresholdCheckerFactory.java
 ##
 @@ -0,0 +1,116 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.lucene.search;
+
+import java.util.concurrent.atomic.AtomicInteger;
+
+/**
+ * Factory which generates instances of GlobalHitsThresholdChecker and 
LocalHitsThresholdChecker
+ * with the threshold as the passed in limit
+ */
+public final class HitCountThresholdCheckerFactory {
 
 Review comment:
   I agree for the interface but the static functions to help build simple 
checker could be directly implemented in `HitsThresholdChecker` ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr] jimczi commented on a change in pull request #823: LUCENE-8939: Introduce Shared Count Early Termination In Parallel Search

2019-08-29 Thread GitBox
jimczi commented on a change in pull request #823: LUCENE-8939: Introduce 
Shared Count Early Termination In Parallel Search
URL: https://github.com/apache/lucene-solr/pull/823#discussion_r319063786
 
 

 ##
 File path: 
lucene/core/src/java/org/apache/lucene/search/HitCountThresholdCheckerFactory.java
 ##
 @@ -0,0 +1,116 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.lucene.search;
+
+import java.util.concurrent.atomic.AtomicInteger;
+
+/**
+ * Factory which generates instances of GlobalHitsThresholdChecker and 
LocalHitsThresholdChecker
+ * with the threshold as the passed in limit
+ */
+public final class HitCountThresholdCheckerFactory {
 
 Review comment:
   Since we return a `HitsThresholdChecker` we could also make it public and 
add the private impl and the static functions creation there. We don't really 
need this factory class ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr] jimczi commented on a change in pull request #823: LUCENE-8939: Introduce Shared Count Early Termination In Parallel Search

2019-08-29 Thread GitBox
jimczi commented on a change in pull request #823: LUCENE-8939: Introduce 
Shared Count Early Termination In Parallel Search
URL: https://github.com/apache/lucene-solr/pull/823#discussion_r319030989
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/search/IndexSearcher.java
 ##
 @@ -469,7 +469,7 @@ public TopDocs searchAfter(ScoreDoc after, Query query, 
int numHits) throws IOEx
 
   @Override
   public TopScoreDocCollector newCollector() throws IOException {
-return TopScoreDocCollector.create(cappedNumHits, after, 
TOTAL_HITS_THRESHOLD);
+return TopScoreDocCollector.create(cappedNumHits, after, new 
GlobalHitsThresholdChecker(TOTAL_HITS_THRESHOLD));
 
 Review comment:
   The `HitsThresholdChecker` should be created once and shared within the 
collectors ? We also don't need to use the `GlobalHitsThresholdChecker` if the 
executor is null or if there is a single slice.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr] jimczi commented on a change in pull request #823: LUCENE-8939: Introduce Shared Count Early Termination In Parallel Search

2019-08-29 Thread GitBox
jimczi commented on a change in pull request #823: LUCENE-8939: Introduce 
Shared Count Early Termination In Parallel Search
URL: https://github.com/apache/lucene-solr/pull/823#discussion_r319031039
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/search/IndexSearcher.java
 ##
 @@ -598,7 +598,7 @@ private TopFieldDocs searchAfter(FieldDoc after, Query 
query, int numHits, Sort
   @Override
   public TopFieldCollector newCollector() throws IOException {
 // TODO: don't pay the price for accurate hit counts by default
-return TopFieldCollector.create(rewrittenSort, cappedNumHits, after, 
TOTAL_HITS_THRESHOLD);
+return TopFieldCollector.create(rewrittenSort, cappedNumHits, after, 
new GlobalHitsThresholdChecker(TOTAL_HITS_THRESHOLD));
 
 Review comment:
   same here


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr] jimczi commented on a change in pull request #823: LUCENE-8939: Introduce Shared Count Early Termination In Parallel Search

2019-08-29 Thread GitBox
jimczi commented on a change in pull request #823: LUCENE-8939: Introduce 
Shared Count Early Termination In Parallel Search
URL: https://github.com/apache/lucene-solr/pull/823#discussion_r319030989
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/search/IndexSearcher.java
 ##
 @@ -469,7 +469,7 @@ public TopDocs searchAfter(ScoreDoc after, Query query, 
int numHits) throws IOEx
 
   @Override
   public TopScoreDocCollector newCollector() throws IOException {
-return TopScoreDocCollector.create(cappedNumHits, after, 
TOTAL_HITS_THRESHOLD);
+return TopScoreDocCollector.create(cappedNumHits, after, new 
GlobalHitsThresholdChecker(TOTAL_HITS_THRESHOLD));
 
 Review comment:
   It should be created once and shared within the collectors ? We also don't 
need to use the `GlobalHitsThresholdChecker` if the executor is null or if 
there is a single slice.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr] jimczi commented on a change in pull request #823: LUCENE-8939: Introduce Shared Count Early Termination In Parallel Search

2019-08-28 Thread GitBox
jimczi commented on a change in pull request #823: LUCENE-8939: Introduce 
Shared Count Early Termination In Parallel Search
URL: https://github.com/apache/lucene-solr/pull/823#discussion_r318559088
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/search/TopFieldCollector.java
 ##
 @@ -282,6 +293,7 @@ public void collect(int doc) throws IOException {
   private static final ScoreDoc[] EMPTY_SCOREDOCS = new ScoreDoc[0];
 
   final int numHits;
+  final HitsThresholdChecker hitsThresholdChecker;
   final int totalHitsThreshold;
 
 Review comment:
   This is unused so we can remove it ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr] jimczi commented on a change in pull request #823: LUCENE-8939: Introduce Shared Count Early Termination In Parallel Search

2019-08-28 Thread GitBox
jimczi commented on a change in pull request #823: LUCENE-8939: Introduce 
Shared Count Early Termination In Parallel Search
URL: https://github.com/apache/lucene-solr/pull/823#discussion_r318562876
 
 

 ##
 File path: 
lucene/core/src/java/org/apache/lucene/search/TopScoreDocCollector.java
 ##
 @@ -207,16 +210,23 @@ public static TopScoreDocCollector create(int numHits, 
ScoreDoc after, int total
 }
   }
 
-  final int totalHitsThreshold;
   ScoreDoc pqTop;
+  final HitsThresholdChecker hitsThresholdChecker;
 
   // prevents instantiation
-  TopScoreDocCollector(int numHits, int totalHitsThreshold) {
+  TopScoreDocCollector(int numHits, int totalHitsThreshold, 
HitsThresholdChecker hitsThresholdChecker) {
 
 Review comment:
   `totalHitsThreshold` is not needed


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr] jimczi commented on a change in pull request #823: LUCENE-8939: Introduce Shared Count Early Termination In Parallel Search

2019-08-28 Thread GitBox
jimczi commented on a change in pull request #823: LUCENE-8939: Introduce 
Shared Count Early Termination In Parallel Search
URL: https://github.com/apache/lucene-solr/pull/823#discussion_r318563308
 
 

 ##
 File path: 
lucene/core/src/java/org/apache/lucene/search/TopScoreDocCollector.java
 ##
 @@ -207,16 +210,23 @@ public static TopScoreDocCollector create(int numHits, 
ScoreDoc after, int total
 }
   }
 
-  final int totalHitsThreshold;
   ScoreDoc pqTop;
+  final HitsThresholdChecker hitsThresholdChecker;
 
   // prevents instantiation
-  TopScoreDocCollector(int numHits, int totalHitsThreshold) {
+  TopScoreDocCollector(int numHits, int totalHitsThreshold, 
HitsThresholdChecker hitsThresholdChecker) {
 super(new HitQueue(numHits, true));
-this.totalHitsThreshold = totalHitsThreshold;
+assert hitsThresholdChecker != null;
+
 // HitQueue implements getSentinelObject to return a ScoreDoc, so we know
 // that at this point top() is already initialized.
 pqTop = pq.top();
+this.hitsThresholdChecker = hitsThresholdChecker;
+  }
+
+  // Same as above but uses a local hits checker for hits threshold checks
+  TopScoreDocCollector(int numHits, int totalHitsThreshold) {
 
 Review comment:
   Same as `TopFieldCollector`, we should have a single constructor that takes 
a `HitsThresholdChecker` and build the `totalHitsThreshold` logic in the static 
function.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr] jimczi commented on a change in pull request #823: LUCENE-8939: Introduce Shared Count Early Termination In Parallel Search

2019-08-28 Thread GitBox
jimczi commented on a change in pull request #823: LUCENE-8939: Introduce 
Shared Count Early Termination In Parallel Search
URL: https://github.com/apache/lucene-solr/pull/823#discussion_r318562576
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/search/TopFieldCollector.java
 ##
 @@ -414,6 +428,45 @@ public static TopFieldCollector create(Sort sort, int 
numHits, FieldDoc after,
 }
   }
 
+  /**
+   * Same as above with an additional parameter to allow passing in the 
threshold checker
+   */
+  public static TopFieldCollector create(Sort sort, int numHits, FieldDoc 
after,
+ int totalHitsThreshold, 
HitsThresholdChecker hitsThresholdChecker) {
 
 Review comment:
   We don't need the `totalHitsThreshold` here, is it possible to move the 
argument validation (totalHitsThreshold < 0) in the HitsThresholdChecker ctr ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr] jimczi commented on a change in pull request #823: LUCENE-8939: Introduce Shared Count Early Termination In Parallel Search

2019-08-28 Thread GitBox
jimczi commented on a change in pull request #823: LUCENE-8939: Introduce 
Shared Count Early Termination In Parallel Search
URL: https://github.com/apache/lucene-solr/pull/823#discussion_r318559370
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/search/TopFieldCollector.java
 ##
 @@ -297,10 +309,12 @@ public void collect(int doc) throws IOException {
   // internal versions. If someone will define a constructor with any other
   // visibility, then anyone will be able to extend the class, which is not 
what
   // we want.
-  private TopFieldCollector(FieldValueHitQueue pq, int numHits, int 
totalHitsThreshold, boolean needsScores) {
+  private TopFieldCollector(FieldValueHitQueue pq, int numHits, int 
totalHitsThreshold,
 
 Review comment:
   We don't need the `totalHitsThreshold` if we have the `hitsThresholdChecker` 
below


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr] jimczi commented on a change in pull request #823: LUCENE-8939: Introduce Shared Count Early Termination In Parallel Search

2019-08-28 Thread GitBox
jimczi commented on a change in pull request #823: LUCENE-8939: Introduce 
Shared Count Early Termination In Parallel Search
URL: https://github.com/apache/lucene-solr/pull/823#discussion_r318562103
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/search/TopFieldCollector.java
 ##
 @@ -200,6 +204,12 @@ public PagingFieldCollector(Sort sort, 
FieldValueHitQueue queue, FieldDoc
   }
 }
 
+// Default case -- single threaded execution
+public PagingFieldCollector(Sort sort, FieldValueHitQueue queue, 
FieldDoc after, int numHits,
+int totalHitsThreshold) {
 
 Review comment:
   We should have a single ctr


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] [lucene-solr] jimczi commented on a change in pull request #823: LUCENE-8939: Introduce Shared Count Early Termination In Parallel Search

2019-08-28 Thread GitBox
jimczi commented on a change in pull request #823: LUCENE-8939: Introduce 
Shared Count Early Termination In Parallel Search
URL: https://github.com/apache/lucene-solr/pull/823#discussion_r318561896
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/search/TopFieldCollector.java
 ##
 @@ -96,16 +96,21 @@ private static boolean canEarlyTerminateOnPrefix(Sort 
searchSort, Sort indexSort
* document scores and maxScore.
*/
   private static class SimpleFieldCollector extends TopFieldCollector {
-
 final Sort sort;
 final FieldValueHitQueue queue;
 
-public SimpleFieldCollector(Sort sort, FieldValueHitQueue queue, 
int numHits, int totalHitsThreshold) {
-  super(queue, numHits, totalHitsThreshold, sort.needsScores());
+public SimpleFieldCollector(Sort sort, FieldValueHitQueue queue, 
int numHits, int totalHitsThreshold,
+HitsThresholdChecker hitsThresholdChecker) {
+  super(queue, numHits, totalHitsThreshold, hitsThresholdChecker, 
sort.needsScores());
   this.sort = sort;
   this.queue = queue;
 }
 
+/* Default case -- Single threaded execution */
+public SimpleFieldCollector(Sort sort, FieldValueHitQueue queue, 
int numHits, int totalHitsThreshold) {
 
 Review comment:
   Could we replace this ctr with a static function that calls 
`TopFieldCollector#create` with the appropriate `LocalHitsThresholdChecker`? 
This way we don't need to reimplement the logic to create 
`SimpleFieldCollector` or `PagingCollector` twice, e.g.:
   
   public static TopFieldCollector create(Sort sort, int numHits, FieldDoc 
after, int totalHitsThreshold) {
   if (totalHitsThreshold < 0) {
 throw new IllegalArgumentException("totalHitsThreshold must be >= 0, 
got " + totalHitsThreshold);
   }
   return create(sort, numHits, new 
LocalHitsThresholdChecker(totalHitsThreshold));
   } 
   `
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org