Re: [PR] Optimize counts on two clause term disjunctions [lucene]

2024-02-01 Thread via GitHub
jpountz merged PR #13036: URL: https://github.com/apache/lucene/pull/13036 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [PR] Optimize counts on two clause term disjunctions [lucene]

2024-02-01 Thread via GitHub
jfreden commented on PR #13036: URL: https://github.com/apache/lucene/pull/13036#issuecomment-1920986116 Thank you @jpountz ! I've pushed changes to the tests, added the comment and also added an entry to `CHANGES.txt`. -- This is an automated message from the Apache Git Service. To

Re: [PR] Optimize counts on two clause term disjunctions [lucene]

2024-02-01 Thread via GitHub
jpountz commented on code in PR #13036: URL: https://github.com/apache/lucene/pull/13036#discussion_r1474129857 ## lucene/core/src/test/org/apache/lucene/search/TestBooleanQuery.java: ## @@ -962,6 +962,118 @@ public void testDisjunctionMatchesCount() throws IOException {

Re: [PR] Optimize counts on two clause term disjunctions [lucene]

2024-02-01 Thread via GitHub
jfreden commented on code in PR #13036: URL: https://github.com/apache/lucene/pull/13036#discussion_r1474087852 ## lucene/core/src/test/org/apache/lucene/search/TestBooleanQuery.java: ## @@ -962,6 +962,46 @@ public void testDisjunctionMatchesCount() throws IOException {

Re: [PR] Optimize counts on two clause term disjunctions [lucene]

2024-01-31 Thread via GitHub
jpountz commented on code in PR #13036: URL: https://github.com/apache/lucene/pull/13036#discussion_r1473219731 ## lucene/core/src/test/org/apache/lucene/search/TestBooleanQuery.java: ## @@ -962,6 +962,46 @@ public void testDisjunctionMatchesCount() throws IOException {

Re: [PR] Optimize counts on two clause term disjunctions [lucene]

2024-01-30 Thread via GitHub
jpountz commented on PR #13036: URL: https://github.com/apache/lucene/pull/13036#issuecomment-1917595110 Thanks @jfreden, the heuristic looks sensible. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] Optimize counts on two clause term disjunctions [lucene]

2024-01-30 Thread via GitHub
jfreden commented on PR #13036: URL: https://github.com/apache/lucene/pull/13036#issuecomment-1916440041 I added code to only apply the optimization `if count(term-with-less-docs)/count(term-with-more-docs) < 0.1` and it yielded a way better result. Will investigate the term cache idea too

Re: [PR] Optimize counts on two clause term disjunctions [lucene]

2024-01-29 Thread via GitHub
jfreden commented on PR #13036: URL: https://github.com/apache/lucene/pull/13036#issuecomment-1914370159 Thanks for the review! Great ideas! I will work on adding a simple heuristic and caching the `TermState`. -- This is an automated message from the Apache Git Service.

Re: [PR] Optimize counts on two clause term disjunctions [lucene]

2024-01-29 Thread via GitHub
jpountz commented on PR #13036: URL: https://github.com/apache/lucene/pull/13036#issuecomment-1914327764 This is a great speedup on `CountOrHighMed`! Too bad it's not faster all the time, though I'm not too surprised as conjunctions have more overhead than disjunctions when all clauses

Re: [PR] Optimize counts on two clause term disjunctions [lucene]

2024-01-29 Thread via GitHub
jfreden commented on PR #13036: URL: https://github.com/apache/lucene/pull/13036#issuecomment-1914220139 Output from luceneutil. **The count tasks:** ``` TaskQPS baseline StdDevQPS my_modified_version StdDevPct diff p-value

Re: [PR] Optimize counts on two clause term disjunctions [lucene]

2024-01-28 Thread via GitHub
jfreden commented on code in PR #13036: URL: https://github.com/apache/lucene/pull/13036#discussion_r1468906825 ## lucene/core/src/java/org/apache/lucene/search/BooleanWeight.java: ## @@ -249,10 +249,74 @@ BulkScorer optionalBulkScorer(LeafReaderContext context) throws

Re: [PR] Optimize counts on two clause term disjunctions [lucene]

2024-01-27 Thread via GitHub
jpountz commented on code in PR #13036: URL: https://github.com/apache/lucene/pull/13036#discussion_r1468428276 ## lucene/core/src/java/org/apache/lucene/search/BooleanWeight.java: ## @@ -249,10 +249,74 @@ BulkScorer optionalBulkScorer(LeafReaderContext context) throws