[PR] Remove concatenation in String.format() calls [lucene]
sabi0 opened a new pull request, #13038: URL: https://github.com/apache/lucene/pull/13038 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[PR] Make use of null-checked variable [lucene]
sabi0 opened a new pull request, #13040: URL: https://github.com/apache/lucene/pull/13040 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Cleanup TokenizedPhraseQueryNode code [lucene]
sabi0 commented on code in PR #13041: URL: https://github.com/apache/lucene/pull/13041#discussion_r1468923263 ## lucene/queryparser/src/java/org/apache/lucene/queryparser/flexible/core/nodes/TokenizedPhraseQueryNode.java: ## @@ -70,10 +70,8 @@ public QueryNode cloneTree() throws CloneNotSupportedException { @Override public CharSequence getField() { List children = getChildren(); - -if (children == null || children.size() == 0) { +if (children == null || children.isEmpty()) { return null; - } else { return ((FieldableNode) children.get(0)).getField(); Review Comment: According to the `setField(CharSequence)` method below children are not necessarily `FieldableNode`. I.e. this line might throw `ClassCastException`? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Optimize counts on two clause term disjunctions [lucene]
jfreden commented on code in PR #13036: URL: https://github.com/apache/lucene/pull/13036#discussion_r1468906825 ## lucene/core/src/java/org/apache/lucene/search/BooleanWeight.java: ## @@ -249,10 +249,74 @@ BulkScorer optionalBulkScorer(LeafReaderContext context) throws IOException { return optional.get(0); } +// Calculate count(clause1 OR clause2) as count(clause1) + count(clause2) - count(clause1 AND +// clause2) +if (scoreMode == ScoreMode.COMPLETE_NO_SCORES +&& context.reader().hasDeletions() == false +&& query.isTwoClauseDisjunctionWithTerms()) { + return twoClauseTermDisjunctionOptimizedScorer(context); +} + return new BooleanScorer( this, optional, Math.max(1, query.getMinimumNumberShouldMatch()), scoreMode.needsScores()); } + private BulkScorer twoClauseTermDisjunctionOptimizedScorer(LeafReaderContext context) + throws IOException { +List optionalScorers = new ArrayList<>(); +final int[] clauseDocFreqSum = new int[1]; +for (WeightedBooleanClause wc : weightedClauses) { + clauseDocFreqSum[0] += wc.weight.count(context); + ScorerSupplier scorerSupplier = wc.weight.scorerSupplier(context); + if (scorerSupplier != null) { +optionalScorers.add(scorerSupplier.get(Long.MAX_VALUE)); + } +} + +final ConjunctionBulkScorer conjunctionBulkScorer = +optionalScorers.size() == 2 ? new ConjunctionBulkScorer(List.of(), optionalScorers) : null; +return new BulkScorer() { + @Override + public int score(LeafCollector collector, Bits acceptDocs, int min, int max) + throws IOException { +final int[] intersectionScore = new int[1]; +LeafCollector intersectionCollector = +new LeafCollector() { + @Override + public void setScorer(Scorable scorer) {} + + @Override + public void collect(int doc) { +intersectionScore[0]++; + } + + @Override + public void collect(DocIdStream stream) throws IOException { +intersectionScore[0] += stream.count(); + } +}; + +int leadDocId = 0; +if (conjunctionBulkScorer != null) { + leadDocId = conjunctionBulkScorer.score(intersectionCollector, acceptDocs, min, max); +} + +for (int i = 1; i <= clauseDocFreqSum[0] - intersectionScore[0]; i++) { + collector.collect(i); Review Comment: Thanks for looking at this! That helps a lot. Wasn't sure how to proceed since I couldn't come up with a nice way to do this without modifying `IndexSearcher#count` (felt unsure about this since I couldn't find any similar optimizations in `IndexSearcher`) or breaking the contract with `LeafCollector` (like I ended up doing, but only works for the count case where the doc ids are discarded). I've pushed a change to do this in `IndexSearcher#count` instead. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[PR] Cleanup TokenizedPhraseQueryNode code [lucene]
sabi0 opened a new pull request, #13041: URL: https://github.com/apache/lucene/pull/13041 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[PR] Support getMaxScore of ConjunctionScorer for non top level scoring clause [lucene]
mrkm4ntr opened a new pull request, #13043: URL: https://github.com/apache/lucene/pull/13043 ### Description After introducing topLevelScoringClause, ConjunctionScorer with multiple scorers can be used for non top level scoring clause conjunctions instead of BlockMaxConjunctionScorer even requiredScorers is empty. In such case, ConjunctionScorer returns Infinity as maxScore and it ruins some optimizations like parent WANDScorer. https://github.com/apache/lucene/blob/7d35ae485807147460f63ea58ae495124e972e13/lucene/core/src/java/org/apache/lucene/search/Boolean2ScorerSupplier.java#L218C77-L218C98 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] LUCENE-4056: Japanese Tokenizer (Kuromoji) cannot build UniDic dictionary [lucene]
github-actions[bot] commented on PR #12517: URL: https://github.com/apache/lucene/pull/12517#issuecomment-1913774063 This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the d...@lucene.apache.org list. Thank you for your contribution! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[PR] [LUCENE-13044][replicator] NRT add configurable commitData for Custom… [lucene]
dianjifzm opened a new pull request, #13045: URL: https://github.com/apache/lucene/pull/13045 … security verification ### Description 开放commitData的修改,可以自定义主从同步的安全机制 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] [Fix] Binary search the entries when all suffixes have the same length in a leaf block. [lucene]
vsop-479 commented on PR #11888: URL: https://github.com/apache/lucene/pull/11888#issuecomment-1914149013 @jpountz @mikemccand I resolved the conflicts, and moved the test case for target greater than the last entry of matched block from `TestLucene90PostingsFormat` to `TestLucene99PostingsFormat`. Please take a look when you get a chance! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Propagate topLevelScoringClause from QueryProfiler [lucene]
mrkm4ntr commented on PR #13031: URL: https://github.com/apache/lucene/pull/13031#issuecomment-1914149588 @jpountz Sure. Added. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Propagate topLevelScoringClause from QueryProfiler [lucene]
jpountz commented on PR #13031: URL: https://github.com/apache/lucene/pull/13031#issuecomment-1914131295 @mrkm4ntr 9.10 please -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org