[GitHub] [lucene] zacharymorn commented on pull request #101: LUCENE-9335: [Discussion Only] Add BMM scorer and use it for pure disjunction term query

2021-05-10 Thread GitBox
zacharymorn commented on pull request #101: URL: https://github.com/apache/lucene/pull/101#issuecomment-837880326 > Thanks @mikemccand and @jpountz for the uploads! > > > The nightly benchmarks uses the binary form of wikibigall, to reduce thread bottleneck when reading/parsing

[GitHub] [lucene] zacharymorn commented on pull request #101: LUCENE-9335: [Discussion Only] Add BMM scorer and use it for pure disjunction term query

2021-05-10 Thread GitBox
zacharymorn commented on pull request #101: URL: https://github.com/apache/lucene/pull/101#issuecomment-837864536 Thanks @mikemccand and @jpountz for the uploads! > The nightly benchmarks uses the binary form of wikibigall, to reduce thread bottleneck when reading/parsing documents

[GitHub] [lucene] zacharymorn commented on pull request #113: LUCENE-9335: [Discussion Only] Implement BMM with BulkScorer interface

2021-05-10 Thread GitBox
zacharymorn commented on pull request #113: URL: https://github.com/apache/lucene/pull/113#issuecomment-837753531 I've also tried out smaller window sizes in the latest 2 commits (benchmark results in the git commit message), and it appears that window size of 1024 might have better

[GitHub] [lucene] zacharymorn commented on a change in pull request #113: LUCENE-9335: [Discussion Only] Implement BMM with BulkScorer interface

2021-05-10 Thread GitBox
zacharymorn commented on a change in pull request #113: URL: https://github.com/apache/lucene/pull/113#discussion_r629825600 ## File path: lucene/core/src/java/org/apache/lucene/search/BMMBulkScorer.java ## @@ -0,0 +1,317 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] [lucene] zacharymorn commented on a change in pull request #113: LUCENE-9335: [Discussion Only] Implement BMM with BulkScorer interface

2021-05-10 Thread GitBox
zacharymorn commented on a change in pull request #113: URL: https://github.com/apache/lucene/pull/113#discussion_r629794667 ## File path: lucene/core/src/java/org/apache/lucene/search/BMMBulkScorer.java ## @@ -0,0 +1,317 @@ +/* + * Licensed to the Apache Software Foundation

[jira] [Updated] (LUCENE-9952) FacetResult#value can be inaccurate in SortedSetDocValueFacetCounts

2021-05-10 Thread Greg Miller (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Greg Miller updated LUCENE-9952: Affects Version/s: (was: 8.9) main (9.0) > FacetResult#value can be

[jira] [Updated] (LUCENE-9952) FacetResult#value can be inaccurate in SortedSetDocValueFacetCounts

2021-05-10 Thread Greg Miller (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Greg Miller updated LUCENE-9952: Description: As described in a dev@ list

[jira] [Commented] (LUCENE-9953) FacetResult#value is inaccurate in LongValueFacetCounts for multi-value docs

2021-05-10 Thread Greg Miller (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17342119#comment-17342119 ] Greg Miller commented on LUCENE-9953: - Created two PRs: one against {{branch_8x}} (in the older

[jira] [Updated] (LUCENE-9952) FacetResult#value can be inaccurate in SortedSetDocValueFacetCounts

2021-05-10 Thread Greg Miller (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Greg Miller updated LUCENE-9952: Summary: FacetResult#value can be inaccurate in SortedSetDocValueFacetCounts (was:

[GitHub] [lucene-solr] gsmiller commented on pull request #2491: LUCENE-9953: Make FacetResult#value accurate for LongValueFacetCounts

2021-05-10 Thread GitBox
gsmiller commented on pull request #2491: URL: https://github.com/apache/lucene-solr/pull/2491#issuecomment-837256853 I've also created a pull request against the new repo here: https://github.com/apache/lucene/pull/131 -- This is an automated message from the Apache Git Service. To

[GitHub] [lucene] gsmiller opened a new pull request #131: LUCENE-9953: Make FacetResult#value accurate for LongValueFacetCounts

2021-05-10 Thread GitBox
gsmiller opened a new pull request #131: URL: https://github.com/apache/lucene/pull/131 # Description `LongValueFacetCounts` may produce inaccurate counts for `FacetResult#value` in cases where docs are multi-valued. This fixes the bug. Note that I'm proposing this change be

[GitHub] [lucene-solr] gsmiller opened a new pull request #2491: LUCENE-9953: Make FacetResult#value accurate for LongValueFacetCounts

2021-05-10 Thread GitBox
gsmiller opened a new pull request #2491: URL: https://github.com/apache/lucene-solr/pull/2491 LongValueFacetCounts is not populating FacetResult#value correctly for cases where a doc is multi-valued. This addresses the bug. -- This is an automated message from the Apache Git Service.

[jira] [Commented] (LUCENE-9952) FacetResult#value should consistently report doc count, not field count

2021-05-10 Thread Greg Miller (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17342079#comment-17342079 ] Greg Miller commented on LUCENE-9952: - I've spun off LUCENE-9953 to track just the fix for

[jira] [Created] (LUCENE-9953) FacetResult#value is inaccurate in LongValueFacetCounts for multi-value docs

2021-05-10 Thread Greg Miller (Jira)
Greg Miller created LUCENE-9953: --- Summary: FacetResult#value is inaccurate in LongValueFacetCounts for multi-value docs Key: LUCENE-9953 URL: https://issues.apache.org/jira/browse/LUCENE-9953 Project:

[jira] [Commented] (LUCENE-9952) FacetResult#value should consistently report doc count, not field count

2021-05-10 Thread Greg Miller (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17342068#comment-17342068 ] Greg Miller commented on LUCENE-9952: - Hmm, this is actually a bit trickier for

[GitHub] [lucene] jpountz commented on a change in pull request #91: LUCENE-9932: Performance improvement for BKD index building

2021-05-10 Thread GitBox
jpountz commented on a change in pull request #91: URL: https://github.com/apache/lucene/pull/91#discussion_r629535798 ## File path: lucene/core/src/java/org/apache/lucene/util/StableMSBRadixSorter.java ## @@ -0,0 +1,60 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] [lucene] jpountz commented on a change in pull request #91: LUCENE-9932: Performance improvement for BKD index building

2021-05-10 Thread GitBox
jpountz commented on a change in pull request #91: URL: https://github.com/apache/lucene/pull/91#discussion_r629526267 ## File path: lucene/core/src/java/org/apache/lucene/util/StableMSBRadixSorter.java ## @@ -0,0 +1,60 @@ +/* + * Licensed to the Apache Software Foundation

[jira] [Commented] (LUCENE-9952) FacetResult#value should consistently report doc count, not field count

2021-05-10 Thread Greg Miller (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17341997#comment-17341997 ] Greg Miller commented on LUCENE-9952: - I've made this change locally and just need to update unit

[jira] [Created] (LUCENE-9952) FacetResult#value should consistently report doc count, not field count

2021-05-10 Thread Greg Miller (Jira)
Greg Miller created LUCENE-9952: --- Summary: FacetResult#value should consistently report doc count, not field count Key: LUCENE-9952 URL: https://issues.apache.org/jira/browse/LUCENE-9952 Project:

[GitHub] [lucene] neoremind commented on a change in pull request #91: LUCENE-9932: Performance improvement for BKD index building

2021-05-10 Thread GitBox
neoremind commented on a change in pull request #91: URL: https://github.com/apache/lucene/pull/91#discussion_r629484180 ## File path: lucene/core/src/java/org/apache/lucene/util/bkd/MutablePointsReaderUtils.java ## @@ -35,63 +37,60 @@ MutablePointsReaderUtils() {} -

[GitHub] [lucene] neoremind commented on a change in pull request #91: LUCENE-9932: Performance improvement for BKD index building

2021-05-10 Thread GitBox
neoremind commented on a change in pull request #91: URL: https://github.com/apache/lucene/pull/91#discussion_r629482757 ## File path: lucene/core/src/java/org/apache/lucene/codecs/MutablePointValues.java ## @@ -41,4 +41,10 @@ protected MutablePointValues() {} /** Swap

[jira] [Commented] (LUCENE-9836) Fix 8.x Maven Validation and publication to work with Maven Central and HTTPS again; remove pure Maven build (did not work anymore)

2021-05-10 Thread David Smiley (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17341985#comment-17341985 ] David Smiley commented on LUCENE-9836: -- Is there any chance this is related to the smokerelease

[GitHub] [lucene] jpountz commented on pull request #101: LUCENE-9335: [Discussion Only] Add BMM scorer and use it for pure disjunction term query

2021-05-10 Thread GitBox
jpountz commented on pull request #101: URL: https://github.com/apache/lucene/pull/101#issuecomment-836880288 Wonderful, thanks @mikemccand ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [lucene] mikemccand commented on pull request #101: LUCENE-9335: [Discussion Only] Add BMM scorer and use it for pure disjunction term query

2021-05-10 Thread GitBox
mikemccand commented on pull request #101: URL: https://github.com/apache/lucene/pull/101#issuecomment-836879100 > FWIW the file I have locally is enwiki-20130102-lines.txt, not the enwiki-20100302-pages-articles-lines.txt file that luceneutil refers to. Aha! I have that one

[GitHub] [lucene] neoremind commented on a change in pull request #91: LUCENE-9932: Performance improvement for BKD index building

2021-05-10 Thread GitBox
neoremind commented on a change in pull request #91: URL: https://github.com/apache/lucene/pull/91#discussion_r629480227 ## File path: lucene/core/src/java/org/apache/lucene/util/StableMSBRadixSorter.java ## @@ -0,0 +1,60 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] [lucene] jpountz commented on pull request #101: LUCENE-9335: [Discussion Only] Add BMM scorer and use it for pure disjunction term query

2021-05-10 Thread GitBox
jpountz commented on pull request #101: URL: https://github.com/apache/lucene/pull/101#issuecomment-836811596 @mikemccand I'll try to do it overnight as I have a terrible uplink. FWIW the file I have locally is `enwiki-20130102-lines.txt`, not the

[GitHub] [lucene] mikemccand commented on pull request #101: LUCENE-9335: [Discussion Only] Add BMM scorer and use it for pure disjunction term query

2021-05-10 Thread GitBox
mikemccand commented on pull request #101: URL: https://github.com/apache/lucene/pull/101#issuecomment-836788286 > > I also tried to run wikibigall as well, which seems to require enwiki-20100302-pages-articles-lines.txt but it's not downloaded by the util. It appears the archive should

[GitHub] [lucene] jpountz commented on a change in pull request #91: LUCENE-9932: Performance improvement for BKD index building

2021-05-10 Thread GitBox
jpountz commented on a change in pull request #91: URL: https://github.com/apache/lucene/pull/91#discussion_r629124298 ## File path: lucene/core/src/java/org/apache/lucene/codecs/MutablePointValues.java ## @@ -41,4 +41,10 @@ protected MutablePointValues() {} /** Swap the

[GitHub] [lucene] jpountz commented on a change in pull request #91: LUCENE-9932: Performance improvement for BKD index building

2021-05-10 Thread GitBox
jpountz commented on a change in pull request #91: URL: https://github.com/apache/lucene/pull/91#discussion_r629121558 ## File path: lucene/core/src/java/org/apache/lucene/util/StableMSBRadixSorter.java ## @@ -0,0 +1,60 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] [lucene] jpountz commented on a change in pull request #113: LUCENE-9335: [Discussion Only] Implement BMM with BulkScorer interface

2021-05-10 Thread GitBox
jpountz commented on a change in pull request #113: URL: https://github.com/apache/lucene/pull/113#discussion_r629112286 ## File path: lucene/core/src/java/org/apache/lucene/search/BMMBulkScorer.java ## @@ -0,0 +1,317 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

[GitHub] [lucene] jpountz commented on pull request #101: LUCENE-9335: [Discussion Only] Add BMM scorer and use it for pure disjunction term query

2021-05-10 Thread GitBox
jpountz commented on pull request #101: URL: https://github.com/apache/lucene/pull/101#issuecomment-836296845 > I cherry-picked your commit and pushed to this branch / PR to further explore the changes and their effect, hope that's ok. Of course! > I also tried to run