[GitHub] [lucene] LuXugang commented on pull request #865: LUCENE-10502: Use IndexedDISI to store docIds and DirectMonotonicWriter/Reader to handle ordToDoc

2022-05-04 Thread GitBox
LuXugang commented on PR #865: URL: https://github.com/apache/lucene/pull/865#issuecomment-1118189707 > I was thinking we can adopt the following workflow for this work: OK @mayya-sharipova , I would do new format changes later after `apache:vectors-disi-direct` updated. -- This

[GitHub] [lucene] mocobeta commented on a diff in pull request #833: LUCENE-10411: Add NN vectors support to ExitableDirectoryReader

2022-05-04 Thread GitBox
mocobeta commented on code in PR #833: URL: https://github.com/apache/lucene/pull/833#discussion_r865566084 ## lucene/CHANGES.txt: ## @@ -128,6 +128,9 @@ Optimizations * LUCENE-8836: Speed up calls to TermsEnum#lookupOrd on doc values terms enums and sequences of increasing

[GitHub] [lucene] mocobeta commented on a diff in pull request #833: LUCENE-10411: Add NN vectors support to ExitableDirectoryReader

2022-05-04 Thread GitBox
mocobeta commented on code in PR #833: URL: https://github.com/apache/lucene/pull/833#discussion_r865566084 ## lucene/CHANGES.txt: ## @@ -128,6 +128,9 @@ Optimizations * LUCENE-8836: Speed up calls to TermsEnum#lookupOrd on doc values terms enums and sequences of increasing

[GitHub] [lucene] LuXugang closed pull request #865: LUCENE-10502: Use IndexedDISI to store docIds and DirectMonotonicWriter/Reader to handle ordToDoc

2022-05-04 Thread GitBox
LuXugang closed pull request #865: LUCENE-10502: Use IndexedDISI to store docIds and DirectMonotonicWriter/Reader to handle ordToDoc URL: https://github.com/apache/lucene/pull/865 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [lucene] LuXugang commented on a diff in pull request #792: LUCENE-10502: Use IndexedDISI to store docIds and DirectMonotonicWriter/Reader to handle ordToDoc

2022-05-04 Thread GitBox
LuXugang commented on code in PR #792: URL: https://github.com/apache/lucene/pull/792#discussion_r865561221 ## lucene/core/src/java/org/apache/lucene/codecs/lucene91/Lucene91HnswVectorsReader.java: ## @@ -320,13 +323,19 @@ private static class FieldEntry { final int

[jira] [Comment Edited] (LUCENE-10558) Expose IOSupplier constructors in Kuromoji (and Nori?)

2022-05-04 Thread Tomoko Uchida (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17531987#comment-17531987 ] Tomoko Uchida edited comment on LUCENE-10558 at 5/5/22 5:10 AM: Or, I'm

[jira] [Comment Edited] (LUCENE-10558) Expose IOSupplier constructors in Kuromoji (and Nori?)

2022-05-04 Thread Tomoko Uchida (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17531987#comment-17531987 ] Tomoko Uchida edited comment on LUCENE-10558 at 5/5/22 4:18 AM: Or, I'm

[GitHub] [lucene] LuXugang commented on a diff in pull request #792: LUCENE-10502: Use IndexedDISI to store docIds and DirectMonotonicWriter/Reader to handle ordToDoc

2022-05-04 Thread GitBox
LuXugang commented on code in PR #792: URL: https://github.com/apache/lucene/pull/792#discussion_r865516124 ## lucene/core/src/java/org/apache/lucene/codecs/lucene91/Lucene91HnswVectorsReader.java: ## @@ -388,115 +400,239 @@ private static class FieldEntry { int size() {

[jira] [Comment Edited] (LUCENE-10397) KnnVectorQuery doesn't tie break by doc ID

2022-05-04 Thread Lu Xugang (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17531997#comment-17531997 ] Lu Xugang edited comment on LUCENE-10397 at 5/5/22 1:54 AM: Hi

[jira] [Commented] (LUCENE-10397) KnnVectorQuery doesn't tie break by doc ID

2022-05-04 Thread Lu Xugang (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17531997#comment-17531997 ] Lu Xugang commented on LUCENE-10397: Hi [~msoko...@gmail.com] Could we back to use PriorityQueue,

[GitHub] [lucene] LuXugang commented on a diff in pull request #792: LUCENE-10502: Use IndexedDISI to store docIds and DirectMonotonicWriter/Reader to handle ordToDoc

2022-05-04 Thread GitBox
LuXugang commented on code in PR #792: URL: https://github.com/apache/lucene/pull/792#discussion_r865507439 ## lucene/core/src/java/org/apache/lucene/codecs/lucene91/Lucene91HnswVectorsReader.java: ## @@ -258,14 +257,20 @@ public TopDocs search(String field, float[] target, int

[GitHub] [lucene] LuXugang commented on a diff in pull request #792: LUCENE-10502: Use IndexedDISI to store docIds and DirectMonotonicWriter/Reader to handle ordToDoc

2022-05-04 Thread GitBox
LuXugang commented on code in PR #792: URL: https://github.com/apache/lucene/pull/792#discussion_r865507439 ## lucene/core/src/java/org/apache/lucene/codecs/lucene91/Lucene91HnswVectorsReader.java: ## @@ -258,14 +257,20 @@ public TopDocs search(String field, float[] target, int

[GitHub] [lucene] mocobeta commented on pull request #867: LUCENE-10558: expose stream-based Kuromoji resource constructors

2022-05-04 Thread GitBox
mocobeta commented on PR #867: URL: https://github.com/apache/lucene/pull/867#issuecomment-1118087968 I think the same change would be needed for Nori, I don't know the use-cases but for the completeness. -- This is an automated message from the Apache Git Service. To respond to the

[GitHub] [lucene] mocobeta merged pull request #866: Make CONTRIBUTING.md a bit more succinct

2022-05-04 Thread GitBox
mocobeta merged PR #866: URL: https://github.com/apache/lucene/pull/866 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [lucene] mocobeta commented on pull request #866: Make CONTRIBUTING.md a bit more succinct

2022-05-04 Thread GitBox
mocobeta commented on PR #866: URL: https://github.com/apache/lucene/pull/866#issuecomment-1118084362 @mikemccand thanks for taking a look! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [lucene] jtibshirani commented on a diff in pull request #792: LUCENE-10502: Use IndexedDISI to store docIds and DirectMonotonicWriter/Reader to handle ordToDoc

2022-05-04 Thread GitBox
jtibshirani commented on code in PR #792: URL: https://github.com/apache/lucene/pull/792#discussion_r865503009 ## lucene/core/src/java/org/apache/lucene/codecs/lucene91/Lucene91HnswVectorsReader.java: ## @@ -320,13 +323,19 @@ private static class FieldEntry { final int

[GitHub] [lucene] mikemccand commented on pull request #633: LUCENE-10216: Use MergeScheduler and MergePolicy to run addIndexes(CodecReader[]) merges.

2022-05-04 Thread GitBox
mikemccand commented on PR #633: URL: https://github.com/apache/lucene/pull/633#issuecomment-1118081084 Thanks @vigyasharma -- I'll review again soon. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [lucene] LuXugang commented on a diff in pull request #792: LUCENE-10502: Use IndexedDISI to store docIds and DirectMonotonicWriter/Reader to handle ordToDoc

2022-05-04 Thread GitBox
LuXugang commented on code in PR #792: URL: https://github.com/apache/lucene/pull/792#discussion_r865502821 ## lucene/core/src/java/org/apache/lucene/codecs/lucene91/Lucene91HnswVectorsReader.java: ## @@ -258,14 +257,20 @@ public TopDocs search(String field, float[] target, int

[GitHub] [lucene] LuXugang commented on a diff in pull request #792: LUCENE-10502: Use IndexedDISI to store docIds and DirectMonotonicWriter/Reader to handle ordToDoc

2022-05-04 Thread GitBox
LuXugang commented on code in PR #792: URL: https://github.com/apache/lucene/pull/792#discussion_r865502567 ## lucene/core/src/java/org/apache/lucene/codecs/lucene91/Lucene91HnswVectorsReader.java: ## @@ -258,14 +257,20 @@ public TopDocs search(String field, float[] target, int

[GitHub] [lucene] LuXugang commented on a diff in pull request #792: LUCENE-10502: Use IndexedDISI to store docIds and DirectMonotonicWriter/Reader to handle ordToDoc

2022-05-04 Thread GitBox
LuXugang commented on code in PR #792: URL: https://github.com/apache/lucene/pull/792#discussion_r865502085 ## lucene/core/src/java/org/apache/lucene/codecs/lucene91/Lucene91HnswVectorsReader.java: ## @@ -388,115 +400,239 @@ private static class FieldEntry { int size() {

[GitHub] [lucene] LuXugang commented on a diff in pull request #792: LUCENE-10502: Use IndexedDISI to store docIds and DirectMonotonicWriter/Reader to handle ordToDoc

2022-05-04 Thread GitBox
LuXugang commented on code in PR #792: URL: https://github.com/apache/lucene/pull/792#discussion_r865501863 ## lucene/core/src/java/org/apache/lucene/codecs/lucene91/Lucene91HnswVectorsReader.java: ## @@ -388,115 +400,239 @@ private static class FieldEntry { int size() {

[GitHub] [lucene] mocobeta commented on a diff in pull request #867: LUCENE-10558: expose stream-based Kuromoji resource constructors

2022-05-04 Thread GitBox
mocobeta commented on code in PR #867: URL: https://github.com/apache/lucene/pull/867#discussion_r865500952 ## lucene/analysis/kuromoji/src/java/org/apache/lucene/analysis/ja/dict/UnknownDictionary.java: ## @@ -52,18 +52,26 @@ private UnknownDictionary() throws IOException {

[jira] [Commented] (LUCENE-10558) Expose IOSupplier constructors in Kuromoji (and Nori?)

2022-05-04 Thread Tomoko Uchida (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17531987#comment-17531987 ] Tomoko Uchida commented on LUCENE-10558: Or, I'm also fine with keeping backward compatibility

[jira] [Commented] (LUCENE-10558) Expose IOSupplier constructors in Kuromoji (and Nori?)

2022-05-04 Thread Tomoko Uchida (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17531985#comment-17531985 ] Tomoko Uchida commented on LUCENE-10558: Yes, the refactored constructor ignores the {{path}}

[jira] [Commented] (LUCENE-10509) Performance degraded after upgrade from 8.8.2 to 8.9.0

2022-05-04 Thread Rishabh Kumar Maurya (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17531981#comment-17531981 ] Rishabh Kumar Maurya commented on LUCENE-10509: --- {quote}I suspect that this is due to 

[GitHub] [lucene] jtibshirani commented on pull request #796: LUCENE-10504: KnnGraphTester to use KnnVectorQuery

2022-05-04 Thread GitBox
jtibshirani commented on PR #796: URL: https://github.com/apache/lucene/pull/796#issuecomment-1118021852 Super interesting, looking forward to hearing more! I do hope we can stick with a prefiltering-like approach (and just improve its performance), since it feels easier to work with for

[GitHub] [lucene] msokolov commented on pull request #796: LUCENE-10504: KnnGraphTester to use KnnVectorQuery

2022-05-04 Thread GitBox
msokolov commented on PR #796: URL: https://github.com/apache/lucene/pull/796#issuecomment-1118014031 I'm trying not to steal the thunder of the folks who are actually working on this, but at a high level: we were seeing prefiltering being more expensive than postfiltering (over collecting

[GitHub] [lucene] jtibshirani commented on pull request #796: LUCENE-10504: KnnGraphTester to use KnnVectorQuery

2022-05-04 Thread GitBox
jtibshirani commented on PR #796: URL: https://github.com/apache/lucene/pull/796#issuecomment-1118002380 Thanks @msokolov ! So I'm not waiting in suspense too much, what sorts of interesting results have you found related to prefiltering? -- This is an automated message from the Apache

[jira] [Commented] (LUCENE-10558) Expose IOSupplier constructors in Kuromoji (and Nori?)

2022-05-04 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17531971#comment-17531971 ] Michael Sokolov commented on LUCENE-10558: -- > As workaround I would suggest to put the files

[jira] [Commented] (LUCENE-10559) Add preFilter/postFilter options to KnnGraphTester

2022-05-04 Thread Julie Tibshirani (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17531972#comment-17531972 ] Julie Tibshirani commented on LUCENE-10559: --- Big +1 for this addition to KnnGraphTester. I

[jira] [Commented] (LUCENE-10504) KnnGraphTester should use KnnVectorQuery

2022-05-04 Thread ASF subversion and git services (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17531970#comment-17531970 ] ASF subversion and git services commented on LUCENE-10504: -- Commit

[GitHub] [lucene] msokolov merged pull request #796: LUCENE-10504: KnnGraphTester to use KnnVectorQuery

2022-05-04 Thread GitBox
msokolov merged PR #796: URL: https://github.com/apache/lucene/pull/796 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[jira] [Commented] (LUCENE-10558) Expose IOSupplier constructors in Kuromoji (and Nori?)

2022-05-04 Thread Uwe Schindler (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17531968#comment-17531968 ] Uwe Schindler commented on LUCENE-10558: The test does not fail as it uses the default file on

[jira] [Commented] (LUCENE-10558) Expose IOSupplier constructors in Kuromoji (and Nori?)

2022-05-04 Thread Uwe Schindler (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17531966#comment-17531966 ] Uwe Schindler commented on LUCENE-10558: Same for TokenInfoDictionary. > Expose IOSupplier

[jira] [Commented] (LUCENE-10558) Expose IOSupplier constructors in Kuromoji (and Nori?)

2022-05-04 Thread Uwe Schindler (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17531963#comment-17531963 ] Uwe Schindler commented on LUCENE-10558: Ah I see the bug:

[jira] [Commented] (LUCENE-10558) Expose IOSupplier constructors in Kuromoji (and Nori?)

2022-05-04 Thread Uwe Schindler (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17531962#comment-17531962 ] Uwe Schindler commented on LUCENE-10558: How does DICTIONARY_PATH look like? Does it start with

[GitHub] [lucene] msokolov opened a new pull request, #867: LUCENE-10558: expose stream-based Kuromoji resource constructors

2022-05-04 Thread GitBox
msokolov opened a new pull request, #867: URL: https://github.com/apache/lucene/pull/867 Just makes some existing constructors public, and adds javadocs -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[jira] [Comment Edited] (LUCENE-10558) Expose IOSupplier constructors in Kuromoji (and Nori?)

2022-05-04 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17531960#comment-17531960 ] Michael Sokolov edited comment on LUCENE-10558 at 5/4/22 9:53 PM: -- Hi

[jira] [Comment Edited] (LUCENE-10558) Expose IOSupplier constructors in Kuromoji (and Nori?)

2022-05-04 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17531960#comment-17531960 ] Michael Sokolov edited comment on LUCENE-10558 at 5/4/22 9:52 PM: -- Hi

[jira] [Commented] (LUCENE-10558) Expose IOSupplier constructors in Kuromoji (and Nori?)

2022-05-04 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17531960#comment-17531960 ] Michael Sokolov commented on LUCENE-10558: -- Hi [~uschindler] – we have code like this:   {{

[jira] [Commented] (LUCENE-10558) Expose IOSupplier constructors in Kuromoji (and Nori?)

2022-05-04 Thread Uwe Schindler (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17531956#comment-17531956 ] Uwe Schindler commented on LUCENE-10558: Which APIs did you use and how: - what classes do you

[jira] [Created] (LUCENE-10559) Add preFilter/postFilter options to KnnGraphTester

2022-05-04 Thread Michael Sokolov (Jira)
Michael Sokolov created LUCENE-10559: Summary: Add preFilter/postFilter options to KnnGraphTester Key: LUCENE-10559 URL: https://issues.apache.org/jira/browse/LUCENE-10559 Project: Lucene - Core

[GitHub] [lucene] msokolov commented on pull request #796: LUCENE-10504: KnnGraphTester to use KnnVectorQuery

2022-05-04 Thread GitBox
msokolov commented on PR #796: URL: https://github.com/apache/lucene/pull/796#issuecomment-1117937725 > @msokolov it'd be great to get this in when you have the time! whoa sorry I lost track. I will address the comments and push soon. I have other changes I want to make, relating to

[GitHub] [lucene] jtibshirani commented on pull request #796: LUCENE-10504: KnnGraphTester to use KnnVectorQuery

2022-05-04 Thread GitBox
jtibshirani commented on PR #796: URL: https://github.com/apache/lucene/pull/796#issuecomment-1117856474 @msokolov it'd be great to get this in when you have the time! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[jira] [Commented] (LUCENE-10335) IOUtils.getDecodingReader(Class, String) is broken with modules

2022-05-04 Thread Mike Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17531926#comment-17531926 ] Mike Sokolov commented on LUCENE-10335: --- sure - I opened

[jira] [Created] (LUCENE-10558) Expose IOSupplier constructors in Kuromoji (and Nori?)

2022-05-04 Thread Michael Sokolov (Jira)
Michael Sokolov created LUCENE-10558: Summary: Expose IOSupplier constructors in Kuromoji (and Nori?) Key: LUCENE-10558 URL: https://issues.apache.org/jira/browse/LUCENE-10558 Project: Lucene -

[GitHub] [lucene] jtibshirani commented on a diff in pull request #792: LUCENE-10502: Use IndexedDISI to store docIds and DirectMonotonicWriter/Reader to handle ordToDoc

2022-05-04 Thread GitBox
jtibshirani commented on code in PR #792: URL: https://github.com/apache/lucene/pull/792#discussion_r865291279 ## lucene/core/src/java/org/apache/lucene/codecs/lucene91/Lucene91HnswVectorsReader.java: ## @@ -388,115 +400,239 @@ private static class FieldEntry { int size()

[GitHub] [lucene] vigyasharma commented on pull request #633: LUCENE-10216: Use MergeScheduler and MergePolicy to run addIndexes(CodecReader[]) merges.

2022-05-04 Thread GitBox
vigyasharma commented on PR #633: URL: https://github.com/apache/lucene/pull/633#issuecomment-1117847194 Removed the randomly returned high concurrency mergeSpec from `findMerges(CodecReaders...)` in `MockRandomMergePolicy`. Added a check to gracefully catch pending merges that are aborted

[jira] [Closed] (LUCENE-9848) Correctly sort HNSW graph neighbors when applying diversity criterion

2022-05-04 Thread Mayya Sharipova (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mayya Sharipova closed LUCENE-9848. --- > Correctly sort HNSW graph neighbors when applying diversity criterion >

[jira] [Resolved] (LUCENE-9848) Correctly sort HNSW graph neighbors when applying diversity criterion

2022-05-04 Thread Mayya Sharipova (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mayya Sharipova resolved LUCENE-9848. - Fix Version/s: 9.2 Resolution: Fixed > Correctly sort HNSW graph neighbors when

[jira] [Commented] (LUCENE-9848) Correctly sort HNSW graph neighbors when applying diversity criterion

2022-05-04 Thread ASF subversion and git services (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17531886#comment-17531886 ] ASF subversion and git services commented on LUCENE-9848: - Commit

[jira] [Commented] (LUCENE-9848) Correctly sort HNSW graph neighbors when applying diversity criterion

2022-05-04 Thread ASF subversion and git services (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17531882#comment-17531882 ] ASF subversion and git services commented on LUCENE-9848: - Commit

[GitHub] [lucene] jtibshirani commented on a diff in pull request #833: LUCENE-10411: Add NN vectors support to ExitableDirectoryReader

2022-05-04 Thread GitBox
jtibshirani commented on code in PR #833: URL: https://github.com/apache/lucene/pull/833#discussion_r865068242 ## lucene/CHANGES.txt: ## @@ -128,6 +128,9 @@ Optimizations * LUCENE-8836: Speed up calls to TermsEnum#lookupOrd on doc values terms enums and sequences of

[jira] [Commented] (LUCENE-9848) Correctly sort HNSW graph neighbors when applying diversity criterion

2022-05-04 Thread ASF subversion and git services (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17531879#comment-17531879 ] ASF subversion and git services commented on LUCENE-9848: - Commit

[GitHub] [lucene] mayya-sharipova commented on a diff in pull request #862: LUCENE-9848 Sort HNSW graph neighbors for construction

2022-05-04 Thread GitBox
mayya-sharipova commented on code in PR #862: URL: https://github.com/apache/lucene/pull/862#discussion_r865134562 ## lucene/core/src/java/org/apache/lucene/util/hnsw/NeighborArray.java: ## @@ -72,8 +103,39 @@ public void removeLast() { size--; } + public void

[GitHub] [lucene] mayya-sharipova merged pull request #862: LUCENE-9848 Sort HNSW graph neighbors for construction

2022-05-04 Thread GitBox
mayya-sharipova merged PR #862: URL: https://github.com/apache/lucene/pull/862 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[jira] [Commented] (LUCENE-10335) IOUtils.getDecodingReader(Class, String) is broken with modules

2022-05-04 Thread Tomoko Uchida (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17531876#comment-17531876 ] Tomoko Uchida commented on LUCENE-10335: Hi [~sokolov] , can you please open an issue? >

[jira] [Commented] (LUCENE-10335) IOUtils.getDecodingReader(Class, String) is broken with modules

2022-05-04 Thread Uwe Schindler (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17531873#comment-17531873 ] Uwe Schindler commented on LUCENE-10335: What I wanted to say: if it breaks in classpath mode,

[jira] [Commented] (LUCENE-10335) IOUtils.getDecodingReader(Class, String) is broken with modules

2022-05-04 Thread Uwe Schindler (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17531870#comment-17531870 ] Uwe Schindler commented on LUCENE-10335: Migrate.txt or .md was more about module system not

[jira] [Commented] (LUCENE-10335) IOUtils.getDecodingReader(Class, String) is broken with modules

2022-05-04 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17531851#comment-17531851 ] Michael Sokolov commented on LUCENE-10335: -- Thanks, Uwe, you're right I confused the issues.

[jira] [Commented] (LUCENE-10335) IOUtils.getDecodingReader(Class, String) is broken with modules

2022-05-04 Thread Uwe Schindler (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17531847#comment-17531847 ] Uwe Schindler commented on LUCENE-10335: bq. would load the resource from the given path, but

[jira] [Commented] (LUCENE-10335) IOUtils.getDecodingReader(Class, String) is broken with modules

2022-05-04 Thread Uwe Schindler (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17531841#comment-17531841 ] Uwe Schindler commented on LUCENE-10335: If you use CustomAnalyzer, you can pass

[jira] [Commented] (LUCENE-10335) IOUtils.getDecodingReader(Class, String) is broken with modules

2022-05-04 Thread Uwe Schindler (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17531839#comment-17531839 ] Uwe Schindler commented on LUCENE-10335: We did not change the method here, it was just

[GitHub] [lucene] msokolov commented on pull request #833: LUCENE-10411: Add NN vectors support to ExitableDirectoryReader

2022-05-04 Thread GitBox
msokolov commented on PR #833: URL: https://github.com/apache/lucene/pull/833#issuecomment-1117570637 Thanks, it will be great if we can make ExitableDirectoryReader cover all the formats so it can reliably be used for timing out. I agree with others here: basically, this is instrumenting

[GitHub] [lucene] msokolov commented on a diff in pull request #862: LUCENE-9848 Sort HNSW graph neighbors for construction

2022-05-04 Thread GitBox
msokolov commented on code in PR #862: URL: https://github.com/apache/lucene/pull/862#discussion_r865039829 ## lucene/core/src/java/org/apache/lucene/util/hnsw/NeighborArray.java: ## @@ -72,8 +103,39 @@ public void removeLast() { size--; } + public void

[jira] [Commented] (LUCENE-10335) IOUtils.getDecodingReader(Class, String) is broken with modules

2022-05-04 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17531809#comment-17531809 ] Michael Sokolov commented on LUCENE-10335: -- This issue broke backwards compatibility. I think

[GitHub] [lucene] LuXugang commented on pull request #865: LUCENE-10502: Use IndexedDISI to store docIds and DirectMonotonicWriter/Reader to handle ordToDoc

2022-05-04 Thread GitBox
LuXugang commented on PR #865: URL: https://github.com/apache/lucene/pull/865#issuecomment-1117455420 Thanks @mayya-sharipova, your suggestion can reducee a lot of work, so this issue could be closed. -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [lucene] mayya-sharipova commented on pull request #865: LUCENE-10502: Use IndexedDISI to store docIds and DirectMonotonicWriter/Reader to handle ordToDoc

2022-05-04 Thread GitBox
mayya-sharipova commented on PR #865: URL: https://github.com/apache/lucene/pull/865#issuecomment-1117440907 @LuXugang Thanks for this work. I was thinking we can adopt the following workflow for this work: - Once https://github.com/apache/lucene/pull/792 is fully approved, we can

[jira] [Assigned] (LUCENE-10527) Use bigger maxConn for last layer in HNSW

2022-05-04 Thread Mayya Sharipova (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mayya Sharipova reassigned LUCENE-10527: Assignee: Mayya Sharipova > Use bigger maxConn for last layer in HNSW >

[GitHub] [lucene] jpountz commented on a diff in pull request #860: LUCENE-10553: Fix WANDScorer's handling of 0 and +Infty.

2022-05-04 Thread GitBox
jpountz commented on code in PR #860: URL: https://github.com/apache/lucene/pull/860#discussion_r864859715 ## lucene/core/src/java/org/apache/lucene/search/WANDScorer.java: ## @@ -86,7 +86,6 @@ static int scalingFactor(float f) { * sure we do not miss any matches. */

[GitHub] [lucene] LuXugang closed pull request #839: LUCENE-10537: DisjunctionMaxWeight could be rewrite to BooleanWeight if score is disable

2022-05-04 Thread GitBox
LuXugang closed pull request #839: LUCENE-10537: DisjunctionMaxWeight could be rewrite to BooleanWeight if score is disable URL: https://github.com/apache/lucene/pull/839 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[jira] [Commented] (LUCENE-10557) Migrate to GitHub issue from Jira?

2022-05-04 Thread Tomoko Uchida (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17531690#comment-17531690 ] Tomoko Uchida commented on LUCENE-10557: INFRA team says Vote or PMC agreement are needed for

[jira] [Commented] (LUCENE-10543) Achieve contribution workflow perfection (with progress)

2022-05-04 Thread Tomoko Uchida (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17531686#comment-17531686 ] Tomoko Uchida commented on LUCENE-10543: Ok, I don't know how difficult it is though, opened an

[jira] [Updated] (LUCENE-10557) Migrate to GitHub issue from Jira?

2022-05-04 Thread Tomoko Uchida (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tomoko Uchida updated LUCENE-10557: --- Description: A few (not the majority) Apache projects already use the GitHub issue instead

[jira] [Commented] (LUCENE-10557) Migrate to GitHub issue from Jira?

2022-05-04 Thread Tomoko Uchida (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17531684#comment-17531684 ] Tomoko Uchida commented on LUCENE-10557: Thank you [~mikemccand] for joining and giving the

[jira] [Commented] (LUCENE-10557) Migrate to GitHub issue from Jira?

2022-05-04 Thread Tomoko Uchida (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17531683#comment-17531683 ] Tomoko Uchida commented on LUCENE-10557: Among the tasks I mentioned in the issue description,

[jira] [Comment Edited] (LUCENE-10557) Migrate to GitHub issue from Jira?

2022-05-04 Thread Michael McCandless (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17531680#comment-17531680 ] Michael McCandless edited comment on LUCENE-10557 at 5/4/22 11:29 AM:

[jira] [Commented] (LUCENE-10557) Migrate to GitHub issue from Jira?

2022-05-04 Thread Michael McCandless (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17531681#comment-17531681 ] Michael McCandless commented on LUCENE-10557: - INFRA-16128 was the issue when RocketMQ

[jira] [Commented] (LUCENE-10557) Migrate to GitHub issue from Jira?

2022-05-04 Thread Michael McCandless (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17531680#comment-17531680 ] Michael McCandless commented on LUCENE-10557: - Hmm at least ~4 years ago, migrating was

[jira] [Commented] (LUCENE-10557) Migrate to GitHub issue from Jira?

2022-05-04 Thread Michael McCandless (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17531677#comment-17531677 ] Michael McCandless commented on LUCENE-10557: - +1 Do we know whether we could (relatively

[jira] [Commented] (LUCENE-10556) Relax the maximum dirtiness for stored fields and term vectors?

2022-05-04 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17531676#comment-17531676 ] Robert Muir commented on LUCENE-10556: -- {quote} Currently the logic is to recompress if more than

[jira] [Created] (LUCENE-10557) Migrate to GitHub issue from Jira?

2022-05-04 Thread Tomoko Uchida (Jira)
Tomoko Uchida created LUCENE-10557: -- Summary: Migrate to GitHub issue from Jira? Key: LUCENE-10557 URL: https://issues.apache.org/jira/browse/LUCENE-10557 Project: Lucene - Core Issue Type:

[jira] [Commented] (LUCENE-10543) Achieve contribution workflow perfection (with progress)

2022-05-04 Thread Tomoko Uchida (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17531655#comment-17531655 ] Tomoko Uchida commented on LUCENE-10543: I found GitHub issue is enabled on Apache Airflow and

[GitHub] [lucene] romseygeek commented on a diff in pull request #860: LUCENE-10553: Fix WANDScorer's handling of 0 and +Infty.

2022-05-04 Thread GitBox
romseygeek commented on code in PR #860: URL: https://github.com/apache/lucene/pull/860#discussion_r864686490 ## lucene/core/src/java/org/apache/lucene/search/WANDScorer.java: ## @@ -86,7 +86,6 @@ static int scalingFactor(float f) { * sure we do not miss any matches. */

[GitHub] [lucene] mocobeta commented on pull request #833: LUCENE-10411: Add NN vectors support to ExitableDirectoryReader

2022-05-04 Thread GitBox
mocobeta commented on PR #833: URL: https://github.com/apache/lucene/pull/833#issuecomment-1117073813 Thanks for updating. I'm new to the main code (ExitableDirectoryReader itself) so I'd defer to Adrien and Julie on whether this is ready to be merged, but the tests look good to me. --

[GitHub] [lucene] zacharymorn commented on pull request #833: LUCENE-10411: Add NN vectors support to ExitableDirectoryReader

2022-05-04 Thread GitBox
zacharymorn commented on PR #833: URL: https://github.com/apache/lucene/pull/833#issuecomment-1117003662 > > the baseline could run on the raw reader and the contender would wrap the reader with ExitableDirectoryReader and a very large timeout that's almost certainly not going to be hit,

[GitHub] [lucene] jpountz commented on a diff in pull request #860: LUCENE-10553: Fix WANDScorer's handling of 0 and +Infty.

2022-05-04 Thread GitBox
jpountz commented on code in PR #860: URL: https://github.com/apache/lucene/pull/860#discussion_r864517912 ## lucene/core/src/java/org/apache/lucene/search/WANDScorer.java: ## @@ -86,7 +86,6 @@ static int scalingFactor(float f) { * sure we do not miss any matches. */

[jira] [Created] (LUCENE-10556) Relax the maximum dirtiness for stored fields and term vectors?

2022-05-04 Thread Adrien Grand (Jira)
Adrien Grand created LUCENE-10556: - Summary: Relax the maximum dirtiness for stored fields and term vectors? Key: LUCENE-10556 URL: https://issues.apache.org/jira/browse/LUCENE-10556 Project: Lucene

[GitHub] [lucene] LuXugang commented on pull request #792: LUCENE-10502: Use IndexedDISI to store docIds and DirectMonotonicWriter/Reader to handle ordToDoc

2022-05-04 Thread GitBox
LuXugang commented on PR #792: URL: https://github.com/apache/lucene/pull/792#issuecomment-1116957549 > Maybe you could give some information about your machine and benchmark set-up (was there a warmup?) Here is my benchmark test demo :