[GitHub] [lucene] dungba88 opened a new pull request #254: LUCENE-10059: Fix AssertionError in JapaneseTokenizer backtrace

2021-08-19 Thread GitBox
dungba88 opened a new pull request #254: URL: https://github.com/apache/lucene/pull/254 # Description There is an issue which causes an `AssertionError` in the backtrace step of `JapaneseTokenizer`. If there is a text span of length 1024 (determined by `MAX_BACKTRACE_GAP`) where

[jira] [Created] (LUCENE-10059) Assertion error in JapaneseTokenizer backtrace

2021-08-19 Thread Anh Dung Bui (Jira)
Anh Dung Bui created LUCENE-10059: - Summary: Assertion error in JapaneseTokenizer backtrace Key: LUCENE-10059 URL: https://issues.apache.org/jira/browse/LUCENE-10059 Project: Lucene - Core

[GitHub] [lucene] madrob commented on pull request #200: LUCENE-10017 Less verbose exception on IndexFormatTooOld

2021-08-19 Thread GitBox
madrob commented on pull request #200: URL: https://github.com/apache/lucene/pull/200#issuecomment-902273699 @rmuir - do you have thoughts on this, since it looks like you were the last person in this code way back in 2014 -- This is an automated message from the Apache Git Service. To

[jira] [Commented] (LUCENE-10057) Replace direct mmaped buffer with Lucene abstractions in KnnVectorDict

2021-08-19 Thread ASF subversion and git services (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17401848#comment-17401848 ] ASF subversion and git services commented on LUCENE-10057: -- Commit

[GitHub] [lucene] msokolov merged pull request #252: LUCENE-10057: Use Lucene abstractions to store KnnVectorDict

2021-08-19 Thread GitBox
msokolov merged pull request #252: URL: https://github.com/apache/lucene/pull/252 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [lucene] msokolov closed pull request #252: LUCENE-10057: Use Lucene abstractions to store KnnVectorDict

2021-08-19 Thread GitBox
msokolov closed pull request #252: URL: https://github.com/apache/lucene/pull/252 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [lucene] msokolov commented on a change in pull request #252: LUCENE-10057: Use Lucene abstractions to store KnnVectorDict

2021-08-19 Thread GitBox
msokolov commented on a change in pull request #252: URL: https://github.com/apache/lucene/pull/252#discussion_r692447161 ## File path: lucene/demo/src/java/org/apache/lucene/demo/IndexFiles.java ## @@ -153,6 +163,10 @@ public static void main(String[] args) throws Exception {

[GitHub] [lucene] dweiss commented on a change in pull request #252: LUCENE-10057: Use Lucene abstractions to store KnnVectorDict

2021-08-19 Thread GitBox
dweiss commented on a change in pull request #252: URL: https://github.com/apache/lucene/pull/252#discussion_r692388454 ## File path: lucene/demo/src/java/org/apache/lucene/demo/IndexFiles.java ## @@ -153,6 +163,10 @@ public static void main(String[] args) throws Exception {

[jira] [Commented] (LUCENE-10051) lucene branch_8x run ant run-task error

2021-08-19 Thread xiaoshi (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17401799#comment-17401799 ] xiaoshi commented on LUCENE-10051: -- This error is the same as LUCENE-10058, I will fix it there. >

[GitHub] [lucene] xiaoshi2013 opened a new pull request #253: LUCENE-10058: fix gradle lucene:benchmark:run error

2021-08-19 Thread GitBox
xiaoshi2013 opened a new pull request #253: URL: https://github.com/apache/lucene/pull/253 When running ./gradlew lucene:benchmark:run, the default thread name is main not ParallelTaskThread, StringIndexOutOfBoundsException error is thrown. I set the threadIndex default value to 0 to

[GitHub] [lucene] uschindler commented on pull request #252: LUCENE-10057: Use Lucene abstractions to store KnnVectorDict

2021-08-19 Thread GitBox
uschindler commented on pull request #252: URL: https://github.com/apache/lucene/pull/252#issuecomment-902087301 This looks very well, now the dict file is created using IndexOutput and read in using IndexInput. And it is also stored in index directory. I am not sure if storing in index

[jira] [Commented] (LUCENE-10057) Replace direct mmaped buffer with Lucene abstractions in KnnVectorDict

2021-08-19 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17401732#comment-17401732 ] Michael Sokolov commented on LUCENE-10057: -- I tweaked Dawid's patch here

[jira] [Updated] (LUCENE-10058) lucene main run ./gradlew lucene:benchmark:run error

2021-08-19 Thread xiaoshi (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] xiaoshi updated LUCENE-10058: - Labels: newdev (was: ) > lucene main run ./gradlew lucene:benchmark:run error >

[GitHub] [lucene-solr] xiaoshi2013 closed pull request #2556: LUCENE-10051 lucene branch_8x run ant run-task error

2021-08-19 Thread GitBox
xiaoshi2013 closed pull request #2556: URL: https://github.com/apache/lucene-solr/pull/2556 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[jira] [Created] (LUCENE-10058) lucene main run ./gradlew lucene:benchmark:run error

2021-08-19 Thread xiaoshi (Jira)
xiaoshi created LUCENE-10058: Summary: lucene main run ./gradlew lucene:benchmark:run error Key: LUCENE-10058 URL: https://issues.apache.org/jira/browse/LUCENE-10058 Project: Lucene - Core

[jira] [Commented] (LUCENE-10052) Add LuceneTestCase.newBytesRef methods

2021-08-19 Thread ASF subversion and git services (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17401717#comment-17401717 ] ASF subversion and git services commented on LUCENE-10052: -- Commit

[jira] [Commented] (LUCENE-10052) Add LuceneTestCase.newBytesRef methods

2021-08-19 Thread ASF subversion and git services (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17401715#comment-17401715 ] ASF subversion and git services commented on LUCENE-10052: -- Commit

[jira] [Commented] (LUCENE-10052) Add LuceneTestCase.newBytesRef methods

2021-08-19 Thread ASF subversion and git services (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17401716#comment-17401716 ] ASF subversion and git services commented on LUCENE-10052: -- Commit

[GitHub] [lucene] mikemccand commented on a change in pull request #251: LUCENE-10040: Temporarily disable test assertion

2021-08-19 Thread GitBox
mikemccand commented on a change in pull request #251: URL: https://github.com/apache/lucene/pull/251#discussion_r692192328 ## File path: lucene/core/src/test/org/apache/lucene/search/TestKnnVectorQuery.java ## @@ -350,7 +350,9 @@ public void testDeletes() throws IOException {

[GitHub] [lucene] jtibshirani commented on pull request #251: LUCENE-10040: Temporarily disable test assertion

2021-08-19 Thread GitBox
jtibshirani commented on pull request #251: URL: https://github.com/apache/lucene/pull/251#issuecomment-901987142 I think that if k <= n, the number of documents, then ideally we'd always retrieve k neighbors. I opted to disable the assertion for now instead of making a fix, since we're

[GitHub] [lucene] jtibshirani opened a new pull request #251: LUCENE-10040: Temporarily disable test assertion

2021-08-19 Thread GitBox
jtibshirani opened a new pull request #251: URL: https://github.com/apache/lucene/pull/251 TestKnnVectorQuery#testDeletes assumes that if there are n total documents, we can perform a kNN search with k=n and retrieve all documents. This isn't true with our implementation -- due to

[GitHub] [lucene] gsmiller commented on a change in pull request #240: LUCENE-10002: Deprecate IndexSearch#search(Query, Collector) in favor of IndexSearcher#search(Query, CollectorManager)

2021-08-19 Thread GitBox
gsmiller commented on a change in pull request #240: URL: https://github.com/apache/lucene/pull/240#discussion_r692105614 ## File path: lucene/core/src/java/org/apache/lucene/search/IndexSearcher.java ## @@ -659,9 +614,12 @@ public TopFieldDocs reduce(Collection collectors)

[jira] [Commented] (LUCENE-10057) Replace direct mmaped buffer with Lucene abstractions in KnnVectorDict

2021-08-19 Thread Uwe Schindler (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17401680#comment-17401680 ] Uwe Schindler commented on LUCENE-10057: Hi, the fix to use directory looks fine. I don't

[jira] [Commented] (LUCENE-10057) Replace direct mmaped buffer with Lucene abstractions in KnnVectorDict

2021-08-19 Thread Dawid Weiss (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17401673#comment-17401673 ] Dawid Weiss commented on LUCENE-10057: -- I think it works because those tests prepare the

[jira] [Commented] (LUCENE-10057) Replace direct mmaped buffer with Lucene abstractions in KnnVectorDict

2021-08-19 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17401672#comment-17401672 ] Michael Sokolov commented on LUCENE-10057: -- Oh! You are indeed correct, this logic is flawed.

[jira] [Commented] (LUCENE-10057) Replace direct mmaped buffer with Lucene abstractions in KnnVectorDict

2021-08-19 Thread Dawid Weiss (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17401666#comment-17401666 ] Dawid Weiss commented on LUCENE-10057: -- If you leave an unclosed mapped buffer then no - closing

[jira] [Comment Edited] (LUCENE-10057) Replace direct mmaped buffer with Lucene abstractions in KnnVectorDict

2021-08-19 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17401661#comment-17401661 ] Michael Sokolov edited comment on LUCENE-10057 at 8/19/21, 12:24 PM:

[jira] [Commented] (LUCENE-10057) Replace direct mmaped buffer with Lucene abstractions in KnnVectorDict

2021-08-19 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17401661#comment-17401661 ] Michael Sokolov commented on LUCENE-10057: -- Oh, did my stab at this not work? I was unable to

[jira] [Comment Edited] (LUCENE-10054) Handle hierarchy in HNSW graph

2021-08-19 Thread Mayya Sharipova (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17401172#comment-17401172 ] Mayya Sharipova edited comment on LUCENE-10054 at 8/19/21, 11:44 AM:

[jira] [Comment Edited] (LUCENE-10054) Handle hierarchy in HNSW graph

2021-08-19 Thread Mayya Sharipova (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17401172#comment-17401172 ] Mayya Sharipova edited comment on LUCENE-10054 at 8/19/21, 11:42 AM:

[jira] [Comment Edited] (LUCENE-10054) Handle hierarchy in HNSW graph

2021-08-19 Thread Mayya Sharipova (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17401172#comment-17401172 ] Mayya Sharipova edited comment on LUCENE-10054 at 8/19/21, 11:22 AM:

[jira] [Comment Edited] (LUCENE-10054) Handle hierarchy in HNSW graph

2021-08-19 Thread Mayya Sharipova (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17401172#comment-17401172 ] Mayya Sharipova edited comment on LUCENE-10054 at 8/19/21, 11:16 AM:

[jira] [Commented] (LUCENE-10057) Replace direct mmaped buffer with Lucene abstractions in KnnVectorDict

2021-08-19 Thread Dawid Weiss (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17401615#comment-17401615 ] Dawid Weiss commented on LUCENE-10057: -- Well, this works but I scratch my head over whether it's

[jira] [Updated] (LUCENE-10057) Replace direct mmaped buffer with Lucene abstractions in KnnVectorDict

2021-08-19 Thread Dawid Weiss (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Weiss updated LUCENE-10057: - Attachment: LUCENE-10057.patch > Replace direct mmaped buffer with Lucene abstractions in

[jira] [Commented] (LUCENE-10057) Replace direct mmaped buffer with Lucene abstractions in KnnVectorDict

2021-08-19 Thread Dawid Weiss (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17401576#comment-17401576 ] Dawid Weiss commented on LUCENE-10057: -- Uwe: {code} To fix the demo issue: In KNNVectorDict demo

[jira] [Created] (LUCENE-10057) Replace direct mmaped buffer with Lucene abstractions in KnnVectorDict

2021-08-19 Thread Dawid Weiss (Jira)
Dawid Weiss created LUCENE-10057: Summary: Replace direct mmaped buffer with Lucene abstractions in KnnVectorDict Key: LUCENE-10057 URL: https://issues.apache.org/jira/browse/LUCENE-10057 Project: