[jira] [Commented] (LUCENE-10147) KnnVectorQuery can produce negative scores

2021-10-20 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17431170#comment-17431170 ] Michael Sokolov commented on LUCENE-10147: -- [~julietibs]I don't have a link to the thread, but

[jira] [Commented] (LUCENE-10180) Remove usage of lambdas in SegmentMerger?

2021-10-18 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17430276#comment-17430276 ] Michael Sokolov commented on LUCENE-10180: -- It looks as if [~jpountz] posted a PR:

[jira] [Commented] (LUCENE-10180) Remove usage of lambdas in SegmentMerger?

2021-10-17 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17429694#comment-17429694 ] Michael Sokolov commented on LUCENE-10180: -- Thanks for digging - does the proposed solution

[jira] [Commented] (LUCENE-10170) Regression in stored fields compression in 9.0

2021-10-17 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17429690#comment-17429690 ] Michael Sokolov commented on LUCENE-10170: -- IIRC luceneutil does have some support for

[jira] [Comment Edited] (LUCENE-10069) HNSW can miss results with very large k

2021-10-13 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17404690#comment-17404690 ] Michael Sokolov edited comment on LUCENE-10069 at 10/13/21, 2:25 PM:

[jira] [Commented] (LUCENE-10168) drop support for 7.0 indexes in 9.0 (master)

2021-10-12 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17427701#comment-17427701 ] Michael Sokolov commented on LUCENE-10168: -- This is what I see:   {{> Task

[jira] [Comment Edited] (LUCENE-10168) drop support for 7.0 indexes in 9.0 (master)

2021-10-12 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17427664#comment-17427664 ] Michael Sokolov edited comment on LUCENE-10168 at 10/12/21, 12:56 PM:

[jira] [Commented] (LUCENE-10168) drop support for 7.0 indexes in 9.0 (master)

2021-10-12 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17427664#comment-17427664 ] Michael Sokolov commented on LUCENE-10168: -- I haven't messed around with these tests, so I was

[jira] [Resolved] (LUCENE-10147) KnnVectorQuery can produce negative scores

2021-10-07 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Sokolov resolved LUCENE-10147. -- Resolution: Fixed > KnnVectorQuery can produce negative scores >

[jira] [Commented] (LUCENE-10147) KnnVectorQuery can produce negative scores

2021-10-06 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17425221#comment-17425221 ] Michael Sokolov commented on LUCENE-10147: -- OK, I went with convertToScore. I haven't done

[jira] [Commented] (LUCENE-10147) KnnVectorQuery can produce negative scores

2021-10-05 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17424680#comment-17424680 ] Michael Sokolov commented on LUCENE-10147: -- Also, I'll just note that we do have this javadoc

[jira] [Commented] (LUCENE-10147) KnnVectorQuery can produce negative scores

2021-10-05 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17424675#comment-17424675 ] Michael Sokolov commented on LUCENE-10147: -- I'm working up a small CR that should address this

[jira] [Commented] (LUCENE-10147) KnnVectorQuery can produce negative scores

2021-10-05 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17424414#comment-17424414 ] Michael Sokolov commented on LUCENE-10147: -- Oh, nice catch! I agree let's shift (and maybe

[jira] [Commented] (LUCENE-8638) Remove deprecated code in main

2021-09-30 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-8638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17422761#comment-17422761 ] Michael Sokolov commented on LUCENE-8638: - > Shall we remove the 'blocker' tag from this issue

[jira] [Updated] (LUCENE-8638) Remove deprecated code in main

2021-09-30 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-8638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Sokolov updated LUCENE-8638: Priority: Major (was: Blocker) > Remove deprecated code in main >

[jira] [Resolved] (LUCENE-10082) Improve error messages relating to schema consistency enforcement

2021-09-03 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Sokolov resolved LUCENE-10082. -- Resolution: Fixed > Improve error messages relating to schema consistency

[jira] [Updated] (LUCENE-10082) Improve error messages relating to schema consistency enforcement

2021-09-01 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Sokolov updated LUCENE-10082: - Description: I recently went through the process of upgrading our service to use

[jira] [Created] (LUCENE-10082) Improve error messages relating to schema consistency enforcement

2021-09-01 Thread Michael Sokolov (Jira)
Michael Sokolov created LUCENE-10082: Summary: Improve error messages relating to schema consistency enforcement Key: LUCENE-10082 URL: https://issues.apache.org/jira/browse/LUCENE-10082 Project:

[jira] [Commented] (LUCENE-10063) SimpleTextKnnVectorsReader.search needs an implementation

2021-08-31 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17407740#comment-17407740 ] Michael Sokolov commented on LUCENE-10063: -- Ooh, thank you for pointing that out. Sorry for

[jira] [Commented] (LUCENE-8723) Bad interaction bewteen WordDelimiterGraphFilter, StopFilter and FlattenGraphFilter

2021-08-31 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-8723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17407572#comment-17407572 ] Michael Sokolov commented on LUCENE-8723: - I wonder if WDGF and SynonymGraphFilter can also be

[jira] [Comment Edited] (LUCENE-10069) HNSW can miss results with very large k

2021-08-25 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17404690#comment-17404690 ] Michael Sokolov edited comment on LUCENE-10069 at 8/25/21, 8:01 PM:

[jira] [Comment Edited] (LUCENE-10069) HNSW can miss results with very large k

2021-08-25 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17404690#comment-17404690 ] Michael Sokolov edited comment on LUCENE-10069 at 8/25/21, 8:01 PM:

[jira] [Commented] (LUCENE-10069) HNSW can miss results with very large k

2021-08-25 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17404690#comment-17404690 ] Michael Sokolov commented on LUCENE-10069: -- also, I found it is easy to reproduce this several

[jira] [Commented] (LUCENE-10069) HNSW can miss results with very large k

2021-08-25 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17404687#comment-17404687 ] Michael Sokolov commented on LUCENE-10069: -- I think that because we prune neighbors to

[jira] [Commented] (LUCENE-10063) SimpleTextKnnVectorsReader.search needs an implementation

2021-08-25 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17404636#comment-17404636 ] Michael Sokolov commented on LUCENE-10063: -- Thanks for opening Adrien; I agree we can do a

[jira] [Commented] (LUCENE-10059) Assertion error in JapaneseTokenizer backtrace

2021-08-25 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17404476#comment-17404476 ] Michael Sokolov commented on LUCENE-10059: -- > Should we try to have a base class (maybe in

[jira] [Commented] (LUCENE-10067) investigate 6/23/2021 -> 6/24/2021 drop in facets perf

2021-08-25 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17404425#comment-17404425 ] Michael Sokolov commented on LUCENE-10067: -- > nd thank you nightly benchmarks for catching

[jira] [Commented] (LUCENE-10064) Give findFullFlushMerges a default implementation

2021-08-24 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17403812#comment-17403812 ] Michael Sokolov commented on LUCENE-10064: -- +1 to provide some default impl in the standard

[jira] [Commented] (LUCENE-10054) Handle hierarchy in HNSW graph

2021-08-20 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17402289#comment-17402289 ] Michael Sokolov commented on LUCENE-10054: -- Thanks for looking into this! could you expand

[jira] [Commented] (LUCENE-10057) Replace direct mmaped buffer with Lucene abstractions in KnnVectorDict

2021-08-19 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17401732#comment-17401732 ] Michael Sokolov commented on LUCENE-10057: -- I tweaked Dawid's patch here

[jira] [Commented] (LUCENE-10057) Replace direct mmaped buffer with Lucene abstractions in KnnVectorDict

2021-08-19 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17401672#comment-17401672 ] Michael Sokolov commented on LUCENE-10057: -- Oh! You are indeed correct, this logic is flawed.

[jira] [Comment Edited] (LUCENE-10057) Replace direct mmaped buffer with Lucene abstractions in KnnVectorDict

2021-08-19 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17401661#comment-17401661 ] Michael Sokolov edited comment on LUCENE-10057 at 8/19/21, 12:24 PM:

[jira] [Commented] (LUCENE-10057) Replace direct mmaped buffer with Lucene abstractions in KnnVectorDict

2021-08-19 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17401661#comment-17401661 ] Michael Sokolov commented on LUCENE-10057: -- Oh, did my stab at this not work? I was unable to

[jira] [Commented] (LUCENE-8638) Remove deprecated code in main

2021-08-18 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-8638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17401330#comment-17401330 ] Michael Sokolov commented on LUCENE-8638: - I see, thanks [~mdrob]. I only noticed that because I

[jira] [Commented] (LUCENE-8638) Remove deprecated code in main

2021-08-18 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-8638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17401026#comment-17401026 ] Michael Sokolov commented on LUCENE-8638: - I think this is close to "done enough" now. I plan to

[jira] [Resolved] (LUCENE-10016) VectorReader.search needs rethought, o.a.l.search integration?

2021-08-18 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Sokolov resolved LUCENE-10016. -- Resolution: Fixed [~rcmuir] thanks for opening this; I think we can resolve now. >

[jira] [Comment Edited] (LUCENE-8638) Remove deprecated code in main

2021-08-18 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-8638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17400540#comment-17400540 ] Michael Sokolov edited comment on LUCENE-8638 at 8/18/21, 12:33 PM:

[jira] [Comment Edited] (LUCENE-8638) Remove deprecated code in main

2021-08-17 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-8638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17400540#comment-17400540 ] Michael Sokolov edited comment on LUCENE-8638 at 8/17/21, 9:51 PM: ---

[jira] [Comment Edited] (LUCENE-8638) Remove deprecated code in main

2021-08-17 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-8638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17400540#comment-17400540 ] Michael Sokolov edited comment on LUCENE-8638 at 8/17/21, 9:47 PM: ---

[jira] [Comment Edited] (LUCENE-8638) Remove deprecated code in main

2021-08-17 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-8638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17400540#comment-17400540 ] Michael Sokolov edited comment on LUCENE-8638 at 8/17/21, 6:59 PM: --- I

[jira] [Comment Edited] (LUCENE-8638) Remove deprecated code in main

2021-08-17 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-8638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17400540#comment-17400540 ] Michael Sokolov edited comment on LUCENE-8638 at 8/17/21, 6:12 PM: --- I

[jira] [Commented] (LUCENE-8638) Remove deprecated code in main

2021-08-17 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-8638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17400540#comment-17400540 ] Michael Sokolov commented on LUCENE-8638: - I merged https://github.com/apache/lucene/pull/243

[jira] [Comment Edited] (LUCENE-8638) Remove deprecated code in main

2021-08-16 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-8638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17399740#comment-17399740 ] Michael Sokolov edited comment on LUCENE-8638 at 8/16/21, 12:58 PM:

[jira] [Comment Edited] (LUCENE-8638) Remove deprecated code in main

2021-08-16 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-8638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17399740#comment-17399740 ] Michael Sokolov edited comment on LUCENE-8638 at 8/16/21, 12:57 PM:

[jira] [Comment Edited] (LUCENE-8638) Remove deprecated code in main

2021-08-16 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-8638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17399740#comment-17399740 ] Michael Sokolov edited comment on LUCENE-8638 at 8/16/21, 12:56 PM:

[jira] [Comment Edited] (LUCENE-8638) Remove deprecated code in main

2021-08-16 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-8638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17399740#comment-17399740 ] Michael Sokolov edited comment on LUCENE-8638 at 8/16/21, 12:55 PM:

[jira] [Commented] (LUCENE-8638) Remove deprecated code in main

2021-08-16 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-8638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17399740#comment-17399740 ] Michael Sokolov commented on LUCENE-8638: - I opened a PR with the easy changes from the

[jira] [Commented] (LUCENE-8638) Remove deprecated code in main

2021-08-14 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-8638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17399265#comment-17399265 ] Michael Sokolov commented on LUCENE-8638: - I'm a little unclear why we opted to use a separate

[jira] [Updated] (LUCENE-8638) Remove deprecated code in main

2021-08-14 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-8638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Sokolov updated LUCENE-8638: Summary: Remove deprecated code in main (was: Remove deprecated code in master) >

[jira] [Commented] (LUCENE-10016) VectorReader.search needs rethought, o.a.l.search integration?

2021-08-13 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17398774#comment-17398774 ] Michael Sokolov commented on LUCENE-10016: -- https://github.com/apache/lucene/pull/241 adds a

[jira] [Resolved] (LUCENE-9614) Implement KNN Query

2021-08-13 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Sokolov resolved LUCENE-9614. - Resolution: Fixed > Implement KNN Query > --- > > Key:

[jira] [Commented] (LUCENE-10048) Bypass total frequency check if field uses custom term frequency

2021-08-13 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17398614#comment-17398614 ] Michael Sokolov commented on LUCENE-10048: -- > Ankur I wonder if encoding the scores more

[jira] [Commented] (LUCENE-9937) ann-benchmarks results for HNSW search

2021-08-11 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17397664#comment-17397664 ] Michael Sokolov commented on LUCENE-9937: - I'm glad you found the batch setup was confounding

[jira] [Commented] (LUCENE-9614) Implement KNN Query

2021-08-11 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17397613#comment-17397613 ] Michael Sokolov commented on LUCENE-9614: - UPDATE: we went with {{score(q,d) = 1 / (1 + |q -

[jira] [Comment Edited] (LUCENE-9614) Implement KNN Query

2021-08-09 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17396283#comment-17396283 ] Michael Sokolov edited comment on LUCENE-9614 at 8/9/21, 9:07 PM: --

[jira] [Commented] (LUCENE-9614) Implement KNN Query

2021-08-09 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17396283#comment-17396283 ] Michael Sokolov commented on LUCENE-9614: - Thinking about how to make the scores be commensurate

[jira] [Commented] (LUCENE-10040) Handle deletions in nearest vector search

2021-08-06 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17394763#comment-17394763 ] Michael Sokolov commented on LUCENE-10040: -- This approach (filter out deletions while

[jira] [Commented] (LUCENE-9004) Approximate nearest vector search

2021-08-04 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17393372#comment-17393372 ] Michael Sokolov commented on LUCENE-9004: - > One other advantage is that each search would be

[jira] [Commented] (LUCENE-9004) Approximate nearest vector search

2021-08-04 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17393216#comment-17393216 ] Michael Sokolov commented on LUCENE-9004: - > It looks like the current HnswGraph doesn't

[jira] [Commented] (LUCENE-10016) VectorReader.search needs rethought, o.a.l.search integration?

2021-07-28 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17388899#comment-17388899 ] Michael Sokolov commented on LUCENE-10016: -- as for the demo, there is a start on something we

[jira] [Commented] (LUCENE-10034) Vectors NeighborQueue MIN/MAX heap reversed?

2021-07-24 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17386765#comment-17386765 ] Michael Sokolov commented on LUCENE-10034: -- I have trouble with any "distance" that has d(x,

[jira] [Commented] (LUCENE-10034) Vectors NeighborQueue MIN/MAX heap reversed?

2021-07-23 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17386525#comment-17386525 ] Michael Sokolov commented on LUCENE-10034: -- This is so confusing! I made the same error

[jira] [Commented] (LUCENE-10015) Remove VectorValues.SimilarityFunction, remove NONE

2021-07-22 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17385546#comment-17385546 ] Michael Sokolov commented on LUCENE-10015: -- Yes, let's keep the similarity function! It's an

[jira] [Comment Edited] (LUCENE-9614) Implement KNN Query

2021-07-10 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17378497#comment-17378497 ] Michael Sokolov edited comment on LUCENE-9614 at 7/10/21, 5:03 PM: ---

[jira] [Commented] (LUCENE-9614) Implement KNN Query

2021-07-10 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17378497#comment-17378497 ] Michael Sokolov commented on LUCENE-9614: - Doing nn vector search during rewrite has one

[jira] [Commented] (LUCENE-10016) VectorReader.search needs rethought, o.a.l.search integration?

2021-07-10 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17378489#comment-17378489 ] Michael Sokolov commented on LUCENE-10016: -- I posted a PR removing fanout. For the question of

[jira] [Commented] (LUCENE-10019) Align file starts in CFS files to have proper alignment (8 bytes)

2021-07-03 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17374100#comment-17374100 ] Michael Sokolov commented on LUCENE-10019: -- > I'd like to get this in for 9.0 so we can at

[jira] [Updated] (LUCENE-10019) Align file starts in CFS files to have proper alignment (8 bytes)

2021-07-03 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Sokolov updated LUCENE-10019: - Priority: Blocker (was: Major) > Align file starts in CFS files to have proper

[jira] [Commented] (LUCENE-8638) Remove deprecated code in master

2021-07-03 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-8638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17374089#comment-17374089 ] Michael Sokolov commented on LUCENE-8638: - re: {{FixBrokenOffsetsFilter}} I don't think we have

[jira] [Commented] (LUCENE-10016) VectorReader.search needs rethought, o.a.l.search integration?

2021-07-01 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17373038#comment-17373038 ] Michael Sokolov commented on LUCENE-10016: -- > We can move it to a codec parameter? Probably

[jira] [Commented] (LUCENE-9855) Reconsider names for ANN related format and APIs

2021-07-01 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17373011#comment-17373011 ] Michael Sokolov commented on LUCENE-9855: - # I agree this is a blocker - we don't want to be

[jira] [Commented] (LUCENE-10016) VectorReader.search needs rethought, o.a.l.search integration?

2021-06-30 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17372056#comment-17372056 ] Michael Sokolov commented on LUCENE-10016: -- Hmm I think we expect that any approximate

[jira] [Commented] (LUCENE-10015) Remove VectorValues.SimilarityFunction, remove NONE

2021-06-30 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17372053#comment-17372053 ] Michael Sokolov commented on LUCENE-10015: -- I'm fine with removing NONE but I don't think that

[jira] [Created] (LUCENE-9992) During merging, write empty (headers-only) numeric vectors when vector-valued FieldInfos is present

2021-06-06 Thread Michael Sokolov (Jira)
Michael Sokolov created LUCENE-9992: --- Summary: During merging, write empty (headers-only) numeric vectors when vector-valued FieldInfos is present Key: LUCENE-9992 URL:

[jira] [Commented] (LUCENE-9986) Create a simple "real world" regexp benchmark

2021-06-02 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17355684#comment-17355684 ] Michael Sokolov commented on LUCENE-9986: - [This SO

[jira] [Commented] (LUCENE-9937) ann-benchmarks results for HNSW search

2021-06-01 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17355161#comment-17355161 ] Michael Sokolov commented on LUCENE-9937: - I ran some further tests using hnswlib and

[jira] [Commented] (LUCENE-9625) Benchmark KNN search with ann-benchmarks

2021-05-27 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17352621#comment-17352621 ] Michael Sokolov commented on LUCENE-9625: - Yes, once we have a publicly-available release I

[jira] [Commented] (LUCENE-9905) Revise approach to specifying NN algorithm

2021-04-25 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17331624#comment-17331624 ] Michael Sokolov commented on LUCENE-9905: - [GitHub Pull Request

[jira] [Commented] (LUCENE-9937) ann-benchmarks results for HNSW search

2021-04-25 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17331562#comment-17331562 ] Michael Sokolov commented on LUCENE-9937: - I think an important statistic that helps when

[jira] [Commented] (LUCENE-9905) Revise approach to specifying NN algorithm

2021-04-23 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17330867#comment-17330867 ] Michael Sokolov commented on LUCENE-9905: - [~sejalpawar] as this issue is somewhat

[jira] [Commented] (LUCENE-9905) Revise approach to specifying NN algorithm

2021-04-19 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17325327#comment-17325327 ] Michael Sokolov commented on LUCENE-9905: - I think it could be difficult to break it up, but

[jira] [Comment Edited] (LUCENE-9905) Revise approach to specifying NN algorithm

2021-04-18 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17324622#comment-17324622 ] Michael Sokolov edited comment on LUCENE-9905 at 4/18/21, 11:35 PM:

[jira] [Commented] (LUCENE-9905) Revise approach to specifying NN algorithm

2021-04-18 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17324622#comment-17324622 ] Michael Sokolov commented on LUCENE-9905: - Sejal, please go ahead! > Revise approach to

[jira] [Resolved] (LUCENE-9844) Document the disk layout of Lucene90VectorFormat

2021-04-14 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Sokolov resolved LUCENE-9844. - Resolution: Fixed > Document the disk layout of Lucene90VectorFormat >

[jira] [Commented] (LUCENE-9798) Fix looping bug when calculating full KNN results in KnnGraphTester

2021-04-12 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17319710#comment-17319710 ] Michael Sokolov commented on LUCENE-9798: - [~nitiraj] I pushed your change, but had to revert.

[jira] [Reopened] (LUCENE-9855) Reconsider names for ANN related format and APIs

2021-04-12 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Sokolov reopened LUCENE-9855: - > Reconsider names for ANN related format and APIs >

[jira] [Commented] (LUCENE-9855) Reconsider names for ANN related format and APIs

2021-04-12 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17319432#comment-17319432 ] Michael Sokolov commented on LUCENE-9855: - [~tomoko] thanks for working on this. I think your

[jira] [Commented] (LUCENE-9855) Reconsider codec name VectorFormat

2021-04-05 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17315067#comment-17315067 ] Michael Sokolov commented on LUCENE-9855: - OK, naming is hard! I think it will help to break

[jira] [Commented] (LUCENE-9855) Reconsider codec name VectorFormat

2021-04-02 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17313963#comment-17313963 ] Michael Sokolov commented on LUCENE-9855: - > If we'd like to have another ANN algorithm, say

[jira] [Commented] (LUCENE-9855) Reconsider codec name VectorFormat

2021-04-02 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17313940#comment-17313940 ] Michael Sokolov commented on LUCENE-9855: - I think it will be helpful to consider how we would

[jira] [Comment Edited] (LUCENE-9877) Explore increasing the allowable exceptions in PForUtil

2021-03-30 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17311684#comment-17311684 ] Michael Sokolov edited comment on LUCENE-9877 at 3/30/21, 5:38 PM: --- > 

[jira] [Commented] (LUCENE-9877) Explore increasing the allowable exceptions in PForUtil

2021-03-30 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17311684#comment-17311684 ] Michael Sokolov commented on LUCENE-9877: - > I was able to setup ant/JDK8 and made sure "ant

[jira] [Commented] (LUCENE-9855) Reconsider codec name VectorFormat

2021-03-30 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17311671#comment-17311671 ] Michael Sokolov commented on LUCENE-9855: - I'm making this a Blocker for 9.0, because we either

[jira] [Updated] (LUCENE-9855) Reconsider codec name VectorFormat

2021-03-30 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Sokolov updated LUCENE-9855: Priority: Blocker (was: Minor) > Reconsider codec name VectorFormat >

[jira] [Commented] (LUCENE-9855) Reconsider codec name VectorFormat

2021-03-22 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17306324#comment-17306324 ] Michael Sokolov commented on LUCENE-9855: - Thanks for picking this up, [~tomoko] - I'll add the

[jira] [Commented] (LUCENE-9615) Expose HnswGraphBuilder index-time hyperparameters

2021-03-16 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17302669#comment-17302669 ] Michael Sokolov commented on LUCENE-9615: - Thanks [~shubhambeniwal]! > Expose HnswGraphBuilder

[jira] [Resolved] (LUCENE-9639) Add unit tests for SImpleTextVector format

2021-03-16 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Sokolov resolved LUCENE-9639. - Resolution: Fixed Thanks, [~zacharymorn]! > Add unit tests for SImpleTextVector format

[jira] [Resolved] (LUCENE-9615) Expose HnswGraphBuilder index-time hyperparameters

2021-03-16 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Sokolov resolved LUCENE-9615. - Resolution: Fixed > Expose HnswGraphBuilder index-time hyperparameters >

[jira] [Resolved] (LUCENE-9679) Try using Math.fma to speed up vector computations

2021-03-16 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Sokolov resolved LUCENE-9679. - Resolution: Won't Fix this was never a great idea, and now we have other patches

[jira] [Commented] (LUCENE-9845) Improve encoding of HNSW graph offsets

2021-03-16 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17302631#comment-17302631 ] Michael Sokolov commented on LUCENE-9845: - It does sound tempting to attack the "dense" case,

<    1   2   3   4   5   >