[ https://issues.apache.org/jira/browse/LUCENE-10057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17401661#comment-17401661 ]
Michael Sokolov edited comment on LUCENE-10057 at 8/19/21, 12:24 PM: --------------------------------------------------------------------- Oh, did my stab at this not work? I was unable to reproduce so I wasn't sure ... Thank you for hacking at it, [~dweiss]. Your patches LGTM. I don't think I understand where the issue was coming from and why this fixed it though. UPDATE - I read the thread on the mailing list that explains we have a fix for how to unmap mmapped files in Directory/IndexInput, and using those classes enables us to avail the demo of it. Thanks for looking into it [~uschindler] Re: source data for the vectors ... I'm not sure what you mean there; these are a small sample of the (from our perspective precomputed) embeddings downloaded from https://nlp.stanford.edu/projects/glove/ (there is something about it in the package-info.java). Originally they were arrived at by training a large corpus of text (I think these are from a collection of 6B twitter and other texts). was (Author: sokolov): Oh, did my stab at this not work? I was unable to reproduce so I wasn't sure ... Thank you for hacking at it, @Dawid. Your patches LGTM. I don't think I understand where the issue was coming from and why this fixed it though. Re: source data for the vectors ... I'm not sure what you mean there; these are a small sample of the (from our perspective precomputed) embeddings downloaded from https://nlp.stanford.edu/projects/glove/ (there is something about it in the package-info.java). Originally they were arrived at by training a large corpus of text (I think these are from a collection of 6B twitter and other texts). > Replace direct mmaped buffer with Lucene abstractions in KnnVectorDict > ---------------------------------------------------------------------- > > Key: LUCENE-10057 > URL: https://issues.apache.org/jira/browse/LUCENE-10057 > Project: Lucene - Core > Issue Type: Bug > Reporter: Dawid Weiss > Priority: Major > Attachments: LUCENE-10057.patch > > -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org