[
https://issues.apache.org/jira/browse/LUCENE-6968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16804310#comment-16804310
]
Andy Hind commented on LUCENE-6968:
---
[~mayyas], in answer to your questions:
1) Depends on your view
[
https://issues.apache.org/jira/browse/LUCENE-6968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16783921#comment-16783921
]
Mayya Sharipova commented on LUCENE-6968:
-
[~andyhind] Thanks very much for your answer, it made
[
https://issues.apache.org/jira/browse/LUCENE-6968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737192#comment-16737192
]
Andy Hind commented on LUCENE-6968:
---
[~mayyas] Hi Mayya, there is a good review paper here
[
https://issues.apache.org/jira/browse/LUCENE-6968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16706571#comment-16706571
]
Mayya Sharipova commented on LUCENE-6968:
-
[~andyhind] Hello Andy! I have several questions
[
https://issues.apache.org/jira/browse/LUCENE-6968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15601387#comment-15601387
]
Tommaso Teofili commented on LUCENE-6968:
-
[~yo...@apache.org] the _MinHash_ filter can be
[
https://issues.apache.org/jira/browse/LUCENE-6968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15599783#comment-15599783
]
Yonik Seeley commented on LUCENE-6968:
--
Is there a JIRA issue yet to expose (and test) this and
[
https://issues.apache.org/jira/browse/LUCENE-6968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15414895#comment-15414895
]
Varun Thacker commented on LUCENE-6968:
---
Hi Tommaso,
I think we need to fix the CHANGES entry to
[
https://issues.apache.org/jira/browse/LUCENE-6968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15348025#comment-15348025
]
Tommaso Teofili commented on LUCENE-6968:
-
I've backported this to branch 6.x for inclusion in
[
https://issues.apache.org/jira/browse/LUCENE-6968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15348023#comment-15348023
]
ASF subversion and git services commented on LUCENE-6968:
-
Commit
[
https://issues.apache.org/jira/browse/LUCENE-6968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15333646#comment-15333646
]
ASF subversion and git services commented on LUCENE-6968:
-
Commit
[
https://issues.apache.org/jira/browse/LUCENE-6968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15331794#comment-15331794
]
Tommaso Teofili commented on LUCENE-6968:
-
yes, I plan to merge it to 6.x, I wanted to have a few
[
https://issues.apache.org/jira/browse/LUCENE-6968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15330032#comment-15330032
]
Andy Hind commented on LUCENE-6968:
---
Hi Tommaso - are you planning to merge this to 6.x?
> LSH Filter
[
https://issues.apache.org/jira/browse/LUCENE-6968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15329206#comment-15329206
]
Tommaso Teofili commented on LUCENE-6968:
-
I've committed this, thanks to [~andyhind] and
[
https://issues.apache.org/jira/browse/LUCENE-6968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15329205#comment-15329205
]
ASF subversion and git services commented on LUCENE-6968:
-
Commit
[
https://issues.apache.org/jira/browse/LUCENE-6968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15328054#comment-15328054
]
Andy Hind commented on LUCENE-6968:
---
Hi Tommaso, the MinHashFilterTest was running fine. It was
[
https://issues.apache.org/jira/browse/LUCENE-6968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15275123#comment-15275123
]
Tommaso Teofili commented on LUCENE-6968:
-
thanks a lot Andy, it generally looks much better, I
[
https://issues.apache.org/jira/browse/LUCENE-6968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15274697#comment-15274697
]
Andy Hind commented on LUCENE-6968:
---
I have attached an updated patch.
This addresses the following
[
https://issues.apache.org/jira/browse/LUCENE-6968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15263867#comment-15263867
]
Andy Hind commented on LUCENE-6968:
---
After a bit more digging, the single hash and keeping the minimum
[
https://issues.apache.org/jira/browse/LUCENE-6968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15262918#comment-15262918
]
Andy Hind commented on LUCENE-6968:
---
[~yo...@apache.org] has murmurhash3_x64_128 here
[
https://issues.apache.org/jira/browse/LUCENE-6968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15262169#comment-15262169
]
Andy Hind commented on LUCENE-6968:
---
I agree a pure token stream test makes sense. The only concern I
[
https://issues.apache.org/jira/browse/LUCENE-6968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15261997#comment-15261997
]
Robert Muir commented on LUCENE-6968:
-
We don't need any "integration tests" or "end to end tests".
[
https://issues.apache.org/jira/browse/LUCENE-6968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15261990#comment-15261990
]
Tommaso Teofili commented on LUCENE-6968:
-
thanks Robert, I agree on all of your suggestions.
The
[
https://issues.apache.org/jira/browse/LUCENE-6968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15261961#comment-15261961
]
Robert Muir commented on LUCENE-6968:
-
also, these analyzers should not be tested with queries.
[
https://issues.apache.org/jira/browse/LUCENE-6968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15261955#comment-15261955
]
Robert Muir commented on LUCENE-6968:
-
analysis-common library cannot have any external dependencies.
[
https://issues.apache.org/jira/browse/LUCENE-6968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15261480#comment-15261480
]
Cao Manh Dat commented on LUCENE-6968:
--
Thanks, that make sense!
> LSH Filter
> --
>
>
[
https://issues.apache.org/jira/browse/LUCENE-6968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15259924#comment-15259924
]
Andy Hind commented on LUCENE-6968:
---
This comes down to "what is a good estimate of |A U B|" and do we
[
https://issues.apache.org/jira/browse/LUCENE-6968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15257732#comment-15257732
]
Cao Manh Dat commented on LUCENE-6968:
--
Thanks for the link. I totally agree that keeping some
[
https://issues.apache.org/jira/browse/LUCENE-6968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15256114#comment-15256114
]
Andy Hind commented on LUCENE-6968:
---
The argument here says it is pretty much the same.
{code}
[
https://issues.apache.org/jira/browse/LUCENE-6968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15255435#comment-15255435
]
Cao Manh Dat commented on LUCENE-6968:
--
What's a wonderful patch. The code is optimized, sure that
[
https://issues.apache.org/jira/browse/LUCENE-6968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15255434#comment-15255434
]
Cao Manh Dat commented on LUCENE-6968:
--
What's a wonderful patch. The code is optimized, sure that
[
https://issues.apache.org/jira/browse/LUCENE-6968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15252743#comment-15252743
]
Andy Hind commented on LUCENE-6968:
---
Hi
It would be quite common to use min hashing after shingling.
[
https://issues.apache.org/jira/browse/LUCENE-6968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15252122#comment-15252122
]
Joel Bernstein commented on LUCENE-6968:
HI, I believe there will be some work forthcoming from
[
https://issues.apache.org/jira/browse/LUCENE-6968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15252065#comment-15252065
]
Tommaso Teofili commented on LUCENE-6968:
-
[~caomanhdat], I'd like to commit this patch shortly,
[
https://issues.apache.org/jira/browse/LUCENE-6968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195272#comment-15195272
]
Tommaso Teofili commented on LUCENE-6968:
-
I was having a second look at this patch, why is
[
https://issues.apache.org/jira/browse/LUCENE-6968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15147204#comment-15147204
]
Tommaso Teofili commented on LUCENE-6968:
-
since this has also been mentioned in SOLR-7739 maybe
[
https://issues.apache.org/jira/browse/LUCENE-6968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15093091#comment-15093091
]
Cao Manh Dat commented on LUCENE-6968:
--
Yes, It kinda like finding K nearest neighbor. But there a
[
https://issues.apache.org/jira/browse/LUCENE-6968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15091901#comment-15091901
]
Joel Bernstein commented on LUCENE-6968:
[~caomanhdat], interestiing ticket!
Can you describe
37 matches
Mail list logo