[jira] [Commented] (LUCENE-10572) Can we optimize BytesRefHash?

2022-05-18 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17538742#comment-17538742 ] Robert Muir commented on LUCENE-10572: -- I don't think we should recommend the user that. Where is

[jira] [Commented] (LUCENE-10572) Can we optimize BytesRefHash?

2022-05-18 Thread Uwe Schindler (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17538598#comment-17538598 ] Uwe Schindler commented on LUCENE-10572: bq. The stopwords are going to skew everything. If

[jira] [Commented] (LUCENE-10572) Can we optimize BytesRefHash?

2022-05-17 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17538211#comment-17538211 ] Robert Muir commented on LUCENE-10572: -- These measurements are also going to be strange because of

[jira] [Commented] (LUCENE-10572) Can we optimize BytesRefHash?

2022-05-17 Thread Uwe Schindler (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17538198#comment-17538198 ] Uwe Schindler commented on LUCENE-10572: If we have 2 hash tables, we could have one for short

[jira] [Commented] (LUCENE-10572) Can we optimize BytesRefHash?

2022-05-17 Thread Adrien Grand (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17538131#comment-17538131 ] Adrien Grand commented on LUCENE-10572: --- If this is memory-bound, I wonder if we could get

[jira] [Commented] (LUCENE-10572) Can we optimize BytesRefHash?

2022-05-16 Thread Michael McCandless (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17537577#comment-17537577 ] Michael McCandless commented on LUCENE-10572: - OK I ran a simple {{luceneutil}} benchmark,

[jira] [Commented] (LUCENE-10572) Can we optimize BytesRefHash?

2022-05-16 Thread Uwe Schindler (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17537477#comment-17537477 ] Uwe Schindler commented on LUCENE-10572: BytesRefHash has this field already: {{int[]

[jira] [Commented] (LUCENE-10572) Can we optimize BytesRefHash?

2022-05-16 Thread Dawid Weiss (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17537410#comment-17537410 ] Dawid Weiss commented on LUCENE-10572: -- > Nevertheless, the main limiting factor of the

[jira] [Commented] (LUCENE-10572) Can we optimize BytesRefHash?

2022-05-16 Thread Uwe Schindler (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17537405#comment-17537405 ] Uwe Schindler commented on LUCENE-10572: Hi Dawid, thats a nice issue. When looking at the

[jira] [Commented] (LUCENE-10572) Can we optimize BytesRefHash?

2022-05-16 Thread Dawid Weiss (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17537354#comment-17537354 ] Dawid Weiss commented on LUCENE-10572: -- This is the issue I filed it under, actually - note it's

[jira] [Commented] (LUCENE-10572) Can we optimize BytesRefHash?

2022-05-16 Thread Dawid Weiss (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17537353#comment-17537353 ] Dawid Weiss commented on LUCENE-10572: -- As much as I love BE (long live M68k), I think it's

[jira] [Commented] (LUCENE-10572) Can we optimize BytesRefHash?

2022-05-14 Thread Michael McCandless (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17537023#comment-17537023 ] Michael McCandless commented on LUCENE-10572: - {quote}Mike, could you make a test on how

[jira] [Commented] (LUCENE-10572) Can we optimize BytesRefHash?

2022-05-14 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17537018#comment-17537018 ] Robert Muir commented on LUCENE-10572: -- {quote} Most terms are shorter than 128 bytes in normal

[jira] [Commented] (LUCENE-10572) Can we optimize BytesRefHash?

2022-05-14 Thread Uwe Schindler (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17537013#comment-17537013 ] Uwe Schindler commented on LUCENE-10572: Mike, could you make a test on how much memory

[jira] [Commented] (LUCENE-10572) Can we optimize BytesRefHash?

2022-05-14 Thread Uwe Schindler (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17537009#comment-17537009 ] Uwe Schindler commented on LUCENE-10572: Hi, actually the reason why BE encoding was used in

[jira] [Commented] (LUCENE-10572) Can we optimize BytesRefHash?

2022-05-13 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17536942#comment-17536942 ] Robert Muir commented on LUCENE-10572: -- {quote} I know that OpenJDK is heavily tested on big

[jira] [Commented] (LUCENE-10572) Can we optimize BytesRefHash?

2022-05-13 Thread Uwe Schindler (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17536937#comment-17536937 ] Uwe Schindler commented on LUCENE-10572: I removed the vInt-like encoding in ByteBlockPool and

[jira] [Commented] (LUCENE-10572) Can we optimize BytesRefHash?

2022-05-13 Thread Uwe Schindler (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17536933#comment-17536933 ] Uwe Schindler commented on LUCENE-10572: bq. Playing with BytesRefHash and ByteBlockPool

[jira] [Commented] (LUCENE-10572) Can we optimize BytesRefHash?

2022-05-13 Thread Uwe Schindler (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17536931#comment-17536931 ] Uwe Schindler commented on LUCENE-10572: bq. Today its swapping bytes on the intel and ARM

[jira] [Commented] (LUCENE-10572) Can we optimize BytesRefHash?

2022-05-13 Thread Uwe Schindler (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17536927#comment-17536927 ] Uwe Schindler commented on LUCENE-10572: Here is a draft PR about the idea. I just changed LZ4

[jira] [Commented] (LUCENE-10572) Can we optimize BytesRefHash?

2022-05-13 Thread Uwe Schindler (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17536921#comment-17536921 ] Uwe Schindler commented on LUCENE-10572: Yeah exactly, sometimes BE is used to allow to sort

[jira] [Commented] (LUCENE-10572) Can we optimize BytesRefHash?

2022-05-13 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17536918#comment-17536918 ] Robert Muir commented on LUCENE-10572: -- {quote} In BytesRefHash we have one big endian variant,

[jira] [Commented] (LUCENE-10572) Can we optimize BytesRefHash?

2022-05-13 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17536917#comment-17536917 ] Robert Muir commented on LUCENE-10572: -- And by the way, I'm fine also with just using

[jira] [Commented] (LUCENE-10572) Can we optimize BytesRefHash?

2022-05-13 Thread Uwe Schindler (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17536916#comment-17536916 ] Uwe Schindler commented on LUCENE-10572: Hi, I have a PR almost ready. In my comment above I

[jira] [Commented] (LUCENE-10572) Can we optimize BytesRefHash?

2022-05-13 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17536915#comment-17536915 ] Robert Muir commented on LUCENE-10572: -- Well, banning it doesn't matter so much to me, if we

[jira] [Commented] (LUCENE-10572) Can we optimize BytesRefHash?

2022-05-13 Thread Uwe Schindler (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17536907#comment-17536907 ] Uwe Schindler commented on LUCENE-10572: bq. We just use our own constant, but then we can

[jira] [Commented] (LUCENE-10572) Can we optimize BytesRefHash?

2022-05-13 Thread Uwe Schindler (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17536905#comment-17536905 ] Uwe Schindler commented on LUCENE-10572: bq. ok, i like your suggestion actually. it solves my

[jira] [Commented] (LUCENE-10572) Can we optimize BytesRefHash?

2022-05-13 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17536885#comment-17536885 ] Robert Muir commented on LUCENE-10572: -- ok, i like your suggestion actually. it solves my issue,

[jira] [Commented] (LUCENE-10572) Can we optimize BytesRefHash?

2022-05-13 Thread Uwe Schindler (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17536883#comment-17536883 ] Uwe Schindler commented on LUCENE-10572: Hey, I agree with the length encoding. This is indeed

[jira] [Commented] (LUCENE-10572) Can we optimize BytesRefHash?

2022-05-13 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17536877#comment-17536877 ] Robert Muir commented on LUCENE-10572: -- {quote} With the swapping of bytes I remember the other

[jira] [Commented] (LUCENE-10572) Can we optimize BytesRefHash?

2022-05-13 Thread Uwe Schindler (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17536855#comment-17536855 ] Uwe Schindler commented on LUCENE-10572: With the swapping of bytes I remember the other issue