[jira] [Commented] (LUCENE-10471) Increase the number of dims for KNN vectors to 2048

2022-08-15 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17579875#comment-17579875 ] Robert Muir commented on LUCENE-10471: -- It is also terrible that this issue says 2048 but somehow

[jira] [Commented] (LUCENE-10471) Increase the number of dims for KNN vectors to 2048

2022-08-15 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17579874#comment-17579874 ] Robert Muir commented on LUCENE-10471: -- My main concern is that it can't be undone, as i

[jira] [Assigned] (LUCENE-10423) Remove uses of wall-clock time in codebase

2022-08-15 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir reassigned LUCENE-10423: Assignee: Marios Trivyzas > Remove uses of wall-clock time in codebase >

[jira] [Commented] (LUCENE-10423) Remove uses of wall-clock time in codebase

2022-08-15 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17579678#comment-17579678 ] Robert Muir commented on LUCENE-10423: -- The comment is slightly incorrect. nanoTime should still

[jira] [Commented] (LUCENE-10677) Duplicate strings in FieldInfo#attributes contribute significantly to heap usage at scale

2022-08-09 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17577309#comment-17577309 ] Robert Muir commented on LUCENE-10677: -- I'm opposed to the use of string.intern by the lucene

[jira] [Commented] (LUCENE-10676) FieldInfo#name contributes significantly to heap usage at scale

2022-08-09 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17577304#comment-17577304 ] Robert Muir commented on LUCENE-10676: -- I'm opposed to use of String.intern here. The problem

[jira] [Commented] (LUCENE-10423) Remove uses of wall-clock time in codebase

2022-07-30 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17573272#comment-17573272 ] Robert Muir commented on LUCENE-10423: -- Certainly not. No reason to drag any third party library

[jira] [Commented] (LUCENE-10662) Make LuceneTestCase to not extend from org.junit.Assert

2022-07-29 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17572929#comment-17572929 ] Robert Muir commented on LUCENE-10662: -- I think, when we first exported the test-framework, there

[jira] [Commented] (LUCENE-10662) Make LuceneTestCase to not extend from org.junit.Assert

2022-07-29 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17572927#comment-17572927 ] Robert Muir commented on LUCENE-10662: -- I don't really understand the benefit of this change to

[jira] [Commented] (LUCENE-10471) Increase the number of dims for KNN vectors to 2048

2022-07-14 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17566874#comment-17566874 ] Robert Muir commented on LUCENE-10471: -- The problem is that nobody will ever want to *reduce* the

[jira] [Commented] (LUCENE-10577) Quantize vector values

2022-07-13 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17566223#comment-17566223 ] Robert Muir commented on LUCENE-10577: -- By the way, if the right answer is that really different

[jira] [Commented] (LUCENE-10577) Quantize vector values

2022-07-13 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17566220#comment-17566220 ] Robert Muir commented on LUCENE-10577: -- {quote} I tried looking at how DocValues are handling this

[jira] [Commented] (LUCENE-10471) Increase the number of dims for KNN vectors to 2048

2022-07-13 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17566212#comment-17566212 ] Robert Muir commented on LUCENE-10471: -- My questions are still unanswered. Please don't merge the

[jira] [Commented] (LUCENE-10627) Using CompositeByteBuf to Reduce Memory Copy

2022-07-07 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17563805#comment-17563805 ] Robert Muir commented on LUCENE-10627: -- Yes we have to stop another PagedBytes/ByteBlockPool from

[jira] [Commented] (LUCENE-10396) Automatically create sparse indexes for sort fields

2022-06-28 Thread Robert Muir (Jira)
Title: Message Title Robert Muir

[jira] [Commented] (LUCENE-10396) Automatically create sparse indexes for sort fields

2022-06-28 Thread Robert Muir (Jira)
Title: Message Title Robert Muir

[jira] [Commented] (LUCENE-10396) Automatically create sparse indexes for sort fields

2022-06-28 Thread Robert Muir (Jira)
Title: Message Title Robert Muir

[jira] [Commented] (LUCENE-10396) Automatically create sparse indexes for sort fields

2022-06-28 Thread Robert Muir (Jira)
Title: Message Title Robert Muir

[jira] [Commented] (LUCENE-10396) Automatically create sparse indexes for sort fields

2022-06-28 Thread Robert Muir (Jira)
Title: Message Title Robert Muir

[jira] [Commented] (LUCENE-10396) Automatically create sparse indexes for sort fields

2022-06-28 Thread Robert Muir (Jira)
Title: Message Title Robert Muir

[jira] [Commented] (LUCENE-10396) Automatically create sparse indexes for sort fields

2022-06-28 Thread Robert Muir (Jira)
Title: Message Title Robert Muir

[jira] [Commented] (LUCENE-10577) Quantize vector values

2022-06-21 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17556848#comment-17556848 ] Robert Muir commented on LUCENE-10577: -- Seems like the codec API needs to be fixed so that ppl can

[jira] [Commented] (LUCENE-10577) Quantize vector values

2022-06-15 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17554900#comment-17554900 ] Robert Muir commented on LUCENE-10577: -- I don't understand why we are doing this as yet another

[jira] [Commented] (LUCENE-10615) Add license information for SmartChineseAnalyzer to NOTICE.txt

2022-06-14 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17554070#comment-17554070 ] Robert Muir commented on LUCENE-10615: -- > Please don't use jira for questions like this. > Add

[jira] [Resolved] (LUCENE-10615) Add license information for SmartChineseAnalyzer to NOTICE.txt

2022-06-14 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir resolved LUCENE-10615. -- Resolution: Invalid Please don't use jira for questions like this. We won't be adding

[jira] [Commented] (LUCENE-10610) RunAutomaton#hashCode() can easily cause hash collision for different Automatons

2022-06-10 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17552947#comment-17552947 ] Robert Muir commented on LUCENE-10610: -- and for the same reason, again, we can do something else

[jira] [Commented] (LUCENE-10610) RunAutomaton#hashCode() can easily cause hash collision for different Automatons

2022-06-10 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17552943#comment-17552943 ] Robert Muir commented on LUCENE-10610: -- it is much more complicated. I really don't think we

[jira] [Commented] (LUCENE-10610) RunAutomaton#hashCode() can easily cause hash collision for different Automatons

2022-06-10 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17552861#comment-17552861 ] Robert Muir commented on LUCENE-10610: -- Also i honestly think the current hashcode based on

[jira] [Commented] (LUCENE-10610) RunAutomaton#hashCode() can easily cause hash collision for different Automatons

2022-06-10 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17552856#comment-17552856 ] Robert Muir commented on LUCENE-10610: -- A simple/fast improvement might be to incorporate

[jira] [Commented] (LUCENE-10610) RunAutomaton#hashCode() can easily cause hash collision for different Automatons

2022-06-10 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17552827#comment-17552827 ] Robert Muir commented on LUCENE-10610: -- what uses this hashcode (anything?). Let's please not go

[jira] [Commented] (LUCENE-10602) Dynamic Index Cache Sizing

2022-06-09 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17552394#comment-17552394 ] Robert Muir commented on LUCENE-10602: -- thanks for giving the context. I was confused whether

[jira] [Commented] (LUCENE-10602) Dynamic Index Cache Sizing

2022-06-07 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17551355#comment-17551355 ] Robert Muir commented on LUCENE-10602: -- well the default impl is a singleton, so even if you have

[jira] [Commented] (LUCENE-10602) Dynamic Index Cache Sizing

2022-06-07 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17551337#comment-17551337 ] Robert Muir commented on LUCENE-10602: -- I'm confused about "give GBs of heap back" when the

[jira] [Commented] (LUCENE-10598) SortedSetDocValues#docValueCount() should be always greater than zero

2022-05-30 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17544079#comment-17544079 ] Robert Muir commented on LUCENE-10598: -- I think we should validate this count in CheckIndex,

[jira] [Commented] (LUCENE-10574) Remove O(n^2) from TieredMergePolicy or change defaults to one that doesn't do this

2022-05-20 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17540283#comment-17540283 ] Robert Muir commented on LUCENE-10574: -- The benchmark was originally some code written to

[jira] [Resolved] (LUCENE-10579) fix smoketester backwards-check to not parse stdout

2022-05-20 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir resolved LUCENE-10579. -- Resolution: Fixed > fix smoketester backwards-check to not parse stdout >

[jira] [Comment Edited] (LUCENE-7161) TestMoreLikeThis.testMultiFieldShouldReturnPerFieldBooleanQuery assertion error

2022-05-19 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-7161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17539601#comment-17539601 ] Robert Muir edited comment on LUCENE-7161 at 5/19/22 3:03 PM: -- I beasted it

[jira] [Commented] (LUCENE-7161) TestMoreLikeThis.testMultiFieldShouldReturnPerFieldBooleanQuery assertion error

2022-05-19 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-7161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17539601#comment-17539601 ] Robert Muir commented on LUCENE-7161: - I beasted it and can reproduce it like this: {noformat} $ git

[jira] [Commented] (LUCENE-10312) Add PersianStemmer

2022-05-19 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17539517#comment-17539517 ] Robert Muir commented on LUCENE-10312: -- There need not be backwards compatibility issues: Just add

[jira] [Commented] (LUCENE-10569) Think again about the floor segment size?

2022-05-18 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17539116#comment-17539116 ] Robert Muir commented on LUCENE-10569: -- I agree. same with the stored fields stuff too. I'd love

[jira] [Updated] (LUCENE-10579) fix smoketester backwards-check to not parse stdout

2022-05-18 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-10579: - Fix Version/s: 9.3 > fix smoketester backwards-check to not parse stdout >

[jira] [Commented] (LUCENE-10579) fix smoketester backwards-check to not parse stdout

2022-05-18 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17539050#comment-17539050 ] Robert Muir commented on LUCENE-10579: -- or even maybe a gradle status update with its escape

[jira] [Commented] (LUCENE-10579) fix smoketester backwards-check to not parse stdout

2022-05-18 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17539042#comment-17539042 ] Robert Muir commented on LUCENE-10579: -- There's all kinds of stuff being printed, but this gives

[jira] [Commented] (LUCENE-10579) fix smoketester backwards-check to not parse stdout

2022-05-18 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17539040#comment-17539040 ] Robert Muir commented on LUCENE-10579: -- I attached compressed file of what the smoketester is

[jira] [Updated] (LUCENE-10579) fix smoketester backwards-check to not parse stdout

2022-05-18 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-10579: - Attachment: backwards.log.gz > fix smoketester backwards-check to not parse stdout >

[jira] [Created] (LUCENE-10579) fix smoketester backwards-check to not parse stdout

2022-05-18 Thread Robert Muir (Jira)
Robert Muir created LUCENE-10579: Summary: fix smoketester backwards-check to not parse stdout Key: LUCENE-10579 URL: https://issues.apache.org/jira/browse/LUCENE-10579 Project: Lucene - Core

[jira] [Commented] (LUCENE-10574) Remove O(n^2) from TieredMergePolicy or change defaults to one that doesn't do this

2022-05-18 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17538780#comment-17538780 ] Robert Muir commented on LUCENE-10574: -- Yes, that's awesome. I think if we go with this PR, let's

[jira] [Commented] (LUCENE-10574) Remove O(n^2) from TieredMergePolicy or change defaults to one that doesn't do this

2022-05-18 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17538767#comment-17538767 ] Robert Muir commented on LUCENE-10574: -- what is "flushed 3 by 3". flushing 3 docs at a time with a

[jira] [Commented] (LUCENE-10578) Make minimum required Java version for build more specific

2022-05-18 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17538749#comment-17538749 ] Robert Muir commented on LUCENE-10578: -- 1. fail, there is only fail. warnings are useless. 2. see

[jira] [Commented] (LUCENE-10572) Can we optimize BytesRefHash?

2022-05-18 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17538742#comment-17538742 ] Robert Muir commented on LUCENE-10572: -- I don't think we should recommend the user that. Where is

[jira] [Commented] (LUCENE-10577) Quantize vector values

2022-05-17 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17538499#comment-17538499 ] Robert Muir commented on LUCENE-10577: -- Well, but comparing latency to the current dog-slow

[jira] [Commented] (LUCENE-10577) Quantize vector values

2022-05-17 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17538474#comment-17538474 ] Robert Muir commented on LUCENE-10577: -- My main concern with some custom encoding would be if it

[jira] [Commented] (LUCENE-10577) Quantize vector values

2022-05-17 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17538459#comment-17538459 ] Robert Muir commented on LUCENE-10577: -- the actual operations you want to do need to be supported.

[jira] [Commented] (LUCENE-10577) Quantize vector values

2022-05-17 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17538456#comment-17538456 ] Robert Muir commented on LUCENE-10577: -- at least for fp16 we see some movement on openjdk (open

[jira] [Commented] (LUCENE-10577) Quantize vector values

2022-05-17 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17538451#comment-17538451 ] Robert Muir commented on LUCENE-10577: -- I think a 2-byte float would be a better design than

[jira] [Commented] (LUCENE-10572) Can we optimize BytesRefHash?

2022-05-17 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17538211#comment-17538211 ] Robert Muir commented on LUCENE-10572: -- These measurements are also going to be strange because of

[jira] [Commented] (LUCENE-10574) Remove O(n^2) from TieredMergePolicy or change defaults to one that doesn't do this

2022-05-17 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17538194#comment-17538194 ] Robert Muir commented on LUCENE-10574: -- I think another approach is to actually remove the

[jira] [Commented] (LUCENE-10574) Remove O(n^2) from TieredMergePolicy or change defaults to one that doesn't do this

2022-05-16 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17537453#comment-17537453 ] Robert Muir commented on LUCENE-10574: -- Seems like this issue definitely isn't fixed as long as

[jira] [Resolved] (LUCENE-10569) Think again about the floor segment size?

2022-05-16 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir resolved LUCENE-10569. -- Resolution: Won't Fix I'm closing this as won't fix because it isn't enough to change a

[jira] [Updated] (LUCENE-10574) Remove O(n^2) from TieredMergePolicy or change defaults to one that doesn't do this

2022-05-16 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-10574: - Description: Remove {{floorSegmentBytes}} parameter, or change lucene's default to a merge

[jira] [Created] (LUCENE-10574) Remove O(n^2) from TieredMergePolicy or change defaults to one that doesn't do this

2022-05-16 Thread Robert Muir (Jira)
Robert Muir created LUCENE-10574: Summary: Remove O(n^2) from TieredMergePolicy or change defaults to one that doesn't do this Key: LUCENE-10574 URL: https://issues.apache.org/jira/browse/LUCENE-10574

[jira] [Commented] (LUCENE-10573) Improve stored fields bulk merge for degenerate O(n^2) merges

2022-05-16 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17537437#comment-17537437 ] Robert Muir commented on LUCENE-10573: -- Fix TieredMergePolicy, don't hack around its shortcomings

[jira] [Commented] (LUCENE-10572) Can we optimize BytesRefHash?

2022-05-14 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17537018#comment-17537018 ] Robert Muir commented on LUCENE-10572: -- {quote} Most terms are shorter than 128 bytes in normal

[jira] [Commented] (LUCENE-10572) Can we optimize BytesRefHash?

2022-05-13 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17536942#comment-17536942 ] Robert Muir commented on LUCENE-10572: -- {quote} I know that OpenJDK is heavily tested on big

[jira] [Commented] (LUCENE-10572) Can we optimize BytesRefHash?

2022-05-13 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17536918#comment-17536918 ] Robert Muir commented on LUCENE-10572: -- {quote} In BytesRefHash we have one big endian variant,

[jira] [Commented] (LUCENE-10572) Can we optimize BytesRefHash?

2022-05-13 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17536917#comment-17536917 ] Robert Muir commented on LUCENE-10572: -- And by the way, I'm fine also with just using

[jira] [Commented] (LUCENE-10572) Can we optimize BytesRefHash?

2022-05-13 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17536915#comment-17536915 ] Robert Muir commented on LUCENE-10572: -- Well, banning it doesn't matter so much to me, if we

[jira] [Commented] (LUCENE-10572) Can we optimize BytesRefHash?

2022-05-13 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17536885#comment-17536885 ] Robert Muir commented on LUCENE-10572: -- ok, i like your suggestion actually. it solves my issue,

[jira] [Commented] (LUCENE-10572) Can we optimize BytesRefHash?

2022-05-13 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17536877#comment-17536877 ] Robert Muir commented on LUCENE-10572: -- {quote} With the swapping of bytes I remember the other

[jira] [Commented] (LUCENE-10556) Relax the maximum dirtiness for stored fields and term vectors?

2022-05-12 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17536233#comment-17536233 ] Robert Muir commented on LUCENE-10556: -- Yes, the original code i wrote for this thing was really

[jira] [Commented] (LUCENE-10556) Relax the maximum dirtiness for stored fields and term vectors?

2022-05-12 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17536219#comment-17536219 ] Robert Muir commented on LUCENE-10556: -- maybe we could try playing instead with the second

[jira] [Commented] (LUCENE-10561) Reduce class/member visibility of ArabicStemmer, ArabicNormalizer, and PersianNormalizer

2022-05-11 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17535862#comment-17535862 ] Robert Muir commented on LUCENE-10561: -- I think i didn't explain my suggestion well enough. When i

[jira] [Commented] (LUCENE-10561) Reduce class/member visibility of ArabicStemmer, ArabicNormalizer, and PersianNormalizer

2022-05-11 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17535830#comment-17535830 ] Robert Muir commented on LUCENE-10561: -- Some of the tokenizers are auto-generated. For example

[jira] [Commented] (LUCENE-9409) TestAllFilesDetectTruncation failures

2022-05-11 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17534878#comment-17534878 ] Robert Muir commented on LUCENE-9409: - the test also doesn't account for the case that you might

[jira] [Commented] (LUCENE-9356) Add tests for corruptions caused by byte flips

2022-05-09 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17534126#comment-17534126 ] Robert Muir commented on LUCENE-9356: - I think the "problem" in current test is the inherent

[jira] [Commented] (LUCENE-9356) Add tests for corruptions caused by byte flips

2022-05-09 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17534124#comment-17534124 ] Robert Muir commented on LUCENE-9356: - i'd disable compound file as a first pass on any test too.

[jira] [Commented] (LUCENE-9356) Add tests for corruptions caused by byte flips

2022-05-09 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17534118#comment-17534118 ] Robert Muir commented on LUCENE-9356: - mulling on it more, this to me seems like the way to go.

[jira] [Commented] (LUCENE-9356) Add tests for corruptions caused by byte flips

2022-05-09 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17534117#comment-17534117 ] Robert Muir commented on LUCENE-9356: - by the way, if we just want to improve the exception path,

[jira] [Commented] (LUCENE-9356) Add tests for corruptions caused by byte flips

2022-05-09 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17534116#comment-17534116 ] Robert Muir commented on LUCENE-9356: - The test seems wrong to me, for example it does not consider

[jira] [Resolved] (LUCENE-10532) Remove @Slow annotation

2022-05-09 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir resolved LUCENE-10532. -- Fix Version/s: 9.2 Resolution: Fixed > Remove @Slow annotation >

[jira] [Commented] (LUCENE-10471) Increase the number of dims for KNN vectors to 2048

2022-05-09 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17534086#comment-17534086 ] Robert Muir commented on LUCENE-10471: -- I think the major problem is still no Vector API in the

[jira] [Commented] (LUCENE-10551) LowercaseAsciiCompression should return false when it's unable to compress

2022-05-09 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17534074#comment-17534074 ] Robert Muir commented on LUCENE-10551: -- Thanks [~irislpx] for solving the mystery. Maybe there

[jira] [Commented] (LUCENE-10561) Reduce member visibility of ArabicStemmer, ArabicNormalizer, and PersianNormalizer

2022-05-07 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17533306#comment-17533306 ] Robert Muir commented on LUCENE-10561: -- you can just make the entire classes package private.

[jira] [Commented] (LUCENE-10556) Relax the maximum dirtiness for stored fields and term vectors?

2022-05-04 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17531676#comment-17531676 ] Robert Muir commented on LUCENE-10556: -- {quote} Currently the logic is to recompress if more than

[jira] [Commented] (LUCENE-10216) Add concurrency to addIndexes(CodecReader…) API

2022-05-01 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17530504#comment-17530504 ] Robert Muir commented on LUCENE-10216: -- {quote} I've also modified the MockRandomMergePolicy to

[jira] [Commented] (LUCENE-10543) Achieve contribution workflow perfection (with progress)

2022-04-29 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17530339#comment-17530339 ] Robert Muir commented on LUCENE-10543: -- IMO it just creates friction. having to sign-up for an

[jira] [Updated] (LUCENE-10543) Achieve contribution workflow perfection (with progress)

2022-04-29 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-10543: - Attachment: Screen_Shot_2022-04-30_at_01.15.00.png > Achieve contribution workflow perfection

[jira] [Commented] (LUCENE-10542) FieldSource exists implementation can avoid value retrieval

2022-04-28 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17529580#comment-17529580 ] Robert Muir commented on LUCENE-10542: -- Hi [~krisden], I think there might be more optimizations

[jira] [Commented] (LUCENE-10543) Achieve contribution workflow perfection (with progress)

2022-04-28 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17529270#comment-17529270 ] Robert Muir commented on LUCENE-10543: -- It was entirely too difficult to find the issue! I knew it

[jira] [Commented] (LUCENE-9871) Achieve build system perfection (with progress)

2022-04-28 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17529263#comment-17529263 ] Robert Muir commented on LUCENE-9871: - just a reminder, haven't forgot about java version

[jira] [Commented] (LUCENE-10543) Achieve contribution workflow perfection (with progress)

2022-04-28 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17529245#comment-17529245 ] Robert Muir commented on LUCENE-10543: -- Believe it or not, there's actually no "link" to the

[jira] [Commented] (LUCENE-10543) Achieve contribution workflow perfection (with progress)

2022-04-28 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17529240#comment-17529240 ] Robert Muir commented on LUCENE-10543: -- another idea, add a simple "fork me on github" to the

[jira] [Commented] (LUCENE-10543) Achieve contribution workflow perfection (with progress)

2022-04-28 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17529236#comment-17529236 ] Robert Muir commented on LUCENE-10543: -- another idea is to use github wiki functionality vs the

[jira] [Created] (LUCENE-10543) Achieve contribution workflow perfection (with progress)

2022-04-28 Thread Robert Muir (Jira)
Robert Muir created LUCENE-10543: Summary: Achieve contribution workflow perfection (with progress) Key: LUCENE-10543 URL: https://issues.apache.org/jira/browse/LUCENE-10543 Project: Lucene - Core

[jira] [Commented] (LUCENE-10541) What to do about massive terms in our Wikipedia EN LineFileDocs?

2022-04-27 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17528912#comment-17528912 ] Robert Muir commented on LUCENE-10541: -- I think the problem is this line in MockTokenizer:

[jira] [Resolved] (LUCENE-10525) Improve WindowsFS emulation to catch directory names with : in them (which is not allowed)

2022-04-27 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir resolved LUCENE-10525. -- Fix Version/s: 9.2 Resolution: Fixed Thanks you [~gworah] > Improve WindowsFS

[jira] [Commented] (LUCENE-10528) TestScripts.testLukeCanBeLaunched creates X Window when running the tests

2022-04-23 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17526846#comment-17526846 ] Robert Muir commented on LUCENE-10528: -- I opened followup: LUCENE-10532 >

[jira] [Created] (LUCENE-10532) Remove @Slow annotation

2022-04-23 Thread Robert Muir (Jira)
Robert Muir created LUCENE-10532: Summary: Remove @Slow annotation Key: LUCENE-10532 URL: https://issues.apache.org/jira/browse/LUCENE-10532 Project: Lucene - Core Issue Type: Task

[jira] [Resolved] (LUCENE-10528) TestScripts.testLukeCanBeLaunched creates X Window when running the tests

2022-04-23 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir resolved LUCENE-10528. -- Fix Version/s: 9.2 Resolution: Fixed > TestScripts.testLukeCanBeLaunched creates X

[jira] [Commented] (LUCENE-10528) TestScripts.testLukeCanBeLaunched creates X Window when running the tests

2022-04-23 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17526844#comment-17526844 ] Robert Muir commented on LUCENE-10528: -- I merged this PR so that i can cleanly -1 any @Slow usage

  1   2   3   4   5   6   7   8   9   10   >