[jira] [Created] (LUCENE-10067) investigate 6/23/2001 -> 6/24/2001 drop in facets perf

2021-08-24 Thread Robert Muir (Jira)
Robert Muir created LUCENE-10067: Summary: investigate 6/23/2001 -> 6/24/2001 drop in facets perf Key: LUCENE-10067 URL: https://issues.apache.org/jira/browse/LUCENE-10067 Project: Lucene - Core

[jira] [Commented] (LUCENE-10062) Explore using SORTED_NUMERIC doc values to encode taxonomy ordinals for faceting

2021-08-24 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17403885#comment-17403885 ] Robert Muir commented on LUCENE-10062: -- +1 for this experiment. I think you are correct: some

[jira] [Commented] (LUCENE-9917) Reduce block size for BEST_SPEED

2021-08-24 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17403836#comment-17403836 ] Robert Muir commented on LUCENE-9917: - I prefer a balanced approach (like this PR) for the default

[jira] [Resolved] (LUCENE-10048) Bypass total frequency check if field uses custom term frequency

2021-08-13 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir resolved LUCENE-10048. -- Resolution: Won't Fix > Bypass total frequency check if field uses custom term frequency >

[jira] [Commented] (LUCENE-10048) Bypass total frequency check if field uses custom term frequency

2021-08-13 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17398896#comment-17398896 ] Robert Muir commented on LUCENE-10048: -- no, you still dont get it. please read the PR that mike

[jira] [Commented] (LUCENE-10048) Bypass total frequency check if field uses custom term frequency

2021-08-12 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17398358#comment-17398358 ] Robert Muir commented on LUCENE-10048: -- this doesn't explain any use case to me. it sounds like

[jira] [Commented] (LUCENE-10048) Bypass total frequency check if field uses custom term frequency

2021-08-11 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17397782#comment-17397782 ] Robert Muir commented on LUCENE-10048: -- btw, there are a lot of alternatives you can look into to

[jira] [Commented] (LUCENE-10048) Bypass total frequency check if field uses custom term frequency

2021-08-11 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17397774#comment-17397774 ] Robert Muir commented on LUCENE-10048: -- Whether or not you use the similarity or scorer is

[jira] [Commented] (LUCENE-9937) ann-benchmarks results for HNSW search

2021-08-11 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17397555#comment-17397555 ] Robert Muir commented on LUCENE-9937: - we can help some of those hotspots with vectorization etc

[jira] [Commented] (LUCENE-10048) Bypass total frequency check if field uses custom term frequency

2021-08-10 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17396981#comment-17396981 ] Robert Muir commented on LUCENE-10048: -- It has been brought up before, I don't think we should

[jira] [Commented] (LUCENE-10033) Encode doc values in smaller blocks of values, like postings

2021-07-29 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17389841#comment-17389841 ] Robert Muir commented on LUCENE-10033: -- {quote} Queries that consume most values like the Browse*

[jira] [Commented] (LUCENE-10016) VectorReader.search needs rethought, o.a.l.search integration?

2021-07-28 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17388868#comment-17388868 ] Robert Muir commented on LUCENE-10016: -- Even if it isn't in the o.a.l.demo module, a simple test

[jira] [Resolved] (LUCENE-9959) Can we remove threadlocals of stored fields and term vectors

2021-07-19 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir resolved LUCENE-9959. - Resolution: Won't Fix was a terrible idea that just made APIs worse: I tried having some

[jira] [Commented] (LUCENE-9959) Can we remove threadlocals of stored fields and term vectors

2021-07-19 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17383126#comment-17383126 ] Robert Muir commented on LUCENE-9959: - I don't think we should do this anymore. Somehow this has

[jira] [Resolved] (LUCENE-5595) TestICUNormalizer2CharFilter test failure

2021-07-13 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-5595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir resolved LUCENE-5595. - Fix Version/s: main (9.0) Resolution: Fixed Marking as fixed. Actually all we did is

[jira] [Commented] (LUCENE-5595) TestICUNormalizer2CharFilter test failure

2021-07-13 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-5595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17380259#comment-17380259 ] Robert Muir commented on LUCENE-5595: - sorry, the previous NFKD test is fine. I thought i read it as

[jira] [Commented] (LUCENE-5595) TestICUNormalizer2CharFilter test failure

2021-07-13 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-5595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17380256#comment-17380256 ] Robert Muir commented on LUCENE-5595: - One thing bogus about the existing test is that it tries to

[jira] [Resolved] (LUCENE-9177) ICUNormalizer2CharFilter worst case is very slow

2021-07-13 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir resolved LUCENE-9177. - Resolution: Fixed > ICUNormalizer2CharFilter worst case is very slow >

[jira] [Commented] (LUCENE-9177) ICUNormalizer2CharFilter worst case is very slow

2021-07-13 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17380243#comment-17380243 ] Robert Muir commented on LUCENE-9177: - Thanks [~mgibney] a lot for taking care of this! I'm

[jira] [Updated] (LUCENE-9177) ICUNormalizer2CharFilter worst case is very slow

2021-07-13 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-9177: Fix Version/s: 8.10 main (9.0) > ICUNormalizer2CharFilter worst case is very

[jira] [Commented] (LUCENE-5595) TestICUNormalizer2CharFilter test failure

2021-07-13 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-5595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17380239#comment-17380239 ] Robert Muir commented on LUCENE-5595: - I'd like to re-enable this test. I will open a PR. If jenkins

[jira] [Commented] (LUCENE-9177) ICUNormalizer2CharFilter worst case is very slow

2021-07-13 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17380238#comment-17380238 ] Robert Muir commented on LUCENE-9177: - I re-enabled this test on top of your branch. I am beasting

[jira] [Commented] (LUCENE-9177) ICUNormalizer2CharFilter worst case is very slow

2021-07-13 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17380178#comment-17380178 ] Robert Muir commented on LUCENE-9177: - it may be the case. it shouldn't hold up your change really,

[jira] [Commented] (LUCENE-9177) ICUNormalizer2CharFilter worst case is very slow

2021-07-13 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17380111#comment-17380111 ] Robert Muir commented on LUCENE-9177: - i haven't had a chance to test it out, I didn't have any plan

[jira] [Commented] (LUCENE-10023) Multi-token post-analysis DocValues

2021-07-09 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17378142#comment-17378142 ] Robert Muir commented on LUCENE-10023: -- A big part of my concern is adding this code directly to

[jira] [Commented] (LUCENE-10023) Multi-token post-analysis DocValues

2021-07-08 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17377769#comment-17377769 ] Robert Muir commented on LUCENE-10023: -- {quote} I guess the essence of the question is whether the

[jira] [Commented] (LUCENE-10023) Multi-token post-analysis DocValues

2021-07-08 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17377639#comment-17377639 ] Robert Muir commented on LUCENE-10023: -- {quote} In contrast, the current PR only runs analysis

[jira] [Commented] (LUCENE-10023) Multi-token post-analysis DocValues

2021-07-08 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17377636#comment-17377636 ] Robert Muir commented on LUCENE-10023: -- {quote} In contrast, the current PR only runs analysis

[jira] [Commented] (LUCENE-10023) Multi-token post-analysis DocValues

2021-07-08 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17377541#comment-17377541 ] Robert Muir commented on LUCENE-10023: -- I actually think it is useful for the user to do this

[jira] [Commented] (LUCENE-8638) Remove deprecated code in master

2021-06-30 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-8638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17372278#comment-17372278 ] Robert Muir commented on LUCENE-8638: - What are these {{lucene-deprecations.txt}} files? Please,

[jira] [Commented] (LUCENE-10016) VectorReader.search needs rethought, o.a.l.search integration?

2021-06-30 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17372057#comment-17372057 ] Robert Muir commented on LUCENE-10016: -- [~sokolov] let's not make these assumptions in such an

[jira] [Updated] (LUCENE-10016) VectorReader.search needs rethought, o.a.l.search integration?

2021-06-30 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-10016: - Fix Version/s: 9.0 > VectorReader.search needs rethought, o.a.l.search integration? >

[jira] [Created] (LUCENE-10016) VectorReader.search needs rethought, o.a.l.search integration?

2021-06-30 Thread Robert Muir (Jira)
Robert Muir created LUCENE-10016: Summary: VectorReader.search needs rethought, o.a.l.search integration? Key: LUCENE-10016 URL: https://issues.apache.org/jira/browse/LUCENE-10016 Project: Lucene -

[jira] [Updated] (LUCENE-10015) Remove VectorValues.SimilarityFunction, remove NONE

2021-06-30 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-10015: - Fix Version/s: 9.0 > Remove VectorValues.SimilarityFunction, remove NONE >

[jira] [Created] (LUCENE-10015) Remove VectorValues.SimilarityFunction, remove NONE

2021-06-30 Thread Robert Muir (Jira)
Robert Muir created LUCENE-10015: Summary: Remove VectorValues.SimilarityFunction, remove NONE Key: LUCENE-10015 URL: https://issues.apache.org/jira/browse/LUCENE-10015 Project: Lucene - Core

[jira] [Commented] (LUCENE-9855) Reconsider names for ANN related format and APIs

2021-06-30 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17372001#comment-17372001 ] Robert Muir commented on LUCENE-9855: - I really do think this is correctly a blocker issue. Renaming

[jira] [Commented] (LUCENE-10010) Should we have a NFA Query?

2021-06-22 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17367524#comment-17367524 ] Robert Muir commented on LUCENE-10010: -- I think the main difference between the article and what

[jira] [Resolved] (LUCENE-10004) Delete unnecessary flush in Lucene90CompressingStoredFieldsWriter.copyChunks() to reduce dirty chunks

2021-06-18 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir resolved LUCENE-10004. -- Resolution: Not A Problem We absolutely need to flush pending docs before we start copying

[jira] [Commented] (LUCENE-10003) Disallow C-style array declarations

2021-06-15 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17363644#comment-17363644 ] Robert Muir commented on LUCENE-10003: -- My concern is really beyond when the tool gets run (again

[jira] [Commented] (LUCENE-10003) Disallow C-style array declarations

2021-06-14 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17363221#comment-17363221 ] Robert Muir commented on LUCENE-10003: -- In this particular case, i'm specifically against the

[jira] [Commented] (LUCENE-10003) Disallow C-style array declarations

2021-06-14 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17363218#comment-17363218 ] Robert Muir commented on LUCENE-10003: -- What is the bug caused by this? It is 100% formatting.

[jira] [Commented] (LUCENE-9935) Bulk merges for stored fields when index sorting is enabled

2021-06-14 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17363019#comment-17363019 ] Robert Muir commented on LUCENE-9935: - ah ok, sorry, i read the linked issues wrong. just mainly

[jira] [Reopened] (LUCENE-9935) Bulk merges for stored fields when index sorting is enabled

2021-06-14 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir reopened LUCENE-9935: - Reopening as the term vectors support was reverted from main (but not 8.x?). cc: [~dnhatn] > Bulk

[jira] [Created] (LUCENE-9997) Revisit smoketester for 9.0 build

2021-06-10 Thread Robert Muir (Jira)
Robert Muir created LUCENE-9997: --- Summary: Revisit smoketester for 9.0 build Key: LUCENE-9997 URL: https://issues.apache.org/jira/browse/LUCENE-9997 Project: Lucene - Core Issue Type: Task

[jira] [Commented] (LUCENE-9981) CompiledAutomaton.getCommonSuffix can be extraordinarily slow, even with default maxDeterminizedStates limit

2021-06-04 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17357410#comment-17357410 ] Robert Muir commented on LUCENE-9981: - +1. glad this test is happy! >

[jira] [Commented] (LUCENE-9987) JVM 11.0.6 crash while trying to read term vectors in CheckIndex?

2021-06-02 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17355694#comment-17355694 ] Robert Muir commented on LUCENE-9987: - ZAC though. Workaround: don't use ZGC > JVM 11.0.6 crash

[jira] [Commented] (LUCENE-9981) CompiledAutomaton.getCommonSuffix can be extraordinarily slow, even with default maxDeterminizedStates limit

2021-05-30 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17353994#comment-17353994 ] Robert Muir commented on LUCENE-9981: - Even with no API break, I don't want these changes rushed in

[jira] [Commented] (LUCENE-9981) CompiledAutomaton.getCommonSuffix can be extraordinarily slow, even with default maxDeterminizedStates limit

2021-05-30 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17353972#comment-17353972 ] Robert Muir commented on LUCENE-9981: - It is probably enough bits? I originally liked the type

[jira] [Commented] (LUCENE-9981) CompiledAutomaton.getCommonSuffix can be extraordinarily slow, even with default maxDeterminizedStates limit

2021-05-30 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17353956#comment-17353956 ] Robert Muir commented on LUCENE-9981: - There's no reason to rush fixes/backports for these

[jira] [Commented] (LUCENE-9981) CompiledAutomaton.getCommonSuffix can be extraordinarily slow, even with default maxDeterminizedStates limit

2021-05-29 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17353910#comment-17353910 ] Robert Muir commented on LUCENE-9981: - Another tweak: its great that we give a compile-time break

[jira] [Comment Edited] (LUCENE-9981) CompiledAutomaton.getCommonSuffix can be extraordinarily slow, even with default maxDeterminizedStates limit

2021-05-29 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17353908#comment-17353908 ] Robert Muir edited comment on LUCENE-9981 at 5/30/21, 12:10 AM: cc:

[jira] [Comment Edited] (LUCENE-9981) CompiledAutomaton.getCommonSuffix can be extraordinarily slow, even with default maxDeterminizedStates limit

2021-05-29 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17353908#comment-17353908 ] Robert Muir edited comment on LUCENE-9981 at 5/30/21, 12:01 AM: cc:

[jira] [Commented] (LUCENE-9981) CompiledAutomaton.getCommonSuffix can be extraordinarily slow, even with default maxDeterminizedStates limit

2021-05-29 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17353908#comment-17353908 ] Robert Muir commented on LUCENE-9981: - cc: [~dweiss] in case you have thoughts. Quick Summary if you

[jira] [Commented] (LUCENE-9687) Hunspell support improvements

2021-05-29 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17353907#comment-17353907 ] Robert Muir commented on LUCENE-9687: - I think we should be good now. The only other change we

[jira] [Commented] (LUCENE-9867) CorruptIndexException after failed segment merge caused by No space left on device

2021-05-29 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17353906#comment-17353906 ] Robert Muir commented on LUCENE-9867: - Sorry for the dyslexic commit > CorruptIndexException after

[jira] [Commented] (LUCENE-9981) CompiledAutomaton.getCommonSuffix can be extraordinarily slow, even with default maxDeterminizedStates limit

2021-05-29 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17353861#comment-17353861 ] Robert Muir commented on LUCENE-9981: - I updated the patch, changes: * Add simple unit tests for the

[jira] [Updated] (LUCENE-9981) CompiledAutomaton.getCommonSuffix can be extraordinarily slow, even with default maxDeterminizedStates limit

2021-05-29 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-9981: Attachment: LUCENE-9981.patch > CompiledAutomaton.getCommonSuffix can be extraordinarily slow,

[jira] [Commented] (LUCENE-9981) CompiledAutomaton.getCommonSuffix can be extraordinarily slow, even with default maxDeterminizedStates limit

2021-05-29 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17353852#comment-17353852 ] Robert Muir commented on LUCENE-9981: - Patch looks great! I really like how it is clear that the

[jira] [Commented] (LUCENE-9981) CompiledAutomaton.getCommonSuffix can be extraordinarily slow, even with default maxDeterminizedStates limit

2021-05-29 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17353830#comment-17353830 ] Robert Muir commented on LUCENE-9981: - I attached LUCENE-9981_nfa_prefix.patch, which uses an NFA

[jira] [Updated] (LUCENE-9981) CompiledAutomaton.getCommonSuffix can be extraordinarily slow, even with default maxDeterminizedStates limit

2021-05-29 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-9981: Attachment: LUCENE-9981_nfaprefix.patch > CompiledAutomaton.getCommonSuffix can be

[jira] [Commented] (LUCENE-9981) CompiledAutomaton.getCommonSuffix can be extraordinarily slow, even with default maxDeterminizedStates limit

2021-05-29 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17353762#comment-17353762 ] Robert Muir commented on LUCENE-9981: - Nope, but exposing very slow queries (via DSL) to

[jira] [Commented] (LUCENE-9981) CompiledAutomaton.getCommonSuffix can be extraordinarily slow, even with default maxDeterminizedStates limit

2021-05-29 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17353742#comment-17353742 ] Robert Muir commented on LUCENE-9981: - Hi [~oridool]. This issue isn't a security issue. I won't add

[jira] [Commented] (LUCENE-9379) Directory based approach for index encryption

2021-05-29 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17353696#comment-17353696 ] Robert Muir commented on LUCENE-9379: - Your argument is even more uneducated, the "i can do better

[jira] [Commented] (LUCENE-9379) Directory based approach for index encryption

2021-05-28 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17353645#comment-17353645 ] Robert Muir commented on LUCENE-9379: - As always, you can count on arch to have some good user-level

[jira] [Commented] (LUCENE-9379) Directory based approach for index encryption

2021-05-28 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17353644#comment-17353644 ] Robert Muir commented on LUCENE-9379: - Sorry, the above comment is really wrong. Please see my

[jira] [Commented] (LUCENE-9981) CompiledAutomaton.getCommonSuffix can be extraordinarily slow, even with default maxDeterminizedStates limit

2021-05-28 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17353637#comment-17353637 ] Robert Muir commented on LUCENE-9981: - Before we {{det(reverse())}} this automaton to compute the

[jira] [Updated] (LUCENE-9981) CompiledAutomaton.getCommonSuffix can be extraordinarily slow, even with default maxDeterminizedStates limit

2021-05-28 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-9981: Attachment: LUCENE-9981_test.patch > CompiledAutomaton.getCommonSuffix can be extraordinarily

[jira] [Created] (LUCENE-9981) CompiledAutomaton.getCommonSuffix can be extraordinarily slow, even with default maxDeterminizedStates limit

2021-05-28 Thread Robert Muir (Jira)
Robert Muir created LUCENE-9981: --- Summary: CompiledAutomaton.getCommonSuffix can be extraordinarily slow, even with default maxDeterminizedStates limit Key: LUCENE-9981 URL:

[jira] [Updated] (LUCENE-9827) Small segments are slower to merge due to stored fields since 8.7

2021-05-13 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-9827: Fix Version/s: 8.9 > Small segments are slower to merge due to stored fields since 8.7 >

[jira] [Resolved] (LUCENE-9843) Remove compression option on doc values

2021-05-06 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir resolved LUCENE-9843. - Fix Version/s: main (9.0) Resolution: Fixed Thanks [~jdconradson]! > Remove compression

[jira] [Commented] (LUCENE-9950) Support both single- and multi-value string fields in facet counting (non-taxonomy based approaches)

2021-05-06 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17340242#comment-17340242 ] Robert Muir commented on LUCENE-9950: - +1 to add a "normal" facet method here, and deprecate this

[jira] [Commented] (LUCENE-9843) Remove compression option on doc values

2021-05-05 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17339863#comment-17339863 ] Robert Muir commented on LUCENE-9843: - +1 to your latest patch [~jdconradson]. I ran all tests,

[jira] [Commented] (LUCENE-9843) Remove compression option on doc values

2021-05-04 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17339374#comment-17339374 ] Robert Muir commented on LUCENE-9843: - looks like we can do the same trick for binary case. remove

[jira] [Updated] (LUCENE-9843) Remove compression option on doc values

2021-05-04 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-9843: Attachment: LUCENE-9843.patch mods.patch Status: Open (was: Open)

[jira] [Commented] (LUCENE-9843) Remove compression option on doc values

2021-05-04 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17339084#comment-17339084 ] Robert Muir commented on LUCENE-9843: - +1 let's simplify and have better test coverage. it does not

[jira] [Commented] (LUCENE-9948) Automatically detect multi- vs. single-valued cases in LongValueFacetCounts

2021-05-03 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17338443#comment-17338443 ] Robert Muir commented on LUCENE-9948: - Thanks [~gsmiller] for taking care of this! > Automatically

[jira] [Resolved] (LUCENE-9948) Automatically detect multi- vs. single-valued cases in LongValueFacetCounts

2021-05-03 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir resolved LUCENE-9948. - Fix Version/s: main (9.0) Resolution: Fixed > Automatically detect multi- vs.

[jira] [Commented] (LUCENE-9948) Automatically detect multi- vs. single-valued cases in LongValueFacetCounts

2021-05-02 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17338091#comment-17338091 ] Robert Muir commented on LUCENE-9948: - For 9.0, i wouldnt add back compat at all or deprecations. I

[jira] [Resolved] (LUCENE-9188) Add jacoco code coverage support to gradle build

2021-05-02 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir resolved LUCENE-9188. - Fix Version/s: main (9.0) Resolution: Fixed > Add jacoco code coverage support to gradle

[jira] [Commented] (LUCENE-9946) Support multi-value fields in range facet counting

2021-05-01 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17337741#comment-17337741 ] Robert Muir commented on LUCENE-9946: - {quote} I was aiming for consistency with the approach taken

[jira] [Commented] (LUCENE-9946) Support multi-value fields in range facet counting

2021-04-30 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17337498#comment-17337498 ] Robert Muir commented on LUCENE-9946: - I'm confused, why make the user provide a {{multivalued}}

[jira] [Commented] (LUCENE-9939) Proper ASCII folding of Danish/Norwegian characters Ø, Å

2021-04-27 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17333128#comment-17333128 ] Robert Muir commented on LUCENE-9939: - This isn't the way to go: these aren't the only languages

[jira] [Updated] (LUCENE-9878) enable redundantNullCheck in ecjLint

2021-04-26 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-9878: Fix Version/s: main (9.0) > enable redundantNullCheck in ecjLint >

[jira] [Resolved] (LUCENE-9878) enable redundantNullCheck in ecjLint

2021-04-26 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir resolved LUCENE-9878. - Resolution: Fixed yes, oops! thanks [~mikemccand] > enable redundantNullCheck in ecjLint >

[jira] [Resolved] (LUCENE-9928) speed up analysis/icu regeneration

2021-04-22 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir resolved LUCENE-9928. - Fix Version/s: main (9.0) Resolution: Fixed > speed up analysis/icu regeneration >

[jira] [Commented] (LUCENE-9843) Remove compression option on doc values

2021-04-14 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17321328#comment-17321328 ] Robert Muir commented on LUCENE-9843: - Great example of why its important: for a while lucene's

[jira] [Commented] (LUCENE-9843) Remove compression option on doc values

2021-04-14 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17321327#comment-17321327 ] Robert Muir commented on LUCENE-9843: - I moved this issue to a blocker for 9.0 because i've already

[jira] [Updated] (LUCENE-9843) Remove compression option on doc values

2021-04-14 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-9843: Priority: Blocker (was: Minor) > Remove compression option on doc values >

[jira] [Commented] (LUCENE-9929) Make ScandinavianNormalizationFilter configurable wrt foldings

2021-04-14 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17321157#comment-17321157 ] Robert Muir commented on LUCENE-9929: - Adding new tokenfilters isn't a breaking change for anyone.

[jira] [Commented] (LUCENE-9929) Make ScandinavianNormalizationFilter configurable wrt foldings

2021-04-14 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17321068#comment-17321068 ] Robert Muir commented on LUCENE-9929: - Can we just add filters for each of the 3 languages instead?

[jira] [Created] (LUCENE-9928) speed up analysis/icu regeneration

2021-04-13 Thread Robert Muir (Jira)
Robert Muir created LUCENE-9928: --- Summary: speed up analysis/icu regeneration Key: LUCENE-9928 URL: https://issues.apache.org/jira/browse/LUCENE-9928 Project: Lucene - Core Issue Type: Task

[jira] [Commented] (LUCENE-9927) Configurable BKDWriter maximum heap size

2021-04-13 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17320144#comment-17320144 ] Robert Muir commented on LUCENE-9927: - I think it would be better to use the bugfixed version in

[jira] [Commented] (LUCENE-9914) Modernize Emoji regeneration scripts

2021-04-12 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17319681#comment-17319681 ] Robert Muir commented on LUCENE-9914: - Thanks [~dweiss]! Let's keep an eye on jflex releases, I

[jira] [Created] (LUCENE-9926) remove last-modified timestamp from ASCIITLD.jflex

2021-04-12 Thread Robert Muir (Jira)
Robert Muir created LUCENE-9926: --- Summary: remove last-modified timestamp from ASCIITLD.jflex Key: LUCENE-9926 URL: https://issues.apache.org/jira/browse/LUCENE-9926 Project: Lucene - Core

[jira] [Commented] (LUCENE-9827) Small segments are slower to merge due to stored fields since 8.7

2021-04-12 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17319486#comment-17319486 ] Robert Muir commented on LUCENE-9827: - [~jpountz] those cross-checks look good. one additional idea

[jira] [Commented] (LUCENE-9914) Modernize Emoji regeneration scripts

2021-04-11 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17318939#comment-17318939 ] Robert Muir commented on LUCENE-9914: - [~dweiss] The emoji regeneration needs to be using ICU 62.x

[jira] [Resolved] (LUCENE-9924) regenerate TLD list from IANA TLD db, rather than root zone db

2021-04-11 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir resolved LUCENE-9924. - Fix Version/s: main (9.0) Resolution: Fixed > regenerate TLD list from IANA TLD db,

[jira] [Commented] (LUCENE-9924) regenerate TLD list from IANA TLD db, rather than root zone db

2021-04-10 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17318602#comment-17318602 ] Robert Muir commented on LUCENE-9924: - Matching puny-decoded form is different, I'm not even sure we

[jira] [Commented] (LUCENE-9924) regenerate TLD list from IANA TLD db, rather than root zone db

2021-04-10 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17318598#comment-17318598 ] Robert Muir commented on LUCENE-9924: - In other words, this change creates the same output as what

[jira] [Commented] (LUCENE-9924) regenerate TLD list from IANA TLD db, rather than root zone db

2021-04-10 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17318597#comment-17318597 ] Robert Muir commented on LUCENE-9924: - No, this generator script explicitly wants only ascii and

<    1   2   3   4   5   6   7   8   9   10   >