[jira] [Commented] (LUCENE-8972) CharFilter version of ICUTransformFilter, to better support dictionary-based tokenization

2019-09-10 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-8972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16926589#comment-16926589 ] Robert Muir commented on LUCENE-8972: - I agree its a good idea, a couple thoughts about the impl you

[jira] [Commented] (LUCENE-8403) Support 'filtered' term vectors - don't require all terms to be present

2019-08-29 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-8403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16918910#comment-16918910 ] Robert Muir commented on LUCENE-8403: - My opinion remains the same, it makes no sense to disable

[jira] [Commented] (LUCENE-8884) Add Directory wrapper to track per-query IO counters

2019-08-14 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16907422#comment-16907422 ] Robert Muir commented on LUCENE-8884: - are threadlocals really needed? must you call get on every op

[jira] [Commented] (LUCENE-8369) Remove the spatial module as it is obsolete

2019-07-25 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16892735#comment-16892735 ] Robert Muir commented on LUCENE-8369: - I don't think adding esoteric functionality (like non-WGS

[jira] [Commented] (LUCENE-8928) BKDWriter could make splitting decisions based on the actual range of values

2019-07-22 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16890224#comment-16890224 ] Robert Muir commented on LUCENE-8928: - I guess I'm saying, I don't see any evidence of speedups for

[jira] [Commented] (LUCENE-8928) BKDWriter could make splitting decisions based on the actual range of values

2019-07-22 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16890207#comment-16890207 ] Robert Muir commented on LUCENE-8928: - Do you mean 1-dimensional ranges (actually 2D) or

[jira] [Commented] (LUCENE-8928) BKDWriter could make splitting decisions based on the actual range of values

2019-07-22 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16890186#comment-16890186 ] Robert Muir commented on LUCENE-8928: - Seems like it should only happen when number of dimensions is

[jira] [Commented] (LUCENE-8926) Test2BDocs.test2BDocs error java.lang.ArrayIndexOutOfBoundsException

2019-07-18 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16888244#comment-16888244 ] Robert Muir commented on LUCENE-8926: - {noformat} NOTE: Linux 3.10.0-957.21.3.el7.ppc64le

[jira] [Commented] (LUCENE-8366) upgrade to icu 62.1

2019-07-10 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16882575#comment-16882575 ] Robert Muir commented on LUCENE-8366: - good catch! do you want to submit a fix? > upgrade to icu

[jira] [Commented] (LUCENE-4312) Index format to store position length per position

2019-07-08 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-4312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16880828#comment-16880828 ] Robert Muir commented on LUCENE-4312: - I don't think chicken and the egg description works well as

[jira] [Commented] (LUCENE-8878) Provide alternative sorting utility from SortField other than FieldComparator

2019-06-27 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16874425#comment-16874425 ] Robert Muir commented on LUCENE-8878: - Yes, please don't let me discourage you from attempting to

[jira] [Commented] (LUCENE-8878) Provide alternative sorting utility from SortField other than FieldComparator

2019-06-26 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16873284#comment-16873284 ] Robert Muir commented on LUCENE-8878: - Please don't forget about the distance sort comparator, it

[jira] [Commented] (LUCENE-8871) Move Kuromoji DictionaryBuilder tool from src/tools to src/

2019-06-25 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16872781#comment-16872781 ] Robert Muir commented on LUCENE-8871: - +1 from me. I looked thru the PR, it is just as described.

[jira] [Updated] (LUCENE-8866) Remove ICU dependency of kuromoji tools/test-tools

2019-06-20 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-8866: Fix Version/s: 8.2 > Remove ICU dependency of kuromoji tools/test-tools >

[jira] [Resolved] (LUCENE-8866) Remove ICU dependency of kuromoji tools/test-tools

2019-06-20 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir resolved LUCENE-8866. - Resolution: Fixed > Remove ICU dependency of kuromoji tools/test-tools >

[jira] [Commented] (LUCENE-8866) Remove ICU dependency of kuromoji tools/test-tools

2019-06-20 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868440#comment-16868440 ] Robert Muir commented on LUCENE-8866: - If there are no objections I will wait until LUCENE-8863 is

[jira] [Updated] (LUCENE-8866) Remove ICU dependency of kuromoji tools/test-tools

2019-06-17 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-8866: Attachment: LUCENE-8866.patch > Remove ICU dependency of kuromoji tools/test-tools >

[jira] [Commented] (LUCENE-8866) Remove ICU dependency of kuromoji tools/test-tools

2019-06-17 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16866133#comment-16866133 ] Robert Muir commented on LUCENE-8866: - Simple patch, I didn't move any code around, just removed the

[jira] [Created] (LUCENE-8866) Remove ICU dependency of kuromoji tools/test-tools

2019-06-17 Thread Robert Muir (JIRA)
Robert Muir created LUCENE-8866: --- Summary: Remove ICU dependency of kuromoji tools/test-tools Key: LUCENE-8866 URL: https://issues.apache.org/jira/browse/LUCENE-8866 Project: Lucene - Core

[jira] [Commented] (LUCENE-8863) Improve handling of edge cases in Kuromoji's DIctionaryBuilder

2019-06-15 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16864856#comment-16864856 ] Robert Muir commented on LUCENE-8863: - {quote}but what about Unidic or dictionaries people might get

[jira] [Commented] (LUCENE-8863) Improve handling of edge cases in Kuromoji's DIctionaryBuilder

2019-06-15 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16864746#comment-16864746 ] Robert Muir commented on LUCENE-8863: - Can we just throw an exception on empty base form? It sounds

[jira] [Commented] (LUCENE-8816) Decouple Kuromoji's morphological analyser and its dictionary

2019-06-11 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16861618#comment-16861618 ] Robert Muir commented on LUCENE-8816: - Mike: yes, I agree with you. The use of assert was laziness

[jira] [Commented] (LUCENE-8833) Allow subclasses of MMapDirecory to preload individual IndexInputs

2019-06-06 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16858126#comment-16858126 ] Robert Muir commented on LUCENE-8833: - Another idea is to expose the option completely differently

[jira] [Commented] (LUCENE-8833) Allow subclasses of MMapDirecory to preload individual IndexInputs

2019-06-06 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16858120#comment-16858120 ] Robert Muir commented on LUCENE-8833: - I'm just curious about more details. For the merge use-case,

[jira] [Commented] (LUCENE-8833) Allow subclasses of MMapDirecory to preload individual IndexInputs

2019-06-05 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16856997#comment-16856997 ] Robert Muir commented on LUCENE-8833: - what would the iocontext provide to base the preload decision

[jira] [Commented] (LUCENE-8833) Allow subclasses of MMapDirecory to preload individual IndexInputs

2019-06-05 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16856751#comment-16856751 ] Robert Muir commented on LUCENE-8833: - Can't the user do this with FileSwitchDirectory today? I am

[jira] [Commented] (LUCENE-8753) New PostingFormat - UniformSplit

2019-06-02 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16853986#comment-16853986 ] Robert Muir commented on LUCENE-8753: - It seems by your question that you already know that what you

[jira] [Commented] (LUCENE-8816) Decouple Kuromoji's morphological analyser and its dictionary

2019-05-30 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16852581#comment-16852581 ] Robert Muir commented on LUCENE-8816: - Yeah there is some optimizations specific to Japanese writing

[jira] [Commented] (LUCENE-8816) Decouple Kuromoji's morphological analyser and its dictionary

2019-05-29 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16850954#comment-16850954 ] Robert Muir commented on LUCENE-8816: - As a suggestion for the first concrete step: the various

[jira] [Commented] (LUCENE-8816) Decouple Kuromoji's morphological analyser and its dictionary

2019-05-28 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16849734#comment-16849734 ] Robert Muir commented on LUCENE-8816: - >From the perspective of this code, the dictionaries are

[jira] [Commented] (LUCENE-8816) Decouple Kuromoji's morphological analyser and its dictionary

2019-05-28 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16849725#comment-16849725 ] Robert Muir commented on LUCENE-8816: - if the licensing challenge in LUCENE-4056 is overcome, then

[jira] [Commented] (LUCENE-8805) Parameter changes for stringField() in StoredFieldVisitor

2019-05-21 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16845121#comment-16845121 ] Robert Muir commented on LUCENE-8805: - +1 > Parameter changes for stringField() in

[jira] [Commented] (LUCENE-8805) Parameter changes for binaryField() and stringField() in StoredFieldVisitor

2019-05-20 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16844072#comment-16844072 ] Robert Muir commented on LUCENE-8805: - We could do the String case here and leave the BytesRef case

[jira] [Commented] (LUCENE-8805) Parameter changes for binaryField() and stringField() in StoredFieldVisitor

2019-05-18 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16843218#comment-16843218 ] Robert Muir commented on LUCENE-8805: - Another idea: we may want to add some tests where we pass

[jira] [Commented] (LUCENE-8805) Parameter changes for binaryField() and stringField() in StoredFieldVisitor

2019-05-18 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16843217#comment-16843217 ] Robert Muir commented on LUCENE-8805: - Should we add any safety checks? With the previous code,

[jira] [Commented] (LUCENE-8776) Start offset going backwards has a legitimate purpose

2019-05-03 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16832449#comment-16832449 ] Robert Muir commented on LUCENE-8776: - I don't think we should change the exception message at all.

[jira] [Commented] (LUCENE-8789) Use JGoodies' look for Luke on Windows platform

2019-05-02 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16832159#comment-16832159 ] Robert Muir commented on LUCENE-8789: - OK, thanks for the explanation. If the 3rd party library

[jira] [Commented] (LUCENE-8789) Use JGoodies' look for Luke on Windows platform

2019-05-02 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16832118#comment-16832118 ] Robert Muir commented on LUCENE-8789: - I'm confused a bit: Is the proposed 3rd party code a

[jira] [Commented] (LUCENE-8780) Improve ByteBufferGuard in Java 11

2019-04-29 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16829444#comment-16829444 ] Robert Muir commented on LUCENE-8780: - i imagine any slowdown only impacts stuff doing lots of tiny

[jira] [Commented] (LUCENE-8753) New PostingFormat - UniformSplit

2019-04-24 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16825028#comment-16825028 ] Robert Muir commented on LUCENE-8753: - Why are we looking at committing this when the most recent

[jira] [Commented] (LUCENE-8776) Start offset going backwards has a legitimate purpose

2019-04-24 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16825025#comment-16825025 ] Robert Muir commented on LUCENE-8776: - {quote} If performance gets worse for large documents, isn't

[jira] [Commented] (LUCENE-8776) Start offset going backwards has a legitimate purpose

2019-04-23 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16824774#comment-16824774 ] Robert Muir commented on LUCENE-8776: - The check is important for several reasons: * with offsets

[jira] [Commented] (LUCENE-8736) LatLonShapePolygonQuery returning incorrect WITHIN results with shared boundaries

2019-04-19 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16821867#comment-16821867 ] Robert Muir commented on LUCENE-8736: - Points are the common use case _by far_ as well. It is the

[jira] [Commented] (LUCENE-8736) LatLonShapePolygonQuery returning incorrect WITHIN results with shared boundaries

2019-04-17 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820574#comment-16820574 ] Robert Muir commented on LUCENE-8736: - {quote} The concern I have is the behavior of excluding

[jira] [Commented] (LUCENE-8736) LatLonShapePolygonQuery returning incorrect WITHIN results with shared boundaries

2019-04-15 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16817895#comment-16817895 ] Robert Muir commented on LUCENE-8736: - Yeah, thats what I am suggesting: this patch really really

[jira] [Commented] (LUCENE-8736) LatLonShapePolygonQuery returning incorrect WITHIN results with shared boundaries

2019-04-13 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16817013#comment-16817013 ] Robert Muir commented on LUCENE-8736: - I guess I don't see them as boundary failures. Instead just a

[jira] [Commented] (LUCENE-8736) LatLonShapePolygonQuery returning incorrect WITHIN results with shared boundaries

2019-04-12 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16816190#comment-16816190 ] Robert Muir commented on LUCENE-8736: - Can we reopen this and think about rolling back the points

[jira] [Commented] (LUCENE-8752) Apply a patch to kuromoji dictionary to properly handle Japanese new era '令和' (REIWA)

2019-04-03 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16808716#comment-16808716 ] Robert Muir commented on LUCENE-8752: - +1 > Apply a patch to kuromoji dictionary to properly handle

[jira] [Commented] (LUCENE-8746) Make EdgeTree (aka ComponentTree) support different type of components

2019-03-29 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16805235#comment-16805235 ] Robert Muir commented on LUCENE-8746: - I agree, this thing kinda evolved: starting from a single

[jira] [Commented] (LUCENE-8681) Prorated early termination

2019-02-15 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16769792#comment-16769792 ] Robert Muir commented on LUCENE-8681: - I agree with the first suggestion, too. Let's put it beside

[jira] [Commented] (LUCENE-8292) Fix FilterLeafReader.FilterTermsEnum to delegate all seekExact methods

2019-02-15 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16769236#comment-16769236 ] Robert Muir commented on LUCENE-8292: - the solution is, don't use delegators except over interfaces

[jira] [Commented] (LUCENE-8681) Prorated early termination

2019-02-12 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16766692#comment-16766692 ] Robert Muir commented on LUCENE-8681: - As far as tests, I agree, yes lets not swap in any inexact

[jira] [Commented] (LUCENE-8681) Prorated early termination

2019-02-11 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16765654#comment-16765654 ] Robert Muir commented on LUCENE-8681: - also unrelated to any user concern, we have to address

[jira] [Commented] (LUCENE-8681) Prorated early termination

2019-02-11 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16765645#comment-16765645 ] Robert Muir commented on LUCENE-8681: - {quote} The point of the math above is that we can bound the

[jira] [Commented] (LUCENE-8681) Prorated early termination

2019-02-07 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16762679#comment-16762679 ] Robert Muir commented on LUCENE-8681: - Please don't read this the right way, but this is

[jira] [Updated] (LUCENE-8673) Use radix partitioning when merging dimensional points

2019-01-31 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-8673: Description: Following the advise of [~jpountz] in LUCENE-8623I have investigated using radix

[jira] [Commented] (LUCENE-8525) throw more specific exception on data corruption

2019-01-08 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737017#comment-16737017 ] Robert Muir commented on LUCENE-8525: - Sorry, this isn't the job of the library to classify

[jira] [Commented] (LUCENE-8525) throw more specific exception on data corruption

2019-01-07 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16736498#comment-16736498 ] Robert Muir commented on LUCENE-8525: - Lucene doesn't know if its temporary or permanent either. All

[jira] [Commented] (LUCENE-8618) MMapDirectory's read ahead on random-access files might trash the OS cache

2018-12-21 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16726761#comment-16726761 ] Robert Muir commented on LUCENE-8618: - {quote} we first look up a document based on its id, fetch

[jira] [Commented] (LUCENE-8617) FSDirectory tries to create MMapDirectory on non default file system

2018-12-20 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16725940#comment-16725940 ] Robert Muir commented on LUCENE-8617: - I'm not sure I agree with this statement: {quote}as only the

[jira] [Commented] (LUCENE-8527) Upgrade JFlex to 1.7.0

2018-12-07 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16713510#comment-16713510 ] Robert Muir commented on LUCENE-8527: - It would be really nice. I don't think the tricky part is

[jira] [Commented] (LUCENE-8584) Japanese UserDictionary should remove duplicate entries

2018-12-03 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16707076#comment-16707076 ] Robert Muir commented on LUCENE-8584: - I don't think so. App can pay the cost of going thru the

[jira] [Commented] (LUCENE-8584) Japanese UserDictionary should remove duplicate entries

2018-12-03 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16707014#comment-16707014 ] Robert Muir commented on LUCENE-8584: - I don't think i agree with the issue description at all. Its

[jira] [Commented] (LUCENE-8584) Japanese UserDictionary should remove duplicate entries

2018-12-03 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16707004#comment-16707004 ] Robert Muir commented on LUCENE-8584: - I don't think being lenient helps the user here. Now they can

[jira] [Commented] (LUCENE-8583) Make GeoUtils#orientation method more stable

2018-12-02 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16706270#comment-16706270 ] Robert Muir commented on LUCENE-8583: - And maybe it should be moved back to a private method in

[jira] [Commented] (LUCENE-8583) Make GeoUtils#orientation method more stable

2018-12-02 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16706265#comment-16706265 ] Robert Muir commented on LUCENE-8583: - Is the new formula really more precise? (substraction etc).

[jira] [Commented] (LUCENE-8563) Remove k1+1 from the numerator of BM25Similarity

2018-11-29 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16703055#comment-16703055 ] Robert Muir commented on LUCENE-8563: - Please deprecate the crazy legacy one too, so it can be

[jira] [Commented] (LUCENE-8574) ExpressionFunctionValues should cache per-hit value

2018-11-25 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16698188#comment-16698188 ] Robert Muir commented on LUCENE-8574: - Hopefully this is enough? If we have to handle the case where

[jira] [Updated] (LUCENE-8574) ExpressionFunctionValues should cache per-hit value

2018-11-25 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-8574: Attachment: LUCENE-8574.patch > ExpressionFunctionValues should cache per-hit value >

[jira] [Assigned] (LUCENE-8574) ExpressionFunctionValues should cache per-hit value

2018-11-25 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir reassigned LUCENE-8574: --- Assignee: Robert Muir > ExpressionFunctionValues should cache per-hit value >

[jira] [Commented] (LUCENE-8574) ExpressionFunctionValues should cache per-hit value

2018-11-25 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16698172#comment-16698172 ] Robert Muir commented on LUCENE-8574: - I don't think we need any caching DoubleValues

[jira] [Commented] (LUCENE-8574) ExpressionFunctionValues should cache per-hit value

2018-11-24 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16698038#comment-16698038 ] Robert Muir commented on LUCENE-8574: - +1, I remember jack added this explicitly to prevent the

[jira] [Commented] (LUCENE-8548) Reevaluate scripts boundary break in Nori's tokenizer

2018-11-22 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16696085#comment-16696085 ] Robert Muir commented on LUCENE-8548: - Yeah, if you want to just change that logic as jim suggests,

[jira] [Commented] (LUCENE-8548) Reevaluate scripts boundary break in Nori's tokenizer

2018-11-22 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16696038#comment-16696038 ] Robert Muir commented on LUCENE-8548: - {quote} Try to make Ant {{nori}} module depend on {{icu}}

[jira] [Commented] (LUCENE-8563) Remove k1+1 from the numerator of BM25Similarity

2018-11-15 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16687922#comment-16687922 ] Robert Muir commented on LUCENE-8563: - No, we shouldn't clutter up BM25Similarity because some users

[jira] [Commented] (LUCENE-8563) Remove k1+1 from the numerator of BM25Similarity

2018-11-12 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16683868#comment-16683868 ] Robert Muir commented on LUCENE-8563: - +1 to nuke it. Currently the explain() goes out of its way to

[jira] [Commented] (LUCENE-8553) New KoreanDecomposeFilter for KoreanAnalyzer(Nori)

2018-11-01 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16671939#comment-16671939 ] Robert Muir commented on LUCENE-8553: - Can't we just use unicode normalization for this? NFD/NFKD

[jira] [Commented] (LUCENE-8548) Reevaluate scripts boundary break in Nori's tokenizer

2018-10-26 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665014#comment-16665014 ] Robert Muir commented on LUCENE-8548: - As far as the suggested fix, why reinvent the wheel? In

[jira] [Assigned] (LUCENE-8525) throw more specific exception on data corruption

2018-10-05 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir reassigned LUCENE-8525: --- Assignee: (was: Robert Muir) > throw more specific exception on data corruption >

[jira] [Commented] (LUCENE-8525) throw more specific exception on data corruption

2018-10-05 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16639673#comment-16639673 ] Robert Muir commented on LUCENE-8525: - i would close this one, as its an elasticsearch bug. it

[jira] [Commented] (LUCENE-8525) throw more specific exception on data corruption

2018-10-05 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16639664#comment-16639664 ] Robert Muir commented on LUCENE-8525: - i don't think it should be corruptindexexception, there is no

[jira] [Assigned] (LUCENE-8525) throw more specific exception on data corruption

2018-10-05 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir reassigned LUCENE-8525: --- Assignee: Robert Muir > throw more specific exception on data corruption >

[jira] [Commented] (LUCENE-8516) Make WordDelimiterGraphFilter a Tokenizer

2018-09-28 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16631756#comment-16631756 ] Robert Muir commented on LUCENE-8516: - It just doesnt seem like it will really improve anything if

[jira] [Commented] (LUCENE-8516) Make WordDelimiterGraphFilter a Tokenizer

2018-09-28 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16631648#comment-16631648 ] Robert Muir commented on LUCENE-8516: - {quote} WordDelimiterTokenizer takes a root tokenizer (so you

[jira] [Commented] (LUCENE-8498) Deprecate/Remove LowerCaseTokenizer

2018-09-19 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16620655#comment-16620655 ] Robert Muir commented on LUCENE-8498: - +1. I think it was added mainly to support LowerCaseTokenizer

[jira] [Commented] (LUCENE-8462) New Arabic snowball stemmer

2018-09-11 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16610910#comment-16610910 ] Robert Muir commented on LUCENE-8462: - +1 for the BSD-licensed test data, thank you. > New Arabic

[jira] [Commented] (LUCENE-8494) CFS leaks a file on exception opening it

2018-09-10 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609020#comment-16609020 ] Robert Muir commented on LUCENE-8494: - If only there was just {{openFiles}}. You got {{openFiles}},

[jira] [Commented] (LUCENE-8494) CFS leaks a file on exception opening it

2018-09-09 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16608478#comment-16608478 ] Robert Muir commented on LUCENE-8494: - duh, this can only be a bug in the directory... having a slow

[jira] [Created] (LUCENE-8494) CFS leaks a file on exception opening it

2018-09-09 Thread Robert Muir (JIRA)
Robert Muir created LUCENE-8494: --- Summary: CFS leaks a file on exception opening it Key: LUCENE-8494 URL: https://issues.apache.org/jira/browse/LUCENE-8494 Project: Lucene - Core Issue Type:

[jira] [Commented] (LUCENE-7882) Maybe expression compiler should cache recently compiled expressions?

2018-08-31 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-7882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16599043#comment-16599043 ] Robert Muir commented on LUCENE-7882: - Well I think it needs to be tested on a modern JVM (e.g.

[jira] [Commented] (LUCENE-7882) Maybe expression compiler should cache recently compiled expressions?

2018-08-30 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-7882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16597774#comment-16597774 ] Robert Muir commented on LUCENE-7882: - I really still think we shouldn't add any java-level caches

[jira] [Commented] (LUCENE-8468) A ByteBuffer based Directory implementation (and associated classes)

2018-08-28 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16594947#comment-16594947 ] Robert Muir commented on LUCENE-8468: - I think we had a race condition in our comments, thank you :)

[jira] [Commented] (LUCENE-8468) A ByteBuffer based Directory implementation (and associated classes)

2018-08-28 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16594943#comment-16594943 ] Robert Muir commented on LUCENE-8468: - Yes but it seems really wrong for some methods to throw

[jira] [Commented] (LUCENE-8468) A ByteBuffer based Directory implementation (and associated classes)

2018-08-28 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16594940#comment-16594940 ] Robert Muir commented on LUCENE-8468: - Can we avoid throwing FileNotFoundException from any new code

[jira] [Commented] (LUCENE-8462) New Arabic snowball stemmer

2018-08-22 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16588901#comment-16588901 ] Robert Muir commented on LUCENE-8462: - This change looks great, however i'm wondering if we can

[jira] [Commented] (LUCENE-8450) Enable TokenFilters to assign offsets when splitting tokens

2018-08-20 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586454#comment-16586454 ] Robert Muir commented on LUCENE-8450: - Look at worddelimiterfilter, it wants to use case information

[jira] [Commented] (LUCENE-8450) Enable TokenFilters to assign offsets when splitting tokens

2018-08-10 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16577055#comment-16577055 ] Robert Muir commented on LUCENE-8450: - {quote} Actually that the real solution for the decompounding

[jira] [Commented] (LUCENE-8450) Enable TokenFilters to assign offsets when splitting tokens

2018-08-10 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576805#comment-16576805 ] Robert Muir commented on LUCENE-8450: - {quotdine} I'm also reluctant to place WDF early in the chain

[jira] [Commented] (LUCENE-8450) Enable TokenFilters to assign offsets when splitting tokens

2018-08-10 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576633#comment-16576633 ] Robert Muir commented on LUCENE-8450: - {quote} To get the benefit, it's only really necessary to

[jira] [Commented] (LUCENE-8450) Enable TokenFilters to assign offsets when splitting tokens

2018-08-09 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16574825#comment-16574825 ] Robert Muir commented on LUCENE-8450: - I feel pretty strongly that we shouldn't go this route. The

  1   2   3   4   5   6   7   8   9   10   >