[jira] [Comment Edited] (LUCENE-8863) Improve handling of edge cases in Kuromoji's DIctionaryBuilder

2019-06-15 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16864841#comment-16864841 ] Mike Sokolov edited comment on LUCENE-8863 at 6/15/19 7:56 PM: --- {quote}Can

[jira] [Commented] (LUCENE-8863) Improve handling of edge cases in Kuromoji's DIctionaryBuilder

2019-06-15 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16864841#comment-16864841 ] Mike Sokolov commented on LUCENE-8863: -- {quote}Can we just throw an exception on empty base form?

[jira] [Commented] (LUCENE-8863) Improve handling of edge cases in Kuromoji's DIctionaryBuilder

2019-06-15 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16864701#comment-16864701 ] Mike Sokolov commented on LUCENE-8863: -- I'll submit a patch soon. My initial idea was to maintain

[jira] [Commented] (LUCENE-8816) Decouple Kuromoji's morphological analyser and its dictionary

2019-06-15 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16864672#comment-16864672 ] Mike Sokolov commented on LUCENE-8816: -- I opened LUCENE-8863 to cover some small, but blocking,

[jira] [Created] (LUCENE-8863) Improve handling of edge cases in Kuromoji's DIctionaryBuilder

2019-06-15 Thread Mike Sokolov (JIRA)
Mike Sokolov created LUCENE-8863: Summary: Improve handling of edge cases in Kuromoji's DIctionaryBuilder Key: LUCENE-8863 URL: https://issues.apache.org/jira/browse/LUCENE-8863 Project: Lucene -

[jira] [Commented] (LUCENE-8781) Explore FST direct array arc encoding

2019-06-15 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16864668#comment-16864668 ] Mike Sokolov commented on LUCENE-8781: -- Thanks for testing, [~dsmiley], you definitely found a bug.

[jira] [Commented] (LUCENE-8816) Decouple Kuromoji's morphological analyser and its dictionary

2019-06-11 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16861622#comment-16861622 ] Mike Sokolov commented on LUCENE-8816: -- Thanks Robert, yeah I understand this was built for a

[jira] [Commented] (LUCENE-8816) Decouple Kuromoji's morphological analyser and its dictionary

2019-06-11 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16861609#comment-16861609 ] Mike Sokolov commented on LUCENE-8816: -- I see that in {{BinaryDictionaryWriter}} we restrict

[jira] [Commented] (LUCENE-8791) Add CollectorRescorer

2019-06-10 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16860382#comment-16860382 ] Mike Sokolov commented on LUCENE-8791: -- bq. We distribute total number of results we are looking

[jira] [Commented] (LUCENE-8781) Explore FST direct array arc encoding

2019-06-08 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16859227#comment-16859227 ] Mike Sokolov commented on LUCENE-8781: -- Got it, thanks. Yeah this was a tiny change, doesn't seem

[jira] [Updated] (LUCENE-8844) Bump FST Version (to 7)

2019-06-08 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mike Sokolov updated LUCENE-8844: - Summary: Bump FST Version (to 7) (was: Bump FST Version) > Bump FST Version (to 7) >

[jira] [Updated] (LUCENE-8844) Bump FST Version

2019-06-08 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mike Sokolov updated LUCENE-8844: - Description: In LUCENE-8781, we changed the FST encoding but did not bump the version number

[jira] [Assigned] (LUCENE-8844) Bump FST Version

2019-06-08 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mike Sokolov reassigned LUCENE-8844: Assignee: Mike Sokolov > Bump FST Version > > > Key:

[jira] [Created] (LUCENE-8844) Bump FST Version

2019-06-08 Thread Mike Sokolov (JIRA)
Mike Sokolov created LUCENE-8844: Summary: Bump FST Version Key: LUCENE-8844 URL: https://issues.apache.org/jira/browse/LUCENE-8844 Project: Lucene - Core Issue Type: Bug

[jira] [Commented] (LUCENE-8781) Explore FST direct array arc encoding

2019-06-08 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16859220#comment-16859220 ] Mike Sokolov commented on LUCENE-8781: -- OK, I see we write a version header and then check it for

[jira] [Commented] (LUCENE-8781) Explore FST direct array arc encoding

2019-06-06 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16858079#comment-16858079 ] Mike Sokolov commented on LUCENE-8781: -- I think I -- did not understand how to edit CHANGES.txt

[jira] [Comment Edited] (LUCENE-8816) Decouple Kuromoji's morphological analyser and its dictionary

2019-05-28 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16849731#comment-16849731 ] Mike Sokolov edited comment on LUCENE-8816 at 5/28/19 1:41 PM: --- What if we

[jira] [Commented] (LUCENE-8816) Decouple Kuromoji's morphological analyser and its dictionary

2019-05-28 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16849731#comment-16849731 ] Mike Sokolov commented on LUCENE-8816: -- What if we changed the various dictionary classes to

[jira] [Resolved] (LUCENE-8781) Explore FST direct array arc encoding

2019-05-27 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mike Sokolov resolved LUCENE-8781. -- Resolution: Fixed Pushed to 8.x (and 7.x, although it seems there will be no future 7.x

[jira] [Updated] (LUCENE-8781) Explore FST direct array arc encoding

2019-05-26 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mike Sokolov updated LUCENE-8781: - Fix Version/s: (was: 8.x) 8.2 > Explore FST direct array arc encoding >

[jira] [Updated] (LUCENE-8781) Explore FST direct array arc encoding

2019-05-26 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mike Sokolov updated LUCENE-8781: - Fix Version/s: 8.x > Explore FST direct array arc encoding >

[jira] [Reopened] (LUCENE-8781) Explore FST direct array arc encoding

2019-05-26 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mike Sokolov reopened LUCENE-8781: -- reopening to track backporting this improvement to 8.x and 7.x > Explore FST direct array arc

[jira] [Commented] (LUCENE-4012) Make all query classes serializable, and provide a query parser to consume them

2019-05-19 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-4012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16843445#comment-16843445 ] Mike Sokolov commented on LUCENE-4012: -- I want to hijack this issue to be about maing Query

[jira] [Updated] (LUCENE-4012) Make all query classes serializable, and provide a query parser to consume them

2019-05-19 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-4012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mike Sokolov updated LUCENE-4012: - Summary: Make all query classes serializable, and provide a query parser to consume them (was:

[jira] [Commented] (LUCENE-8798) Autogenerated ID for LeafReaderContexts Within An IndexSearcher

2019-05-13 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16838538#comment-16838538 ] Mike Sokolov commented on LUCENE-8798: -- I think what confused me was the link to the other JIRA

[jira] [Commented] (LUCENE-8798) Autogenerated ID for LeafReaderContexts Within An IndexSearcher

2019-05-13 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16838497#comment-16838497 ] Mike Sokolov commented on LUCENE-8798: -- [~atris] I glanced at the issue you referenced, but I don't

[jira] [Commented] (LUCENE-8780) Improve ByteBufferGuard in Java 11

2019-04-28 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16828826#comment-16828826 ] Mike Sokolov commented on LUCENE-8780: -- I don't have a good theory, but I was curious so I ran a

[jira] [Updated] (LUCENE-8781) Explore FST direct array arc encoding

2019-04-27 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mike Sokolov updated LUCENE-8781: - Description: This issue is for exploring an alternate FST encoding of Arcs as full-sized

[jira] [Updated] (LUCENE-8781) Explore FST direct array arc encoding

2019-04-27 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mike Sokolov updated LUCENE-8781: - Description: This issue is for exploring an alternate FST encoding of Arcs as full-sized

[jira] [Created] (LUCENE-8781) Explore FST direct array arc encoding

2019-04-27 Thread Mike Sokolov (JIRA)
Mike Sokolov created LUCENE-8781: Summary: Explore FST direct array arc encoding Key: LUCENE-8781 URL: https://issues.apache.org/jira/browse/LUCENE-8781 Project: Lucene - Core Issue Type:

[jira] [Commented] (LUCENE-8681) Prorated early termination

2019-04-15 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16818266#comment-16818266 ] Mike Sokolov commented on LUCENE-8681: -- I updated the PR with a new patch that changes the API for

[jira] [Commented] (LUCENE-8753) New PostingFormat - UniformSplit

2019-04-03 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16809124#comment-16809124 ] Mike Sokolov commented on LUCENE-8753: -- The behavior I'm referring to isn't a problem with the

[jira] [Commented] (LUCENE-8753) New PostingFormat - UniformSplit

2019-04-03 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16808860#comment-16808860 ] Mike Sokolov commented on LUCENE-8753: -- I've been working on some other FST-related changes, and

[jira] [Commented] (LUCENE-8750) Implement setMissingValue for numeric ValueSource sortFields

2019-04-02 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807816#comment-16807816 ] Mike Sokolov commented on LUCENE-8750: -- Here's a PR: https://github.com/apache/lucene-solr/pull/631

[jira] [Created] (LUCENE-8750) Implement setMissingValue for numeric ValueSource sortFields

2019-04-02 Thread Mike Sokolov (JIRA)
Mike Sokolov created LUCENE-8750: Summary: Implement setMissingValue for numeric ValueSource sortFields Key: LUCENE-8750 URL: https://issues.apache.org/jira/browse/LUCENE-8750 Project: Lucene - Core

[jira] [Commented] (LUCENE-8700) Enable concurrent flushing when no indexing is in progress

2019-02-19 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16772305#comment-16772305 ] Mike Sokolov commented on LUCENE-8700: -- Pull request for this issue:

[jira] [Created] (LUCENE-8700) Enable concurrent flushing when no indexing is in progress

2019-02-19 Thread Mike Sokolov (JIRA)
Mike Sokolov created LUCENE-8700: Summary: Enable concurrent flushing when no indexing is in progress Key: LUCENE-8700 URL: https://issues.apache.org/jira/browse/LUCENE-8700 Project: Lucene - Core

[jira] [Commented] (LUCENE-8681) Prorated early termination

2019-02-19 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16771986#comment-16771986 ] Mike Sokolov commented on LUCENE-8681: -- I posted [a new

[jira] [Commented] (LUCENE-8681) Prorated early termination

2019-02-15 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16769854#comment-16769854 ] Mike Sokolov commented on LUCENE-8681: -- bq. ... doMaxScore and trackTotalHits (did you mean

[jira] [Commented] (LUCENE-8681) Prorated early termination

2019-02-15 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16769698#comment-16769698 ] Mike Sokolov commented on LUCENE-8681: -- There are a bunch of different ways to provide for opt-in

[jira] [Commented] (LUCENE-8681) Prorated early termination

2019-02-13 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16767360#comment-16767360 ] Mike Sokolov commented on LUCENE-8681: -- Yes, I guess it would be necessary to pass a

[jira] [Commented] (LUCENE-8681) Prorated early termination

2019-02-12 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16766112#comment-16766112 ] Mike Sokolov commented on LUCENE-8681: -- bq. so from my perspective, api change is not really crazy

[jira] [Commented] (SOLR-13233) SpellCheckCollator ignores stacked tokens

2019-02-10 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-13233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16764410#comment-16764410 ] Mike Sokolov commented on SOLR-13233: - I wonder if SpellCheckCollator should just ignore all stacked

[jira] [Commented] (LUCENE-8681) Prorated early termination

2019-02-09 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16764163#comment-16764163 ] Mike Sokolov commented on LUCENE-8681: -- I hope I'm not reading this the right way (?!? :), but I do

[jira] [Commented] (LUCENE-8635) Lazy loading Lucene FST offheap using mmap

2019-02-07 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16762692#comment-16762692 ] Mike Sokolov commented on LUCENE-8635: -- [~akjain] that's strange yeah -- this patch was supposed to

[jira] [Comment Edited] (LUCENE-8681) Prorated early termination

2019-02-07 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16762656#comment-16762656 ] Mike Sokolov edited comment on LUCENE-8681 at 2/7/19 1:28 PM: -- bq. However

[jira] [Commented] (LUCENE-8681) Prorated early termination

2019-02-07 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16762656#comment-16762656 ] Mike Sokolov commented on LUCENE-8681: -- bq. However I wonder if this could be implemented directly

[jira] [Created] (LUCENE-8681) Prorated early termination

2019-02-05 Thread Mike Sokolov (JIRA)
Mike Sokolov created LUCENE-8681: Summary: Prorated early termination Key: LUCENE-8681 URL: https://issues.apache.org/jira/browse/LUCENE-8681 Project: Lucene - Core Issue Type: Improvement

[jira] [Commented] (LUCENE-8635) Lazy loading Lucene FST offheap using mmap

2019-02-01 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16758457#comment-16758457 ] Mike Sokolov commented on LUCENE-8635: -- Yes, [~akjain] that approach sounds good to me; we should

[jira] [Comment Edited] (LUCENE-8635) Lazy loading Lucene FST offheap using mmap

2019-01-30 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756098#comment-16756098 ] Mike Sokolov edited comment on LUCENE-8635 at 1/30/19 1:24 PM: --- I agree

[jira] [Commented] (LUCENE-8635) Lazy loading Lucene FST offheap using mmap

2019-01-30 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756098#comment-16756098 ] Mike Sokolov commented on LUCENE-8635: -- I agree that would be a good start. Perhaps as a separate

[jira] [Commented] (LUCENE-8635) Lazy loading Lucene FST offheap using mmap

2019-01-29 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16755391#comment-16755391 ] Mike Sokolov commented on LUCENE-8635: -- I posted my latest patch including off-heap change + FST

[jira] [Updated] (LUCENE-8635) Lazy loading Lucene FST offheap using mmap

2019-01-29 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mike Sokolov updated LUCENE-8635: - Attachment: fst-offheap-rev.patch > Lazy loading Lucene FST offheap using mmap >

[jira] [Comment Edited] (LUCENE-8635) Lazy loading Lucene FST offheap using mmap

2019-01-27 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16753468#comment-16753468 ] Mike Sokolov edited comment on LUCENE-8635 at 1/27/19 5:51 PM: --- I tried

[jira] [Commented] (LUCENE-8635) Lazy loading Lucene FST offheap using mmap

2019-01-27 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16753468#comment-16753468 ] Mike Sokolov commented on LUCENE-8635: -- I tried that [~akjain] and strangely got a big drop in

[jira] [Commented] (LUCENE-8635) Lazy loading Lucene FST offheap using mmap

2019-01-23 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16750135#comment-16750135 ] Mike Sokolov commented on LUCENE-8635: -- {quote}we can simply change readBytes to below: {quote}

[jira] [Commented] (LUCENE-8635) Lazy loading Lucene FST offheap using mmap

2019-01-22 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16748996#comment-16748996 ] Mike Sokolov commented on LUCENE-8635: -- I uploaded a patch that combines these three things:

[jira] [Updated] (LUCENE-8635) Lazy loading Lucene FST offheap using mmap

2019-01-22 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mike Sokolov updated LUCENE-8635: - Attachment: fst-offheap-ra-rev.patch > Lazy loading Lucene FST offheap using mmap >

[jira] [Updated] (LUCENE-8653) Reverse FST storage so it can be read forward

2019-01-22 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mike Sokolov updated LUCENE-8653: - Attachment: fst-reverse.patch > Reverse FST storage so it can be read forward >

[jira] [Commented] (LUCENE-8653) Reverse FST storage so it can be read forward

2019-01-22 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16748756#comment-16748756 ] Mike Sokolov commented on LUCENE-8653: -- The reverse reading is required because the FST serializes

[jira] [Commented] (LUCENE-8653) Reverse FST storage so it can be read forward

2019-01-21 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16748227#comment-16748227 ] Mike Sokolov commented on LUCENE-8653: -- Yeah, some initial measurements using luceneutil are

[jira] [Commented] (LUCENE-8635) Lazy loading Lucene FST offheap using mmap

2019-01-21 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16748082#comment-16748082 ] Mike Sokolov commented on LUCENE-8635: -- I opened LUCENE-8653 to explore reversing FSTs; if we can

[jira] [Created] (LUCENE-8653) Reverse FST storage so it can be read forward

2019-01-21 Thread Mike Sokolov (JIRA)
Mike Sokolov created LUCENE-8653: Summary: Reverse FST storage so it can be read forward Key: LUCENE-8653 URL: https://issues.apache.org/jira/browse/LUCENE-8653 Project: Lucene - Core Issue

[jira] [Commented] (LUCENE-8635) Lazy loading Lucene FST offheap using mmap

2019-01-18 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16746726#comment-16746726 ] Mike Sokolov commented on LUCENE-8635: -- {quote}you can still end up with a cold FS cache eg. when

[jira] [Commented] (LUCENE-8635) Lazy loading Lucene FST offheap using mmap

2019-01-18 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16746730#comment-16746730 ] Mike Sokolov commented on LUCENE-8635: -- For the cold host case, we already have to take measures to

[jira] [Commented] (LUCENE-8642) RamUsageTester.sizeOf ignores arrays and collections if --illegal-access=deny

2019-01-16 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16744639#comment-16744639 ] Mike Sokolov commented on LUCENE-8642: -- I feel like there is a value to a partial solution here,

[jira] [Commented] (LUCENE-8635) Lazy loading Lucene FST offheap using mmap

2019-01-16 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16744621#comment-16744621 ] Mike Sokolov commented on LUCENE-8635: -- I used the wikimedia2m data set for the second set of tests

[jira] [Commented] (LUCENE-8635) Lazy loading Lucene FST offheap using mmap

2019-01-16 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16744519#comment-16744519 ] Mike Sokolov commented on LUCENE-8635: -- Right, it seems crazy that makes a difference. I guess

[jira] [Commented] (LUCENE-8635) Lazy loading Lucene FST offheap using mmap

2019-01-16 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16744344#comment-16744344 ] Mike Sokolov commented on LUCENE-8635: -- Following a suggestion from ~mikemccand I tried a slightly

[jira] [Comment Edited] (LUCENE-8635) Lazy loading Lucene FST offheap using mmap

2019-01-16 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16744344#comment-16744344 ] Mike Sokolov edited comment on LUCENE-8635 at 1/16/19 6:54 PM: --- Following

[jira] [Updated] (LUCENE-8635) Lazy loading Lucene FST offheap using mmap

2019-01-16 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mike Sokolov updated LUCENE-8635: - Attachment: ra.patch > Lazy loading Lucene FST offheap using mmap >

[jira] [Commented] (LUCENE-8635) Lazy loading Lucene FST offheap using mmap

2019-01-15 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16743067#comment-16743067 ] Mike Sokolov commented on LUCENE-8635: -- This looked interesting to me, too, so I did run the

[jira] [Commented] (LUCENE-8609) Allow getting consistent docstats from IndexWriter

2018-12-15 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16722302#comment-16722302 ] Mike Sokolov commented on LUCENE-8609: -- Thanks, [~simonw], I had already made the changes locally,

[jira] [Commented] (LUCENE-8609) Allow getting consistent docstats from IndexWriter

2018-12-14 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16721848#comment-16721848 ] Mike Sokolov commented on LUCENE-8609: -- I think this will break nightly benchmarks? Anyway I'm

[jira] [Commented] (LUCENE-8517) TestRandomChains.testRandomChainsWithLargeStrings failure

2018-11-19 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16691682#comment-16691682 ] Mike Sokolov commented on LUCENE-8517: -- Hmm sadly it does still repro for me even with LUCENE-8564

[jira] [Commented] (LUCENE-6336) AnalyzingInfixSuggester needs duplicate handling

2018-11-17 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-6336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16690571#comment-16690571 ] Mike Sokolov commented on LUCENE-6336: -- I dug into this a bit - it seems that we already do provide

[jira] [Commented] (LUCENE-8517) TestRandomChains.testRandomChainsWithLargeStrings failure

2018-11-17 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16690562#comment-16690562 ] Mike Sokolov commented on LUCENE-8517: -- With a bigger heap, I was able to get that one to run

[jira] [Commented] (LUCENE-8517) TestRandomChains.testRandomChainsWithLargeStrings failure

2018-11-16 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16690145#comment-16690145 ] Mike Sokolov commented on LUCENE-8517: -- {quote}Another reproducing seed, though it only fails for

[jira] [Commented] (LUCENE-8517) TestRandomChains.testRandomChainsWithLargeStrings failure

2018-11-16 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16690125#comment-16690125 ] Mike Sokolov commented on LUCENE-8517: -- Well I can at least add a bit more color. I added some

[jira] [Commented] (LUCENE-8517) TestRandomChains.testRandomChainsWithLargeStrings failure

2018-11-16 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16689599#comment-16689599 ] Mike Sokolov commented on LUCENE-8517: -- [~steve_rowe] I can take a look at this one >

[jira] [Commented] (LUCENE-3922) Add Japanese Kanji number normalization to Kuromoji

2018-11-16 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-3922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16689577#comment-16689577 ] Mike Sokolov commented on LUCENE-3922: -- +1 - this was merged ages ago (2015); would be nice to

[jira] [Commented] (SOLR-12964) Use advanceExact instead of advance in a few remaining json facet use cases

2018-11-08 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-12964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16680359#comment-16680359 ] Mike Sokolov commented on SOLR-12964: - {quote}what do you think about making {{DocValuesIterator}}

[jira] [Commented] (LUCENE-8509) NGramTokenizer, TrimFilter and WordDelimiterGraphFilter in combination can produce backwards offsets

2018-10-26 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665429#comment-16665429 ] Mike Sokolov commented on LUCENE-8509: -- [ from mailing list – sorry for the duplication ] The

[jira] [Commented] (LUCENE-8516) Make WordDelimiterGraphFilter a Tokenizer

2018-10-04 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16638156#comment-16638156 ] Mike Sokolov commented on LUCENE-8516: -- {quote}Can you elaborate? This rings a bell but I forget. 

[jira] [Comment Edited] (LUCENE-8516) Make WordDelimiterGraphFilter a Tokenizer

2018-10-01 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16634218#comment-16634218 ] Mike Sokolov edited comment on LUCENE-8516 at 10/1/18 3:51 PM: --- Thanks for

[jira] [Commented] (LUCENE-8516) Make WordDelimiterGraphFilter a Tokenizer

2018-10-01 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16634218#comment-16634218 ] Mike Sokolov commented on LUCENE-8516: -- Thanks for copy/paste, [~romseygeek], I meant to reply-all,

[jira] [Commented] (SOLR-1394) HTML stripper is splitting tokens

2018-09-22 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16624552#comment-16624552 ] Mike Sokolov commented on SOLR-1394: I'm pretty sure this issue is no longer valid. I don't use this

[jira] [Resolved] (LUCENE-5074) Support open-ended NumericRangeQuery in XmlQueryParser

2018-08-31 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-5074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mike Sokolov resolved LUCENE-5074. -- Resolution: Implemented These classes no longer exist, and their replacements handle nulls as

[jira] [Resolved] (SOLR-3513) specifying 2147483647 for rows parameter causes AIOOBE

2018-08-31 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-3513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mike Sokolov resolved SOLR-3513. Resolution: Not A Problem > specifying 2147483647 for rows parameter causes AIOOBE >

[jira] [Resolved] (LUCENE-8019) Add a root failure cause to Explanation

2018-08-31 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mike Sokolov resolved LUCENE-8019. -- Resolution: Won't Fix > Add a root failure cause to Explanation >

[jira] [Commented] (LUCENE-765) Index package level javadocs needs content

2018-08-30 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16597820#comment-16597820 ] Mike Sokolov commented on LUCENE-765: - Thanks for pushing [~jpountz]! I suppose we can always use

[jira] [Commented] (LUCENE-3318) Sketch out highlighting based on term positions / position iterators

2018-08-29 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596890#comment-16596890 ] Mike Sokolov commented on LUCENE-3318: -- [~arafalov] please feel free to resolve, as discussed on

[jira] [Commented] (LUCENE-765) Index package level javadocs needs content

2018-08-29 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596792#comment-16596792 ] Mike Sokolov commented on LUCENE-765: - OK, this patch supplies fully-qualified paths for all the

[jira] [Updated] (LUCENE-765) Index package level javadocs needs content

2018-08-29 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mike Sokolov updated LUCENE-765: Attachment: LUCENE-765.patch > Index package level javadocs needs content >

[jira] [Updated] (LUCENE-765) Index package level javadocs needs content

2018-08-29 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mike Sokolov updated LUCENE-765: Attachment: (was: LUCENE-765.patch.2) > Index package level javadocs needs content >

[jira] [Updated] (LUCENE-765) Index package level javadocs needs content

2018-08-29 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mike Sokolov updated LUCENE-765: Attachment: LUCENE-765.patch.2 > Index package level javadocs needs content >

[jira] [Assigned] (LUCENE-3318) Sketch out highlighting based on term positions / position iterators

2018-08-29 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mike Sokolov reassigned LUCENE-3318: Assignee: (was: Mike Sokolov) > Sketch out highlighting based on term positions /

[jira] [Comment Edited] (LUCENE-765) Index package level javadocs needs content

2018-08-29 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596529#comment-16596529 ] Mike Sokolov edited comment on LUCENE-765 at 8/29/18 4:39 PM: -- Ah, OK sorry

[jira] [Comment Edited] (LUCENE-765) Index package level javadocs needs content

2018-08-29 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596529#comment-16596529 ] Mike Sokolov edited comment on LUCENE-765 at 8/29/18 4:28 PM: -- Ah, OK sorry

[jira] [Comment Edited] (LUCENE-765) Index package level javadocs needs content

2018-08-29 Thread Mike Sokolov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596529#comment-16596529 ] Mike Sokolov edited comment on LUCENE-765 at 8/29/18 4:27 PM: -- Ah, OK sorry

  1   2   3   4   >