Re: Vamana greedy search variant

2023-08-05 Thread jim ferenczi
Hi Jonathan, Could you provide further clarification on your goal? The current description is unclear. Why construct an HNSW graph only to 'optimize' it into a Vamana graph? Why not directly build a Vamana graph? This paper provides guidance for streaming

Re: Conneting Lucene with ChatGPT Retrieval Plugin

2023-05-09 Thread jim ferenczi
Lucene is a library. I don’t see how it would be exposed in this plugin which is about services. On Tue, 9 May 2023 at 18:00, Jun Luo wrote: > The pr mentioned a Elasticsearch pr > that increased the > dim to 2048 in ElasticSearch. > >

Re: [Proposal] Remove max number of dimensions for KNN vectors

2023-04-07 Thread jim ferenczi
low. Hence if you look at storedfields writer, > there is "dirtiness" logic etc so that recompression is amortized over > time and doesn't happen on every merge. > > On Fri, Apr 7, 2023 at 5:38 PM jim ferenczi > wrote: > > > > I am also not sure that diskann

Re: [Proposal] Remove max number of dimensions for KNN vectors

2023-04-07 Thread jim ferenczi
and the overall recall. On Fri, 7 Apr 2023 at 22:36, jim ferenczi wrote: > I am also not sure that diskann would solve the merging issue. The idea > describe in the paper is to run kmeans first to create multiple graphs, one > per cluster. In our case the vectors in each segment could belong t

Re: [Proposal] Remove max number of dimensions for KNN vectors

2023-04-07 Thread jim ferenczi
Apr 2023 at 22:28, jim ferenczi wrote: > The inference time (and cost) to generate these big vectors must be quite > large too ;). > Regarding the ram buffer, we could drastically reduce the size by writing > the vectors on disk instead of keeping them in the heap. With 1k dimensio

Re: [Proposal] Remove max number of dimensions for KNN vectors

2023-04-07 Thread jim ferenczi
The inference time (and cost) to generate these big vectors must be quite large too ;). Regarding the ram buffer, we could drastically reduce the size by writing the vectors on disk instead of keeping them in the heap. With 1k dimensions the ram buffer is filled with these vectors quite rapidly.

Re: Welcome Julie Tibshirani to the Lucene PMC

2021-11-30 Thread jim ferenczi
Congrats and welcome Julie! Le mar. 30 nov. 2021 à 22:49, Adrien Grand a écrit : > I'm pleased to announce that Julie Tibshirani has accepted an invitation > to join the Lucene PMC! > > Congratulations Julie, and welcome aboard! > > -- > Adrien >

Re: Accessibility of CollectedSearchGroup's state

2021-10-14 Thread jim ferenczi
I agree, we should have a SinglePassGroupingCollector in Elasticsearch and reduce the visibility of these expert classes in Lucene. As it stands today, the FirstPassGroupingCollector could be a final class imo. Le jeu. 14 oct. 2021 à 18:42, Adrien Grand a écrit : > I feel sorry for increasing

Re: Welcome Peter Gromov as Lucene committer

2021-04-07 Thread jim ferenczi
Welcome Peter! Le mer. 7 avr. 2021 à 15:14, Ignacio Vera a écrit : > Welcome Peter! > > On Wed, Apr 7, 2021 at 3:10 PM Peter Gromov > wrote: > >> Thanks, that helped! >> >> On Wed, Apr 7, 2021 at 11:23 AM Dawid Weiss >> wrote: >> >>> See here - >>> https://git.apache.org/setup/ >>> >>> On

Re: [VOTE] Release Lucene/Solr 8.8.0 RC1

2021-01-20 Thread jim ferenczi
+1 SUCCESS! [1:06:13.147855] Le mer. 20 janv. 2021 à 08:39, Atri Sharma a écrit : > +1 (binding) > > SUCCESS! [1:04:15:20393] > > On Wed, Jan 20, 2021 at 1:03 PM Ignacio Vera wrote: > > > > +1 (binding) > > > > SUCCESS! [1:05:30.358141] > > > > > > On Tue, Jan 19, 2021 at 8:25 PM Timothy

Re: 8.8 Release

2021-01-19 Thread jim ferenczi
rk done, so I took over > from Noble tonight. > > On Mon, Jan 18, 2021 at 7:24 PM jim ferenczi > wrote: > >> Sorry, I forgot to add the link to the issue: >> https://issues.apache.org/jira/browse/LUCENE-9675 >> >> >> Le lun. 18 janv. 2021 à 14:33,

Re: 8.8 Release

2021-01-18 Thread jim ferenczi
Sorry, I forgot to add the link to the issue: https://issues.apache.org/jira/browse/LUCENE-9675 Le lun. 18 janv. 2021 à 14:33, jim ferenczi a écrit : > Hi Noble, > I opened an issue to expose the compression mode that is used in binary > doc values. The configurable compression

Re: 8.8 Release

2021-01-18 Thread jim ferenczi
Hi Noble, I opened an issue to expose the compression mode that is used in binary doc values. The configurable compression is a new feature in 8.8 so we'd like to expose the compression mode that was used to write the segment in the attributes of the field. I'd like to backport to the 8.8 branch

Re: RFC: N-2 compatibility for file formats

2021-01-07 Thread jim ferenczi
The proposal is only about keeping the ability to read file-format up to N-2. Everything that is done on top of the file format is not guaranteed and should be supported on a best-effort basis. That's an important aspect if we don't want to block innovation. So in practice that means that queries

Re: Suggested query parser syntax change fuzzy and boost operators (term^3~2)

2020-09-18 Thread jim ferenczi
+1 to be more strict about the order of operators. That's a bug fix imo. Le jeu. 17 sept. 2020 à 08:58, Dawid Weiss a écrit : > Just so that it's not overlooked. I suggest a cleanup of the > (flexible?) query parser syntax in LUCENE-9528. > > In short, the current javacc code is a tangled mess

Re: Avoiding false-positives in multivalued field search with intervals?

2020-09-10 Thread jim ferenczi
ces (and then in turn into IntervalQuery). > > Dawid > > On Thu, Sep 10, 2020 at 4:28 PM jim ferenczi > wrote: > > > > Right, I misunderstood Alan's answer. The boundary option is not > "impure" in my opinion. It solves this issue nicely but maybe it needs > someth

Re: Avoiding false-positives in multivalued field search with intervals?

2020-09-10 Thread jim ferenczi
to what Alan suggested. I'd have to rewrite the (general > text-to-query) query parser to only use intervals though. Still > thinking about possible approaches to this. > > D. > > On Thu, Sep 10, 2020 at 3:58 PM jim ferenczi > wrote: > > > > You could set a very high

Re: Avoiding false-positives in multivalued field search with intervals?

2020-09-10 Thread jim ferenczi
You could set a very high position increment gap for multi-valued fields (Analyzer#getPositionIncrementGap) and perform something like Intervals.maxWidth(Intervals.unordered(...), pos_gap-1) ? Le jeu. 10 sept. 2020 à 12:32, Dawid Weiss a écrit : > Yeah... I was thinking about adding synthetic

Re: [VOTE] Lucene logo contest, third time's a charm

2020-09-03 Thread jim ferenczi
A1 (binding) Le jeu. 3 sept. 2020 à 07:09, Noble Paul a écrit : > A1, A2, D binding > > On Thu, Sep 3, 2020 at 7:22 AM Jason Gerlowski > wrote: > > > > A1, A2, D (binding) > > > > On Wed, Sep 2, 2020 at 10:47 AM Michael McCandless > > wrote: > > > > > > A2, A1, C5, D (binding) > > > > > >

Re: Welcome Atri Sharma to the PMC

2020-08-20 Thread jim ferenczi
Welcome Atri! Le jeu. 20 août 2020 à 22:00, Jan Høydahl a écrit : > Welcome Atri! > > Jan > > 20. aug. 2020 kl. 20:16 skrev Ishan Chattopadhyaya < > ichattopadhy...@gmail.com>: > >  > I am pleased to announce that Atri Sharma has accepted the PMC's > invitation to join. > > Congratulations and

Re: Welcome Namgyu Kim to the PMC

2020-08-03 Thread jim ferenczi
Congratulations Namgyu! Le lun. 3 août 2020 à 18:27, Steve Rowe a écrit : > Congrats and welcome, Namgyu! > > -- > Steve > > > On Aug 2, 2020, at 7:18 PM, Ishan Chattopadhyaya < > ichattopadhy...@gmail.com> wrote: > > > > I am pleased to announce that Namgyu Kim has accepted the PMC's >

Welcome Mayya Sharipova as Lucene/Solr committer

2020-06-08 Thread jim ferenczi
Hi all, Please join me in welcoming Mayya Sharipova as the latest Lucene/Solr committer. Mayya, it's tradition for you to introduce yourself with a brief bio. Congratulations and Welcome! Jim

Re: [VOTE] Solr to become a top-level Apache project (TLP)

2020-05-12 Thread jim ferenczi
+1 (binding) Le mar. 12 mai 2020 à 14:00, Simon Willnauer a écrit : > +1 binding > > Sent from a mobile device > > > On 12. May 2020, at 13:33, Jason Gerlowski > wrote: > > > > -1 (binding) > > > >> On Tue, May 12, 2020 at 7:31 AM Alan Woodward > wrote: > >> > >> +1 (binding) > >> > >> Alan

Re: 7.7.3 bugfix release

2020-04-16 Thread jim ferenczi
a écrit : > Hi, Please merge all the required changes to the branch branch_7_7 > > I shall cut the branch in a day or two > > On Mon, Apr 6, 2020 at 6:14 PM jim ferenczi > wrote: > > > > Hi Paul, > > Ignacio have started the release process for a bug fix release

Re: [VOTE] Release Lucene/Solr 8.5.1 RC1

2020-04-09 Thread jim ferenczi
+1 SUCCESS! [2:10:08.094546] Le jeu. 9 avr. 2020 à 10:19, Alan Woodward a écrit : > +1 > > SUCCESS! [1:18:54.574272] > > On 8 Apr 2020, at 21:21, Nhat Nguyen > wrote: > > +1 > > SUCCESS! [0:52:20.920081] > > > On Wed, Apr 8, 2020 at 6:31 AM Ignacio Vera wrote: > >> >> Please vote for release

Re: 7.7.3 bugfix release

2020-04-06 Thread jim ferenczi
Hi Paul, Ignacio have started the release process for a bug fix release of 8.5.1 last week. We cannot have two releases at the same time so would you agree to start 7.7.3 after 8.5.1 is out ? I'd also like to backport LUCENE-9300 in 7.7 (the reason why we started a. 8.5.1 release) so don't

Re: Lucene/Solr 8.5.1 bugfix release

2020-04-03 Thread jim ferenczi
+1, thanks Ignacio. I merged the fix for LUCENE-9300 and backported to the 8.5 branch. Le jeu. 2 avr. 2020 à 21:48, Adrien Grand a écrit : > My general take on this is that it's ok to upgrade a dependency in a patch > release if the dependency

Re: [VOTE] Release Lucene/Solr 8.5.0 RC1

2020-03-17 Thread jim ferenczi
+1 SUCCESS! [1:18:55.683704] Le mar. 17 mars 2020 à 01:35, Mike Drob a écrit : > +1 (non-binding) > > All testing was with Java 11.0.5 > > Smoke tester didn't work (expected) > > Manually ran lucene and solr tests, had a few solr failures but they > passed when rerunning individually. > Went

Re: Lucene/Solr 8.4

2019-11-22 Thread jim ferenczi
+1 Le ven. 22 nov. 2019 à 10:08, Ishan Chattopadhyaya < ichattopadhy...@gmail.com> a écrit : > +1 > > On Fri, Nov 22, 2019 at 2:16 PM Atri Sharma wrote: > > > > +1 > > > > On Fri, Nov 22, 2019 at 2:08 PM Adrien Grand wrote: > > > > > > Hello all, > > > > > > With Thanksgiving and then

Re: [VOTE] Release Lucene/Solr 8.3.0 RC1

2019-10-22 Thread jim ferenczi
, 2019, 6:36 PM Uwe Schindler, wrote: > >> Yes, I would suggest to add this change. It's a big and as the other >> query was already fixed, that's needed for consistency, otherwise you would >> see strange bugs. >> >> Uwe >> >> Am October 22, 2019 12:3

Re: [VOTE] Release Lucene/Solr 8.3.0 RC1

2019-10-22 Thread jim ferenczi
If we respin I'd like to include https://issues.apache.org/jira/browse/LUCENE-9022 that makes all join queries non-eligible for the query cache. The fix is ready and approved so I can backport any time if you are ok with it. Le mar. 22 oct. 2019 à 00:04, Ishan Chattopadhyaya <

Re: Welcome Atri Sharma as Lucene/Solr committer

2019-09-18 Thread jim ferenczi
Congratulations Atri! Le mer. 18 sept. 2019 à 09:28, Ignacio Vera a écrit : > Welcome Atri! > > On Wed, Sep 18, 2019 at 9:12 AM Adrien Grand wrote: > >> Hi all, >> >> Please join me in welcoming Atri Sharma as Lucene/ Solr committer! >> >> If you are following activity on Lucene, this name

[jira] [Updated] (LUCENE-8966) KoreanTokenizer should split unknown words on digits

2019-09-13 Thread Jim Ferenczi (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-8966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jim Ferenczi updated LUCENE-8966: - Fix Version/s: 8.3 master (9.0) Resolution: Fixed Status

[jira] [Commented] (LUCENE-8977) Handle punctuation characters in KoreanTokenizer

2019-09-13 Thread Jim Ferenczi (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-8977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16929027#comment-16929027 ] Jim Ferenczi commented on LUCENE-8977: -- I wonder why you think that this is an issue. Punctuations

[jira] [Commented] (LUCENE-8966) KoreanTokenizer should split unknown words on digits

2019-09-09 Thread Jim Ferenczi (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-8966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16925588#comment-16925588 ] Jim Ferenczi commented on LUCENE-8966: -- I don't think it's a bug [~danmuzi] or at least that it's

[jira] [Commented] (LUCENE-8966) KoreanTokenizer should split unknown words on digits

2019-09-05 Thread Jim Ferenczi (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-8966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16923394#comment-16923394 ] Jim Ferenczi commented on LUCENE-8966: -- {quote} Would you consider grouping numbers and (at least

[jira] [Updated] (LUCENE-8966) KoreanTokenizer should split unknown words on digits

2019-09-05 Thread Jim Ferenczi (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-8966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jim Ferenczi updated LUCENE-8966: - Attachment: LUCENE-8966.patch Status: Patch Available (was: Patch Available) New patch

[jira] [Commented] (LUCENE-8966) KoreanTokenizer should split unknown words on digits

2019-09-05 Thread Jim Ferenczi (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-8966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16923357#comment-16923357 ] Jim Ferenczi commented on LUCENE-8966: -- Thanks for looking [~thetaphi]. These two private static

[jira] [Commented] (LUCENE-8966) KoreanTokenizer should split unknown words on digits

2019-09-05 Thread Jim Ferenczi (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-8966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16923222#comment-16923222 ] Jim Ferenczi commented on LUCENE-8966: -- Here is a patch that breaks unknown words on digits instead

[jira] [Updated] (LUCENE-8966) KoreanTokenizer should split unknown words on digits

2019-09-05 Thread Jim Ferenczi (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-8966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jim Ferenczi updated LUCENE-8966: - Status: Patch Available (was: Open) > KoreanTokenizer should split unknown words on dig

[jira] [Updated] (LUCENE-8966) KoreanTokenizer should split unknown words on digits

2019-09-05 Thread Jim Ferenczi (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-8966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jim Ferenczi updated LUCENE-8966: - Attachment: LUCENE-8966.patch > KoreanTokenizer should split unknown words on dig

[jira] [Updated] (LUCENE-8966) KoreanTokenizer should split unknown words on digits

2019-09-05 Thread Jim Ferenczi (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-8966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jim Ferenczi updated LUCENE-8966: - Description: Since https://issues.apache.org/jira/browse/LUCENE-8548 the Korean tokenizer

[jira] [Created] (LUCENE-8966) KoreanTokenizer should split unknown words on digits

2019-09-05 Thread Jim Ferenczi (Jira)
Jim Ferenczi created LUCENE-8966: Summary: KoreanTokenizer should split unknown words on digits Key: LUCENE-8966 URL: https://issues.apache.org/jira/browse/LUCENE-8966 Project: Lucene - Core

[jira] [Updated] (LUCENE-8959) JapaneseNumberFilter does not take whitespaces into account when concatenating numbers

2019-08-29 Thread Jim Ferenczi (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-8959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jim Ferenczi updated LUCENE-8959: - Description: Today the JapaneseNumberFilter tries to concatenate numbers even

[jira] [Commented] (LUCENE-8959) JapaneseNumberFilter does not take whitespaces into account when concatenating numbers

2019-08-29 Thread Jim Ferenczi (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-8959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16918532#comment-16918532 ] Jim Ferenczi commented on LUCENE-8959: -- *Update:* Whitespaces were removed in my tests because I

[jira] [Created] (LUCENE-8959) JapaneseNumberFilter does not take whitespaces into account when concatenating numbers

2019-08-29 Thread Jim Ferenczi (Jira)
Jim Ferenczi created LUCENE-8959: Summary: JapaneseNumberFilter does not take whitespaces into account when concatenating numbers Key: LUCENE-8959 URL: https://issues.apache.org/jira/browse/LUCENE-8959

[jira] [Commented] (LUCENE-8943) Incorrect IDF in MultiPhraseQuery and SpanOrQuery

2019-08-12 Thread Jim Ferenczi (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16905332#comment-16905332 ] Jim Ferenczi commented on LUCENE-8943: -- {quote} Your post made me think of the problem in another

[jira] [Commented] (LUCENE-8943) Incorrect IDF in MultiPhraseQuery and SpanOrQuery

2019-08-09 Thread Jim Ferenczi (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16903968#comment-16903968 ] Jim Ferenczi commented on LUCENE-8943: -- I don't think we can realistically approximate the doc freq

[jira] [Commented] (LUCENE-8747) Allow access to submatches from Matches instances

2019-08-06 Thread Jim Ferenczi (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16901091#comment-16901091 ] Jim Ferenczi commented on LUCENE-8747: -- Can we return a list of Matches in findNamedMatches

[jira] [Commented] (LUCENE-8941) Build wildcard matches more lazily

2019-08-01 Thread Jim Ferenczi (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16898316#comment-16898316 ] Jim Ferenczi commented on LUCENE-8941: -- +1 the patch looks good. Can you add an assert

[jira] [Resolved] (LUCENE-8935) BooleanQuery with no scoring clauses cannot skip documents when running TOP_SCORES mode

2019-07-29 Thread Jim Ferenczi (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jim Ferenczi resolved LUCENE-8935. -- Resolution: Fixed Fix Version/s: 8.3 master (9.0) > BooleanQu

[jira] [Commented] (LUCENE-8935) BooleanQuery with no scoring clauses cannot skip documents when running TOP_SCORES mode

2019-07-26 Thread Jim Ferenczi (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16893713#comment-16893713 ] Jim Ferenczi commented on LUCENE-8935: -- Sorry I misunderstood the logic but the number of scoring

[jira] [Commented] (LUCENE-8935) BooleanQuery with no scoring clauses cannot skip documents when running TOP_SCORES mode

2019-07-26 Thread Jim Ferenczi (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16893708#comment-16893708 ] Jim Ferenczi commented on LUCENE-8935: -- The logic is already at the bottom

[jira] [Commented] (LUCENE-8933) JapaneseTokenizer creates Token objects with corrupt offsets

2019-07-26 Thread Jim Ferenczi (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16893677#comment-16893677 ] Jim Ferenczi commented on LUCENE-8933: -- {quote} Should we go further and check

[jira] [Updated] (LUCENE-8935) BooleanQuery with no scoring clauses cannot skip documents when running TOP_SCORES mode

2019-07-26 Thread Jim Ferenczi (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jim Ferenczi updated LUCENE-8935: - Attachment: LUCENE-8935.patch Status: Open (was: Open) Here is a patch that wraps

[jira] [Created] (LUCENE-8935) BooleanQuery with no scoring clauses cannot skip documents when running TOP_SCORES mode

2019-07-26 Thread Jim Ferenczi (JIRA)
Jim Ferenczi created LUCENE-8935: Summary: BooleanQuery with no scoring clauses cannot skip documents when running TOP_SCORES mode Key: LUCENE-8935 URL: https://issues.apache.org/jira/browse/LUCENE-8935

[jira] [Commented] (LUCENE-8933) JapaneseTokenizer creates Token objects with corrupt offsets

2019-07-26 Thread Jim Ferenczi (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16893478#comment-16893478 ] Jim Ferenczi commented on LUCENE-8933: -- {quote} If there are no other opinions or objections, I'd

[jira] [Commented] (LUCENE-8933) JapaneseTokenizer creates Token objects with corrupt offsets

2019-07-25 Thread Jim Ferenczi (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16892671#comment-16892671 ] Jim Ferenczi commented on LUCENE-8933: -- The first argument of the dictionary rule is the original

[jira] [Commented] (LUCENE-8889) Remove Dead Code From PointRangeQuery

2019-06-27 Thread Jim Ferenczi (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16873992#comment-16873992 ] Jim Ferenczi commented on LUCENE-8889: -- Why is it an issue ? We have some use cases

[jira] [Resolved] (LUCENE-8859) Add an option to load the completion suggester's FST off-heap

2019-06-26 Thread Jim Ferenczi (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jim Ferenczi resolved LUCENE-8859. -- Resolution: Fixed Fix Version/s: 8.2 master (9.0) > Add an opt

[jira] [Resolved] (LUCENE-7714) Optimize range queries for the sorted case

2019-06-26 Thread Jim Ferenczi (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-7714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jim Ferenczi resolved LUCENE-7714. -- Resolution: Fixed Fix Version/s: 8.2 master (9.0) Thanks

[jira] [Commented] (LUCENE-8806) WANDScorer should support two-phase iterator

2019-06-25 Thread Jim Ferenczi (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16872213#comment-16872213 ] Jim Ferenczi commented on LUCENE-8806: -- I am testing with wikimediumall > WANDScorer sho

[jira] [Commented] (LUCENE-8806) WANDScorer should support two-phase iterator

2019-06-25 Thread Jim Ferenczi (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16872169#comment-16872169 ] Jim Ferenczi commented on LUCENE-8806: -- {quote} FYI we have an issue for phrases already LUCENE

[jira] [Commented] (LUCENE-8848) UnifiedHighlighter should highlight all Query types that implement Weight.matches

2019-06-25 Thread Jim Ferenczi (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16872150#comment-16872150 ] Jim Ferenczi commented on LUCENE-8848: -- The RandomIndexWriter is created but not closed

[jira] [Commented] (LUCENE-8806) WANDScorer should support two-phase iterator

2019-06-25 Thread Jim Ferenczi (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16872142#comment-16872142 ] Jim Ferenczi commented on LUCENE-8806: -- I ran luceneutil with some disjunctions of phrase and term

[jira] [Commented] (LUCENE-8859) Add an option to load the completion suggester's FST off-heap

2019-06-18 Thread Jim Ferenczi (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16866367#comment-16866367 ] Jim Ferenczi commented on LUCENE-8859: -- Thanks for looking Adrien. Currently users can add the file

[jira] [Updated] (LUCENE-8859) Add an option to load the completion suggester's FST off-heap

2019-06-14 Thread Jim Ferenczi (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jim Ferenczi updated LUCENE-8859: - Attachment: LUCENE-8859.patch > Add an option to load the completion suggester's FST off-h

[jira] [Commented] (LUCENE-8859) Add an option to load the completion suggester's FST off-heap

2019-06-14 Thread Jim Ferenczi (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16863810#comment-16863810 ] Jim Ferenczi commented on LUCENE-8859: -- Here is a patch that exposes an option to force the load

[jira] [Updated] (LUCENE-8859) Add an option to load the completion suggester's FST off-heap

2019-06-14 Thread Jim Ferenczi (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jim Ferenczi updated LUCENE-8859: - Priority: Minor (was: Major) > Add an option to load the completion suggester's FST off-h

[jira] [Updated] (LUCENE-8635) Lazy loading Lucene FST offheap using mmap

2019-06-14 Thread Jim Ferenczi (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jim Ferenczi updated LUCENE-8635: - Priority: Major (was: Minor) > Lazy loading Lucene FST offheap using m

[jira] [Created] (LUCENE-8859) Add an option to load the completion suggester's FST off-heap

2019-06-14 Thread Jim Ferenczi (JIRA)
Jim Ferenczi created LUCENE-8859: Summary: Add an option to load the completion suggester's FST off-heap Key: LUCENE-8859 URL: https://issues.apache.org/jira/browse/LUCENE-8859 Project: Lucene - Core

[jira] [Updated] (LUCENE-8635) Lazy loading Lucene FST offheap using mmap

2019-06-14 Thread Jim Ferenczi (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jim Ferenczi updated LUCENE-8635: - Priority: Minor (was: Major) > Lazy loading Lucene FST offheap using m

[jira] [Commented] (LUCENE-8845) Allow maxExpansions to be set on multi-term Intervals

2019-06-11 Thread Jim Ferenczi (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16860693#comment-16860693 ] Jim Ferenczi commented on LUCENE-8845: -- +1 > Allow maxExpansions to be set on multi-term Interv

[jira] [Commented] (LUCENE-8845) Allow maxExpansions to be set on multi-term Intervals

2019-06-10 Thread Jim Ferenczi (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16860252#comment-16860252 ] Jim Ferenczi commented on LUCENE-8845: -- {quote} 2) I think this is covered by the javadocs

Re: No email notifications from JIRA when attaching a patch

2019-06-10 Thread jim ferenczi
> Jim's update (the first link I shared) includes both an attachment and a comment and we got neither of them. I use the comment box when attaching a patch so I guess this is why the notification was not triggered. Le lun. 10 juin 2019 à 19:20, Adrien Grand a écrit : > This might explain why

[jira] [Commented] (LUCENE-8812) add KoreanNumberFilter to Nori(Korean) Analyzer

2019-06-10 Thread Jim Ferenczi (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16859778#comment-16859778 ] Jim Ferenczi commented on LUCENE-8812: -- Thanks [~danmuzi]! > add KoreanNumberFilter to Nori(Kor

[jira] [Commented] (LUCENE-8840) TopTermsBlendedFreqScoringRewrite should use SynonymQuery

2019-06-07 Thread Jim Ferenczi (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16858600#comment-16858600 ] Jim Ferenczi commented on LUCENE-8840: -- {quote} I am curious to understand how including doc

[jira] [Created] (LUCENE-8840) TopTermsBlendedFreqScoringRewrite should use SynonymQuery

2019-06-07 Thread Jim Ferenczi (JIRA)
Jim Ferenczi created LUCENE-8840: Summary: TopTermsBlendedFreqScoringRewrite should use SynonymQuery Key: LUCENE-8840 URL: https://issues.apache.org/jira/browse/LUCENE-8840 Project: Lucene - Core

[jira] [Commented] (LUCENE-8812) add KoreanNumberFilter to Nori(Korean) Analyzer

2019-06-06 Thread Jim Ferenczi (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16857911#comment-16857911 ] Jim Ferenczi commented on LUCENE-8812: -- Sorry I didn't see your reply. I agree with you

Re: Welcome Namgyu Kim as Lucene/Solr committer

2019-06-05 Thread jim ferenczi
Welcome Namgyu! Le mer. 5 juin 2019 à 13:54, Ignacio Vera a écrit : > Welcome! > > On Wed, Jun 5, 2019 at 1:53 PM Michael Sokolov wrote: > >> Namgyu! Welcome >> >> Mike >> >> On Mon, Jun 3, 2019 at 1:52 PM Adrien Grand wrote: >> > >> > Hi all, >> > >> > Please join me in welcoming Namgyu Kim

[jira] [Commented] (LUCENE-8812) add KoreanNumberFilter to Nori(Korean) Analyzer

2019-05-30 Thread Jim Ferenczi (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16851778#comment-16851778 ] Jim Ferenczi commented on LUCENE-8812: -- The patch looks good [~danmuzi], I wonder if it would

[jira] [Commented] (LUCENE-8816) Decouple Kuromoji's morphological analyser and its dictionary

2019-05-30 Thread Jim Ferenczi (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16851574#comment-16851574 ] Jim Ferenczi commented on LUCENE-8816: -- This sounds like a great plan [~tomoko]. Decoupling

[jira] [Commented] (LUCENE-8816) Decouple Kuromoji's morphological analyser and its dictionary

2019-05-28 Thread Jim Ferenczi (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16849778#comment-16849778 ] Jim Ferenczi commented on LUCENE-8816: -- We discussed this when we added the Korean module and said

[jira] [Resolved] (LUCENE-8784) Nori(Korean) tokenizer removes the decimal point.

2019-05-27 Thread Jim Ferenczi (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jim Ferenczi resolved LUCENE-8784. -- Resolution: Fixed Fix Version/s: 8.2 master (9.0) Thanks [~danmuzi

[jira] [Commented] (LUCENE-8784) Nori(Korean) tokenizer removes the decimal point.

2019-05-24 Thread Jim Ferenczi (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16847695#comment-16847695 ] Jim Ferenczi commented on LUCENE-8784: -- The last patch for this issue looks good to me. I'll test

[jira] [Commented] (LUCENE-8788) Order LeafReaderContexts by Estimated Number Of Hits

2019-05-24 Thread Jim Ferenczi (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16847455#comment-16847455 ] Jim Ferenczi commented on LUCENE-8788: -- {quote} I like the idea [~jim.ferenczi] proposed. I can

[jira] [Commented] (LUCENE-8784) Nori(Korean) tokenizer removes the decimal point.

2019-05-24 Thread Jim Ferenczi (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16847381#comment-16847381 ] Jim Ferenczi commented on LUCENE-8784: -- {quote} By the way, would not it be better to leave

[jira] [Commented] (LUCENE-8784) Nori(Korean) tokenizer removes the decimal point.

2019-05-22 Thread Jim Ferenczi (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16845856#comment-16845856 ] Jim Ferenczi commented on LUCENE-8784: -- Hi [~danmuzi], I don't think we should have one option

[jira] [Resolved] (LUCENE-8770) BlockMaxConjunctionScorer should support two-phase scorers

2019-05-21 Thread Jim Ferenczi (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jim Ferenczi resolved LUCENE-8770. -- Resolution: Fixed Fix Version/s: 8.2 master (9.0) Thanks [~jpountz

[jira] [Created] (LUCENE-8806) WANDScorer should support two-phase iterator

2019-05-21 Thread Jim Ferenczi (JIRA)
Jim Ferenczi created LUCENE-8806: Summary: WANDScorer should support two-phase iterator Key: LUCENE-8806 URL: https://issues.apache.org/jira/browse/LUCENE-8806 Project: Lucene - Core Issue

[jira] [Commented] (LUCENE-8770) BlockMaxConjunctionScorer should support two-phase scorers

2019-05-21 Thread Jim Ferenczi (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16844612#comment-16844612 ] Jim Ferenczi commented on LUCENE-8770: -- {quote} I wonder how useful computing the score in the two

[jira] [Updated] (LUCENE-8770) BlockMaxConjunctionScorer should support two-phase scorers

2019-05-20 Thread Jim Ferenczi (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jim Ferenczi updated LUCENE-8770: - Attachment: (was: LUCENE-8770.patch) > BlockMaxConjunctionScorer should support two-ph

Re: [VOTE] Release Lucene/Solr 8.1.0 RC2

2019-05-09 Thread jim ferenczi
+1 SUCCESS! [1:14:41.737009] Le jeu. 9 mai 2019 à 18:56, Kevin Risden a écrit : > +1 > SUCCESS! [1:17:45.727492] > > Kevin Risden > > > On Thu, May 9, 2019 at 11:37 AM Ishan Chattopadhyaya < > ichattopadhy...@gmail.com> wrote: > >> Please vote for release candidate 2 for Lucene/Solr 8.1.0 >> >>

[jira] [Resolved] (LUCENE-7840) BooleanQuery.rewriteNoScoring - optimize away any SHOULD clauses if at least 1 MUST/FILTER clause and 0==minShouldMatch

2019-05-09 Thread Jim Ferenczi (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-7840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jim Ferenczi resolved LUCENE-7840. -- Resolution: Fixed Fix Version/s: 8.2 master (9.0) Thanks [~atris

[jira] [Commented] (LUCENE-7840) BooleanQuery.rewriteNoScoring - optimize away any SHOULD clauses if at least 1 MUST/FILTER clause and 0==minShouldMatch

2019-05-07 Thread Jim Ferenczi (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-7840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16834756#comment-16834756 ] Jim Ferenczi commented on LUCENE-7840: -- Thanks [~atris], it looks good to me too. I'll commit

[jira] [Commented] (LUCENE-7840) BooleanQuery.rewriteNoScoring - optimize away any SHOULD clauses if at least 1 MUST/FILTER clause and 0==minShouldMatch

2019-05-07 Thread Jim Ferenczi (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-7840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16834417#comment-16834417 ] Jim Ferenczi commented on LUCENE-7840: -- Can you build the new query in a single pass ? You could

[jira] [Commented] (LUCENE-7840) BooleanQuery.rewriteNoScoring - optimize away any SHOULD clauses if at least 1 MUST/FILTER clause and 0==minShouldMatch

2019-05-06 Thread Jim Ferenczi (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-7840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16833727#comment-16833727 ] Jim Ferenczi commented on LUCENE-7840: -- I think so yes, we don't need to build the scorer supplier

[jira] [Commented] (LUCENE-7840) BooleanQuery.rewriteNoScoring - optimize away any SHOULD clauses if at least 1 MUST/FILTER clause and 0==minShouldMatch

2019-05-06 Thread Jim Ferenczi (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-7840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16833621#comment-16833621 ] Jim Ferenczi commented on LUCENE-7840: -- Note that the logic to remove SHOULD clauses is already

[jira] [Commented] (LUCENE-8772) [nori] A word that is registered in advance, but the words are not separated and recognized as 'UNKNOWN'

2019-04-19 Thread Jim Ferenczi (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16821810#comment-16821810 ] Jim Ferenczi commented on LUCENE-8772: -- That's expected since the unknown word heuristic

[jira] [Created] (LUCENE-8770) BlockMaxConjunctionScorer should support two-phase scorers

2019-04-18 Thread Jim Ferenczi (JIRA)
Jim Ferenczi created LUCENE-8770: Summary: BlockMaxConjunctionScorer should support two-phase scorers Key: LUCENE-8770 URL: https://issues.apache.org/jira/browse/LUCENE-8770 Project: Lucene - Core

  1   2   3   4   5   6   7   >