[ANNOUNCE] Apache Lucene 9.4.0 released

2022-09-30 Thread Michael Sokolov
The Lucene PMC is pleased to announce the release of Apache Lucene 9.4.0. Apache Lucene is a high-performance, full-featured search engine library written entirely in Java. It is a technology suitable for nearly any application that requires structured search, full-text search, faceting,

[RESULT] [VOTE] Release Lucene 9.4.0 RC3

2022-09-30 Thread Michael Sokolov
It's been >72h since the vote was initiated and the result is: +1 8 (7 binding) 0 0 -1 0 This vote has PASSED - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail:

Re: [VOTE] Release Lucene 9.4.0 RC3

2022-09-28 Thread Michael Sokolov
s >>>> >>>> http://blog.mikemccandless.com >>>> >>>> >>>> On Tue, Sep 27, 2022 at 3:45 PM Anshum Gupta >>>> wrote: >>>>> >>>>> +1 (binding) >>>>> >>>>> Smoketester is hap

[VOTE] Release Lucene 9.4.0 RC3

2022-09-27 Thread Michael Sokolov
Please vote for release candidate 3 for Lucene 9.4.0 The artifacts can be downloaded from: https://dist.apache.org/repos/dist/dev/lucene/lucene-9.4.0-RC3-rev-d2e22e18c6c92b6a6ba0bbc26d78b5e82832f956 You can run the smoke tester directly with this command: python3 -u

Re: [VOTE] Release Lucene 9.4.0 RC2

2022-09-27 Thread Michael Sokolov
gt;> LatLonPoint field, see https://github.com/apache/lucene/issues/11824. >>> >>> It feels like an important regression so it might be worth a respinning. >>> Sorry about that. >>> >>> >>> On Mon, Sep 26, 2022 at 10:30 PM Anshum Gupta &g

[VOTE] Release Lucene 9.4.0 RC2

2022-09-26 Thread Michael Sokolov
Please vote for release candidate 2 for Lucene 9.4.0 The artifacts can be downloaded from: https://dist.apache.org/repos/dist/dev/lucene/lucene-9.4.0-RC2-rev-0384b4fcad7856ddc574c8b994c814a568ce6d0a You can run the smoke tester directly with this command: python3 -u

Re: [VOTE] Release Lucene 9.4.0 RC1

2022-09-26 Thread Michael Sokolov
m 26.09.2022 um 15:51 schrieb Michael Sokolov: > > Hm the build failed with this: > > > > FAILURE: Build failed with an exception. > > > > * What went wrong: > > Execution failed for task ':lucene:core:compileMain19Java'. > >> Error while evaluating proper

Re: [VOTE] Release Lucene 9.4.0 RC1

2022-09-26 Thread Michael Sokolov
in our build scripts? If I install will it autodetect?? On Mon, Sep 26, 2022 at 9:36 AM Michael Sokolov wrote: > > Nice! Thanks everyone, I will refresh and start building the artifacts > > On Mon, Sep 26, 2022 at 9:33 AM Uwe Schindler wrote: > > > > OK, > > > >

Re: [VOTE] Release Lucene 9.4.0 RC1

2022-09-26 Thread Michael Sokolov
gt;>>> >>> >>>>> (no vote) >>> >>>>> >>> >>>>> SUCCESS! [1:12:31.588303] >>> >>>>> >>> >>>>> >>> >>>>> On Thu, Sep 22, 2022 at 2:27 AM Ignacio Ve

Re: [VOTE] Release Lucene 9.4.0 RC1

2022-09-26 Thread Michael Sokolov
Michael McCandless >>>>>> wrote: >>>>>>> >>>>>>> +1 >>>>>>> >>>>>>> >>>>>>> SUCCESS! [0:27:16.514391] >>>>>>> >>>>>>> >>>

Re: Subject: New branch and feature freeze for Lucene 9.4.0

2022-09-21 Thread Michael Sokolov
ase. The vote is still ongoning, so we > > have all options. > > > > Uwe > > > > Am 21.09.2022 um 14:05 schrieb Michael Sokolov: > >> I see; I would kind of like to get the release out before ApacheCon > >> NA, which starts Oct 3. Do you think it's lik

Re: Subject: New branch and feature freeze for Lucene 9.4.0

2022-09-21 Thread Michael Sokolov
ith JDK 19. No risk, it only activates when you enable it. > > Thoughts? > > Uwe > > Am 02.09.2022 um 21:42 schrieb Michael Sokolov: > > NOTICE: > > Branch branch_9_4 has been cut and versions updated to 9.5 on stable branch. > > Please observe the normal r

[VOTE] Release Lucene 9.4.0 RC1

2022-09-20 Thread Michael Sokolov
Please vote for release candidate 1 for Lucene 9.4.0 The artifacts can be downloaded from: https://dist.apache.org/repos/dist/dev/lucene/lucene-9.4.0-RC1-rev-f5d0646daa5651f2192282ac85551bca667e34f9 You can run the smoke tester directly with this command: python3 -u

Re: Subject: New branch and feature freeze for Lucene 9.4.0

2022-09-20 Thread Michael Sokolov
ublish my local ann-benchmarks set-up so that >> it's not so fragile! >> >> In summary, with your latest fix the recall and QPS look good to me -- I >> don't detect any regression between 9.3 and 9.4. >> >> Julie >> >> On Mon, Sep 19, 2022 a

Re: Subject: New branch and feature freeze for Lucene 9.4.0

2022-09-19 Thread Michael Sokolov
orrow to > double-check there's no drop. It would also be nice to formalize the > ann-benchmarks set-up and run it regularly (like we've discussed in > https://github.com/apache/lucene/issues/10665). > > Julie > > On Mon, Sep 19, 2022 at 10:33 AM Michael Sokolov > wro

Re: Subject: New branch and feature freeze for Lucene 9.4.0

2022-09-19 Thread Michael Sokolov
95.236 > n_cands=120 0.843 948.908 0.843 525.914 > n_cands=200 0.878 671.781 0.878 351.529 > n_cands=400 0.918 392.265 0.918 207.854 > n_cands=600 0.937 282.403 0.937 144.311 > n_cands=800 0.949 214.620 0.949 116.875 > > On Sun, Sep 18, 2022 at 6:25 PM Michael Sokolov > wrote: >

Re: Subject: New branch and feature freeze for Lucene 9.4.0

2022-09-18 Thread Michael Sokolov
operations? It would be a little surprising if that were the case given the small number of branchings compared to the number of multiplies in dot-product though. On Sun, Sep 18, 2022 at 3:25 PM Michael Sokolov wrote: > > Thanks for the deep-dive Julie. I was able to reproduce the ch

Re: Subject: New branch and feature freeze for Lucene 9.4.0

2022-09-18 Thread Michael Sokolov
ported the change. > > On Thu, Sep 15, 2022 at 6:32 PM Michael Sokolov wrote: >> >> it looks like a small bug fix, we have had on main (and 9.x?) for a >> while now and no test failures showed up, I guess. Should be OK to >> port. I plan to cut artifacts this weekend, or

Re: Subject: New branch and feature freeze for Lucene 9.4.0

2022-09-15 Thread Michael Sokolov
or running more tests, Michael. >> It is encouraging that you saw a similar performance between 9.3 and 9.4. I >> will also run more tests with different parameters. >> >> On Tue, Sep 13, 2022 at 9:30 AM Michael Sokolov wrote: >>> >>> As a follow-up, I ran

Re: Subject: New branch and feature freeze for Lucene 9.4.0

2022-09-13 Thread Michael Sokolov
--r-- 1 sokolovm amazon 516M Sep 13 13:26 _0.cfs -rw-r--r-- 1 sokolovm amazon 340 Sep 13 13:26 _0.si On Tue, Sep 13, 2022 at 8:50 AM Michael Sokolov wrote: > > I ran another test. I thought I had increased the RAM buffer size to > 8G and heap to 16G. However I still see two segments in

Re: Subject: New branch and feature freeze for Lucene 9.4.0

2022-09-13 Thread Michael Sokolov
must be less that 2GB (2048MB) * * @see #DEFAULT_RAM_PER_THREAD_HARD_LIMIT_MB */ On Mon, Sep 12, 2022 at 6:28 PM Michael Sokolov wrote: > > Hi Mayya, thanks for persisting - I think we need to wrestle this to > the ground for sure. In the test I ran, RAM buffer was the default

Re: Subject: New branch and feature freeze for Lucene 9.4.0

2022-09-12 Thread Michael Sokolov
PS in 9.4. > > Thank you. > > > > > > On Fri, Sep 9, 2022 at 12:21 PM Alan Woodward wrote: >> >> Done. Thanks! >> >> > On 9 Sep 2022, at 16:32, Michael Sokolov wrote: >> > >> > Hi Alan - I checked out the interval queries patch;

Lucene 9.4 release notes draft

2022-09-09 Thread Michael Sokolov
Hi all I published a draft of the release notes here: https://cwiki.apache.org/confluence/display/LUCENE/Release+Notes+9.4 Please review and feel free to make corrections/additions directly in confluence. I didn't include everything in CHANGES, so I may have missed something that deserves a

Re: Subject: New branch and feature freeze for Lucene 9.4.0

2022-09-09 Thread Michael Sokolov
; for a problem with interval queries. Am I OK to port this to the 9.4 branch? > > Thanks, Alan > > On 2 Sep 2022, at 20:42, Michael Sokolov wrote: > > NOTICE: > > Branch branch_9_4 has been cut and versions updated to 9.5 on stable branch. > > Please observe the normal

Re: Subject: New branch and feature freeze for Lucene 9.4.0

2022-09-08 Thread Michael Sokolov
itHub Milestone for 9.5 also needs to be created. >> >> This time, I created Milestone 9.5.0. We should include it in the release >> process. >> https://github.com/apache/lucene/milestone/4 >> >> >> 2022年9月3日(土) 4:42 Michael Sokolov : >>> >>

Re: release notes question

2022-09-03 Thread Michael Sokolov
. > > > On Fri, Sep 2, 2022 at 3:46 PM Michael Sokolov wrote: > > > > Hi Lucene devs, I'm going through the release manager script, and > > coming to the point where it talks about writing release notes. It > > suggests starting from a previous release note on

release notes question

2022-09-02 Thread Michael Sokolov
Hi Lucene devs, I'm going through the release manager script, and coming to the point where it talks about writing release notes. It suggests starting from a previous release note on the confluence wiki, but it seems we haven't been using that for 9.x releases. Can previous release managers give

Subject: New branch and feature freeze for Lucene 9.4.0

2022-09-02 Thread Michael Sokolov
NOTICE: Branch branch_9_4 has been cut and versions updated to 9.5 on stable branch. Please observe the normal rules: * No new features may be committed to the branch. * Documentation patches, build patches and serious bug fixes may be committed to the branch. However, you should submit all

Re: Lucene 9.4.0 release

2022-09-01 Thread Michael Sokolov
t; > On Thu, Sep 1, 2022 at 11:04 AM Michael Sokolov wrote: >> >> Thanks Tomoko - I appreciate the offer to review the changes needed. I >> will take care of updating the release script/template. >> >> I think I managed to get a GPG key registered and signed

Re: Lucene 9.4.0 release

2022-09-01 Thread Michael Sokolov
gt;> > true for GitHub ... >> >> You do not need any special permissions to make new Milestones on GitHub. >> Every committer already has permission to create/close/delete Milestones, >> you can test it here. >> https://github.com/apache/lucene/milestones >>

Re: [JENKINS] Lucene » Lucene-NightlyTests-9.x - Build # 302 - Unstable!

2022-09-01 Thread Michael Sokolov
This was a bug in the test; I fixed on 9.x here: https://github.com/apache/lucene/pull/11732, will also cherry-pick to main On Wed, Aug 31, 2022 at 11:09 PM Apache Jenkins Server wrote: > > Build: > https://ci-builds.apache.org/job/Lucene/job/Lucene-NightlyTests-9.x/302/ > > 1 tests failed. >

Re: Lucene 9.4.0 release

2022-08-31 Thread Michael Sokolov
y be > built on Jira. > I hope other people help to interpret Jira-related things on the way into > the language of GitHub issues. > > Tomoko > > 2022年9月1日(木) 3:40 Michael Sokolov : > >> Thanks for the links, Tomoko. I thought it would be helpful to ask on >> the

Re: [lucene] branch main updated: SimpleText knn vectors; fix searchExhaustively and suppress a byte format test case (#11725)

2022-08-31 Thread Michael Sokolov
commit(s) were added to refs/heads/main by this push: > > new 61ef031f7fa SimpleText knn vectors; fix searchExhaustively and > > suppress a byte format test case (#11725) > > 61ef031f7fa is described below > > > > commit 61ef031f7fa3abdd7c8c2f36db71ad2289b66131 > &

Re: Lucene 9.4.0 release

2022-08-31 Thread Michael Sokolov
github.com/apache/lucene/blob/main/dev-docs/github-issues-howto.md >> >> Is this unclear to you? >> >> >> 2022年8月31日(水) 23:13 Michael Sokolov : >>> >>> Hi, I'd like to start the ball rolling for a 9.4.0 release. We don't >>> have a large num

Lucene 9.4.0 release

2022-08-31 Thread Michael Sokolov
+ to find Major issues, for example, and it seems to only find one Minor one? Does anyone have better github-search-fu? API Changes - * LUCENE-10577: Add VectorEncoding to enable byte-encoded HNSW vectors (Michael Sokolov, Julie Tibshirani) New Features - * LUCENE

Re: [JENKINS] Lucene-main-Linux (64bit/jdk-18) - Build # 36650 - Unstable!

2022-08-28 Thread Michael Sokolov
wson > -Dtests.asserts=true -Dtests.file.encoding=UTF-8 -p lucene/core > > This is both on a mac and on linux. I think the multiplier or some > other option may be affecting the reproducibility? > > Dawid > > On Sun, Aug 28, 2022 at 12:08 AM Michael Sokolov wrote: > > > > T

Re: [JENKINS] Lucene-main-Linux (64bit/jdk-18) - Build # 36650 - Unstable!

2022-08-27 Thread Michael Sokolov
This did not reproduce for me (on JDK17) even with -Ptests.iters=1000. Tried beasting 100 times too, who knows. Since there are 20 bytes in the actual value, but we expected 5, the 4x multiplier sure looks like confusion of floats and bytes. It's scary if some other test is some impacting this.

Re: Label vs. Milestone for version management?

2022-08-25 Thread Michael Sokolov
Tomoko - sorry to re-raise this when we thought it had been settled. Having never really used github issues, I don't think I fully understood the arguments there. On Thu, Aug 25, 2022 at 3:50 AM Tomoko Uchida wrote: > > Hi all. > > I once proposed using Milestone for version management in GitHub

Re: Label vs. Milestone for version management?

2022-08-25 Thread Michael Sokolov
about these, but I think it's better if we can look them up in the issue db. On Thu, Aug 25, 2022 at 9:40 AM Robert Muir wrote: > > On Thu, Aug 25, 2022 at 6:11 AM Michael Sokolov wrote: > > > > The milestone looks appealing since it is prominent and relatively easy to > >

Re: Label vs. Milestone for version management?

2022-08-25 Thread Michael Sokolov
The milestone looks appealing since it is prominent and relatively easy to use. The only drawback I have heard is that it is single valued. It still seems we could use it to document the first version in which something is released, although it wouldn't be possible to record other releases into

Re: [ANNOUNCE] Issue migration Jira to GitHub starts on Monday, August 22

2022-08-24 Thread Michael Sokolov
Thanks! It seems to be working nicely. Question about the fix-version: tagging. I wonder if going forward we want to main that for new issues? I happened to notice there is also this "milestone" feature in github -- does that seem like a place to put version information? On Wed, Aug 24, 2022 at

Re: [JENKINS] Lucene » Lucene-Coverage-main - Build # 500 - Unstable!

2022-08-19 Thread Michael Sokolov
This test asserts that we return the same documents in the same order when the index is sorted and when it's not, but failed because the scores for two documents were equal, and they end up sorting differently due to docid tiebreaking, which is *not* the same under a sorted index. Not sure what

Re: [JENKINS] Lucene » Lucene-NightlyTests-main - Build # 728 - Still Unstable!

2022-08-13 Thread Michael Sokolov
This didn't reproduce for me, but I can see that the error message is different in SimpleTextKnnVectorsReader, so I'll update that On Sat, Aug 13, 2022 at 1:51 AM Apache Jenkins Server wrote: > > Build: > https://ci-builds.apache.org/job/Lucene/job/Lucene-NightlyTests-main/728/ > > 3 tests

Re: [HELP] Please spot-check the migrated Lucene GitHub issues!

2022-08-09 Thread Michael Sokolov
Yes, looks amazing! All I could find was: This one https://github.com/mocobeta/forks-migration-test-2/issues/8964 seems to be missing its attachment - not sure if this was expected with this round? EG

Re: [HELP] Please spot-check the migrated Lucene GitHub issues!

2022-07-30 Thread Michael Sokolov
issues like that (except this SPAM!), so I'd be happy if we don't change anything here :) On Sat, Jul 30, 2022 at 6:12 PM Michael Sokolov wrote: > > I did some spot-checking. ooh, it looks so nice! > > I have one suggestion, totally optional/cosmetic, but I wonder if we > could ma

Re: [HELP] Please spot-check the migrated Lucene GitHub issues!

2022-07-30 Thread Michael Sokolov
I did some spot-checking. ooh, it looks so nice! I have one suggestion, totally optional/cosmetic, but I wonder if we could make the original comment authors' names more prominent by moving the [Legacy Jira: ${Name} (@${user}) on ${date}] to the top of each comment rather than the bottom? That

Re: [jira] [Commented] (LUCENE-10054) Handle hierarchy in HNSW graph

2022-07-28 Thread Michael Sokolov
Thanks David On Wed, Jul 27, 2022 at 5:13 PM David Smiley wrote: > > FYI I had filed https://issues.apache.org/jira/browse/INFRA-23503 > > ~ David Smiley > Apache Lucene/Solr Search Developer > http://www.linkedin.com/in/davidwsmiley > > > On Tue, Jul 26, 2022 at 3:5

Re: Welcome Vigya Sharma as Lucene committer

2022-07-28 Thread Michael Sokolov
Welcome Vigya! On Thu, Jul 28, 2022, 6:48 AM Michael McCandless wrote: > Welcome Vigya!! > > Mike > > On Thu, Jul 28, 2022 at 5:28 AM Lu Xugang > wrote: > >> Congratulations, and welcome Vigya! >> >> Xugang >> >> www.amazingkoala.com.cn >> >> >> >> >> On Jul 28, 2022, at 17:21, Ignacio Vera

Re: [jira] [Commented] (LUCENE-10054) Handle hierarchy in HNSW graph

2022-07-26 Thread Michael Sokolov
searching JIRA for "slkjfdf" I found a few issues in other projects, but none seems to be getting the same degree of spam love On Tue, Jul 26, 2022 at 3:50 PM Mike Sokolov (Jira) wrote: > > > [ >

Re: Lucene 9.3.0 release

2022-07-21 Thread Michael Sokolov
big >>>> change, maybe we should not even try to get it in before cutting the >>>> branch? >>>> >>>> On Tue, Jul 19, 2022 at 4:09 PM Mayya Sharipova >>>> wrote: >>>>> >>>>> Thanks for the reminder about the release

Re: Lucene 9.3.0 release

2022-07-21 Thread Michael Sokolov
4:09 PM Mayya Sharipova >>>> wrote: >>>>> >>>>> Thanks for the reminder about the release, Ignacio! >>>>> About LUCENE-10592 I will see what progress we can make today, and will >>>>> let you know before Wednesday

Re: Lucene 9.3.0 release

2022-07-21 Thread Michael Sokolov
PM Mayya Sharipova >>>> wrote: >>>> >>>>> Thanks for the reminder about the release, Ignacio! >>>>> About LUCENE-10592 >>>>> <https://issues.apache.org/jira/browse/LUCENE-10592> I will see what >>>>> progress we

Re: Lucene 9.3.0 release

2022-07-19 Thread Michael Sokolov
gt; On Tue, Jul 12, 2022 at 2:50 PM Ignacio Vera wrote: >> >>> Thanks for the heads up, I am planning to cut the brunch middle next >>> week, Wednesday July 20th. >>> Let me know at the beginning of next week if there is any issue from >>> your side. >>&

Re: [DISCUSS] Read-only Jira after the GitHub issues migration?

2022-07-17 Thread Michael Sokolov
I think we'd still have the mailing lists open for discussion. So anyone not willing or able to use GitHub would still be able to participate in a meaningful way. Having two parallel bug trackers seems much less useful to me. I'd rather have people emailing to a list that is active rather than

Build failures

2022-07-16 Thread Michael Sokolov
Sorry for all the noise. I think it may be a botched backport of the timeout support I did yesterday. Will look at it today

Re: Lucene 9.3.0 release

2022-07-11 Thread Michael Sokolov
I would like to see if we can get https://issues.apache.org/jira/browse/LUCENE-10577 in. It is working and gives nice gains, but there is some controversy about the API. If we can't get it sorted out this week(?) it can certainly slip to the next revision. I know that

Re: How to avoid double-emails on all git issue/PR updates?

2022-07-11 Thread Michael Sokolov
Oh! thank you - this will be a big help. I just went to https://github.com/apache/lucene and then under "Watch" selected "participating and mentions" instead of "all activity" (which I had before). On Mon, Jul 11, 2022 at 5:46 AM Uwe Schindler wrote: > > Hi, > > I fully agree with Adrien,

Re: A prototype migration tool Jira to GitHub

2022-06-26 Thread Michael Sokolov
as for this access control/script monitoring problem, I wonder whether we could import all the issues into a new github repo owned by whomever is running the script, and then transfer from there to the lucene repo? It would be an extra step involving another script (or something), but maybe(?)

Re: A prototype migration tool Jira to GitHub

2022-06-23 Thread Michael Sokolov
s a duplicate: >>>> >>>> Did you check >>>> https://spring.io/blog/2021/01/07/spring-data-s-migration-from-jira-to-github-issues >>>> >>>> They especially write there is an api that doesn't trigger notifications. >>>> >>>> It is docu

Re: A prototype migration tool Jira to GitHub

2022-06-23 Thread Michael Sokolov
Yes thank you! You say this is not difficult, but it looks like a big job to me! Here are a bunch of things I noticed that we would ideally address (from looking at one long and complex issue, LUCENE-9004). I wouldn't be so bold as to say these should block us from proceeding if they're not

Re: [RESULT] [VOTE] Migration to GitHub issue from Jira

2022-06-20 Thread Michael Sokolov
I think the user mapping must be inferred based on membership in the Apache "organization" https://github.com/settings/organizations On Sun, Jun 19, 2022 at 2:45 AM Dawid Weiss wrote: > > >> User id mapping is an important consideration for me. > > > Some mapping has to be present somewhere

Re: [RESULT] [VOTE] Migration to GitHub issue from Jira

2022-06-15 Thread Michael Sokolov
Agree with everyone here. Also consider that if we duplicate there will be two copies of the same issue, and they will inevitably diverge... On Wed, Jun 15, 2022 at 9:28 AM Jan Høydahl wrote: > > +1 for a manual approach > > Over time the volume will gravitate to mostly GitHub issues. And JIRA

Re: exposing per-field storage usage

2022-06-14 Thread Michael Sokolov
11:15 AM Robert Muir wrote: >> >> On Tue, Jun 14, 2022 at 10:37 AM Michael Sokolov wrote: >> > >> > Oh, yes that's a clever idea. It seems it would take quite a while >> > (tens of minutes?) for a larger index though? Much faster than the >> > for

Re: exposing per-field storage usage

2022-06-14 Thread Michael Sokolov
Oh, yes that's a clever idea. It seems it would take quite a while (tens of minutes?) for a larger index though? Much faster than the force-merge solution for sure. I guess to get faster we would have to instrument each format. I mean they generally do know how much space each field is occupying,

exposing per-field storage usage

2022-06-13 Thread Michael Sokolov
At Amazon, we have a need to produce regular metrics on how much disk storage is consumed by each field. We manage an index with data contributed by many teams and business units and we are often asked to produce reports attributing index storage usage to these customers. The best tool we have for

Re: Welcome Lu Xugang as Lucene committer

2022-06-07 Thread Michael Sokolov
Welcome and thanks for spreading the word; your amazingkoala blog looks very active (although I can't read it :() On Thu, Jun 2, 2022 at 4:09 PM Mikhail Khludnev wrote: > > Welcome, Lu. > > On Wed, Jun 1, 2022 at 12:59 PM 陆徐刚 wrote: >> >> Thanks Adrien for the announcement and all for the

Re: [VOTE] Migration to GitHub issue from Jira (LUCENE-10557)

2022-06-07 Thread Michael Sokolov
Sorry I missed the first vote I think; also +1(pmc) from me. I'd be OK with some issues (esp. closed ones) being orphaned in the old system too. On Tue, Jun 7, 2022 at 9:20 AM Dawid Weiss wrote: > > > I'm fine with either system (or both used concurrently). There is significant > research

Re: Welcome Greg Miller to the Lucene PMC

2022-06-07 Thread Michael Sokolov
Welcome Greg [copying from other thread, oops!] On Tue, Jun 7, 2022 at 11:41 AM Houston Putman wrote: > > Welcome Greg! > > On Tue, Jun 7, 2022 at 11:35 AM Gautam Worah wrote: >> >> Congratulations Greg! >> >> On Tue, Jun 7, 2022 at 8:04 AM Patrick Zhai wrote: >>> >>> Congrats Greg! >>> >>>

Re: 30% query performance degradation for documents with small stored fields

2022-06-07 Thread Michael Sokolov
I wonder whether it would be worth trying switching from stored fields to doc values. The access patterns are different, so the change would not be trivial, but you might be able to achieve gains this way - I really am not sure whether or not you would, the storage model is completely different,

Re: module not found error in intellij

2022-06-03 Thread Michael Sokolov
t; It's hacky but I've done it in the past. >>> >>> When I switch to (my preferred) intellij compilation, things break. This >>> is definitely a regression in IntelliJ somewhere because it used to work >>> very recently - until the last update, I think. Eve

Re: module not found error in intellij

2022-06-02 Thread Michael Sokolov
e tracker), they are just not our bugs... > > > 2022年6月3日(金) 0:17 Michael Sokolov : > > > > glad to know I'm not the only one! I think it's not OK though. Running > > tests in IDE is super useful, especially for debugging, but also for > > visualizing coverage. I think the

Re: module not found error in intellij

2022-06-02 Thread Michael Sokolov
; ./gradlew -p lucene/core.tests/ test > > I'm not sure the exact cause of that though IDEs' java module support > looks far from perfect for now, I would recommend not to use IDE when > running modular tests... > > Tomoko > > 2022年6月2日(木) 23:44 Michael Sokolov : > >

module not found error in intellij

2022-06-02 Thread Michael Sokolov
In IntelliJ building Lucene main branch I see this: .../workspace/lucene/lucene/core.tests/src/test/module-info.java:23: error: module not found: org.apache.lucene.core.tests.main requires org.apache.lucene.core.tests.main; ^ Am I doing it wrong? Does

Re: Welcome Lu Xugang as Lucene committer

2022-06-01 Thread Michael Sokolov
Welcome! I like finally too, but it seems strange that it has nothing to do with its apparent relative, final. On Wed, Jun 1, 2022 at 4:51 PM Gus Heck wrote: > Welcome and congratulations :) > > On Wed, Jun 1, 2022 at 3:32 PM Alessandro Benedetti > wrote: > >> Welcome on board Xugang! >>

Re: Welcome Chris Hegarty as Lucene committer

2022-06-01 Thread Michael Sokolov
Welcome Chris! I remember being part of a skeptical bunch of students in 1990 hearing about this new Java thing that was supposedly going to take over the world. Apparently it is still thriving :) -Mike On Wed, Jun 1, 2022 at 12:59 PM David Smiley wrote: > > Welcome Chris!

Re: Adding a new PointDocValuesField

2022-05-25 Thread Michael Sokolov
Also, there should be examples from other fields. Suppose you are indexing map data and want to support a UI that shows "hot spots" on the map where there is a lot of let's say ... activity of some sort. You'd like to facet on 2-d areas. Or for log analytics -- you want to do anomaly detection

Re: [VOTE] Release Lucene 9.2.0 RC2

2022-05-20 Thread Michael Sokolov
+1 SUCCESS! [0:49:44.832567] JDK11 only On Fri, May 20, 2022 at 4:46 PM Houston Putman wrote: > > +1 > > SUCCESS! [2:17:07.370407] (java 11 & 17) > > - Houston > > On Fri, May 20, 2022 at 8:04 AM Jan Høydahl wrote: >> >> +1 >> >> SUCCESS! [1:13:38.226868] >> >> Jan >> >> > 19. mai 2022 kl.

Re: [VOTE] Release Lucene 9.2.0 RC1

2022-05-18 Thread Michael Sokolov
+1 SUCCESS! [0:43:09.481661] I'm not going to get hung up on an issue with the smokeTester if Robert's not :) BTW thank you for running on slow machine that takes many hours! On Wed, May 18, 2022 at 3:48 PM Robert Muir wrote: > > I opened issue about this. It shouldn't block the release, but it

Re: [GitHub] [lucene] msokolov commented on pull request #870: LUCENE-10502: Refactor hnswVectors format

2022-05-13 Thread Michael Sokolov
Okay sorry I was confused about these override methods - they are different because of the different access patterns in the sparse/dense cases. Maybe the loss of history was unavoidable since we moved/renamed the file, but I wish we could maintain it. On Fri, May 13, 2022 at 1:45 PM GitBox

Re: [GitHub] [lucene] jpountz commented on pull request #859: LUCENE-10552: KnnVectorQuery has incorrect equals/ hashCode

2022-05-13 Thread Michael Sokolov
+1 to back port. It will make things more consistent at least On Thu, May 12, 2022, 11:36 AM GitBox wrote: > > jpountz commented on PR #859: > URL: https://github.com/apache/lucene/pull/859#issuecomment-1125144256 > >FWIW I found about this PR because it is in the 9.2 changelog on `main` >

Re: XML retrieval with Intervals

2022-05-06 Thread Michael Sokolov
> > Disclaimer: I worked there for a couple of years ten years ago. But I’ve been > inside that product and it is non-muggle technology. > > wunder > Walter Underwood > wun...@wunderwood.org > http://observer.wunderwood.org/ (my blog) > > On May 6, 2022, at 5:35 AM,

Re: XML retrieval with Intervals

2022-05-06 Thread Michael Sokolov
Many years ago I had started this Lux project that was designed to build an XML-aware index using Solr; see https://github.com/msokolov/lux/tree/master/src/main/java/lux/index/analysis for the analysis chain I used. Maybe you'll find something useful in this project? It's dormant for years, and

Re: [DISCUSS] A proposal for migration to GitHub issue (LUCENE-10557)

2022-05-05 Thread Michael Sokolov
> Is the original Jira -> GitHub move just a change of defaults or are we, > once moved to GitHub, not letting people use Jira at all anymore ? Nothing has been decided - it's all open for debate. I just want to re-state the idea (at least as I heard it) behind this proposed move is to make

Re: [DISCUSS] A proposal for migration to GitHub issue (LUCENE-10557)

2022-05-05 Thread Michael Sokolov
I'd like to see some discussion of pros/cons. Personally I don't have a lot of experience working with github's issue system, while I have grown comfortable with JIRA over the years, in spite of its warts. Here are a few things I like and *don't* like about both systems (mostly JIRA), but I don't

Re: FST codec for *infix* queries. No luck so far.

2022-04-26 Thread Michael Sokolov
I'm not sure under which scenario ngrams (edgengrams) would not be an option? Another to try maybe would be something like BPE (byte pair encoding). In this encoding, you train a set of tokens from a vocabulary based on frequency of occurrence, and agglomerate them iteratively until you have the

Re: spotless targets

2022-04-08 Thread Michael Sokolov
it" to "deny". In Jdk 17 the setting was > completely removed (https://openjdk.java.net/jeps/403). > > So you have to tell java to export the affected packages (each one > separately listed) also to classpath applications (unnamed module). > > Uwe > > Am 8. April 2022 15:07:

Re: spotless targets

2022-04-08 Thread Michael Sokolov
I guess this is related to the use of Java modules that now hide symbols? On Fri, Apr 8, 2022 at 3:05 AM Dawid Weiss wrote: > > > Maybe a check like this? > https://github.com/apache/lucene/pull/802 > > On Thu, Apr 7, 2022 at 9:26 PM Dawid Weiss wrote: >>> >>> Does spotless have an option to

Re: spotless targets

2022-04-06 Thread Michael Sokolov
that were not there before > > On Wed, Apr 6, 2022 at 3:46 PM Michael Sokolov wrote: >> >> OK, this also happens with Oracle's JDK17. Now I'm confused >> >> On Wed, Apr 6, 2022 at 4:28 PM Michael Sokolov wrote: >> > >> > Hi, locally I failed to run

Re: spotless targets

2022-04-06 Thread Michael Sokolov
OK, this also happens with Oracle's JDK17. Now I'm confused On Wed, Apr 6, 2022 at 4:28 PM Michael Sokolov wrote: > > Hi, locally I failed to run spotlessCheck/spotlessApply on main (10.x) > branch. I assume it's because of a JVM difference; here's the error: > > > Step '

spotless targets

2022-04-06 Thread Michael Sokolov
Hi, locally I failed to run spotlessCheck/spotlessApply on main (10.x) branch. I assume it's because of a JVM difference; here's the error: Step 'google-java-format' found problem in 'lucene/core/src/java/module-info.java': null java.lang.reflect.InvocationTargetException at

Re: Can Lucene9.0.0 be used on Android devices?

2022-04-05 Thread Michael Sokolov
I don't know, it probably comes down to how compatible Android's JVM is with JDK 11. Certainly it isn't a platform that gets a lot of attention from devs here, and I suspect Dalvik is not up to JDK11? Not sure though ... let us know what happens! On Tue, Mar 29, 2022 at 10:53 AM Baiyang Liu

Lucene PMC Chair Bruno Roustant

2022-03-23 Thread Michael Sokolov
Hello, Lucene developers. Lucene Program Management Committee has elected a new chair, Bruno Roustant, and the Board has approved. Bruno, thank you for stepping up, and congratulations! -Mike - To unsubscribe, e-mail:

Re: [ANNOUNCE] Apache Lucene 9.1.0 released

2022-03-22 Thread Michael Sokolov
Thank you for another release Adrien! On Tue, Mar 22, 2022 at 10:32 AM Adrien Grand wrote: > > The Lucene PMC is pleased to announce the release of Apache Lucene 9.1.0. > > Apache Lucene is a high-performance, full-featured search engine > library written entirely in Java. It is a technology

Re: [VOTE] Release Lucene 9.1.0 RC2

2022-03-18 Thread Michael Sokolov
s, so issues with not >> synchronized cache lines can happen easily)." Is this a different problem >> from #1 where we just have slow tests? I'm not sure if this is something we >> want to investigate as part of the release or if we think it can wait. >> >> Than

Re: [VOTE] Release Lucene 9.1.0 RC2

2022-03-18 Thread Michael Sokolov
We had to do a workaround in our internal test suites by setting a system property to trick this thing into not running; Maybe we can apply that also here... On Fri, Mar 18, 2022 at 3:12 PM Dawid Weiss wrote: > > I think this is Amazon trying to cope with log4shell - they've added > external

Re: [VOTE] Release Lucene 9.1.0 RC2

2022-03-18 Thread Michael Sokolov
Yeah this is endemic in our world now. I am having the same issue On Fri, Mar 18, 2022 at 2:51 PM Robert Muir wrote: > > >> 2>at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:521) > >> 2>at Log4jHotPatch.asmVersion(Log4jHotPatch.java:71) > >> 2>at

Re: [JENKINS] Lucene-9.x-Linux (64bit/jdk-17.0.2) - Build # 1798 - Unstable!

2022-03-09 Thread Michael Sokolov
This did not reproduce for me On Wed, Mar 9, 2022 at 3:41 PM Policeman Jenkins Server wrote: > > Build: https://jenkins.thetaphi.de/job/Lucene-9.x-Linux/1798/ > Java: 64bit/jdk-17.0.2 -XX:-UseCompressedOops -XX:+UseG1GC > > 3 tests failed. > FAILED:

Re: [jira] [Updated] (LUCENE-10454) UnifiedHighlighter can miss terms because of query rewrites

2022-03-03 Thread Michael Sokolov
Isn't this kind of like if a tree falls in the woods and nobody is there does it make a sound? I mean -- if the index is empty, how can UH fail? No documents will ever match, ergo no highlights will be returned, so it seems fine that it is unable to extract terms from the query. On Thu, Mar 3,

Re: Lucene 9.1 release soon?

2022-02-25 Thread Michael Sokolov
+1 thanks for volunteering On Thu, Feb 24, 2022, 5:41 AM Mayya Sharipova wrote: > + 1 > > On Thu, Feb 24, 2022 at 11:28 AM Ignacio Vera wrote: > >> +1 >> >> On Thu, Feb 24, 2022 at 9:05 AM Adrien Grand wrote: >> >>> +1 >>> >>> On Thu, Feb 24, 2022 at 8:43 AM Michael Wechner >>> wrote: >>> >

Re: How to Increase max vector size?

2022-02-16 Thread Michael Sokolov
. Examples could > be a service of OpenAI or vector search databases like for example Weaviate > or Pinecone. > > Thanks > > Michael > > > > > Am 15.02.22 um 23:34 schrieb Michael Sokolov: > > I don't think it makes sense to have a static variable maximum that you >

<    1   2   3   4   5   6   >