Re: Lucene 9.11

2024-05-28 Thread Chris Hegarty



> On 28 May 2024, at 14:57, Benjamin Trent  wrote:
> 
> ...
> 
> Did we figure out the hppc concerns? I saw some PR activity, wanted to make 
> sure we are all still good with starting the release process this week.

Hppc is no longer a concern. The issue has been addressed by 
https://github.com/apache/lucene/pull/13422

-Chris.
-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Q: 9.x upgrade to hppc 0.9.1

2024-05-27 Thread Chris Hegarty


> On 27 May 2024, at 09:08, Chris Hegarty  
> wrote:
> 
>> ...
> 
> That sounds like quite a lot of classes. How much is actually necessary to 
> allow to remove the dependency? And/Or is there a place of natural place 
> where it makes logical sense to subset?

Please ignore this comment. I see that such is already progressing.

-Chris.



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Q: 9.x upgrade to hppc 0.9.1

2024-05-27 Thread Chris Hegarty
Hi,

> +1 to moving the hppc fork to oal.internal.

+1

> On 26 May 2024, at 13:33, Bruno Roustant  wrote:
> 
> Currently the hppc fork in Lucene is composed of 15 classes and 8 test 
> classes.
> Forking everything in hppc would mean 525 classes and 193 test classes. I'm 
> not sure we want to fork all hppc?

That sounds like quite a lot of classes. How much is actually necessary to 
allow to remove the dependency? And/Or is there a place of natural place where 
it makes logical sense to subset?

-Chris.
-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Q: 9.x upgrade to hppc 0.9.1

2024-05-26 Thread Chris Hegarty
Hi David,

> On 25 May 2024, at 21:08, Dawid Weiss  wrote:
> 
> ...
> 
> I understand it's a pain if the order changes from run to run but I don't see 
> a way this can be avoided ([1] is the issue you mentioned on gh). Tests (and 
> code) shouldn't rely on map/set ordering, although I realize it may be 
> difficult to weed out in such a large codebase.

To be clear, I agree, the bug is in the Elasticsearch code - it should not 
depend upon iteration order of these collection types. And yes, it’s difficult 
to weed out and fix, which we’ll continue to work on.

> For what it's worth, the next version of HPPC will be a proper module (with 
> com.carrotsearch.hppc id). Would it change anything/ make it easier if I 
> renamed it to just 'hppc'?

Moving to an explicit module with a module-info sounds good. The name, 
com.carrotsearch.hppc, is a fine name for this. No need to revert to the 
automatic module name.

-Chris.
-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Q: 9.x upgrade to hppc 0.9.1

2024-05-25 Thread Chris Hegarty
Hi,

For awareness, I would to like to raise a potential issue that we’ve run into 
when testing Elasticsearch with the latest 9.x branch.

A recent change in 9.x [1] has introduced a dependency on hppc 0.9.1. Hppc has 
added an explicit automatic module name in its manifest, which effectively 
changes the auto module name from the plain hppc (derived from the jar file 
name) to com.carrotsearch.hppc. So one must use 0.9.1 ( or 0.9.0 ) if deployed 
as a module - otherwise the resolution of the `org.apache.lucene.join` module 
will fail.

Since Elasticsearch is deployed as a module, then we need to update to hppc 
0.9.1 [2], but unfortunately this is not straightforward. In fact, Ryan has a 
PR open [3] for the past 2 years without completion! The iteration order of 
some collection types in hppc 0.9.x [*] is tickling some inadvertent order 
dependencies in Elasticsearch. It may take some time to track these down and 
fix them.

I wonder if others may run into either or both of these issues, as we have in 
Elasticsearch, if we release 9.11 with this change?

-Chris.

[1] https://github.com/apache/lucene/pull/13392
[2] https://github.com/elastic/elasticsearch/pull/109006
[3] https://github.com/elastic/elasticsearch/pull/84168

[*] HPPC-186: A different strategy has been implemented for collision avalanche 
avoidance. This results in removal of Scatter* maps and sets and their 
unification with their Hash* counterparts. This change should not affect any 
existing code unless it relied on static, specific ordering of keys. A side 
effect of this change is that key/value enumerators will return a different 
ordering of their container's values on each invocation. If your code relies on 
the order of values in associative arrays, it must order them after they are 
retrieved. (Bruno Roustant).
-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Lucene 9.11

2024-05-15 Thread Chris Hegarty
+1

-Chris.

> On 14 May 2024, at 16:10, Adrien Grand  wrote:
> 
> +1 the 9.11 changelog looks great!
> 
> On Tue, May 14, 2024 at 4:50 PM Benjamin Trent  wrote:
> Hey y'all,
> 
> Looking at changes for 9.11, we are building a significant list. I propose we 
> do a release in the next couple of weeks.
> 
> While this email is a little early (I am about to go on vacation for a bit), 
> I volunteer myself as release manager. 
> 
> Unless there are objections, I plan on kicking off the release process May 
> 28th. 
> 
> Thanks!
> 
> Ben
> 
> 
> -- 
> Adrien


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Query about the GitHub statistics for Lucene

2024-03-06 Thread Chris Hegarty
Hi Mike,

> On 6 Mar 2024, at 10:47, Michael McCandless  wrote:
> 
> On Wed, Mar 6, 2024 at 4:41 AM Chris Hegarty  
> wrote:
> 
> Seems that I’ve fallen into the newbie PMC Chair rabbit hole! ;-) - the 
> reporting tool has long standing issues. Maybe they’re fixable, maybe not, 
> but it’s possible we don’t necessarily need it now.
> 
> Sorry :)  Seems to be a rite-of-passage at this point! 

Ha! Just happy that I’m not alone on this.

> It should be mentioned in the handover instructions... or, we should simply 
> merge Daniel Gruno's one-line fix to the regexp that Kibble/Whimsy/reporter 
> tool uses: 
> https://issues.apache.org/jira/browse/COMDEV-425?focusedCommentId=17823767=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17823767

That would be great, but I’m not sure why it’s not been done before at this 
point. I’ll add a note to future handover instructions if it cannot be resolved.

> @Mike is it possible to add “created since” filter?
> 
> Ahh good idea, done!  
> https://githubsearch.mikemccandless.com/search.py?sort=recentlyUpdated=created%3APast+3+months=issue_or_pr%3APR
>   (this is PRs created in the Past 3 months ... it shows 36 open and 162 
> closed right now, close to the GitHub counts you found).

This looks right, thanks. I think we can use Githubsearch going forward. :-) 

> Here's the luceneserver commit that adds it: 
> https://github.com/mikemccand/luceneserver/commit/397942573bed3e2c4fd00ab0a324a19fd014bfd4

Thank you,
-Chris.
-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Query about the GitHub statistics for Lucene

2024-03-06 Thread Chris Hegarty
Hi,

Seems that I’ve fallen into the newbie PMC Chair rabbit hole! ;-) - the 
reporting tool has long standing issues. Maybe they’re fixable, maybe not, but 
it’s possible we don’t necessarily need it now.

> On 5 Mar 2024, at 18:22, Michael McCandless  wrote:
> 
> ...
> @Mike. Would it be possible to add a “Past 3 months” to 
> https://githubsearch.mikemccandless.com/search.py ? Which would be helpful 
> when reporting.
> 
> Good idea!  Done!  
> https://githubsearch.mikemccandless.com/search.py?sort=recentlyUpdated=status%3AOpen=updated%3APast+3+months

Cool. Thanks.

The stats I’m trying to retrieve are for PRs created in the past 3 months. 
GitHub allows me to get that with:
   https://github.com/apache/lucene/pulls?q=is%3Apr+created%3A%3E2023-12-05

, which (when run today) shows:  PRs - 36 Open   163 Closed

Another interesting stat is PRs UPDATED in the past 3 months, e.g.
  https://github.com/apache/lucene/pulls?q=is%3Apr+updated%3A%3E2023-12-05+
   ~355 PRs updated.
   ( which we can also see from Mike’s githubsearch [1])

@Mike is it possible to add “created since” filter?

Another very rough approximation of activity / health is commits, e.g.

  $ git log --pretty='format:%cd' --since='3 months ago' | wc -l
  244
  $ git log --all --pretty='format:%cd' --since='3 months ago' | wc -l
  472

So 472 commits on all branches in the past 3 months.

-Chris

[1] 
https://githubsearch.mikemccandless.com/search.py?chg=du==status=undefined=0=29577=recentlyUpdated=list=uzz5ht9buk98=status%3AOpen=updated%3APast+3+months=issue_or_pr%3APR=


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Query about the GitHub statistics for Lucene

2024-03-05 Thread Chris Hegarty


> On 5 Mar 2024, at 13:26, Robert Muir  wrote:
> 
> On Tue, Mar 5, 2024 at 4:50 AM Chris Hegarty
>  wrote:
>> It appears that there is no GH activity for 2024! Clearly this is incorrect. 
>> I’ve yet to track down what’s going on with this. Familiar to anyone here?
>> 
> 
> Last time I looked at this, it appeared it is looking at the incorrect
> github repositories, for example https://github.com/apache/lucene-solr
> and not https://github.com/apache/lucene

Ah, that could explain it!!

I’ll try to take a look at what repo those report stats are generated from, and 
how we might be able to get that updated. Mostly for convenience, and also 
having a single source of truth.

Anyway, thankfully git and GH are good enough to get the kind of basic stats we 
typically want - just that it’s not as clear when comparing to previously 
gathered stats. Well… commits are commits, and counting PRs should not result 
in different numbers, but you know ... ;-) 

Thanks,
-Chris.
-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Query about the GitHub statistics for Lucene

2024-03-05 Thread Chris Hegarty
Hi,

In preparation for the project’s upcoming ASF board report, I came across and 
reported [1] an issue with the GH statistics, available at: 
https://reporter.apache.org/wizard/statistics?lucene

It appears that there is no GH activity for 2024! Clearly this is incorrect. 
I’ve yet to track down what’s going on with this. Familiar to anyone here? 

@Mike. Would it be possible to add a “Past 3 months” to 
https://githubsearch.mikemccandless.com/search.py ? Which would be helpful when 
reporting.

-Chris

[1] https://lists.apache.org/thread/78fh8hb68zybbkz63odb0hzohgrddzkq
-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [Vote] Bump the Lucene main branch to Java 21

2024-02-29 Thread Chris Hegarty
Hi, 

> On 29 Feb 2024, at 11:38, Uwe Schindler  wrote:
> 
> Hi,
> 
> this vote has passed.

I was about to send a note about this, but you beat me to it! ;-)  The 
substantive point is that the vote passed - Awesome!

> 
> I wanted to wait for Chris to merge the PR, but due to heavy working on main 
> removing ByteBufferIndexInput and updating Java versions, I accidentally 
> pushed the wrong branch to main, so it is already merged. The PRwas closed 
> manually.
> 
> Lucene "main" (10.0) is now on Java 21.
> 
> Sorry, Chris - my fault!

Apology not needed. Thank you, the the other contributors on that PR, so much 
for getting this done. I’m super happy with the outcome.

-Chris.

> Uwe
> 
> Am 23.02.2024 um 12:24 schrieb Chris Hegarty:
>> Hi,
>> 
>> Since the discussion on bumping the Lucene main branch to Java 21 is winding 
>> down, let's hold a vote on this important change.
>> 
>> Once bumped, the next major release of Lucene (whenever that will be) will 
>> require a version of Java greater than or equal to Java 21.
>> 
>> The vote will be open for at least 72 hours (and allow some additional time 
>> for the weekend) i.e. until 2024-02-28 12:00 UTC.
>> 
>> [ ] +1  approve
>> [ ] +0  no opinion
>> [ ] -1  disapprove (and reason why)
>> 
>> Here is my +1
>> 
>> -Chris.
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>> 
> -- 
> Uwe Schindler
> Achterdiek 19, D-28357 Bremen
> https://www.thetaphi.de
> eMail: u...@thetaphi.de
> 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
> 


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[Vote] Bump the Lucene main branch to Java 21

2024-02-23 Thread Chris Hegarty
Hi,

Since the discussion on bumping the Lucene main branch to Java 21 is winding 
down, let's hold a vote on this important change.

Once bumped, the next major release of Lucene (whenever that will be) will 
require a version of Java greater than or equal to Java 21.

The vote will be open for at least 72 hours (and allow some additional time for 
the weekend) i.e. until 2024-02-28 12:00 UTC.

[ ] +1  approve
[ ] +0  no opinion
[ ] -1  disapprove (and reason why)

Here is my +1

-Chris.
-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Bump the Lucene main branch to Java 21

2024-02-21 Thread Chris Hegarty
Hi Mike,

> On 21 Feb 2024, at 12:34, Michael McCandless  
> wrote:
> 
> Thank you for the heads up Chris.
> 
> So I think this means we are now free to use all the newfangled language 
> features since Java 11 (min required for Lucene 9.x) -> Java 21?

For the _main_ branch, yes.

The _branch_9x_ remains unchanged - it stays on Java 11.

So, if you’re planning to backport a change from main to 9x, then you may want 
to consider what Java language feature and/or JDK API you use - to make the 
backport more straightforward. But this is nothing new, _main_ is already on 
Java 17, while 9x is on Java 11, so the scenario already exists, just that the 
range is changing with this proposal. Hope this helps.

-Chris.

> 
> Mike McCandless
> 
> http://blog.mikemccandless.com
> 
> 
> On Wed, Feb 21, 2024 at 3:58 AM Chris Hegarty 
>  wrote:
> Hi,
> 
> A number of us have been iterating on a PR to bump the Lucene main branch to 
> a minimum of Java 21 [1]. The work is in a good state and is almost ready to 
> commit.
> 
> While the changes themselves are not large, the impact is arguably larger. So 
> I’m raising awareness here with the wider group.
> 
> Clearly one could conflate the bump to Java 21 with the question of when will 
> Lucene have a next major release, but those issues, while somewhat related, 
> are orthogonal. My position is that the next Lucene major should be on Java 
> 21, regardless of when that will happen.
> 
> Comments, feedback, suggestions welcome.
> 
> Thanks,
> -Chris.
> 
> [1] https://github.com/apache/lucene/pull/12753
> 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
> 


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Bump the Lucene main branch to Java 21

2024-02-21 Thread Chris Hegarty
Hi,

A number of us have been iterating on a PR to bump the Lucene main branch to a 
minimum of Java 21 [1]. The work is in a good state and is almost ready to 
commit.

While the changes themselves are not large, the impact is arguably larger. So 
I’m raising awareness here with the wider group.

Clearly one could conflate the bump to Java 21 with the question of when will 
Lucene have a next major release, but those issues, while somewhat related, are 
orthogonal. My position is that the next Lucene major should be on Java 21, 
regardless of when that will happen.

Comments, feedback, suggestions welcome.

Thanks,
-Chris.

[1] https://github.com/apache/lucene/pull/12753


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Welcome Zhang Chao as Lucene committer

2024-02-20 Thread Chris Hegarty
Congratulations and welcome!! 

-Chris.

> On 20 Feb 2024, at 17:28, Adrien Grand  wrote:
> 
> I'm pleased to announce that Zhang Chao has accepted the PMC's
> invitation to become a committer.
> 
> Chao, the tradition is that new committers introduce themselves with a
> brief bio.
> 
> Congratulations and welcome!
> 
> -- 
> Adrien


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Announcing githubsearch!

2024-02-20 Thread Chris Hegarty
Awesome! I love it. Very useful.

-Chris.

> On 20 Feb 2024, at 11:40, Michael McCandless  
> wrote:
> 
> Thank you for all the warm feedback everyone, and all the exciting issues 
> already uncovered / ideas for improvements.  Now I have some more fun work to 
> do!
> 
> Mike McCandless
> 
> http://blog.mikemccandless.com
> 
> 
> On Mon, Feb 19, 2024 at 12:58 PM Julie Tibshirani  wrote:
> This is so cool! Thank you Mike for developing and hosting these services!
> 
> Julie
> 
> On Mon, Feb 19, 2024 at 9:40 AM Michael Wechner  
> wrote:
> thank you very much!
> 
> Am 19.02.24 um 17:39 schrieb Michael McCandless:
>> Hi Team,
>> 
>> ~1.5 years ago (August 2022) we migrated our Lucene issue tracking from Jira 
>> to GitHub. Thank you Tomoko for all the hard work doing such a complex, 
>> multi-phased, high-fidelity migration!
>> 
>> I finally finished also migrating jirasearch to GitHub: 
>> githubsearch.mikemccandless.com. It was tricky because GitHub issues/PRs are 
>> fundamentally more complex than Jira's data model, and the GitHub REST API 
>> is also quite rich / heavily normalized. All of the source code for 
>> githubsearch lives here. The UI remains its barebones self ;)
>> 
>> Githubsearch is dog food for us: it showcases Lucene (currently 9.8.0), and 
>> many of its fun features like infix autosuggest, block join queries (each 
>> comment is a sub-document on the issue/PR), DrillSideways faceting, 
>> near-real-time indexing/searching, synonyms (try “oome”), expressions, 
>> non-relevance and blended-relevance sort, etc.  (This old blog post goes 
>> into detail.)  Plus, it’s meta-fun to use Lucene to search its own issues, 
>> to help us be more productive in improving Lucene!  Nicely recursive.
>> 
>> In addition to good ol’ searching by text, githubsearch has some new/fun 
>> features:
>> • Drill down to just PRs or issues
>> • Filter by “review requested” for a given user: poor Adrien has 8 
>> (open) now (sorry)! Or see your mentions (Robert is mentioned in 27 open 
>> issues/PRs). Or PRs that you reviewed (Uwe has reviewed 9 still-open PRs). 
>> Or issues and PRs where a user has had any involvement at all (Dawid has 
>> interacted on 197 issues/PRs).
>> • Find still-open PRs that were created by a New Contributor (an author 
>> who has no changes merged into our repository) or Contributor (non-committer 
>> who has had some changes merged into our repository) or Member
>> • Here are the uber-stale (last touched more than a month ago) open PRs 
>> by outside contributors. We should ideally keep this at 0, but it’s 83 now!
>> • “Link to this search” to get a short-er, more permanent URL (it is NOT 
>> a URL shortener, though!)
>> • Save named searches you frequently run (they just save to local cookie 
>> state on that one browser)
>> I’m sure there are exciting bugs, feedback/patches welcome!  If you see 
>> problems, please reply to this email or file an issue here. 
>> 
>> Note that jirasearch remains running, to search Solr, Tika and Infra issues.
>> 
>> Happy Searching,
>> 
>> Mike McCandless
>> 
>> http://blog.mikemccandless.com
> 


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [VOTE] Release Lucene 9.10.0 RC1

2024-02-19 Thread Chris Hegarty


+1   SUCCESS! [1:14:49.683559]

-Chris.

> On 15 Feb 2024, at 21:08, Uwe Schindler  wrote:
> 
> Hi,
> I used Stefan Vodita's Hack to make the Smoketester run on a large list of 
> JDKs: https://github.com/apache/lucene/pull/13108
> See the console of running Java 11, Java 17, Java 19, Java 20, Java 21. Due 
> to limitations of Gradle I wasn't able to do the smoker checks on Java 22 
> release candidate, but as there are no changes to 9.x branch I assume that 
> everything also works in Java 22. If anybody else has time to run a test 
> project with Java 22 using mmap and vectors it would be great!
> Log file: https://jenkins.thetaphi.de/job/Lucene-Release-Tester-v2/3/console
> Result was:
> SUCCESS! [2:42:55.968473]
> 
> Here is my +1 (binding).
> Uwe
> 
> Am 15.02.2024 um 12:50 schrieb Uwe Schindler:
>> Hi,
>> I ran the default smoke tester with Java 11 and Java 17 on Policeman 
>> Jenkins; all looks fine: 
>> https://jenkins.thetaphi.de/job/Lucene-Release-Tester/32/console
>> SUCCESS! [1:04:45.740708]
>> I only have one problem. Now that Java 21 LTS is out and more an more people 
>> use it, it would be good to also run the smoke tester with Java 21. I tried 
>> that locally by just passing the home dir of java 21 instead of Java 17, but 
>> that failed due to some check in smoker.
>> I will work this evening on patching Smoke tester to also allow it to pass 
>> Java 21. Maybe the best would be to pass multiple Java versions as comma 
>> spearated list, just the default one must be Java 11 (the baseline). This 
>> would allo me to spin Policeman Jenkins with Java 11, Java 17, Java 19, Java 
>> 20, Java 21 and Java 22-rc1. Takes a while but would make sure all works in 
>> the officially MR-JAR supported relaeses + LTS.
>> What do you think.
>> I will give my +1 later when I checked the options and also looked into the 
>> downloaded artifacts.
>> Uwe
>> Am 14.02.2024 um 20:28 schrieb Adrien Grand:
>>> Please vote for release candidate 1 for Lucene 9.10.0
>>> 
>>> The artifacts can be downloaded from:
>>> https://dist.apache.org/repos/dist/dev/lucene/lucene-9.10.0-RC1-rev-695c0ac84508438302cd346a812cfa2fdc5a10df
>>> 
>>> You can run the smoke tester directly with this command:
>>> 
>>> python3 -u dev-tools/scripts/smokeTestRelease.py \
>>> https://dist.apache.org/repos/dist/dev/lucene/lucene-9.10.0-RC1-rev-695c0ac84508438302cd346a812cfa2fdc5a10df
>>> 
>>> The vote will be open for at least 72 hours i.e. until 2024-02-17 20:00 UTC.
>>> 
>>> [ ] +1  approve
>>> [ ] +0  no opinion
>>> [ ] -1  disapprove (and reason why)
>>> 
>>> Here is my +1
>>> 
>>> -- 
>>> Adrien
>> -- 
>> Uwe Schindler
>> Achterdiek 19, D-28357 Bremen
>> https://www.thetaphi.de
>> eMail: u...@thetaphi.de
> -- 
> Uwe Schindler
> Achterdiek 19, D-28357 Bremen
> https://www.thetaphi.de
> eMail: u...@thetaphi.de


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[ANNOUNCE] Apache Lucene 9.9.2 released

2024-01-29 Thread Chris Hegarty
The Lucene PMC is pleased to announce the release of Apache Lucene 9.9.2.

Apache Lucene is a high-performance, full-featured text search engine library 
written entirely in Java. It is a technology suitable for nearly any 
application that requires full-text search, especially cross-platform.

This patch release contains bug fixes that are highlighted below. The release 
is available for immediate download at: 

https://lucene.apache.org/core/downloads.html

Lucene 9.9.2 Release Highlights

Bug fixes
 * GITHUB#13027: Fix NPE when sampling for quantization in 
Lucene99HnswScalarQuantizedVectorsFormat (Ben Trent)
 * GITHUB#13014: Rollback the tmp storage of BytesRefHash to -1 after sort (Guo 
Feng)

Further details of changes are available in the change log available at:

http://lucene.apache.org/core/9_9_2/changes/Changes.html.

-Chris
-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[RESULT] [VOTE] Release Lucene 9.9.2 RC1

2024-01-29 Thread Chris Hegarty
It's been >72h since the vote was initiated and the result is:

+1  11  (9 binding)
 0  0
-1  0

This vote has PASSED 

-Chris.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[VOTE] Release Lucene 9.9.2 RC1

2024-01-25 Thread Chris Hegarty
Please vote for release candidate 1 for Lucene 9.9.2

The artifacts can be downloaded from:
https://dist.apache.org/repos/dist/dev/lucene/lucene-9.9.2-RC1-rev-a2939784c4ca60bc28bf488b5479c02fc2e5e22c

You can run the smoke tester directly with this command:

python3 -u dev-tools/scripts/smokeTestRelease.py \
https://dist.apache.org/repos/dist/dev/lucene/lucene-9.9.2-RC1-rev-a2939784c4ca60bc28bf488b5479c02fc2e5e22c

The vote will be open for 96 hours ( allowing some additional time for weekend 
span) i.e. until 2024-01-29 12:00 UTC.

[ ] +1  approve
[ ] +0  no opinion
[ ] -1  disapprove (and reason why)

Here is my +1

Draft release notes can be found at 
https://cwiki.apache.org/confluence/display/LUCENE/ReleaseNote9_9_2

-Chris.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: The need for a Lucene 9.9.2 release

2024-01-24 Thread Chris Hegarty
Hi Marios,

If you have a link to the CrateDB issue, I’ll add it to JDK-8323659.

-Chris. 

> On 24 Jan 2024, at 14:26, Chris Hegarty  
> wrote:
> 
> Hi Marios,
> 
> Thanks for raising awareness of this JDK bug.
> 
> Just to be clear, and for other readers of this list, the JDK bug is 
> orthogonal to whether or not we include support for Java 22 in a future 
> Lucene release.
> 
>> On 24 Jan 2024, at 14:15, Marios Trivyzas  wrote:
>> 
>> Hi,
>> 
>> Just an FYI regarding Java22.
>> In CrateDB we experienced issues with a bug: 
>> https://bugs.openjdk.org/browse/JDK-8323659
>> From a quick search, haven't seen usage of `LinkedTransferQueue` in Lucene, 
>> so just wanted to share the issue.
> 
> JDK-8323659 was discovered and fixed in a JDK 22 Early Access build - it will 
> not be in any GA release of 22.
> 
> Unfortunately, JDK-8323659 found its way into a bugfix release of the JDK, 
> 21.0.2.  Since it was found late in the release cycle, 21.0.2 shipped with 
> it. It was later fixed in the yet-to-be-released 21.0.3.
> 
> I’ve seen no issues in Lucene because of JDK-8323659. However, we have seen 
> issues in Elasticsearch, see 
> https://github.com/elastic/elasticsearch/pull/104347. Were we needed to 
> workaround JDK-8323659 in order to adopt JDK 21.0.2.
> 
> JDK-8323659 is a bit of a sad story. I really wish we could have had a respin 
> of JDK 21.0.2, but that was not possible at the time :-( 
> 
> -Chris.
> 
>> 
>> Cheers
>> -- Marios
>> 
>> On Wed, Jan 24, 2024 at 4:08 PM Chris Hegarty 
>>  wrote:
>> Hi Uwe,
>> 
>>> On 24 Jan 2024, at 13:29, Uwe Schindler  wrote:
>>> 
>>> Hi,
>>> 
>>> Now I understand why you asked yesterday in the Java 22 PR.
>> 
>> :-) I was just gathering my thoughts and considering what options we have 
>> relating to releases (bugfix or minor).
>> 
>>> Do you think we should add Java 22 support for MMAP and Vectors?
>> 
>> No. Let’s do a 9.9.2 with just the two aforementioned bug fixes. We can 
>> later do a 9.10, some time in late February to early March, in order to pick 
>> up the Java 22 and other changes.
>> 
>>> It is a bit risky, because API may still change, but the worst that could 
>>> happen is that people need to pass a sysprop in Java 22 to disable broken 
>>> MMAP (if everything goes wrong).
>>> 
>>> So what do you think? Should we merge in Java 22 support or not? It's a 
>>> bugfix release, so I am not super happy to take any risks.
>> 
>> Agree. Let’s lower the risk. We’ll stay the coarse for Java 22 support in 
>> 9.10, as would be the case if 9.9.2 was not a thing.
>> 
>> -Chris.
>> 
>>> Uwe
>>> 
>>> Am 23.01.2024 um 18:36 schrieb Chris Hegarty:
>>>> Hi,
>>>> 
>>>> We’ve encounter a serious issue with the recent Lucene 9.9.1 release, 
>>>> which warrants a 9.9.2.
>>>> 
>>>> The issue is a NPE when sampling for quantization in 
>>>> Lucene99HnswScalarQuantizedVectorsFormat [1]. Thankfully Ben has already 
>>>> resolved the issue, and backported it to the appropriate branches.
>>>> 
>>>> I don’t see any other potential issues that would warrant being pulled 
>>>> into this release.
>>>> 
>>>> I’m happy to be Release Manager for 9.9.2 (given my recent experience on 
>>>> 9.9.1). I’ll start the release process tomorrow and notify this list when 
>>>> artifacts are ready.
>>>> 
>>>> Thanks,
>>>> -Chris.
>>>> 
>>>> [1] https://github.com/apache/lucene/pull/13027
>>>> -
>>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>>>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>>> 
>>> -- 
>>> Uwe Schindler
>>> Achterdiek 19, D-28357 Bremen
>>> https://www.thetaphi.de
>>> eMail: u...@thetaphi.de
>>> 
>>> 
>>> -
>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>> 
>> 
>> 
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>> 
>> 
>> 
>> -- 
>> Marios



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: The need for a Lucene 9.9.2 release

2024-01-24 Thread Chris Hegarty
Hi Marios,

Thanks for raising awareness of this JDK bug.

Just to be clear, and for other readers of this list, the JDK bug is orthogonal 
to whether or not we include support for Java 22 in a future Lucene release.

> On 24 Jan 2024, at 14:15, Marios Trivyzas  wrote:
> 
> Hi,
> 
> Just an FYI regarding Java22.
> In CrateDB we experienced issues with a bug: 
> https://bugs.openjdk.org/browse/JDK-8323659
> From a quick search, haven't seen usage of `LinkedTransferQueue` in Lucene, 
> so just wanted to share the issue.

JDK-8323659 was discovered and fixed in a JDK 22 Early Access build - it will 
not be in any GA release of 22.

Unfortunately, JDK-8323659 found its way into a bugfix release of the JDK, 
21.0.2.  Since it was found late in the release cycle, 21.0.2 shipped with it. 
It was later fixed in the yet-to-be-released 21.0.3.

I’ve seen no issues in Lucene because of JDK-8323659. However, we have seen 
issues in Elasticsearch, see 
https://github.com/elastic/elasticsearch/pull/104347. Were we needed to 
workaround JDK-8323659 in order to adopt JDK 21.0.2.

JDK-8323659 is a bit of a sad story. I really wish we could have had a respin 
of JDK 21.0.2, but that was not possible at the time :-( 

-Chris.

> 
> Cheers
> -- Marios
> 
> On Wed, Jan 24, 2024 at 4:08 PM Chris Hegarty 
>  wrote:
> Hi Uwe,
> 
> > On 24 Jan 2024, at 13:29, Uwe Schindler  wrote:
> > 
> > Hi,
> > 
> > Now I understand why you asked yesterday in the Java 22 PR.
> 
> :-) I was just gathering my thoughts and considering what options we have 
> relating to releases (bugfix or minor).
> 
> > Do you think we should add Java 22 support for MMAP and Vectors?
> 
> No. Let’s do a 9.9.2 with just the two aforementioned bug fixes. We can later 
> do a 9.10, some time in late February to early March, in order to pick up the 
> Java 22 and other changes.
> 
> > It is a bit risky, because API may still change, but the worst that could 
> > happen is that people need to pass a sysprop in Java 22 to disable broken 
> > MMAP (if everything goes wrong).
> > 
> > So what do you think? Should we merge in Java 22 support or not? It's a 
> > bugfix release, so I am not super happy to take any risks.
> 
> Agree. Let’s lower the risk. We’ll stay the coarse for Java 22 support in 
> 9.10, as would be the case if 9.9.2 was not a thing.
> 
> -Chris.
> 
> > Uwe
> > 
> > Am 23.01.2024 um 18:36 schrieb Chris Hegarty:
> >> Hi,
> >> 
> >> We’ve encounter a serious issue with the recent Lucene 9.9.1 release, 
> >> which warrants a 9.9.2.
> >> 
> >> The issue is a NPE when sampling for quantization in 
> >> Lucene99HnswScalarQuantizedVectorsFormat [1]. Thankfully Ben has already 
> >> resolved the issue, and backported it to the appropriate branches.
> >> 
> >> I don’t see any other potential issues that would warrant being pulled 
> >> into this release.
> >> 
> >> I’m happy to be Release Manager for 9.9.2 (given my recent experience on 
> >> 9.9.1). I’ll start the release process tomorrow and notify this list when 
> >> artifacts are ready.
> >> 
> >> Thanks,
> >> -Chris.
> >> 
> >> [1] https://github.com/apache/lucene/pull/13027
> >> -
> >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> >> For additional commands, e-mail: dev-h...@lucene.apache.org
> >> 
> > -- 
> > Uwe Schindler
> > Achterdiek 19, D-28357 Bremen
> > https://www.thetaphi.de
> > eMail: u...@thetaphi.de
> > 
> > 
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> > For additional commands, e-mail: dev-h...@lucene.apache.org
> > 
> 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
> 
> 
> 
> -- 
> Marios


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: The need for a Lucene 9.9.2 release

2024-01-24 Thread Chris Hegarty
Hi Uwe,

> On 24 Jan 2024, at 13:29, Uwe Schindler  wrote:
> 
> Hi,
> 
> Now I understand why you asked yesterday in the Java 22 PR.

:-) I was just gathering my thoughts and considering what options we have 
relating to releases (bugfix or minor).

> Do you think we should add Java 22 support for MMAP and Vectors?

No. Let’s do a 9.9.2 with just the two aforementioned bug fixes. We can later 
do a 9.10, some time in late February to early March, in order to pick up the 
Java 22 and other changes.

> It is a bit risky, because API may still change, but the worst that could 
> happen is that people need to pass a sysprop in Java 22 to disable broken 
> MMAP (if everything goes wrong).
> 
> So what do you think? Should we merge in Java 22 support or not? It's a 
> bugfix release, so I am not super happy to take any risks.

Agree. Let’s lower the risk. We’ll stay the coarse for Java 22 support in 9.10, 
as would be the case if 9.9.2 was not a thing.

-Chris.

> Uwe
> 
> Am 23.01.2024 um 18:36 schrieb Chris Hegarty:
>> Hi,
>> 
>> We’ve encounter a serious issue with the recent Lucene 9.9.1 release, which 
>> warrants a 9.9.2.
>> 
>> The issue is a NPE when sampling for quantization in 
>> Lucene99HnswScalarQuantizedVectorsFormat [1]. Thankfully Ben has already 
>> resolved the issue, and backported it to the appropriate branches.
>> 
>> I don’t see any other potential issues that would warrant being pulled into 
>> this release.
>> 
>> I’m happy to be Release Manager for 9.9.2 (given my recent experience on 
>> 9.9.1). I’ll start the release process tomorrow and notify this list when 
>> artifacts are ready.
>> 
>> Thanks,
>> -Chris.
>> 
>> [1] https://github.com/apache/lucene/pull/13027
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>> 
> -- 
> Uwe Schindler
> Achterdiek 19, D-28357 Bremen
> https://www.thetaphi.de
> eMail: u...@thetaphi.de
> 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
> 


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: The need for a Lucene 9.9.2 release

2024-01-23 Thread Chris Hegarty
Hi Christine,

Including 13014 seems reasonable, and the issue appears quite severe.

Let’s see if we can get 13014 reviewed and merged in the next day or two. If 
so, then it seems reasonable to include.

-Chris.

> On 23 Jan 2024, at 18:30, Christine Poerschke (BLOOMBERG/ LONDON) 
>  wrote:
> 
> Thanks Chris for volunteering!
> 
> I wonder if https://github.com/apache/lucene/pull/13014 might be a candidate 
> for pulling into the release too?
> 
> From: dev@lucene.apache.org At: 01/23/24 17:37:21 UTC
> To: dev@lucene.apache.org
> Subject: The need for a Lucene 9.9.2 release 
> 
> Hi,
> 
> We’ve encounter a serious issue with the recent Lucene 9.9.1 release, which 
> warrants a 9.9.2.
> 
> The issue is a NPE when sampling for quantization in 
> Lucene99HnswScalarQuantizedVectorsFormat [1]. Thankfully Ben has already 
> resolved the issue, and backported it to the appropriate branches. 
> 
> I don’t see any other potential issues that would warrant being pulled into 
> this release.
> 
> I’m happy to be Release Manager for 9.9.2 (given my recent experience on 
> 9.9.1). I’ll start the release process tomorrow and notify this list when 
> artifacts are ready.
> 
> Thanks,
> -Chris.
> 
> [1] https://github.com/apache/lucene/pull/13027
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
> 
> 


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



The need for a Lucene 9.9.2 release

2024-01-23 Thread Chris Hegarty
Hi,

We’ve encounter a serious issue with the recent Lucene 9.9.1 release, which 
warrants a 9.9.2.

The issue is a NPE when sampling for quantization in 
Lucene99HnswScalarQuantizedVectorsFormat [1]. Thankfully Ben has already 
resolved the issue, and backported it to the appropriate branches. 

I don’t see any other potential issues that would warrant being pulled into 
this release.

I’m happy to be Release Manager for 9.9.2 (given my recent experience on 
9.9.1). I’ll start the release process tomorrow and notify this list when 
artifacts are ready.

Thanks,
-Chris.

[1] https://github.com/apache/lucene/pull/13027
-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Welcome Stefan Vodita as Lucene committter

2024-01-18 Thread Chris Hegarty
Welcome Stefan.

-Chris.

> On 18 Jan 2024, at 15:53, Michael McCandless  
> wrote:
> 
> Hi Team,
> 
> I'm pleased to announce that Stefan Vodita has accepted the Lucene PMC's 
> invitation to become a committer!
> 
> Stefan, the tradition is that new committers introduce themselves with a 
> brief bio.
> 
> Congratulations, welcome, and thank you for all your improvements to Lucene 
> and our community,
> 
> Mike McCandless
> 
> http://blog.mikemccandless.com


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Lucene v9.9.1: org.apache.lucene.search.ScoreMode

2024-01-06 Thread Chris Hegarty
Hi,

I see no issue. ScoreMode is present in lucene-core-9.9.1.jar

$ curl https://dlcdn.apache.org/lucene/java/9.9.1/lucene-9.9.1.tgz > 
lucene-9.9.1.tgz
   ...
$  $ tar -xzf  lucene-9.9.1.tgz  $ jar -tvf 
lucene-9.9.1/modules/lucene-core-9.9.1.jar | grep ScoreMode
  1618 Wed Dec 13 11:06:00 GMT 2023 org/apache/lucene/search/ScoreMode.class

Or from maven

$ curl 
https://repo1.maven.org/maven2/org/apache/lucene/lucene-core/9.9.1/lucene-core-9.9.1.jar
 > lucene-core-9.9.1.jar
   ...
$ jar -tvf lucene-core-9.9.1.jar | grep ScoreMode
-rw-r--r--  0 0  01618 13 Dec 11:06 
org/apache/lucene/search/ScoreMode.class

-Chris.

> On 6 Jan 2024, at 12:42, Nazerke S  wrote:
> 
> Hi, 
> 
> While I was trying to upgrade Solr to use Lucene v9.9.1, I encountered 
> 'org.apache.lucene.search.ScoreMode' not found, getting resolve class issue. 
> Quickly took a look into the ScoreMode class in lucene codebase,  there is no 
> change. 
> Maybe it is related to lucene-core-9.9.1.jar issue where ScoreMode class is  
> ? 
>  Anyone could help with this ?   
> 
> Thanksss, 
> 
> --Nazerke


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [JENKINS] Lucene » Lucene-NightlyTests-9.9 - Build # 18 - Unstable!

2023-12-17 Thread Chris Hegarty
Hi Dawid

> On 16 Dec 2023, at 07:36, Dawid Weiss  wrote:
> 
> 
> This one has been fixed on branch_9x and on main - it's a broken test. Should 
> we apply it on 9_9 to quiet down CI or is the branch frozen?

I completed the RM tasks for the 9.9.1 release, and just merged the fix for 
this test into branch_9_9. Should be stable now. Thanks,

-Chris.

> Dawid
> 
> On Sat, Dec 16, 2023 at 3:24 AM Apache Jenkins Server 
>  wrote:
> Build: https://ci-builds.apache.org/job/Lucene/job/Lucene-NightlyTests-9.9/18/
> 
> 1 tests failed.
> FAILED:  
> org.apache.lucene.queries.function.TestSortedSetFieldSource.testSimple
> 
> Error Message:
> org.junit.ComparisonFailure: expected: but was:
> 
> Stack Trace:
> org.junit.ComparisonFailure: expected: but was:
> at 
> __randomizedtesting.SeedInfo.seed([7908DAB4CF9E2D92:41BBFE4AE86DF943]:0)
> at junit@4.13.1/org.junit.Assert.assertEquals(Assert.java:117)
> at junit@4.13.1/org.junit.Assert.assertEquals(Assert.java:146)
> at 
> org.apache.lucene.queries.function.TestSortedSetFieldSource.testSimple(TestSortedSetFieldSource.java:59)
> at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.base/java.lang.reflect.Method.invoke(Method.java:566)
> at 
> randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1758)
> at 
> randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:946)
> at 
> randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:982)
> at 
> randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:996)
> at 
> org.apache.lucene.test_framework@9.9.1-SNAPSHOT/org.apache.lucene.tests.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:48)
> at 
> org.apache.lucene.test_framework@9.9.1-SNAPSHOT/org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
> at 
> org.apache.lucene.test_framework@9.9.1-SNAPSHOT/org.apache.lucene.tests.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:45)
> at 
> org.apache.lucene.test_framework@9.9.1-SNAPSHOT/org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
> at 
> org.apache.lucene.test_framework@9.9.1-SNAPSHOT/org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
> at junit@4.13.1/org.junit.rules.RunRules.evaluate(RunRules.java:20)
> at 
> randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> at 
> randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390)
> at 
> randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:843)
> at 
> randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:490)
> at 
> randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:955)
> at 
> randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:840)
> at 
> randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:891)
> at 
> randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:902)
> at 
> org.apache.lucene.test_framework@9.9.1-SNAPSHOT/org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
> at 
> randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> at 
> org.apache.lucene.test_framework@9.9.1-SNAPSHOT/org.apache.lucene.tests.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38)
> at 
> randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
> at 
> randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
> at 
> 

Re: [VOTE] Release Lucene 9.9.1 RC1

2023-12-17 Thread Chris Hegarty
Hi Dawid,

> On 15 Dec 2023, at 14:18, Dawid Weiss  wrote:
> 
> 
> I hit TestSortedSetFieldSource assertion fixed in 12851 - this wasn't applied 
> to branch_9_9, I think. Not a big problem (test fix).

I just merged the fix for this test into branch_9_9. It should be stable now.

Thanks,
-Chris.

> 
> The second run succeeded on a slow vm.
> SUCCESS! [3:34:11.606597]
> 
> +1.
> 
> Dawid
> 
> On Wed, Dec 13, 2023 at 12:55 PM Chris Hegarty 
>  wrote:
> Hi,
> 
> Please vote for release candidate 1 for Lucene 9.9.1
> 
> The artifacts can be downloaded from:
> https://dist.apache.org/repos/dist/dev/lucene/lucene-9.9.1-RC1-rev-eee32cbf5e072a8c9d459c349549094230038308
> 
> You can run the smoke tester directly with this command:
> 
> python3 -u dev-tools/scripts/smokeTestRelease.py \
> https://dist.apache.org/repos/dist/dev/lucene/lucene-9.9.1-RC1-rev-eee32cbf5e072a8c9d459c349549094230038308
> 
> The vote will be open for at least 72 hours i.e. until 2023-12-16 12:00 UTC.
> 
> [ ] +1  approve
> [ ] +0  no opinion
> [ ] -1  disapprove (and reason why)
> 
> Here is my +1
> 
> -Chris.
> 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
> 


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[ANNOUNCE] Apache Lucene 9.9.1 released

2023-12-16 Thread Chris Hegarty
The Lucene PMC is pleased to announce the release of Apache Lucene 9.9.1.

Apache Lucene is a high-performance, full-featured text search engine library 
written entirely in Java. It is a technology suitable for nearly any 
application that requires full-text search, especially cross-platform.

This patch release contains bug fixes that are highlighted below. The release 
is available for immediate download at:

https://lucene.apache.org/core/downloads.html

Lucene 9.9.1 Release Highlights

Bug fixes

• JVM SIGSEGV crash when compiling 
computeCommonPrefixLengthAndBuildHistogram (Chris Hegarty)

• Push and pop OutputAccumulator as IntersectTermsEnumFrames are pushed and 
popped (Guo Feng, Mike McCandless)

Further details of changes are available in the change log available at: 

http://lucene.apache.org/core/9_9_1/changes/Changes.html. 

-Chris
-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[RESULT] [VOTE] Release Lucene 9.9.1 RC1

2023-12-16 Thread Chris Hegarty
Hi,

It's been >72h since the vote was initiated and the result is:

+1  9  (8 binding)
 0  0
-1  0

This vote has PASSED 

-Chris.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [VOTE] Release Lucene 9.9.1 RC1

2023-12-15 Thread Chris Hegarty
Dawid,

> On 15 Dec 2023, at 09:56, Chris Hegarty  
> wrote:
> 
>> ...
>> File: 
>> /tmp/smoke_lucene_9.9.1_eee32cbf5e072a8c9d459c349549094230038308/lucene.lucene-9.9.1-src.tgz.gpg.verify.log
>>verify trust
>>  GPG: gpg: WARNING: This key is not certified with a trusted signature!
> 
> I believe the warning means that your local trusted database does not contain 
> my key, or a chain of keys from which mine can be certified.  Maybe look into 
> whether or not you need to run `gpg --update-trustdb`.

You could try running, to 

$ gpg --list-keys --list-options show-uid-validity 0E6E898E

pub   rsa4096 2023-11-23 [SC]
  9A56AE5DDB7C01163B39DE72A14898310E6E898E
uid   [ultimate] Chris Hegarty (CODE SIGNING KEY) 
sub   rsa4096 2023-11-23 [E]

-Chris.
-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [VOTE] Release Lucene 9.9.1 RC1

2023-12-15 Thread Chris Hegarty
Hi Dawid,

> On 14 Dec 2023, at 17:47, Dawid Weiss  wrote:
> 
>  SUCCESS! [0:14:52.296147]
> 
> What?!... How is that possible, Mike?
> 
> I get gpg warnings - is this normal?
> 
> File: 
> /tmp/smoke_lucene_9.9.1_eee32cbf5e072a8c9d459c349549094230038308/lucene.lucene-9.9.1-src.tgz.gpg.verify.log
> verify trust
>   GPG: gpg: WARNING: This key is not certified with a trusted signature!

I believe the warning means that your local trusted database does not contain 
my key, or a chain of keys from which mine can be certified.  Maybe look into 
whether or not you need to run `gpg --update-trustdb`.

-Chris

> Dawid


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [VOTE] Release Lucene 9.9.1 RC1

2023-12-13 Thread Chris Hegarty
And (short) release note:

  https://cwiki.apache.org/confluence/display/LUCENE/ReleaseNote9_9_1

-Chris.

> On 13 Dec 2023, at 11:55, Chris Hegarty  
> wrote:
> 
> Hi,
> 
> Please vote for release candidate 1 for Lucene 9.9.1
> 
> The artifacts can be downloaded from:
> https://dist.apache.org/repos/dist/dev/lucene/lucene-9.9.1-RC1-rev-eee32cbf5e072a8c9d459c349549094230038308
> 
> You can run the smoke tester directly with this command:
> 
> python3 -u dev-tools/scripts/smokeTestRelease.py \
> https://dist.apache.org/repos/dist/dev/lucene/lucene-9.9.1-RC1-rev-eee32cbf5e072a8c9d459c349549094230038308
> 
> The vote will be open for at least 72 hours i.e. until 2023-12-16 12:00 UTC.
> 
> [ ] +1  approve
> [ ] +0  no opinion
> [ ] -1  disapprove (and reason why)
> 
> Here is my +1
> 
> -Chris.
> 


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[VOTE] Release Lucene 9.9.1 RC1

2023-12-13 Thread Chris Hegarty
Hi,

Please vote for release candidate 1 for Lucene 9.9.1

The artifacts can be downloaded from:
https://dist.apache.org/repos/dist/dev/lucene/lucene-9.9.1-RC1-rev-eee32cbf5e072a8c9d459c349549094230038308

You can run the smoke tester directly with this command:

python3 -u dev-tools/scripts/smokeTestRelease.py \
https://dist.apache.org/repos/dist/dev/lucene/lucene-9.9.1-RC1-rev-eee32cbf5e072a8c9d459c349549094230038308

The vote will be open for at least 72 hours i.e. until 2023-12-16 12:00 UTC.

[ ] +1  approve
[ ] +0  no opinion
[ ] -1  disapprove (and reason why)

Here is my +1

-Chris.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: The need for a Lucene 9.9.1 release

2023-12-13 Thread Chris Hegarty
Thanks,

I added a couple of 9.9.1 changelog entries, and will start the RC1 process.

-Chris.

> On 12 Dec 2023, at 18:42, Michael McCandless  
> wrote:
> 
> OK this is merged now.  Are there any other 9.9.1 blockers?  I am trying to 
> pass all Monster tests but that can probably just run concurrently with 
> voting (optimistic concurrency!)?
> 
> Mike McCandless
> 
> http://blog.mikemccandless.com
> 
> 
> On Tue, Dec 12, 2023 at 9:18 AM Chris Hegarty 
>  wrote:
> Hi Mike,
> 
> > On 12 Dec 2023, at 12:56, Michael McCandless  
> > wrote:
> > 
> > Hi Chris,
> > 
> > I think we should also regenerate the FSTs for 9.9.1?
> 
> Seems reasonable.
> 
> > https://github.com/apache/lucene/pull/12924
> 
> I added my comments and review on the PR.
> 
> -Chris.
> 
> > Thanks,
> > 
> > Mike
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
> 


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: The need for a Lucene 9.9.1 release

2023-12-12 Thread Chris Hegarty
Hi Mike,

> On 12 Dec 2023, at 12:56, Michael McCandless  
> wrote:
> 
> Hi Chris,
> 
> I think we should also regenerate the FSTs for 9.9.1?

Seems reasonable.

> https://github.com/apache/lucene/pull/12924

I added my comments and review on the PR.

-Chris.

> Thanks,
> 
> Mike

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: The need for a Lucene 9.9.1 release

2023-12-11 Thread Chris Hegarty
Just a quick update on this...

> On 9 Dec 2023, at 09:09, Chris Hegarty  wrote:
> 
> Hi,
> 
> We’ve encounter two very serious issues with the recent Lucene 9.9.0 release, 
> both of which (even if taken by themselves) would warrant a 9.9.1. The issues 
> are:
> 
> 1. https://github.com/apache/lucene/issues/12895 - Corruption read on term 
> dictionaries in Lucene 9.9

Great work has been done re-adding tests, creating a new test to reproduce, and 
also working on an underlying fix. It feels like we’re getting close! :-) 

> 2. https://github.com/apache/lucene/issues/12898 - JVM SIGSEGV crash when 
> compiling computeCommonPrefixLengthAndBuildHistogram Lucene 9.9.0

Merged to branch_9_9.

Once no.1 is merged, I’ll build a 9.9.1 RC1 and start a vote.

-Chris




-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: The need for a Lucene 9.9.1 release

2023-12-10 Thread Chris Hegarty


> On 9 Dec 2023, at 09:09, Chris Hegarty  wrote:
> 
> Hi,
> 
> We’ve encounter two very serious issues with the recent Lucene 9.9.0 release, 
> both of which (even if taken by themselves) would warrant a 9.9.1. The issues 
> are:
> 
> 1. https://github.com/apache/lucene/issues/12895 - Corruption read on term 
> dictionaries in Lucene 9.9
> 
> 2. https://github.com/apache/lucene/issues/12898 - JVM SIGSEGV crash when 
> compiling computeCommonPrefixLengthAndBuildHistogram Lucene 9.9.0

I opened a small PR which reflows the code in 
computeCommonPrefixLengthAndBuildHistogram which has the affect of working 
around the JIT crash.

https://github.com/apache/lucene/pull/12903

-Chris.


> There is still a little investigation and work left to bring these issues to 
> a point where we’re comfortable with proposing a solution. I would be hopeful 
> that we’ll get there by early next week. If so, then a Lucene 9.9.1 release 
> can be proposed.
> 
> Thanks,
> -Chris.



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: The need for a Lucene 9.9.1 release

2023-12-09 Thread Chris Hegarty
FYI - I added the next bugfix version 9.9.1 to `branch_9_9`, in preparation for 
the upcoming bug fix release.

https://github.com/apache/lucene/commit/1617c0b3a5624adba6e7b380dfeb7fb89b8a2feb

-Chris.

> On 9 Dec 2023, at 09:09, Chris Hegarty  wrote:
> 
> Hi,
> 
> We’ve encounter two very serious issues with the recent Lucene 9.9.0 release, 
> both of which (even if taken by themselves) would warrant a 9.9.1. The issues 
> are:
> 
> 1. https://github.com/apache/lucene/issues/12895 - Corruption read on term 
> dictionaries in Lucene 9.9
> 
> 2. https://github.com/apache/lucene/issues/12898 - JVM SIGSEGV crash when 
> compiling computeCommonPrefixLengthAndBuildHistogram Lucene 9.9.0
> 
> There is still a little investigation and work left to bring these issues to 
> a point where we’re comfortable with proposing a solution. I would be hopeful 
> that we’ll get there by early next week. If so, then a Lucene 9.9.1 release 
> can be proposed.
> 
> Thanks,
> -Chris.



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: The need for a Lucene 9.9.1 release

2023-12-09 Thread Chris Hegarty
Oh, and I’m happy to be Release Manager for 9.9.1 (given my recent experience 
on 9.9.0)

-Chris.

> On 9 Dec 2023, at 09:09, Chris Hegarty  wrote:
> 
> Hi,
> 
> We’ve encounter two very serious issues with the recent Lucene 9.9.0 release, 
> both of which (even if taken by themselves) would warrant a 9.9.1. The issues 
> are:
> 
> 1. https://github.com/apache/lucene/issues/12895 - Corruption read on term 
> dictionaries in Lucene 9.9
> 
> 2. https://github.com/apache/lucene/issues/12898 - JVM SIGSEGV crash when 
> compiling computeCommonPrefixLengthAndBuildHistogram Lucene 9.9.0
> 
> There is still a little investigation and work left to bring these issues to 
> a point where we’re comfortable with proposing a solution. I would be hopeful 
> that we’ll get there by early next week. If so, then a Lucene 9.9.1 release 
> can be proposed.
> 
> Thanks,
> -Chris.



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



The need for a Lucene 9.9.1 release

2023-12-09 Thread Chris Hegarty
Hi,

We’ve encounter two very serious issues with the recent Lucene 9.9.0 release, 
both of which (even if taken by themselves) would warrant a 9.9.1. The issues 
are:

1. https://github.com/apache/lucene/issues/12895 - Corruption read on term 
dictionaries in Lucene 9.9

2. https://github.com/apache/lucene/issues/12898 - JVM SIGSEGV crash when 
compiling computeCommonPrefixLengthAndBuildHistogram Lucene 9.9.0

There is still a little investigation and work left to bring these issues to a 
point where we’re comfortable with proposing a solution. I would be hopeful 
that we’ll get there by early next week. If so, then a Lucene 9.9.1 release can 
be proposed.

Thanks,
-Chris.
-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



SIGSEGV with Lucene 9.9.0

2023-12-06 Thread Chris Hegarty
Hi,

I want to raise awareness of a JVM crash that has been tickled by the Lucene 
9.9.0 release. I don’t have all the answers (yet), but I just want to ensure 
that folk here are aware. Details in this Elasticsearch issue [1]

# A fatal error has been detected by the Java Runtime Environment:
#
# SIGSEGV (0xb) at pc=0x7f57de17ef0e, pid=9600, tid=48393
#
# JRE version: OpenJDK Runtime Environment (21.0.1+12) (build 21.0.1+12-29)
# Java VM: OpenJDK 64-Bit Server VM (21.0.1+12-29, mixed mode, sharing, tiered, 
compressed oops, compressed class ptrs, g1 gc, linux-amd64)
# Problematic frame:
# V [libjvm.so+0xb93f0e] PhaseIdealLoop::build_loop_late_post_work(Node*, 
bool)+0xce
#

I dunno why we’ve not seen this before, but it reproduces quite frequently with 
a particular Elasticsearch test.

Offending methods:
-XX:CompileCommand=exclude,org.apache.lucene.util.MSBRadixSorter::computeCommonPrefixLengthAndBuildHistogram
 \
-XX:CompileCommand=exclude,org.apache.lucene.util.RadixSelector::computeCommonPrefixLengthAndBuildHistogram
 \

The issue is reproducible with JDK 19.0.2, JDK 20.0.2, JDK 21.0.1.  There is no 
crash in JDK 21.0.2 or JDK 22 (local builds of yet unreleased JDKs), but my 
understanding is that the method is just not compiled.  Investigation is 
ongoing in the JDK issue  https://bugs.openjdk.org/browse/JDK-8321370 

For reference, JDK 20.0.2 is scheduled to release on 2024-01-16.
 
-Chris.

[1] https://github.com/elastic/elasticsearch/issues/103004
-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[ANNOUNCE] Apache Lucene 9.9.0 released

2023-12-04 Thread Chris Hegarty
The Lucene PMC is pleased to announce the release of Apache Lucene 9.9.0.

Apache Lucene is a high-performance, full-featured search engine library 
written entirely in Java. It is a technology suitable for nearly any 
application that requires structured search, full-text search, faceting, 
nearest-neighbor search across high-dimensionality vectors, spell correction or 
query suggestions.

This release contains numerous bug fixes, optimizations, and improvements, some 
of which are highlighted below. The release is available for immediate download 
at:

https://lucene.apache.org/core/downloads.html

Lucene 9.9.0 Release Highlights:

New Features
• Add int8 scalar quantization to the HNSW vector format. This optionally 
allows for more compact lossy storage for the vectors, requiring approximately 
4x less memory for fast HNSW search.
• HNSW graph now can be merged with multiple threads, leveraging the same 
infrastructure that inter-segment concurrency utilizes.

Improvements
• Speed up Panama vector support, use FMA, and test improvements. 
• FSTCompiler can now approximately limit how much RAM it uses to share 
suffixes during FST construction using the suffixRAMLimitMB method. 

Optimizations
• Faster top-level conjunctions on term queries when sorting by descending 
score.
• Change Postings back to using FOR in Lucene99PostingsFormat. Freqs, 
positions and offset keep using PFOR.

... plus a multitude of helpful bug fixes!

Please read CHANGES.txt for a full list of new features and changes:

https://lucene.apache.org/core/9_9_0/changes/Changes.html

-Chris.
-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[RESULT] [VOTE] Release Lucene 9.9.0 RC2

2023-12-04 Thread Chris Hegarty
Hi,

It's been >72h since the vote was initiated and the result is:

+1  12  (9 binding)
 0  0
-1  0

This vote has PASSED 

-Chris.

Re: [JENKINS] Lucene-9.x-Linux (64bit/hotspot/jdk-11.0.21) - Build # 14204 - Unstable!

2023-12-02 Thread Chris Hegarty
Sorry, PR link: https://github.com/apache/lucene/pull/12865

-Chris

On Saturday, December 2, 2023, Uwe Schindler  wrote:

> Hi Chris,
>
> I can't find the PR.
>
>
>
> I am interested, because I wrote the original ParallelReader tests.
>
> IMHO the parallel readers are so sensitive to random changes, the test
> setup should not use any indexwriter randomization at all.
>
> ParallelReader is also seldomly used, maybe we should remove support at
> some point. I don't know anybody using it, because it is very complicated
> to maintain consistent indexes. It only works with stable merge policies.
>
> Uwe
>

>
> Am 2. Dezember 2023 09:34:46 MEZ schrieb Chris Hegarty
> :
>
>> Hi,
>>
>> I noticed this failure locally, and opened a PR for it yesterday. It is a
>> test issues, and indeed related to the recent merge policy test
>> randomization change.
>>
>> -Chris
>>
>> On Saturday, December 2, 2023, Patrick Zhai  wrote:
>>
>>> Seems it's because this MockRandomMergePolicy change
>>> <https://github.com/apache/lucene/blob/main/lucene/test-framework/src/java/org/apache/lucene/tests/index/MockRandomMergePolicy.java#L242>
>>>  recently
>>> makes ParallelLeafReader unhappy - it's reading two parallel segments from
>>> 2 dir and this MP makes one of the segments' documents order reversed.
>>>
>>> But should be just test util issue and not affecting release.
>>>
>>> Adrien do you want to take a look? I'm not sure what's the best way to
>>> fix it, adding an index sort for that test seems a bit overkill?
>>>
>>> Patrick
>>>
>>> On Fri, Dec 1, 2023 at 2:06 PM Michael McCandless <
>>> luc...@mikemccandless.com> wrote:
>>>
>>>> Hmm this reproduces for me, and looks new/unique.  Could it be related
>>>> to recent 9.9.0 changes / release blocker?
>>>>
>>>> Mike
>>>>
>>>> On Fri, Dec 1, 2023 at 3:33 PM Policeman Jenkins Server <
>>>> jenk...@thetaphi.de> wrote:
>>>>
>>>>> Build: https://jenkins.thetaphi.de/job/Lucene-9.x-Linux/14204/
>>>>> Java: 64bit/hotspot/jdk-11.0.21 -XX:+UseCompressedOops
>>>>> -XX:+UseParallelGC
>>>>>
>>>>> 1 tests failed.
>>>>> FAILED:  org.apache.lucene.index.TestParallelLeafReader.testQueries
>>>>>
>>>>> Error Message:
>>>>> org.junit.ComparisonFailure: expected: but was:
>>>>>
>>>>> Stack Trace:
>>>>> org.junit.ComparisonFailure: expected: but was:
>>>>> at __randomizedtesting.SeedInfo.s
>>>>> eed([6CA57EA3FB50CA0D:302BB278E1397FA3]:0)
>>>>> at org.junit.Assert.assertEquals(Assert.java:117)
>>>>> at org.junit.Assert.assertEquals(Assert.java:146)
>>>>> at org.apache.lucene.index.TestPa
>>>>> rallelLeafReader.queryTest(TestParallelLeafReader.java:263)
>>>>> at org.apache.lucene.index.TestPa
>>>>> rallelLeafReader.testQueries(TestParallelLeafReader.java:48)
>>>>> at java.base/jdk.internal.reflect
>>>>> .NativeMethodAccessorImpl.invoke0(Native Method)
>>>>> at java.base/jdk.internal.reflect
>>>>> .NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>>>>> at java.base/jdk.internal.reflect
>>>>> .DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccesso
>>>>> rImpl.java:43)
>>>>> at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>>>>> at com.carrotsearch.randomizedtes
>>>>> ting.RandomizedRunner.invoke(RandomizedRunner.java:1758)
>>>>> at com.carrotsearch.randomizedtesting.RandomizedRunner$8.
>>>>> evaluate(RandomizedRunner.java:946)
>>>>> at com.carrotsearch.randomizedtesting.RandomizedRunner$9.
>>>>> evaluate(RandomizedRunner.java:982)
>>>>> at com.carrotsearch.randomizedtesting.RandomizedRunner$10.
>>>>> evaluate(RandomizedRunner.java:996)
>>>>> at org.apache.lucene.tests.util.T
>>>>> estRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardown
>>>>> Chained.java:48)
>>>>> at org.apache.lucene.tests.util.A
>>>>> bstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
>>>>> at org.apache.lucene.

Re: [JENKINS] Lucene-9.x-Linux (64bit/hotspot/jdk-11.0.21) - Build # 14204 - Unstable!

2023-12-02 Thread Chris Hegarty
Hi,

I noticed this failure locally, and opened a PR for it yesterday. It is a
test issues, and indeed related to the recent merge policy test
randomization change.

-Chris

On Saturday, December 2, 2023, Patrick Zhai  wrote:

> Seems it's because this MockRandomMergePolicy change
> 
>  recently
> makes ParallelLeafReader unhappy - it's reading two parallel segments from
> 2 dir and this MP makes one of the segments' documents order reversed.
>
> But should be just test util issue and not affecting release.
>
> Adrien do you want to take a look? I'm not sure what's the best way to fix
> it, adding an index sort for that test seems a bit overkill?
>
> Patrick
>
> On Fri, Dec 1, 2023 at 2:06 PM Michael McCandless <
> luc...@mikemccandless.com> wrote:
>
>> Hmm this reproduces for me, and looks new/unique.  Could it be related to
>> recent 9.9.0 changes / release blocker?
>>
>> Mike
>>
>> On Fri, Dec 1, 2023 at 3:33 PM Policeman Jenkins Server <
>> jenk...@thetaphi.de> wrote:
>>
>>> Build: https://jenkins.thetaphi.de/job/Lucene-9.x-Linux/14204/
>>> Java: 64bit/hotspot/jdk-11.0.21 -XX:+UseCompressedOops -XX:+UseParallelGC
>>>
>>> 1 tests failed.
>>> FAILED:  org.apache.lucene.index.TestParallelLeafReader.testQueries
>>>
>>> Error Message:
>>> org.junit.ComparisonFailure: expected: but was:
>>>
>>> Stack Trace:
>>> org.junit.ComparisonFailure: expected: but was:
>>> at __randomizedtesting.SeedInfo.seed([6CA57EA3FB50CA0D:
>>> 302BB278E1397FA3]:0)
>>> at org.junit.Assert.assertEquals(Assert.java:117)
>>> at org.junit.Assert.assertEquals(Assert.java:146)
>>> at org.apache.lucene.index.TestParallelLeafReader.queryTest(
>>> TestParallelLeafReader.java:263)
>>> at org.apache.lucene.index.TestParallelLeafReader.testQueries(
>>> TestParallelLeafReader.java:48)
>>> at 
>>> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
>>> Method)
>>> at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.
>>> invoke(NativeMethodAccessorImpl.java:62)
>>> at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.
>>> invoke(DelegatingMethodAccessorImpl.java:43)
>>> at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>>> at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(
>>> RandomizedRunner.java:1758)
>>> at com.carrotsearch.randomizedtesting.
>>> RandomizedRunner$8.evaluate(RandomizedRunner.java:946)
>>> at com.carrotsearch.randomizedtesting.
>>> RandomizedRunner$9.evaluate(RandomizedRunner.java:982)
>>> at com.carrotsearch.randomizedtesting.
>>> RandomizedRunner$10.evaluate(RandomizedRunner.java:996)
>>> at org.apache.lucene.tests.util.TestRuleSetupTeardownChained$
>>> 1.evaluate(TestRuleSetupTeardownChained.java:48)
>>> at org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.
>>> evaluate(AbstractBeforeAfterRule.java:43)
>>> at org.apache.lucene.tests.util.TestRuleThreadAndTestName$1.
>>> evaluate(TestRuleThreadAndTestName.java:45)
>>> at org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures
>>> $1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
>>> at org.apache.lucene.tests.util.TestRuleMarkFailure$1.
>>> evaluate(TestRuleMarkFailure.java:44)
>>> at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>>> at com.carrotsearch.randomizedtesting.rules.
>>> StatementAdapter.evaluate(StatementAdapter.java:36)
>>> at com.carrotsearch.randomizedtesting.ThreadLeakControl$
>>> StatementRunner.run(ThreadLeakControl.java:390)
>>> at com.carrotsearch.randomizedtesting.ThreadLeakControl.
>>> forkTimeoutingTask(ThreadLeakControl.java:843)
>>> at com.carrotsearch.randomizedtesting.
>>> ThreadLeakControl$3.evaluate(ThreadLeakControl.java:490)
>>> at com.carrotsearch.randomizedtesting.RandomizedRunner.
>>> runSingleTest(RandomizedRunner.java:955)
>>> at com.carrotsearch.randomizedtesting.
>>> RandomizedRunner$5.evaluate(RandomizedRunner.java:840)
>>> at com.carrotsearch.randomizedtesting.
>>> RandomizedRunner$6.evaluate(RandomizedRunner.java:891)
>>> at com.carrotsearch.randomizedtesting.
>>> RandomizedRunner$7.evaluate(RandomizedRunner.java:902)
>>> at org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.
>>> evaluate(AbstractBeforeAfterRule.java:43)
>>> at com.carrotsearch.randomizedtesting.rules.
>>> StatementAdapter.evaluate(StatementAdapter.java:36)
>>> at org.apache.lucene.tests.util.TestRuleStoreClassName$1.
>>> evaluate(TestRuleStoreClassName.java:38)
>>> at com.carrotsearch.randomizedtesting.rules.
>>> NoShadowingOrOverridesOnMethodsRule$1.evaluate(
>>> NoShadowingOrOverridesOnMethodsRule.java:40)
>>> at com.carrotsearch.randomizedtesting.rules.
>>> 

Re: [VOTE] Release Lucene 9.9.0 RC1

2023-11-30 Thread Chris Hegarty
For clarity, consider this vote cancelled. A new vote has been started on an 
RC2 build.

> On 30 Nov 2023, at 16:22, Greg Miller  wrote:
> 
> If we're spinning a new RC, I'd like to ask this group if it would make sense 
> to pull this very small method deprecation in: 
> https://github.com/apache/lucene/pull/12854
> 
> If there's a chance we don't release a 9.10 and go directly to 10.0, this 
> would be our last opportunity to mark it deprecated on a 9.x version so we 
> can actually remove it in 10.0. It's really minor though, so I don't want to 
> create churn, but if we can get it into 9.9 without much issue, it would be 
> nice. If folks agree, I can get it merged onto 9.9.

Thanks for raising the issue. I don’t have a strong opinion on whether or not 
to do the deprecation in this release, and since you say that it is minor, then 
I don’t see that it necessitates another respin.

Since I had already started an RC2 build, then I just continued with it (and 
since the above issue is not yet reviewed ). If others feel like the 
deprecation should absolutely be in, then we can do an RC3.

-Chris. 

> Cheers,
> -Greg
> 
> On Thu, Nov 30, 2023 at 7:58 AM Michael Sokolov  <mailto:msoko...@gmail.com>> wrote:
>> for the sake of posterity, I did get a successful smoketest:
>> 
>> SUCCESS! [1:00:06.512261]
>> 
>> but +0 to release I guess since it's moot...
>> 
>> On Thu, Nov 30, 2023 at 10:38 AM Michael McCandless 
>> mailto:luc...@mikemccandless.com>> wrote:
>>> On Thu, Nov 30, 2023 at 9:56 AM Chris Hegarty 
>>>  wrote:
>>> 
>>>> P.S. I’m less sure about this, but the RC 2 starts a 72hr voting time 
>>>> again? (Just so I know what TTL to put on that)
>>> 
>>> Yeah a new 72 hour clock starts with each new RC :)
>>> 
>>> Mike McCandless
>>> 
>>> http://blog.mikemccandless.com <http://blog.mikemccandless.com/>


[VOTE] Release Lucene 9.9.0 RC2

2023-11-30 Thread Chris Hegarty
Please vote for release candidate 2 for Lucene 9.9.0

The artifacts can be downloaded from:
https://dist.apache.org/repos/dist/dev/lucene/lucene-9.9.0-RC2-rev-06070c0dceba07f0d33104192d9ac98ca16fc500

You can run the smoke tester directly with this command:

python3 -u dev-tools/scripts/smokeTestRelease.py \
https://dist.apache.org/repos/dist/dev/lucene/lucene-9.9.0-RC2-rev-06070c0dceba07f0d33104192d9ac98ca16fc500

The vote will be open for at least 72 hours, and given the weekend in between, 
let’s keep it open until 2023-12-04 12:00 UTC.

[ ] +1  approve
[ ] +0  no opinion
[ ] -1  disapprove (and reason why)

Here is my +1

-Chris.



Re: [VOTE] Release Lucene 9.9.0 RC1

2023-11-30 Thread Chris Hegarty
Adrien,

> On 30 Nov 2023, at 14:51, Adrien Grand  wrote:
> 
> Yet another bug due to ghost fields. :( Thanks for fixing! For reference, I 
> checked how postings work on SlowCompositeCodecReaderWrapper, since they are 
> prone to ghost fields as well, and they seem to be ok.

Thanks for checking this Adrien.

> I worry that it could actually occur in practice when enabling recursive 
> graph bisection, so I would prefer to respin.

Since the change has already been merged to branch_9_9 (thanks Mike), I’ll 
start an RC2 build right away, and post a notice when it is done.

-Chris.

P.S. I’m less sure about this, but the RC 2 starts a 72hr voting time again? 
(Just so I know what TTL to put on that)

> On Thu, Nov 30, 2023 at 6:01 AM Luca Cavanna  wrote:
>> SUCCESS! [0:33:10.432870]
>> 
>> +1
>> 
>> On Thu, Nov 30, 2023 at 2:59 PM Chris Hegarty 
>>  wrote:
>>> Hi Mike,
>>> 
>>>> On 30 Nov 2023, at 11:41, Michael McCandless >>> <mailto:luc...@mikemccandless.com>> wrote:
>>>> 
>>>> +1 to release.
>>>> 
>>>> I hit a corner-case test failure and opened a PR to fix it: 
>>>> https://github.com/apache/lucene/pull/12859
>>> 
>>> Good find!  It looks like the fix for this issue is well in hand - great.
>>> 
>>>> I don't think this should block the release? -- it looks exotic.
>>> 
>>> I’m not sure how likely this bug is to show in real (non-test) scenarios, 
>>> but it does look kinda “exotic” to me too. So unless there are counter 
>>> arguments, I do not see it as critical, and therefore not needing a respin.
>>> 
>>> -Chris.
>>> 
>>>> 
>>>> Thanks Chris!
>>>> 
>>>> Mike McCandless
>>>> 
>>>> http://blog.mikemccandless.com <http://blog.mikemccandless.com/>
>>>> 
>>>> On Thu, Nov 30, 2023 at 1:16 AM Patrick Zhai >>> <mailto:zhai7...@gmail.com>> wrote:
>>>>> SUCCESS! [1:03:54.880200]
>>>>> 
>>>>> +1. Thank you Chris!
>>>>> 
>>>>> On Wed, Nov 29, 2023 at 8:45 PM Nhat Nguyen 
>>>>>  wrote:
>>>>>> SUCCESS! [1:11:30.037919]
>>>>>> 
>>>>>> +1. Thanks, Chris!
>>>>>> 
>>>>>> On Wed, Nov 29, 2023 at 8:53 AM Chris Hegarty 
>>>>>>  wrote:
>>>>>>> Hi,
>>>>>>> 
>>>>>>> Please vote for release candidate 1 for Lucene 9.9.0
>>>>>>> 
>>>>>>> The artifacts can be downloaded from:
>>>>>>> https://dist.apache.org/repos/dist/dev/lucene/lucene-9.9.0-RC1-rev-92a5e5b02e0e083126c4122f2b7a02426c21a037
>>>>>>> 
>>>>>>> You can run the smoke tester directly with this command:
>>>>>>> 
>>>>>>> python3 -u dev-tools/scripts/smokeTestRelease.py \
>>>>>>> https://dist.apache.org/repos/dist/dev/lucene/lucene-9.9.0-RC1-rev-92a5e5b02e0e083126c4122f2b7a02426c21a037
>>>>>>> 
>>>>>>> The vote will be open for at least 72 hours, and given the weekend in 
>>>>>>> between, let’s it open until 2023-12-04 12:00 UTC.
>>>>>>> 
>>>>>>> [ ] +1  approve
>>>>>>> [ ] +0  no opinion
>>>>>>> [ ] -1  disapprove (and reason why)
>>>>>>> 
>>>>>>> Here is my +1
>>>>>>> 
>>>>>>> Draft release highlights can be viewed here (comments and feedback 
>>>>>>> welcome):
>>>>>>> https://cwiki.apache.org/confluence/display/LUCENE/ReleaseNote9_9_0
>>>>>>> 
>>>>>>> -Chris.
>>> 
> 
> 
> -- 
> Adrien



Re: [VOTE] Release Lucene 9.9.0 RC1

2023-11-30 Thread Chris Hegarty
Hi Mike,

> On 30 Nov 2023, at 11:41, Michael McCandless  
> wrote:
> 
> +1 to release.
> 
> I hit a corner-case test failure and opened a PR to fix it: 
> https://github.com/apache/lucene/pull/12859

Good find!  It looks like the fix for this issue is well in hand - great.

> I don't think this should block the release? -- it looks exotic.

I’m not sure how likely this bug is to show in real (non-test) scenarios, but 
it does look kinda “exotic” to me too. So unless there are counter arguments, I 
do not see it as critical, and therefore not needing a respin.

-Chris.

> 
> Thanks Chris!
> 
> Mike McCandless
> 
> http://blog.mikemccandless.com <http://blog.mikemccandless.com/>
> 
> On Thu, Nov 30, 2023 at 1:16 AM Patrick Zhai  <mailto:zhai7...@gmail.com>> wrote:
>> SUCCESS! [1:03:54.880200]
>> 
>> +1. Thank you Chris!
>> 
>> On Wed, Nov 29, 2023 at 8:45 PM Nhat Nguyen  
>> wrote:
>>> SUCCESS! [1:11:30.037919]
>>> 
>>> +1. Thanks, Chris!
>>> 
>>> On Wed, Nov 29, 2023 at 8:53 AM Chris Hegarty 
>>>  wrote:
>>>> Hi,
>>>> 
>>>> Please vote for release candidate 1 for Lucene 9.9.0
>>>> 
>>>> The artifacts can be downloaded from:
>>>> https://dist.apache.org/repos/dist/dev/lucene/lucene-9.9.0-RC1-rev-92a5e5b02e0e083126c4122f2b7a02426c21a037
>>>> 
>>>> You can run the smoke tester directly with this command:
>>>> 
>>>> python3 -u dev-tools/scripts/smokeTestRelease.py \
>>>> https://dist.apache.org/repos/dist/dev/lucene/lucene-9.9.0-RC1-rev-92a5e5b02e0e083126c4122f2b7a02426c21a037
>>>> 
>>>> The vote will be open for at least 72 hours, and given the weekend in 
>>>> between, let’s it open until 2023-12-04 12:00 UTC.
>>>> 
>>>> [ ] +1  approve
>>>> [ ] +0  no opinion
>>>> [ ] -1  disapprove (and reason why)
>>>> 
>>>> Here is my +1
>>>> 
>>>> Draft release highlights can be viewed here (comments and feedback 
>>>> welcome):
>>>> https://cwiki.apache.org/confluence/display/LUCENE/ReleaseNote9_9_0
>>>> 
>>>> -Chris.



[VOTE] Release Lucene 9.9.0 RC1

2023-11-29 Thread Chris Hegarty
Hi,

Please vote for release candidate 1 for Lucene 9.9.0

The artifacts can be downloaded from:
https://dist.apache.org/repos/dist/dev/lucene/lucene-9.9.0-RC1-rev-92a5e5b02e0e083126c4122f2b7a02426c21a037

You can run the smoke tester directly with this command:

python3 -u dev-tools/scripts/smokeTestRelease.py \
https://dist.apache.org/repos/dist/dev/lucene/lucene-9.9.0-RC1-rev-92a5e5b02e0e083126c4122f2b7a02426c21a037

The vote will be open for at least 72 hours, and given the weekend in between, 
let’s it open until 2023-12-04 12:00 UTC.

[ ] +1  approve
[ ] +0  no opinion
[ ] -1  disapprove (and reason why)

Here is my +1

Draft release highlights can be viewed here (comments and feedback welcome):
https://cwiki.apache.org/confluence/display/LUCENE/ReleaseNote9_9_0

-Chris.


New branch and feature freeze for Lucene 9.9.0

2023-11-29 Thread Chris Hegarty

Hi Lucene Devs, 

Branch branch_9_9 has been cut and versions updated to 9.10 on stable branch.

Please observe the normal rules:

* No new features may be committed to the branch.
* Documentation patches, build patches and serious bug fixes may be
  committed to the branch. However, you should submit all patches you
  want to commit as pull requests first to give others the chance to review
  and possibly vote against them. Keep in mind that it is our
  main intention to keep the branch as stable as possible.
* All patches that are intended for the branch should first be committed
  to the unstable branch, merged into the stable branch, and then into
  the current release branch.
* Normal unstable and stable branch development may continue as usual.
  However, if you plan to commit a big change to the unstable branch
  while the branch feature freeze is in effect, think twice: can't the
  addition wait a couple more days? Merges of bug fixes into the branch
  may become more difficult.
* Only Github issues with Milestone 9.9
  and priority "Blocker" will delay a release candidate build.

Thanks,
-Chris.



Re: Lucene 9.9.0 Release

2023-11-29 Thread Chris Hegarty
Hi Feng,

> On 29 Nov 2023, at 11:25, Guo Feng  wrote:
> 
> Hi Chris,
> 
> Nightly benchmark shows that #12699 gets back some speed. I've backport it to 
> 9.9.0. I think it is ready now.

Awesome!

> Sorry for delaying the release!

No apology needed. I appreciate your speedy work here.

I’ll start the branch cut process soon, and post an update when it is done.

-Chris. 

> Feng
> 
> On 2023/11/28 08:32:59 Chris Hegarty wrote:
>> Hi Guo,
>> 
>> Thanks for the update.
>> 
>> Let’s push the 9.9.0 branch cut until tomorrow (rather than today as 
>> previously suggested), which should allow time to determine the outstanding 
>> issues you mentioned below. That should be more straightforward all round.
>> 
>> New 9.9.0 branch cut 12:00 29th Nov 2023 UTC.
>> 
>> We have flexibility here, and I hope that this helps.
>> 
>> -Chris.
>> 
>>> On 28 Nov 2023, at 05:31, Guo Feng  wrote:
>>> 
>>> +1, thanks for volunteering Chris!
>>> 
>>> #12699 is merged to main. I plan to backport it to 9.9 if it fixes the 
>>> performance drop, otherwise  revert #12699 and #12631 (the PR introduced 
>>> regression) and push them to the next version.
>>> 
>>> On 2023/11/21 09:51:43 Chris Hegarty wrote:
>>>> Hi,
>>>> 
>>>> It's been a while since the 9.8.0 release and we’ve accumulated quite a 
>>>> few changes. I’d like to propose that we release 9.9.0.
>>>> 
>>>> If there's no objections, I volunteer to be the release manager and will 
>>>> cut the feature branch a week from now, 12:00 28th Nov UTC.
>>>> 
>>>> -Chris.
>>>> -
>>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>>>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>>> 
>>>> 
>>> 
>>> -
>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>> 
>> 
>> 
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>> 
>> 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
> 


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Lucene 9.9.0 Release

2023-11-28 Thread Chris Hegarty
Hi Guo,

Thanks for the update.

Let’s push the 9.9.0 branch cut until tomorrow (rather than today as previously 
suggested), which should allow time to determine the outstanding issues you 
mentioned below. That should be more straightforward all round.

New 9.9.0 branch cut 12:00 29th Nov 2023 UTC.

We have flexibility here, and I hope that this helps.

-Chris.

> On 28 Nov 2023, at 05:31, Guo Feng  wrote:
> 
> +1, thanks for volunteering Chris!
> 
> #12699 is merged to main. I plan to backport it to 9.9 if it fixes the 
> performance drop, otherwise  revert #12699 and #12631 (the PR introduced 
> regression) and push them to the next version.
> 
> On 2023/11/21 09:51:43 Chris Hegarty wrote:
>> Hi,
>> 
>> It's been a while since the 9.8.0 release and we’ve accumulated quite a few 
>> changes. I’d like to propose that we release 9.9.0.
>> 
>> If there's no objections, I volunteer to be the release manager and will cut 
>> the feature branch a week from now, 12:00 28th Nov UTC.
>> 
>> -Chris.
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>> 
>> 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
> 


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Lucene 9.9.0 Release

2023-11-27 Thread Chris Hegarty
Hi Adrien,

Comments inline.

> On 21 Nov 2023, at 12:31, Adrien Grand  wrote:
> 
> +1 9.9 has plenty of great changes indeed! Thanks for volunteering as a RM, 
> Chris.
> 
> It would be good to try and fix the PKLookup regression that was introduced 
> since 9.8: http://people.apache.org/~mikemccand/lucenebench/PKLookup.html. Is 
> it just about getting #12699 <https://github.com/apache/lucene/pull/12699> 
> merged?

I see that this is not yet merged. It looks like it is waiting final review.

> 
> Separately, I have a PR that does a small change to the file format of 
> postings and skip lists. It's certainly not a blocker for 9.9, but it would 
> be convenient to get it into 9.9 since we already changed file formats for 
> the switch from PFOR to FOR. Does someone have time to take a look? #12810 
> <https://github.com/apache/lucene/pull/12810>
I see that this has been merged, and later reverted because of some test 
instability. The new issue tracking this work is #12810 
<https://github.com/apache/lucene/pull/12810> [1]. Are we still expecting this 
to be resolved in 9.9.0 ?

-Chris.

[1] https://github.com/apache/lucene/pull/12810

> 
> On Tue, Nov 21, 2023 at 11:16 AM Michael McCandless 
> mailto:luc...@mikemccandless.com>> wrote:
>> +1
>> 
>> Thank you for volunteering as RC Chris!
>> 
>> Mike McCandless
>> 
>> http://blog.mikemccandless.com <http://blog.mikemccandless.com/>
>> 
>> On Tue, Nov 21, 2023 at 4:52 AM Chris Hegarty 
>>  wrote:
>>> Hi,
>>> 
>>> It's been a while since the 9.8.0 release and we’ve accumulated quite a few 
>>> changes. I’d like to propose that we release 9.9.0.
>>> 
>>> If there's no objections, I volunteer to be the release manager and will 
>>> cut the feature branch a week from now, 12:00 28th Nov UTC.
>>> 
>>> -Chris.
>>> -
>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org 
>>> <mailto:dev-unsubscr...@lucene.apache.org>
>>> For additional commands, e-mail: dev-h...@lucene.apache.org 
>>> <mailto:dev-h...@lucene.apache.org>
>>> 
> 
> 
> -- 
> Adrien



Lucene 9.9.0 Release

2023-11-21 Thread Chris Hegarty
Hi,

It's been a while since the 9.8.0 release and we’ve accumulated quite a few 
changes. I’d like to propose that we release 9.9.0.

If there's no objections, I volunteer to be the release manager and will cut 
the feature branch a week from now, 12:00 28th Nov UTC.

-Chris.
-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Welcome Patrick Zhai to the Lucene PMC

2023-11-13 Thread Chris Hegarty
Congratulations! Welcome Patrick.

-Chris


> On 13 Nov 2023, at 10:25, Luca Cavanna  wrote:
> 
> Congrats Patrick!
> 
> On Sun, Nov 12, 2023 at 7:14 PM Patrick Zhai  > wrote:
>> Thank you everyone! 
>> 
>> On Sun, Nov 12, 2023, 09:34 Dawid Weiss > > wrote:
>>> 
>>> 
>>> Congratulations and welcome, Patrick!
>>> 
>>> Dawid
>>> 
>>> On Fri, Nov 10, 2023 at 9:05 PM Michael McCandless 
>>> mailto:luc...@mikemccandless.com>> wrote:
 I'm happy to announce that Patrick Zhai has accepted an invitation to join 
 the Lucene Project Management Committee (PMC)!
 
 Congratulations Patrick, thank you for all your hard work improving 
 Lucene's community and source code, and welcome aboard!
 
 Mike McCandless
 
 http://blog.mikemccandless.com 


Re: Bump minimum Java version requirement to 21

2023-11-06 Thread Chris Hegarty
Hi Robert,

> On 6 Nov 2023, at 12:24, Robert Muir  wrote:
> 
>> …
>> The only concern I have with no.2 is that it could be considered an 
>> “aggressive” adoption of Java 21 - adoption sooner than the ecosystem can 
>> handle, e.g. are environments in which Lucene is deployed, and their 
>> transitive dependencies, ready to run on Java 21? By the time we’re ready to 
>> release 10.0.0, say March 2023, then I expect no issue with this.
> 
> The problem is worse, historically jdk version X isn't adopted as a
> minimum until it is already EOL. And the lucene major versions take an
> eternity to get out there, code just sits in "main" branch for years
> unreleased to nobody. It is really discouraging as a contributor to
> contribute code that literally sits on the shelf for years, for no
> good reason at all.

Agreed. I also feel discouraged by this approach too, and also wanna
avoid the “backport the world”, since it’s counterproductive.

> So why delay?
> 
> The argument of "moving sooner than ecosystem can handle" is also
> bogus in the same way. You mean versus the code sitting on the shelf
> and being released to nobody?

Yes - sitting on the shelf is no good to anyone.

Ok, what I’m hearing are good arguments for releasing 10.0.0 *now*, with
a Java 17 minimum - this is what is in _main_ today.

If we do that, then we can follow up with _main_ later (after the 10.x
branch is created). That is, 1) bump _main_ to Java 21, and 2) decide
when a Lucene 11 is to be released (I would to see Lucene 11 ~1yr after
Lucene 10).

This is Uwe’s proposal, earlier in this thread.

-Chris.



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Bump minimum Java version requirement to 21

2023-11-06 Thread Chris Hegarty
 possibly 
> a MR JAR also with Java 21:
> 
> In Java 21, panama-foreign is still preview. So when compiling we need the 
> APIJAR.
> In the MR-JAR compilation we patch the APIJAR into the java.base module 
> (which we also need for incubating). The problem is: YOu cannot patch the 
> "java.base" module and at same time pass "--release 21". So In that code part 
> we need to compile against actual class library (I have no idea why patching 
> is disallowed with --release). It prints a cryptic error message, but makes 
> no sense to me.
> Because of the inability to use "--release" we still need to compile the 
> Panama classes in a separate gradle sourceSet. But we can copy the separate 
> sourceSet output for 21 directly into the main JAR part (but we can also let 
> it live in versions/21.
> 
> This should not stop us from moving to 21, the details with how to build the 
> JRA/MR-JAR can be solved separately. You PR looks fine, I would keep away 
> from the MR-JAR sourceSets for now. We can clean the up later.
> 
> Keeping parts of the MR-JAR logic as suggested before helps with backporting.
> 
> Uwe
> 
>>> Am 03.11.2023 um 13:20 schrieb Chris Hegarty:
>>>> Hi,
>>>> 
>>>> I would like to start the discussion and gather feedback on bumping the
>>>> minimum Java version requirement to 21.
>>>> 
>>>> I have no particular timeline in mind, but these kinda bumps often
>>>> require dependency updates [*], small code refactorings, etc, and can
>>>> take some time to plan and execute. It's best to at least have a plan
>>>> for when, rather than if!  Any bump would of course be limited to the
>>>> _main_ branch, and therefore targeting a major Lucene release (no
>>>> changes to branches targeting minor patch releases).
>>>> 
>>>> I'm sure subscribers to this list are already familiar with the various
>>>> goodies that have been added between Java 17 and 21, so I'll not
>>>> enumerate them here, but rather callout just two particular benefits
>>>> that I think are significant to the Lucene project.
>>>> 
>>>> 1) Put a lower bound on the number of memory segment mmap and Panama
>>>> Vector similarity implementations that we need to carry. This not only
>>>> reduces maintenance cost, but avoids additional consideration and
>>>> experimentation for performance improvements.
>>>> 
>>>> 2) Support for half float, Float::float16ToFloat and Float::floatToFloat16,
>>>> which will likely be beneficial in several places.
>>>> 
>>>> More concretely, and somewhat orthogonal to the discussion of when, I
>>>> would like to create a meta-issue capturing the prerequisites to a
>>>> version bump.
>>>> 
>>>> Your thoughts, comments, and feedback are very much welcome.
>>>> 
>>>> -Chris.
>>>> 
>>>> [*] we need at least an ECJ JDT dependency update, that supports
>>>> Java 21, https://www.eclipse.org/lists/eclipse-dev/msg12203.html
>>>> 
>>>> -
>>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org 
>>>> <mailto:dev-unsubscr...@lucene.apache.org>
>>>> For additional commands, e-mail: dev-h...@lucene.apache.org 
>>>> <mailto:dev-h...@lucene.apache.org>
>>>> 
>>> -- 
>>> Uwe Schindler
>>> Achterdiek 19, D-28357 Bremen
>>> https://www.thetaphi.de <https://www.thetaphi.de/>
>>> eMail: u...@thetaphi.de <mailto:u...@thetaphi.de>
>>> 
>>> 
>>> -
>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org 
>>> <mailto:dev-unsubscr...@lucene.apache.org>
>>> For additional commands, e-mail: dev-h...@lucene.apache.org 
>>> <mailto:dev-h...@lucene.apache.org>
>>> 
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org 
>> <mailto:dev-unsubscr...@lucene.apache.org>
>> For additional commands, e-mail: dev-h...@lucene.apache.org 
>> <mailto:dev-h...@lucene.apache.org>
>> 
> -- 
> Uwe Schindler
> Achterdiek 19, D-28357 Bremen
> https://www.thetaphi.de <https://www.thetaphi.de/>
> eMail: u...@thetaphi.de <mailto:u...@thetaphi.de>


Re: Bump minimum Java version requirement to 21

2023-11-04 Thread Chris Hegarty
Hi Uwe,

Thanks for your reply, comments inline.

> On 3 Nov 2023, at 13:11, Uwe Schindler  wrote:
> 
> Hi,
> 
> I had another idea: Why not release main as 10.0.0 *NOW* and create 
> branch_10x (with Java 17) minimum, stop working on 9.x, and move main branch 
> to 21?

I see now that 9.x has a minimum Java version of 11, and that _main_ has a 
minimum version of 17. I previously overlooked this ( I thought that 9.x was on 
17, but it is not ). Ok, so your idea is actually quite inline with how things 
have happened in the past.

For ease of reference, here are the dates of the last 4 major releases. 
  9.0.0   Dec 2021
  8.0.0   Mar 2019
  7.0.0   Sep 2017
  6.0.0   Apr 2016 

If we release 10.0.0 now (with a minimum of 17) that drops the need to support 
Java 11 (since work in 9.x will mostly stop). I’m ok with this, and we get the 
benefits of dropping < Java 17.  But can we be more ambitious in our approach 
here?

I’ll defer to others about what is in _main_ to justify a major release or not 
- the driver for a release should be more than just the minimum Java version.

Alternatively, what if we were to not release 10.0.0 for another while, say 3 - 
6 months, and at the same time bump it to Java 21. In the meantime we can keep 
the 9.x updates coming.  My motivation for suggesting this is that it appears 
that major Lucene versions seem to be around every 2 years or so, and if we 
release 10 with Java 17, the we’ll still be reluctant to use Java APIs and 
features between 17 and 21 for the next, likely, 2 years. An alternative to 
that is to release Lucene 11.0.0 sometime before the 2 year mark.

> I would be happy to remove the MmapByteBuffer directory in Java 18.

We can only do this when we move to a minimum Java > 17, so in your proposal 
that would be in _main_ some time post the fork for branch_10x. That seems ok.

> Unfortunately in Java 21 we still need a hack top compile the MemorySegment 
> classes because of the preview flag. And for the incubator we also need the 
> APIJAR files. But we can do this then without MR-JAR unless we need a new 
> version for Java 22, 23 of vectors. My idea would be to patch in the api JAR 
> during compile of "main" sourceset classes.

Yeah, regardless of the minimum version bump some work is needed here :-( Where 
possible we should try to minimise it, but I agree we’ll likely need updates 
for the vector stuff in 22+.

-Chris.

> Uwe
> 
> Am 03.11.2023 um 13:20 schrieb Chris Hegarty:
>> Hi,
>> 
>> I would like to start the discussion and gather feedback on bumping the
>> minimum Java version requirement to 21.
>> 
>> I have no particular timeline in mind, but these kinda bumps often
>> require dependency updates [*], small code refactorings, etc, and can
>> take some time to plan and execute. It's best to at least have a plan
>> for when, rather than if!  Any bump would of course be limited to the
>> _main_ branch, and therefore targeting a major Lucene release (no
>> changes to branches targeting minor patch releases).
>> 
>> I'm sure subscribers to this list are already familiar with the various
>> goodies that have been added between Java 17 and 21, so I'll not
>> enumerate them here, but rather callout just two particular benefits
>> that I think are significant to the Lucene project.
>> 
>> 1) Put a lower bound on the number of memory segment mmap and Panama
>> Vector similarity implementations that we need to carry. This not only
>> reduces maintenance cost, but avoids additional consideration and
>> experimentation for performance improvements.
>> 
>> 2) Support for half float, Float::float16ToFloat and Float::floatToFloat16,
>> which will likely be beneficial in several places.
>> 
>> More concretely, and somewhat orthogonal to the discussion of when, I
>> would like to create a meta-issue capturing the prerequisites to a
>> version bump.
>> 
>> Your thoughts, comments, and feedback are very much welcome.
>> 
>> -Chris.
>> 
>> [*] we need at least an ECJ JDT dependency update, that supports
>> Java 21, https://www.eclipse.org/lists/eclipse-dev/msg12203.html
>> 
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>> 
> -- 
> Uwe Schindler
> Achterdiek 19, D-28357 Bremen
> https://www.thetaphi.de
> eMail: u...@thetaphi.de
> 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
> 


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Bump minimum Java version requirement to 21

2023-11-03 Thread Chris Hegarty
Hi,

I would like to start the discussion and gather feedback on bumping the
minimum Java version requirement to 21.

I have no particular timeline in mind, but these kinda bumps often
require dependency updates [*], small code refactorings, etc, and can
take some time to plan and execute. It's best to at least have a plan
for when, rather than if!  Any bump would of course be limited to the
_main_ branch, and therefore targeting a major Lucene release (no
changes to branches targeting minor patch releases).

I'm sure subscribers to this list are already familiar with the various
goodies that have been added between Java 17 and 21, so I'll not
enumerate them here, but rather callout just two particular benefits
that I think are significant to the Lucene project.

1) Put a lower bound on the number of memory segment mmap and Panama
Vector similarity implementations that we need to carry. This not only
reduces maintenance cost, but avoids additional consideration and
experimentation for performance improvements.

2) Support for half float, Float::float16ToFloat and Float::floatToFloat16,
which will likely be beneficial in several places.

More concretely, and somewhat orthogonal to the discussion of when, I
would like to create a meta-issue capturing the prerequisites to a
version bump.

Your thoughts, comments, and feedback are very much welcome.

-Chris.

[*] we need at least an ECJ JDT dependency update, that supports
Java 21, https://www.eclipse.org/lists/eclipse-dev/msg12203.html

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [VOTE] Release Lucene 9.8.0 RC1

2023-09-25 Thread Chris Hegarty
Hi,

>>   2>at Log4jHotPatch.asmVersion(Log4jHotPatch.java:71)

This coming from Amazon’s Log4Shell hot patch [1], which I believe was deployed 
by default on many (all?) JVM’s running on Amazon instances. Well… that was 
almost 2yrs ago, not sure why it’s still showing up in some places now - it 
should not be needed.

In fact, I do remember seeing and reporting this issue back in late 2021. The 
hot patcher initially used the JDK’s internal ASM library, which is the root 
cause of the security exception. The hot patcher was subsequently fixed to not 
do this - it bundles/shades ASM itself. This fix was made in late 2021.

I have no idea why the system in question is running an old version of the hot 
patcher. @Michael, you should probably take a look at that system, maybe it 
needs some updates or something?

-Chris.

[1] https://github.com/corretto/hotpatch-for-apache-log4j2/tree/main

> On 25 Sep 2023, at 09:22, Uwe Schindler  wrote:
> 
> Hi,
> 
> as Lucene does not use Log4j, it is unclear why it wants to patch anything. 
> The problem in indeed caused by SecurityManager which is enabled for running 
> Lucene tests. Actually it detects that something tries to access some 
> internals of ASM, not sure what it exactly does. The "injected" Agent code 
> must possibly use AccessController#doPrivileged and the security context must 
> allow patching of classes.
> 
> In short: SecurityManager has done everything it should do: It detected an 
> illegal access. Mission achieved! You have to report this issue and patch 
> your tool so it works correctly with SecurityManager.
> 
> Uwe
> 
> Am 24.09.2023 um 23:52 schrieb Michael Sokolov:
>> I ran the smoketester and had a failure. It seems related to some
>> log4j hot patch script we are required to run at work which is somehow
>> conflicting with the security manager? I'm killing that and trying
>> again, but I wonder if this is going to cause problems at runtime as
>> well? How do we enable the security manager -is it only when running
>> tests?
>> 
>> org.apache.lucene.codecs.simpletext.TestSimpleTextPostingsFormat >
>> classMethod FAILED
>> java.lang.AssertionError: The test or suite printed 15378 bytes to
>> stdout and stderr, even though the limit was set to 8192 bytes.
>> Increase the limit with @Limit, ignore it
>>  completely with @SuppressSysoutChecks or run with
>> -Dtests.verbose=true
>> at __randomizedtesting.SeedInfo.seed([3E554FE0FEE122B9]:0)
>> at 
>> org.apache.lucene.tests.util.TestRuleLimitSysouts.afterIfSuccessful(TestRuleLimitSysouts.java:283)
>> at 
>> com.carrotsearch.randomizedtesting.rules.TestRuleAdapter$1.afterIfSuccessful(TestRuleAdapter.java:36)
>> at 
>> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:37)
>> at 
>> org.apache.lucene.tests.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
>> at 
>> org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
>> at 
>> org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
>> at 
>> org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
>> at 
>> org.apache.lucene.tests.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:47)
>> at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>> at 
>> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>> at 
>> com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390)
>> at 
>> com.carrotsearch.randomizedtesting.ThreadLeakControl.lambda$forkTimeoutingTask$0(ThreadLeakControl.java:850)
>> at java.base/java.lang.Thread.run(Thread.java:829)
>> 
>> org.apache.lucene.codecs.simpletext.TestSimpleTextPostingsFormat >
>> test suite's output saved to
>> /tmp/smoke_lucene_9.8.0_d914b3722bd5b8ef31ccf7e8ddc638a87fd648db/unpack/lucene-9
>> .8.0/lucene/codecs/build/test-results/test/outputs/OUTPUT-org.apache.lucene.codecs.simpletext.TestSimpleTextPostingsFormat.txt,
>> copied below:
>>   2> java.lang.reflect.InvocationTargetException
>>   2>at 
>> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
>> Method)
>>   2>at 
>> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>>   2>at 
>> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>   2>at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>>   2>at 
>> java.instrument/sun.instrument.InstrumentationImpl.loadClassAndStartAgent(InstrumentationImpl.java:513)
>>   2>at 
>> java.instrument/sun.instrument.InstrumentationImpl.loadClassAndCallAgentmain(InstrumentationImpl.java:535)
>>   2> Caused by: java.security.AccessControlException: access 

Re: [VOTE] Release Lucene 9.8.0 RC1

2023-09-22 Thread Chris Hegarty
Hi,

The Elasticsearch CI is green with the Lucene RC1 build [1].

+1 to release.

-Chris.

[1] https://github.com/elastic/elasticsearch/pull/99800


> On 22 Sep 2023, at 15:43, Adrien Grand  wrote:
> 
> +1 SUCCESS! [0:54:58.932481]
> 
> On Fri, Sep 22, 2023 at 4:18 PM Uwe Schindler  wrote:
>> 
>> Hi,
>> 
>> I verified the release with the usual tools and my workflow:
>> 
>> Policeman Jenkins ran smoketester for me with Java 11 and Java 17:
>> https://jenkins.thetaphi.de/job/Lucene-Release-Tester/28/console
>> 
>> SUCCESS! [1:10:15.704228]
>> 
>> In addition I checked the changes entries and ran Luke with Java 21 GA
>> (released two days ago). All fine!
>> 
>> +1 to release!
>> 
>> Am 22.09.2023 um 07:48 schrieb Patrick Zhai:
>>> Please vote for release candidate 1 for Lucene 9.8.0
>>> 
>>> The artifacts can be downloaded from:
>>> https://dist.apache.org/repos/dist/dev/lucene/lucene-9.8.0-RC1-rev-d914b3722bd5b8ef31ccf7e8ddc638a87fd648db
>>> 
>>> You can run the smoke tester directly with this command:
>>> 
>>> python3 -u dev-tools/scripts/smokeTestRelease.py \
>>> https://dist.apache.org/repos/dist/dev/lucene/lucene-9.8.0-RC1-rev-d914b3722bd5b8ef31ccf7e8ddc638a87fd648db
>>> 
>>> The vote will be open for at least 72 hours, as there's a weekend, the
>>> vote will last until 2023-09-27 06:00 UTC.
>>> 
>>> [ ] +1  approve
>>> [ ] +0  no opinion
>>> [ ] -1  disapprove (and reason why)
>>> 
>>> Here is my +1 (non-binding)
>> 
>> --
>> Uwe Schindler
>> Achterdiek 19, D-28357 Bremen
>> https://www.thetaphi.de
>> eMail: u...@thetaphi.de
>> 
>> 
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>> 
> 
> 
> -- 
> Adrien
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
> 



Re: Welcome Chris Hegarty to the Lucene PMC

2023-06-21 Thread Chris Hegarty
Thank you all for the warm welcome. Happy to be included in this very talented 
group of individuals :-)

-Chris.

> On 21 Jun 2023, at 09:31, Uwe Schindler  wrote:
> 
> Welcome Chris. 
> 
> Uwe
> 
> 
> Am 19. Juni 2023 11:52:50 MESZ schrieb Adrien Grand :
>> I'm pleased to announce that Chris Hegarty has accepted an invitation to 
>> join the Lucene PMC!
>> 
>> Congratulations Chris, and welcome aboard!
>> 
>> -- 
>> Adrien
> 
> --
> Uwe Schindler
> Achterdiek 19, 28357 Bremen
> https://www.thetaphi.de <https://www.thetaphi.de/>


Re: New branch and feature freeze for Lucene 9.7.0

2023-06-16 Thread Chris Hegarty
Hi Uwe,

> On 16 Jun 2023, at 13:52, Uwe Schindler  wrote:
> 
> Hi,
> 
> I also downloaded latest JDK 21 EA build (21-b27) and regenerated the apijar 
> files: No changes. So all fine! I did this because there were some late 
> changes to javadocs and API definition, but it all looks fine. Also the bug 
> with SecurityManager that hit Elasticsearch was also fixed (but we have a 
> workaround).
> 
> I will now also update Policeman Jenkins to latest EA build.
> 
Thanks for verifying the apijar, and adding the new EA build to the Jenkins.

It would be great if we could automate the apijar check, by periodically 
running a job that sucks down the latest EA build and runs the generation, 
comparing against the checked in version. But the way we do this already kinda 
minimises the risk - as you have said before (and I agree), it is very unlikely 
that the API will change during rampdown.

-Chris.
> Uwe
> 
> Am 16.06.2023 um 13:50 schrieb Adrien Grand:
>> NOTICE:
>> 
>> Branch branch_9_7 has been cut and versions updated to 9.8 on stable branch.
>> 
>> Please observe the normal rules:
>> 
>> * No new features may be committed to the branch.
>> * Documentation patches, build patches and serious bug fixes may be
>>   committed to the branch. However, you should submit all patches you
>>   want to commit as pull requests first to give others the chance to review
>>   and possibly vote against them. Keep in mind that it is our
>>   main intention to keep the branch as stable as possible.
>> * All patches that are intended for the branch should first be committed
>>   to the unstable branch, merged into the stable branch, and then into
>>   the current release branch.
>> * Normal unstable and stable branch development may continue as usual.
>>   However, if you plan to commit a big change to the unstable branch
>>   while the branch feature freeze is in effect, think twice: can't the
>>   addition wait a couple more days? Merges of bug fixes into the branch
>>   may become more difficult.
>> * Only Github issues with Milestone 9.7
>>   and priority "Blocker" will delay a release candidate build.
>> 
>> -- 
>> Adrien
> -- 
> Uwe Schindler
> Achterdiek 19, D-28357 Bremen
> https://www.thetaphi.de 
> eMail: u...@thetaphi.de 


Re: Lucene 9.7 release

2023-06-09 Thread Chris Hegarty
Hi,

> On 9 Jun 2023, at 17:19, Uwe Schindler  wrote:
> 
> Hi,
> 
> if possible I would like to get the Java 21 changes (MemorySegments and 
> Vector) into the release. I'd like to ask Chris who has better knowledge how 
> to proceed. If he suggests to wait maybe a week or 2, I'd suggest to wait 
> that time.
> 
> Chris Hegarthy: Do you know if the API of JDK 21 is finalized or not. From my 
> understanding the final phases have started, so API changes are unlikely. If 
> there are bug fixes they won't affect public APIs or the incubator module, 
> right?
> 
Your understanding is correct. I do not expect any API changes at this point.
> The MMapDir changes are already tested all the time, vector API needs the 
> forward port to 21.
> 
We are also doing some early testing with JDK 21 EA, and it would be great to 
get the 21-version of Panama VectorUtils in. I can help get this done.

Uwe, what has been done so far? If nothing, as that is still the case tomorrow, 
I can start on it.

-Chris.

> Uwe
> 
> Am 09.06.2023 um 18:07 schrieb Adrien Grand:
>> Hello all,
>> 
>> There is some good stuff that is scheduled for 9.7 already, I found the 
>> following changes in the changelog that look especially interesting:
>>  - Concurrent query rewrites for vector queries.
>>  - Speedups to vector indexing/search via integration of the Panama vector 
>> API.
>>  - Reduced overhead of soft deletes.
>>  - Support for update by query.
>> 
>> I propose we start the process for a 9.7 release, and I volunteer to be the 
>> release manager. I suggest the following schedule:
>>  - Feature freeze on June 16th, one week from now. This is when the 9.7 
>> branch will be cut.
>>  - Open a vote on June 21st, which we'll possibly delay if blockers get 
>> identified.
>> 
>> -- 
>> Adrien
> -- 
> Uwe Schindler
> Achterdiek 19, D-28357 Bremen
> https://www.thetaphi.de 
> eMail: u...@thetaphi.de 


Re: Welcome Chris Hegarty as Lucene committer

2022-06-01 Thread Chris Hegarty
Hi,

I am both honoured and humbled to have been invited to become a committer. 
Thank you.

I've been working on the development of the Java Platform and the JDK for a 
little more than 20 years. First in the Javasoft group at Sun Microsystems, and 
later in the Java Platform Group at Oracle. After spending much of my working 
life as a "producer of Java", I'm now with Elastic and looking forward to 
seeing what it is like as a "user of Java”. There is so much exciting and 
interesting work happening in this space, I hope to be able to make some 
positive contributions, even in a small way. 

-Chris.

> On 1 Jun 2022, at 08:04, Adrien Grand  wrote:
> 
> I'm pleased to announce that Chris Hegarty has accepted the PMC's
> invitation to become a committer.
> 
> Chris, the tradition is that new committers introduce themselves with a
> brief bio.
> 
> Congratulations and welcome!
> 
> -- 
> Adrien


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [Heads up] Test framework package rename

2021-12-20 Thread Chris Hegarty
Hi David,

> On 20 Dec 2021, at 17:12, Dawid Weiss  wrote:
> 
> [snip]
> 
>> I think that the approach is reasonable (to inject ’test’ into the package 
>> name between the reverse DNS prefix and the specific logical technology 
>> suffix).
> 
> It's actually '.tests'... Robert (Muir) suggested .test as well but I
> must have added that extra 's' somewhere along the line and I didn't
> have the heart to redo all the refactorings...

Ah yes, '.tests’ - which is also fine.

> [ snip ]
>> TestSecrets also seems reasonable. It’s a pity that it intrudes somewhat on 
>> the actual product code, but not to any extent that would be concerning.
> 
> This is quite funny - I actually reengineered the internal-package
> trick, although my version looked slightly different (it had public
> methods returning the "secret" accessors, although they couldn't be
> instantiated by anything else than the internal package - you can see
> it in the commit history... then I recalled that the JDK had a similar
> solution and peeked at how that was done. I thought it was nicer as it
> didn't pollute the API.

Right, if you need access to package-privates from an “API” package, 
then secrets are a fine tool for the job.

> So now you know whom I borrowed the idea from
> - you. :)
> 
> The only difference is the use of unsafe to make sure classes have
> their static blocks invoked. I didn't want to bring unsafe into the
> mix and used Class.forName(apiClazz.getName()). I don't think this can
> have any side-effects whatsoever and the contract on this variant of
> forName has the initialization flag on, so it seems to be safe -
> correct me if I'm wrong though.

You are correct (you’re safe). Lucene has bumped the minimum JDK
to 17, right? If so, you could use MethodHandles.Lookup::ensureInitialized.
Or this could be done separately, if there is interest. (hmm.. maybe you
don’t wanna go down this rabbit hole now!, lookup access could be an
issue )

>> It’s great that you can now have a test_framework module. And, from what I 
>> can see, the moduleXXX configurations that you introduced recently work just 
>> fine for the Gradle dependency configuration.
> 
> I have to admit this works rather nicely. There are some rough corners
> we discovered already but I think they're all fixable with relative
> easy -
> https://issues.apache.org/jira/browse/LUCENE-10328
> 
> I'm sure in the end this could be extracted into a more reusable
> gradle plugin, probably replacing the built-in support for modules,
> but for the time being it's just easier to work with those separate
> gradle scripts.

Yeah, it seems so.

>> Once this is in, then it will be possible to patch area specific unit tests 
>> into the actual product module and `require` the framework, right?
> 
> Yes, I think this is doable. It's what Uwe has been asking for - his
> panama branch currently requires tricks that shouldn't be there.

Ok, great. I think we’re on the same track.

>> And if we had that, it’s not a big leap to maybe refactor some of the 
>> secrets to be injected too ( but I accept that that is not really necessary 
>> either, and I’m not sure how the IDEs or Gradle would like this )
> 
> What secrets do you have in mind here?

Nothing specific, just that if we’re injecting code into the module
(patching the module for unit testing purposes), then even the secrets
(currently in the product code), could be effectively injected too.
(The code that sets things up in the API packages). Almost as if each
module had its own mini-test-framework, which could be (but does not
strictly need to be) part of its unit test. 

> As for IDEs: IntelliJ should
> work fine as it converts each source set into a logical dependency
> unit. I don't think how Eclipse can be made to digest all this - for
> now, we create a big bag of sources without submodule demarcation. So
> it compiles, although is far from ideal.

I was mostly thinking of how the IDE would react to —-patch-module, and
whether or not it could handle things like auto-complete, etc, from
consuming code whose view point just sees the patched module as one
complete unit. 

Anyway, that is a distraction from the good work here, and something for
another day.

-Chris.



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [Heads up] Test framework package rename

2021-12-20 Thread Chris Hegarty
Hi David,

[ not a review ]

> On 20 Dec 2021, at 14:54, Dawid Weiss  wrote:
> 
> 
> Hello everyone,
> 
> I've completed the task of getting the test framework to not share any 
> packages with Lucene core - this is here:
> 
> https://issues.apache.org/jira/browse/LUCENE-10301 
> 
> https://github.com/apache/lucene/pull/551 
> 
> 
> Basically everything remains the same, except for the changed package prefix. 
> The patch is rather large because it cuts across all of the code (imports, 
> mostly). There are also some minor tweaks to expose package-private internals 
> in the core to the test framework, now residing in a different package.


I think that the approach is reasonable (to inject ’test’ into the package name 
between the reverse DNS prefix and the specific logical technology suffix). 
I’ve played a little with similar in the Elasticsearch codebase - but not yet 
advanced it to a point that eliminates all split packages.

TestSecrets also seems reasonable. It’s a pity that it intrudes somewhat on the 
actual product code, but not to any extent that would be concerning.

It’s great that you can now have a test_framework module. And, from what I can 
see, the moduleXXX configurations that you introduced recently work just fine 
for the Gradle dependency configuration. ( I wish that we were at a similar 
point in ES, but recent distractions have slowed progress :-( )

Once this is in, then it will be possible to patch area specific unit tests 
into the actual product module and `require` the framework, right? And if we 
had that, it’s not a big leap to maybe refactor some of the secrets to be 
injected too ( but I accept that that is not really necessary either, and I’m 
not sure how the IDEs or Gradle would like this )

-Chris. 

Re: HEADS UP: Java 9 Module System (JMS) module names for Lucene artifacts

2021-11-30 Thread Chris Hegarty
Hi Uwe,

> On 30 Nov 2021, at 13:01, Uwe Schindler  wrote:
> 
> Hi Chris,
>  
> yes they are not declared “stable” for 9.0 as those are just “automatic 
> module names” assigned through manifest based on the gradle module name only. 
> But I would say they are still more “stable” than anything we had in 8.x 
> where the module system guesses just something from JAR file name. 

Right - “more stable”, which is fine.

> With the statement I wanted to say: We try to keep the names as noted in my 
> previous mail when we will switch to full-featured module system with 
> module-info.java, all SPIs declared, exports of packages,… 
> (https://issues.apache.org/jira/browse/LUCENE-10255 
> ). But there may be the 
> requirement to refactor and rename a module.

That sounds reasonable.

> As you were also involved with the JMS @ Oracle before you started at 
> Elasticsearch, you may also give some comments about the naming issue. To my 
> understanding the new and almost stable module names are safe to be released 
> as auto-modules in Lucene 9.0.

Yes, that is my understanding also.

> Uwe
>  
> P.S.: Not sure with which version you are testing Elasticsearch. The 
> automodules were added 2 weeks ago and yesterday we renamed them to have 
> fully-qualified domain names. We will merge the PR later and backport to 9.0 
> and 9.x branch.

I’m experimenting with a recent 9.0 snapshot. I’ll update once the latest name 
changes find their way into 9.0.

> -
> Uwe Schindler
> Achterdiek 19, D-28357 Bremen
> https://www.thetaphi.de 
> eMail: u...@thetaphi.de 
>  
> From: Dawid Weiss  
> Sent: Tuesday, November 30, 2021 1:19 PM
> To: Lucene Dev 
> Subject: Re: HEADS UP: Java 9 Module System (JMS) module names for Lucene 
> artifacts
>  
>> Getting this fixed in 9.0.0 will allow us to leverage these stable module 
>> names in requires
>> directives (without needing to change them anything soon).
>  
> For the record - they are not considered stable in 9.0.0 -- see Uwe's note 
> below (although this was my original 
> intention for introducing automatic module names, even if the actual compact 
> naming has been so 
> fiercely criticized...).
>  
> LUCENE-10234: Added Automatic-Module-Name to all JARs. This is a first step 
> to enable full Java
> module system (JMS) support in later Lucene versions. At the moment, the 
> automatic names should
> not be considered stable. (Dawid Weiss, Uwe Schindler)
>  
> D. 



Re: HEADS UP: Java 9 Module System (JMS) module names for Lucene artifacts

2021-11-30 Thread Chris Hegarty
Hi Uwe,

Thank you very much for noticing and fixing this. I’m relatively new
here and certainly don’t get a vote or anything, but it is a big +1
from me for this change.

I'm actively investigaing the modularization of the Elasticsearch
platform and I noticed, when prototyping our module declarations, that
the lucene module names were less than ideal. Getting this fixed in
9.0.0 will allow us to leverage these stable module names in requires
directives (without needing to change them anything soon).

Thanks again,
-Chris.

> On 29 Nov 2021, at 23:04, Uwe Schindler  wrote:
> 
> Hi,
> 
> I stopped the Lucene 9.0 release because of some inconsistencies with Java
> Module System (JMS) module names. We will respin, but in preparation to full
> module system support (in later Lucene 9.x versions), I changed the so
> called "automatic module name" of all JAR artifacts so they are consistent
> with naming conventions by the ASF and suggested by module developers at
> OpenJDK and Maven people.
> 
> Long story:
> 
> There were already lengthy discussions on Maven and OpenJDK mailing list on
> "how to name a module". If you define a module name though
> "automatic-module-name" in the JAR manifest or by an explicit
> module-info.java (see https://issues.apache.org/jira/browse/LUCENE-10255,
> which is draft) the module name must be well thought. Christian Stein
> (Member of the OpenJDK group and also Junit committer, also well involved in
> development of Apache Maven) wrote some blog post about how a module name
> should look like, so any code downstream can import it into their own
> modules. The names must be valid Java identifiers and should be formatted
> like package names:
> https://sormuras.github.io/blog/2019-08-04-maven-coordinates-and-java-module
> -names.html
> 
> It concludes this very well:
> - The Java module name should have the Maven Group ID as prefix, followed by
> "." and then a local module descriptor. E.g., "org.apache.lucene.core"
> - The prefix of exported package names inside each module *should* be
> prefixed by the module name (we can't do this for Lucene, but we should at
> least share the same prefix: "org.apache.lucene").
> - The version name inside the module should follow module system syntax (so
> at least "9.0.0", no prefix/suffix => parseable by ModuleDescriptor.Version)
> 
> Here is a statistic of module names used on Maven by different artifacts,
> have a look at examples like Log4J, Apache TIKA and others:
> https://github.com/sormuras/modules/blob/main/doc/Top1000-2020.txt.md
> 
> For my detailed arguments see the discussion here (comments following this
> one):
>  7=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comm
> ent-17450327>
> 
> My proposal is to do the following before release, now implemented in
> https://github.com/apache/lucene/pull/487:
> 
>> Task :showModuleNames
> lucene-benchmark-10.0.0-SNAPSHOT.jar   ->
> org.apache.lucene.benchmark
> lucene-backward-codecs-10.0.0-SNAPSHOT.jar ->
> org.apache.lucene.backward_codecs
> lucene-classification-10.0.0-SNAPSHOT.jar  ->
> org.apache.lucene.classification
> lucene-codecs-10.0.0-SNAPSHOT.jar  ->
> org.apache.lucene.codecs
> lucene-core-10.0.0-SNAPSHOT.jar-> org.apache.lucene.core
> lucene-demo-10.0.0-SNAPSHOT.jar-> org.apache.lucene.demo
> lucene-expressions-10.0.0-SNAPSHOT.jar ->
> org.apache.lucene.expressions
> lucene-facet-10.0.0-SNAPSHOT.jar   ->
> org.apache.lucene.facet
> lucene-grouping-10.0.0-SNAPSHOT.jar->
> org.apache.lucene.grouping
> lucene-highlighter-10.0.0-SNAPSHOT.jar ->
> org.apache.lucene.highlighter
> lucene-join-10.0.0-SNAPSHOT.jar-> org.apache.lucene.join
> lucene-luke-10.0.0-SNAPSHOT.jar-> org.apache.lucene.luke
> lucene-memory-10.0.0-SNAPSHOT.jar  ->
> org.apache.lucene.memory
> lucene-misc-10.0.0-SNAPSHOT.jar-> org.apache.lucene.misc
> lucene-monitor-10.0.0-SNAPSHOT.jar ->
> org.apache.lucene.monitor
> lucene-queries-10.0.0-SNAPSHOT.jar ->
> org.apache.lucene.queries
> lucene-queryparser-10.0.0-SNAPSHOT.jar ->
> org.apache.lucene.queryparser
> lucene-replicator-10.0.0-SNAPSHOT.jar  ->
> org.apache.lucene.replicator
> lucene-sandbox-10.0.0-SNAPSHOT.jar ->
> org.apache.lucene.sandbox
> lucene-spatial-extras-10.0.0-SNAPSHOT.jar  ->
> org.apache.lucene.spatial_extras
> lucene-spatial3d-10.0.0-SNAPSHOT.jar   ->
> org.apache.lucene.spatial3d
> lucene-suggest-10.0.0-SNAPSHOT.jar ->
> org.apache.lucene.suggest
> lucene-test-framework-10.0.0-SNAPSHOT.jar  ->
> org.apache.lucene.test_framework
> lucene-analysis-common-10.0.0-SNAPSHOT.jar ->
> org.apache.lucene.analysis.common
> 

Re: Accessibility of CollectedSearchGroup's state

2021-10-15 Thread Chris Hegarty
Hi Adrien, Jim,

First, thank you for your time and insightful comments. They are very much 
appreciated, and will no doubt lead to the best solution.

I will reimplement the elasticsearch collector as suggested, and then circle 
back here with a proposal for the reduction in accessibility most appropriate 
for Lucene.

-Chris.

P.S. Eliminating split packages shines a spotlight on tech debt. Each case 
different and requiring / deserving of its own investigation and analysis. I 
try to do as much of this upfront, but really it needs the valuable input from 
folk on this list. Thank you for your time.

> On 14 Oct 2021, at 19:05, jim ferenczi  wrote:
> 
> I agree, we should have a SinglePassGroupingCollector in Elasticsearch and 
> reduce the visibility of these expert classes in Lucene.
> As it stands today, the FirstPassGroupingCollector could be a final class imo.
> 
> 
> Le jeu. 14 oct. 2021 à 18:42, Adrien Grand  > a écrit :
> I feel sorry for increasing the scope of all these requests for changes that 
> you make, but the way Elasticsearch overrides this collector feels wrong to 
> me as any change in the implementation details of this collector would 
> probably break Elasticsearch's collector too. In my opinion, 
> CollectedSearchGroup should not even be public. My preference would be to 
> copy this collector to the Elasticsearch code base and fold the changes from 
> Elasticsearch's CollapsingTopDocsCollector into it. I'm not super familiar 
> with this code, so I might be missing something. Maybe Jim or Alan have an 
> opinion.



Accessibility of CollectedSearchGroup's state

2021-10-14 Thread Chris Hegarty
In an effort to prepare Elasticsearch for modularization, we are
investigating and eliminating split packages. The situation has improved
through recent refactoring in Lucene 9.0 [1], but a number of split
packages still remain. This message identifies one such so that it can
be discussed in isolation, with a view to a potential solution either in
Lucene or possibly within Elasticsearch itself.

Elasticsearch has a collapsing search collector that groups documents
based on field values and collapses based on the top sorted documents,
`CollapsingTopDocsCollector` [2]. The CTDC is a subclass of lucene's
`FirstPassGroupingCollector` [3], and extends its functionality to
get the top docs in just a single pass. As a subclass, the CTDC
leverages the sorted top N unique groups by means of the protected
`FPGC.orderedGroups` field (of type `TreeSet`),
when performing the collapsing. Specifically, the
`CollectedSearchGroup.topDoc` field is of interest in order to retrieve
the number of the top document. The `topDoc` field is package-private
and therefore not normally accessible to the CTDC (without resorting to
nasty tricks!).

Given that lucene's publicly extensible FPGC exposes the `orderedGroups`
as a set of `CollectedSearchGroup`, and that CSG is a public class, it
would appear that the lack of public access to its state is likely an
oversight, rather than a deliberate design decision (otherwise, from
outside the package CSG adds no apparent value and appears as if a
marker interface, which is not useful to any subclasses).

Minimally, the elasticsearch collector requires read access to the
`CollectedSearchGroup.topDoc` field. This could be achieved by adding
an accessor method to CSG, that returns the primitive int doc value.  But
this whole API seems fairly low-level and powerful (you should know what
you're doing - experts only!). Also, CSG's superclass, `SearchGroup`,
makes its state available through public fields, so maybe we could just
make CSG's state public too?

lucene/grouping/src/java/org/apache/lucene/search/grouping/CollectedSearchGroup.java

 public class CollectedSearchGroup extends SearchGroup {
-  int topDoc;
-  int comparatorSlot;
+
+  /** The number of the top document. */
+  public int topDoc;
+
+  /** The field comparator slot. */
+  public int comparatorSlot;
 }

-Chris.

[1] https://issues.apache.org/jira/browse/LUCENE-9319
[2] 
https://github.com/elastic/elasticsearch/blob/master/server/src/main/java/org/apache/lucene/search/grouping/CollapseTopFieldDocs.java
[3] 
https://github.com/apache/lucene/blob/main/lucene/grouping/src/java/org/apache/lucene/search/grouping/FirstPassGroupingCollector.java
-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Accessibility of SegmentInfo::setDiagnostics

2021-09-27 Thread Chris Hegarty
Hi Adrien,

Thanks for your reply.

A further thought... the use-case here is really for custom subclasses
to “add additional” diagnostics, rather than “replace”. How about we
expose just that functionality in SegmentInfo, e.g. through a new method
similar to:

  /**
   * Adds the given {@code diagnostics} to this segment's diagnostics.
   * @param diagnostics the diagnostics to add
   */
  public void addDiagnostics(Map diagnostics) { ... }

This would satisfy the ES use-case, while disallowing a full replace.

If we think that there are other use-cases that require "replace" (or it
is preferred to defer API discussion for a later time), then I'm happy
to proceed with simply making the existing `setDiagnostics` method
public.

-Chris.

> On 24 Sep 2021, at 17:30, Adrien Grand  wrote:
> 
> I'd +1 a change that makes setDiagnostics public. Longer term I wonder if we 
> should have a more locked down API that _only_ allows setting diagnostics. 
> There are lots of things in SegmentCommitInfo that merges should never 
> override like the segment ID, and I can't think of anything else than 
> diagnostics that a merge policy should ever need to set.
> 


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Accessibility of SegmentInfo::setDiagnostics

2021-09-24 Thread Chris Hegarty
In an effort to prepare Elasticsearch for modularization, we are
investigating and eliminating split packages. The situation has improved
through recent refactoring in Lucene 9.0 [1], but a number of split
packages still remain. This message identifies one such so that it can
be discussed in isolation, with a view to a potential solution either in
Lucene or possibly within Elasticsearch itself.

Elasticsearch has a custom merge policy, ShuffleForcedMergePolicy, that
interleaves eldest and newest segments. SFMP has but a single
package-private dependency on lucene, namely `SegmentInfo::setDiagnostics`.
The `setDiagnostics` method is used by SFMP to add a single additional
ES-specific diagnostic key.

It is possible to recreate the whole SegmentCommitInfo (and the SI
within, along with the additional ES-specific diagnostic key), by using
just the accessible parts of the lucene API. This has been prototyped[2],
but shows that something is not quite right [3]. Resolving this issue is
likely better done in lucene.

The comment in `MergePolicy::setMergeInfo` - "Sets the SegmentCommitInfo
of the merged segment. Allows sub-classes to e.g. set diagnostics
properties"[4] - is telling. To better support the use-cases referred to
in this comment and to allow for other-package subclasses (which seems
completely reasonable given other parts of the API), it would be better
for lucene's SI::setDiagnostics method to be public (rather than
package-private). Such a change is a general improvement in the lucene
API, which can be leveraged by any custom merge policy.

-Chris.

[1] https://issues.apache.org/jira/browse/LUCENE-9319
[2] https://github.com/elastic/elasticsearch/pull/78253
[3] https://github.com/elastic/elasticsearch/pull/78253#issuecomment-925861893
[4] 
https://github.com/apache/lucene/blob/d5d6dc079395c47cd6d12dcce3bcfdd2c7d9dc63/lucene/core/src/java/org/apache/lucene/index/MergePolicy.java#L271
-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Accessibility of MergeThread.rateLimiter

2021-09-22 Thread Chris Hegarty
Hi Adrien,

> On 22 Sep 2021, at 12:56, Adrien Grand  wrote:
> 
> You interpreted my suggestion correctly.

Great. Thanks for the confirmation.

I filed the following issue to track this:
  https://issues.apache.org/jira/browse/LUCENE-10118 


> Elasticsearch can indeed leverage information that is sent to the InfoStream: 
> look up the class called LoggerInfoStream, which forwards all the InfoStream 
> logging to Log4J.

Got it. Thanks.

Cheers,
-Chris.

Re: Accessibility of MergeThread.rateLimiter

2021-09-22 Thread Chris Hegarty
Hi Adrien,

Great suggestion. If I understand correctly, then what you are suggesting is 
something along the lines of ( subject to exact message details, which we can 
trash out in a PR ):

diff --git 
a/lucene/core/src/java/org/apache/lucene/index/ConcurrentMergeScheduler.java 
b/lucene/core/src/java/org/apache/lucene/index/ConcurrentMergeScheduler.java
index 4500d5cf7ce..76f7ea2a9f2 100644
--- a/lucene/core/src/java/org/apache/lucene/index/ConcurrentMergeScheduler.java
+++ b/lucene/core/src/java/org/apache/lucene/index/ConcurrentMergeScheduler.java
@@ -691,13 +691,23 @@ public class ConcurrentMergeScheduler extends 
MergeScheduler {
 public void run() {
   try {
 if (verbose()) {
-  message("  merge thread: start");
+  message(String.format(Locale.ROOT, "merge thread %s start", 
this.getName()));
 }
 
 doMerge(mergeSource, merge);
 
 if (verbose()) {
-  message("  merge thread: done");
+  message(
+  String.format(
+  Locale.ROOT,
+  "merge thread %s done estSize=%.1f MB (written=%.1f MB) 
runTime=%.1fs (stopped=%.1fs, paused=%.1fs) rate=%s",
+  this.getName(),
+  bytesToMB(merge.estimatedMergeBytes),
+  bytesToMB(rateLimiter.getTotalBytesWritten()),
+  nsToSec(System.nanoTime() - merge.mergeStartNS),
+  nsToSec(rateLimiter.getTotalStoppedNS()),
+  nsToSec(rateLimiter.getTotalPausedNS()),
+  rateToString(rateLimiter.getMBPerSec(;
 }
 runOnMergeFinished(mergeSource);
   } catch (Throwable exc) {


Then ES can leverage such from the infoStream, right? ( thus avoiding the need 
for ES extract the inaccessible information directly itself, while also being 
more generally useful in Lucene logs ).

Or have I misinterpreted your comment?

-Chris.


> On 22 Sep 2021, at 10:12, Adrien Grand  wrote:
> 
> Hi Chris,
> 
> I looked into this and Elasticsearch seems to only need access to the rate 
> limiter for logging purposes, without adding any information that Lucene 
> doesn't have.
> Maybe another option would consist of moving the logging to Lucene? Having 
> information in the IndexWriter's InfoStream about rate limiting for each 
> completed merge sounds like something that would generally be useful.
> 



Accessibility of MergeThread.rateLimiter

2021-09-22 Thread Chris Hegarty
In an effort to prepare Elasticsearch for modularization, we are
investigating and eliminating split packages. The situation has improved
through recent refactoring in Lucene 9.0, but a number of split
packages still remain. This message identifies one such, so that it can
be discussed in isolation, with a view to a potential solution either in
Lucene or possibly within Elasticsearch itself.

Elasticsearch defines an `ElasticsearchConcurrentMergeScheduler` [1]
(subclass of lucene's `ConcurrentMergeScheduler`), that provides
tracking of merge times for all and current merges. It does so by
extracting the relevant information from the `MergeThread.rateLimiter`
to report, a) the total-bytes-written, via
`MergeRateLimiter::getTotalBytesWritten`, and b) the MB-per-second
throttle, via `MergeRateLimiter::getMBPerSec`. Currently access to the
package-private `rateLimiter` is gained by effectively (but not
literally) injecting into a lucene package [2].

The elasticsearch ECMS overrides the factory for creating merge threads,
so is in control of the thread creation, but still cannot gain access to
the merge tracking information.

There are several ways that we could resolve this, e.g.

1) In Lucene, declare `MergeThread.rateLimiter` as a public field.
   Elasticsearch can access the rate limiter directly.

2) Same as #1, but instead declare `MergeThread.rateLimiter` as
   protected field. Elasticsearch can then subclass MergeThread and
   access the rate limiter.

3) In Lucene, provide an overloaded MergeThread constructor that accepts
   a `MergeRateLimiter`. Elasticsearch can then create the merge rate
   limiter itself and pass it to the merge thread on construction. This
   would work, but would require ES to retain a map from thread to rate
   limiter - not ideal.

4) Add public accessors directly to MergeThread to expose the necessary
   information, e.g.
 class MergeThread {
   ...
   /** Returns the current mb per second rate limit. */
  public double getMBPerSec() { return rateLimiter.getMBPerSec(); }

  /** Returns the total bytes written by this merge. */
  public long getTotalBytesWritten() { return 
rateLimiter.getTotalBytesWritten(); }   
}

I'm sure there are probably a few other ideas that are not listed above,
but so far no.2 above seems the least intrusive. No.4 is also reasonable
if we consider this functionality more broadly desirable.

-Chris.

[1] 
https://github.com/elastic/elasticsearch/blob/master/server/src/main/java/org/elasticsearch/index/engine/ElasticsearchConcurrentMergeScheduler.java#L39
[2] 
https://github.com/elastic/elasticsearch/blob/master/server/src/main/java/org/apache/lucene/index/OneMergeHelper.java#L16


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Accessibility of QueryParserBase::handleBareFuzzy

2021-09-21 Thread Chris Hegarty
Thanks Alan, great suggestion.

I filed the following issue to track this:
   https://issues.apache.org/jira/browse/LUCENE-10115

-Chris.

> On 20 Sep 2021, at 16:14, Alan Woodward  wrote:
> 
> Hi Chris,
> 
> The difference between the elasticsearch query parser and the built-in lucene 
> one appears to be based around how they parse fuzziness, so I think the best 
> solution here is to add another protected method, something like this:
> 
> protected float parseFuzzyDistance(String input, float default) {
>   try {
>   return Float.parseFloat(fuzzySlop.image.substring(1));
>   } catch (@SuppressWarnings("unused”) Exception ignored) {
>   return default;
>   }
> }
> 
> Then handleBareFuzzy() can call out to this, and the ES version can overload 
> it and do its own parsing.
> 
> - A

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Accessibility of QueryParserBase::handleBareFuzzy

2021-09-20 Thread Chris Hegarty
Hi, 

In an effort to prepare Elasticsearch for modularization, we are
investigating and eliminating split packages. The situation has improved
through recent refactoring in Lucene 9.0 [1], but a number of split
packages still remain. This message identifies one such so that it can
be discussed in isolation, with a view to a potential solution either in
Lucene or possibly within Elasticsearch itself.

Elasticsearch has a query parser, `QueryStringQueryParser`[2], that
builds queries based on mapping information. This parser has a need to
override its superclass's 
`org.apache.lucene.queryparser.classic.QueryParserBase::handleBareFuzzy` [3]
method, in order to provide custom handling of fuzzy queries. This is
clearly not "best practice", since to do so requires the use of
effectively (but not literally) injecting into a lucene package, which
is done through `XQueryParser` [4]. We want to eliminate the need for
`XQueryParser`, and hence the split package at run time.

Clearly, but likely not right, we could simply make `handleBareFuzzy` a
a protected method in Lucene's `QueryParser` or `QueryParserBase` - this
would satisfy the need of the Elasticsearch `QueryStringQueryParser`. If
not this, I don't see an alternative that could be coded in
Elasticsearch's `QueryStringQueryParser`, but maybe there is a different
API extension point that could be used, or a new one provided?

-Chris.

[1] https://issues.apache.org/jira/browse/LUCENE-9319
[2] 
https://github.com/elastic/elasticsearch/blob/master/server/src/main/java/org/elasticsearch/index/search/QueryStringQueryParser.java#L436
[3] 
https://github.com/apache/lucene/blob/8ac26737913d0c1555019e93bc6bf7db1ab9047e/lucene/queryparser/src/java/org/apache/lucene/queryparser/classic/QueryParserBase.java#L813
[4] 
https://github.com/elastic/elasticsearch/blob/master/server/src/main/java/org/apache/lucene/queryparser/classic/XQueryParser.java


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org