Re: [VOTE] Release Lucene 9.11.1 RC1

2024-06-24 Thread Michael Sokolov
SUCCESS! [0:55:48.190137]

(tested w/Corretto JDK)

+1

On Mon, Jun 24, 2024 at 8:01 AM Benjamin Trent  wrote:
>
> SUCCESS! [0:40:46.898514]
>
> +1
>
> On Mon, Jun 24, 2024 at 1:29 AM Ignacio Vera  wrote:
> >
> > Please vote for release candidate 1 for Lucene 9.11.1
> >
> >
> > The artifacts can be downloaded from:
> >
> > https://dist.apache.org/repos/dist/dev/lucene/lucene-9.11.1-RC1-rev-0c087dfdd10e0f6f3f6faecc6af4415e671a9e69
> >
> >
> > You can run the smoke tester directly with this command:
> >
> >
> > python3 -u dev-tools/scripts/smokeTestRelease.py \
> >
> > https://dist.apache.org/repos/dist/dev/lucene/lucene-9.11.1-RC1-rev-0c087dfdd10e0f6f3f6faecc6af4415e671a9e69
> >
> >
> > The vote will be open for at least 72 hours i.e. until 2024-06-27 07:00 UTC.
> >
> >
> > [ ] +1  approve
> >
> > [ ] +0  no opinion
> >
> > [ ] -1  disapprove (and reason why)
> >
> >
> > Here is my +1
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [VOTE] Release Lucene 9.11.1 RC1

2024-06-24 Thread Benjamin Trent
SUCCESS! [0:40:46.898514]

+1

On Mon, Jun 24, 2024 at 1:29 AM Ignacio Vera  wrote:
>
> Please vote for release candidate 1 for Lucene 9.11.1
>
>
> The artifacts can be downloaded from:
>
> https://dist.apache.org/repos/dist/dev/lucene/lucene-9.11.1-RC1-rev-0c087dfdd10e0f6f3f6faecc6af4415e671a9e69
>
>
> You can run the smoke tester directly with this command:
>
>
> python3 -u dev-tools/scripts/smokeTestRelease.py \
>
> https://dist.apache.org/repos/dist/dev/lucene/lucene-9.11.1-RC1-rev-0c087dfdd10e0f6f3f6faecc6af4415e671a9e69
>
>
> The vote will be open for at least 72 hours i.e. until 2024-06-27 07:00 UTC.
>
>
> [ ] +1  approve
>
> [ ] +0  no opinion
>
> [ ] -1  disapprove (and reason why)
>
>
> Here is my +1

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Bugfix release 9.11.1

2024-06-23 Thread Ignacio Vera
Here are the release notes for the release:

https://cwiki.apache.org/confluence/pages/resumedraft.action?draftId=311626871=460c13d9-7e8e-4e24-912b-2933f64bb746;

Please feel free to edit them.

On Fri, Jun 21, 2024 at 11:55 AM Stefan Vodita 
wrote:

> The fix is now in main, branch_9x, and branch_9_11.
>
> On Thu, 20 Jun 2024 at 14:17, Stefan Vodita 
> wrote:
>
>> Thank you Ignacio for handling the release!
>>
>> I've just updated the PR with the fix [1].
>> I can push it tomorrow.
>>
>>
>> Stefan
>>
>> [1] https://github.com/apache/lucene/pull/13494
>>
>> On Thu, 20 Jun 2024 at 08:36, Ignacio Vera  wrote:
>>
>>> I am now preparing for a bugfix release from branch branch_9_11. I am
>>> planning to build the RC next Monday.
>>>
>>> Please observe the normal rules for committing to this branch:
>>>
>>> * Before committing to the branch, reply to this thread and argue
>>>   why the fix needs backporting and how long it will take.
>>> * All issues accepted for backporting should be marked with 9.11.1
>>>   in GitHub, and issues that should delay the release must be marked as
>>> Blocker
>>> * All patches that are intended for the branch should first be committed
>>>   to the unstable branch, merged into the stable branch, and then into
>>>   the current release branch.
>>> * Only issues with Milestone version 9.11.1 and priority "Blocker" will
>>> delay
>>>   a release candidate build.
>>>
>>> There are already three fixes backported to branch 9.11:
>>>
>>> [1] https://github.com/apache/lucene/pull/13498
>>> [2] https://github.com/apache/lucene/pull/13504
>>> [3] https://github.com/apache/lucene/pull/13501
>>>
>>> @Stefan you mention in another thread an issue with
>>> StringValueFacetCounts but I see there is no fix for it yet, what's the
>>> status there?
>>>
>>> Thanks,
>>>
>>> Ignacio
>>>
>>


[VOTE] Release Lucene 9.11.1 RC1

2024-06-23 Thread Ignacio Vera
Please vote for release candidate 1 for Lucene 9.11.1


The artifacts can be downloaded from:

https://dist.apache.org/repos/dist/dev/lucene/lucene-9.11.1-RC1-rev-0c087dfdd10e0f6f3f6faecc6af4415e671a9e69


You can run the smoke tester directly with this command:


python3 -u dev-tools/scripts/smokeTestRelease.py \

https://dist.apache.org/repos/dist/dev/lucene/lucene-9.11.1-RC1-rev-0c087dfdd10e0f6f3f6faecc6af4415e671a9e69


The vote will be open for at least 72 hours i.e. until 2024-06-27 07:00 UTC.


[ ] +1  approve

[ ] +0  no opinion

[ ] -1  disapprove (and reason why)


Here is my +1


Re: Any recommended issues to work on for a newcomer?

2024-06-22 Thread Michael Wechner

Hi Hank

Sorry, I still did not find the time to try your code, but learned today 
about


https://rockset.com/Rockset_for_Hybrid_Search.pdf
https://rockset.com/whitepapers/hybrid-search-architecture/

which might be interesting to compare with.

Thanks

Michael



Am 20.05.24 um 08:16 schrieb Michael Wechner:

Hi Hank

Very cool, thank you, will try to do this asap!

All the best

Michael


Am 19.05.24 um 01:42 schrieb Chang Hank:

Hey Michael,

I wrote the first version of my idea about implementing RRF in 
Lucene, here the link of the code 
https://gist.github.com/hack4chang/ee2b37eab80bd82e574ff4f94ed204e9.
Right now I have some questions, one is about the shardIndex to be 
returned, another one is the TotalHits value, please take a look at 
the code and kindly leave some comments below.


Thanks,
Hank


On May 18, 2024, at 2:01 PM, Chang Hank  wrote:

Or maybe we can first create an issue and PR based on the issue number?
WDYT?

Best,

Hank

On May 18, 2024, at 11:29 AM, Chang Hank  
wrote:


Hey Michael,

Sorry I was a bit busy this week, but I’ve looked into the 
resources you provided and also some useful advice from Alessandro 
and Adrien.


I have a briefly understanding of how RRF works, but I’m not quite 
sure how we should implement it. Based on the advice from 
Alessandro and Adrien, it seems we need to consider that the search 
results are located at different shards. According to Alessandro, 
we should aggregate the ranked lists from all distributed nodes and 
then apply RRF.
Are we going to implement this aggregation logic inside our RRF 
method?


Also could you please create a PR so we can discuss more details 
further?


All the best,

Hank

On May 13, 2024, at 10:09 AM, Michael Wechner 
 wrote:


Great, sounds like we have plan :-)

Hank and I can get started trying to understand the internals 
better ...


Thanks

Michael

Am 13.05.24 um 18:21 schrieb Alessandro Benedetti:
Sure, we can make it work but in a distributed environment you 
have to run first each query distributed (aggregating all nodes) 
and then RRF on top of the aggregated ranked lists.
Doing RRF per node first and then aggregate per shard won't 
return the same results I suspect.

When I go back to working on the task I'll be able to elaborate more!

Cheers
--
*Alessandro Benedetti*
Director @ Sease Ltd.
/Apache Lucene/Solr Committer/
/Apache Solr PMC Member/

e-mail: a.benede...@sease.io/
/

*Sease* - Information Retrieval Applied
Consulting | Training | Open Source

Website: Sease.io 
LinkedIn  | Twitter 
 | Youtube 
 | 
Github 



On Mon, 13 May 2024 at 14:12, Adrien Grand  wrote:

> Maybe Adrien Grand and others might also have some feedback
:-)

I'd suggest the signature to look something like `TopDocs
TopDocs#rrf(int topN, int k, TopDocs[] hits)` to be
consistent with `TopDocs#merge`. Internally, it should look
at `ScoreDoc#shardId` and `ScoreDoc#doc` to figure out which
hits map to the same document.

> Back in the day, I was reasoning on this and I didn't think
Lucene was the right place for an interleaving algorithm,
given that Reciprocal Rank Fusion is affected by distribution
and it's not supposed to work per node.

To me this is like `TopDocs#merge`. There are changes needed
on the application side to hook this call into the logic that
combines hits that come from multiple shards (multiple
queries in the case of RRF), but Lucene can still provide the
merging logic.

On Mon, May 13, 2024 at 1:41 PM Michael Wechner
 wrote:

Thanks for your feedback Alessandro!

I am using Lucene independent of Solr or OpenSearch,
Elasticsearch, but would like to combine different result
sets using RRF, therefore think that Lucene itself could
be a good place actually.

Looking forward to your additional elaboration!

Thanks

Michael





Am 13.05.2024 um 12:34 schrieb Alessandro Benedetti
:

This is not strictly related to Lucene, but I'll give a
talk at Berlin Buzzwords on how I am implementing
Reciprocal Rank Fusion in Apache Solr.
I'll resume my work on the contribution next week and
have more to share later.

Back in the day, I was reasoning on this and I didn't
think Lucene was the right place for an interleaving
algorithm, given that Reciprocal Rank Fusion is affected
by distribution and it's not supposed to work per node.
I think I evaluated the possibility of doing it as a
Lucene query or a Lucene component but then ended up
with a different approach.
I'll elaborate more when I go back to the task!

Cheers
--
*Alessandro 

Re: [JENKINS] Lucene-main-MacOSX (64bit/hotspot/jdk-21.0.1) - Build # 11500 - Still Failing!

2024-06-22 Thread Dawid Weiss
Thank you for digging, Uwe!

On Fri, Jun 21, 2024 at 10:24 PM Uwe Schindler  wrote:

> Hi,
>
> it looks like I was able to work around by putting:
>
> org.gradle.vfs.watch=false
>
> into the config file ~/.gradle/gradle.properties on the MacOS's Jenkins
> Node.
>
> This build seems to work again:
> https://jenkins.thetaphi.de/job/Lucene-main-MacOSX/11501/console
>
> Uwe
> Am 21.06.2024 um 18:48 schrieb Uwe Schindler:
>
> Hi,
>
> the issue is here: https://github.com/gradle/gradle/issues/29476
>
> Looks like they solved the issue. We may need to update to later Gradle
> 8.8 bugfix release. It looks like it's not yet available.
>
> https://github.com/gradle/gradle/pull/29514
>
> Jenkins is older than macos 11:
>
> serv1-vm2:~ jenkins$ sw_vers
> ProductName:Mac OS X
> ProductVersion: 10.14.6
> BuildVersion:   18G9323
>
> Uwe
>
> Am 21.06.2024 um 18:01 schrieb Uwe Schindler:
>
> Hi,
>
> it looks like since we changed Gradle to latest version (not sure which PR
> it was), all builds on MacOS X fail on Policeman Jenkins (its is x86-64 not
> ARM, older version of MacOSX as its hard to update due to VM issues with
> VirtualBOX). The whole JVM crushes when Gradle starts. The hserr.pid shows
> that Gradle loads a native library which causes an issue. The dylib file is
> provided by Gradle so the issue is clearly the dylib file shipped with
> Gradle.
>
> For now I disabled the builds. I have the feeling that Gradle has some NPE
> error in their own dynamic library "libnative-platform-file-events.dylib":
>
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  SIGSEGV (0xb) at pc=0x00012c06c000, pid=26586, tid=3587
> #
> # JRE version: OpenJDK Runtime Environment Temurin-11.0.21+9 (11.0.21+9)
> (build 11.0.21+9)
> # Java VM: OpenJDK 64-Bit Server VM Temurin-11.0.21+9 (11.0.21+9, mixed
> mode, tiered, compressed oops, parallel gc, bsd-amd64)
> # Problematic frame:
> # C  [libnative-platform-file-events.dylib+0x0] __dso_handle+0x0
> #
> # No core dump will be written. Core dumps have been disabled. To enable
> core dumping, try "ulimit -c unlimited" before starting Java again
>
> See attached file for full details, it also shows at end that the
> mentioned libnative-platform-file-events.dylib file is shipped with Gradle:
>
> 0x00012c06
> /Users/jenkins/.gradle/native/c067742578af261105cb4f569cf0c3c89f3d7b1fecec35dd04571415982c5e48/osx-amd64/libnative-platform.dylib
> 0x00012c06c000
> /Users/jenkins/.gradle/native/100fb08df4bc3b14c8652ba06237920a3bd2aa13389f12d3474272988ae205f9/osx-amd64/libnative-platform-file-events.dylib
>
> Solr builds have not yet updated Gradle, they build fine.
>
> Uwe
>
> Am 21.06.2024 um 15:58 schrieb Policeman Jenkins Server:
>
> Build: https://jenkins.thetaphi.de/job/Lucene-main-MacOSX/11500/
> Java: 64bit/hotspot/jdk-21.0.1 -XX:-UseCompressedOops -XX:+UseParallelGC
>
> No tests ran.
>
> -
> To unsubscribe, e-mail: builds-unsubscr...@lucene.apache.org
> For additional commands, e-mail: builds-h...@lucene.apache.org
>
>
>
> -
> To unsubscribe, e-mail:dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail:dev-h...@lucene.apache.org
>
>
> --
> Uwe Schindler
> Achterdiek 19, D-28357 Bremenhttps://www.thetaphi.de
> eMail: u...@thetaphi.de
>
>


Re: [JENKINS] Lucene-main-MacOSX (64bit/hotspot/jdk-21.0.1) - Build # 11500 - Still Failing!

2024-06-21 Thread Uwe Schindler

Hi,

it looks like I was able to work around by putting:

   org.gradle.vfs.watch=false

into the config file ~/.gradle/gradle.properties on the MacOS's Jenkins 
Node.


This build seems to work again: 
https://jenkins.thetaphi.de/job/Lucene-main-MacOSX/11501/console


Uwe

Am 21.06.2024 um 18:48 schrieb Uwe Schindler:

Hi,

the issue is here: https://github.com/gradle/gradle/issues/29476

Looks like they solved the issue. We may need to update to later 
Gradle 8.8 bugfix release. It looks like it's not yet available.


https://github.com/gradle/gradle/pull/29514

Jenkins is older than macos 11:

serv1-vm2:~ jenkins$ sw_vers
ProductName:    Mac OS X
ProductVersion: 10.14.6
BuildVersion:   18G9323

Uwe

Am 21.06.2024 um 18:01 schrieb Uwe Schindler:

Hi,

it looks like since we changed Gradle to latest version (not sure 
which PR it was), all builds on MacOS X fail on Policeman Jenkins 
(its is x86-64 not ARM, older version of MacOSX as its hard to update 
due to VM issues with VirtualBOX). The whole JVM crushes when Gradle 
starts. The hserr.pid shows that Gradle loads a native library which 
causes an issue. The dylib file is provided by Gradle so the issue is 
clearly the dylib file shipped with Gradle.


For now I disabled the builds. I have the feeling that Gradle has 
some NPE error in their own dynamic library 
"libnative-platform-file-events.dylib":


# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00012c06c000, pid=26586, tid=3587
#
# JRE version: OpenJDK Runtime Environment Temurin-11.0.21+9 
(11.0.21+9) (build 11.0.21+9)
# Java VM: OpenJDK 64-Bit Server VM Temurin-11.0.21+9 (11.0.21+9, 
mixed mode, tiered, compressed oops, parallel gc, bsd-amd64)

# Problematic frame:
# C  [libnative-platform-file-events.dylib+0x0] __dso_handle+0x0
#
# No core dump will be written. Core dumps have been disabled. To 
enable core dumping, try "ulimit -c unlimited" before starting Java 
again


See attached file for full details, it also shows at end that the 
mentioned libnative-platform-file-events.dylib file is shipped with 
Gradle:


0x00012c06 
/Users/jenkins/.gradle/native/c067742578af261105cb4f569cf0c3c89f3d7b1fecec35dd04571415982c5e48/osx-amd64/libnative-platform.dylib
0x00012c06c000 
/Users/jenkins/.gradle/native/100fb08df4bc3b14c8652ba06237920a3bd2aa13389f12d3474272988ae205f9/osx-amd64/libnative-platform-file-events.dylib


Solr builds have not yet updated Gradle, they build fine.

Uwe

Am 21.06.2024 um 15:58 schrieb Policeman Jenkins Server:

Build: https://jenkins.thetaphi.de/job/Lucene-main-MacOSX/11500/
Java: 64bit/hotspot/jdk-21.0.1 -XX:-UseCompressedOops 
-XX:+UseParallelGC


No tests ran.

-
To unsubscribe, e-mail: builds-unsubscr...@lucene.apache.org
For additional commands, e-mail: builds-h...@lucene.apache.org



-
To unsubscribe, e-mail:dev-unsubscr...@lucene.apache.org
For additional commands, e-mail:dev-h...@lucene.apache.org



--
Uwe Schindler
Achterdiek 19, D-28357 Bremen
https://www.thetaphi.de
eMail:u...@thetaphi.de


Re: [JENKINS] Lucene-main-MacOSX (64bit/hotspot/jdk-21.0.1) - Build # 11500 - Still Failing!

2024-06-21 Thread Uwe Schindler

Hi,

the issue is here: https://github.com/gradle/gradle/issues/29476

Looks like they solved the issue. We may need to update to later Gradle 
8.8 bugfix release. It looks like it's not yet available.


https://github.com/gradle/gradle/pull/29514

Jenkins is older than macos 11:

serv1-vm2:~ jenkins$ sw_vers
ProductName:    Mac OS X
ProductVersion: 10.14.6
BuildVersion:   18G9323

Uwe

Am 21.06.2024 um 18:01 schrieb Uwe Schindler:

Hi,

it looks like since we changed Gradle to latest version (not sure 
which PR it was), all builds on MacOS X fail on Policeman Jenkins (its 
is x86-64 not ARM, older version of MacOSX as its hard to update due 
to VM issues with VirtualBOX). The whole JVM crushes when Gradle 
starts. The hserr.pid shows that Gradle loads a native library which 
causes an issue. The dylib file is provided by Gradle so the issue is 
clearly the dylib file shipped with Gradle.


For now I disabled the builds. I have the feeling that Gradle has some 
NPE error in their own dynamic library 
"libnative-platform-file-events.dylib":


# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00012c06c000, pid=26586, tid=3587
#
# JRE version: OpenJDK Runtime Environment Temurin-11.0.21+9 
(11.0.21+9) (build 11.0.21+9)
# Java VM: OpenJDK 64-Bit Server VM Temurin-11.0.21+9 (11.0.21+9, 
mixed mode, tiered, compressed oops, parallel gc, bsd-amd64)

# Problematic frame:
# C  [libnative-platform-file-events.dylib+0x0] __dso_handle+0x0
#
# No core dump will be written. Core dumps have been disabled. To 
enable core dumping, try "ulimit -c unlimited" before starting Java again


See attached file for full details, it also shows at end that the 
mentioned libnative-platform-file-events.dylib file is shipped with 
Gradle:


0x00012c06 
/Users/jenkins/.gradle/native/c067742578af261105cb4f569cf0c3c89f3d7b1fecec35dd04571415982c5e48/osx-amd64/libnative-platform.dylib
0x00012c06c000 
/Users/jenkins/.gradle/native/100fb08df4bc3b14c8652ba06237920a3bd2aa13389f12d3474272988ae205f9/osx-amd64/libnative-platform-file-events.dylib


Solr builds have not yet updated Gradle, they build fine.

Uwe

Am 21.06.2024 um 15:58 schrieb Policeman Jenkins Server:

Build: https://jenkins.thetaphi.de/job/Lucene-main-MacOSX/11500/
Java: 64bit/hotspot/jdk-21.0.1 -XX:-UseCompressedOops -XX:+UseParallelGC

No tests ran.

-
To unsubscribe, e-mail: builds-unsubscr...@lucene.apache.org
For additional commands, e-mail: builds-h...@lucene.apache.org



-
To unsubscribe, e-mail:dev-unsubscr...@lucene.apache.org
For additional commands, e-mail:dev-h...@lucene.apache.org


--
Uwe Schindler
Achterdiek 19, D-28357 Bremen
https://www.thetaphi.de
eMail:u...@thetaphi.de


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [JENKINS] Lucene-main-MacOSX (64bit/hotspot/jdk-21.0.1) - Build # 11500 - Still Failing!

2024-06-21 Thread Uwe Schindler

Hi,

it looks like since we changed Gradle to latest version (not sure which 
PR it was), all builds on MacOS X fail on Policeman Jenkins (its is 
x86-64 not ARM, older version of MacOSX as its hard to update due to VM 
issues with VirtualBOX). The whole JVM crushes when Gradle starts. The 
hserr.pid shows that Gradle loads a native library which causes an 
issue. The dylib file is provided by Gradle so the issue is clearly the 
dylib file shipped with Gradle.


For now I disabled the builds. I have the feeling that Gradle has some 
NPE error in their own dynamic library 
"libnative-platform-file-events.dylib":


# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00012c06c000, pid=26586, tid=3587
#
# JRE version: OpenJDK Runtime Environment Temurin-11.0.21+9 (11.0.21+9) 
(build 11.0.21+9)
# Java VM: OpenJDK 64-Bit Server VM Temurin-11.0.21+9 (11.0.21+9, mixed 
mode, tiered, compressed oops, parallel gc, bsd-amd64)

# Problematic frame:
# C  [libnative-platform-file-events.dylib+0x0] __dso_handle+0x0
#
# No core dump will be written. Core dumps have been disabled. To enable 
core dumping, try "ulimit -c unlimited" before starting Java again


See attached file for full details, it also shows at end that the 
mentioned libnative-platform-file-events.dylib file is shipped with Gradle:


0x00012c06 
/Users/jenkins/.gradle/native/c067742578af261105cb4f569cf0c3c89f3d7b1fecec35dd04571415982c5e48/osx-amd64/libnative-platform.dylib
0x00012c06c000 
/Users/jenkins/.gradle/native/100fb08df4bc3b14c8652ba06237920a3bd2aa13389f12d3474272988ae205f9/osx-amd64/libnative-platform-file-events.dylib


Solr builds have not yet updated Gradle, they build fine.

Uwe

Am 21.06.2024 um 15:58 schrieb Policeman Jenkins Server:

Build: https://jenkins.thetaphi.de/job/Lucene-main-MacOSX/11500/
Java: 64bit/hotspot/jdk-21.0.1 -XX:-UseCompressedOops -XX:+UseParallelGC

No tests ran.

-
To unsubscribe, e-mail: builds-unsubscr...@lucene.apache.org
For additional commands, e-mail: builds-h...@lucene.apache.org


--
Uwe Schindler
Achterdiek 19, D-28357 Bremen
https://www.thetaphi.de
eMail: u...@thetaphi.de
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00012c06c000, pid=26586, tid=3587
#
# JRE version: OpenJDK Runtime Environment Temurin-11.0.21+9 (11.0.21+9) (build 
11.0.21+9)
# Java VM: OpenJDK 64-Bit Server VM Temurin-11.0.21+9 (11.0.21+9, mixed mode, 
tiered, compressed oops, parallel gc, bsd-amd64)
# Problematic frame:
# C  [libnative-platform-file-events.dylib+0x0]  __dso_handle+0x0
#
# No core dump will be written. Core dumps have been disabled. To enable core 
dumping, try "ulimit -c unlimited" before starting Java again
#
# If you would like to submit a bug report, please visit:
#   https://github.com/adoptium/adoptium-support/issues
#

---  S U M M A R Y 

Command Line: -XX:TieredStopAtLevel=1 -XX:+UseParallelGC 
-XX:ActiveProcessorCount=1 
--add-exports=jdk.compiler/com.sun.tools.javac.api=ALL-UNNAMED 
--add-exports=jdk.compiler/com.sun.tools.javac.file=ALL-UNNAMED 
--add-exports=jdk.compiler/com.sun.tools.javac.parser=ALL-UNNAMED 
--add-exports=jdk.compiler/com.sun.tools.javac.tree=ALL-UNNAMED 
--add-exports=jdk.compiler/com.sun.tools.javac.util=ALL-UNNAMED 
--add-opens=java.base/java.util=ALL-UNNAMED 
--add-opens=java.base/java.lang=ALL-UNNAMED 
--add-opens=java.base/java.lang.invoke=ALL-UNNAMED 
--add-opens=java.prefs/java.util.prefs=ALL-UNNAMED 
--add-opens=java.base/java.nio.charset=ALL-UNNAMED 
--add-opens=java.base/java.net=ALL-UNNAMED 
--add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED -Xmx1g 
-Dfile.encoding=UTF-8 
-Djava.io.tmpdir=/Users/jenkins/workspace/Lucene-9.x-MacOSX/.gradle/tmp 
-Duser.country=US -Duser.language=en -Duser.variant 
-javaagent:/Users/jenkins/.gradle/wrapper/dists/gradle-8.8-bin/dl7vupf4psengwqhwktix4v1/gradle-8.8/lib/agents/gradle-instrumentation-agent-8.8.jar
 org.gradle.launcher.daemon.bootstrap.GradleDaemon 8.8

Host: MacBookPro11,3 x86_64 3620 MHz, 6 cores, 12G, Darwin 18.7.0
Time: Fri Jun 21 12:11:27 2024 UTC elapsed time: 5.787127 seconds (0d 0h 0m 5s)

---  T H R E A D  ---

Current thread is native thread

Stack: [0x7ba23000,0x7baa3000],  sp=0x7baa0298,  free 
space=500k
Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, 
Vv=VM code, C=native code)
C  [libnative-platform-file-events.dylib+0x0]  __dso_handle+0x0
C  [libnative-platform-file-events.dylib+0x15c27]  void 
std::__1::__variant_detail::__ctor 
>::__generic_construct[abi:v160006], (std::__1::__variant_detail::_Trait)1> 
const&>(std::__1::__variant_detail::__ctor >&, 
std::__1::__variant_detail::__copy_constructor, (std::__1::__variant_detail::_Trait)1> const&&&)+0x47
C  [libnative-platform-file-events.dylib+0x15b8a]  

Re: Bugfix release 9.11.1

2024-06-21 Thread Stefan Vodita
The fix is now in main, branch_9x, and branch_9_11.

On Thu, 20 Jun 2024 at 14:17, Stefan Vodita  wrote:

> Thank you Ignacio for handling the release!
>
> I've just updated the PR with the fix [1].
> I can push it tomorrow.
>
>
> Stefan
>
> [1] https://github.com/apache/lucene/pull/13494
>
> On Thu, 20 Jun 2024 at 08:36, Ignacio Vera  wrote:
>
>> I am now preparing for a bugfix release from branch branch_9_11. I am
>> planning to build the RC next Monday.
>>
>> Please observe the normal rules for committing to this branch:
>>
>> * Before committing to the branch, reply to this thread and argue
>>   why the fix needs backporting and how long it will take.
>> * All issues accepted for backporting should be marked with 9.11.1
>>   in GitHub, and issues that should delay the release must be marked as
>> Blocker
>> * All patches that are intended for the branch should first be committed
>>   to the unstable branch, merged into the stable branch, and then into
>>   the current release branch.
>> * Only issues with Milestone version 9.11.1 and priority "Blocker" will
>> delay
>>   a release candidate build.
>>
>> There are already three fixes backported to branch 9.11:
>>
>> [1] https://github.com/apache/lucene/pull/13498
>> [2] https://github.com/apache/lucene/pull/13504
>> [3] https://github.com/apache/lucene/pull/13501
>>
>> @Stefan you mention in another thread an issue with
>> StringValueFacetCounts but I see there is no fix for it yet, what's the
>> status there?
>>
>> Thanks,
>>
>> Ignacio
>>
>


Re: Bugfix release 9.11.1

2024-06-20 Thread Ignacio Vera
I expect that the entry will be added in 9.12 except in the 9_11 branch
where it should be in 9.11.1.
I will synchronize the CHANGES.txt once I am done with the release.

On Thu, Jun 20, 2024 at 4:49 PM Dawid Weiss  wrote:

>
> Thank you. I've applied 13315 to main, branch_9x and branch_9_11. Can't
> remember what changes.txt policy is - do we copy the entire section for
> 9.11 once it's cut back to main/ 9x?
>
> Dawid
>
> On Thu, Jun 20, 2024 at 4:41 PM Ignacio Vera  wrote:
>
>> Stefan sounds good.
>>
>> Sure Dawid. Thanks for moving it forward.
>>
>> Cheers,
>>
>> Ignacio
>>
>> On Thu, Jun 20, 2024 at 4:38 PM Dawid Weiss 
>> wrote:
>>
>>>
>>> I would like to backport this change to unified highlighter to 9.11.1,
>>> if there are no objections.
>>> https://github.com/apache/lucene/pull/13315
>>>
>>>
>>>
>>> On Thu, Jun 20, 2024 at 9:37 AM Ignacio Vera  wrote:
>>>
 I am now preparing for a bugfix release from branch branch_9_11. I am
 planning to build the RC next Monday.

 Please observe the normal rules for committing to this branch:

 * Before committing to the branch, reply to this thread and argue
   why the fix needs backporting and how long it will take.
 * All issues accepted for backporting should be marked with 9.11.1
   in GitHub, and issues that should delay the release must be marked as
 Blocker
 * All patches that are intended for the branch should first be committed
   to the unstable branch, merged into the stable branch, and then into
   the current release branch.
 * Only issues with Milestone version 9.11.1 and priority "Blocker" will
 delay
   a release candidate build.

 There are already three fixes backported to branch 9.11:

 [1] https://github.com/apache/lucene/pull/13498
 [2] https://github.com/apache/lucene/pull/13504
 [3] https://github.com/apache/lucene/pull/13501

 @Stefan you mention in another thread an issue with
 StringValueFacetCounts but I see there is no fix for it yet, what's the
 status there?

 Thanks,

 Ignacio

>>>


Re: Bugfix release 9.11.1

2024-06-20 Thread Dawid Weiss
Thank you. I've applied 13315 to main, branch_9x and branch_9_11. Can't
remember what changes.txt policy is - do we copy the entire section for
9.11 once it's cut back to main/ 9x?

Dawid

On Thu, Jun 20, 2024 at 4:41 PM Ignacio Vera  wrote:

> Stefan sounds good.
>
> Sure Dawid. Thanks for moving it forward.
>
> Cheers,
>
> Ignacio
>
> On Thu, Jun 20, 2024 at 4:38 PM Dawid Weiss  wrote:
>
>>
>> I would like to backport this change to unified highlighter to 9.11.1, if
>> there are no objections.
>> https://github.com/apache/lucene/pull/13315
>>
>>
>>
>> On Thu, Jun 20, 2024 at 9:37 AM Ignacio Vera  wrote:
>>
>>> I am now preparing for a bugfix release from branch branch_9_11. I am
>>> planning to build the RC next Monday.
>>>
>>> Please observe the normal rules for committing to this branch:
>>>
>>> * Before committing to the branch, reply to this thread and argue
>>>   why the fix needs backporting and how long it will take.
>>> * All issues accepted for backporting should be marked with 9.11.1
>>>   in GitHub, and issues that should delay the release must be marked as
>>> Blocker
>>> * All patches that are intended for the branch should first be committed
>>>   to the unstable branch, merged into the stable branch, and then into
>>>   the current release branch.
>>> * Only issues with Milestone version 9.11.1 and priority "Blocker" will
>>> delay
>>>   a release candidate build.
>>>
>>> There are already three fixes backported to branch 9.11:
>>>
>>> [1] https://github.com/apache/lucene/pull/13498
>>> [2] https://github.com/apache/lucene/pull/13504
>>> [3] https://github.com/apache/lucene/pull/13501
>>>
>>> @Stefan you mention in another thread an issue with
>>> StringValueFacetCounts but I see there is no fix for it yet, what's the
>>> status there?
>>>
>>> Thanks,
>>>
>>> Ignacio
>>>
>>


Re: Bugfix release 9.11.1

2024-06-20 Thread Ignacio Vera
Stefan sounds good.

Sure Dawid. Thanks for moving it forward.

Cheers,

Ignacio

On Thu, Jun 20, 2024 at 4:38 PM Dawid Weiss  wrote:

>
> I would like to backport this change to unified highlighter to 9.11.1, if
> there are no objections.
> https://github.com/apache/lucene/pull/13315
>
>
>
> On Thu, Jun 20, 2024 at 9:37 AM Ignacio Vera  wrote:
>
>> I am now preparing for a bugfix release from branch branch_9_11. I am
>> planning to build the RC next Monday.
>>
>> Please observe the normal rules for committing to this branch:
>>
>> * Before committing to the branch, reply to this thread and argue
>>   why the fix needs backporting and how long it will take.
>> * All issues accepted for backporting should be marked with 9.11.1
>>   in GitHub, and issues that should delay the release must be marked as
>> Blocker
>> * All patches that are intended for the branch should first be committed
>>   to the unstable branch, merged into the stable branch, and then into
>>   the current release branch.
>> * Only issues with Milestone version 9.11.1 and priority "Blocker" will
>> delay
>>   a release candidate build.
>>
>> There are already three fixes backported to branch 9.11:
>>
>> [1] https://github.com/apache/lucene/pull/13498
>> [2] https://github.com/apache/lucene/pull/13504
>> [3] https://github.com/apache/lucene/pull/13501
>>
>> @Stefan you mention in another thread an issue with
>> StringValueFacetCounts but I see there is no fix for it yet, what's the
>> status there?
>>
>> Thanks,
>>
>> Ignacio
>>
>


Re: Bugfix release 9.11.1

2024-06-20 Thread Dawid Weiss
I would like to backport this change to unified highlighter to 9.11.1, if
there are no objections.
https://github.com/apache/lucene/pull/13315



On Thu, Jun 20, 2024 at 9:37 AM Ignacio Vera  wrote:

> I am now preparing for a bugfix release from branch branch_9_11. I am
> planning to build the RC next Monday.
>
> Please observe the normal rules for committing to this branch:
>
> * Before committing to the branch, reply to this thread and argue
>   why the fix needs backporting and how long it will take.
> * All issues accepted for backporting should be marked with 9.11.1
>   in GitHub, and issues that should delay the release must be marked as
> Blocker
> * All patches that are intended for the branch should first be committed
>   to the unstable branch, merged into the stable branch, and then into
>   the current release branch.
> * Only issues with Milestone version 9.11.1 and priority "Blocker" will
> delay
>   a release candidate build.
>
> There are already three fixes backported to branch 9.11:
>
> [1] https://github.com/apache/lucene/pull/13498
> [2] https://github.com/apache/lucene/pull/13504
> [3] https://github.com/apache/lucene/pull/13501
>
> @Stefan you mention in another thread an issue with StringValueFacetCounts
> but I see there is no fix for it yet, what's the status there?
>
> Thanks,
>
> Ignacio
>


GH-13315: Fix IndexOutOfBoundsException thrown in DefaultPassageFormatter by unordered matches

2024-06-20 Thread Stéphane Campinas
Hello,

I have made requested changes to the PR 
https://github.com/apache/lucene/pull/13315.
Please tell me if I need to add something else to it!

Thanks,
-- 
Stephane Campinas


signature.asc
Description: PGP signature


Re: Bugfix release 9.11.1

2024-06-20 Thread Stefan Vodita
Thank you Ignacio for handling the release!

I've just updated the PR with the fix [1].
I can push it tomorrow.


Stefan

[1] https://github.com/apache/lucene/pull/13494

On Thu, 20 Jun 2024 at 08:36, Ignacio Vera  wrote:

> I am now preparing for a bugfix release from branch branch_9_11. I am
> planning to build the RC next Monday.
>
> Please observe the normal rules for committing to this branch:
>
> * Before committing to the branch, reply to this thread and argue
>   why the fix needs backporting and how long it will take.
> * All issues accepted for backporting should be marked with 9.11.1
>   in GitHub, and issues that should delay the release must be marked as
> Blocker
> * All patches that are intended for the branch should first be committed
>   to the unstable branch, merged into the stable branch, and then into
>   the current release branch.
> * Only issues with Milestone version 9.11.1 and priority "Blocker" will
> delay
>   a release candidate build.
>
> There are already three fixes backported to branch 9.11:
>
> [1] https://github.com/apache/lucene/pull/13498
> [2] https://github.com/apache/lucene/pull/13504
> [3] https://github.com/apache/lucene/pull/13501
>
> @Stefan you mention in another thread an issue with StringValueFacetCounts
> but I see there is no fix for it yet, what's the status there?
>
> Thanks,
>
> Ignacio
>


Bugfix release 9.11.1

2024-06-20 Thread Ignacio Vera
I am now preparing for a bugfix release from branch branch_9_11. I am
planning to build the RC next Monday.

Please observe the normal rules for committing to this branch:

* Before committing to the branch, reply to this thread and argue
  why the fix needs backporting and how long it will take.
* All issues accepted for backporting should be marked with 9.11.1
  in GitHub, and issues that should delay the release must be marked as
Blocker
* All patches that are intended for the branch should first be committed
  to the unstable branch, merged into the stable branch, and then into
  the current release branch.
* Only issues with Milestone version 9.11.1 and priority "Blocker" will
delay
  a release candidate build.

There are already three fixes backported to branch 9.11:

[1] https://github.com/apache/lucene/pull/13498
[2] https://github.com/apache/lucene/pull/13504
[3] https://github.com/apache/lucene/pull/13501

@Stefan you mention in another thread an issue with StringValueFacetCounts
but I see there is no fix for it yet, what's the status there?

Thanks,

Ignacio


Re: Do we need a 9.11.1 release?

2024-06-18 Thread Ignacio Vera
Hi Stefan,

Yes, I think we need a 9.11.1 release.

There are a couple more bugs that we might want to fix. The first one is a
performance regression due to an expensive operation
added on NumericComaprator [1]. In addition there is another one I would
like to address when an IndexWriter loses the track of the parent field due
to an empty index [2]

[1] https://github.com/apache/lucene/pull/13498
[2] https://github.com/apache/lucene/issues/13340

If we agree that the release is necessary, I can offer to be the release
manager. I am planning to start the release by the end of this week or
early next week.

On Tue, Jun 18, 2024 at 12:43 PM Stefan Vodita 
wrote:

> Hi all,
>
> I wanted to bring to everyone's attention that we released a bug [1] in
> StringValueFacetCounts with 9.11. On an empty match-set, instead of
> returning
> empty facet results, we throw an NPE. Users can work around this, but
> obviously
> it's not ideal.
>
> I noticed there is one other issue around ConcurrentMergeScheduler where
> we're
> considering a bugfix release [2].
>
> Are there any other bug fixes we're working on that I've missed?
>
>
> Stefan
>
>
> [1] https://github.com/apache/lucene/issues/13493
> [2] https://github.com/apache/lucene/issues/13478#issuecomment-2160815181
>


Do we need a 9.11.1 release?

2024-06-18 Thread Stefan Vodita
Hi all,

I wanted to bring to everyone's attention that we released a bug [1] in
StringValueFacetCounts with 9.11. On an empty match-set, instead of
returning
empty facet results, we throw an NPE. Users can work around this, but
obviously
it's not ideal.

I noticed there is one other issue around ConcurrentMergeScheduler where
we're
considering a bugfix release [2].

Are there any other bug fixes we're working on that I've missed?


Stefan


[1] https://github.com/apache/lucene/issues/13493
[2] https://github.com/apache/lucene/issues/13478#issuecomment-2160815181


Looking for PR Review - Fix to DefaultPassageFormatter

2024-06-17 Thread Zack Kendall
Hey folks, the github-actions bot told me to email this list, since my PR
has been out for weeks now.

I introduced a PR that identifies a small bug in highlighter behavior and
fixes it. This includes updating existing unit tests and introducing new
tests.

https://github.com/apache/lucene/pull/13384

Thanks!


Re: Can we import an HNSW graph into lucene index ?

2024-06-14 Thread Michael Froh
Hi Anand,

Interesting that you should bring this up!

There was a talk just this week at Berlin Buzzwords talking about using
cuVS with Lucene: https://www.youtube.com/watch?v=qiW7iIDFJC0

>From that talk, it sounds like the folks at SearchScale have managed to
integrate cuVS as a custom codec under Lucene. There was also mention in
the talk that a CAGRA graph can be built using GPUs (to get some impressive
speedups), then they can be searched using CPU-based logic (so you don't
need to provision GPU hosts for your search fleet).

I wasn't able to find the code for the codec at
https://github.com/SearchScale/lucene-cuvs/tree/main, but Ishan and Noble
should be on this list and might be able to find it.

- Froh

On Fri, Jun 14, 2024 at 7:57 AM Benjamin Trent 
wrote:

> Anand,
>
> In short, I think it's feasible, but I don't think it's simple. I also
> don't think Lucene should directly provide an interface to the format
> that says "Give me the graph". You could have a custom writer that
> does this however.
>
> All formats are nominally based, so if your GPU merge format writes
> out the appropriate name and format, it should be readable.
>
> > One issue we have been running into is long build times with higher
> dimensional vectors.
>
> Are you building the graph with a single thread?
>
> What vector dimensions are you using?
>
> As an aside, building the graph via quantized vectors can help speed
> things up. Though I understand the desire to do graph building with a
> GPU.
>
> Very interesting ideas indeed Anand.
>
> Ben
>
> On Fri, Jun 14, 2024 at 4:49 AM Anand Kotriwal 
> wrote:
> >
> > Hi all,
> >
> > We extensively use Lucene and HNSW graph search capability for ANN
> searches.
> > One issue we have been running into is long build times with higher
> dimensional vectors. To address this, we are exploring ways where we can
> build the hnsw index on the GPU and merge it into an existing Lucene index
> to serve queries. For example, Nvidia's cuvs library supports building a
> CAGRA index and  transforming it into a hnswlib graph.
> >
> > My idea is - once the hnswgraph is built on the GPUs, we can import the
> graph. We need the graph vertices and their connections. We can then write
> it to a lucene compatible segment file format. We also map the docids to
> embeddings and update the fieldinfos.
> >
> > I would like feedback from the community on whether this sounds feasible
> and any implementation pointers you might have.
> >
> >
> > Thanks,
> > Anand Kotriwal
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>


Re: Can we import an HNSW graph into lucene index ?

2024-06-14 Thread Atri Sharma
There is a module that integrates Lucene with CAGRA.

Ishan should be able to point you to the link.

On Fri, 14 Jun 2024 at 8:27 PM, Benjamin Trent 
wrote:

> Anand,
>
> In short, I think it's feasible, but I don't think it's simple. I also
> don't think Lucene should directly provide an interface to the format
> that says "Give me the graph". You could have a custom writer that
> does this however.
>
> All formats are nominally based, so if your GPU merge format writes
> out the appropriate name and format, it should be readable.
>
> > One issue we have been running into is long build times with higher
> dimensional vectors.
>
> Are you building the graph with a single thread?
>
> What vector dimensions are you using?
>
> As an aside, building the graph via quantized vectors can help speed
> things up. Though I understand the desire to do graph building with a
> GPU.
>
> Very interesting ideas indeed Anand.
>
> Ben
>
> On Fri, Jun 14, 2024 at 4:49 AM Anand Kotriwal 
> wrote:
> >
> > Hi all,
> >
> > We extensively use Lucene and HNSW graph search capability for ANN
> searches.
> > One issue we have been running into is long build times with higher
> dimensional vectors. To address this, we are exploring ways where we can
> build the hnsw index on the GPU and merge it into an existing Lucene index
> to serve queries. For example, Nvidia's cuvs library supports building a
> CAGRA index and  transforming it into a hnswlib graph.
> >
> > My idea is - once the hnswgraph is built on the GPUs, we can import the
> graph. We need the graph vertices and their connections. We can then write
> it to a lucene compatible segment file format. We also map the docids to
> embeddings and update the fieldinfos.
> >
> > I would like feedback from the community on whether this sounds feasible
> and any implementation pointers you might have.
> >
> >
> > Thanks,
> > Anand Kotriwal
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>


Re: Can we import an HNSW graph into lucene index ?

2024-06-14 Thread Benjamin Trent
Anand,

In short, I think it's feasible, but I don't think it's simple. I also
don't think Lucene should directly provide an interface to the format
that says "Give me the graph". You could have a custom writer that
does this however.

All formats are nominally based, so if your GPU merge format writes
out the appropriate name and format, it should be readable.

> One issue we have been running into is long build times with higher 
> dimensional vectors.

Are you building the graph with a single thread?

What vector dimensions are you using?

As an aside, building the graph via quantized vectors can help speed
things up. Though I understand the desire to do graph building with a
GPU.

Very interesting ideas indeed Anand.

Ben

On Fri, Jun 14, 2024 at 4:49 AM Anand Kotriwal  wrote:
>
> Hi all,
>
> We extensively use Lucene and HNSW graph search capability for ANN searches.
> One issue we have been running into is long build times with higher 
> dimensional vectors. To address this, we are exploring ways where we can 
> build the hnsw index on the GPU and merge it into an existing Lucene index to 
> serve queries. For example, Nvidia's cuvs library supports building a CAGRA 
> index and  transforming it into a hnswlib graph.
>
> My idea is - once the hnswgraph is built on the GPUs, we can import the 
> graph. We need the graph vertices and their connections. We can then write it 
> to a lucene compatible segment file format. We also map the docids to 
> embeddings and update the fieldinfos.
>
> I would like feedback from the community on whether this sounds feasible and 
> any implementation pointers you might have.
>
>
> Thanks,
> Anand Kotriwal

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Can we import an HNSW graph into lucene index ?

2024-06-14 Thread Anand Kotriwal
Hi all,

We extensively use Lucene and HNSW graph search capability for ANN
searches.
One issue we have been running into is long build times with higher
dimensional vectors. To address this, we are exploring ways where we can
build the hnsw index on the GPU and merge it into an existing Lucene index
to serve queries. For example, Nvidia's cuvs library supports building a
CAGRA  index and  transforming it into a
hnswlib graph.

My idea is - once the hnswgraph is built on the GPUs, we can import the
graph. We need the graph vertices and their connections. We can then write
it to a lucene compatible segment file format. We also map the docids to
embeddings and update the fieldinfos.

I would like feedback from the community on whether this sounds feasible
and any implementation pointers you might have.


Thanks,
Anand Kotriwal


Re: Intellij build/test times

2024-06-13 Thread Dawid Weiss
> I don't see why in the gradle.properties template we've messed with
> the default daemon; this should purely be a personal choice and not
> one where we the project need to make a strong preference for you.
>

I think the choice is still there - we generate this file with some sane
defaults so that people can (and should!) modify it to their liking.

The reason for the daemon timeout set to 15 mins is that if you're
switching branches
or play with jvm settings (different jvms), you'll quickly run out of
memory (multiple gradle
daemons running in the background).

Dawid


Re: Intellij build/test times

2024-06-13 Thread David Smiley
I'm coming to this late but want to chime in to concur with most of
Dawid's responses.

I don't see why in the gradle.properties template we've messed with
the default daemon; this should purely be a personal choice and not
one where we the project need to make a strong preference for you.

On Thu, Jun 13, 2024 at 9:43 PM Dawid Weiss  wrote:
>
>
> I would love to do that but it's almost like the three body problem - IDEs, 
> gradle and code changes.
>
> Dawid
>
> On Thu, Jun 13, 2024 at 8:49 PM Michael Sokolov  wrote:
>>
>> Thanks for digging into this Dawid - I think it's important to keep an
>> IDE dev path pretty clear of underbrush in order to encourage new
>> joiners, even if it is not the primary or best means of building and
>> testing
>>
>> On Thu, Jun 13, 2024 at 2:01 PM Dawid Weiss  wrote:
>> >
>> >
>> > Hi Mike,
>> >
>> > Just FYI - I confirm something is odd with the configuration evaluation. 
>> > The times vary wildly on my machine. I don't know why it's the case and I 
>> > couldn't pinpoint a clear cause. Once the daemon is running, things are 
>> > faster - perhaps you should increase the default daemon timeout (it also 
>> > applies to the IDE, I think):
>> >
>> > # timeout after 15 mins of inactivity.
>> > org.gradle.daemon.idletimeout=90
>> >
>> > I'll try to improve things by refreshing some of the build scripts. I 
>> > really liked gradle when it started - mostly for its simplicity. I don't 
>> > like how it turned from a build system to a distributed cache of prebuilt 
>> > artefacts... eh.
>> >
>> > Dawid
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Intellij build/test times

2024-06-13 Thread Dawid Weiss
I would love to do that but it's almost like the three body problem - IDEs,
gradle and code changes.

Dawid

On Thu, Jun 13, 2024 at 8:49 PM Michael Sokolov  wrote:

> Thanks for digging into this Dawid - I think it's important to keep an
> IDE dev path pretty clear of underbrush in order to encourage new
> joiners, even if it is not the primary or best means of building and
> testing
>
> On Thu, Jun 13, 2024 at 2:01 PM Dawid Weiss  wrote:
> >
> >
> > Hi Mike,
> >
> > Just FYI - I confirm something is odd with the configuration evaluation.
> The times vary wildly on my machine. I don't know why it's the case and I
> couldn't pinpoint a clear cause. Once the daemon is running, things are
> faster - perhaps you should increase the default daemon timeout (it also
> applies to the IDE, I think):
> >
> > # timeout after 15 mins of inactivity.
> > org.gradle.daemon.idletimeout=90
> >
> > I'll try to improve things by refreshing some of the build scripts. I
> really liked gradle when it started - mostly for its simplicity. I don't
> like how it turned from a build system to a distributed cache of prebuilt
> artefacts... eh.
> >
> > Dawid
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>


Re: Intellij build/test times

2024-06-13 Thread Michael Sokolov
Thanks for digging into this Dawid - I think it's important to keep an
IDE dev path pretty clear of underbrush in order to encourage new
joiners, even if it is not the primary or best means of building and
testing

On Thu, Jun 13, 2024 at 2:01 PM Dawid Weiss  wrote:
>
>
> Hi Mike,
>
> Just FYI - I confirm something is odd with the configuration evaluation. The 
> times vary wildly on my machine. I don't know why it's the case and I 
> couldn't pinpoint a clear cause. Once the daemon is running, things are 
> faster - perhaps you should increase the default daemon timeout (it also 
> applies to the IDE, I think):
>
> # timeout after 15 mins of inactivity.
> org.gradle.daemon.idletimeout=90
>
> I'll try to improve things by refreshing some of the build scripts. I really 
> liked gradle when it started - mostly for its simplicity. I don't like how it 
> turned from a build system to a distributed cache of prebuilt artefacts... eh.
>
> Dawid

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Intellij build/test times

2024-06-13 Thread Dawid Weiss
Hi Mike,

Just FYI - I confirm something is odd with the configuration evaluation.
The times vary wildly on my machine. I don't know why it's the case and I
couldn't pinpoint a clear cause. Once the daemon is running, things are
faster - perhaps you should increase the default daemon timeout (it also
applies to the IDE, I think):

# timeout after 15 mins of inactivity.
org.gradle.daemon.idletimeout=90

I'll try to improve things by refreshing some of the build scripts. I
really liked gradle when it started - mostly for its simplicity. I don't
like how it turned from a build system to a distributed cache of prebuilt
artefacts... eh.

Dawid


Re: scalar quantization heap usage during merge

2024-06-12 Thread Benjamin Trent
Michael,

Empirically, I am not surprised there is an increase in heap usage. We
do have extra overhead with the scalar quantization on flush. There
may also be some additional heap usage on merge.

I just don't think it is via: Lucene99FlatVectorsWriter

On Wed, Jun 12, 2024 at 11:55 AM Michael Sokolov  wrote:
>
>  Empirically I thought I saw the need to increase JVM heap with this,
> but let me do some more testing to narrow down what is going on. It's
> possible the same heap requirements exist for the non-quantized case
> and I am just seeing some random vagary of the merge process happening
> to tip over a limit. It's also possible I messed something up in
> https://github.com/apache/lucene/pull/13469 which I am trying to use
> in order to index quantized vectors without building an HNSW graph.
>
> On Wed, Jun 12, 2024 at 10:24 AM Benjamin Trent  wrote:
> >
> > Heya Michael,
> >
> > > the first one I traced was referenced by vector writers involved in a 
> > > merge (Lucene99FlatVectorsWriter.FieldsWriter.vectors). Is this expected?
> >
> > Yes, that is holding the raw floats before flush. You should see
> > nearly the exact same overhead there as you would indexing raw
> > vectors. I would be surprised if there is a significant memory usage
> > difference due to Lucene99FlatVectorsWriter when using quantized vs.
> > not.
> >
> > The flow is this:
> >
> >  - Lucene99FlatVectorsWriter gets the float[] vector and makes a copy
> > of it (does this no matter what) and passes on to the next part of the
> > chain
> >  - If quantizing, the next part of the chain is
> > Lucene99ScalarQuantizedVectorsWriter.FieldsWriter, which only keeps a
> > REFERENCE to the array, it doesn't copy it. The float vector array is
> > then passed to the HNSW indexer (if its being used), which also does
> > NOT copy, but keeps a reference.
> >  - If not quantizing but indexing, Lucene99FlatVectorsWriter will pass
> > it directly to the hnsw indexer, which does not copy it, but does add
> > it to the HNSW graph
> >
> > > I wonder if there is an opportunity to move some of this off-heap?
> >
> > I think we could do some things off-heap in the ScalarQuantizer. Maybe
> > even during "flush", but we would have to adjust the interfaces some
> > so that the scalarquantizer can know where the vectors are being
> > stored after the initial flush. Right now there is no way to know the
> > file nor file handle.
> >
> > > I can imagine that when we requantize we need to scan all the vectors to 
> > > determine the new quantization settings?
> >
> > We shouldn't be scanning every vector. We do take a sampling, though
> > that sampling can be large. There is here an opportunity for off-heap
> > action if possible. Though I don't know how we could do that before
> > flush. I could see the off-heap idea helping on merge.
> >
> > > Maybe we could do two passes - merge the float vectors while 
> > > recalculating, and then re-scan to do the actual quantization?
> >
> > I am not sure what you mean here by "merge the float vectors". If you
> > mean simply reading the individual float vector files and combining
> > them into a single file, we already do that separately from
> > quantizing.
> >
> > Thank you for digging into this. Glad others are experimenting!
> >
> > Ben
> >
> > On Wed, Jun 12, 2024 at 8:57 AM Michael Sokolov  wrote:
> > >
> > > Hi folks. I've been experimenting with our new scalar quantization
> > > support - yay, thanks for adding it! I'm finding that when I index a
> > > large number of large vectors, enabling quantization (vs simply
> > > indexing the full-width floats) requires more heap - I keep getting
> > > OOMs and have to increase heap size. I took a heap dump, and not
> > > surprisingly I found some big arrays of floats and bytes, and the
> > > first one I traced was referenced by vector writers involved in a
> > > merge (Lucene99FlatVectorsWriter.FieldsWriter.vectors). Is this
> > > expected? I wonder if there is an opportunity to move some of this
> > > off-heap?  I can imagine that when we requantize we need to scan all
> > > the vectors to determine the new quantization settings?  Maybe we
> > > could do two passes - merge the float vectors while recalculating, and
> > > then re-scan to do the actual quantization?
> > >
> > > -
> > > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> > > For additional commands, e-mail: dev-h...@lucene.apache.org
> > >
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> > For additional commands, e-mail: dev-h...@lucene.apache.org
> >
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>

-
To unsubscribe, 

Re: scalar quantization heap usage during merge

2024-06-12 Thread Michael Sokolov
 Empirically I thought I saw the need to increase JVM heap with this,
but let me do some more testing to narrow down what is going on. It's
possible the same heap requirements exist for the non-quantized case
and I am just seeing some random vagary of the merge process happening
to tip over a limit. It's also possible I messed something up in
https://github.com/apache/lucene/pull/13469 which I am trying to use
in order to index quantized vectors without building an HNSW graph.

On Wed, Jun 12, 2024 at 10:24 AM Benjamin Trent  wrote:
>
> Heya Michael,
>
> > the first one I traced was referenced by vector writers involved in a merge 
> > (Lucene99FlatVectorsWriter.FieldsWriter.vectors). Is this expected?
>
> Yes, that is holding the raw floats before flush. You should see
> nearly the exact same overhead there as you would indexing raw
> vectors. I would be surprised if there is a significant memory usage
> difference due to Lucene99FlatVectorsWriter when using quantized vs.
> not.
>
> The flow is this:
>
>  - Lucene99FlatVectorsWriter gets the float[] vector and makes a copy
> of it (does this no matter what) and passes on to the next part of the
> chain
>  - If quantizing, the next part of the chain is
> Lucene99ScalarQuantizedVectorsWriter.FieldsWriter, which only keeps a
> REFERENCE to the array, it doesn't copy it. The float vector array is
> then passed to the HNSW indexer (if its being used), which also does
> NOT copy, but keeps a reference.
>  - If not quantizing but indexing, Lucene99FlatVectorsWriter will pass
> it directly to the hnsw indexer, which does not copy it, but does add
> it to the HNSW graph
>
> > I wonder if there is an opportunity to move some of this off-heap?
>
> I think we could do some things off-heap in the ScalarQuantizer. Maybe
> even during "flush", but we would have to adjust the interfaces some
> so that the scalarquantizer can know where the vectors are being
> stored after the initial flush. Right now there is no way to know the
> file nor file handle.
>
> > I can imagine that when we requantize we need to scan all the vectors to 
> > determine the new quantization settings?
>
> We shouldn't be scanning every vector. We do take a sampling, though
> that sampling can be large. There is here an opportunity for off-heap
> action if possible. Though I don't know how we could do that before
> flush. I could see the off-heap idea helping on merge.
>
> > Maybe we could do two passes - merge the float vectors while recalculating, 
> > and then re-scan to do the actual quantization?
>
> I am not sure what you mean here by "merge the float vectors". If you
> mean simply reading the individual float vector files and combining
> them into a single file, we already do that separately from
> quantizing.
>
> Thank you for digging into this. Glad others are experimenting!
>
> Ben
>
> On Wed, Jun 12, 2024 at 8:57 AM Michael Sokolov  wrote:
> >
> > Hi folks. I've been experimenting with our new scalar quantization
> > support - yay, thanks for adding it! I'm finding that when I index a
> > large number of large vectors, enabling quantization (vs simply
> > indexing the full-width floats) requires more heap - I keep getting
> > OOMs and have to increase heap size. I took a heap dump, and not
> > surprisingly I found some big arrays of floats and bytes, and the
> > first one I traced was referenced by vector writers involved in a
> > merge (Lucene99FlatVectorsWriter.FieldsWriter.vectors). Is this
> > expected? I wonder if there is an opportunity to move some of this
> > off-heap?  I can imagine that when we requantize we need to scan all
> > the vectors to determine the new quantization settings?  Maybe we
> > could do two passes - merge the float vectors while recalculating, and
> > then re-scan to do the actual quantization?
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> > For additional commands, e-mail: dev-h...@lucene.apache.org
> >
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: scalar quantization heap usage during merge

2024-06-12 Thread Benjamin Trent
Heya Michael,

> the first one I traced was referenced by vector writers involved in a merge 
> (Lucene99FlatVectorsWriter.FieldsWriter.vectors). Is this expected?

Yes, that is holding the raw floats before flush. You should see
nearly the exact same overhead there as you would indexing raw
vectors. I would be surprised if there is a significant memory usage
difference due to Lucene99FlatVectorsWriter when using quantized vs.
not.

The flow is this:

 - Lucene99FlatVectorsWriter gets the float[] vector and makes a copy
of it (does this no matter what) and passes on to the next part of the
chain
 - If quantizing, the next part of the chain is
Lucene99ScalarQuantizedVectorsWriter.FieldsWriter, which only keeps a
REFERENCE to the array, it doesn't copy it. The float vector array is
then passed to the HNSW indexer (if its being used), which also does
NOT copy, but keeps a reference.
 - If not quantizing but indexing, Lucene99FlatVectorsWriter will pass
it directly to the hnsw indexer, which does not copy it, but does add
it to the HNSW graph

> I wonder if there is an opportunity to move some of this off-heap?

I think we could do some things off-heap in the ScalarQuantizer. Maybe
even during "flush", but we would have to adjust the interfaces some
so that the scalarquantizer can know where the vectors are being
stored after the initial flush. Right now there is no way to know the
file nor file handle.

> I can imagine that when we requantize we need to scan all the vectors to 
> determine the new quantization settings?

We shouldn't be scanning every vector. We do take a sampling, though
that sampling can be large. There is here an opportunity for off-heap
action if possible. Though I don't know how we could do that before
flush. I could see the off-heap idea helping on merge.

> Maybe we could do two passes - merge the float vectors while recalculating, 
> and then re-scan to do the actual quantization?

I am not sure what you mean here by "merge the float vectors". If you
mean simply reading the individual float vector files and combining
them into a single file, we already do that separately from
quantizing.

Thank you for digging into this. Glad others are experimenting!

Ben

On Wed, Jun 12, 2024 at 8:57 AM Michael Sokolov  wrote:
>
> Hi folks. I've been experimenting with our new scalar quantization
> support - yay, thanks for adding it! I'm finding that when I index a
> large number of large vectors, enabling quantization (vs simply
> indexing the full-width floats) requires more heap - I keep getting
> OOMs and have to increase heap size. I took a heap dump, and not
> surprisingly I found some big arrays of floats and bytes, and the
> first one I traced was referenced by vector writers involved in a
> merge (Lucene99FlatVectorsWriter.FieldsWriter.vectors). Is this
> expected? I wonder if there is an opportunity to move some of this
> off-heap?  I can imagine that when we requantize we need to scan all
> the vectors to determine the new quantization settings?  Maybe we
> could do two passes - merge the float vectors while recalculating, and
> then re-scan to do the actual quantization?
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



scalar quantization heap usage during merge

2024-06-12 Thread Michael Sokolov
Hi folks. I've been experimenting with our new scalar quantization
support - yay, thanks for adding it! I'm finding that when I index a
large number of large vectors, enabling quantization (vs simply
indexing the full-width floats) requires more heap - I keep getting
OOMs and have to increase heap size. I took a heap dump, and not
surprisingly I found some big arrays of floats and bytes, and the
first one I traced was referenced by vector writers involved in a
merge (Lucene99FlatVectorsWriter.FieldsWriter.vectors). Is this
expected? I wonder if there is an opportunity to move some of this
off-heap?  I can imagine that when we requantize we need to scan all
the vectors to determine the new quantization settings?  Maybe we
could do two passes - merge the float vectors while recalculating, and
then re-scan to do the actual quantization?

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Trading high latency for better throughput

2024-06-12 Thread Gautam Worah
Hi folks,

I was wondering if people had thought about this problem before.
If in a search system, you have clients that can tolerate high latency, is
there a way to increase their latency and improve overall system
throughput, where there may be other clients that demand reasonable latency
and high throughput..?

In the Lucene practical sense, one possible way to do this would be to
limit the concurrency of a query by limiting the number of Slices (by
changing the segment->Slice geometry) and seeing the effect on throughput.
Such a setup should ideally save on thread synchronization costs and the
remaining threads can serve other queries.
However, taking a TopScoreDocCollector as an example, what could end up
happening is that the minScore in the PQ would not go up fast enough, and
the Collector may end up over collecting low-quality hits.

Are there any other solutions in this space?
It's like having too much of a different currency and wondering if there is
a way to convert it to a currency that you want :)

Appreciate any insights or thoughts. Thanks!

-
Gautam Worah.


Delete stale Jenkins CI builds

2024-06-11 Thread Dawid Weiss
Something has happened to Jenkins and it triggered old builds to come back
to life. This generates noise and doesn't add any value, I think.

Similar to what David Smiley suggested on Solr mailing list, I suggest we
remove old builds from Jenkins [1], leaving only the latest release and 9x,
main branches.

If anybody knows of a reason to keep those builds, please shout out.

Dawid

https://ci-builds.apache.org/job/Lucene/


Re: [JENKINS] Lucene » Lucene-NightlyTests-9.0 - Build # 139 - Still Failing!

2024-06-11 Thread Dawid Weiss
I disabled this build.

On Mon, Jun 10, 2024 at 6:08 PM Apache Jenkins Server <
jenk...@builds.apache.org> wrote:

> Build:
> https://ci-builds.apache.org/job/Lucene/job/Lucene-NightlyTests-9.0/139/
>
> No tests ran.
>
> Build Log:
> [...truncated 10 lines...]
> FATAL: The Gradle wrapper has not been found in these directories:
> /home/jenkins/jenkins-slave/workspace/Lucene/Lucene-NightlyTests-9.0/checkout
> Build step 'Invoke Gradle script' marked build as failure
> Archiving artifacts
> Recording test results
> ERROR: Step ‘Publish JUnit test result report’ failed: No test report
> files were found. Configuration error?
> Email was triggered for: Failure - Any
> Sending email for trigger: Failure - Any
>
> -
> To unsubscribe, e-mail: builds-unsubscr...@lucene.apache.org
> For additional commands, e-mail: builds-h...@lucene.apache.org


Re: [JENKINS] Lucene-main-MacOSX (64bit/hotspot/jdk-21.0.1) - Build # 11448 - Still Failing!

2024-06-10 Thread Dawid Weiss
Uwe - something has been failing fairly regularly on the Mac VM (and on
openj9). I again wonder how much sense these builds have. We can run a
MacOS build on github and openj9 has been... unreliable to say the least.

I'm afraid people will stop looking at errors if we have too many false
positives.

Dawid

On Tue, Jun 11, 2024 at 2:13 AM Policeman Jenkins Server <
jenk...@thetaphi.de> wrote:

> Build: https://jenkins.thetaphi.de/job/Lucene-main-MacOSX/11448/
> Java: 64bit/hotspot/jdk-21.0.1 -XX:+UseCompressedOops -XX:+UseParallelGC
>
> No tests ran.
>
> -
> To unsubscribe, e-mail: builds-unsubscr...@lucene.apache.org
> For additional commands, e-mail: builds-h...@lucene.apache.org


Re: Intellij build/test times

2024-06-10 Thread Dawid Weiss
I've taken a look at intellij source code and I don't think the default
annotation preprocessor location can be easily redirected in "intellij
compilation" mode. This personally doesn't bother me much but maybe you're
right that the official developer docs shouldn't make people think too hard
and assume gradle is used both from command line and the IDE - I honestly
don't know... IDEs are a moving target, it's really difficult to keep up
with the changes, eh.

D.

On Mon, Jun 10, 2024 at 5:07 PM Dawid Weiss  wrote:

>
>
> If I set IJ build/test to "gradle" and then right click on "core" in
>> the Project tab -- it gives an option like "run tests in
>> lucene-root.lucene.core" which works.
>
>
> It works because you're running all tests in a module.
>
>
>> At the very top (lucene
>> [lucene-root]) of the hierarchy you can right-click and select "run
>> all tests", but this fails with "Error running 'All in lucene-root':
>> No junit.jar". I thought this had once worked, but maybe I was only
>> running tests in core?
>>
>
> It can't work in the current setup - the topmost project isn't a Java
> module. Maybe it worked at some point when we had ant-generated intellij
> files (which aggregated everything on the same classpath)? I honestly can't
> remember.
>
> Honestly, if you want to run everything, add a gradle configuration and
> run it there - it'll be faster than a sequential run from the IDE.
>
> Dawid
>


Re: Intellij build/test times

2024-06-10 Thread Dawid Weiss
If I set IJ build/test to "gradle" and then right click on "core" in
> the Project tab -- it gives an option like "run tests in
> lucene-root.lucene.core" which works.


It works because you're running all tests in a module.


> At the very top (lucene
> [lucene-root]) of the hierarchy you can right-click and select "run
> all tests", but this fails with "Error running 'All in lucene-root':
> No junit.jar". I thought this had once worked, but maybe I was only
> running tests in core?
>

It can't work in the current setup - the topmost project isn't a Java
module. Maybe it worked at some point when we had ant-generated intellij
files (which aggregated everything on the same classpath)? I honestly can't
remember.

Honestly, if you want to run everything, add a gradle configuration and run
it there - it'll be faster than a sequential run from the IDE.

Dawid


Re: Intellij build/test times

2024-06-10 Thread Michael Sokolov
If I set IJ build/test to "gradle" and then right click on "core" in
the Project tab -- it gives an option like "run tests in
lucene-root.lucene.core" which works. At the very top (lucene
[lucene-root]) of the hierarchy you can right-click and select "run
all tests", but this fails with "Error running 'All in lucene-root':
No junit.jar". I thought this had once worked, but maybe I was only
running tests in core?

On Mon, Jun 10, 2024 at 9:37 AM Dawid Weiss  wrote:
>>
>> When I say "run in IJ" I mean I right clicked a button somewhere and said 
>> "run all tests" :) I expect it was with the gradle runner selected.
>
>
> When you find that button, let me know. It's probably right next to the Holy 
> Grail. ;)
>
> Dawid

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Intellij build/test times

2024-06-10 Thread Dawid Weiss
>
> When I say "run in IJ" I mean I right clicked a button somewhere and said
> "run all tests" :) I expect it was with the gradle runner selected.
>

When you find that button, let me know. It's probably right next to the
Holy Grail. ;)

Dawid

>


Re: Intellij build/test times

2024-06-10 Thread Michael Sokolov
>
> Yet I feel certain I have been able to run all tests in IJ before.
>
>
>
> I don't think this was ever the case with intellij. Or maybe you ran those
> tests via gradle?


When I say "run in IJ" I mean I right clicked a button somewhere and said
"run all tests" :) I expect it was with the gradle runner selected.


On Mon, Jun 10, 2024 at 6:38 AM Dawid Weiss  wrote:

>
> Yet I feel certain I have been able to run all tests in IJ before.
>>
>
> I don't think this was ever the case with intellij. Or maybe you ran those
> tests via gradle?
>
>
>> There are a few oddities that happen in intellij that require you to
>> fiddle with the build in odd ways, but I wonder if these will be
>> reproducible or if they maybe happen because there is some bad state:
>>
>
> Intellij changes from version to version so there is no "one" version,
> unfortunately. Also, sometimes
> some of the settings intellij sets up on the initial import persist and
> 'reloading' the gradle plugin does not
> help to update them. An occasional import from scratch is a good way to
> check if something like this happens.
>
> 1. Building branch_9x with intellij builder selected in the gradle
>> settings failed to build the benchmark module due to some modules not
>> being visible to it (e.g. icu). So I "unload module benchmark"
>> effectively skipping building that, and then I am able to build the
>> rest of lucene. YMMV
>>
>
> I can't reproduce this on main, haven't tried 9x.
>
>
>> 2. After switching back to main branch, I got a build failure  "error:
>> Annotation generator had thrown the exception.
>> javax.annotation.processing.FilerException: Attempt to recreate a file
>> for type
>> org.apache.lucene.benchmark.jmh.jmh_generated.ExpressionsBenchmark_expression_jmhTest".
>> I see there are some generated classes in
>> lucene/benchmark_jmh/src/java/generated, that show up in git status,
>> so I remove that folder and then everything is fine - some cruft left
>> from a previous build?
>>
>
> This is intellij's compiler emitting annotation processor output (jmh) to
> an incorrect location.
> It's javac's '-s' option. Not sure how to configure this option so that
> intellij picks it up from the gradle build model.
>
>
>> Side note: when running all tests in "intellij" mode you cannot do it
>> by selecting the "core" module - you have to navigate down to the
>> "tests" folder.
>
>
> Correct, This is the classpath-container unit which contains tests.
>
>
>> Also I observed that when running tests in "gradle"
>> mode I no longer observed the slow startup times? Really unsure what
>> that means. Maybe some networking thing?
>>
>
> Or the daemon starting for the first time - this is relatively expensive.
> Once the daemon is up, launch times should
> be faster.
>
>
>> But the main thing I learned is that while running tests using
>> intellij builder mostly works, MemorySegmentIndexInputProvider fails
>> to get loaded and any test using MMapDirectory will fail, regardless
>> of whether I run a single test or a whole suite. This is true on both
>> 9x and main branches and causes 1/3-1/2 of tests to fail in core.
>>
>
> This class is in src/java21 - it's not picked up as a source folder by
> intellij. And if you add it manually, you'll get errors related to
> compilation because of the way the gradle build "cheats" javac and bypasses
> explicitly importing
> jdk.incubator.vector and not declaring --enable-preview...
>
> In short: yes, it'll be difficult to work around that, especially with an
> automatic project import in intellij (perhaps you could hand-craft
> configuration files so that it works, I'm not sure).
>
>
>> At this point I'm reluctant to recommend using the intellij build
>> mode. Maybe it will become viable again if we can figure out how to
>> get MMapDirectory tests to work with it?
>>
>
> I use it because it's much faster. Whenever I need something  more
> complex, I set up a dedicated gradle launch configuration for that - like
> so:
>
> [image: image.png]
>
> I'm neither gradle nor intellij expert though, it's mostly a
> trial-and-error of what works and what doesn't...
>
> D.
>


Re: Intellij build/test times

2024-06-10 Thread Dawid Weiss
> Yet I feel certain I have been able to run all tests in IJ before.
>

I don't think this was ever the case with intellij. Or maybe you ran those
tests via gradle?


> There are a few oddities that happen in intellij that require you to
> fiddle with the build in odd ways, but I wonder if these will be
> reproducible or if they maybe happen because there is some bad state:
>

Intellij changes from version to version so there is no "one" version,
unfortunately. Also, sometimes
some of the settings intellij sets up on the initial import persist and
'reloading' the gradle plugin does not
help to update them. An occasional import from scratch is a good way to
check if something like this happens.

1. Building branch_9x with intellij builder selected in the gradle
> settings failed to build the benchmark module due to some modules not
> being visible to it (e.g. icu). So I "unload module benchmark"
> effectively skipping building that, and then I am able to build the
> rest of lucene. YMMV
>

I can't reproduce this on main, haven't tried 9x.


> 2. After switching back to main branch, I got a build failure  "error:
> Annotation generator had thrown the exception.
> javax.annotation.processing.FilerException: Attempt to recreate a file
> for type
> org.apache.lucene.benchmark.jmh.jmh_generated.ExpressionsBenchmark_expression_jmhTest".
> I see there are some generated classes in
> lucene/benchmark_jmh/src/java/generated, that show up in git status,
> so I remove that folder and then everything is fine - some cruft left
> from a previous build?
>

This is intellij's compiler emitting annotation processor output (jmh) to
an incorrect location.
It's javac's '-s' option. Not sure how to configure this option so that
intellij picks it up from the gradle build model.


> Side note: when running all tests in "intellij" mode you cannot do it
> by selecting the "core" module - you have to navigate down to the
> "tests" folder.


Correct, This is the classpath-container unit which contains tests.


> Also I observed that when running tests in "gradle"
> mode I no longer observed the slow startup times? Really unsure what
> that means. Maybe some networking thing?
>

Or the daemon starting for the first time - this is relatively expensive.
Once the daemon is up, launch times should
be faster.


> But the main thing I learned is that while running tests using
> intellij builder mostly works, MemorySegmentIndexInputProvider fails
> to get loaded and any test using MMapDirectory will fail, regardless
> of whether I run a single test or a whole suite. This is true on both
> 9x and main branches and causes 1/3-1/2 of tests to fail in core.
>

This class is in src/java21 - it's not picked up as a source folder by
intellij. And if you add it manually, you'll get errors related to
compilation because of the way the gradle build "cheats" javac and bypasses
explicitly importing
jdk.incubator.vector and not declaring --enable-preview...

In short: yes, it'll be difficult to work around that, especially with an
automatic project import in intellij (perhaps you could hand-craft
configuration files so that it works, I'm not sure).


> At this point I'm reluctant to recommend using the intellij build
> mode. Maybe it will become viable again if we can figure out how to
> get MMapDirectory tests to work with it?
>

I use it because it's much faster. Whenever I need something  more complex,
I set up a dedicated gradle launch configuration for that - like so:

[image: image.png]

I'm neither gradle nor intellij expert though, it's mostly a
trial-and-error of what works and what doesn't...

D.


JDK 23 Feature Freeze / New Loom EA builds

2024-06-10 Thread David Delabassee
Welcome to the latest OpenJDK Quality Outreach update!

JDK 23, scheduled for General Availability on September 17, 2024, is now in 
Rampdown Phase One (RDP1) [1]. At this point, the overall JDK 23 feature set is 
frozen (see the final list of JEPs integrated into JDK 23 below) and only 
low-risk enhancements might still be considered. The coming weeks should be 
leveraged to identify and resolve as many issues as possible, i.e. before JDK 
23 enters the Release Candidates phase in early August [2]. We count on you to 
test your projects and help us make JDK 23 another solid release!

This time, we are covering several heads-up related to JDK 23 : Deprecate the 
Memory-Access Methods in sun.misc.Unsafe for Removal and default annotation 
processing policy change. Also, make sure to check the new Loom early-access 
builds which have an improved Java monitors implementation to work better with 
virtual threads.

[1] https://mail.openjdk.org/pipermail/jdk-dev/2024-June/009053.html
[2] https://openjdk.org/projects/jdk/23/


## Heads-Up - JDK 23: Deprecate the Memory-Access Methods in sun.misc.Unsafe 
for Removal

As mentioned in a previous communication [3], there’s a plan to ultimately 
remove the sun.misc.Unsafe memory-access methods as the platform offers safer 
alternatives. JEP 471 (Deprecate the Memory-Access Methods in sun.misc.Unsafe 
for Removal) [4] outlines in more detail this plan including the initial step 
which is happening in JDK 23, i.e., all of the sun.misc unsafe memory-access 
methods are now marked as deprecated for removal. This will cause, in JDK 23, 
compile-time deprecation warnings for code that refers to these methods, 
alerting library developers to their forthcoming removal. A new command-line 
option also enables application developers and users to receive runtime 
warnings when those methods are used.

Developers relying on those sun.misc.Unsafe APIs for access memory are strongly 
encouraged to start, if they haven't done so yet, the migration from the 
sun.misc.Unsafe APIs to supported replacements. For more details, make sure to 
read JEP 471 (Deprecate the Memory-Access Methods in sun.misc.Unsafe for 
Removal).

[3] https://mail.openjdk.org/pipermail/quality-discuss/2024-January/001132.html
[4] https://openjdk.org/jeps/471


## Heads-Up - JDK 23: Changes Default Annotation Processing Policy

Annotation processing is a compile-time feature, where javac scans the 
to-be-compiled source files for annotations and then the class path for 
matching annotation processors, so they can generate source code. Up to JDK 22, 
this feature is enabled by default, which may have been reasonable when it was 
introduced in JDK 6 circa 2006, but from a current perspective, in the interest 
of making build output more robust against annotation processors being placed 
on the class path unintentionally, this is much less reasonable. Hence, 
starting with JDK 23, javac requires an additional command-line option to 
enable annotation processing.

### New `-proc` Value
To that end, the pre-existing option `-proc:$policy` was extended, where 
`$policy` can now have the following values:
- `none`: compilation _without_ annotation processing, this policy exists since 
JDK 6
- `only`: annotation processing _without_ compilation, this policy exists since 
JDK 6
- `full`: annotation processing followed by compilation, this policy is the 
default in JDK ≤22 but the value itself is new (see next section for versions 
that support it)

Up to and including JDK 22, code bases that require annotation processing 
before compilation could rely on javac's default behavior to process 
annotations but that is no longer the case. Starting with JDK 23, at least one 
annotation-processing command line option needs to be present. If neither 
`-processor`, `--processor-path`, now `--processor-module-path` is used, 
`-proc:only` or `-proc:full` has to be provided. In other words, absent other 
command line options, `-proc:none` is the default on JDK 23.

### Migration to `-proc:full`

Several measures were undertaken to help projects prepare for the switch to 
`-proc:full`:
- As of the April 2024 JDK security updates, support for `-proc:full` has been 
backported to 17u (17.0.11) and 11u (11.0.23) for both Oracle JDK and OpenJDK 
distributions. Additionally, Oracle's 8u release (8u411) also supports 
`-proc:full`.
- Starting in JDK 21, javac prints an informative message if implicit usage of 
annotation processing under the default policy is detected.

With `-proc:full` backported, it is possible to configure a build that will 
work the same before and after the change in javac's default policy.

Additional details can be found in the original proposal [5].

[5] https://mail.openjdk.org/pipermail/jdk-dev/2024-May/009028.html


## Heads-up - Loom: New EA builds with improved Java monitors implementation to 
work better with virtual threads

Project Loom published new early-access builds [6]. These builds have an 
improved 

Re: Intellij build/test times

2024-06-09 Thread Michael Sokolov
OK, I can see how the directory structure might be at odds
w/intellij's view of the world.Yet I feel certain I have been able to
run all tests in IJ before.

Just to disconfirm my insanity I tried again building and running all
tests in core on branch_9x/main using both intellij and gradle
build/test options (both via in intellij).

There are a few oddities that happen in intellij that require you to
fiddle with the build in odd ways, but I wonder if these will be
reproducible or if they maybe happen because there is some bad state:

1. Building branch_9x with intellij builder selected in the gradle
settings failed to build the benchmark module due to some modules not
being visible to it (e.g. icu). So I "unload module benchmark"
effectively skipping building that, and then I am able to build the
rest of lucene. YMMV
2. After switching back to main branch, I got a build failure  "error:
Annotation generator had thrown the exception.
javax.annotation.processing.FilerException: Attempt to recreate a file
for type 
org.apache.lucene.benchmark.jmh.jmh_generated.ExpressionsBenchmark_expression_jmhTest".
I see there are some generated classes in
lucene/benchmark_jmh/src/java/generated, that show up in git status,
so I remove that folder and then everything is fine - some cruft left
from a previous build?

Side note: when running all tests in "intellij" mode you cannot do it
by selecting the "core" module - you have to navigate down to the
"tests" folder. Also I observed that when running tests in "gradle"
mode I no longer observed the slow startup times? Really unsure what
that means. Maybe some networking thing?

But the main thing I learned is that while running tests using
intellij builder mostly works, MemorySegmentIndexInputProvider fails
to get loaded and any test using MMapDirectory will fail, regardless
of whether I run a single test or a whole suite. This is true on both
9x and main branches and causes 1/3-1/2 of tests to fail in core.

At this point I'm reluctant to recommend using the intellij build
mode. Maybe it will become viable again if we can figure out how to
get MMapDirectory tests to work with it?

On Sat, Jun 8, 2024 at 4:06 PM Dawid Weiss  wrote:
>
>
>> By the way, the
>> classpath problems seem to occur with either method (gradle or
>> intellij) when running entire suite - I just confused while switching
>> back and forth. This is on main, haven't tried 9x recently
>
>
> Some of these headaches are caused by Lucene's folder structure and have been 
> there forever - resources mixed with source classes. I don't know if you can 
> make intellij use a folder as a resource and as source directory at the same 
> time - I don't think it's possible. If so, tests that rely on these resources 
> will fail. It's been this way since I remember - nothing has changed here.
>
> There is also a lot of trickery involving modular paths etc. I don't think 
> it'll be easy to simulate this in Intellij. Then, 99% of test cases will run 
> just fine from intellij without any special hacks (I think)...
>
> I'd say - run individual tests from intellij, add a test launching config 
> redirecting to gradle for the whole suite - it should also be faster this way 
> since tests will run in parallel between modules.
>
> D.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Intellij build/test times

2024-06-08 Thread Dawid Weiss
> By the way, the
> classpath problems seem to occur with either method (gradle or
> intellij) when running entire suite - I just confused while switching
> back and forth. This is on main, haven't tried 9x recently
>

Some of these headaches are caused by Lucene's folder structure and have
been there forever - resources mixed with source classes. I don't know if
you can make intellij use a folder as a resource and as source directory at
the same time - I don't think it's possible. If so, tests that rely on
these resources will fail. It's been this way since I remember - nothing
has changed here.

There is also a lot of trickery involving modular paths etc. I don't think
it'll be easy to simulate this in Intellij. Then, 99% of test cases will
run just fine from intellij without any special hacks (I think)...

I'd say - run individual tests from intellij, add a test launching config
redirecting to gradle for the whole suite - it should also be faster this
way since tests will run in parallel between modules.

D.


Re: Intellij build/test times

2024-06-08 Thread Michael Sokolov
Indeed, it's when I run multiple tests that I see the problems.
Running single test classes seems to work OK. In the past I have been
able to run the entire test suite, but I agree this is less critical
than being able to debug single tests. Cursory internet search
indicates the problem is widespread and others propose using the same
plan - don't use gradle test runner in intellij. By the way, the
classpath problems seem to occur with either method (gradle or
intellij) when running entire suite - I just confused while switching
back and forth. This is on main, haven't tried 9x recently

On Fri, Jun 7, 2024 at 4:05 PM Dawid Weiss  wrote:
>
>
> Hi Mike,
>
> Are you trying to run all the tests from Lucene from IntelliJ? I admit I 
> haven't tried that... :) I usually use intellij for running/ debugging 
> isolated classes, then rerun the full suite from command line (increased 
> parallelism). I don't think everything will work - if something needs a 
> specific setup done by gradle tasks or has resources under src, where they're 
> not seed as resources by intellij and thus not copied - tough luck. But most 
> stuff should work.
>
> Running via gradle is slow for me not just with Lucene but also with other 
> projects... I can take a look but I'm pessimistic I can do any wonders here.
>
> Dawid
>
> On Fri, Jun 7, 2024 at 6:06 PM Michael Sokolov  wrote:
>>
>> I'm also getting errors like:
>>
>> Caused by: java.lang.ExceptionInInitializerError: Exception
>> java.lang.LinkageError: MemorySegmentIndexInputProvider is missing in
>> Lucene JAR file [in thread
>> "TEST-TestDemo.testDemo-seed#[872544629C2881C6]"]
>>
>> I wonder if this is due to some kind of module permissions thing
>> controlling the visibility of these symbols?
>>
>> On Fri, Jun 7, 2024 at 11:53 AM Michael Sokolov  wrote:
>> >
>> > hm I found FakeCharFilterFactory in src/test/META-INF.services -- it's
>> > in a "test sources root" folder and won't allow itself to be set as a
>> > resources folder? hm even after fiddling with this - I finally get to
>> > mark it as "test resources root" my test is still not passing. This
>> > can't be this hard!
>> >
>> > On Fri, Jun 7, 2024 at 11:44 AM Michael Sokolov  wrote:
>> > >
>> > > hmm so after playing around with this Intellij build for a bit I ran
>> > > into some trouble -- all the tests relying on SPI seemed to start
>> > > failing. So then I switched back to build with Gradle and rebuild the
>> > > project and these tests passed. Just to double check there wasn't some
>> > > strange stale build problem, I think switched back again to IntelliJ
>> > > builder and I still see the same failures; example is like:
>> > >
>> > > NOTE: reproduce with: gradlew test --tests
>> > > TestAnalysisSPILoader.testLookupCharFilter
>> > > -Dtests.seed=88A2DA17C6510A33 -Dtests.locale=en-PR
>> > > -Dtests.timezone=Etc/GMT-9 -Dtests.asserts=true
>> > > -Dtests.file.encoding=UTF-8
>> > >
>> > > java.lang.IllegalArgumentException: A SPI class of type
>> > > org.apache.lucene.analysis.CharFilterFactory with name 'Fake' does not
>> > > exist. You need to add the corresponding JAR file supporting this SPI
>> > > to your classpath. The current classpath supports the following names:
>> > > []
>> > >
>> > > I guess there must be some setup required in order to expose the SPI
>> > > resource files to the build? So I checked some of the resources
>> > > folders like lucene/analysis/common/src/resources and sure enough it
>> > > is labeled as a resources folder in intellij UI. So ... what am I
>> > > missing?
>> > >
>> > > On Fri, Jun 7, 2024 at 10:40 AM Michael Sokolov  
>> > > wrote:
>> > > >
>> > > > ok, life must be scary for developers on windows!
>> > > >
>> > > > On Fri, Jun 7, 2024 at 10:33 AM Dawid Weiss  
>> > > > wrote:
>> > > > >
>> > > > >
>> > > > > Certain regenerate tasks do require perl and python indeed.
>> > > > >
>> > > > > On Fri, Jun 7, 2024 at 2:23 PM Michael Sokolov  
>> > > > > wrote:
>> > > > >>
>> > > > >> While editing this CONTRIBUTING.md I found the following statement:
>> > > > >>
>> > > > >> Some build tasks (in particular `./gradlew check`) require Perl
>> > > > >> and Python 3.
>> > > > >>
>> > > > >> Is it actually true that we require Perl?
>> > > > >>
>> > > > >> On Fri, Jun 7, 2024 at 8:11 AM Michael Sokolov  
>> > > > >> wrote:
>> > > > >> >
>> > > > >> > So I'm glad we have a fix for this, but it's making me realize 
>> > > > >> > that
>> > > > >> > any new joiner that uses intellij (probably most of them?) will 
>> > > > >> > have
>> > > > >> > this problem and have no idea what to do about it. They will just
>> > > > >> > conclude - running Lucene tests in intellij sucks. If we revived 
>> > > > >> > that
>> > > > >> > intellij target maybe that would help - but .. you would have to 
>> > > > >> > know
>> > > > >> > to run it! So then I went to look at our project web page to see 
>> > > > >> > what
>> > > > >> > kind of developer docs we have that a new contributor might find.
>> > > > 

Re: Intellij build/test times

2024-06-07 Thread Dawid Weiss
Hi Mike,

Are you trying to run *all* the tests from Lucene from IntelliJ? I admit I
haven't tried that... :) I usually use intellij for running/ debugging
isolated classes, then rerun the full suite from command line (increased
parallelism). I don't think everything will work - if something needs a
specific setup done by gradle tasks or has resources under src, where
they're not seed as resources by intellij and thus not copied - tough luck.
But most stuff should work.

Running via gradle is slow for me not just with Lucene but also with other
projects... I can take a look but I'm pessimistic I can do any wonders here.

Dawid

On Fri, Jun 7, 2024 at 6:06 PM Michael Sokolov  wrote:

> I'm also getting errors like:
>
> Caused by: java.lang.ExceptionInInitializerError: Exception
> java.lang.LinkageError: MemorySegmentIndexInputProvider is missing in
> Lucene JAR file [in thread
> "TEST-TestDemo.testDemo-seed#[872544629C2881C6]"]
>
> I wonder if this is due to some kind of module permissions thing
> controlling the visibility of these symbols?
>
> On Fri, Jun 7, 2024 at 11:53 AM Michael Sokolov 
> wrote:
> >
> > hm I found FakeCharFilterFactory in src/test/META-INF.services -- it's
> > in a "test sources root" folder and won't allow itself to be set as a
> > resources folder? hm even after fiddling with this - I finally get to
> > mark it as "test resources root" my test is still not passing. This
> > can't be this hard!
> >
> > On Fri, Jun 7, 2024 at 11:44 AM Michael Sokolov 
> wrote:
> > >
> > > hmm so after playing around with this Intellij build for a bit I ran
> > > into some trouble -- all the tests relying on SPI seemed to start
> > > failing. So then I switched back to build with Gradle and rebuild the
> > > project and these tests passed. Just to double check there wasn't some
> > > strange stale build problem, I think switched back again to IntelliJ
> > > builder and I still see the same failures; example is like:
> > >
> > > NOTE: reproduce with: gradlew test --tests
> > > TestAnalysisSPILoader.testLookupCharFilter
> > > -Dtests.seed=88A2DA17C6510A33 -Dtests.locale=en-PR
> > > -Dtests.timezone=Etc/GMT-9 -Dtests.asserts=true
> > > -Dtests.file.encoding=UTF-8
> > >
> > > java.lang.IllegalArgumentException: A SPI class of type
> > > org.apache.lucene.analysis.CharFilterFactory with name 'Fake' does not
> > > exist. You need to add the corresponding JAR file supporting this SPI
> > > to your classpath. The current classpath supports the following names:
> > > []
> > >
> > > I guess there must be some setup required in order to expose the SPI
> > > resource files to the build? So I checked some of the resources
> > > folders like lucene/analysis/common/src/resources and sure enough it
> > > is labeled as a resources folder in intellij UI. So ... what am I
> > > missing?
> > >
> > > On Fri, Jun 7, 2024 at 10:40 AM Michael Sokolov 
> wrote:
> > > >
> > > > ok, life must be scary for developers on windows!
> > > >
> > > > On Fri, Jun 7, 2024 at 10:33 AM Dawid Weiss 
> wrote:
> > > > >
> > > > >
> > > > > Certain regenerate tasks do require perl and python indeed.
> > > > >
> > > > > On Fri, Jun 7, 2024 at 2:23 PM Michael Sokolov 
> wrote:
> > > > >>
> > > > >> While editing this CONTRIBUTING.md I found the following
> statement:
> > > > >>
> > > > >> Some build tasks (in particular `./gradlew check`) require
> Perl
> > > > >> and Python 3.
> > > > >>
> > > > >> Is it actually true that we require Perl?
> > > > >>
> > > > >> On Fri, Jun 7, 2024 at 8:11 AM Michael Sokolov <
> msoko...@gmail.com> wrote:
> > > > >> >
> > > > >> > So I'm glad we have a fix for this, but it's making me realize
> that
> > > > >> > any new joiner that uses intellij (probably most of them?) will
> have
> > > > >> > this problem and have no idea what to do about it. They will
> just
> > > > >> > conclude - running Lucene tests in intellij sucks. If we
> revived that
> > > > >> > intellij target maybe that would help - but .. you would have
> to know
> > > > >> > to run it! So then I went to look at our project web page to
> see what
> > > > >> > kind of developer docs we have that a new contributor might
> find.
> > > > >> >
> > > > >> > The first place Google sent me was to our github page
> > > > >> > https://github.com/apache/lucene/?tab=readme-ov-file-- that
> one has
> > > > >> > some very brief description about how to build, but nothing
> about
> > > > >> > intellij. It does have a prominent link to "Developer
> documentation"
> > > > >> > which is here:
> https://github.com/apache/lucene/tree/main/dev-docs but
> > > > >> > that folder is mostly empty; it has a few somewhat esoteric
> bits of
> > > > >> > info, but again nothing basic about building and testing; no
> > > > >> > discussion of all the myriad gradle tasks and deep help info
> that
> > > > >> > exists there.
> > > > >> >
> > > > >> > Next I tried looking on apache.org, but actually it is quite
> hard to
> > > > >> > find any info about Lucene there - Apache 

Experience with a round robin or dynamic collection of docs from segments within a Slice

2024-06-07 Thread Gautam Worah
Hi folks,

I was wondering if people had experimented with something other than the
default Lucene search logic of completing one segment within a Slice at a
time, and then going on to the next segment in sequential order.

Here is the current logic in IndexSearcher#search(List
leaves, Weight weight, Collector collector):

```
for (LeafReaderContext ctx : leaves) {
...
scorer.score(leafCollector, ctx.reader().getLiveDocs());
...
}
```

For sorted indexes (e.g. time sorted data), for a query that asks for the
top-k results, it may be better to round robin among leaves (or expand more
into the leaf that has better values) within a Slice that uses just a
single thread?
In Amazon's Product Search context, our index is sorted in descending order by
a custom Score.

I wanted to experiment with something like:

```
while (isAnyCollectorNotTerminated) { // do a round robin
for (LeafReaderContext ctx : leaves) {
...
scorer.score(leafCollector, ctx.reader().getLiveDocs(), start, end);
...
}
}
```

However, I came across two issues until now:
1. DrillSidewaysScorer has a docId limitation of start=0 and end=Integer
.MAX_VALUE
2. FacetsCollector is designed in a way that it needs a leaf to be finished
before it can go on to
the next leaf. This does not work with my logic of round-robining.

I am working through them right now.
I was wondering if people had in general tried this approach, and whether
they knew of other
problems that might arise, potential re-routes, performance numbers or any
other experiences in general?
Are there any other Collectors that have such design decisions?

I see the current code has a comment like:

```
// TODO: should we make this
// threaded...? the Collector could be sync'd?
```

so I guess there were some ideas around making this logic smarter?

Thanks for the help!

-
Gautam Worah.


Re: Intellij build/test times

2024-06-07 Thread Michael Sokolov
I'm also getting errors like:

Caused by: java.lang.ExceptionInInitializerError: Exception
java.lang.LinkageError: MemorySegmentIndexInputProvider is missing in
Lucene JAR file [in thread
"TEST-TestDemo.testDemo-seed#[872544629C2881C6]"]

I wonder if this is due to some kind of module permissions thing
controlling the visibility of these symbols?

On Fri, Jun 7, 2024 at 11:53 AM Michael Sokolov  wrote:
>
> hm I found FakeCharFilterFactory in src/test/META-INF.services -- it's
> in a "test sources root" folder and won't allow itself to be set as a
> resources folder? hm even after fiddling with this - I finally get to
> mark it as "test resources root" my test is still not passing. This
> can't be this hard!
>
> On Fri, Jun 7, 2024 at 11:44 AM Michael Sokolov  wrote:
> >
> > hmm so after playing around with this Intellij build for a bit I ran
> > into some trouble -- all the tests relying on SPI seemed to start
> > failing. So then I switched back to build with Gradle and rebuild the
> > project and these tests passed. Just to double check there wasn't some
> > strange stale build problem, I think switched back again to IntelliJ
> > builder and I still see the same failures; example is like:
> >
> > NOTE: reproduce with: gradlew test --tests
> > TestAnalysisSPILoader.testLookupCharFilter
> > -Dtests.seed=88A2DA17C6510A33 -Dtests.locale=en-PR
> > -Dtests.timezone=Etc/GMT-9 -Dtests.asserts=true
> > -Dtests.file.encoding=UTF-8
> >
> > java.lang.IllegalArgumentException: A SPI class of type
> > org.apache.lucene.analysis.CharFilterFactory with name 'Fake' does not
> > exist. You need to add the corresponding JAR file supporting this SPI
> > to your classpath. The current classpath supports the following names:
> > []
> >
> > I guess there must be some setup required in order to expose the SPI
> > resource files to the build? So I checked some of the resources
> > folders like lucene/analysis/common/src/resources and sure enough it
> > is labeled as a resources folder in intellij UI. So ... what am I
> > missing?
> >
> > On Fri, Jun 7, 2024 at 10:40 AM Michael Sokolov  wrote:
> > >
> > > ok, life must be scary for developers on windows!
> > >
> > > On Fri, Jun 7, 2024 at 10:33 AM Dawid Weiss  wrote:
> > > >
> > > >
> > > > Certain regenerate tasks do require perl and python indeed.
> > > >
> > > > On Fri, Jun 7, 2024 at 2:23 PM Michael Sokolov  
> > > > wrote:
> > > >>
> > > >> While editing this CONTRIBUTING.md I found the following statement:
> > > >>
> > > >> Some build tasks (in particular `./gradlew check`) require Perl
> > > >> and Python 3.
> > > >>
> > > >> Is it actually true that we require Perl?
> > > >>
> > > >> On Fri, Jun 7, 2024 at 8:11 AM Michael Sokolov  
> > > >> wrote:
> > > >> >
> > > >> > So I'm glad we have a fix for this, but it's making me realize that
> > > >> > any new joiner that uses intellij (probably most of them?) will have
> > > >> > this problem and have no idea what to do about it. They will just
> > > >> > conclude - running Lucene tests in intellij sucks. If we revived that
> > > >> > intellij target maybe that would help - but .. you would have to know
> > > >> > to run it! So then I went to look at our project web page to see what
> > > >> > kind of developer docs we have that a new contributor might find.
> > > >> >
> > > >> > The first place Google sent me was to our github page
> > > >> > https://github.com/apache/lucene/?tab=readme-ov-file-- that one has
> > > >> > some very brief description about how to build, but nothing about
> > > >> > intellij. It does have a prominent link to "Developer documentation"
> > > >> > which is here: https://github.com/apache/lucene/tree/main/dev-docs 
> > > >> > but
> > > >> > that folder is mostly empty; it has a few somewhat esoteric bits of
> > > >> > info, but again nothing basic about building and testing; no
> > > >> > discussion of all the myriad gradle tasks and deep help info that
> > > >> > exists there.
> > > >> >
> > > >> > Next I tried looking on apache.org, but actually it is quite hard to
> > > >> > find any info about Lucene there - Apache just has too many projects.
> > > >> > I did finally find this page though
> > > >> > https://projects.apache.org/project.html?lucene-core and it links to
> > > >> > https://lucene.apache.org/core/. From there, I see a "Developer" 
> > > >> > link,
> > > >> > again this page has a paucity of info; basically it links you to
> > > >> > github, jenkins, and to the wiki. The "wiki" link actually just takes
> > > >> > you to a different github page -- and *this* one actually has some
> > > >> > useful info on how to build -- I think it's our best "intro" page for
> > > >> > a new developer. However all it says about IntelliJ is: "IntelliJ -
> > > >> > IntelliJ idea can import and build gradle-based projects out of the
> > > >> > box." true, sort of.
> > > >> >
> > > >> > So I think I will (1) add a note about this IJ build setting to that
> > > >> > page, and (2) consolidate some of 

Re: Intellij build/test times

2024-06-07 Thread Michael Sokolov
hm I found FakeCharFilterFactory in src/test/META-INF.services -- it's
in a "test sources root" folder and won't allow itself to be set as a
resources folder? hm even after fiddling with this - I finally get to
mark it as "test resources root" my test is still not passing. This
can't be this hard!

On Fri, Jun 7, 2024 at 11:44 AM Michael Sokolov  wrote:
>
> hmm so after playing around with this Intellij build for a bit I ran
> into some trouble -- all the tests relying on SPI seemed to start
> failing. So then I switched back to build with Gradle and rebuild the
> project and these tests passed. Just to double check there wasn't some
> strange stale build problem, I think switched back again to IntelliJ
> builder and I still see the same failures; example is like:
>
> NOTE: reproduce with: gradlew test --tests
> TestAnalysisSPILoader.testLookupCharFilter
> -Dtests.seed=88A2DA17C6510A33 -Dtests.locale=en-PR
> -Dtests.timezone=Etc/GMT-9 -Dtests.asserts=true
> -Dtests.file.encoding=UTF-8
>
> java.lang.IllegalArgumentException: A SPI class of type
> org.apache.lucene.analysis.CharFilterFactory with name 'Fake' does not
> exist. You need to add the corresponding JAR file supporting this SPI
> to your classpath. The current classpath supports the following names:
> []
>
> I guess there must be some setup required in order to expose the SPI
> resource files to the build? So I checked some of the resources
> folders like lucene/analysis/common/src/resources and sure enough it
> is labeled as a resources folder in intellij UI. So ... what am I
> missing?
>
> On Fri, Jun 7, 2024 at 10:40 AM Michael Sokolov  wrote:
> >
> > ok, life must be scary for developers on windows!
> >
> > On Fri, Jun 7, 2024 at 10:33 AM Dawid Weiss  wrote:
> > >
> > >
> > > Certain regenerate tasks do require perl and python indeed.
> > >
> > > On Fri, Jun 7, 2024 at 2:23 PM Michael Sokolov  wrote:
> > >>
> > >> While editing this CONTRIBUTING.md I found the following statement:
> > >>
> > >> Some build tasks (in particular `./gradlew check`) require Perl
> > >> and Python 3.
> > >>
> > >> Is it actually true that we require Perl?
> > >>
> > >> On Fri, Jun 7, 2024 at 8:11 AM Michael Sokolov  
> > >> wrote:
> > >> >
> > >> > So I'm glad we have a fix for this, but it's making me realize that
> > >> > any new joiner that uses intellij (probably most of them?) will have
> > >> > this problem and have no idea what to do about it. They will just
> > >> > conclude - running Lucene tests in intellij sucks. If we revived that
> > >> > intellij target maybe that would help - but .. you would have to know
> > >> > to run it! So then I went to look at our project web page to see what
> > >> > kind of developer docs we have that a new contributor might find.
> > >> >
> > >> > The first place Google sent me was to our github page
> > >> > https://github.com/apache/lucene/?tab=readme-ov-file-- that one has
> > >> > some very brief description about how to build, but nothing about
> > >> > intellij. It does have a prominent link to "Developer documentation"
> > >> > which is here: https://github.com/apache/lucene/tree/main/dev-docs but
> > >> > that folder is mostly empty; it has a few somewhat esoteric bits of
> > >> > info, but again nothing basic about building and testing; no
> > >> > discussion of all the myriad gradle tasks and deep help info that
> > >> > exists there.
> > >> >
> > >> > Next I tried looking on apache.org, but actually it is quite hard to
> > >> > find any info about Lucene there - Apache just has too many projects.
> > >> > I did finally find this page though
> > >> > https://projects.apache.org/project.html?lucene-core and it links to
> > >> > https://lucene.apache.org/core/. From there, I see a "Developer" link,
> > >> > again this page has a paucity of info; basically it links you to
> > >> > github, jenkins, and to the wiki. The "wiki" link actually just takes
> > >> > you to a different github page -- and *this* one actually has some
> > >> > useful info on how to build -- I think it's our best "intro" page for
> > >> > a new developer. However all it says about IntelliJ is: "IntelliJ -
> > >> > IntelliJ idea can import and build gradle-based projects out of the
> > >> > box." true, sort of.
> > >> >
> > >> > So I think I will (1) add a note about this IJ build setting to that
> > >> > page, and (2) consolidate some of the other links to go here instead
> > >> > of routing folks through a twisty maze of web pages
> > >> >
> > >> > On Fri, Jun 7, 2024 at 7:45 AM Stefan Vodita  
> > >> > wrote:
> > >> > >
> > >> > > +1, I had the same problem and it seems better now. Thank you, Dawid!
> > >> > >
> > >> > > On Thu, 6 Jun 2024 at 12:20, Michael Sokolov  
> > >> > > wrote:
> > >> > >>
> > >> > >> Oh! TIL! so much better, thanks. And now I have the "Repeat" option
> > >> > >> back in the test runner
> > >> > >>
> > >> > >> On Thu, Jun 6, 2024 at 6:18 AM Dawid Weiss  
> > >> > >> wrote:
> > >> > >> >
> > >> > >> >
> > >> > >> > 

Re: Intellij build/test times

2024-06-07 Thread Michael Sokolov
hmm so after playing around with this Intellij build for a bit I ran
into some trouble -- all the tests relying on SPI seemed to start
failing. So then I switched back to build with Gradle and rebuild the
project and these tests passed. Just to double check there wasn't some
strange stale build problem, I think switched back again to IntelliJ
builder and I still see the same failures; example is like:

NOTE: reproduce with: gradlew test --tests
TestAnalysisSPILoader.testLookupCharFilter
-Dtests.seed=88A2DA17C6510A33 -Dtests.locale=en-PR
-Dtests.timezone=Etc/GMT-9 -Dtests.asserts=true
-Dtests.file.encoding=UTF-8

java.lang.IllegalArgumentException: A SPI class of type
org.apache.lucene.analysis.CharFilterFactory with name 'Fake' does not
exist. You need to add the corresponding JAR file supporting this SPI
to your classpath. The current classpath supports the following names:
[]

I guess there must be some setup required in order to expose the SPI
resource files to the build? So I checked some of the resources
folders like lucene/analysis/common/src/resources and sure enough it
is labeled as a resources folder in intellij UI. So ... what am I
missing?

On Fri, Jun 7, 2024 at 10:40 AM Michael Sokolov  wrote:
>
> ok, life must be scary for developers on windows!
>
> On Fri, Jun 7, 2024 at 10:33 AM Dawid Weiss  wrote:
> >
> >
> > Certain regenerate tasks do require perl and python indeed.
> >
> > On Fri, Jun 7, 2024 at 2:23 PM Michael Sokolov  wrote:
> >>
> >> While editing this CONTRIBUTING.md I found the following statement:
> >>
> >> Some build tasks (in particular `./gradlew check`) require Perl
> >> and Python 3.
> >>
> >> Is it actually true that we require Perl?
> >>
> >> On Fri, Jun 7, 2024 at 8:11 AM Michael Sokolov  wrote:
> >> >
> >> > So I'm glad we have a fix for this, but it's making me realize that
> >> > any new joiner that uses intellij (probably most of them?) will have
> >> > this problem and have no idea what to do about it. They will just
> >> > conclude - running Lucene tests in intellij sucks. If we revived that
> >> > intellij target maybe that would help - but .. you would have to know
> >> > to run it! So then I went to look at our project web page to see what
> >> > kind of developer docs we have that a new contributor might find.
> >> >
> >> > The first place Google sent me was to our github page
> >> > https://github.com/apache/lucene/?tab=readme-ov-file-- that one has
> >> > some very brief description about how to build, but nothing about
> >> > intellij. It does have a prominent link to "Developer documentation"
> >> > which is here: https://github.com/apache/lucene/tree/main/dev-docs but
> >> > that folder is mostly empty; it has a few somewhat esoteric bits of
> >> > info, but again nothing basic about building and testing; no
> >> > discussion of all the myriad gradle tasks and deep help info that
> >> > exists there.
> >> >
> >> > Next I tried looking on apache.org, but actually it is quite hard to
> >> > find any info about Lucene there - Apache just has too many projects.
> >> > I did finally find this page though
> >> > https://projects.apache.org/project.html?lucene-core and it links to
> >> > https://lucene.apache.org/core/. From there, I see a "Developer" link,
> >> > again this page has a paucity of info; basically it links you to
> >> > github, jenkins, and to the wiki. The "wiki" link actually just takes
> >> > you to a different github page -- and *this* one actually has some
> >> > useful info on how to build -- I think it's our best "intro" page for
> >> > a new developer. However all it says about IntelliJ is: "IntelliJ -
> >> > IntelliJ idea can import and build gradle-based projects out of the
> >> > box." true, sort of.
> >> >
> >> > So I think I will (1) add a note about this IJ build setting to that
> >> > page, and (2) consolidate some of the other links to go here instead
> >> > of routing folks through a twisty maze of web pages
> >> >
> >> > On Fri, Jun 7, 2024 at 7:45 AM Stefan Vodita  
> >> > wrote:
> >> > >
> >> > > +1, I had the same problem and it seems better now. Thank you, Dawid!
> >> > >
> >> > > On Thu, 6 Jun 2024 at 12:20, Michael Sokolov  
> >> > > wrote:
> >> > >>
> >> > >> Oh! TIL! so much better, thanks. And now I have the "Repeat" option
> >> > >> back in the test runner
> >> > >>
> >> > >> On Thu, Jun 6, 2024 at 6:18 AM Dawid Weiss  
> >> > >> wrote:
> >> > >> >
> >> > >> >
> >> > >> > Don't know what's causing this... but I never run IntelliJ builds 
> >> > >> > or tests through its gradle launcher, actually. Switch it to 
> >> > >> > compile and run using its own built-in method - much faster.
> >> > >> >
> >> > >> >
> >> > >> >
> >> > >> > Dawid
> >> > >> >
> >> > >> > On Thu, Jun 6, 2024 at 12:10 PM Michael Sokolov 
> >> > >> >  wrote:
> >> > >> >>
> >> > >> >> Hi, I wonder how many of us are using intellij to run Lucene 
> >> > >> >> tests, and if you are, have you noticed it having gotten really 
> >> > >> >> quite slow? It 

Re: Intellij build/test times

2024-06-07 Thread Michael Sokolov
ok, life must be scary for developers on windows!

On Fri, Jun 7, 2024 at 10:33 AM Dawid Weiss  wrote:
>
>
> Certain regenerate tasks do require perl and python indeed.
>
> On Fri, Jun 7, 2024 at 2:23 PM Michael Sokolov  wrote:
>>
>> While editing this CONTRIBUTING.md I found the following statement:
>>
>> Some build tasks (in particular `./gradlew check`) require Perl
>> and Python 3.
>>
>> Is it actually true that we require Perl?
>>
>> On Fri, Jun 7, 2024 at 8:11 AM Michael Sokolov  wrote:
>> >
>> > So I'm glad we have a fix for this, but it's making me realize that
>> > any new joiner that uses intellij (probably most of them?) will have
>> > this problem and have no idea what to do about it. They will just
>> > conclude - running Lucene tests in intellij sucks. If we revived that
>> > intellij target maybe that would help - but .. you would have to know
>> > to run it! So then I went to look at our project web page to see what
>> > kind of developer docs we have that a new contributor might find.
>> >
>> > The first place Google sent me was to our github page
>> > https://github.com/apache/lucene/?tab=readme-ov-file-- that one has
>> > some very brief description about how to build, but nothing about
>> > intellij. It does have a prominent link to "Developer documentation"
>> > which is here: https://github.com/apache/lucene/tree/main/dev-docs but
>> > that folder is mostly empty; it has a few somewhat esoteric bits of
>> > info, but again nothing basic about building and testing; no
>> > discussion of all the myriad gradle tasks and deep help info that
>> > exists there.
>> >
>> > Next I tried looking on apache.org, but actually it is quite hard to
>> > find any info about Lucene there - Apache just has too many projects.
>> > I did finally find this page though
>> > https://projects.apache.org/project.html?lucene-core and it links to
>> > https://lucene.apache.org/core/. From there, I see a "Developer" link,
>> > again this page has a paucity of info; basically it links you to
>> > github, jenkins, and to the wiki. The "wiki" link actually just takes
>> > you to a different github page -- and *this* one actually has some
>> > useful info on how to build -- I think it's our best "intro" page for
>> > a new developer. However all it says about IntelliJ is: "IntelliJ -
>> > IntelliJ idea can import and build gradle-based projects out of the
>> > box." true, sort of.
>> >
>> > So I think I will (1) add a note about this IJ build setting to that
>> > page, and (2) consolidate some of the other links to go here instead
>> > of routing folks through a twisty maze of web pages
>> >
>> > On Fri, Jun 7, 2024 at 7:45 AM Stefan Vodita  
>> > wrote:
>> > >
>> > > +1, I had the same problem and it seems better now. Thank you, Dawid!
>> > >
>> > > On Thu, 6 Jun 2024 at 12:20, Michael Sokolov  wrote:
>> > >>
>> > >> Oh! TIL! so much better, thanks. And now I have the "Repeat" option
>> > >> back in the test runner
>> > >>
>> > >> On Thu, Jun 6, 2024 at 6:18 AM Dawid Weiss  
>> > >> wrote:
>> > >> >
>> > >> >
>> > >> > Don't know what's causing this... but I never run IntelliJ builds or 
>> > >> > tests through its gradle launcher, actually. Switch it to compile and 
>> > >> > run using its own built-in method - much faster.
>> > >> >
>> > >> >
>> > >> >
>> > >> > Dawid
>> > >> >
>> > >> > On Thu, Jun 6, 2024 at 12:10 PM Michael Sokolov  
>> > >> > wrote:
>> > >> >>
>> > >> >> Hi, I wonder how many of us are using intellij to run Lucene tests, 
>> > >> >> and if you are, have you noticed it having gotten really quite slow? 
>> > >> >> It seems to take a long time doing... Something... Before the test 
>> > >> >> starts running. I have a suspicion that we are using gradle in a way 
>> > >> >> that forces it to rebuild its cache every time or something like 
>> > >> >> that. Once upon a time we had an intellij build setup target that 
>> > >> >> set things up in a more intellij friendly way, according gradle, 
>> > >> >> didn't we? Does that still exist?
>> > >>
>> > >> -
>> > >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> > >> For additional commands, e-mail: dev-h...@lucene.apache.org
>> > >>
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Intellij build/test times

2024-06-07 Thread Dawid Weiss
Certain regenerate tasks do require perl and python indeed.

On Fri, Jun 7, 2024 at 2:23 PM Michael Sokolov  wrote:

> While editing this CONTRIBUTING.md I found the following statement:
>
> Some build tasks (in particular `./gradlew check`) require Perl
> and Python 3.
>
> Is it actually true that we require Perl?
>
> On Fri, Jun 7, 2024 at 8:11 AM Michael Sokolov  wrote:
> >
> > So I'm glad we have a fix for this, but it's making me realize that
> > any new joiner that uses intellij (probably most of them?) will have
> > this problem and have no idea what to do about it. They will just
> > conclude - running Lucene tests in intellij sucks. If we revived that
> > intellij target maybe that would help - but .. you would have to know
> > to run it! So then I went to look at our project web page to see what
> > kind of developer docs we have that a new contributor might find.
> >
> > The first place Google sent me was to our github page
> > https://github.com/apache/lucene/?tab=readme-ov-file-- that one has
> > some very brief description about how to build, but nothing about
> > intellij. It does have a prominent link to "Developer documentation"
> > which is here: https://github.com/apache/lucene/tree/main/dev-docs but
> > that folder is mostly empty; it has a few somewhat esoteric bits of
> > info, but again nothing basic about building and testing; no
> > discussion of all the myriad gradle tasks and deep help info that
> > exists there.
> >
> > Next I tried looking on apache.org, but actually it is quite hard to
> > find any info about Lucene there - Apache just has too many projects.
> > I did finally find this page though
> > https://projects.apache.org/project.html?lucene-core and it links to
> > https://lucene.apache.org/core/. From there, I see a "Developer" link,
> > again this page has a paucity of info; basically it links you to
> > github, jenkins, and to the wiki. The "wiki" link actually just takes
> > you to a different github page -- and *this* one actually has some
> > useful info on how to build -- I think it's our best "intro" page for
> > a new developer. However all it says about IntelliJ is: "IntelliJ -
> > IntelliJ idea can import and build gradle-based projects out of the
> > box." true, sort of.
> >
> > So I think I will (1) add a note about this IJ build setting to that
> > page, and (2) consolidate some of the other links to go here instead
> > of routing folks through a twisty maze of web pages
> >
> > On Fri, Jun 7, 2024 at 7:45 AM Stefan Vodita 
> wrote:
> > >
> > > +1, I had the same problem and it seems better now. Thank you, Dawid!
> > >
> > > On Thu, 6 Jun 2024 at 12:20, Michael Sokolov 
> wrote:
> > >>
> > >> Oh! TIL! so much better, thanks. And now I have the "Repeat" option
> > >> back in the test runner
> > >>
> > >> On Thu, Jun 6, 2024 at 6:18 AM Dawid Weiss 
> wrote:
> > >> >
> > >> >
> > >> > Don't know what's causing this... but I never run IntelliJ builds
> or tests through its gradle launcher, actually. Switch it to compile and
> run using its own built-in method - much faster.
> > >> >
> > >> >
> > >> >
> > >> > Dawid
> > >> >
> > >> > On Thu, Jun 6, 2024 at 12:10 PM Michael Sokolov 
> wrote:
> > >> >>
> > >> >> Hi, I wonder how many of us are using intellij to run Lucene
> tests, and if you are, have you noticed it having gotten really quite slow?
> It seems to take a long time doing... Something... Before the test starts
> running. I have a suspicion that we are using gradle in a way that forces
> it to rebuild its cache every time or something like that. Once upon a time
> we had an intellij build setup target that set things up in a more intellij
> friendly way, according gradle, didn't we? Does that still exist?
> > >>
> > >> -
> > >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> > >> For additional commands, e-mail: dev-h...@lucene.apache.org
> > >>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>


Re: Intellij build/test times

2024-06-07 Thread Michael Sokolov
While editing this CONTRIBUTING.md I found the following statement:

Some build tasks (in particular `./gradlew check`) require Perl
and Python 3.

Is it actually true that we require Perl?

On Fri, Jun 7, 2024 at 8:11 AM Michael Sokolov  wrote:
>
> So I'm glad we have a fix for this, but it's making me realize that
> any new joiner that uses intellij (probably most of them?) will have
> this problem and have no idea what to do about it. They will just
> conclude - running Lucene tests in intellij sucks. If we revived that
> intellij target maybe that would help - but .. you would have to know
> to run it! So then I went to look at our project web page to see what
> kind of developer docs we have that a new contributor might find.
>
> The first place Google sent me was to our github page
> https://github.com/apache/lucene/?tab=readme-ov-file-- that one has
> some very brief description about how to build, but nothing about
> intellij. It does have a prominent link to "Developer documentation"
> which is here: https://github.com/apache/lucene/tree/main/dev-docs but
> that folder is mostly empty; it has a few somewhat esoteric bits of
> info, but again nothing basic about building and testing; no
> discussion of all the myriad gradle tasks and deep help info that
> exists there.
>
> Next I tried looking on apache.org, but actually it is quite hard to
> find any info about Lucene there - Apache just has too many projects.
> I did finally find this page though
> https://projects.apache.org/project.html?lucene-core and it links to
> https://lucene.apache.org/core/. From there, I see a "Developer" link,
> again this page has a paucity of info; basically it links you to
> github, jenkins, and to the wiki. The "wiki" link actually just takes
> you to a different github page -- and *this* one actually has some
> useful info on how to build -- I think it's our best "intro" page for
> a new developer. However all it says about IntelliJ is: "IntelliJ -
> IntelliJ idea can import and build gradle-based projects out of the
> box." true, sort of.
>
> So I think I will (1) add a note about this IJ build setting to that
> page, and (2) consolidate some of the other links to go here instead
> of routing folks through a twisty maze of web pages
>
> On Fri, Jun 7, 2024 at 7:45 AM Stefan Vodita  wrote:
> >
> > +1, I had the same problem and it seems better now. Thank you, Dawid!
> >
> > On Thu, 6 Jun 2024 at 12:20, Michael Sokolov  wrote:
> >>
> >> Oh! TIL! so much better, thanks. And now I have the "Repeat" option
> >> back in the test runner
> >>
> >> On Thu, Jun 6, 2024 at 6:18 AM Dawid Weiss  wrote:
> >> >
> >> >
> >> > Don't know what's causing this... but I never run IntelliJ builds or 
> >> > tests through its gradle launcher, actually. Switch it to compile and 
> >> > run using its own built-in method - much faster.
> >> >
> >> >
> >> >
> >> > Dawid
> >> >
> >> > On Thu, Jun 6, 2024 at 12:10 PM Michael Sokolov  
> >> > wrote:
> >> >>
> >> >> Hi, I wonder how many of us are using intellij to run Lucene tests, and 
> >> >> if you are, have you noticed it having gotten really quite slow? It 
> >> >> seems to take a long time doing... Something... Before the test starts 
> >> >> running. I have a suspicion that we are using gradle in a way that 
> >> >> forces it to rebuild its cache every time or something like that. Once 
> >> >> upon a time we had an intellij build setup target that set things up in 
> >> >> a more intellij friendly way, according gradle, didn't we? Does that 
> >> >> still exist?
> >>
> >> -
> >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> >> For additional commands, e-mail: dev-h...@lucene.apache.org
> >>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Intellij build/test times

2024-06-07 Thread Michael Sokolov
So I'm glad we have a fix for this, but it's making me realize that
any new joiner that uses intellij (probably most of them?) will have
this problem and have no idea what to do about it. They will just
conclude - running Lucene tests in intellij sucks. If we revived that
intellij target maybe that would help - but .. you would have to know
to run it! So then I went to look at our project web page to see what
kind of developer docs we have that a new contributor might find.

The first place Google sent me was to our github page
https://github.com/apache/lucene/?tab=readme-ov-file-- that one has
some very brief description about how to build, but nothing about
intellij. It does have a prominent link to "Developer documentation"
which is here: https://github.com/apache/lucene/tree/main/dev-docs but
that folder is mostly empty; it has a few somewhat esoteric bits of
info, but again nothing basic about building and testing; no
discussion of all the myriad gradle tasks and deep help info that
exists there.

Next I tried looking on apache.org, but actually it is quite hard to
find any info about Lucene there - Apache just has too many projects.
I did finally find this page though
https://projects.apache.org/project.html?lucene-core and it links to
https://lucene.apache.org/core/. From there, I see a "Developer" link,
again this page has a paucity of info; basically it links you to
github, jenkins, and to the wiki. The "wiki" link actually just takes
you to a different github page -- and *this* one actually has some
useful info on how to build -- I think it's our best "intro" page for
a new developer. However all it says about IntelliJ is: "IntelliJ -
IntelliJ idea can import and build gradle-based projects out of the
box." true, sort of.

So I think I will (1) add a note about this IJ build setting to that
page, and (2) consolidate some of the other links to go here instead
of routing folks through a twisty maze of web pages

On Fri, Jun 7, 2024 at 7:45 AM Stefan Vodita  wrote:
>
> +1, I had the same problem and it seems better now. Thank you, Dawid!
>
> On Thu, 6 Jun 2024 at 12:20, Michael Sokolov  wrote:
>>
>> Oh! TIL! so much better, thanks. And now I have the "Repeat" option
>> back in the test runner
>>
>> On Thu, Jun 6, 2024 at 6:18 AM Dawid Weiss  wrote:
>> >
>> >
>> > Don't know what's causing this... but I never run IntelliJ builds or tests 
>> > through its gradle launcher, actually. Switch it to compile and run using 
>> > its own built-in method - much faster.
>> >
>> >
>> >
>> > Dawid
>> >
>> > On Thu, Jun 6, 2024 at 12:10 PM Michael Sokolov  wrote:
>> >>
>> >> Hi, I wonder how many of us are using intellij to run Lucene tests, and 
>> >> if you are, have you noticed it having gotten really quite slow? It seems 
>> >> to take a long time doing... Something... Before the test starts running. 
>> >> I have a suspicion that we are using gradle in a way that forces it to 
>> >> rebuild its cache every time or something like that. Once upon a time we 
>> >> had an intellij build setup target that set things up in a more intellij 
>> >> friendly way, according gradle, didn't we? Does that still exist?
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Intellij build/test times

2024-06-07 Thread Stefan Vodita
+1, I had the same problem and it seems better now. Thank you, Dawid!

On Thu, 6 Jun 2024 at 12:20, Michael Sokolov  wrote:

> Oh! TIL! so much better, thanks. And now I have the "Repeat" option
> back in the test runner
>
> On Thu, Jun 6, 2024 at 6:18 AM Dawid Weiss  wrote:
> >
> >
> > Don't know what's causing this... but I never run IntelliJ builds or
> tests through its gradle launcher, actually. Switch it to compile and run
> using its own built-in method - much faster.
> >
> >
> >
> > Dawid
> >
> > On Thu, Jun 6, 2024 at 12:10 PM Michael Sokolov 
> wrote:
> >>
> >> Hi, I wonder how many of us are using intellij to run Lucene tests, and
> if you are, have you noticed it having gotten really quite slow? It seems
> to take a long time doing... Something... Before the test starts running. I
> have a suspicion that we are using gradle in a way that forces it to
> rebuild its cache every time or something like that. Once upon a time we
> had an intellij build setup target that set things up in a more intellij
> friendly way, according gradle, didn't we? Does that still exist?
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>


[ANNOUNCE] Apache Lucene 9.11.0 released

2024-06-06 Thread Benjamin Trent
The Lucene PMC is pleased to announce the release of Apache Lucene 9.11.0.

Apache Lucene is a high-performance, full-featured search engine library
written entirely in Java. It is a technology suitable for nearly any
application that requires structured search, full-text search, faceting,
nearest-neighbor search across high-dimensionality vectors, spell
correction or query suggestions.

This release contains numerous bug fixes, optimizations, and improvements,
some of which are highlighted below. The release is available for immediate
download at:

https://lucene.apache.org/core/downloads.html

Lucene 9.11.0 Release Highlights:

New features:

 * Add support for posix_madvise to MMapDirectory: If running on
Linux/macOS and Java 21 or later, MMapDirectory uses IOContext to pass
suitable MADV flags to the kernel of the operating system. This may improve
paging logic especially when working with large indexes under memory
pressure.
 * Expand support for new scalar bit levels for HNSW vectors. This includes
4-bit vectors and an option to compress them to gain a 50% reduction in
memory usage.
 * Recursive graph bisection is now supported on indexes that have blocks

Improvements:

 * MergeScheduler can now provide an executor for intra-merge parallelism.
The first implementation is the ConcurrentMergeScheduler.
 * Upgrade icu4j to version 74.2.

Optimizations:

 * Use RWLock to access LRUQueryCache to reduce contention.
 * Speedup multi-segment HNSW graph search for diversifying child kNN
queries.
 * Add a MemorySegment Vector scorer - for scoring without copying on-heap.
This can improve search latency by almost 2x for byte vectors.
 * Switch to using optimized, primitive collections where possible to
improve performance and heap utilization.

...And many more optimizations and bugfixes.

Please read CHANGES.txt for a full list of new features and changes:
https://lucene.apache.org/core/9_11_0/changes/Changes.html

Please report any feedback to the mailing lists (
http://lucene.apache.org/core/discussion.html)

Note: The Apache Software Foundation uses an extensive mirroring network
for distributing releases. It is possible that the mirror you are using may
not have replicated the release yet. If that is the case, please try
another mirror. This also applies to Maven access.


Re: [VOTE] Release Lucene 9.11.0 RC1

2024-06-06 Thread Benjamin Trent
It's been >72h since the vote was initiated and the result is:

+1  12  (11 binding)
 0  0
-1  0

This vote has PASSED


Thanks!

Ben Trent

On Thu, Jun 6, 2024 at 12:27 AM Patrick Zhai  wrote:

> +1
>
> SUCCESS! [1:01:30.064666]
>
> On Wed, Jun 5, 2024 at 11:08 AM Houston Putman  wrote:
>
>> +1
>>
>> SUCCESS! [1:49:36.192513]
>>
>> - Houston Putman
>>
>> On Wed, Jun 5, 2024 at 12:58 PM Michael McCandless <
>> luc...@mikemccandless.com> wrote:
>>
>>> +1 SUCCESS! [0:24:55.332837]
>>>
>>> Mike McCandless
>>>
>>> http://blog.mikemccandless.com
>>>
>>>
>>> On Wed, Jun 5, 2024 at 11:21 AM Adrien Grand  wrote:
>>>
 +1 SUCCESS! [1:09:30.262027]

 On Wed, Jun 5, 2024 at 4:15 PM Tomás Fernández Löbbe <
 tomasflo...@gmail.com> wrote:

> +1
>
> SUCCESS! [1:12:30.029470]
>
> On Wed, Jun 5, 2024 at 9:22 AM Bruno Roustant <
> bruno.roust...@gmail.com> wrote:
>
>> +1
>>
>> SUCCESS! [0:41:14.593265]
>>
>> Bruno
>>
>>>

 --
 Adrien

>>>


Re: Intellij build/test times

2024-06-06 Thread Michael Sokolov
Oh! TIL! so much better, thanks. And now I have the "Repeat" option
back in the test runner

On Thu, Jun 6, 2024 at 6:18 AM Dawid Weiss  wrote:
>
>
> Don't know what's causing this... but I never run IntelliJ builds or tests 
> through its gradle launcher, actually. Switch it to compile and run using its 
> own built-in method - much faster.
>
>
>
> Dawid
>
> On Thu, Jun 6, 2024 at 12:10 PM Michael Sokolov  wrote:
>>
>> Hi, I wonder how many of us are using intellij to run Lucene tests, and if 
>> you are, have you noticed it having gotten really quite slow? It seems to 
>> take a long time doing... Something... Before the test starts running. I 
>> have a suspicion that we are using gradle in a way that forces it to rebuild 
>> its cache every time or something like that. Once upon a time we had an 
>> intellij build setup target that set things up in a more intellij friendly 
>> way, according gradle, didn't we? Does that still exist?

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Intellij build/test times

2024-06-06 Thread Michael Sokolov
Hi, I wonder how many of us are using intellij to run Lucene tests, and if
you are, have you noticed it having gotten really quite slow? It seems to
take a long time doing... Something... Before the test starts running. I
have a suspicion that we are using gradle in a way that forces it to
rebuild its cache every time or something like that. Once upon a time we
had an intellij build setup target that set things up in a more intellij
friendly way, according gradle, didn't we? Does that still exist?


Re: [VOTE] Release Lucene 9.11.0 RC1

2024-06-05 Thread Patrick Zhai
+1

SUCCESS! [1:01:30.064666]

On Wed, Jun 5, 2024 at 11:08 AM Houston Putman  wrote:

> +1
>
> SUCCESS! [1:49:36.192513]
>
> - Houston Putman
>
> On Wed, Jun 5, 2024 at 12:58 PM Michael McCandless <
> luc...@mikemccandless.com> wrote:
>
>> +1 SUCCESS! [0:24:55.332837]
>>
>> Mike McCandless
>>
>> http://blog.mikemccandless.com
>>
>>
>> On Wed, Jun 5, 2024 at 11:21 AM Adrien Grand  wrote:
>>
>>> +1 SUCCESS! [1:09:30.262027]
>>>
>>> On Wed, Jun 5, 2024 at 4:15 PM Tomás Fernández Löbbe <
>>> tomasflo...@gmail.com> wrote:
>>>
 +1

 SUCCESS! [1:12:30.029470]

 On Wed, Jun 5, 2024 at 9:22 AM Bruno Roustant 
 wrote:

> +1
>
> SUCCESS! [0:41:14.593265]
>
> Bruno
>
>>
>>>
>>> --
>>> Adrien
>>>
>>


Re: [VOTE] Release Lucene 9.11.0 RC1

2024-06-05 Thread Zhang Chao
+1

SUCCESS! [1:14:38.618061]

--
Zhang Chao

> 2024年6月6日 02:08,Houston Putman  写道:
> 
> +1
> 
> SUCCESS! [1:49:36.192513]
> 
> - Houston Putman
> 
> On Wed, Jun 5, 2024 at 12:58 PM Michael McCandless  > wrote:
>> +1 SUCCESS! [0:24:55.332837]
>> 
>> Mike McCandless
>> 
>> http://blog.mikemccandless.com 
>> 
>> On Wed, Jun 5, 2024 at 11:21 AM Adrien Grand > > wrote:
>>> +1 SUCCESS! [1:09:30.262027]
>>> 
>>> On Wed, Jun 5, 2024 at 4:15 PM Tomás Fernández Löbbe >> > wrote:
 +1
 
 SUCCESS! [1:12:30.029470]
 
 On Wed, Jun 5, 2024 at 9:22 AM Bruno Roustant >>> > wrote:
> +1
> 
> SUCCESS! [0:41:14.593265]
> 
> Bruno
>>> 
>>> 
>>> -- 
>>> Adrien



Re: [VOTE] Release Lucene 9.11.0 RC1

2024-06-05 Thread Houston Putman
+1

SUCCESS! [1:49:36.192513]

- Houston Putman

On Wed, Jun 5, 2024 at 12:58 PM Michael McCandless <
luc...@mikemccandless.com> wrote:

> +1 SUCCESS! [0:24:55.332837]
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
>
> On Wed, Jun 5, 2024 at 11:21 AM Adrien Grand  wrote:
>
>> +1 SUCCESS! [1:09:30.262027]
>>
>> On Wed, Jun 5, 2024 at 4:15 PM Tomás Fernández Löbbe <
>> tomasflo...@gmail.com> wrote:
>>
>>> +1
>>>
>>> SUCCESS! [1:12:30.029470]
>>>
>>> On Wed, Jun 5, 2024 at 9:22 AM Bruno Roustant 
>>> wrote:
>>>
 +1

 SUCCESS! [0:41:14.593265]

 Bruno

>
>>
>> --
>> Adrien
>>
>


Re: [VOTE] Release Lucene 9.11.0 RC1

2024-06-05 Thread Michael McCandless
+1 SUCCESS! [0:24:55.332837]

Mike McCandless

http://blog.mikemccandless.com


On Wed, Jun 5, 2024 at 11:21 AM Adrien Grand  wrote:

> +1 SUCCESS! [1:09:30.262027]
>
> On Wed, Jun 5, 2024 at 4:15 PM Tomás Fernández Löbbe <
> tomasflo...@gmail.com> wrote:
>
>> +1
>>
>> SUCCESS! [1:12:30.029470]
>>
>> On Wed, Jun 5, 2024 at 9:22 AM Bruno Roustant 
>> wrote:
>>
>>> +1
>>>
>>> SUCCESS! [0:41:14.593265]
>>>
>>> Bruno
>>>

>
> --
> Adrien
>


Re: [VOTE] Release Lucene 9.11.0 RC1

2024-06-05 Thread Adrien Grand
+1 SUCCESS! [1:09:30.262027]

On Wed, Jun 5, 2024 at 4:15 PM Tomás Fernández Löbbe 
wrote:

> +1
>
> SUCCESS! [1:12:30.029470]
>
> On Wed, Jun 5, 2024 at 9:22 AM Bruno Roustant 
> wrote:
>
>> +1
>>
>> SUCCESS! [0:41:14.593265]
>>
>> Bruno
>>
>>>

-- 
Adrien


Re: [VOTE] Release Lucene 9.11.0 RC1

2024-06-05 Thread Tomás Fernández Löbbe
+1

SUCCESS! [1:12:30.029470]

On Wed, Jun 5, 2024 at 9:22 AM Bruno Roustant 
wrote:

> +1
>
> SUCCESS! [0:41:14.593265]
>
> Bruno
>
>>


Re: [VOTE] Release Lucene 9.11.0 RC1

2024-06-05 Thread Bruno Roustant
+1

SUCCESS! [0:41:14.593265]

Bruno

>


Re: [JENKINS] Lucene » Lucene-Solr-NightlyTests-8.10 - Build # 57 - Still Failing!

2024-06-04 Thread Dawid Weiss
Not sure why this ancient build ran at all - I turned it off.

D.

On Tue, Jun 4, 2024 at 3:49 PM Apache Jenkins Server <
jenk...@builds.apache.org> wrote:

> Build:
> https://ci-builds.apache.org/job/Lucene/job/Lucene-Solr-NightlyTests-8.10/57/
>
> No tests ran.
>
> Build Log:
> [...truncated 6 lines...]
> ERROR: Unable to find build script at
> /home/jenkins/jenkins-slave/workspace/Lucene/Lucene-Solr-NightlyTests-8.10/checkout/build.xml
> Archiving artifacts
> Recording test results
> ERROR: Step ‘Publish JUnit test result report’ failed: No test report
> files were found. Configuration error?
> Email was triggered for: Failure - Any
> Sending email for trigger: Failure - Any
>
> -
> To unsubscribe, e-mail: builds-unsubscr...@lucene.apache.org
> For additional commands, e-mail: builds-h...@lucene.apache.org


Re: [VOTE] Release Lucene 9.11.0 RC1

2024-06-04 Thread Dawid Weiss
I've ran the tests on a Linux cloud machine and looked at the artifacts,
everything's ok.

SUCCESS! [3:26:28.575148]

+1 to release.

D.

On Mon, Jun 3, 2024 at 1:30 PM Benjamin Trent  wrote:

> Please vote for release candidate 1 for Lucene 9.11.0
>
> The artifacts can be downloaded from:
>
> https://dist.apache.org/repos/dist/dev/lucene/lucene-9.11.0-RC1-rev-d433394b292e3562e0bb34222f7dd4f307e2b8ca
>
> You can run the smoke tester directly with this command:
>
> python3 -u dev-tools/scripts/smokeTestRelease.py \
>
> https://dist.apache.org/repos/dist/dev/lucene/lucene-9.11.0-RC1-rev-d433394b292e3562e0bb34222f7dd4f307e2b8ca
>
> The vote will be open for at least 72 hours i.e. until 2024-06-06 12:00
> UTC.
>
> [ ] +1  approve
> [ ] +0  no opinion
> [ ] -1  disapprove (and reason why)
>
> Here is my +1
>
> Thanks!
>
> Ben Trent
>


Re: [VOTE] Release Lucene 9.11.0 RC1

2024-06-04 Thread Uwe Schindler

Hi,

I let Policeman Jenkins run the smoketester with all important Java 
versions 11, 17, 19, 20, 21 (and specifically with those covering the 
MR-JAR and MemorySegments/vectors):


https://jenkins.thetaphi.de/job/Lucene-Release-Tester/33/console

SUCCESS! [2:49:30.617905]

Policeman Jenkins said yes - I also say: Release it! +1

Uwe

Am 03.06.2024 um 13:29 schrieb Benjamin Trent:

Please vote for release candidate 1 for Lucene 9.11.0

The artifacts can be downloaded from:
https://dist.apache.org/repos/dist/dev/lucene/lucene-9.11.0-RC1-rev-d433394b292e3562e0bb34222f7dd4f307e2b8ca

You can run the smoke tester directly with this command:

python3 -u dev-tools/scripts/smokeTestRelease.py \
https://dist.apache.org/repos/dist/dev/lucene/lucene-9.11.0-RC1-rev-d433394b292e3562e0bb34222f7dd4f307e2b8ca

The vote will be open for at least 72 hours i.e. until 2024-06-06 
12:00 UTC.


[ ] +1  approve
[ ] +0  no opinion
[ ] -1  disapprove (and reason why)

Here is my +1

Thanks!

Ben Trent


--
Uwe Schindler
Achterdiek 19, D-28357 Bremen
https://www.thetaphi.de
eMail:u...@thetaphi.de


Re: [VOTE] Release Lucene 9.11.0 RC1

2024-06-04 Thread Chris Hegarty
+1

SUCCESS! [1:09:31.902484]

-Chris

> On 3 Jun 2024, at 12:29, Benjamin Trent  wrote:
> 
> Please vote for release candidate 1 for Lucene 9.11.0
> 
> The artifacts can be downloaded from:
> https://dist.apache.org/repos/dist/dev/lucene/lucene-9.11.0-RC1-rev-d433394b292e3562e0bb34222f7dd4f307e2b8ca
> 
> You can run the smoke tester directly with this command:
> 
> python3 -u dev-tools/scripts/smokeTestRelease.py \
> https://dist.apache.org/repos/dist/dev/lucene/lucene-9.11.0-RC1-rev-d433394b292e3562e0bb34222f7dd4f307e2b8ca
> 
> The vote will be open for at least 72 hours i.e. until 2024-06-06 12:00 UTC.
> 
> [ ] +1  approve
> [ ] +0  no opinion
> [ ] -1  disapprove (and reason why)
> 
> Here is my +1
> 
> Thanks!
> 
> Ben Trent


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [VOTE] Release Lucene 9.11.0 RC1

2024-06-03 Thread Michael Sokolov
+1

(tested w/Amazon Corretto JVM)
SUCCESS! [0:46:40.066524]

On Mon, Jun 3, 2024 at 7:30 AM Benjamin Trent  wrote:
>
> Please vote for release candidate 1 for Lucene 9.11.0
>
> The artifacts can be downloaded from:
> https://dist.apache.org/repos/dist/dev/lucene/lucene-9.11.0-RC1-rev-d433394b292e3562e0bb34222f7dd4f307e2b8ca
>
> You can run the smoke tester directly with this command:
>
> python3 -u dev-tools/scripts/smokeTestRelease.py \
> https://dist.apache.org/repos/dist/dev/lucene/lucene-9.11.0-RC1-rev-d433394b292e3562e0bb34222f7dd4f307e2b8ca
>
> The vote will be open for at least 72 hours i.e. until 2024-06-06 12:00 UTC.
>
> [ ] +1  approve
> [ ] +0  no opinion
> [ ] -1  disapprove (and reason why)
>
> Here is my +1
>
> Thanks!
>
> Ben Trent

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[VOTE] Release Lucene 9.11.0 RC1

2024-06-03 Thread Benjamin Trent
Please vote for release candidate 1 for Lucene 9.11.0

The artifacts can be downloaded from:
https://dist.apache.org/repos/dist/dev/lucene/lucene-9.11.0-RC1-rev-d433394b292e3562e0bb34222f7dd4f307e2b8ca

You can run the smoke tester directly with this command:

python3 -u dev-tools/scripts/smokeTestRelease.py \
https://dist.apache.org/repos/dist/dev/lucene/lucene-9.11.0-RC1-rev-d433394b292e3562e0bb34222f7dd4f307e2b8ca

The vote will be open for at least 72 hours i.e. until 2024-06-06 12:00 UTC.

[ ] +1  approve
[ ] +0  no opinion
[ ] -1  disapprove (and reason why)

Here is my +1

Thanks!

Ben Trent


Re: Any recommended issues to work on for a newcomer?

2024-05-31 Thread Michael Wechner

thank you very much for sharing!

Unfortunately I did not find time yet to review Hank's work yet, but 
maybe Hank can already proceed based on your code.


Thanks

Michael

Am 31.05.24 um 18:50 schrieb Alessandro Benedetti:
Just for your curiosity, my Reciprocal Rank Fusion contribution to 
Solr is in decent shape now:

https://github.com/apache/solr/pull/2489
Everything is just Solr's side but maybe it can be of some sort of 
inspiration if you want to do a similar work in Lucene.


Cheers
--
*Alessandro Benedetti*
Director @ Sease Ltd.
/Apache Lucene/Solr Committer/
/Apache Solr PMC Member/

e-mail: a.benede...@sease.io/
/

*Sease* - Information Retrieval Applied
Consulting | Training | Open Source

Website: Sease.io 
LinkedIn  | Twitter 
 | Youtube 
 | Github 




On Mon, 20 May 2024 at 08:16, Michael Wechner 
 wrote:


Hi Hank

Very cool, thank you, will try to do this asap!

All the best

Michael


Am 19.05.24 um 01:42 schrieb Chang Hank:

Hey Michael,

I wrote the first version of my idea about implementing RRF in
Lucene, here the link of the code
https://gist.github.com/hack4chang/ee2b37eab80bd82e574ff4f94ed204e9.
Right now I have some questions, one is about the shardIndex to
be returned, another one is the TotalHits value, please take a
look at the code and kindly leave some comments below.

Thanks,
Hank


On May 18, 2024, at 2:01 PM, Chang Hank
  wrote:

Or maybe we can first create an issue and PR based on the issue
number?
WDYT?

Best,

Hank


On May 18, 2024, at 11:29 AM, Chang Hank
  wrote:

Hey Michael,

Sorry I was a bit busy this week, but I’ve looked into the
resources you provided and also some useful advice from
Alessandro and Adrien.

I have a briefly understanding of how RRF works, but I’m not
quite sure how we should implement it. Based on the advice from
Alessandro and Adrien, it seems we need to consider that the
search results are located at different shards. According to
Alessandro, we should aggregate the ranked lists from all
distributed nodes and then apply RRF.
Are we going to implement this aggregation logic inside our RRF
method?

Also could you please create a PR so we can discuss more
details further?

All the best,

Hank


On May 13, 2024, at 10:09 AM, Michael Wechner
 
wrote:

Great, sounds like we have plan :-)

Hank and I can get started trying to understand the internals
better ...

Thanks

Michael

Am 13.05.24 um 18:21 schrieb Alessandro Benedetti:

Sure, we can make it work but in a distributed environment
you have to run first each query distributed (aggregating all
nodes) and then RRF on top of the aggregated ranked lists.
Doing RRF per node first and then aggregate per shard won't
return the same results I suspect.
When I go back to working on the task I'll be able to
elaborate more!

Cheers
--
*Alessandro Benedetti*
Director @ Sease Ltd.
/Apache Lucene/Solr Committer/
/Apache Solr PMC Member/

e-mail: a.benede...@sease.io/
/

*Sease* - Information Retrieval Applied
Consulting | Training | Open Source

Website: Sease.io 
LinkedIn  | Twitter
 | Youtube
 |
Github 


On Mon, 13 May 2024 at 14:12, Adrien Grand
 wrote:

> Maybe Adrien Grand and others might also have some
feedback :-)

I'd suggest the signature to look something like `TopDocs
TopDocs#rrf(int topN, int k, TopDocs[] hits)` to be
consistent with `TopDocs#merge`. Internally, it should
look at `ScoreDoc#shardId` and `ScoreDoc#doc` to figure
out which hits map to the same document.

> Back in the day, I was reasoning on this and I didn't
think Lucene was the right place for an interleaving
algorithm, given that Reciprocal Rank Fusion is affected
by distribution and it's not supposed to work per node.

To me this is like `TopDocs#merge`. There are changes
needed on the application side to hook this call into the
logic that combines hits that come from multiple shards
(multiple queries in the case of RRF), but Lucene can
still provide the merging logic.

On Mon, May 13, 2024 at 1:41 PM Michael Wechner
 wrote:

Thanks for your feedback Alessandro!

I am using Lucene 

Re: Any recommended issues to work on for a newcomer?

2024-05-31 Thread Alessandro Benedetti
Just for your curiosity, my Reciprocal Rank Fusion contribution to Solr is
in decent shape now:
https://github.com/apache/solr/pull/2489
Everything is just Solr's side but maybe it can be of some sort of
inspiration if you want to do a similar work in Lucene.

Cheers
--
*Alessandro Benedetti*
Director @ Sease Ltd.
*Apache Lucene/Solr Committer*
*Apache Solr PMC Member*

e-mail: a.benede...@sease.io


*Sease* - Information Retrieval Applied
Consulting | Training | Open Source

Website: Sease.io 
LinkedIn  | Twitter
 | Youtube
 | Github



On Mon, 20 May 2024 at 08:16, Michael Wechner 
wrote:

> Hi Hank
>
> Very cool, thank you, will try to do this asap!
>
> All the best
>
> Michael
>
>
> Am 19.05.24 um 01:42 schrieb Chang Hank:
>
> Hey Michael,
>
> I wrote the first version of my idea about implementing RRF in Lucene,
> here the link of the code
> https://gist.github.com/hack4chang/ee2b37eab80bd82e574ff4f94ed204e9.
> Right now I have some questions, one is about the shardIndex to be
> returned, another one is the TotalHits value, please take a look at the
> code and kindly leave some comments below.
>
> Thanks,
> Hank
>
> On May 18, 2024, at 2:01 PM, Chang Hank 
>  wrote:
>
> Or maybe we can first create an issue and PR based on the issue number?
> WDYT?
>
> Best,
>
> Hank
>
> On May 18, 2024, at 11:29 AM, Chang Hank 
>  wrote:
>
> Hey Michael,
>
> Sorry I was a bit busy this week, but I’ve looked into the resources you
> provided and also some useful advice from Alessandro and Adrien.
>
> I have a briefly understanding of how RRF works, but I’m not quite sure
> how we should implement it. Based on the advice from Alessandro and Adrien,
> it seems we need to consider that the search results are located at
> different shards. According to Alessandro, we should aggregate the ranked
> lists from all distributed nodes and then apply RRF.
> Are we going to implement this aggregation logic inside our RRF method?
>
> Also could you please create a PR so we can discuss more details further?
>
> All the best,
>
> Hank
>
> On May 13, 2024, at 10:09 AM, Michael Wechner 
>  wrote:
>
> Great, sounds like we have plan :-)
>
> Hank and I can get started trying to understand the internals better ...
>
> Thanks
>
> Michael
>
> Am 13.05.24 um 18:21 schrieb Alessandro Benedetti:
>
> Sure, we can make it work but in a distributed environment you have to run
> first each query distributed (aggregating all nodes) and then RRF on top of
> the aggregated ranked lists.
> Doing RRF per node first and then aggregate per shard won't return the
> same results I suspect.
> When I go back to working on the task I'll be able to elaborate more!
>
> Cheers
> --
> *Alessandro Benedetti*
> Director @ Sease Ltd.
> *Apache Lucene/Solr Committer*
> *Apache Solr PMC Member*
>
> e-mail: a.benede...@sease.io
>
>
> *Sease* - Information Retrieval Applied
> Consulting | Training | Open Source
>
> Website: Sease.io 
> LinkedIn  | Twitter
>  | Youtube
>  | Github
> 
>
>
> On Mon, 13 May 2024 at 14:12, Adrien Grand  wrote:
>
>> > Maybe Adrien Grand and others might also have some feedback :-)
>>
>> I'd suggest the signature to look something like `TopDocs TopDocs#rrf(int
>> topN, int k, TopDocs[] hits)` to be consistent with `TopDocs#merge`.
>> Internally, it should look at `ScoreDoc#shardId` and `ScoreDoc#doc` to
>> figure out which hits map to the same document.
>>
>> > Back in the day, I was reasoning on this and I didn't think Lucene was
>> the right place for an interleaving algorithm, given that Reciprocal Rank
>> Fusion is affected by distribution and it's not supposed to work per node.
>>
>> To me this is like `TopDocs#merge`. There are changes needed on the
>> application side to hook this call into the logic that combines hits that
>> come from multiple shards (multiple queries in the case of RRF), but Lucene
>> can still provide the merging logic.
>>
>> On Mon, May 13, 2024 at 1:41 PM Michael Wechner <
>> michael.wech...@wyona.com> wrote:
>>
>>> Thanks for your feedback Alessandro!
>>>
>>> I am using Lucene independent of Solr or OpenSearch, Elasticsearch, but
>>> would like to combine different result sets using RRF, therefore think that
>>> Lucene itself could be a good place actually.
>>>
>>> Looking forward to your additional elaboration!
>>>
>>> Thanks
>>>
>>> Michael
>>>
>>>
>>>
>>>
>>> Am 13.05.2024 um 12:34 schrieb Alessandro Benedetti <
>>> a.benede...@sease.io>:
>>>
>>> This is not strictly related to Lucene, but I'll give a talk at Berlin
>>> Buzzwords on how I am implementing Reciprocal Rank Fusion in Apache Solr.
>>> I'll resume my work 

Re: Seeking Insights on New Features in Lucene

2024-05-30 Thread Robert Muir
No problem, I remember the thread was a bit old.

I edited the 10.0 milestone on github with the link to that thread.
The milestone page might be helpful also as you can see issue links:

https://github.com/apache/lucene/milestone/2

On Thu, May 30, 2024 at 11:10 PM Chang Hank  wrote:
>
> Thanks for this reference. I'm sorry, I just joined this month and hadn’t 
> read this announcement before.
> I really appreciate your help!
>
> Thanks,
> Hank
>
>
> > On May 30, 2024, at 8:04 PM, Robert Muir  wrote:
> >
> > Check out this thread which lists some:
> > https://lists.apache.org/thread/4bhnkkvvodxxgrpj4yqm5yrgj0ppc59r
> >
> > On Thu, May 30, 2024 at 10:49 PM Chang Hank  wrote:
> >>
> >> Hi all,
> >>
> >> I’m curious about the future development of Lucene and would like to know 
> >> if there are any planned new features.
> >> Could you share some insights into the main focus areas for upcoming 
> >> releases? Are there specific features or improvements the community is 
> >> currently working on and maybe I can help with?
> >>
> >> Best regards,
> >> Hank
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> > For additional commands, e-mail: dev-h...@lucene.apache.org
> >
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Seeking Insights on New Features in Lucene

2024-05-30 Thread Chang Hank
Thanks for this reference. I'm sorry, I just joined this month and hadn’t read 
this announcement before. 
I really appreciate your help!

Thanks,
Hank


> On May 30, 2024, at 8:04 PM, Robert Muir  wrote:
> 
> Check out this thread which lists some:
> https://lists.apache.org/thread/4bhnkkvvodxxgrpj4yqm5yrgj0ppc59r
> 
> On Thu, May 30, 2024 at 10:49 PM Chang Hank  wrote:
>> 
>> Hi all,
>> 
>> I’m curious about the future development of Lucene and would like to know if 
>> there are any planned new features.
>> Could you share some insights into the main focus areas for upcoming 
>> releases? Are there specific features or improvements the community is 
>> currently working on and maybe I can help with?
>> 
>> Best regards,
>> Hank
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
> 


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Seeking Insights on New Features in Lucene

2024-05-30 Thread Robert Muir
Check out this thread which lists some:
https://lists.apache.org/thread/4bhnkkvvodxxgrpj4yqm5yrgj0ppc59r

On Thu, May 30, 2024 at 10:49 PM Chang Hank  wrote:
>
> Hi all,
>
> I’m curious about the future development of Lucene and would like to know if 
> there are any planned new features.
> Could you share some insights into the main focus areas for upcoming 
> releases? Are there specific features or improvements the community is 
> currently working on and maybe I can help with?
>
> Best regards,
> Hank

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Seeking Insights on New Features in Lucene

2024-05-30 Thread Chang Hank
Hi all, 

I’m curious about the future development of Lucene and would like to know if 
there are any planned new features.
Could you share some insights into the main focus areas for upcoming releases? 
Are there specific features or improvements the community is currently working 
on and maybe I can help with?

Best regards,
Hank

Re: Lucene 9.11

2024-05-29 Thread Benjamin Trent
Hey y'all,

As part of the release process, I have cut the 9.11 branch & bumped
versions. So, be aware when backporting bug fixes. I am still fighting with
Jenkins on getting the periodic build jobs (I may not have the correct
permissions...).

I will be continuing the release process over the next day or so. It's my
first time, so I am swamped with reading :).

Thanks!

Ben


On Wed, May 29, 2024 at 9:04 AM Michael McCandless <
luc...@mikemccandless.com> wrote:

> Thanks Ben!
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
>
> On Wed, May 29, 2024 at 12:45 AM Stefan Vodita 
> wrote:
>
>> Ben, I just merged #13414 ,
>> so it's not a blocker for the release.
>> Thanks again for volunteering to be release manager!
>>
>> Stefan
>>
>> On Tue, 28 May 2024 at 14:58, Benjamin Trent 
>> wrote:
>>
>>> Hey y'all,
>>>
>>> I am planning on starting the release process tomorrow (May 29).
>>>
>>> I am in the Eastern USA time zone, so I will start the process around
>>> noon UTC.
>>>
>>> I noticed one PR from Stefan. I can wait for that one if I need to.
>>>
>>> Did we figure out the hppc concerns? I saw some PR activity, wanted to
>>> make sure we are all still good with starting the release process this week.
>>>
>>> Anything else I should be aware of or wait for?
>>>
>>> Thanks!
>>>
>>> Ben Trent
>>>
>>> On Wed, May 15, 2024, 3:58 AM Chris Hegarty
>>>  wrote:
>>>
 +1

 -Chris.

 > On 14 May 2024, at 16:10, Adrien Grand  wrote:
 >
 > +1 the 9.11 changelog looks great!
 >
 > On Tue, May 14, 2024 at 4:50 PM Benjamin Trent 
 wrote:
 > Hey y'all,
 >
 > Looking at changes for 9.11, we are building a significant list. I
 propose we do a release in the next couple of weeks.
 >
 > While this email is a little early (I am about to go on vacation for
 a bit), I volunteer myself as release manager.
 >
 > Unless there are objections, I plan on kicking off the release
 process May 28th.
 >
 > Thanks!
 >
 > Ben
 >
 >
 > --
 > Adrien


 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org




Re: Lucene 9.11

2024-05-29 Thread Michael McCandless
Thanks Ben!

Mike McCandless

http://blog.mikemccandless.com


On Wed, May 29, 2024 at 12:45 AM Stefan Vodita 
wrote:

> Ben, I just merged #13414 ,
> so it's not a blocker for the release.
> Thanks again for volunteering to be release manager!
>
> Stefan
>
> On Tue, 28 May 2024 at 14:58, Benjamin Trent 
> wrote:
>
>> Hey y'all,
>>
>> I am planning on starting the release process tomorrow (May 29).
>>
>> I am in the Eastern USA time zone, so I will start the process around
>> noon UTC.
>>
>> I noticed one PR from Stefan. I can wait for that one if I need to.
>>
>> Did we figure out the hppc concerns? I saw some PR activity, wanted to
>> make sure we are all still good with starting the release process this week.
>>
>> Anything else I should be aware of or wait for?
>>
>> Thanks!
>>
>> Ben Trent
>>
>> On Wed, May 15, 2024, 3:58 AM Chris Hegarty
>>  wrote:
>>
>>> +1
>>>
>>> -Chris.
>>>
>>> > On 14 May 2024, at 16:10, Adrien Grand  wrote:
>>> >
>>> > +1 the 9.11 changelog looks great!
>>> >
>>> > On Tue, May 14, 2024 at 4:50 PM Benjamin Trent 
>>> wrote:
>>> > Hey y'all,
>>> >
>>> > Looking at changes for 9.11, we are building a significant list. I
>>> propose we do a release in the next couple of weeks.
>>> >
>>> > While this email is a little early (I am about to go on vacation for a
>>> bit), I volunteer myself as release manager.
>>> >
>>> > Unless there are objections, I plan on kicking off the release process
>>> May 28th.
>>> >
>>> > Thanks!
>>> >
>>> > Ben
>>> >
>>> >
>>> > --
>>> > Adrien
>>>
>>>
>>> -
>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>>
>>>


Re: Lucene 9.11

2024-05-29 Thread Stefan Vodita
Ben, I just merged #13414 , so
it's not a blocker for the release.
Thanks again for volunteering to be release manager!

Stefan

On Tue, 28 May 2024 at 14:58, Benjamin Trent  wrote:

> Hey y'all,
>
> I am planning on starting the release process tomorrow (May 29).
>
> I am in the Eastern USA time zone, so I will start the process around noon
> UTC.
>
> I noticed one PR from Stefan. I can wait for that one if I need to.
>
> Did we figure out the hppc concerns? I saw some PR activity, wanted to
> make sure we are all still good with starting the release process this week.
>
> Anything else I should be aware of or wait for?
>
> Thanks!
>
> Ben Trent
>
> On Wed, May 15, 2024, 3:58 AM Chris Hegarty
>  wrote:
>
>> +1
>>
>> -Chris.
>>
>> > On 14 May 2024, at 16:10, Adrien Grand  wrote:
>> >
>> > +1 the 9.11 changelog looks great!
>> >
>> > On Tue, May 14, 2024 at 4:50 PM Benjamin Trent 
>> wrote:
>> > Hey y'all,
>> >
>> > Looking at changes for 9.11, we are building a significant list. I
>> propose we do a release in the next couple of weeks.
>> >
>> > While this email is a little early (I am about to go on vacation for a
>> bit), I volunteer myself as release manager.
>> >
>> > Unless there are objections, I plan on kicking off the release process
>> May 28th.
>> >
>> > Thanks!
>> >
>> > Ben
>> >
>> >
>> > --
>> > Adrien
>>
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>>


Re: Lucene 9.11

2024-05-28 Thread Michael Sokolov
I misread this as "Lucene 911" as in "Lucene Emergency!!!" -- might
not land for everyone - someday we will Have Lucene 11.2? But ... no
concerns from me aside from the things you mentioned - thanks for
pushing, Ben

On Tue, May 28, 2024 at 9:58 AM Benjamin Trent  wrote:
>
> Hey y'all,
>
> I am planning on starting the release process tomorrow (May 29).
>
> I am in the Eastern USA time zone, so I will start the process around noon 
> UTC.
>
> I noticed one PR from Stefan. I can wait for that one if I need to.
>
> Did we figure out the hppc concerns? I saw some PR activity, wanted to make 
> sure we are all still good with starting the release process this week.
>
> Anything else I should be aware of or wait for?
>
> Thanks!
>
> Ben Trent
>
> On Wed, May 15, 2024, 3:58 AM Chris Hegarty 
>  wrote:
>>
>> +1
>>
>> -Chris.
>>
>> > On 14 May 2024, at 16:10, Adrien Grand  wrote:
>> >
>> > +1 the 9.11 changelog looks great!
>> >
>> > On Tue, May 14, 2024 at 4:50 PM Benjamin Trent  
>> > wrote:
>> > Hey y'all,
>> >
>> > Looking at changes for 9.11, we are building a significant list. I propose 
>> > we do a release in the next couple of weeks.
>> >
>> > While this email is a little early (I am about to go on vacation for a 
>> > bit), I volunteer myself as release manager.
>> >
>> > Unless there are objections, I plan on kicking off the release process May 
>> > 28th.
>> >
>> > Thanks!
>> >
>> > Ben
>> >
>> >
>> > --
>> > Adrien
>>
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Lucene 9.11

2024-05-28 Thread Chris Hegarty



> On 28 May 2024, at 14:57, Benjamin Trent  wrote:
> 
> ...
> 
> Did we figure out the hppc concerns? I saw some PR activity, wanted to make 
> sure we are all still good with starting the release process this week.

Hppc is no longer a concern. The issue has been addressed by 
https://github.com/apache/lucene/pull/13422

-Chris.
-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Lucene 9.11

2024-05-28 Thread Benjamin Trent
Hey y'all,

I am planning on starting the release process tomorrow (May 29).

I am in the Eastern USA time zone, so I will start the process around noon
UTC.

I noticed one PR from Stefan. I can wait for that one if I need to.

Did we figure out the hppc concerns? I saw some PR activity, wanted to make
sure we are all still good with starting the release process this week.

Anything else I should be aware of or wait for?

Thanks!

Ben Trent

On Wed, May 15, 2024, 3:58 AM Chris Hegarty
 wrote:

> +1
>
> -Chris.
>
> > On 14 May 2024, at 16:10, Adrien Grand  wrote:
> >
> > +1 the 9.11 changelog looks great!
> >
> > On Tue, May 14, 2024 at 4:50 PM Benjamin Trent 
> wrote:
> > Hey y'all,
> >
> > Looking at changes for 9.11, we are building a significant list. I
> propose we do a release in the next couple of weeks.
> >
> > While this email is a little early (I am about to go on vacation for a
> bit), I volunteer myself as release manager.
> >
> > Unless there are objections, I plan on kicking off the release process
> May 28th.
> >
> > Thanks!
> >
> > Ben
> >
> >
> > --
> > Adrien
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>


Re: Q: 9.x upgrade to hppc 0.9.1

2024-05-27 Thread Uwe Schindler

See the PR, a subset is enough.

The problem with all classes comes from the fact that HPPC uses a 
template mechanism (like NIO Buffer classes in JDK) to generate the 
source code for all those combinations. We need to only fork all classes 
we use. Actually we do not fork classes part of the HPPC repository, we 
fork the output of the autogenerated code.


If we want to fork everything, I think we may need to better fork the 
template engine and the templates :-)


Uwe

Am 27.05.2024 um 10:08 schrieb Chris Hegarty:

Hi,


+1 to moving the hppc fork to oal.internal.

+1


On 26 May 2024, at 13:33, Bruno Roustant  wrote:

Currently the hppc fork in Lucene is composed of 15 classes and 8 test classes.
Forking everything in hppc would mean 525 classes and 193 test classes. I'm not 
sure we want to fork all hppc?

That sounds like quite a lot of classes. How much is actually necessary to 
allow to remove the dependency? And/Or is there a place of natural place where 
it makes logical sense to subset?

-Chris.
-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org


--
Uwe Schindler
Achterdiek 19, D-28357 Bremen
https://www.thetaphi.de
eMail: u...@thetaphi.de


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Q: 9.x upgrade to hppc 0.9.1

2024-05-27 Thread Chris Hegarty


> On 27 May 2024, at 09:08, Chris Hegarty  
> wrote:
> 
>> ...
> 
> That sounds like quite a lot of classes. How much is actually necessary to 
> allow to remove the dependency? And/Or is there a place of natural place 
> where it makes logical sense to subset?

Please ignore this comment. I see that such is already progressing.

-Chris.



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Q: 9.x upgrade to hppc 0.9.1

2024-05-27 Thread Chris Hegarty
Hi,

> +1 to moving the hppc fork to oal.internal.

+1

> On 26 May 2024, at 13:33, Bruno Roustant  wrote:
> 
> Currently the hppc fork in Lucene is composed of 15 classes and 8 test 
> classes.
> Forking everything in hppc would mean 525 classes and 193 test classes. I'm 
> not sure we want to fork all hppc?

That sounds like quite a lot of classes. How much is actually necessary to 
allow to remove the dependency? And/Or is there a place of natural place where 
it makes logical sense to subset?

-Chris.
-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Q: 9.x upgrade to hppc 0.9.1

2024-05-27 Thread Dawid Weiss
Thanks Bruno - didn't want to step in the way but I can work on the PR now
(is this fine)?

D.

On Mon, May 27, 2024 at 9:03 AM Bruno Roustant 
wrote:

> Ah, I started to work on this. So I just sent an incomplete PR[1] to share
> with you Dawid, so you don't do the work twice. Feel free to modify the PR
> if needed.
>
> [1] https://github.com/apache/lucene/pull/13422
>
> Le lun. 27 mai 2024 à 08:40, Dawid Weiss  a écrit :
>
>>
>> Hi Mike,
>>
>> These changes are rare, really. Bruno made most of them recently and the
>> core has been unchanged for quite some time now. The same goes for
>> fastutil, koloboke and other libraries. I don't think it's a problem and I
>> think it's more than fine to poach what's needed if it makes
>> people's lives easier downstream. I'll try to do this today and provide a
>> patch.
>>
>> Dawid
>>
>> On Mon, May 27, 2024 at 1:53 AM Mike Drob  wrote:
>>
>>> What is the cost of maintaining the fork? I don’t feel it’s fair to you
>>> Dawid, if we were to expect you to port over any changes made to hppc
>>> upstream.
>>>
>>> Mike
>>>
>>> On Sun, May 26, 2024 at 3:59 PM Dawid Weiss 
>>> wrote:
>>>
 If we increase the hppc fork to 23 classes and 14 test classes, then we
> can remove the hppc dependency from all modules.
> Do we agree that we should
> - Increase the fork size
> - Move it to oal.internal
> - Remove the hppc dependency from everywhere
>

 Yes, I think it's the safest way to go and it's also the cleanest -
 keeps the implementation details private and doesn't clash with anything
 out there. Dropping an existing dependency shouldn't be a problem, I think.


> Dawid, for the size of hppc, I counted the number of files with
> find . -type f | wc -l
> in hppc/build/generated/main
>

 Oh, ok. Many of these are a bit esoteric (even though we don't generate
 all combinations). Taking what's needed sounds reasonable to me - and it
 shouldn't be that much, really.

 D.


>
> Le dim. 26 mai 2024 à 21:52, Dawid Weiss  a
> écrit :
>
>>
>> Hi Bruno,
>>
>> Currently the hppc fork in Lucene is composed of 15 classes and 8
>>> test classes.
>>> Forking everything in hppc would mean 525 classes and 193 test
>>> classes. I'm not sure we want to fork all hppc?
>>>
>>
>> My superficial analysis hinted at far fewer classes but I'll take a
>> look tomorrow, had a busy day today.
>>
>>
>>> +1 to moving the hppc fork to oal.internal.
>>>
>>
>> Yes, I think it's a good idea to move it and hide it, at least for
>> the module system.
>>
>> D.
>>
>>


Re: Q: 9.x upgrade to hppc 0.9.1

2024-05-27 Thread Bruno Roustant
Ah, I started to work on this. So I just sent an incomplete PR[1] to share
with you Dawid, so you don't do the work twice. Feel free to modify the PR
if needed.

[1] https://github.com/apache/lucene/pull/13422

Le lun. 27 mai 2024 à 08:40, Dawid Weiss  a écrit :

>
> Hi Mike,
>
> These changes are rare, really. Bruno made most of them recently and the
> core has been unchanged for quite some time now. The same goes for
> fastutil, koloboke and other libraries. I don't think it's a problem and I
> think it's more than fine to poach what's needed if it makes
> people's lives easier downstream. I'll try to do this today and provide a
> patch.
>
> Dawid
>
> On Mon, May 27, 2024 at 1:53 AM Mike Drob  wrote:
>
>> What is the cost of maintaining the fork? I don’t feel it’s fair to you
>> Dawid, if we were to expect you to port over any changes made to hppc
>> upstream.
>>
>> Mike
>>
>> On Sun, May 26, 2024 at 3:59 PM Dawid Weiss 
>> wrote:
>>
>>> If we increase the hppc fork to 23 classes and 14 test classes, then we
 can remove the hppc dependency from all modules.
 Do we agree that we should
 - Increase the fork size
 - Move it to oal.internal
 - Remove the hppc dependency from everywhere

>>>
>>> Yes, I think it's the safest way to go and it's also the cleanest -
>>> keeps the implementation details private and doesn't clash with anything
>>> out there. Dropping an existing dependency shouldn't be a problem, I think.
>>>
>>>
 Dawid, for the size of hppc, I counted the number of files with
 find . -type f | wc -l
 in hppc/build/generated/main

>>>
>>> Oh, ok. Many of these are a bit esoteric (even though we don't generate
>>> all combinations). Taking what's needed sounds reasonable to me - and it
>>> shouldn't be that much, really.
>>>
>>> D.
>>>
>>>

 Le dim. 26 mai 2024 à 21:52, Dawid Weiss  a
 écrit :

>
> Hi Bruno,
>
> Currently the hppc fork in Lucene is composed of 15 classes and 8 test
>> classes.
>> Forking everything in hppc would mean 525 classes and 193 test
>> classes. I'm not sure we want to fork all hppc?
>>
>
> My superficial analysis hinted at far fewer classes but I'll take a
> look tomorrow, had a busy day today.
>
>
>> +1 to moving the hppc fork to oal.internal.
>>
>
> Yes, I think it's a good idea to move it and hide it, at least for the
> module system.
>
> D.
>
>


Re: Q: 9.x upgrade to hppc 0.9.1

2024-05-27 Thread Dawid Weiss
Hi Mike,

These changes are rare, really. Bruno made most of them recently and the
core has been unchanged for quite some time now. The same goes for
fastutil, koloboke and other libraries. I don't think it's a problem and I
think it's more than fine to poach what's needed if it makes
people's lives easier downstream. I'll try to do this today and provide a
patch.

Dawid

On Mon, May 27, 2024 at 1:53 AM Mike Drob  wrote:

> What is the cost of maintaining the fork? I don’t feel it’s fair to you
> Dawid, if we were to expect you to port over any changes made to hppc
> upstream.
>
> Mike
>
> On Sun, May 26, 2024 at 3:59 PM Dawid Weiss  wrote:
>
>> If we increase the hppc fork to 23 classes and 14 test classes, then we
>>> can remove the hppc dependency from all modules.
>>> Do we agree that we should
>>> - Increase the fork size
>>> - Move it to oal.internal
>>> - Remove the hppc dependency from everywhere
>>>
>>
>> Yes, I think it's the safest way to go and it's also the cleanest - keeps
>> the implementation details private and doesn't clash with anything out
>> there. Dropping an existing dependency shouldn't be a problem, I think.
>>
>>
>>> Dawid, for the size of hppc, I counted the number of files with
>>> find . -type f | wc -l
>>> in hppc/build/generated/main
>>>
>>
>> Oh, ok. Many of these are a bit esoteric (even though we don't generate
>> all combinations). Taking what's needed sounds reasonable to me - and it
>> shouldn't be that much, really.
>>
>> D.
>>
>>
>>>
>>> Le dim. 26 mai 2024 à 21:52, Dawid Weiss  a
>>> écrit :
>>>

 Hi Bruno,

 Currently the hppc fork in Lucene is composed of 15 classes and 8 test
> classes.
> Forking everything in hppc would mean 525 classes and 193 test
> classes. I'm not sure we want to fork all hppc?
>

 My superficial analysis hinted at far fewer classes but I'll take a
 look tomorrow, had a busy day today.


> +1 to moving the hppc fork to oal.internal.
>

 Yes, I think it's a good idea to move it and hide it, at least for the
 module system.

 D.




Re: Q: 9.x upgrade to hppc 0.9.1

2024-05-26 Thread Mike Drob
What is the cost of maintaining the fork? I don’t feel it’s fair to you
Dawid, if we were to expect you to port over any changes made to hppc
upstream.

Mike

On Sun, May 26, 2024 at 3:59 PM Dawid Weiss  wrote:

> If we increase the hppc fork to 23 classes and 14 test classes, then we
>> can remove the hppc dependency from all modules.
>> Do we agree that we should
>> - Increase the fork size
>> - Move it to oal.internal
>> - Remove the hppc dependency from everywhere
>>
>
> Yes, I think it's the safest way to go and it's also the cleanest - keeps
> the implementation details private and doesn't clash with anything out
> there. Dropping an existing dependency shouldn't be a problem, I think.
>
>
>> Dawid, for the size of hppc, I counted the number of files with
>> find . -type f | wc -l
>> in hppc/build/generated/main
>>
>
> Oh, ok. Many of these are a bit esoteric (even though we don't generate
> all combinations). Taking what's needed sounds reasonable to me - and it
> shouldn't be that much, really.
>
> D.
>
>
>>
>> Le dim. 26 mai 2024 à 21:52, Dawid Weiss  a
>> écrit :
>>
>>>
>>> Hi Bruno,
>>>
>>> Currently the hppc fork in Lucene is composed of 15 classes and 8 test
 classes.
 Forking everything in hppc would mean 525 classes and 193 test classes.
 I'm not sure we want to fork all hppc?

>>>
>>> My superficial analysis hinted at far fewer classes but I'll take a look
>>> tomorrow, had a busy day today.
>>>
>>>
 +1 to moving the hppc fork to oal.internal.

>>>
>>> Yes, I think it's a good idea to move it and hide it, at least for the
>>> module system.
>>>
>>> D.
>>>
>>>


Re: Q: 9.x upgrade to hppc 0.9.1

2024-05-26 Thread Dawid Weiss
>
> If we increase the hppc fork to 23 classes and 14 test classes, then we
> can remove the hppc dependency from all modules.
> Do we agree that we should
> - Increase the fork size
> - Move it to oal.internal
> - Remove the hppc dependency from everywhere
>

Yes, I think it's the safest way to go and it's also the cleanest - keeps
the implementation details private and doesn't clash with anything out
there. Dropping an existing dependency shouldn't be a problem, I think.


> Dawid, for the size of hppc, I counted the number of files with
> find . -type f | wc -l
> in hppc/build/generated/main
>

Oh, ok. Many of these are a bit esoteric (even though we don't generate all
combinations). Taking what's needed sounds reasonable to me - and it
shouldn't be that much, really.

D.


>
> Le dim. 26 mai 2024 à 21:52, Dawid Weiss  a écrit :
>
>>
>> Hi Bruno,
>>
>> Currently the hppc fork in Lucene is composed of 15 classes and 8 test
>>> classes.
>>> Forking everything in hppc would mean 525 classes and 193 test classes.
>>> I'm not sure we want to fork all hppc?
>>>
>>
>> My superficial analysis hinted at far fewer classes but I'll take a look
>> tomorrow, had a busy day today.
>>
>>
>>> +1 to moving the hppc fork to oal.internal.
>>>
>>
>> Yes, I think it's a good idea to move it and hide it, at least for the
>> module system.
>>
>> D.
>>
>>


Re: Q: 9.x upgrade to hppc 0.9.1

2024-05-26 Thread Bruno Roustant
If we increase the hppc fork to 23 classes and 14 test classes, then we can
remove the hppc dependency from all modules.
Do we agree that we should
- Increase the fork size
- Move it to oal.internal
- Remove the hppc dependency from everywhere

I can send a PR for this soon.

Dawid, for the size of hppc, I counted the number of files with
find . -type f | wc -l
in hppc/build/generated/main

Le dim. 26 mai 2024 à 21:52, Dawid Weiss  a écrit :

>
> Hi Bruno,
>
> Currently the hppc fork in Lucene is composed of 15 classes and 8 test
>> classes.
>> Forking everything in hppc would mean 525 classes and 193 test classes.
>> I'm not sure we want to fork all hppc?
>>
>
> My superficial analysis hinted at far fewer classes but I'll take a look
> tomorrow, had a busy day today.
>
>
>> +1 to moving the hppc fork to oal.internal.
>>
>
> Yes, I think it's a good idea to move it and hide it, at least for the
> module system.
>
> D.
>
>


Re: Q: 9.x upgrade to hppc 0.9.1

2024-05-26 Thread Dawid Weiss
Hi Bruno,

Currently the hppc fork in Lucene is composed of 15 classes and 8 test
> classes.
> Forking everything in hppc would mean 525 classes and 193 test classes.
> I'm not sure we want to fork all hppc?
>

My superficial analysis hinted at far fewer classes but I'll take a look
tomorrow, had a busy day today.


> +1 to moving the hppc fork to oal.internal.
>

Yes, I think it's a good idea to move it and hide it, at least for the
module system.

D.


Re: Q: 9.x upgrade to hppc 0.9.1

2024-05-26 Thread Chris Hegarty
Hi David,

> On 25 May 2024, at 21:08, Dawid Weiss  wrote:
> 
> ...
> 
> I understand it's a pain if the order changes from run to run but I don't see 
> a way this can be avoided ([1] is the issue you mentioned on gh). Tests (and 
> code) shouldn't rely on map/set ordering, although I realize it may be 
> difficult to weed out in such a large codebase.

To be clear, I agree, the bug is in the Elasticsearch code - it should not 
depend upon iteration order of these collection types. And yes, it’s difficult 
to weed out and fix, which we’ll continue to work on.

> For what it's worth, the next version of HPPC will be a proper module (with 
> com.carrotsearch.hppc id). Would it change anything/ make it easier if I 
> renamed it to just 'hppc'?

Moving to an explicit module with a module-info sounds good. The name, 
com.carrotsearch.hppc, is a fine name for this. No need to revert to the 
automatic module name.

-Chris.
-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Q: 9.x upgrade to hppc 0.9.1

2024-05-26 Thread Bruno Roustant
Currently the hppc fork in Lucene is composed of 15 classes and 8 test
classes.
Forking everything in hppc would mean 525 classes and 193 test classes. I'm
not sure we want to fork all hppc?

+1 to moving the hppc fork to oal.internal.

Le dim. 26 mai 2024 à 12:22, Uwe Schindler  a écrit :

> Hi,
>
> I was also wondering why parts of hppc were forked/copied to Lucene Core,
> others not. IMHO it should be consistent.
>
> I alaos agree that we should remove the classes completely from the util
> package (public part of API) and move them to the non-exported packages
> unter oal.internal. Of course this does not prevent classpath users form
> using those classes (P.S.: for the SharedSecrets and Vectorization theres
> stack inspection to prevent invalid callers from using them, but that's not
> needed for packages here as they cannot bring any risk for code when
> keeping public).
>
> +1 to move the classes and fork everything of HPPC to oal.internal package
> and only export it to specific modules in the module-info by a specific
> export (like for test.framework).
>
> Uwe
> Am 26.05.2024 um 10:31 schrieb Dawid Weiss:
>
>
> I will not have the time for this today but took a quick look and I think
> these external dependencies on hppc can be removed after the work Bruno has
> done to port some of these utility classes to the core. I'd also move the
> entire Lucene hppc fork under internal and only expose it to other Lucene
> modules that need it - would have to verify that no class is part of the
> public API but I don't think it is (in spatial3d and spatial-extras).
>
> Dawid
>
> On Sat, May 25, 2024 at 10:08 PM Dawid Weiss 
> wrote:
>
>>
>> Hi Chris,
>>
>> Since Elasticsearch is deployed as a module, then we need to update to
>>> hppc 0.9.1 [2], but unfortunately this is not straightforward. In fact,
>>> Ryan has a PR open [3] for the past 2 years without completion! The
>>> iteration order of some collection types in hppc 0.9.x [*] is tickling some
>>> inadvertent order dependencies in Elasticsearch. It may take some time to
>>> track these down and fix them.
>>>
>>
>> I understand it's a pain if the order changes from run to run but I don't
>> see a way this can be avoided ([1] is the issue you mentioned on gh). Tests
>> (and code) shouldn't rely on map/set ordering, although I realize it may be
>> difficult to weed out in such a large codebase.
>>
>> For what it's worth, the next version of HPPC will be a proper module
>> (with com.carrotsearch.hppc id). Would it change anything/ make it easier
>> if I renamed it to just 'hppc'?
>>
>> I wonder if others may run into either or both of these issues, as we
>>> have in Elasticsearch, if we release 9.11 with this change?
>>>
>>
>> That's why I wasn't entirely sold on having HPPC as the dependency from
>> Lucene when Bruno mentioned it recently - the jar/module hell will surface
>> sooner than later... Maybe it'd be a better idea to just copy what's needed
>> to the core jar and expose those packages to other Lucene modules (so that
>> there is no explicit dependency on HPPC at all)? Bruno copied a lot of
>> those classes already anyway - don't know how much of it is left to copy to
>> drop the dependency.
>>
>> Dawid
>>
>> [1] https://github.com/carrotsearch/hppc/issues/228
>> [2]
>> https://github.com/carrotsearch/hppc/commit/d569a8944091844c62349646f8eeaf35ebfb5ba6
>>
>>
>>>
>>> -Chris.
>>>
>>> [1] https://github.com/apache/lucene/pull/13392
>>> [2] https://github.com/elastic/elasticsearch/pull/109006
>>> [3] https://github.com/elastic/elasticsearch/pull/84168
>>>
>>> [*] HPPC-186: A different strategy has been implemented for collision
>>> avalanche avoidance. This results in removal of Scatter* maps and sets and
>>> their unification with their Hash* counterparts. This change should not
>>> affect any existing code unless it relied on static, specific ordering of
>>> keys. A side effect of this change is that key/value enumerators will
>>> return a different ordering of their container's values on each invocation.
>>> If your code relies on the order of values in associative arrays, it must
>>> order them after they are retrieved. (Bruno Roustant).
>>> -
>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>>
>>> --
> Uwe Schindler
> Achterdiek 19, D-28357 Bremenhttps://www.thetaphi.de
> eMail: u...@thetaphi.de
>
>


Re: Q: 9.x upgrade to hppc 0.9.1

2024-05-26 Thread Uwe Schindler

Hi,

I was also wondering why parts of hppc were forked/copied to Lucene 
Core, others not. IMHO it should be consistent.


I alaos agree that we should remove the classes completely from the util 
package (public part of API) and move them to the non-exported packages 
unter oal.internal. Of course this does not prevent classpath users form 
using those classes (P.S.: for the SharedSecrets and Vectorization 
theres stack inspection to prevent invalid callers from using them, but 
that's not needed for packages here as they cannot bring any risk for 
code when keeping public).


+1 to move the classes and fork everything of HPPC to oal.internal 
package and only export it to specific modules in the module-info by a 
specific export (like for test.framework).


Uwe

Am 26.05.2024 um 10:31 schrieb Dawid Weiss:


I will not have the time for this today but took a quick look and I 
think these external dependencies on hppc can be removed after the 
work Bruno has done to port some of these utility classes to the core. 
I'd also move the entire Lucene hppc fork under internal and only 
expose it to other Lucene modules that need it - would have to verify 
that no class is part of the public API but I don't think it is (in 
spatial3d and spatial-extras).


Dawid

On Sat, May 25, 2024 at 10:08 PM Dawid Weiss  
wrote:



Hi Chris,

Since Elasticsearch is deployed as a module, then we need to
update to hppc 0.9.1 [2], but unfortunately this is not
straightforward. In fact, Ryan has a PR open [3] for the past
2 years without completion! The iteration order of some
collection types in hppc 0.9.x [*] is tickling some
inadvertent order dependencies in Elasticsearch. It may take
some time to track these down and fix them.


I understand it's a pain if the order changes from run to run but
I don't see a way this can be avoided ([1] is the issue you
mentioned on gh). Tests (and code) shouldn't rely on map/set
ordering, although I realize it may be difficult to weed out in
such a large codebase.

For what it's worth, the next version of HPPC will be a proper
module (with com.carrotsearch.hppc id). Would it change anything/
make it easier if I renamed it to just 'hppc'?

I wonder if others may run into either or both of these
issues, as we have in Elasticsearch, if we release 9.11 with
this change?


That's why I wasn't entirely sold on having HPPC as the dependency
from Lucene when Bruno mentioned it recently - the jar/module hell
will surface sooner than later... Maybe it'd be a better idea to
just copy what's needed to the core jar and expose those packages
to other Lucene modules (so that there is no explicit dependency
on HPPC at all)? Bruno copied a lot of those classes already
anyway - don't know how much of it is left to copy to drop the
dependency.

Dawid

[1] https://github.com/carrotsearch/hppc/issues/228
[2]

https://github.com/carrotsearch/hppc/commit/d569a8944091844c62349646f8eeaf35ebfb5ba6


-Chris.

[1] https://github.com/apache/lucene/pull/13392
[2] https://github.com/elastic/elasticsearch/pull/109006
[3] https://github.com/elastic/elasticsearch/pull/84168

[*] HPPC-186: A different strategy has been implemented for
collision avalanche avoidance. This results in removal of
Scatter* maps and sets and their unification with their Hash*
counterparts. This change should not affect any existing code
unless it relied on static, specific ordering of keys. A side
effect of this change is that key/value enumerators will
return a different ordering of their container's values on
each invocation. If your code relies on the order of values in
associative arrays, it must order them after they are
retrieved. (Bruno Roustant).
-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org


--
Uwe Schindler
Achterdiek 19, D-28357 Bremen
https://www.thetaphi.de
eMail:u...@thetaphi.de


Re: Q: 9.x upgrade to hppc 0.9.1

2024-05-26 Thread Bruno Roustant
I didn't copy all hppc, the Lucene hppc fork is limited.
I know there are some hppc classes used and not in the fork in the facet
module, which had the hppc jar dependency since a while ago. So maybe we
can keep this dependency?
For the new dependencies that I added to the join and spatial modules,
maybe we can remove it. But it probably requires adapting in some way to
use only the fork.

Bruno

Le dim. 26 mai 2024 à 10:32, Dawid Weiss  a écrit :

>
> I will not have the time for this today but took a quick look and I think
> these external dependencies on hppc can be removed after the work Bruno has
> done to port some of these utility classes to the core. I'd also move the
> entire Lucene hppc fork under internal and only expose it to other Lucene
> modules that need it - would have to verify that no class is part of the
> public API but I don't think it is (in spatial3d and spatial-extras).
>
> Dawid
>
> On Sat, May 25, 2024 at 10:08 PM Dawid Weiss 
> wrote:
>
>>
>> Hi Chris,
>>
>> Since Elasticsearch is deployed as a module, then we need to update to
>>> hppc 0.9.1 [2], but unfortunately this is not straightforward. In fact,
>>> Ryan has a PR open [3] for the past 2 years without completion! The
>>> iteration order of some collection types in hppc 0.9.x [*] is tickling some
>>> inadvertent order dependencies in Elasticsearch. It may take some time to
>>> track these down and fix them.
>>>
>>
>> I understand it's a pain if the order changes from run to run but I don't
>> see a way this can be avoided ([1] is the issue you mentioned on gh). Tests
>> (and code) shouldn't rely on map/set ordering, although I realize it may be
>> difficult to weed out in such a large codebase.
>>
>> For what it's worth, the next version of HPPC will be a proper module
>> (with com.carrotsearch.hppc id). Would it change anything/ make it easier
>> if I renamed it to just 'hppc'?
>>
>> I wonder if others may run into either or both of these issues, as we
>>> have in Elasticsearch, if we release 9.11 with this change?
>>>
>>
>> That's why I wasn't entirely sold on having HPPC as the dependency from
>> Lucene when Bruno mentioned it recently - the jar/module hell will surface
>> sooner than later... Maybe it'd be a better idea to just copy what's needed
>> to the core jar and expose those packages to other Lucene modules (so that
>> there is no explicit dependency on HPPC at all)? Bruno copied a lot of
>> those classes already anyway - don't know how much of it is left to copy to
>> drop the dependency.
>>
>> Dawid
>>
>> [1] https://github.com/carrotsearch/hppc/issues/228
>> [2]
>> https://github.com/carrotsearch/hppc/commit/d569a8944091844c62349646f8eeaf35ebfb5ba6
>>
>>
>>>
>>> -Chris.
>>>
>>> [1] https://github.com/apache/lucene/pull/13392
>>> [2] https://github.com/elastic/elasticsearch/pull/109006
>>> [3] https://github.com/elastic/elasticsearch/pull/84168
>>>
>>> [*] HPPC-186: A different strategy has been implemented for collision
>>> avalanche avoidance. This results in removal of Scatter* maps and sets and
>>> their unification with their Hash* counterparts. This change should not
>>> affect any existing code unless it relied on static, specific ordering of
>>> keys. A side effect of this change is that key/value enumerators will
>>> return a different ordering of their container's values on each invocation.
>>> If your code relies on the order of values in associative arrays, it must
>>> order them after they are retrieved. (Bruno Roustant).
>>> -
>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>>
>>>


  1   2   3   4   5   6   7   8   9   10   >