Re: (lucene) branch main updated: Use jdk11 primitives in test to allow backport to branch_9x (#13311)
Hi, why did you not first backport and apply this change only to 9.x? If we have better methods available in Java 21, why not use them? We also change large parts of code to "record" classes, also not available in Java 11. Uwe Am 17.04.2024 um 08:17 schrieb vigyasha...@apache.org: This is an automated email from the ASF dual-hosted git repository. vigyasharma pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/lucene.git The following commit(s) were added to refs/heads/main by this push: new bc678ac67e3 Use jdk11 primitives in test to allow backport to branch_9x (#13311) bc678ac67e3 is described below commit bc678ac67e32c55a27a4e8950c25144cc89cef66 Author: Vigya Sharma AuthorDate: Tue Apr 16 23:17:43 2024 -0700 Use jdk11 primitives in test to allow backport to branch_9x (#13311) --- .../src/test/org/apache/lucene/search/BaseKnnVectorQueryTestCase.java | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/lucene/core/src/test/org/apache/lucene/search/BaseKnnVectorQueryTestCase.java b/lucene/core/src/test/org/apache/lucene/search/BaseKnnVectorQueryTestCase.java index 2ae0ae14a29..2546115ff4f 100644 --- a/lucene/core/src/test/org/apache/lucene/search/BaseKnnVectorQueryTestCase.java +++ b/lucene/core/src/test/org/apache/lucene/search/BaseKnnVectorQueryTestCase.java @@ -781,7 +781,7 @@ abstract class BaseKnnVectorQueryTestCase extends LuceneTestCase { TimeLimitingKnnCollectorManager noTimeoutManager = new TimeLimitingKnnCollectorManager(delegate, null); KnnCollector noTimeoutCollector = - noTimeoutManager.newCollector(Integer.MAX_VALUE, searcher.leafContexts.getFirst()); + noTimeoutManager.newCollector(Integer.MAX_VALUE, searcher.leafContexts.get(0)); // Check that a normal collector is created without timeout assertTrue(noTimeoutCollector instanceof TopKnnCollector); @@ -797,7 +797,7 @@ abstract class BaseKnnVectorQueryTestCase extends LuceneTestCase { TimeLimitingKnnCollectorManager timeoutManager = new TimeLimitingKnnCollectorManager(delegate, () -> true); KnnCollector timeoutCollector = - timeoutManager.newCollector(Integer.MAX_VALUE, searcher.leafContexts.getFirst()); + timeoutManager.newCollector(Integer.MAX_VALUE, searcher.leafContexts.get(0)); // Check that a time limiting collector is created, which returns partial results assertFalse(timeoutCollector instanceof TopKnnCollector); -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail: u...@thetaphi.de - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: (lucene) branch main updated: fix s/Long/Fixed in FixedBitSet javadocs (#13290)
Please run "gradlew tidy", this fails builds. Uwe Am 11.04.2024 um 12:21 schrieb cpoersc...@apache.org: This is an automated email from the ASF dual-hosted git repository. cpoerschke pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/lucene.git The following commit(s) were added to refs/heads/main by this push: new f44ded0c95d fix s/Long/Fixed in FixedBitSet javadocs (#13290) f44ded0c95d is described below commit f44ded0c95dfb57d5aac4beb5b869af6e2b0598a Author: Christine Poerschke AuthorDate: Thu Apr 11 11:21:49 2024 +0100 fix s/Long/Fixed in FixedBitSet javadocs (#13290) --- lucene/core/src/java/org/apache/lucene/util/FixedBitSet.java | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/lucene/core/src/java/org/apache/lucene/util/FixedBitSet.java b/lucene/core/src/java/org/apache/lucene/util/FixedBitSet.java index ebf626a777d..a7941ec9f0f 100644 --- a/lucene/core/src/java/org/apache/lucene/util/FixedBitSet.java +++ b/lucene/core/src/java/org/apache/lucene/util/FixedBitSet.java @@ -116,7 +116,7 @@ public final class FixedBitSet extends BitSet { } /** - * Creates a new LongBitSet. The internally allocated long array will be exactly the size needed + * Creates a new FixedBitSet. The internally allocated long array will be exactly the size needed * to accommodate the numBits specified. * * @param numBits the number of bits needed @@ -128,7 +128,7 @@ public final class FixedBitSet extends BitSet { } /** - * Creates a new LongBitSet using the provided long[] array as backing store. The storedBits array + * Creates a new FixedBitSet using the provided long[] array as backing store. The storedBits array * must be large enough to accommodate the numBits specified, but may be larger. In that case the * 'extra' or 'ghost' bits must be clear (or they may provoke spurious side-effects) * -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail: u...@thetaphi.de - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [JENKINS] Lucene-9.x-Windows (64bit/openj9/jdk-11.0.20) - Build # 3781 - Failure!
We habe build failures: FAILURE: Build failed with an exception. * What went wrong: Execution failed for task ':lucene:core:spotlessJavaCheck'. > The following files had format violations: src\java\org\apache\lucene\util\FixedBitSet.java @@ -128,9 +128,9 @@ ··} ··/** -···*·Creates·a·new·FixedBitSet·using·the·provided·long[]·array·as·backing·store.·The·storedBits·array -···*·must·be·large·enough·to·accommodate·the·numBits·specified,·but·may·be·larger.·In·that·case·the -···*·'extra'·or·'ghost'·bits·must·be·clear·(or·they·may·provoke·spurious·side-effects) +···*·Creates·a·new·FixedBitSet·using·the·provided·long[]·array·as·backing·store.·The·storedBits +···*·array·must·be·large·enough·to·accommodate·the·numBits·specified,·but·may·be·larger.·In·that +···*·case·the·'extra'·or·'ghost'·bits·must·be·clear·(or·they·may·provoke·spurious·side-effects) ···* ···*·@param·storedBits·the·array·to·use·as·backing·store ···*·@param·numBits·the·number·of·bits·actually·needed Run 'gradlew.bat :lucene:core:spotlessApply' to fix these violations. Am 11.04.2024 um 13:50 schrieb Policeman Jenkins Server: Build:https://jenkins.thetaphi.de/job/Lucene-9.x-Windows/3781/ Java: 64bit/openj9/jdk-11.0.20 -XX:+UseCompressedOops -Xgcpolicy:gencon All tests passed - To unsubscribe, e-mail:builds-unsubscr...@lucene.apache.org For additional commands, e-mail:builds-h...@lucene.apache.org -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail:u...@thetaphi.de
Re: [JENKINS] Lucene » Lucene-Check-main (s390x big endian) - Build # 460 - Still Failing!
See this issue: https://github.com/apache/lucene/issues/13161 The s390x server (big endian) has no Java 21 yet. I'll keep the job enabled, should work soon. Uwe Am 06.03.2024 um 23:09 schrieb Apache Jenkins Server: Build: https://ci-builds.apache.org/job/Lucene/job/Lucene-Check-main%20(s390x%20big%20endian)/460/ No tests ran. Build Log: [...truncated 29 lines...] ERROR: JAVA_HOME is set to an invalid directory: /home/jenkins/tools/java/latest21 Please set the JAVA_HOME variable in your environment to match the location of your Java installation. Build step 'Invoke Gradle script' changed build result to FAILURE Build step 'Invoke Gradle script' marked build as failure Archiving artifacts Recording test results ERROR: Step ‘Publish JUnit test result report’ failed: No test report files were found. Configuration error? Email was triggered for: Failure - Any Sending email for trigger: Failure - Any - To unsubscribe, e-mail: builds-unsubscr...@lucene.apache.org For additional commands, e-mail: builds-h...@lucene.apache.org -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail: u...@thetaphi.de - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Query about the GitHub statistics for Lucene
Hi, Yes, we should contact INFRA so they get all the repository links uptodate. They should maybe send us a list of tracked repos/issue trackers for us to review. There were also some crazy things like the temporary repository, that we used to migrate our issues from JIRA to Github, be used for statistics, but NOT the apache/lucene one. The statistics for JIRA are clearly wrong, too. The last change in JIRA was Aug 19, 2022. Uwe Am 05.03.2024 um 14:26 schrieb Robert Muir: On Tue, Mar 5, 2024 at 4:50 AM Chris Hegarty wrote: It appears that there is no GH activity for 2024! Clearly this is incorrect. I’ve yet to track down what’s going on with this. Familiar to anyone here? Last time I looked at this, it appeared it is looking at the incorrect github repositories, for example https://github.com/apache/lucene-solr and not https://github.com/apache/lucene - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail: u...@thetaphi.de - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [Vote] Bump the Lucene main branch to Java 21
Hi, this vote has passed. I wanted to wait for Chris to merge the PR, but due to heavy working on main removing ByteBufferIndexInput and updating Java versions, I accidentally pushed the wrong branch to main, so it is already merged. The PRwas closed manually. Lucene "main" (10.0) is now on Java 21. Sorry, Chris - my fault! Uwe Am 23.02.2024 um 12:24 schrieb Chris Hegarty: Hi, Since the discussion on bumping the Lucene main branch to Java 21 is winding down, let's hold a vote on this important change. Once bumped, the next major release of Lucene (whenever that will be) will require a version of Java greater than or equal to Java 21. The vote will be open for at least 72 hours (and allow some additional time for the weekend) i.e. until 2024-02-28 12:00 UTC. [ ] +1 approve [ ] +0 no opinion [ ] -1 disapprove (and reason why) Here is my +1 -Chris. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail: u...@thetaphi.de - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [Vote] Bump the Lucene main branch to Java 21
Here is my +1 Uwe Am 23.02.2024 um 12:24 schrieb Chris Hegarty: Hi, Since the discussion on bumping the Lucene main branch to Java 21 is winding down, let's hold a vote on this important change. Once bumped, the next major release of Lucene (whenever that will be) will require a version of Java greater than or equal to Java 21. The vote will be open for at least 72 hours (and allow some additional time for the weekend) i.e. until 2024-02-28 12:00 UTC. [ ] +1 approve [ ] +0 no opinion [ ] -1 disapprove (and reason why) Here is my +1 -Chris. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail: u...@thetaphi.de - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [VOTE] Release Lucene 9.10.0 RC1
Hi, I used Stefan Vodita's Hack to make the Smoketester run on a large list of JDKs: https://github.com/apache/lucene/pull/13108 See the console of running Java 11, Java 17, Java 19, Java 20, Java 21. Due to limitations of Gradle I wasn't able to do the smoker checks on Java 22 release candidate, but as there are no changes to 9.x branch I assume that everything also works in Java 22. If anybody else has time to run a test project with Java 22 using mmap and vectors it would be great! Log file: https://jenkins.thetaphi.de/job/Lucene-Release-Tester-v2/3/console Result was: SUCCESS! [2:42:55.968473] Here is my +1 (binding). Uwe Am 15.02.2024 um 12:50 schrieb Uwe Schindler: Hi, I ran the default smoke tester with Java 11 and Java 17 on Policeman Jenkins; all looks fine: https://jenkins.thetaphi.de/job/Lucene-Release-Tester/32/console SUCCESS! [1:04:45.740708] I only have one problem. Now that Java 21 LTS is out and more an more people use it, it would be good to also run the smoke tester with Java 21. I tried that locally by just passing the home dir of java 21 instead of Java 17, but that failed due to some check in smoker. I will work this evening on patching Smoke tester to also allow it to pass Java 21. Maybe the best would be to pass multiple Java versions as comma spearated list, just the default one must be Java 11 (the baseline). This would allo me to spin Policeman Jenkins with Java 11, Java 17, Java 19, Java 20, Java 21 and Java 22-rc1. Takes a while but would make sure all works in the officially MR-JAR supported relaeses + LTS. What do you think. I will give my +1 later when I checked the options and also looked into the downloaded artifacts. Uwe Am 14.02.2024 um 20:28 schrieb Adrien Grand: Please vote for release candidate 1 for Lucene 9.10.0 The artifacts can be downloaded from: https://dist.apache.org/repos/dist/dev/lucene/lucene-9.10.0-RC1-rev-695c0ac84508438302cd346a812cfa2fdc5a10df You can run the smoke tester directly with this command: python3 -u dev-tools/scripts/smokeTestRelease.py \ https://dist.apache.org/repos/dist/dev/lucene/lucene-9.10.0-RC1-rev-695c0ac84508438302cd346a812cfa2fdc5a10df The vote will be open for at least 72 hours i.e. until 2024-02-17 20:00 UTC. [ ] +1 approve [ ] +0 no opinion [ ] -1 disapprove (and reason why) Here is my +1 -- Adrien -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail:u...@thetaphi.de -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail:u...@thetaphi.de
Re: Lucene 9.10
I clarified the MMap stuff to say that it works now with Java 22 and later versions. Vector incubator was also added. Uwe Am 13.02.2024 um 14:37 schrieb Adrien Grand: I started a draft for release notes, feel free to modify or add more release highlights. https://cwiki.apache.org/confluence/display/LUCENE/Release+notes+9.10 On Thu, Feb 8, 2024 at 11:49 AM Uwe Schindler wrote: Hi Adrien, as discussed in the PR, I will merge the MMapDir and Panama Vector for JDK 22 later today or at latest tomorrow. I need to first download the RC version of JDK that is going to be released today and do the usual API consistency checks (checking no late API changes appeared). So next Wednesday is perfectly fine. Uwe Am 07.02.2024 um 15:57 schrieb Adrien Grand: Hello all, It's been 2 months since we released 9.9 and we accumulated a good number of changes, so I'd like to propose that we release 9.10.0. If there are no objections, I volunteer to be the release manager and suggest cutting the branch next Monday (February 12th) and starting the release process on Wednesday, one week from now (February 14th). +Uwe Schindler <mailto:u...@thetaphi.de> I remember that there are JDK22-related changes that you'd like to get into 9.10, feel free to let me know if this timeline doesn't work for you. -- Adrien -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail:u...@thetaphi.de -- Adrien -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail:u...@thetaphi.de
Re: [VOTE] Release Lucene 9.10.0 RC1
Hi, My Python knowledge is too limited to fix the build script to allow to test the smoker with arbitrary JAVA_HOME dircetories next to the baseline (Java 11). With lots of copypaste I can make it run on Java 21 in addition to 17, but that looks like too unflexible. Mike McCandless: If you could help me to make it more flexible, it would be good. I can open an issue, but if you have an easy solution. I think of the following: * JAVA_HOME must run be Java 11 (in 9.x) * At moment you can pass "--test-java17 ", but this one is also checked to be really java 17 (by parsing strings from its version output), but I'd like to pass "--test-alternative-java " multiple times and it would just run all those as part of smoking, maxbe the version number can be extracted to be printed out. To me this is a hopeless task with Python. Uwe Am 15.02.2024 um 12:50 schrieb Uwe Schindler: Hi, I ran the default smoke tester with Java 11 and Java 17 on Policeman Jenkins; all looks fine: https://jenkins.thetaphi.de/job/Lucene-Release-Tester/32/console SUCCESS! [1:04:45.740708] I only have one problem. Now that Java 21 LTS is out and more an more people use it, it would be good to also run the smoke tester with Java 21. I tried that locally by just passing the home dir of java 21 instead of Java 17, but that failed due to some check in smoker. I will work this evening on patching Smoke tester to also allow it to pass Java 21. Maybe the best would be to pass multiple Java versions as comma spearated list, just the default one must be Java 11 (the baseline). This would allo me to spin Policeman Jenkins with Java 11, Java 17, Java 19, Java 20, Java 21 and Java 22-rc1. Takes a while but would make sure all works in the officially MR-JAR supported relaeses + LTS. What do you think. I will give my +1 later when I checked the options and also looked into the downloaded artifacts. Uwe Am 14.02.2024 um 20:28 schrieb Adrien Grand: Please vote for release candidate 1 for Lucene 9.10.0 The artifacts can be downloaded from: https://dist.apache.org/repos/dist/dev/lucene/lucene-9.10.0-RC1-rev-695c0ac84508438302cd346a812cfa2fdc5a10df You can run the smoke tester directly with this command: python3 -u dev-tools/scripts/smokeTestRelease.py \ https://dist.apache.org/repos/dist/dev/lucene/lucene-9.10.0-RC1-rev-695c0ac84508438302cd346a812cfa2fdc5a10df The vote will be open for at least 72 hours i.e. until 2024-02-17 20:00 UTC. [ ] +1 approve [ ] +0 no opinion [ ] -1 disapprove (and reason why) Here is my +1 -- Adrien -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail:u...@thetaphi.de -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail:u...@thetaphi.de
Re: [VOTE] Release Lucene 9.10.0 RC1
Hi, I ran the default smoke tester with Java 11 and Java 17 on Policeman Jenkins; all looks fine: https://jenkins.thetaphi.de/job/Lucene-Release-Tester/32/console SUCCESS! [1:04:45.740708] I only have one problem. Now that Java 21 LTS is out and more an more people use it, it would be good to also run the smoke tester with Java 21. I tried that locally by just passing the home dir of java 21 instead of Java 17, but that failed due to some check in smoker. I will work this evening on patching Smoke tester to also allow it to pass Java 21. Maybe the best would be to pass multiple Java versions as comma spearated list, just the default one must be Java 11 (the baseline). This would allo me to spin Policeman Jenkins with Java 11, Java 17, Java 19, Java 20, Java 21 and Java 22-rc1. Takes a while but would make sure all works in the officially MR-JAR supported relaeses + LTS. What do you think. I will give my +1 later when I checked the options and also looked into the downloaded artifacts. Uwe Am 14.02.2024 um 20:28 schrieb Adrien Grand: Please vote for release candidate 1 for Lucene 9.10.0 The artifacts can be downloaded from: https://dist.apache.org/repos/dist/dev/lucene/lucene-9.10.0-RC1-rev-695c0ac84508438302cd346a812cfa2fdc5a10df You can run the smoke tester directly with this command: python3 -u dev-tools/scripts/smokeTestRelease.py \ https://dist.apache.org/repos/dist/dev/lucene/lucene-9.10.0-RC1-rev-695c0ac84508438302cd346a812cfa2fdc5a10df The vote will be open for at least 72 hours i.e. until 2024-02-17 20:00 UTC. [ ] +1 approve [ ] +0 no opinion [ ] -1 disapprove (and reason why) Here is my +1 -- Adrien -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail:u...@thetaphi.de
Re: (lucene) branch branch_9_10 created (now 695c0ac8450)
Hi Adrien, Thanks for creating the branch. I activated Policeman Jenkins tests for it. Uwe Am 12.02.2024 um 14:30 schrieb jpou...@apache.org: This is an automated email from the ASF dual-hosted git repository. jpountz pushed a change to branch branch_9_10 in repository https://gitbox.apache.org/repos/asf/lucene.git at 695c0ac8450 Add the missing Version field for 8.11.3. (#13093) No new revisions were added by this update. -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail: u...@thetaphi.de - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Lucene 9.10
Hi Adrien, as discussed in the PR, I will merge the MMapDir and Panama Vector for JDK 22 later today or at latest tomorrow. I need to first download the RC version of JDK that is going to be released today and do the usual API consistency checks (checking no late API changes appeared). So next Wednesday is perfectly fine. Uwe Am 07.02.2024 um 15:57 schrieb Adrien Grand: Hello all, It's been 2 months since we released 9.9 and we accumulated a good number of changes, so I'd like to propose that we release 9.10.0. If there are no objections, I volunteer to be the release manager and suggest cutting the branch next Monday (February 12th) and starting the release process on Wednesday, one week from now (February 14th). +Uwe Schindler <mailto:u...@thetaphi.de> I remember that there are JDK22-related changes that you'd like to get into 9.10, feel free to let me know if this timeline doesn't work for you. -- Adrien -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail:u...@thetaphi.de
Re: Computing weight.count() cheaply in the face of deletes?
Hi, my response was a bit unclear. Before Lucene 4.0 we saved *deletions* in a bitset (1 = doc deleted), so you were able to use the DocIdSetIterator provided directly. At this point there was no sparse implementation. My idea was more about this: "Because we marked *deleted* docs (not live docs) in the bitset, the cardinality of the Bitset was small and a sparse one would work well". Of course we can just invert on set/get to make use of a SparseFixedBitSet. Uwe Am 06.02.2024 um 21:05 schrieb Adrien Grand: Good point, I opened an issue to discuss this: https://github.com/apache/lucene/issues/13084. Did we actually use a sparse bit set to encode deleted docs before? I don't recall that. On Tue, Feb 6, 2024 at 2:42 PM Uwe Schindler wrote: Hi, A SparseBitset impl for DELETES would be fine if the model in Lucene would encode deleted docs (it did that in earlier times). As deletes are sparse (deletes are in most cases <40%), this would help to make the iterator cheaper. Uwe Am 06.02.2024 um 09:01 schrieb Adrien Grand: Hey Michael, You are right, iterating all deletes with nextClearBit() would run in O(maxDoc). I am coming from the other direction, where I'm expecting the number of deletes to be more in the order of 1%-5% of the doc ID space, so a separate int[] would use lots of heap and probably not help that much compared with nextClearBit(). My mental model is that the two most common use-cases are append-only workloads, where there are no deletes at all, and update workloads, which would commonly have several percents of deleted docs. It's not clear to me how common it is to have very few deletes. On Tue, Feb 6, 2024 at 7:03 AM Michael Froh wrote: Thanks Adrien! My thinking with a separate iterator was that nextClearBit() is relatively expensive (O(maxDoc) to traverse everything, I think). The solution I was imagining would involve an index-time change to output, say, an int[] of deleted docIDs if the number is sufficiently small (like maybe less than 1000). Then the livedocs interface could optionally return a cheap deleted docs iterator (i.e. only if the number of deleted docs is less than the threshold). Technically, the cost would be O(1), since we set a constant bound on the effort and fail otherwise. :) I think 1000 doc value lookups would be cheap, but I don't know if the guarantee is cheap enough to make it into Weight#count. That said, I'm going to see if iterating with nextClearBit() is sufficiently cheap. Hmm... precomputing that int[] for deleted docIDs on refresh could be an option too. Thanks again, Froh On Fri, Feb 2, 2024 at 11:38 PM Adrien Grand wrote: Hi Michael, Indeed, only MatchAllDocsQuery knows how to produce a count when there are deletes. Your idea sounds good to me, do you actually need a side car iterator for deletes, or could you use a nextClearBit() operation on the bit set? I don't think we can fold it into Weight#count since there is an expectation that it is negligible compared with the cost of a naive count, but we may be able to do it in IndexSearcher#count or on the OpenSearch side. Le ven. 2 févr. 2024, 23:50, Michael Froh a écrit : Hi, On OpenSearch, we've been taking advantage of the various O(1) Weight#count() implementations to quickly compute various aggregations without needing to iterate over all the matching documents (at least when the top-level query is functionally a match-all at the segment level). Of course, from what I've seen, every clever Weight#count() implementation falls apart (returns -1) in the face of deletes. I was thinking that we could still handle small numbers of deletes efficiently if only we could get a DocIdSetIterator for deleted docs. Like suppose you're doing a date histogram aggregation, you could get the counts for each bucket from the points tree (ignoring deletes), then iterate through the deleted docs and decrement their contribution from the relevant bucket (determined based on a docvalues lookup). Assuming the number of deleted docs is small, it should be cheap, right? The current LiveDocs implementation is just a FixedBitSet, so AFAIK it's not great for iteration. I'm imagining adding a supplementary "deleted docs
Re: Computing weight.count() cheaply in the face of deletes?
Hi, A SparseBitset impl for DELETES would be fine if the model in Lucene would encode deleted docs (it did that in earlier times). As deletes are sparse (deletes are in most cases <40%), this would help to make the iterator cheaper. Uwe Am 06.02.2024 um 09:01 schrieb Adrien Grand: Hey Michael, You are right, iterating all deletes with nextClearBit() would run in O(maxDoc). I am coming from the other direction, where I'm expecting the number of deletes to be more in the order of 1%-5% of the doc ID space, so a separate int[] would use lots of heap and probably not help that much compared with nextClearBit(). My mental model is that the two most common use-cases are append-only workloads, where there are no deletes at all, and update workloads, which would commonly have several percents of deleted docs. It's not clear to me how common it is to have very few deletes. On Tue, Feb 6, 2024 at 7:03 AM Michael Froh wrote: Thanks Adrien! My thinking with a separate iterator was that nextClearBit() is relatively expensive (O(maxDoc) to traverse everything, I think). The solution I was imagining would involve an index-time change to output, say, an int[] of deleted docIDs if the number is sufficiently small (like maybe less than 1000). Then the livedocs interface could optionally return a cheap deleted docs iterator (i.e. only if the number of deleted docs is less than the threshold). Technically, the cost would be O(1), since we set a constant bound on the effort and fail otherwise. :) I think 1000 doc value lookups would be cheap, but I don't know if the guarantee is cheap enough to make it into Weight#count. That said, I'm going to see if iterating with nextClearBit() is sufficiently cheap. Hmm... precomputing that int[] for deleted docIDs on refresh could be an option too. Thanks again, Froh On Fri, Feb 2, 2024 at 11:38 PM Adrien Grand wrote: Hi Michael, Indeed, only MatchAllDocsQuery knows how to produce a count when there are deletes. Your idea sounds good to me, do you actually need a side car iterator for deletes, or could you use a nextClearBit() operation on the bit set? I don't think we can fold it into Weight#count since there is an expectation that it is negligible compared with the cost of a naive count, but we may be able to do it in IndexSearcher#count or on the OpenSearch side. Le ven. 2 févr. 2024, 23:50, Michael Froh a écrit : Hi, On OpenSearch, we've been taking advantage of the various O(1) Weight#count() implementations to quickly compute various aggregations without needing to iterate over all the matching documents (at least when the top-level query is functionally a match-all at the segment level). Of course, from what I've seen, every clever Weight#count() implementation falls apart (returns -1) in the face of deletes. I was thinking that we could still handle small numbers of deletes efficiently if only we could get a DocIdSetIterator for deleted docs. Like suppose you're doing a date histogram aggregation, you could get the counts for each bucket from the points tree (ignoring deletes), then iterate through the deleted docs and decrement their contribution from the relevant bucket (determined based on a docvalues lookup). Assuming the number of deleted docs is small, it should be cheap, right? The current LiveDocs implementation is just a FixedBitSet, so AFAIK it's not great for iteration. I'm imagining adding a supplementary "deleted docs iterator" that could sit next to the FixedBitSet if and only if the number of deletes is "small". Is there a better way that I should be thinking about this? Thanks, Froh -- Adrien -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail:u...@thetaphi.de
Re: [VOTE] Release Lucene 9.9.2 RC1
Hi, +1 to release. Tested smoketester with Java 11 and 17; results: https://jenkins.thetaphi.de/job/Lucene-Release-Tester/31/console Uwe Am 25.01.2024 um 12:57 schrieb Chris Hegarty: Please vote for release candidate 1 for Lucene 9.9.2 The artifacts can be downloaded from: https://dist.apache.org/repos/dist/dev/lucene/lucene-9.9.2-RC1-rev-a2939784c4ca60bc28bf488b5479c02fc2e5e22c You can run the smoke tester directly with this command: python3 -u dev-tools/scripts/smokeTestRelease.py \ https://dist.apache.org/repos/dist/dev/lucene/lucene-9.9.2-RC1-rev-a2939784c4ca60bc28bf488b5479c02fc2e5e22c The vote will be open for 96 hours ( allowing some additional time for weekend span) i.e. until 2024-01-29 12:00 UTC. [ ] +1 approve [ ] +0 no opinion [ ] -1 disapprove (and reason why) Here is my +1 Draft release notes can be found at https://cwiki.apache.org/confluence/display/LUCENE/ReleaseNote9_9_2 -Chris. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail: u...@thetaphi.de - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: The need for a Lucene 9.9.2 release
Hi, Now I understand why you asked yesterday in the Java 22 PR. Do you think we should add Java 22 support for MMAP and Vectors? It is a bit risky, because API may still change, but the worst that could happen is that people need to pass a sysprop in Java 22 to disable broken MMAP (if everything goes wrong). So what do you think? Should we merge in Java 22 support or not? It's a bugfix release, so I am not super happy to take any risks. Uwe Am 23.01.2024 um 18:36 schrieb Chris Hegarty: Hi, We’ve encounter a serious issue with the recent Lucene 9.9.1 release, which warrants a 9.9.2. The issue is a NPE when sampling for quantization in Lucene99HnswScalarQuantizedVectorsFormat [1]. Thankfully Ben has already resolved the issue, and backported it to the appropriate branches. I don’t see any other potential issues that would warrant being pulled into this release. I’m happy to be Release Manager for 9.9.2 (given my recent experience on 9.9.1). I’ll start the release process tomorrow and notify this list when artifacts are ready. Thanks, -Chris. [1] https://github.com/apache/lucene/pull/13027 - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail: u...@thetaphi.de - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Heads up: upcoming GitHub action to mark stale Lucene PRs
Hi thanks Stefan, I will check some of the PRs I was invoved in and close/merge them if needed. Uwe Am 08.01.2024 um 13:34 schrieb Stefan Vodita: Hi all, We merged the PR for adding stale labels and ran the bot a few minutes ago. The run [1] went through 133 PRs, starting from the most recently created and found 91 stale [2]. The workflow will continue running daily. By tomorrow, it should get through all the PRs. [1] https://github.com/apache/lucene/actions/runs/7447339092/job/20259452054 [2] https://github.com/apache/lucene/pulls?q=is%3Apr+label%3AStale+is%3Aclosed On Thu, 4 Jan 2024 at 15:47, Stefan Vodita wrote: There's a flag for excluding draft PRs [1]. I'll add it to the workflow. If we need more flexibility in the future, we can also exclude PRs with certain labels [2]. Stefan [1] https://github.com/actions/stale?tab=readme-ov-file#exempt-draft-pr [2] https://github.com/actions/stale?tab=readme-ov-file#exempt-pr-labels On Thu, 4 Jan 2024 at 15:30, Uwe Schindler wrote: Hi, would it be possible to exclude DRAFT pull requests from this check? I can't send a weekly reminder to my own WIP PRs, like the MMapDirectory one (because it waits for Java 22 to be in RC phase). Uwe Am 04.01.2024 um 14:04 schrieb Michael McCandless: Hi Team, Stefan Vodita made an awesome simple PR adding a GitHub action to remind / nag us about stale PRs: https://github.com/apache/lucene/pull/12813 This happened after an in-person discussion at the last Community Over Code NA in Halifax where Stefan learned about the nice automation Apache Beam uses to nudge PRs forward. This change is just a baby step to try to get our stale PRs into a healthier state / workflow. In the ultimate irony, that PR itself had become stale recently (2 weeks of no activity) -- a "meta-stale PR"! I would like to merge this PR soon, but: * It will generate a bunch of one-time noise because we have ~163 open PRs many of which are stale: https://githubsearch.mikemccandless.com/search.py?dd=status%3AOpen=issue_or_pr%3APR <https://githubsearch.mikemccandless.com/search.py?dd=status%3AOpen=issue_or_pr%3APR> * I know nothing about GitHub actions YAML format, but worst comes to worst we push it, it fails in some exotic way, and we revert. I assume lazy consensus soon ;) Mike McCandless http://blog.mikemccandless.com -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail:u...@thetaphi.de -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail:u...@thetaphi.de
Re: Heads up: upcoming GitHub action to mark stale Lucene PRs
Hi, would it be possible to exclude DRAFT pull requests from this check? I can't send a weekly reminder to my own WIP PRs, like the MMapDirectory one (because it waits for Java 22 to be in RC phase). Uwe Am 04.01.2024 um 14:04 schrieb Michael McCandless: Hi Team, Stefan Vodita made an awesome simple PR adding a GitHub action to remind / nag us about stale PRs: https://github.com/apache/lucene/pull/12813 This happened after an in-person discussion at the last Community Over Code NA in Halifax where Stefan learned about the nice automation Apache Beam uses to nudge PRs forward. This change is just a baby step to try to get our stale PRs into a healthier state / workflow. In the ultimate irony, that PR itself had become stale recently (2 weeks of no activity) -- a "meta-stale PR"! I would like to merge this PR soon, but: * It will generate a bunch of one-time noise because we have ~163 open PRs many of which are stale: https://githubsearch.mikemccandless.com/search.py?dd=status%3AOpen=issue_or_pr%3APR <https://githubsearch.mikemccandless.com/search.py?dd=status%3AOpen=issue_or_pr%3APR> * I know nothing about GitHub actions YAML format, but worst comes to worst we push it, it fails in some exotic way, and we revert. I assume lazy consensus soon ;) Mike McCandless http://blog.mikemccandless.com -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail:u...@thetaphi.de
Re: [JENKINS] Lucene-9.x-Linux (64bit/hotspot/jdk-11.0.21) - Build # 14204 - Unstable!
Ha. Cool! Nice to meet. I suggested to use this reader to some customers, but they were using Solr or Elasticsearch and it's not easy to implement it there. And they didn't want to pay the expensive Uwe. How do you handle deletes. Because the main issue with those readers is that you can't update documents without also updating the main reader (although it's a fake update). If this is used, have you thought of a SynchronizedMergePolicy that just applies the same merges in the secondary index? Uwe Am 2. Dezember 2023 10:27:20 MEZ schrieb Dawid Weiss : >> ParallelReader is also seldomly used, maybe we should remove support at >> some point. I don't know anybody using it, because it is very complicated >> to maintain consistent indexes. It only works with stable merge policies. >> > >You do know somebody - you know me. We're using it extensively - the >scenario is for storing data derived from the main document in a separate >index, merging this data dynamically. The data can then be reindexed/ >modified independently. Yes, we do use stable merge policies. > >Dawid -- Uwe Schindler Achterdiek 19, 28357 Bremen https://www.thetaphi.de
Re: [JENKINS] Lucene-9.x-Linux (64bit/hotspot/jdk-11.0.21) - Build # 14204 - Unstable!
Found the PR. Somehow the mailinglist didn't get it. Am 2. Dezember 2023 09:58:45 MEZ schrieb Uwe Schindler : >Hi Chris, > >I can't find the PR. > >I am interested, because I wrote the original ParallelReader tests. > >IMHO the parallel readers are so sensitive to random changes, the test setup >should not use any indexwriter randomization at all. > >ParallelReader is also seldomly used, maybe we should remove support at some >point. I don't know anybody using it, because it is very complicated to >maintain consistent indexes. It only works with stable merge policies. > >Uwe > >Am 2. Dezember 2023 09:34:46 MEZ schrieb Chris Hegarty >: >>Hi, >> >>I noticed this failure locally, and opened a PR for it yesterday. It is a >>test issues, and indeed related to the recent merge policy test >>randomization change. >> >>-Chris >> >>On Saturday, December 2, 2023, Patrick Zhai wrote: >> >>> Seems it's because this MockRandomMergePolicy change >>> <https://github.com/apache/lucene/blob/main/lucene/test-framework/src/java/org/apache/lucene/tests/index/MockRandomMergePolicy.java#L242> >>> recently >>> makes ParallelLeafReader unhappy - it's reading two parallel segments from >>> 2 dir and this MP makes one of the segments' documents order reversed. >>> >>> But should be just test util issue and not affecting release. >>> >>> Adrien do you want to take a look? I'm not sure what's the best way to fix >>> it, adding an index sort for that test seems a bit overkill? >>> >>> Patrick >>> >>> On Fri, Dec 1, 2023 at 2:06 PM Michael McCandless < >>> luc...@mikemccandless.com> wrote: >>> >>>> Hmm this reproduces for me, and looks new/unique. Could it be related to >>>> recent 9.9.0 changes / release blocker? >>>> >>>> Mike >>>> >>>> On Fri, Dec 1, 2023 at 3:33 PM Policeman Jenkins Server < >>>> jenk...@thetaphi.de> wrote: >>>> >>>>> Build: https://jenkins.thetaphi.de/job/Lucene-9.x-Linux/14204/ >>>>> Java: 64bit/hotspot/jdk-11.0.21 -XX:+UseCompressedOops -XX:+UseParallelGC >>>>> >>>>> 1 tests failed. >>>>> FAILED: org.apache.lucene.index.TestParallelLeafReader.testQueries >>>>> >>>>> Error Message: >>>>> org.junit.ComparisonFailure: expected: but was: >>>>> >>>>> Stack Trace: >>>>> org.junit.ComparisonFailure: expected: but was: >>>>> at __randomizedtesting.SeedInfo.seed([6CA57EA3FB50CA0D: >>>>> 302BB278E1397FA3]:0) >>>>> at org.junit.Assert.assertEquals(Assert.java:117) >>>>> at org.junit.Assert.assertEquals(Assert.java:146) >>>>> at org.apache.lucene.index.TestParallelLeafReader.queryTest( >>>>> TestParallelLeafReader.java:263) >>>>> at org.apache.lucene.index.TestParallelLeafReader.testQueries( >>>>> TestParallelLeafReader.java:48) >>>>> at >>>>> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native >>>>> Method) >>>>> at java.base/jdk.internal.reflect.NativeMethodAccessorImpl. >>>>> invoke(NativeMethodAccessorImpl.java:62) >>>>> at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl. >>>>> invoke(DelegatingMethodAccessorImpl.java:43) >>>>> at java.base/java.lang.reflect.Method.invoke(Method.java:566) >>>>> at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke( >>>>> RandomizedRunner.java:1758) >>>>> at com.carrotsearch.randomizedtesting. >>>>> RandomizedRunner$8.evaluate(RandomizedRunner.java:946) >>>>> at com.carrotsearch.randomizedtesting. >>>>> RandomizedRunner$9.evaluate(RandomizedRunner.java:982) >>>>> at com.carrotsearch.randomizedtesting. >>>>> RandomizedRunner$10.evaluate(RandomizedRunner.java:996) >>>>> at org.apache.lucene.tests.util.TestRuleSetupTeardownChained$ >>>>> 1.evaluate(TestRuleSetupTeardownChained.java:48) >>>>> at org.apache.lucene.tests.util.AbstractBeforeAfterRule$1. >>>>> evaluate(AbstractBeforeAfterRule.java:43) >>>>> at org.apache.lucene.tests.util.TestRuleThreadAndTestName$1. >>>>> evaluate(TestRuleThreadAndTestName.java:45) >>>>> at org.apac
Re: [JENKINS] Lucene-9.x-Linux (64bit/hotspot/jdk-11.0.21) - Build # 14204 - Unstable!
.java:36) >>>> at com.carrotsearch.randomizedtesting.ThreadLeakControl$ >>>> StatementRunner.run(ThreadLeakControl.java:390) >>>> at com.carrotsearch.randomizedtesting.ThreadLeakControl. >>>> forkTimeoutingTask(ThreadLeakControl.java:843) >>>> at com.carrotsearch.randomizedtesting. >>>> ThreadLeakControl$3.evaluate(ThreadLeakControl.java:490) >>>> at com.carrotsearch.randomizedtesting.RandomizedRunner. >>>> runSingleTest(RandomizedRunner.java:955) >>>> at com.carrotsearch.randomizedtesting. >>>> RandomizedRunner$5.evaluate(RandomizedRunner.java:840) >>>> at com.carrotsearch.randomizedtesting. >>>> RandomizedRunner$6.evaluate(RandomizedRunner.java:891) >>>> at com.carrotsearch.randomizedtesting. >>>> RandomizedRunner$7.evaluate(RandomizedRunner.java:902) >>>> at org.apache.lucene.tests.util.AbstractBeforeAfterRule$1. >>>> evaluate(AbstractBeforeAfterRule.java:43) >>>> at com.carrotsearch.randomizedtesting.rules. >>>> StatementAdapter.evaluate(StatementAdapter.java:36) >>>> at org.apache.lucene.tests.util.TestRuleStoreClassName$1. >>>> evaluate(TestRuleStoreClassName.java:38) >>>> at com.carrotsearch.randomizedtesting.rules. >>>> NoShadowingOrOverridesOnMethodsRule$1.evaluate( >>>> NoShadowingOrOverridesOnMethodsRule.java:40) >>>> at com.carrotsearch.randomizedtesting.rules. >>>> NoShadowingOrOverridesOnMethodsRule$1.evaluate( >>>> NoShadowingOrOverridesOnMethodsRule.java:40) >>>> at com.carrotsearch.randomizedtesting.rules. >>>> StatementAdapter.evaluate(StatementAdapter.java:36) >>>> at com.carrotsearch.randomizedtesting.rules. >>>> StatementAdapter.evaluate(StatementAdapter.java:36) >>>> at org.apache.lucene.tests.util.TestRuleAssertionsRequired$1. >>>> evaluate(TestRuleAssertionsRequired.java:53) >>>> at org.apache.lucene.tests.util.AbstractBeforeAfterRule$1. >>>> evaluate(AbstractBeforeAfterRule.java:43) >>>> at org.apache.lucene.tests.util.TestRuleMarkFailure$1. >>>> evaluate(TestRuleMarkFailure.java:44) >>>> at org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures >>>> $1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60) >>>> at org.apache.lucene.tests.util.TestRuleIgnoreTestSuites$1. >>>> evaluate(TestRuleIgnoreTestSuites.java:47) >>>> at org.junit.rules.RunRules.evaluate(RunRules.java:20) >>>> at com.carrotsearch.randomizedtesting.rules. >>>> StatementAdapter.evaluate(StatementAdapter.java:36) >>>> at com.carrotsearch.randomizedtesting.ThreadLeakControl$ >>>> StatementRunner.run(ThreadLeakControl.java:390) >>>> at com.carrotsearch.randomizedtesting.ThreadLeakControl.lambda$ >>>> forkTimeoutingTask$0(ThreadLeakControl.java:850) >>>> at java.base/java.lang.Thread.run(Thread.java:829) >>>> >>>> - >>>> To unsubscribe, e-mail: builds-unsubscr...@lucene.apache.org >>>> For additional commands, e-mail: builds-h...@lucene.apache.org >>> >>> -- Uwe Schindler Achterdiek 19, 28357 Bremen https://www.thetaphi.de
Re: [VOTE] Release Lucene 9.9.0 RC2
Hi, I let Policeman Jenkins run the smoke tester with Java 11 and Java 17 (unfortunately we have no support for 21 yet, so new MMap and Vectors were not tested). But this was tested long enough, so I trust everything. I just did some cross-checking and validated the MR-JAR to contain all classes and that Javadocs are uptodate. Looks fine after the manual review. Here is Policeman's work and opinion: SUCCESS! [1:02:37.749085] (https://jenkins.thetaphi.de/job/Lucene-Release-Tester/30/console) Here is my personal opinion: +1 to release Uwe Am 30.11.2023 um 18:31 schrieb Chris Hegarty: Please vote for release candidate 2 for Lucene 9.9.0 The artifacts can be downloaded from: https://dist.apache.org/repos/dist/dev/lucene/lucene-9.9.0-RC2-rev-06070c0dceba07f0d33104192d9ac98ca16fc500 You can run the smoke tester directly with this command: python3 -u dev-tools/scripts/smokeTestRelease.py \ https://dist.apache.org/repos/dist/dev/lucene/lucene-9.9.0-RC2-rev-06070c0dceba07f0d33104192d9ac98ca16fc500 The vote will be open for at least 72 hours, and given the weekend in between, let’s keep it open until 2023-12-04 12:00 UTC. [ ] +1 approve [ ] +0 no opinion [ ] -1 disapprove (and reason why) Here is my +1 -Chris. -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail:u...@thetaphi.de
Re: [VOTE] Release Lucene 9.9.0 RC1
OK, great. I wanted to post a +1 already. Will wait for 2nd RC. Uwe Am 30.11.2023 um 16:38 schrieb Michael McCandless: On Thu, Nov 30, 2023 at 9:56 AM Chris Hegarty wrote: P.S. I’m less sure about this, but the RC 2 starts a 72hr voting time again? (Just so I know what TTL to put on that) Yeah a new 72 hour clock starts with each new RC :) Mike McCandless http://blog.mikemccandless.com -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail:u...@thetaphi.de
Re: [JENKINS] Lucene-9.x-Linux (64bit/openj9/jdk-11.0.20) - Build # 14015 - Still Failing!
Hi, Here is the PR to disable errorprone with forked compiler: https://github.com/apache/lucene/pull/12808; it makes no difference in main, but on 9.x it will fix the issue. I will merge the commit also to Solr as Jenkins fails there the same way. Uwe Am 14.11.2023 um 19:35 schrieb Dawid Weiss: Thanks Uwe! On Tue, Nov 14, 2023 at 7:27 PM Uwe Schindler wrote: Hi, For now the simplest is to disable always is an alternate JVM is used, just remove the second part of the first IF statement. In Main it is no longer relevant, as the runtime JDK is always >= 17, so it would always trigger the first if. I would not spend too much time, until errorprone gets its issues fixed. Actually they have some hacks but they only work with toolkits. Looks like the combination fork=true and not using toolkits breaks their construction of parameters to pass. Same for Solr. Am 14.11.2023 um 19:17 schrieb Uwe Schindler: Hi Dawid, The problem does not happen on Java 17, because errorprone is not enabled when the forked JDK is > Java 15. We did this because earlier versions worked correctly. But new versions of errorprone always fail when the JDK is forked while compiling. if (rootProject.usesAltJvm && rootProject.runtimeJavaVersion > JavaVersion.VERSION_15) { skipReason = "won't work with JDK ${rootProject.runtimeJavaVersion} if used as alternative java toolchain" } if (!propertyOrDefault("validation.errorprone", isCIBuild).asBoolean()) { skipReason = "skipped on builds not running inside CI environments, pass -Pvalidation.errorprone=true to enable" } So it looks like the errorprone plugin got broken by a recent upgrade. It now always fails when forked JDK is used. So we shold disable it in this case. We just did not notice, as previously it was only disabled when the runtime java version was > 17. Nowadays we no longer run alternate JVMs with Java 12, 13, 14, 15. We run with Java 11, 17, 19, 20, 21. So it is always disabled except for Java 11. With RUNTIME_JAVA_HOME==JAVA_HOME we never fork, but as we use OpenJ9, we fork an BOM. I will post a PR soon. Uwe Am 14.11.2023 um 19:06 schrieb Uwe Schindler: Hi Dawid, Hah, the issue happens only if you pass CI=true (this is set by CI systems), so errorprone is enabled. so do "export CI=true" and then build with that config. So it looks like a combination of errorprone enabled with Java 11 OpenJ9. Uwe Am 13.11.2023 um 09:09 schrieb Dawid Weiss: Sure, thanks. What's strange is that we don't use add-opens anywhere, I think (there is a mention of it I left in one of the comments, but nothing else across the codebase uses this directive). > Task :lucene:distribution.tests:compileTestJava warning: [options] --add-opens has no effect at compile time On Sun, Nov 12, 2023 at 10:56 PM Uwe Schindler wrote: Will check tomorrow, it's too late now. On Jenkins there were no windows builds with IBM and Java 11 yet: https://jenkins.thetaphi.de/job/Lucene-9.x-Windows/ Am 12.11.2023 um 22:00 schrieb Dawid Weiss: Hi Uwe, Can you reproduce this on Windows with the same JVM versions though? Seems like I have exactly the same setup and yet this works for me just fine. Strange. Dawid On Sun, Nov 12, 2023 at 9:52 PM Uwe Schindler wrote: This one was my first idea, too. It fails only with IBM Semeru in combination with Gradle using Temurin. I will dig tomorrow on Jenkins server and print all debug info. Uwe Am 12. November 2023 21:48:54 MEZ schrieb Dawid Weiss : I can't reproduce this though - used exactly the same JVMs (on Windows): > gradlew :lucene:distribution.tests:compileTestJava --rerun-tasks --console=plain Generating gradle.properties ... > Task :altJvmWarning NOTE: Alternative java toolchain will be used for compilation and tests: Project will use 11 (IBM JDK 11.0.20.1+1, home at: c:\_tmp\jdk-11.0.20.1+1) Gradle runs with 11 (Eclipse Temurin JDK 11.0.21+9, home at: C:\_tmp\jdk-11.0.21+9) ... > Task :lucene:distribution.tests:compileJava NO-SOURCE > Task :lucene:distribution.tests:classes UP-TO-DATE > Task :lucene:distribution.tests:compileTestJava BUILD SUCCESSFUL in 23s 5 actionab
Re: [JENKINS] Lucene-9.x-Linux (64bit/openj9/jdk-11.0.20) - Build # 14015 - Still Failing!
Hi, For now the simplest is to disable always is an alternate JVM is used, just remove the second part of the first IF statement. In Main it is no longer relevant, as the runtime JDK is always >= 17, so it would always trigger the first if. I would not spend too much time, until errorprone gets its issues fixed. Actually they have some hacks but they only work with toolkits. Looks like the combination fork=true and not using toolkits breaks their construction of parameters to pass. Same for Solr. Am 14.11.2023 um 19:17 schrieb Uwe Schindler: Hi Dawid, The problem does not happen on Java 17, because errorprone is not enabled when the forked JDK is > Java 15. We did this because earlier versions worked correctly. But new versions of errorprone always fail when the JDK is forked while compiling. if (rootProject.usesAltJvm && rootProject.runtimeJavaVersion > JavaVersion.VERSION_15) { skipReason = "won't work with JDK ${rootProject.runtimeJavaVersion} if used as alternative java toolchain" } if (!propertyOrDefault("validation.errorprone", isCIBuild).asBoolean()) { skipReason = "skipped on builds not running inside CI environments, pass -Pvalidation.errorprone=true to enable" } So it looks like the errorprone plugin got broken by a recent upgrade. It now always fails when forked JDK is used. So we shold disable it in this case. We just did not notice, as previously it was only disabled when the runtime java version was > 17. Nowadays we no longer run alternate JVMs with Java 12, 13, 14, 15. We run with Java 11, 17, 19, 20, 21. So it is always disabled except for Java 11. With RUNTIME_JAVA_HOME==JAVA_HOME we never fork, but as we use OpenJ9, we fork an BOM. I will post a PR soon. Uwe Am 14.11.2023 um 19:06 schrieb Uwe Schindler: Hi Dawid, Hah, the issue happens only if you pass CI=true (this is set by CI systems), so errorprone is enabled. so do "export CI=true" and then build with that config. So it looks like a combination of errorprone enabled with Java 11 OpenJ9. Uwe Am 13.11.2023 um 09:09 schrieb Dawid Weiss: Sure, thanks. What's strange is that we don't use add-opens anywhere, I think (there is a mention of it I left in one of the comments, but nothing else across the codebase uses this directive). > Task :lucene:distribution.tests:compileTestJava warning: [options] --add-opens has no effect at compile time On Sun, Nov 12, 2023 at 10:56 PM Uwe Schindler wrote: Will check tomorrow, it's too late now. On Jenkins there were no windows builds with IBM and Java 11 yet: https://jenkins.thetaphi.de/job/Lucene-9.x-Windows/ Am 12.11.2023 um 22:00 schrieb Dawid Weiss: Hi Uwe, Can you reproduce this on Windows with the same JVM versions though? Seems like I have exactly the same setup and yet this works for me just fine. Strange. Dawid On Sun, Nov 12, 2023 at 9:52 PM Uwe Schindler wrote: This one was my first idea, too. It fails only with IBM Semeru in combination with Gradle using Temurin. I will dig tomorrow on Jenkins server and print all debug info. Uwe Am 12. November 2023 21:48:54 MEZ schrieb Dawid Weiss : I can't reproduce this though - used exactly the same JVMs (on Windows): > gradlew :lucene:distribution.tests:compileTestJava --rerun-tasks --console=plain Generating gradle.properties ... > Task :altJvmWarning NOTE: Alternative java toolchain will be used for compilation and tests: Project will use 11 (IBM JDK 11.0.20.1+1, home at: c:\_tmp\jdk-11.0.20.1+1) Gradle runs with 11 (Eclipse Temurin JDK 11.0.21+9, home at: C:\_tmp\jdk-11.0.21+9) ... > Task :lucene:distribution.tests:compileJava NO-SOURCE > Task :lucene:distribution.tests:classes UP-TO-DATE > Task :lucene:distribution.tests:compileTestJava BUILD SUCCESSFUL in 23s 5 actionable tasks: 5 executed On main branch it works, no idea why: O thought it's because of this: https://github.com/apache/lucene/commit/2e12a35c876a but I don't think so... seems to work for me on Windows on branch_9x just fine? D. -- Uwe Schindler Achterdiek 19, 28357 Bremen https://www.thetaphi.de -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail:u...@thetaphi.de -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail:u...@thetaphi.de -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail:u...@thetaphi.de -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail:u...@thetaphi.de
Re: [JENKINS] Lucene-9.x-Linux (64bit/openj9/jdk-11.0.20) - Build # 14015 - Still Failing!
Hi Dawid, The problem does not happen on Java 17, because errorprone is not enabled when the forked JDK is > Java 15. We did this because earlier versions worked correctly. But new versions of errorprone always fail when the JDK is forked while compiling. if (rootProject.usesAltJvm && rootProject.runtimeJavaVersion > JavaVersion.VERSION_15) { skipReason = "won't work with JDK ${rootProject.runtimeJavaVersion} if used as alternative java toolchain" } if (!propertyOrDefault("validation.errorprone", isCIBuild).asBoolean()) { skipReason = "skipped on builds not running inside CI environments, pass -Pvalidation.errorprone=true to enable" } So it looks like the errorprone plugin got broken by a recent upgrade. It now always fails when forked JDK is used. So we shold disable it in this case. We just did not notice, as previously it was only disabled when the runtime java version was > 17. Nowadays we no longer run alternate JVMs with Java 12, 13, 14, 15. We run with Java 11, 17, 19, 20, 21. So it is always disabled except for Java 11. With RUNTIME_JAVA_HOME==JAVA_HOME we never fork, but as we use OpenJ9, we fork an BOM. I will post a PR soon. Uwe Am 14.11.2023 um 19:06 schrieb Uwe Schindler: Hi Dawid, Hah, the issue happens only if you pass CI=true (this is set by CI systems), so errorprone is enabled. so do "export CI=true" and then build with that config. So it looks like a combination of errorprone enabled with Java 11 OpenJ9. Uwe Am 13.11.2023 um 09:09 schrieb Dawid Weiss: Sure, thanks. What's strange is that we don't use add-opens anywhere, I think (there is a mention of it I left in one of the comments, but nothing else across the codebase uses this directive). > Task :lucene:distribution.tests:compileTestJava warning: [options] --add-opens has no effect at compile time On Sun, Nov 12, 2023 at 10:56 PM Uwe Schindler wrote: Will check tomorrow, it's too late now. On Jenkins there were no windows builds with IBM and Java 11 yet: https://jenkins.thetaphi.de/job/Lucene-9.x-Windows/ Am 12.11.2023 um 22:00 schrieb Dawid Weiss: Hi Uwe, Can you reproduce this on Windows with the same JVM versions though? Seems like I have exactly the same setup and yet this works for me just fine. Strange. Dawid On Sun, Nov 12, 2023 at 9:52 PM Uwe Schindler wrote: This one was my first idea, too. It fails only with IBM Semeru in combination with Gradle using Temurin. I will dig tomorrow on Jenkins server and print all debug info. Uwe Am 12. November 2023 21:48:54 MEZ schrieb Dawid Weiss : I can't reproduce this though - used exactly the same JVMs (on Windows): > gradlew :lucene:distribution.tests:compileTestJava --rerun-tasks --console=plain Generating gradle.properties ... > Task :altJvmWarning NOTE: Alternative java toolchain will be used for compilation and tests: Project will use 11 (IBM JDK 11.0.20.1+1, home at: c:\_tmp\jdk-11.0.20.1+1) Gradle runs with 11 (Eclipse Temurin JDK 11.0.21+9, home at: C:\_tmp\jdk-11.0.21+9) ... > Task :lucene:distribution.tests:compileJava NO-SOURCE > Task :lucene:distribution.tests:classes UP-TO-DATE > Task :lucene:distribution.tests:compileTestJava BUILD SUCCESSFUL in 23s 5 actionable tasks: 5 executed On main branch it works, no idea why: O thought it's because of this: https://github.com/apache/lucene/commit/2e12a35c876a but I don't think so... seems to work for me on Windows on branch_9x just fine? D. -- Uwe Schindler Achterdiek 19, 28357 Bremen https://www.thetaphi.de -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail:u...@thetaphi.de -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail:u...@thetaphi.de -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail:u...@thetaphi.de
Re: [JENKINS] Lucene-9.x-Linux (64bit/openj9/jdk-11.0.20) - Build # 14015 - Still Failing!
Hi Dawid, Hah, the issue happens only if you pass CI=true (this is set by CI systems), so errorprone is enabled. so do "export CI=true" and then build with that config. So it looks like a combination of errorprone enabled with Java 11 OpenJ9. Uwe Am 13.11.2023 um 09:09 schrieb Dawid Weiss: Sure, thanks. What's strange is that we don't use add-opens anywhere, I think (there is a mention of it I left in one of the comments, but nothing else across the codebase uses this directive). > Task :lucene:distribution.tests:compileTestJava warning: [options] --add-opens has no effect at compile time On Sun, Nov 12, 2023 at 10:56 PM Uwe Schindler wrote: Will check tomorrow, it's too late now. On Jenkins there were no windows builds with IBM and Java 11 yet: https://jenkins.thetaphi.de/job/Lucene-9.x-Windows/ Am 12.11.2023 um 22:00 schrieb Dawid Weiss: Hi Uwe, Can you reproduce this on Windows with the same JVM versions though? Seems like I have exactly the same setup and yet this works for me just fine. Strange. Dawid On Sun, Nov 12, 2023 at 9:52 PM Uwe Schindler wrote: This one was my first idea, too. It fails only with IBM Semeru in combination with Gradle using Temurin. I will dig tomorrow on Jenkins server and print all debug info. Uwe Am 12. November 2023 21:48:54 MEZ schrieb Dawid Weiss : I can't reproduce this though - used exactly the same JVMs (on Windows): > gradlew :lucene:distribution.tests:compileTestJava --rerun-tasks --console=plain Generating gradle.properties ... > Task :altJvmWarning NOTE: Alternative java toolchain will be used for compilation and tests: Project will use 11 (IBM JDK 11.0.20.1+1, home at: c:\_tmp\jdk-11.0.20.1+1) Gradle runs with 11 (Eclipse Temurin JDK 11.0.21+9, home at: C:\_tmp\jdk-11.0.21+9) ... > Task :lucene:distribution.tests:compileJava NO-SOURCE > Task :lucene:distribution.tests:classes UP-TO-DATE > Task :lucene:distribution.tests:compileTestJava BUILD SUCCESSFUL in 23s 5 actionable tasks: 5 executed On main branch it works, no idea why: O thought it's because of this: https://github.com/apache/lucene/commit/2e12a35c876a but I don't think so... seems to work for me on Windows on branch_9x just fine? D. -- Uwe Schindler Achterdiek 19, 28357 Bremen https://www.thetaphi.de -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail:u...@thetaphi.de -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail:u...@thetaphi.de
Re: (lucene) branch main updated: remove another errant lurking angry semicolon -- why do I keep finding these?
Hi Mike, this comes indirectly from the code formatter. As you see here, the previous line ends with "}", so it looks like somebody added the semicolon behind the bracket. This happens often if it was an anonymous inner class previously or a lambda that was moved around but the onsolete semicolon left intact at end of line. When it gets formatted the semicolon (as it is out of its original context) gets its own line, as it looks like a statement. Trust me, the semicolon is not angry it is just sad alone. Uwe Am 14.11.2023 um 13:08 schrieb mikemcc...@apache.org: This is an automated email from the ASF dual-hosted git repository. mikemccand pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/lucene.git The following commit(s) were added to refs/heads/main by this push: new cc35e903551 remove another errant lurking angry semicolon -- why do I keep finding these? cc35e903551 is described below commit cc35e903551ddedbfe6765be5a2f332fc7871da2 Author: Mike McCandless AuthorDate: Tue Nov 14 07:07:52 2023 -0500 remove another errant lurking angry semicolon -- why do I keep finding these? --- .../src/java/org/apache/lucene/analysis/DelegatingAnalyzerWrapper.java | 1 - 1 file changed, 1 deletion(-) diff --git a/lucene/core/src/java/org/apache/lucene/analysis/DelegatingAnalyzerWrapper.java b/lucene/core/src/java/org/apache/lucene/analysis/DelegatingAnalyzerWrapper.java index 9fc24af29a0..6a77078a5cc 100644 --- a/lucene/core/src/java/org/apache/lucene/analysis/DelegatingAnalyzerWrapper.java +++ b/lucene/core/src/java/org/apache/lucene/analysis/DelegatingAnalyzerWrapper.java @@ -100,5 +100,4 @@ public abstract class DelegatingAnalyzerWrapper extends AnalyzerWrapper { } } } - ; } -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail: u...@thetaphi.de - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [JENKINS] Lucene-MMAPv2-Linux (64bit/hotspot/jdk-17.0.9) - Build # 1442 - Failure!
It always happens in this config: Java: 64bit/hotspot/jdk-17.0.9 -XX:+UseCompressedOops -XX:+UseShenandoahGC Uwe Am 14.11.2023 um 15:10 schrieb Uwe Schindler: Hi, This looks like a JVM bug. It ONLY happens with JDK 17.0.9 (no other JVM) since last weekend when it was updated. See my other message. Uwe Am 14.11.2023 um 14:17 schrieb Michael McCandless: Hmm again timeout. Something seems amiss. Do our super slow tests still print out HEARTBEAT periodically? Or did we lose that in the gradle migration maybe? Build timed out (after 126 minutes). Marking the build as aborted. Build timed out (after 126 minutes). Marking the build as failed. Mike McCandless http://blog.mikemccandless.com On Tue, Nov 14, 2023 at 7:59 AM Policeman Jenkins Server wrote: Build: https://jenkins.thetaphi.de/job/Lucene-MMAPv2-Linux/1442/ Java: 64bit/hotspot/jdk-17.0.9 -XX:+UseCompressedOops -XX:+UseShenandoahGC All tests passed - To unsubscribe, e-mail: builds-unsubscr...@lucene.apache.org For additional commands, e-mail: builds-h...@lucene.apache.org -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail:u...@thetaphi.de -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail:u...@thetaphi.de
Re: [JENKINS] Lucene-MMAPv2-Linux (64bit/hotspot/jdk-17.0.9) - Build # 1442 - Failure!
Hi, This looks like a JVM bug. It ONLY happens with JDK 17.0.9 (no other JVM) since last weekend when it was updated. See my other message. Uwe Am 14.11.2023 um 14:17 schrieb Michael McCandless: Hmm again timeout. Something seems amiss. Do our super slow tests still print out HEARTBEAT periodically? Or did we lose that in the gradle migration maybe? Build timed out (after 126 minutes). Marking the build as aborted. Build timed out (after 126 minutes). Marking the build as failed. Mike McCandless http://blog.mikemccandless.com On Tue, Nov 14, 2023 at 7:59 AM Policeman Jenkins Server wrote: Build: https://jenkins.thetaphi.de/job/Lucene-MMAPv2-Linux/1442/ Java: 64bit/hotspot/jdk-17.0.9 -XX:+UseCompressedOops -XX:+UseShenandoahGC All tests passed - To unsubscribe, e-mail: builds-unsubscr...@lucene.apache.org For additional commands, e-mail: builds-h...@lucene.apache.org -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail:u...@thetaphi.de
Re: [JENKINS] Lucene-main-Linux (64bit/hotspot/jdk-17.0.9) - Build # 45536 - Failure!
It is absolutely NOT wanted to have absolute timeouts on Policeman Jenkins. This was hanging really, so it was killed for correct reasons. Uwe Am 14.11.2023 um 12:19 schrieb Dawid Weiss: P.S.: Jenkins kills jobs, if they take longer than usual it kills it (it has no hard limit, it takes the average time of previous runs and if one takes much longer it kills). From what I see here - https://plugins.jenkins.io/build-timeout/ it should be possible to tweak jenkins to use an absolute timeout... and even if not, provide a shell script to perhaps jps all java processes so that there's more diagnostics? D. -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail:u...@thetaphi.de
Re: [JENKINS] Lucene-main-Linux (64bit/hotspot/jdk-17.0.9) - Build # 45536 - Failure!
Hi, Sorry, we do not know where it hangs. Phonetic was already finished. As tasks may run in parallel, we have no idea where it hung. Uwe Am 14.11.2023 um 11:56 schrieb Uwe Schindler: Hi, It could also be a JVM bug! This happened the second time with Temurin Hotspot JDK 17.0.9: https://jenkins.thetaphi.de/job/Lucene-main-Linux/45532/ and https://jenkins.thetaphi.de/job/Lucene-main-Linux/45536/ Both times it hung in phonetic tests... Uwe Am 14.11.2023 um 11:46 schrieb Michael McCandless: Thanks Uwe. OK so this might just be a high-sigma outlier-ish event due to unluckily slow seed selection? I wonder whether the distribution of total run time of each full "./gradlew test" on each JVM configuration is roughly Gaussian-ish? Mike McCandless http://blog.mikemccandless.com On Tue, Nov 14, 2023 at 5:40 AM Uwe Schindler wrote: Hi, Actually this is the default JVM, so its not OpenJ9 or another EA release.It could be one of the tests haging, but we can't figure that out. P.S.: Jenkins kills jobs, if they take longer than usual it kills it (it has no hard limit, it takes the average time of previous runs and if one takes much longer it kills). Uwe Am 14.11.2023 um 11:06 schrieb Michael McCandless: Hmm build timed out -- not sure why it's taking so long to run tests: Build timed out (after 137 minutes). Marking the build as aborted. Build timed out (after 137 minutes). Marking the build as failed. Mike McCandless http://blog.mikemccandless.com On Tue, Nov 14, 2023 at 1:39 AM Policeman Jenkins Server wrote: Build: https://jenkins.thetaphi.de/job/Lucene-main-Linux/45536/ Java: 64bit/hotspot/jdk-17.0.9 -XX:+UseCompressedOops -XX:+UseShenandoahGC All tests passed - To unsubscribe, e-mail: builds-unsubscr...@lucene.apache.org For additional commands, e-mail: builds-h...@lucene.apache.org -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail:u...@thetaphi.de -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail:u...@thetaphi.de -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail:u...@thetaphi.de
Re: [JENKINS] Lucene-main-Linux (64bit/hotspot/jdk-17.0.9) - Build # 45536 - Failure!
Hi, It could also be a JVM bug! This happened the second time with Temurin Hotspot JDK 17.0.9: https://jenkins.thetaphi.de/job/Lucene-main-Linux/45532/ and https://jenkins.thetaphi.de/job/Lucene-main-Linux/45536/ Both times it hung in phonetic tests... Uwe Am 14.11.2023 um 11:46 schrieb Michael McCandless: Thanks Uwe. OK so this might just be a high-sigma outlier-ish event due to unluckily slow seed selection? I wonder whether the distribution of total run time of each full "./gradlew test" on each JVM configuration is roughly Gaussian-ish? Mike McCandless http://blog.mikemccandless.com On Tue, Nov 14, 2023 at 5:40 AM Uwe Schindler wrote: Hi, Actually this is the default JVM, so its not OpenJ9 or another EA release.It could be one of the tests haging, but we can't figure that out. P.S.: Jenkins kills jobs, if they take longer than usual it kills it (it has no hard limit, it takes the average time of previous runs and if one takes much longer it kills). Uwe Am 14.11.2023 um 11:06 schrieb Michael McCandless: Hmm build timed out -- not sure why it's taking so long to run tests: Build timed out (after 137 minutes). Marking the build as aborted. Build timed out (after 137 minutes). Marking the build as failed. Mike McCandless http://blog.mikemccandless.com On Tue, Nov 14, 2023 at 1:39 AM Policeman Jenkins Server wrote: Build: https://jenkins.thetaphi.de/job/Lucene-main-Linux/45536/ Java: 64bit/hotspot/jdk-17.0.9 -XX:+UseCompressedOops -XX:+UseShenandoahGC All tests passed - To unsubscribe, e-mail: builds-unsubscr...@lucene.apache.org For additional commands, e-mail: builds-h...@lucene.apache.org -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail:u...@thetaphi.de -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail:u...@thetaphi.de
Re: [JENKINS] Lucene-main-Linux (64bit/hotspot/jdk-17.0.9) - Build # 45536 - Failure!
Hi, Actually this is the default JVM, so its not OpenJ9 or another EA release.It could be one of the tests haging, but we can't figure that out. P.S.: Jenkins kills jobs, if they take longer than usual it kills it (it has no hard limit, it takes the average time of previous runs and if one takes much longer it kills). Uwe Am 14.11.2023 um 11:06 schrieb Michael McCandless: Hmm build timed out -- not sure why it's taking so long to run tests: Build timed out (after 137 minutes). Marking the build as aborted. Build timed out (after 137 minutes). Marking the build as failed. Mike McCandless http://blog.mikemccandless.com On Tue, Nov 14, 2023 at 1:39 AM Policeman Jenkins Server wrote: Build: https://jenkins.thetaphi.de/job/Lucene-main-Linux/45536/ Java: 64bit/hotspot/jdk-17.0.9 -XX:+UseCompressedOops -XX:+UseShenandoahGC All tests passed - To unsubscribe, e-mail: builds-unsubscr...@lucene.apache.org For additional commands, e-mail: builds-h...@lucene.apache.org -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail:u...@thetaphi.de
Re: [JENKINS] Lucene-9.x-Linux (64bit/openj9/jdk-11.0.20) - Build # 14015 - Still Failing!
Will check tomorrow, it's too late now. On Jenkins there were no windows builds with IBM and Java 11 yet: https://jenkins.thetaphi.de/job/Lucene-9.x-Windows/ Am 12.11.2023 um 22:00 schrieb Dawid Weiss: Hi Uwe, Can you reproduce this on Windows with the same JVM versions though? Seems like I have exactly the same setup and yet this works for me just fine. Strange. Dawid On Sun, Nov 12, 2023 at 9:52 PM Uwe Schindler wrote: This one was my first idea, too. It fails only with IBM Semeru in combination with Gradle using Temurin. I will dig tomorrow on Jenkins server and print all debug info. Uwe Am 12. November 2023 21:48:54 MEZ schrieb Dawid Weiss : I can't reproduce this though - used exactly the same JVMs (on Windows): > gradlew :lucene:distribution.tests:compileTestJava --rerun-tasks --console=plain Generating gradle.properties ... > Task :altJvmWarning NOTE: Alternative java toolchain will be used for compilation and tests: Project will use 11 (IBM JDK 11.0.20.1+1, home at: c:\_tmp\jdk-11.0.20.1+1) Gradle runs with 11 (Eclipse Temurin JDK 11.0.21+9, home at: C:\_tmp\jdk-11.0.21+9) ... > Task :lucene:distribution.tests:compileJava NO-SOURCE > Task :lucene:distribution.tests:classes UP-TO-DATE > Task :lucene:distribution.tests:compileTestJava BUILD SUCCESSFUL in 23s 5 actionable tasks: 5 executed On main branch it works, no idea why: O thought it's because of this: https://github.com/apache/lucene/commit/2e12a35c876a but I don't think so... seems to work for me on Windows on branch_9x just fine? D. -- Uwe Schindler Achterdiek 19, 28357 Bremen https://www.thetaphi.de -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail:u...@thetaphi.de
Re: [JENKINS] Lucene-9.x-Linux (64bit/openj9/jdk-11.0.20) - Build # 14015 - Still Failing!
This one was my first idea, too. It fails only with IBM Semeru in combination with Gradle using Temurin. I will dig tomorrow on Jenkins server and print all debug info. Uwe Am 12. November 2023 21:48:54 MEZ schrieb Dawid Weiss : >I can't reproduce this though - used exactly the same JVMs (on Windows): > >> gradlew :lucene:distribution.tests:compileTestJava --rerun-tasks >--console=plain >Generating gradle.properties >... >> Task :altJvmWarning >NOTE: Alternative java toolchain will be used for compilation and tests: > Project will use 11 (IBM JDK 11.0.20.1+1, home at: >c:\_tmp\jdk-11.0.20.1+1) > Gradle runs with 11 (Eclipse Temurin JDK 11.0.21+9, home at: >C:\_tmp\jdk-11.0.21+9) >... >> Task :lucene:distribution.tests:compileJava NO-SOURCE >> Task :lucene:distribution.tests:classes UP-TO-DATE >> Task :lucene:distribution.tests:compileTestJava > >BUILD SUCCESSFUL in 23s >5 actionable tasks: 5 executed > >On main branch it works, no idea why: >> > >O thought it's because of this: > >https://github.com/apache/lucene/commit/2e12a35c876a > >but I don't think so... seems to work for me on Windows on branch_9x just >fine? > >D. > >> -- Uwe Schindler Achterdiek 19, 28357 Bremen https://www.thetaphi.de
Re: [JENKINS] Lucene-9.x-Linux (64bit/openj9/jdk-11.0.20) - Build # 14015 - Still Failing!
One addition: Solr is affected on same way. Am 12. November 2023 21:39:07 MEZ schrieb Uwe Schindler : >Thanks. > >I have the feeling it is because of same major version. If Gradle JDK is >Eclipse Temurin and runtime JDK is same version but OpenJ9 it fails. > >Interestingly only in 11 (branch 9x). On main it worked for long time. Jdk17 >as Gradle runtime by Temurin and openj9 as runtime. > >If runtime and Gradle VM is identical home dir it does not fork. Here is a >flight difference and that matters. The alternate VM info is printed, but it >looks like it f*cks up when providing options and something thinks it does not >fork, but it does. > >> Task :altJvmWarning >NOTE: Alternative java toolchain will be used for compilation and tests: > Project will use 11 (IBM JDK 11.0.20.1+1, home at: > /home/jenkins/tools/java/64bit/openj9/jdk-11.0.20) > Gradle runs with 11 (Eclipse Temurin JDK 11.0.21+9, home at: > /home/jenkins/tools/java/64bit/hotspot/jdk-11.0.21) > >On main branch it works, no idea why: > >> Task :altJvmWarning >NOTE: Alternative java toolchain will be used for compilation and tests: > Project will use 17 (IBM JDK 17.0.8.1+1, home at: > /home/jenkins/tools/java/64bit/openj9/jdk-17.0.8) > Gradle runs with 17 (Eclipse Temurin JDK 17.0.9+9, home at: > /home/jenkins/tools/java/64bit/hotspot/jdk-17.0.9) > >In case you ask, die to randomized jvms the Jenkins job Always runs with >default temurin minimum jdk for Gradle and just exchanges runtime for testing >(because Gradle itself can't run on all JDKs out there). > >One other thing: if you set OpenJ9 as default jdk and run Gradle with it, it >fails while building buildSrc: it can't compile the profiling stuff as flight >recorder module does not ship with OpenJ9. We should possibly fix this, maybe >by rewriting the code with pure Groovy class and not compile it at all. > >Uwe > >Am 12. November 2023 21:08:41 MEZ schrieb Dawid Weiss : >>Hi Uwe, >> >>Will dig tomorrow. Maybe Dawid has an idea? It looks like the alternate >>> runtime is correctly detected, but why is Gradle passing compiler runzine >>> options without -J in just this case. In Main the same works where Gradle >>> runs with Java17-Temurin and J9 is used as runtime. >>> >> >>I think I know what this is - please let me verify and provide a PR, if >>it's indeed that. >> >>Dawid > >-- >Uwe Schindler >Achterdiek 19, 28357 Bremen >https://www.thetaphi.de -- Uwe Schindler Achterdiek 19, 28357 Bremen https://www.thetaphi.de
Re: [JENKINS] Lucene-9.x-Linux (64bit/openj9/jdk-11.0.20) - Build # 14015 - Still Failing!
Thanks. I have the feeling it is because of same major version. If Gradle JDK is Eclipse Temurin and runtime JDK is same version but OpenJ9 it fails. Interestingly only in 11 (branch 9x). On main it worked for long time. Jdk17 as Gradle runtime by Temurin and openj9 as runtime. If runtime and Gradle VM is identical home dir it does not fork. Here is a flight difference and that matters. The alternate VM info is printed, but it looks like it f*cks up when providing options and something thinks it does not fork, but it does. > Task :altJvmWarning NOTE: Alternative java toolchain will be used for compilation and tests: Project will use 11 (IBM JDK 11.0.20.1+1, home at: /home/jenkins/tools/java/64bit/openj9/jdk-11.0.20) Gradle runs with 11 (Eclipse Temurin JDK 11.0.21+9, home at: /home/jenkins/tools/java/64bit/hotspot/jdk-11.0.21) On main branch it works, no idea why: > Task :altJvmWarning NOTE: Alternative java toolchain will be used for compilation and tests: Project will use 17 (IBM JDK 17.0.8.1+1, home at: /home/jenkins/tools/java/64bit/openj9/jdk-17.0.8) Gradle runs with 17 (Eclipse Temurin JDK 17.0.9+9, home at: /home/jenkins/tools/java/64bit/hotspot/jdk-17.0.9) In case you ask, die to randomized jvms the Jenkins job Always runs with default temurin minimum jdk for Gradle and just exchanges runtime for testing (because Gradle itself can't run on all JDKs out there). One other thing: if you set OpenJ9 as default jdk and run Gradle with it, it fails while building buildSrc: it can't compile the profiling stuff as flight recorder module does not ship with OpenJ9. We should possibly fix this, maybe by rewriting the code with pure Groovy class and not compile it at all. Uwe Am 12. November 2023 21:08:41 MEZ schrieb Dawid Weiss : >Hi Uwe, > >Will dig tomorrow. Maybe Dawid has an idea? It looks like the alternate >> runtime is correctly detected, but why is Gradle passing compiler runzine >> options without -J in just this case. In Main the same works where Gradle >> runs with Java17-Temurin and J9 is used as runtime. >> > >I think I know what this is - please let me verify and provide a PR, if >it's indeed that. > >Dawid -- Uwe Schindler Achterdiek 19, 28357 Bremen https://www.thetaphi.de
Re: [JENKINS] Lucene-9.x-Linux (64bit/openj9/jdk-11.0.20) - Build # 14015 - Still Failing!
It is unclear why this happens with Java 11 version of OpenJ9 only. Java 17 works. The compiler is forked as Gradle runs with Temurin, J9 is used via RUNTIME_JAVA_HOME. Will dig tomorrow. Maybe Dawid has an idea? It looks like the alternate runtime is correctly detected, but why is Gradle passing compiler runzine options without -J in just this case. In Main the same works where Gradle runs with Java17-Temurin and J9 is used as runtime. Uwe Am 12. November 2023 18:37:04 MEZ schrieb Policeman Jenkins Server : >Build: https://jenkins.thetaphi.de/job/Lucene-9.x-Linux/14015/ >Java: 64bit/openj9/jdk-11.0.20 -XX:+UseCompressedOops -Xgcpolicy:balanced > >No tests ran. -- Uwe Schindler Achterdiek 19, 28357 Bremen https://www.thetaphi.de
Re: [JENKINS] Lucene-main-Linux (64bit/openj9/jdk-17.0.5) - Build # 45394 - Unstable!
Hi, I had some time today to do upgrades of JDK versions on Policeman Jenkins: * jdk 8, 11, 17, 21 was updated to latest Temurin Hotspot versions (Linux, Windows, Mac x64): jdk1.8.0_392, jdk-11.0.21, jdk-17.0.9, jdk-21.0.1 * updated to jdk-17.0.8 of IBM Semeru OpenJ9 * added jdk-11.0.20 and jdk-20.0.2 of IBM Semeru OpenJ9 into the game Uwe Am 06.11.2023 um 14:02 schrieb Michael McCandless: On Sun, Nov 5, 2023 at 5:01 AM Uwe Schindler wrote: I will update the J9 runtime later this day. But this was a real bug, so it's good it catched this :-) So - no - I won't remove OpenJ9 support at all. I see, that's great that J9 build is indeed catching real Lucene bugs! +1 to keep running it in CI builds. The errors someties happen are bugs, they might get better with latest versions. I see there's no waslo a Java 20 version. I will give it a try, too - especially regarding Panama (+ Vector). Want to see how it behaves. +1 Thanks Uwe. Mike McCandless http://blog.mikemccandless.com -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail:u...@thetaphi.de
Re: Welcome Patrick Zhai to the Lucene PMC
Welcome Patrick! Uwe Am 10. November 2023 21:04:32 MEZ schrieb Michael McCandless : >I'm happy to announce that Patrick Zhai has accepted an invitation to join >the Lucene Project Management Committee (PMC)! > >Congratulations Patrick, thank you for all your hard work improving >Lucene's community and source code, and welcome aboard! > >Mike McCandless > >http://blog.mikemccandless.com -- Uwe Schindler Achterdiek 19, 28357 Bremen https://www.thetaphi.de
Re: Ascii folding
Hi Dawid, the ASCII folding filter is meant to remove accents. You would like to have searching for visually similar characters. These are 2 different things. Actually Robert also has some config options, waht I generally use for wester european searches where some documents may contain names of people (Author names, titles in cyrillic or other languages) it to convert the tokens using ICU transliteration (use one of the ICU folding filters with the below config): Transliterator.getInstance("Any-Latin; NFD; [:Nonspacing Mark:] Remove; NFKC; CaseFold", Transliterator.FORWARD); This does convert everything to latin characters in a language-neutral way and then removes all accents by the trick "decompose, remove non-spacing mark, compose again and case-fold the result. Uwe Am 10.11.2023 um 19:03 schrieb Dawid Weiss: Hi Steve, Chris, Ok, makes sense. Thanks for the pointers. I agree the justification for the use of character-level normalization filters is highly context-dependent (for example, unsuitable when mixed languages are present on input). Dawid On Fri, Nov 10, 2023 at 6:58 PM Chris Hostetter wrote: : Here's the unicode letter after "th": : https://www.fileformat.info/info/unicode/char/0435/index.htm : : To my surprise, I couldn't find it in the ascii folding filter: : : https://github.com/apache/lucene/blob/main/lucene/analysis/common/src/java/org/apache/lucene/analysis/miscellaneous/ASCIIFoldingFilter.java : : Anybody remembers whether the omission of Cyrillic characters was : intentional (there is quite a few of them that are nearly identical in : appearance to Latin letters). From the javadocs, i'm going to guess it's because the the filter focuses on "Latin_characters_in_Unicode" ... and your "CYRILLIC SMALL LETTER IE" isn't described as being a "(adjective) LATIN noun (WITH noun)" like all of the other characters that are considered to have a direct mapping to the "ASCII" / latin characters. If you look back at when it was added... https://issues.apache.org/jira/browse/LUCENE-1390 ...the original focus was on deprecating "ISOLatin1AccentFilter" and replacing it with "a more comprehensive version of this code that included not just ISO-Latin-1 (ISO-8859-1) but the entire Latin 1 and Latin Extended A unicode blocks." (The originally proposed name was 'ISOLatinAccentFilter') ... subsequent discussion focused on adding more Latin blocks. There was a related issue at the time which initially aimed to add a more general "UnicodeNormalizationFilter" that ultimated resulted in adding the "ICU" analysis classes... https://issues.apache.org/jira/browse/LUCENE-1343 ..which IIUC may better handle "CYRILLIC SMALL LETTER IE" (but i haven't tested that) -Hoss http://www.lucidworks.com/ - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail:u...@thetaphi.de
Re: [JENKINS] Lucene-9.x-Linux (64bit/hotspot/jdk-20) - Build # 13968 - Unstable!
See this PR: https://github.com/apache/lucene/pull/12785 The OpenJDK issue is: https://bugs.openjdk.org/browse/JDK-8319756 Uwe Am 08.11.2023 um 20:31 schrieb Uwe Schindler: I have seen this error multiple times, so we should fix it before 9.9. Uwe Am 08.11.2023 um 19:48 schrieb Uwe Schindler: Hi, this is caused by the change to better rethrow exception: https://github.com/apache/lucene/pull/12707 The internals of MemorySegment's ScopedMemoryAccess throw IllegalStateException("Already closed") when the memory segment was closed by another thread. Since the above PR, we check the exception type and rethrow it if the extra check does not confirm that the Arena/MemorySession's scope. Unfortunately it looks like isAlive() does sometimes still returns true, when another thread has closes the Arena. The issue is that the check is not volatile, so isAlive is just a informational method to check if an Arena/MemorySession are alive. It may still return true. We should maybe add another check to the alreadyClosed method to also rethrow as AlreadyClosedException when the exception message equals "Already closed". This is also not 100% safe, but works. I will open a PR for Lucene 9.9 to improve the detection in multithreaded code. Uwe Am 08.11.2023 um 17:43 schrieb Policeman Jenkins Server: Build: https://jenkins.thetaphi.de/job/Lucene-9.x-Linux/13968/ Java: 64bit/hotspot/jdk-20 -XX:-UseCompressedOops -XX:+UseG1GC 1 tests failed. FAILED: org.apache.lucene.store.TestMmapDirectory.testAceWithThreads Error Message: com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an uncaught exception in thread: Thread[id=3961, name=Thread-3701, state=RUNNABLE, group=TGRP-TestMmapDirectory] Stack Trace: com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an uncaught exception in thread: Thread[id=3961, name=Thread-3701, state=RUNNABLE, group=TGRP-TestMmapDirectory] at __randomizedtesting.SeedInfo.seed([58DC34EC2F093F06:866507D737434A23]:0) Caused by: java.lang.IllegalStateException: Already closed at __randomizedtesting.SeedInfo.seed([58DC34EC2F093F06]:0) at java.base/jdk.internal.foreign.MemorySessionImpl.alreadyClosed(MemorySessionImpl.java:312) at java.base/jdk.internal.misc.ScopedMemoryAccess$ScopedAccessError.newRuntimeException(ScopedMemoryAccess.java:113) at java.base/jdk.internal.misc.ScopedMemoryAccess.copyMemory(ScopedMemoryAccess.java:131) at java.base/jdk.internal.foreign.AbstractMemorySegmentImpl.copy(AbstractMemorySegmentImpl.java:589) at java.base/java.lang.foreign.MemorySegment.copy(MemorySegment.java:2152) at org.apache.lucene.store.MemorySegmentIndexInput.readBytes(MemorySegmentIndexInput.java:146) at org.apache.lucene.store.TestMmapDirectory.lambda$testAceWithThreads$1(TestMmapDirectory.java:83) at java.base/java.lang.Thread.run(Thread.java:1623) - To unsubscribe, e-mail: builds-unsubscr...@lucene.apache.org For additional commands, e-mail: builds-h...@lucene.apache.org -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail: u...@thetaphi.de - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [JENKINS] Lucene-9.x-Linux (64bit/hotspot/jdk-20) - Build # 13968 - Unstable!
I have seen this error multiple times, so we should fix it before 9.9. Uwe Am 08.11.2023 um 19:48 schrieb Uwe Schindler: Hi, this is caused by the change to better rethrow exception: https://github.com/apache/lucene/pull/12707 The internals of MemorySegment's ScopedMemoryAccess throw IllegalStateException("Already closed") when the memory segment was closed by another thread. Since the above PR, we check the exception type and rethrow it if the extra check does not confirm that the Arena/MemorySession's scope. Unfortunately it looks like isAlive() does sometimes still returns true, when another thread has closes the Arena. The issue is that the check is not volatile, so isAlive is just a informational method to check if an Arena/MemorySession are alive. It may still return true. We should maybe add another check to the alreadyClosed method to also rethrow as AlreadyClosedException when the exception message equals "Already closed". This is also not 100% safe, but works. I will open a PR for Lucene 9.9 to improve the detection in multithreaded code. Uwe Am 08.11.2023 um 17:43 schrieb Policeman Jenkins Server: Build: https://jenkins.thetaphi.de/job/Lucene-9.x-Linux/13968/ Java: 64bit/hotspot/jdk-20 -XX:-UseCompressedOops -XX:+UseG1GC 1 tests failed. FAILED: org.apache.lucene.store.TestMmapDirectory.testAceWithThreads Error Message: com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an uncaught exception in thread: Thread[id=3961, name=Thread-3701, state=RUNNABLE, group=TGRP-TestMmapDirectory] Stack Trace: com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an uncaught exception in thread: Thread[id=3961, name=Thread-3701, state=RUNNABLE, group=TGRP-TestMmapDirectory] at __randomizedtesting.SeedInfo.seed([58DC34EC2F093F06:866507D737434A23]:0) Caused by: java.lang.IllegalStateException: Already closed at __randomizedtesting.SeedInfo.seed([58DC34EC2F093F06]:0) at java.base/jdk.internal.foreign.MemorySessionImpl.alreadyClosed(MemorySessionImpl.java:312) at java.base/jdk.internal.misc.ScopedMemoryAccess$ScopedAccessError.newRuntimeException(ScopedMemoryAccess.java:113) at java.base/jdk.internal.misc.ScopedMemoryAccess.copyMemory(ScopedMemoryAccess.java:131) at java.base/jdk.internal.foreign.AbstractMemorySegmentImpl.copy(AbstractMemorySegmentImpl.java:589) at java.base/java.lang.foreign.MemorySegment.copy(MemorySegment.java:2152) at org.apache.lucene.store.MemorySegmentIndexInput.readBytes(MemorySegmentIndexInput.java:146) at org.apache.lucene.store.TestMmapDirectory.lambda$testAceWithThreads$1(TestMmapDirectory.java:83) at java.base/java.lang.Thread.run(Thread.java:1623) - To unsubscribe, e-mail: builds-unsubscr...@lucene.apache.org For additional commands, e-mail: builds-h...@lucene.apache.org -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail: u...@thetaphi.de - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [JENKINS] Lucene-9.x-Linux (64bit/hotspot/jdk-20) - Build # 13968 - Unstable!
Hi, this is caused by the change to better rethrow exception: https://github.com/apache/lucene/pull/12707 The internals of MemorySegment's ScopedMemoryAccess throw IllegalStateException("Already closed") when the memory segment was closed by another thread. Since the above PR, we check the exception type and rethrow it if the extra check does not confirm that the Arena/MemorySession's scope. Unfortunately it looks like isAlive() does sometimes still returns true, when another thread has closes the Arena. The issue is that the check is not volatile, so isAlive is just a informational method to check if an Arena/MemorySession are alive. It may still return true. We should maybe add another check to the alreadyClosed method to also rethrow as AlreadyClosedException when the exception message equals "Already closed". This is also not 100% safe, but works. I will open a PR for Lucene 9.9 to improve the detection in multithreaded code. Uwe Am 08.11.2023 um 17:43 schrieb Policeman Jenkins Server: Build: https://jenkins.thetaphi.de/job/Lucene-9.x-Linux/13968/ Java: 64bit/hotspot/jdk-20 -XX:-UseCompressedOops -XX:+UseG1GC 1 tests failed. FAILED: org.apache.lucene.store.TestMmapDirectory.testAceWithThreads Error Message: com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an uncaught exception in thread: Thread[id=3961, name=Thread-3701, state=RUNNABLE, group=TGRP-TestMmapDirectory] Stack Trace: com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an uncaught exception in thread: Thread[id=3961, name=Thread-3701, state=RUNNABLE, group=TGRP-TestMmapDirectory] at __randomizedtesting.SeedInfo.seed([58DC34EC2F093F06:866507D737434A23]:0) Caused by: java.lang.IllegalStateException: Already closed at __randomizedtesting.SeedInfo.seed([58DC34EC2F093F06]:0) at java.base/jdk.internal.foreign.MemorySessionImpl.alreadyClosed(MemorySessionImpl.java:312) at java.base/jdk.internal.misc.ScopedMemoryAccess$ScopedAccessError.newRuntimeException(ScopedMemoryAccess.java:113) at java.base/jdk.internal.misc.ScopedMemoryAccess.copyMemory(ScopedMemoryAccess.java:131) at java.base/jdk.internal.foreign.AbstractMemorySegmentImpl.copy(AbstractMemorySegmentImpl.java:589) at java.base/java.lang.foreign.MemorySegment.copy(MemorySegment.java:2152) at org.apache.lucene.store.MemorySegmentIndexInput.readBytes(MemorySegmentIndexInput.java:146) at org.apache.lucene.store.TestMmapDirectory.lambda$testAceWithThreads$1(TestMmapDirectory.java:83) at java.base/java.lang.Thread.run(Thread.java:1623) - To unsubscribe, e-mail: builds-unsubscr...@lucene.apache.org For additional commands, e-mail: builds-h...@lucene.apache.org -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail: u...@thetaphi.de - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Squash vs merge of PRs
http://blog.mikemccandless.com On Sat, Nov 4, 2023 at 11:03 AM Gus Heck wrote: Also, since (as noted) this is a previously decided issue, not sure why this is a list email instead of a simple direct query to Robert seeking to understand the specific case? No need to make a public discussion unless it's a long term pattern, actually breaking something, or we want to change something? On Sat, Nov 4, 2023 at 9:37 AM Benjamin Trent wrote: TL;DR, forcing non-committers to squash things is a good idea. Enforcing through some measure for committers is a bad idea. Since this thread is now in Robert's spam, I am guessing it won't have any impact :). I do not think Robert is actively trying hurt the project in any way. It seems to me that he doesn't think a clean git history is worth the effort. Having a clean git history makes things easier for everyone. Comparing histories between branches with git-bisect to find bugs is just one example. Another is simply reading commits to see when features/bug fixes/etc. were added. I do NOT think we should add procedures or branch protections to actively enforce this. Small personal sacrifices (like dealing with commit conflicts) are necessary for a community. Being part of a community is about buying into what the community is about and working towards a common goal. Many times we do things we don't agree with, or make things slightly more difficult for us, for the community as a whole. This thing being OSS shows that we all buy into its importance and are willing to put work into the project. Having a cultural default of "make things nice for others" is good. Enforcing this ideology on others is antithesis to its definition. On Sat, Nov 4, 2023 at 9:02 AM Robert Muir wrote: This isn't a community issue, it is me avoiding useless unnecessary merge conflicts. Word "community" is invoked here to try to make it out, like you can hold a vote about what git commands i should type on my computer? You know that isn't gonna work. have some humility. thread moved to spam. On Sat, Nov 4, 2023 at 8:36 AM Mike Drob wrote: > > We all agree on using Java though, and using a specific version, and even the style output from gradle tidy. Is that nanny state or community consensus? > > On Sat, Nov 4, 2023 at 7:29 AM Robert Muir wrote: >> >> example of a nanny state IMO, trying to dictate what git commands to >> use, or what editor to use. Maybe this works for you in your corporate >> hellholes, but I think some folks have a bit of a power issue, are >> accustomed to dictacting this stuff to their employees and so on, but >> this is open-source. I don't report to you, i dont use the editor you >> tell me, or the git commands you tell me. >> >> On Sat, Nov 4, 2023 at 8:21 AM Uwe Schindler wrote: >> > >> > Hi, >> > >> > I just wanted to give your attention to the following discussion: >> > https://github.com/apache/lucene/pull/12737#issuecomment-1793426911 >> > >> > From my knowledge the Lucene (and Solr) community decided a while back >> > to disable merging and only allow squashig of PRs. Robert always did >> > this, but because of a one-time problem with two branches he was working >> > on in parallel, he suddenly changed his mind and did merges o
Re: [JENKINS] Lucene-main-Linux (64bit/openj9/jdk-17.0.5) - Build # 45394 - Unstable!
Hi Mike, I will update the J9 runtime later this day. But this was a real bug, so it's good it catched this :-) So - no - I won't remove OpenJ9 support at all. The errors someties happen are bugs, they might get better with latest versions. I see there's no waslo a Java 20 version. I will give it a try, too - especially regarding Panama (+ Vector). Want to see how it behaves. Uwe Am 05.11.2023 um 10:34 schrieb Uwe Schindler: Hi Mike, No it was a bug introduced by me. Will be fixed in a moment. See reply on other thread. Was outside yesterday. (without J9, the bug I introduced by refactoring would not have been detected) Uwe Am 04.11.2023 um 17:40 schrieb Michael McCandless: OK I opened https://github.com/eclipse-openj9/openj9/issues/18400 -- let's see where that goes. Uwe, should we upgrade to the latest OpenJ9 again maybe? Mike McCandless http://blog.mikemccandless.com On Sat, Nov 4, 2023 at 12:25 PM Michael McCandless wrote: Should we maybe stop testing J9? Reduce its frequency? So much noise ... I know I can filter these out from my gmail box. I will try opening an issue in the OpenJ9 GitHub repo: https://github.com/eclipse-openj9/openj9/issues Mike McCandless http://blog.mikemccandless.com On Fri, Nov 3, 2023 at 7:43 PM Policeman Jenkins Server wrote: Build: https://jenkins.thetaphi.de/job/Lucene-main-Linux/45394/ Java: 64bit/openj9/jdk-17.0.5 -XX:-UseCompressedOops -Xgcpolicy:metronome 2 tests failed. FAILED: org.apache.lucene.util.TestRamUsageEstimator.testReferenceSize Error Message: java.lang.AssertionError: For 64 bit JVMs, reference size must be 8, unless compressed references are enabled expected:<8> but was:<4> Stack Trace: java.lang.AssertionError: For 64 bit JVMs, reference size must be 8, unless compressed references are enabled expected:<8> but was:<4> at __randomizedtesting.SeedInfo.seed([91923EC152043BB:15B168BF99C02E62]:0) at app//org.junit.Assert.fail(Assert.java:89) at app//org.junit.Assert.failNotEquals(Assert.java:835) at app//org.junit.Assert.assertEquals(Assert.java:647) at app//org.apache.lucene.util.TestRamUsageEstimator.testReferenceSize(TestRamUsageEstimator.java:195) at java.base@17.0.5/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base@17.0.5/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) at java.base@17.0.5/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base@17.0.5/java.lang.reflect.Method.invoke(Method.java:568) at app//com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1758) at app//com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:946) at app//com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:982) at app//com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:996) at app//org.apache.lucene.tests.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:48) at app//org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43) at app//org.apache.lucene.tests.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:45) at app//org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60) at app//org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44) at app//org.junit.rules.RunRules.evaluate(RunRules.java:20) at app//com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at app//com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390) at app//com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:843) at app//com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:490) at app//com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:955) at app//com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:840)
Re: [JENKINS] Lucene-main-Windows (64bit/openj9/jdk-17.0.5) - Build # 13400 - Unstable!
Hi, the problem introduced by https://github.com/apache/lucene/pull/12754 was fixed. Sorry for this. On OpenJ9 the RAMUsageEstimator works correct again: 2> Nov. 05, 2023 10:52:15 AM org.apache.lucene.util.HotspotVMOptions 2> *WARNUNG: Lucene cannot optimize algorithms or calculate object sizes for JVMs that are not based on Hotspot or a compatible implementation.* 1> JVM_IS_HOTSPOT_64BIT = false 1> COMPRESSED_REFS_ENABLED = false 1> NUM_BYTES_OBJECT_ALIGNMENT = 8 1> NUM_BYTES_OBJECT_REF = 8 1> NUM_BYTES_OBJECT_HEADER = 16 1> NUM_BYTES_ARRAY_HEADER = 24 1> LONG_SIZE = 24 2> NOTE: Windows 10 10.0 amd64/IBM Corporation 17.0.8.1 (64-bit)/cpus=1,threads=1,free=207903304,total=268435456 2> NOTE: All tests run in this JVM: [TestRamUsageEstimator] :lucene:core:test (SUCCESS): 10 test(s), 1 skipped The slowest suites (exceeding 1s) during this run: 2.70s TestRamUsageEstimator (:lucene:core) Uwe Am 05.11.2023 um 10:30 schrieb Uwe Schindler: Hi, this was my fault, as I had no J9 VM ready. The issue was moving the Hotspot condition to the first if clause: https://github.com/apache/lucene/pull/12754/files#diff-d66ace802ca787e308d675106db2413c4a77d36b51c5a2997bb7efd49e8bR115 The test catched this - cool. I will change the logic a bit to separate 64 bit detection from Hotspot logic. The HotspotVMOptions.IS_HOTSPOT needs to go one line down it is not needed for the logic, it needs to be just assigned to the constant JVM_IS_HOTSPOT_64BIT. Sorry fix is going out. Need to reactivate my J9 here to make sure all is sane. Uwe Am 05.11.2023 um 10:16 schrieb Uwe Schindler: I will look into this. Could be related to the previous commit. Uwe Am 4. November 2023 16:17:45 MEZ schrieb Michael McCandless : Maybe J9 specific? Mike McCandless http://blog.mikemccandless.com On Sat, Nov 4, 2023 at 11:01 AM Policeman Jenkins Server wrote: Build: https://jenkins.thetaphi.de/job/Lucene-main-Windows/13400/ Java: 64bit/openj9/jdk-17.0.5 -XX:-UseCompressedOops -Xgcpolicy:gencon 2 tests failed. FAILED: org.apache.lucene.util.TestRamUsageEstimator.testReferenceSize Error Message: java.lang.AssertionError: For 64 bit JVMs, reference size must be 8, unless compressed references are enabled expected:<8> but was:<4> Stack Trace: java.lang.AssertionError: For 64 bit JVMs, reference size must be 8, unless compressed references are enabled expected:<8> but was:<4> at __randomizedtesting.SeedInfo.seed([41AB595A28A8656B:5D031209A44808B2]:0) at app//org.junit.Assert.fail(Assert.java:89) at app//org.junit.Assert.failNotEquals(Assert.java:835) at app//org.junit.Assert.assertEquals(Assert.java:647) at app//org.apache.lucene.util.TestRamUsageEstimator.testReferenceSize(TestRamUsageEstimator.java:195) at java.base@17.0.5/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base@17.0.5/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) at java.base@17.0.5/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base@17.0.5/java.lang.reflect.Method.invoke(Method.java:568) at app//com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1758) at app//com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:946) at app//com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:982) at app//com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:996) at app//org.apache.lucene.tests.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:48) at app//org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43) at app//org.apache.lucene.tests.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:45) at app//org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60) at app//org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44) at app//org.junit.rules.RunRules.evaluate(RunRules.java:20) at app//com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at app//com.carrotsearch.randomizedtesting.ThreadL
Re: Bump minimum Java version requirement to 21
Hi, thanks Chris. This is why I suggested the idea, to have the discussion here. We are already close to Lucene 9.9. Do we want 9.10? We had that long series of minor releases only int the 4.x branch (which ended in 4.10). I have some comments inline: On 3 Nov 2023, at 13:11, Uwe Schindler wrote: Hi, I had another idea: Why not release main as 10.0.0 *NOW* and create branch_10x (with Java 17) minimum, stop working on 9.x, and move main branch to 21? I see now that 9.x has a minimum Java version of 11, and that _main_ has a minimum version of 17. I previously overlooked this ( I thought that 9.x was on 17, but it is not ). Ok, so your idea is actually quite inline with how things have happened in the past. For ease of reference, here are the dates of the last 4 major releases. 9.0.0 Dec 2021 8.0.0 Mar 2019 7.0.0 Sep 2017 6.0.0 Apr 2016 If we release 10.0.0 now (with a minimum of 17) that drops the need to support Java 11 (since work in 9.x will mostly stop). I’m ok with this, and we get the benefits of dropping < Java 17. But can we be more ambitious in our approach here? I’ll defer to others about what is in _main_ to justify a major release or not - the driver for a release should be more than just the minimum Java version. Alternatively, what if we were to not release 10.0.0 for another while, say 3 - 6 months, and at the same time bump it to Java 21. In the meantime we can keep the 9.x updates coming. My motivation for suggesting this is that it appears that major Lucene versions seem to be around every 2 years or so, and if we release 10 with Java 17, the we’ll still be reluctant to use Java APIs and features between 17 and 21 for the next, likely, 2 years. An alternative to that is to release Lucene 11.0.0 sometime before the 2 year mark. I would be happy to remove the MmapByteBuffer directory in Java 18. We can only do this when we move to a minimum Java > 17, so in your proposal that would be in _main_ some time post the fork for branch_10x. That seems ok. Sorry this was a typo with version number. I meant Java 21 would no longer require (Mapped-)ByteBufferIndexInput. Unfortunately in Java 21 we still need a hack top compile the MemorySegment classes because of the preview flag. And for the incubator we also need the APIJAR files. But we can do this then without MR-JAR unless we need a new version for Java 22, 23 of vectors. My idea would be to patch in the api JAR during compile of "main" sourceset classes. Yeah, regardless of the minimum version bump some work is needed here :-( Where possible we should try to minimise it, but I agree we’ll likely need updates for the vector stuff in 22+. I figured out it is not so easy, we need additional maintenance and possibly a MR JAR also with Java 21: * In Java 21, panama-foreign is still preview. So when compiling we need the APIJAR. * In the MR-JAR compilation we patch the APIJAR into the java.base module (which we also need for incubating). The problem is: YOu cannot patch the "java.base" module and at same time pass "--release 21". So In that code part we need to compile against actual class library (I have no idea why patching is disallowed with --release). It prints a cryptic error message, but makes no sense to me. * Because of the inability to use "--release" we still need to compile the Panama classes in a separate gradle sourceSet. But we can copy the separate sourceSet output for 21 directly into the main JAR part (but we can also let it live in versions/21. This should not stop us from moving to 21, the details with how to build the JRA/MR-JAR can be solved separately. You PR looks fine, I would keep away from the MR-JAR sourceSets for now. We can clean the up later. Keeping parts of the MR-JAR logic as suggested before helps with backporting. Uwe Am 03.11.2023 um 13:20 schrieb Chris Hegarty: Hi, I would like to start the discussion and gather feedback on bumping the minimum Java version requirement to 21. I have no particular timeline in mind, but these kinda bumps often require dependency updates [*], small code refactorings, etc, and can take some time to plan and execute. It's best to at least have a plan for when, rather than if! Any bump would of course be limited to the _main_ branch, and therefore targeting a major Lucene release (no changes to branches targeting minor patch releases). I'm sure subscribers to this list are already familiar with the various goodies that have been added between Java 17 and 21, so I'll not enumerate them here, but rather callout just two particular benefits that I think are significant to the Lucene project. 1) Put a lower bound on the number of memory segment mmap and Panama Vector similarity implementations that we need to carry. This not only reduces maintenance cost, but avoids additional consideration and experimentation for performance i
Re: [JENKINS] Lucene-main-Linux (64bit/openj9/jdk-17.0.5) - Build # 45394 - Unstable!
ontrol.java:490) at app//com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:955) at app//com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:840) at app//com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:891) at app//com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:902) at app//org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43) at app//com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at app//org.apache.lucene.tests.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38) at app//com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40) at app//com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40) at app//com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at app//com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at app//org.apache.lucene.tests.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53) at app//org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43) at app//org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44) at app//org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60) at app//org.apache.lucene.tests.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:47) at app//org.junit.rules.RunRules.evaluate(RunRules.java:20) at app//com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at app//com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390) at app//com.carrotsearch.randomizedtesting.ThreadLeakControl.lambda$forkTimeoutingTask$0(ThreadLeakControl.java:850) at java.base@17.0.5/java.lang.Thread.run(Thread.java:857) - To unsubscribe, e-mail: builds-unsubscr...@lucene.apache.org For additional commands, e-mail: builds-h...@lucene.apache.org -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail:u...@thetaphi.de
Re: [JENKINS] Lucene-main-Windows (64bit/openj9/jdk-17.0.5) - Build # 13400 - Unstable!
Hi, this was my fault, as I had no J9 VM ready. The issue was moving the Hotspot condition to the first if clause: https://github.com/apache/lucene/pull/12754/files#diff-d66ace802ca787e308d675106db2413c4a77d36b51c5a2997bb7efd49e8bR115 The test catched this - cool. I will change the logic a bit to separate 64 bit detection from Hotspot logic. The HotspotVMOptions.IS_HOTSPOT needs to go one line down it is not needed for the logic, it needs to be just assigned to the constant JVM_IS_HOTSPOT_64BIT. Sorry fix is going out. Need to reactivate my J9 here to make sure all is sane. Uwe Am 05.11.2023 um 10:16 schrieb Uwe Schindler: I will look into this. Could be related to the previous commit. Uwe Am 4. November 2023 16:17:45 MEZ schrieb Michael McCandless : Maybe J9 specific? Mike McCandless http://blog.mikemccandless.com On Sat, Nov 4, 2023 at 11:01 AM Policeman Jenkins Server wrote: Build: https://jenkins.thetaphi.de/job/Lucene-main-Windows/13400/ Java: 64bit/openj9/jdk-17.0.5 -XX:-UseCompressedOops -Xgcpolicy:gencon 2 tests failed. FAILED: org.apache.lucene.util.TestRamUsageEstimator.testReferenceSize Error Message: java.lang.AssertionError: For 64 bit JVMs, reference size must be 8, unless compressed references are enabled expected:<8> but was:<4> Stack Trace: java.lang.AssertionError: For 64 bit JVMs, reference size must be 8, unless compressed references are enabled expected:<8> but was:<4> at __randomizedtesting.SeedInfo.seed([41AB595A28A8656B:5D031209A44808B2]:0) at app//org.junit.Assert.fail(Assert.java:89) at app//org.junit.Assert.failNotEquals(Assert.java:835) at app//org.junit.Assert.assertEquals(Assert.java:647) at app//org.apache.lucene.util.TestRamUsageEstimator.testReferenceSize(TestRamUsageEstimator.java:195) at java.base@17.0.5/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base@17.0.5/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) at java.base@17.0.5/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base@17.0.5/java.lang.reflect.Method.invoke(Method.java:568) at app//com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1758) at app//com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:946) at app//com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:982) at app//com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:996) at app//org.apache.lucene.tests.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:48) at app//org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43) at app//org.apache.lucene.tests.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:45) at app//org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60) at app//org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44) at app//org.junit.rules.RunRules.evaluate(RunRules.java:20) at app//com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at app//com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390) at app//com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:843) at app//com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:490) at app//com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:955) at app//com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:840) at app//com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:891) at app//com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:902) at app//org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43) at app//com.carrotsearch.randomizedtesting.rules.S
Re: [JENKINS] Lucene-main-Windows (64bit/openj9/jdk-17.0.5) - Build # 13400 - Unstable!
r.java:902) >> at >> app//org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43) >> at >> app//com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) >> at >> app//org.apache.lucene.tests.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38) >> at >> app//com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40) >> at >> app//com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40) >> at >> app//com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) >> at >> app//com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) >> at >> app//org.apache.lucene.tests.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53) >> at >> app//org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43) >> at >> app//org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44) >> at >> app//org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60) >> at >> app//org.apache.lucene.tests.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:47) >> at app//org.junit.rules.RunRules.evaluate(RunRules.java:20) >> at >> app//com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) >> at >> app//com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390) >> at >> app//com.carrotsearch.randomizedtesting.ThreadLeakControl.lambda$forkTimeoutingTask$0(ThreadLeakControl.java:850) >> at java.base@17.0.5/java.lang.Thread.run(Thread.java:857) >> >> - >> To unsubscribe, e-mail: builds-unsubscr...@lucene.apache.org >> For additional commands, e-mail: builds-h...@lucene.apache.org -- Uwe Schindler Achterdiek 19, 28357 Bremen https://www.thetaphi.de
Squash vs merge of PRs
Hi, I just wanted to give your attention to the following discussion: https://github.com/apache/lucene/pull/12737#issuecomment-1793426911 From my knowledge the Lucene (and Solr) community decided a while back to disable merging and only allow squashig of PRs. Robert always did this, but because of a one-time problem with two branches he was working on in parallel, he suddenly changed his mind and did merges on his own, not sqashing the branch and pushing to ASF Git. I am also not a fan of removing all history, but especially for heavy committing branches like the given PR, I think we should invite our committers to also adhere to community standards everyone else practices. I would agree with merging those branches if all commit messages in the branch would be well-formed with issue ID or PR number, but in the above case you get a history of random commits which is no longer linear and not easy readable. What do others think? Uwe -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail: u...@thetaphi.de - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Bump minimum Java version requirement to 21
Hi, I had another idea: Why not release main as 10.0.0 *NOW* and create branch_10x (with Java 17) minimum, stop working on 9.x, and move main branch to 21? I would be happy to remove the MmapByteBuffer directory in Java 18. Unfortunately in Java 21 we still need a hack top compile the MemorySegment classes because of the preview flag. And for the incubator we also need the APIJAR files. But we can do this then without MR-JAR unless we need a new version for Java 22, 23 of vectors. My idea would be to patch in the api JAR during compile of "main" sourceset classes. Uwe Am 03.11.2023 um 13:20 schrieb Chris Hegarty: Hi, I would like to start the discussion and gather feedback on bumping the minimum Java version requirement to 21. I have no particular timeline in mind, but these kinda bumps often require dependency updates [*], small code refactorings, etc, and can take some time to plan and execute. It's best to at least have a plan for when, rather than if! Any bump would of course be limited to the _main_ branch, and therefore targeting a major Lucene release (no changes to branches targeting minor patch releases). I'm sure subscribers to this list are already familiar with the various goodies that have been added between Java 17 and 21, so I'll not enumerate them here, but rather callout just two particular benefits that I think are significant to the Lucene project. 1) Put a lower bound on the number of memory segment mmap and Panama Vector similarity implementations that we need to carry. This not only reduces maintenance cost, but avoids additional consideration and experimentation for performance improvements. 2) Support for half float, Float::float16ToFloat and Float::floatToFloat16, which will likely be beneficial in several places. More concretely, and somewhat orthogonal to the discussion of when, I would like to create a meta-issue capturing the prerequisites to a version bump. Your thoughts, comments, and feedback are very much welcome. -Chris. [*] we need at least an ECJ JDT dependency update, that supports Java 21, https://www.eclipse.org/lists/eclipse-dev/msg12203.html - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail: u...@thetaphi.de - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [JENKINS-EA] Lucene-main-Linux (64bit/hotspot/jdk-22-ea+20) - Build # 45159 - Unstable!
OpenJDK bug: https://bugs.openjdk.org/browse/JDK-8318646 Am 23.10.2023 um 09:48 schrieb Uwe Schindler: Hi, this test fails only in newest JDK 22, because the error message of NumberFormatException on Integer#parseInt() changed. I opened issue https://github.com/apache/lucene/issues/12708 and will report this to openjdk, because it might affect other projects, too. We should at least fix the code here to not be so picky (why does it check the error message at all)? Uwe Am 23.10.2023 um 07:33 schrieb Policeman Jenkins Server: org.junit.ComparisonFailure: expected:<...berFormatException: [For input string: ""]> but was:<...berFormatException: []> at __randomizedtesting.SeedInfo.seed([50F854467969E2C3:7F8CC4BC3195F1BF]:0) at junit@4.13.1/org.junit.Assert.assertEquals(Assert.java:117) at junit@4.13.1/org.junit.Assert.assertEquals(Assert.java:146) at org.apache.lucene.queryparser.xml.TestCoreParser.testSpanNearQueryWithoutSlopXML(TestCoreParser.java:165) at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103) at java.base/java.lang.reflect.Method.invoke(Method.java:580) at randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1758) at randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:946) at randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:982) at randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:996) at org.apache.lucene.test_framework@10.0.0-SNAPSHOT/org.apache.lucene.tests.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:48) at org.apache.lucene.test_framework@10.0.0-SNAPSHOT/org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43) at org.apache.lucene.test_framework@10.0.0-SNAPSHOT/org.apache.lucene.tests.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:45) at org.apache.lucene.test_framework@10.0.0-SNAPSHOT/org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60) at org.apache.lucene.test_framework@10.0.0-SNAPSHOT/org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44) at junit@4.13.1/org.junit.rules.RunRules.evaluate(RunRules.java:20) at randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390) at randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:843) at randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:490) at randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:955) at randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:840) at randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:891) at randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:902) at org.apache.lucene.test_framework@10.0.0-SNAPSHOT/org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43) at randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.test_framework@10.0.0-SNAPSHOT/org.apache.lucene.tests.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38) at randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40) at randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40) at randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.test_framework@10.0.0-SNAPSHOT/org.apache.lucene.tests.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53) at org.apache.lucene.test_framework@10.0.0-SNAPSHOT/org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43) at org.apache.lucene.test_framework@10.0.0-SNAPSHOT/org.apache.lucene.tests.util.TestRule
Re: Welcome Luca Cavanna to the Lucene PMC
Welcome Luca, Uwe Am 20.10.2023 um 07:50 schrieb Adrien Grand: I'm pleased to announce that Luca Cavanna has accepted an invitation to join the Lucene PMC! Congratulations Luca, and welcome aboard! -- Adrien -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail:u...@thetaphi.de
Re: [JENKINS-EA] Lucene-main-Linux (64bit/hotspot/jdk-22-ea+20) - Build # 45159 - Unstable!
mework@10.0.0-SNAPSHOT/org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60) at org.apache.lucene.test_framework@10.0.0-SNAPSHOT/org.apache.lucene.tests.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:47) at junit@4.13.1/org.junit.rules.RunRules.evaluate(RunRules.java:20) at randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390) at randomizedtesting.runner@2.8.1/com.carrotsearch.randomizedtesting.ThreadLeakControl.lambda$forkTimeoutingTask$0(ThreadLeakControl.java:850) at java.base/java.lang.Thread.run(Thread.java:1570) -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail: u...@thetaphi.de - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: ByteBufferIndexInput.alreadyClosed creates an exception that doesn't track its cause
If the buffers are nonnull and the guard state is sane, it would have thrown the exception, like on incomplete views. If buffers or guard is invalidated, the indexinput was closed. The state of curFloatBufferViews would then not matter. Anyways: Indexinput is not allowed to be used by multiple threads, so it's a bug in your code. The code has state (position to read) and it is documented to be not thread safe. Uwe Am 22. Oktober 2023 14:37:06 MESZ schrieb Michael Sokolov : >Thanks, Uwe. The underlying exception in my situation was caused by >curFloatBufferViews being allocated and used before it was fully >populated. So I think it was an NPE, yes. I'll check your PR to see if >it would have hidden this? > >On Sun, Oct 22, 2023 at 4:57 AM Uwe Schindler wrote: >> >> Please read my other comments and the PR. The PR filters the cause of >> the NPE, if the NPE is caused by inernals of MMapDirectory it won't be >> exposed to anybody. >> >> If you use it in multiple threads and acidentally close one of the >> indexinputs, AlreadyClosedException is the only correct exception. Any >> cause like an internal signalling NPE is not useful and helps nothing. >> The PR explains this, so we won't add the NPE as cause. If the NPE is >> coming from outside MMapDircetory, it will be rethrown so you see it. >> >> I will merge the PR in a moment. >> >> Uwe >> >> Am 22.10.2023 um 01:37 schrieb Michael Sokolov: >> > Thanks for digging into this. I do think it will be helpful for >> > developers that blithely access the IndexInput from multiple threads >> > :) >> > >> > On Sat, Oct 21, 2023 at 3:53 PM Chris Hostetter >> > wrote: >> >> >> >> Uwe: In your PR, you should add these details to the javadocs of >> >> ByteBufferIndexInput.alreadyClosed(), so future code spelunkers understand >> >> the choice being made here is intentional :) >> >> >> >> : please don't add the NPE here as cause (except for debugging). The NPE >> >> is only >> >> : catched to NOT add extra checks in the highly performance sensitive >> >> code. >> >> : Actually the NPE is catched to detect the case where the bytebuffer was >> >> : already unset to trigger the already closed. The code uses setting the >> >> buffers >> >> : to NULL to signal cause, but it does NOT add a NULL check everywhere. >> >> This >> >> : allows Hotspot to compile this code without any bounds checks and >> >> signal the >> >> : AlreadyClosedException only when a NPE happens. Adding the NPE as cause >> >> would >> >> >> >> >> >> >> >> -Hoss >> >> http://www.lucidworks.com/ >> >> >> >> - >> >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org >> >> For additional commands, e-mail: dev-h...@lucene.apache.org >> >> >> > - >> > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org >> > For additional commands, e-mail: dev-h...@lucene.apache.org >> > >> -- >> Uwe Schindler >> Achterdiek 19, D-28357 Bremen >> https://www.thetaphi.de >> eMail: u...@thetaphi.de >> >> >> - >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org >> For additional commands, e-mail: dev-h...@lucene.apache.org >> > >- >To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org >For additional commands, e-mail: dev-h...@lucene.apache.org > -- Uwe Schindler Achterdiek 19, 28357 Bremen https://www.thetaphi.de
Re: ByteBufferIndexInput.alreadyClosed creates an exception that doesn't track its cause
Please read my other comments and the PR. The PR filters the cause of the NPE, if the NPE is caused by inernals of MMapDirectory it won't be exposed to anybody. If you use it in multiple threads and acidentally close one of the indexinputs, AlreadyClosedException is the only correct exception. Any cause like an internal signalling NPE is not useful and helps nothing. The PR explains this, so we won't add the NPE as cause. If the NPE is coming from outside MMapDircetory, it will be rethrown so you see it. I will merge the PR in a moment. Uwe Am 22.10.2023 um 01:37 schrieb Michael Sokolov: Thanks for digging into this. I do think it will be helpful for developers that blithely access the IndexInput from multiple threads :) On Sat, Oct 21, 2023 at 3:53 PM Chris Hostetter wrote: Uwe: In your PR, you should add these details to the javadocs of ByteBufferIndexInput.alreadyClosed(), so future code spelunkers understand the choice being made here is intentional :) : please don't add the NPE here as cause (except for debugging). The NPE is only : catched to NOT add extra checks in the highly performance sensitive code. : Actually the NPE is catched to detect the case where the bytebuffer was : already unset to trigger the already closed. The code uses setting the buffers : to NULL to signal cause, but it does NOT add a NULL check everywhere. This : allows Hotspot to compile this code without any bounds checks and signal the : AlreadyClosedException only when a NPE happens. Adding the NPE as cause would -Hoss http://www.lucidworks.com/ - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail: u...@thetaphi.de - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: ByteBufferIndexInput.alreadyClosed creates an exception that doesn't track its cause
Hi, I added long comment in the PR for all variants of that code (there are actually 4 variants because we have 4 implementations). Uwe Am 21.10.2023 um 21:53 schrieb Chris Hostetter: Uwe: In your PR, you should add these details to the javadocs of ByteBufferIndexInput.alreadyClosed(), so future code spelunkers understand the choice being made here is intentional :) : please don't add the NPE here as cause (except for debugging). The NPE is only : catched to NOT add extra checks in the highly performance sensitive code. : Actually the NPE is catched to detect the case where the bytebuffer was : already unset to trigger the already closed. The code uses setting the buffers : to NULL to signal cause, but it does NOT add a NULL check everywhere. This : allows Hotspot to compile this code without any bounds checks and signal the : AlreadyClosedException only when a NPE happens. Adding the NPE as cause would -Hoss http://www.lucidworks.com/ - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail: u...@thetaphi.de - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: ByteBufferIndexInput.alreadyClosed creates an exception that doesn't track its cause
Hi Mike, here is the PR improving the situation while not making the signalling NPE visible to code: https://github.com/apache/lucene/pull/12705 In MemorySegmentIndexInput theres also InvalidStateException which cannot be throws on wrong parameters. This one is also internal and always mapped to AlreadyClosedException (also without cause). Uwe Am 21.10.2023 um 10:23 schrieb Uwe Schindler: Hi, I have a good idea: We should only wrap as AlreadyClosedException if and only if the bytebuffers/memorysegemnts are null (see . In all other cases rethrow the NPE: AlreadyClosedException alreadyClosed(RuntimeException npe) { if (npe == null || this.buffers == null) { // buffers == null if input closed! return new AlreadyClosedException("Already closed: " + this); } throw npe; } (this would need the same change in all MemorySegmentIndexInput in a similar way). This would keep the NPE on wrong usage, but in the case of a closed ByteBufferIndexInput / MemorySegmentIndexInput it would throw plain AlreadyClosedEx. I can provide a PR, but give me a week, I am very busy. Uwe Am 21.10.2023 um 10:01 schrieb Uwe Schindler: Hi, please don't add the NPE here as cause (except for debugging). The NPE is only catched to NOT add extra checks in the highly performance sensitive code. Actually the NPE is catched to detect the case where the bytebuffer was already unset to trigger the already closed. The code uses setting the buffers to NULL to signal cause, but it does NOT add a NULL check everywhere. This allows Hotspot to compile this code without any bounds checks and signal the AlreadyClosedException only when a NPE happens. Adding the NPE as cause would bring confusion to end user, as we only want to tell that IndexInput was closed, but the NPE should be kept behind scenes as it would be a support nightmare ("your code has no good null checks, it is broken"). The NPE is a signal here, not the cause. I think the issue you have seen may be cause by passing a NULL parameter to one of the methods like a float array to readFloats(). This is not detected (P.S.: this also affects MemorySegmentIndexInput). I can looks at the code to figure out how to better detect the NPE when parameters of methods are NULL, but no way to add the cause here. I would say: If you have to debug, do it temporarily or ose another dircetory impl. Uwe Am 20.10.2023 um 21:20 schrieb Chris Hostetter: FWIW: The choice to ignore the original exception goes back to here... https://issues.apache.org/jira/browse/LUCENE-3588 ...circa 2011, where it was focused on catching NPE and throwing AlreadyClosedException instead, w/o any particular discussion as to why to throw away the original NPE. If i had to guess it's simply because at that time AlreadyClosedException didn't support wrapping any other Throwable. That wasn't added until LUCENE-5958 (circa 2014) which was focused on making sure "tragic" errors kept a record of what caused the tragedy, and then include that as the 'cause' of the AlreadyClosedExceptions throw by 'ensureOpen()' There didn't seem to be any discussion at that time about reviewing other code that might be throwing AlreadyClosedException from a 'catch' block that could also be updated to include the cause. I'd say open a PR to review & update all code that results in AlreadyClosedException originating from a catch block? : Date: Tue, 17 Oct 2023 11:24:03 -0400 : From: Michael Sokolov : Reply-To: dev@lucene.apache.org : To: Lucene Dev : Subject: ByteBufferIndexInput.alreadyClosed creates an exception that doesn't : track its cause : : I was messing around with something that was resulting in : AlreadyClosedException being thrown and I noticed that we weren't : tracking the exception that caused it. I found this in : ByteBufferIndexInput: : : // the unused parameter is just to silence javac about unused variables : AlreadyClosedException alreadyClosed(RuntimeException unused) { : - return new AlreadyClosedException("Already closed: " + this); : + return new AlreadyClosedException("Already closed: " + this, unused); : } : : and added the cause there, which helped me find and fix my wicked : ways. Is there a reason we decided not to wrap the "unused" : RuntimeException there? : : - : To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org : For additional commands, e-mail: dev-h...@lucene.apache.org : : -Hoss http://www.lucidworks.com/ - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail:u...@thetaphi.de -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail:u...@thetaphi.de
Re: ByteBufferIndexInput.alreadyClosed creates an exception that doesn't track its cause
Hi, I have a good idea: We should only wrap as AlreadyClosedException if and only if the bytebuffers/memorysegemnts are null (see . In all other cases rethrow the NPE: AlreadyClosedException alreadyClosed(RuntimeException npe) { if (npe == null || this.buffers == null) { // buffers == null if input closed! return new AlreadyClosedException("Already closed: " + this); } throw npe; } (this would need the same change in all MemorySegmentIndexInput in a similar way). This would keep the NPE on wrong usage, but in the case of a closed ByteBufferIndexInput / MemorySegmentIndexInput it would throw plain AlreadyClosedEx. I can provide a PR, but give me a week, I am very busy. Uwe Am 21.10.2023 um 10:01 schrieb Uwe Schindler: Hi, please don't add the NPE here as cause (except for debugging). The NPE is only catched to NOT add extra checks in the highly performance sensitive code. Actually the NPE is catched to detect the case where the bytebuffer was already unset to trigger the already closed. The code uses setting the buffers to NULL to signal cause, but it does NOT add a NULL check everywhere. This allows Hotspot to compile this code without any bounds checks and signal the AlreadyClosedException only when a NPE happens. Adding the NPE as cause would bring confusion to end user, as we only want to tell that IndexInput was closed, but the NPE should be kept behind scenes as it would be a support nightmare ("your code has no good null checks, it is broken"). The NPE is a signal here, not the cause. I think the issue you have seen may be cause by passing a NULL parameter to one of the methods like a float array to readFloats(). This is not detected (P.S.: this also affects MemorySegmentIndexInput). I can looks at the code to figure out how to better detect the NPE when parameters of methods are NULL, but no way to add the cause here. I would say: If you have to debug, do it temporarily or ose another dircetory impl. Uwe Am 20.10.2023 um 21:20 schrieb Chris Hostetter: FWIW: The choice to ignore the original exception goes back to here... https://issues.apache.org/jira/browse/LUCENE-3588 ...circa 2011, where it was focused on catching NPE and throwing AlreadyClosedException instead, w/o any particular discussion as to why to throw away the original NPE. If i had to guess it's simply because at that time AlreadyClosedException didn't support wrapping any other Throwable. That wasn't added until LUCENE-5958 (circa 2014) which was focused on making sure "tragic" errors kept a record of what caused the tragedy, and then include that as the 'cause' of the AlreadyClosedExceptions throw by 'ensureOpen()' There didn't seem to be any discussion at that time about reviewing other code that might be throwing AlreadyClosedException from a 'catch' block that could also be updated to include the cause. I'd say open a PR to review & update all code that results in AlreadyClosedException originating from a catch block? : Date: Tue, 17 Oct 2023 11:24:03 -0400 : From: Michael Sokolov : Reply-To: dev@lucene.apache.org : To: Lucene Dev : Subject: ByteBufferIndexInput.alreadyClosed creates an exception that doesn't : track its cause : : I was messing around with something that was resulting in : AlreadyClosedException being thrown and I noticed that we weren't : tracking the exception that caused it. I found this in : ByteBufferIndexInput: : : // the unused parameter is just to silence javac about unused variables : AlreadyClosedException alreadyClosed(RuntimeException unused) { : - return new AlreadyClosedException("Already closed: " + this); : + return new AlreadyClosedException("Already closed: " + this, unused); : } : : and added the cause there, which helped me find and fix my wicked : ways. Is there a reason we decided not to wrap the "unused" : RuntimeException there? : : - : To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org : For additional commands, e-mail: dev-h...@lucene.apache.org : : -Hoss http://www.lucidworks.com/ - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail:u...@thetaphi.de
Re: ByteBufferIndexInput.alreadyClosed creates an exception that doesn't track its cause
Hi Hoss, Hi, see my other response, this is not the reason why it isn't wrapped: If i had to guess it's simply because at that time AlreadyClosedException didn't support wrapping any other Throwable. That wasn't added until LUCENE-5958 (circa 2014) which was focused on making sure "tragic" errors kept a record of what caused the tragedy, and then include that as the 'cause' of the AlreadyClosedExceptions throw by 'ensureOpen()' See my other mail for an explanation, in short: The NPE is only used as signal and not as error condition and MUST be hidden from end user (except for debbugging). Uwe -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail:u...@thetaphi.de
Re: Could we allow an IndexInput to read from a still writing IndexOutput?
Hi, the biggest problem is with some IndexInputs that work on FS Cache (mmapdir). The file size changes while you are writing therefore it could cause strange issues. Especially the mapping of mmap may not see the changes you have already written as there is no happens-before relationship. Basically the IO model of Lucene is WORM. So something thats visible to readers must never change anymore. So as said by the others, if you need stuff already written, keep it in memory (like nodes). We should really not change our IO model for this singleton. 1% slowdown while writing due to some caching of buffering does not matter and risk us corrupting indexes or run into errors while reading. Uwe Am 19.10.2023 um 15:47 schrieb Michael McCandless: Hi Team, Today, Lucene's Directory abstraction does not allow opening an IndexInput on a file until the file is fully written and closed via IndexOutput. We enforce this in tests, and some of our core Directory implementations demand this (e.g. caching the file's length on opening an IndexInput). Yet, most filesystems will easily allow simultaneous read/append of a single file. We just don't expose this IO semantics to Lucene, but could we allow random-access reads with append-only writes on one file? Is there a strong reason that we don't allow this? Quick TL/DR context: we are trying to enable FST compilation to write off-heap (directly to disk), enabling creating arbitrarily large FSTs with bounded heap, matching how FSTs can now be read off-heap, and it would be much much more RAM efficient if we could read/append the same file at once. Full gory details context: inspired by how Tantivy <https://github.com/quickwit-oss/tantivy> (awesome and fast Rust search engine!) writes its FSTs <https://blog.burntsushi.net/transducers/>, over in this issue <https://github.com/apache/lucene/issues/12543> and PR <https://github.com/dungba88/lucene/commit/882f5a5b1f60d4321d2e09986335063368c08e9b>, we (thank you Dzung Bui / @dungba88!) are trying to fix Lucene's FST building to immediately stream the FST to disk, instead of buffering the whole thing in RAM and then writing to disk. This would allow building arbitrarily large FSTs without using up heap, and symmetrically matches how we can now read FSTs off-heap, plus FST building is already (mostly) append-only. This would also allow removing some of the crazy abstractions we have for writing FST bytes into RAM (FSTStore, BytesStore). It would enable interesting things like a Codec whose term dictionary is stored entirely in an FST <https://github.com/apache/lucene/pull/12688> (also inspired by Tantivy). The wrinkle is that, while the FST is building, it sometimes looks back and reads previously written bytes, to share suffixes and create a minimal (or near minimal) FST. So if IndexInput could read those bytes, even as the FST is still appending to IndexOutput, it would "just work". Failing that, our plan B is to wastefully duplicate the byte[] slices from the already written bytes into our own private (heap resident, boo) copy, which would use quite a bit more RAM while building the FST, and make less minimal FSTs for a given RAM budget. I haven't measured the added wasted RAM if we have to go this route but I fear it is sizable in practice, i.e. it strongly negates the whole idea of writing an FST off-heap since its effectively storing a possibly large portion of the FST in many duplicated byte[] fragments (in the NodeHash). So ... could we somehow relax Lucene's Directory semantics to allow opening an IndexInput on a still appending IndexOutput, since most filesystems are fine with this? Mike McCandless http://blog.mikemccandless.com -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail:u...@thetaphi.de
Re: ByteBufferIndexInput.alreadyClosed creates an exception that doesn't track its cause
Hi, please don't add the NPE here as cause (except for debugging). The NPE is only catched to NOT add extra checks in the highly performance sensitive code. Actually the NPE is catched to detect the case where the bytebuffer was already unset to trigger the already closed. The code uses setting the buffers to NULL to signal cause, but it does NOT add a NULL check everywhere. This allows Hotspot to compile this code without any bounds checks and signal the AlreadyClosedException only when a NPE happens. Adding the NPE as cause would bring confusion to end user, as we only want to tell that IndexInput was closed, but the NPE should be kept behind scenes as it would be a support nightmare ("your code has no good null checks, it is broken"). The NPE is a signal here, not the cause. I think the issue you have seen may be cause by passing a NULL parameter to one of the methods like a float array to readFloats(). This is not detected (P.S.: this also affects MemorySegmentIndexInput). I can looks at the code to figure out how to better detect the NPE when parameters of methods are NULL, but no way to add the cause here. I would say: If you have to debug, do it temporarily or ose another dircetory impl. Uwe Am 20.10.2023 um 21:20 schrieb Chris Hostetter: FWIW: The choice to ignore the original exception goes back to here... https://issues.apache.org/jira/browse/LUCENE-3588 ...circa 2011, where it was focused on catching NPE and throwing AlreadyClosedException instead, w/o any particular discussion as to why to throw away the original NPE. If i had to guess it's simply because at that time AlreadyClosedException didn't support wrapping any other Throwable. That wasn't added until LUCENE-5958 (circa 2014) which was focused on making sure "tragic" errors kept a record of what caused the tragedy, and then include that as the 'cause' of the AlreadyClosedExceptions throw by 'ensureOpen()' There didn't seem to be any discussion at that time about reviewing other code that might be throwing AlreadyClosedException from a 'catch' block that could also be updated to include the cause. I'd say open a PR to review & update all code that results in AlreadyClosedException originating from a catch block? : Date: Tue, 17 Oct 2023 11:24:03 -0400 : From: Michael Sokolov : Reply-To: dev@lucene.apache.org : To: Lucene Dev : Subject: ByteBufferIndexInput.alreadyClosed creates an exception that doesn't : track its cause : : I was messing around with something that was resulting in : AlreadyClosedException being thrown and I noticed that we weren't : tracking the exception that caused it. I found this in : ByteBufferIndexInput: : :// the unused parameter is just to silence javac about unused variables :AlreadyClosedException alreadyClosed(RuntimeException unused) { : -return new AlreadyClosedException("Already closed: " + this); : +return new AlreadyClosedException("Already closed: " + this, unused); :} : : and added the cause there, which helped me find and fix my wicked : ways. Is there a reason we decided not to wrap the "unused" : RuntimeException there? : : - : To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org : For additional commands, e-mail: dev-h...@lucene.apache.org : : -Hoss http://www.lucidworks.com/ - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail: u...@thetaphi.de - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Update TermInSetQuery Example?
Hi Michael, Go ahead. Maybe scan through the remaining source files with a grep/regex: $ fgrep -R 'new BooleanQuery(' * lucene/core/src/java/org/apache/lucene/search/BooleanQuery.java: return new BooleanQuery(minimumNumberShouldMatch, clauses.toArray(new BooleanClause[0])); lucene/core/src/java/org/apache/lucene/search/TermInSetQuery.java: * BooleanQuery bq = new BooleanQuery(); lucene/queries/src/java/org/apache/lucene/queries/spans/package-info.java: * Query query = new BooleanQuery(); lucene/spatial-extras/src/java/org/apache/lucene/spatial/bbox/BBoxStrategy.java: // BooleanQuery qNotDisjoint = new BooleanQuery(); The first one is a false positive (the builder calls the BQ ctor, but all others should possibly be fixed. There may be other combinations not detected because of source code formatting. Uwe Am 20.10.2023 um 23:46 schrieb Michael Wechner: Hi I recently found TermInSetQuery example at https://lucene.apache.org/core/9_7_0/core/org/apache/lucene/search/TermInSetQuery.html but if I understand correctly one should use now BooleanQuery.Builder instead BooleanQuery itself, right? BooleanQuery.Builder bqb = new BooleanQuery.Builder(); bqb.add(new TermQuery(new Term("field", "foo")), BooleanClause.Occcur.SHOULD); bqb.add(new TermQuery(new Term("field", "bar")), BooleanClause.Occcur.SHOULD); Query q2 = new ConstantScoreQuery(bqb.build()); If so, I would be happy to do a minor pull request or feel free to update it directly. Thanks Michael - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail: u...@thetaphi.de - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Can we get rid of "Approve & Run" on GitHub PRs by new contributors (non-committers)?
Hi, this seems to be a safety feature and is also enabled in general for Github. I found no options in asf.yaml to enable/disable it: https://cwiki.apache.org/confluence/display/INFRA/Git+-+.asf.yaml+features#Git.asf.yamlfeatures-GitHubsettings You can only add some users to a whitelist of "collaborators" through asf.yaml. Nevertheless, I see no problem for pressing the button. When I quickly review a PR, I generally press the button. For safety reasons this is required in most projects I was contributing, too (not only ASF). What's the problem in pressing the button? Of course you take responsibility when the crypto miner starts, but if there is a huge PR by an external contributor, I would first ask if they could split it into smaller pieces. At some point we have to review it, and most external people creating huge PRs did bad stuff like pressing the format button in their IDE. I think running "./gradlew precommit" is a must for new contributors. The online checks on Github are more for me as reviewer/committer, to make sure all is fine before I press the merge button (for many PRs I don't even checkout the code after review). So it is fine to not trigger it by end-users. -1 to ask INFRA to enable this. Uwe Am 16.10.2023 um 15:57 schrieb Michael McCandless: When a non-committer (I think?) opens a PR, one of the committers must notice it and click Approve & Run so the contributor can find out if something broke in our automated tests/precommit/linting. This seems like a waste, and a friction in the worst possible place for our community: new contributor onboarding experience. I think we have it to prevent e.g. a crypto mining bot of a PR sneaking in and taking tons of resources to mine dogecoin or so? But 1) that doesn't seem to be happening so far, 2) when I hit "Approve & Run" I never look closely to see if there is in fact a hidden crypto miner in there, and 3) can't we just put some reasonable timeout on the GitHub actions to block such abuse? Is this some sort of requirement by GitHub, or did we choose to turn on this silly step? Mike McCandless http://blog.mikemccandless.com -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail:u...@thetaphi.de
Re: [VOTE] Release Lucene 9.8.0 RC1
be downloaded from: https://dist.apache.org/repos/dist/dev/lucene/lucene-9.8.0-RC1-rev-d914b3722bd5b8ef31ccf7e8ddc638a87fd648db You can run the smoke tester directly with this command: python3 -u dev-tools/scripts/smokeTestRelease.py \ https://dist.apache.org/repos/dist/dev/lucene/lucene-9.8.0-RC1-rev-d914b3722bd5b8ef31ccf7e8ddc638a87fd648db The vote will be open for at least 72 hours, as there's a weekend, the vote will last until 2023-09-27 06:00 UTC. [ ] +1 approve [ ] +0 no opinion [ ] -1 disapprove (and reason why) Here is my +1 (non-binding) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail: u...@thetaphi.de - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [VOTE] Release Lucene 9.8.0 RC1
Hi, I verified the release with the usual tools and my workflow: Policeman Jenkins ran smoketester for me with Java 11 and Java 17: https://jenkins.thetaphi.de/job/Lucene-Release-Tester/28/console SUCCESS! [1:10:15.704228] In addition I checked the changes entries and ran Luke with Java 21 GA (released two days ago). All fine! +1 to release! Am 22.09.2023 um 07:48 schrieb Patrick Zhai: Please vote for release candidate 1 for Lucene 9.8.0 The artifacts can be downloaded from: https://dist.apache.org/repos/dist/dev/lucene/lucene-9.8.0-RC1-rev-d914b3722bd5b8ef31ccf7e8ddc638a87fd648db You can run the smoke tester directly with this command: python3 -u dev-tools/scripts/smokeTestRelease.py \ https://dist.apache.org/repos/dist/dev/lucene/lucene-9.8.0-RC1-rev-d914b3722bd5b8ef31ccf7e8ddc638a87fd648db The vote will be open for at least 72 hours, as there's a weekend, the vote will last until 2023-09-27 06:00 UTC. [ ] +1 approve [ ] +0 no opinion [ ] -1 disapprove (and reason why) Here is my +1 (non-binding) -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail: u...@thetaphi.de - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Lucene 9.8 Release
Hi, I also enabled Jenkins jobs for the 9.8 branch today (a bit late, sorry). See https://jenkins.thetaphi.de for the randomized jobs. Uwe Am 21.09.2023 um 19:05 schrieb Patrick Zhai: Thanks Adrien, I plan to start creating the RC tonight, I *think* I have finished all the PGP key set up so I hope it won't be too hard :) On Thu, Sep 21, 2023, 04:10 Adrien Grand wrote: Thanks Patrick. I expanded a bit on the optimization section to highlight the sort of speedup that nightly benchmarks reported, and moved this section first as I suspect that users would be especially interested in these speedups. Out of curiosity, do you know when you plan on creating a release candidate? On Thu, Sep 21, 2023 at 7:40 AM Patrick Zhai wrote: > > Hi all, > Here's the draft release note: https://cwiki.apache.org/confluence/display/LUCENE/Draft+Release+Notes+9.8 > > Please feel free to edit if you feel like to add anything > > Best > Patrick > > On Tue, Sep 19, 2023 at 12:05 AM Adrien Grand wrote: >> >> Thanks Patrick, this PR is now merged. >> >> On Tue, Sep 19, 2023 at 6:22 AM Patrick Zhai wrote: >> > >> > Update: >> > Will wait https://github.com/apache/lucene/pull/12568 to be merged to cut the branch >> > >> > >> > On Mon, Sep 18, 2023 at 11:00 AM Michael Sokolov wrote: >> >> >> >> +1 for a release soon, and thanks for volunteering, Patrick! >> >> >> >> On Tue, Sep 12, 2023 at 2:08 AM Patrick Zhai wrote: >> >> > >> >> > Hi all, >> >> > It's been a while since the last release and we have quite a few good changes including new APIs, improvements and bug fixes. Should we release the 9.8? >> >> > >> >> > If there's no objections I volunteer to be the release manager and will cut the feature branch a week from now, which is Sep. 18th PST. >> >> > >> >> > Best >> >> > Patrick >> >> >> >> - >> >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org >> >> For additional commands, e-mail: dev-h...@lucene.apache.org >> >> >> >> >> -- >> Adrien >> >> ----- >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org >> For additional commands, e-mail: dev-h...@lucene.apache.org >> -- Adrien - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail:u...@thetaphi.de
Re: [lucene] branch branch_9x updated: Fix issues with BP tests and the security manager. (#12568)
I know where it comes from. The javadoc comment has a "<" sign. I would also fix this in main. Am 19.09.2023 um 09:48 schrieb Uwe Schindler: Looks like Java 11 can't compile this, see https://github.com/apache/lucene/actions/runs/6232257025/job/16915121779#step:5:452 /home/runner/work/lucene/lucene/lucene/misc/src/java/org/apache/lucene/misc/index/BPIndexReorderer.java:78: error: bad use of '>' * p -> new ForkJoinWorkerThread(p) {}, null, random().nextBoolean()); > Task :lucene:misc:compileJava FAILED ^ Note: /home/runner/work/lucene/lucene/lucene/misc/src/java/org/apache/lucene/misc/util/fst/UpToTwoPositiveIntOutputs.java uses or overrides a deprecated API. Note: Recompile with -Xlint:deprecation for details. 1 error Note: Some input files use or override a deprecated API. Not sure what's wrong, I think the problem is with the anonymous subclassing Maybe brackets around the whole "new ForkJoin() {}" helps? Uwe Am 19.09.2023 um 09:04 schrieb jpou...@apache.org: This is an automated email from the ASF dual-hosted git repository. jpountz pushed a commit to branch branch_9x in repository https://gitbox.apache.org/repos/asf/lucene.git The following commit(s) were added to refs/heads/branch_9x by this push: new c241ab006c4 Fix issues with BP tests and the security manager. (#12568) c241ab006c4 is described below commit c241ab006c4be918207adc69bb34fa72a48286f3 Author: Adrien Grand AuthorDate: Tue Sep 19 08:55:48 2023 +0200 Fix issues with BP tests and the security manager. (#12568) The default ForkJoinPool implementation uses a thread factory that removes all permissions on threads, so we need to create our own to avoid tests failing with FS-based directories. --- .../src/java/org/apache/lucene/misc/index/BPIndexReorderer.java | 4 +++- .../test/org/apache/lucene/misc/index/TestBPIndexReorderer.java | 7 ++- 2 files changed, 9 insertions(+), 2 deletions(-) diff --git a/lucene/misc/src/java/org/apache/lucene/misc/index/BPIndexReorderer.java b/lucene/misc/src/java/org/apache/lucene/misc/index/BPIndexReorderer.java index 7482e7a06ed..b8dadc3f6a0 100644 --- a/lucene/misc/src/java/org/apache/lucene/misc/index/BPIndexReorderer.java +++ b/lucene/misc/src/java/org/apache/lucene/misc/index/BPIndexReorderer.java @@ -74,7 +74,9 @@ import org.apache.lucene.util.OfflineSorter.BufferSize; * * Directory targetDir = FSDirectory.open(targetPath); * BPIndexReorderer reorderer = new BPIndexReorderer(); - * reorderer.setForkJoinPool(ForkJoinPool.commonPool()); + * ForkJoinPool pool = new ForkJoinPool(Runtime.getRuntime().availableProcessors(), + * p -> new ForkJoinWorkerThread(p) {}, null, random().nextBoolean()); + * reorderer.setForkJoinPool(pool); * reorderer.setFields(Collections.singleton("body")); * CodecReader reorderedReaderView = reorderer.reorder(SlowCodecReaderWrapper.wrap(reader), targetDir); * try (IndexWriter w = new IndexWriter(targetDir, new IndexWriterConfig().setOpenMode(OpenMode.CREATE))) { diff --git a/lucene/misc/src/test/org/apache/lucene/misc/index/TestBPIndexReorderer.java b/lucene/misc/src/test/org/apache/lucene/misc/index/TestBPIndexReorderer.java index 4b6a9a85037..13d6989ff74 100644 --- a/lucene/misc/src/test/org/apache/lucene/misc/index/TestBPIndexReorderer.java +++ b/lucene/misc/src/test/org/apache/lucene/misc/index/TestBPIndexReorderer.java @@ -21,6 +21,7 @@ import static org.apache.lucene.misc.index.BPIndexReorderer.fastLog2; import java.io.IOException; import java.util.Arrays; import java.util.concurrent.ForkJoinPool; +import java.util.concurrent.ForkJoinWorkerThread; import org.apache.lucene.document.Document; import org.apache.lucene.document.Field.Store; import org.apache.lucene.document.StoredField; @@ -47,7 +48,11 @@ public class TestBPIndexReorderer extends LuceneTestCase { public void testSingleTermWithForkJoinPool() throws IOException { int concurrency = TestUtil.nextInt(random(), 1, 8); - ForkJoinPool pool = new ForkJoinPool(concurrency); + // The default ForkJoinPool implementation uses a thread factory that removes all permissions on + // threads, so we need to create our own to avoid tests failing with FS-based directories. + ForkJoinPool pool = + new ForkJoinPool( + concurrency, p -> new ForkJoinWorkerThread(p) {}, null, random().nextBoolean()); try { doTestSingleTerm(pool); } finally { -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail: u...@thetaphi.de - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [lucene] branch branch_9x updated: Fix issues with BP tests and the security manager. (#12568)
Looks like Java 11 can't compile this, see https://github.com/apache/lucene/actions/runs/6232257025/job/16915121779#step:5:452 /home/runner/work/lucene/lucene/lucene/misc/src/java/org/apache/lucene/misc/index/BPIndexReorderer.java:78: error: bad use of '>' * p -> new ForkJoinWorkerThread(p) {}, null, random().nextBoolean()); > Task :lucene:misc:compileJava FAILED ^ Note: /home/runner/work/lucene/lucene/lucene/misc/src/java/org/apache/lucene/misc/util/fst/UpToTwoPositiveIntOutputs.java uses or overrides a deprecated API. Note: Recompile with -Xlint:deprecation for details. 1 error Note: Some input files use or override a deprecated API. Not sure what's wrong, I think the problem is with the anonymous subclassing Maybe brackets around the whole "new ForkJoin() {}" helps? Uwe Am 19.09.2023 um 09:04 schrieb jpou...@apache.org: This is an automated email from the ASF dual-hosted git repository. jpountz pushed a commit to branch branch_9x in repository https://gitbox.apache.org/repos/asf/lucene.git The following commit(s) were added to refs/heads/branch_9x by this push: new c241ab006c4 Fix issues with BP tests and the security manager. (#12568) c241ab006c4 is described below commit c241ab006c4be918207adc69bb34fa72a48286f3 Author: Adrien Grand AuthorDate: Tue Sep 19 08:55:48 2023 +0200 Fix issues with BP tests and the security manager. (#12568) The default ForkJoinPool implementation uses a thread factory that removes all permissions on threads, so we need to create our own to avoid tests failing with FS-based directories. --- .../src/java/org/apache/lucene/misc/index/BPIndexReorderer.java| 4 +++- .../test/org/apache/lucene/misc/index/TestBPIndexReorderer.java| 7 ++- 2 files changed, 9 insertions(+), 2 deletions(-) diff --git a/lucene/misc/src/java/org/apache/lucene/misc/index/BPIndexReorderer.java b/lucene/misc/src/java/org/apache/lucene/misc/index/BPIndexReorderer.java index 7482e7a06ed..b8dadc3f6a0 100644 --- a/lucene/misc/src/java/org/apache/lucene/misc/index/BPIndexReorderer.java +++ b/lucene/misc/src/java/org/apache/lucene/misc/index/BPIndexReorderer.java @@ -74,7 +74,9 @@ import org.apache.lucene.util.OfflineSorter.BufferSize; * * Directory targetDir = FSDirectory.open(targetPath); * BPIndexReorderer reorderer = new BPIndexReorderer(); - * reorderer.setForkJoinPool(ForkJoinPool.commonPool()); + * ForkJoinPool pool = new ForkJoinPool(Runtime.getRuntime().availableProcessors(), + * p -> new ForkJoinWorkerThread(p) {}, null, random().nextBoolean()); + * reorderer.setForkJoinPool(pool); * reorderer.setFields(Collections.singleton("body")); * CodecReader reorderedReaderView = reorderer.reorder(SlowCodecReaderWrapper.wrap(reader), targetDir); * try (IndexWriter w = new IndexWriter(targetDir, new IndexWriterConfig().setOpenMode(OpenMode.CREATE))) { diff --git a/lucene/misc/src/test/org/apache/lucene/misc/index/TestBPIndexReorderer.java b/lucene/misc/src/test/org/apache/lucene/misc/index/TestBPIndexReorderer.java index 4b6a9a85037..13d6989ff74 100644 --- a/lucene/misc/src/test/org/apache/lucene/misc/index/TestBPIndexReorderer.java +++ b/lucene/misc/src/test/org/apache/lucene/misc/index/TestBPIndexReorderer.java @@ -21,6 +21,7 @@ import static org.apache.lucene.misc.index.BPIndexReorderer.fastLog2; import java.io.IOException; import java.util.Arrays; import java.util.concurrent.ForkJoinPool; +import java.util.concurrent.ForkJoinWorkerThread; import org.apache.lucene.document.Document; import org.apache.lucene.document.Field.Store; import org.apache.lucene.document.StoredField; @@ -47,7 +48,11 @@ public class TestBPIndexReorderer extends LuceneTestCase { public void testSingleTermWithForkJoinPool() throws IOException { int concurrency = TestUtil.nextInt(random(), 1, 8); -ForkJoinPool pool = new ForkJoinPool(concurrency); +// The default ForkJoinPool implementation uses a thread factory that removes all permissions on +// threads, so we need to create our own to avoid tests failing with FS-based directories. +ForkJoinPool pool = +new ForkJoinPool( +concurrency, p -> new ForkJoinWorkerThread(p) {}, null, random().nextBoolean()); try { doTestSingleTerm(pool); } finally { -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail: u...@thetaphi.de - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [JENKINS] Lucene-MMAPv2-Windows (64bit/hotspot/jdk-21-rc) - Build # 801 - Still Unstable!
It may still be a good idea to show an example how to pass a ForkJoinPool to the sorter that does not limit permissions (just examples for the educated reader). Uwe Am 18.09.2023 um 18:18 schrieb Adrien Grand: Thanks Uwe for digging. The fork-join pool is optional, I will change the test to use a ByteBuffersDirectory. On Mon, Sep 18, 2023 at 6:15 PM Uwe Schindler wrote: Hi, this issue is a real one. The problem is: The default ForkJoin thread pool runs all tasks with zero permissions if a security manager is present. As the MMap Jenkins enforces usage of MMapDirectory for all tests (it passes -Dtests.directory=MMapDirectory), all disk IO fails. This will be a big issue for Elasticsearch/Opensearch/Solr if we use the default thread pool. If this is a test only issue, we should fix it: use non-FS-based directory use our own thread pool If this issue is in 9.8 branch we have to fix it! Uwe Am 18.09.2023 um 17:59 schrieb Policeman Jenkins Server: Build: https://jenkins.thetaphi.de/job/Lucene-MMAPv2-Windows/801/ Java: 64bit/hotspot/jdk-21-rc -XX:-UseCompressedOops -XX:+UseG1GC 1 tests failed. FAILED: org.apache.lucene.misc.index.TestBPIndexReorderer.testSingleTermWithForkJoinPool Error Message: java.security.AccessControlException: access denied ("java.io.FilePermission" "C:\Users\jenkins\workspace\Lucene-MMAPv2-Windows\lucene\misc\build\tmp\tests-tmp\lucene.misc.index.TestBPIndexReorderer_4B02FABB1F62D832-001\index-MMapDirectory-003\forward-index_sort_5.tmp" "write") Stack Trace: java.security.AccessControlException: access denied ("java.io.FilePermission" "C:\Users\jenkins\workspace\Lucene-MMAPv2-Windows\lucene\misc\build\tmp\tests-tmp\lucene.misc.index.TestBPIndexReorderer_4B02FABB1F62D832-001\index-MMapDirectory-003\forward-index_sort_5.tmp" "write") at __randomizedtesting.SeedInfo.seed([4B02FABB1F62D832:77694EDC9D6E8956]:0) at java.base/java.security.AccessControlContext.checkPermission(AccessControlContext.java:488) at java.base/java.security.AccessController.checkPermission(AccessController.java:1071) at java.base/java.lang.SecurityManager.checkPermission(SecurityManager.java:411) at java.base/java.lang.SecurityManager.checkWrite(SecurityManager.java:833) at java.base/sun.nio.fs.WindowsChannelFactory.open(WindowsChannelFactory.java:302) at java.base/sun.nio.fs.WindowsChannelFactory.newFileChannel(WindowsChannelFactory.java:168) at java.base/sun.nio.fs.WindowsFileSystemProvider.newByteChannel(WindowsFileSystemProvider.java:229) at java.base/java.nio.file.spi.FileSystemProvider.newOutputStream(FileSystemProvider.java:482) at org.apache.lucene.test_framework@10.0.0-SNAPSHOT/org.apache.lucene.tests.mockfile.FilterFileSystemProvider.newOutputStream(FilterFileSystemProvider.java:198) at org.apache.lucene.test_framework@10.0.0-SNAPSHOT/org.apache.lucene.tests.mockfile.FilterFileSystemProvider.newOutputStream(FilterFileSystemProvider.java:198) at org.apache.lucene.test_framework@10.0.0-SNAPSHOT/org.apache.lucene.tests.mockfile.HandleTrackingFS.newOutputStream(HandleTrackingFS.java:132) at org.apache.lucene.test_framework@10.0.0-SNAPSHOT/org.apache.lucene.tests.mockfile.HandleTrackingFS.newOutputStream(HandleTrackingFS.java:132) at org.apache.lucene.test_framework@10.0.0-SNAPSHOT/org.apache.lucene.tests.mockfile.FilterFileSystemProvider.newOutputStream(FilterFileSystemProvider.java:198) at java.base/java.nio.file.Files.newOutputStream(Files.java:227) at org.apache.lucene.core@10.0.0-SNAPSHOT/org.apache.lucene.store.FSDirectory$FSIndexOutput.(FSDirectory.java:394) at org.apache.lucene.core@10.0.0-SNAPSHOT/org.apache.lucene.store.FSDirectory.createTempOutput(FSDirectory.java:234) at org.apache.lucene.test_framework@10.0.0-SNAPSHOT/org.apache.lucene.tests.store.MockDirectoryWrapper.createTempOutput(MockDirectoryWrapper.java:752) at org.apache.lucene.core@10.0.0-SNAPSHOT/org.apache.lucene.store.TrackingDirectoryWrapper.createTempOutput(TrackingDirectoryWrapper.java:49) at org.apache.lucene.core@10.0.0-SNAPSHOT/org.apache.lucene.store.TrackingDirectoryWrapper.createTempOutput(TrackingDirectoryWrapper.java:49) at org.apache.lucene.core@10.0.0-SNAPSHOT/org.apache.lucene.util.OfflineSorter$SortPartitionTask.call(OfflineSorter.java:623) at org.apache.lucene.core@10.0.0-SNAPSHOT/org.apache.lucene.util.OfflineSorter$SortPartitionTask.call(OfflineSorter.java:610) at java.base/java.util.concurrent.ForkJoinTask$AdaptedCallable.exec(ForkJoinTask.java:1456) at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:387) at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1312) at java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1843) at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1808) at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:188) -
Re: [JENKINS] Lucene-MMAPv2-Windows (64bit/hotspot/jdk-21-rc) - Build # 801 - Still Unstable!
Hi, this issue is a real one. The problem is: The default ForkJoin thread pool runs all tasks with zero permissions if a security manager is present. As the MMap Jenkins enforces usage of MMapDirectory for all tests (it passes -Dtests.directory=MMapDirectory), all disk IO fails. This will be a big issue for Elasticsearch/Opensearch/Solr if we use the default thread pool. If this is a test only issue, we should fix it: * use non-FS-based directory * use our own thread pool If this issue is in 9.8 branch we have to fix it! Uwe Am 18.09.2023 um 17:59 schrieb Policeman Jenkins Server: Build:https://jenkins.thetaphi.de/job/Lucene-MMAPv2-Windows/801/ Java: 64bit/hotspot/jdk-21-rc -XX:-UseCompressedOops -XX:+UseG1GC 1 tests failed. FAILED: org.apache.lucene.misc.index.TestBPIndexReorderer.testSingleTermWithForkJoinPool Error Message: java.security.AccessControlException: access denied ("java.io.FilePermission" "C:\Users\jenkins\workspace\Lucene-MMAPv2-Windows\lucene\misc\build\tmp\tests-tmp\lucene.misc.index.TestBPIndexReorderer_4B02FABB1F62D832-001\index-MMapDirectory-003\forward-index_sort_5.tmp" "write") Stack Trace: java.security.AccessControlException: access denied ("java.io.FilePermission" "C:\Users\jenkins\workspace\Lucene-MMAPv2-Windows\lucene\misc\build\tmp\tests-tmp\lucene.misc.index.TestBPIndexReorderer_4B02FABB1F62D832-001\index-MMapDirectory-003\forward-index_sort_5.tmp" "write") at __randomizedtesting.SeedInfo.seed([4B02FABB1F62D832:77694EDC9D6E8956]:0) at java.base/java.security.AccessControlContext.checkPermission(AccessControlContext.java:488) at java.base/java.security.AccessController.checkPermission(AccessController.java:1071) at java.base/java.lang.SecurityManager.checkPermission(SecurityManager.java:411) at java.base/java.lang.SecurityManager.checkWrite(SecurityManager.java:833) at java.base/sun.nio.fs.WindowsChannelFactory.open(WindowsChannelFactory.java:302) at java.base/sun.nio.fs.WindowsChannelFactory.newFileChannel(WindowsChannelFactory.java:168) at java.base/sun.nio.fs.WindowsFileSystemProvider.newByteChannel(WindowsFileSystemProvider.java:229) at java.base/java.nio.file.spi.FileSystemProvider.newOutputStream(FileSystemProvider.java:482) atorg.apache.lucene.test_framework@10.0.0-SNAPSHOT/org.apache.lucene.tests.mockfile.FilterFileSystemProvider.newOutputStream(FilterFileSystemProvider.java:198) atorg.apache.lucene.test_framework@10.0.0-SNAPSHOT/org.apache.lucene.tests.mockfile.FilterFileSystemProvider.newOutputStream(FilterFileSystemProvider.java:198) atorg.apache.lucene.test_framework@10.0.0-SNAPSHOT/org.apache.lucene.tests.mockfile.HandleTrackingFS.newOutputStream(HandleTrackingFS.java:132) atorg.apache.lucene.test_framework@10.0.0-SNAPSHOT/org.apache.lucene.tests.mockfile.HandleTrackingFS.newOutputStream(HandleTrackingFS.java:132) atorg.apache.lucene.test_framework@10.0.0-SNAPSHOT/org.apache.lucene.tests.mockfile.FilterFileSystemProvider.newOutputStream(FilterFileSystemProvider.java:198) at java.base/java.nio.file.Files.newOutputStream(Files.java:227) atorg.apache.lucene.core@10.0.0-SNAPSHOT/org.apache.lucene.store.FSDirectory$FSIndexOutput.(FSDirectory.java:394) atorg.apache.lucene.core@10.0.0-SNAPSHOT/org.apache.lucene.store.FSDirectory.createTempOutput(FSDirectory.java:234) atorg.apache.lucene.test_framework@10.0.0-SNAPSHOT/org.apache.lucene.tests.store.MockDirectoryWrapper.createTempOutput(MockDirectoryWrapper.java:752) atorg.apache.lucene.core@10.0.0-SNAPSHOT/org.apache.lucene.store.TrackingDirectoryWrapper.createTempOutput(TrackingDirectoryWrapper.java:49) atorg.apache.lucene.core@10.0.0-SNAPSHOT/org.apache.lucene.store.TrackingDirectoryWrapper.createTempOutput(TrackingDirectoryWrapper.java:49) atorg.apache.lucene.core@10.0.0-SNAPSHOT/org.apache.lucene.util.OfflineSorter$SortPartitionTask.call(OfflineSorter.java:623) atorg.apache.lucene.core@10.0.0-SNAPSHOT/org.apache.lucene.util.OfflineSorter$SortPartitionTask.call(OfflineSorter.java:610) at java.base/java.util.concurrent.ForkJoinTask$AdaptedCallable.exec(ForkJoinTask.java:1456) at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:387) at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1312) at java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1843) at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1808) at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:188) - To unsubscribe, e-mail:builds-unsubscr...@lucene.apache.org For additional commands, e-mail:builds-h...@lucene.apache.org -- Uwe Sch
Re: Lucene 9.8 Release
Hi, I also rebuilt the Java 21 Panama APIJAR files with latest RC of Java 21, no changes, so all perfect. In case of any changes I would have recommended to rebuild the apijar files. P.S.: For Java 22 the API of MemorySegments will be finalized and officially released, unfortunately vector API is still in never-ending incubation. Uwe Am 12.09.2023 um 08:07 schrieb Patrick Zhai: Hi all, It's been a while since the last release and we have quite a few good changes including new APIs, improvements and bug fixes. Should we release the 9.8? If there's no objections I volunteer to be the release manager and will cut the feature branch a week from now, which is Sep. 18th PST. Best Patrick -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail: u...@thetaphi.de - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [JENKINS] Lucene » Lucene-NightlyTests-9.x - Build # 665 - Unstable!
Hi, I commented on the thread. We need to keep the timeout, otherwise a failing client (e.g. process doe snot startup) will cause server to hang forever. Let's discuss in the isseue. Instead we need more information if one of the clients did not start in time (test starting server and clients should print stderr/stdout of all started clients for debugging purposes on failure). Uwe Am 02.09.2023 um 13:47 schrieb Michael McCandless: OK I opened https://github.com/apache/lucene/pull/12535 Mike McCandless http://blog.mikemccandless.com On Sat, Sep 2, 2023 at 7:17 AM Michael McCandless wrote: > The code is just good old socket accept loop as we have all learned it in school when we were fighting to write a small echo server with C. LOL this is all my fault from lng ago, showing my poor understanding of sockets/networking/C echo servers!! So it sounds like the client was just super slow in starting up and didn't connect to the server within the timeout. So maybe we just remove the timeout entirely (client will eventually start up?), and remove the pointless SO_REUSADDR? I'll try to whip up a PR. Mike McCandless http://blog.mikemccandless.com On Sat, Sep 2, 2023 at 6:53 AM Uwe Schindler wrote: Let's fix this issue with bogus socket reuse. I am not sure why it is there. We touched the code last time around 2012 Why does it has a timeout in setver at all? Normally the accept() call should have no timeout. If the client does not start fast enough, of course it runs into timeout. The code is just good old socket accept loop as we have all learned it in school when we were fighting to write a small echo server with C. The bug here is the timeout. A timeout should only be in the client and not in the waiting call. Uwe Uwe Am 31. August 2023 14:53:44 MESZ schrieb Robert Muir : I looked at this lockverifyserver and would say its probably just the craziness of this code. it sets 30 second socket timeout and intentionally calls accept() when there is nothing yet to accept... well no wonder we see this issue. p.s. why does it set SO_REUSEADDR? no reason to do this leniency when binding to port 0. nuke it. On Thu, Aug 31, 2023 at 8:46 AM Robert Muir wrote: probably a bug in some jvm sockets code that called accept() in its default blocking mode, when there wasn't any connection to accept? in that case accept() call will just block and wait for someone to make a new connection. On Thu, Aug 31, 2023 at 8:16 AM Dawid Weiss wrote: https://ge.apache.org/s/orksynljk2yp6/tests/task/:lucene:core:test/details/org.apache.lucene.store.TestStressLockFactories/testSimpleFSLockFactory?top-execution=1 This test took 31 seconds... An extremely slow vm, perhaps? I don't know what the default connection timeouts are... it does look weird though. Dawid On Thu, Aug 31, 2023 at 1:08 PM Michael McCandless wrote: Good grief -- why are we getting SocketTimeoutException in our LockVerifyServer's attempt to accept an incoming connection!? These are all processes running on the same host ... Mike McCandless http://blog.mikemccandless.com On Tue, Aug 29, 2023 at 11:17 PM Apache Jenkins Server wrote: Build: https://ci-builds.apache.org/job/Lucene/job/Lucene-NightlyTests-9.x/665/ 2 tests failed. FAILED: org.apache.lucene.store.TestStressLockFactories.testSimpleFSLockFactory Error Message: java.net.SocketTimeoutException: Accept timed out Stack Trace: java.net.SocketTimeoutException: Accept timed out at __randomizedtesting.SeedInfo.seed([E1AD0D2AD68BA993:F325FE2A6E367AC7]:0) at java.base/java.net.PlainSocketImpl.socketAccept(Native Method) at java.base/java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:474) at java.base/java.net.ServerSocket.implAccept(ServerSocket.java:565) at java.base/java.net.ServerSocket.accept(ServerSocket.j
Re: [JENKINS] Lucene » Lucene-NightlyTests-9.x - Build # 665 - Unstable!
rencePipeline.java:528) >> >>> at >> >>> org.apache.tools.ant.DirectoryScanner.isIncluded(DirectoryScanner.java:1374) >> >>> at >> >>> org.apache.tools.ant.DirectoryScanner.scandir(DirectoryScanner.java:1260) >> >>> at >> >>> org.apache.tools.ant.DirectoryScanner.scandir(DirectoryScanner.java:1267) >> >>> at >> >>> org.apache.tools.ant.DirectoryScanner.scandir(DirectoryScanner.java:1267) >> >>> at >> >>> org.apache.tools.ant.DirectoryScanner.scandir(DirectoryScanner.java:1267) >> >>> at >> >>> org.apache.tools.ant.DirectoryScanner.scandir(DirectoryScanner.java:1267) >> >>> at >> >>> org.apache.tools.ant.DirectoryScanner.scandir(DirectoryScanner.java:1267) >> >>> at >> >>> org.apache.tools.ant.DirectoryScanner.scandir(DirectoryScanner.java:1267) >> >>> at >> >>> org.apache.tools.ant.DirectoryScanner.scandir(DirectoryScanner.java:1194) >> >>> at >> >>> org.apache.tools.ant.DirectoryScanner.scandir(DirectoryScanner.java:1156) >> >>> at >> >>> org.apache.tools.ant.DirectoryScanner.checkIncludePatterns(DirectoryScanner.java:954) >> >>> at >> >>> org.apache.tools.ant.DirectoryScanner.scan(DirectoryScanner.java:912) >> >>> at >> >>> hudson.FilePath$ValidateAntFileMask.hasMatch(FilePath.java:3313) >> >>> Caused: hudson.FilePath$FileMaskNoMatchesFoundException: no matches >> >>> found within 1 >> >>> at >> >>> hudson.FilePath$ValidateAntFileMask.hasMatch(FilePath.java:3318) >> >>> at hudson.FilePath$ValidateAntFileMask.invoke(FilePath.java:3196) >> >>> at hudson.FilePath$ValidateAntFileMask.invoke(FilePath.java:3174) >> >>> at hudson.FilePath$FileCallableWrapper.call(FilePath.java:3578) >> >>> Also: hudson.remoting.Channel$CallSiteStackTrace: Remote call to >> >>> lucene-solr-2 >> >>> at >> >>> hudson.remoting.Channel.attachCallSiteStackTrace(Channel.java:1784) >> >>> at >> >>> hudson.remoting.UserRequest$ExceptionResponse.retrieve(UserRequest.java:356) >> >>> at hudson.remoting.Channel.call(Channel.java:1000) >> >>> at hudson.FilePath.act(FilePath.java:1192) >> >>> at hudson.FilePath.act(FilePath.java:1181) >> >>> at >> >>> hudson.FilePath.validateAntFileMask(FilePath.java:3171) >> >>> at >> >>> hudson.tasks.ArtifactArchiver.perform(ArtifactArchiver.java:271) >> >>> at >> >>> hudson.tasks.BuildStepCompatibilityLayer.perform(BuildStepCompatibilityLayer.java:80) >> >>> at >> >>> hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20) >> >>> at >> >>> hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:818) >> >>> at >> >>> hudson.model.AbstractBuild$AbstractBuildExecution.performAllBuildSteps(AbstractBuild.java:767) >> >>> at >> >>> hudson.model.Build$BuildExecution.post2(Build.java:179) >> >>> at >> >>> hudson.model.AbstractBuild$AbstractBuildExecution.post(AbstractBuild.java:711) >> >>> at hudson.model.Run.execute(Run.java:1925) >> >>> at >> >>> hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:44) >> >>> at >> >>> hudson.model.ResourceController.execute(ResourceController.java:101) >> >>> at hudson.model.Executor.run(Executor.java:442) >> >>> Caused: hudson.FilePath$TunneledInterruptedException >> >>> at hudson.FilePath$FileCallableWrapper.call(FilePath.java:3580) >> >>> at hudson.remoting.UserRequest.perform(UserRequest.java:211) >> >>> at hudson.remoting.UserRequest.perform(UserRequest.java:54) >> >>> at hudson.remoting.Request$2.run(Request.java:377) >> >>> at >> >>> hudson.remoting.InterceptingExecutorService.lambda$wrap$0(InterceptingExecutorService.java:78) >> >>> at >> >>> java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) >> >>> at >> >>> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) >> >>> at >> >>> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) >> >>> at java.base/java.lang.Thread.run(Thread.java:829) >> >>> Caused: java.lang.InterruptedException: >> >>> hudson.FilePath$FileMaskNoMatchesFoundException: no matches found within >> >>> 1 >> >>> at hudson.FilePath.act(FilePath.java:1194) >> >>> at hudson.FilePath.act(FilePath.java:1181) >> >>> at hudson.FilePath.validateAntFileMask(FilePath.java:3171) >> >>> at >> >>> hudson.tasks.ArtifactArchiver.perform(ArtifactArchiver.java:271) >> >>> at >> >>> hudson.tasks.BuildStepCompatibilityLayer.perform(BuildStepCompatibilityLayer.java:80) >> >>> at >> >>> hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20) >> >>> at >> >>> hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:818) >> >>> at >> >>> hudson.model.AbstractBuild$AbstractBuildExecution.performAllBuildSteps(AbstractBuild.java:767) >> >>> at hudson.model.Build$BuildExecution.post2(Build.java:179) >> >>> at >> >>> hudson.model.AbstractBuild$AbstractBuildExecution.post(AbstractBuild.java:711) >> >>> at hudson.model.Run.execute(Run.java:1925) >> >>> at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:44) >> >>> at >> >>> hudson.model.ResourceController.execute(ResourceController.java:101) >> >>> at hudson.model.Executor.run(Executor.java:442) >> >>> No artifacts found that match the file pattern >> >>> "**/*.events,heapdumps/**,**/hs_err_pid*". Configuration error? >> >>> Recording test results >> >>> [Checks API] No suitable checks publisher found. >> >>> Build step 'Publish JUnit test result report' changed build result to >> >>> UNSTABLE >> >>> Email was triggered for: Unstable (Test Failures) >> >>> Sending email for trigger: Unstable (Test Failures) >> >>> >> >>> - >> >>> To unsubscribe, e-mail: builds-unsubscr...@lucene.apache.org >> >>> For additional commands, e-mail: builds-h...@lucene.apache.org > >- >To unsubscribe, e-mail: builds-unsubscr...@lucene.apache.org >For additional commands, e-mail: builds-h...@lucene.apache.org > -- Uwe Schindler Achterdiek 19, 28357 Bremen https://www.thetaphi.de
Re: Vector Search with OpenAI Embeddings: Lucene Is All You Need
Hey, Very nice article! Looks like lots of manual work to look at search results in those examples. Great work! Do you have a DOI name for the article? Uwe Am 1. September 2023 07:22:09 MESZ schrieb Kent Fitch : >My testing shows Lucene's HNSW in a very positive light. The ability to >perform blended searches (vector/semantic and text) is valuable, even with >high quality embeddings, and helps when the searcher's intent is to search >for specific words or phrases (such as a name, or exact concepts) which get >blurred-out by semantics. I discussed blended searching using Lucene in >this Code4Lib article: https://journal.code4lib.org/articles/17443 > >And regarding performance, I have benchmarked Lucene's HNSW (circa Jan2023 >snapshot) on a test index of 192 million vectors of 1536 dimensions, >reduced by PQ coding to 512 bytes and stored in HNSW. Building this index >was slow (lots of time merging...) but once it was built, it did fit >entirely in memory (core i7-9800x (8 cores) with 128gb DDR4 memory running >at 2400 MT/s) so no IO was required at search time. (I modified the lucene >similarity code to support expansion of each of the 512 PQ byte codes back >to 3 floats for the distance calculation.) I havent updated this to take >advantage of the latest SIMD capability, but even so, once the HNSW >structure is in memory, a single-threaded topK=10 search thread achieves >2.4 queries/second. Two threads: 4.9 q/s, 4 threads: 7.2q/s, maxing out at >8 threads: 9.4 q/s. I guess the non-linear scaling with threads is due to >competition for memory bandwidth and cache. Curiously, I'm not getting >nearly as good performance out of the box using Milvus 2.3's diskANN, but I >need to find out why before condemning it. > >Kent Fitch > >On Thu, Aug 31, 2023 at 7:53 PM Michael McCandless < >luc...@mikemccandless.com> wrote: > >> Thanks Michael, very interesting! I of course agree that Lucene is all >> you need, heh ;) >> >> Jimmy Lin also tweeted about the strength of Lucene's HNSW: >> https://twitter.com/lintool/status/1681333664431460353?s=20 >> >> Mike McCandless >> >> http://blog.mikemccandless.com >> >> >> On Thu, Aug 31, 2023 at 3:31 AM Michael Wechner >> wrote: >> >>> Hi Together >>> >>> You might be interesed in this paper / article >>> >>> https://arxiv.org/abs/2308.14963 >>> >>> Thanks >>> >>> Michael >>> >>> - >>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org >>> For additional commands, e-mail: dev-h...@lucene.apache.org >>> >>> -- Uwe Schindler Achterdiek 19, 28357 Bremen https://www.thetaphi.de
Re: [JENKINS] Lucene-9.x-Linux (64bit/openj9/jdk-17.0.5) - Build # 12560 - Unstable!
Hi, Let me update the version of openj9 and let's see. We should open bug report of it persists. Uwe Am 28. August 2023 09:28:11 MESZ schrieb Dawid Weiss : >The real reason for this is buried in other stack traces from >barrier-broken threads, it's this assertion: > >Caused by: >java.lang.AssertionError >at __randomizedtesting.SeedInfo.seed([F7B4CD7A5624D5EC]:0) >at app//org.junit.Assert.fail(Assert.java:87) >at app//org.junit.Assert.assertTrue(Assert.java:42) >at app//org.junit.Assert.assertTrue(Assert.java:53) >at > app//org.apache.lucene.index.TestIndexWriterThreadsToSegments$CheckSegmentCount.run(TestIndexWriterThreadsToSegments.java:150) >at > java.base@17.0.5/java.util.concurrent.CyclicBarrier.dowait(CyclicBarrier.java:222) >at > java.base@17.0.5/java.util.concurrent.CyclicBarrier.await(CyclicBarrier.java:364) >at > app//org.apache.lucene.index.TestIndexWriterThreadsToSegments$2.run(TestIndexWriterThreadsToSegments.java:236) > > > >On Mon, Aug 28, 2023 at 2:04 AM Policeman Jenkins Server < >jenk...@thetaphi.de> wrote: > >> Build: https://jenkins.thetaphi.de/job/Lucene-9.x-Linux/12560/ >> Java: 64bit/openj9/jdk-17.0.5 -XX:+UseCompressedOops -Xgcpolicy:balanced >> >> 2 tests failed. >> FAILED: >> org.apache.lucene.index.TestIndexWriterThreadsToSegments.testSegmentCountOnFlushRandom >> >> Error Message: >> com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an >> uncaught exception in thread: Thread[id=1089, name=Thread-814, >> state=RUNNABLE, group=TGRP-TestIndexWriterThreadsToSegments] >> >> Stack Trace: >> com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an >> uncaught exception in thread: Thread[id=1089, name=Thread-814, >> state=RUNNABLE, group=TGRP-TestIndexWriterThreadsToSegments] >> Caused by: java.lang.RuntimeException: >> java.util.concurrent.BrokenBarrierException >> at __randomizedtesting.SeedInfo.seed([F7B4CD7A5624D5EC]:0) >> at >> app//org.apache.lucene.index.TestIndexWriterThreadsToSegments$2.run(TestIndexWriterThreadsToSegments.java:239) >> Caused by: java.util.concurrent.BrokenBarrierException >> at java.base@17.0.5 >> /java.util.concurrent.CyclicBarrier.dowait(CyclicBarrier.java:252) >> at java.base@17.0.5 >> /java.util.concurrent.CyclicBarrier.await(CyclicBarrier.java:364) >> at >> app//org.apache.lucene.index.TestIndexWriterThreadsToSegments$2.run(TestIndexWriterThreadsToSegments.java:236) >> >> >> FAILED: >> org.apache.lucene.index.TestIndexWriterThreadsToSegments.classMethod >> >> Error Message: >> java.lang.AssertionError: The test or suite printed 8227 bytes to stdout >> and stderr, even though the limit was set to 8192 bytes. Increase the limit >> with @Limit, ignore it completely with @SuppressSysoutChecks or run with >> -Dtests.verbose=true >> >> Stack Trace: >> java.lang.AssertionError: The test or suite printed 8227 bytes to stdout >> and stderr, even though the limit was set to 8192 bytes. Increase the limit >> with @Limit, ignore it completely with @SuppressSysoutChecks or run with >> -Dtests.verbose=true >> at __randomizedtesting.SeedInfo.seed([F7B4CD7A5624D5EC]:0) >> at >> app//org.apache.lucene.tests.util.TestRuleLimitSysouts.afterIfSuccessful(TestRuleLimitSysouts.java:283) >> at >> app//com.carrotsearch.randomizedtesting.rules.TestRuleAdapter$1.afterIfSuccessful(TestRuleAdapter.java:36) >> at >> app//com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:37) >> at >> app//org.apache.lucene.tests.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53) >> at >> app//org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43) >> at >> app//org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44) >> at >> app//org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60) >> at >> app//org.apache.lucene.tests.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:47) >> at app//org.junit.rules.RunRules.evaluate(RunRules.java:20) >> at >> app//com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) >> at >> app//com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakCont
Re: WrongThreadException using the new Panama MMap on Java 19
Hi, this error indeed cannot happen as all our segments are shared. It could still be some bug in the Java 19 version, did you try Java 21 or Java 20? It may also be a Coretto problem, maybe contact their team, maybe they have applied some changes. ScopedMemoryAccess is using an extension to the original Java memory model internally (I think the changed something in the specs), so it changed quite a lot internally. Maybe Coretto has some patches for hotspot that make the memory model changes hit us? I don't think the bug is in Lucene's code, because if a thread is shared, it is shared. Maybe some other problem could be: Have you maybe accidentally closed the IndexInput too early. Normally this should cause an IllegalStateException (we have a test for this), but I am not fully sure what happens if the shared scope was already closed. I remmeber there were some bugs in 19, but it is already too long ago. So please try with plain OpenJDK Java 21 (or 20). I would like to know more about the speed improvements! In our benchmarking they were not so visible (only a slight change), so happy to see more. Uwe Am 17.08.2023 um 12:43 schrieb Michael McCandless: Hi Team, We hit an interesting and exciting intermittent exception in our customer-facing product search instance (all Lucene!) at Amazon: java.lang.WrongThreadException: Attempted access outside owning thread at java.base/jdk.internal.foreign.MemorySessionImpl.wrongThread(MemorySessionImpl.java:460) at java.base/jdk.internal.misc.ScopedMemoryAccess$ScopedAccessError.newRuntimeException(ScopedMemoryAccess.java:113) at java.base/jdk.internal.misc.ScopedMemoryAccess.getByte(ScopedMemoryAccess.java:518) at java.base/java.lang.invoke.VarHandleSegmentAsBytes.get(VarHandleSegmentAsBytes.java:109) at java.base/java.lang.foreign.MemorySegment.get(MemorySegment.java:1103) at org.apache.lucene.store.MemorySegmentIndexInput$SingleSegmentImpl.readByte(MemorySegmentIndexInput.java:485) at org.apache.lucene.util.fst.ReverseRandomAccessReader.readByte(ReverseRandomAccessReader.java:33) at org.apache.lucene.util.fst.FST.findTargetArc(FST.java:1444) at org.apache.lucene.codecs.lucene90.blocktree.SegmentTermsEnum.seekExact(SegmentTermsEnum.java:511) at org.apache.lucene.index.TermStates.loadTermsEnum(TermStates.java:111) at org.apache.lucene.index.TermStates.build(TermStates.java:96) We are using Corretto Java full version: openjdk full version "19.0.2+9" Looking at how Uwe's magic mrjar code works, it doesn't look like we ever make a thread private MemorySegment? If so, I don't see how this exception could be occurring :) We seem to do this: |final MemorySession session = MemorySession.openShared();| Or, maybe we do sometimes make thread private memory segments, and maybe we (Amazon's sources) have a silly thread over-sharing bug, but so far I think that's unlikely -- we are calling TermStates.build from a single thread, which under the hood clones/slices the MMap IndexInputs to seek the terms dictionary on each segment and only that one thread ever interacts with those. It's all just one thread under TermStates.build. This only happened on a few hosts and only for a short period of time, making me suspect some sort of intermittent JVM bug (e.g. HotSpot miscomiplation or so). It is clearly very rare, so we are still using the new MMap (which btw seems to be a big performance gain for our service, which we are still trying to fully understand, more on that later!). Has anyone else seen such errant exceptions with the new Panama based MMap? Are there any known Java issues that smell like this? (A quick search on bugs.openjdk.org <http://bugs.openjdk.org> (https://bugs.openjdk.org/browse/JDK-8287809?jql=issuetype%20%3D%20Bug%20AND%20text%20~%20WrongThreadException) did not seem to turn up any obvious candidates). Thanks, Mike McCandless http://blog.mikemccandless.com -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail:u...@thetaphi.de
Re: Branchless binary search in Java?
Actually this is exactly the same for Java: You can try whatever you want, the outcome of the dynamic optimization applied by various dynamic building blocks (Java bytecode, Java/Hotspot version, command line parameters, hardware CPU, virtualization) is not predictable and any change anywhere may produce different results. So we should stop on arguing about changing *our* code to improve assembly code. If we have some code on our side and it is not correctly converted to CMOV, we should open bug report on OpenJDK (Chris H. and I can do this easily - and ask for improvement). As you have seen in my other answer to this thread: Hotspot applies CMOV depending on analysis of branches. So in general our code *should* make us of CMOV. You can only get certainity by using hsdis and print of assembly for some of our methods which you think should use CMOV. But there's no guarantee that it is applied. And as always: It may take a very long time until Hotspot replaces the standard branched code by conditional moves (as they have significant overhead if used in cases where the result is With Hotspot you can try to add -XX:ConditionalMoveLimit ("Limit of ops to make speculative when using CMOVE") and try with different values (0 disables, default is 3 on x86 and aarch64, 4 on arm). But as always: Wait long enough. To enforce usage of CMOV (maybe that's the first thing for trying around and to look on the type of assembly created; but this may slow down other code as CMOV is always used, without analysis): -XX:+UseCMoveUnconditionally ("Generates CMove (scalar and vector) instructions regardless of profitability analysis.") Uwe P.S.: Hotspot also has cmov for vectorized code Am 28.07.2023 um 09:08 schrieb Dawid Weiss: Specifically, one of the fascinating Tantivy optimizations is the branchless binary search: https://quickwit.io/blog/search-a-sorted-block. This is an interesting post, thanks for sharing, Mike. I remember when people did such low-level tricks frequently (but on much simpler processors and fairly consistent hardware) and it always makes me wonder whether all the moving blocks involved here (rust, llvm, actual hardware) make it sane - any change in any of these layers may affect the outcome (and debugging what actually happened will be a nightmare...). I like it though - nice intellectual exercise and some assembly dumps for a change. ;) D. -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail:u...@thetaphi.de
Re: Branchless binary search in Java?
See Shipilevs blog: https://shipilev.net/jvm/anatomy-quarks/30-conditional-moves/ He also has some examples and also there's a command line option to tell hotspot when to use cmov: -XX:ConditionalMoveLimit Uwe Am 27. Juli 2023 14:43:19 MESZ schrieb Uwe Schindler : >Hi Mike, > >actually Hotspot is using CMOV. Some nodes after bytecode analysis are >converted to CMoveNode and it has implementations to generate machine code (as >far as i see) for x86, s390 and arm. > >The generic code is here: >https://github.com/openjdk/jdk/blob/486c7844f902728ce580c3994f58e3e497834952/src/hotspot/share/opto/movenode.cpp > >Actually it is used in some cases, but I did not find out when it uses it to >generate instructions from bytecode. It also has some code to optimize the >cmov away if the result is known before (or not, e.g. for floats it does not >do this). > >I think best would be to ask on the hotspot compiler list on suggestions how >to write Java code to trigger the JVM to insert the CMOV. > >Uwe > >Am 27.07.2023 um 13:40 schrieb Michael McCandless: >> Hi Team, >> >> At Amazon (customer facing product search team) we've been playing with / >> benchmarking Tantivy (exciting Rust search engine loosely inspired by >> Lucene: https://github.com/quickwit-oss/tantivy, created by Paul Masurel and >> developed now by Quickwit and the Tantivy open source dev community) vs >> Lucene, by building a benchmark that blends Tantivy's search-benchmark-game >> (https://github.com/quickwit-oss/search-benchmark-game) and Lucene's nightly >> benchmarks (luceneutil: https://github.com/mikemccand/luceneutil and >> https://home.apache.org/~mikemccand/lucenebench). >> >> It's great fun, and we would love more eyeballs to spot any remaining unfair >> aspects of the comparison (thank you Uwe and Adrien for catching things >> already!). We are trying hard for an apples to apples comparison: same >> (enwiki) corpus, same queries, confirming we get precisely the same top N >> hits and total hit counts, latest versions of both engines. >> >> Indeed, Tantivy is substantially (2-3X?) faster than Lucene for many >> queries, and we've been trying to understand why. >> >> Sometimes it is due to a more efficient algorithms, e.g. the count() API for >> pure disjunctive queries, which Adrien is now porting to Lucene (thank >> you!), showing sizable (~80% faster in one query) speedups: >> https://github.com/apache/lucene/issues/12358. Other times may be due to >> Rust's more efficient/immediate/Python-like GC, or direct access to SIMD >> (Lucene now has this for aKNN search -- big speedup -- and maybe soon for >> postings too?), and unsafe code, different skip data / block postings >> encoding, or ... >> >> Specifically, one of the fascinating Tantivy optimizations is the branchless >> binary search: https://quickwit.io/blog/search-a-sorted-block. Here's >> another blog post about it (implemented in C++): >> https://probablydance.com/2023/04/27/beautiful-branchless-binary-search/ >> >> The idea is to get rustc/gcc to compile down to x86-64's CMOVcc >> ("conditional move") instruction (I'm not sure if ARM has an equivalent? >> Maybe "conditional execution"?). The idea is to avoid a "true" branch of >> the instruction stream (costly pipeline flush on mis-prediction, which is >> likely in a binary search or priority queue context) by instead >> conditionally moving a value from one location to another (register or >> memory). Tantivy uses this for skipping through postings, in a single layer >> in-memory skip list structure (versus Lucene's on-disk (memory-mapped, by >> default) multi-layer skip list) -- see the above blog post. >> >> This made me wonder: does javac's hotspot compiler use CMOVcc? I see javac >> bug fixes like https://github.com/openjdk/mobile/commit/a03e9220 which seems >> to imply C2 does in fact compile to CMOVcc sometimes. So then I wondered >> whether a branchless binary search in Java is a possibility? Has anyone >> played with this? >> >> Before Robert gets too upset :) Even if we could build such a thing, the >> right place for such optimizations is likely the JDK itself (see the similar >> discussion about SIMD-optimized sorting: >> https://github.com/apache/lucene/issues/12399). Still, I'm curious if anyone >> has explored this, and maybe saw some performance gains from way up in >> javaland where we can barely see the bare metal shining beneath us :) >> >> Sorry for the long post! >> >> Mike McCandless >> >> http://blog.mikemccandless.com > >-- >Uwe Schindler >Achterdiek 19, D-28357 Bremen >https://www.thetaphi.de >eMail:u...@thetaphi.de -- Uwe Schindler Achterdiek 19, 28357 Bremen https://www.thetaphi.de
Re: Branchless binary search in Java?
Hi Mike, actually Hotspot is using CMOV. Some nodes after bytecode analysis are converted to CMoveNode and it has implementations to generate machine code (as far as i see) for x86, s390 and arm. The generic code is here: https://github.com/openjdk/jdk/blob/486c7844f902728ce580c3994f58e3e497834952/src/hotspot/share/opto/movenode.cpp Actually it is used in some cases, but I did not find out when it uses it to generate instructions from bytecode. It also has some code to optimize the cmov away if the result is known before (or not, e.g. for floats it does not do this). I think best would be to ask on the hotspot compiler list on suggestions how to write Java code to trigger the JVM to insert the CMOV. Uwe Am 27.07.2023 um 13:40 schrieb Michael McCandless: Hi Team, At Amazon (customer facing product search team) we've been playing with / benchmarking Tantivy (exciting Rust search engine loosely inspired by Lucene: https://github.com/quickwit-oss/tantivy, created by Paul Masurel and developed now by Quickwit and the Tantivy open source dev community) vs Lucene, by building a benchmark that blends Tantivy's search-benchmark-game (https://github.com/quickwit-oss/search-benchmark-game) and Lucene's nightly benchmarks (luceneutil: https://github.com/mikemccand/luceneutil and https://home.apache.org/~mikemccand/lucenebench). It's great fun, and we would love more eyeballs to spot any remaining unfair aspects of the comparison (thank you Uwe and Adrien for catching things already!). We are trying hard for an apples to apples comparison: same (enwiki) corpus, same queries, confirming we get precisely the same top N hits and total hit counts, latest versions of both engines. Indeed, Tantivy is substantially (2-3X?) faster than Lucene for many queries, and we've been trying to understand why. Sometimes it is due to a more efficient algorithms, e.g. the count() API for pure disjunctive queries, which Adrien is now porting to Lucene (thank you!), showing sizable (~80% faster in one query) speedups: https://github.com/apache/lucene/issues/12358. Other times may be due to Rust's more efficient/immediate/Python-like GC, or direct access to SIMD (Lucene now has this for aKNN search -- big speedup -- and maybe soon for postings too?), and unsafe code, different skip data / block postings encoding, or ... Specifically, one of the fascinating Tantivy optimizations is the branchless binary search: https://quickwit.io/blog/search-a-sorted-block. Here's another blog post about it (implemented in C++): https://probablydance.com/2023/04/27/beautiful-branchless-binary-search/ The idea is to get rustc/gcc to compile down to x86-64's CMOVcc ("conditional move") instruction (I'm not sure if ARM has an equivalent? Maybe "conditional execution"?). The idea is to avoid a "true" branch of the instruction stream (costly pipeline flush on mis-prediction, which is likely in a binary search or priority queue context) by instead conditionally moving a value from one location to another (register or memory). Tantivy uses this for skipping through postings, in a single layer in-memory skip list structure (versus Lucene's on-disk (memory-mapped, by default) multi-layer skip list) -- see the above blog post. This made me wonder: does javac's hotspot compiler use CMOVcc? I see javac bug fixes like https://github.com/openjdk/mobile/commit/a03e9220 which seems to imply C2 does in fact compile to CMOVcc sometimes. So then I wondered whether a branchless binary search in Java is a possibility? Has anyone played with this? Before Robert gets too upset :) Even if we could build such a thing, the right place for such optimizations is likely the JDK itself (see the similar discussion about SIMD-optimized sorting: https://github.com/apache/lucene/issues/12399). Still, I'm curious if anyone has explored this, and maybe saw some performance gains from way up in javaland where we can barely see the bare metal shining beneath us :) Sorry for the long post! Mike McCandless http://blog.mikemccandless.com -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail:u...@thetaphi.de
Re: JavaDoc generated with -noindex
Isn't this about Lucene? So 9.8 is right version. Am 07.07.2023 um 23:43 schrieb Houston Putman: Agreed, should be an easy change to include for 9.3. - Houston On Fri, Jul 7, 2023 at 5:42 PM Ishan Chattopadhyaya wrote: +1 to include this in release. Thanks for noticing! On Sat, 8 Jul, 2023, 12:33 am Mike Drob, wrote: Why is our javadoc currently generated with -noindex? I did some digging and found that we set that back in LUCENE-3977 to save 10MB, and then added a property to re-enable it in LUCENE-4237, but I think that got lost in the gradle migration. While the index might have been useless at the time, it now powers the javadoc search box, see a demo at https://youtu.be/VrI6rJNO2x4?t=925 -- The full spec is described at https://docs.oracle.com/en/java/javase/17/docs/specs/javadoc/javadoc-search-spec.html I think this would be a useful thing to include, at least for releases. WDYT? Mike -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail:u...@thetaphi.de
Re: Feature request: make Uwe and Robert redundant
to move your house with a sports car: sure, it's faster, but you're gonna hit roadblocks. When working with legacy software, it's important to understand its limitations. And let's be clear, the JVM option is just a patch, not a fix. The real fix is to upgrade your Lucene. Just like a rusty old car, it's easier and safer to get a new one than trying to hold the old one together with duct tape. Uwe: Haha, yes, Robert. That's a great analogy. Upgrading might indeed be a heavy task, especially considering the potential need for rewriting code and reindexing data. But in the end, it would provide much better stability and performance. And remember, we are here to guide you through this process if needed. Robert: Absolutely! Like a storm, change can be scary but it's necessary for growth. With Lucene's newer versions, we've squashed bugs you probably didn't know existed in Lucene 2. Take the plunge, upgrade, and trust us - it's for the best! -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail:u...@thetaphi.de
Re: [VOTE] Release Lucene 9.7.0 RC1
Hi, SUCCESS! [1:04:57.975885] https://jenkins.thetaphi.de/job/Lucene-Release-Tester/27/console Smoke tester ran with Java 11 and Java 17. Unfortunately theres still no support by Smoketester to run it with a set of arbitrary JDKs (some limited conformance tests with gradle should be executed to not make it take forever). We should open issue for that, I would have created a PR already but my Python knowledge is minimal and my brain only supports copypaste! I verified in addition the following: * Changes for completeness; I also updated the release notes (function query support for vectors was missing) * I regenerated the JDK 21 API signatures with latest JDK21 EA build 28, no changes - all fine. * I started Luke with Java 21, MMapDirectory was using memory segments. * I did not specifically test Java 20/21 vector support (see smoketester issue above). +1 to release! Uwe Am 21.06.2023 um 16:36 schrieb Adrien Grand: Please vote for release candidate 1 for Lucene 9.7.0 The artifacts can be downloaded from: https://dist.apache.org/repos/dist/dev/lucene/lucene-9.7.0-RC1-rev-ccf4b198ec328095d45d2746189dc8ca633e8bcf You can run the smoke tester directly with this command: python3 -u dev-tools/scripts/smokeTestRelease.py \ https://dist.apache.org/repos/dist/dev/lucene/lucene-9.7.0-RC1-rev-ccf4b198ec328095d45d2746189dc8ca633e8bcf The vote will be open for at least 72 hours i.e. until 2023-06-24 15:00 UTC. [ ] +1 approve [ ] +0 no opinion [ ] -1 disapprove (and reason why) Here is my +1 -- Adrien -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail:u...@thetaphi.de
Re: Welcome Chris Hegarty to the Lucene PMC
Welcome Chris. Uwe Am 19. Juni 2023 11:52:50 MESZ schrieb Adrien Grand : >I'm pleased to announce that Chris Hegarty has accepted an invitation to >join the Lucene PMC! > >Congratulations Chris, and welcome aboard! > >-- >Adrien -- Uwe Schindler Achterdiek 19, 28357 Bremen https://www.thetaphi.de
Re: New branch and feature freeze for Lucene 9.7.0
Hi, I also downloaded latest JDK 21 EA build (21-b27) and regenerated the apijar files: No changes. So all fine! I did this because there were some late changes to javadocs and API definition, but it all looks fine. Also the bug with SecurityManager that hit Elasticsearch was also fixed (but we have a workaround). I will now also update Policeman Jenkins to latest EA build. Uwe Am 16.06.2023 um 13:50 schrieb Adrien Grand: NOTICE: Branch branch_9_7 has been cut and versions updated to 9.8 on stable branch. Please observe the normal rules: * No new features may be committed to the branch. * Documentation patches, build patches and serious bug fixes may be committed to the branch. However, you should submit all patches you want to commit as pull requests first to give others the chance to review and possibly vote against them. Keep in mind that it is our main intention to keep the branch as stable as possible. * All patches that are intended for the branch should first be committed to the unstable branch, merged into the stable branch, and then into the current release branch. * Normal unstable and stable branch development may continue as usual. However, if you plan to commit a big change to the unstable branch while the branch feature freeze is in effect, think twice: can't the addition wait a couple more days? Merges of bug fixes into the branch may become more difficult. * Only Github issues with Milestone 9.7 and priority "Blocker" will delay a release candidate build. -- Adrien -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail:u...@thetaphi.de
Re: JDK 21 is in Rampdown / The importance of testing with Early-Access Builds
view) It is worth mentioning that JEP 404 (Generational Shenandoah - Experimental) has been proposed to drop from JDK 21 [8]. ### Changes in recent JDK 21 builds (b23-b26) that may be of interest: Note that this is only a curated list of changes, make sure to check [9] for additional changes. - JDK-8298127: HSS/LMS Signature Verification - JDK-8305972: Update XML Security for Java to 3.0.2 - JDK-8308244: Installation of jdk rpm corrupts alternatives - JDK-8307990: jspawnhelper must close its writing side of a pipe before reading from it - JDK-8303465: KeyStore of type KeychainStore, provider Apple does not show all trusted certificates - JDK-8303530: Redefine JAXP Configuration File - JDK-8307478: Implementation of Prepare to Restrict The Dynamic Loading of Agents - JDK-8301553: Support Password-Based Cryptography in SunPKCS11 - JDK-8308341: JNI_GetCreatedJavaVMs returns a partially initialized JVM - JDK-8308108: Support Unicode extension for collation settings - JDK-8305972: Update XML Security for Java to 3.0.2 - JDK-8305091: Change ChaCha20 cipher init behavior to match AES-GCM - JDK-8179502: Enhance OCSP, CRL and Certificate Fetch Timeouts - JDK-8307547: Support variant collations - JDK-8308876: JFR: Deserialization of EventTypeInfo uses incorrect attribute names - JDK-8297878: KEM: Implementation - JDK-8308819: add JDWP and JDI virtual thread support for ThreadReference.ForceEarlyReturn - JDK-8307779: Relax the java.awt.Robot specification - JDK-8306703: JFR: Summary views - JDK-8309146: extend JDI StackFrame.setValue() and JDWP StackFrame.setValues minimal support for virtual threads - JDK-8307840: SequencedMap view method specification and implementation adjustments - JDK-8304438: jcmd JVMTI.agent_load should obey EnableDynamicAgentLoading - JDK-8306431: File.listRoots method description should be re-examined [5] https://jdk.java.net/21/ [6] https://jdk.java.net/21/release-notes [7] https://download.java.net/java/early_access/jdk21/docs/api/ [8] https://mail.openjdk.org/pipermail/jdk-dev/2023-June/007910.html [9] https://github.com/openjdk/jdk/compare/jdk-21+23...jdk-21+26 ## JDK 22 Early-Access Builds Given that JDK 21 is now in Rampdown Phase, the initial JDK 22 Early-Access builds are now also available [10]. Those EA builds are provided under the GNU General Public License v2, with the Classpath Exception. [10] https://jdk.java.net/22/ ## JavaFX 21 Early-Access Builds These are early access builds of the JavaFX 21 Runtime, built from openjdk/jfx [11]. They are intended to allow JavaFX application developers to build and test their applications with JavaFX 21 on JDK 21. The latest builds 21 (2023/6/8) are now available [12]. These early-access builds are provided under the GNU General Public License, version 2, with the Classpath Exception. Feedback should be reported to the openjfx-dev mailing list [13]. [11] https://github.com/openjdk/jfx [12] https://jdk.java.net/javafx21/ [13] http://mail.openjdk.org/mailman/listinfo/openjfx-dev ## Topics of Interest All That is in Java 21?! https://inside.java/2023/06/08/newscast-50/ Script Java Easily in 21 and Beyond https://inside.java/2023/05/25/newscast-49/ New JFR `view` Command https://egahlin.github.io/2023/05/30/views.html Patterns: Exhaustiveness, Unconditionality, and Remainder https://openjdk.org/projects/amber/design-notes/patterns/exhaustiveness Design Document on Nullability and Value Types https://mail.openjdk.org/pipermail/valhalla-spec-observers/2023-May/002243.html JFR: Java's Observability & Monitoring Framework - Stack Walker #2 https://inside.java/2023/05/14/stackwalker-02/ ## JDK Crypto Roadmap Update Oracle updated the JDK Cryptographic Roadmap to announce a change, with the Oct CPU (2023-10-17), of the priority order used by JDK 8 and JDK 11 when negotiating cipher suites to use on TLS connections. Please check the JDK Cryptographic Roadmap page [14] for more details. [14] https://www.java.com/en/jre-jdk-cryptoroadmap.html ~ Please, make sure to test your projects using the JDK 21 EA builds as we still have time to fix potential issues. And thanks for participating in the OpenJDK Quality Outreach program! --David -- Uwe Schindler uschind...@apache.org ASF Member, Member of PMC and Committer of Apache Lucene and Apache Solr Bremen, Germany https://lucene.apache.org/ https://solr.apache.org/
Re: Lucene 9.7 release
Hi, we merged Java 21 support for both MMapDirectory and VectorUtil to main and branch_9x. As JDK is in Rampdown Phase 1, it is very unlikely that there will be hard API changes till release. In fact, the Java 21 version of vector support was working without code change, we just enabled the support without compiling explicit Java 21 version. To be sure: as part of the release testing I will regenerate API JARs and do explicit testing. Uwe P.S.: We should update the Smoketester to accept an arbitrary number of alternative JDKs to run tests. Currently it is fixed to Java 11 and 17 (I think). Am 10.06.2023 um 00:53 schrieb Uwe Schindler: Hi, BTW, there was a slight change in APIJARs caused by this API change: https://github.com/openjdk/jdk/commit/5fc9b5787dc4d7f00d2c59288bc8d840fdf5b495 (this does not affect our code, but it was done 3 weeks ago). I hope something like this won't happen. I updated the PR, no code changes needed as those methods were not used by Lucene. I'd like to update the APIJARS again shortly before the feature branch is created. Uwe Am 09.06.2023 um 23:10 schrieb Uwe Schindler: Let me merge and backport the java 21 map PR first. It has all new source directories and APIJAR files. For safety I will regenerate the 21 APIJAR with newest jdk build. Fyi, to regenerate you need to have an environment variable with jdk21 as autoprovisioning doesn't work. After that we can copy-paste the vector impl to the main/java21 folder and add vector classes to it. Uwe Am 9. Juni 2023 22:30:09 MESZ schrieb Chris Hegarty : Hi, On 9 Jun 2023, at 17:19, Uwe Schindler wrote: Hi, if possible I would like to get the Java 21 changes (MemorySegments and Vector) into the release. I'd like to ask Chris who has better knowledge how to proceed. If he suggests to wait maybe a week or 2, I'd suggest to wait that time. Chris Hegarthy: Do you know if the API of JDK 21 is finalized or not. From my understanding the final phases have started, so API changes are unlikely. If there are bug fixes they won't affect public APIs or the incubator module, right? Your understanding is correct. I do not expect any API changes at this point. The MMapDir changes are already tested all the time, vector API needs the forward port to 21. We are also doing some early testing with JDK 21 EA, and it would be great to get the 21-version of Panama VectorUtils in. I can help get this done. Uwe, what has been done so far? If nothing, as that is still the case tomorrow, I can start on it. -Chris. Uwe Am 09.06.2023 um 18:07 schrieb Adrien Grand: Hello all, There is some good stuff that is scheduled for 9.7 already, I found the following changes in the changelog that look especially interesting: - Concurrent query rewrites for vector queries. - Speedups to vector indexing/search via integration of the Panama vector API. - Reduced overhead of soft deletes. - Support for update by query. I propose we start the process for a 9.7 release, and I volunteer to be the release manager. I suggest the following schedule: - Feature freeze on June 16th, one week from now. This is when the 9.7 branch will be cut. - Open a vote on June 21st, which we'll possibly delay if blockers get identified. -- Adrien -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail:u...@thetaphi.de -- Uwe Schindler Achterdiek 19, 28357 Bremen https://www.thetaphi.de -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail:u...@thetaphi.de -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail:u...@thetaphi.de
Re: Lucene 9.7 release
Hi, BTW, there was a slight change in APIJARs caused by this API change: https://github.com/openjdk/jdk/commit/5fc9b5787dc4d7f00d2c59288bc8d840fdf5b495 (this does not affect our code, but it was done 3 weeks ago). I hope something like this won't happen. I updated the PR, no code changes needed as those methods were not used by Lucene. I'd like to update the APIJARS again shortly before the feature branch is created. Uwe Am 09.06.2023 um 23:10 schrieb Uwe Schindler: Let me merge and backport the java 21 map PR first. It has all new source directories and APIJAR files. For safety I will regenerate the 21 APIJAR with newest jdk build. Fyi, to regenerate you need to have an environment variable with jdk21 as autoprovisioning doesn't work. After that we can copy-paste the vector impl to the main/java21 folder and add vector classes to it. Uwe Am 9. Juni 2023 22:30:09 MESZ schrieb Chris Hegarty : Hi, On 9 Jun 2023, at 17:19, Uwe Schindler wrote: Hi, if possible I would like to get the Java 21 changes (MemorySegments and Vector) into the release. I'd like to ask Chris who has better knowledge how to proceed. If he suggests to wait maybe a week or 2, I'd suggest to wait that time. Chris Hegarthy: Do you know if the API of JDK 21 is finalized or not. From my understanding the final phases have started, so API changes are unlikely. If there are bug fixes they won't affect public APIs or the incubator module, right? Your understanding is correct. I do not expect any API changes at this point. The MMapDir changes are already tested all the time, vector API needs the forward port to 21. We are also doing some early testing with JDK 21 EA, and it would be great to get the 21-version of Panama VectorUtils in. I can help get this done. Uwe, what has been done so far? If nothing, as that is still the case tomorrow, I can start on it. -Chris. Uwe Am 09.06.2023 um 18:07 schrieb Adrien Grand: Hello all, There is some good stuff that is scheduled for 9.7 already, I found the following changes in the changelog that look especially interesting: - Concurrent query rewrites for vector queries. - Speedups to vector indexing/search via integration of the Panama vector API. - Reduced overhead of soft deletes. - Support for update by query. I propose we start the process for a 9.7 release, and I volunteer to be the release manager. I suggest the following schedule: - Feature freeze on June 16th, one week from now. This is when the 9.7 branch will be cut. - Open a vote on June 21st, which we'll possibly delay if blockers get identified. -- Adrien -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail:u...@thetaphi.de -- Uwe Schindler Achterdiek 19, 28357 Bremen https://www.thetaphi.de -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail:u...@thetaphi.de
Re: Lucene 9.7 release
Let me merge and backport the java 21 map PR first. It has all new source directories and APIJAR files. For safety I will regenerate the 21 APIJAR with newest jdk build. Fyi, to regenerate you need to have an environment variable with jdk21 as autoprovisioning doesn't work. After that we can copy-paste the vector impl to the main/java21 folder and add vector classes to it. Uwe Am 9. Juni 2023 22:30:09 MESZ schrieb Chris Hegarty : >Hi, > >> On 9 Jun 2023, at 17:19, Uwe Schindler wrote: >> >> Hi, >> >> if possible I would like to get the Java 21 changes (MemorySegments and >> Vector) into the release. I'd like to ask Chris who has better knowledge how >> to proceed. If he suggests to wait maybe a week or 2, I'd suggest to wait >> that time. >> >> Chris Hegarthy: Do you know if the API of JDK 21 is finalized or not. From >> my understanding the final phases have started, so API changes are unlikely. >> If there are bug fixes they won't affect public APIs or the incubator >> module, right? >> >Your understanding is correct. I do not expect any API changes at this point. >> The MMapDir changes are already tested all the time, vector API needs the >> forward port to 21. >> >We are also doing some early testing with JDK 21 EA, and it would be great to >get the 21-version of Panama VectorUtils in. I can help get this done. > >Uwe, what has been done so far? If nothing, as that is still the case >tomorrow, I can start on it. > >-Chris. > >> Uwe >> >> Am 09.06.2023 um 18:07 schrieb Adrien Grand: >>> Hello all, >>> >>> There is some good stuff that is scheduled for 9.7 already, I found the >>> following changes in the changelog that look especially interesting: >>> - Concurrent query rewrites for vector queries. >>> - Speedups to vector indexing/search via integration of the Panama vector >>> API. >>> - Reduced overhead of soft deletes. >>> - Support for update by query. >>> >>> I propose we start the process for a 9.7 release, and I volunteer to be the >>> release manager. I suggest the following schedule: >>> - Feature freeze on June 16th, one week from now. This is when the 9.7 >>> branch will be cut. >>> - Open a vote on June 21st, which we'll possibly delay if blockers get >>> identified. >>> >>> -- >>> Adrien >> -- >> Uwe Schindler >> Achterdiek 19, D-28357 Bremen >> https://www.thetaphi.de <https://www.thetaphi.de/> >> eMail: u...@thetaphi.de <mailto:u...@thetaphi.de> -- Uwe Schindler Achterdiek 19, 28357 Bremen https://www.thetaphi.de
Re: Lucene 9.7 release
Hi, if possible I would like to get the Java 21 changes (MemorySegments and Vector) into the release. I'd like to ask Chris who has better knowledge how to proceed. If he suggests to wait maybe a week or 2, I'd suggest to wait that time. Chris Hegarthy: Do you know if the API of JDK 21 is finalized or not. From my understanding the final phases have started, so API changes are unlikely. If there are bug fixes they won't affect public APIs or the incubator module, right? The MMapDir changes are already tested all the time, vector API needs the forward port to 21. Uwe Am 09.06.2023 um 18:07 schrieb Adrien Grand: Hello all, There is some good stuff that is scheduled for 9.7 already, I found the following changes in the changelog that look especially interesting: - Concurrent query rewrites for vector queries. - Speedups to vector indexing/search via integration of the Panama vector API. - Reduced overhead of soft deletes. - Support for update by query. I propose we start the process for a 9.7 release, and I volunteer to be the release manager. I suggest the following schedule: - Feature freeze on June 16th, one week from now. This is when the 9.7 branch will be cut. - Open a vote on June 21st, which we'll possibly delay if blockers get identified. -- Adrien -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail:u...@thetaphi.de
Re: [JENKINS] Lucene-9.x-Linux (64bit/openj9/jdk-17.0.5) - Build # 10811 - Unstable!
me.java:45) at app//org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60) at app//org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44) at app//org.junit.rules.RunRules.evaluate(RunRules.java:20) at app//com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at app//com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390) at app//com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:843) at app//com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:490) at app//com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:955) at app//com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:840) at app//com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:891) at app//com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:902) at app//org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43) at app//com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at app//org.apache.lucene.tests.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38) at app//com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40) at app//com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40) at app//com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at app//com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at app//org.apache.lucene.tests.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53) at app//org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43) at app//org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44) at app//org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60) at app//org.apache.lucene.tests.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:47) at app//org.junit.rules.RunRules.evaluate(RunRules.java:20) at app//com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at app//com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390) at app//com.carrotsearch.randomizedtesting.ThreadLeakControl.lambda$forkTimeoutingTask$0(ThreadLeakControl.java:850) at java.base@17.0.5/java.lang.Thread.run(Thread.java:857) - To unsubscribe, e-mail: builds-unsubscr...@lucene.apache.org For additional commands, e-mail: builds-h...@lucene.apache.org -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail:u...@thetaphi.de
Re: Updates documents using queries
You are talking about "updateDocuments(Term delTerm, Iterable docs)"? We could add another method with Query like [https://lucene.apache.org/core/9_6_0/core/org/apache/lucene/index/IndexWriter.html#deleteDocuments(org.apache.lucene.search.Query...)]. Implementation behind could be the same. Basically it would do the same but just use delQuery using the DocIdSetIteraor of the query and Iterable for the new block. Uwe Am 30.05.2023 um 23:52 schrieb Patrick Zhai: Hi folks, Currently the only way to update a block of documents is by identifying them with a term and update those documents. However we have a case where the child documents does not share a same identifier as parent documents, and to identify the whole block of documents we need to use at least a disjunction query like: (parentId: xx OR id: xx). I wonder whether we could add a new API to IndexWriter supporting that? It seems to me we just need to create a new DeleteQueue node with queries instead of terms and pass it into internal updates method? Or am I missing something so that update using query is not obvious? Thanks Patrick -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail: u...@thetaphi.de - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Build Lucene9 with JDK8
Hi, In short: You can't use Lucene 9 without Java 11, sorry. Am 19.05.2023 um 19:55 schrieb MyCoy Z: So we're thinking of building L9 with JDK8. I've briefly tried to just change the JDK versions in build.gradle and gradlew, but it just doesn't work out. So my questions are: 1. What are the features in L9 which strictly require JDK11? Lucene 9 uses the Java Module System that was introduced with Java 9. All code was changed to communicate with the Java 9 module system, so many code parts won't compile because of this. Also the ServicePorvider API to load codecs was changed to be module system compatible. In addition for performance reasons Lucene Core already uses new Arrays methods like mismatch() which are intrinsified by JVM (FYI, Lucene 8.x has a Multi-Release JAR that automatically makes use of Java 9+ features, this backwards layer was removed in Lucene 9, so code won't compile without having the new method signatures). Finally, you would need to patch huge amounts of code, because it uses the "var" keyword. There may be more requirements, the above are the first ones I remember. 2. Is it possible to build L9 with JDK8? If possible (maybe just for some modules), -- how to configure it to make the build work? No. Theres no way to do this by purely configuring the build. The Lucene Core module heavily requires the above features (modules, Arrays, new Mmap implementation, var keyword). -- will there be any negative impact, e.g. performance reduction? Yes for sure. Also MMapDirectory can't be used anymore as the Java 8 compatibility layer was removed. So MMapDirectory can't be used anymore. This will slow down everything. Uwe -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail: u...@thetaphi.de - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [VOTE] Dimension Limit for KNN Vectors
I agree with Dawid, I am +1 for those two options in combination: * option 3 (make limit an HNSW specific thing). New formats may use other limits (lower or higher). * option 4 (make a system property with HNSW prefix). Adding the system property must be done in same way like new properties for MMAP directory (including access controller) so it can be denied by system admin to be set in code (see https://github.com/apache/lucene/blob/f53eb28af053d7612f7e4d1b2de05d33dc410645/lucene/core/src/java/org/apache/lucene/store/MMapDirectory.java#L327-L346 for example). Care has to be taken that the static initializers won't fail is system properties cannot be read/set (system adminitrator enforces default -> see mmap code). It also has to be made sure that an index written with raised limit can still be read without the limit, so the limit should not be glued into the file format. Otherwise I disagree with option 4. In short: I am fine with making it configurable only for HNSW if the limit is not glued into index format. The default should only be there to by default prevent people from doing wrong things, but changing default should not break reading/modifiying those indexes. Uwe Am 16.05.2023 um 15:37 schrieb Dawid Weiss: I'm for option 3 (limit at algorithm level), with the default there tunable via property (option 4). I understand Robert's concerns and I'd love to contribute a faster implementation but the reality is - I can't do it at the moment. I feel like experiments are good though and we shouldn't just ban people from trying - if somebody changes the (sane) default and gets burned by performance, perhaps it'll be an itch to work on speeding things up (much like it's already happening with Jonathan's patch). Dawid On Tue, May 16, 2023 at 10:50 AM Alessandro Benedetti wrote: Hi all, we have finalized all the options proposed by the community and we are ready to vote for the preferred one and then proceed with the implementation. *Option 1* Keep it as it is (dimension limit hardcoded to 1024) *Motivation*: We are close to improving on many fronts. Given the criticality of Lucene in computing infrastructure and the concerns raised by one of the most active stewards of the project, I think we should keep working toward improving the feature as is and move to up the limit after we can demonstrate improvement unambiguously. *Option 2* make the limit configurable, for example through a system property *Motivation*: The system administrator can enforce a limit its users need to respect that it's in line with whatever the admin decided to be acceptable for them. The default can stay the current one. This should open the doors for Apache Solr, Elasticsearch, OpenSearch, and any sort of plugin development *Option 3* Move the max dimension limit lower level to a HNSW specific implementation. Once there, this limit would not bind any other potential vector engine alternative/evolution.* * *Motivation:*There seem to be contradictory performance interpretations about the current HNSW implementation. Some consider its performance ok, some not, and it depends on the target data set and use case. Increasing the max dimension limit where it is currently (in top level FloatVectorValues) would not allow potential alternatives (e.g. for other use-cases) to be based on a lower limit. *Option 4* Make it configurable and move it to an appropriate place. In particular, a simple Integer.getInteger("lucene.hnsw.maxDimensions", 1024) should be enough. *Motivation*: Both are good and not mutually exclusive and could happen in any order. Someone suggested to perfect what the _default_ limit should be, but I've not seen an argument _against_ configurability. Especially in this way -- a toggle that doesn't bind Lucene's APIs in any way. I'll keep this [VOTE] open for a week and then proceed to the implementation. -- *Alessandro Benedetti* Director @ Sease Ltd. /Apache Lucene/Solr Committer/ /Apache Solr PMC Member/ e-mail: a.benede...@sease.io/ / *Sease* - Information Retrieval Applied Consulting | Training | Open Source Website: Sease.io <http://sease.io/> LinkedIn <https://linkedin.com/company/sease-ltd> | Twitter <https://twitter.com/seaseltd> | Youtube <https://www.youtube.com/channel/UCDx86ZKLYNpI3gzMercM7BQ> | Github <https://github.com/seaseltd> -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail:u...@thetaphi.de
Re: JDK 21 EA builds 22 & Sequenced Collections Heads-up
Hi David, about sequences collections: No compile failures with Apache Lucene on Java 21 (but we use --release 17, so the API changes should not matter at all). Uwe Am 15.05.2023 um 18:15 schrieb Uwe Schindler: Hi David, I will update our Jenkins server to use the latest JDK 21 version for testing this. I will also add a new PR to Lucene for adding JDK 21 support of Panama Foreign 3rd Preview (it is too bad that it is still preview - are there any chances to have this officially in the LTS 21 release?). I am really angry that (my personal opinion: shitty) virtual threads are finally released but foreign memory APIs are still not in JDK 21. Any way to change your mind? Uwe Am 15.05.2023 um 17:01 schrieb David Delabassee: Welcome to the latest OpenJDK Quality Outreach update! The schedule for JDK 21 is now known [1] with Rampdown Phase One (RDP1) phase set for June 8th and General Availability (GA) set for September 19th. As we are getting closer to RDP1, we are gradually getting a better view on the JDK 21 content. At the time of writing, 5 JEPs are already integrated in the JDK 21 mainline - Virtual Threads, Generational ZGC, etc. – see below for more details. This newsletter heads-up is focused on one of those JEPs; i.e., JEP 431 Sequenced Collections, as it might induce some incompatibilities on existing codebases. Please do tell us if your project works or fails on the latest JDK 21 Early-Access builds. We still have some time to fix issues before JDK 21 reaches General Availability. [1] https://openjdk.org/projects/jdk/21/ ## Heads-Up - JDK 21: Potential Sequenced Collections Incompatibilities The Sequenced Collection JEP [2] has been integrated into JDK 21, build 20. This JEP introduces several new interfaces into the collections framework’s interface hierarchy, and these interfaces introduce new default methods. When such changes are made, they can cause conflicts that result in source or binary incompatibilities. Any conflicts that occur will be in code that implements new collections or that subclasses existing collection classes. Code that simply uses collections implementations will be largely unaffected. There are several kinds of conflicts that might arise. The first is a simple method naming conflict, if a method already exists with the same name but with a different return type or access modifier. Another is a clash between different inherited default method implementations arising from covariant overrides. A class might inherit multiple default methods if it implements multiple interfaces from different parts of the collections framework. A third example occurs with type inference. With type inference (e.g., the use of `var`) the compiler will infer a type for that local variable. It’s possible for other code to use explicitly declared types that must match the inferred type. The change to the interface hierarchy might result in a different inferred type, causing an incompatibility. Make sure to check the following article [3] that provides additional details and strategies to mitigate potential incompatibilities. [2] https://openjdk.org/jeps/431 [3] https://inside.java/2023/05/12/quality-heads-up/ Additional Sequenced Collections resources are also listed in the 'Topics of Interest' section below. ## JDK 21 Early-Access builds The latest Early-Access builds 22 are available [4], and are provided under the GNU General Public License v2, with the Classpath Exception. The Release Notes [5] and the Javadocs [6] are also available. [4] https://jdk.java.net/21/ [5] https://jdk.java.net/21/release-notes [6] https://download.java.net/java/early_access/jdk21/docs/api/ ### JEPs integrated to JDK 21, so far: - 430: String Templates (Preview) - 431: Sequenced Collections - 439: Generational ZGC - 442: Foreign Function & Memory API (3rd Preview) - 444: Virtual Threads ### JEPs targeted to JDK 21, so far: - 440: Record Patterns - 441: Pattern Matching for switch - 448: Vector API (6th Incubator) JEPs proposed to target JDK 21: - 404: Generational Shenandoah (Experimental) - 443: Unnamed Patterns and Variables (Preview) - 445: Unnamed Classes and Instance Main Methods (Preview) - 449: Deprecate the Windows 32-bit x86 Port for Removal ### Changes in recent builds that may be of interest: Note that this is only a curated list of changes, make sure to check https://github.com/openjdk/jdk/compare/jdk-21+0...jdk-21+22 for additional changes. JDK 21 Build 22: - JDK-8307466: java.time.Instant calculation bug in until and between methods - JDK-8307399: get rid of compatibility ThreadStart/ThreadEnd events for virtual threads - JDK-8306461: ObjectInputStream::readObject() should handle negative array sizes without throwing NegativeArraySizeExceptions - JDK-8280031: Deprecate GTK2 for removal - JDK-8307629: FunctionDescriptor::toMethodType should allow sequence layouts (mainline) - JDK-8302845: Rep
Re: JDK 21 EA builds 22 & Sequenced Collections Heads-up
latform https://inside.java/2023/04/11/levelup-security/ Java Language Futures, Spring 2023 Edition https://inside.java/2023/04/06/levelup-amber/ Java 21's New (Sequenced) Collections https://inside.java/2023/03/30/newscast/ JFR: Java's Observability & Monitoring Framework https://inside.java/2023/05/14/stackwalker-02/ Additionals Level Up - Java Developer Day videos https://www.youtube.com/playlist?list=PLX8CzqL3ArzX_RZNjtyETshl876jfE2bo ## April 2023 Critical Patch Update Released As part of the April 2023 CPU, Oracle released OpenJDK 20.0.1, JDK 20.0.1, JDK 17.0.7 LTS, JDK 11.0.19 LTS, JDK 8u371, as well as JDK 8u371-perf. ~ Thanks for participating in the OpenJDK Quality Outreach program. If you find any issue on JDK 21 EA builds, please send it my way! --David -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail:uwe.h.schind...@gmail.com