[PR] Make `static final Map` constants immutable [lucene]

2024-02-09 Thread via GitHub


sabi0 opened a new pull request, #13092:
URL: https://github.com/apache/lucene/pull/13092

   (no comment)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Enable MemorySegment in MMapDirectory for Java 22+ and Vectorization (incubation) for exact Java 22 [lucene]

2024-02-09 Thread via GitHub


uschindler merged PR #12706:
URL: https://github.com/apache/lucene/pull/12706


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Enable MemorySegment in MMapDirectory for Java 22+ and Vectorization (incubation) for exact Java 22 [lucene]

2024-02-09 Thread via GitHub


uschindler commented on PR #12706:
URL: https://github.com/apache/lucene/pull/12706#issuecomment-1936659871

   JDK release candidate was announced (it is build 35). I downloaded and 
tested it with vector and foreign API, no API breaks detected. Tests work.
   
   I will merge this now.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Bump release to Java 21 [lucene]

2024-02-09 Thread via GitHub


rmuir commented on PR #12753:
URL: https://github.com/apache/lucene/pull/12753#issuecomment-1936554932

   I did `git grep 17` and sifted through the noise, there's just no 
substitute, quickly fixed a bunch of stuff. sorry for any build breakage if it 
happens :)
   
   still need to update `releaseWizard.py` and `smokeTestRelease.py`, but i 
think that's really all thats left. just didnt feel like wrestling any python 
right now.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Add necessary assertion in CheckHits#doCheckMaxScores [lucene]

2024-02-09 Thread via GitHub


jpountz merged PR #13088:
URL: https://github.com/apache/lucene/pull/13088


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Modernize BWC testing with parameterized tests [lucene]

2024-02-09 Thread via GitHub


HoustonPutman commented on PR #13046:
URL: https://github.com/apache/lucene/pull/13046#issuecomment-1936231183

   So I just finished the 8.11.3 release, and had some problems with the 
back-compat testing code. Obviously the ant/gradle switch and jdk versions 
meant that I had to do a lot of it manually, but I'm pretty sure the regex 
matching in the `update_backcompat_tests()` function in 
`addBackcompatIndexes.py` was broken by this commit.
   
   I updated the files myself manually for 8.11.3 (on main and 9x), and I think 
they are correct (and the tests pass), but I would recommend checking them and 
re-checking that the python script works with these changes...


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Move `brToString(BytesRef)` to `ToStringUtils` [lucene]

2024-02-09 Thread via GitHub


dweiss commented on code in PR #13068:
URL: https://github.com/apache/lucene/pull/13068#discussion_r1484540199


##
lucene/core/src/java/org/apache/lucene/util/ToStringUtils.java:
##
@@ -32,11 +32,37 @@ public static void byteArray(StringBuilder buffer, byte[] 
bytes) {
 
   private static final char[] HEX = "0123456789abcdef".toCharArray();
 
+  /**
+   * Unlike {@link Long#toHexString(long)} returns a String with a "0x" prefix 
and all the leading
+   * zeros.
+   */
   public static String longHex(long x) {
 char[] asHex = new char[16];
 for (int i = 16; --i >= 0; x >>>= 4) {
   asHex[i] = HEX[(int) x & 0x0F];
 }
 return "0x" + new String(asHex);
   }
+
+  @SuppressWarnings("unused")
+  public static String brToString(BytesRef b) {

Review Comment:
   Yes, please do. Better a bit more verbose than cryptic.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Move `brToString(BytesRef)` to `ToStringUtils` [lucene]

2024-02-09 Thread via GitHub


sabi0 commented on code in PR #13068:
URL: https://github.com/apache/lucene/pull/13068#discussion_r1484475836


##
lucene/core/src/java/org/apache/lucene/util/ToStringUtils.java:
##
@@ -32,11 +32,37 @@ public static void byteArray(StringBuilder buffer, byte[] 
bytes) {
 
   private static final char[] HEX = "0123456789abcdef".toCharArray();
 
+  /**
+   * Unlike {@link Long#toHexString(long)} returns a String with a "0x" prefix 
and all the leading
+   * zeros.
+   */
   public static String longHex(long x) {
 char[] asHex = new char[16];
 for (int i = 16; --i >= 0; x >>>= 4) {
   asHex[i] = HEX[(int) x & 0x0F];
 }
 return "0x" + new String(asHex);
   }
+
+  @SuppressWarnings("unused")
+  public static String brToString(BytesRef b) {

Review Comment:
   Shall I rename it to `bytesRefToString()` ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Enable MemorySegment in MMapDirectory for Java 22+ and Vectorization (incubation) for exact Java 22 [lucene]

2024-02-09 Thread via GitHub


uschindler commented on PR #12706:
URL: https://github.com/apache/lucene/pull/12706#issuecomment-1935816905

   The release candidate is still missing and wasn't announced by Mark 
Reinhold: https://jdk.java.net/22/


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] upgrade to OpenNLP 2.3.2 [lucene]

2024-02-09 Thread via GitHub


cpoerschke commented on PR #12674:
URL: https://github.com/apache/lucene/pull/12674#issuecomment-1935802081

   > > ... perhaps fire an email to legal at apache, asking ...
   > 
   > Good idea, will do. And thanks for reviewing also with this in mind!
   
   https://issues.apache.org/jira/browse/LEGAL-667 opened.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] upgrade to OpenNLP 2.3.2 [lucene]

2024-02-09 Thread via GitHub


cpoerschke merged PR #12674:
URL: https://github.com/apache/lucene/pull/12674


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] upgrade to OpenNLP 2.3.2 [lucene]

2024-02-09 Thread via GitHub


cpoerschke commented on PR #12674:
URL: https://github.com/apache/lucene/pull/12674#issuecomment-1935747334

   > ... perhaps fire an email to legal at apache, asking ...
   
   Good idea, will do. And thanks for reviewing also with this in mind!
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Bump release to Java 21 [lucene]

2024-02-09 Thread via GitHub


ChrisHegarty commented on PR #12753:
URL: https://github.com/apache/lucene/pull/12753#issuecomment-1935727748

   At the moment we're just using this PR to ensure build and tests of _main_ 
are ok with Java 21. In order to make sure we're ready to bump when the time 
comes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Move synonym map off-heap for SynonymGraphFilter [lucene]

2024-02-09 Thread via GitHub


msfroh commented on PR #13054:
URL: https://github.com/apache/lucene/pull/13054#issuecomment-1935491971

   I decided to try experimenting with moving the output words back onto the 
heap, since I didn't the fact that every word lookup was triggering a seek. 
   
   Running now, I got way less variance on the on-heap runs. I also added some 
GCs between  iterations, since I wanted to measure the heap usage of each. That 
likely removed some GC pauses from the on-heap version.
   
   I then switched back to the off-heap words to confirm the results that I saw 
last time (and compare against the implementation with on-heap words).
   
   The conclusion seems to be roughly:
   * Existing on-heap FST averages about 444ms to process a lot of synonyms.
   * Off-heap FST with on-heap words averages 515 or 516ms. (About 16% slower 
than existing on-heap.)
   * Off-heap FST with off-heap words averages 620ms. (About 40% slower than 
existing on-heap.)
   
   The on-heap FST seems to occupy about 36MB of heap. The off-heap FST with 
on-heap words occupies about 560kB. The off-heap FST with off-heap words 
occupies about 150kB.
   
   With these trade-offs, I think off-heap FST with on-heap words may be a good 
choice for folks with large sets of synonyms. I don't think I would recommend 
off-heap FST with off-heap words.
   
   | Attempt OnHeap FST load time | OffHeap FST (OnHeap words) load time | 
OffHeap FST (OnHeap words) reload time | OnHeap FST processing time | OffHeap 
FST (OnHeap words) processing time | OffHeap FST (OnHeap words) reloaded 
processing time | OffHeap FST (OffHeap words) processing time | OffHeap FST 
(OffHeap words) reloaded processing time
   
|--|--||||-|-|-
   | 1 | 1191.339685 | 1072.285824 | 9.669646 | 436.391631 | 520.550704 | 
516.11297 | 623.451546 | 620.531215
   | 2 | 1030.432454 | 1033.619768 | 8.874105 | 448.848403 | 516.784387 | 
517.230739 | 621.522464 | 622.793343
   | 3 | 984.83645 | 1037.807342 | 8.912252 | 443.789813 | 512.066535 | 
517.716981 | 622.455444 | 620.468985
   | 4 | 1049.63589 | 1048.60113 | 8.894401 | 449.237547 | 518.946226 | 
516.868933 | 617.837364 | 616.810236
   | 5 | 990.22176 | 1049.618665 | 8.861166 | 448.923912 | 512.559801 | 
511.114898 | 616.555422 | 617.122551
   | 6 | 978.41877 | 1063.824595 | 8.930418 | 440.251675 | 517.632376 | 
518.175232 | 621.969759 | 622.828416
   | 7 | 985.434177 | 1049.113913 | 8.872906 | 443.209607 | 511.210536 | 
518.802292 | 624.151468 | 622.097039
   | 8 | 985.376238 | 1046.102696 | 8.823786 | 440.815454 | 517.491411 | 
517.905752 | 623.390319 | 625.387487
   | 9 | 983.341325 | 1065.892279 | 8.871586 | 449.145252 | 516.029267 | 
516.916524 | 622.811992 | 622.798858
   | 10 | 985.438642 | 1046.71167 | 8.8518 | 445.970679 | 512.045037 | 
518.934149 | 622.592098 | 614.661805
   | 11 | 990.592624 | 1050.377106 | 8.832753 | 443.844237 | 515.758106 | 
510.808005 | 611.62254 | 622.956946
   | 12 | 986.747374 | 1066.052969 | 8.884928 | 444.398327 | 517.259451 | 
524.770132 | 622.085785 | 619.311172
   | 13 | 984.328191 | 1052.189621 | 8.88281 | 439.612497 | 517.861131 | 
515.796013 | 617.86 | 615.101452
   | 14 | 984.405339 | 1049.06783 | 8.835775 | 438.871305 | 517.885493 | 
515.853446 | 615.254987 | 623.464483
   | 15 | 997.323593 | 1064.473985 | 8.90682 | 443.640208 | 515.329143 | 
518.807239 | 623.020916 | 623.013801
   | 16 | 997.253932 | 1066.558928 | 8.900308 | 442.534843 | 511.930766 | 
516.365803 | 624.316916 | 615.037306
   | 17 | 999.464751 | 1046.464149 | 8.895899 | 443.48306 | 514.841946 | 
517.082166 | 617.615908 | 618.661376
   | 18 | 1001.896073 | 1045.304622 | 8.877555 | 444.875225 | 515.029862 | 
510.365428 | 618.540866 | 624.355309
   | 19 | 986.055833 | 1045.208347 | 8.863339 | 441.647553 | 511.489699 | 
517.213428 | 623.61503 | 621.198543
   | 20 | 984.112667 | 1047.317164 | 8.940865 | 451.304206 | 514.762544 | 
510.45981 | 621.057397 | 621.483146
   | 21 | 988.310511 | 1046.154648 | 8.865301 | 447.25874 | 514.859414 | 
517.24163 | 623.916511 | 614.185296
   | 22 | 982.874582 | 1062.113889 | 8.867098 | 439.785463 | 510.387721 | 
516.885653 | 623.494968 | 622.527091
   | 23 | 980.96967 | 1048.050631 | 8.867966 | 439.05464 | 511.423329 | 
516.984465 | 621.567988 | 621.204435
   | 24 | 983.189843 | 1046.083632 | 8.81578 | 440.574651 | 518.390122 | 
520.392926 | 622.34785 | 614.923018
   | 25 | 987.033178 | 1074.553767 | 8.812579 | 446.687106 | 513.914686 | 
521.952744 | 615.870183 | 621.089011
   | 26 | 985.771758 | 1076.245942 | 8.845264 | 444.718264 | 516.274395 | 
513.5547 | 615.927497 | 615.53522
   | 27 | 981.748774 | 1046.85677 | 8.818164 | 443.252924 | 513.632714 | 
515.919924 |