dungba88 commented on code in PR #12831:
URL: https://github.com/apache/lucene/pull/12831#discussion_r1420866991
##
lucene/core/src/java/org/apache/lucene/util/fst/FST.java:
##
@@ -503,9 +518,7 @@ public FSTMetadata getMetadata() {
}
/**
- * Save the FST to DataOutput.
mikemccand commented on issue #12895:
URL: https://github.com/apache/lucene/issues/12895#issuecomment-1848198927
I am travelling this weekend and unlikely to make much progress on this
until early next week.
Maybe we just revert and release 9.9.1 now?
--
This is an automated messag
msfroh commented on PR #12897:
URL: https://github.com/apache/lucene/pull/12897#issuecomment-1848071583
If needed, I'm happy to add versions of `testFileIsUTF8()` for the other
SimpleTextCodec format unit tests.
--
This is an automated message from the Apache Git Service.
To respond to th
msfroh opened a new pull request, #12897:
URL: https://github.com/apache/lucene/pull/12897
### Description
The SimpleTextSegmentInfoFormat was writing the random byte array used as a
segment's ID directly -- not converting to a simple text representation of the
byte array. As a resul
benwtrent commented on issue #12895:
URL: https://github.com/apache/lucene/issues/12895#issuecomment-1848037016
I think if a fix for this isn't found early next week, we should consider
reverting it.
No user should upgrade to Lucene 9.9.0 with this bug.
--
This is an automated mess
kuramitsu commented on code in PR #12885:
URL: https://github.com/apache/lucene/pull/12885#discussion_r1421180794
##
lucene/analysis/kuromoji/src/java/org/apache/lucene/analysis/ja/JapaneseReadingFormFilter.java:
##
@@ -43,10 +43,30 @@ public JapaneseReadingFormFilter(TokenStrea
kuramitsu commented on code in PR #12885:
URL: https://github.com/apache/lucene/pull/12885#discussion_r1421180794
##
lucene/analysis/kuromoji/src/java/org/apache/lucene/analysis/ja/JapaneseReadingFormFilter.java:
##
@@ -43,10 +43,30 @@ public JapaneseReadingFormFilter(TokenStrea
kuramitsu commented on code in PR #12885:
URL: https://github.com/apache/lucene/pull/12885#discussion_r1421172943
##
lucene/analysis/kuromoji/src/java/org/apache/lucene/analysis/ja/JapaneseReadingFormFilter.java:
##
@@ -43,10 +43,30 @@ public JapaneseReadingFormFilter(TokenStrea
gsmiller commented on issue #12418:
URL: https://github.com/apache/lucene/issues/12418#issuecomment-1847976644
OK merged #12853 which I think fixes the root cause of this randomized test
failures. I'm going to resolve out this issue and will keep an eye on nightlies
for any new failures.
gsmiller commented on issue #12558:
URL: https://github.com/apache/lucene/issues/12558#issuecomment-1847976273
Fixed the root cause of this in #12853
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go
gsmiller commented on PR #12853:
URL: https://github.com/apache/lucene/pull/12853#issuecomment-1847976009
Thanks @gautamworah96 for taking a look!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to
gsmiller commented on code in PR #12853:
URL: https://github.com/apache/lucene/pull/12853#discussion_r1421103992
##
lucene/facet/src/java/org/apache/lucene/facet/DrillSidewaysQuery.java:
##
@@ -193,42 +204,29 @@ public BulkScorer bulkScorer(LeafReaderContext context)
throws IOE
gsmiller merged PR #12853:
URL: https://github.com/apache/lucene/pull/12853
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.ap
gautamworah96 commented on code in PR #12853:
URL: https://github.com/apache/lucene/pull/12853#discussion_r1421069686
##
lucene/facet/src/java/org/apache/lucene/facet/DrillSidewaysQuery.java:
##
@@ -193,42 +204,29 @@ public BulkScorer bulkScorer(LeafReaderContext context)
throw
gautamworah96 commented on code in PR #12853:
URL: https://github.com/apache/lucene/pull/12853#discussion_r1421077089
##
lucene/facet/src/java/org/apache/lucene/facet/DrillSidewaysQuery.java:
##
@@ -193,42 +204,29 @@ public BulkScorer bulkScorer(LeafReaderContext context)
throw
benwtrent commented on issue #12895:
URL: https://github.com/apache/lucene/issues/12895#issuecomment-1847858897
@mikemccand I have to use at a minimum: `wikibig1m` for it to replicate.
Couple of weird things I noticed in that optimization PR:
-
https://github.com/apache/lucene
zhaih commented on issue #12896:
URL: https://github.com/apache/lucene/issues/12896#issuecomment-1847830605
Oh probably not, the test is just using the default merge policy (TMP)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub
zhaih commented on issue #12896:
URL: https://github.com/apache/lucene/issues/12896#issuecomment-1847828807
I think it might due to the same problem as:
https://github.com/apache/lucene/pull/12889
e.g. a doc reorder merge policy reordered the parent child block
I haven't check it myse
zhaih commented on code in PR #12885:
URL: https://github.com/apache/lucene/pull/12885#discussion_r1421011978
##
lucene/analysis/kuromoji/src/java/org/apache/lucene/analysis/ja/JapaneseReadingFormFilter.java:
##
@@ -43,10 +43,30 @@ public JapaneseReadingFormFilter(TokenStream in
mikemccand commented on issue #12895:
URL: https://github.com/apache/lucene/issues/12895#issuecomment-1847786318
It's also curious that it's not happening w/ 9.9 created indices. #12699 is
about optimizing how we accumulate the long output while traversing (reading)
the FST block tree term
mikemccand commented on issue #12895:
URL: https://github.com/apache/lucene/issues/12895#issuecomment-1847782193
Ugh -- I'll try to look at this later today. Disappointing that our back
compat test specifically for reading 9.8 indices failed to catch this.
--
This is an automated message
gsmiller opened a new issue, #12896:
URL: https://github.com/apache/lucene/issues/12896
### Description
Saw this test fail a couple times in automated builds (e.g.,
[here](https://jenkins.thetaphi.de/job/Lucene-main-Windows/13501/testReport/junit/org.apache.lucene.search.join/TestPare
jpountz commented on issue #12895:
URL: https://github.com/apache/lucene/issues/12895#issuecomment-1847604281
I have a 9.8 index that reproduces the bug and ran a `git bisect` to figure
out the first commit that fails, it pointed to #12699.
--
This is an automated message from the Apache
benwtrent commented on issue #12895:
URL: https://github.com/apache/lucene/issues/12895#issuecomment-1847604189
Git bisect has confirmed the read corruption occurs with:
https://github.com/apache/lucene/pull/12699
--
This is an automated message from the Apache Git Service.
To respond to
gsmiller merged PR #12894:
URL: https://github.com/apache/lucene/pull/12894
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.ap
gsmiller commented on code in PR #12894:
URL: https://github.com/apache/lucene/pull/12894#discussion_r1420835887
##
lucene/test-framework/src/java/org/apache/lucene/tests/util/fst/FSTTester.java:
##
@@ -283,14 +283,17 @@ public FST doTest() throws IOException {
}
}
easyice commented on code in PR #12841:
URL: https://github.com/apache/lucene/pull/12841#discussion_r1420843242
##
lucene/core/src/test/org/apache/lucene/store/TestMMapDirectory.java:
##
@@ -114,4 +115,31 @@ public void testNullParamsIndexInput() throws Exception {
}
benwtrent commented on issue #12895:
URL: https://github.com/apache/lucene/issues/12895#issuecomment-1847590622
Possibly related: https://github.com/apache/lucene/pull/12631
NOTE: the read corruption doesn't occur when reading from an index created
in 9.9.
--
This is an automated m
gsmiller merged PR #12812:
URL: https://github.com/apache/lucene/pull/12812
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.ap
gsmiller commented on code in PR #12812:
URL: https://github.com/apache/lucene/pull/12812#discussion_r1420830252
##
lucene/memory/src/java/org/apache/lucene/index/memory/MemoryIndex.java:
##
@@ -179,6 +179,19 @@ static class SlicedIntBlockPool extends IntBlockPool {
super
gsmiller merged PR #12854:
URL: https://github.com/apache/lucene/pull/12854
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.ap
gsmiller merged PR #12855:
URL: https://github.com/apache/lucene/pull/12855
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.ap
benwtrent commented on issue #12895:
URL: https://github.com/apache/lucene/issues/12895#issuecomment-1847568572
Here are some exceptions ran into when trying to do multi-term queries with
Lucene 9.9 against an index created in 9.8 or before:
```
Caused by: java.lang.ArrayIndexOutOf
benwtrent commented on issue #12895:
URL: https://github.com/apache/lucene/issues/12895#issuecomment-1847562169
//cc @gf2121 && @mikemccand
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the sp
benwtrent opened a new issue, #12895:
URL: https://github.com/apache/lucene/issues/12895
### Description
It seems that https://github.com/apache/lucene/pull/12699/ has inadvertantly
broken reading term dictionaries created in Lucene 9.8<=.
To replicate a bug, one can index wiki
easyice commented on code in PR #12841:
URL: https://github.com/apache/lucene/pull/12841#discussion_r1420812950
##
lucene/core/src/java21/org/apache/lucene/store/MemorySegmentIndexInput.java:
##
@@ -324,24 +324,9 @@ private void readGroupVInt(long[] dst, int offset) throws
IOEx
dungba88 opened a new pull request, #12894:
URL: https://github.com/apache/lucene/pull/12894
### Description
The test can throw a NPE when it's using off-heap mode and no nodes are
accepted
--
This is an automated message from the Apache Git Service.
To respond to the message, plea
epotyom commented on PR #12862:
URL: https://github.com/apache/lucene/pull/12862#issuecomment-1847540870
Thank you fore reviewing @mikemccand ! Resolved your comments in 2nd commit.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to Git
uschindler commented on code in PR #12841:
URL: https://github.com/apache/lucene/pull/12841#discussion_r1420760926
##
lucene/core/src/test/org/apache/lucene/store/TestMMapDirectory.java:
##
@@ -114,4 +115,31 @@ public void testNullParamsIndexInput() throws Exception {
}
uschindler commented on code in PR #12841:
URL: https://github.com/apache/lucene/pull/12841#discussion_r1420793976
##
lucene/core/src/java21/org/apache/lucene/store/MemorySegmentIndexInput.java:
##
@@ -324,24 +324,9 @@ private void readGroupVInt(long[] dst, int offset) throws
I
uschindler commented on code in PR #12841:
URL: https://github.com/apache/lucene/pull/12841#discussion_r1420793976
##
lucene/core/src/java21/org/apache/lucene/store/MemorySegmentIndexInput.java:
##
@@ -324,24 +324,9 @@ private void readGroupVInt(long[] dst, int offset) throws
I
epotyom commented on code in PR #12862:
URL: https://github.com/apache/lucene/pull/12862#discussion_r1420792687
##
lucene/facet/src/java/org/apache/lucene/facet/MultiFacets.java:
##
@@ -77,6 +80,39 @@ public Number getSpecificValue(String dim, String... path)
throws IOException
stefanvodita commented on code in PR #12812:
URL: https://github.com/apache/lucene/pull/12812#discussion_r1420778257
##
lucene/memory/src/java/org/apache/lucene/index/memory/MemoryIndex.java:
##
@@ -179,6 +179,19 @@ static class SlicedIntBlockPool extends IntBlockPool {
s
uschindler commented on code in PR #12841:
URL: https://github.com/apache/lucene/pull/12841#discussion_r1420760926
##
lucene/core/src/test/org/apache/lucene/store/TestMMapDirectory.java:
##
@@ -114,4 +115,31 @@ public void testNullParamsIndexInput() throws Exception {
}
epotyom commented on code in PR #12862:
URL: https://github.com/apache/lucene/pull/12862#discussion_r1420764333
##
lucene/facet/src/java/org/apache/lucene/facet/MultiFacets.java:
##
@@ -77,6 +80,39 @@ public Number getSpecificValue(String dim, String... path)
throws IOException
epotyom commented on code in PR #12862:
URL: https://github.com/apache/lucene/pull/12862#discussion_r1420764008
##
lucene/facet/src/java/org/apache/lucene/facet/MultiFacets.java:
##
@@ -77,6 +80,39 @@ public Number getSpecificValue(String dim, String... path)
throws IOException
epotyom commented on code in PR #12862:
URL: https://github.com/apache/lucene/pull/12862#discussion_r1420763643
##
lucene/facet/src/java/org/apache/lucene/facet/LongValueFacetCounts.java:
##
@@ -568,6 +568,12 @@ public Number getSpecificValue(String dim, String... path)
{
epotyom commented on code in PR #12862:
URL: https://github.com/apache/lucene/pull/12862#discussion_r1420763238
##
lucene/CHANGES.txt:
##
@@ -67,6 +67,8 @@ API Changes
* GITHUB#11023: Adding -level param to CheckIndex, making the old -fast param
the default behaviour. (Jakub
gsmiller commented on code in PR #12812:
URL: https://github.com/apache/lucene/pull/12812#discussion_r1420739649
##
lucene/memory/src/java/org/apache/lucene/index/memory/MemoryIndex.java:
##
@@ -179,6 +179,19 @@ static class SlicedIntBlockPool extends IntBlockPool {
super
easyice commented on code in PR #12841:
URL: https://github.com/apache/lucene/pull/12841#discussion_r1420565417
##
lucene/test-framework/src/java/org/apache/lucene/tests/store/BaseDirectoryTestCase.java:
##
@@ -1438,4 +1440,68 @@ public void testListAllIsSorted() throws IOExcept
benwtrent commented on code in PR #12699:
URL: https://github.com/apache/lucene/pull/12699#discussion_r1420733099
##
lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/SegmentTermsEnum.java:
##
@@ -1190,4 +1176,63 @@ public void seekExact(long ord) {
public long
benwtrent commented on code in PR #12699:
URL: https://github.com/apache/lucene/pull/12699#discussion_r1420733099
##
lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/SegmentTermsEnum.java:
##
@@ -1190,4 +1176,63 @@ public void seekExact(long ord) {
public long
gsmiller commented on PR #12890:
URL: https://github.com/apache/lucene/pull/12890#issuecomment-1847461557
IMO we should deprecate these without replacement. I agree that users should
be able to implement this logic pretty easily in their application layer, and
would probably be better suite
stefanvodita commented on code in PR #12812:
URL: https://github.com/apache/lucene/pull/12812#discussion_r1420723711
##
lucene/memory/src/java/org/apache/lucene/index/memory/MemoryIndex.java:
##
@@ -179,6 +179,19 @@ static class SlicedIntBlockPool extends IntBlockPool {
s
gsmiller commented on code in PR #12812:
URL: https://github.com/apache/lucene/pull/12812#discussion_r1420707913
##
lucene/memory/src/java/org/apache/lucene/index/memory/MemoryIndex.java:
##
@@ -179,6 +179,19 @@ static class SlicedIntBlockPool extends IntBlockPool {
super
stefanvodita commented on code in PR #12812:
URL: https://github.com/apache/lucene/pull/12812#discussion_r1420699163
##
lucene/memory/src/java/org/apache/lucene/index/memory/MemoryIndex.java:
##
@@ -179,6 +179,19 @@ static class SlicedIntBlockPool extends IntBlockPool {
s
gsmiller commented on code in PR #12812:
URL: https://github.com/apache/lucene/pull/12812#discussion_r1420685472
##
lucene/memory/src/java/org/apache/lucene/index/memory/MemoryIndex.java:
##
@@ -179,6 +179,19 @@ static class SlicedIntBlockPool extends IntBlockPool {
super
lukas-vlcek commented on PR #12875:
URL: https://github.com/apache/lucene/pull/12875#issuecomment-1847241112
@mikemccand Do you think you can give me some hint about?
> (e.g. `UnifiedHighlighter`, in certain modes)
I am looking at `TestUnifiedHighlighter*` tests. Does it mean th
uschindler commented on code in PR #12841:
URL: https://github.com/apache/lucene/pull/12841#discussion_r1420450756
##
lucene/core/src/java/org/apache/lucene/util/GroupVIntUtil.java:
##
@@ -0,0 +1,104 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+
uschindler commented on code in PR #12841:
URL: https://github.com/apache/lucene/pull/12841#discussion_r1420451580
##
lucene/core/src/java/org/apache/lucene/util/GroupVIntUtil.java:
##
@@ -0,0 +1,104 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+
uschindler commented on code in PR #12841:
URL: https://github.com/apache/lucene/pull/12841#discussion_r1420447123
##
lucene/core/src/java/org/apache/lucene/util/GroupVIntUtil.java:
##
@@ -0,0 +1,104 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+
easyice commented on code in PR #12841:
URL: https://github.com/apache/lucene/pull/12841#discussion_r1420418454
##
lucene/core/src/java/org/apache/lucene/util/GroupVIntUtil.java:
##
@@ -0,0 +1,104 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ *
uschindler commented on code in PR #12841:
URL: https://github.com/apache/lucene/pull/12841#discussion_r1420387856
##
lucene/core/src/java21/org/apache/lucene/store/MemorySegmentIndexInput.java:
##
@@ -303,6 +304,34 @@ public byte readByte(long pos) throws IOException {
}
slow-J opened a new pull request, #12893:
URL: https://github.com/apache/lucene/pull/12893
Following up on @mikemccand's comment in previous CheckIndex
PR:https://github.com/apache/lucene/pull/12876.
> I do think some of these tags in CheckIndex.java could be removed, e.g.
on each o
easyice commented on PR #12841:
URL: https://github.com/apache/lucene/pull/12841#issuecomment-1847102122
It looks good on `byteBuffers` and `MMapDirectory`, the benchmark result is
pretty close to previous commit,
but a bit slowdon on `NIOFSDirectory`, i will dig it.
* `*ReadGroupV
uschindler commented on code in PR #12841:
URL: https://github.com/apache/lucene/pull/12841#discussion_r1420382794
##
lucene/core/src/java21/org/apache/lucene/store/MemorySegmentIndexInput.java:
##
@@ -303,6 +304,48 @@ public byte readByte(long pos) throws IOException {
}
uschindler commented on code in PR #12841:
URL: https://github.com/apache/lucene/pull/12841#discussion_r1420379683
##
lucene/core/src/java21/org/apache/lucene/store/MemorySegmentIndexInput.java:
##
@@ -303,6 +304,34 @@ public byte readByte(long pos) throws IOException {
}
uschindler commented on code in PR #12841:
URL: https://github.com/apache/lucene/pull/12841#discussion_r1420377148
##
lucene/core/src/java21/org/apache/lucene/store/MemorySegmentIndexInput.java:
##
@@ -324,24 +324,9 @@ private void readGroupVInt(long[] dst, int offset) throws
I
uschindler commented on code in PR #12841:
URL: https://github.com/apache/lucene/pull/12841#discussion_r1420375351
##
lucene/core/src/java/org/apache/lucene/util/GroupVIntUtil.java:
##
@@ -62,4 +62,42 @@ private static long readLongInGroup(DataInput in, int
numBytesMinus1) thro
uschindler commented on code in PR #12841:
URL: https://github.com/apache/lucene/pull/12841#discussion_r1420373977
##
lucene/core/src/java/org/apache/lucene/util/GroupVIntUtil.java:
##
@@ -62,4 +62,42 @@ private static long readLongInGroup(DataInput in, int
numBytesMinus1) thro
uschindler commented on code in PR #12841:
URL: https://github.com/apache/lucene/pull/12841#discussion_r1420364348
##
lucene/test-framework/src/java/org/apache/lucene/tests/store/BaseDirectoryTestCase.java:
##
@@ -1438,4 +1440,68 @@ public void testListAllIsSorted() throws IOExc
uschindler commented on code in PR #12841:
URL: https://github.com/apache/lucene/pull/12841#discussion_r1420364348
##
lucene/test-framework/src/java/org/apache/lucene/tests/store/BaseDirectoryTestCase.java:
##
@@ -1438,4 +1440,68 @@ public void testListAllIsSorted() throws IOExc
easyice commented on PR #12841:
URL: https://github.com/apache/lucene/pull/12841#issuecomment-1847060669
I'm running the performance differences between previous commit, it will
take a moment.
--
This is an automated message from the Apache Git Service.
To respond to the message, please l
gokaai commented on code in PR #12872:
URL: https://github.com/apache/lucene/pull/12872#discussion_r1420277534
##
lucene/core/src/java/org/apache/lucene/index/CheckIndex.java:
##
@@ -957,6 +974,9 @@ private Status.SegmentInfoStatus testSegment(
SegmentReader reader = null;
easyice commented on PR #12841:
URL: https://github.com/apache/lucene/pull/12841#issuecomment-1846907147
> Can we do the same for all other inputs?
I think so, i will do this if @jpountz doesn't mind.
> I will nag Maurizio again about the problem with slice().
Thank you s
shaikhu commented on PR #12519:
URL: https://github.com/apache/lucene/pull/12519#issuecomment-1846884628
Oops I completely forgot about this. Restored forked repo and reopening.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub
uschindler commented on PR #12841:
URL: https://github.com/apache/lucene/pull/12841#issuecomment-1846883907
I will nag Maurizio again about the problem with slice(). The reason for
this was some strange problem with Hotspot. I thought they fixed it.
--
This is an automated message from th
uschindler commented on PR #12841:
URL: https://github.com/apache/lucene/pull/12841#issuecomment-1846881254
I would still be safe and initialize the IntReader on construction of the
IndexInput. It can strongly bind to the current segment.
Can we do the same for all other inputs?
--
easyice commented on PR #12841:
URL: https://github.com/apache/lucene/pull/12841#issuecomment-1846784038
+1 for gc overhead, here is the gc output (`-prof gc` ):
```
Benchmark
(size) Mode Cnt Score Er
jpountz commented on PR #12841:
URL: https://github.com/apache/lucene/pull/12841#issuecomment-1846779641
I confirmed there's GC activity happening with the slice approach by using
`-prof gc`:
```
Benchmark
(si
jpountz commented on PR #12841:
URL: https://github.com/apache/lucene/pull/12841#issuecomment-1846774383
I'll check if there is GC activity during the benchmark. In the meantime, I
looked into using lambdas instead, and it seems like it would work well:
https://github.com/apache/lucene/comm
kuramitsu commented on PR #12885:
URL: https://github.com/apache/lucene/pull/12885#issuecomment-1846729364
The modification within the getRomanization function has been dropped.
Instead, in the incrementToken function, I added a process to treat the
hiragana OOV term converted to kataka a
82 matches
Mail list logo