Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2024-01-11 Thread via GitHub
jpountz commented on PR #12841: URL: https://github.com/apache/lucene/pull/12841#issuecomment-1887173459 FYI I pushed an annotation to nightly benchmarks, it should show up tomorrow. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-29 Thread via GitHub
easyice commented on PR #12841: URL: https://github.com/apache/lucene/pull/12841#issuecomment-1872023239 Thank you for the backport and all great suggestions! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-29 Thread via GitHub
uschindler commented on PR #12841: URL: https://github.com/apache/lucene/pull/12841#issuecomment-1872008672 Hi @easyice, I backported the PR. There was only a change in the test because in Java 11 does not have random() with two parameters. We have TestUtil for that. Uwe -- This is

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-28 Thread via GitHub
easyice commented on PR #12841: URL: https://github.com/apache/lucene/pull/12841#issuecomment-1871224023 Thanks for explaining! i agree that the micro benchmarks results are for reference only, so let we backport it :) -- This is an automated message from the Apache Git Service. To

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-28 Thread via GitHub
uschindler commented on PR #12841: URL: https://github.com/apache/lucene/pull/12841#issuecomment-1871211606 You see a relic. There is no slowdown. The baseline code is exactly the same like the one The difference may come from different decisions by hotspot regarding inlining

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-28 Thread via GitHub
easyice commented on PR #12841: URL: https://github.com/apache/lucene/pull/12841#issuecomment-1871206021 > If you have backported this PR and see a difference, it is a relic. Just raise number of repetitions. Ohhh... let me try again.. -- This is an automated message from the

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-28 Thread via GitHub
easyice commented on PR #12841: URL: https://github.com/apache/lucene/pull/12841#issuecomment-1871203007 Yeah, i backport this PR to branch_9x on my local machine, and run the benchmark with java11, i got performance regression for `MMapDirectoryInputs`, even if we not change the code for

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-28 Thread via GitHub
uschindler commented on PR #12841: URL: https://github.com/apache/lucene/pull/12841#issuecomment-1871196645 If you have backported this PR and see a difference, it is a relic. Just raise number of repetitions. -- This is an automated message from the Apache Git Service. To respond to the

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-28 Thread via GitHub
uschindler commented on PR #12841: URL: https://github.com/apache/lucene/pull/12841#issuecomment-1871194418 This PR only changes the Java 19-21 MMapIndput as we had a regression of ByteBufferIndexInput in Java 17 with ByteBufferIndexInput, too. So it was reverted. Just look at the code.

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-28 Thread via GitHub
uschindler commented on PR #12841: URL: https://github.com/apache/lucene/pull/12841#issuecomment-1871193467 > Thank you Uwe, in my understanding we shouldn't backport this PR, because it has performance regression on java11 for `MMapDirectoryInputs` , this is also the main way of reading

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-28 Thread via GitHub
easyice commented on PR #12841: URL: https://github.com/apache/lucene/pull/12841#issuecomment-1871184321 Thank you Uwe, in my understanding we shouldn't backport this PR, because it has performance regression on java11 for `MMapDirectoryInputs` , this is also the main way of reading data.

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-28 Thread via GitHub
uschindler commented on PR #12841: URL: https://github.com/apache/lucene/pull/12841#issuecomment-1871151967 Hi @easyice, is my understanding correct. We can backport this PR as is? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-26 Thread via GitHub
uschindler commented on PR #12841: URL: https://github.com/apache/lucene/pull/12841#issuecomment-1869642731 Hi, Thanks for the measurements, @easyice. So basically, we can backport the commit without any modifications. I can do this and move change entries afterwards. No PR

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-26 Thread via GitHub
easyice commented on PR #12841: URL: https://github.com/apache/lucene/pull/12841#issuecomment-1869575788 There seems to be a speedup on [prefix queries](http://people.apache.org/~mikemccand/lucenebench/Prefix3.html) in nightly benchmarks. For reference, here is the benchmark in

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-23 Thread via GitHub
uschindler commented on PR #12841: URL: https://github.com/apache/lucene/pull/12841#issuecomment-1868283615 I also merged the main barnch into the Java 22+ MMapDirectory implementation to make sure the MMAP tester finds any issues during its runs:

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-23 Thread via GitHub
uschindler commented on PR #12841: URL: https://github.com/apache/lucene/pull/12841#issuecomment-1868283208 I appied this for Lucene 10 only (main branch). This also fits changes.txt. Let's figure out if backporting breaks anything. If this is fine in Lucene 9.x, we can move the

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-23 Thread via GitHub
uschindler merged PR #12841: URL: https://github.com/apache/lucene/pull/12841 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-22 Thread via GitHub
uschindler commented on PR #12841: URL: https://github.com/apache/lucene/pull/12841#issuecomment-1867867281 I think, @mikemccand should have nightly benchmarks for both branches, the stable one and the main one. -- This is an automated message from the Apache Git Service. To respond to

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-22 Thread via GitHub
uschindler commented on PR #12841: URL: https://github.com/apache/lucene/pull/12841#issuecomment-1867866005 > Change looks good to me, thanks for running the micro benchmarks again. > > I was thinking we'd backport this to 9.x, since this doesn't look like it would break anything.

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-22 Thread via GitHub
uschindler commented on PR #12841: URL: https://github.com/apache/lucene/pull/12841#issuecomment-1867386417 Hi, Yes we can. I was waiting for some comment from Adrien. Let me invite him for a review. As this is main branch only, we can find-rube and change APIs later. Uwe --

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-21 Thread via GitHub
easyice commented on PR #12841: URL: https://github.com/apache/lucene/pull/12841#issuecomment-1867190473 Hi @uschindler , can we merge this PR now?(without optimizing ByteBufferIndexInput) or anything else that needs to be changed? -- This is an automated message from the Apache Git

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-13 Thread via GitHub
easyice commented on PR #12841: URL: https://github.com/apache/lucene/pull/12841#issuecomment-1855355717 After copy paste code to `MemorysegmentIndexInput`, they work well: java19: ``` GroupVIntBenchmark.benchMMapDirectoryInputs_readGroupVInt64 thrpt5

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-13 Thread via GitHub
easyice commented on PR #12841: URL: https://github.com/apache/lucene/pull/12841#issuecomment-1855104523 > Let's keep ByteBufferIndexInput (without s) as is. Maybe work on that later. We would need to figure out what is causing slowness here. Yes, it is very strange. I have spent

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-13 Thread via GitHub
uschindler commented on PR #12841: URL: https://github.com/apache/lucene/pull/12841#issuecomment-1854877951 Let's keep ByteBufferIndexInput (without s) as is. Maybe work on that later. We would need to figure out what is causing slowness here. So revert the change and copy paste code

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-13 Thread via GitHub
uschindler commented on PR #12841: URL: https://github.com/apache/lucene/pull/12841#issuecomment-1854874240 I checked: niofs indexinput still uses an on heap buffer. So I have no idea why it is slower for that case. -- This is an automated message from the Apache Git Service. To respond

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-13 Thread via GitHub
uschindler commented on PR #12841: URL: https://github.com/apache/lucene/pull/12841#issuecomment-1854866407 > I agree with you, the getInt() seems more expensive than `MemorySegment`, i prefer to revert the change on `ByteBufferIndexInput`, then do the similar improve on java20, java19 (to

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-12 Thread via GitHub
easyice commented on PR #12841: URL: https://github.com/apache/lucene/pull/12841#issuecomment-1853232196 I agree with you, the getInt() seems more expensive than `MemorySegment`, i prefer to revert the change on `ByteBufferIndexInput`, then the similar improve on java20, java19 (to keep

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-12 Thread via GitHub
uschindler commented on PR #12841: URL: https://github.com/apache/lucene/pull/12841#issuecomment-1852261778 If you have an on-heap ByteBuffer (like for ByteBuffersIndexInput), it executes completely different code when reading from the underlying data structure.. -- This is an automated

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-12 Thread via GitHub
uschindler commented on PR #12841: URL: https://github.com/apache/lucene/pull/12841#issuecomment-1852257293 I have the feeling that for direct buffers (this is what MMap and NIO use, the getInt() seems more expensive than the sequential reads. -- This is an automated message from the

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-12 Thread via GitHub
easyice commented on PR #12841: URL: https://github.com/apache/lucene/pull/12841#issuecomment-1852253188 Even if i used the copy code approach(avoid to using lambda, for test purpose), it was only 15%-20% faster. like this: ``` @Override public void

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-12 Thread via GitHub
uschindler commented on PR #12841: URL: https://github.com/apache/lucene/pull/12841#issuecomment-1852165482 There is no lambda capturing problem. I have no idea why it complains. It really looks like fully inlined. It seems that it is not happy about those ByteBuffers at all. `ix()`

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-12 Thread via GitHub
easyice commented on PR #12841: URL: https://github.com/apache/lucene/pull/12841#issuecomment-1852013823 Hi, I added the implementation for `ByteBufferIndexInput`, Unfortunately, the benchmark shows a bit regression: java17 ``` Benchmark

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-12 Thread via GitHub
easyice commented on code in PR #12841: URL: https://github.com/apache/lucene/pull/12841#discussion_r1423744250 ## lucene/core/src/java/org/apache/lucene/store/ByteBuffersIndexInput.java: ## @@ -205,6 +205,12 @@ public void readLongs(long[] dst, int offset, int length) throws

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-12 Thread via GitHub
uschindler commented on code in PR #12841: URL: https://github.com/apache/lucene/pull/12841#discussion_r1423722848 ## lucene/core/src/java/org/apache/lucene/store/ByteBuffersIndexInput.java: ## @@ -205,6 +205,12 @@ public void readLongs(long[] dst, int offset, int length)

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-12 Thread via GitHub
uschindler commented on PR #12841: URL: https://github.com/apache/lucene/pull/12841#issuecomment-1851570021 > Could we consider not changing `MemorySegmentIndexInput` for java 19 and java20? As a preview feature , it seems reasonable that we only do optimizations in higher versions, and

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-11 Thread via GitHub
easyice commented on PR #12841: URL: https://github.com/apache/lucene/pull/12841#issuecomment-1851433406 Could we consider not changing `MemorySegmentIndexInput` for java 19 and java20? As a preview feature , it seems reasonable that we only do optimizations in higher versions, and they

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-11 Thread via GitHub
easyice commented on PR #12841: URL: https://github.com/apache/lucene/pull/12841#issuecomment-1851319229 > I have a better idea. Lets keep the 2 different method, but do another trick: With this great idea, the performance comes back! java21 ``` Benchmark

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-11 Thread via GitHub
easyice commented on PR #12841: URL: https://github.com/apache/lucene/pull/12841#issuecomment-1851115440 Thanks for the detailed description, got it! looks better :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-11 Thread via GitHub
uschindler commented on PR #12841: URL: https://github.com/apache/lucene/pull/12841#issuecomment-1851098019 Hi @easyice, I have a better idea. Lets keep the 2 different method, but do another trick: - The base class DataInput implements the public outer loop as a final implementation,

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-11 Thread via GitHub
uschindler commented on PR #12841: URL: https://github.com/apache/lucene/pull/12841#issuecomment-1850587232 Otherwise I like the new code very much. In fact, this is a micro-benchmark. So the differences in speed won't be visible in query benchmarks. -- This is an automated

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-11 Thread via GitHub
easyice commented on PR #12841: URL: https://github.com/apache/lucene/pull/12841#issuecomment-1850511554 Thank you @uschindler , thinking about that too, i will try to a second lambda tomorrow. -- This is an automated message from the Apache Git Service. To respond to the message, please

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-11 Thread via GitHub
uschindler commented on PR #12841: URL: https://github.com/apache/lucene/pull/12841#issuecomment-1850500493 Hi, the problem of MMapDir is that the seek method has to update also the current block number. Maybe we pass a second lambda to update the position? Let's just try this out!

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-11 Thread via GitHub
easyice commented on PR #12841: URL: https://github.com/apache/lucene/pull/12841#issuecomment-1850424855 The performance of the new approach seems regressed a bit more on java21_benchMMapDirectoryInputs_readGroupVInt, here is the difference of speed up relative to the baseline.

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-11 Thread via GitHub
uschindler commented on code in PR #12841: URL: https://github.com/apache/lucene/pull/12841#discussion_r1422507453 ## lucene/core/src/java21/org/apache/lucene/store/MemorySegmentIndexInput.java: ## @@ -303,6 +304,30 @@ public byte readByte(long pos) throws IOException { }

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-11 Thread via GitHub
uschindler commented on code in PR #12841: URL: https://github.com/apache/lucene/pull/12841#discussion_r1422507453 ## lucene/core/src/java21/org/apache/lucene/store/MemorySegmentIndexInput.java: ## @@ -303,6 +304,30 @@ public byte readByte(long pos) throws IOException { }

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-11 Thread via GitHub
easyice commented on code in PR #12841: URL: https://github.com/apache/lucene/pull/12841#discussion_r1422494949 ## lucene/core/src/java21/org/apache/lucene/store/MemorySegmentIndexInput.java: ## @@ -303,6 +304,30 @@ public byte readByte(long pos) throws IOException { }

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-11 Thread via GitHub
uschindler commented on code in PR #12841: URL: https://github.com/apache/lucene/pull/12841#discussion_r1422466176 ## lucene/core/src/java21/org/apache/lucene/store/MemorySegmentIndexInput.java: ## @@ -303,6 +304,30 @@ public byte readByte(long pos) throws IOException { }

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-11 Thread via GitHub
uschindler commented on code in PR #12841: URL: https://github.com/apache/lucene/pull/12841#discussion_r1422441239 ## lucene/core/src/java21/org/apache/lucene/store/MemorySegmentIndexInput.java: ## @@ -303,6 +304,30 @@ public byte readByte(long pos) throws IOException { }

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-11 Thread via GitHub
uschindler commented on code in PR #12841: URL: https://github.com/apache/lucene/pull/12841#discussion_r1422441239 ## lucene/core/src/java21/org/apache/lucene/store/MemorySegmentIndexInput.java: ## @@ -303,6 +304,30 @@ public byte readByte(long pos) throws IOException { }

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-11 Thread via GitHub
easyice commented on code in PR #12841: URL: https://github.com/apache/lucene/pull/12841#discussion_r1422436257 ## lucene/core/src/java21/org/apache/lucene/store/MemorySegmentIndexInput.java: ## @@ -303,6 +304,30 @@ public byte readByte(long pos) throws IOException { }

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-11 Thread via GitHub
uschindler commented on code in PR #12841: URL: https://github.com/apache/lucene/pull/12841#discussion_r1422417582 ## lucene/core/src/java21/org/apache/lucene/store/MemorySegmentIndexInput.java: ## @@ -303,6 +304,30 @@ public byte readByte(long pos) throws IOException { }

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-11 Thread via GitHub
uschindler commented on code in PR #12841: URL: https://github.com/apache/lucene/pull/12841#discussion_r1422413707 ## lucene/core/src/java21/org/apache/lucene/store/MemorySegmentIndexInput.java: ## @@ -303,6 +304,30 @@ public byte readByte(long pos) throws IOException { }

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-11 Thread via GitHub
uschindler commented on code in PR #12841: URL: https://github.com/apache/lucene/pull/12841#discussion_r1422413707 ## lucene/core/src/java21/org/apache/lucene/store/MemorySegmentIndexInput.java: ## @@ -303,6 +304,30 @@ public byte readByte(long pos) throws IOException { }

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-11 Thread via GitHub
uschindler commented on code in PR #12841: URL: https://github.com/apache/lucene/pull/12841#discussion_r1422396582 ## lucene/core/src/java21/org/apache/lucene/store/MemorySegmentIndexInput.java: ## @@ -303,6 +304,30 @@ public byte readByte(long pos) throws IOException { }

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-11 Thread via GitHub
uschindler commented on code in PR #12841: URL: https://github.com/apache/lucene/pull/12841#discussion_r1422395622 ## lucene/core/src/java21/org/apache/lucene/store/MemorySegmentIndexInput.java: ## @@ -303,6 +304,30 @@ public byte readByte(long pos) throws IOException { }

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-11 Thread via GitHub
easyice commented on code in PR #12841: URL: https://github.com/apache/lucene/pull/12841#discussion_r1422390737 ## lucene/core/src/java21/org/apache/lucene/store/MemorySegmentIndexInput.java: ## @@ -303,6 +304,30 @@ public byte readByte(long pos) throws IOException { }

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-11 Thread via GitHub
uschindler commented on code in PR #12841: URL: https://github.com/apache/lucene/pull/12841#discussion_r1422381211 ## lucene/core/src/java21/org/apache/lucene/store/MemorySegmentIndexInput.java: ## @@ -303,6 +304,30 @@ public byte readByte(long pos) throws IOException { }

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-11 Thread via GitHub
easyice commented on code in PR #12841: URL: https://github.com/apache/lucene/pull/12841#discussion_r1422358990 ## lucene/core/src/java21/org/apache/lucene/store/MemorySegmentIndexInput.java: ## @@ -303,6 +304,30 @@ public byte readByte(long pos) throws IOException { }

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-11 Thread via GitHub
uschindler commented on PR #12841: URL: https://github.com/apache/lucene/pull/12841#issuecomment-1849889528 Thanks. Why can't we move the whole loop to the util class? Of course you could benchmark this, but this may also the reson for the slowdown in the NIOFSDir case, -- This is an

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-11 Thread via GitHub
uschindler commented on code in PR #12841: URL: https://github.com/apache/lucene/pull/12841#discussion_r1422300969 ## lucene/core/src/java21/org/apache/lucene/store/MemorySegmentIndexInput.java: ## @@ -303,6 +304,30 @@ public byte readByte(long pos) throws IOException { }

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-10 Thread via GitHub
easyice commented on PR #12841: URL: https://github.com/apache/lucene/pull/12841#issuecomment-1849275229 > 3rd party implementations and our own MMap/NIO/... would just call this static method if they support random access or call super(), if they can't. Thanks @uschindler , it's

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-10 Thread via GitHub
uschindler commented on PR #12841: URL: https://github.com/apache/lucene/pull/12841#issuecomment-1849057360 Hi, I was busy this Sunday, sorry for delay. Will check tomorrow. In general: My idea is to have a single static utility method (like the default one) in the util class that

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-10 Thread via GitHub
uschindler commented on code in PR #12841: URL: https://github.com/apache/lucene/pull/12841#discussion_r1421801910 ## lucene/benchmark-jmh/src/java/org/apache/lucene/benchmark/jmh/GroupVIntBenchmark.java: ## @@ -140,10 +155,12 @@ public void init() throws Exception { }

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-09 Thread via GitHub
easyice commented on code in PR #12841: URL: https://github.com/apache/lucene/pull/12841#discussion_r1421459009 ## lucene/benchmark-jmh/src/java/org/apache/lucene/benchmark/jmh/GroupVIntBenchmark.java: ## @@ -140,10 +155,12 @@ public void init() throws Exception { }

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-09 Thread via GitHub
easyice commented on code in PR #12841: URL: https://github.com/apache/lucene/pull/12841#discussion_r1421441005 ## lucene/benchmark-jmh/src/java/org/apache/lucene/benchmark/jmh/GroupVIntBenchmark.java: ## @@ -140,10 +155,12 @@ public void init() throws Exception { }

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-09 Thread via GitHub
uschindler commented on code in PR #12841: URL: https://github.com/apache/lucene/pull/12841#discussion_r1421431357 ## lucene/benchmark-jmh/src/java/org/apache/lucene/benchmark/jmh/GroupVIntBenchmark.java: ## @@ -140,10 +155,12 @@ public void init() throws Exception { }

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-09 Thread via GitHub
uschindler commented on code in PR #12841: URL: https://github.com/apache/lucene/pull/12841#discussion_r1421431248 ## lucene/benchmark-jmh/src/java/org/apache/lucene/benchmark/jmh/GroupVIntBenchmark.java: ## @@ -140,10 +155,12 @@ public void init() throws Exception { }

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-09 Thread via GitHub
uschindler commented on code in PR #12841: URL: https://github.com/apache/lucene/pull/12841#discussion_r1421430301 ## lucene/core/src/java/org/apache/lucene/store/BufferedIndexInput.java: ## @@ -149,6 +150,29 @@ public final int readInt() throws IOException { } } +

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-09 Thread via GitHub
easyice commented on code in PR #12841: URL: https://github.com/apache/lucene/pull/12841#discussion_r1421421593 ## lucene/core/src/java/org/apache/lucene/store/BufferedIndexInput.java: ## @@ -149,6 +150,29 @@ public final int readInt() throws IOException { } } +

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-09 Thread via GitHub
uschindler commented on code in PR #12841: URL: https://github.com/apache/lucene/pull/12841#discussion_r1421414875 ## lucene/benchmark-jmh/src/java/org/apache/lucene/benchmark/jmh/GroupVIntBenchmark.java: ## @@ -140,10 +155,12 @@ public void init() throws Exception { }

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-08 Thread via GitHub
easyice commented on code in PR #12841: URL: https://github.com/apache/lucene/pull/12841#discussion_r1420843242 ## lucene/core/src/test/org/apache/lucene/store/TestMMapDirectory.java: ## @@ -114,4 +115,31 @@ public void testNullParamsIndexInput() throws Exception { }

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-08 Thread via GitHub
easyice commented on code in PR #12841: URL: https://github.com/apache/lucene/pull/12841#discussion_r1420812950 ## lucene/core/src/java21/org/apache/lucene/store/MemorySegmentIndexInput.java: ## @@ -324,24 +324,9 @@ private void readGroupVInt(long[] dst, int offset) throws

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-08 Thread via GitHub
uschindler commented on code in PR #12841: URL: https://github.com/apache/lucene/pull/12841#discussion_r1420760926 ## lucene/core/src/test/org/apache/lucene/store/TestMMapDirectory.java: ## @@ -114,4 +115,31 @@ public void testNullParamsIndexInput() throws Exception { }

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-08 Thread via GitHub
uschindler commented on code in PR #12841: URL: https://github.com/apache/lucene/pull/12841#discussion_r1420793976 ## lucene/core/src/java21/org/apache/lucene/store/MemorySegmentIndexInput.java: ## @@ -324,24 +324,9 @@ private void readGroupVInt(long[] dst, int offset) throws

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-08 Thread via GitHub
uschindler commented on code in PR #12841: URL: https://github.com/apache/lucene/pull/12841#discussion_r1420793976 ## lucene/core/src/java21/org/apache/lucene/store/MemorySegmentIndexInput.java: ## @@ -324,24 +324,9 @@ private void readGroupVInt(long[] dst, int offset) throws

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-08 Thread via GitHub
uschindler commented on code in PR #12841: URL: https://github.com/apache/lucene/pull/12841#discussion_r1420760926 ## lucene/core/src/test/org/apache/lucene/store/TestMMapDirectory.java: ## @@ -114,4 +115,31 @@ public void testNullParamsIndexInput() throws Exception { }

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-08 Thread via GitHub
easyice commented on code in PR #12841: URL: https://github.com/apache/lucene/pull/12841#discussion_r1420565417 ## lucene/test-framework/src/java/org/apache/lucene/tests/store/BaseDirectoryTestCase.java: ## @@ -1438,4 +1440,68 @@ public void testListAllIsSorted() throws

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-08 Thread via GitHub
uschindler commented on code in PR #12841: URL: https://github.com/apache/lucene/pull/12841#discussion_r1420450756 ## lucene/core/src/java/org/apache/lucene/util/GroupVIntUtil.java: ## @@ -0,0 +1,104 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-08 Thread via GitHub
uschindler commented on code in PR #12841: URL: https://github.com/apache/lucene/pull/12841#discussion_r1420451580 ## lucene/core/src/java/org/apache/lucene/util/GroupVIntUtil.java: ## @@ -0,0 +1,104 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-08 Thread via GitHub
uschindler commented on code in PR #12841: URL: https://github.com/apache/lucene/pull/12841#discussion_r1420447123 ## lucene/core/src/java/org/apache/lucene/util/GroupVIntUtil.java: ## @@ -0,0 +1,104 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-08 Thread via GitHub
easyice commented on code in PR #12841: URL: https://github.com/apache/lucene/pull/12841#discussion_r1420418454 ## lucene/core/src/java/org/apache/lucene/util/GroupVIntUtil.java: ## @@ -0,0 +1,104 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + *

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-08 Thread via GitHub
uschindler commented on code in PR #12841: URL: https://github.com/apache/lucene/pull/12841#discussion_r1420387856 ## lucene/core/src/java21/org/apache/lucene/store/MemorySegmentIndexInput.java: ## @@ -303,6 +304,34 @@ public byte readByte(long pos) throws IOException { }

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-08 Thread via GitHub
easyice commented on PR #12841: URL: https://github.com/apache/lucene/pull/12841#issuecomment-1847102122 It looks good on `byteBuffers` and `MMapDirectory`, the benchmark result is pretty close to previous commit, but a bit slowdon on `NIOFSDirectory`, i will dig it. *

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-08 Thread via GitHub
uschindler commented on code in PR #12841: URL: https://github.com/apache/lucene/pull/12841#discussion_r1420382794 ## lucene/core/src/java21/org/apache/lucene/store/MemorySegmentIndexInput.java: ## @@ -303,6 +304,48 @@ public byte readByte(long pos) throws IOException { }

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-08 Thread via GitHub
uschindler commented on code in PR #12841: URL: https://github.com/apache/lucene/pull/12841#discussion_r1420379683 ## lucene/core/src/java21/org/apache/lucene/store/MemorySegmentIndexInput.java: ## @@ -303,6 +304,34 @@ public byte readByte(long pos) throws IOException { }

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-08 Thread via GitHub
uschindler commented on code in PR #12841: URL: https://github.com/apache/lucene/pull/12841#discussion_r1420377148 ## lucene/core/src/java21/org/apache/lucene/store/MemorySegmentIndexInput.java: ## @@ -324,24 +324,9 @@ private void readGroupVInt(long[] dst, int offset) throws

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-08 Thread via GitHub
uschindler commented on code in PR #12841: URL: https://github.com/apache/lucene/pull/12841#discussion_r1420375351 ## lucene/core/src/java/org/apache/lucene/util/GroupVIntUtil.java: ## @@ -62,4 +62,42 @@ private static long readLongInGroup(DataInput in, int numBytesMinus1)

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-08 Thread via GitHub
uschindler commented on code in PR #12841: URL: https://github.com/apache/lucene/pull/12841#discussion_r1420373977 ## lucene/core/src/java/org/apache/lucene/util/GroupVIntUtil.java: ## @@ -62,4 +62,42 @@ private static long readLongInGroup(DataInput in, int numBytesMinus1)

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-08 Thread via GitHub
uschindler commented on code in PR #12841: URL: https://github.com/apache/lucene/pull/12841#discussion_r1420364348 ## lucene/test-framework/src/java/org/apache/lucene/tests/store/BaseDirectoryTestCase.java: ## @@ -1438,4 +1440,68 @@ public void testListAllIsSorted() throws

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-08 Thread via GitHub
uschindler commented on code in PR #12841: URL: https://github.com/apache/lucene/pull/12841#discussion_r1420364348 ## lucene/test-framework/src/java/org/apache/lucene/tests/store/BaseDirectoryTestCase.java: ## @@ -1438,4 +1440,68 @@ public void testListAllIsSorted() throws

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-08 Thread via GitHub
easyice commented on PR #12841: URL: https://github.com/apache/lucene/pull/12841#issuecomment-1847060669 I'm running the performance differences between previous commit, it will take a moment. -- This is an automated message from the Apache Git Service. To respond to the message, please

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-08 Thread via GitHub
easyice commented on PR #12841: URL: https://github.com/apache/lucene/pull/12841#issuecomment-1846907147 > Can we do the same for all other inputs? I think so, i will do this if @jpountz doesn't mind. > I will nag Maurizio again about the problem with slice(). Thank you

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-08 Thread via GitHub
uschindler commented on PR #12841: URL: https://github.com/apache/lucene/pull/12841#issuecomment-1846883907 I will nag Maurizio again about the problem with slice(). The reason for this was some strange problem with Hotspot. I thought they fixed it. -- This is an automated message from

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-08 Thread via GitHub
uschindler commented on PR #12841: URL: https://github.com/apache/lucene/pull/12841#issuecomment-1846881254 I would still be safe and initialize the IntReader on construction of the IndexInput. It can strongly bind to the current segment. Can we do the same for all other inputs? --

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-08 Thread via GitHub
easyice commented on PR #12841: URL: https://github.com/apache/lucene/pull/12841#issuecomment-1846784038 +1 for gc overhead, here is the gc output (`-prof gc` ): ``` Benchmark (size) Mode Cnt Score

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-08 Thread via GitHub
jpountz commented on PR #12841: URL: https://github.com/apache/lucene/pull/12841#issuecomment-1846779641 I confirmed there's GC activity happening with the slice approach by using `-prof gc`: ``` Benchmark

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-08 Thread via GitHub
jpountz commented on PR #12841: URL: https://github.com/apache/lucene/pull/12841#issuecomment-1846774383 I'll check if there is GC activity during the benchmark. In the meantime, I looked into using lambdas instead, and it seems like it would work well:

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-07 Thread via GitHub
uschindler commented on PR #12841: URL: https://github.com/apache/lucene/pull/12841#issuecomment-1846710600 > Thank you for quick impl Adrien, for reference, i tried this approach [code

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-07 Thread via GitHub
easyice commented on PR #12841: URL: https://github.com/apache/lucene/pull/12841#issuecomment-1846510407 > Thank you for sharing the code, it seems very clear, another way, could we pass the current block(ByteBuffer) to the decode function like below? this will keep the remaining bytes

Re: [PR] Move group-varint encoding/decoding logic to DataOutput/DataInput [lucene]

2023-12-07 Thread via GitHub
uschindler commented on PR #12841: URL: https://github.com/apache/lucene/pull/12841#issuecomment-1845904374 Lol. That's strange! I would not have added a readNBytes method and just do the ByteBuffer wrapping in the readVInt method that calls the static method to decode. If

  1   2   >