[
https://issues.apache.org/jira/browse/LUCENE-9629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17242418#comment-17242418
]
Feng Guo edited comment on LUCENE-9629 at 12/2/20, 3:05 PM:
------------------------------------------------------------
Thanks for your reply! I can't agree more that write path is less
performance-sensitive than the read path, and to be honest, i didn't expect
this change will bring a very big improvement in writing speed. All I'm trying
to do is just to reduce duplicate compute no matter where it appears, not to
mention here is somewhat a hot way when indexing. So you may consider it as a
"fix" instead of an "enhancement".
here is a simple benchmark run with cpu profiler if your are interested~
{code:java}
for (int time=0; time<100; time++) {
Random random = new Random(System.currentTimeMillis());
long[] nums = new long[128];
for (int i=0;i<128;i++) {
nums[i] = random.nextInt(4)+1;
}
ForUtil forUtil = new ForUtil();
Directory directory = new ByteBuffersDirectory();
DataOutput dataOutput = directory.createOutput("test", IOContext.DEFAULT);
for (int i = 0; i < 100000000; i++) {
forUtil.encode(nums, 3, dataOutput);
}
directory.close();
}{code}
*result:*
|| ||before||after||
|org.apache.lucene.store.ByteBuffersIndexOutput.writeLong
org.apache.lucene.store.ForUtil.collapse8
org.apache.lucene.store.ForUtil.mask(ed)8|40.4%
15.3%
8.8%|41.2%
14.8%
3.8%|
>From my point of view, the number of code lines is less important than writing
>speed, but if you insist that precompute make no sense, just tell me and i
>will revert this part of change:)
was (Author: gf2121):
Thanks for your reply! I can't agree more that write path is less
performance-sensitive than the read path, and to be honest, i didn't expect
this change will bring a very big improvement in writing speed. All I'm trying
to do is just to reduce duplicate compute no matter where it appears, not to
mention here is somewhat a hot way when indexing. So you may consider it as a
"fix" instead of an "enhancement".
here is a simple benchmark run with cpu profiler if your are interested~
{code:java}
for (int time=0; time<100; time++) {
Random random = new Random(System.currentTimeMillis());
long[] nums = new long[128];
for (int i=0;i<128;i++) {
nums[i] = random.nextInt(4)+1;
}
ForUtil forUtil = new ForUtil();
Directory directory = new ByteBuffersDirectory();
DataOutput dataOutput = directory.createOutput("test", IOContext.DEFAULT);
for (int i = 0; i < 100000000; i++) {
forUtil.encode(nums, 3, dataOutput);
}
directory.close();
}{code}
*result:*
|| ||before||after||
|org.apache.lucene.store.ByteBuffersIndexOutput.writeLong
org.apache.lucene.store.ForUtil.collapse8
org.apache.lucene.store.ForUtil.mask(ed)8|40.4%
15.3%
8.8%|41.2%
14.8%
3.8%|
>From my point of view, the number of code lines is less important than writing
>speed, but if you insist that precompute make no sense, just tell me and i
>will revert this part of change:)
> Use computed mask values in ForUtil
> -----------------------------------
>
> Key: LUCENE-9629
> URL: https://issues.apache.org/jira/browse/LUCENE-9629
> Project: Lucene - Core
> Issue Type: Improvement
> Components: core/codecs
> Reporter: Feng Guo
> Priority: Major
> Time Spent: 10m
> Remaining Estimate: 0h
>
> In the class ForkUtil, mask values have been computed and stored in static
> final vailables, but they are recomputed for every encoding, which may be
> unnecessary.
> anther small fix is that change
> {code:java}
> remainingBitsPerValue > remainingBitsPerLong{code}
> to
> {code:java}
> remainingBitsPerValue >= remainingBitsPerLong{code}
> otherwise
>
> {code:java}
> if (remainingBitsPerValue == 0) {
> idx++;
> remainingBitsPerValue = bitsPerValue;
> }
> {code}
>
> these code will never be used.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]