[
https://issues.apache.org/jira/browse/IGNITE-5097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16030867#comment-16030867
]
Vyacheslav Daradur commented on IGNITE-5097:
--------------------------------------------
I've prepared separate PR where only arrays lengths are affected:
https://github.com/apache/ignite/pull/2043
Locally, tests in IgniteBinaryObjectsTestSuite are ok.
Sent to [ci.tests|http://ci.ignite.apache.org/viewQueued.html?itemId=638454]
For me, there is 2 controversial issues:
1). Extend inteface with 'varint' methods OR just add new write-read methods.
I chose second approach, because it is used in other places in project. In
addition, we use 'varint' for special cases, there is no sense to extend
interface with it.
2). Implementation of sizeInVarint method, there is several ways to implement
it.
I've prepared benchmark, looks like current implementation is a bit faster.
{code:title=Benchmark}
import java.util.concurrent.TimeUnit;
import org.openjdk.jmh.annotations.Benchmark;
import org.openjdk.jmh.annotations.Mode;
import org.openjdk.jmh.annotations.Param;
import org.openjdk.jmh.annotations.Scope;
import org.openjdk.jmh.annotations.State;
import org.openjdk.jmh.runner.Runner;
import org.openjdk.jmh.runner.RunnerException;
import org.openjdk.jmh.runner.options.Options;
import org.openjdk.jmh.runner.options.OptionsBuilder;
public class VarintSizeBenchmark {
public static void main(String[] args) throws RunnerException {
Options opt = new OptionsBuilder()
.include(VarintSizeBenchmark.class.getSimpleName())
.mode(Mode.AverageTime)
.timeUnit(TimeUnit.NANOSECONDS)
.warmupIterations(10)
.measurementIterations(30)
.forks(1)
.jvmArgs("-ea", "-Xms4g", "-Xmx4g")
.shouldFailOnError(true)
.build();
new Runner(opt).run();
}
@State(Scope.Thread)
public static class ValueState {
@Param({"1", "10", "100", "1000", "10200", "102000", "1203000",
"10203000", "200304000", "300000000"})
int i;
}
@Benchmark
public int sizeInUnsignedVarint(ValueState state) {
int val = state.i;
int size = 1;
while ((val & 0xFFFFFF80) != 0L) {
val >>>= 7;
size++;
}
return size;
}
@Benchmark
public int sizeInUnsignedVarint2(ValueState state) {
int val = state.i;
if (val < 0)
return 5;
if (val <= Byte.MAX_VALUE)
return 1;
if (val <= 16383)
return 2;
if (val <= 2097151)
return 3;
if (val <= 268435455)
return 4;
return 5;
}
}
{code}
{code:title=Result}
Benchmark (i) Mode Cnt Score Error
Units
VarintSizeBenchmark.sizeInUnsignedVarint 1 avgt 30 2,326 ± 0,042
ns/op
VarintSizeBenchmark.sizeInUnsignedVarint 10 avgt 30 2,301 ± 0,013
ns/op
VarintSizeBenchmark.sizeInUnsignedVarint 100 avgt 30 2,296 ± 0,008
ns/op
VarintSizeBenchmark.sizeInUnsignedVarint 1000 avgt 30 2,556 ± 0,017
ns/op
VarintSizeBenchmark.sizeInUnsignedVarint 10200 avgt 30 2,570 ± 0,044
ns/op
VarintSizeBenchmark.sizeInUnsignedVarint 102000 avgt 30 3,534 ± 0,021
ns/op
VarintSizeBenchmark.sizeInUnsignedVarint 1203000 avgt 30 3,553 ± 0,075
ns/op
VarintSizeBenchmark.sizeInUnsignedVarint 10203000 avgt 30 3,569 ± 0,015
ns/op
VarintSizeBenchmark.sizeInUnsignedVarint 200304000 avgt 30 3,564 ± 0,014
ns/op
VarintSizeBenchmark.sizeInUnsignedVarint 300000000 avgt 30 3,843 ± 0,005
ns/op
VarintSizeBenchmark.sizeInUnsignedVarint2 1 avgt 30 2,075 ± 0,068
ns/op
VarintSizeBenchmark.sizeInUnsignedVarint2 10 avgt 30 2,032 ± 0,013
ns/op
VarintSizeBenchmark.sizeInUnsignedVarint2 100 avgt 30 2,032 ± 0,013
ns/op
VarintSizeBenchmark.sizeInUnsignedVarint2 1000 avgt 30 2,292 ± 0,017
ns/op
VarintSizeBenchmark.sizeInUnsignedVarint2 10200 avgt 30 2,287 ± 0,011
ns/op
VarintSizeBenchmark.sizeInUnsignedVarint2 102000 avgt 30 2,285 ± 0,007
ns/op
VarintSizeBenchmark.sizeInUnsignedVarint2 1203000 avgt 30 2,285 ± 0,007
ns/op
VarintSizeBenchmark.sizeInUnsignedVarint2 10203000 avgt 30 2,286 ± 0,010
ns/op
VarintSizeBenchmark.sizeInUnsignedVarint2 200304000 avgt 30 2,346 ± 0,087
ns/op
VarintSizeBenchmark.sizeInUnsignedVarint2 300000000 avgt 30 2,285 ± 0,009
ns/op
{code}
> BinaryMarshaller should write ints in "varint" encoding where it makes sense
> ----------------------------------------------------------------------------
>
> Key: IGNITE-5097
> URL: https://issues.apache.org/jira/browse/IGNITE-5097
> Project: Ignite
> Issue Type: Task
> Components: general
> Affects Versions: 2.0
> Reporter: Vladimir Ozerov
> Assignee: Vyacheslav Daradur
> Labels: important, performance
> Fix For: 2.1
>
>
> There are a lot of places in the code where we write integers for some
> special purposes. Quite often their value will be vary small, so that
> applying "varint" format could save a lot of space at the cost of very low
> additional CPU overhead.
> Specifically:
> 1) Array/collection/map lengths
> 2) BigDecimal's (usually will save ~6 bytes)
> 3) Strings
> 4) Enum ordinals
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)