Re: [xz-devel] java crc64 implementation

2021-02-05 Thread Brett Okken
This had /way/ more impact than I expected on overall decompression performance. Here are the baseline numbers for 1.8 (jdk 11 64bit): Benchmark (file) Mode Cnt Score Error Units XZDecompressionBenchmark.decompress ihe_ovly_pr.dcm avgt3 0.731 ± 0.010

Re: [xz-devel] java crc64 implementation

2021-02-05 Thread Lasse Collin
On 2021-02-05 Brett Okken wrote: > On Fri, Feb 5, 2021 at 11:07 AM Lasse Collin > wrote: > > Also, does it really help to unroll the loop? With 8191-byte > > buffers I see no significant difference (in a quick > > not-very-accurate test) if the switch-statement is replaced with a > > while-loop.

Re: [xz-devel] java crc64 implementation

2021-02-05 Thread Brett Okken
On Fri, Feb 5, 2021 at 11:07 AM Lasse Collin wrote: > > On 2021-02-02 Brett Okken wrote: > > Thus far I have only tested on jdk 11 64bit windows, but the fairly > > clear winner is: > > > > public void update(byte[] buf, int off, int len) { > > final int end = off + len; > >

Re: [xz-devel] java crc64 implementation

2021-02-05 Thread Lasse Collin
On 2021-02-02 Brett Okken wrote: > Thus far I have only tested on jdk 11 64bit windows, but the fairly > clear winner is: > > public void update(byte[] buf, int off, int len) { > final int end = off + len; > int i=off; > if (len > 3) { > switch (i & 3) { >

Re: [xz-devel] java crc64 implementation

2021-02-02 Thread Brett Okken
I tested jdk 15 64bit and jdk 11 32bit, client and server and the above implementation is consistently quite good. The alternate in running does not do the leading alignment. This version is really close in 64 bit testing and slightly faster for 32 bit. The differences are pretty small, and both

Re: [xz-devel] java crc64 implementation

2021-02-02 Thread Brett Okken
I accidentally hit reply instead of reply all. > > Shouldn't that be (i & 3) != 0? > > An offset of 0 should not enter this loop, but 0 & 3 does not equal 1. > > The idea really is that offset of 1 doesn't enter the loop, thus the > main slicing-by-4 loop is misaligned. I don't know why it makes

Re: [xz-devel] java crc64 implementation

2021-02-02 Thread Lasse Collin
I assume you accidentally didn't post to the list so I'm quoting your email in full. On 2021-02-02 Brett Okken wrote: > > while ((i & 3) != 1 && i < end) > > Shouldn't that be (i & 3) != 0? > An offset of 0 should not enter this loop, but 0 & 3 does not equal 1. The idea really is that offset

Re: [xz-devel] java crc64 implementation

2021-02-02 Thread Lasse Collin
Hello! I need to make a new release in the near future so that a minor problem can be fixed in .7z support in Apache Commons Compress. I thought I could include simpler and safer changes from your long list of patches and the CRC64 improvement might be such. On 2021-01-21 Brett Okken wrote: >

Re: [xz-devel] java crc64 implementation

2021-01-21 Thread Brett Okken
Here is a slice by 4 implementation. It goes byte by byte to easily be compatible with older jdks. Performance wise, it is pretty comparable to the java port of Adler's stackoverflow implementation: Benchmark Mode Cnt Score Error Units Hash64Benchmark.adler

Re: [xz-devel] java crc64 implementation

2021-01-19 Thread Lasse Collin
On 2021-01-13 Brett Okken wrote: > Mark Adler has posted an optimized crc64 implementation on > stackoverflow[1]. This can be reasonably easily ported to java (that > post has a link to java impl on github[2] which warrants a little > clean up, but gives a decent idea). > > I did a quick