Re: [xz-devel] xz-java and newer java

2024-03-20 Thread Brett Okken
being more modular. 2. Allow specifying the implementation to use with a system property. This would be unlikely to be used outside of benchmarking, but would provide options for users on unusual hardware. Brett On Tue, Mar 12, 2024 at 12:55 PM Lasse Collin wrote: > On 2024-03-12 Bre

Re: [xz-devel] xz-java and newer java

2024-03-12 Thread Brett Okken
will get opportunity to test out arm64. That could be awhile yet. I do have some things still on jdk 8, but only decompression. Surveys seem to indicate quite a bit of jdk 8 still in use, but I have no personal need. Brett On Sun, Mar 10, 2024 at 2:49 PM Lasse Collin wrote: > On 2024-03-09 Br

Re: [xz-devel] xz-java and newer java

2024-03-09 Thread Brett Okken
When I tested graviton2 (arm64) previously, Arrays.mismatch was better than comparing longs using a VarHandle. The benefits are definitely with content that compresses more - because there are more long matches. I do like Unsafe as an option for jdk 8 users on x86 or arm64. On Sat, Mar 9, 2024

Re: [xz-devel] xz-java and newer java

2024-03-05 Thread Brett Okken
I have added a comment to the PR with updated benchmark results: https://github.com/tukaani-project/xz-java/pull/13#issuecomment-1977705691 On Fri, Mar 1, 2024 at 6:23 AM Brett Okken wrote: > > I found and resolved the difference: > https://github.com/tukaani-project/xz-java/pull/1

Re: [xz-devel] xz-java and newer java

2024-03-01 Thread Brett Okken
On Thu, Feb 29, 2024 at 8:47 PM Brett Okken wrote: > > > Thanks! Ideally there would be one commit to add the minimal portable > > version, then separate commits for each optimized variant. > > Would you like me to remove the Unsafe based impl from > https://github.com/tukaani

Re: [xz-devel] xz-java and newer java

2024-02-29 Thread Brett Okken
or in one or the other. Brett On Thu, Feb 29, 2024 at 11:35 AM Lasse Collin wrote: > > On 2024-02-25 Brett Okken wrote: > > I created https://github.com/tukaani-project/xz-java/pull/13 with the > > bare bones changes to utilize a utility for array comparisons and an > > U

Re: [xz-devel] xz-java and newer java

2024-02-25 Thread Brett Okken
t; > On 2024-02-19 Brett Okken wrote: > > I have created a pr to the GitHub project. > > > > https://github.com/tukaani-project/xz-java/pull/12 > > Thanks! I could be good to split into smaller commits to make reviewing > easier. > > > It is not clear to me if

Re: [xz-devel] Re: improve java delta performance

2024-02-19 Thread Brett Okken
I have created a pr to the GitHub project with these changes. https://github.com/tukaani-project/xz-java/pull/11/files Thanks, Brett On Thu, Mar 31, 2022 at 4:33 PM Lasse Collin wrote: > > On Thu, May 6, 2021 at 4:18 PM Brett Okken > > wrote: > > > > > Th

Re: [xz-devel] xz-java and newer java

2024-02-19 Thread Brett Okken
I have created a pr to the GitHub project. https://github.com/tukaani-project/xz-java/pull/12 It is not clear to me if that is actually seeing active dev on the Java project yet. Thanks, Brett On Sat, Feb 12, 2022 at 11:45 AM Brett Okken wrote: > Can this be taken up again? > > On

Re: [xz-devel] Question about using Java API for geospatial data

2022-07-10 Thread Brett Okken
> I'm not sure that this is authoritative. The Java API documentation > says that it "aims" to provide "Full support for the .xz file format > specification version 1.0.4" I am not certain which statement you believe is not authoritative. There are existing constructors (such as[1]) which allow

Re: [xz-devel] Question about using Java API for geospatial data

2022-07-09 Thread Brett Okken
What version of xz are you using? The differences between xz and lzma are a bit more involved. One such example is that xz is a framed format which includes checksums on each “frame”. I would not expect checksum verification to account for all of that difference, but it can be disabled to

Re: [xz-devel] XZ for Java

2022-05-19 Thread Brett Okken
like multithreaded encoding / decoding and a > > few updates that Brett Okken had submited (but are still waiting for > > merge). Should I add these things to only my local version, or is > > there a plan for these things in the future? > > Brett Okken's patches I haven't r

[xz-devel] Re: improve java delta performance

2022-02-12 Thread Brett Okken
Can this be reviewed? On Thu, May 6, 2021 at 4:18 PM Brett Okken wrote: > These changes reduce the time of DeltaEncoder by ~65% and DeltaDecoder > by ~40%, assuming using arrays that are several KB in size. >

Re: [xz-devel] xz-java and newer java

2022-02-12 Thread Brett Okken
Can this be taken up again? On Wed, Mar 24, 2021 at 6:20 AM Brett Okken wrote: > I grabbed an older version in the last mail. This is the updated > version for aarch64. >

[xz-devel] improve java delta performance

2021-05-06 Thread Brett Okken
These changes reduce the time of DeltaEncoder by ~65% and DeltaDecoder by ~40%, assuming using arrays that are several KB in size. diff --git a/src/org/tukaani/xz/delta/DeltaCoder.java b/src/org/tukaani/xz/delta/DeltaCoder.java index d94eb66..ccb702d 100644xz/delta/DeltaCoder.java

Re: [xz-devel] xz-java and newer java

2021-03-24 Thread Brett Okken
I grabbed an older version in the last mail. This is the updated version for aarch64. ArrayUtil.java Description: Binary data

Re: [xz-devel] xz-java and newer java

2021-03-23 Thread Brett Okken
I was able to test on AWS graviton2 instances (aarch64), but only with jdk 15. The results show that the vectorized approach appears the best option, though long comparisons are also an improvement over baseline. Based on this, I made a small change to ArrayUtil to, by default, use unsafe long

Re: [xz-devel] Re: java LZDecoder small improvement

2021-03-01 Thread Brett Okken
> With a quick try I got a feeling that my worry about short repeats was > wrong. It doesn't matter because decoding each LZMA symbol is much more > expensive. What matters is avoiding multiple tiny arraycopy calls > within a single run of the repeat method, and that problem was already > solved.

Re: [xz-devel] java array cache fill

2021-02-19 Thread Brett Okken
I learned the wrong lesson from LZDecoder. This pattern of doubling sizes System.arraycopy was better than byte by byte copies in loop. There was not really a direct comparison to Arrays.fill. The single byte repeating was close. Hotspot must be doing something interesting with Arrays.fill,

Re: [xz-devel] xz-java and newer java

2021-02-19 Thread Brett Okken
I have attached updated patches and ArrayUtil.java. HC4 needed changes/optimizations in both locations. I also found a better way to handle BT4 occasionally sending -1 as the length. diff --git a/src/org/tukaani/xz/lz/BT4.java b/src/org/tukaani/xz/lz/BT4.java index 6c46feb..7d78aef 100644 ---

Re: [xz-devel] xz-java and newer java

2021-02-16 Thread Brett Okken
On Tue, Feb 16, 2021 at 12:48 PM Lasse Collin wrote: > > I quickly tried these with "XZEncDemo 2". I used the preset 2 because > that uses LZMAEncoderFast instead of LZMAEncoderNormal where the > negative lengths result in a crash. I updated the mismatch method to check for negative lengths

[xz-devel] java array cache fill

2021-02-16 Thread Brett Okken
We found in LZDecoder that using System.arrayCopy with doubling size is faster than Arrays.fill (especially for larger arrays). We can apply that knowledge in the BasicArrayCache, where there are some use cases which require clearing out the array prior to returning it. diff --git

Re: [xz-devel] jdk9+ CRC64

2021-02-14 Thread Brett Okken
On Sun, Feb 14, 2021 at 9:30 AM Lasse Collin wrote: > On 2021-02-13 Brett Okken wrote: > > We can make it look even more like liblzma :) > > It can be done but I'm not sure yet if it should be done. Your > implementation looks very neat though. :-) > > > In my benc

Re: [xz-devel] jdk9+ CRC64

2021-02-13 Thread Brett Okken
We can make it look even more like liblzma :) In my benchmark I observe no negative impact of using the functions. Which is to say that this is still 5-7% faster than the byte-by-byte approach. public class CRC64 extends Check { private static final VarHandle INT_HANDLE =

Re: [xz-devel] Re: java LZDecoder small improvement

2021-02-13 Thread Brett Okken
On Thu, Feb 11, 2021 at 12:51 PM Lasse Collin wrote: > > On 2021-02-05 Brett Okken wrote: > > I worked this out last night. We need to double how much we copy each > > time by not advancing "back". This actually works even better than > > Arrays.

[xz-devel] jdk9+ CRC64

2021-02-06 Thread Brett Okken
the decompression of the repeating single byte by ~1%. /* * CRC64 * * Authors: Brett Okken * Lasse Collin * * This file has been put into the public domain. * You can do whatever you want with this file. */ package org.tukaani.xz.check; import java.lang.invoke.MethodHandles; import

Re: [xz-devel] Re: java LZDecoder small improvement

2021-02-06 Thread Brett Okken
Here is a patch for changes. The benchmark results follow. diff --git a/src/org/tukaani/xz/lz/LZDecoder.java b/src/org/tukaani/xz/lz/LZDecoder.java index 85b2ca1..565209a 100644 --- a/src/org/tukaani/xz/lz/LZDecoder.java +++ b/src/org/tukaani/xz/lz/LZDecoder.java @@ -12,6 +12,7 @@ package

Re: [xz-devel] java crc64 implementation

2021-02-05 Thread Brett Okken
This had /way/ more impact than I expected on overall decompression performance. Here are the baseline numbers for 1.8 (jdk 11 64bit): Benchmark (file) Mode Cnt Score Error Units XZDecompressionBenchmark.decompress ihe_ovly_pr.dcm avgt3 0.731 ± 0.010

Re: [xz-devel] java LZMA2OutputStream changes

2021-02-05 Thread Brett Okken
> > Now that there is a 6 byte chunkHeader, could the 1 byte tempBuf be > > removed? > > It's better to keep it. It would be confusing to use the same buffer in > write(int) and writeChunk(). At glance it would look like that > writeChunk() could be overwriting the input. I assumed that

Re: [xz-devel] java crc64 implementation

2021-02-05 Thread Brett Okken
On Fri, Feb 5, 2021 at 11:07 AM Lasse Collin wrote: > > On 2021-02-02 Brett Okken wrote: > > Thus far I have only tested on jdk 11 64bit windows, but the fairly > > clear winner is: > > > > public void update(byte[] buf, int off, int len) { > &g

[xz-devel] java LZMA2OutputStream changes

2021-02-05 Thread Brett Okken
After recent changes, the LZMA2OutputStream class no longer uses DataOutputStream, but the import statement is still present. Now that there is a 6 byte chunkHeader, could the 1 byte tempBuf be removed?

Re: [xz-devel] Re: java LZDecoder small improvement

2021-02-05 Thread Brett Okken
> With a file with two-byte repeat ("ababababababab"...) it's 50 % slower > than the baseline. Calling arraycopy in a loop, copying two bytes at a > time, is not efficient. I didn't try look how big the copy needs to be > to make the overhead of arraycopy smaller than the benefit but clearly > it

Re: [xz-devel] Re: java LZDecoder small improvement

2021-02-03 Thread Brett Okken
I still need to do more testing across jdk 8 and 15, but initial returns on this are pretty positive. The repeating byte file is meaningfully faster than baseline. One of my test files (image1.dcm) does not improve much from baseline, but the other 2 files do. diff --git

Re: [xz-devel] Re: java LZDecoder small improvement

2021-02-03 Thread Brett Okken
On Wed, Feb 3, 2021 at 2:56 PM Lasse Collin wrote: > > On 2021-02-01 Brett Okken wrote: > > I have played with this quite a bit and have come up with a slightly > > modified change which does not regress for the smallest of the sample > > objects and shows a nice impr

Re: [xz-devel] xz-java minor read improvements

2021-02-03 Thread Brett Okken
I have not done any testing of xz specifically, but was motivated by https://github.com/openjdk/jdk/pull/542, which showed pretty noticeable slowdown when biased locking is removed. The specific example there was writing 1 byte at a time being transitioned to writing the 2-8 bytes to a byte[]

Re: [xz-devel] java crc64 implementation

2021-02-02 Thread Brett Okken
I tested jdk 15 64bit and jdk 11 32bit, client and server and the above implementation is consistently quite good. The alternate in running does not do the leading alignment. This version is really close in 64 bit testing and slightly faster for 32 bit. The differences are pretty small, and both

Re: [xz-devel] java crc64 implementation

2021-02-02 Thread Brett Okken
I accidentally hit reply instead of reply all. > > Shouldn't that be (i & 3) != 0? > > An offset of 0 should not enter this loop, but 0 & 3 does not equal 1. > > The idea really is that offset of 1 doesn't enter the loop, thus the > main slicing-by-4 loop is misaligned. I don't know why it makes

[xz-devel] Re: java LZDecoder small improvement

2021-02-01 Thread Brett Okken
I have played with this quite a bit and have come up with a slightly modified change which does not regress for the smallest of the sample objects and shows a nice improvement for the 2 larger files. Here is baseline benchmark on 1.8: jdk 11 64 bit 1.8 BASELINE Benchmark

Re: [xz-devel] xz-java and newer java

2021-01-31 Thread Brett Okken
Comparison}. * * * @author Brett Okken */ public final class ArrayUtil { /** * Enumerated options for controlling implementation of how to compare arrays. */ public static enum ArrayComparison { /** * Uses {@code VarHandle} for {@code int

[xz-devel] xz-java minor read improvements

2021-01-29 Thread Brett Okken
Here are some small improvements when creating new BlockInputStream instances. This reduces the size of the byte[] for the block header to the actual size and replaces use of ByteArrayInputStream, which has synchronized methods, with a ByteBuffer, which provides the same functionality without

[xz-devel] java buffer writes

2021-01-29 Thread Brett Okken
There are several places where single byte writes are being done during compression. Often this is going to an OutputStream with synchronized write methods. Historically that has not mattered much because of biased locking. However, biased locking is being removed[1]. These changes will batch

Re: [xz-devel] xz-java and newer java

2021-01-24 Thread Brett Okken
Based on some playing around with unrolling loops as part of the crc64 implementation, I tried unrolling the "legacy" implementation and found it provided some nice improvements. The improvements were most pronounced on 32 bit jdk 11: 32 jdk 11 - LEGACY Benchmark

Re: [xz-devel] xz-java and newer java

2021-01-22 Thread Brett Okken
org.tukaani.xz.ArrayComparison} to a value from {@link ArrayComparison}. * * * @author Brett Okken */ public final class ArrayUtil { /** * Enumerated options for controlling implementation of how to compare arrays. */ public static enum ArrayComparison

Re: [xz-devel] xz-java and newer java

2021-01-22 Thread Brett Okken
diff --git a/src/org/tukaani/xz/lz/BT4.java b/src/org/tukaani/xz/lz/BT4.java index 6c46feb..c96c766 100644 --- a/src/org/tukaani/xz/lz/BT4.java +++ b/src/org/tukaani/xz/lz/BT4.java @@ -11,6 +11,7 @@ package org.tukaani.xz.lz; import org.tukaani.xz.ArrayCache; +import

Re: [xz-devel] java crc64 implementation

2021-01-21 Thread Brett Okken
Here is a slice by 4 implementation. It goes byte by byte to easily be compatible with older jdks. Performance wise, it is pretty comparable to the java port of Adler's stackoverflow implementation: Benchmark Mode Cnt Score Error Units Hash64Benchmark.adler

Re: [xz-devel] xz-java and newer java

2021-01-21 Thread Brett Okken
> Have you tested with 32-bit Java too? It's quite possible that it's > better to use ints than longs on 32-bit system. If so, that should be > detected at runtime too, I guess. I have now run benchmarks using the 32bit jre on 64bit windows system. That actually introduces additional interesting

Re: [xz-devel] xz-java and newer java

2021-01-16 Thread Brett Okken
java.lang.reflect.Method; import java.nio.ByteOrder; import java.util.logging.Level; import java.util.logging.Logger; /** * Utilities for optimized array interactions. * * @author Brett Okken */ public final class ArrayUtil { /** * MethodHandle to the actual mismatch method to use at runtime

[xz-devel] java crc64 implementation

2021-01-13 Thread Brett Okken
Mark Adler has posted an optimized crc64 implementation on stackoverflow[1]. This can be reasonably easily ported to java (that post has a link to java impl on github[2] which warrants a little clean up, but gives a decent idea). I did a quick benchmark calculating the crc64 over 8KB and the

Re: [xz-devel] xz-java and newer java

2021-01-13 Thread Brett Okken
public int getMatchLen(int forward, int dist, int lenLimit) { final int curPos = readPos + forward; final int backPos = curPos - dist - 1; return ArrayUtil.mismatch(buf, curPos, buf, backPos, lenLimit); } On Tue, Jan 12, 2021 at 10:17 AM Brett Okken wrote: > >

Re: [xz-devel] xz-java and newer java

2021-01-12 Thread Brett Okken
lower > than comparing ints if the mismatch occurs in the first 4 bytes. > > I wrote this test using jdk 9 VarHandle to read the ints and longs > from the byte[], but the same thing can be achieved using > sun.misc.Unsafe. I will add that as a case in the benchmark, but it is > e

Re: [xz-devel] xz-java and newer java

2021-01-11 Thread Brett Okken
, but it is expected to be similar to VarHandle (maybe slightly faster). Brett On Mon, Jan 11, 2021 at 10:04 AM Lasse Collin wrote: > > On 2021-01-09 Brett Okken wrote: > > This would seem to be a potential candidate for a multi-release > > jar[1], if you can figure out

Re: [xz-devel] xz-java and newer java

2021-01-09 Thread Brett Okken
hole class could be handled for the MR jar. [1] - https://openjdk.java.net/jeps/238 Thanks, Brett On Fri, Jan 8, 2021 at 1:36 PM Lasse Collin wrote: > > On 2021-01-08 Brett Okken wrote: > > Are there any plans to update xz-java to take advantage of newer > > features in jdk 9+?

[xz-devel] java LZDecoder small improvement

2021-01-08 Thread Brett Okken
The repeat method in LZDecoder[1] currently copies individual bytes in a loop. This could be changed to do batch copies: do { //it is possible for the "repeat" to include content which is going to be generated here //so we have to limit ourselves to how much data is

[xz-devel] xz-java and newer java

2021-01-08 Thread Brett Okken
Are there any plans to update xz-java to take advantage of newer features in jdk 9+? For example, Arrays.mismatch[1] leverages vectorized comparisons of 2 byte[]. This could be leveraged in the getMatches methods of BT4 and HC4 as well as the 2 getMatchLen methods in LZEncoder. Another example