On 04-13-2011 3:00 PM, Mike Duigou wrote:
Mike, can you share the results of performance testing at various compression levels? Is 
there much difference between the levels or an apparent "sweet spot"?

For low hanging fruit for jdk 7 it might be worth considering raising the 
default compression level from 5 to 6 (the zlib default). Raising the level 
from 5 to 6 entails (by today's

Hi Mike,

zlib1.2.3/zlib.h states the default is level 6. I've not checked or run any test to verify if its implementation matches its docs, any reason you think zlib actually is using level 5 as
default? I can take look into it further...

...
ZEXTERN int ZEXPORT deflateInit OF((z_streamp strm, int level));

     Initializes the internal stream state for compression. The fields
   zalloc, zfree and opaque must be initialized before by the caller.
   If zalloc and zfree are set to Z_NULL, deflateInit updates them to
   use default allocation functions.

The compression level must be Z_DEFAULT_COMPRESSION, or between 0 and 9:
   1 gives best speed, 9 gives best compression, 0 gives no compression at
   all (the input data is simply copied a block at a time).
   Z_DEFAULT_COMPRESSION requests a default compromise between speed and
   compression (currently equivalent to level 6).
...


-Sherman

standards) very modest increases in the amount of memory and effort used (16Kb 
additional buffer space for compression). In general zlib reflects size choices 
that are almost 20 years old and it may be of no measurable benefit to be using 
the lower compression levels.

Mike (also)


On Apr 12 2011, at 17:48 , [email protected] wrote:

Hi Sherman,
I have had a quick look at the current code to see what 'low hanging fruit'
there is. I appreciate that parallelizing the command in its entirity may not be
feasible for the first release

The tests that I have run are jarring the content of the 1.7 rt.jar with varying
compression levels. Each jar is run as an child process 6 times and the average
of the last 5 is taken. Tests are pretty much CPU bound on a single core

1. The performance of the cf0 (create file with no compression) path can be
improved for the general case if the file is buffered.
In my test env (windows 7 64bit) this equates to a 17% time performance
improvement in my tests. In the existing code the file is read twice, once to
calc the CRC and once to write the file to the Jar file. This change would
buffer a single small file at a time (size<  100K)

2. It is also a trivial fix to allow different compression levels, rather than
stored and default

After that it is harder to gain performance improvements without structural
change or more discussion

3. The largest saving after that is in the management of the 'entries' Set, as
the hashcode of the File is expensive (this may not apply to other filesystems).
the management of this map seems to account for more cpu than Deflator!
I cannot see the reason for this data being collected. I am probably missing the
obvious ...

4. After that there is just the parallelisation of the jar utility and the
underlying stream

What is the best way to proceed

regards

Mike



________________________________
From: Xueming Shen<[email protected]>
To: [email protected]
Sent: Wednesday, 6 April, 2011 19:04:25
Subject: Re: proposal to optimise the performance of the Jar utility

Hi Mike,

We are in the final stage of the JDK7 development, work like this is
unlikely to get in the
last minute. I have filed a CR/RFE to trace this issue, we can use this
CR as the start
point for the discussion and target for some jdk7 update releases or JDK8.

7034403: proposal to optimise the performance of the Jar utility

Thanks,
Sherman


On 04/05/2011 04:42 PM, [email protected] wrote:
Hi,
Not sure if this is too late for Java 7 but I have made some optimisations for
a
client to improve the performance of the jar utility in their environment, and
would like to promite then into the main code base

The optimisations that I have performed are

1. Allowing the Jar utility to have other compression levels (currently it
allows default (5) only)
2. Multi-threading, and pipelining the the file information and access
3. multi-threading the compression and file writing

A little background
A part of the development process of where I work they regularly Jar the
content
of the working projects as part of the distribution to remote systems. This is
a
large and complex code base of 6 million LOC and growing. The Jar file ends up
compressed to approx 100Mb, Uncompressed the jar size is approx 245mb, about
4-5
times the size of rt.jar.

I was looking at ways to improve the performance as this activity occurs
several
times a day for dozens of developers

In essence when compressing a new jar file the jar utility is single threaded
and staged. Forgive me if this is an oversimplification

first  it works out all of the files that are specified, buffering the file
names, (IO bound)
then it iterates through the files, and for each file, it load  the file
information, and then the file content sending it to a JarOutputStream, (CPU
bound or IO bound depending on the IO speed)

The JarOutputStream has a compression of 0 (just store) or 5 (the default),
and
the jar writing is single threaded by the design of the JarOutputStream

The process of creation of a Jar took about 20 seconds in windows with the
help
of an SSD, and considerable longer without one, and was CPU bound to one CPU
core

----
The changes that I made were
1. Allow deferent compression levels (for us a compression level of 1
increases
the file size of the Jar to 110 Mb but reduces the CPU load in compression to
approx 30% of what it was (rough estimate)
2. pipelining the file access
2.1    one thread is started for each file root (-C on the Jar command line),
which scans for files and places the file information into a blocking
queue(Q1),
which I set to abretrary size of 200 items
2.2    one thread pool of 10 threads reads the file information from the queue
(Q1) and buffers the file content to a specified size (again I specified an
arbetrary size limit of 25K for a file, and places the buffered content into a
queue(q2) (again arbetrary size of 10 items
2.3    one thread takes the filecontent from Q2 and compresses it or checksums
it and adds it the the  JarOutputStream. This process is single threaded due
to
the design of the JarOutputStream

some other minor performance gain occurred by increasing the buffer on the
output stream to reduce the IO load

The end result is that the process takes about approx 5 seconds in the same
configuration

The above is in use in production configuration for a few months now

As a home project I have completed some enhancements to the JarOutputStream,
and
produced a JarWriter that allows multiple threads to work concurrently
deflating
or calculating checksums, which seems to test OK for the test cases that Ihave
generated,and successfully loads my quad core home dev machine on all cores.
Each thread allocates a buffer, and the thread compresses a files into the
buffer, only blocking other threads whenthe buffer is written to the output
(which is after the compression is complete, unless the file is too large to
compress

This JarWriter is not API compatable with the JarOutputStream, it is not a
stream. It allows the programmer to write a record based of the file
information
and an input stream, and is threadsafe. It is not a drop in replacement for
JarOutputStream
I am not an expert in the ZIp file format, but much of the code from
ZipOutputStream is unchanged, just restructured
---
I did think that there is some scope for improvement, that I have not looked
at
a. thresholding of file size for compression (very small files dont compress
well
b. some file types dont compress well (e.g. png, jpeg) as they have been
compressed already)
c. using NIO to parallelise the loading of the file information or content
d. some pre-charging of the deflator dictionary (e.g. a class file contains
the
strings of the class name and packages), but this would make the format
incompatable with zip, and require changes to the JVM to be useful, and is a
long way from my comform zone, or skill set. This would reduce the file size

--
What is the view of the readers. Is this something, or at least some parts of
this that could be incorperated into Java 7 or is this too late on the dev
cycle

regards

Mike

Reply via email to