Hi Mark

It appears the patch you provided throws unexpected exception (attached at the end of my email) when I tried it out on the latest JDK8 repository. Since I only did a quick scan of your
patch, I'm not sure what went wrong here.

This patch includes lots of stuff that obviously you are trying/testing on, as you "warned" us in
your email, I can see at least it tries to

(1) to support different compression level 0-9
(2) parallel Zip file writing
(3) with various m-thread strategy -Z
(4) Files.walkFileTree instead of File.list
(5) the -D :-) which I would really not recommend to do
(6) small optimization in various places.

which makes the code a little hard to read and the resulting data hard to compare with. I would suggest to divide this proposal to separate pieces and work on them one by one, for example maybe we can try to solve the main puzzle (2) + (3) first, and then the other
optimization opportunities.

To collect some data, I followed your lead to write a simple MT support implementation
in Jar Main class as showed at

http://cr.openjdk.java.net/~sherman/mtjar2/webrev2/

which I guess is similar to what your are doing. It uses a "simple" strategy

(1) A dedicated thread (from the ExecutorService thread pool) to iterate the file system tree to "collect" the target files, submit a "compression job" for each of these files to the ExecutorService and keep the returned "Future" (from the submission) in a
     queue "elist".
(2) Threads from ExecutorService to use temporary buffer memory to read and compress
      the the file in memory .
(3) The main thread is polling the queue "elist", waiting for the "compression job" to
     cmplete and  write the result into the target ZipOutputStream.

The resulting data looks promising, I'm seeing the jar-ing speed doubled when jar-ing the rt.jar and a jdk7 binary tree, on a "slow" but 4-core linux vm machine (I have the
similar result on a 2-hcore linux as well)

java Jar cf jdk.jar jdk1.7.0        Jar TotalTime:17278
java Jar cfT1 jdk.jar jdk1.7.0   Jar TotalTime:12345
java Jar cfT2 jdk.jar jdk1.7.0   Jar TotalTime:7559
java Jar cfT3 jdk.jar jdk1.7.0   Jar TotalTime:7572
java Jar cfT4 jdk.jar jdk1.7.0   Jar TotalTime:7801
java Jar cfT5 jdk.jar jdk1.7.0   Jar TotalTime:8112

The new "T" option for "n-thread", the digit number followed is to specify the fixed thread number for the executor service's thread pool. It appears that we can achieve the "best" result with only 3 threads in this configuration. One thread for scanning the file system, one thread for the compression and the main thread for the writing out. My guess is that the fact we have to "write out" to a single file (the resulting jar) limits the potential benefit of having more "compressing" threads.

I also tried to measure the "file scanning" speed in my mini-benchmark FIter

 http://cr.openjdk.java.net/~sherman/mtjar2/FIter.java

Here are the "surprising" results.

 "nio" is the walkFileTree,
 "io" is the File.list()
 "io2" is the File.listFiles().

The nio's File.walkFileTree is 15 times faster than the "traditional" recursion+File.list().
wow!

Linux--------------------------------------------------------------------------
sherman@sherman-linux:~/Workspace/test$ java FIter ../jdk8_mtJar/src
java.io.File iteration
------------------
  nio.totalSize:137149279
        fileNum:12222
       checkSum:16122691809689000
           Time:85
------------------
  io.totalSize:137149279
        fileNum:12222
       checkSum:16122691809689000
          Time:269
------------------
 io2.totalSize:137149279
        fileNum:12222
       checkSum:16122691809689000
          Time:450

Windows7---------------------------------------------------------------------------------

$ /cygdrive/c/Program\ Files\ \(x86\)/Java/jdk1.7.0/bin/java FIter ../sqa/jdk8/src
java.io.File iteration
------------------
  nio.totalSize:136695871
        fileNum:12199
       checkSum:15997350823839479
           Time:323
------------------
  io.totalSize:136695871
        fileNum:12199
       checkSum:15997350823839479
          Time:2633
------------------
 io2.totalSize:136695871
        fileNum:12199
       checkSum:15997350823839479
          Time:4592


----------------------------------------------------------------------

sherman@sherman-linux:~/Workspace/test$ ../jdk8_mtJar/build/linux-i586/bin/jar cf6DZ3 rt0.jar rtjar
duplicate path
java.util.zip.ZipException: duplicate entry: ../
at java.util.zip.AbstractZipWriter.writeHeader(AbstractZipWriter.java:647) at java.util.zip.AbstractZipWriter.startWritingStored(AbstractZipWriter.java:384) at java.util.zip.AbstractZipWriter.writeWithResource(AbstractZipWriter.java:350) at java.util.zip.AbstractZipWriter.writeAll(AbstractZipWriter.java:273)
    at sun.tools.jar.Main$ZipOutputLoader2File.call(Main.java:410)
    at sun.tools.jar.Main$ZipOutputLoader2File.call(Main.java:350)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
    at java.lang.Thread.run(Thread.java:722)
java.util.concurrent.ExecutionException: java.util.zip.ZipException: duplicate entry: ../
    at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:252)
    at java.util.concurrent.FutureTask.get(FutureTask.java:111)
    at sun.tools.jar.Main.waitFor(Main.java:810)
    at sun.tools.jar.Main.run(Main.java:679)
    at sun.tools.jar.Main.main(Main.java:1842)
Caused by: java.util.zip.ZipException: duplicate entry: ../

-Sherman

On 10/20/2011 3:55 PM, Mike Skells wrote:
Hi All,
I have some performance updates for the jar tool and for the Zip/Jar writing components, including some code to allow parallel writing of Jar and ZIP files (in java.util)
This work is not finished as yet but I am looking to see if anyone has any 
views as to the shape this should move in
Currently it is a testbed for comparing different techniques, but largely based 
on the Jar utility

The changes allow the work to be spread across multiple CPUs and optimise the 
some of the code and I/O paths

This comparative figures do not include the effect of the nio changes that I 
proposed in earlier emails

Command line changes
0--9 - I have added support for specifying different compression levels (the 
existing jar command just allows default compression or '0' for no compression
D This allows the files to all be written with the date of now, lather than the 
file date  (the conversion of the date to zip format is a CPU hog, and not 
needed in some use-cases)
Z0-5 - these are the different mechanisms to allow different parallel execution 
models - I would not expect this to be a production qualifier

The test environment is a 4 core Intel core2 pc running windows vista 64, the test case is jaring up the content of rt.jar to a jar file. Each test is repeated 6 times and the last 5 are averaged to produce the answers. Each test is run in a fresh VM

The performance figures are below as a CSV. The last column is the duration of 
the task in ms.

In summary the existing jar utility takes (for uncompressed, compressed) 8.4 , 9.4 seconds to complete and this can be reduced to 1.6, 2.3 seconds The different parallel algorithms are 0 - none all in one thread as before
1 - file scanning in one core, 10 threads loading and buffering files, zip 
writing in a single thread using the existing ZipOuputStream
2. - file scanning in one core, 10 threads loading and buffering files, zip 
writing mostly mutithreaded (e.g. parallel compression, single write to the 
output stream)
3 - as 2 but writes to a file rather than a stream
4. as 2 but uses channels to be to write with direct buffers
5 as 4 but using heap buffers

3-5 have the zip capability in the code to seek and update headers that are 
incomplete, but this is not much tested


C:\Program Files\Java\jdk1.6.0_24\bin\java.exe, C:\Program 
Files\Java\jdk1.6.0_24\lib\tools.jar, -cf0, java 1.6 rt -cf0, 8482
C:\Program Files\Java\jdk1.6.0_24\bin\java.exe, C:\Program 
Files\Java\jdk1.6.0_24\lib\tools.jar, -cf, java 1.6 rt -cf, 9318
C:\Program Files\Java\jdk1.7.0\bin\java.exe, C:\Program 
Files\Java\jdk1.7.0\lib\tools.jar, -cf0, java 1.7 rt -cf0, 8497
C:\Program Files\Java\jdk1.7.0\bin\java.exe, C:\Program 
Files\Java\jdk1.7.0\lib\tools.jar, -cf, java 1.7 rt -cf, 9518
C:\Program Files\Java\jdk1.7.0\bin\java.exe, C:\Test\Archive\baseline.jar, 
-cf0, orig 1.7 rt -cf0, 8448
C:\Program Files\Java\jdk1.7.0\bin\java.exe, C:\Test\Archive\baseline.jar, -cf, 
orig 1.7 rt -cf, 9484
C:\Program Files\Java\jdk1.7.0\bin\java.exe, 
C:\NetBeansProjects\JavaProject1\dist\javaproject1.jar, -cf0, project 1.7 rt 
-cf0, 3133
C:\Program Files\Java\jdk1.7.0\bin\java.exe, 
C:\NetBeansProjects\JavaProject1\dist\javaproject1.jar, -cf0D, project 1.7 rt 
-cf0D, 2824
C:\Program Files\Java\jdk1.7.0\bin\java.exe, 
C:\NetBeansProjects\JavaProject1\dist\javaproject1.jar, -cf0Z0, project 1.7 rt 
-cf0 parallel 0, 3026
C:\Program Files\Java\jdk1.7.0\bin\java.exe, 
C:\NetBeansProjects\JavaProject1\dist\javaproject1.jar, -cf0DZ0, project 1.7 rt 
-cf0D parallel 0, 2961
C:\Program Files\Java\jdk1.7.0\bin\java.exe, 
C:\NetBeansProjects\JavaProject1\dist\javaproject1.jar, -cf0DZ1, project 1.7 rt 
-cf0D parallel 1, 2022
C:\Program Files\Java\jdk1.7.0\bin\java.exe, 
C:\NetBeansProjects\JavaProject1\dist\javaproject1.jar, -cf0DZ2, project 1.7 rt 
-cf0D parallel 2, 1757
C:\Program Files\Java\jdk1.7.0\bin\java.exe, 
C:\NetBeansProjects\JavaProject1\dist\javaproject1.jar, -cf0DZ3, project 1.7 rt 
-cf0D parallel 3, 1632
C:\Program Files\Java\jdk1.7.0\bin\java.exe, 
C:\NetBeansProjects\JavaProject1\dist\javaproject1.jar, -cf0DZ4, project 1.7 rt 
-cf0D parallel 4, 1994
C:\Program Files\Java\jdk1.7.0\bin\java.exe, 
C:\NetBeansProjects\JavaProject1\dist\javaproject1.jar, -cf0DZ5, project 1.7 rt 
-cf0D parallel 5, 1978
C:\Program Files\Java\jdk1.7.0\bin\java.exe, 
C:\NetBeansProjects\JavaProject1\dist\javaproject1.jar, -cf1, project 1.7 rt 
-cf1, 5237

C:\Program Files\Java\jdk1.7.0\bin\java.exe, 
C:\NetBeansProjects\JavaProject1\dist\javaproject1.jar, -cf1D, project 1.7 rt 
-cf1D, 5073
C:\Program Files\Java\jdk1.7.0\bin\java.exe, 
C:\NetBeansProjects\JavaProject1\dist\javaproject1.jar, -cf1Z0, project 1.7 rt 
-cf1 parallel 0, 5367
C:\Program Files\Java\jdk1.7.0\bin\java.exe, 
C:\NetBeansProjects\JavaProject1\dist\javaproject1.jar, -cf1DZ0, project 1.7 rt 
-cf1D parallel 0, 5002
C:\Program Files\Java\jdk1.7.0\bin\java.exe, 
C:\NetBeansProjects\JavaProject1\dist\javaproject1.jar, -cf1DZ1, project 1.7 rt 
-cf1D parallel 1, 5125
C:\Program Files\Java\jdk1.7.0\bin\java.exe, 
C:\NetBeansProjects\JavaProject1\dist\javaproject1.jar, -cf1DZ2, project 1.7 rt 
-cf1D parallel 2, 2257
C:\Program Files\Java\jdk1.7.0\bin\java.exe, 
C:\NetBeansProjects\JavaProject1\dist\javaproject1.jar, -cf1DZ3, project 1.7 rt 
-cf1D parallel 3, 2145
C:\Program Files\Java\jdk1.7.0\bin\java.exe, 
C:\NetBeansProjects\JavaProject1\dist\javaproject1.jar, -cf1DZ4, project 1.7 rt 
-cf1D parallel 4, 2505
C:\Program Files\Java\jdk1.7.0\bin\java.exe, 
C:\NetBeansProjects\JavaProject1\dist\javaproject1.jar, -cf1DZ5, project 1.7 rt 
-cf1D parallel 5, 2549
C:\Program Files\Java\jdk1.7.0\bin\java.exe, 
C:\NetBeansProjects\JavaProject1\dist\javaproject1.jar, -cf2, project 1.7 rt 
-cf2, 5371

C:\Program Files\Java\jdk1.7.0\bin\java.exe, 
C:\NetBeansProjects\JavaProject1\dist\javaproject1.jar, -cf3, project 1.7 rt 
-cf3, 5409
C:\Program Files\Java\jdk1.7.0\bin\java.exe, 
C:\NetBeansProjects\JavaProject1\dist\javaproject1.jar, -cf4, project 1.7 rt 
-cf4, 5778
C:\Program Files\Java\jdk1.7.0\bin\java.exe, 
C:\NetBeansProjects\JavaProject1\dist\javaproject1.jar, -cf5, project 1.7 rt 
-cf5, 5906
C:\Program Files\Java\jdk1.7.0\bin\java.exe, 
C:\NetBeansProjects\JavaProject1\dist\javaproject1.jar, -cf6, project 1.7 rt 
-cf6, 6082
C:\Program Files\Java\jdk1.7.0\bin\java.exe, 
C:\NetBeansProjects\JavaProject1\dist\javaproject1.jar, -cf7, project 1.7 rt 
-cf7, 6070
C:\Program Files\Java\jdk1.7.0\bin\java.exe, 
C:\NetBeansProjects\JavaProject1\dist\javaproject1.jar, -cf8, project 1.7 rt 
-cf8, 6251
C:\Program Files\Java\jdk1.7.0\bin\java.exe, 
C:\NetBeansProjects\JavaProject1\dist\javaproject1.jar, -cf9, project 1.7 rt 
-cf9, 6191
C:\Program Files\Java\jdk1.7.0\bin\java.exe, 
C:\NetBeansProjects\JavaProject1\dist\javaproject1.jar, -cf6D, project 1.7 rt 
-cf6D, 5843
C:\Program Files\Java\jdk1.7.0\bin\java.exe, 
C:\NetBeansProjects\JavaProject1\dist\javaproject1.jar, -cf6Z0, project 1.7 rt 
-cf6 parallel 0, 6095
C:\Program Files\Java\jdk1.7.0\bin\java.exe, 
C:\NetBeansProjects\JavaProject1\dist\javaproject1.jar, -cf6DZ0, project 1.7 rt 
-cf6D parallel 0, 5907
C:\Program Files\Java\jdk1.7.0\bin\java.exe, 
C:\NetBeansProjects\JavaProject1\dist\javaproject1.jar, -cf6DZ1, project 1.7 rt 
-cf6D parallel 1, 5957
C:\Program Files\Java\jdk1.7.0\bin\java.exe, 
C:\NetBeansProjects\JavaProject1\dist\javaproject1.jar, -cf6DZ2, project 1.7 rt 
-cf6D parallel 2, 2388
C:\Program Files\Java\jdk1.7.0\bin\java.exe, 
C:\NetBeansProjects\JavaProject1\dist\javaproject1.jar, -cf6DZ3, project 1.7 rt 
-cf6D parallel 3, 2351
C:\Program Files\Java\jdk1.7.0\bin\java.exe, 
C:\NetBeansProjects\JavaProject1\dist\javaproject1.jar, -cf6DZ4, project 1.7 rt 
-cf6D parallel 4, 2694
C:\Program Files\Java\jdk1.7.0\bin\java.exe, 
C:\NetBeansProjects\JavaProject1\dist\javaproject1.jar, -cf6DZ5, project 1.7 rt 
-cf6D parallel 5, 2830
C:\Program Files\Java\jdk1.7.0\bin\java.exe, 
C:\NetBeansProjects\JavaProject1\dist\javaproject1.jar, -cf9D, project 1.7 rt 
-cf9D, 6134

C:\Program Files\Java\jdk1.7.0\bin\java.exe, 
C:\NetBeansProjects\JavaProject1\dist\javaproject1.jar, -cf9Z0, project 1.7 rt 
-cf9 parallel 0, 6258
C:\Program Files\Java\jdk1.7.0\bin\java.exe, 
C:\NetBeansProjects\JavaProject1\dist\javaproject1.jar, -cf9DZ0, project 1.7 rt 
-cf9D parallel 0, 6066
C:\Program Files\Java\jdk1.7.0\bin\java.exe, 
C:\NetBeansProjects\JavaProject1\dist\javaproject1.jar, -cf9DZ1, project 1.7 rt 
-cf9D parallel 1, 6203
C:\Program Files\Java\jdk1.7.0\bin\java.exe, 
C:\NetBeansProjects\JavaProject1\dist\javaproject1.jar, -cf9DZ2, project 1.7 rt 
-cf9D parallel 2, 2490
C:\Program Files\Java\jdk1.7.0\bin\java.exe, 
C:\NetBeansProjects\JavaProject1\dist\javaproject1.jar, -cf9DZ3, project 1.7 rt 
-cf9D parallel 3, 2361
C:\Program Files\Java\jdk1.7.0\bin\java.exe, 
C:\NetBeansProjects\JavaProject1\dist\javaproject1.jar, -cf9DZ4, project 1.7 rt 
-cf9D parallel 4, 2788
C:\Program Files\Java\jdk1.7.0\bin\java.exe, 
C:\NetBeansProjects\JavaProject1\dist\javaproject1.jar, -cf9DZ5, project 1.7 rt 
-cf9D parallel 5, 2847

regards
Mike

Reply via email to