Hi, Sian, Andrew, I had decoupled the I/O and processing for unpacking scenario [1]. The bottom-line for this is to get rid from essentially serial I/O operations as much as possible, thus decreasing the amount of serial code in pack200 and opening the way for parallelism.
The stage measurements for the first prototype are (msecs): read=6737 process=26724 write=2537 That is, 6.7 secs is spent on reading, 2.5 secs on writing, 26.7 secs on processing. Keeping in mind that each segment traverses all three actions exactly once, we can see that processing of average segment is 4x slower than reading/writing. That mean, you could spawn 1 reader thread, 1 writer thread, 4 processing threads and have an equilibrium in producer-consumer scheme. In case of ideal scaling, it would decrease the scenario timing down to (6.7 + 2.5 + 26.7/4) = 15.8 secs, giving +70% boost. Exact mechanism of such paralleling is not so clear for me yet. Can we take the j.u.concurrent as the dependency? Another issue is: there still processing on read stage, because of mind-boggling dependencies I can't eliminate in this version. If we'll manage to decrease reading timings at least twice, the unpacking scenario timing will drop by (6.7/2 + 2.5 + [26.7+6.7/2]/4) = 13.3 secs, giving +100% boost. Ahmdal's Law, eh :) Thanks, Aleksey. [1] https://issues.apache.org/jira/browse/HARMONY-5916
