Right, the pack200 structure is in ok way fixed size, so you have to
read through to determine what they are. And given everything uses
"current pointer" mentality, it would be difficult to parallelise
generally.
The best way to parallelise a pack200 process is to generate multiple
packed Jars, and then extract each one in an independent thread. Most
large systems that would benefit from this would probably do that in
any case.
Alex
Sent from my (new) iPhone
On 18 Jul 2008, at 10:16, "Sian January" <[EMAIL PROTECTED]>
wrote:
Hi Aleksey,
That's a really interesting idea. It doesn't sound like a very
complicated
threading scenario, so I would think you could just use
java.lang.Thread
rather than adding a dependency on the concurrent module.
I think there needs to be some processing on the read stage, because
the
length of some bands depends on the contents of previous bands so
you won't
know how much to read unless you do some processing. But some
things can be
done afterwards like sorting the constant pool, so there's
definitely an
opportunity for parallelism there.
I haven't looked at your patch yet, but I will try to review it soon.
Thanks,
Sian
On 17/07/2008, Aleksey Shipilev <[EMAIL PROTECTED]> wrote:
Hi, Sian, Andrew,
I had decoupled the I/O and processing for unpacking scenario [1].
The
bottom-line for this is to get rid from essentially serial I/O
operations as much as possible, thus decreasing the amount of serial
code in pack200 and opening the way for parallelism.
The stage measurements for the first prototype are (msecs):
read=6737 process=26724 write=2537
That is, 6.7 secs is spent on reading, 2.5 secs on writing, 26.7 secs
on processing. Keeping in mind that each segment traverses all three
actions exactly once, we can see that processing of average segment
is
4x slower than reading/writing. That mean, you could spawn 1 reader
thread, 1 writer thread, 4 processing threads and have an equilibrium
in producer-consumer scheme. In case of ideal scaling, it would
decrease the scenario timing down to (6.7 + 2.5 + 26.7/4) = 15.8
secs,
giving +70% boost.
Exact mechanism of such paralleling is not so clear for me yet. Can
we
take the j.u.concurrent as the dependency?
Another issue is: there still processing on read stage, because of
mind-boggling dependencies I can't eliminate in this version. If
we'll
manage to decrease reading timings at least twice, the unpacking
scenario timing will drop by (6.7/2 + 2.5 + [26.7+6.7/2]/4) = 13.3
secs, giving +100% boost. Ahmdal's Law, eh :)
Thanks,
Aleksey.
[1] https://issues.apache.org/jira/browse/HARMONY-5916
--
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with
number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire
PO6 3AU