Re: [classlib][pack200] Decoupling I/O and processing for unpacking scenario

Aleksey Shipilev Fri, 18 Jul 2008 10:18:02 -0700

Ah, ok.

I had updated the patch with pre-reading implementation [1]. Combining
it with the multi-threading prototype [2] I have run pack200 with
~100% CPU utilization on 8-core Xeon (Clovertown). That corresponds
with the stage measurements, which show that [1,2] combined should
scale linearly up to 10 processor cores.


The stage measurements:
read=265 process=32358 write=2673

Theoretically, in case of ideal scaling on 8 cores we should have:
 - on single-threaded version: (0.2 + 2.7 + 32.3) = 35.2 secs
 - on multi-threaded version: (0.2 + 2.7 + 32.3/8) = 7 secs
...giving theoretical 5x boost.

In real life, the scenario of unpacking 50 Mb .pack file into 150 Mb .jar took:
 - on single-threaded version: 24 secs
 - on multi-threaded version: 14 secs
...giving practical 1.7x boost.

That's awesome, but lot of time is spent on GC: ~7 secs. Throwing away
GC time, we can have "practical" 3.5x boost. There are two points:
 1. Frequent GC is the price for buffering. Clean decoupling of
reading/processing stages will do multi-threaded version less
garbaged.
 2. Memory management could be next "big deal" in pack200 implementation.

Thanks,
Aleksey.


[1] https://issues.apache.org/jira/browse/HARMONY-5916
[2] https://issues.apache.org/jira/browse/HARMONY-5918

On Fri, Jul 18, 2008 at 7:46 PM, Sian January
<[EMAIL PROTECTED]> wrote:
> According to the spec, "The value #archive_size is either zero or declares
> the number of bytes in the archive segment, starting immediately after
> #archive_size_lo and before #archive_next_count and ending with the last
> band, the *file_bits band. (That is, a non-zero size includes the size of
> #archive_next_count, *file_bits, and everything in between.) "
>
> So you'll need to minus a few bytes for the values you've already read from
> the second half of the header.
>
>
> On 18/07/2008, Aleksey Shipilev <[EMAIL PROTECTED]> wrote:
>>
>> Sian,
>>
>> On Fri, Jul 18, 2008 at 5:29 PM, Sian January
>> <[EMAIL PROTECTED]> wrote:
>> >> Awesome! Am I understanding correctly: this value determines the size
>> >> of segment? If yes, can you point me how to access this value? Is
>> >> there API in current implementation?
>> > Yes - use SegmentHeader.getArchiveSize()
>>
>> Does spec cover any alignment/padding constraints for segments?
>> What exactly archive size specify?
>>
>> I'm doing this one [1]:
>> 1. Reading the header of segment (moved from readSegment).
>> 2. Check the field value, then either
>> 3a. Read the segment into byte array and wrap it with BAIS, then
>> read from BAIS
>> 3b. Read the segment from global input stream
>>
>> I can only read first segment, second fails to read with the "bad
>> header" exception.
>>
>> Thanks,
>> Aleksey.
>>
>> [1]
>>    void unpackRead(InputStream in) throws IOException, Pack200Exception {
>>        if (!in.markSupported())
>>            in = new BufferedInputStream(in);
>>
>>        header = new SegmentHeader(this);
>>        header.read(in);
>>
>>        int size = (int)header.getArchiveSize();
>>
>>        if (size != 0) {
>>            byte[] data = new byte[size];
>>            in.read(data);
>>            bin = new ByteArrayInputStream(data);
>>
>>            readSegment(bin);
>>        } else {
>>            readSegment(in);
>>        }
>>    }
>>
>
>
>
> --
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>

Re: [classlib][pack200] Decoupling I/O and processing for unpacking scenario

Reply via email to