Ah, ok. I had updated the patch with pre-reading implementation [1]. Combining it with the multi-threading prototype [2] I have run pack200 with ~100% CPU utilization on 8-core Xeon (Clovertown). That corresponds with the stage measurements, which show that [1,2] combined should scale linearly up to 10 processor cores.
The stage measurements: read=265 process=32358 write=2673 Theoretically, in case of ideal scaling on 8 cores we should have: - on single-threaded version: (0.2 + 2.7 + 32.3) = 35.2 secs - on multi-threaded version: (0.2 + 2.7 + 32.3/8) = 7 secs ...giving theoretical 5x boost. In real life, the scenario of unpacking 50 Mb .pack file into 150 Mb .jar took: - on single-threaded version: 24 secs - on multi-threaded version: 14 secs ...giving practical 1.7x boost. That's awesome, but lot of time is spent on GC: ~7 secs. Throwing away GC time, we can have "practical" 3.5x boost. There are two points: 1. Frequent GC is the price for buffering. Clean decoupling of reading/processing stages will do multi-threaded version less garbaged. 2. Memory management could be next "big deal" in pack200 implementation. Thanks, Aleksey. [1] https://issues.apache.org/jira/browse/HARMONY-5916 [2] https://issues.apache.org/jira/browse/HARMONY-5918 On Fri, Jul 18, 2008 at 7:46 PM, Sian January <[EMAIL PROTECTED]> wrote: > According to the spec, "The value #archive_size is either zero or declares > the number of bytes in the archive segment, starting immediately after > #archive_size_lo and before #archive_next_count and ending with the last > band, the *file_bits band. (That is, a non-zero size includes the size of > #archive_next_count, *file_bits, and everything in between.) " > > So you'll need to minus a few bytes for the values you've already read from > the second half of the header. > > > On 18/07/2008, Aleksey Shipilev <[EMAIL PROTECTED]> wrote: >> >> Sian, >> >> On Fri, Jul 18, 2008 at 5:29 PM, Sian January >> <[EMAIL PROTECTED]> wrote: >> >> Awesome! Am I understanding correctly: this value determines the size >> >> of segment? If yes, can you point me how to access this value? Is >> >> there API in current implementation? >> > Yes - use SegmentHeader.getArchiveSize() >> >> Does spec cover any alignment/padding constraints for segments? >> What exactly archive size specify? >> >> I'm doing this one [1]: >> 1. Reading the header of segment (moved from readSegment). >> 2. Check the field value, then either >> 3a. Read the segment into byte array and wrap it with BAIS, then >> read from BAIS >> 3b. Read the segment from global input stream >> >> I can only read first segment, second fails to read with the "bad >> header" exception. >> >> Thanks, >> Aleksey. >> >> [1] >> void unpackRead(InputStream in) throws IOException, Pack200Exception { >> if (!in.markSupported()) >> in = new BufferedInputStream(in); >> >> header = new SegmentHeader(this); >> header.read(in); >> >> int size = (int)header.getArchiveSize(); >> >> if (size != 0) { >> byte[] data = new byte[size]; >> in.read(data); >> bin = new ByteArrayInputStream(data); >> >> readSegment(bin); >> } else { >> readSegment(in); >> } >> } >> > > > > -- > Unless stated otherwise above: > IBM United Kingdom Limited - Registered in England and Wales with number > 741598. > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU >
