Andrew, I believe that's a side effect of HARMONY-5906, see separate thread :)
Thanks, Aleksey. On Wed, Aug 6, 2008 at 3:22 AM, Andrew Cornwall <[EMAIL PROTECTED]> wrote: > I don't see any differences in results with your patch [5]=5931 when running > against my big batch of tests. That surprises me, honestly - I think when I > did my test I commented out the "nestedEntries.addAll(byteCodes)" in > CodeAttribute.getNestedClassFileEntries(), and doing that certainly causes > problems. > > I guess once the nested entries of the ByteCodes have been figured out and > added to the class pool, the bytecodes themselves are no longer needed... > which is why your patch works and mine didn't. > > Cool stuff! > > On Tue, Aug 5, 2008 at 3:44 PM, Andrew Cornwall <[EMAIL PROTECTED]>wrote: > >> I'm not sure about [5]=5931 - if I remember right, I'd taken out ByteCode >> processing in add(), and that led to problems resolving bytecodes which >> referred to things in the classpool. But that was a while ago; I'm >> interested in trying it and seeing if there are any differences in my wad of >> 40M or so testcases. >> >> The only other patch I've looked at so far has been [2]=5928. That's a >> nifty change. >> >> All in all, it looks like you've done a lot of good work! >> >> >> >> On Tue, Aug 5, 2008 at 3:16 PM, Aleksey Shipilev < >> [EMAIL PROTECTED]> wrote: >> >>> Sian, Andrew, >>> >>> An update here. I had updated the profiler [1] and run it over on JDT >>> unpacking scenario several times. That's what we have today (times are >>> msecs, indentation resembles call hierarchy): >>> >>> Unpack: 220529 >>> >>> segment read: 28853 >>> cpBands: 13791 >>> adBands: 214 >>> icBands: 343 >>> cbBands: 9158 >>> bcBands: 4694 >>> fbBands: 118 >>> fBits: 407 >>> >>> segment parse: 176929 >>> header: 0 >>> cpBands: 0 >>> adBands: 0 >>> icBands: 1 >>> cbBands: 0 >>> >>> bcBands: 77245 >>> exceptn: 656 >>> newCA: 71145 >>> getBC: 16372 >>> extOpnd: 26776 >>> fixup: 1885 >>> methAttr: 591 >>> curAttr: 1808 >>> >>> fbBands: 0 >>> >>> buildCF: 82057 >>> ccp.addN: 36182 >>> ccp.addNW: 433 >>> ccp.resv: 27452 >>> ic.getIC: 11717 >>> >>> cfWrite: 17379 >>> >>> segment write: 14349 >>> >>> >>> As you can see in commits, I had filed a couple of JIRAs with the >>> bunch of pack200 optimizations [2,3,4,5], here what I got with all >>> them applied: >>> >>> Unpack: 193165 (-14% in total) >>> >>> segment read: 28334 >>> cpBands: 13034 >>> adBands: 249 >>> icBands: 341 >>> cbBands: 9299 >>> bcBands: 4645 >>> fbBands: 169 >>> fBits: 459 >>> >>> segment parse: 150032 >>> header: 1 >>> cpBands: 0 >>> adBands: 0 >>> icBands: 82 >>> cbBands: 0 >>> >>> bcBands: 73936 >>> exceptn: 633 >>> newCA: 67871 >>> getBC: 13874 <--- (-18% due to [2]) >>> extOpnd: 26583 >>> fixup: 1808 >>> methAttr: 615 >>> curAttr: 1663 >>> >>> fbBands: 0 >>> >>> buildCF: 58319 >>> ccp.addN: 26199 <---- (-38% due to [5]) >>> ccp.addNW: 424 >>> ccp.resv: 23642 <---- (-16% due to [5]) >>> ic.getIC: 2245 <--- (-80% due to [3,4]) >>> >>> cfWrite: 17463 >>> >>> segment write: 14413 >>> >>> >>> Of course, the boosts are diminished with the performance overheads of >>> profiling. But still, this profile gives pretty good insight on what's >>> going on. CodeAttribute ["newCA" is the "new CodeAttribute(...)"] is >>> the next candidate for optimization, I guess. >>> >>> Sian, Andrew, can you please review the patches? I'm particularly >>> interested in [5], because it's proof-of-concept and kind of >>> controversial. >>> >>> Thanks, >>> Aleksey. >>> >>> [1] "classlib][pack200] Internal profiler for pack200" >>> https://issues.apache.org/jira/browse/HARMONY-5905 >>> >>> [2] [classlib][pack200][performance] java.util.HashMap usage optimization >>> https://issues.apache.org/jira/browse/HARMONY-5928 >>> >>> [3] [classlib][pack200][performance] Segment.computeIcStored rewrite >>> https://issues.apache.org/jira/browse/HARMONY-5929 >>> >>> [4] [classlib][pack200][performance] IcBands.getRelevantIcTuples rewrite >>> https://issues.apache.org/jira/browse/HARMONY-5930 >>> >>> [5] [classlib][pack200][performance] Some ClassConstantPool content >>> may not be needed >>> https://issues.apache.org/jira/browse/HARMONY-5931 >>> >>> >>> On Thu, Jul 10, 2008 at 7:59 PM, Aleksey Shipilev >>> <[EMAIL PROTECTED]> wrote: >>> > I had quickly drafted the internal Java profiler for pack200 at [1]. >>> > Here are the results of profiling for 50Mb Eclipse JDT jar, times are >>> > microsecs, identation emulates the call tree. Some of the label are >>> > not distinguishable, but you may look up probe positions in the patch. >>> > >>> > Unpack: 38311 >>> > segment unpack: 38217 >>> > parse segment: 11575 >>> > parse header: 0 >>> > parse ADB: 78 >>> > parse bcbands: 5342 >>> > parse1: 453 >>> > parse2: 93 >>> > select: 252 >>> > attrlayout: 0 >>> > methods: 3997 >>> > parse cbands: 3358 >>> > classattr: 1002 >>> > code: 1636 >>> > fields: 173 >>> > methods: 515 >>> > parse cpbands: 2483 >>> > parse fbands: 63 >>> > parse icbands: 16 >>> > write jar: 26642 >>> > build classf: 21111 >>> > sfattrs: 47 >>> > cfattrs: 0 >>> > fields: 218 >>> > interfaces: 0 >>> > methods: 362 >>> > addNested: 8146 >>> > inner: 3051 >>> > final: 8774 >>> > write classf: 1934 >>> > constpool: 1015 >>> > interfaces: 0 >>> > attributes: 31 >>> > methods: 827 >>> > fields: 31 >>> > write primit: 486 >>> > >>> > That's the point where one can take the method and optimize it locally >>> :) >>> > >>> > Thanks, >>> > Aleksey. >>> > >>> > [1] https://issues.apache.org/jira/browse/HARMONY-5905 >>> > >>> > On Wed, Jul 9, 2008 at 9:20 PM, Aleksey Shipilev >>> > <[EMAIL PROTECTED]> wrote: >>> >> I had disabled the compression in my test to throw away ZIP overhead >>> >> and focus on pack200 performance only. Thus the performance data is >>> >> not relevant to previous measurements. The data are assumed with >>> >> HARMONY-5900 incorporated. >>> >> >>> >> Harmony's pack200: 43 secs (3.5 Mb/secs) >>> >> Sun's pack200: 9 secs (16.6 Mb/secs) >>> >> >>> >> Profile: >>> >> >>> >> 22.0% java.util.HashMap.* >>> >> 11.4% java.io.FileInputStream.readBytes() >>> >> 7.5% o.a.h.unpack200.bytecode.ClassConstantPool.addNested() >>> >> 6.9% java.util.zip.* >>> >> 5.6% o.a.h.pack200.BHSDCodec.decode() >>> >> 4.8% java.lang.* >>> >> 4.4% o.a.h.unpack200.IcBands.getRelevantIcTuples() >>> >> 3.9% >>> o.a.h.unpack200.bytecode.forms.NoArgumentForm.setByteCodeOperands() >>> >> 3.2% o.a.h.unpack200.bytecode.ClassConstantPool.* (other) >>> >> 3.0% o.a.h.unpack200.bytecode.CodeAttribute.* >>> >> 2.8% java.io.FileOutputStream.writeBytes() >>> >> 2.8% o.a.h.unpack200.bytecode.ByteCode.* >>> >> 2.75% java.util.TreeMap.* >>> >> >>> >> Note ArrayList is gone! >>> >> It seems like BHSDCodec.decode(), IcBands.getRelevanticTuples() and >>> >> NoArgumentForm.setByteCodeOperands() are next candidates for tuning. >>> >> After that, the performance improvement is not possible without deep >>> >> changes, like overall algorithmic improvements. Anyway, that should be >>> >> first, but I'm not familiar with the code yet. This can't stop us >>> >> though ;) >>> >> >>> >> Thanks, >>> >> Aleksey. >>> >> >>> >> On Wed, Jul 9, 2008 at 7:27 PM, Sian January < >>> [EMAIL PROTECTED]> wrote: >>> >>> Thanks for doing that Aleksey. In fact I think Sun's was 20 or 30 >>> times >>> >>> faster before we started doing any performance optimizations, but it >>> looks >>> >>> like there's still some ground that we could make up! >>> >>> >>> >>> >>> >>> >>> >>> On 08/07/2008, Aleksey Shipilev <[EMAIL PROTECTED]> wrote: >>> >>>> >>> >>>> Hi, >>> >>>> >>> >>>> I took the liberty of profiling of pack200 implementation on >>> unpacking >>> >>>> scenario. Source data was obtained from Eclipse JDT jars, repacked in >>> >>>> single 60 Mb jar file, then packed with pack200 from Sun's JDK (-E9 >>> >>>> used), resulting in 20 Mb pack200-compressed file. Then Sun JDK >>> >>>> 1.6.0_05 (Windows, -server) was used together with hprof (cpu=time) >>> to >>> >>>> obtain the profile. My patch from HARMONY-5900 is onboard. The head >>> of >>> >>>> the profile looks like this: >>> >>>> >>> >>>> 4.76% >>> org.apache.harmony.unpack200.bytecode.ClassConstantPool.addNested >>> >>>> 4.22% java.util.HashMap.getEntry >>> >>>> 2.99% java.util.AbstractList$Itr.next >>> >>>> 2.92% java.util.AbstractList$Itr.hasNext >>> >>>> 2.84% java.util.ArrayList.get >>> >>>> 2.43% java.util.AbstractList$Itr.next >>> >>>> 2.41% java.util.HashMap.containsKey >>> >>>> 2.15% org.apache.harmony.unpack200.IcBands.getRelevantIcTuples >>> >>>> 2.00% java.util.HashSet.contains >>> >>>> 1.57% java.io.DataOutputStream.writeUTF >>> >>>> >>> >>>> Composite occupancy: >>> >>>> >>> >>>> 18.4% java.util.AbstractList >>> >>>> 18.0% java.util.HashMap >>> >>>> 15.8% java.util.ArrayList >>> >>>> 10.5% o.a.h.unpack200.bytecode.ClassConstantPool.* >>> >>>> 5.3% o.a.h.unpack200.bytecode.CPUTF8.* (hashcode mostly) >>> >>>> 4.5% java.io.* >>> >>>> 4.5% java.lang.String.* >>> >>>> 4.4% o.a.h.unpack200.bytecode.ByteCode.* >>> >>>> 3.9% o.a.h.unpack200.bytecode.Ic{Tuple|Bands}.* >>> >>>> 14.7% other >>> >>>> >>> >>>> So the main concern is Collections usage. ClassConstantPool uses >>> Lists >>> >>>> excessively, so I suspect the significant amount of time is spent >>> >>>> there. >>> >>>> >>> >>>> NB: >>> >>>> Timings for the scenario (the less the better): >>> >>>> Harmony's pack200: 67 secs >>> >>>> Sun's pack200: 6 secs >>> >>>> >>> >>>> Yep, 10 times faster. >>> >>>> >>> >>>> Thanks, >>> >>>> Aleksey. >>> >>>> >>> >>> >>> >>> >>> >>> >>> >>> -- >>> >>> Unless stated otherwise above: >>> >>> IBM United Kingdom Limited - Registered in England and Wales with >>> number >>> >>> 741598. >>> >>> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 >>> 3AU >>> >>> >>> >> >>> > >>> >> >> >
