I'm not sure about [5]=5931 - if I remember right, I'd taken out ByteCode processing in add(), and that led to problems resolving bytecodes which referred to things in the classpool. But that was a while ago; I'm interested in trying it and seeing if there are any differences in my wad of 40M or so testcases.
The only other patch I've looked at so far has been [2]=5928. That's a nifty change. All in all, it looks like you've done a lot of good work! On Tue, Aug 5, 2008 at 3:16 PM, Aleksey Shipilev <[EMAIL PROTECTED] > wrote: > Sian, Andrew, > > An update here. I had updated the profiler [1] and run it over on JDT > unpacking scenario several times. That's what we have today (times are > msecs, indentation resembles call hierarchy): > > Unpack: 220529 > > segment read: 28853 > cpBands: 13791 > adBands: 214 > icBands: 343 > cbBands: 9158 > bcBands: 4694 > fbBands: 118 > fBits: 407 > > segment parse: 176929 > header: 0 > cpBands: 0 > adBands: 0 > icBands: 1 > cbBands: 0 > > bcBands: 77245 > exceptn: 656 > newCA: 71145 > getBC: 16372 > extOpnd: 26776 > fixup: 1885 > methAttr: 591 > curAttr: 1808 > > fbBands: 0 > > buildCF: 82057 > ccp.addN: 36182 > ccp.addNW: 433 > ccp.resv: 27452 > ic.getIC: 11717 > > cfWrite: 17379 > > segment write: 14349 > > > As you can see in commits, I had filed a couple of JIRAs with the > bunch of pack200 optimizations [2,3,4,5], here what I got with all > them applied: > > Unpack: 193165 (-14% in total) > > segment read: 28334 > cpBands: 13034 > adBands: 249 > icBands: 341 > cbBands: 9299 > bcBands: 4645 > fbBands: 169 > fBits: 459 > > segment parse: 150032 > header: 1 > cpBands: 0 > adBands: 0 > icBands: 82 > cbBands: 0 > > bcBands: 73936 > exceptn: 633 > newCA: 67871 > getBC: 13874 <--- (-18% due to [2]) > extOpnd: 26583 > fixup: 1808 > methAttr: 615 > curAttr: 1663 > > fbBands: 0 > > buildCF: 58319 > ccp.addN: 26199 <---- (-38% due to [5]) > ccp.addNW: 424 > ccp.resv: 23642 <---- (-16% due to [5]) > ic.getIC: 2245 <--- (-80% due to [3,4]) > > cfWrite: 17463 > > segment write: 14413 > > > Of course, the boosts are diminished with the performance overheads of > profiling. But still, this profile gives pretty good insight on what's > going on. CodeAttribute ["newCA" is the "new CodeAttribute(...)"] is > the next candidate for optimization, I guess. > > Sian, Andrew, can you please review the patches? I'm particularly > interested in [5], because it's proof-of-concept and kind of > controversial. > > Thanks, > Aleksey. > > [1] "classlib][pack200] Internal profiler for pack200" > https://issues.apache.org/jira/browse/HARMONY-5905 > > [2] [classlib][pack200][performance] java.util.HashMap usage optimization > https://issues.apache.org/jira/browse/HARMONY-5928 > > [3] [classlib][pack200][performance] Segment.computeIcStored rewrite > https://issues.apache.org/jira/browse/HARMONY-5929 > > [4] [classlib][pack200][performance] IcBands.getRelevantIcTuples rewrite > https://issues.apache.org/jira/browse/HARMONY-5930 > > [5] [classlib][pack200][performance] Some ClassConstantPool content > may not be needed > https://issues.apache.org/jira/browse/HARMONY-5931 > > > On Thu, Jul 10, 2008 at 7:59 PM, Aleksey Shipilev > <[EMAIL PROTECTED]> wrote: > > I had quickly drafted the internal Java profiler for pack200 at [1]. > > Here are the results of profiling for 50Mb Eclipse JDT jar, times are > > microsecs, identation emulates the call tree. Some of the label are > > not distinguishable, but you may look up probe positions in the patch. > > > > Unpack: 38311 > > segment unpack: 38217 > > parse segment: 11575 > > parse header: 0 > > parse ADB: 78 > > parse bcbands: 5342 > > parse1: 453 > > parse2: 93 > > select: 252 > > attrlayout: 0 > > methods: 3997 > > parse cbands: 3358 > > classattr: 1002 > > code: 1636 > > fields: 173 > > methods: 515 > > parse cpbands: 2483 > > parse fbands: 63 > > parse icbands: 16 > > write jar: 26642 > > build classf: 21111 > > sfattrs: 47 > > cfattrs: 0 > > fields: 218 > > interfaces: 0 > > methods: 362 > > addNested: 8146 > > inner: 3051 > > final: 8774 > > write classf: 1934 > > constpool: 1015 > > interfaces: 0 > > attributes: 31 > > methods: 827 > > fields: 31 > > write primit: 486 > > > > That's the point where one can take the method and optimize it locally :) > > > > Thanks, > > Aleksey. > > > > [1] https://issues.apache.org/jira/browse/HARMONY-5905 > > > > On Wed, Jul 9, 2008 at 9:20 PM, Aleksey Shipilev > > <[EMAIL PROTECTED]> wrote: > >> I had disabled the compression in my test to throw away ZIP overhead > >> and focus on pack200 performance only. Thus the performance data is > >> not relevant to previous measurements. The data are assumed with > >> HARMONY-5900 incorporated. > >> > >> Harmony's pack200: 43 secs (3.5 Mb/secs) > >> Sun's pack200: 9 secs (16.6 Mb/secs) > >> > >> Profile: > >> > >> 22.0% java.util.HashMap.* > >> 11.4% java.io.FileInputStream.readBytes() > >> 7.5% o.a.h.unpack200.bytecode.ClassConstantPool.addNested() > >> 6.9% java.util.zip.* > >> 5.6% o.a.h.pack200.BHSDCodec.decode() > >> 4.8% java.lang.* > >> 4.4% o.a.h.unpack200.IcBands.getRelevantIcTuples() > >> 3.9% > o.a.h.unpack200.bytecode.forms.NoArgumentForm.setByteCodeOperands() > >> 3.2% o.a.h.unpack200.bytecode.ClassConstantPool.* (other) > >> 3.0% o.a.h.unpack200.bytecode.CodeAttribute.* > >> 2.8% java.io.FileOutputStream.writeBytes() > >> 2.8% o.a.h.unpack200.bytecode.ByteCode.* > >> 2.75% java.util.TreeMap.* > >> > >> Note ArrayList is gone! > >> It seems like BHSDCodec.decode(), IcBands.getRelevanticTuples() and > >> NoArgumentForm.setByteCodeOperands() are next candidates for tuning. > >> After that, the performance improvement is not possible without deep > >> changes, like overall algorithmic improvements. Anyway, that should be > >> first, but I'm not familiar with the code yet. This can't stop us > >> though ;) > >> > >> Thanks, > >> Aleksey. > >> > >> On Wed, Jul 9, 2008 at 7:27 PM, Sian January < > [EMAIL PROTECTED]> wrote: > >>> Thanks for doing that Aleksey. In fact I think Sun's was 20 or 30 > times > >>> faster before we started doing any performance optimizations, but it > looks > >>> like there's still some ground that we could make up! > >>> > >>> > >>> > >>> On 08/07/2008, Aleksey Shipilev <[EMAIL PROTECTED]> wrote: > >>>> > >>>> Hi, > >>>> > >>>> I took the liberty of profiling of pack200 implementation on unpacking > >>>> scenario. Source data was obtained from Eclipse JDT jars, repacked in > >>>> single 60 Mb jar file, then packed with pack200 from Sun's JDK (-E9 > >>>> used), resulting in 20 Mb pack200-compressed file. Then Sun JDK > >>>> 1.6.0_05 (Windows, -server) was used together with hprof (cpu=time) to > >>>> obtain the profile. My patch from HARMONY-5900 is onboard. The head of > >>>> the profile looks like this: > >>>> > >>>> 4.76% > org.apache.harmony.unpack200.bytecode.ClassConstantPool.addNested > >>>> 4.22% java.util.HashMap.getEntry > >>>> 2.99% java.util.AbstractList$Itr.next > >>>> 2.92% java.util.AbstractList$Itr.hasNext > >>>> 2.84% java.util.ArrayList.get > >>>> 2.43% java.util.AbstractList$Itr.next > >>>> 2.41% java.util.HashMap.containsKey > >>>> 2.15% org.apache.harmony.unpack200.IcBands.getRelevantIcTuples > >>>> 2.00% java.util.HashSet.contains > >>>> 1.57% java.io.DataOutputStream.writeUTF > >>>> > >>>> Composite occupancy: > >>>> > >>>> 18.4% java.util.AbstractList > >>>> 18.0% java.util.HashMap > >>>> 15.8% java.util.ArrayList > >>>> 10.5% o.a.h.unpack200.bytecode.ClassConstantPool.* > >>>> 5.3% o.a.h.unpack200.bytecode.CPUTF8.* (hashcode mostly) > >>>> 4.5% java.io.* > >>>> 4.5% java.lang.String.* > >>>> 4.4% o.a.h.unpack200.bytecode.ByteCode.* > >>>> 3.9% o.a.h.unpack200.bytecode.Ic{Tuple|Bands}.* > >>>> 14.7% other > >>>> > >>>> So the main concern is Collections usage. ClassConstantPool uses Lists > >>>> excessively, so I suspect the significant amount of time is spent > >>>> there. > >>>> > >>>> NB: > >>>> Timings for the scenario (the less the better): > >>>> Harmony's pack200: 67 secs > >>>> Sun's pack200: 6 secs > >>>> > >>>> Yep, 10 times faster. > >>>> > >>>> Thanks, > >>>> Aleksey. > >>>> > >>> > >>> > >>> > >>> -- > >>> Unless stated otherwise above: > >>> IBM United Kingdom Limited - Registered in England and Wales with > number > >>> 741598. > >>> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 > 3AU > >>> > >> > > >
