I think that's bad for performance. Using Object.hashCode() leads to
System.identityHashCode(), which is handled by VM. The exact mechanism
is VM-dependent, but at least on Harmony it's pretty slow. If you want
to mark each instance as not-identical to another (warning here, you
may break something), then I suggest to use static increment, e.g.:
private static int hashcodeBase = 1;
private int cachedHashCode = hashcodeBase++;
This may end up with cache collisions if objects are created from
different threads, but that's not the case for now.
Anyway, Andrew, your code lacks caching again ;)
Thanks,
Aleksey.
On Mon, Jul 14, 2008 at 9:51 PM, Andrew Cornwall
<[EMAIL PROTECTED]> wrote:
> And in fact, I just tried the following (which makes even more sense):
>
> - add objectHashCode() to ClassFileEntry:
> protected int objectHashCode() {
> return super.hashCode();
> }
>
> - change generateHashCode in ByteCode to return objectHashCode()
> private int generateHashCode() {
> hashcodeComputed = true;
> return objectHashCode();
> }
>
> Since ByteCodes are equal if and only if they are identical, this seems to
> be the right thing to do. What do you think?
>
>
> On Mon, Jul 14, 2008 at 10:44 AM, Andrew Cornwall <[EMAIL PROTECTED]>
> wrote:
>
>> I applied Aleksey's changes, and they look pretty good. I disagree with
>> Sian to some degree about ByteCode. On my VM (which isn't Harmony), a new
>> empty array's hashCode() is dependent on the array's location in memory, and
>> not the array's contents. In other words:
>>
>> int[] x = new int[3];
>> System.out.println(x.hashCode());
>> x[1] = 5;
>> System.out.println(x.hashCode());
>>
>> prints the same value for in both cases. rewrite.hashCode() is a handy (if
>> lazy) way to distinguish among different instances.
>>
>> If we take rewrite out of hashCode(), HashMap and HashSet get really slow -
>> essentially a linear search among all the ByteCodes of the same form. This
>> brings my test case from 39 seconds up to 1:02.
>>
>> Perhaps the right thing to do is to give each unique instance of ByteCode
>> an integer ID which is used in creating the hashCode rather than relying on
>> rewrite to give us uniqueness?
>>
>>
>> On Mon, Jul 14, 2008 at 7:19 AM, Sian January <[EMAIL PROTECTED]>
>> wrote:
>>
>>> Ok - we'll wait and see what Andrew says. The only one that I'm not happy
>>> with is Bytecode.hashCode, because rewrite always seems to be an empty
>>> array
>>> at the point when generateHashCode is called so it's a bit misleading
>>> using
>>> it. I think it should be ok to just remove that line, and still cache the
>>> hashCode.
>>>
>>>
>>> On 14/07/2008, Aleksey Shipilev <[EMAIL PROTECTED]> wrote:
>>> >
>>> > Sian,
>>> >
>>> > Actually I had tried to extend Andrew's approach to these classes
>>> > first, but somehow I caught the degradation, that leaved me no choice
>>> > except the lazy initialization. My concern is, the
>>> > constructor-initialized hashcode is really depend on usage pattern for
>>> > each specific class, while lazy initialization has more guarantees to
>>> > be performance-stable. Moreover, I suspect the lazy initialization can
>>> > degrade performance much less because the only overhead it causes is
>>> > checking the value of boolean field. On the other hand, the
>>> > constructor initialization may degrade performance a lot since the
>>> > generation of hashCode is expensive.
>>> >
>>> > I can recheck which classes favor lazy initialization and which are
>>> > not, but I think it's not valuable in terms of efficiency. I mean here
>>> > that the boost connected with changing lazy initialization to
>>> > constructor one is much lower than boost from caching hashcode anyway.
>>> >
>>> > Can we accept the patch in this form and revisit this difference later?
>>> > It would be better to focus on more profitable areas for improvements
>>> for
>>> > now.
>>> >
>>> > P.S. I had asked Andrew to recheck whether my patch works as fast as
>>> > his, also to check lazy initialization approach. On my tests the boost
>>> > is stable and good.
>>> >
>>> > Thanks,
>>> > Aleksey.
>>> >
>>> > On Mon, Jul 14, 2008 at 5:25 PM, Sian January
>>> > <[EMAIL PROTECTED]> wrote:
>>> > > Hi Aleksey,
>>> > >
>>> > > It's a really good idea to extend this patch to cover some more
>>> classes,
>>> > but
>>> > > I think Andrew's method is faster (a final cachedHashCode field that
>>> is
>>> > > initialized in the constructor). The only reason I see to do it later
>>> > would
>>> > > be if we thought some of these objects never had hashCode called on
>>> them,
>>> > > but I don't think that's the case. Would you be able to try that
>>> method
>>> > > instead in your patch and see if I'm right about it being faster?
>>> > >
>>> > > Thanks,
>>> > >
>>> > > Sian
>>> > >
>>> > >
>>> > >
>>> > > On 12/07/2008, Aleksey Shipilev <[EMAIL PROTECTED]> wrote:
>>> > >>
>>> > >> Andrew,
>>> > >>
>>> > >> I had attached the patch to HARMONY-5907, covering several first
>>> > >> methods. Can you confirm this patch helps for your scenario?
>>> > >>
>>> > >> Thanks,
>>> > >> Aleksey.
>>> > >>
>>> > >> On Sat, Jul 12, 2008 at 1:58 PM, Aleksey Shipilev
>>> > >> <[EMAIL PROTECTED]> wrote:
>>> > >> > And the sorted list:
>>> > >> >
>>> > >> > 95462388 bc.cputf8
>>> > >> > 18646908 bc.bytecode
>>> > >> > 15118425 bc.cpclass
>>> > >> > 14928914 bc.cpnametype
>>> > >> > 12103799 bc.cpmethref
>>> > >> > 5159994 bc.cpfieldref
>>> > >> > 3420605 bc.methref
>>> > >> > 1840965 bc.cpstring
>>> > >> > 839916 bc.codeattr
>>> > >> > 839916 bc.locvarattr
>>> > >> > 839916 bc.linenumattr
>>> > >> > 430234 bc.cpmethod
>>> > >> > 277144 bc.cpfield
>>> > >> > 263753 bc.attr
>>> > >> > 153811 bc.cpinteger
>>> > >> > 121856 bc.newattr
>>> > >> > 93471 bc.cvalattr
>>> > >> > 72492 bc.excpattr
>>> > >> > 57428 bc.srcfileattr
>>> > >> > 57428 bc.srcfileattr
>>> > >> > 48104 bc.cplong
>>> > >> > 40362 bc.innerclass
>>> > >> > 5593 bc.depattr
>>> > >> > 3255 bc.cpfloat
>>> > >> > 1638 bc.cpdouble
>>> > >> > 532 attrlayout
>>> > >> > 0 archive
>>> > >> > 0 attrdef
>>> > >> > 0 newattrband
>>> > >> > 0 bc.anndefarg
>>> > >> > 0 bc.rtannattr
>>> > >> > 0 classbands
>>> > >> > 0 filebands
>>> > >> > 0 metabandgr
>>> > >> > 0 segheader
>>> > >> > 0 bc.remattr
>>> > >> > 0 bc.annattr
>>> > >> > 0 bc.cpconst
>>> > >> > 0 bc.cpmember
>>> > >> > 0 bc.signattr
>>> > >> > 0 bandset
>>> > >> > 0 bcbands
>>> > >> > 0 cpbands
>>> > >> > 0 icbands
>>> > >> > 0 ictuple
>>> > >> > 0 segment
>>> > >> > 0 segopts
>>> > >> > 0 bc.classf
>>> > >> > 0 bc.cpref
>>> > >> > 0 bc.opmgr
>>> > >> > 0 bc.rtattr
>>> > >> > 0 segcp
>>> > >> > 0 bc.ccp
>>> > >> > 0 attrlayoutmap
>>> > >> > 0 bc.encmethattr
>>> > >> > 0 bc.exptableent
>>> > >> > 0 bc.locvartable
>>> > >> > 0 bc.signattr
>>> > >> >
>>> > >> > Thanks,
>>> > >> > Aleksey.
>>> > >> >
>>> > >> > On Sat, Jul 12, 2008 at 1:50 PM, Aleksey Shipilev
>>> > >> > <[EMAIL PROTECTED]> wrote:
>>> > >> >> Hi, Andrew!
>>> > >> >>
>>> > >> >> I had updated the internal profiler to support hashCode() probes
>>> [1],
>>> > >> >> to extend your effort in hashcode optimization. There are bunch of
>>> > >> >> heavily used hashcodes, most of them are going to
>>> Object.hashCode()
>>> > >> >> and then to System.identityHashCode(). We can cache/implement
>>> > hashcode
>>> > >> >> for these classes. Here's the profile:
>>> > >> >>
>>> > >> >> Hashcodes:
>>> > >> >> archive: 0
>>> > >> >> attrdef: 0
>>> > >> >> attrlayout: 532
>>> > >> >> attrlayoutmap: 0
>>> > >> >> bandset: 0
>>> > >> >> bcbands: 0
>>> > >> >> classbands: 0
>>> > >> >> cpbands: 0
>>> > >> >> filebands: 0
>>> > >> >> icbands: 0
>>> > >> >> ictuple: 0
>>> > >> >> metabandgr: 0
>>> > >> >> newattrband: 0
>>> > >> >> segcp: 0
>>> > >> >> segheader: 0
>>> > >> >> segment: 0
>>> > >> >> segopts: 0
>>> > >> >> bc.attr: 263753
>>> > >> >> bc.remattr: 0
>>> > >> >> bc.anndefarg: 0
>>> > >> >> bc.annattr: 0
>>> > >> >> bc.bytecode: 18646908
>>> > >> >> bc.ccp: 0
>>> > >> >> bc.classf: 0
>>> > >> >> bc.codeattr: 839916
>>> > >> >> bc.cvalattr: 93471
>>> > >> >> bc.cpclass: 15118425
>>> > >> >> bc.cpconst: 0
>>> > >> >> bc.cpdouble: 1638
>>> > >> >> bc.cpfield: 277144
>>> > >> >> bc.cpfieldref: 5159994
>>> > >> >> bc.cpfloat: 3255
>>> > >> >> bc.cpinteger: 153811
>>> > >> >> bc.methref: 3420605
>>> > >> >> bc.cplong: 48104
>>> > >> >> bc.cpmember: 0
>>> > >> >> bc.cpmethod: 430234
>>> > >> >> bc.cpmethref: 12103799
>>> > >> >> bc.cpnametype: 14928914
>>> > >> >> bc.cpref: 0
>>> > >> >> bc.cpstring: 1840965
>>> > >> >> bc.cputf8: 95462388
>>> > >> >> bc.depattr: 5593
>>> > >> >> bc.encmethattr: 0
>>> > >> >> bc.excpattr: 72492
>>> > >> >> bc.exptableent: 0
>>> > >> >> bc.innerclass: 40362
>>> > >> >> bc.linenumattr: 839916
>>> > >> >> bc.locvarattr: 839916
>>> > >> >> bc.locvartable: 0
>>> > >> >> bc.newattr: 121856
>>> > >> >> bc.opmgr: 0
>>> > >> >> bc.rtattr: 0
>>> > >> >> bc.rtannattr: 0
>>> > >> >> bc.signattr: 0
>>> > >> >> bc.srcfileattr: 57428
>>> > >> >>
>>> > >> >> Would you like to produce the patch?
>>> > >> >> I think it would be funny :)
>>> > >> >>
>>> > >> >> Thanks,
>>> > >> >> Aleksey.
>>> > >> >>
>>> > >> >> [1] https://issues.apache.org/jira/browse/HARMONY-5905
>>> > >> >>
>>> > >> >> On Sat, Jul 12, 2008 at 12:48 AM, Andrew Cornwall (JIRA)
>>> > >> >> <[EMAIL PROTECTED]> wrote:
>>> > >> >>>
>>> > >> >>> [
>>> > >>
>>> >
>>> https://issues.apache.org/jira/browse/HARMONY-5907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
>>> > ]
>>> > >> >>>
>>> > >> >>> Andrew Cornwall updated HARMONY-5907:
>>> > >> >>> -------------------------------------
>>> > >> >>>
>>> > >> >>> Attachment: main.patch
>>> > >> >>>
>>> > >> >>> main.patch includes change to CPUTF8.java
>>> > >> >>>
>>> > >> >>>
>>> > >> >>>> [classlib][pack200]CPUTF8.hashCode() is slow
>>> > >> >>>> --------------------------------------------
>>> > >> >>>>
>>> > >> >>>> Key: HARMONY-5907
>>> > >> >>>> URL:
>>> > >> https://issues.apache.org/jira/browse/HARMONY-5907
>>> > >> >>>> Project: Harmony
>>> > >> >>>> Issue Type: Improvement
>>> > >> >>>> Affects Versions: 5.0M6
>>> > >> >>>> Environment: Latest pack200
>>> > >> >>>> Reporter: Andrew Cornwall
>>> > >> >>>> Attachments: main.patch
>>> > >> >>>>
>>> > >> >>>>
>>> > >> >>>> The unpack process spends a lot of time doing CPUTF8.hashCode()
>>> -
>>> > >> which does String.hashCode(). We can save about 1.5 seconds of my 39
>>> > second
>>> > >> test case (about 4%) by caching the hashCode. (I thought we did this
>>> > before
>>> > >> - or maybe I dreamt it?)
>>> > >> >>>
>>> > >> >>> --
>>> > >> >>> This message is automatically generated by JIRA.
>>> > >> >>> -
>>> > >> >>> You can reply to this email to add a comment to the issue online.
>>> > >> >>>
>>> > >> >>>
>>> > >> >>
>>> > >> >
>>> > >>
>>> > >
>>> > >
>>> > >
>>> > > --
>>> > > Unless stated otherwise above:
>>> > > IBM United Kingdom Limited - Registered in England and Wales with
>>> number
>>> > > 741598.
>>> > > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
>>> > 3AU
>>> > >
>>> >
>>>
>>>
>>>
>>> --
>>> Unless stated otherwise above:
>>> IBM United Kingdom Limited - Registered in England and Wales with number
>>> 741598.
>>> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>>>
>>
>>
>