Nevertheless, Object.hashCode() performance is unclear and as such it
should be specifically avoided if not required explicitly. Will static
increment suit better?

Thanks,
Aleksey.

On Mon, Jul 14, 2008 at 10:38 PM, Andrew Cornwall
<[EMAIL PROTECTED]> wrote:
> Sorry - I didn't explain fully. I intended that my code would cache the
> Object.hashCode() rather than recomputing the hashCode each time.
>
> On Mon, Jul 14, 2008 at 11:25 AM, Aleksey Shipilev <
> [EMAIL PROTECTED]> wrote:
>
>> I think that's bad for performance. Using Object.hashCode() leads to
>> System.identityHashCode(), which is handled by VM. The exact mechanism
>> is VM-dependent, but at least on Harmony it's pretty slow. If you want
>> to mark each instance as not-identical to another (warning here, you
>> may break something), then I suggest to use static increment, e.g.:
>>
>> private static int hashcodeBase = 1;
>> private int cachedHashCode = hashcodeBase++;
>>
>> This may end up with cache collisions if objects are created from
>> different threads, but that's not the case for now.
>>
>> Anyway, Andrew, your code lacks caching again ;)
>>
>> Thanks,
>> Aleksey.
>>
>> On Mon, Jul 14, 2008 at 9:51 PM, Andrew Cornwall
>> <[EMAIL PROTECTED]> wrote:
>> > And in fact, I just tried the following (which makes even more sense):
>> >
>> >  - add objectHashCode() to ClassFileEntry:
>> >    protected int objectHashCode() {
>> >        return super.hashCode();
>> >    }
>> >
>> > - change generateHashCode in ByteCode to return objectHashCode()
>> >   private int generateHashCode() {
>> >        hashcodeComputed = true;
>> >        return objectHashCode();
>> >   }
>> >
>> > Since ByteCodes are equal if and only if they are identical, this seems
>> to
>> > be the right thing to do. What do you think?
>> >
>> >
>> > On Mon, Jul 14, 2008 at 10:44 AM, Andrew Cornwall <
>> [EMAIL PROTECTED]>
>> > wrote:
>> >
>> >> I applied Aleksey's changes, and they look pretty good. I disagree with
>> >> Sian to some degree about ByteCode. On my VM (which isn't Harmony), a
>> new
>> >> empty array's hashCode() is dependent on the array's location in memory,
>> and
>> >> not the array's contents. In other words:
>> >>
>> >>         int[] x = new int[3];
>> >>         System.out.println(x.hashCode());
>> >>         x[1] = 5;
>> >>         System.out.println(x.hashCode());
>> >>
>> >> prints the same value for in both cases. rewrite.hashCode() is a handy
>> (if
>> >> lazy) way to distinguish among different instances.
>> >>
>> >> If we take rewrite out of hashCode(), HashMap and HashSet get really
>> slow -
>> >> essentially a linear search among all the ByteCodes of the same form.
>> This
>> >> brings my test case from 39 seconds up to 1:02.
>> >>
>> >> Perhaps the right thing to do is to give each unique instance of
>> ByteCode
>> >> an integer ID which is used in creating the hashCode rather than relying
>> on
>> >> rewrite to give us uniqueness?
>> >>
>> >>
>> >> On Mon, Jul 14, 2008 at 7:19 AM, Sian January <
>> [EMAIL PROTECTED]>
>> >> wrote:
>> >>
>> >>> Ok - we'll wait and see what Andrew says.  The only one that I'm not
>> happy
>> >>> with is Bytecode.hashCode, because rewrite always seems to be an empty
>> >>> array
>> >>> at the point when generateHashCode is called so it's a bit misleading
>> >>> using
>> >>> it.  I think it should be ok to just remove that line, and still cache
>> the
>> >>> hashCode.
>> >>>
>> >>>
>> >>> On 14/07/2008, Aleksey Shipilev <[EMAIL PROTECTED]> wrote:
>> >>> >
>> >>> > Sian,
>> >>> >
>> >>> > Actually I had tried to extend Andrew's approach to these classes
>> >>> > first, but somehow I caught the degradation, that leaved me no choice
>> >>> > except the lazy initialization. My concern is, the
>> >>> > constructor-initialized hashcode is really depend on usage pattern
>> for
>> >>> > each specific class, while lazy initialization has more guarantees to
>> >>> > be performance-stable. Moreover, I suspect the lazy initialization
>> can
>> >>> > degrade performance much less because the only overhead it causes is
>> >>> > checking the value of boolean field. On the other hand, the
>> >>> > constructor initialization may degrade performance a lot since the
>> >>> > generation of hashCode is expensive.
>> >>> >
>> >>> > I can recheck which classes favor lazy initialization and which are
>> >>> > not, but I think it's not valuable in terms of efficiency. I mean
>> here
>> >>> > that the boost connected with changing lazy initialization to
>> >>> > constructor one is much lower than boost from caching hashcode
>> anyway.
>> >>> >
>> >>> > Can we accept the patch in this form and revisit this difference
>> later?
>> >>> > It would be better to focus on more profitable areas for improvements
>> >>> for
>> >>> > now.
>> >>> >
>> >>> > P.S. I had asked Andrew to recheck whether my patch works as fast as
>> >>> > his, also to check lazy initialization approach. On my tests the
>> boost
>> >>> > is stable and good.
>> >>> >
>> >>> > Thanks,
>> >>> > Aleksey.
>> >>> >
>> >>> > On Mon, Jul 14, 2008 at 5:25 PM, Sian January
>> >>> > <[EMAIL PROTECTED]> wrote:
>> >>> > > Hi Aleksey,
>> >>> > >
>> >>> > > It's a really good idea to extend this patch to cover some more
>> >>> classes,
>> >>> > but
>> >>> > > I think Andrew's method is faster (a final cachedHashCode field
>> that
>> >>> is
>> >>> > > initialized in the constructor).  The only reason I see to do it
>> later
>> >>> > would
>> >>> > > be if we thought some of these objects never had hashCode called on
>> >>> them,
>> >>> > > but I don't think that's the case.  Would you be able to try that
>> >>> method
>> >>> > > instead in your patch and see if I'm right about it being faster?
>> >>> > >
>> >>> > > Thanks,
>> >>> > >
>> >>> > > Sian
>> >>> > >
>> >>> > >
>> >>> > >
>> >>> > > On 12/07/2008, Aleksey Shipilev <[EMAIL PROTECTED]>
>> wrote:
>> >>> > >>
>> >>> > >> Andrew,
>> >>> > >>
>> >>> > >> I had attached the patch to HARMONY-5907, covering several first
>> >>> > >> methods. Can you confirm this patch helps for your scenario?
>> >>> > >>
>> >>> > >> Thanks,
>> >>> > >> Aleksey.
>> >>> > >>
>> >>> > >> On Sat, Jul 12, 2008 at 1:58 PM, Aleksey Shipilev
>> >>> > >> <[EMAIL PROTECTED]> wrote:
>> >>> > >> > And the sorted list:
>> >>> > >> >
>> >>> > >> > 95462388         bc.cputf8
>> >>> > >> > 18646908         bc.bytecode
>> >>> > >> > 15118425         bc.cpclass
>> >>> > >> > 14928914         bc.cpnametype
>> >>> > >> > 12103799         bc.cpmethref
>> >>> > >> > 5159994  bc.cpfieldref
>> >>> > >> > 3420605  bc.methref
>> >>> > >> > 1840965  bc.cpstring
>> >>> > >> > 839916   bc.codeattr
>> >>> > >> > 839916   bc.locvarattr
>> >>> > >> > 839916   bc.linenumattr
>> >>> > >> > 430234   bc.cpmethod
>> >>> > >> > 277144   bc.cpfield
>> >>> > >> > 263753   bc.attr
>> >>> > >> > 153811   bc.cpinteger
>> >>> > >> > 121856   bc.newattr
>> >>> > >> > 93471    bc.cvalattr
>> >>> > >> > 72492    bc.excpattr
>> >>> > >> > 57428    bc.srcfileattr
>> >>> > >> > 57428    bc.srcfileattr
>> >>> > >> > 48104    bc.cplong
>> >>> > >> > 40362    bc.innerclass
>> >>> > >> > 5593     bc.depattr
>> >>> > >> > 3255     bc.cpfloat
>> >>> > >> > 1638     bc.cpdouble
>> >>> > >> > 532      attrlayout
>> >>> > >> > 0       archive
>> >>> > >> > 0        attrdef
>> >>> > >> > 0        newattrband
>> >>> > >> > 0        bc.anndefarg
>> >>> > >> > 0        bc.rtannattr
>> >>> > >> > 0        classbands
>> >>> > >> > 0        filebands
>> >>> > >> > 0        metabandgr
>> >>> > >> > 0        segheader
>> >>> > >> > 0        bc.remattr
>> >>> > >> > 0        bc.annattr
>> >>> > >> > 0        bc.cpconst
>> >>> > >> > 0        bc.cpmember
>> >>> > >> > 0        bc.signattr
>> >>> > >> > 0        bandset
>> >>> > >> > 0        bcbands
>> >>> > >> > 0        cpbands
>> >>> > >> > 0        icbands
>> >>> > >> > 0        ictuple
>> >>> > >> > 0        segment
>> >>> > >> > 0        segopts
>> >>> > >> > 0        bc.classf
>> >>> > >> > 0        bc.cpref
>> >>> > >> > 0        bc.opmgr
>> >>> > >> > 0        bc.rtattr
>> >>> > >> > 0        segcp
>> >>> > >> > 0        bc.ccp
>> >>> > >> > 0        attrlayoutmap
>> >>> > >> > 0        bc.encmethattr
>> >>> > >> > 0        bc.exptableent
>> >>> > >> > 0        bc.locvartable
>> >>> > >> > 0        bc.signattr
>> >>> > >> >
>> >>> > >> > Thanks,
>> >>> > >> > Aleksey.
>> >>> > >> >
>> >>> > >> > On Sat, Jul 12, 2008 at 1:50 PM, Aleksey Shipilev
>> >>> > >> > <[EMAIL PROTECTED]> wrote:
>> >>> > >> >> Hi, Andrew!
>> >>> > >> >>
>> >>> > >> >> I had updated the internal profiler to support hashCode()
>> probes
>> >>> [1],
>> >>> > >> >> to extend your effort in hashcode optimization. There are bunch
>> of
>> >>> > >> >> heavily used hashcodes, most of them are going to
>> >>> Object.hashCode()
>> >>> > >> >> and then to System.identityHashCode(). We can cache/implement
>> >>> > hashcode
>> >>> > >> >> for these classes. Here's the profile:
>> >>> > >> >>
>> >>> > >> >> Hashcodes:
>> >>> > >> >>  archive:       0
>> >>> > >> >>  attrdef:       0
>> >>> > >> >>  attrlayout:    532
>> >>> > >> >>  attrlayoutmap: 0
>> >>> > >> >>  bandset:       0
>> >>> > >> >>  bcbands:       0
>> >>> > >> >>  classbands:    0
>> >>> > >> >>  cpbands:       0
>> >>> > >> >>  filebands:     0
>> >>> > >> >>  icbands:       0
>> >>> > >> >>  ictuple:       0
>> >>> > >> >>  metabandgr:    0
>> >>> > >> >>  newattrband:   0
>> >>> > >> >>  segcp:         0
>> >>> > >> >>  segheader:     0
>> >>> > >> >>  segment:       0
>> >>> > >> >>  segopts:       0
>> >>> > >> >>  bc.attr:        263753
>> >>> > >> >>  bc.remattr:     0
>> >>> > >> >>  bc.anndefarg:   0
>> >>> > >> >>  bc.annattr:     0
>> >>> > >> >>  bc.bytecode:    18646908
>> >>> > >> >>  bc.ccp:         0
>> >>> > >> >>  bc.classf:      0
>> >>> > >> >>  bc.codeattr:    839916
>> >>> > >> >>  bc.cvalattr:    93471
>> >>> > >> >>  bc.cpclass:     15118425
>> >>> > >> >>  bc.cpconst:     0
>> >>> > >> >>  bc.cpdouble:    1638
>> >>> > >> >>  bc.cpfield:     277144
>> >>> > >> >>  bc.cpfieldref:  5159994
>> >>> > >> >>  bc.cpfloat:     3255
>> >>> > >> >>  bc.cpinteger:   153811
>> >>> > >> >>  bc.methref:     3420605
>> >>> > >> >>  bc.cplong:      48104
>> >>> > >> >>  bc.cpmember:    0
>> >>> > >> >>  bc.cpmethod:    430234
>> >>> > >> >>  bc.cpmethref:   12103799
>> >>> > >> >>  bc.cpnametype:  14928914
>> >>> > >> >>  bc.cpref:       0
>> >>> > >> >>  bc.cpstring:    1840965
>> >>> > >> >>  bc.cputf8:      95462388
>> >>> > >> >>  bc.depattr:     5593
>> >>> > >> >>  bc.encmethattr: 0
>> >>> > >> >>  bc.excpattr:    72492
>> >>> > >> >>  bc.exptableent: 0
>> >>> > >> >>  bc.innerclass:  40362
>> >>> > >> >>  bc.linenumattr: 839916
>> >>> > >> >>  bc.locvarattr:  839916
>> >>> > >> >>  bc.locvartable: 0
>> >>> > >> >>  bc.newattr:     121856
>> >>> > >> >>  bc.opmgr:       0
>> >>> > >> >>  bc.rtattr:      0
>> >>> > >> >>  bc.rtannattr:   0
>> >>> > >> >>  bc.signattr:    0
>> >>> > >> >>  bc.srcfileattr: 57428
>> >>> > >> >>
>> >>> > >> >> Would you like to produce the patch?
>> >>> > >> >> I think it would be funny :)
>> >>> > >> >>
>> >>> > >> >> Thanks,
>> >>> > >> >> Aleksey.
>> >>> > >> >>
>> >>> > >> >> [1] https://issues.apache.org/jira/browse/HARMONY-5905
>> >>> > >> >>
>> >>> > >> >> On Sat, Jul 12, 2008 at 12:48 AM, Andrew Cornwall (JIRA)
>> >>> > >> >> <[EMAIL PROTECTED]> wrote:
>> >>> > >> >>>
>> >>> > >> >>>     [
>> >>> > >>
>> >>> >
>> >>>
>> https://issues.apache.org/jira/browse/HARMONY-5907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
>> >>> > ]
>> >>> > >> >>>
>> >>> > >> >>> Andrew Cornwall updated HARMONY-5907:
>> >>> > >> >>> -------------------------------------
>> >>> > >> >>>
>> >>> > >> >>>    Attachment: main.patch
>> >>> > >> >>>
>> >>> > >> >>> main.patch includes change to CPUTF8.java
>> >>> > >> >>>
>> >>> > >> >>>
>> >>> > >> >>>> [classlib][pack200]CPUTF8.hashCode() is slow
>> >>> > >> >>>> --------------------------------------------
>> >>> > >> >>>>
>> >>> > >> >>>>                 Key: HARMONY-5907
>> >>> > >> >>>>                 URL:
>> >>> > >> https://issues.apache.org/jira/browse/HARMONY-5907
>> >>> > >> >>>>             Project: Harmony
>> >>> > >> >>>>          Issue Type: Improvement
>> >>> > >> >>>>    Affects Versions: 5.0M6
>> >>> > >> >>>>         Environment: Latest pack200
>> >>> > >> >>>>            Reporter: Andrew Cornwall
>> >>> > >> >>>>         Attachments: main.patch
>> >>> > >> >>>>
>> >>> > >> >>>>
>> >>> > >> >>>> The unpack process spends a lot of time doing
>> CPUTF8.hashCode()
>> >>> -
>> >>> > >> which does String.hashCode(). We can save about 1.5 seconds of my
>> 39
>> >>> > second
>> >>> > >> test case (about 4%) by caching the hashCode. (I thought we did
>> this
>> >>> > before
>> >>> > >> - or maybe I dreamt it?)
>> >>> > >> >>>
>> >>> > >> >>> --
>> >>> > >> >>> This message is automatically generated by JIRA.
>> >>> > >> >>> -
>> >>> > >> >>> You can reply to this email to add a comment to the issue
>> online.
>> >>> > >> >>>
>> >>> > >> >>>
>> >>> > >> >>
>> >>> > >> >
>> >>> > >>
>> >>> > >
>> >>> > >
>> >>> > >
>> >>> > > --
>> >>> > > Unless stated otherwise above:
>> >>> > > IBM United Kingdom Limited - Registered in England and Wales with
>> >>> number
>> >>> > > 741598.
>> >>> > > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire
>> PO6
>> >>> > 3AU
>> >>> > >
>> >>> >
>> >>>
>> >>>
>> >>>
>> >>> --
>> >>> Unless stated otherwise above:
>> >>> IBM United Kingdom Limited - Registered in England and Wales with
>> number
>> >>> 741598.
>> >>> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
>> 3AU
>> >>>
>> >>
>> >>
>> >
>>
>

Reply via email to