That's the question worth experimenting.
But taking into consideration that hashcode will not be generated
frequently, Object.hashCode() seem to be well too.
On Tue, Jul 15, 2008 at 12:37 AM, Andrew Cornwall
<[EMAIL PROTECTED]> wrote:
> I think Object.hashCode() will be better distributed than a static increment
> - wouldn't that have better performance in Hash* objects?
>
>
> On Mon, Jul 14, 2008 at 11:44 AM, Aleksey Shipilev <
> [EMAIL PROTECTED]> wrote:
>
>> Nevertheless, Object.hashCode() performance is unclear and as such it
>> should be specifically avoided if not required explicitly. Will static
>> increment suit better?
>>
>> Thanks,
>> Aleksey.
>>
>> On Mon, Jul 14, 2008 at 10:38 PM, Andrew Cornwall
>> <[EMAIL PROTECTED]> wrote:
>> > Sorry - I didn't explain fully. I intended that my code would cache the
>> > Object.hashCode() rather than recomputing the hashCode each time.
>> >
>> > On Mon, Jul 14, 2008 at 11:25 AM, Aleksey Shipilev <
>> > [EMAIL PROTECTED]> wrote:
>> >
>> >> I think that's bad for performance. Using Object.hashCode() leads to
>> >> System.identityHashCode(), which is handled by VM. The exact mechanism
>> >> is VM-dependent, but at least on Harmony it's pretty slow. If you want
>> >> to mark each instance as not-identical to another (warning here, you
>> >> may break something), then I suggest to use static increment, e.g.:
>> >>
>> >> private static int hashcodeBase = 1;
>> >> private int cachedHashCode = hashcodeBase++;
>> >>
>> >> This may end up with cache collisions if objects are created from
>> >> different threads, but that's not the case for now.
>> >>
>> >> Anyway, Andrew, your code lacks caching again ;)
>> >>
>> >> Thanks,
>> >> Aleksey.
>> >>
>> >> On Mon, Jul 14, 2008 at 9:51 PM, Andrew Cornwall
>> >> <[EMAIL PROTECTED]> wrote:
>> >> > And in fact, I just tried the following (which makes even more sense):
>> >> >
>> >> > - add objectHashCode() to ClassFileEntry:
>> >> > protected int objectHashCode() {
>> >> > return super.hashCode();
>> >> > }
>> >> >
>> >> > - change generateHashCode in ByteCode to return objectHashCode()
>> >> > private int generateHashCode() {
>> >> > hashcodeComputed = true;
>> >> > return objectHashCode();
>> >> > }
>> >> >
>> >> > Since ByteCodes are equal if and only if they are identical, this
>> seems
>> >> to
>> >> > be the right thing to do. What do you think?
>> >> >
>> >> >
>> >> > On Mon, Jul 14, 2008 at 10:44 AM, Andrew Cornwall <
>> >> [EMAIL PROTECTED]>
>> >> > wrote:
>> >> >
>> >> >> I applied Aleksey's changes, and they look pretty good. I disagree
>> with
>> >> >> Sian to some degree about ByteCode. On my VM (which isn't Harmony), a
>> >> new
>> >> >> empty array's hashCode() is dependent on the array's location in
>> memory,
>> >> and
>> >> >> not the array's contents. In other words:
>> >> >>
>> >> >> int[] x = new int[3];
>> >> >> System.out.println(x.hashCode());
>> >> >> x[1] = 5;
>> >> >> System.out.println(x.hashCode());
>> >> >>
>> >> >> prints the same value for in both cases. rewrite.hashCode() is a
>> handy
>> >> (if
>> >> >> lazy) way to distinguish among different instances.
>> >> >>
>> >> >> If we take rewrite out of hashCode(), HashMap and HashSet get really
>> >> slow -
>> >> >> essentially a linear search among all the ByteCodes of the same form.
>> >> This
>> >> >> brings my test case from 39 seconds up to 1:02.
>> >> >>
>> >> >> Perhaps the right thing to do is to give each unique instance of
>> >> ByteCode
>> >> >> an integer ID which is used in creating the hashCode rather than
>> relying
>> >> on
>> >> >> rewrite to give us uniqueness?
>> >> >>
>> >> >>
>> >> >> On Mon, Jul 14, 2008 at 7:19 AM, Sian January <
>> >> [EMAIL PROTECTED]>
>> >> >> wrote:
>> >> >>
>> >> >>> Ok - we'll wait and see what Andrew says. The only one that I'm not
>> >> happy
>> >> >>> with is Bytecode.hashCode, because rewrite always seems to be an
>> empty
>> >> >>> array
>> >> >>> at the point when generateHashCode is called so it's a bit
>> misleading
>> >> >>> using
>> >> >>> it. I think it should be ok to just remove that line, and still
>> cache
>> >> the
>> >> >>> hashCode.
>> >> >>>
>> >> >>>
>> >> >>> On 14/07/2008, Aleksey Shipilev <[EMAIL PROTECTED]> wrote:
>> >> >>> >
>> >> >>> > Sian,
>> >> >>> >
>> >> >>> > Actually I had tried to extend Andrew's approach to these classes
>> >> >>> > first, but somehow I caught the degradation, that leaved me no
>> choice
>> >> >>> > except the lazy initialization. My concern is, the
>> >> >>> > constructor-initialized hashcode is really depend on usage pattern
>> >> for
>> >> >>> > each specific class, while lazy initialization has more guarantees
>> to
>> >> >>> > be performance-stable. Moreover, I suspect the lazy initialization
>> >> can
>> >> >>> > degrade performance much less because the only overhead it causes
>> is
>> >> >>> > checking the value of boolean field. On the other hand, the
>> >> >>> > constructor initialization may degrade performance a lot since the
>> >> >>> > generation of hashCode is expensive.
>> >> >>> >
>> >> >>> > I can recheck which classes favor lazy initialization and which
>> are
>> >> >>> > not, but I think it's not valuable in terms of efficiency. I mean
>> >> here
>> >> >>> > that the boost connected with changing lazy initialization to
>> >> >>> > constructor one is much lower than boost from caching hashcode
>> >> anyway.
>> >> >>> >
>> >> >>> > Can we accept the patch in this form and revisit this difference
>> >> later?
>> >> >>> > It would be better to focus on more profitable areas for
>> improvements
>> >> >>> for
>> >> >>> > now.
>> >> >>> >
>> >> >>> > P.S. I had asked Andrew to recheck whether my patch works as fast
>> as
>> >> >>> > his, also to check lazy initialization approach. On my tests the
>> >> boost
>> >> >>> > is stable and good.
>> >> >>> >
>> >> >>> > Thanks,
>> >> >>> > Aleksey.
>> >> >>> >
>> >> >>> > On Mon, Jul 14, 2008 at 5:25 PM, Sian January
>> >> >>> > <[EMAIL PROTECTED]> wrote:
>> >> >>> > > Hi Aleksey,
>> >> >>> > >
>> >> >>> > > It's a really good idea to extend this patch to cover some more
>> >> >>> classes,
>> >> >>> > but
>> >> >>> > > I think Andrew's method is faster (a final cachedHashCode field
>> >> that
>> >> >>> is
>> >> >>> > > initialized in the constructor). The only reason I see to do it
>> >> later
>> >> >>> > would
>> >> >>> > > be if we thought some of these objects never had hashCode called
>> on
>> >> >>> them,
>> >> >>> > > but I don't think that's the case. Would you be able to try
>> that
>> >> >>> method
>> >> >>> > > instead in your patch and see if I'm right about it being
>> faster?
>> >> >>> > >
>> >> >>> > > Thanks,
>> >> >>> > >
>> >> >>> > > Sian
>> >> >>> > >
>> >> >>> > >
>> >> >>> > >
>> >> >>> > > On 12/07/2008, Aleksey Shipilev <[EMAIL PROTECTED]>
>> >> wrote:
>> >> >>> > >>
>> >> >>> > >> Andrew,
>> >> >>> > >>
>> >> >>> > >> I had attached the patch to HARMONY-5907, covering several
>> first
>> >> >>> > >> methods. Can you confirm this patch helps for your scenario?
>> >> >>> > >>
>> >> >>> > >> Thanks,
>> >> >>> > >> Aleksey.
>> >> >>> > >>
>> >> >>> > >> On Sat, Jul 12, 2008 at 1:58 PM, Aleksey Shipilev
>> >> >>> > >> <[EMAIL PROTECTED]> wrote:
>> >> >>> > >> > And the sorted list:
>> >> >>> > >> >
>> >> >>> > >> > 95462388 bc.cputf8
>> >> >>> > >> > 18646908 bc.bytecode
>> >> >>> > >> > 15118425 bc.cpclass
>> >> >>> > >> > 14928914 bc.cpnametype
>> >> >>> > >> > 12103799 bc.cpmethref
>> >> >>> > >> > 5159994 bc.cpfieldref
>> >> >>> > >> > 3420605 bc.methref
>> >> >>> > >> > 1840965 bc.cpstring
>> >> >>> > >> > 839916 bc.codeattr
>> >> >>> > >> > 839916 bc.locvarattr
>> >> >>> > >> > 839916 bc.linenumattr
>> >> >>> > >> > 430234 bc.cpmethod
>> >> >>> > >> > 277144 bc.cpfield
>> >> >>> > >> > 263753 bc.attr
>> >> >>> > >> > 153811 bc.cpinteger
>> >> >>> > >> > 121856 bc.newattr
>> >> >>> > >> > 93471 bc.cvalattr
>> >> >>> > >> > 72492 bc.excpattr
>> >> >>> > >> > 57428 bc.srcfileattr
>> >> >>> > >> > 57428 bc.srcfileattr
>> >> >>> > >> > 48104 bc.cplong
>> >> >>> > >> > 40362 bc.innerclass
>> >> >>> > >> > 5593 bc.depattr
>> >> >>> > >> > 3255 bc.cpfloat
>> >> >>> > >> > 1638 bc.cpdouble
>> >> >>> > >> > 532 attrlayout
>> >> >>> > >> > 0 archive
>> >> >>> > >> > 0 attrdef
>> >> >>> > >> > 0 newattrband
>> >> >>> > >> > 0 bc.anndefarg
>> >> >>> > >> > 0 bc.rtannattr
>> >> >>> > >> > 0 classbands
>> >> >>> > >> > 0 filebands
>> >> >>> > >> > 0 metabandgr
>> >> >>> > >> > 0 segheader
>> >> >>> > >> > 0 bc.remattr
>> >> >>> > >> > 0 bc.annattr
>> >> >>> > >> > 0 bc.cpconst
>> >> >>> > >> > 0 bc.cpmember
>> >> >>> > >> > 0 bc.signattr
>> >> >>> > >> > 0 bandset
>> >> >>> > >> > 0 bcbands
>> >> >>> > >> > 0 cpbands
>> >> >>> > >> > 0 icbands
>> >> >>> > >> > 0 ictuple
>> >> >>> > >> > 0 segment
>> >> >>> > >> > 0 segopts
>> >> >>> > >> > 0 bc.classf
>> >> >>> > >> > 0 bc.cpref
>> >> >>> > >> > 0 bc.opmgr
>> >> >>> > >> > 0 bc.rtattr
>> >> >>> > >> > 0 segcp
>> >> >>> > >> > 0 bc.ccp
>> >> >>> > >> > 0 attrlayoutmap
>> >> >>> > >> > 0 bc.encmethattr
>> >> >>> > >> > 0 bc.exptableent
>> >> >>> > >> > 0 bc.locvartable
>> >> >>> > >> > 0 bc.signattr
>> >> >>> > >> >
>> >> >>> > >> > Thanks,
>> >> >>> > >> > Aleksey.
>> >> >>> > >> >
>> >> >>> > >> > On Sat, Jul 12, 2008 at 1:50 PM, Aleksey Shipilev
>> >> >>> > >> > <[EMAIL PROTECTED]> wrote:
>> >> >>> > >> >> Hi, Andrew!
>> >> >>> > >> >>
>> >> >>> > >> >> I had updated the internal profiler to support hashCode()
>> >> probes
>> >> >>> [1],
>> >> >>> > >> >> to extend your effort in hashcode optimization. There are
>> bunch
>> >> of
>> >> >>> > >> >> heavily used hashcodes, most of them are going to
>> >> >>> Object.hashCode()
>> >> >>> > >> >> and then to System.identityHashCode(). We can
>> cache/implement
>> >> >>> > hashcode
>> >> >>> > >> >> for these classes. Here's the profile:
>> >> >>> > >> >>
>> >> >>> > >> >> Hashcodes:
>> >> >>> > >> >> archive: 0
>> >> >>> > >> >> attrdef: 0
>> >> >>> > >> >> attrlayout: 532
>> >> >>> > >> >> attrlayoutmap: 0
>> >> >>> > >> >> bandset: 0
>> >> >>> > >> >> bcbands: 0
>> >> >>> > >> >> classbands: 0
>> >> >>> > >> >> cpbands: 0
>> >> >>> > >> >> filebands: 0
>> >> >>> > >> >> icbands: 0
>> >> >>> > >> >> ictuple: 0
>> >> >>> > >> >> metabandgr: 0
>> >> >>> > >> >> newattrband: 0
>> >> >>> > >> >> segcp: 0
>> >> >>> > >> >> segheader: 0
>> >> >>> > >> >> segment: 0
>> >> >>> > >> >> segopts: 0
>> >> >>> > >> >> bc.attr: 263753
>> >> >>> > >> >> bc.remattr: 0
>> >> >>> > >> >> bc.anndefarg: 0
>> >> >>> > >> >> bc.annattr: 0
>> >> >>> > >> >> bc.bytecode: 18646908
>> >> >>> > >> >> bc.ccp: 0
>> >> >>> > >> >> bc.classf: 0
>> >> >>> > >> >> bc.codeattr: 839916
>> >> >>> > >> >> bc.cvalattr: 93471
>> >> >>> > >> >> bc.cpclass: 15118425
>> >> >>> > >> >> bc.cpconst: 0
>> >> >>> > >> >> bc.cpdouble: 1638
>> >> >>> > >> >> bc.cpfield: 277144
>> >> >>> > >> >> bc.cpfieldref: 5159994
>> >> >>> > >> >> bc.cpfloat: 3255
>> >> >>> > >> >> bc.cpinteger: 153811
>> >> >>> > >> >> bc.methref: 3420605
>> >> >>> > >> >> bc.cplong: 48104
>> >> >>> > >> >> bc.cpmember: 0
>> >> >>> > >> >> bc.cpmethod: 430234
>> >> >>> > >> >> bc.cpmethref: 12103799
>> >> >>> > >> >> bc.cpnametype: 14928914
>> >> >>> > >> >> bc.cpref: 0
>> >> >>> > >> >> bc.cpstring: 1840965
>> >> >>> > >> >> bc.cputf8: 95462388
>> >> >>> > >> >> bc.depattr: 5593
>> >> >>> > >> >> bc.encmethattr: 0
>> >> >>> > >> >> bc.excpattr: 72492
>> >> >>> > >> >> bc.exptableent: 0
>> >> >>> > >> >> bc.innerclass: 40362
>> >> >>> > >> >> bc.linenumattr: 839916
>> >> >>> > >> >> bc.locvarattr: 839916
>> >> >>> > >> >> bc.locvartable: 0
>> >> >>> > >> >> bc.newattr: 121856
>> >> >>> > >> >> bc.opmgr: 0
>> >> >>> > >> >> bc.rtattr: 0
>> >> >>> > >> >> bc.rtannattr: 0
>> >> >>> > >> >> bc.signattr: 0
>> >> >>> > >> >> bc.srcfileattr: 57428
>> >> >>> > >> >>
>> >> >>> > >> >> Would you like to produce the patch?
>> >> >>> > >> >> I think it would be funny :)
>> >> >>> > >> >>
>> >> >>> > >> >> Thanks,
>> >> >>> > >> >> Aleksey.
>> >> >>> > >> >>
>> >> >>> > >> >> [1] https://issues.apache.org/jira/browse/HARMONY-5905
>> >> >>> > >> >>
>> >> >>> > >> >> On Sat, Jul 12, 2008 at 12:48 AM, Andrew Cornwall (JIRA)
>> >> >>> > >> >> <[EMAIL PROTECTED]> wrote:
>> >> >>> > >> >>>
>> >> >>> > >> >>> [
>> >> >>> > >>
>> >> >>> >
>> >> >>>
>> >>
>> https://issues.apache.org/jira/browse/HARMONY-5907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
>> >> >>> > ]
>> >> >>> > >> >>>
>> >> >>> > >> >>> Andrew Cornwall updated HARMONY-5907:
>> >> >>> > >> >>> -------------------------------------
>> >> >>> > >> >>>
>> >> >>> > >> >>> Attachment: main.patch
>> >> >>> > >> >>>
>> >> >>> > >> >>> main.patch includes change to CPUTF8.java
>> >> >>> > >> >>>
>> >> >>> > >> >>>
>> >> >>> > >> >>>> [classlib][pack200]CPUTF8.hashCode() is slow
>> >> >>> > >> >>>> --------------------------------------------
>> >> >>> > >> >>>>
>> >> >>> > >> >>>> Key: HARMONY-5907
>> >> >>> > >> >>>> URL:
>> >> >>> > >> https://issues.apache.org/jira/browse/HARMONY-5907
>> >> >>> > >> >>>> Project: Harmony
>> >> >>> > >> >>>> Issue Type: Improvement
>> >> >>> > >> >>>> Affects Versions: 5.0M6
>> >> >>> > >> >>>> Environment: Latest pack200
>> >> >>> > >> >>>> Reporter: Andrew Cornwall
>> >> >>> > >> >>>> Attachments: main.patch
>> >> >>> > >> >>>>
>> >> >>> > >> >>>>
>> >> >>> > >> >>>> The unpack process spends a lot of time doing
>> >> CPUTF8.hashCode()
>> >> >>> -
>> >> >>> > >> which does String.hashCode(). We can save about 1.5 seconds of
>> my
>> >> 39
>> >> >>> > second
>> >> >>> > >> test case (about 4%) by caching the hashCode. (I thought we did
>> >> this
>> >> >>> > before
>> >> >>> > >> - or maybe I dreamt it?)
>> >> >>> > >> >>>
>> >> >>> > >> >>> --
>> >> >>> > >> >>> This message is automatically generated by JIRA.
>> >> >>> > >> >>> -
>> >> >>> > >> >>> You can reply to this email to add a comment to the issue
>> >> online.
>> >> >>> > >> >>>
>> >> >>> > >> >>>
>> >> >>> > >> >>
>> >> >>> > >> >
>> >> >>> > >>
>> >> >>> > >
>> >> >>> > >
>> >> >>> > >
>> >> >>> > > --
>> >> >>> > > Unless stated otherwise above:
>> >> >>> > > IBM United Kingdom Limited - Registered in England and Wales
>> with
>> >> >>> number
>> >> >>> > > 741598.
>> >> >>> > > Registered office: PO Box 41, North Harbour, Portsmouth,
>> Hampshire
>> >> PO6
>> >> >>> > 3AU
>> >> >>> > >
>> >> >>> >
>> >> >>>
>> >> >>>
>> >> >>>
>> >> >>> --
>> >> >>> Unless stated otherwise above:
>> >> >>> IBM United Kingdom Limited - Registered in England and Wales with
>> >> number
>> >> >>> 741598.
>> >> >>> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire
>> PO6
>> >> 3AU
>> >> >>>
>> >> >>
>> >> >>
>> >> >
>> >>
>> >
>>
>