Re: [classlib][luni][performance] IdentityHashMap implementation

Nathan Beyer Tue, 22 Apr 2008 23:19:51 -0700

On Tue, Apr 22, 2008 at 4:03 PM, Mark Hindess <[EMAIL PROTECTED]>
wrote:


>
> On 22 April 2008 at 23:02, "Aleksey Shipilev" <[EMAIL 
> PROTECTED]<https://mail.google.com/mail?view=cm&tf=0&[EMAIL PROTECTED]>
> >
> wrote:
> >
> > Good comments, Mark, thanks! But I have to chill your fury a little :)
>
> I'm might be a little bothered but I'm certainly not furious.  I just
> wish to point out issues so that we can make things better.  I'd not
> want to see the patch applied in its current form but I've already
> moved on from the "clean" issue - by proving to myself that it could be
> cleaned up - and I'm looking at the "non-breaking" part now.
>
> > The reason behind all that changes is that entire IdentityHashMap
> > implementation was thrown away and replaced by HashMap, then the
> > relevant changes that transform it back to IdentityHashMap were done
> > (changing hashCode() to System.identityHashCode(), equals to ==, etc).
>
> sigh... this should really have been made clear in the JIRA - see below
> for an example of how this could be done.
>
> > So if you diff the HashMap.java and IdentityHashMap.java after the
> > patch applied, you should notice very small difference. That's a
> > good sign because thus we can easily identify and generalize the
> > common code in both implementations, as we suggested to do. That's the
> > generalization we talked about with Nathan and Tim, right?
>
> Well, I've not looked at HashMap.java but I suspect that it might have
> methods ordered alphabetically too which is horrible and should be
> fixed.
>
> Also, when I reverted the spurious javadoc changes back to the
> original form, I replaced a number of instances of the specific word
> IdentityHashMap with the less specific word Map.  Given the context
> doesn't detract from the meaning and does make it easier to read.
> Rather than make the IdentityHashMap javadoc more like the HashMap
> javadoc I suggest we change the HashMap javadoc in this way too.  (This
> is a personal preference but it would make the significant differences
> between HashMap.java and IdentityHashMap.java much more obvious.)
>
> In short, making them similar is great but lets make them both better
> rather than make IdentityHashMap worse.


I agree with this approach; clean HashMap, copy to IdentityHashMap, make
Identity-specific changes.


>
>
> > Keeping this in mind, you can see that the assumption of "minimal
> > change" can't be preserved
>
> Nonsense.  I have a 33% smaller cleaner patch created in ~10 minutes
> using emacs and ediff mode to prove it.  And armed with the extra
> information that was missing from the JIRA I can suggest an even better
> way see below.
>
> > and reading out the patch itself is pretty useless.
>
> I disagree; I read it and I don't think it was a useless exercise.  If
> my reading of the patch resulted in what you referred to above as "good
> comments" then clearly it wasn't useless.
>
> Of course, it is not the whole story but then I was never under the
> illusion that it was and I am moving on to look at the result of the
> patch next.  Personally, I think people underestimate the value of
> reading raw patches.  You do get to see just the changes - in a clean
> patch anyway - and "bad smells"[0] are often easier to see this way.
>

Agreed. This is something that I didn't quite get at first, but after
working with several different projects and being committer here, the power
of patches and diffs has certainly won me over. I have a much greater
respect for SCM tools now.

I love the "code smells" reference -- this is a turn-of-phrase that I
frequently use.

-Nathan


>
> > Rather, it makes sense to apply the patch to local copy and then
> > see whether the differences between IHM and HM are enough, whether
> > resulting IHM implementation contradicts with spec, whether IHM passes
> > the tests and runs the workloads well and stuff.
>
> I'm getting to that part.  You stated that the patch was "clean,
> non-breaking, consistent implementation, ready to be committed" and I
> was testing those assertions in that order.  Although I've probably made
> my mind up about the last one too.
>
> > Of course, to avoid these confusions I can rewrite entire Collections
> > at once and then contribute a mega-patch, but I don't think this is
> > the proper way of doing things.
>
> That's an interesting goal but I don't believe mega-patches are required
> to achieve it.  Before I became a committer, I did a huge amount of
> work unifying the native code and I did it in *lots* of small steps.
> Each step was clear.  I sometimes submitted a series of patches (and/or
> scripts) for a single JIRA to make the steps as transparent as possible.
> For example, for HARMONY-5771, you might do something like:
>
>  1) Attached a script (or documented) doing:
>
>    svn copy modules/luni/src/main/java/java/util/HashMap.java \
>    modules/luni/src/main/java/java/util/IdentityHashMap.java
>
>  2) Attach a patch to be applied after the copy with the changes.
>
> I think what you had done would be much clearer and I might not have
> started this discussion.  Certainly reading the patch would have been
> much easier and the *important* changes that really need reviewing would
> have been obvious not cluttered like the current patch.[1]
>
> Having said that, I think that would still be the wrong thing to do in
> this case and you'd need:
>
>  0) Attach a patch to fix some ugly things about HashMap
>
> to avoid duplicating the IMHO less wonderful aspects of HashMap in
> IdentityHashMap.
>
> > I prefer gradual updates instead.
>
> Well, at least, we can agree about that. ;-)
>
> -Mark.
>
> [0]
> http://sis36.berkeley.edu/projects/streek/agile/bad-smells-in-code.html
>
> [1] The svn diff on the commits list would still be a mess but hopefully
> the committer would use a log message that made it clear there was a
> more readable version in the JIRA.
>
> > On Tue, Apr 22, 2008 at 9:02 PM, Mark Hindess
> > <[EMAIL PROTECTED]<https://mail.google.com/mail?view=cm&tf=0&[EMAIL 
> > PROTECTED]>>
> wrote:
> > >
> > >  On 22 April 2008 at 15:23, "Aleksey Shipilev" <
> [EMAIL PROTECTED]<https://mail.google.com/mail?view=cm&tf=0&[EMAIL PROTECTED]>
> >
> > >  wrote:
> > >
> > > > Tim,
> > >  >
> > >  > I support your proposal and try to make distinction between
> > >  > proof-of-concept and clean patches.
> > >  >
> > >  > Performance improvements could be notorious work, requiring a lot
> of
> > >  > resources. The bad thing is, they could give nothing after the work
> is
> > >  > done, that's why we are doing thought experiments and
> proof-of-concept
> > >  > patches before actually implementing something. Proof-of-concept
> may
> > >  > also give a hint on what maximum improvement we can get and set the
> > >  > expectations for clean patch.
> > >  >
> > >  > So we are not looking for hiding the problems, rather we're
> exploring
> > >  > what could it worth.
> > >  >
> > >  > F.ex. the experiments on IdentityHashMap performance finally led to
> > >  > clean, non-breaking, consistent implementation, ready to be
> committed
> > >  > patch [1]. Can we focus on reviewing it?
> > >
> > >  "clean [snip] patch"?  Okay, I'll take the bait...
> > >
> > >  1) It is good to fix the assert in the test case, but if you are
> going to
> > >  change it at all you might as well replace:
> > >
> > >   assertTrue("TestB. removed entries incorrectly", map.size() == 11 &&
> > >  !result);
> > >
> > >  with:
> > >
> > >   assertEquals("TestB. removed entries incorrectly, size is not
> correct",
> > >                11, map.size());
> > >   assertFalse("TestB. removed entries incorrectly, result is not
> correct",
> > >               result);
> > >
> > >
> > >  2) I think the IdentityHashMap javadoc comment change removes too
> much
> > >  important information.
> > >
> > >
> > >  3) I don't understand why you make arbitrary whitespace changes -
> such
> > >  as rewriting:
> > >
> > >  public class IdentityHashMap<K, V> extends AbstractMap<K, V>
> implements
> > >         Map<K, V>, Serializable, Cloneable {
> > >
> > >  to be longer than 80-characters.
> > >
> > >
> > >  4) You rename DEFAULT_MAX_SIZE to DEFAULT_SIZE but I don't really see
> > >  the relevance of this to the changes.
> > >
> > >
> > >  5) You seem to remove a number of correct comments from the
> > >  declarations of threshold, DEFAULT_MAX_SIZE, loadFactor, and
> modCount.
> > >
> > >  I almost I gave up reading the patch at this point out of frustration
> > >  at the number of irrelevant changes.  Such as:
> > >
> > >  a) the re-ordering of often unchanged public methods - previously the
> > >  public methods were grouped by similar functionality which seems
> logical
> > >  to me but now they are in alphabetical order.  This makes the patch
> much
> > >  bigger and harder to read.  If there is a good reason to re-order the
> > >  methods, do it in a separate patch.
> > >
> > >
> > >  b) The arbitrary changes to javadoc like:
> > >
> > >      /**
> > >  -     * Searches this Map for the specified value.
> > >  -     *
> > >  -     *
> > >  +     * Searches this IdentityHashMap for the specified value.
> > >  +     *
> > >       * @param value
> > >       *            the object to search for
> > >  -     * @return true if <code>value</code> is a value of this Map,
> false
> > >  +     * @return true if <code>value</code> is a value of this
> IdentityHash
> > Map,
> > >  false
> > >       *         otherwise
> > >       */
> > >
> > >  c) The arbitrary renaming of IdentityHashMapEntry to Entry.
> > >
> > >  Is it unrealistic to expect clean patches with justifications for
> > >  changes?  Can we optimise patches as well as code?
> > >
> > >  If I apply the patch and then revert the whitespace changes,
> > >  javadoc reformatting, renaming of DEFAULT_MAX_SIZE, renaming of
> > >  IdentityHashMapEntry, etc. the the resulting svn diff contains at
> least
> > >  33% fewer +/- lines than the one on the JIRA and is still
> functionally
> > >  the same.
> > >
> > >  As someone who tries to read svn commit messages (and all committers
> or
> > >  would-be-committers should be doing this to some extent!) patches
> like
> > >  this are a nightmare.
> > >
> > >  Regards,
> > >  -Mark.
> > >
> > >
> > >
> > >  >
> > >  > Thanks,
> > >  > Aleksey.
> > >  >
> > >  > [1] https://issues.apache.org/jira/browse/HARMONY-5771
> > >  >
> > >  > On Tue, Apr 22, 2008 at 12:09 PM, Tim Ellison <
> [EMAIL PROTECTED]<https://mail.google.com/mail?view=cm&tf=0&[EMAIL 
> PROTECTED]>>
> wr
> > ote:
> > >  > > Aleksey Shipilev wrote:
> > >  > >
> > >  > > > Right, Tim. It was just the proof-of-concept "Does
> IdentityHashMap
> > >  > > > tuning worth it?".
> > >  > > >
> > >  > >
> > >  > >  I think it is unhelpful to mix actual, concrete suggestions for
> perfo
> > rmanc
> > >  > e
> > >  > > improvements, with proof-of-concept ideas that break the
> specification
> >  or
> > >  > > security.
> > >  > >
> > >  > >  Please can people mark clearly in their mail whether they are
> suggest
> > ing
> > >  > > the idea only as "a thought experiment", otherwise somebody may
> apply
> > the
> > >  > > change without properly understanding the consequences.
> > >  > >
> > >  > >  I realize the onus is on the committer to ensure it is safe but
> hidin
> > g suc
> > >  > h
> > >  > > problems is, as I said, unhelpful.
> > >  > >
> > >  > >  Thanks,
> > >  > >  Tim
> > >  > >
> > >  > >
> > >  > >
> > >  > >
> > >  > >
> > >  > > > On Mon, Apr 21, 2008 at 4:50 PM, Tim Ellison <
> [EMAIL PROTECTED]<https://mail.google.com/mail?view=cm&tf=0&[EMAIL PROTECTED]>
> >
> > >  > > wrote:
> > >  > > >
> > >  > > > > Sergey Salishev wrote:
> > >  > > > >
> > >  > > > >
> > >  > > > > > It also should be noticed that replacing the
> IdentityHashMap wit
> > h
> > >  > > HashMap
> > >  > > > > >
> > >  > > > > in
> > >  > > > >
> > >  > > > > > Thread impl gives 4.5x boost on ThreadLocalBench.
> > >  > > > > >
> > >  > > > > >
> > >  > > > >  But it would be wrong since it opens a security
> vulnerability.
> > >  > > > >
> > >  > > > >  See:
> > >  > > > >  http://markmail.org/message/rwq3kif5n33wp65m
> > >  > > > >  http://svn.apache.org/viewvc?view=rev&revision=477355
> > >  > > > >
> > >  > > > >
> > >  > > > >  Regards,
> > >  > > > >  Tim
> > >  > > > >
> > >  > > > >
> > >  > > >
> > >  > > >
> > >  > >
> > >  >
> > >
> > >
> > >
> >
>
>
>

Re: [classlib][luni][performance] IdentityHashMap implementation

Reply via email to