Re: RFR: 8221836: Avoid recalculating String.hash when zero

Peter Levart Mon, 08 Apr 2019 05:46:35 -0700



On 4/8/19 1:40 PM, Aleksey Shipilev wrote:

On 4/8/19 1:28 PM, Peter Levart wrote:

The reasoning is very similar as with just one field. With one field (hash) the 
thread sees either
the default value (0) or a non-zero value calculated either by this thread 
sometime before or by a
concurrent thread that has already stored it. Regardless of ordering, the 
thread either uses the
non-zero value or (re)calculates it (again). The value calculation is 
deterministic and uses
immutable published state (the array), so it always calculates the same value 
for the same object.
Idempotence is guaranteed.

The same reasoning can be extended to a general case where there are many 
fields used for caching of
a calculated state from some immutable published state. The constraint is that 
the calculation must
be deterministic and must also deterministically choose which of the many 
fields used for caching is
to be modified. Only one field may be modified, never more than one. The thread 
therefore sees
either the default values of all fields or the default values of all but one 
field which has been
set by either this thread sometime before or by a concurrent thread. Regardless 
of ordering, the
thread either uses the state combined from the default values of all fields but 
one and a
non-default value of a single field or (re)calculates the non-default value of 
the single field. The
value calculation is deterministic, uses immutable published state and 
deterministically chooses the
field to modify, so it always calculates the same "next" state for the object. 
Idempotence is
guaranteed.

Thank you, the mere existence of this wall of text solidifies my argument: the 
need to invoke the
argument like that is exactly the cognitive complexity I've been talking about, 
and it speaks about
maintainability/risk cost, while benefits are still around the machine epsilon.

I tried to write the two descriptions side by side to show that the 2ndis not more complex than the 1st. It's just using longer "nouns". Thesentences are otherwise equivalent and there's additional text thatdescribes the "nouns". I could have done a better job though...


So here's 2nd try:

The String hash code caching (as it is written today) is an example of abenign data race that can be described as caching of lazily calculatedstate from immutable published state, both modeled in the same object.Data race is benign if:


- the published state which is used as input of the calculation is immutable
- the calculation is deterministic

- threads observe the cached calculated state of the object to beupdated just once atomically. Meaning that there are only two differentobservable states of object: "initial" state where the calculated cacheddata is not set and "updated" state where the the calculated cached datais set.

Java fields up to 32 bits wide (+ reference fields regardless of width)exhibit atomic updates.

So if the update of the object state (transition from "initial" to"updated" state) is performed by a write of a deterministicallycalculated value to a single deterministically chosen field of no morethan 32 bits (or a reference field), the whole object state is observedto change atomically and the data race is benign.

Current and proposed caching differ only in the number of fields usedfor caching the calculated state, but both adhere to the above rules.

So the reasoning stays the same as with current code. It only takes alittle to realize that it's all about a single field that is updatedwhile the presence of other fields (zero or more) don't change thepicture since they are constant for the whole lifetime of object.

If you're afraid that a future maintainer of that code would not realizethat, then a simple comment put into String.hashCode method andjava_lang_String::set_hash C++ metohd that would say something like thefollowing:

// only a single field may be modified so that the Object state isupdated atomically


...is surely going to help him/her keep the String free from bugs...

Regards, Peter

Re: RFR: 8221836: Avoid recalculating String.hash when zero

Reply via email to