What I can tell you is that I ***loves*** this discussion. It illustrates the spirit of pharo that we want to push. Let's make a better and cooler smalltalk :) And people are even learning nice knowledge. Thanks
On Oct 29, 2009, at 3:14 AM, Andres Valloud wrote: > Martin, > > One of the constituencies I thought of when I decided to leave > identityHash alone was folks like you. Now, as a representative, if > you > are ok with dealing with broken identityHash senders (which I hope > will > be few), then most of my motivation for leaving identityHash unchanged > is gone. Thus, I would not mind changing identityHash and > implementing > primIdentityHash. > > What about others? Would anybody mind if identityHash was changed? > > Some comments below... > >> I took a survey of the senders of #identityHash in the latest web >> image. >> There aren't that many. The largest category is those that want the >> printString of the identityHash. >> > > These would probably need to be changed to get the printString of the > primIdentityHash. > >> Of those that care about the value of the identityHash, there are >> several that use it in #hash methods. The most common is this >> definition: >> >> hash >> ^self identityHash >> >> These are presumably overriding superclass behavior to restore Object >> behavior. > > I'd like to take a look at these, I suspect there may be low hanging > fruit waiting to be fixed. > >> If the authors knew about the limited range of #identityHash, that is >> entirely possible. I tend to think it more likely that in most cases >> these implementations are just the simplest way to follow the dictate >> that 'a=b -> a hash = b hash', and that they didn't really think >> about >> the impact on collection performance. >> > > Or maybe they chose identityHash because they can assume uniqueness (= > effectively being ==)... > >> 5 improved, 2 harmed. And one of the listed harmed is >> MethodDictionary, >> whose performance would not be harmed, but I assume the VM would >> not be >> happy if their hashing was changed (anybody know for sure whether >> that's >> true?) >> > > The VM probably knows a lot about identityHash values, and most likely > uses the primIdentityHash values because then it doesn't have to shift > on access. > >> They could, and I admit to having written this kind of code in the >> past, >> but I doubt that I'm typical in doing so. Do you know of any Pharo >> code >> that actually *does* this sort of thing? There isn't any in the >> distributed web image, but I didn't look at every package that is >> meant >> to be loadable in Pharo. >> > > I might suspect that Magma does this kind of stuff... but that's > just a > guess. I didn't immediately see any code doing so. As long as > package > maintainers are fine with two quite different versions of Pharo with > very different identityHash method behaviors, then I do not have a > problem. > >>> Clever hacks such as >>> >>> SomeObject>>hash >>> >>> ^(self variableA identityHash bitShift: 12) + self variableB >>> identityHash >>> >>> >>> would also remain undisturbed. >>> >> >> Yes, if #identityHash is changed it's the clever hacks that will >> have to >> change. This could be a disadvantage of this approach, but often, >> as in >> the case of IdentityDictionary, IdentitySet, and >> WeakIdentityKeyDictionary, the necessary change is simply to remove >> the >> clever hack, get simpler code, and enjoy better performance than >> you got >> with the clever hack, so making the change is IMO an improvement. >> > > We agree, mod I wouldn't want to impose version maintenance homework > on > maintainers of large packages. For the sake of illustration only, and > using Magma without knowing if it would be affected, I wouldn't want > whoever is maintaining Magma to maintain two branches... one for Pharo > 1.xyz, and one for Pharo 1.xyz++. > >>> Finally, I do not know of any Smalltalk >>> in which identityHash does not answer the actual object header >>> bits for >>> the identity hash. If we change identityHash, then AFAIK Pharo would >>> become the only Smalltalk in which identityHash does not answer >>> consecutive values between 0 and (2^k)-1 (k=12 for Squeak/Pharo, >>> k=14 >>> for VisualWorks 32 bits, k=20 for VisualWorks 64 bits, IIRC k=15 >>> for VA >>> and VisualSmalltalk). >>> >> >> GemStone is a Smalltalk that does not answer consecutive values for >> identityHash. > > Haha, I was thinking of "regular" image based Smalltalks... > >> In GemStone the identityHash is computed from the object's >> OOP, and OOPs are not consecutive. > > Not necessarily, although I suspect identityHash values map to an > integer interval along the lines of [0, 2^40-1]. So, if you look at > hash(x) as a function, the image of hash(x) is a set of consecutive > intervals. Using bitShift: to scale identityHash values would make > the > image of hash(x) sparse (with the exception of small integers, > characters and, to some extent in VW 64 bit, small doubles). > >> And Smalltalk-80 basically used the >> same scheme, though you could only have 32K objects, every one had a >> different identityHash based on OOP. >> > > These are also consecutive values... [0, 2^15-1], basically. > >> Also, most (all?) Smalltalks with limited ranges for identityHash do >> have a larger range of identityHash for SmallIntegers (usually >> ^self), >> so you can't use the clever hacks if you might have any >> SmallIntegers in >> your collection. So any general-purpose collection must already deal >> with the full SmallInteger range of identity hashes as keys, cannot >> use >> the clever hacks, and so is likely to only be improved by changing >> #identityHash. This is a key point that I forgot to bring up last >> night. >> > > Well, more or less, because with scaledIdentityHash you'd need to > implement it in SmallInteger as ^self... but yes, I think hashed > collections shouldn't be put into a position where they judge what's a > good hash value and what isn't (and spend CPU time doing so at > runtime!!!). Java does this, and as far as I could see back when I > studied Java's hashing implementation, IMO it's not a good idea. > > Andres. > > _______________________________________________ > Pharo-project mailing list > [email protected] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project _______________________________________________ Pharo-project mailing list [email protected] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
