On Mar 4, 2008, at 4:26 AM, Denis S. Otkidach wrote:
>
> On Mon, Mar 3, 2008 at 8:23 PM, Michael Bayer <[EMAIL PROTECTED]
> > wrote:
>> We define __eq__() all over the place so that would be a lot of
>> __hash__() methods to add, all of which return id(self). I wonder if
>> we shouldn't just make a util.Mixin called "Hashable" so that we can
>> centralize the idea.
>
> Are you sure this is a correct way? Below is an example demonstrating
> the problem with it:
>
>>>> class C(object):
> ... def __init__(self, value):
> ... self._value = value
> ... def __eq__(self, other):
> ... return self._value==other._value
> ... def __hash__(self):
> ... return id(self)
> ...
>>>> c1 = C(1)
>>>> c2 = C(1)
>>>> c1==c2
> True
>>>> d = {c1: None}
>>>> c1 in d
> True
>>>> c2 in d
> False
>
> I.e. although c2 is equal to c1 and thus should be found in
> dictionary, it is not. The defined __hash__ method must return equal
> numbers for equal object.
>
Well actually, in our particular case that's the behavior that we *do*
want; pretty much everywhere we've defined __eq__(), we've done it not
to redefine what it means for a==b, but to produce SQL expressions -
so in that sense __eq__() is entirely broken for its normal usage in
SQLAlchemy (as well as in all the other SQL tools out there using this
approach). For this reason, internally we can't do things like "d in
[a,b,c]" if those are SQL expressions, since __eq__() evaluates to
true in all cases - we use sets when we need a collection of SQL
expressions where we can test for presence, so that their hash value
is used.
However, while Im not familiar with the internals of Python
dictionaries, depending on how they implemented it we still may need
to use IdentitySet and IdentityDict, two classes (well we have the
first one at least) which ignore the __hash__() and __eq__() methods
entirely and hash their contents strictly based on id(obj). This is
because a "hashtable" usually stores items in buckets based on a
modulus of the __hash__() value; if two items are in the same bucket,
an equality comparison is used to locate the correct object. If
Python's dict uses __eq__() for the equality comparison, we'd be in
trouble. I have a strong suspicion that they do not (since I think we
would have noticed by now), and that they use __hash__() for the
equality comparison as well, but I'm not sure; and also not sure if
this is slated to change in py2.6.
I think I might want to look into defining in util ExpressionSet /
ExpressionKeyDict set symbols (subject to the new names jek is sure to
propose... ;) ) which would be used throughout the source code to
store SQL expression constructs as keys. That way at least we can
change the underlying implementation based on observed quirks of the
version in use.
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups
"sqlalchemy" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at
http://groups.google.com/group/sqlalchemy?hl=en
-~----------~----~----~----~------~----~------~--~---