Re: [Python-Dev] Dataclasses and correct hashability

Steven D'Aprano Tue, 06 Feb 2018 09:28:56 -0800

On Mon, Feb 05, 2018 at 10:50:21AM -0800, David Mertz wrote:

> Absolutely I agree. 'unsafe_hash' as a name is clear warning to users.


(I don't mean to pick on David specifically, I had to reply to some 
message in this thread and I just picked his.)

I'm rather gobsmacked at the attitudes of many people here about hashing 
data classes. I thought *I* was the cynical pessimist who didn't have a 
high opinion of the quality of the average programmer, but according to 
this thread apparently I'm positively Pollyanna-esque for believing that 
most people will realise that if an API offers separate switches for 
hashable and frozen, you need to set *both* if you want both.

Greg Smith even says that writing dunders apart from __init__ is a code 
smell, and warns people not to write dunders. Seriously? I get that 
__hash__ is hard to write correctly, which is why we have a hash=True to 
do the hard work for us, but I can't help feeling that at the point 
we're saying "don't write dunders, any dunder, you'll only do it wrong" 
we have crossed over to the wrong side of the pessimist/optimist line.

But here we are: talking about naming a perfectly reasonable argument 
"unsafe_hash". Why are we trying to frighten people?

There is nothing unsafe about a DataClass with hash=True, frozen=True, 
but this scheme means that even people who know what they're doing will 
write unsafe_hash=True, frozen=True, as if hashability was some sort of 
hand grenade waiting to go off.

Perhaps we ought to deprecate __hash__ and start calling it 
__danger_danger_hash__ too? No, I don't think so.

In the past, we've (rightly!) rejected proposals to call things like 
eval "unsafe_eval", and that really is dangerously unsafe when used 
naively with untrusted, unsanitised data. Hashing mutable objects by 
accident might be annoyingly difficult and frustrating to debug, but 
code injection attacks can lead to identity theft and worse, serious 
consequences for real people.

I'm 100% in favour of programmer education, but I think this label is 
*miseducation*. We're suggesting that hashability is unsafe, regardless 
of whether the object is frozen or not.

I'd far prefer to get a runtime warning:

"Are you sure you want hash=True without frozen=True?"

(or words to that extent) rather than burden all uses of the hash 
parameter, good and bad, with the unsafe label.


-- 
Steve
_______________________________________________
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Dataclasses and correct hashability

Reply via email to