On May 29, 2006, at 3:14 AM, Thomas Wouters wrote:



On 5/29/06, Bob Ippolito <[EMAIL PROTECTED]> wrote:
Well, the behavior change is in response to a bug <http://python.org/sf/1229380>. If nothing else, we should at least fix the standard library such that it doesn't depend on struct bugs. This is the only way to find them :)

Feel free to comment how the zlib.crc32/gzip co-operation should be fixed. I don't see an obviously correct fix. The trunk is currently failing tests it shouldn't fail. Also note that the error isn't with feeding signed values to unsigned formats (which is what the bug is about) but the other way 'round, although I do believe both should be accepted for the time being, while generating a warning. 

Well, first I'm going to just correct the modules that are broken (zlib, gzip, tarfile, binhex and probably one or two others).

Basically the struct module previously only checked for errors if you don't specify an endian. That's really strange and leads to very confusing results. The only code that really should be broken by this additional check is code that existed before Python had a long type and only signed values were available.

Alas, reality is different. The fundamental difference between types in Python and in C causes this, and code using struct is usually meant specifically to bridge those two worlds. Furthermore, struct is often used *fix* that issue, by flipping sign bits if necessary: 

Well, in C you get a compiler warning for stuff like this.

>>> struct.unpack("<l", struct.pack("<l", 3221225472))
(-1073741824,)
>>> struct.unpack("<l", struct.pack("<L", 3221225472))
(-1073741824,)
>>> struct.unpack("<l", struct.pack("<l", -1073741824))
(-1073741824,)
>>> struct.unpack("<l", struct.pack("<L", -1073741824))
(-1073741824,)

Before this change, you didn't have to check whether the value is negative before the struct.unpack/pack dance, regardless of which format character you used. This misfeature is used (and many would consider it convenient, even Pythonic, for struct to DWIM), breaking it suddenly is bad. 

struct doesn't really DWIM anyway, since integers are up-converted to longs and will overflow past what the (old or new) struct module will accept. Before there was a long type or automatic up-converting, the sign agnosticism worked.. but it doesn't really work correctly these days.

We have two choices, either fix it to behave consistently broken everywhere for numbers of every size (modulo every number that comes in so that it fits), or have it do proper range checking. A compromise is to do proper range checking as a warning, and do the modulo math anyway... but is that what we really want?

-bob

_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to