Re: [Python-Dev] test_gzip/test_tarfile failure om AMD64
I'm afraid a sabbatical year isn't long enough to understand what the struct module did or intends to do by way of range checking 0.7 wink. Is this intended? This is on a 32-bit Windows box with current trunk: from struct import pack as p p(I, 2**32 + 2343) C:\Code\python\lib\struct.py:63: DeprecationWarning: 'I' format requires 0 = number = 4294967295 return o.pack(*args) '\x00\x00\x00\x00' The warning makes sense, but the result doesn't make sense to me. In Python 2.4.3, that example raised OverflowError, which seems better than throwing away all the bits without an exception. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] test_gzip/test_tarfile failure om AMD64
On May 31, 2006, at 8:31 AM, Tim Peters wrote: I'm afraid a sabbatical year isn't long enough to understand what the struct module did or intends to do by way of range checking 0.7 wink. Is this intended? This is on a 32-bit Windows box with current trunk: from struct import pack as p p(I, 2**32 + 2343) C:\Code\python\lib\struct.py:63: DeprecationWarning: 'I' format requires 0 = number = 4294967295 return o.pack(*args) '\x00\x00\x00\x00' The warning makes sense, but the result doesn't make sense to me. In Python 2.4.3, that example raised OverflowError, which seems better than throwing away all the bits without an exception. Throwing away all the bits is a bug, it's supposed to mask with 0xL -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] test_gzip/test_tarfile failure om AMD64
On May 29, 2006, at 8:00 PM, Tim Peters wrote: [Bob Ippolito] ... Actually, should this be a FutureWarning or a DeprecationWarning? Since it was never documented, UndocumentedBugGoingAwayError ;-) Short of that, yes, DeprecationWarning. FutureWarning is for changes in non-exceptional behavior (.e.g, if we swapped the meanings of and in struct format codes, that would rate a FutureWarning subclass, line InsaneFutureWarning). OK, this behavior is implemented in revision 46537: (this is from ./python.exe -Wall) import struct ... struct.pack('B', -1) /Users/bob/src/python/Lib/struct.py:63: DeprecationWarning: struct integer wrapping is deprecated return o.pack(*args) /Users/bob/src/python/Lib/struct.py:63: DeprecationWarning: 'B' format requires 0 = number = 255 return o.pack(*args) '\xff' We certainly don't want to see two deprecation warnings for a single deprecated behavior. I suggest eliminating the struct integer wrapping warning, mostly because I had no idea what it _meant_ before reading the comments in _struct.c (wrapping is used most often in a proxy or delegation context in Python these days). 'B' format requires 0 = number = 255 is perfectly clear all by itself. What should it be called instead of wrapping? When it says it's wrapping, it means that it's doing x = (2 ^ (8 * n)) - 1 to force a number into meeting the expected range. Reducing it to one warning instead of two is kinda difficult. Is it worth the trouble? -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] test_gzip/test_tarfile failure om AMD64
Bob Ippolito wrote: On May 29, 2006, at 8:00 PM, Tim Peters wrote: We certainly don't want to see two deprecation warnings for a single deprecated behavior. I suggest eliminating the struct integer wrapping warning, mostly because I had no idea what it _meant_ before reading the comments in _struct.c (wrapping is used most often in a proxy or delegation context in Python these days). 'B' format requires 0 = number = 255 is perfectly clear all by itself. What should it be called instead of wrapping? When it says it's wrapping, it means that it's doing x = (2 ^ (8 * n)) - 1 to force a number into meeting the expected range. integer overflow masking perhaps? Reducing it to one warning instead of two is kinda difficult. Is it worth the trouble? If there are cases where only one warning or the other triggers, it doesn't seem worth the effort to try and suppress one of them when they both trigger. Cheers, Nick. -- Nick Coghlan | [EMAIL PROTECTED] | Brisbane, Australia --- http://www.boredomandlaziness.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] test_gzip/test_tarfile failure om AMD64
On May 30, 2006, at 2:41 AM, Nick Coghlan wrote: Bob Ippolito wrote: On May 29, 2006, at 8:00 PM, Tim Peters wrote: We certainly don't want to see two deprecation warnings for a single deprecated behavior. I suggest eliminating the struct integer wrapping warning, mostly because I had no idea what it _meant_ before reading the comments in _struct.c (wrapping is used most often in a proxy or delegation context in Python these days). 'B' format requires 0 = number = 255 is perfectly clear all by itself. What should it be called instead of wrapping? When it says it's wrapping, it means that it's doing x = (2 ^ (8 * n)) - 1 to force a number into meeting the expected range. integer overflow masking perhaps? Sounds good enough, I'll go ahead and change the wording to that. Reducing it to one warning instead of two is kinda difficult. Is it worth the trouble? If there are cases where only one warning or the other triggers, it doesn't seem worth the effort to try and suppress one of them when they both trigger. It works kinda like this: def get_ulong(x): ulong_mask = (sys.maxint 1L) | 1 if is_unsigned and ((unsigned)x) ulong_mask: x = ulong_mask warning('integer overflow masking is deprecated') return x def pack_ubyte(x): x = get_ulong(x) if not (0 = x = 255): warning('B' format requires 0 = number = 255) x = 0xff return chr(x) Given the implementation, it will warn twice if sizeof(format) sizeof(long) AND one of the following: 1. Negative numbers are given for an unsigned format 2. Input value is greater than ((sys.maxint 1) | 1) for an unsigned format 3. Input value is not ((-sys.maxint - 1) = x = sys.maxint) for a signed format -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] test_gzip/test_tarfile failure om AMD64
[Bob Ippolito] What should it be called instead of wrapping? I don't know -- I don't know what it's trying to _say_ that isn't already said by saying that the input is out of bounds for the format code. When it says it's wrapping, it means that it's doing x = (2 ^ (8 * n)) - 1 to force a number into meeting the expected range. How is that different from what it does in this case?: struct.pack('B', 256L) /Users/bob/src/python/Lib/struct.py:63: DeprecationWarning: 'B' format requires 0 = number = 255 return o.pack(*args) '\x00' That looks like wrapping to me too (256 (2**(8*1)-1)== 0x00), but in this case there is no deprecation warning about wrapping. Because of that, I'm afraid you're drawing distinctions that can't make sense to users. Reducing it to one warning instead of two is kinda difficult. Is it worth the trouble? I don't understand. Every example you gave that showed a wrapping warning also showed a format requires i = number = j warning. Are there cases in which a wrapping warning is given but not a format requires i = number = j warning? If so, I simply haven't seen one (but I haven't tried all possible inputs ;-)). Since the implementation appears (to judge from the examples) to wrap in every case in which any warning is given (or are there cases in which it doesn't?), I don't understand the point of distinguishing between wrapping warnings and format requires i = number = j warnings either. The latter are crystal clear. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] test_gzip/test_tarfile failure om AMD64
On May 30, 2006, at 10:47 AM, Tim Peters wrote: [Bob Ippolito] What should it be called instead of wrapping? I don't know -- I don't know what it's trying to _say_ that isn't already said by saying that the input is out of bounds for the format code. The wrapping (now overflow masking) warning happens during conversion of PyObject* to long or unsigned long. It has no idea what the destination packing format is beyond whether it's signed or unsigned. If the packing format happens to be the same size as a long, it can't possibly trigger a range warning (unless range checks are moved up the stack and all of the function signatures and code get changed to accommodate that). When it says it's wrapping, it means that it's doing x = (2 ^ (8 * n)) - 1 to force a number into meeting the expected range. How is that different from what it does in this case?: struct.pack('B', 256L) /Users/bob/src/python/Lib/struct.py:63: DeprecationWarning: 'B' format requires 0 = number = 255 return o.pack(*args) '\x00' That looks like wrapping to me too (256 (2**(8*1)-1)== 0x00), but in this case there is no deprecation warning about wrapping. Because of that, I'm afraid you're drawing distinctions that can't make sense to users. When it says integer wrapping it means that it's wrapping to fit in a long or unsigned long. n in this case is always 4 or 8 depending on the platform. The format-specific range check is separate. My description wasn't very good in the last email. Reducing it to one warning instead of two is kinda difficult. Is it worth the trouble? I don't understand. Every example you gave that showed a wrapping warning also showed a format requires i = number = j warning. Are there cases in which a wrapping warning is given but not a format requires i = number = j warning? If so, I simply haven't seen one (but I haven't tried all possible inputs ;-)). Since the implementation appears (to judge from the examples) to wrap in every case in which any warning is given (or are there cases in which it doesn't?), I don't understand the point of distinguishing between wrapping warnings and format requires i = number = j warnings either. The latter are crystal clear. A latter email in this thread enumerates exactly which circumstances should cause two warnings with the current implementation. -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] test_gzip/test_tarfile failure om AMD64
On May 28, 2006, at 5:34 PM, Thomas Wouters wrote:On 5/29/06, Bob Ippolito [EMAIL PROTECTED] wrote: On May 28, 2006, at 4:31 AM, Thomas Wouters wrote: I'm seeing a dubious failure of test_gzip and test_tarfile on my AMD64 machine. It's triggered by the recent struct changes, but I'd say it's probably caused by a bug/misfeature in zlibmodule: zlib.crc32 is the result of a zlib 'crc32' functioncall, which returns an unsigned long. zlib.crc32 turns that unsigned long into a (signed) Python int, which means a number beyond 131 goes negative on 32-bit systems and other systems with 32-bit longs, but stays positive on systems with 64-bit longs: (32-bit) zlib.crc32("foobabazr") -271938108 (64-bit) zlib.crc32("foobabazr") 4023029188 The old structmodule coped with that: struct.pack("l", -271938108) '\xc4\x8d\xca\xef' struct.pack("l", 4023029188) '\xc4\x8d\xca\xef' The new one does not: struct.pack("l", -271938108) '\xc4\x8d\xca\xef' struct.pack("l", 4023029188) Traceback (most recent call last): File "stdin", line 1, in module File "Lib/struct.py", line 63, in pack return o.pack(*args) struct.error: 'l' format requires -2147483647 = number = 2147483647 The structmodule should be fixed (and a test added ;) but I'm also wondering if zlib shouldn't be fixed. Now, I'm AMD64-centric, so my suggested fix would be to change the PyInt_FromLong() call to PyLong_FromUnsignedLong(), making zlib always return positive numbers -- it might break some code on 32-bit platforms, but that code is already broken on 64-bit platforms. But I guess I'm okay with the long being changed into an actual 32-bit signed number on 64-bit platforms, too.The struct module isn't what's broken here. All of the struct typeshave always had well defined bit sizes and alignment if you explicitly specify an endian, I and L are 32-bits everywhere, and Q is supported on platforms that don't have long long. The onlything that's changed is that it actually checks for errorsconsistently now. Yes. And that breaks things. I'm certain the behaviour is used in real-world code (and I don't mean just the gzip module.) It has always, as far as I can remember, accepted 'unsigned' values for the signed versions of ints, longs and long-longs (but not chars or shorts.) I agree that that's wrong, but I don't think changing struct to do the right thing should be done in 2.5. I don't even think it should be done in 2.6 -- although 3.0 is fine.Well, the behavior change is in response to a bug http://python.org/sf/1229380. If nothing else, we should at least fix the standard library such that it doesn't depend on struct bugs. This is the only way to find them :)Basically the struct module previously only checked for errors if you don't specify an endian. That's really strange and leads to very confusing results. The only code that really should be broken by this additional check is code that existed before Python had a long type and only signed values were available.-bob___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] test_gzip/test_tarfile failure om AMD64
On 5/29/06, Bob Ippolito [EMAIL PROTECTED] wrote: Well, the behavior change is in response to a bug http://python.org/sf/1229380.If nothing else, we should at least fix the standard library such that it doesn't depend on struct bugs. This is the only way to find them :) Feel free to comment how the zlib.crc32/gzip co-operation should be fixed. I don't see an obviously correct fix. The trunk is currently failing tests it shouldn't fail. Also note that the error isn't with feeding signed values to unsigned formats (which is what the bug is about) but the other way 'round, although I do believe both should be accepted for the time being, while generating a warning. Basically the struct module previously only checked for errors if you don't specify an endian. That's really strange and leads to very confusing results. The only code that really should be broken by this additional check is code that existed before Python had a long type and only signed values were available. Alas, reality is different. The fundamental difference between types in Python and in C causes this, and code using struct is usually meant specifically to bridge those two worlds. Furthermore, struct is often used *fix* that issue, by flipping sign bits if necessary: struct.unpack(l, struct.pack(l, 3221225472))(-1073741824,) struct.unpack(l, struct.pack(L, 3221225472))(-1073741824,) struct.unpack(l, struct.pack(l, -1073741824))(-1073741824,) struct.unpack(l, struct.pack(L, -1073741824))(-1073741824,) Before this change, you didn't have to check whether the value is negative before the struct.unpack/pack dance, regardless of which format character you used. This misfeature is used (and many would consider it convenient, even Pythonic, for struct to DWIM), breaking it suddenly is bad. -- Thomas Wouters [EMAIL PROTECTED]Hi! I'm a .signature virus! copy me into your .signature file to help me spread! ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] test_gzip/test_tarfile failure om AMD64
On May 29, 2006, at 3:14 AM, Thomas Wouters wrote:On 5/29/06, Bob Ippolito [EMAIL PROTECTED] wrote: Well, the behavior change is in response to a bug http://python.org/sf/1229380. If nothing else, we should at least fix the standard library such that it doesn't depend on struct bugs. This is the only way to find them :) Feel free to comment how the zlib.crc32/gzip co-operation should be fixed. I don't see an obviously correct fix. The trunk is currently failing tests it shouldn't fail. Also note that the error isn't with feeding signed values to unsigned formats (which is what the bug is about) but the other way 'round, although I do believe both should be accepted for the time being, while generating a warning. Well, first I'm going to just correct the modules that are broken (zlib, gzip, tarfile, binhex and probably one or two others).Basically the struct module previously only checked for errors if you don't specify an endian. That's really strange and leads to very confusing results. The only code that really should be broken by this additional check is code that existed before Python had a long type and only signed values were available. Alas, reality is different. The fundamental difference between types in Python and in C causes this, and code using struct is usually meant specifically to bridge those two worlds. Furthermore, struct is often used *fix* that issue, by flipping sign bits if necessary: Well, in C you get a compiler warning for stuff like this. struct.unpack("l", struct.pack("l", 3221225472))(-1073741824,) struct.unpack("l", struct.pack("L", 3221225472))(-1073741824,) struct.unpack("l", struct.pack("l", -1073741824))(-1073741824,) struct.unpack("l", struct.pack("L", -1073741824))(-1073741824,) Before this change, you didn't have to check whether the value is negative before the struct.unpack/pack dance, regardless of which format character you used. This misfeature is used (and many would consider it convenient, even Pythonic, for struct to DWIM), breaking it suddenly is bad. struct doesn't really DWIM anyway, since integers are up-converted to longs and will overflow past what the (old or new) struct module will accept. Before there was a long type or automatic up-converting, the sign agnosticism worked.. but it doesn't really work correctly these days.We have two choices, either fix it to behave consistently broken everywhere for numbers of every size (modulo every number that comes in so that it fits), or have it do proper range checking. A compromise is to do proper range checking as a warning, and do the modulo math anyway... but is that what we really want?-bob___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] test_gzip/test_tarfile failure om AMD64
On 5/29/06, Bob Ippolito [EMAIL PROTECTED] wrote: A compromise is to do proper range checking as a warning, and do the modulo math anyway... but is that what we really want?I don't know about the rest of 'us', but that's what I want, yes: backward compatibility, and a warning to tell people to fix their code 'or else'. The prevalence of the warnings (outside of the stdlib) should give us a clue whether to make it an exception in 2.6 or wait for 2.7/3.0.Perhaps more people could chime in? Am I being too anal about backward compatibility here?-- Thomas Wouters [EMAIL PROTECTED] Hi! I'm a .signature virus! copy me into your .signature file to help me spread! ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] test_gzip/test_tarfile failure om AMD64
[Thomas Wouters] ... Perhaps more people could chime in? Am I being too anal about backward compatibility here? Yes and no ;-) Backward compatibility _is_ important, but there seems no way to know in this case whether struct's range-checking sloppiness was accidental or deliberate. Having fixed bugs in the old code several times, and been reduced to writing crap like this in the old test_struct.py: # XXX Most std integer modes fail to test for out-of-range. # The i and l codes appear to range-check OK on 32-bit boxes, but # fail to check correctly on some 64-bit ones (Tru64 Unix + Compaq C # reported by Mark Favas). BUGGY_RANGE_CHECK = bBhHiIlL I can't help but note several things: - If it _was_ intended that range-checking be sloppy, nobody bothered to document it. - Or even to write a comment in the code explaining that obscure intent. - When I implemented the Q (8-byte int) format code, I added correct range-checking in all cases, and nobody ever complained about that. - As noted in the comment above, we have gotten complaints about failures of struct range-checking at other integer widths. OTOH, BUGGY_RANGE_CHECK existed because I was too timid to risk making broken user code visibly broken. So, in all, I'm 95% sure 2.4's behavior is buggy, but 50% unsure that we need to warn about it before repairing it. Since you (Thomas) want warnings, and in theory it only affects the lightly-used standard modes, I do lean in favor of leaving the standard modes that _are_ broken (as above, not all are) broken in 2.5 but warning that this will change in 2.6. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] test_gzip/test_tarfile failure om AMD64
Thomas Wouters [EMAIL PROTECTED] wrote: On 5/29/06, Bob Ippolito [EMAIL PROTECTED] wrote: A compromise is to do proper range checking as a warning, and do the modulo math anyway... but is that what we really want? I don't know about the rest of 'us', but that's what I want, yes: backward compatibility, and a warning to tell people to fix their code 'or else'. The prevalence of the warnings (outside of the stdlib) should give us a clue whether to make it an exception in 2.6 or wait for 2.7/3.0. Perhaps more people could chime in? Am I being too anal about backward compatibility here? As a fairly heavy user of struct, I personally don't use struct to do modulos and/or sign manipulation (I mask before I pass), but a change in behavior seems foolish if people use that behavior. So far, I'm not aware of anyone complaining about Python 2.4's use, so it would seem to suggest that the current behavior is not incorrect. - Josiah ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] test_gzip/test_tarfile failure om AMD64
On 5/29/06, Tim Peters [EMAIL PROTECTED] wrote: [Thomas Wouters] ... Perhaps more people could chime in? Am I being too anal about backward compatibility here? Yes and no ;-) Backward compatibility _is_ important, but there seems no way to know in this case whether struct's range-checking sloppiness was accidental or deliberate. I'm pretty sure it was deliberate. I'm more than likely the original author of this code (since the struct module was originally mine), and I know I put in things like that in a few places to cope with hex/oct constants pasted from C headers, and the general messiness that ensued because of Python's lack of an unsigned int type. This is probably a case that was missed by PEP 237. I think we should do as Thomas proposes: plan to make it an error in 2.6 (or 2.7 if there's a big outcry, which I don't expect) and accept it with a warning in 2.5. - If it _was_ intended that range-checking be sloppy, nobody bothered to document it. Mea culpa. In those days we didn't document a lot of things. - Or even to write a comment in the code explaining that obscure intent. It never occurred to me that it was obscure; we did this all over the place (in PyArg_Parse too). - When I implemented the Q (8-byte int) format code, I added correct range-checking in all cases, and nobody ever complained about that. It's really only a practical concern for 32-bit values on 32-bit machines, where reasonable people can disagree over whether 0x is -1 or 4294967295. - As noted in the comment above, we have gotten complaints about failures of struct range-checking at other integer widths. OTOH, BUGGY_RANGE_CHECK existed because I was too timid to risk making broken user code visibly broken. So, in all, I'm 95% sure 2.4's behavior is buggy, but 50% unsure that we need to warn about it before repairing it. Since you (Thomas) want warnings, and in theory it only affects the lightly-used standard modes, I do lean in favor of leaving the standard modes that _are_ broken (as above, not all are) broken in 2.5 but warning that this will change in 2.6. I'm not sure what we gain by leaving other std modules depending on struct's brokenness broken. But I may be misinterpreting which modules you're referring to. -- --Guido van Rossum (home page: http://www.python.org/~guido/) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] test_gzip/test_tarfile failure om AMD64
[Guido] ... It's really only a practical concern for 32-bit values on 32-bit machines, where reasonable people can disagree over whether 0x is -1 or 4294967295. Then maybe we should only let that one slide 0.5 wink. ... [Tim] So, in all, I'm 95% sure 2.4's behavior is buggy, but 50% unsure that we need to warn about it before repairing it. Since you (Thomas) want warnings, and in theory it only affects the lightly-used standard modes, I do lean in favor of leaving the standard modes that _are_ broken (as above, not all are) broken in 2.5 but warning that this will change in 2.6. I'm not sure what we gain by leaving other std modules depending on struct's brokenness broken. But I may be misinterpreting which modules you're referring to. I think you're just reading module where I wrote mode. Standard mode is struct-module terminology, as in b !b b are standard modes but b is not a standard mode (it's native mode). But I got it backwards -- or maybe not ;-) It's confusing because it's so inconsistent (this under 2.4.3 on 32-bit Windows): struct.pack(B, -32) # std mode doesn't complain '\xe0' struct.pack(B, -32) # native mode does Traceback (most recent call last): File stdin, line 1, in ? struct.error: ubyte format requires 0=number=255 struct.pack(b, 255) # std mode doesn't complain '\xff' struct.pack(b, 255) # native mode does Traceback (most recent call last): File stdin, line 1, in ? struct.error: byte format requires -128=number=127 On the other hand, as I noted last time, some standard modes _do_ range-check -- but not correctly on some 64-bit boxes -- and not consistently across positive and negative out-of-range values, or across input types. Like: struct.pack(i, 2**32-1) # std and native modes complain Traceback (most recent call last): File stdin, line 1, in ? OverflowError: long int too large to convert to int struct.pack(i, 2**32-1) Traceback (most recent call last): File stdin, line 1, in ? OverflowError: long int too large to convert to int struct.pack(I, -1) # neither std nor native modes complain '\xff\xff\xff\xff' struct.pack(I, -1) '\xff\xff\xff\xff' struct.pack(I, -1L) # but both complain if the input is long Traceback (most recent call last): File stdin, line 1, in ? OverflowError: can't convert negative value to unsigned long struct.pack(I, -1L) Traceback (most recent call last): File stdin, line 1, in ? OverflowError: can't convert negative value to unsigned long In short, there's no way to explain what struct checks for in 2.4.3 short of drawing up an exhaustive table of standard-vs-native mode, format code, which direction a value may be out of range, and whether the value is given as a Python int or a long. At the sprint, I encouraged Bob to do complete range-checking. That's explainable. If we have to back off from that, then since the new code is consistent, I'm sure any warts he puts back in will be clearly look like warts ;-) I think we should do as Thomas proposes: plan to make it an error in 2.6 (or 2.7 if there's a big outcry, which I don't expect) and accept it with a warning in 2.5. That's what I arrived at, although 2.4.3's checking behavior is actually so inconsistent that it needs some defining (what exactly are we trying to still accept? e.g., that -1 doesn't trigger I complaints but that -1L does above? that one's surely a bug). To be clear, Thomas proposed accepting it (whatever that means) until 3.0. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] test_gzip/test_tarfile failure om AMD64
On 5/29/06, Tim Peters [EMAIL PROTECTED] wrote: I think we should do as Thomas proposes: plan to make it an error in 2.6 (or 2.7 if there's a big outcry, which I don't expect) and accept it with a warning in 2.5. That's what I arrived at, although 2.4.3's checking behavior is actually so inconsistent that it needs some defining (what exactly are we trying to still accept? e.g., that -1 doesn't trigger I complaints but that -1L does above? that one's surely a bug). No, it reflects that (up to 2.3 I believe) 0x was -1 but 0xL was 4294967295L. To be clear, Thomas proposed accepting it (whatever that means) until 3.0. Fine with me. -- --Guido van Rossum (home page: http://www.python.org/~guido/) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] test_gzip/test_tarfile failure om AMD64
Perhaps more people could chime in? Am I being too anal about backward compatibility here? As a sometimes bug report reviewer, I would like the reported discrepancy between the public docs and visible code behavior fixed one way or the other (by changing the docs or code) since that is my working definition for whether something is a bug or not. Terry Jan Reedy ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] test_gzip/test_tarfile failure om AMD64
On May 29, 2006, at 12:44 PM, Guido van Rossum wrote: On 5/29/06, Tim Peters [EMAIL PROTECTED] wrote: I think we should do as Thomas proposes: plan to make it an error in 2.6 (or 2.7 if there's a big outcry, which I don't expect) and accept it with a warning in 2.5. That's what I arrived at, although 2.4.3's checking behavior is actually so inconsistent that it needs some defining (what exactly are we trying to still accept? e.g., that -1 doesn't trigger I complaints but that -1L does above? that one's surely a bug). No, it reflects that (up to 2.3 I believe) 0x was -1 but 0xL was 4294967295L. Python 2.3 did a FutureWarning on 0x but its value was -1. Anyway, my plan is to make it such that all non-native format codes will behave exactly like C casting, but will do a DeprecationWarning for input numbers that were initially out of bounds. This behavior will be consistent across (python) int and long, and will be easy enough to explain in the docs (but still more complicated than values not representable by this data type will raise struct.error). This means that I'm also changing it so that struct.pack will not raise OverflowError for some longs, it will always raise struct.error or do a warning (as long as the input is int or long). Pseudocode looks kinda like this: def wrap_unsigned(x, CTYPE): if not (0 = x = CTYPE_MAX): DeprecationWarning() x = CTYPE_MAX return x Actually, should this be a FutureWarning or a DeprecationWarning? -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] test_gzip/test_tarfile failure om AMD64
[Guido] I think we should do as Thomas proposes: plan to make it an error in 2.6 (or 2.7 if there's a big outcry, which I don't expect) and accept it with a warning in 2.5. [Tim] That's what I arrived at, although 2.4.3's checking behavior is actually so inconsistent that it needs some defining (what exactly are we trying to still accept? e.g., that -1 doesn't trigger I complaints but that -1L does above? that one's surely a bug). [Guido] No, it reflects that (up to 2.3 I believe) 0x was -1 but 0xL was 4294967295L. But is it a bug _now_? That's what I mean by a bug. To me, this is simply inexplicable in 2.4 (and 2.5+): struct.pack(I, -1) # neither std nor native modes complain '\xff\xff\xff\xff' struct.pack(I, -1) '\xff\xff\xff\xff' struct.pack(I, -1L) # but both complain if the input is long Traceback (most recent call last): File stdin, line 1, in ? OverflowError: can't convert negative value to unsigned long struct.pack(I, -1L) Traceback (most recent call last): File stdin, line 1, in ? OverflowError: can't convert negative value to unsigned long Particulary for the standard modes, the behavior also varies across platforms (i.e., it wasn't true in any version of Python that 0x == -1 on most 64-bit boxes, and to have standard mode behavior vary according to platform isn't particularly standard :-)). To be clear, Thomas proposed accepting it (whatever that means) until 3.0. Fine with me. So who has a definition for what it means? I don't. Does every glitch have to be reproduced across the cross product of platform X format-code X input-type X native-vs-standard X direction-of-out-of-range? Or would people be happier if struct simply never checked for out-of-bounds? At least the latter is doable with finite effort, and is also explainable (for an integral code of N bytes, pack() stores the least-significant N bytes of the input viewed as a 2's-complement integer). I'd be happier with that than with something that can't be explained short of exhaustive listing of cases across 5 dimensions. Ditto with saying that for an integral type using N bytes, pack() accepts any integer in -(2**(8*N-1)) through 2**(8*N)-1, complains outside that range, and makes no complaining distinctions based on platform, input type, standard-vs-native mode, signed-or-unsigned format code, or direction of out-of-range. That's also explainable. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] test_gzip/test_tarfile failure om AMD64
On 5/29/06, Tim Peters [EMAIL PROTECTED] wrote: [Tim] To be clear, Thomas proposed accepting it (whatever that means) until 3.0.[Guido] Fine with me.So who has a definition for what it means? I know which 'it' I meant: the same 'it' as struct already accepts in Python 2.4 and before. Yes, it's inconsistent between formatcodes and valuetypes -- fixing that the whole point of the change -- but that's how you define 'compatibility'; struct, by default, should do what it did for Python 2.4, for all operating modes. It doesn't have to be more liberal than 2.4 (and preferably shouldn't, as that could break backward compatibility of some code -- much less common, though.)Making a list of which formatcodes accept what values (for what valuetypes) for 2.4 is easy enough (and should be added to the test suite, too ;-) -- I can do that in a few days if no one gets to it.-- Thomas Wouters [EMAIL PROTECTED] Hi! I'm a .signature virus! copy me into your .signature file to help me spread! ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] test_gzip/test_tarfile failure om AMD64
Thomas Wouters wrote: I know which 'it' I meant: the same 'it' as struct already accepts in Python 2.4 and before. Yes, it's inconsistent between formatcodes and valuetypes -- fixing that the whole point of the change -- but that's how you define 'compatibility'; struct, by default, should do what it did for Python 2.4, for all operating modes. that's how you define compatibility for bug fix releases in Python. 2.5 is not 2.4.4. /F ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] test_gzip/test_tarfile failure om AMD64
On 5/29/06, Fredrik Lundh [EMAIL PROTECTED] wrote: Thomas Wouters wrote: I know which 'it' I meant: the same 'it' as struct already accepts in Python 2.4 and before. Yes, it's inconsistent between formatcodes and valuetypes -- fixing that the whole point of the change -- but that's how you define 'compatibility'; struct, by default, should do what it did for Python 2.4, for all operating modes.that's how you define compatibility for bug fix releases in Python.2.5is not 2.4.4.Correct, and it is also not 2.6. Breaking perfectly working (and more to the point, non-complaining) code, even if it is for a good reason, is bad. It means, for instance, that I can not upgrade Python on any of the servers I manage, where Python gets used by clients. This is not a hypothetical problem, I've had it happen too often (although fortunately not often with Python, because of the sane policy on backward compatibility.) If 2.5 warns and does the old thing, the upgrade path is easy and defendable. This is also why there are future statements -- I distinctly recall making the same argument back then :-) The cost of continuing the misfeatures in struct for one release does not weigh up to the cost of breaking compatibility unwarned. -- Thomas Wouters [EMAIL PROTECTED]Hi! I'm a .signature virus! copy me into your .signature file to help me spread! ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] test_gzip/test_tarfile failure om AMD64
On May 29, 2006, at 1:18 PM, Bob Ippolito wrote: On May 29, 2006, at 12:44 PM, Guido van Rossum wrote: On 5/29/06, Tim Peters [EMAIL PROTECTED] wrote: I think we should do as Thomas proposes: plan to make it an error in 2.6 (or 2.7 if there's a big outcry, which I don't expect) and accept it with a warning in 2.5. That's what I arrived at, although 2.4.3's checking behavior is actually so inconsistent that it needs some defining (what exactly are we trying to still accept? e.g., that -1 doesn't trigger I complaints but that -1L does above? that one's surely a bug). No, it reflects that (up to 2.3 I believe) 0x was -1 but 0xL was 4294967295L. Python 2.3 did a FutureWarning on 0x but its value was -1. Anyway, my plan is to make it such that all non-native format codes will behave exactly like C casting, but will do a DeprecationWarning for input numbers that were initially out of bounds. This behavior will be consistent across (python) int and long, and will be easy enough to explain in the docs (but still more complicated than values not representable by this data type will raise struct.error). This means that I'm also changing it so that struct.pack will not raise OverflowError for some longs, it will always raise struct.error or do a warning (as long as the input is int or long). Pseudocode looks kinda like this: def wrap_unsigned(x, CTYPE): if not (0 = x = CTYPE_MAX): DeprecationWarning() x = CTYPE_MAX return x Actually, should this be a FutureWarning or a DeprecationWarning? OK, this behavior is implemented in revision 46537: (this is from ./python.exe -Wall) import struct struct.pack('B', 256) Traceback (most recent call last): File stdin, line 1, in module File /Users/bob/src/python/Lib/struct.py, line 63, in pack return o.pack(*args) struct.error: ubyte format requires 0 = number = 255 struct.pack('B', -1) Traceback (most recent call last): File stdin, line 1, in module File /Users/bob/src/python/Lib/struct.py, line 63, in pack return o.pack(*args) struct.error: ubyte format requires 0 = number = 255 struct.pack('B', 256) /Users/bob/src/python/Lib/struct.py:63: DeprecationWarning: 'B' format requires 0 = number = 255 return o.pack(*args) '\x00' struct.pack('B', -1) /Users/bob/src/python/Lib/struct.py:63: DeprecationWarning: struct integer wrapping is deprecated return o.pack(*args) /Users/bob/src/python/Lib/struct.py:63: DeprecationWarning: 'B' format requires 0 = number = 255 return o.pack(*args) '\xff' struct.pack('B', 256L) /Users/bob/src/python/Lib/struct.py:63: DeprecationWarning: 'B' format requires 0 = number = 255 return o.pack(*args) '\x00' struct.pack('B', -1L) /Users/bob/src/python/Lib/struct.py:63: DeprecationWarning: struct integer wrapping is deprecated return o.pack(*args) /Users/bob/src/python/Lib/struct.py:63: DeprecationWarning: 'B' format requires 0 = number = 255 return o.pack(*args) '\xff' In _struct.c, getting rid of the #define PY_STRUCT_WRAPPING 1 will turn off this warning+wrapping nonsense and just raise errors for out of range values. It'll also enable some additional performance hacks (swapping out the host-endian table's pack and unpack functions with the faster native versions). -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] test_gzip/test_tarfile failure om AMD64
Thomas Wouters wrote: If 2.5 warns and does the old thing, the upgrade path is easy and defendable. This is also why there are future statements -- I distinctly recall making the same argument back then :-) The cost of continuing the misfeatures in struct for one release does not weigh up to the cost of breaking compatibility unwarned. This really sounds to me like a __future__ import would be useful to get the fixed behaviour. Tim Delaney ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] test_gzip/test_tarfile failure om AMD64
[Bob Ippolito] ... Actually, should this be a FutureWarning or a DeprecationWarning? Since it was never documented, UndocumentedBugGoingAwayError ;-) Short of that, yes, DeprecationWarning. FutureWarning is for changes in non-exceptional behavior (.e.g, if we swapped the meanings of and in struct format codes, that would rate a FutureWarning subclass, line InsaneFutureWarning). OK, this behavior is implemented in revision 46537: (this is from ./python.exe -Wall) import struct ... struct.pack('B', -1) /Users/bob/src/python/Lib/struct.py:63: DeprecationWarning: struct integer wrapping is deprecated return o.pack(*args) /Users/bob/src/python/Lib/struct.py:63: DeprecationWarning: 'B' format requires 0 = number = 255 return o.pack(*args) '\xff' We certainly don't want to see two deprecation warnings for a single deprecated behavior. I suggest eliminating the struct integer wrapping warning, mostly because I had no idea what it _meant_ before reading the comments in _struct.c (wrapping is used most often in a proxy or delegation context in Python these days). 'B' format requires 0 = number = 255 is perfectly clear all by itself. ... ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] test_gzip/test_tarfile failure om AMD64
I'm seeing a dubious failure of test_gzip and test_tarfile on my AMD64 machine. It's triggered by the recent struct changes, but I'd say it's probably caused by a bug/misfeature in zlibmodule: zlib.crc32 is the result of a zlib 'crc32' functioncall, which returns an unsigned long. zlib.crc32 turns that unsigned long into a (signed) Python int, which means a number beyond 131 goes negative on 32-bit systems and other systems with 32-bit longs, but stays positive on systems with 64-bit longs: (32-bit) zlib.crc32(foobabazr)-271938108(64-bit) zlib.crc32(foobabazr)4023029188The old structmodule coped with that: struct.pack(l, -271938108)'\xc4\x8d\xca\xef' struct.pack(l, 4023029188)'\xc4\x8d\xca\xef'The new one does not: struct.pack(l, -271938108) '\xc4\x8d\xca\xef' struct.pack(l, 4023029188)Traceback (most recent call last): File stdin, line 1, in module File Lib/struct.py, line 63, in pack return o.pack(*args)struct.error: 'l' format requires -2147483647 = number = 2147483647The structmodule should be fixed (and a test added ;) but I'm also wondering if zlib shouldn't be fixed. Now, I'm AMD64-centric, so my suggested fix would be to change the PyInt_FromLong() call to PyLong_FromUnsignedLong(), making zlib always return positive numbers -- it might break some code on 32-bit platforms, but that code is already broken on 64-bit platforms. But I guess I'm okay with the long being changed into an actual 32-bit signed number on 64-bit platforms, too. -- Thomas Wouters [EMAIL PROTECTED]Hi! I'm a .signature virus! copy me into your .signature file to help me spread! ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] test_gzip/test_tarfile failure om AMD64
On May 28, 2006, at 4:31 AM, Thomas Wouters wrote: I'm seeing a dubious failure of test_gzip and test_tarfile on my AMD64 machine. It's triggered by the recent struct changes, but I'd say it's probably caused by a bug/misfeature in zlibmodule: zlib.crc32 is the result of a zlib 'crc32' functioncall, which returns an unsigned long. zlib.crc32 turns that unsigned long into a (signed) Python int, which means a number beyond 131 goes negative on 32-bit systems and other systems with 32-bit longs, but stays positive on systems with 64-bit longs: (32-bit) zlib.crc32(foobabazr) -271938108 (64-bit) zlib.crc32(foobabazr) 4023029188 The old structmodule coped with that: struct.pack(l, -271938108) '\xc4\x8d\xca\xef' struct.pack(l, 4023029188) '\xc4\x8d\xca\xef' The new one does not: struct.pack(l, -271938108) '\xc4\x8d\xca\xef' struct.pack(l, 4023029188) Traceback (most recent call last): File stdin, line 1, in module File Lib/struct.py, line 63, in pack return o.pack(*args) struct.error: 'l' format requires -2147483647 = number = 2147483647 The structmodule should be fixed (and a test added ;) but I'm also wondering if zlib shouldn't be fixed. Now, I'm AMD64-centric, so my suggested fix would be to change the PyInt_FromLong() call to PyLong_FromUnsignedLong(), making zlib always return positive numbers -- it might break some code on 32-bit platforms, but that code is already broken on 64-bit platforms. But I guess I'm okay with the long being changed into an actual 32-bit signed number on 64-bit platforms, too. The struct module isn't what's broken here. All of the struct types have always had well defined bit sizes and alignment if you explicitly specify an endian, I and L are 32-bits everywhere, and Q is supported on platforms that don't have long long. The only thing that's changed is that it actually checks for errors consistently now. -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] test_gzip/test_tarfile failure om AMD64
On 5/29/06, Bob Ippolito [EMAIL PROTECTED] wrote: On May 28, 2006, at 4:31 AM, Thomas Wouters wrote: I'm seeing a dubious failure of test_gzip and test_tarfile on my AMD64 machine. It's triggered by the recent struct changes, but I'd say it's probably caused by a bug/misfeature in zlibmodule: zlib.crc32 is the result of a zlib 'crc32' functioncall, which returns an unsigned long. zlib.crc32 turns that unsigned long into a (signed) Python int, which means a number beyond 131 goes negative on 32-bit systems and other systems with 32-bit longs, but stays positive on systems with 64-bit longs: (32-bit) zlib.crc32(foobabazr) -271938108 (64-bit) zlib.crc32(foobabazr) 4023029188 The old structmodule coped with that: struct.pack(l, -271938108) '\xc4\x8d\xca\xef' struct.pack(l, 4023029188) '\xc4\x8d\xca\xef' The new one does not: struct.pack(l, -271938108) '\xc4\x8d\xca\xef' struct.pack(l, 4023029188) Traceback (most recent call last): File stdin, line 1, in module File Lib/struct.py, line 63, in pack return o.pack(*args) struct.error: 'l' format requires -2147483647 = number = 2147483647 The structmodule should be fixed (and a test added ;) but I'm also wondering if zlib shouldn't be fixed. Now, I'm AMD64-centric, so my suggested fix would be to change the PyInt_FromLong() call to PyLong_FromUnsignedLong(), making zlib always return positive numbers -- it might break some code on 32-bit platforms, but that code is already broken on 64-bit platforms. But I guess I'm okay with the long being changed into an actual 32-bit signed number on 64-bit platforms, too.The struct module isn't what's broken here. All of the struct typeshave always had well defined bit sizes and alignment if you explicitly specify an endian, I and L are 32-bits everywhere, and Q is supported on platforms that don't have long long. The onlything that's changed is that it actually checks for errorsconsistently now. Yes. And that breaks things. I'm certain the behaviour is used in real-world code (and I don't mean just the gzip module.) It has always, as far as I can remember, accepted 'unsigned' values for the signed versions of ints, longs and long-longs (but not chars or shorts.) I agree that that's wrong, but I don't think changing struct to do the right thing should be done in 2.5. I don't even think it should be done in 2.6 -- although 3.0 is fine.-- Thomas Wouters [EMAIL PROTECTED]Hi! I'm a .signature virus! copy me into your .signature file to help me spread! ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com