Re: [Python-Dev] ImportWarning flood
--- Martin v. L�wis [EMAIL PROTECTED] wrote: The specific question was Is there a way to set the warning options via an environment variable? This has nothing to do with beta1; the warnings module was introduced many releases ago, along with all the mechanics to disable warnings. Due to the new ImportWarning first introduced in 2.5b1 the question of disabling warnings is becoming much more pressing (I am assuming that I am not again the only person on the planet to have this problem). I guess you misunderstood. Yes. I propose you put warnings.simplefilter() into your code. The warnings was introduced before 2.2.1 IIRC, so this should work on all releases you want to support (but have no effect on installations where the warning isn't generated). Where would I put the warnings.simplefilter()? I have hundreds of scripts and __init__.py files. I just came accross this situation (simplified): % cd boost/boost % python2.5 import math __main__:1: ImportWarning: Not importing directory 'math': missing __init__.py This is because there is a subdirectory math in boost/boost, something that I cannot change. The PYTHONPATH is not set at all in this case. I.e. I get the ImportWarning just because my current working directory happens to contain a subdirectory which matches one of the Python modules in the standard library. Isn't this going to cause widespread problems? __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PyRange_New() alternative?
Scott David Daniels wrote: I am not sure about your compiler, but if I remember the standard correctly, the following code shouldn't complain: PyObject_CallFunction((PyObject*) (void *) PyRange_Type, lll, start, start+len*step, step) You remember the standard incorrectly. Python's usage of casts has undefined behaviour, and adding casts only makes the warning go away, but does not make the problem go away. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PyRange_New() alternative?
Ralf W. Grosse-Kunstleve wrote: I am not an expert of the C/C++ language details, but the intermediate cast seems to be a great local alternative to the global -fno-strict-aliasing flag. Depends on what you want to achieve. If you just want to make the warning go away, the cast works fine. If you want to avoid bad code being generated, you better use the flag (alternatively, you could fix Python to not rely on undefined behaviour (and no, it's not easy to fix in Python, or else we would have fixed it long ago)). Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] ImportWarning flood
Ralf W. Grosse-Kunstleve wrote: This has nothing to do with beta1; the warnings module was introduced many releases ago, along with all the mechanics to disable warnings. Due to the new ImportWarning first introduced in 2.5b1 the question of disabling warnings is becoming much more pressing (I am assuming that I am not again the only person on the planet to have this problem). Sure. However, many people on comp.lang.python could have told you how to silence warnings in Python. Where would I put the warnings.simplefilter()? I have hundreds of scripts and __init__.py files. I would have to study your application to answer that question. Putting it into sitecustomize.py should always work. I.e. I get the ImportWarning just because my current working directory happens to contain a subdirectory which matches one of the Python modules in the standard library. Isn't this going to cause widespread problems? I don't know. Whether a warning is a problem is a matter of attitude, also. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] ImportWarning flood
--- Martin v. Löwis [EMAIL PROTECTED] wrote: I don't know. Whether a warning is a problem is a matter of attitude, also. Our users will think our applications are broken if they see warnings like that. It is not professional. __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PyRange_New() alternative?
--- Martin v. Löwis [EMAIL PROTECTED] wrote: Scott David Daniels wrote: I am not sure about your compiler, but if I remember the standard correctly, the following code shouldn't complain: PyObject_CallFunction((PyObject*) (void *) PyRange_Type, lll, start, start+len*step, step) You remember the standard incorrectly. There are three standards to consider: C89/90, C99, C++97/98. Here you can find the opinion of one of the authors of the C++ standard in this matter: http://mail.python.org/pipermail/c++-sig/2005-December/009869.html __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PyRange_New() alternative?
Ralf W. Grosse-Kunstleve wrote: You remember the standard incorrectly. There are three standards to consider: C89/90, C99, C++97/98. Here you can find the opinion of one of the authors of the C++ standard in this matter: http://mail.python.org/pipermail/c++-sig/2005-December/009869.html This might be out of context, but Dave Abrahams comment C++ doesn't support the C99 restrict feature. seems irrelevant: C++ certain does not have the restrict keyword, but it has the same aliasing rules as C89 and C99. The specific problem exists in all three languages. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] doc for new restricted execution design for Python
Brett Cannon wrote: Yep. That API will be used directly in the changes to pymalloc and PyMem_*() macros (or at least the basic idea). It is not *only* for extension modules but for the core as well. Existing extension modules and existing C code in the Python interpreter have no idea of any PyXXX_ calls, so I don't understand how new API functions help here. The calls get added to pymalloc and PyMem_*() under the hood, so that existing extension modules use the memory check automatically without a change. The calls are just there in case some one has some random need to do their own malloc but still want to participate in the cap. Plus it helped me think everything through by giving everything I would need to change internally an API. This confused me a bit, too. It might help if you annotated each of the new API's with who the expected callers were: - trusted interpreter - untrusted interpreter - embedding application - extension module Cheers, Nick. -- Nick Coghlan | [EMAIL PROTECTED] | Brisbane, Australia --- http://www.boredomandlaziness.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] ImportWarning flood
Ralf W. Grosse-Kunstleve wrote: --- Martin v. Löwis [EMAIL PROTECTED] wrote: I don't know. Whether a warning is a problem is a matter of attitude, also. Our users will think our applications are broken if they see warnings like that. It is not professional. Actually, your application *was* pretty close to being broken a few weeks ago, when Guido wanted to drop the requirement that a package must contain an __init__ file. In that case, import math would have imported the directory, and given you an empty package. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] doc for new restricted execution design for Python
On Jun 24, 2006, at 2:46 AM, Nick Coghlan wrote: Brett Cannon wrote: Yep. That API will be used directly in the changes to pymalloc and PyMem_*() macros (or at least the basic idea). It is not *only* for extension modules but for the core as well. Existing extension modules and existing C code in the Python interpreter have no idea of any PyXXX_ calls, so I don't understand how new API functions help here. The calls get added to pymalloc and PyMem_*() under the hood, so that existing extension modules use the memory check automatically without a change. The calls are just there in case some one has some random need to do their own malloc but still want to participate in the cap. Plus it helped me think everything through by giving everything I would need to change internally an API. This confused me a bit, too. It might help if you annotated each of the new API's with who the expected callers were: - trusted interpreter - untrusted interpreter - embedding application - extension module Threading is definitely going to be an issue with multiple interpreters (restricted or otherwise)... for example, the PyGILState API probably wouldn't work anymore. -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Switch statement
The current train of thought seems to be to handle a switch statement as follows: 1. Define switch explicitly as a hash table lookup, with the hash table built at function definition time 2. Allow expressions to be flagged as 'static' to request evaluation at def-time 3. Have the expressions in a case clause be implicitly flagged as static 4. Allow 'case in' to be used to indicate that a case argument is to be iterated and all its values added to the current case 5. Static names are not needed - static expressions must refer solely to literals and non-local names An issue with Point 4 is a syntactic nit that Eric Sumner pointed out. Since it involves iteration over x to populate the jump table rather than doing a containment test on x, using 'case in x' is misleading. It would be better written as 'case *x'. Then: 'case 1:' == a switch value of 1 will jump to this case 'case 1, 2:' == a switch value of 1 or 2 will jump to this case 'case *x' == any switch value in x will jump to this case 'case *x, *y' == any switch value in x or y will jump to this case For the remaining points, I share Jim Jewett's concern that 'function definition time' is well defined for function scopes only - a better definition of the evaluation time is needed so that it works for other code as well. (Unlike Jim, I have no problems with restricting switch statements to hashable objects and building the entire jump table at once - if what you want is an arbitrary if-elif chain, then write one!) I'd also like to avoid confusing the code execution order too much. People objected to the out-of-order evaluation in statement local namespaces - what's being proposed for static expressions is significantly worse. So here's a fleshed out proposal for 'once expressions' that are evaluated the first time they are encountered and cached thereafter. Once expressions An expression of the form 'once EXPR' is evaluated exactly once for a given scope. Precedence rules are as for yield expressions. Evaluation occurs the first time the expression is executed. On all subsequent executions, the expression will return the same result as was returned the first time. Referencing a function local variable name from a static expression is a syntax error. References to module globals, to closure variables and to names not bound in the module at all are fine. Justifying evaluation at first execution time - With evaluation at first execution time, the semantics are essentially the same in all kinds of scope (module, function, class, exec). When the evaluation time is defined in terms of function definition time, it is very unclear what happens when there is no function definition involved. With the once-per-scope definition above, the potentially confusing cases that concerned Guido would have the behaviour he desired. def foo(c): ... print once c ... SyntaxError: Cannot use local variable 'c' in once expression The rationale for disallowing function local variables in a once expression is that next time the function is executed, the local variables are expected to contain different values, so it is unlikely that any expression depending on them would give the same answer. Builtins, module globals and closure variables, on the other hand, will typically remain the same across invocations of a once expression. So the rationale for the syntactic restriction against using local variables is still there, even though the local variables may actually contain valid data at the time the once expression is executed. This syntactic restriction only applies to function locals so that a module level once expression is still useful. def foo(c): ... def bar(): ... print once c ... return bar ... b1 = foo(1) b2 = foo(2) b1() 1 b2() 2 For this case, the important point is that execution of the once expression is once per scope, not once per program. Since running the function definition again creates a different function object, the once expression gets executed again the first time that function is called. An advantage of first time execution for functions is that it can be used to defer calculation of expensive default values to the first time they're needed. def foo(c=None): ... if c is None: ... c = once calculate_expensive_default() ... # etc ... With function definition time evaluation, the expensive default would always be calculated even if the specific application always provided an argument to the function and hence never actually needed the default. The one downside to this first time execution approach is that it means 'once' is NOT a solution to the early-binding vs late-binding problem for closure variables. Forcing early binding would still require abuse of function defaults, or a compiler directive along the lines of the current 'global'. I
Re: [Python-Dev] PyRange_New() alternative?
Martin v. Löwis wrote: Scott David Daniels wrote: ... if I remember the standard correctly, the following code shouldn't complain: PyObject_CallFunction((PyObject*) (void *) PyRange_Type, lll, start, start+len*step, step) You remember the standard incorrectly. Python's usage of casts has undefined behaviour, and adding casts only makes the warning go away, but does not make the problem go away. ... (PyObject*) PyRange_Type, ... should only work in the presence of subtypes (which C89 and C99 don't define). If there were a way to declare PyTypeObject as a subtype of PyObject then this cast should work. ... (PyObject*) (void *) PyRange_Type, ... Says a pointer to PyRange_Type should have the structure of a pointer PyObject. Since the prelude to PyTypeObject matches that of PyObject, this should be an effective cast. In addition, casting pointers to and from void * should be silent -- _that_ is what I thought I was remembering of the standard. Do not mistake this for advocacy of changing Python's macros; I was telling the OP how he could shut up the complaint he was getting. In C extensions I'd be likely to do the convert through void * trick myself. -- Scott David Daniels [EMAIL PROTECTED] ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PyObject* aliasing (Was: PyRange_New() alternative?)
Scott David Daniels wrote: ... (PyObject*) (void *) PyRange_Type, ... Says a pointer to PyRange_Type should have the structure of a pointer PyObject. Since the prelude to PyTypeObject matches that of PyObject, this should be an effective cast. The C standard says it's undefined behaviour. The compiler is free to layout PyObject entirely different from PyTypeObject, and dereferencing a PyTypeObject through a PyObject* is undefined behaviour. Python does this all the time, and it did not cause much problems in the past, but that all still does not make it defined behaviour. In particular, gcc now starts to assume (rightfully) that a PyTypeObject* and a PyObject* possibly cannot refer to the same memory when being dereferenced (hence the warning about aliasing). That means that the compiler does not need to re-read contents of one of them (e.g. the reference count) even if the memory gets changed through the other pointer. That may cause bad code to be generated (if the pointers actually do alias). The only well-defined way to alias between types in this context is that a pointer to a struct may alias with a pointer to its first member. So if PyTypeObject was defined as struct PyTypeObject { PyObject _ob; Py_ssize_t ob_size; const char *tp_name; ... }; then Python's behaviour would be well-defined (i.e. one could dereference the _ob member through a PyObject*). Do not mistake this for advocacy of changing Python's macros; I was telling the OP how he could shut up the complaint he was getting. In C extensions I'd be likely to do the convert through void * trick myself. And risk that the compiler generates bad code. In most cases, the compiler cannot detect that a program breaks the standard C aliasing rules. However, in some cases, it can, and in these cases, it issues a warning to make the programmer aware that the program might be full of errors (such as Python). It's unfortunate that people silence the warnings before understanding them. Regards, Martin P.S. As for an example where the compiler really does generate bad code: consider #include stdio.h long f(int *a, long *d){ (*d)++; *a = 5; return *d; } int main() { long d = 0; printf(%ld\n, f((int*)d, d)); return 0; } Here, d starts out as 0, then gets incremented (to 1) in f. Then, a value of 5 is assigned through a (which also points to d), and then the value of d (through the pointer) is printed. Without optimization, gcc 4.1 generates code that prints 5. With optimization, the compiler recognizes that d and a cannot alias, so that it does not need to refetch *d at the end of f (it still has the value in a register). So it does not reread the value, and instead returns the old value (1), and prints that. In calling f, the compiler notices that undefined behavior is invoked, and generates the warning. Casting through void* silences the warning; the generated code is still incorrect (of course, it's undefined, so anything is correct). ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] ImportWarning flood
On Sat, 24 Jun 2006 11:47:19 +0200, \Martin v. Löwis\ [EMAIL PROTECTED] wrote: Ralf W. Grosse-Kunstleve wrote: --- Martin v. Löwis [EMAIL PROTECTED] wrote: I don't know. Whether a warning is a problem is a matter of attitude, also. Our users will think our applications are broken if they see warnings like that. It is not professional. Actually, your application *was* pretty close to being broken a few weeks ago, when Guido wanted to drop the requirement that a package must contain an __init__ file. In that case, import math would have imported the directory, and given you an empty package. But this change was *not* made, and afaict it is not going to be made. So the application is not broken, and the warning is entirely spurious. I am very unhappy that the burden of understanding Python's package structure is being pushed onto end users in this way. Several of my projects now emit three or four warnings on import now. The Twisted plugin system relies on the fact that directories without __init__ are not Python packages (since they _aren't_, have never been, and it has always been extremely clear that Python will ignore them). Of course, Twisted is a pretty marginal Python user so I'm sure no one cares. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] ImportWarning flood
Jean-Paul Calderone wrote: I am very unhappy that the burden of understanding Python's package structure is being pushed onto end users in this way. Several of my projects now emit three or four warnings on import now. So are you requesting that the change is reverted? Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] ImportWarning flood
On Sat, Jun 24, 2006, Jean-Paul Calderone wrote: I am very unhappy that the burden of understanding Python's package structure is being pushed onto end users in this way. Several of my projects now emit three or four warnings on import now. The Twisted plugin system relies on the fact that directories without __init__ are not Python packages (since they _aren't_, have never been, and it has always been extremely clear that Python will ignore them). Of course, Twisted is a pretty marginal Python user so I'm sure no one cares. Then again, bringing this back to the original source of this change, Google is a pretty marginal Python user, too. ;-) I was a pretty strong -1 on the original proposed change of allowing import on empty directories, but my take is that if a project deliberately includes empty directories, they can add a new warning filter on program startup. Your users will have to upgrade to a new version of the application or do a similar fix in their own sitecustomize. I don't consider that a huge burden. -- Aahz ([EMAIL PROTECTED]) * http://www.pythoncraft.com/ I saw `cout' being shifted Hello world times to the left and stopped right there. --Steve Gonedes ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] ImportWarning flood
On Sat, 24 Jun 2006 07:27:15 -0700, Aahz [EMAIL PROTECTED] wrote: On Sat, Jun 24, 2006, Jean-Paul Calderone wrote: I am very unhappy that the burden of understanding Python's package structure is being pushed onto end users in this way. Several of my projects now emit three or four warnings on import now. The Twisted plugin system relies on the fact that directories without __init__ are not Python packages (since they _aren't_, have never been, and it has always been extremely clear that Python will ignore them). Of course, Twisted is a pretty marginal Python user so I'm sure no one cares. Then again, bringing this back to the original source of this change, Google is a pretty marginal Python user, too. ;-) I think it is safe to say that Twisted is more widely used than anything Google has yet released. Twisted also has a reasonably plausible technical reason to dislike this change. Google has a bunch of engineers who, apparently, cannot remember to create an empty __init__.py file in some directories sometimes. I was a pretty strong -1 on the original proposed change of allowing import on empty directories, but my take is that if a project deliberately includes empty directories, they can add a new warning filter on program startup. Your users will have to upgrade to a new version of the application or do a similar fix in their own sitecustomize. I don't consider that a huge burden. The usage here precludes fixing it in Twisted. Importing twisted itself prints a warning: there's no way to get code run soon enough to suppress this. I do think requiring each user to modify sitecustomize is overly burdensome. Of course this is highly subjective and I don't expect anyone to come to an agreement over it, but it seems clear that it is at least a burden of some sort. Jean-Paul ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Switch statement
Josiah Carlson wrote: This is a good thing, because if switch/case ends up functionally identical to if/elif/else, then it has no purpose as a construct. there's no shortage of Python constructs that are functionally identical to existing constructs. as with all syntactic sugar, the emphasis should be on what the programmer wants to express, not how you can artificially constrain the implementation to make the new thing slightly different from what's already in there. and the point of switch/case is to be able to say I'm going to dispatch on a single value in a concise way; the rest is optimizations. /F ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] ImportWarning flood
On Sat, 24 Jun 2006 16:24:11 +0200, \Martin v. Löwis\ [EMAIL PROTECTED] wrote: Jean-Paul Calderone wrote: I am very unhappy that the burden of understanding Python's package structure is being pushed onto end users in this way. Several of my projects now emit three or four warnings on import now. So are you requesting that the change is reverted? Yes, please. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Numerical robustness, IEEE etc.
Neal Norwitz [EMAIL PROTECTED] wrote: Seriously, there seems to be a fair amount of miscommunication in this thread. ... Actually, this isn't really a reply to you, but you have described the issue pretty well. The best design doc that I know of is code. :-) It would be much easier to communicate using code snippets. I'd suggest pointing out places in the Python code that are lacking and how you would correct them. That will make it easier for everyone to understand each other. Yes. That is easy. What, however, I have part of (already) and was proposing to do BEFORE going into details was to generate a testing version that shows how I think that it should be done. Then people could experiment with both the existing code and mine, to see the differences. But, in order to do that, I needed to find out the best way of going about it It wouldn't help with the red herrings, such as the reasons why it is no longer possible to rely on hardware interrupts as a mechanism. But they are only very indirectly relevant. The REASON that I wanted to do that was precisely because I knew that very few people would be deeply into arithmetic models, the details of C89 and C99 (ESPECIALLY as the standard is incomplete :-( ), and so having a sandbox before starting the debate would be a GREAT help. It's much easier to believe things when you can try them yourself Facundo Batista [EMAIL PROTECTED] wrote: Well, so I'm completely lost... because, if all you want is to be able to chose a returned value or an exception raised, you actually can control that in Decimal. Yes, but I have so far failed to get hold of a copy of the Decimal code! I will have another go at subverting Subversion. I should VERY much like to be get hold of those documents AND build a testing version of the code - then I can go away, experiment, and come back with some more directed comments (not mere generalities). Aahz [EMAIL PROTECTED] wrote: You can't expect us to do your legwork for you, and you can't expect that Tim Peters is the only person on the dev team who understands what you're getting at. Well, see above for the former - I did post my intents in my first message. And, as for the latter, I have tried asking what I can assume that people know - it is offensive and time-consuming and hence counter-productive to start off assuming that your audience does not have a basic background. To repeat, it is precisely to address THAT issue that I wanted to build a sandbox BEFORE going into details. If people don't know the theory in depth and but are interested, they could experiment with the sandbox and see what happens in practice. Incidentally, your posts will go directly to python-dev without moderation if you subscribe to the list, which is a Good Idea if you want to participate in discussion. Er, you don't receive a mailing list at all if you don't subscribe! If that is the intent, I will see if I can find how to subscribe in the unmoderated fashion. I didn't spot two methods on the Web pages when I subscribed. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Numerical robustness, IEEE etc.
To the moderator: this is getting ridiculous. [EMAIL PROTECTED] wrote: Unfortunately, that doesn't help, because it is not where the issues are. What I don't know is how much you know about numerical models, IEEE 754 in particular, and C99. You weren't active on the SC22WG14 reflector, but there were some lurkers. Hand wave, hand wave, hand wave. Many of us here aren't stupid and have more than passing experience with numerical issues, even if we haven't been active on SC22WG14. Let's stop with the high-level pissing contest and lay out a clear technical description of exactly what has your knickers in a twist, how it hurts Python, and how we can all work together to make the pain go away. SC22WG14 is the ISO committee that handles C standardisation. One of the reasons that the UK voted no was because the C99 standard was seriously incomprehensible in many areas to anyone who had not been active on the reflector. If you think that I can summarise a blazing row that went on for over 5 years and produced over a million lines of technical argument alone in a clear technical description, you have an exaggerated idea of my abilities. I have a good many documents that I could post, but they would not help. Some of them could be said to be clear technical descriptions but most of them were written for other audiences, and assume those audiences' backgrounds. I recommend starting by reading the comments in floatobject.c and mathmodule.c and then looking up the sections of the C89 and C99 standards that are referenced by them. A good place to start: You mentioned earlier that there where some nonsensical things in floatobject.c. Can you list some of the most serious of these? Well, try the following for a start: Python 2.4.2 (#1, May 2 2006, 08:28:01) [GCC 4.1.0 (SUSE Linux)] on linux2 Type help, copyright, credits or license for more information. a = NaN b = float(a) c = int(b) d = (b == b) print a, b, c, d NaN nan 0 False Python 2.3.3 (#1, Feb 18 2004, 11:58:04) [GCC 2.8.1] on sunos5 Type help, copyright, credits or license for more information. a = NaN b = float(a) c = int(b) d = (b == b) print a, b, c, d NaN NaN 0 True That demonstrates that the error state is lost by converting to int, and that NaN testing isn't reliable. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Numerical robustness, IEEE etc.
Michael Hudson [EMAIL PROTECTED] wrote: But, a floating point exception isn't a machine check interrupt, it's a program interrupt... For reasons that I could go into, but are irrelevant, almost all modern CPU architectures have one ONE interrupt mechanism, and use it for both of those. It is the job of the interrupt handler (i.e. FLIH, first-level interrupt handler, usually in Assembler) to classify those, get into the appropriate state and call the interrupt handling code. Now, this is a Bad Idea, but separating floating-point exceptions from machine checks at the hardware level died with mainframes, as far as I know. The problem with the current approach is that it makes it very hard for the operating system to allow the application to handle the former. And the problem with most modern operating systems is that don't even do what they could do at all well, because THAT died with the mainframes, too :-( The impact of all this mess on things like Python is that exception handling is a nightmare area, especially when you bring in threading (i.e. hardware threading with multiple cores, or genuinely parallel threading on a single core). Yes, I have brought a system down by causing too many floating-point exceptions in all threads of a highly parallel program on a large SMP See, that wasn't so hard! We'd have saved a lot of heat and light if you'd said that at the start (and if you think you'd made it clear already: you hadn't). I thought I had. I accept your statement that I hadn't. Sorry. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Numerical robustness, IEEE etc.
Tim Peters [EMAIL PROTECTED] wrote: I suspect Nick spends way too much time reading standards ;-) God help me, YES! And in trying to get them improved. Both of which are very bad for my blood pressure :-( My real interest is in portable, robust programming - I DON'T abuse the terms to mean bitwise identical, but that is by the way - and I delved in here trying to write a jiffy little bit of just such code as part of a course example. BANG!!! It failed in both respects on the first two systems I tried on, and it wasn't my code that was wrong. The killer is that standards are the nearest to a roadmap for portability, especially portability and robustness. If you have non-conforming code, and it goes bananas, the compiler vendor will refuse to do anything, no matter how clearly it is a bug in the compiler or library. What is worse is that there is an incentive for the leading vendors (see below) to implement down to the standard, even when it is easier to do better. And this is happening in this area. What he said is: If you look at floatobject.c, you will find it solid with constructions that make limited sense in C99 but next to no sense in C89. And, in fact, C89 truly defines almost nothing about floating-point semantics or pragmatics. Nevertheless, if a thing works under gcc and under MS C, then it works for something like 99.9% of Python's users, and competitive pressures are huge for other compiler vendors to play along with those two. Yup, though you mean gcc on an x86/AMD64/EM64T system, and 99.9% is a rhetorical exaggeration - but one of the failures WAS on one of those! I don't know what specifically Nick had in mind, and join the chorus asking for specifics. That is why I wanted to: a) Read the decimal stuff and play around with the module and: b) Write a sandbox and sort out my obvious errors and: c) Write a PEP describing the issue and proposals BEFORE going into details. The devil is in the details, and I wanted to leave him sleeping until I had lined up my howitzers I _expect_ he's got a keen eye for genuine coding bugs here, but also expect I'd consider many technically dubious bits of FP code to be fine under the de facto standard dodge. Actually, I tried to explain that I don't have many objections to the coding of the relevant files - whoever wrote them and I have a LOT of common attitudes :-) And I have been strongly into de facto standards for over 30 years, so am happy with them. Yes, I have found a couple of bugs, but not ones worth fixing (e.g. there is a use of x != x where PyISNAN should be used, and a redundant test for an already excluded case, but what the hell?) My main objection is that they invoke C behaviour in many places, and that is (a) mostly unspecified in C, (b) numerically insane in C99 and (c) broken in practice. So, sure, everything we do is undefined, but, no, we don't really care :-) If a non-trivial 100%-guaranteed-by-the-standard-to-work C program exists, I don't think I've seen it. I can prove that none exists, though I would have to trawl over SC22WG14 messages to prove it. I spent a LONG time trying to get undefined defined and used consistently (let alone sanely) in C, and failed dismally. BTW, Nick, are you aware of Python's fpectl module? That's user-contributed code that attempts to catch overflow, div-by-0, and invalid operation on 754 boxes and transform them into raising a Python-level FloatingPointError exception. Changes were made all over the place to try to support this at the time. Every time you see a PyFPE_START_PROTECT or PyFPE_END_PROTECT macro in Python's C code, that's the system it's trying to play nicely with. Normally, those macros have empty expansions. Aware of, yes. Have looked at, no. I have already beaten my head against that area and already knew the issues. I have even implemented run-time systems that got it right, and that is NOT pretty. fpectl is no longer built by default, because repeated attempts failed to locate any but ya, I played with it once, I think users, and the masses of platform-specific #ifdef'ery in fpectlmodule.c were suffering fatal bit-rot. No users + no maintainers means I expect it's likely that module will go away in the foreseeable future. You'd probably hate its _approach_ to this anyway ;-) Oh, yes, I know that problem. You would be AMAZED at how many 'working' programs blow up when I turn it on on systems that I manage - not excluding Python itself (integer overflow) :-) And, no, I don't hate that approach, because it is one of the plausible ones; not good, but what can you do? Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org
Re: [Python-Dev] Numerical robustness, IEEE etc.
Tim Peters [EMAIL PROTECTED] wrote: SC22WG14? is that some marketing academy? not a very good one, obviously. That's because it's European ;-) Er, please don't post ironic satire of that nature - many people will believe it! ISO is NOT European. It is the Internatational Standards Organisation, of which ANSI is a member. And, for reasons and with consequences that are only peripherally relevant, SC22WG14 has always been dominated by ANSI. In fact, C89 was standardised by ANSI (sic), acting as an agent for ISO. C99 was standardised by ISO directly, but for various reasons only some of which I know, was even more ANSI-dominated than C89. Note that I am NOT saying big bad ANSI, as a large proportion of that was and is the ghastly support provided by many countries to their national standards bodies. The UK not excepted. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Numerical robustness, IEEE etc.
Terry Reedy [EMAIL PROTECTED] wrote: Of interest among their C-EPs is one for adding the equivalent of our decimal module http://www.open-std.org/jtc1/sc22/wg14/www/projects#24732 IBM is mounting a major campaign to get its general decimal arithmetic standardised as THE standard form of arithmetic. There is a similar (more advanced) move in C++, and they are working on Fortran. I assume that Cobol is already on board, and there may be others. There is nothing underhand about this - IBM is quite open about it, I believe that they are making all critical technologies freely design has been thought out and is at least half-sane - which makes it among the best 1-2% of IT technologies :-( Personally, I think that it is overkill, because it is a MASSIVELY complex solution, and will make sense only where at least two of implementation cost, performance, power usage and CPU/memory size are not constraints. E.g. mainframes, heavyweight commercial codes etc. but definitely NOT massive parallelism, very low power computing, micro-minaturisation and so on. IEEE 754 was bad (which is why it is so often implemented only in part), but this is MUCH worse. God alone knows whether IBM will manage to move the whole of IT design - they have done it before, and have failed before (often after having got further than this). Now, whether that makes it a good match for Python is something that is clearly fruitful grounds for debate :-) Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Numerical robustness, IEEE etc.
Jim Jewett [EMAIL PROTECTED] wrote: The conventional interpretation was that any operation that was not mathematically continuous in an open region including its argument values (in the relevant domain) was an error, and that all such errors should be flagged. That is what I am talking about. Not a bad goal, but not worth sweating over, since it isn't sufficient. It still allows functions whose continuity does not extend to the next possible floating point approximation, or functions whose value, while continuous, changes too much in that region. Oh, yes, quite. But I wasn't describing something that needed effort; I was merely describing the criterion that was traditionally used (and still is, see below). There is also the Principle of Least Surprise: the behaviour of a language should be at least explicable to mere mortals (a.k.a. ordinary programmers) - one that says whatever the designing committee thought good at the time is a software engineering disaster. For some uses, it is more important to be consistent with established practice than to be as correct as possible. If the issues are still problems, and can't be solved in languages like java, then ... the people who want correct behavior will be a tiny minority, and it makes sense to have them use a 3rd-party extension. I don't think that you understand the situation. I was and am describing established practice, as used by the numeric programmers who care about getting reliable answers - most of those still use Fortran, for good and sufficient reasons. There are two other established practices: Floating-point is figment of your imagination - don't support it. Yeah. Right. Whatever. It's only approximate, so who gives a damn what it does? Mine is the approach taken by the Fortran, C and C++ standards and many Fortran implementations, but the established practice in highly optimised Fortran and most C is the last. Now, Java (to some extent) and C99 introduced something that attempts to eliminate errors by defining what they do (more-or-less arbitrarily); much as if Python said that, if a list or dictionary entry wasn't found, it would create one and return None. But that is definitely NOT established practice, despite the fact that its proponents claim it is. Even IEEE 754 (as specified) has never reached established practice at the language level. The very first C99 Annex F implementation that I came across appeared in 2005 (Sun One Studio 9 under Solaris 10 - BOTH are needed); I have heard rumours that HP-UX may have one, but neither AIX nor Linux does (even now). I have heard rumours that the latest Intel compiler may be C99 Annex F, but don't use it myself, and I haven't heard anything reliable either way for Microsoft. What is more, many of the tender documents for systems bought for numeric programming in 2005 said explicitly that they wanted C89, not C99 - none asked for C99 Annex F that I saw. No, C99 Annex F is NOT established practice and, God willing, never will be. For example, consider conversion between float and long - which class should control the semantics? The current python approach with binary fp is to inherit from C (consistency with established practice). The current python approach for Decimal (or custom classes) is to refuse to guess in the first place; people need to make an explicit conversion. How is this a problem? See above re C extablished practice. The above is not my point. I am talking about the generic problem where class A says that overflow should raise an exception, class B says that it should return infinity and class C says nothing. What should C = A*B do on overflow? [ Threading and interrupts ] No, that is a functionality issue, but the details are too horrible to go into here. Python can do next to nothing about them, except to distrust them - just as it already does. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Numerical robustness, IEEE etc.
Nick Maclaren [EMAIL PROTECTED] wrote: Facundo Batista [EMAIL PROTECTED] wrote: Well, so I'm completely lost... because, if all you want is to be able to chose a returned value or an exception raised, you actually can control that in Decimal. Yes, but I have so far failed to get hold of a copy of the Decimal code! I will have another go at subverting Subversion. I should VERY much like to be get hold of those documents AND build a testing version of the code - then I can go away, experiment, and come back with some more directed comments (not mere generalities). Download any Python 2.4 or 2.5 distribution. It will include the decimal.py module. Heck, you may even have it already, if you are running Python 2.4 or later. To see the latest version: http://svn.python.org/view/python/trunk/Lib/decimal.py If you want to see the C version of the decimal module, it is available: http://svn.python.org/view/sandbox/trunk/decimal-c/ The general magic incantation to see the SVN repository is: http://svn.python.org/view/ - Josiah ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Numerical robustness, IEEE etc.
On 6/23/06, Nick Maclaren [EMAIL PROTECTED] wrote: [EMAIL PROTECTED] wrote: Unfortunately, that doesn't help, because it is not where the issues are.What I don't know is how much you know about numerical models, IEEE 754 in particular, and C99.You weren't active on the SC22WG14 reflector, but there were some lurkers. Hand wave, hand wave, hand wave. [...]SC22WG14 is the ISO committee that handles C standardisation. [...] I'm not asking you to describe SC22WG14 or post detailed technical summaries of the long and painful road. I'd like you to post things directly relevant to Python with footnotes to necessary references. It is then incumbent on those that wish to respond to your post to familiarize themselves with the relevant background material. However, it is really darn hard to do that when we don't know what you're trying to fix in Python. The examples you show below are a good start in that direction. A good place to start: You mentioned earlier that there where some nonsensical things in floatobject.c.Can you list some of the most serious of these?Well, try the following for a start:Python 2.4.2 (#1, May2 2006, 08:28:01)[GCC 4.1.0 (SUSE Linux)] on linux2Type help, copyright, credits or license for more information. a = NaN b = float(a) c = int(b) d = (b == b) print a, b, c, dNaN nan 0 False Python 2.3.3 (#1, Feb 18 2004, 11:58:04)[GCC 2.8.1] on sunos5Type help, copyright, credits or license for more information. a = NaN b = float(a) c = int(b) d = (b == b) print a, b, c, dNaN NaN 0 TrueThat demonstrates that the error state is lost by converting to int,and that NaN testing isn't reliable. Now we're getting to business. There are actually (at least 3 issues) that I see: 1) The string representation of NaN is not standardized across platforms 2) on a sane platform, int(float('NaN')) should raise an ValueError exception for the int() portion. 3) float('NaN') == float('NaN') should be false, assuming NaN is not a signaling NaN, by default If we include Windows:Python 2.5b1 (r25b1:47027, Jun 20 2006, 09:31:33) [MSC v.1310 32 bit (Intel)] on win32Type copyright, credits or license() for more information. a = NaN b = float(a)Traceback (most recent call last): File pyshell#1, line 1, in module b = float(a)ValueError: invalid literal for float(): NaN So: 4) in addition to #1, the platform atof sometimes doesn't accept any conventional spelling of NaN 5) All of the above likely applies to infinities and +-0So the open question is how to both define the semantics of Python floating point operations and to implement them in a way that verifiably works on the vast majority of platforms without turning the code into a maze of platform-specific defines, kludges, or maintenance problems waiting to happen. Thanks,-Kevin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Switch statement
Guido van Rossum wrote: just map switch EXPR: case E1: ... case in E2: ... else: ... to VAR = EXPR if VAR == E1: ... elif VAR in E2: ... else: ... where VAR is a temporary variable, and case and case-in clauses can be freely mixed, and leave the rest to the code generator. (we could even allow switch EXPR [as VAR] to match a certain other sugar construct). This used to be my position. I switched after considering the alternatives for what should happen if either the switch expression or one or more of the case expressions is unhashable. I don't see this as much of a problem, really: we can simply restrict the optimization to well-known data types (homogenous switches using integers or strings should cover 99.9% of all practical cases), and then add an opcode that checks uses a separate dispatch object to check if fast dispatch is possible, and place that before an ordinary if/elif sequence. the dispatch object is created when the function object is created, along with default values and statics. if fast dispatch cannot be used for a function instance, the dispatch object is set to None, and the dispatch opcode turns into a NOP. (each switch statement should of course have it's own dispatch object). /F ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Switch statement
At 07:04 PM 6/24/2006 +0200, Fredrik Lundh wrote: I don't see this as much of a problem, really: we can simply restrict the optimization to well-known data types (homogenous switches using integers or strings should cover 99.9% of all practical cases), and then add an opcode that checks uses a separate dispatch object to check if fast dispatch is possible, and place that before an ordinary if/elif sequence. What about switches on types? Things like XML-RPC and JSON want to be able to have a fast switch on an object's type and fall back to slower tests only for non-common cases. For that matter, you can build an effective multiway isinstance() check using something like: for t in obtype.__mro__: switch t: case int: ...; break case str: ...; break else: continue else: # not a recognized type This is essentially what RuleDispatch does in generic functions' dispatch trees now, albeit without the benefit of a switch statement or opcode. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] ImportWarning flood
--- Jean-Paul Calderone [EMAIL PROTECTED] wrote: I think it is safe to say that Twisted is more widely used than anything Google has yet released. Twisted also has a reasonably plausible technical reason to dislike this change. Google has a bunch of engineers who, apparently, cannot remember to create an empty __init__.py file in some directories sometimes. Simply adding a note to the ImportError message would solve this problem just in time: import mypackage.foo Traceback (most recent call last): File stdin, line 1, in ? ImportError: No module named mypackage.foo Note that subdirectories are searched for imports only if they contain an __init__.py file: http://www.python.org/doc/essays/packages.html __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Switch statement
On Fri, 23 Jun 2006, Josiah Carlson wrote: This is a good thing, because if switch/case ends up functionally identical to if/elif/else, then it has no purpose as a construct. This doesn't make sense as a rule. Consider: If x.y ends up functionally identical to getattr(x, 'y'), then it has no purpose as a construct. If print x ends up functionally identical to import sys; sys.stdout.write(str(x) + '\n'), then it has no purpose as a construct. What matters is not whether it's *functionally* identical. What matters is whether it makes more sense to the reader and has a meaning that is likely to be what the writer wanted. Evaluate the switch expression just once is a semantic win. Evaluate the switch expression just once, but throw an exception if the result is not hashable is a weaker semantic win. (How often is that what the writer is thinking about?) Throw an exception at compile time if the cases overlap is also a weaker semantic win. (How often is this an actual mistake that the writer wants to be caught at compile time?) Use the case values computed at compile time, not at runtime doesn't seem like much of a win. (How often will this be what the writer intended, as opposed to a surprise hiding in the bushes?) -- ?!ng ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Switch statement
Ka-Ping Yee [EMAIL PROTECTED] wrote: On Fri, 23 Jun 2006, Josiah Carlson wrote: This is a good thing, because if switch/case ends up functionally identical to if/elif/else, then it has no purpose as a construct. This doesn't make sense as a rule. Consider: If x.y ends up functionally identical to getattr(x, 'y'), then it has no purpose as a construct. If print x ends up functionally identical to import sys; sys.stdout.write(str(x) + '\n'), then it has no purpose as a construct. I agree with you completely, it doesn't make sense as a rule, but that was not its intent. Note that I chose specific values of X and Y in if X is functionally identical to Y, then it has no purpose as a construct such that it did make sense. What matters is not whether it's *functionally* identical. What matters is whether it makes more sense to the reader and has a meaning that is likely to be what the writer wanted. Evaluate the switch expression just once is a semantic win. Evaluate the switch expression just once, but throw an exception if the result is not hashable is a weaker semantic win. (How often is that what the writer is thinking about?) Throw an exception at compile time if the cases overlap is also a weaker semantic win. (How often is this an actual mistake that the writer wants to be caught at compile time?) Use the case values computed at compile time, not at runtime doesn't seem like much of a win. (How often will this be what the writer intended, as opposed to a surprise hiding in the bushes?) The reasons by themselves don't seem to make sense, until you look at them in the scope from which the decisions were made. Just like the word Excelsior makes no sense until you hear Minnesota. Those final three rules, when seen in the context of the rest of the conversation, and with the understanding that one of the motivating purposes is to improve execution time, do offer methods and mechanisms to answer those motivations. - Josiah ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Switch statement
Fredrik Lundh wrote: Guido van Rossum wrote: just map switch EXPR: case E1: ... case in E2: ... else: ... to VAR = EXPR if VAR == E1: ... elif VAR in E2: ... else: ... where VAR is a temporary variable, and case and case-in clauses can be freely mixed, and leave the rest to the code generator. (we could even allow switch EXPR [as VAR] to match a certain other sugar construct). This used to be my position. I switched after considering the alternatives for what should happen if either the switch expression or one or more of the case expressions is unhashable. I don't see this as much of a problem, really: we can simply restrict the optimization to well-known data types (homogenous switches using integers or strings should cover 99.9% of all practical cases) +1 This would keep it simple to use. A possibility that hasn't been mentioned yet is to supply a precomputed jump table to a switch explicitly. table = {expr1:1, expr2:2, ... } for value in data: switch table[value]: case 1: ... case 2: ... ... else: ... (I prefer indented case's, but it's not the point here. I can get use them it not being indented.) It is an easy matter to lift evaluation of the switch table expressions out of inner loops or even out of functions. (if it's needed of course) Or an alternate form may allow a pre-evaluated jump table to be explicitly substituted directly at the time of use. (would this be possible?) def switcher(value, table): switch value, table: case 1: ... case 2: ... ... else: ... Cheers, Ron ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Simple Switch statement
From what I can see, almost everyone wants a switch statement, though perhaps for different reasons. The main points of contention are 1) a non-ambiguous syntax for assigning multiple cases to a single block of code, 2) how to compile variables as constants in a case statement, and 3) handling overlapping cases. Here's a simple approach that will provide most of the benefit without trying to overdo it: switch f(x): # any expression is allowable here but raises an exception if the result is not hashable case 1: g() # matches when f(x)==1 case 2,3 : h()# matches when f(x) in (2,3) case 1: i() # won't ever match because the first case 1 wins case (4,5), 6: j()# matches when f(x) in ((4,5), 6) case bingo: k() # matches when f(x) in (bingo,) default: l()# matches if nothing else does Though implemented as a hash table, this would execute as if written: fx = f(x) hash(fx) if fx in (1,): g() elif fx in (2,3): h() elif fx in (1,): i() elif fx in ((4,5), 6): j() elif fx in (bingo,): k() else: l() The result of f(x) should be hashable or an exception is raised. Cases values must be ints, strings, or tuples of ints or strings. No expressions are allowed in cases. Since a hash table is used, the fx value must support __hash__ and __eq__, but not expect multiple __eq__ tests as in the elif version. I've bypassed the constantification issue. The comes-up throughout Python and is not unique to the switch statement. If someone wants a static or const declaration, it should be evaluated separately on its own merits. At first, I was bothered by not supporting sre style use cases with imported codes; however, I noticed that sre's imported constants already have values that correspond to their variable names and that that commonplace approach makes is easy to write fast switch-case suites: def _compile(code, pattern, flags): # internal: compile a (sub)pattern for op, av in pattern: switch op: case 'literal', 'not_literal': if flags SRE_FLAG_IGNORECASE: emit(OPCODES[OP_IGNORE[op]]) emit(_sre.getlower(av, flags)) else: emit(OPCODES[op]) emit(av) elif 'in': if flags SRE_FLAG_IGNORECASE: emit(OPCODES[OP_IGNORE[op]]) def fixup(literal, flags=flags): return _sre.getlower(literal, flags) else: emit(OPCODES[op]) fixup = _identityfunction skip = _len(code); emit(0) _compile_charset(av, flags, code, fixup) code[skip] = _len(code) - skip case 'any': if flags SRE_FLAG_DOTALL: emit(OPCODES[ANY_ALL]) else: emit(OPCODES[ANY]) case 'repeat', 'min_repeat', 'max_repeat': . . . When the constants are mapped to integers instead of strings, it is no burden to supply a reverse mapping like we already do in opcode.py. This commonplace setup also makes it easy to write fast switch-case suites: from opcode import opmap def calc_jump_statistics(f): reljumps = absjumps = 0 for opcode, oparg in gencodes(f.func_code.co_code): switch opmap[opcode]: case 'JUMP_FORWARD', 'JUMP_IF_FALSE', 'JUMP_IF_TRUE': reljumps +=1 case 'JUMP_ABSOLUTE', 'CONTINUE_LOOP': absjumps += 1 . . . So, that is it, my proposal for simple switch statements with a straight-forward implementation, fast execution, simply explained behavior, and applicability to to the most important use cases. Raymond P.S. For the sre case, we get a great benefit from using strings. Since they are all interned at compile time and have their hash values computed no more than once, the dispatch table will never have to actually calculate a hash and the full string comparison will be bypassed because identity implies equality. That's nice. The code will execute clean and fast. AND we get readability improvements too. Not bad. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Simple Switch statement
At 03:49 PM 6/24/2006 -0700, Raymond Hettinger wrote: Cases values must be ints, strings, or tuples of ints or strings. -1. There is no reason to restrict the types in this fashion. Even if you were trying to ensure marshallability, you could still include unicode and longs. However, there isn't any need for marshallability here, and I would like to be able to use switches on types, enumerations, and the like. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] PyObject_CallFunction and 'N' format char
Sorry this is slightly offtopic, but I think it's important... According to recent experiments tracking down a memory leak, it seems that PyObject_CallFunction(func, N, object) and PyObject_CallFunction(func, O, object) are doing exactly the same thing. However, documentation says The C arguments are described using a Py_BuildValue() style format string.. And of course Py_BuildValue consumes one object reference, according to the documentation and practice. However, PyObject_CallFunction does _not_ consume such an object reference, contrary to what I believed for years. God knows how many leaks I may have introduced in my bindings... :| Any comments? ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Simple Switch statement
[Phillip Eby] I would like to be able to use switches on types, enumerations, and the like. Be careful about wanting everything and getting nothing. My proposal is the simplest thing that gets the job done for key use cases found in real code. Also, it is defined tightly enough to allow room for growth and elaboration over time. Good luck proposing some alternative that is explainable, has no hidden surprises, has an easy implementation, and allows fast hash-table style dispatch. Besides, if you want to switch on other types, it is trivial to include a reverse mapping (like that in the opcode.py example). Reverse mappings are to build and easy to read: # enumeration example colormap = {} for code, name in enumerate('RED ORANGE YELLOW GREEN BLUE INDIGO MAGENTA'.split()): globals()[name] = code colormap[code] = name def colormixer(color): switch colorname[color]: case 'RED', 'YELLOW', 'BLUE': handle_primary() case 'MAGENTA': get_another_color() default: handle_rest() colormixer(RED) colormixer(ORANGE) Raymond ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Simple Switch statement
On Sat, Jun 24, 2006, Raymond Hettinger wrote: So, that is it, my proposal for simple switch statements with a straight-forward implementation, fast execution, simply explained behavior, and applicability to to the most important use cases. +1 I've been trying to write a response to these threads. I don't particularly like what looks like an attempt to shove together too many different features into a single package. Raymond's proposal gives Python the switch statement people have been demanding while leaving room for the improvements that have been suggested over a plain switch. Phillip's point about longs and Unicode is valid, but easily addressed by limiting cases to hashable literal (though we might want to explicitly exclude floats). -- Aahz ([EMAIL PROTECTED]) * http://www.pythoncraft.com/ I saw `cout' being shifted Hello world times to the left and stopped right there. --Steve Gonedes ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Simple Switch statement
At 05:30 PM 6/24/2006 -0700, Raymond Hettinger wrote: [Phillip Eby] I would like to be able to use switches on types, enumerations, and the like. Be careful about wanting everything and getting nothing. My proposal is the simplest thing that gets the job done for key use cases found in real code. It's ignoring at least symbolic constants and types -- which are certainly key use cases found in real code. Besides which, this is Python. We don't select a bunch of built-in types and say these are the only types that work. Instead, we have protocols (like __hash__ and __eq__) that any object may implement. If you don't want expressions to be implicitly lifted to function definition time, you'd probably be better off arguing to require the use of explicit 'static' for non-literal case expressions. (Your reverse mapping, by the way, is a non-starter -- it makes the code considerably more verbose and less obvious than a switch statement, even if every 'case' has to be decorated with 'static'.) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Simple Switch statement
Raymond Hettinger wrote: [Phillip Eby] I would like to be able to use switches on types, enumerations, and the like. Be careful about wanting everything and getting nothing. My proposal is the simplest thing that gets the job done for key use cases found in real code. Also, it is defined tightly enough to allow room for growth and elaboration over time. Good luck proposing some alternative that is explainable, has no hidden surprises, has an easy implementation, and allows fast hash-table style dispatch. I like it! You could actually make it even simpler by having the initial implementation only permit strings for the cases. Then the concept is: 1. Each case in the switch is given one or more string names 2. The same name cannot appear more than once in a single switch statement 3. A case is executed when the switch value matches one of its names 4. The else clause is executed if the switch value does not match any case 5. Case names use string-literal syntax to permit later expansion 6. Switching on non-strings requires an auxiliary lookup The advantage over the status quo is that instead of having to identify code directly (as in a function dispatch table), the auxiliary lookup only has to identify the name of the appropriate case. And it still leaves the door open for all the other features being considered: - literals other than strings in the cases (integers, tuples) - arbitrary expressions in the cases (needs 'static' expressions first) - sequence unpacking using 'in' or '*' Besides, if you want to switch on other types, it is trivial to include a reverse mapping (like that in the opcode.py example). Reverse mappings are to build and easy to read: You can even build the jump table after the fact if everything you want to switch on is an existing global or builtin variable: def switch_table(*args): # Build a string switch table for a set of arguments # All arguments must exist in the current global namespace # All arguments must be hashable all_items = globals().items() all_items.extend(__builtins__.__dict__.items()) table = {} for obj in args: for name, value in all_items: if obj is value: table[obj] = name return table typemap = switch_table(float, complex, int, long, str, unicode, Decimal) pprint(typemap) {class 'decimal.Decimal': 'Decimal', type 'complex': 'complex', type 'float': 'float', type 'int': 'int', type 'long': 'long', type 'str': 'str', type 'unicode': 'unicode'} Armed with that switch table, you can then do: def fast_dispatch(self, other): switch typemap[other.__class__] case 'Decimal': self.handle_decimal(other) case 'int', 'long': self.handle_integer(other) case 'float': self.handle_float(other) case 'complex': self.handle_complex(other) case 'str', 'unicode': self.handle_string(other) else: self.handle_any(other) -- Nick Coghlan | [EMAIL PROTECTED] | Brisbane, Australia --- http://www.boredomandlaziness.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Simple Switch statement
Raymond Hettinger wrote: From what I can see, almost everyone wants a switch statement, though perhaps for different reasons. The main points of contention are 1) a non-ambiguous syntax for assigning multiple cases to a single block of code, 2) how to compile variables as constants in a case statement, and 3) handling overlapping cases. Here's a simple approach that will provide most of the benefit without trying to overdo it: Looks good to me. switch f(x): # any expression is allowable here but raises an exception if the result is not hashable case 1: g() # matches when f(x)==1 case 2,3 : h()# matches when f(x) in (2,3) case 1: i() # won't ever match because the first case 1 wins case (4,5), 6: j()# matches when f(x) in ((4,5), 6) case bingo: k() # matches when f(x) in (bingo,) default: l()# matches if nothing else does Though implemented as a hash table, this would execute as if written: fx = f(x) hash(fx) if fx in (1,): g() elif fx in (2,3): h() elif fx in (1,): i() elif fx in ((4,5), 6): j() elif fx in (bingo,): k() else: l() The result of f(x) should be hashable or an exception is raised. Cases values must be ints, strings, or tuples of ints or strings. No expressions are allowed in cases. Since a hash table is used, the fx value must support __hash__ and __eq__, but not expect multiple __eq__ tests as in the elif version. I've bypassed the constantification issue. The comes-up throughout Python and is not unique to the switch statement. If someone wants a static or const declaration, it should be evaluated separately on its own merits. Yes, I agree. When the constants are mapped to integers instead of strings, it is no burden to supply a reverse mapping like we already do in opcode.py. This commonplace setup also makes it easy to write fast switch-case suites: from opcode import opmap def calc_jump_statistics(f): reljumps = absjumps = 0 for opcode, oparg in gencodes(f.func_code.co_code): switch opmap[opcode]: case 'JUMP_FORWARD', 'JUMP_IF_FALSE', 'JUMP_IF_TRUE': reljumps +=1 case 'JUMP_ABSOLUTE', 'CONTINUE_LOOP': absjumps += 1 . . . So, that is it, my proposal for simple switch statements with a straight-forward implementation, fast execution, simply explained behavior, and applicability to to the most important use cases. Just what I was looking for! +1 I happen to like simple modular code that when combined is more than either alone, which I believe is the case here when using mappings with switches. This type of synergy is common in python and I have no problem using a separate lookup map to do early and/or more complex evaluations for cases. Cheers, Ron Raymond P.S. For the sre case, we get a great benefit from using strings. Since they are all interned at compile time and have their hash values computed no more than once, the dispatch table will never have to actually calculate a hash and the full string comparison will be bypassed because identity implies equality. That's nice. The code will execute clean and fast. AND we get readability improvements too. Not bad. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Simple Switch statement
Phillip J. Eby wrote: At 05:30 PM 6/24/2006 -0700, Raymond Hettinger wrote: [Phillip Eby] I would like to be able to use switches on types, enumerations, and the like. Be careful about wanting everything and getting nothing. My proposal is the simplest thing that gets the job done for key use cases found in real code. It's ignoring at least symbolic constants and types -- which are certainly key use cases found in real code. Raymond's idea is a step on the road, not necessarily the endpoint. It's cleverness lies in the fact that it removes the dependency between getting a switch statement that will help with the standard library's current use cases and getting static expressions. Being able to build a dispatch table as hashable object - case name instead of having to build it as hashable object - callable object is a significant improvement over the status quo, even if it doesn't solve everything. It doesn't get rid of the need for the separate dispatch table, but it does eliminate the need to turn everything into a separate, and it also allows the cases to modify the function's local namespace. Note that, *if* static expressions or an equivalent are added later, there is nothing preventing them being integrated (implicitly or otherwise) into Raymond's simplified switch statement. The simplified proposal breaks the current discussion into two separately PEP-able concepts: a. add a literals-only switch statement for fast local dispatch (PEP 275) b. add the ability to designate code for once-only evaluation Removing the limitations on the initial version of the switch statement would then be one of the motivating use cases for the second PEP (which would be a new PEP to thrash out whether such evaluation should be at first execution time so it works everywhere, or at function definition time, so it only works at all in functions and behaves very surprisingly if inside a loop or conditional statement). Cheers, Nick. -- Nick Coghlan | [EMAIL PROTECTED] | Brisbane, Australia --- http://www.boredomandlaziness.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Alternatives to switch?
At first I was pretty excited about the switch proposals, but having read the various discussions I have to say that my enthusiasm has cooled quite a bit. The set of proposals that are currently being put forward have enough caveats and restrictions as to make the statement far less useful than I had originally hoped. In fact, I'd like to point out something that hasn't been brought up, which is that in many cases having a closure rebind the switch cases defeats the purpose of the thing. For example: def outer(): def inner(x): switch(x): case 1: ... case 2: ... case 3: ... return inner If the switch cases are bound at the time that 'inner' is defined, it means that the hash table will be rebuilt each time 'outer' is called. But what if 'inner' is only intended to be used once? It means that the performance advantage of switch is completely negated. On the other hand, if 'inner' is intended to be used many times, then 'switch' is a win. But the compiler has no idea which of these two cases is true. I want to try and think out of the box here, and ask the question what exactly we wanted a switch statement for, and if there are any other ways to approach the problem. Switch statements are used to efficiently dispatch based on a single value to a large number of alternative code paths. The primary uses that have been put forward are in interpreting parse trees (the 'sre' example is a special case of this) and pickling. In fact, I would say that these two use cases alone justify the need for some improvement to the language, since both are popular design patterns, and both are somewhat ill-served by the current limits of the language. Parse trees and pickling can, in fact, be considered as two examples of a broader category of external interpretation of an object graph. By external interpretation I mean that the inspection and transformation of the data is not done via method calls on the individual objects, but rather by code outside of the object graph that recognizes individual object types or values and takes action based on that. This type of architectural pattern manifests in nearly every sub-branch of software engineering, and usually appears when you have a complex graph of objects where it is inadvisable, for one reason or another, to put the behavior in the objects themselves. There are several possible reasons why this might be so. Perhaps the operations involve the relationships between the objects rather than the objects themselves (this is the parse tree case.) Or perhaps for reasons of modularity, it is desired that the objects not have built-in knowledge of the type of operation being performed - so that, for example, you can write several different kinds of serializers for an object graph without having to have the individual objects have special understanding of each different serializer type. This is the pickle case. I'd like to propose that we consider this class of problems (external interpretation of an object graph) as the 'reference' use case for dicussions of the merits of the switch statement, and that evaluation of the merits of language changes be compared against this reference. Here are my reasons for suggesting this: -- The class is broad and encompasses a large set of practical, real-world applications. -- The class is not well-served by 'if-elif-else' dispatching styles. -- There have been few, if any, use cases in support of a switch statement that don't fall within this class. So how does a switch statement help with this problem? Currently in Python, there are a limited number of ways to do N-way dispatching: -- if-elif-else chains -- method overloading -- dicts/arrays of function or method pointers -- exotic and weird solutions such as using try/except as a dispatching mechanism. (I can't think of any others, but I am sure I missed something.) We've already discussed the issues with if-elif-else chains, in particular the fact that they have O(N) performance instead of O(1). The next two options both have in common the fact that they require the dispatch to go through a function call. This means that you are paying for the (allegedly expensive) Python function dispatch overhead, plus you no longer have access to any local variables which happened to be in scope when the dispatch occured. It seems to me that the desire for a switch statement is a desire to get around those two limitations - in other words, if function calls were cheap, and there was an easy way to do dynamic scoping so that called functions could access their caller's variables, then there wouldn't be nearly as much of a desire for a switch statement. For example, one might do a pickling function along these lines: dispatch_table = None def marshall( data ): type_code = None object_data = None def int_case(): type_code =
Re: [Python-Dev] Switch statement
Phillip J. Eby wrote: 1. case (literal|NAME) is the syntax for equality testing -- you can't use an arbitrary expression, not even a dotted name. That's too restrictive. I want to be able to write things like class Foods: Spam = 1 Eggs = 2 Ham = 3 ... switch f: case Foods.Spam: ... case Foods.Eggs: ... -- Greg ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PyObject_CallFunction and 'N' format char
Gustavo Carneiro wrote: However, PyObject_CallFunction does _not_ consume such an object reference, contrary to what I believed for years. Why do you say that? It certainly does. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com