Re: [Python-Dev] 2.5 issues need resolving in a few days
Fred L. Drake, Jr. wrote: On Saturday 10 June 2006 12:34, Fredrik Lundh wrote: if all undocumented modules had as much documentation and articles as ET, the world would be a lot better documented ;-) I've posted a text version of the xml.etree.ElementTree PythonDoc here: Here's a question that we should answer before the beta: With the introduction of the xmlcore package in Python 2.5, should we document xml.etree or xmlcore.etree? If someone installs PyXML with Python 2.5, I don't think they're going to get xml.etree, which will be really confusing. We can be sure that xmlcore.etree will be there. I'd rather not propogate the pain caused xml package insanity any further. +1 for 'xmlcore.etree'. I don't use XML very much, and it was thoroughly confusing to find that published XML related code didn't work on my machine, even though the stdlib claimed to provide an 'xml' module (naturally, the published code needed the full version of PyXML, but I didn't know that at the time). Cheers, Nick. -- Nick Coghlan | [EMAIL PROTECTED] | Brisbane, Australia --- http://www.boredomandlaziness.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Segmentation fault in collections.defaultdict
Kevin Jacobs [EMAIL PROTECTED] wrote: Try this at home: import collections d=collections.defaultdict(int) d.iterkeys().next() # Seg fault d.iteritems().next() # Seg fault d.itervalues().next() # Fine and dandy This all worked fine for me in rev 46739 and 46849 (Kubuntu 6.06, gcc 4.0.3). Python version: Python 2.5a2 (trunk:46822M, Jun 10 2006, 13:14:15) [GCC 4.0.2 20050901 (prerelease) (SUSE Linux)] on linux2 Either something got broken and then fixed again between the two revs I tried, there's a problem specific to GCC 4.0.2, or there's a problem with whatever local modifications you have in your working copy :) Cheers, Nick. -- Nick Coghlan | [EMAIL PROTECTED] | Brisbane, Australia --- http://www.boredomandlaziness.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Segmentation fault in collections.defaultdict
Nick Coghlan wrote: Kevin Jacobs [EMAIL PROTECTED] wrote: Try this at home: import collections d=collections.defaultdict(int) d.iterkeys().next() # Seg fault d.iteritems().next() # Seg fault d.itervalues().next() # Fine and dandy This all worked fine for me in rev 46739 and 46849 (Kubuntu 6.06, gcc 4.0.3). Python version: Python 2.5a2 (trunk:46822M, Jun 10 2006, 13:14:15) [GCC 4.0.2 20050901 (prerelease) (SUSE Linux)] on linux2 Either something got broken and then fixed again between the two revs I tried, there's a problem specific to GCC 4.0.2, or there's a problem with whatever local modifications you have in your working copy :) Same here. I tried with the same revision as Kevin, and got no segfault at all (using GCC 4.1.1 on Linux). Note that GCC 4.0.2 20050901 (prerelease) sound like something that's not really been thoroughly tested ;) Georg ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] crash in dict on gc collect
I wonder if this is similar to Kevin's problem? I couldn't reproduce his problem though. This happens with both debug and release builds. Not sure how to reduce the test case. pychecker was just iterating through the byte codes. It wasn't doing anything particularly interesting. ./python pychecker/pychecker/checker.py Lib/encodings/cp1140.py 0x004cfa18 in visit_decref (op=0x661180, data=0x0) at gcmodule.c:270 270 if (PyObject_IS_GC(op)) { (gdb) bt #0 0x004cfa18 in visit_decref (op=0x661180, data=0x0) at gcmodule.c:270 #1 0x004474ab in dict_traverse (op=0x7cdd90, visit=0x4cf9e0 visit_decref, arg=0x0) at dictobject.c:1819 #2 0x004cfaf0 in subtract_refs (containers=0x670240) at gcmodule.c:295 #3 0x004d07fd in collect (generation=0) at gcmodule.c:790 #4 0x004d0ad1 in collect_generations () at gcmodule.c:897 #5 0x004d1505 in _PyObject_GC_Malloc (basicsize=56) at gcmodule.c:1332 #6 0x004d1542 in _PyObject_GC_New (tp=0x64f4a0) at gcmodule.c:1342 #7 0x0041d992 in PyInstance_NewRaw (klass=0x2a95dffcc0, dict=0x800e80) at classobject.c:505 #8 0x0041dab8 in PyInstance_New (klass=0x2a95dffcc0, arg=0x2a95f5f9e0, kw=0x0) at classobject.c:525 #9 0x0041aa4e in PyObject_Call (func=0x2a95dffcc0, arg=0x2a95f5f9e0, kw=0x0) at abstract.c:1802 #10 0x0049ecd2 in do_call (func=0x2a95dffcc0, pp_stack=0x7fbfffb5b0, na=3, nk=0) at ceval.c:3785 #11 0x0049e46f in call_function (pp_stack=0x7fbfffb5b0, oparg=3) at ceval.c:3597 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] 2.5 issues need resolving in a few days
Fred L. Drake, Jr. wrote: With the introduction of the xmlcore package in Python 2.5, should we document xml.etree or xmlcore.etree? If someone installs PyXML with Python 2.5, I don't think they're going to get xml.etree, which will be really confusing. We can be sure that xmlcore.etree will be there. I think it would be unfortunate if an external, mostly unmaintained package could claim absolute ownership of the xml package root. how about tweaking the xml loader to map xml.foo to _xmlplus.foo only if that subpackage really exists ? /F ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] 2.5 issues need resolving in a few days
On 11 jun 2006, at 12.09, Fredrik Lundh wrote: Fred L. Drake, Jr. wrote: With the introduction of the xmlcore package in Python 2.5, should we document xml.etree or xmlcore.etree? If someone installs PyXML with Python 2.5, I don't think they're going to get xml.etree, which will be really confusing. We can be sure that xmlcore.etree will be there. I think it would be unfortunate if an external, mostly unmaintained package could claim absolute ownership of the xml package root. how about tweaking the xml loader to map xml.foo to _xmlplus.foo only if that subpackage really exists ? I'm a bit confused by what the problem is. I though this was all handled like it should be now. import xml.etree xml.etree module 'xml.etree' from '.../lib/python2.5/xmlcore/etree/ __init__.pyc' import xml.sax xml.sax module 'xml.sax' from '.../lib/python2.5/site-packages/_xmlplus/ sax/__init__.pyc' It picks up modules from both places //Simon ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] UUID module
Ka-Ping Yee [EMAIL PROTECTED] wrote: Quite a few people have expressed interest in having UUID functionality in the standard library, and previously on this list some suggested possibly using the uuid.py module i wrote: http://zesty.ca/python/uuid.py Some comments on the code: for dir in ['', r'c:\windows\system32', r'c:\winnt\system32']: Can we get rid of these absolute paths? Something like this should suffice: from ctypes import * buf = create_string_buffer(4096) windll.kernel32.GetSystemDirectoryA(buf, 4096) 17 buf.value.decode(mbcs) u'C:\\WINNT\\system32' for function in functions: try: _node = function() except: continue This also hides typos and whatnot. I guess it's better if each function catches its own exceptions, and either return None or raise a common exception (like a class _GetNodeError(RuntimeError)) which is then caught. Giovanni Bajo ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] 2.5 issues need resolving in a few days
Simon Percivall wrote: how about tweaking the xml loader to map xml.foo to _xmlplus.foo only if that subpackage really exists ? I'm a bit confused by what the problem is. I though this was all handled like it should be now. that's how I thought things were done, but then I read Fred's post, and looked at the source code, and didn't see this line: _xmlplus.__path__.extend(xmlcore.__path__) or-maybe-someone's-been-using-the-time-machine-ly yrs /F ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] sgmllib Comments
Planet is a feed aggregator written in Python. It depends heavily on SGMLLib. A recent bug report turned out to be a deficiency in sgmllib, and I've submitted a test case and a patch[1] (use or discard the patch, it is the test that I care about). While looking around, a few things surfaced. For starters, it would seem that the version of sgmllib in SVN HEAD will selectively unescape certain character references that might appear in an attribute. I say selectively, as: * it will unescape amp; * it won't unescape copy; * it will unescape #38; * it won't unescape #x26; * it will unescape #146; * it won't unescape #8217; There are a number of issues here. While not unescaping anything is suboptimal, at least the recipient is aware of exactly which characters have been unescaped (i.e., none of them). The proposed solution makes it impossible for the recipient to know which characters are unescaped, and which are original. (Note: feeds often contain such abominations as amp;copy; which the new code will treat indistinguishably from copy;) Additionally, there is a unicode issue here - one that is shared by handle_charref, but at least that method is overrideable. If unescaping remains, do it for hex character references and for values greather than 8-bits, i.e., use unichr instead of chr if the value is greater than 127. - Sam Ruby [1] http://tinyurl.com/j4a6n ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Switch statement
Greg Ewing wrote: [EMAIL PROTECTED] wrote: switch raw_input(enter a, b or c: ): case 'a': print 'yay! an a!' case 'b': print 'yay! a b!' case 'c': print 'yay! a c!' else: print 'hey dummy! I said a, b or c!' Before accepting this, we could do with some debate about the syntax. It's not a priori clear that C-style switch/case is the best thing to adopt. Since you don't have the 'fall-through' behavior of C, I would also assume that you could associate more than one value with a case, i.e.: case 'a', 'b', 'c': ... It seems to me that the value of a 'switch' statement is that it is a computed jump - that is, instead of having to iteratively test a bunch of alternatives, you can directly jump to the code for a specific value. I can see this being very useful for parser generators and state machine code. At the moment, similar things can be done with hash tables of functions, but those have a number of limitations, such as the fact that they can't access local variables. I don't have any specific syntax proposals, but I notice that the suite that follows the switch statement is not a normal suite, but a restricted one, and I am wondering if we could come up with a syntax that avoids having a special suite. Here's an (ugly) example, not meant as a serious proposal: select (x) when 'a': ... when 'b', 'c': ... else: ... The only real difference between this and an if-else chain is that the compiler knows that all of the test expressions are constants and can be hashed at compile time. -- Talin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] sgmllib Comments
On Sun, Jun 11, 2006, Sam Ruby wrote: Planet is a feed aggregator written in Python. It depends heavily on SGMLLib. A recent bug report turned out to be a deficiency in sgmllib, and I've submitted a test case and a patch[1] (use or discard the patch, it is the test that I care about). [1] http://tinyurl.com/j4a6n When providing links to SF, please use the python.org tinyurl equivalent to ensure that people can easily see the bug/patch number: http://www.python.org/sf?id=1504333 -- Aahz ([EMAIL PROTECTED]) * http://www.pythoncraft.com/ I saw `cout' being shifted Hello world times to the left and stopped right there. --Steve Gonedes ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Import semantics
Python and Jython import semantics differ on how sub-packages should be accessed after importing some module:Jython 2.1 on java1.5.0 (JIT: null)Type copyright, credits or license for more information. import xml xml.dommodule xml.dom at 10340434Python 2.4.2 (#67, Sep 28 2005, 12:41:11) [MSC v.1310 32 bit (Intel)] on win32Type help, copyright, credits or license for more information. import xml xml.domTraceback (most recent call last): File stdin, line 1, in ?AttributeError: 'module' object has no attribute 'dom' from xml.dom import pulldom xml.dommodule 'xml.dom' from 'C:\bin\Python24\lib\xml\dom\__init__.pyc'Note that in Jython importing a module makes all subpackages beneath it available, whereas in python, only the tokens available in __init__.py are accessible, but if you do load the module later even if not getting it directly into the namespace, it gets accessible too -- this seems more like something unexpected to me -- I would expect it to be available only if I did some import xml.dom at some point.My problem is that in Pydev, in static analysis, I would only get the tokens available for actually imported modules, but that's not true for Jython, and I'm not sure if the current behaviour in Python was expected. So... which would be the right semantics for this?Thanks,Fabio ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Switch statement
talin Since you don't have the 'fall-through' behavior of C, I would talin also assume that you could associate more than one value with a talin case, i.e.: talin case 'a', 'b', 'c': talin... As Andrew Koenig pointed out, that's not discussed in the PEP. Given the various examples though, I would have to assume the above is equivalent to case ('a', 'b', 'c'): ... since in all cases the PEP implies a single expression. talin It seems to me that the value of a 'switch' statement is that it talin is a computed jump - that is, instead of having to iteratively talin test a bunch of alternatives, you can directly jump to the code talin for a specific value. I agree, but that of course limits the expressions to constants which can be evaluated at compile-time as I indicated in my previous mail. Also, as someone else pointed out, that probably prevents something like START_TOKEN = '' END_TOKEN = '' ... switch expr: case START_TOKEN: ... case END_TOKEN: ... The PEP states that the case clauses must accept constants, but the sample implementation allows arbitrary expressions. If we assume that the case expressions need not be constants, does that force the compiler to evaluate the case expressions in the order given in the file? To make my dumb example from yesterday even dumber: def f(): switch raw_input(enter b, d or f:): case incr('a'): print 'yay! a b!' case incr('b'): print 'yay! a d!' case incr('c'): print 'yay! an f!' else: print 'hey dummy! I said b, d or f!' _n = 0 def incr(c): global _n try: return chr(ord(c)+1+_n) finally: _n += 1 print _n The cases must be evaluated in the order they are written for the example to work properly. The tension between efficient run-time and Python's highly dynamic nature would seem to prevent the creation of a switch statement that will satisfy all demands. Skip ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Switch statement
Talin wrote: I don't have any specific syntax proposals, but I notice that the suite that follows the switch statement is not a normal suite, but a restricted one, and I am wondering if we could come up with a syntax that avoids having a special suite. don't have KR handy, but I'm pretty sure they put switch and case at the same level (just like if/else), thus eliminating the need for silly special suites. The only real difference between this and an if-else chain is that the compiler knows that all of the test expressions are constants and can be hashed at compile time. the compiler can of course figure that out also for if/elif/else state- ments, by inspecting the AST. the only advantage for switch/case is user syntax... /F ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Switch statement
[EMAIL PROTECTED] wrote: talin Since you don't have the 'fall-through' behavior of C, I would talin also assume that you could associate more than one value with a talin case, i.e.: talin case 'a', 'b', 'c': talin... As Andrew Koenig pointed out, that's not discussed in the PEP. Given the various examples though, I would have to assume the above is equivalent to case ('a', 'b', 'c'): ... I had recognized that ambiguity as well, but chose not to mention it :) since in all cases the PEP implies a single expression. talin It seems to me that the value of a 'switch' statement is that it talin is a computed jump - that is, instead of having to iteratively talin test a bunch of alternatives, you can directly jump to the code talin for a specific value. I agree, but that of course limits the expressions to constants which can be evaluated at compile-time as I indicated in my previous mail. Also, as someone else pointed out, that probably prevents something like START_TOKEN = '' END_TOKEN = '' ... switch expr: case START_TOKEN: ... case END_TOKEN: ... Here's another ugly thought experiment, not meant as a serious proposal; it's intent is to stimulate ideas by breaking preconceptions. Suppose we take the notion of a computed jump literally: def myfunc( x ): goto dispatcher[ x ] section s1: ... section s2: ... dispatcher=dict('a'=myfunc.s1, 'b'=myfunc.s2) No, I am *not* proposing that Python add a goto statement. What I am really talking about is the idea that you could (somehow) use a dictionary as the input to a control construct. In the above example, rather than allowing arbitrary constant expressions as cases, we would require the compiler to generate a set of opaque tokens representing various code fragments. These fragments would be exactly like inner functions, except that they don't have their own scope (and therefore have no parameters either). Since the jump labels are symbols generated by the compiler, there's no ambiguity about when they get evaluated. The above example also allows these labels to be accessed externally from the function by defining attributes on the function object itself which correspond to the code fragments. So in the example, the dictionary which associates specific values with executable sections is created once, at runtime, but before the first time that myfunc is called. Of course, this is quite a bit clumsier than a switch statement, which is why I say its not a serious proposal. -- Talin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] subprocess.Popen(.... stdout=IGNORE, ...)
In the subprocess module, by default the files handles in the child are inherited from the parent. To ignore a child's output, I can use the stdout or stderr options to send the output to a pipe:: p = Popen(command, stdout=PIPE, stderr=PIPE) However, this is sensitive to the buffer deadlock problem, where for example the buffer for stderr might become full and a deadlock occurs because the child is blocked on writing to stderr and the parent is blocked on reading from stdout or waiting for the child to finish. For example, using this command will cause deadlock:: call('cat /boot/vmlinuz'.split(), stdout=PIPE, stderr=PIPE) Popen.communicate() implements a solution using either select() or multiple threads (under Windows) to read from the pipes, and returns the strings as a result. It works out like this:: p = Popen(command, stdout=PIPE, stderr=PIPE) output, errors = p.communicate() if p.returncode != 0: … Now, as a user of the subprocess module, sometimes I just want to call some child process and simply ignore its output, and to do so I am forced to use communicate() as above and wastefully capture and ignore the strings. This is actually quite a common use case. Just run something, and check the return code. Right now, in order to do this without polluting the parent's output, you cannot use the call() convenience (or is there another way?). A workaround that works under UNIX is to do this:: FNULL = open('/dev/null', 'w') returncode = call(command, stdout=FNULL, stderr=FNULL) Some feedback requested, I'd like to know what you think: 1. Would it not be nice to add a IGNORE constant to subprocess.py that would do this automatically?, i.e. :: returncode = call(command, stdout=IGNORE, stderr=IGNORE) Rather than capture and accumulate the output, it would find an appropriate OS-specific way to ignore the output (the /dev/null file above works well under UNIX, how would you do this under Windows? I'm sure we can find something.) 2. call() should be modified to not be sensitive to the deadlock problem, since its interface provides no way to return the contents of the output. The IGNORE value provides a possible solution for this. 3. With the /dev/null file solution, the following code actually works without deadlock, because stderr is never blocked on writing to /dev/null:: p = Popen(command, stdout=PIPE, stderr=IGNORE) text = p.stdout.read() retcode = p.wait() Any idea how this idiom could be supported using a more portable solution (i.e. how would I make this idiom under Windows, is there some equivalent to /dev/null)? ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] UUID module
Thomas Heller wrote: I don't know if this is the uuidgen you're talking about, but on linux there is libuuid: Thanks! Okay, that's in there now. Have a look at http://zesty.ca/python/uuid.py . Phillip J. Eby wrote: By the way, I'd love to see a uuid.uuid() constructor that simply calls the platform-specific default UUID constructor (CoCreateGuid or uuidgen(2)), I've added code to make uuid1() use uuid_generate_time() if available and uuid4() use uuid_generate_random() if available. These functions are provided on Mac OS X (in libc) and on Linux (in libuuid). Does that work for you? I'm using the Windows UUID generation calls (UuidCreate and UuidCreateSequential in rpcrt4) only to get the hardware address, not to make UUIDs, because they yield results that aren't compliant with RFC 4122. Even worse, they actually have the variant bits set to say that they are RFC 4122, but they can have an illegal version number. If there are better alternatives on Windows, i'm happy to use them. -- ?!ng ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Should hex() yield 'L' suffix for long numbers?
I did this earlier: hex(9) '0x9184e729fffL' and found it a little jarring, because i feel there's been a general trend toward getting rid of the 'L' suffix in Python. Literal long integers don't need an L anymore; they're automatically made into longs if the number is too big. And while the repr() of a long retains the L on the end, the str() of a long does not, and i rather like that. So i kind of expected that hex() would not include the L either. I see its main job as just giving me the hex digits (in fact, for Python 3000 i'd prefer even to drop the '0x' as well), and the L seems superfluous and distracting. What do you think? Is Python 2.5 a reasonable time to drop this L? -- ?!ng ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] a note in random.shuffle.__doc__ ...
Terry Jones wrote: Suppose you have a RNG with a cycle length of 5. There's nothing to stop an algorithm from taking multiple already returned values and combining them in some (deterministic) way to generate 5 outcomes. No, it's not. As long as the RNG output is the only input to the algorithm, and the algorithm is deterministic, it is not possible get more than N different outcomes. It doesn't matter what the algorithm does with the input. If you expanded what you meant by internal states to include the state of the algorithm (as well as the state of the RNG), then I'd be more inclined to agree. If the algorithm can start out with more than one initial state, then the RNG is not the only input. Worse, if you have multiple threads / processes using the same RNG, the individual threads could exhibit _much_ more random behavior Then you haven't got a deterministic algorithm. -- Greg ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Pre-PEP: Allow Empty Subscript List Without Parentheses
BJörn Lindqvist wrote: I don't know how difficult it is to get rid of the implicit return None or even if it is doable, but if it is, it should, IMHO, be done. It's been proposed before, and the conclusion was that it would cause more problems than it would solve. (Essentially it would require returning some object that raised an exception when anything at all was done to it, but such an object would cause debuggers and other introspective code to choke.) -- Greg ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Switch statement
Talin wrote: Since you don't have the 'fall-through' behavior of C, I would also assume that you could associate more than one value with a case, i.e.: case 'a', 'b', 'c': ... Multiple values could be written case 'a': case 'b': case 'c': ... without conflicting with the no-fallthrough semantics, since a do-nothing case can be written as case 'd': pass I don't have any specific syntax proposals, but I notice that the suite that follows the switch statement is not a normal suite, but a restricted one, I don't see that as a problem. And all the proposed syntaxes I've ever seen for putting the cases at the same level as the switch look ugly to me. -- Greg ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Switch statement
[EMAIL PROTECTED] wrote: I agree, but that of course limits the expressions to constants which can be evaluated at compile-time as I indicated in my previous mail. A way out of this would be to define the semantics so that the expression values are allowed to be cached, and the order of evaluation and testing is undefined. So the first time through, the values could all be put in a dict, to be looked up thereafter. -- Greg ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] a note in random.shuffle.__doc__ ...
Greg == Greg Ewing [EMAIL PROTECTED] writes: Greg Terry Jones wrote: Suppose you have a RNG with a cycle length of 5. There's nothing to stop an algorithm from taking multiple already returned values and combining them in some (deterministic) way to generate 5 outcomes. Greg No, it's not. As long as the RNG output is the only input to Greg the algorithm, and the algorithm is deterministic, it is Greg not possible get more than N different outcomes. It doesn't Greg matter what the algorithm does with the input. Greg If the algorithm can start out with more than one initial Greg state, then the RNG is not the only input. The code below uses a RNG with period 5, is deterministic, and has one initial state. It produces 20 different outcomes. It's just doing a simplistic version of what a lagged RNG generator does, but the lagged part is in the algorithm not in the rng. That's why I said if you included the state of the algorithm in what you meant by state I'd be more inclined to agree. Terry n = map(float, range(1, 17, 3)) i = 0 def rng(): global i i += 1 if i == 5: i = 0 return n[i] if __name__ == '__main__': seen = {} history = [rng()] o = 0 for lag in range(1, 5): for x in range(5): o += 1 new = rng() outcome = new / history[-lag] if outcome in seen: print DUP! seen[outcome] = True print outcome %d = %f % (o, outcome) history.append(new) # Outputs outcome 1 = 1.75 outcome 2 = 1.428571 outcome 3 = 1.30 outcome 4 = 0.076923 outcome 5 = 4.00 outcome 6 = 7.00 outcome 7 = 2.50 outcome 8 = 1.857143 outcome 9 = 0.10 outcome 10 = 0.307692 outcome 11 = 0.538462 outcome 12 = 10.00 outcome 13 = 3.25 outcome 14 = 0.142857 outcome 15 = 0.40 outcome 16 = 0.70 outcome 17 = 0.769231 outcome 18 = 13.00 outcome 19 = 0.25 outcome 20 = 0.571429 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] sgmllib Comments
Fred L. Drake, Jr. [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] On Sunday 11 June 2006 16:26, Sam Ruby wrote: Planet is a feed aggregator written in Python. It depends heavily on SGMLLib. A recent bug report turned out to be a deficiency in sgmllib, and I've submitted a test case and a patch[1] (use or discard the patch, it is the test that I care about). ... and which are original. (Note: feeds often contain such abominations as amp;copy; which the new code will treat indistinguishably from copy;) It really sounds like sgmllib is the wrong foundation for this. ... Have you looked at HTMLParser as an alternate to sgmllib? It has better support for XHTML constructs. Have you (the OP), checked how related Python projects, such as Mark Pilgrim's feed parser, http://www.feedparser.org/ handle the same sort of input (I have only looked at docs and tests, not code). tjr ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Import semantics
Fabio Zadrozny [EMAIL PROTECTED] wrote in message Jython 2.1 on java1.5.0 (JIT: null) Python 2.4.2 (#67, Sep 28 2005, 12:41:11) [MSC v.1310 32 bit (Intel)] on win32 Jython 2.1 intends to match Python 2.1, I believe. Python 2.2, which I still have loaded, matches Python 2.4 in the behavior reported. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] subprocess.Popen(.... stdout=IGNORE, ...)
Martin Blais [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] Any idea how this idiom could be supported using a more portable solution (i.e. how would I make this idiom under Windows, is there some equivalent to /dev/null)? On a DOS/Windows command line, 'NUL:' or 'nul:' ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Should hex() yield 'L' suffix for long numbers?
[Ka-Ping Yee] I did this earlier: hex(9) '0x9184e729fffL' and found it a little jarring, because i feel there's been a general trend toward getting rid of the 'L' suffix in Python. Literal long integers don't need an L anymore; they're automatically made into longs if the number is too big. And while the repr() of a long retains the L on the end, the str() of a long does not, and i rather like that. So i kind of expected that hex() would not include the L either. I see its main job as just giving me the hex digits (in fact, for Python 3000 i'd prefer even to drop the '0x' as well), and the L seems superfluous and distracting. What do you think? Is Python 2.5 a reasonable time to drop this L? As I read pep 237, that should have happened in Python 2.3 or 2.4. This specific case is kinda muddy there. Regardless, the only part that was left for Python 3 was phase C, and this is phase C in its entirety: C. The trailing 'L' is dropped from repr(), and made illegal on input. (If possible, the 'long' type completely disappears.) It's possible, though, that hex() and oct() were implicitly considered to be variants of repr() for purposes of phase C. How much are we willing to pay Guido to Pronounce? ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] a note in random.shuffle.__doc__ ...
[Terry Jones] The code below uses a RNG with period 5, is deterministic, and has one initial state. It produces 20 different outcomes. Well, I'd call the sequence of 20 numbers it produces one outcome. From that view, there are at most 5 outcomes it can produce (at most 5 distinct 20-number sequences). In much the same way, there are at most P distinct infinite sequences this can produce, if the PRNG used by random.random() has period P: def belch(): import random, math start = random.random() i = 0 while True: i += 1 yield math.fmod(i * start, 1.0) The trick is to define outcome in such a way that the original claim is true :-) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] sgmllib Comments
Fred L. Drake, Jr. wrote: On Sunday 11 June 2006 16:26, Sam Ruby wrote: Planet is a feed aggregator written in Python. It depends heavily on SGMLLib. A recent bug report turned out to be a deficiency in sgmllib, and I've submitted a test case and a patch[1] (use or discard the patch, it is the test that I care about). And it's a nice aggregator to use, indeed! While looking around, a few things surfaced. For starters, it would seem that the version of sgmllib in SVN HEAD will selectively unescape certain character references that might appear in an attribute. I say selectively, as: * it will unescape amp; * it won't unescape copy; * it will unescape #38; * it won't unescape #x26; * it will unescape #146; * it won't unescape #8217; And just why would you use sgmllib to handle RSS or ATOM feeds? Neither is defined in terms of SGML. The sgmllib documentation also notes that it isn't really a fully general SGML parser (it isn't), but that it exists primarily as a foundation for htmllib. The feed itself is read first with SAX (then with a fallback using sgmllib if the feed is not well formed, but that's beside the point). Then the embedded HTML portions are then processed with subclasses of sgmllib. There are a number of issues here. While not unescaping anything is suboptimal, at least the recipient is aware of exactly which characters have been unescaped (i.e., none of them). The proposed solution makes it impossible for the recipient to know which characters are unescaped, and which are original. (Note: feeds often contain such abominations as amp;copy; which the new code will treat indistinguishably from copy;) My suspicion is that the right thing to do at the sgmllib level is to categorize the markup and call a method depending on what the entity reference is, and let that handle whatever it is. For SGML, that means we have things like name; (entity references), #123; (character references), and that's it. #x123; isn't legal SGML under any circumstance; the #xnumber; syntax was introduced with XML. ... but it effectively is valid HTML. And as you point out below sgmllib's raison d’être is to support htmllib. Additionally, there is a unicode issue here - one that is shared by handle_charref, but at least that method is overrideable. If unescaping remains, do it for hex character references and for values greather than 8-bits, i.e., use unichr instead of chr if the value is greater than 127. For SGML, it's worse than that, since the document character set is defined in the SGML declaration, which is a far hairier beast than an XML declaration. :-) understood It really sounds like sgmllib is the wrong foundation for this. While the module has some questionable behaviors, none of them are signifcant in the context it's intended context (support for htmllib). Now, I understand that RSS has historical issues, with HTML-as-practiced getting embedded as payload data with various flavors of escaping applied, and I'm not an expert in the details of that. Have you looked at HTMLParser as an alternate to sgmllib? It has better support for XHTML constructs. HTMLParser is less forgiving, and generally less suitable for consuming HTML as practiced. - Sam Ruby ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] sgmllib Comments
Terry Reedy wrote: Fred L. Drake, Jr. [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] On Sunday 11 June 2006 16:26, Sam Ruby wrote: Planet is a feed aggregator written in Python. It depends heavily on SGMLLib. A recent bug report turned out to be a deficiency in sgmllib, and I've submitted a test case and a patch[1] (use or discard the patch, it is the test that I care about). ... and which are original. (Note: feeds often contain such abominations as amp;copy; which the new code will treat indistinguishably from copy;) It really sounds like sgmllib is the wrong foundation for this. ... Have you looked at HTMLParser as an alternate to sgmllib? It has better support for XHTML constructs. Have you (the OP), checked how related Python projects, such as Mark Pilgrim's feed parser, http://www.feedparser.org/ handle the same sort of input (I have only looked at docs and tests, not code). Just to be clear: Planet uses Mark's feed parser, which uses SGMLlib. I'm a committer on that project: http://sourceforge.net/project/memberlist.php?group_id=112328 I was investigating a bug in sgmllib which affected the feed parser (and therefore Planet), and noticed that there were changes in the SVN head of Python which broke three feed parser unit tests. It is my belief that these changes will break other existing users of sgmllib. - Sam Ruby ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] sgmllib Comments
On Monday 12 June 2006 00:05, Sam Ruby wrote: Just to be clear: Planet uses Mark's feed parser, which uses SGMLlib. Cool. I was investigating a bug in sgmllib which affected the feed parser (and therefore Planet), and noticed that there were changes in the SVN head of Python which broke three feed parser unit tests. It is my belief that these changes will break other existing users of sgmllib. This is good to know; thanks for pointing it out. If you can summarize the specific changes to sgmllib that cause problems for the feed parser, and identify the tests there that rely on the old behavior, I'll be glad to look at the problems. I expect to have some time in the next few evenings, so I should be able to look at these soon. Is the SourceForge CVS the definitive development source for the feed parser? -Fred -- Fred L. Drake, Jr. fdrake at acm.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] sgmllib Comments
Sam Ruby wrote: Planet is a feed aggregator written in Python. It depends heavily on SGMLLib. A recent bug report turned out to be a deficiency in sgmllib, and I've submitted a test case and a patch[1] (use or discard the patch, it is the test that I care about). I think (but am not sure) you are referring to patch #1462498 here, which fixes bugs 1452246 and 1087808. * it will unescape amp; * it won't unescape copy; That must be because you have amp in your entitydefs, but not copy. * it will unescape #38; * it won't unescape #x26; That's because it doesn't recognize hex character references. That's systematic, though: it doesn't just ignore them in attribute values, but also in content. * it will unescape #146; * it won't unescape #8217; That's because the value is larger than 256, so chr() fails. There are a number of issues here. While not unescaping anything is suboptimal, at least the recipient is aware of exactly which characters have been unescaped (i.e., none of them). The proposed solution makes it impossible for the recipient to know which characters are unescaped, and which are original. (Note: feeds often contain such abominations as amp;copy; which the new code will treat indistinguishably from copy;) The recipient should then add copy; to entitydefs; sgmllib will unescape copy, so the recipient can know not to unescape that. Alternatively, the recipient could provide an empty entitydefs. Additionally, there is a unicode issue here - one that is shared by handle_charref, but at least that method is overrideable. If unescaping remains, do it for hex character references and for values greather than 8-bits, i.e., use unichr instead of chr if the value is greater than 127. Alternatively, a callback function could be provided for character references. Unfortunately, the existing callback is unsuitable, as it is supposed to do the full processing; this callback should return the replacement text. Generally assuming Unicode would be wrong, though. Would you like to contribute a patch? Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] 2.5 issues need resolving in a few days
Neal Norwitz wrote: The most important outstanding issue is the xmlplus/xmlcore issue. It's not going to get fixed unless someone works on it. There's only a few days left before beta 1. Can someone please address this? From my point of view, I shall consider them resolved/irrelevant: I'm going to step down as a PyXML maintainer, so I don't have to worry anymore about how to maintain PyXML. If PyXML then gets unmaintained, the problem goes away, otherwise, the new maintainer will have to find a solution. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] sgmllib Comments
Martin v. Löwis wrote: Alternatively, a callback function could be provided for character references. Unfortunately, the existing callback is unsuitable, as it is supposed to do the full processing; this callback should return the replacement text. Generally assuming Unicode would be wrong, though. Would you like to contribute a patch? If we can agree on the behavior, I would be glad to write up a patch. It seems to me that the simplest way to proceed would be for the code that attempts to resolve character references (both named and numeric) in attributes to be isolated in a single method. Subclasses that desire different behavior (including the existing Python 2.4 and prior behaviour) could simply override this method. - Sam Ruby ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com