Re: How to pop the interpreter's stack?
On Dec 25, 2:49 pm, Steven D'Aprano steve +comp.lang.pyt...@pearwood.info wrote: But that's a separate issue from what is being discussed here. What we're discussing here is the idea that a function should be able to delegate work to private subroutines without the caller being aware of that fact. I can't fathom any possible reason why this could be considered a good thing, especially in Python where the culture is, We're all adults here. By the same logic, it would be a good idea to prevent the user from accessing private members of a class, internal variables of a module, or importing an internal module directly. There is a convention in Python: internal objects are preceded by underscore. The fact that your internal function is marked this way in the traceback is more than enough information to the user that this is an internal function. Python is not, and never has been, about hiding internal details. It's about openness, and there's no reason a traceback should hide internal details any more than a class should--in fact hiding information in the traceback is far worse, because you're suppressing information that could be crucial for debugging. Carl Banks -- http://mail.python.org/mailman/listinfo/python-list
Re: How to pop the interpreter's stack?
On Dec 25, 6:21 am, Robert Kern robert.k...@gmail.com wrote: On 12/24/10 4:24 AM, Steven D'Aprano wrote: On Thu, 23 Dec 2010 22:38:05 -0800, Carl Banks wrote: OTOH, going the extra mile to hide useful information from a user is asinine. As a user, I will decide for myself how I want to use implementation-defined information, and I don't want the implementor to decide this for me. It's bad enough if an implementor fails to provide information out of laziness, but when they deliberately do extra work to hide information, that's self-importance and arrogance. But that of course is nonsense, because as the user you don't decide anything of the sort. The developer responsible for writing the function decides what information he provides you, starting with whether you get an exception at all, where it comes from, the type of exception, and the error message (if any). Carl isn't arguing that the user is or should be responsible for this sort of thing. He is arguing that developers should be responsible for doing this in such a way that is beneficial for the developer/user down the road. I'm not even arguing that; I think I would be content if the developer merely doesn't actively work to harm the user. Carl Banks -- http://mail.python.org/mailman/listinfo/python-list
Re: lxml etree question
On Dec 24, 10:17 am, Adam Tauno Williams awill...@whitemice.org wrote: On Fri, 2010-12-24 at 20:48 +0530, Nitin Pawar wrote: On Fri, Dec 24, 2010 at 8:40 PM, Jim jim.heffe...@gmail.com wrote: Hello, I wonder if someone knows about lxml.etree and namespaces? Yes, and don't. He's using lxml.etree (which is a third-party library that mimics ElementTree's interface), not ElementTree. Were you aware of this? I want to build an ElementTree where some of the sub-elements have attributes that serialize this way. comment xml:lang='de'../comment I've tried just comment_elet.set('xml:lang','de') and it didn't like that at all (although it takes comment_elet.set('auth:id','jones') just fine). I've also spelunked the docs and googled but have not hit on the right invocation. If someone knows, I'd be grateful. I'd *strongly* recommend using ElementFlow for building XML documents (over ElementTree), especially if namespaces are involved. ElementFlow is far more intuitive. http://pypi.python.org/pypi/elementflow I'd have to disagree with the use of strong recommendation here. The library you recommended isn't a general replacement for lxml (or ElementTree), and you didn't qualify the conditions for when it is a suitable alternative. A. What if he needed to keep the tree in memory? B. This library builds the tags with with statements, which could be convenient for xml files with rigid structure, but I would think it'd be inconvenient if the format were relatively loose. If you're going to recommend a more specialized solution, you should also give the conditions for which it is suitable. Carl Banks -- http://mail.python.org/mailman/listinfo/python-list
Re: Trying to parse a HUGE(1gb) xml file
Tim Harig, 26.12.2010 02:05: On 2010-12-25, Nobodynob...@nowhere.com wrote: On Sat, 25 Dec 2010 14:41:29 -0500, Roy Smith wrote: Of course, one advantage of XML is that with so much redundant text, it compresses well. We typically see gzip compression ratios of 20:1. But, that just means you can archive them efficiently; you can't do anything useful until you unzip them. XML is typically processed sequentially, so you don't need to create a decompressed copy of the file before you start processing it. Sometimes XML is processed sequentially. When the markup footprint is large enough it must be. Quite often, as in the case of the OP, you only want to extract a small piece out of the total data. In those cases, being forced to read all of the data sequentially is both inconvenient and and a performance penalty unless there is some way to address the data you want directly. So what? If you only have to do that once, it doesn't matter if you have to read the whole file or just a part of it. Should make a difference of a couple of minutes. If you do it a lot, you will have to find a way to make the access efficient for your specific use case. So the file format doesn't matter either, because the data will most likely end up in a fast data base after reading it in sequentially *once*, just as in the case above. I really don't think there are many important use cases where you need fast random access to large data sets and cannot afford to adapt the storage layout before hand. Stefan -- http://mail.python.org/mailman/listinfo/python-list
Re: Trying to parse a HUGE(1gb) xml file
On 2010-12-26, Nobody nob...@nowhere.com wrote: On Sun, 26 Dec 2010 01:05:53 +, Tim Harig wrote: XML is typically processed sequentially, so you don't need to create a decompressed copy of the file before you start processing it. Sometimes XML is processed sequentially. When the markup footprint is large enough it must be. Quite often, as in the case of the OP, you only want to extract a small piece out of the total data. In those cases, being forced to read all of the data sequentially is both inconvenient and and a performance penalty unless there is some way to address the data you want directly. OTOH, formats designed for random access tend to be more limited in their utility. You can only perform random access based upon criteria which match the format's indexing. Once you step outside that, you often have to walk the entire file anyhow. That may be true and it may not. Even assuming that you have to walk through a large number of top level elements there may be an advantage to being able to directly access the next element as opposed to having to parse through the entire current element once you have determined it isn't one which you are looking for. To be fair, this may be invalid preoptimization without taking into account how the hard drive buffers; but, I would suspect that there is a threshold where the amount of data skipped starts to outweigh the penalty of overreaching the hard drives buffers. -- http://mail.python.org/mailman/listinfo/python-list
Re: Trying to parse a HUGE(1gb) xml file
On 2010-12-26, Stefan Behnel stefan...@behnel.de wrote: Tim Harig, 26.12.2010 02:05: On 2010-12-25, Nobodynob...@nowhere.com wrote: On Sat, 25 Dec 2010 14:41:29 -0500, Roy Smith wrote: Of course, one advantage of XML is that with so much redundant text, it compresses well. We typically see gzip compression ratios of 20:1. But, that just means you can archive them efficiently; you can't do anything useful until you unzip them. XML is typically processed sequentially, so you don't need to create a decompressed copy of the file before you start processing it. Sometimes XML is processed sequentially. When the markup footprint is large enough it must be. Quite often, as in the case of the OP, you only want to extract a small piece out of the total data. In those cases, being forced to read all of the data sequentially is both inconvenient and and a performance penalty unless there is some way to address the data you want directly. So what? If you only have to do that once, it doesn't matter if you have to read the whole file or just a part of it. Should make a difference of a couple of minutes. Much agreed. I assume that the process needs to be repeated or it probably would be simpler just to rip out what I wanted using regular expressions with shell utilities. If you do it a lot, you will have to find a way to make the access efficient for your specific use case. So the file format doesn't matter either, because the data will most likely end up in a fast data base after reading it in sequentially *once*, just as in the case above. If the data is just going to end up in a database anyway; then why not send it as a database to begin with and save the trouble of having to convert it? -- http://mail.python.org/mailman/listinfo/python-list
Re: Design Ideals Goals Python 3 - Forest for the trees
So do the new changes(to the GIL) nullify concerns raised by David Beazely here http://dabeaz.com/python/UnderstandingGIL.pdf Ah, good catch. I had watched the recorded presentation some time ago. Yes, the rewritten GIL should alleviate those problems. Instead of simply releasing and re-acquiring the GIL, it releases, then determines thread scheduling using the brainfuck algorithm instead of leaving it up to the kernel in the brief instant the GIL is unassigned (which often doesn't context switch to another thread, thus the performance penalty). (I beleive that algorithm, whose name -is- accurate, was the winner of the long, long discussion on the Python ticket.) Some projects have been using and requiring psyco to gain speed improvements in python http://psyco.sourceforge.net/introduction.html however it seems that the developer is not active on this project anymore and is more active on PyPy http://codespeak.net/pypy/dist/pypy/doc/ I've never really attempted to use JIT optimizers due to the fact that all of my development and production environments are 64-bit, and I haven't found one yet that supports 64-bit properly. Relying on dead projects, however, is an issue of larger scope than mere portability. ;) A program such as AVSP http://avisynth.org/qwerpoi/ which relies on psyco what would be a good proposition to use when taking the project to python 3.0 if psyco will remain unavailable? I'd take the same approach Python 3 itself did; rewrite it for Python 3 and take the opportunity to remove excessive backwards compatibility cruft, streamline algorithms, etc. With a suite of existing unit/functional tests, that possibility is the ultimate in test-driven development. ;) It would also follow the write clean code, then profile and optimize where actually needed philosophy. Obviously that recommendation won't be the best solution for every project. With all of the FOSS projects I really, really care about I'm writing from near-scratch (the code, if not the algorithms) with 2.6+ and 3.1+ polygot (no conversion tools like 2to3, and no split packaging) compatibility as a primary design goal. So far it's working out quite well. - Alice -- http://mail.python.org/mailman/listinfo/python-list
Re: Trying to parse a HUGE(1gb) xml file
Tim Harig, 26.12.2010 10:22: On 2010-12-26, Stefan Behnel wrote: Tim Harig, 26.12.2010 02:05: On 2010-12-25, Nobody wrote: On Sat, 25 Dec 2010 14:41:29 -0500, Roy Smith wrote: Of course, one advantage of XML is that with so much redundant text, it compresses well. We typically see gzip compression ratios of 20:1. But, that just means you can archive them efficiently; you can't do anything useful until you unzip them. XML is typically processed sequentially, so you don't need to create a decompressed copy of the file before you start processing it. Sometimes XML is processed sequentially. When the markup footprint is large enough it must be. Quite often, as in the case of the OP, you only want to extract a small piece out of the total data. In those cases, being forced to read all of the data sequentially is both inconvenient and and a performance penalty unless there is some way to address the data you want directly. [...] If you do it a lot, you will have to find a way to make the access efficient for your specific use case. So the file format doesn't matter either, because the data will most likely end up in a fast data base after reading it in sequentially *once*, just as in the case above. If the data is just going to end up in a database anyway; then why not send it as a database to begin with and save the trouble of having to convert it? I don't think anyone would object to using a native format when copying data from one database 1:1 to another one. But if the database formats are different on both sides, it's a lot easier to map XML formatted data to a given schema than to map a SQL dump, for example. Matter of use cases, not of data size. Stefan -- http://mail.python.org/mailman/listinfo/python-list
Re: GUI Tools for Python 3.1
On Friday 24 December 2010, 03:58:15 Randy Given wrote: Lots of stuff for 2.6 and 2.7 -- what GUI tools are there for 3.1? PyQt4 of course. http://www.riverbankcomputing.com Pete -- http://mail.python.org/mailman/listinfo/python-list
Re: Keeping track of the N largest values
Roy Smith wrote: In article xns9e59a44d7cc49duncanbo...@127.0.0.1, Duncan Booth duncan.bo...@invalid.invalid wrote: Roy Smith r...@panix.com wrote: I'm processing a stream of N numbers and want to keep track of the K largest. There's too many numbers in the stream (i.e. N is too large) to keep in memory at once. K is small (100 would be typical). ... Is there a better way to do this, either from a theoretical running time point of view, or just a nicer way to code this in Python? How about: from heapq import nlargest top = nlargest(K, input()) It uses a heap so avoids completely resorting each time. Hmmm, that looks like it would solve the problem as stated, but there's another twist. In addition to finding the K largest values, I *also* need to do some other processing on all the values (which I didn't show in the original example, incorrectly thinking it not germane to my question). The problem with nlargest() is that it doesn't give me a hook to do that. If Paul's solution doesn't suffice -- the heapq module has the building blocks for a custom solution, e. g.: import heapq from functools import partial class NLargest(object): def __init__(self, n): self.n = n self.heap = [] def tally(self, item): heap = self.heap if len(heap) = self.n: heapq.heappushpop(heap, item) self.tally = partial(heapq.heappushpop, heap) else: heapq.heappush(heap, item) def __str__(self): return str(sorted(self.heap, reverse=True)) if __name__ == __main__: import random items = range(100) random.shuffle(items) accu = NLargest(10) for item in items: accu.tally(item) print item, accu PS: I'm assuming heapq.nlargest(n, iterable) operates with memory proportional to n, and not to the iterator length. That's the only reasonable conclusion, but the docs don't actually come out and say it. I would hope so. -- http://mail.python.org/mailman/listinfo/python-list
Re: Keeping track of the N largest values
from bisect import insort_left K = 5 top = [] while 1: x = input() if len(top) K: insort_left(top, x) elif x top[0]: del top[0] insort_left(top, x) print top will be enough -- http://mail.python.org/mailman/listinfo/python-list
Re: Design Ideals Goals Python 3 - Forest for the trees
On Dec 26, 5:34 am, Alice Bevan–McGregor al...@gothcandy.com wrote: I've never really attempted to use JIT optimizers due to the fact that all of my development and production environments are 64-bit, and I haven't found one yet that supports 64-bit properly. Relying on dead projects, however, is an issue of larger scope than mere portability. ;) The PyPy JIT supports x86_64. It's still being improved, but it does provide real speedups in some cases already. Jean-Paul -- http://mail.python.org/mailman/listinfo/python-list
Re: Keeping track of the N largest values
In article e05e480b-8956-4984-b4cc-9a1666380...@l32g2000yqc.googlegroups.com, n00m n...@narod.ru wrote: from bisect import insort_left K = 5 top = [] while 1: x = input() if len(top) K: insort_left(top, x) elif x top[0]: del top[0] insort_left(top, x) print top will be enough Hmmm, that's an interesting idea. Thanks. -- http://mail.python.org/mailman/listinfo/python-list
Re: How to pop the interpreter's stack?
Carl Banks wrote: Python is not, and never has been, about hiding internal details. It's about openness, and there's no reason a traceback should hide internal details any more than a class should--in fact hiding information in the traceback is far worse, because you're suppressing information that could be crucial for debugging. +100 ~Ethan~ -- http://mail.python.org/mailman/listinfo/python-list
Re: How to pop the interpreter's stack?
Steven D'Aprano wrote: On Sat, 25 Dec 2010 09:17:27 -0500, Robert Kern wrote: On 12/24/10 5:14 PM, Ethan Furman wrote: There are also times when I change the exception being raised to match what python expects from that type of object -- for example, from WhatEverException to KeyError for a dict-like object. So in this regard I agree with Steven. Steven isn't arguing that particular point here, nor is anyone arguing against it. Emphasis on *here*. You will note that in Python 3, if you raise an exception inside an except block, both the *original* and the new exception are printed. This is great for fixing bugs inside except blocks, but terribly disruptive for catching one error and raising another error in it's place... Yes, this is where I was agreeing with Steven. While I love python3, the current nested exception behavior is horrible. ~Ethan~ -- http://mail.python.org/mailman/listinfo/python-list
Re: How to pop the interpreter's stack?
Steven D'Aprano wrote: Right. But I have thought of a clever trick to get the result KJ was asking for, with the minimum of boilerplate code. Instead of this: def _pre_spam(args): if condition(args): raise SomeException(message) if another_condition(args): raise AnotherException(message) if third_condition(args): raise ThirdException(message) def spam(args): _pre_spam(args) do_useful_work() you can return the exceptions instead of raising them (exceptions are just objects, like everything else!), and then add one small piece of boilerplate to the spam() function: def _pre_spam(args): if condition(args): return SomeException(message) if another_condition(args): return AnotherException(message) if third_condition(args): return ThirdException(message) def spam(args): exc = _pre_spam(args) if exc: raise exc do_useful_work() -1 You failed to mention that cleverness is not a prime requisite of the python programmer -- in fact, it's usually frowned upon. The big problem with the above code is you are back to passing errors in-band, pretty much completely defeating the point of have an out-of-band channel. ~Ethan~ -- http://mail.python.org/mailman/listinfo/python-list
__delitem__ feature
When I execute this file: #-- def nodelfactory(klass): class nodel(klass): def _delitem(self, _): raise TypeError(can't delete) # __delitem__ = _delitem def __init__(self, *a, **k): klass.__init__(self, *a, **k) self.__delitem__ = self._delitem nodel.__name__ = 'nodel%s' % klass.__name__ return nodel if __name__ == '__main__': import traceback as tb d = nodelfactory(dict)([('k1', 'v1'), ('k2', 'v2')]) try: d.__delitem__('k1') except TypeError: tb.print_exc() print d try: del d['k1'] except TypeError: tb.print_exc() print d l = nodelfactory(list)([1, 2, 3, 4]) try: l.__delitem__(0) except TypeError: tb.print_exc() print l try: del l[0] except TypeError: tb.print_exc() print l #-- ...the output I get is: Traceback (most recent call last): File /tmp/delbug.py, line 20, in module try: d.__delitem__('k1') File /tmp/delbug.py, line 4, in _delitem raise TypeError(can't delete) TypeError: can't delete {'k2': 'v2', 'k1': 'v1'} {'k2': 'v2'} Traceback (most recent call last): File /tmp/delbug.py, line 30, in module try: l.__delitem__(0) File /tmp/delbug.py, line 4, in _delitem raise TypeError(can't delete) TypeError: can't delete [1, 2, 3, 4] [2, 3, 4] It means that, for both subclasses, del fails to trigger the dynamically installed instance method __delitem__. If I replace dict with UserDict, *both* deletion attempts lead to a call to the dynamic __delitem__ method, and are thus blocked. This is the behavior I expected of dict (and will help me hold on to my belief that I'm not going insane when inevitably I'm told that there's no bug in dict or list). Interestingly enough, if I replace list with UserList, I see no change in behavior. So maybe I am going insane after all. ~kj P.S. If you uncomment the commented-out line, and comment out the last line of the __init__ method (which installs self._delitem as self.__delitem__) then *all* the deletion attempts invoke the __delitem__ method, and are therefore blocked. FWIW. -- http://mail.python.org/mailman/listinfo/python-list
Re: __delitem__ feature
On 12/26/2010 10:53 AM, kj wrote: P.S. If you uncomment the commented-out line, and comment out the last line of the __init__ method (which installs self._delitem as self.__delitem__) then *all* the deletion attempts invoke the __delitem__ method, and are therefore blocked. FWIW. Because subclasses of builtins only check the class __dict__ for special method overrides, not the instance __dict__. -- http://mail.python.org/mailman/listinfo/python-list
inspect.getsource bug?
Try this: 1) define a function 'foo' in a script 2) runfile the script from a shell 3) do 'inspect.getsource(foo)' 4) change the source of 'foo' 5) runfile the script from the same shell 6) do 3 again On my 2.6.6 getsource returns twice the same code. I couldn't find very much about this, is there any known workaround? thanks -- http://mail.python.org/mailman/listinfo/python-list
Re: __delitem__ feature
Am 26.12.2010 18:53, schrieb kj: It means that, for both subclasses, del fails to trigger the dynamically installed instance method __delitem__. Magic methods like __delitem__ are looked up on the type, not on the instance. You can't change the behaviour on instances. If I replace dict with UserDict, *both* deletion attempts lead to a call to the dynamic __delitem__ method, and are thus blocked. This is the behavior I expected of dict (and will help me hold on to my belief that I'm not going insane when inevitably I'm told that there's no bug in dict or list). UserDict is an old style class (not a subclass of object). Old style classes behave differently. Christian -- http://mail.python.org/mailman/listinfo/python-list
Re: type(d) != type(d.copy()) when type(d).issubclass(dict)
John O'Hagan resea...@johnohagan.com writes: IMO one of the benefits of subclassing is that you can just bolt on additional behaviour without having to know all the inner workings of the superclass, a benefit that is somewhat defeated by this behaviour of builtins. I agree. I've read the old post/articles by GvR and other over how great it will be now that one can subclass Python builtin types like any other class (GvR even gives explicit examples of this luscious possibility in his paper on type/class unification). But now I'm discovering so many caveats, exceptions, and gotchas about subclassing builtins that I have to conclude that this much celebrated new capability is basically useless... Just like readability counts, it is also true that conceptual clarity counts, and treating builtins as classes in Python is the most obfuscated design I've ever seen. UserDict, come back, all is forgotten! ~kj -- http://mail.python.org/mailman/listinfo/python-list
Re: type(d) != type(d.copy()) when type(d).issubclass(dict)
In xns9e59a27def178duncanbo...@127.0.0.1 Duncan Booth duncan.bo...@invalid.invalid writes: kj no.em...@please.post wrote: Watch this: class neodict(dict): pass ... d = neodict() type(d) class '__main__.neodict' type(d.copy()) type 'dict' Bug? Feature? Genius beyond the grasp of schlubs like me? Feature. In (almost?) all cases any objects constructed by a subclass of a builtin class will be of the original builtin class. What I *really* would like to know is: how do *you* know this (and the same question goes for the other responders who see this behavior of dict as par for the course). Can you show me where it is in the documentation? I'd really appreciate it. TIA! ~kj -- http://mail.python.org/mailman/listinfo/python-list
Re: __delitem__ feature
In mailman.302.1293387041.6505.python-l...@python.org Ian Kelly ian.g.ke...@gmail.com writes: On 12/26/2010 10:53 AM, kj wrote: P.S. If you uncomment the commented-out line, and comment out the last line of the __init__ method (which installs self._delitem as self.__delitem__) then *all* the deletion attempts invoke the __delitem__ method, and are therefore blocked. FWIW. Because subclasses of builtins only check the class __dict__ for special method overrides, not the instance __dict__. How do you know this? Is this documented? Or is this a case of Monday-night quarterbacking? ~kj -- http://mail.python.org/mailman/listinfo/python-list
Re: Keeping track of the N largest values
Am 25.12.2010 16:42, schrieb Roy Smith: I'm processing a stream of N numbers and want to keep track of the K largest. There's too many numbers in the stream (i.e. N is too large) to keep in memory at once. K is small (100 would be typical). From a theoretical point of view, I should be able to do this in N log K time. What I'm doing now is essentially: top = [-1]# Assume all x are= 0 for x in input(): if x= top[0]: continue top.append(x) if len(top) K: top.sort() top.pop(0) I can see pathological cases (say, all input values the same) where running time would be N K log K, but on average (N K and random distribution of values), this should be pretty close to N. Is there a better way to do this, either from a theoretical running time point of view, or just a nicer way to code this in Python? Here is my version: l = [] K = 10 while 1: a = input() if len(l) == K: l.remove(min(l)) l=[x for x in l if x a] + [a] + [x for x in l if x a] print l -- http://mail.python.org/mailman/listinfo/python-list
Re: How to pop the interpreter's stack?
In mailman.280.1293287106.6505.python-l...@python Robert Kern robert.k...@gmail.com writes: Except that the *caller* never gets the traceback (unless if it deliberately inspects the stack for some metaprogramming reason). It gets the exception, and that is the same no matter what you do. The developer/user gets the traceback, and those implementation details *are* often important to them. Just look at what Python shows you if you pass the wrong number of arguments to a function: def spam(x, y, z): pass ... spam(1, 2) Traceback (most recent call last): File stdin, line 1, in module TypeError: spam() takes exactly 3 arguments (2 given) That's it. The traceback stops at the point of the error. Python doesn't show you all the underlying C-coded machinery that went into detecting the error and emitting the error message. *No one* needs this information at this point. All I'm saying is that I want to do the same thing with my argument validation code as Python does with its argument validation code: keep it out of sight. When my argument validation code fires an exception ***there's no bug in **my** code***. It's doing exactly what it's supposed to do. Therefore, there's no need for me to debug anything, and certainly no need for me to inspect the traceback all the way to the exception. The bug is in the code that called my function with the wrong arguments. The developer of that code has no more use for seeing the traceback all the way to where my code raises the exception than I have for seeing the traceback of Python's underlying C code when I get an error like the one shown above. ~kj -- http://mail.python.org/mailman/listinfo/python-list
Re: How to pop the interpreter's stack?
In mailman.301.1293383804.6505.python-l...@python.org Ethan Furman et...@stoneleaf.us writes: You failed to mention that cleverness is not a prime requisite of the python programmer -- in fact, it's usually frowned upon. That's the party line, anyway. I no longer believe it. I've been crashing against one bit of cleverness after another in Python's unification of types and classes... -- http://mail.python.org/mailman/listinfo/python-list
Re: How to pop the interpreter's stack?
On 12/25/2010 2:50 PM, Steven D'Aprano wrote: On Fri, 24 Dec 2010 10:51:32 -0800, John Nagle wrote: On 12/24/2010 3:24 AM, Carl Banks wrote: On Dec 24, 1:24 am, Steven D'Apranosteve +comp.lang.pyt...@pearwood.info wrote: All I'm suggesting is that there should be a way of reducing the boilerplate needed for this idiom: def _validate_arg(x): if x == 'bad input': return False return True def f(arg): if not _validate_arg(arg): raise ValueError process(arg) to something more natural that doesn't needlessly expose implementation details that are completely irrelevant to the caller. How about raise ValueError(Bad input %s to validate_arg % (repr(arg),)) You can pass arguments to most exceptions, and the content of the exception is determined entirely by the code raising it. I know that exceptions can take arguments (usually string error messages). I was writing in short-hand. My apologies, I thought that would have been obvious :( Perhaps you have missed the context of the discussion. The context is that the called function delegates the job of validating input to a private function, which should be hidden from the caller (it's private, not part of the public API, subject to change, hidden, etc.) but tracebacks expose that information, obscuring the cause of the fault. (The fault being bad input to the public function, not an accidental bug in the private function.) If end users are seeing uncaught tracebacks, the program is broken. Well, perhaps, but that's a separate issue. We're talking about the caller of the function seeing internal details, not the end use. No, that is the issue, unless the program itself is examining the stack traceback data. Python exception-catching has no notion of what code raised the exception. Only the contents of the exception object are normally available. So the private function is always hidden, unless you're debugging, in which case it shouldn't be hidden. Traceback is purely a debugging feature. In some Python implementations, such as Shed Skin, you don't get tracebacks unless you're running under a debugger. John Nagle -- http://mail.python.org/mailman/listinfo/python-list
A Brief Illustrated Guide To Understanding Islam Home
A Brief Illustrated Guide To Understanding Islam Home http://www.islam-guide.com/ -- http://mail.python.org/mailman/listinfo/python-list
Re: Python Web App
@Katie Thank you I considered this option until I realized it wouldn't let me do anything other than ping from the command line. The rest of you all make valid points after doing a little more research on my own I found some really nice web based text editors but they didn't have any testing abilities which meant learning in that environment wasn't feasible in my opinion. I am inclined to agree that chrome OS will probably not do as well as they want it to but with the kind of capital Google has they could easily flood the market. In the end I wound up giving the notebook to my mom because all she really does is check her email and Facebook so it was perfect for her. Thank You for all the responses they were a great help with me testing the notebook. On Dec 25, 9:02 pm, Katie T ka...@coderstack.co.uk wrote: On Wed, Dec 22, 2010 at 9:43 PM, Sean secr...@gmail.com wrote: Anybody know where I can find a Python Development Environment in the form of a web app for use with Chrome OS. I have been looking for a few days and all i have been able to find is some old discussions with python developers talking about they will want one for the OS to be a success with them. Your best bet is probably just to SSH to a *nix box and use something like vim or emacs. None of the web solutions are anywhere near acceptable. Katie -- CoderStackhttp://www.coderstack.co.uk/python-jobs The Software Developer Job Board -- http://mail.python.org/mailman/listinfo/python-list
Re: Trying to parse a HUGE(1gb) xml file
On 2010-12-26, Stefan Behnel stefan...@behnel.de wrote: Tim Harig, 26.12.2010 10:22: On 2010-12-26, Stefan Behnel wrote: Tim Harig, 26.12.2010 02:05: On 2010-12-25, Nobody wrote: On Sat, 25 Dec 2010 14:41:29 -0500, Roy Smith wrote: Of course, one advantage of XML is that with so much redundant text, it compresses well. We typically see gzip compression ratios of 20:1. But, that just means you can archive them efficiently; you can't do anything useful until you unzip them. XML is typically processed sequentially, so you don't need to create a decompressed copy of the file before you start processing it. Sometimes XML is processed sequentially. When the markup footprint is large enough it must be. Quite often, as in the case of the OP, you only want to extract a small piece out of the total data. In those cases, being forced to read all of the data sequentially is both inconvenient and and a performance penalty unless there is some way to address the data you want directly. [...] If you do it a lot, you will have to find a way to make the access efficient for your specific use case. So the file format doesn't matter either, because the data will most likely end up in a fast data base after reading it in sequentially *once*, just as in the case above. If the data is just going to end up in a database anyway; then why not send it as a database to begin with and save the trouble of having to convert it? I don't think anyone would object to using a native format when copying data from one database 1:1 to another one. But if the database formats are different on both sides, it's a lot easier to map XML formatted data to a given schema than to map a SQL dump, for example. Matter of use cases, not of data size. Your assumption keeps hinging on the fact that I should want to dump the data into a database in the first place. Very often I don't. I just want to rip out the small portion of information that happens to be important to me. I may not even want to archive my little piece of the information once I have processed it. Even assuming that I want to dump all the data into a database, walking through a bunch of database records to translate them into the schema for another database is no more difficult then walking through a bunch of XML elements. In fact, it is even easier since I can use the relational model to reconstruct the information in an organization that better fits how the data is actually structured in my database instead of being constrained by how somebody else wanted to organize their XML. There is no need to map a[sic] SQL dump. XML is great when the data is set is small enough that parsing the whole tree has negligable costs. I can choose whether I want to parse it sequentially or use XPath/DOM/Etree etc to make it appear as though I am making random accesses. When the data set grows so that parsing it is expensive I loose that choice even if my use case would otherwise prefer a random access paradigm. When that happens, there are better ways of communicating that data that doesn't force me into using a high overhead method of extracting my data. The problem is that XML has become such a defacto standard that it used automatically, without thought, even when there are much better alternatives available. -- http://mail.python.org/mailman/listinfo/python-list
Re: __delitem__ feature
Am 26.12.2010 19:49, schrieb kj: How do you know this? Is this documented? Or is this a case of Monday-night quarterbacking? Please stop bitching around. You know that by carefully reading the documentation: http://docs.python.org/reference/datamodel.html#special-method-lookup-for-new-style-classes -- http://mail.python.org/mailman/listinfo/python-list
Re: lxml etree question
Jim, 26.12.2010 00:32: On Dec 25, 5:33 am, Stefan Behnel wrote: lxml knows about this special case, so you can write {http://www.w3.org/XML/1998/namespace}lang and lxml will take care of using the right prefix. Stefan, thank you for the software, which has helped me a great deal. I tried that exact thing, among a number of others, and it didn't work for me (I got ns0). Works for me, at least with a recent SVN version: Python 2.7.1rc1+ (trunk:86636, Nov 21 2010, 09:18:37) [GCC 4.4.3] on linux2 Type help, copyright, credits or license for more information. import lxml.etree as ET el = ET.Element('test', ...{'{http://www.w3.org/XML/1998/namespace}lang': 'de'}) ET.tostring(el) 'test xml:lang=de/' Anyway, I applied a patch that makes sure it will always use the 'xml' prefix for this namespace. Will be in 2.3 final. Stefan -- http://mail.python.org/mailman/listinfo/python-list
string u'hyv\xe4' to file as 'hyvä'
Could you please help me with special characters saving to file. I need to write the string u'hyv\xe4' to file. I would like to open file and to have line 'hyvä' import codecs word= u'hyv\xe4' F=codecs.open(/opt/finnish.txt, 'w+','Latin-1') F.writelines(item.encode('Latin-1')) F.writelines(item.encode('utf8')) F.writelines(item) F.close() All three writelines gives the same result in finnish.txt: hyv\xe4 i would like to find 'hyvä'. regards, gintare -- http://mail.python.org/mailman/listinfo/python-list
Re: inspect.getsource bug?
On 26 Dic, 19:24, Ciccio franap...@gmail.com wrote: Try this: 1) define a function 'foo' in a script 2) runfile the script from a shell 3) do 'inspect.getsource(foo)' 4) change the source of 'foo' 5) runfile the script from the same shell 6) do 3 again On my 2.6.6 getsource returns twice the same code. I couldn't find very much about this, is there any known workaround? thanks found this in the meantime: http://bugs.python.org/issue993580 -- http://mail.python.org/mailman/listinfo/python-list
Need urllib.urlretrieve and urllib2.OpenerDirector together
Hello everyone. I'm writing a script in Python 2.7 which uses a urllib2.OpenerDirector instance via urllib2.build_opener() to take advantage of the urllib2.HTTPCookieProcessor class, because I need to store and re-send the cookies I get: opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cookielib.CookieJar())) However, after making several requests and moving the cookies around, eventually I need to retrieve a list of URLs. I wanted to use urllib.urlretrieve() because I read it downloads the file in chunks, but I cannot because I need to carry my cookies on the request and urllib.urlretrieve() uses a urllib.URLOpener, which doesn't have support for cookie handlers like OpenerDirector has. I want to know the reason of this strange way of splitting functionality, and how can I achieve my goal. Thank you in advance! -- http://mail.python.org/mailman/listinfo/python-list
Re: string u'hyv\xe4' to file as 'hyvä'
On 26/12/2010 22:43, gintare wrote: Could you please help me with special characters saving to file. I need to write the string u'hyv\xe4' to file. I would like to open file and to have line 'hyvä' import codecs word= u'hyv\xe4' F=codecs.open(/opt/finnish.txt, 'w+','Latin-1') This opens the file using the Latin-1 encoding (although only if you put the filename in quotes). F.writelines(item.encode('Latin-1')) This encodes the Unicode item (did you mean 'word'?) to a bytestring using the Latin-1 encoding. You opened the file using Latin-1 encoding, so this is pointless. You should pass a Unicode string; it will encode it for you. You're also passing a bytestring to the .writelines method, which expects a list of strings. What you should be doing is this: F.write(word) F.writelines(item.encode('utf8')) This encodes the Unicode item to a bytestring using the UTF-8 encoding. This is also pointless. You shouldn't be encoding to UTF-8 and then trying to write it to a file which was opened using Latin-1 encoding! F.writelines(item) F.close() All three writelines gives the same result in finnish.txt: hyv\xe4 i would like to find 'hyvä'. -- http://mail.python.org/mailman/listinfo/python-list
IDLE GUI not working
Hello all, Newbie here so go easy on me. I've been trying to get the IDLE GUI to work on my machine, but have been unsuccessful so far. I have an IBM Thinkpad running Windows XP and it has an older version of Python running (2.2, I believe). When I try to use the shortcut to open the IDLE GUI nothing happens (not even a process running in task manager). When I use the cmd line to try to open the IDLE GUI, I get this error message: C:\C:\python27\python C:\python27\Lib\idlelib\idle.py Traceback (most recent call last): File C:\python27\Lib\idlelib\idle.py, line 11, in module idlelib.PyShell.main() File C:\python27\Lib\idlelib\PyShell.py, line 1389, in main root = Tk(className=Idle) File C:\python27\lib\lib-tk\Tkinter.py, line 1685, in __init__ self.tk = _tkinter.create(screenName, baseName, className, interactive, want objects, useTk, sync, use) _tkinter.TclError: Can't find a usable init.tcl in the following directories: {C:\IBMTOOLS\Python22\tcl\tcl8.4} C:/IBMTOOLS/Python22/tcl/tcl8.5 C:/python2 7/lib/tcl8.5 C:/lib/tcl8.5 C:/lib/tcl8.5 C:/library C:/library C:/ tcl8.5.2/libra ry C:/tcl8.5.2/library C:/IBMTOOLS/Python22/tcl/tcl8.4/init.tcl: version conflict for package Tcl: ha ve 8.5.2, need exactly 8.4 version conflict for package Tcl: have 8.5.2, need exactly 8.4 while executing package require -exact Tcl 8.4 (file C:/IBMTOOLS/Python22/tcl/tcl8.4/init.tcl line 19) invoked from within source C:/IBMTOOLS/Python22/tcl/tcl8.4/init.tcl (uplevel body line 1) invoked from within uplevel #0 [list source $tclfile] This probably means that Tcl wasn't installed properly. As I understand this, python 2.7 is looking in a directory or path specified by the older, python 2.2 for Tcl. I have tried to unset this path by entering this into the cmd line: --- C:\set TCL_LIBRARY= C:\set TK_LIBRARY= C:\C:\Python27\python.exe C:\Python27\Lib\idlelib\idle.py --- No dice. I tried to enter this information as a new pythonpath in the environment variables. No dice. I've tried, in vain, to get this thing working, but most of the other explanations are way over my head (I'm pretty new to programming and digging around the programming guts of computers) It seems like a fairly common problem, but haven't gotten a good answer from either the official python help boards or elsewhere. I was hoping that someone here could give me a easy to understand way to make the IDLE GUI work. Thanks in advance. -- http://mail.python.org/mailman/listinfo/python-list
Re: How to pop the interpreter's stack?
On 12/26/2010 2:14 PM, kj wrote: In mailman.301.1293383804.6505.python-l...@python.org Ethan Furman et...@stoneleaf.us writes: You failed to mention that cleverness is not a prime requisite of the python programmer -- in fact, it's usually frowned upon. That's the party line, anyway. I no longer believe it. I've been crashing against one bit of cleverness after another in Python's unification of types and classes... Well if you can find a way to implement a class system that doesn't use clever tricks *in its implementation* please let me know. regards Steve -- Steve Holden +1 571 484 6266 +1 800 494 3119 PyCon 2011 Atlanta March 9-17 http://us.pycon.org/ See Python Video! http://python.mirocommunity.org/ Holden Web LLC http://www.holdenweb.com/ -- http://mail.python.org/mailman/listinfo/python-list
Re: How to pop the interpreter's stack?
On Dec 23, 3:22 am, Steven D'Aprano steve +comp.lang.pyt...@pearwood.info wrote: You seem to have completely missed that there will be no bug report, because this isn't a bug. (Or if it is a bug, the bug is elsewhere, external to the function that raises the exception.) It is part of the promised API. The fact that the exception is generated deep down some chain of function calls is irrelevant. While the idea of being able to do this (create a general validation exception) sounds mildly appealing at first, what happens when the module implementing this documented API and documented error, has a bug? It seems that the user, or even module developer in the midst of writing, would now have no tools to easily tackle the problem, and no useful information to submit in the required bug report. -- http://mail.python.org/mailman/listinfo/python-list
Re: How to pop the interpreter's stack?
On Dec 26, 11:09 am, kj no.em...@please.post wrote: In mailman.280.1293287106.6505.python-l...@python Robert Kern robert.k...@gmail.com writes: Except that the *caller* never gets the traceback (unless if it deliberately inspects the stack for some metaprogramming reason). It gets the exception, and that is the same no matter what you do. The developer/user gets the traceback, and those implementation details *are* often important to them. Just look at what Python shows you if you pass the wrong number of arguments to a function: def spam(x, y, z): pass ... spam(1, 2) Traceback (most recent call last): File stdin, line 1, in module TypeError: spam() takes exactly 3 arguments (2 given) That's it. The traceback stops at the point of the error. Python doesn't show you all the underlying C-coded machinery that went into detecting the error and emitting the error message. *No one* needs this information at this point. All I'm saying is that I want to do the same thing with my argument validation code as Python does with its argument validation code: keep it out of sight. When my argument validation code fires an exception ***there's no bug in **my** code***. It's doing exactly what it's supposed to do. Therefore, there's no need for me to debug anything, and certainly no need for me to inspect the traceback all the way to the exception. The bug is in the code that called my function with the wrong arguments. The developer of that code has no more use for seeing the traceback all the way to where my code raises the exception than I have for seeing the traceback of Python's underlying C code when I get an error like the one shown above. Python makes no attempt to hide its machinery in tracebacks (that I'm aware of); in fact stack points from internal Python functions, classes, and modules appear in tracebacks all the time. The reason you don't see traceback lines for Python's argument validation is it's written in C. If it bothers you that much, you're welcome to write you own argument validation in C, too. Carl Banks -- http://mail.python.org/mailman/listinfo/python-list
Re: How to pop the interpreter's stack?
On Dec 25, 2:49 pm, Steven D'Aprano steve +comp.lang.pyt...@pearwood.info wrote: On Sat, 25 Dec 2010 09:17:27 -0500, Robert Kern wrote: On 12/24/10 5:14 PM, Ethan Furman wrote: There are also times when I change the exception being raised to match what python expects from that type of object -- for example, from WhatEverException to KeyError for a dict-like object. So in this regard I agree with Steven. Steven isn't arguing that particular point here, nor is anyone arguing against it. Emphasis on *here*. You will note that in Python 3, if you raise an exception inside an except block, both the *original* and the new exception are printed. This is great for fixing bugs inside except blocks, but terribly disruptive for catching one error and raising another error in it's place, e.g.: try: x+0 except ValueError, TypeError as e: # x is not a numeric value, e.g. a string or a NAN. raise MyError('x is not a number') The explicit raise is assumed to indicate a bug in the except block, and the original exception is printed as well. But that's a separate issue from what is being discussed here. What we're discussing here is the idea that a function should be able to delegate work to private subroutines without the caller being aware of that fact. When you return a value, the caller doesn't see the internal details of how you calculated the value, but if you deliberately raise an exception, the caller does. Often this is the right thing to do, but sometimes it isn't. E.g. you can't delegate input validation to a subroutine and raise inside the subroutine without obfuscating the traceback. import re re.compile(r() Traceback (most recent call last): File stdin, line 1, in module File /usr/lib/python2.6/re.py, line 190, in compile return _compile(pattern, flags) File /usr/lib/python2.6/re.py, line 245, in _compile raise error, v # invalid expression sre_constants.error: unbalanced parenthesis OHMYGOD HOW DARE the standard library allow the traceback list an internal function that does input valididation! Carl Banks -- http://mail.python.org/mailman/listinfo/python-list
Re: IDLE GUI not working
On Sun, Dec 26, 2010 at 4:33 PM, Python Programming pytho...@gmail.comwrote: Hello all, Newbie here so go easy on me. I've been trying to get the IDLE GUI to work on my machine, but have been unsuccessful so far. I have an IBM Thinkpad running Windows XP and it has an older version of Python running (2.2, I believe). When I try to use the shortcut to open the IDLE GUI nothing happens (not even a process running in task manager). When I use the cmd line to try to open the IDLE GUI, I get this error message: C:\C:\python27\python C:\python27\Lib\idlelib\idle.py Traceback (most recent call last): File C:\python27\Lib\idlelib\idle.py, line 11, in module idlelib.PyShell.main() File C:\python27\Lib\idlelib\PyShell.py, line 1389, in main root = Tk(className=Idle) File C:\python27\lib\lib-tk\Tkinter.py, line 1685, in __init__ self.tk = _tkinter.create(screenName, baseName, className, interactive, want objects, useTk, sync, use) _tkinter.TclError: Can't find a usable init.tcl in the following directories: {C:\IBMTOOLS\Python22\tcl\tcl8.4} C:/IBMTOOLS/Python22/tcl/tcl8.5 C:/python2 7/lib/tcl8.5 C:/lib/tcl8.5 C:/lib/tcl8.5 C:/library C:/library C:/ tcl8.5.2/libra ry C:/tcl8.5.2/library C:/IBMTOOLS/Python22/tcl/tcl8.4/init.tcl: version conflict for package Tcl: ha ve 8.5.2, need exactly 8.4 version conflict for package Tcl: have 8.5.2, need exactly 8.4 while executing package require -exact Tcl 8.4 (file C:/IBMTOOLS/Python22/tcl/tcl8.4/init.tcl line 19) invoked from within source C:/IBMTOOLS/Python22/tcl/tcl8.4/init.tcl (uplevel body line 1) invoked from within uplevel #0 [list source $tclfile] This probably means that Tcl wasn't installed properly. As I understand this, python 2.7 is looking in a directory or path specified by the older, python 2.2 for Tcl. I have tried to unset this path by entering this into the cmd line: --- C:\set TCL_LIBRARY= C:\set TK_LIBRARY= C:\C:\Python27\python.exe C:\Python27\Lib\idlelib\idle.py --- No dice. I tried to enter this information as a new pythonpath in the environment variables. No dice. I've tried, in vain, to get this thing working, but most of the other explanations are way over my head (I'm pretty new to programming and digging around the programming guts of computers) It seems like a fairly common problem, but haven't gotten a good answer from either the official python help boards or elsewhere. I was hoping that someone here could give me a easy to understand way to make the IDLE GUI work. Thanks in advance. -- http://mail.python.org/mailman/listinfo/python-list Wow, I wrote a note to this list so closely resembling yours I had to read carefully to ensure it wasn't my own. You have a PATH problem - these machines ship with Tcl installed and that's creating this issue. There is the standard Path variable, but if you'll notice, there is also a scrollbar to the right. Scroll down and find additional variables that need fixed - PythonPath, TCL_Library, and TK Library all need to be redirected to your current install. If you fix all of these, your problem should be solved and you'll be on your way, at least that was my exact issue on a Thinkpad running XP. Best of luck! Grant -- http://mail.python.org/mailman/listinfo/python-list
User input masks - Access Style
Is there anyay to use input masks in python? Similar to the function found in access where a users input is limited to a type, length and format. So in my case I want to ensure that numbers are saved in a basic format. 1) Currency so input limited to 000.00 eg 1.00, 2.50, 13.80 etc For sports times that is time duration not a system or date times should I assume that I would need to calculate a user input to a decimal number and then recalculate it to present it to user? So an example, sorry. import time #not sure if this is any use minute = input(How many minutes: ) seconds = input(How many seconds: ) Hundredths = input(how many Hundredths: ) # convert user input MyTime = (minute/60)+(seconds)+(Hundredths/1800) #Display to user assuming i had written a name and user # had retrieved it print([User], your time was), (MyTime/60:MyTime(MyTime-((MyTime/ 60)*60).(MyTime-(MyTime0))) ) -- http://mail.python.org/mailman/listinfo/python-list
Re: __delitem__ feature
On Sun, 26 Dec 2010 18:49:55 +, kj wrote: In mailman.302.1293387041.6505.python-l...@python.org Ian Kelly ian.g.ke...@gmail.com writes: On 12/26/2010 10:53 AM, kj wrote: P.S. If you uncomment the commented-out line, and comment out the last line of the __init__ method (which installs self._delitem as self.__delitem__) then *all* the deletion attempts invoke the __delitem__ method, and are therefore blocked. FWIW. Because subclasses of builtins only check the class __dict__ for special method overrides, not the instance __dict__. How do you know this? Is this documented? Or is this a case of Monday-night quarterbacking? We know it because it explains the observable facts. It also happens to be documented, but documentation can be wrong or incomplete. The facts are never wrong, since by definition they are the facts. -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: How to pop the interpreter's stack?
On Sun, 26 Dec 2010 09:15:32 -0800, Ethan Furman wrote: Steven D'Aprano wrote: Right. But I have thought of a clever trick to get the result KJ was asking for, with the minimum of boilerplate code. Instead of this: def _pre_spam(args): if condition(args): raise SomeException(message) if another_condition(args): raise AnotherException(message) if third_condition(args): raise ThirdException(message) def spam(args): _pre_spam(args) do_useful_work() you can return the exceptions instead of raising them (exceptions are just objects, like everything else!), and then add one small piece of boilerplate to the spam() function: def _pre_spam(args): if condition(args): return SomeException(message) if another_condition(args): return AnotherException(message) if third_condition(args): return ThirdException(message) def spam(args): exc = _pre_spam(args) if exc: raise exc do_useful_work() -1 You failed to mention that cleverness is not a prime requisite of the python programmer -- in fact, it's usually frowned upon. The big problem with the above code is you are back to passing errors in-band, pretty much completely defeating the point of have an out-of-band channel. How is that any worse than making _pre_spam() a validation function that returns a bool? def spam(args): flag = _pre_spam(args) if flag: raise SomeException() do_useful_work() Is that also frowned upon for being too clever? -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: How to pop the interpreter's stack?
On Sun, 26 Dec 2010 17:12:50 -0800, misno...@gmail.com wrote: On Dec 23, 3:22 am, Steven D'Aprano steve +comp.lang.pyt...@pearwood.info wrote: You seem to have completely missed that there will be no bug report, because this isn't a bug. (Or if it is a bug, the bug is elsewhere, external to the function that raises the exception.) It is part of the promised API. The fact that the exception is generated deep down some chain of function calls is irrelevant. While the idea of being able to do this (create a general validation exception) sounds mildly appealing at first, what happens when the module implementing this documented API and documented error, has a bug? It seems that the user, or even module developer in the midst of writing, would now have no tools to easily tackle the problem, and no useful information to submit in the required bug report. That's very true, but the same applies to *any* use of encapsulation. Any time you hide information, you hide information (duh!). This doesn't stop us from doing this: def public(x): if condition: return _private(x) elif other_condition: return _more_private(x+1) else: return _yet_another_private(x-1) If you call public(42), and get the wrong answer, it's a bug, but the source of the bug is hidden from the caller. If you have access to the source code, you can work out where the bug lies (which of the three private functions is buggy?) given the argument, but the return result itself does not expose any information about where the bug lies. This is considered an unavoidable but acceptable side-effect of an otherwise desirable state of affairs: information hiding and encapsulation. The caller being unaware of where and how the result is calculated is considered a good thing, and the fact that it occasionally adds to the debugging effort is considered such a trivial cost that it normally isn't remarked upon, except by lunatics and die-hard fans of spaghetti code using GOTO. But I repeat myself. Why should exceptions *necessarily* be different? As I've repeatedly acknowledged, for an unexpected exception (a bug), the developer needs all the help he can get, and the current behaviour is the right way to do it. You won't hear me argue differently. But for a documented, explicit, expected, deliberate exception, Python breaks encapsulation by exposing the internal details of any internal subroutines used to generate that exception. This leads to messy tracebacks that obscure the source of bugs in the caller's code: import re re.compile(r() Traceback (most recent call last): File stdin, line 1, in module File /usr/lib/python2.6/re.py, line 190, in compile return _compile(pattern, flags) File /usr/lib/python2.6/re.py, line 245, in _compile raise error, v # invalid expression sre_constants.error: unbalanced parenthesis I think critics of my position have forgotten what it's like to learning the language. One of the most valuable skills to learn is to *ignore parts of the traceback* -- a skill that takes practice and familiarity with the library or function in use. To those who are less familiar with the function, it can be very difficult to determine which parts of the traceback are relevant and which aren't. In this case, the caller has nothing to do with _compile, and the traceback looks like it's an internal bug in a subroutine, when in fact it is actually due to bad input. The experienced developer learns (by trial and error, possibly) to ignore nearly half of the error message in this case. In principle, the traceback could be roughly half as big, which means the caller would find it half as difficult to read and understand: re.compile(r() Traceback (most recent call last): File stdin, line 1, in module File /usr/lib/python2.6/re.py, line 190, in compile raise error, v # invalid expression sre_constants.error: unbalanced parenthesis With a one-to-one correspondence between the function called, and the function reporting an error, it is easier to recognise that the error lies in the input rather than some internal error in some subroutine you have nothing to do with. Unfortunately there's no straightforward way to consistently get this in Python without giving up the advantages of delegating work to subroutines. It need not be that way. This could, in principle, be left up to the developer of the public function to specify (somehow!) that some specific exceptions are expected, and should be treated as coming from public() rather than from some internal subroutine. I don't have a concrete proposal for such, although I did suggest a work-around. I expected disinterest (I don't see the point). I didn't expect the level of hostility to the idea that exceptions should (if and when possible) point to the source of the error rather than some accidental implementation- specific subroutine. Go figure. -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: string u'hyv\xe4' to file as 'hyvä'
Hello, STILL do not work. WHAT to be done. import codecs item=u'hyv\xe4' F=codecs.open('/opt/finnish.txt', 'w+', 'utf8') F.writelines(item.encode('utf8')) F.close() In file i find 'hyv\xe4' instead of hyvä. (Sorry for mistyping in previous letter about 'latin-1'. I was making all possible combinations, when normal example syntax did not work, before writting to this forum.) regards, gintare On 27 Gruo, 00:43, gintare g.statk...@gmail.com wrote: Could you please help me with special characters saving to file. I need to write the string u'hyv\xe4' to file. I would like to open file and to have line 'hyvä' import codecs word= u'hyv\xe4' F=codecs.open(/opt/finnish.txt, 'w+','Latin-1') F.writelines(item.encode('Latin-1')) F.writelines(item.encode('utf8')) F.writelines(item) F.close() All three writelines gives the same result in finnish.txt: hyv\xe4 i would like to find 'hyvä'. regards, gintare -- http://mail.python.org/mailman/listinfo/python-list
Re: string u'hyv\xe4' to file as 'hyvä'
gintare g.statk...@gmail.com wrote in message news:83dc3076-9ddc-42bd-8c33-6af96b263...@l32g2000yqc.googlegroups.com... Hello, STILL do not work. WHAT to be done. import codecs item=u'hyv\xe4' F=codecs.open('/opt/finnish.txt', 'w+', 'utf8') F.writelines(item.encode('utf8')) F.close() In file i find 'hyv\xe4' instead of hyvä. When you open a file with codecs.open(), it expects Unicode strings to be written to the file. Don't encode them again. Also, .writelines() expects a list of strings. Use .write(): import codecs item=u'hyv\xe4' F=codecs.open('/opt/finnish.txt', 'w+', 'utf8') F.write(item) F.close() An additional comment, if you save the script in UTF8, you can inform Python of that fact with a special comment, and actually use the correct characters in your string constants (ä instead of \xe4). Make sure to use a text editor that can save in UTF8, or use the correct coding comment for whatever encoding in which you save the file. # coding: utf8 import codecs item=u'hyvä' F=codecs.open('finnish.txt', 'w+', 'utf8') F.write(item) F.close() -Mark -- http://mail.python.org/mailman/listinfo/python-list
Re: User input masks - Access Style
On 2010-12-27, flebber flebber.c...@gmail.com wrote: Is there anyay to use input masks in python? Similar to the function found in access where a users input is limited to a type, length and format. So in my case I want to ensure that numbers are saved in a basic format. 1) Currency so input limited to 000.00 eg 1.00, 2.50, 13.80 etc Some GUIs provide this functionality or provide callbacks for validation functions that can determine the validity of the input. I don't know of any modules that provide formatted input in a terminal. Most terminal input functions just read from stdin (in this case a buffered line) and output that as a string. It is easy enough to validate whether terminal input is in the proper. Your example time code might look like: ... import re ... import sys ... ... # get the input ... print(Please enter time in the format 'MM:SS:HH': , end=) ... timeInput = input() ... ... # validate the input is in the correct format (usually this would be in ... # loop that continues until the user enters acceptable data) ... if re.match(r'''^[0-9]{2}:[0-9]{2}:[0-9]{2}$''', timeInput) == None: ... print(I'm sorry, your input is improperly formated.) ... sys.exit(1) ... ... # break the input into its componets ... componets = timeInput.split(:) ... minutes = int(componets[0]) ... seconds = int(componets[1]) ... microseconds = int(componets[2]) ... ... # output the time ... print(Your time is: + %02d % minutes + : + %02d % seconds + : + ... %02d % microseconds) Currency works the same way using validating it against: r'''[0-9]+\.[0-9]{2}''' For sports times that is time duration not a system or date times should I assume that I would need to calculate a user input to a decimal number and then recalculate it to present it to user? I am not sure what you are trying to do or asking. Python provides time, date, datetime, and timedelta objects that can be used for date/time calculations, locale based formatting, etc. What you use, if any, will depend on what you are actually tring to accomplish. Your example doesn't really show you doing much with the time so it is difficult giving you any concrete recommendations. -- http://mail.python.org/mailman/listinfo/python-list
Re: How to pop the interpreter's stack?
Steven D'Aprano wrote: ... I think critics of my position have forgotten what it's like to learning the language. One of the most valuable skills to learn is to *ignore parts of the traceback* -- a skill that takes practice and familiarity with the library or function in use. To those who are less familiar with the function, it can be very difficult to determine which parts of the traceback are relevant and which aren't. In this case, the caller has nothing to do with _compile, and the traceback looks like it's an internal bug in a subroutine, when in fact it is actually due to bad input. The experienced developer learns (by trial and error, possibly) to ignore nearly half of the error message in this case. And it can still be some work to figure out which parts of the traceback are relevant, even after a couple years... ... It need not be that way. This could, in principle, be left up to the developer of the public function to specify (somehow!) that some specific exceptions are expected, and should be treated as coming from public() rather than from some internal subroutine. I don't have a concrete proposal for such, although I did suggest a work-around. I expected disinterest (I don't see the point). I didn't expect the level of hostility to the idea that exceptions should (if and when possible) point to the source of the error rather than some accidental implementation- specific subroutine. Go figure. My objection is not to the idea, but to the ad-hoc methods that would currently be required. Resorting to passing exceptions in-band is a step backwards. If python had a way to specify what level an exception should be reported from, I might be interested. At this point, if sparing the user one level of traceback was that high a priority to me, I would make the validation be either a decorator, or have the validation *be* the main routine, and the *real work* routine be the private one. ~Ethan~ -- http://mail.python.org/mailman/listinfo/python-list
Re: User input masks - Access Style
On 2010-12-27, Tim Harig user...@ilthio.net wrote: ... if re.match(r'''^[0-9]{2}:[0-9]{2}:[0-9]{2}$''', timeInput) == None: [SNIP] Currency works the same way using validating it against: r'''[0-9]+\.[0-9]{2}''' Sorry, you need to check to make sure that there are no trailing characters as in the example above. Checking the beginning is not actually necessary with match(). r'''^[0-9]+\.[0-9]{2}$''' -- http://mail.python.org/mailman/listinfo/python-list
Re: User input masks - Access Style
On Dec 27, 6:01 pm, Tim Harig user...@ilthio.net wrote: On 2010-12-27, flebber flebber.c...@gmail.com wrote: Is there anyay to use input masks in python? Similar to the function found in access where a users input is limited to a type, length and format. So in my case I want to ensure that numbers are saved in a basic format. 1) Currency so input limited to 000.00 eg 1.00, 2.50, 13.80 etc Some GUIs provide this functionality or provide callbacks for validation functions that can determine the validity of the input. I don't know of any modules that provide formatted input in a terminal. Most terminal input functions just read from stdin (in this case a buffered line) and output that as a string. It is easy enough to validate whether terminal input is in the proper. Your example time code might look like: ... import re ... import sys ... ... # get the input ... print(Please enter time in the format 'MM:SS:HH': , end=) ... timeInput = input() ... ... # validate the input is in the correct format (usually this would be in ... # loop that continues until the user enters acceptable data) ... if re.match(r'''^[0-9]{2}:[0-9]{2}:[0-9]{2}$''', timeInput) == None: ... print(I'm sorry, your input is improperly formated.) ... sys.exit(1) ... ... # break the input into its componets ... componets = timeInput.split(:) ... minutes = int(componets[0]) ... seconds = int(componets[1]) ... microseconds = int(componets[2]) ... ... # output the time ... print(Your time is: + %02d % minutes + : + %02d % seconds + : + ... %02d % microseconds) Currency works the same way using validating it against: r'''[0-9]+\.[0-9]{2}''' For sports times that is time duration not a system or date times should I assume that I would need to calculate a user input to a decimal number and then recalculate it to present it to user? I am not sure what you are trying to do or asking. Python provides time, date, datetime, and timedelta objects that can be used for date/time calculations, locale based formatting, etc. What you use, if any, will depend on what you are actually tring to accomplish. Your example doesn't really show you doing much with the time so it is difficult giving you any concrete recommendations. yes you are right I should have clarified. The time is a duration over distance, so its a speed measure. Ultimately I will need to store the times so I may need to use something likw sqlAlchemy but I am nowehere near the advanced but I know that most Db's mysql, postgre etc don't support time as a duration as such and i will probably need to store it as a decimal and convert it back for the user. -- http://mail.python.org/mailman/listinfo/python-list
[issue7436] Define 'object with assignable attributes'
Changes by Daniel Urban urban.dani...@gmail.com: -- nosy: +durban ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7436 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue9196] Improve docs for string interpolation %s re Unicode strings
Craig McQueen pyt...@craig.mcqueen.id.au added the comment: I should be able to attach my test code. But it is at my work, and I'm on holidays for 2 more weeks. Sorry 'bout that! I do assume that Python 3 greatly simplifies this. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue9196 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10769] ast: provide more useful range information
Sven Brauch svenbra...@googlemail.com added the comment: Hi, yeah Terry, that's exactly what most people whom I talked about this said (me too). Anyway, here's the patch which -- in my opinion -- fixes this behavior: --- python-orig/Python/ast.c 2010-10-19 03:22:07.0 +0200 +++ python-ast-fix/Python/ast.c 2010-12-26 13:25:48.0 +0100 @@ -1742,8 +1742,6 @@ tmp = ast_for_trailer(c, ch, e); if (!tmp) return NULL; -tmp-lineno = e-lineno; -tmp-col_offset = e-col_offset; e = tmp; } if (TYPE(CHILD(n, NCH(n) - 1)) == factor) { The offsets for foo.bar.baz before the patch: [1, 0, _ast.Attribute] [1, 0, _ast.Attribute, 'baz'] [1, 0, _ast.Name, 'bar'] [1, 0, 'foo'] ... and after the patch: [1, 0, _ast.Attribute] [1, 7, _ast.Attribute, 'baz'] [1, 3, _ast.Name, 'bar'] [1, 0, 'foo'] It would really be great if that could be applied. Best regards, Sven -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10769 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10576] Add a progress callback to gcmodule
Lukas Lueg lukas.l...@gmail.com added the comment: Collection may re-occur at any time, there is no promise to the callback code. However, the callback can disable the gc, preventing further collection. I don't think we need the other callbacks to be informed. As the callbacks are worked down in the order they registered, whoever comes first is served first. Returning True from the callback is mereley a I dont mind if gc happens now... -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10576 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2504] Add gettext.pgettext() and variants support
Changes by Felix Schwarz felix.schw...@web.de: -- nosy: +Felix Schwarz ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue2504 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10773] Building C and C++ Extensions on Windows documentation shows 2.x way of initializing module
New submission from Thorsten Behrens sbehr...@gmx.li: The documentation titled Building C and C++ Extensions on Windows at http://docs.python.org/py3k/extending/windows.html shows a Python 2.x way of handling static type object initializers, to whit: If your module creates a new type, you may have trouble with this line: PyVarObject_HEAD_INIT(PyType_Type, 0) Static type object initializers in extension modules may cause compiles to fail with an error message like “initializer not a constant”. This shows up when building DLL under MSVC. Change it to: PyVarObject_HEAD_INIT(NULL, 0) and add the following to the module initialization function: MyObject_Type.ob_type = PyType_Type; That last line will not function in Python 3.x. However, PyType_Ready will fill in the ob_type field if it is empty, if I understand PyType_Ready correctly. Therefore, the last few lines of this documentation snippet can become: and add the following to the module initialization function: if (PyType_Ready(MyObject_Type) 0) return NULL; -- assignee: d...@python components: Documentation messages: 124667 nosy: d...@python, thorsten.behrens priority: normal severity: normal status: open title: Building C and C++ Extensions on Windows documentation shows 2.x way of initializing module versions: Python 3.1 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10773 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue6720] multiprocessing logging
Éric Araujo mer...@netwok.org added the comment: This is either out of date (2.5 doesn’t get bugfixes any more) or invalid (concerns a backport of multiprocessing outside of the stdlib). -- nosy: +eric.araujo resolution: - rejected stage: - committed/rejected status: open - closed type: crash - behavior ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue6720 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10774] test_logging leaks temp files
New submission from Éric Araujo mer...@netwok.org: After a test_logging run in 3.2, I get stray files in $TMP. I think some test is not using the right mixin or addCleanup or tearDown. Less importantly, some recently added docstrings produce unwanted output on the console (the method name is clear enough, the redundant docstring is redundant); attached output turns them into comments. -- assignee: vinay.sajip components: Tests files: test_logging-nits.diff keywords: patch messages: 124669 nosy: eric.araujo, vinay.sajip priority: normal severity: normal stage: needs patch status: open title: test_logging leaks temp files type: behavior versions: Python 3.2 Added file: http://bugs.python.org/file20168/test_logging-nits.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10774 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10756] Error in atexit._run_exitfuncs [...] Exception expected for value, str found
Éric Araujo mer...@netwok.org added the comment: Looks good to me. I’d just move the raising function into the test method (no need to update the patch). -- nosy: +eric.araujo stage: - patch review ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10756 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10770] zipinfo - fix of a typo in the doc
Éric Araujo mer...@netwok.org added the comment: Fixed, thanks. Note that Georg and I follow the docs mailing list, so there is no need to open a report for each message. -- nosy: +eric.araujo resolution: - fixed stage: patch review - committed/rejected status: open - closed ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10770 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10774] test_logging leaves temp files
Changes by Georg Brandl ge...@python.org: -- title: test_logging leaks temp files - test_logging leaves temp files ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10774 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5258] addpackage in site.py fails hard on badly formed .pth files
R. David Murray rdmur...@bitdance.com added the comment: Here is a revised patch with tests. -- Added file: http://bugs.python.org/file20169/site_pth_exceptions.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5258 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10756] Error in atexit._run_exitfuncs [...] Exception expected for value, str found
Georg Brandl ge...@python.org added the comment: +1. -- nosy: +georg.brandl ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10756 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5258] addpackage in site.py fails hard on badly formed .pth files
Georg Brandl ge...@python.org added the comment: LGTM. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5258 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10775] assertRaises as a context manager should accept a 'msg' keyword argument.
New submission from R. David Murray rdmur...@bitdance.com: assertRaises used as a method can't take a msg keyword argument because all args and keywords are passed to the callable. But in context manager form it could, and this can be useful. See, for example, issue 3583. -- keywords: easy messages: 124675 nosy: michael.foord, r.david.murray, rhettinger priority: normal severity: normal stage: needs patch status: open title: assertRaises as a context manager should accept a 'msg' keyword argument. type: feature request versions: Python 3.3 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10775 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3583] test_urllibnet.test_bad_address() fails when using OpenDNS
R. David Murray rdmur...@bitdance.com added the comment: I think the best we can do here is add a message explaining that the error may be due to a broken DNS server (one with a wildcard dns record for all non-existent top level domains). However, assertRaises, even in context manager form, doesn't take a msg argument (yet). I've opened an issue with a feature request to fix that and made it a dependency of this issue. Note that the test uses a domain name ending in .d, and for a while before that used '.invalid', so the test should not fail if the ISP is only capturing valid top level domains with wildcards, something that seems to be far more common than catching invalid domains. -- dependencies: +assertRaises as a context manager should accept a 'msg' keyword argument. nosy: +r.david.murray priority: normal - low stage: unit test needed - needs patch versions: +Python 3.3 -Python 2.6, Python 3.1 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue3583 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue6027] test_xmlrpc_net fails when the ISP returns 302 Found
R. David Murray rdmur...@bitdance.com added the comment: IMO there's no way to fix this. I suggest closing it as invalid, since the problem is a buggy ISP DNS server, and the problem only occurs when time.xmlrpc.com is down. The canonical fix to problems like this is to remove dependency on the external service, but that would presumably defeat the entire purpose of test_xmlrpc_net. So one fix would be to delete this test entirely. Which if we have mock-server xmlrpc tests, might not be out of the question as a solution. Note that the second test in that file tests against the xmlrpc interface of our buildbot master, and that support has been formally removed by upstream and only restored by us...presumably at some point we'll drop support for it too, when it breaks too badly to be easily forward ported. Since there was, if I remember correctly, an extended period when xmlrpc.com's time service was down, simply deleting this test file may in fact be the best move. -- nosy: +r.david.murray versions: +Python 2.7, Python 3.2 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue6027 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7198] Extraneous newlines with csv.writer on Windows
John Machin sjmac...@users.sourceforge.net added the comment: Skip, I'm WRITING, not reading.. Please read the 3.1 documentation for csv.writer. It does NOT mention newline='', and neither does the example. Please fix. Other problems with the examples: (1) They encourage a bad habit (open inside the call to reader/writer); good practice is to retain the reference to the file handle (preferably with a with statement) so that it can be closed properly. (2) delimiter=' ' is very unrealistic. The documentation for both 2.x and 3.x should be much more explicit about what is needed in open() for csv to work properly and portably: 2.x read: use mode='rb' -- otherwise fail on Windows 2.x write: use mode='wb' -- otherwise fail on Windows 3.x read: use newline='' -- otherwise fail unconditionally(?) 3.x write: use newline='' -- otherwise fail on Windows The 2.7 documentation says If csvfile is a file object, it must be opened with the 'b' flag on platforms where that makes a difference ... in my experience, people are left asking what platforms? what difference?; Windows should be mentioned explicitly. -- versions: +Python 2.7, Python 3.2, Python 3.3 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7198 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10769] ast: provide more useful range information
Benjamin Peterson benja...@python.org added the comment: I suggest you mail python-dev or python-ideas. I find it more consistent as it stands now. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10769 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10769] ast: provide more useful range information
Sven Brauch svenbra...@googlemail.com added the comment: Okay, thank you, I'm going to do that. :) Bye, Sven -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10769 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5258] addpackage in site.py fails hard on badly formed .pth files
R. David Murray rdmur...@bitdance.com added the comment: Committed to py3k in r87497, 3.1 in r87499, and 2.7 in r87500. -- resolution: - fixed stage: patch review - committed/rejected status: open - closed ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5258 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7198] Extraneous newlines with csv.writer on Windows
R. David Murray rdmur...@bitdance.com added the comment: OK, I'm reopening this as a doc issue, since currently the Python3 writer docs do not mention newline='', and it is indeed required on Windows. John, would you care to suggest a doc patch? I agree with Skip that where it makes a difference is more precise than specifically mentioning Windows, even if less useful in this context. That is how the 'b' mode is documented in the open documentation. To fix the problem with the CSV docs, the recommendation to use 'b' can simply be made unconditional, as it is for newline='' in python3. -- components: +Documentation nosy: +r.david.murray resolution: invalid - stage: - needs patch status: closed - open versions: -Python 3.3 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7198 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10296] ctypes catches BreakPoint error on windows 32
Amaury Forgeot d'Arc amaur...@gmail.com added the comment: With a release build of python, I have a similar behavior on both 2.5 and 2.7; the messages are different though: 2.5: WindowsError: [Error -2147483645] One or more arguments are invalid. 2.7: WindowsError: exception: breakpoint encountered And with both versions, a debug build (python_d.exe) triggers the JIT debugger. What do you get exactly? -- nosy: +amaury.forgeotdarc ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10296 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10776] os.utime returns an error on NTFS-3G partition
New submission from Aaron Masover amaso...@gmail.com: I'm working with Anki (http://ankisrs.net/) on a linux NTFS-3G partition. Anki requires access to modification times in order to handle its backup files. This works fine on my ext3 partition, but on an NTFS partition accessed with NTFS-3G an error is returned: you...@yinghuochong:/storage/文件/anki/decks$ python -c 'import shutil,os; shutil.copyfile(u\u6f22\u5b57.anki, new.anki); os.utime(new.anki, None)' you...@yinghuochong:/storage/文件/anki/decks$ python -c 'import shutil,os; shutil.copyfile(u\u6f22\u5b57.anki, new.anki); os.utime(new.anki, (1293402264,1293402264))' Traceback (most recent call last): File string, line 1, in module OSError: [Errno 1] Operation not permitted: 'new.anki' Note that passing numbers into os.utime returns an error. -- components: IO messages: 124684 nosy: Aaron.Masover priority: normal severity: normal status: open title: os.utime returns an error on NTFS-3G partition versions: Python 2.6 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10776 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue9709] test_distutils warning: initfunc exported twice on Windows
Martin v. Löwis mar...@v.loewis.de added the comment: Thorsten: my recommendation is to ignore this issue in your software. It's just a warning. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue9709 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10757] zipfile.write, arcname should be bytestring
Martin v. Löwis mar...@v.loewis.de added the comment: So, in reverse of issue 4871, it appears that in this case the API should reject bytes input with an appropriate error message. -1. It is quite common to produce ill-formed zipfiles, and other ziptools are interpreting them in violation of the format spec. Python needs to support creation of such broken zipfiles, even though it may not be able to read them back. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10757 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10776] os.utime returns an error on NTFS-3G partition
Martin v. Löwis mar...@v.loewis.de added the comment: Why do you think this is a bug in Python? -- nosy: +loewis ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10776 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10777] xml.etree.register_namespace dictionary changed size during iteration
New submission from Peter p.j.a.c...@googlemail.com: The following was found testing the Biopython unit tests (latest code from git) against Python 3.2 beta 2 (compiled from source on 64 bit Linux Ubuntu). Reduced test case: $ python3.2 Python 3.2b2 (r32b2:87398, Dec 26 2010, 19:01:30) [GCC 4.4.3] on linux2 Type help, copyright, credits or license for more information. from xml.etree import ElementTree ElementTree.register_namespace(xs, http://www.w3.org/2001/XMLSchema;) Traceback (most recent call last): File stdin, line 1, in module File /home/peterjc/lib/python3.2/xml/etree/ElementTree.py, line 1071, in register_namespace for k, v in _namespace_map.items(): RuntimeError: dictionary changed size during iteration Suggested fix, replace this: def register_namespace(prefix, uri): if re.match(ns\d+$, prefix): raise ValueError(Prefix format reserved for internal use) for k, v in _namespace_map.items(): if k == uri or v == prefix: del _namespace_map[k] _namespace_map[uri] = prefix with something like this: def register_namespace(prefix, uri): if re.match(ns\d+$, prefix): raise ValueError(Prefix format reserved for internal use) for k, v in list(_namespace_map.items()): if k == uri or v == prefix: del _namespace_map[k] _namespace_map[uri] = prefix Note that cElementTree seems to be OK. Note that Python 3.1 was not affected as it didn't even have register_namespace, $ python3 Python 3.1.2 (r312:79147, Sep 27 2010, 09:57:50) [GCC 4.4.3] on linux2 Type help, copyright, credits or license for more information. from xml.etree import ElementTree ElementTree.register_namespace(xs, http://www.w3.org/2001/XMLSchema;) Traceback (most recent call last): File stdin, line 1, in module AttributeError: 'module' object has no attribute 'register_namespace' -- components: XML messages: 124688 nosy: maubp priority: normal severity: normal status: open title: xml.etree.register_namespace dictionary changed size during iteration type: crash versions: Python 3.2 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10777 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10776] os.utime returns an error on NTFS-3G partition
Aaron Masover amaso...@gmail.com added the comment: The Anki author suggested that it was a python bug. However, that example command works on a drive set with different permissions, so this looks more like an NTFS-3G bug. -- status: open - closed ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10776 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10757] zipfile.write, arcname should be allowed to be a byte string
R. David Murray rdmur...@bitdance.com added the comment: Well, this is the same treat-strings-and-byte-strings-equivalently-in-the-same-API problem that we've had elsewhere. It'll require a bit of refactoring to make it work. On read zipfile decodes filenames using cp437 if the utf-8 flag isn't set. Logically, then, a binary string should be encoded using cp437. Since cp437 has a character corresponding to each of the 256 bytes, it seems to me it should be enough to decode a binary filename using cp437 and set a flag that _encodeFilenameFlags would respect and re-encode to cp437 instead of utf-8. That might produce unexpected results if someone passes in a binary filename encoded in some other character set, but it would be consistent with how zipfiles work and so should be at least as interoperable as zipfiles normally are. -- title: zipfile.write, arcname should be bytestring - zipfile.write, arcname should be allowed to be a byte string ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10757 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10764] sysconfig and alternative implementations
Amaury Forgeot d'Arc amaur...@gmail.com added the comment: PyPy has exactly the same issue. PyPy's workaround is to have a custom version of sysconfig.py, but this is really a hack. If these config_vars are really determined at compile time, IMO they should be built in the interpreter (in a _sysconfig module?). This would even work on non-posix platforms. -- nosy: +amaury.forgeotdarc ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10764 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue9738] Document the encoding of functions bytes arguments of the C API
STINNER Victor victor.stin...@haypocalc.com added the comment: r87504 documents encodings of error functions. r87505 documents encodings of unicode functions. r87506 documents encodings of AST, compiler, parser and PyRun functions. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue9738 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10778] decoding_fgets() (tokenizer.c) decodes the filename from the wrong encoding
New submission from STINNER Victor victor.stin...@haypocalc.com: decoding_fgets() decodes the input filename from UTF-8 whereas the filename is encoded to the filesystem encoding. PyUnicode_DecodeFSDefault() should be used. decoding_fgets() raises a SyntaxError(Non-UTF-8 code starting with '\xHH' in file xxx on line xxx, but no encoding declared; ...). indenterror() (inconsistent use of tabs and spaces in indentation) and -- components: Interpreter Core, Unicode messages: 124693 nosy: haypo priority: normal severity: normal status: open title: decoding_fgets() (tokenizer.c) decodes the filename from the wrong encoding versions: Python 3.2 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10778 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10779] Change filename encoding to FS encoding in PyErr_WarnExplicit()
New submission from STINNER Victor victor.stin...@haypocalc.com: PyErr_WarnExplicit() expects a filename encoded to UTF-8. This function is only called twice in the Python interpreter: compiler_assert() (with assertion is always true, perhaps remove parentheses?) and symtable_warn() (eg. with name 'xxx' is assigned to before global declaration), and both functions pass a filename encoded to the filesystem encoding. PyErr_WarnExplicit() should use the filesystem encoding instead of UTF-8 to decode the filename. I already did the same change in issue #9713 and #10114 (r85569). Attached patch fixes this issue. See also issue #10778 (decoding_fgets() (tokenizer.c) decodes the filename from the wrong encoding). -- components: Interpreter Core, Unicode files: warnexplicit_fsencoding.patch keywords: patch messages: 124694 nosy: haypo priority: normal severity: normal status: open title: Change filename encoding to FS encoding in PyErr_WarnExplicit() versions: Python 3.2 Added file: http://bugs.python.org/file20170/warnexplicit_fsencoding.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10779 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10778] decoding_fgets() (tokenizer.c) decodes the filename from the wrong encoding
STINNER Victor victor.stin...@haypocalc.com added the comment: See also issue #10779 (Change filename encoding to FS encoding in PyErr_WarnExplicit()). -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10778 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue9738] Document the encoding of functions bytes arguments of the C API
STINNER Victor victor.stin...@haypocalc.com added the comment: While documenting encodings, I found two issues: #10778 and #10779. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue9738 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10780] Fix filename encoding in PyErr_SetFromWindowsErrWithFilename() (and PyErr_SetExcFromWindowsErrWithFilename())
New submission from STINNER Victor victor.stin...@haypocalc.com: PyErr_SetFromWindowsErrWithFilename() expects a filename encoded to UTF-8. It is called by win32_error() function of the nt (posix) module, and win32_error() is called on an error in the bytes implementation of a function (if the argument is a byte string, not an Unicode string). But on Windows, bytes filenames are encoded to the ANSI code page, not to UTF-8. PyErr_SetExcFromWindowsErrWithFilename() expects also a filename encoded to UTF-8. It is not used in Python core, but I think that it should be fixed too. See also #10779 (and #9713 and #10114). -- components: Interpreter Core, Unicode, Windows messages: 124697 nosy: haypo priority: normal severity: normal status: open title: Fix filename encoding in PyErr_SetFromWindowsErrWithFilename() (and PyErr_SetExcFromWindowsErrWithFilename()) versions: Python 3.2 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10780 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10576] Add a progress callback to gcmodule
Kristján Valur Jónsson krist...@ccpgames.com added the comment: 1) what I mean is that if a callback rejects GC, the GC algorithm may find its condition for invoking GC in the first place to be still valid immediately afterwards, so doing a GC will be immediately retried. I have to check, but it could mean that more changes would be required. 2) Of course callbacks have to know, e.g. those that intend to gather statisctic or measure the time of GC. They have started a timer on the start opcode, and expect a stop code to follow. They have to get some canceled code for their bookkeeping to work. Then additionally we have the question: Should you be able to cancel a direct gc request (like calling gc.collect()) or just the automatic one? This then starts to be a much more complicated change, perhaps one that requires a PEP so I don't think we should do all of that in one gulp. Once the callback mechanism is in, there is every oppertunity to extend it. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10576 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10780] Fix filename encoding in PyErr_SetFromWindowsErrWithFilename() (and PyErr_SetExcFromWindowsErrWithFilename())
STINNER Victor victor.stin...@haypocalc.com added the comment: issue10780.patch fixes this issue. -- keywords: +patch Added file: http://bugs.python.org/file20171/issue10780.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10780 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10296] ctypes catches BreakPoint error on windows 32
Kristján Valur Jónsson krist...@ccpgames.com added the comment: I _think_ that in our old 2.5 python (which had a backported ctypes from 2.6 to support 64 bits) we always got the JIT debugger i.e. with _ctypes.pyd and _ctypes_d.pyd. This api, DebugBreak always invokes the JIT debugger, however the program was compiled (_NDEBUG, DEBUG, or not). This is done by raising the breakpoint exception and apparently _ctypes.pyd is catching that exception and handling it, but only in release mode, while I think it shouldn't do. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10296 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10780] Fix filename encoding in PyErr_SetFromWindowsErrWithFilename() (and PyErr_SetExcFromWindowsErrWithFilename())
STINNER Victor victor.stin...@haypocalc.com added the comment: issue10780_mbcs_ignore.patch is a safer but more complex fix: use mbcs decoder with the ignore error handler. Even if issue10780.patch might raise a UnicodeDecodeError, I prefer it because it's shorter, simpler and so easier to maintain the code. -- Added file: http://bugs.python.org/file20172/issue10780_mbcs_ignore.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10780 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5871] email.header.Header too lax with embeded newlines
R. David Murray rdmur...@bitdance.com added the comment: I've considered this a bit more deeply, and it turns out to be simpler to fix than I originally thought, assuming the fix is acceptable. When a message is parsed we obviously wind up with headers that don't have any embedding issues. So, if we check for embedded headers at message production time and reject them, we can't be breaking any round-trip properties of the package. The only way for a header malformed in this way to get produced is through it being added to a Message object via one of the header adding APIs. Further, for this specific issue we are only worried about things that look like header labels that follow a newline and don't have whitespace in front of them. We don't have to worry about the other RFC restrictions on headers in order to fix this. I tried a patch that checked at header add time, and while that is potentially workable it got fairly complicated and is a bit fragile (did I get *all* the places a header can be added?) Instead, the attached patch takes the approach of throwing an error for an embedded header at message serialization time. The advantage here is that all headers are run through Header.encode on serialization, so there's only one bit of code that needs to be modified to pretty much guarantee that header injection can't happen. There are code paths for producing individual headers that don't go through encode, but these don't produce complete messages, so it shouldn't be a problem (and may even be a feature). Barry, do you think this is indeed enough of a security issue that this fix (if acceptable) should be backported to 2.6? I should note that this patch produces a situation unusual[*] for the email package, where serialization may throw an error (but only on a Message object that has been modified). Also, I'm reusing HeaderParseError, which may not be optimal, although it does seem at least semi-logical. [*] Generator currently only throws an error itself in one place, if it encounters a bytes payload for a text Message. Again, something that can only happen in a modified Message, so this seems analogous. -- keywords: +patch stage: unit test needed - patch review versions: +Python 2.7, Python 3.1 Added file: http://bugs.python.org/file20173/header_injection.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5871 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10778] decoding_fgets() (tokenizer.c) decodes the filename from the wrong encoding
STINNER Victor victor.stin...@haypocalc.com added the comment: Oh, ignore indenterror() (inconsistent use of tabs and spaces in indentation) and, I forgot to remove it. indenterror() is correct. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10778 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10573] Consistency in unittest assert methods: order of actual, expected
Ron Adam ron_a...@users.sourceforge.net added the comment: The issue10573.diff file with the time stamp 20:03 has a lot of document changes that don't have corresponding code changes? -- nosy: +ron_adam ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10573 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10576] Add a progress callback to gcmodule
Lukas Lueg lukas.l...@gmail.com added the comment: Agreed, let's have the simple callback first. To solve 2) later on, we could have the callback proposed here be the 'execution'-callback. It neither has nor will have the capability to prevent garbage-collection. We can introduce another 'prepare'-callback later which is called when the gc-modules decides that it is time for collection. Callbacks may react with a negative value so execution does not happen and the execution-callbacks are also never called. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10576 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com