[issue23373] curses.textpad crashes in insert mode on wide window
New submission from Alex Martelli: http://stackoverflow.com/questions/28264353/stack-overflow-in-pythons-curses-is-it-bug-in-the-module/28264823#28264823 for details. a curses.textpad on a wide-enough window, in insert mode, causes a crash by recursion limit exceeded (in _insert_printable_char) upon edit. Workaround is to keep the window underlying the textpad sufficiently narrow. -- components: Library (Lib) messages: 235182 nosy: Alex.Martelli priority: normal severity: normal status: open title: curses.textpad crashes in insert mode on wide window type: crash versions: Python 3.4 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue23373 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
Re: C++ version of the C Python API?
Martin v. Löwis [EMAIL PROTECTED] wrote: ... The most popular ones are Boost.Python, CXX, and PySTL. I think SIP is also pretty popular (see http://www.riverbankcomputing.co.uk/sip/). Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: what is the difference between the two kinds of brackets?
James Stroud [EMAIL PROTECTED] wrote: ... I wonder if its the philosophical difference between: Anything not expressly allowed is forbidden and Anything not expressly forbidden is allowed ? - Hendrik The latter is how I interpret any religious moral code--life is a lot more fun that way. Maybe that percolates to how I use python? FYI, in Security the first approach is also known as Default Deny, the second one as Default Permit. http://www.ranum.com/security/computer_security/editorials/dumb/ explains why default permit is THE very dumbest one of the six dumbest ideas in computer security which the article is all about. But then, the needs of Security are often antithetical to everything else we wish for -- security and convenience just don't mix:-( Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: Best way to generate alternate toggling values in a loop?
Grant Edwards [EMAIL PROTECTED] wrote: ... I like the solution somebody sent me via PM: def toggle(): while 1: yield Even yield Odd I think the itertools-based solution is more elegant: toggle = itertools.cycle(('Even', 'Odd')) and use toggle rather than toggle() later; or, just use that itertools.cycle call inside the expression instead of toggle(). Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: Appending a list's elements to another list using a list comprehension
Debajit Adhikary [EMAIL PROTECTED] wrote: ... How does a.extend(b) compare with a += b when it comes to performance? Does a + b create a completely new list that it assigns back to a? If so, a.extend(b) would seem to be faster. How could I verify things like these? That's what the timeit module is for, but make sure that the snippet you're timing has no side effects (since it's repeatedly executed). E.g.: brain:~ alex$ python -mtimeit -s'z=[1,2,3];b=[4,5,6]' 'a=z[:];a.extend(b)' 100 loops, best of 3: 0.769 usec per loop brain:~ alex$ python -mtimeit -s'z=[1,2,3];b=[4,5,6]' 'a=z[:];a+=b' 100 loops, best of 3: 0.664 usec per loop brain:~ alex$ python -mtimeit -s'z=[1,2,3];b=[4,5,6]' 'a=z[:];a.extend(b)' 100 loops, best of 3: 0.769 usec per loop brain:~ alex$ python -mtimeit -s'z=[1,2,3];b=[4,5,6]' 'a=z[:];a+=b' 100 loops, best of 3: 0.665 usec per loop brain:~ alex$ The repetition of the measurements show them very steady, so now you know that += is about 100 nanoseconds faster (on my laptop) than extend (the reason is: it saves the tiny cost of looking up 'extend' on a; to verify this, use much longer lists and you'll notice that while overall times for both approaches increase, the difference between the two approaches remains about the same for lists of any length). But the key point to retain is: make sure that the snippet is free of side effects, so that each of the MANY repetitions that timeit does is repeating the SAME operation. If we initialized a in the -s and then just extended it in the snippet, we'd be extending a list that keeps growing at each repetition -- a very different operation than extending a list of a certain fixed starting length (here, serendipitously, we'd end up measuring the same difference -- but in the general case, where timing difference between approaches DOES depend on the sizes of the objects involved, our measurements would instead become meaningless). Therefore, we initialize in -s an auxiliary list, and copy it in the snippet. That's better than the more natural alternative: brain:~ alex$ python -mtimeit 'a=[1,2,3];a+=[4,5,6]' 100 loops, best of 3: 1.01 usec per loop brain:~ alex$ python -mtimeit 'a=[1,2,3];a.extend([4,5,6])' 100 loops, best of 3: 1.12 usec per loop brain:~ alex$ python -mtimeit 'a=[1,2,3];a+=[4,5,6]' 100 loops, best of 3: 1.02 usec per loop brain:~ alex$ python -mtimeit 'a=[1,2,3];a.extend([4,5,6])' 100 loops, best of 3: 1.12 usec per loop as in this more natural alternative we're also paying each time through the snippet the cost of building the literal lists; this overhead (which is a lot larger than the difference we're trying to measure!) does not DISTORT the measurement but it sure OBSCURES it to some extend (losing us about one significant digit worth of difference in this case). Remember, the WORST simple operation you can do in measurement is gauging a small number delta as the difference of two much larger numbers X and X+delta... so, make X as small as feasible to reduce the resulting loss of precision!-) You can find more details on commandline use of timeit at http://docs.python.org/lib/node808.html (see adjacent nodes in Python docs for examples and details on the more advanced use of timeit inside your own code) but I hope these indications may be of help anyway. Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: Inheriting automatic attributes initializer considered harmful?
Andrew Durdin [EMAIL PROTECTED] wrote: On 10/17/07, Thomas Wittek [EMAIL PROTECTED] wrote: Writing such constructors for all classes is very tedious. So I subclass them from this base class to avoid writing these constructors: class AutoInitAttributes(object): def __init__(self, **kwargs): for k, v in kwargs.items(): getattr(self, k) # assure that the attribute exits setattr(self, k, v) Is there already a standard lib class doing (something like) this? Or is it even harmful to do this? It depends on your kwargs and where they're coming from. You could do something like this, for example: def fake_str(self): return not a User u = User(__str__=fake_str) str(u) ...and, if you did, that would be totally harmless (in a new-style class as shown by the OP): class AutoInitAttributes(object): ... def __init__(self, **kwargs): ... for k, v in kwargs.items(): ... getattr(self, k) # assure that the attribute exits ... setattr(self, k, v) ... class User(AutoInitAttributes): pass ... def fake_str(self): ... return not a User ... u = User(__str__=fake_str) str(u) '__main__.User object at 0x635f0' fake_str is not called, because special-method lookup occurs on the TYPE, *NOT* on the instance. The OP's idea is handy for some generic containers (I published it as the Bunch class back in 2001 in http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/52308, and I doubt it was original even then); it's not particularly recommended for classes that need to have some specific *NON*-special methods, because then the overwriting issue MIGHT possibly be a (minor) annoyance. Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: int to str in list elements..
Abandoned [EMAIL PROTECTED] wrote: Hi.. I have a list as a=[1, 2, 3 ] (4 million elements) and b=,.join(a) than TypeError: sequence item 0: expected string, int found I want to change list to a=['1','2','3'] but i don't want to use FOR because my list very very big. I'm sorry my bad english. King regards Try b=','.join(map(str, a)) -- it WILL take up some memory (temporarily) to build the huge resulting string, but there's no real way to avoid that. It does run a bit faster than a genexp with for...: brain:~ alex$ python -mtimeit -s'a=range(4000*1000)' 'b=,.join(map(str,a))' 10 loops, best of 3: 3.37 sec per loop brain:~ alex$ python -mtimeit -s'a=range(4000*1000)' 'b=,.join(str(x) for x i n a)' 10 loops, best of 3: 4.36 sec per loop Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: Python on imac
James Stroud [EMAIL PROTECTED] wrote: ... For OS X 10.4, wx has come as part of the stock python install. You may I use Mac OSX 10.4 and this assertion seems unfounded -- I can't see any wx as part of the stock Python (2.3.5). Maybe you mean something else? Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: Python on imac
Raffaele Salmaso [EMAIL PROTECTED] wrote: Alex Martelli wrote: I use Mac OSX 10.4 and this assertion seems unfounded -- I can't see any wx as part of the stock Python (2.3.5). Maybe you mean something else? Very old version, see /System/Library/Frameworks/Python.framework/Versions/2.3/Extras/lib/python /wx-2.5.3-mac-unicode Ah, I see it now, thanks. Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: The fundamental concept of continuations
Matthias Benkard [EMAIL PROTECTED] wrote: continuations. There used to be a project called Stackless Python that tried to add continuations to Python, but as far as I know, it has always been separate from the official Python interpreter. I don't know whether it's still alive. You may want to check http://stackless.com/ Alive and well, but it has removed continuations (which were indeed in early versions, as per the paper at http://www.stackless.com/spcpaper.htm). Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: weakrefs and bound methods
Steven D'Aprano [EMAIL PROTECTED] wrote: ... Without __del__, what should I have done to test that my code was deleting objects and not leaking memory? See module gc in the Python standard library. What should I do when my objects need to perform some special processing when they are freed, if I shouldn't use __del__? The solid, reliable way is: from __future__ import with_statement and use module contextlib from the Python standard library (or handcode an __exit__ method, but that's rarely needed), generating these special objects that require special processing only in 'with' statements. This resource acquisition is initialization (RAII) pattern is the RIGHT way to ensure timely finalization (particularly but not exclusively in garbage-collected languages, and particularly but not exclusively to ease portability to different garbage collection strategies -- e.g., among CPython and future versions of IronPython and/or Jython that will support the with statement). An alternative that will work in pre-2.5 Python (and, I believe but I'm not sure, in Jython and IronPython _today_) is to rely on the weakref module of the standard Python library. If your finalizer, in order to perform special processing, requires access to some values that depend on the just-freed object, you'll have to carefully stash those values elsewhere, because the finalizer gets called _after_ the object is freed (this crucial bit of sequencing semantics is what allows weak references to work while strong finalizers [aka destructors] don't play well with garbage collection when reference-loops are possible). E.g., weakref.ref instances are hashable, so you can keep a per-class dict keyed by them to hold the special values that are needed for special processing at finalization, and use accessors as needed to make those special values still look like attributes of instances of your class. E.g., consider: import weakref class ClosingAtDel(object): _xs = {} def __init__(self, x): self._r = weakref.ref(self, self._closeit) self._xs[self._r] = x @property def x(self): return self._xs[self._r] @classmethod def _closeit(cls, theweakref): cls._xs[theweakref].close() del cls._xs[theweakref] This will ensure that .close() is called on the object 'wrapped' in the instance of ClosingAtDel when the latter instance goes away -- even when the going away is due to a reference loop getting collected by gc. If ClosingAtDel had a __del__ method, that would interfere with the garbage collection. For example, consider adding to that class the following test/example code: class Zap(object): def close(self): print 'closed', self c = ClosingAtDel(Zap()) d = ClosingAtDel(Zap()) print c.x, d.x # create a reference loop c.xx = d; d.xx = c # garbage-collect it anyway import gc del c; del d; gc.collect() print 'done!' you'll get a pleasant, expected output: $ python wr.py __main__.Zap object at 0x6b430 __main__.Zap object at 0x6b490 closed __main__.Zap object at 0x6b430 closed __main__.Zap object at 0x6b490 done! Suppose that ClosingAtDel was instead miscoded with a __del__, e.g.: class ClosingAtDel(object): def __init__(self, x): self.x = x def __del__(self): self.x.close() Now, the same test/example code would emit a desolating...: $ python wr.py __main__.Zap object at 0x6b5b0 __main__.Zap object at 0x6b610 done! I.e., the assumed-to-be-crucial calls to .close() have NOT been performed, because __del__ inhibits collection of reference-looping garbage. _Ensuring_ you always avoid reference loops (in intricate real-life cases) is basically unfeasible (that's why we HAVE gc in the first place -- for non-loopy cases, reference counting suffices;-), so the best strategy is to avoid coding __del__ methods, just as Marc recommends. Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: weakrefs and bound methods
Mathias Panzenboeck [EMAIL PROTECTED] wrote: ... I only inserted them so I can see if the objects are really freed. How can I see that without a __del__ method? You can use weakref.ref instances with finalizer functions - see the long post I just made on this thread for a reasonably rich and complex example. Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: weakrefs and bound methods
Mathias Panzenboeck [EMAIL PROTECTED] wrote: Marc 'BlackJack' Rintsch wrote: ``del b`` just deletes the name `b`. It does not delete the object. There's still the name `_` bound to it in the interactive interpreter. `_` stays bound to the last non-`None` result in the interpreter. Actually I have the opposite problem. The reference (to the bound method) gets lost but it shouldn't! weakrefs to bound methods require some subtlety, see http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/81253 (or what I believe is the better treatment of this recipe in the printed edition of the Python Cookbook -- of course, being the latter's editor, I'm biased;-). Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: Newbie packages Q
MarkyMarc [EMAIL PROTECTED] wrote: ... And sys.path is /python/Test/bpack sys.path must be a LIST. Are you saying you set yours to NOT be a list, but, e.g., a STRING?! (It's hard to tell, as you show no quotes there). The 'Test' package is *not* in your sys.path. I can say yes to the first: The atest.py is in the right dir/package. And the third. If it is not good enough that this /python/Test/bpack is in the path. Then I can not understand the package thing. I also tried to put /python/ and /python/Test in the sys.path same result. If the only ITEM in the list that is sys.path is the string '/python', then any Python code you execute will be able to import Test.apack (as well as Test.bpack, or just Test). Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: Problem of Readability of Python
Licheng Fang [EMAIL PROTECTED] wrote: ... Python Tutorial says an empty class can be used to do this. But if namespaces are implemented as dicts, wouldn't it incur much overhead if one defines empty classes as such for some very frequently used data structures of the program? Just measure: $ python -mtimeit -s'class A(object):pass' -s'a=A()' 'a.zop=23' 100 loops, best of 3: 0.241 usec per loop $ python -mtimeit -s'a=[None]' 'a[0]=23' 1000 loops, best of 3: 0.156 usec per loop So, the difference, on my 18-months-old laptop, is about 85 nanoseconds per write-access; if you have a million such accesses in a typical run of your program, it will slow the program down by about 85 milliseconds. Is that much overhead? If your program does nothing else except those accesses, maybe, but then why are your writing that program AT ALL?-) And yes, you CAN save about 1/3 of those 85 nanoseconds by having '__slots__=[zop]' in your class A(object)... but that's the kind of thing one normally does only to tiny parts of one's program that have been identified by profiling as dramatic bottlenecks, to shave off the last few nanoseconds in the very last stages of micro-optimization of a program that's ALMOST, but not QUITE, fast enough... knowing about such extreme last-ditch optimization tricks is of very doubtful value (and I think I'm qualified to say that, since I _do_ know many of them...:-). There ARE important performance things to know about Python, but those worth a few nanoseconds don't matter much. Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: Newbie packages Q
MarkyMarc [EMAIL PROTECTED] wrote: ... And sys.path is /python/Test/bpack sys.path must be a LIST. Are you saying you set yours to NOT be a list, but, e.g., a STRING?! (It's hard to tell, as you show no quotes there). ... I also tried to put /python/ and /python/Test in the sys.path same result. If the only ITEM in the list that is sys.path is the string '/python', then any Python code you execute will be able to import Test.apack (as well as Test.bpack, or just Test). Of course I have more than just the /python string in the sys.path. I have a list of paths, depending on which system the code run on. As long as '/python' comes in the list before any other directory that might interfere (by dint of having a Test.py or Test/__init__.py), and in particular in the non-pathological case where there are no such possible interferences, my assertion here quoted still holds. If you're having problems in this case, run with python -v to get information about all that's being imported, print sys.path and sys.modules just before the import statement that you think is failing, and copy and paste all the output here, incuding the traceback from said failing import. Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: Newbie packages Q
MarkyMarc [EMAIL PROTECTED] wrote: ... As long as '/python' comes in the list before any other directory that might interfere (by dint of having a Test.py or Test/__init__.py), and in particular in the non-pathological case where there are no such possible interferences, my assertion here quoted still holds. If you're having problems in this case, run with python -v to get information about all that's being imported, print sys.path and sys.modules just before the import statement that you think is failing, and copy and paste all the output here, incuding the traceback from said failing import. Alex OK thank you, with some help from the -v option and debugging I found a test package in some package. I now renamed it and load it with sys.path.append. And now the btest.py works. Good. BUT does this mean I have to set the path too the package in every __init__.py class? Or have do I tell a subpackage that it is part of a big package ? The package directory (the one containing __init__.py) must be on some directory in sys.path, just like a plain something.py module would have to be in order to be importable. How you arrange for this is up to you (I normally install all add-ons in the site-packages directory of my Python installation: that's what Python's distutils do by default, ). As for conflict in names (of modules and/or packages), they're of course best avoided than worked-around; not naming any module test.py, nor any package (directory containing an __init__.py) Test, is a good start. Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: Problem of Readability of Python
Steven D'Aprano [EMAIL PROTECTED] wrote: On Sun, 07 Oct 2007 13:24:14 -0700, Alex Martelli wrote: And yes, you CAN save about 1/3 of those 85 nanoseconds by having '__slots__=[zop]' in your class A(object)... but that's the kind of thing one normally does only to tiny parts of one's program that have been identified by profiling as dramatic bottlenecks Seems to me that: class Record(object): __slots__ = [x, y, z] has a couple of major advantages over: class Record(object): pass aside from the micro-optimization that classes using __slots__ are faster and smaller than classes with __dict__. (1) The field names are explicit and self-documenting; (2) You can't accidentally assign to a mistyped field name without Python letting you know immediately. Maybe it's the old Pascal programmer in me coming out, but I think they're big advantages. I'm also an old Pascal programmer (ask anybody who was at IBM in the '80s who was the most active poster on the TURBO FORUM about Turbo Pascal, and PASCALVS FORUM about Pascal/Vs...), and yet I consider these advantages to be trivial in most cases compared to the loss in flexibility, such as the inability to pickle (without bothering to code an explicit __getstate__) and the inability to monkey-patch instances on the fly -- not to mention the bother of defining a separate 'Record' class for each and every combination of attributes you might want to put together. If you REALLY pine for Pascal's records, you might choose to inherit from ctypes.Structure, which has the additional advantages of specifying a C type for each field and (a real advantage;-) creating an appropriate __init__ method. import ctypes class Record(ctypes.Structure): ... _fields_ = (('x',ctypes.c_float),('y',ctypes.c_float),('z',ctypes.c_float) ) ... r=Record() r.x 0.0 r=Record(1,2,3) r.x 1.0 r=Record('zip','zop','zap') Traceback (most recent call last): File stdin, line 1, in module TypeError: float expected instead of str instance See? You get type-checking too -- Pascal looms closer and closer!-) And if you need an array of 1000 such Records, just use as the type Record*1000 -- think of the savings in memory (no indirectness, no overallocations as lists may have...). If I had any real need for such things, I'd probably use a metaclass (or class decorator) to also add a nice __repr__ function, etc... Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: Can you please give me some advice?
Byung-Hee HWANG [EMAIL PROTECTED] wrote: Hi there, What is different between Ruby and Python? Not all that much; Python is more mature, Ruby more fashionable. I am wondering what language is really mine for work. Somebody tell me Ruby is clean or Python is really easy! Anyway I will really make decision today what I have to study from now on. What I make the decision is more difficult than to know why I have to learn English. Yeah I do not like to learn English because it is just very painful.. www.python.or.kr/ http://wiki.python.org/moin/KoreanPythonBooks Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: Python 3.0 migration plans?
John Nagle [EMAIL PROTECTED] wrote: TheFlyingDutchman wrote: It seems that Python 3 is more significant for what it removes than what it adds. What are the additions that people find the most compelling? I'd rather see Python 2.5 finished, so it just works. And I'd rather see peace on Earth and goodwill among men than _either_ Python 3 or your cherished finished 2.5 -- the comparison and implied tradeoff make about as much sense as yours. All the major third-party libraries working and available with working builds for all major platforms. That working set of components in all the major distros used on servers. The major hosting companies having that up and running on their servers. Windows installers that install a collection of components that all play well together. That's what I mean by working. I.e., you mean tasks appropriate for maintainers of all the major third-party libraries, distros, and hosting companies -- great, go convince them, or go convince all warmongers on Earth to make peace if you want an even harder tasks with even better potential impact on the state of the world, then. Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: Google and Python
Bryan Olson [EMAIL PROTECTED] wrote: ... YouTube (one of Google's most valuable properties) is essentially all-Python (except for open-source infrastructure components such as lighttpd). Also, at Google I'm specifically Uber Tech Lead, Production Systems: while I can't discuss details, my main responsibilities relate to various software projects that are part of our deep infrastructure, and our general philosophy there is Python where we can, C++ where we must. Good motto. So is most of Google's code base now in Python? About what is the ratio of Python code to C++ code? Of course lines of code is kine of a bogus measure. Of all those cycles Google executes, about what portion are executed by a Python interpreter? I don't have those numbers at hand, and if I did they would be confidential: you know that Google doesn't release many numbers at all about its operations, most particularly not about our production infrastructure (not even, say, how many server we have, in how many data centers, with what bandwidth, and so on). Still, I wouldn't say that most of our codebase is in Python: there's a lot of Java, a lot of C++, a lot of Python, a lot of Javascript (which may not correspond to all that many cycles Google executes since the main point of coding in Javascript is having it execute in the user's browser, of course, but it's still code that gets developed, debugged, deployed, maintained), and a lot of other languages including ones that Google developed in-house such as http://labs.google.com/papers/sawzall.html . Python is definitely not just a tiny little piece nor (by a long shot) used only for scripting tasks; Ah, sorry. I meant the choice of scripting language was a tiny little piece of Google's method of operation. In the same sense in which other such technology choices (C++, Java, what operating systems, what relational databases, what http servers, and so on) are similarly tiny pieces, maybe. Considering the number of technology choices that must be made, plus the number of other choices that aren't directly about technology but, say, about methodology (style guides for each language in use, mandatory code reviews before committing to the shared codebase, release-engineering practices, standards for unit-tests and other kinds of tests, and so on, and so forth), one could defensibly make a case that each and every such choice must of necessity be but a tiny little piece of the whole. Scripting language means languages such as Python, Perl, and Ruby. A widespread terminology, but nevertheless a fundamentally bankrupt one: when a language is used to develop an application, it's very misleading to call it a scripting language, as it implies that it's instead used only to script something else. When it comes time to decide which mix of languages to use to develop a new application, it's important to avoid being biased by having tagged some languages as scripting ones, some (say Java) as application ones, others yet (say C++) as system ones -- the natural subconscious process would be to say well I'm developing an X, I should use an X language, not a Y language or a Z language, which is most likely to lead to wrong choices. if the mutant space-eating nanovirus should instantly stop the execution of all Python code, the powerful infrastructure that has been often described as Google's secret weapon would seize up. And the essence of the Google way is to employ a lot of smart programmers to build their own software to run on Google's infrastructure. Choice of language is triva. No, it's far from trivial, any more than choice of operating system, and so on. Google is a technology company: exactly which technologies to use and/or develop for the various necessary tasks, far from being trivial, is the very HEART of its operation. Your ludicrous claim is similar to saying that the essence of a certain hedge fund is to employ smart traders to make a lot of money by sophisticated trades (so far so reasonable) and (here comes the idiocy) choice of currencies and financial instruments is trivia (?!?!?!) -- it's the HEART of such a fund, to pick and choose which positions to build, unwind, or sell-on, and which (e.g.) currencies should be involved in such positions is obviously *crucial*, one of the many important decisions those smart traders make every day, and far from the least important of the many. And similarly, OF COURSE, for choices of technologies (programming languages very important among those) for a technology company, just like, say, what horticultural techniques and chemicals to employ would be for a company whose essence was cultivating artichokes for sale on the market, and so on. I think both Python Google are great. What I find ludicrous is the idea that the bits one hears about how Google builds its software make a case for how others should build theirs. To each his own, I guess: what I find ludicrous is your claim
Re: Google and Python
Bryan Olson [EMAIL PROTECTED] wrote: ... TheFlyingDutchman asked of someone: Would you know what technique the custom web server uses to invoke a C++ app No, I expect he would not know that. I can tell you that GWS is just for Google, and anyone else is almost certainly better off with Apache. Or lighttpd, like YouTube (cfr http://trac.lighttpd.net/trac/wiki/PoweredByLighttpd). How does Google use Python? As their scripting-language of choice. A fine choice, but just a tiny little piece. Maybe Alex will disagree with me. In my short time at Google, I was uber-nobody. YouTube (one of Google's most valuable properties) is essentially all-Python (except for open-source infrastructure components such as lighttpd). Also, at Google I'm specifically Uber Tech Lead, Production Systems: while I can't discuss details, my main responsibilities relate to various software projects that are part of our deep infrastructure, and our general philosophy there is Python where we can, C++ where we must. Python is definitely not just a tiny little piece nor (by a long shot) used only for scripting tasks; if the mutant space-eating nanovirus should instantly stop the execution of all Python code, the powerful infrastructure that has been often described as Google's secret weapon would seize up. The internal web applications needed to restore things, btw, would seize up too; as I already said I can't give details of the ones I'm responsible for (used by Google's network specialists, reliability engineers, hardware technicians, etc), but Guido did manage to get permission to talk about his work, Mondrian (http://www.niallkennedy.com/blog/archives/2006/11/google-mondrian.html ) -- that's what we all use to review code, whatever language it's in, before it can be submitted to the Google codebase (code reviews are a mandatory step of development at Google). Internal web applications are the preferred way at Google to make any internal functionality available, of course. Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: Using pseudonyms
Aahz [EMAIL PROTECTED] wrote: For that matter, there are plenty of people who are better known by some nickname that is not their legal name. Yep. For example, some people whose legal name is Alessandro (which no American is ever going to be able to spell right -- ONE L, TWO S's, NOT an X or a J instead, DRO ending rather than DER, etc), might choose to avoid the hassle and go by Alex (just to make up a case...). Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: can Python be useful as functional?
Rustom Mody [EMAIL PROTECTED] wrote: Can someone help? Heres the non-working code def si(l): p = l.next() yield p (x for x in si(l) if x % p != 0) There should be an yield or return somewhere but cant figure it out Change last line to for x in (x for x in si(l) if x % p != 0): yield x if you wish. Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: super() doesn't get superclass
Ben Finney [EMAIL PROTECTED] wrote: Am I mistaken in thinking that superclass of foo is equivalent to parent class of foo? If so, I'd lay heavy odds that I'm not alone in that thinking. That thinking (confusing parent with ancestor) makes sense only (if at all) in a single-inheritance world. Python's super() exists to support MULTIPLE inheritance. In general, a superclass of foo means a class X such that foo is a sublass of X and thus applies to all parents, all parents of parents, and so on (issubclass does NOT mean is a DIRECT AND IMMEDIATE subclass, but is a subclass; check the Python builtin function of that name). Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: Just bought Python in a Nutshell
7stud [EMAIL PROTECTED] wrote: Used copies of computer books for out of date editions are always cheap. Python in a Nutshell (2nd ed) is a reference book with a frustratingly poor index--go figure. It also contains errors not posted in the errata. You can always enter errata at http://www.oreilly.com/cgi-bin/errata.form/pythonian2 and thus help all future readers of the book (if your errata are confirmed to be valid). Vague mentions of errors not posted in the errata are far less useful (and unconfirmed, too). Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: how to join array of integers?
Paul Rudin [EMAIL PROTECTED] wrote: ... Isn't it odd that the generator isn't faster, since the comprehension presumably builds a list first and then iterates over it, whereas the generator doesn't need to make a list? The generator doesn't, but the implementation of join then does (almost). See Objects/stringobject.c line 1745: seq = PySequence_Fast(orig, ); As per http://docs.python.org/api/sequence.html, PyObject* PySequence_Fast(PyObject *o, const char *m) Return value: New reference. Returns the sequence o as a tuple, unless it is already a tuple or list, in which case o is returned. Use PySequence_Fast_GET_ITEM() to access the members of the result. Returns NULL on failure. If the object is not a sequence, raises TypeError with m as the message text. If orig is neither a list nor a tuple, but for example a generator, PySequence_Fast builds a list from it (even though its docs which I just quoted says it builds a tuple -- building the list is clearly the right choice, so I'd say it's the docs that are wrong, not the code;-)... so in this particular case the usual advantage of the generator disappears. PySequence_fast is called in 13 separate spots in 8 C files in the Python 2.5 sources, so there may a few more surprises like this;-). Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: once assigment in Python
Lorenzo Di Gregorio [EMAIL PROTECTED] wrote: When employing Python it's pretty straightforward to translate the instance to an object. instance = Component(input=wire1,output=wire2) Then you don't use instance *almost* anymore: it's an object which gets registered with the simulator kernel and gets called by reference and event-driven only by the simulator kernel. We might reuse the name for calling some administrative methods related to the instance (e.g. for reporting) but that's a pretty safe thing to do. Of course all this can be done during initialization, but there are some good reasons (see Verilog vs VHDL) why it's handy do be able to do it *anywhere*. The annoying problem was that every time the program flow goes over the assignment, the object gets recreated. If you originally set, e.g., instance = None then using in your later code: instance = instance or Component(...) will stop the multiple creations. Other possibilities include using a compound name (say an.instance where 'an' is an instance of a suitable container class) and overriding the __new__ method of class Component so that it will not return multiple distinct objects with identical attributes. Has this *plain* name ever been previously assigned to anything at all is simply not a particularly good condition to test for (you COULD probably write a decorator that ensures that all uninitialized local variables of a function are instead initialized to None, but I'd DEFINITELY advise against such black magic). Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: Python 3K or Python 2.9?
TheFlyingDutchman [EMAIL PROTECTED] wrote: Foo.bar(foo, spam) foo.bar(spam) That looks like a case of There's more than one way to do it. ;) The first form is definitely consistent with the method declaration, so there's a lot to be said for using that style when teaching people to make classes - send self, receive self. On the other hand, the second form is not polymorphic: it doesn't allow for foo to be an instance of some OTHER class (possibly subclassing Foo and overriding bar) -- it will call the Foo version of bar anyway. type(foo).bar(foo, spam) *IS* almost semantically equivalent to the obviousy simpler foo.bar(spam) -- but doesn't cover the possibility for foo to do a *per-instance* override of 'bar'. getattr(foo, 'bar', functools.partial(type(foo).bar, foo))(spam) is getting closer to full semantic equivalence. And if you think that's another OBVIOUS way of doing it wrt foo.bar(spam), I think your definition of obvious may need a reset (not to mention the fact that the equivalent version is way slower;-). Foo.bar(foo, spam)'s different semantics are important when any implementation of type(foo).bar (or other method yet) wants to BYPASS polymorphism to redirect part of the functionality to a specific type's implementation of bar ('super' may help in some cases, but it keeps some polymorphic aspects and pretty often you just want to cut all polymorphism off and just redirect to ONE specific implementation). Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: Python 3K or Python 2.9?
Duncan Booth [EMAIL PROTECTED] wrote: ... As for omitting 'self' from method definitions, at first site you might think the compiler could just decide that any 'def' directly inside a class could silently insert 'self' as an additional argument. This doesn't work though because not everything defined in a class has to be an instance method: static methods don't have a self parameter at all, class methods traditionally use 'cls' instead of 'self' as the name of the first parameter and it is also possible to define a function inside a class block and use it as a function. e.g. Actually you could do the magic first-parameter insertion just when returning a bound or unbound method object in the function's __get__ special method, and that would cover all of the technical issues you raise. E.g.: class Weird: def factory(arg): Returns a function based on its argument foo = factory(foo) bar = factory(bar) del factory When factory is called, it is a simple function not a method. If it had Sure, that's because the function object itself is called, not a bound or unbound method object -- indeed. factory.__get__ never gets called here. class C: def method(self): pass and def foo(self): pass class C: pass C.method = foo both of these result in effectively the same class (although the second one has a different name for the method in tracebacks). And exactly the same would occur if the self argument was omitted from the signature and magically inserted when __get__ does its job. That consistency really is important. Whenever I see a 'def' I know exactly what parameters the resulting function will take regardless of the context. And this non-strictly-technical issue is the only true one. Another area to consider is what happens when I do: foo = FooClass() foo.bar(x) # versus f = foo.bar f(x) Both of these work in exactly the same way in Python: the self parameter And so they would with the __get__ does magic rule, NP. My point here is that in Python the magic is clearly defined and overridable (so we can have static or class methods that act differently). And so it would be with that rule, since staticmethod c create different descriptor objects. Really, the one and only true issue is that the Python community doesn't like magic. It would be perfectly feasible, we just don't wanna:-). Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: Python 3K or Python 2.9?
Chris Mellon [EMAIL PROTECTED] wrote: ... Actually you could do the magic first-parameter insertion just when returning a bound or unbound method object in the function's __get__ special method, and that would cover all of the technical issues you ... This would mean that mixing functions and methods would have to be done like you do it in C++, with lots of careful knowledge and inspection of what you're working with. Not particularly -- it would not require anything special that's not required today. What would happen to stuff like inspect.getargspec? It would return the signature of the function, if asked to analyze a function, and the signature of the method, if asked to analyze a method. Not exactly rocket science, as it happens. Besides, if self isn't in the argument spec, you know that the very next thing people will complain about is that it's not implicitly used for locals, Whether 'self' needs to be explicit as a function's first argument, and whether it needs to be explicit (as a self. ``prefix'') to access instance variables (which is what I guess you mean here by locals, since reading it as written makes zero sense), are of course separate issues. and I'll punch a kitten before I accept having to read Python code guessing if something is a global, a local, or part of self like I do in C++. Exactly: the technical objections that are being raised are bogus, and the REAL objections from the Python community boil down to: we like it better the way it is now. Bringing technical objections that are easily debunked doesn't _strengthen_ our real case: in fact, it _weakens_ it. So, I'd rather see discussants focus on how things SHOULD be, rather than argue they must stay that way because of technical difficulties that do not really apply. The real advantage of making 'self' explicit is that it IS explicit, and we like it that way, just as much as its critics detest it. Just like, say, significant indentation, it's a deep part of Python's culture, tradition, preferences, and mindset, and neither is going to go away (I suspect, in fact, that, even if Guido somehow suddenly changed his mind, these are issues on which even he couldn't impose a change at this point without causing a fork in the community). Making up weak technical objections (ones that ignore the possibilities of __get__ or focus on something so absolutely central to everyday programming practice as inspect.getargspec [!!!], for example;-) is just not the right way to communicate this state of affairs. Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: newbie: self.member syntax seems /really/ annoying
Chris Mellon [EMAIL PROTECTED] wrote: ... This is terrible and horrible, please don't use it. That said, presenting the magic implicit_self context manager! ...which doesn't work in functions -- just try changing your global code: with implicit_self(t): print a print b a = 40 b = a * 2 into a function and a call to it: def f(): with implicit_self(t): print a print b a = 40 b = a * 2 f() ...even with different values for the argument to _getframe. You just can't dynamically add local variables to a function, beyond the set the compiler has determined are local (and those are exactly the ones that are *assigned to* in the function's body -- no less, no more -- where assigned to includes name-binding statements such as 'def' and 'class' in addition to plain assignment statements, of course). Making, say, 'a' hiddenly mean 'x.a', within a function, requires a decorator that suitably rewrites the function's bytecode... (after which, it WOULD still be terrible and horrible and not to be used, just as you say, but it might at least _work_;-). Main problem is, the decorator needs to know the set of names to be faked out in this terrible and horrible way at the time the 'def' statement executes: it can't wait until runtime (to dynamically determine what's in var(self)) before it rewrites the bytecode (well, I guess you _could_ arrange a complicated system to do that, but it _would_ be ridiculously slow). You could try defeating the fundamental optimization that stands in your way by adding, say, exec 'pass' inside the function-needing-fakeouts -- but we're getting deeper and deeper into the mire...;-) Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: newbie: self.member syntax seems /really/ annoying
Carl Banks [EMAIL PROTECTED] wrote: ... How about this? The decorator could generate a bytecode wrapper that would have the following behavior, where __setlocal__ and __execute_function__ are special forms that are not possible in Python. (The loops would necessarily be unwrapped in the actual bytecode.) I'm not entirely sure how you think those special forms would work. Right now, say, if the compiler sees somewhere in your function z = 23 print z it thereby knows that z is a local name, so it adds a slot to the function's locals-array, suppose it's the 11th slot, and generates bytecode for LOAD_FAST 11 and STORE_FAST 11 to access and bind that 'z'. (The string 'z' is stored in f.func_code.co_varnames but is not used for the access or storing, just for debug/reporting purposes; the access and storing are very fast because they need no lookup). If instead it sees a print z with no assignment to name z anywhere in the function's body, it generates instead bytecode LOAD_GLOBAL `z` (where the string `z` is actually stored in f.func_code.co_names). The string (variable name) gets looked up in dict f.func_globals each and every time that variable is accessed or bound/rebound. If the compiler turns this key optimization off (because it sees an exec statement anywhere in the function, currently), then the bytecode it generates (for variables it can't be sure are local, but can't be sure otherwise either as they MIGHT be assigned in that exec...) is different again -- it's LOAD_NAME (which is like LOAD_GLOBAL in that it does need to look up the variable name string, but often even slower because it needs to look it up in the locals and then also in the globals if not currently found among the locals -- so it may often have to pay for two lookups, not just one). So it would appear that to make __setlocal__ work, among other minor revolutions to Python's code objects (many things that are currently tuples, built once and for all by the compiler at def time, would have to become lists so that __setlocal__ can change them on the fly), all the LOAD_GLOBAL occurrences would have to become LOAD_NAME instead (so, all references to globals would slow down, just as they're slowed down today when the compiler sees an exec statement in the function body). Incidentally, Python 3.0 is moving the OTHER way, giving up the chore of dropping optimization to support 'exec' -- the latter will become a function instead of a statement and the compiler will NOT get out of its way to make it work right any more; if LOAD_NAME remains among Python bytecodes (e.g. it may remain in use for class-statement bodies) it won't be easy to ask the compiler to emit it instead of LOAD_GLOBAL (the trick of just adding exec 'pass' will not work any more;-). So, rewriting the bytecode on the fly (to use LOAD_NAME instead of LOAD_GLOBAL, despite the performance hit) seems to be necessary; if you're willing to take those two performance hits (at decoration time, and again each time the function is called) I think you could develop the necessary bytecode hacks even today. This wouldn't be that much slower than just assigning local variables to locals by hand, and it would allow assignments in the straightforward way as well. The big performance hit comes from the compiler having no clue about what you're doing (exactly the crucial hint that assigning local variables by hand DOES give the compiler;-) There'd be some gotchas, so extra care is required, but it seems like for the OP's particular use case of a complex math calculation script, it would be a decent solution. Making such complex calculations even slower doesn't look great to me. I understand where the OP is coming from. I've done flight simulations in Java where there are lot of complex calculations using symbols. This is a typical formula (drag force calculation) that I would NOT want to have to use self.xxx for: FX_wind = -0.5 * rho * Vsq * Sref * (C_D_0 + C_D_alphasq*alpha*alpha + C_D_esq*e*e) If ALL the names in every formula always refer to nothing but instance variables (no references to globals or builtins such as sin, pi, len, abs, and so on, by barenames) then there might be better tricks, ones that rely on that knowledge to actually make things *faster*, not slower. But they'd admittedly require a lot more work (basically a separate specialized compiler to generate bytecode for these cases). Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: unexpected behavior: did i create a pointer?
Arnaud Delobelle [EMAIL PROTECTED] wrote: ... def lower_list(L): ... for i, x in enumerate(L): ... L[i] = x.lower() ... s = ['STRING'] lower_list(s) print s == ['string'] True def lower_string(s): ... s = s.lower() ... s = STRING lower_string(s) Let's see what happens here: when lower_string(s) is called, the 's' which is local to lower_string is made to point to the same object as the global s (i.e. the string object with value STRING). In the body of the function, the statement s=s.lower() makes the local 's' point to a new string object returned s.lower(). Of course this has not effect on what object the global 's' points to. Yep, the analogy with C pointers would work fine here: void lower_string(char* s) { s = whatever } would fail to have the intended effect in C just as its equivalent does in Python (in both Python and C, rebinding the local name s has no effect on the caller of lower_string). Add an indirection: void lower_list(item* L) { ... L[i] = something } this indirection (via indexing) *does* modify the memory area (visible by the caller) to which L points. The difference between name=something and name[i]=something is so *HUGE* in C (and in Python) that anybody who doesn't grok that difference just doesn't know or understand any C (nor any Python). What I think is a more dangerous misconception is to think that the assignement operator (=) has the same meaning in C and python. I've seen the prevalence of that particular misconception drop dramatically over the years, as a growing fraction of the people who come to Python after some previous programming experience become more and more likely to have been exposed to *Java*, where assignment semantics are very close to Python (despite Java's unfortunate complication with unboxed elementary scalar types, in practice a vast majority of occurrences of a=b in Java have just the same semantics as they do in Python); teaching Python semantics to people with Java exposure is trivially easy (moving from ALMOST every variable is an implicit reference -- excepting int and float ones to EVERY variable is an implicit reference...). Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: Using s.sort([cmp[, key[, reverse]]]) to sort a list of objects based on a attribute
Stefan Arentz [EMAIL PROTECTED] wrote: Miki [EMAIL PROTECTED] writes: steps.sort(key = lambda s: s.time) This is why attrgetter in the operator module was invented. from operator import attrgetter ... steps.sort(key=attrgettr(time)) Personally I prefer the anonymous function over attrgettr :) However, Python disagrees with you...: brain:~ alex$ python -mtimeit -s'from operator import attrgetter; L=map(complex,xrange(999))' 'sorted(L, key=lambda x:x.real)' 1000 loops, best of 3: 567 usec per loop brain:~ alex$ python -mtimeit -s'from operator import attrgetter; L=map(complex,xrange(999))' 'sorted(L, key=attrgetter(real))' 1000 loops, best of 3: 367 usec per loop A speed-up of 35% is a pretty clear indicator of what _Python_ prefers in this situation:-). Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: concise code (beginner)
bambam [EMAIL PROTECTED] wrote: O(n) to find the element you wish to remove and move over everything after it, Is that how lists are stored in cPython? It seems unlikely? So-called lists in Python are stored contiguously in memory (more like vectors in some other languages), so e.g. L[n] is O(1) [independent from n] but removing an element is O(N) [as all following items need to be shifted 1 place down]. Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: Class design (information hiding)
Gregor Horvath [EMAIL PROTECTED] wrote: Alexander Eisenhuth schrieb: I'm wodering how the information hiding in python is ment. As I understand there doesn't exist public / protected / private mechanism, but a '_' and '__' naming convention. As I figured out there is only public and private possible as speakin in C++ manner. Are you all happy with it. What does the zen of python say to that design? (protected is useless?) My favourite thread to this FAQ: http://groups.google.at/group/comp.lang.python/browse_thread/thread/2c85 d6412d9e99a4/b977ed1312e10b21#b977ed1312e10b21 Why, thanks for the pointer -- I'm particularly proud of having written The only really workable way to develop large software projects, just as the only really workable way to run a large business, is a state of controlled chaos. *before* I had read Brown and Eisenhardt's Competing on the Edge: Strategy as Structured Chaos (at that time I had no real-world interest in strategically managing a large business -- it was based on mere intellectual curiosity and extrapolation that I wrote controlled chaos where B E have structured chaos so well and clearly explained;-). BTW, if you want to read my entire post on that Austrian server, the most direct URL is http://groups.google.at/group/comp.lang.python/msg/b977ed1312e10b21? ... Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: list index()
Neil Cerutti [EMAIL PROTECTED] wrote: It's probable that a simpler implementation using slice operations will be faster for shortish lengths of subseq. It was certainly easier to get it working correctly. ;) def find(seq, subseq): for i, j in itertools.izip(xrange(len(seq)-len(subseq)), xrange(len(subseq), len(seq))): if subseq == seq[i:j]: return i return -1 Simpler yet (though maybe slower!-): def find(seq, subseq): L = len(subseq) for i in xrange(0, len(seq)-L): if subseq == seq[i:i+L]: return i return -1 also worth trying (may be faster in some cases, e.g. when the first item of the subsequence occurs rarely in the sequence): def find(seq, subseq): L = len(subseq) firstitem = subseq[0] end = len(seq) - len(subseq) i = -1 while 1: try: i = seq.index(firstitem, i+1, end) except ValueError: return -1 if subseq == seq[i:i+L]: return i For particularly long sequences (with hashable items) it might even be worth trying variants of Boyer-Moore, Horspool, or Knuth-Morris-Pratt; while these search algorithms are mostly intended for text strings, since you need tables indexed by the item values, using dicts for such tables might yet be feasible (however, the program won't be quite that simple). Benchmarking of various possibilities on typical input data for your application is recommended, as performance may vary! Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: Sort of an odd way to debug...
xkenneth [EMAIL PROTECTED] wrote: ... What I'd like to do, is define a base class. This base class would have a function, that gets called every time another function is called (regardless of whether in the base class or a derived class), and prints the doc string of each function whenever it's called. I'd like to be able to do this without explicitly specifying the function inside all of the other functions of a base class or derived class. So you need to write a metaclass that wraps every function attribute of the class into a wrapper performing such prints as you desire. The metaclass will be inherited by subclasses (unless metaclass conflicts intervene in multiple-inheritance situation). You don't appear to need the printing-wrapper to be a method, and it's simpler to have it be a freestanding function, such as: import functools def make_printing_wrapper(f): @functools.wraps(f) def wrapper(*a, **k): print f.__doc__ return f(*a, **k) return wrapper Now, the metaclass could be, say: import inspect class MetaWrapFunctions(type): def __init__(cls, name, bases, attrs): for k, f in attrs.iteritems(): if inspect.isfunction(f): attrs[k] = make_printing_wrapper(f) type.__init__(cls, name, bases, attrs) and the base class: class Base: __metaclass__ = MetaWrapFunctions Now, the code: class Derived(Base): This function prints something def printSometing(something) #ghost function get's called here print something Output would be: This function prints something something Should behave as you described. I have not tested the code I'm suggesting (so there might be some errors of detail) but the general idea should work. Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: Can you use -getattr- to get a function in the current module?
Sergio Correia [EMAIL PROTECTED] wrote: This works: # Module spam.py import eggs print getattr(eggs, 'omelet')(100) That is, I just call the function omelet inside the module eggs and evaulate it with the argument 100. But what if the function 'omelet' is in the module where I do the getattr (that is, in spam.py). If I do any of this print getattr(spam, 'omelet')(100) print getattr('','omelet')(100) print getattr('omelet')(100) It wont work. Any ideas? globals() returns a dict of all globals defined so far, so, _after_ 'def omelet ...' has executed, globals()['omelet'](100) should be OK. Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: Why is this loop heavy code so slow in Python? Possible Project Euler spoilers
Mark Dickinson [EMAIL PROTECTED] wrote: On Sep 2, 9:45 am, [EMAIL PROTECTED] wrote: [snip code] Thanks for that. I realise that improving the algorithm will speed things up. I wanted to know why my less than perfect algorithm was so much slower in python than exactly the same algorithm in C. Even when turning off gcc's optimiser with the -O0 flag, the C version is still 100 times quicker. Well, for one thing, you're creating half a million xrange objects in the course of the search. All the C code has to do is increment a few integers. I don't think the creation of xrange objects is a meaningful part of Python's execution time here. Consider: M = 1000 solutions = [0] * M def f2(): a*a + b*b precalculated for a in xrange(1, M): a2 = a*a for b in xrange(1, M - a): s = a2 + b*b for c in xrange(1, M - a - b): if s == c*c: solutions[a+b+c] += 1 def f3(M=M, solutions=solutions): pull out all the stops xrs = [xrange(1, k) for k in xrange(0, M+1)] for a in xrs[M]: a2 = a*a for b in xrs[M-a]: s = a2 + b*b for c in xrs[M-a-b]: if s == c*c: solutions[a+b+c] += 1 import time t = time.time() f2() e = time.time() print e-t, max(xrange(M), key=solutions.__getitem__) solutions = [0]*M t = time.time() f3(M, solutions) e = time.time() print e-t, max(xrange(M), key=solutions.__getitem__) f2 is Arnaud's optimization of the OP's algorithm by simple hoisting; f3 further hoists the xrange creation -- it creates only 1000 such objects rather than half a million. And yet...: brain:~/py25 alex$ python puz.py 34.6613101959 840 36.2000119686 840 brain:~/py25 alex$ ...which suggests that creating an xrange object is _cheaper_ than indexing a list... Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: localizing a sort
Ricardo Aráoz [EMAIL PROTECTED] wrote: Peter Otten wrote: ... print ''.join(sorted(a, cmp=lambda x,y: locale.strcoll(x,y))) aeiouàáäèéëìíïòóöùúü The lambda is superfluous. Just write cmp=locale.strcoll instead. No it is not : print ''.join(sorted(a, cmp=locale.strcoll(x,y))) Traceback (most recent call last): File input, line 1, in module TypeError: strcoll expected 2 arguments, got 0 You need the lambda to assign both arguments. No, your mistake is that you're CALLING locale.strcoll, while as Peter suggested you should just PASS it as the cmp argument. I.e., ''.join(sorted('ciao', cmp=locale.strcoll)) Using key=locale.strxfrm should be faster (at least when you're sorting long-enough lists of strings), which is why strxfrm (and key=...:-) exist in the first place, but cmp=locale.strcoll, while usually slower, is entirely correct. That lambda _IS_ superfluous, as Peter said. Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: code check for modifying sequence while iterating over it?
Neal Becker [EMAIL PROTECTED] wrote: After just getting bitten by this error, I wonder if any pylint, pychecker variant can detect this error? I know pychecker can't (and I doubt pylint can, but I can't download the latest version to check as logilab's website is temporarily down for maintenance right now). It's a very thorny problem to detect a reasonable subset of likely occurrences of this bug by static analysis only, i.e., without running the code:-( Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: Google spreadsheets
iapain [EMAIL PROTECTED] wrote: On Aug 31, 5:40 pm, Michele Simionato [EMAIL PROTECTED] wrote: I would like to upload a tab-separated file to a Google spreadsheet from Python. Does anybody have a recipe handy? TIA, Michele Simionato Probably its irrelevant to python. Use should see Google Spreadsheet API and use it in your python application. http://code.google.com/apis/spreadsheets/ For Python-specific use, you probably want to get the Python version of the GData client libraries, http://code.google.com/p/gdata-python-client/ ; an example of using it with a spreadsheet is at http://gdata-python-client.googlecode.com/svn/trunk/samples/spreadsheet s/spreadsheetExample.py . Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: Why is this loop heavy code so slow in Python? Possible Project Euler spoilers
Paul Rubin http://[EMAIL PROTECTED] wrote: [EMAIL PROTECTED] (Alex Martelli) writes: ...which suggests that creating an xrange object is _cheaper_ than indexing a list... Why not re-use the xrange instead of keeping a list around? Python 2.4.4 (#1, Oct 23 2006, 13:58:00) a = xrange(3) print list(a) [0, 1, 2] print list(a) [0, 1, 2] Reusing xranges is exactly what my code was doing -- at each for loop you need an xrange(1, k) for a different value of k, which is why you need some container to keep them around (and a list of xrange objects is the simplest applicable container). Your suggestion doesn't appear to make any sense in the context of the optimization problem at hand -- what list(...) calls are you thinking of?! Please indicate how your suggestion would apply in the context of: def f3(M=M, solutions=solutions): pull out all the stops xrs = [xrange(1, k) for k in xrange(0, M+1)] for a in xrs[M]: a2 = a*a for b in xrs[M-a]: s = a2 + b*b for c in xrs[M-a-b]: if s == c*c: solutions[a+b+c] += 1 Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: Why is this loop heavy code so slow in Python? Possible Project Euler spoilers
Mark Dickinson [EMAIL PROTECTED] wrote: On Sep 2, 12:55 pm, [EMAIL PROTECTED] (Alex Martelli) wrote: Mark Dickinson [EMAIL PROTECTED] wrote: Well, for one thing, you're creating half a million xrange objects in the course of the search. All the C code has to do is increment a few integers. I don't think the creation of xrange objects is a meaningful part of Python's execution time here. Consider: [...] Agreed---I just came to the same conclusion after doing some tests. So maybe it's the billion or so integer objects being created that dominate the running time? (Not sure how many integer objects actually are created here: doesn't Python cache *some* small integers?) Yep, some, say -5 to 100 or thereabouts; it also caches on a free-list all the empty integer-objects it ever has (rather than returning the memory for the system), so I don't think there's much optimization to be had on that score either. Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: Adding attributes stored in a list to a class dynamically.
Nathan Harmston [EMAIL PROTECTED] wrote: Hi, Sorry if the subject line of post is wrong, but I think that is what this is called. I want to create objects with class Coconuts(object): def __init__(self, a, b, *args, **kwargs): self.a = a self.b = b def spam( l ) return Coconuts( l.a, l.b, l.attributes ) l in a parse line of a file which is a tuple wrapped with attrcol..with attributes a, b and attributes (which is a list of strings in the format key=value ie... [ id=bar, test=1234, doh=qwerty ] ). I want to add attributes to Coconuts so that I can do print c.id, c.test, c.doh HOwever I m not sure how to do this: how can i assign args, kwargs within the constructor of coconuts and how can I deconstruct the list to form the correct syntax to be able to be used for args, kwargs. If you want to pass the attributes list it's simpler to do that directly, avoiding *a and **k constructs. E.g.: def __init__(self, a, b, attrs): self.a = a self.b = b for attr in attrs: name, value = attr.split('=') setattr(self, name, value) You may want to add some better error-handling (this code just raises exceptions if any item in attrs has !=1 occurrences of the '=' sign, etc, etc), but I hope this gives you the general idea. Note that you'll have trouble accessing attributes that just happen to be named like a Python keyword, e.g. if you have yield=23 as one of your attributes you will NOT be able to just say c.yield to get at that attribute. Also, I'm assuming it's OK for all of these attributes' values to be strings, etc, etc. Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: list index()
Marc 'BlackJack' Rintsch [EMAIL PROTECTED] wrote: On Sat, 01 Sep 2007 13:44:28 -0600, Michael L Torrie wrote: Alex Martelli wrote: is the one obvious way to do it (the set(...) is just a simple and powerful optimization -- checking membership in a set is roughly O(1), while checking membership in a list of N items is O(N)...). Depending on a how a set is stored, I'd estimate any membership check in a set to be O(log N). Sets are stored as hash tables so membership check is O(1) just like Alex said. Roughly O(1), as I said, because of the usual issues with cost of hashing, potential hashing conflicts, re-hashing (which requires thinking in terms of *amortized* big-O, just like, say, list appends!), etc, just like for any hash table implementation (though Python's, long used and finely tuned in dicts then adopted for sets, is an exceedingly good implementation, it IS possible to artificially construct a worst case -- e.g., set(23+sys.maxint*i*2+i for i in xrange(24,199))...) Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: status of Programming by Contract (PEP 316)?
Ricardo Aráoz [EMAIL PROTECTED] wrote: ... We should remember that the level of security of a 'System' is the same as the level of security of it's weakest component, Not true (not even for security, much less for reliability which is what's being discussed here). It's easy to see how this assertion of yours is totally wrong in many ways... Example 1: a toy system made up of subsystem A (which has a probability of 90% of working right) whose output feeds into subsystem B (which has a probability of 80% of working right). A's failures and B's faliures are statistically independent (no common-mode failures, c). The ``level of goodness'' (probability of working right) of the weakest component, B, is 80%; but the whole system has a ``level of goodness'' (probability of working right) of just 72%, since BOTH subsystems must work right for the whole system to do so. 72 != 80 and thus your assertion is false. More generally: subsystems in series with independent failures can produce a system that's weaker than its weakest component. Example 2: another toy system made up of subsystems A1, A2 and A3, each trying to transform the same input supplied to all of them into a 1 bit result; each of these systems works right 80% of the time, statistically independently (no common-mode failures, c). The three subsystems' results are reconciled by a simple majority-voting component M which emits as the system's result the bit value that's given by two out of three of the Ai subsystems (or, of course, the value given unanimously by all) and has extremely high reliability thanks to its utter simplicity (say 99.9%, high enough that we can ignore M's contribution to system failures in a first-order analysis). The whole system will fail when all Ai fail together (probability 0.2**3) or when 2 out of them fail while the hird one is working (probability 3*0.8*0.2**2): 0.2**3+3*0.2**2*0.8 0.10404 So, the system as a whole has a level of goodness (probability of working right) of almost 90% -- again different from the weakest component (each of the three Ai's), in this case higher. More generally: subsystems in parallel (arranged so as to be able to survive the failure of some subset) with indipendent failures can produce a system that's stronger than its weakest component. Even in the field of security, which (changing the subject...) you specifically refer to, similar considerations apply. If your assertion was correct, then removing one component would never WEAKEN a system's security -- it might increase it if it was the weakest, otherwise it would leave it intact. And yet, a strong and sound tradition in security is to require MULTIPLE components to be all satisfied e.g. for access to secret information: e.g. the one wanting access must prove their identity (say by retinal scan), possess a physical token (say a key) AND know a certain secret (say a password). Do you really think that, e.g., removing the need for the retinal scan would make the system's security *STRONGER*...? It would clearly weaken it, as a would-be breaker would now need only to purloin the key and trick the secret password out of the individual knowing it, without the further problem of falsifying a retinal scan successfully. Again, such security systems exist and are traditional exactly because they're STRONGER than their weakest component! So, the implication accompanying your assertion, that strenghtening a component that's not the weakest one is useless, is also false. It may indeed have extremely low returns on investment, depending on system's structure and exact circumstances, but then again, it may not; nothing can be inferred about this ROI issue from the consideration in question. Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: status of Programming by Contract (PEP 316)?
Russ [EMAIL PROTECTED] wrote: ... the inputs. To test the post-conditions, you just need a call at the bottom of the function, just before the return, ... there's nothing to stop you putting the calls before every return. Oops! I didn't think of that. The idea of putting one before every return certainly doesn't appeal to me. So much for that idea. try: blah blah with as many return statements as you want finally: something that gets executed unconditionally at the end You'll need some convention such as all the return statements are of the same form ``return result'' (where the result may be computed differently each time), but that's no different from the conventions you need anyway to express such things as ``the value that foobar had at the time the function was called''. Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: status of Programming by Contract (PEP 316)?
Ricardo Aráoz [EMAIL PROTECTED] wrote: ... We should remember that the level of security of a 'System' is the same as the level of security of it's weakest component, ... You win the argument, and thanks you prove my point. You typically concerned yourself with the technical part of the matter, yet you completely ignored the point I was trying to make. That's because I don't particularly care about the point you were trying to make (either for or against -- as I said, it's a case of ROI for different investments [in either security, or, more germanely to this thread, reliability] rather than of useful/useless classification of the investments), while I care deeply about proper system thinking (which you keep failing badly on, even in this post). In the third part of your post, regarding security, I think you went off the road. The weakest component would not be one of the requisites of access, the weakest component I was referring to would be an actual APPLICATION, Again, F- at system thinking: a system's components are NOT just applications (what's the alternative to their being actual, btw?), nor is it necessarily likely that an application would be the weakest one of the system's components (these wrong assertions are in addition to your original error, which you keep repeating afterwards). For example, in a system where access is gained *just* by knowing a secret (e.g., a password), the weakest component is quite likely to be that handy but very weak architectural choice -- or, seen from another viewpoint, the human beings that are supposed to know that password, remember, and keep it secret. If you let them choose their password, it's too likely to be fred or other easily guessable short word; if you force them to make it at least 8 characters long, it's too likely to be fredfred; if you force them to use length, mixed case and digits, it's too likely to be Fred2Fred. If you therefore decide that passwords chosen by humans are too weak and generate one for them, obtaining, say, FmZACc2eZL, they'll write it down (perhaps on a post-it attached to their screen...) because they just can't commit to memory a lot of long really-random strings (and nowadays the poor users are all too likely to need to memorize far too many passwords). A clever attacker has many other ways to try to steal passwords, from social engineering (pose as a repair person and ask the user to reveal their password as a prerequisite of obtaining service), to keystroke sniffers of several sorts, fake applications that imitate real ones and steal the password before delegating to the real apps, etc, etc. Similarly, if all that's needed is a physical token (say, some sort of electronic key), that's relatively easy to purloin by traditional means, such as pickpocketing and breaking-and-entering; certain kind of electronic keys (such as the passive unencrypted RFID chips that are often used e.g. to control access to buildings) are, in addition, trivially easy to steal by other (technological) means. Refusing to admit that certain components of a system ARE actually part of the system is weak, blinkered thinking that just can't allow halfway decent system design -- be that for purposes of reliability, security, availability, or whatever else. Indeed, if certain part of the system's architecture are OUTSIDE your control (because you can't redesign the human mind, for example;-), all the more important then to make them the focus of the whole design (since you must design AROUND them, and any amelioration of their weaknesses is likely to have great ROI -- e.g., if you can make the users take a 30-minutes short course in password security, and accompany that with a password generator that makes reasonably memorable though random ones, you're going to get substantial returns on investment in any password-using system's security). e.g. an ftp server. In that case, if you have several applications running your security will be the security of the weakest of them. Again, false as usual, and for the same reason I already explained: if your system can be broken by breaking any one of several components, then it's generally WEAKER than the weakest of the components. Say that you're running on the system two servers, an FTP one that can be broken into by 800 hackers in the world, and a SSH one that can only be broken into by 300 hackers in the world; unless every single one of the hackers who are able to break into the SSH server is *also* able to break into the FTP one (a very special case indeed!), there are now *MORE* than 800 hackers in the world that can break into your system as a whole -- in other words, again and no matter how often you repeat falsities to the contraries without a shred of supporting argument, your assertion is *FALSE*, and in this case your security is *WEAKER* than the security of the weaker of the two components. I do not really much care what point(s) you are trying to make through your
Re: status of Programming by Contract (PEP 316)?
Paul Rubin http://[EMAIL PROTECTED] wrote: ... Hi Alex, I'm a little confused: does Production Systems mean stuff like the Google search engine, which (as you described further up in your message) achieves its reliability at least partly by massive redundancy and failover when something breaks? The infrastructure supporting that engine (and other things), yes. In that case why is it so important that the software be highly reliable? Is a software Think common-mode failures: if a program has a bug, so do all identical copies of that program. Redundancy works for cheap hardware because the latter's unreliability is essentially free of common-mode failures (when properly deployed): it wouldn't work against a design mistake in the hardware units. Think of the famous Pentium division bug: no matter how many redundant but identical such buggy CPUs you place in parallel to compute divisions, in the error cases they'll all produce the same wrong results. Software bugs generally work (or, rather, fail to work;-) similarly to hardware design bugs. There are (for both hw and sw) also classes of mistakes that don't quite behave that way -- occasional glitches that are not necessarily repeatable and are heavily state-dependent (race conditions in buggy multitasking SW, for example; and many more examples for HW, where flaky behavior may be triggered by, say, temperature situations). Here, from a systems viewpoint, you might have a system that _usually_ says that 10/2 is 5, but once in a while says it's 4 instead (as opposed to the Pentium division bug case where it would always say 4) -- this is much more likely to be caused by flaky HW, but might possibly be caused by the SW running on it (or the microcode in between -- not that it makes much of a difference one way or another from a systems viewpoint). Catching such issues can, again, benefit from redundancy (and monitoring, watchdog systems, health and sanity checks running in the background, c). Quis custodiet custodes is an interesting problem here, since bugs or flakiness in the monitoring/watchdog infrastructure have the potential to do substantial global harm; one approach is to invest in giving that infrastructure an order of magnitude more reliability than the systems it's overseeing (for example by using more massive and *simple* redundancy, and extremely straightforward architectures). There's ample literature in the matter, but it absolutely needs a *systems* approach: focusing just on the HW, just on the SW, or just on the microcode in-between;-), just can't help much. some good hits they should display) but the server is never actually down, can you still claim 100% uptime? I've claimed nothing (since all such measurements and methodologies would no doubt be considered confidential unless and until cleared for publication -- this has been done for a few whitepapers about some aspects of Google's systems, but never to the best of my knowledge for the metasystem as a whole), but rather pointed to http://uptime.pingdom.com/site/month_summary/site_name/www.google.com, a publically available site which does publish its methodology (at http://uptime.pingdom.com/general/methodology); summarizing, as they have no way to check that the results are right for the many sites they keep an eye on, they rely on the HTTP result codes (as well as validity of HTTP headers returned, and of course whether the site does return a response at all). problem. Of course then there's a second level system to manage the restarts that has to be very reliable, but it doesn't have to deal with much weird concocted input the way that a public-facing internet application has to. Indeed, Production Systems' software does *not* have to deal with input from the general public -- it's infrastructure, not user-facing applications (except in as much as the users are Google engineers or operators, say). IOW, it's *exactly* the code that has to be very reliable (nice to see that we agree on this;-), and therefore, if as you then said Russ's point stands, would NOT be in Python -- but it is. So, I disagree about the standing status of his so-called point. Therefore I think Russ's point stands, that we're talking about a different sort of reliability in these highly redundant systems, than in the systems Russ is describing. Russ specifically mentioned *mission-critical applications* as being outside of Python's possibilities; yet search IS mission critical to Google. Yes, reliability is obtained via a systems approach, considering HW, microcode, SW, and other issues yet such as power supplies, cooling units, network cables, etc, not as a single opaque big box but as an articulated, extremely complex and large system that needs testing, monitoring, watchdogging, etc, at many levels -- there is no other real way to make systems reliable (you can't do it by just looking at components in isolation). Note that this does have costs and therefore it needs to be
Re: status of Programming by Contract (PEP 316)?
Michele Simionato [EMAIL PROTECTED] wrote: ... I would not call that an attack. If you want to see an attack, wait for Alex replying to you observations about the low quality of code at Google! ;) I'm not going to deny that Google Groups has glitches, particularly in its user interface (that's why I'm using MacSOUP instead, even though Groups, were it perfect, would offer me a lot of convenience). We have a LOT of products (see http://www.google.com/intl/en/options/index.html, plus a few more at http://labs.google.com/; http://en.wikipedia.org/wiki/List_of_Google_products for an overview, http://searchengineland.com/070220-091136.php for a list of more lists...), arguably too many in the light of the It's best to do one thing really, really well ``thing we've found to be true''; given the 70-20-10 rule we use (we spend 70% of our resources on search and ads [and of course infrastructure supporting those;-)], 20% on adjacent businesses such as News, Desktop and Maps, 10% on all the rest combined), products in the other (10%) category may simply not receive sufficient time, resources and attention. We've recently officially raised Apps to the status of a third pillar for Google (after Search and Ads), but I don't know which of our many products are officially within these pillar-level Apps -- maybe a good starting hint is what's currently included in the Premier Edition of Google Apps, i.e.: Gmail (with 99.9% uptime guarantee), Google Talk, Google Calendar, Docs Spreadsheets, Page Creator and Start Page. I do notice that Google Groups is currently not in that elite (but then, neither are other products we also offer in for-pay editions, such as Google Earth and Sketchup) but I have no insider information as to what this means or portends for the future (of course not: if I _did_ have insider information, I could not talk about the subject!-). Notice, however, that none of these points depend on use of Python vs (or side by side with) other programming languages, DbC vs (or side by side with) other methodologies, and other such technical and technological issues: rather, these are strategical problems in the optimal allocation of resources that (no matter how abundant they may look on the outside) are always scarce compared to the bazillion ways in which they _could_ be employed -- engineers' time and attention, machines and networking infrastructure, and so forth. Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: list index()
[EMAIL PROTECTED] [EMAIL PROTECTED] wrote: ... Why wouldn't the one obvious way be: def inAnotB(A, B): inA = set(os.listdir(A)) inBs = set(os.listdir(B)) return inA.difference(inBs) If you want a set as the result, that's one possibility (although possibly a bit wasteful as you're building one more set than necessary); I read the original request as implying a sorted list result is wanted, just like os.listdir returns (possibly sorted in case-independent order depending on the underlying filesystem). There's no real added value in destroying inA's ordering by making it a set, when the list comprehension just naturally keeps its ordering. Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: status of Programming by Contract (PEP 316)?
Russ [EMAIL PROTECTED] wrote: ... programs. Any idea how much Python is used for flight control systems in commercial transport aircraft or jet fighters? Are there differences in reliability requirements between the parts of such control systems that run on aircraft themselves, and those that run in airports' control towers? Because Python *IS* used in the latter case, cfr http://www.python.org/about/success/frequentis/ ... if on-plane control SW requires hard-real-time response, that might be a more credible reason why Python would be inappropriate (any garbage collected language is NOT a candidate for hard-real-time SW!) than your implied aspersions against Python's reliability. According to http://uptime.pingdom.com/site/month_summary/site_name/www.google.com, Google's current uptime is around 99.99%, with many months at 100% and a few at 99.98% -- and that's on *cheap*, not-that-reliable commodity HW, and in real-world conditions where power can go away, network cables can accidentally get cut, etc. I'm Uber Tech Lead for Production Systems at Google -- i.e., the groups I uber-lead are responsible for some software which (partly by automating things as much as possible) empowers our wondrous Site Reliability Engineers and network specialists to achieve that uptime in face of all the Bad Stuff the world can and does throw at us. Guess what programming language I'm a well-known expert of...? The important question is this: why do I waste my time with bozos like you? Yeah, good question indeed, and I'm asking myself that -- somebody who posts to this group in order to attack the reliability of the language the group is about (and appears to be supremely ignorant about its use in air-traffic control and for high-reliability mission-critical applications such as Google's Production Systems software) might well be considered not worth responding to. OTOH, you _did_ irritate me enough that I feel happier for venting in response;-) Oh, FYI -- among the many tasks I undertook in my quarter-century long career was leading and coordinating pilot projects in Eiffel for one employer, many years ago. The result of the pilot was that Eiffel and its DbC features didn't really let us save any of the extensive testing we performed for C++-coded components, and the overall reliability of such extensively tested components was not different in a statistically significant way whether they were coded in C++ or Eiffel; Eiffel did let us catch a few bugs marginally earlier (but then, I'm now convinced that, at that distant time, we used by far too few unit-tests for early bug catching and relied too much on regression and acceptance tests), but that definitely was _not_ enough to pay for itself. DbC and allegedly rigorous compile-time typechecking (regularly too weak due to Eiffel's covariant vs countervariant approach, btw...), based on those empirical results, appear to be way overhyped. A small decorator library supporting DbC would probably be a nice addition to Python, but it should first prove itself in the field by being released and supported as an add-on and gaining wide acceptance: arguments such as yours are definitely NOT going to change that. Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: list index()
[EMAIL PROTECTED] wrote: ... In my case of have done os.listdir() on two directories. I want to see what files are in directory A that are not in directory B. So why would you care about WHERE, in the listdir of B, are to be found the files that are in A but not B?! You should call .index only if you CARE about the position. def inAnotB(A, B): inA = os.listdir(A) inBs = set(os.listdir(B)) return [f for f in inA if f not in inBs] is the one obvious way to do it (the set(...) is just a simple and powerful optimization -- checking membership in a set is roughly O(1), while checking membership in a list of N items is O(N)...). Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: list index()
Ricardo Aráoz [EMAIL PROTECTED] wrote: ... Alex Martelli wrote: [EMAIL PROTECTED] wrote: ... In my case of have done os.listdir() on two directories. I want to see what files are in directory A that are not in directory B. So why would you care about WHERE, in the listdir of B, are to be found the files that are in A but not B?! You should call .index only if you CARE about the position. def inAnotB(A, B): inA = os.listdir(A) inBs = set(os.listdir(B)) return [f for f in inA if f not in inBs] is the one obvious way to do it (the set(...) is just a simple and powerful optimization -- checking membership in a set is roughly O(1), while checking membership in a list of N items is O(N)...). And what is the order of passing a list into a set? O(N)+? Roughly O(N), yes (with the usual caveats about hashing costs, c;-). So, when A has M files and B has N, your total costs are roughly O(M+N) instead of O(M*N) -- a really juicy improvement for large M and N! Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: What's the difference ?
Alex [EMAIL PROTECTED] wrote: Hye, I was just wondering what is the difference between if my_key in mydict: ... and if mydict.has_keys(my_key): Mis-spelled (no final s in the method name). ... I've search a bit in the python documentation, and the only things I found was that they are equivalent. Semantically they are, but `in' is faster, more concise, readable. But in this (quiet old) sample ( http://aspn.activestate.com/ASPN/ Cookbook/Python/Recipe/59875 ), there is difference between the two notation. What that example is pointing to as wrong way is a NON-equivalent approach that's extremely slow: if my_key in mydict.keys(): The call to keys() takes time and memory to build a list of all keys, after which the ``in'' operator, having a list as the RHS operand, is also quite slow (O(N), vs O(1)!). So, never use that useless and silly call to keys() in this context! Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: What's the difference ?
[EMAIL PROTECTED] wrote: ... Weird. Hetland's book, Beginning Python states that it's a matter of taste. If your taste is for more verbose AND slower notation without any compensating advantage, sure. Martelli's Python Cookbook 2nd Ed. says to use the get() method instead as you never know if a key is in the dict. However, I can't seem to find any reference to has_key in his book. .get is not a direct alternative to ``in'' (it's an alternative to an idiom where you key into the dict if the key is present and otherwise supply a default value, and it's MUCH better in that case). has_key is probably not even mentioned in the Cookbook (2nd edition) since there is never a good need for it in the Python versions it covers (2.2 and up), but you can probably find traces in the 1st edition (which also covered Python back to 1.5.2, where has_key *was* needed); the Nutshell (2nd ed) mentions it briefly in a table on p. 60. According to Chun in Core Python Programming, has_key will be obsoleted in future versions of Python, so he recommends using in or not in. Yes, we're removing has_key in Python 3.0 (whose first alpha will be out reasonably soon, but is not going to be ready for production use for quite a bit longer), among other redundant things that exist in 2.* only for legacy and backwards compatibility reasons. This makes 3.0 simpler (a little closer to the only one obvious way ideal). But you should use ``in'' and ``not in'' anyway, even if you don't care about 3.* at all, because they only have advantages wrt has_key, without any compensating disadvantage. Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: How to free memory ( ie garbage collect) at run time with Python 2.5.1(windows)
rfv-370 [EMAIL PROTECTED] wrote: have made the following small test: Before starting my test my UsedPhysicalMemory(PF): 555Mb tf=range(0,1000)PF: 710Mb ( so 155Mb for my List) tf=[0,1,2,3,4,5] PF: 672Mb (Why? Why the remaining 117Mb is not freed?) del tfPF: 672Mb (unused memory not freed) Integer objects that are once generated are kept around in a free list against the probability that they might be needed again in the future (a few other types of objects similarly keep a per-type free-list, but I think int is the only one that keeps an unbounded amount of memory there). Like any other kind of cache, this free-list (in normal cases) hoards a bit more memory than needed, but results in better runtime performance; anomalous cases like your example can however easily bust this too-simple heuristic. So how can I force Python to clean the memory and free the memory that is not used? On Windows, with Python 2.5, I don't know of a good approach (on Linux and other Unix-like systems I've used a strategy based on forking, doing the bit that needs a bazillion ints in the child process, ending the child process; but that wouldn't work on Win -- no fork). I suggest you enter a feature request to let gc grow a way to ask every type object to prune its cache, on explicit request from the Python program; this will not solve the problem in Python 2.5, but work on 3.0 is underway and this is just the right time for such requests. Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: How to replace a method in an instance.
Bruno Desthuilliers [EMAIL PROTECTED] wrote: Of course, a function in a class is also know as a method. Less obvious but still wrong !-) I wish the authors of the Python books would get a clue then. I'd think that at least some authors of some Python books would explain all this much better than I did. But FWIW, all these rules are clearly documented in the Fine Manual. Speaking as one such author, I think I do a reasonable job of this in Python in a Nutshell (2nd ed): on p. 82 and 85 I have brief mentions that class attributes bound to functions are also known as methods of the class (p.82) and again that functions (called methods in this context) are important attributes for most class objects (p.85); on p.91-94, after explaining descriptors, instances, and the basics of attribute reference, I can finally cover the subject thoroughly in Bound and Unbound Methods. I realize that a beginner might be confused into believing that class attributes bound to functions means function in a class, if they stop reading before p.91;-), but I don't literally make that wrong assertion...;-) Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: Need a better understanding on how MRO works?
Steven W. Orr [EMAIL PROTECTED] wrote: ... =accepts whatever dictionary you give it (so you can, though shouldn't, =do strange things such as pass globals()...:-). In fact, I wanted to make a common routine that could be called from multiple modules. I have classes that need to be created from those multiple modules. I did run into trouble when I created a common routine even though I passed globals() as one of the args. The though shouldn't is prompting me to ask why, and where I might be able to read more. The dictionary you pass to new.classobj should be specifically constructed for the purpose -- globals() will contains all sort of odds and ends that have nothing much to do with the case. You appear to be trying to embody lot of black magic in your common routine, making it communicate with its callers by covert channels; the way you use globals() to give that routine subtle side effects (making the routine stick entries there) as well as pass it an opaque, amorphous, unknown blobs of input information, strongly suggests that the magic is running away with you (a good general reference about that is http://video.google.com/videoplay?docid=4611491525028588899). Explicit is better than implicit, simple is better than complex, etc, can be read by typing ``import this'' at an interactive Python prompt. The best book I know about the do's and don't's of large-scale software architecture is Lakos' Large-Scale C++ Software Design, http://www.amazon.com/Large-Scale-Software-Design-John-Lakos/dp/0201633 620 -- very C++ specific, but even though some of the issues only apply to C++ itself, many of its crucial lessons will help with large scale SW architecture in just about any language, Python included. What I had to say about the lures and pitfalls of black magic in Python specifically is spread through the Python Cookbook 2nd edition (and, to a lesser extent, Python in a Nutshell). Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: beginner, idiomatic python
bambam [EMAIL PROTECTED] wrote: Is it safe to write A = [x for x in A if x in U] or is that undefined? I understand that the slice operation It's perfectly safe and well-defined, as the assignment rebinds the LHS name only AFTER the RHS list comprehension is done. Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: beginner, idiomatic python
bambam [EMAIL PROTECTED] wrote: ... Bags don't seem to be built in to my copy of Python, and A bag is a collections.defaultdict(int) [[you do have to import collections -- it's in the standard library, NOT built-in]]. Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: ANN: SCF released GPL
hg [EMAIL PROTECTED] wrote: ... I am looking for a free subversion server resource to put the code ... if you know of any. Check out code.google.com -- it has a hosting service for open source code, too, these days (and it IS subversion). Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: Does shuffle() produce uniform result ?
tooru honda [EMAIL PROTECTED] wrote: At the end, I think it is worthwhile to implement my own shuffle and random methods based on os.urandom. Not only does the resulting code gets rid of the minuscule bias, but the program also runs much faster. When using random.SystemRandom.shuffle, posix.open and posix.close from calling os.urandom account for almost half of the total execution time for my program. By implementing my own random and getting a much larger chunk of random bytes from os.urandom each time, I am able to reduce the total execution time by half. If I were in your shoes, I would optimize by subclassing random.SystemRandom and overriding the random method to use os.urandom with some large block size and then parcel it out, instead of the _urandom(7) that it now uses. E.g., something like: class SystemBlockRandom(random.SystemRandom): def __init__(self): random.SystemRandom.__init__(self) def rand7(): while True: randata = os.urandom(7*1024) for i in xrange(0, 7*1024, 7): yield long(binascii.hexlify(randata[i:i+7]),16) self.rand7 = rand7().next def random(self): Get the next random number in the range [0.0, 1.0). return (self.rand7() 3) * random.RECIP_BPF (untested code). No need to reimplement anything else, it seems to me. Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: Need a better understanding on how MRO works?
Steven W. Orr [EMAIL PROTECTED] wrote: ... name = 'C1' nclass = new.classobj(name,(D1,),globals()) globals()[name] = nclass Here, you're creating a VERY anomalous class C1 whose __dict__ is globals(), i.e. the dict of this module object; name = 'C2' nclass = new.classobj(name,(D1,),globals()) globals()[name] = nclass and here you're creating another class with the SAME __dict__; globals()[name].m1 = m1replace So of course this assignment affects the 'm1' entries in the dict of both classes, since they have the SAME dict object (a la Borg) -- that is, IF they're old-style classes (i.e. if D1 is old-style), since in that case a class's __dict__ is in fact a dict object, plain and simple. However, if D1 is new-style, then C1.__dict__ and C2.__dict__ are in fact instances of dictproxy -- each with a copy of the entries that were in globals() when you called new.classobj, but DISTINCT from each other and from globals(), so that further changes in one (or globals) don't affect globals (nor the other). I guess this might be a decent interview question if somebody claims to be a Python guru: if they can make head or tails out of this mess, boy the *ARE* a Python guru indeed (in fact I'd accord minor guruhood even to somebody who can get a glimmer of understanding of this with ten minutes at a Python interactive prompt or the like, as opposed to needing to understand it on paper without the ability to explore:-). Among the several don'ts to learn from this: don't use old-style classes, don't try to make two classes share the same dictionary, and don't ask about MRO in a question that has nothing to do with MRO (though I admit that was a decent attempt at misdirection, it wouldn't slow down even the minor-guru in any appreciable way:-). Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: Need a better understanding on how MRO works?
Steven W. Orr [EMAIL PROTECTED] wrote: ... Thanks Alex. I am humbled, though I was before I started. I really don't have a lot of understanding of what you're saying so I'll probably have to study this for about a year or so. * (I need to look up what dictproxy is.) I don't have any idea what the ramifications are of your use of the word DISTINCT. Are you somehow suggesting that new.classobj does a deep copy of the globals copy that's passed to it? No, most definitely NOT deep!!!, but type.__new__ does a little of what you've said (a shallow copy, which is not quite a copy because it embeds [some of] the entries in slots). new.classobj determines the metaclass (from the bases, or a __metaclass__ entry in the dictionary) and calls it to generate the new class. For modern style classes, the class is type; for old-style legacy classes, it's types.ClassType, and they're not exactly identical in behavior (of course not, or there would no point in having both:-). * Also, I'd like to understand what the difference is between nclass = new.classobj(name,(D1,),globals()) vs. def classfactory(): class somename(object): def somestuff(): pass return somename G1 = classfactory() globals()[name] = G1 Does new.classobj do anything special? No, new.classobj does essentially the same thing that Python does after evaluating a class statement to prepare the class's name, bases and dictionary: finds the metaclass and calls it with these arguments. A key difference of course is that a class statement prepares the class dictionary as a new, ordinary, distinct dictionary, while new.classobj accepts whatever dictionary you give it (so you can, though shouldn't, do strange things such as pass globals()...:-). Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: Does shuffle() produce uniform result ?
tooru honda [EMAIL PROTECTED] wrote: ... def rand2(): while True: randata = urandom(2*1024) for i in xrange(0, 2*1024, 2): yield int(hexlify(randata[i:i+2]),16)# integer in [0,65535] another equivalent possibility, which might probably be faster: import array ... def rand2(): while True: x = array.array(H) x.fromstring(urandom(2*4000)) for i in x: yield i Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: yet another indentation proposal
Jakub Stolarski [EMAIL PROTECTED] wrote: Why not just use comments and some filter. Just write # _{ at the beginning and # _} at the end. Then filter just before runing indenting with those control sequences? Then there's no need to change interpreter. As I pointed out in another post to this thread, that's essentially what Tools/scripts/pindent.py (part of the Python source distribution) does (no need to comment the beginning of a block since it's always a colon followed by newline; block-end comments in pindent.py are more informative). Just use and/or adapt that... Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: yet another indentation proposal
Aaron [EMAIL PROTECTED] wrote: ... That's probably what I'll end up doing. The only drawback to that is that it solves the problem for me only. Perhaps I will open source the scripts and write up some documentation so that other folks in a similar situation don't have to reinvent the wheel. As I pointed out in another post to this thread, Tools/scripts/pindent.py IS open-source, indeed it's part of the Python source distribution. Why not use and/or adapt that? The only unfortunate aspect to that is that blind newbies to the language will have to figure out setting up a shell script or batch file to pipe the output of the filter into Python on top of learning the language. I admit, it's probably not that much work, but it is one more stumblingblock that blind newcomers will have to overcome. pindent.py's approach ensures that the tool's output is also entirely valid Python (as it only adds comments to mark and explain block ends!) so no piping the output into Python is at all needed; you only need (editor-dependent) to ensure pindent.py is run when you LOAD a Python source file into your editor. If anything, pindent.py and/or the screen reader of choice might be tweaked to read out Python sources more clearly (e.g. by recognizing block-end comments and reading them differently than other comments are read). Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: yet another indentation proposal
Michael Tobis [EMAIL PROTECTED] wrote: On Aug 19, 11:51 pm, James Stroud [EMAIL PROTECTED] wrote: What's wrong with just saying the current indent level? I'd much rather hear indent 4 than tab tab tab tab. Alternatively, you might also consider writing a simple pre and postprocessor so that you could read and write python the way you would prefer As I pointed out in another post to this thread, that's essentially what Tools/scripts/pindent.py (part of the Python source distribution) does. Just use and/or adapt that... Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: clarification
samwyse [EMAIL PROTECTED] wrote: ... brain:~ alex$ python -mtimeit -s'sos=[set(range(x,x+4)) for x in range(0, 100, 3)]' 'r=set()' 'for x in sos: r.update(x)' 10 loops, best of 3: 18.8 usec per loop brain:~ alex$ python -mtimeit -s'sos=[set(range(x,x+4)) for x in range(0, 100, 3)]' 'r=reduce(set.union, sos, set())' 1 loops, best of 3: 87.2 usec per loop Even in a case as tiny as this one, reduce is taking over 4 times longer than the loop with the in-place mutator -- and it only gets worse, as we're talking about O(N squared) vs O(N) performance! Indeed, this is part of what makes reduce an attractive nuisance...;-). [[And The set-union case, just like the list-catenation case, is O(N squared) (when approached in a functional way) because the intermediate result often _grows_ [whenever a new set or list operand adds items], and thus a new temporary value must be allocated, and the K results-so-far copied over (as part of constructing the new temp value) from the previous temporary value; and sum(range(N)) grows quadratically in N. The in-place approach avoids that fate by a strategy of proportional over-allocation (used by both set and lists) that makes in-place operations such as .update(X) and .extend(X) amortized O(K) where K is len(X). In the set-intersection case, the intermediate result _shrinks_ rather than growing, so the amount of data copied over is a diminishing quantity at each step, and so the analysis showing quadratic behavior for the functional approach does not hold; behavior may be roughly linear, influenced in minor ways by accidental regularities in the sets' structure and order (especially likely for short sequences of small sets, as in your case). Using a slightly longer sequence of slightly larger sets, with little structure to each, e.g.: random.seed(12345) # set seed to ensure total repeatability los=[set(random.sample(range(1000), 990)) for x in range(200)] at the end of the setup (the intersection of these 200 sets happens to contain 132 items), I measure (as usual on my 18-months-old Macbook Pro laptop): stmt = 'reduce(set.intersection,los)' best of 3: 1.66e+04 usec per loop stmt = 'intersect_all(los)' best of 3: 1.66e+04 usec per loop and occasionally 1.65 or 1.67 instead of 1.66 for either or both, whether with 10,000 or 100,000 loops. (Not sure whether your observations about the reduce-based approach becoming faster with more loops may be due to anomalies in Windows' scheduler, or the accidental regularities mentioned above; my timings are probably more regular since I have two cores, one of which probably ends up dedicated to whatever task I'm benchmarking while the other one runs all background stuff). turn indicates that both implementations actually work about same and your O(n squared) argument is irrelevant. It's indeed irrelevant when the behavior _isn't_ quadratic (as in the case of intersections) -- but unfortunately it _is_ needlessly quadratic in most interesting cases involving containers (catenation of sequences, union of sets, merging of dictionaries, merging of priority-queues, ...), because in those cases the intermediate temporary values tend to grow, as I tried to explain in more detail above. Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: yet another indentation proposal
Paddy [EMAIL PROTECTED] wrote: ... Can screen reaaderss be customized? Open-source ones surely can (e.g., NVDA is an open-source reader for Windows written in Python, http://www.nvda-project.org/ -- alas, if you search for NVDA Google appears to be totally convinced you mean NVidia instead, making searches pretty useless, sigh). Maybe their is a way to get the screen reader to say indent and dedent at thee appropriate places? There definitely should be. Or maybe a filter to put those wordds into the source? .../Tools/scripts/pindent.py (comes with the Python source distribution, and I hope that, like the whole Tools directory, it would also come with any sensible packaged Python distribution) should already be sufficient for this particular task. The indent always happens (in correct Python sources) on the next line after one ending with a colon; pindent.py can add or remove block-closing comments at each dedent (e.g., # end for if the dedent is terminating a for-statement), and can adjust the indentation to make it correct if given a Python source with such block-closing comments but messed-up indentation. Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: Sorting a list of Unicode strings?
[EMAIL PROTECTED] [EMAIL PROTECTED] wrote: ... Maybe I'm missing something fundamental here, but if I have a list of Unicode strings, and I want to sort these alphabetically, then it places those that begin with unicode characters at the bottom. ... Anyway, I know _why_ it does this, but I really do need it to sort them correctly based on how humans would look at it. Depending on the nationality of those humans, you may need very different sorting criteria; indeed, in some countries, different sorting criteria apply to different use cases (such as sorting surnames versus sorting book titles, etc; sorry, I don't recall specific examples, but if you delve on sites about i18n issues you'll find some). In both Swedish and Danish, I believe, A-with-ring sorts AFTER the letter Z in the alphabet; so, having Åaland (where I'm using Aa for A-with-ring, since this newsreader has some problem in letting me enter non-ascii characters;-) sort right at the bottom, while it doesn't look right to YOU (maybe an English-speaker?) may look right to the inhabitants of that locality (be they Danes or Swedes -- but I believe Norwegian may also work similarly in terms of sorting). The Unicode consortium does define a standard collation algorithm (UCA) and table (DUCET) to use when you need a locale-independent ordering; at http://jtauber.com/blog/2006/01/27/python_unicode_collation_algorithm you'll be able to obtain James Tauber's Python implementation of UCA, to work with the DUCET found at http://jtauber.com/blog/2006/01/27/python_unicode_collation_algorithm. I suspect you won't like the collation order you obtain this way, but you might start from there, subsetting and tweaking the DUCET into an OUCET (Oliver Unicode Collation Element Table;-) that suits you better. A simpler, rougher approach, if you think the right collation is obtained by ignoring accents, diacritics, etc (even though the speakers of many languages that include diacritics, c, disagree;-) is to use the key=coll argument in your sorting call, passing a function coll that maps any Unicode string to what you _think_ it should be like for sorting purposes. The .translate method of Unicode string objects may help there: it takes a dict mapping Unicode ordinals to ordinals or string (or None for characters you want to delete as part of the translation). For example, suppose that what we want is the following somewhat silly collation: we only care about ISO-8859-1 characters, and want to ignore for sorting purposes any accent (be it grave, acute or circumflex), umlauts, slashes through letters, tildes, cedillas. htmlentitydefs has a useful dict called codepoint2name that helps us identify those weirdy decorated foreign characters. def make_transdict(): import htmlentitydefs cp2n = htmlentitydefs.codepoint2name suffixes = 'acute crave circ uml slash tilde cedil'.split() td = {} for x in range(128, 256): if x not in cp2n: continue n = cp2n[x] for s in suffixes: if n.endswith(s): td[x] = unicode(n[-len(s)]) break return td def coll(us, td=make_transdict()): return us.translate(td) listofus.sort(key=coll) I haven't tested this code, but it should be reasonably easy to fix any problems it might have, as well as making make_transdict richer to meet your goals. Just be aware that the resulting collation (e.g., sorting a-ring just as if it was a plain a) will be ABSOLUTELY WEIRD to anybody who knows something about Scandinavian languages...!!!-) Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: Parser Generator?
Jack [EMAIL PROTECTED] wrote: Thanks for the suggestion. I understand that more work is needed for natural language understanding. What I want to do is actually very simple - I pre-screen the user typed text. If it's a simple syntax my code understands, like, Weather in London, I'll redirect it to a weather site. Or, if it's What is ... I'll probably redirect it to wikipedia. Otherwise, I'll throw it to a search engine. So, extremelyl simple stuff ... http://nltk.sourceforge.net/index.php/Main_Page NLTK — the Natural Language Toolkit — is a suite of open source Python modules, data sets and tutorials supporting research and development in natural language processing. Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: clarification
samwyse [EMAIL PROTECTED] wrote: ... Finally, does anyone familar with P3K know how best to do the reduction without using 'reduce'? Right now, sets don't support the 'add' and 'multiply' operators, so 'sum' and (the currently ficticious) 'product' won't work at all; while 'any' and 'all' don't work as one might hope. Are there an 'intersection' and 'union' built-ins anywhere? For intersection and union of a sequence of sets, I'd use: def union_all(sos): result = set() for s in sos: result.update(s) return result def intersect_all(sos): it = iter(sos) result = set(it.next()) for s in it: result.intersection_update(s) return result The latter will raise an exception if sos is empty -- I don't think the intersection of no sets at all has a single natural interpretation (while the union of no sets at all appears to be naturally interpreted as an empty set)... if you disagree, just wrap a try/except around the initialization of result, and return whatever in the except clause. Of course, hoisting the unbound method out of the loops can afford the usual small optimization. But my point is that, in Python, these operations (like, say, the concatenation of a sequence of lists, etc) are best performed in place via loops calling mutator methods such as update and intersection_update (or a list's extend, etc), rather than functionally (building and tossing away many intermediate results). E.g., consider: brain:~ alex$ python -mtimeit -s'sos=[set(range(x,x+4)) for x in range(0, 100, 3)]' 'r=set()' 'for x in sos: r.update(x)' 10 loops, best of 3: 18.8 usec per loop brain:~ alex$ python -mtimeit -s'sos=[set(range(x,x+4)) for x in range(0, 100, 3)]' 'r=reduce(set.union, sos, set())' 1 loops, best of 3: 87.2 usec per loop Even in a case as tiny as this one, reduce is taking over 4 times longer than the loop with the in-place mutator -- and it only gets worse, as we're talking about O(N squared) vs O(N) performance! Indeed, this is part of what makes reduce an attractive nuisance...;-). [[And so is sum, if used OTHERWISE than for the documented purpose, computing the sum of a sequence of numbers: a loop with r.extend is similarly faster, to concatenate a sequence of lists, when compared to sum(sol, [])...!!!]] Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: Understanding closures
Ramashish Baranwal [EMAIL PROTECTED] wrote: Hi, I want to use variables passed to a function in an inner defined function. Something like- def fun1(method=None): def fun2(): if not method: method = 'GET' print '%s: this is fun2' % method return fun2() fun1() However I get this error- UnboundLocalError: local variable 'method' referenced before assignment This however works fine. def fun1(method=None): if not method: method = 'GET' def fun2(): print '%s: this is fun2' % method return fun2() fun1() Is there a simple way I can pass on the variables passed to the outer function to the inner one without having to use or refer them in the outer function? Sure, just don't ASSIGN TO those names in the inner function. Any name ASSIGNED TO in a given function is local to that specific function (save for global statements, which bypass variable of containing functions anyway). Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: How to call module functions inside class instance functions?
beginner [EMAIL PROTECTED] wrote: ... testmodule.py - Test Module def __module_level_func(): print Hello class TestClass: def class_level_func(self): __module_level_func() main.py -- import testmodule x=testmodule.TestClass() x.class_level_func() The error message I am encountering is: NameError: global name '_TestClass__module_level_func' is not defined I think it has something to do with the two underscores for __module_level_func. Maybe it has something to do with the python implementation of the private class level functions. By the way, the reason I am naming it __module_level_func() is because I'd like __module_level_func() to be private to the module, like the C static function. If the interpreter cannot really enforce it, at least it is some sort of naming convention for me. The two underscores are exactly the cause of your problem: as you see in the error message, the compiled has inserted the CLASS name (not MODULE name) implicitly there. This name mangling is part of Python's rules. Use a SINGLE leading underscore (NOT double ones) as the sort of naming convention to indicate privacy, and Python will support you (mostly by social convention, but a little bit technically, too); use a different convention (particularly one that fights against the language rules;-) and you're fighting city hall to no good purpose and without much hope of achieving anything whatsoever thereby. Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: Can python threads take advantage of use dual core ?
Stefan Behnel [EMAIL PROTECTED] wrote: ... Which virtually all computation-intensive extensions do. Also, note the gmpy doesn't (release the GIL), even though it IS computationally intensive -- I tried, but it slows things down horribly even on an Intel Core Duo. I suspect that may partly be due to the locking strategy of the underlying GMP 4.2 library (which I haven't analyzed in depth). In practice, when I want to exploit both cores to the hilt with gmpy-based computations, I run multiple processes. Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: using super() to call two parent classes __init__() method
7stud [EMAIL PROTECTED] wrote: When I run the following code and call super() in the Base class's __init__ () method, only one Parent's __init__() method is called. class Parent1(object): def __init__(self): print Parent1 init called. self.x = 10 class Parent2(object): def __init__(self): print Parent2 init called. self.y = 15 class Base(Parent1, Parent2): def __init__(self): super(Base, self).__init__() self.z = 20 b = Base() --output:-- Parent1 init called. Yep -- Parent1.__init__ doesn't call its own super's __init__, so it doesn't participate in cooperative superclass delegation and the buck stops there. Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: closing StringIO objects
Neil Cerutti [EMAIL PROTECTED] wrote: The documentation says the following about StringIO.close: close( ) Free the memory buffer. Or else... what? Or else the memory buffer sticks around, so you can keep calling getvalue as needed. I believe the freeing will happen anyway, eventually, if and when the StringIO instance is garbage collected (just like, say, a file object's underlying fd gets closed when the file object is garbage collected), but relying on such behavior is often considered a dubious practice nowadays (given the existence of many Python implementations whose GC strategies differ). Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: Opinions about this new Python book?
Neil Cerutti [EMAIL PROTECTED] wrote: On 2007-08-15, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: For some reason, the author makes the claim that the term Predicate is bandied about quite a bit in the literature of Python. I have 17 or so Python books and I don't think I've ever seen this used in conjunction with Python...or in any of the docs I've skimmed. What the!? The document searching facility reveals that the term is bandied about in five places in the standard documentation. These uses seem approriate and uncontroversial to me. These document functions accepting predicates as aruments: 6.5.1 Itertools functions 6.5.3 Recipes 11.47 Creating a new Distutils command 26.10.1 Types and members The following provides a few predicate functions (weird! I'd have never thought to look there for, e.g., ismodule): 6.7 operator -- Standard operators as functions Module inspect also provides useful predicates (though I don't remember if its docs CALL them predicates;-). Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: Something in the function tutorial confused me.
Neil Cerutti [EMAIL PROTECTED] wrote: ... Then we get into unpacking assignments and augmented assignments, but I don't really want to write two more pages worth of summary...;-). Thanks very much for taking the time to help clear up my erroneous model of assignment in Python. I'd taken a conceptual shortcut that's not justified. You're very welcome, it's always a pleasure to help! Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: Help with optimisation
special_dragonfly [EMAIL PROTECTED] wrote: ... dom=xml.dom.minidom.parseString(text_buffer) If you need to optimize code that parses XML, use ElementTree (some other parsers are also fast, but minidom ISN'T). Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: LRU cache?
Paul Rubin http://[EMAIL PROTECTED] wrote: Anyone got a favorite LRU cache implementation? I see a few in google but none look all that good. I just want a dictionary indexed by strings, that remembers the last few thousand entries I put in it. So what's wrong with Evan Prodromou's lrucache.py module that's in pypi? Haven't used it, but can't see anything wrong at a glance. Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: Something in the function tutorial confused me.
Neil Cerutti [EMAIL PROTECTED] wrote: ... OK, I've thought about this some more and I think the source of my confusion was I thought assignment in Python meant binding a name to something, not mutating an object. But in the case of augmented assignment, assignment no longer means that? Plain assignment *to a plain name* does mean binding a name (the LHS) to something (the RHS). Other assignments (ones that are not plain assignments to names) may have different meanings. For example: class act(object): ... def __init__(self, c): self._c = c ... def getC(self): return self._c ... def setC(self, *ignore): self._c += 1 ... c = property(getC, setC) ... x = act(0) x.c 0 x.c = 23 x.c 1 Here's an example where a plain assignment (to an attribute of x, not to a plain name) obviously DOESN'T mean binding a name to something: the something (the RHS) is completely ignored, so the plain assignment is mutating an object (x) and not binding any name to anything. Plain assignments to items and slices can also often be best seen as mutating an object (the one being indexed or sliced on the LHS) rather than binding a name. For example: l=list('ciao') l[1:3]='app' l ['c', 'a', 'p', 'p', 'o'] If I was teaching Python and came upon this example, I would definitely not try to weaselword the explanation of what's going on in terms of binding a name (or several ``names'', including ``rebinding a new ``name'' l[4] to the 'o' that was previously ``bound'' to l[3], etc:-): it's just orders of magnitudes simpler to explain this as mutating an object, namely the list l. I take almost 3 pages in Python in a Nutshell (47 to 49 in the second edition) to summarily explain every kind assignment -- and that's in a work in which I've tried (successfully, I believe from reviews) to be very, *VERY* concise;-). Summarizing that summary;-), a plain assignment to an identifier binds that name; a plain assignment to an attribute reference x.y asks object x (x can be any expression) to bind its attribute named 'y'; a plain assignment to an indexing x[y] (x and y are arbitrary expressions) asks object x to bind its item indicated by the value of y); a plain assignment to a slicing is equivalent to the plain assignment to the indexing with an index of slice(start, stop, stride) [[slice is a Python builtin type]]. Plain assignment to an identifier just happens; all other cases of plain assignment are requests to an object to bind one or more of its attributes or items (i.e., requests for specific mutations of an object) -- as for, say any method call (which might also be a request for some kind of mutation), the object will do whatever it pleases with the request (including, perhaps, refusing it, by raising an exception). Then we get into unpacking assignments and augmented assignments, but I don't really want to write two more pages worth of summary...;-). Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: Ipc mechanisms and designs.
king kikapu [EMAIL PROTECTED] wrote: Hi, inspired of the topic The Future of Python Threading, i started to realize that the only way to utilize the power of multiple cores using Python, is spawn processes and communicate with them. If we have the scenario: 1. Windows (mainly) development 2. Processes are running in the same machine 3. We just want to pass info from one process to another. Info may be simple data types or user defined Python objects. what is the best solution (besides sockets) that someone can implement so to have 2 actually processes that interchanged data between them ? I looked at Pyro and it looks really good but i wanted to experiment with a simpler solution. Check out http://www.lindaspaces.com/products/NWS_overview.html Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: Help with Dictionaries and Classes requested please.
Sion Arrowsmith [EMAIL PROTECTED] wrote: special_dragonfly [EMAIL PROTECTED] wrote: if key in FieldsDictionary: FieldsDictionary[key].append(FieldClass(*line.split(,))) else: FieldsDictionary[key]=[FieldClass(*line.split(,))] These four lines can be replaced by: FieldsDictionary.setdefault(key, []).append(FieldClass(*line.split(,))) Even better might be to let FieldsDictionary be an instance of collections.defaultdict(list) [[assuming Python 2.5 is in use]], in which case the simpler FieldsDictionary[key].append(FieldClass(*line.split(,))) will Just Work. setdefault was a valiant attempt at fixing this problem, but defaultdict is better. Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: Something in the function tutorial confused me.
greg [EMAIL PROTECTED] wrote: Steve Holden wrote: For some reason your reply got right up my nose, I'm sorry about that. Sometimes it's hard to judge the level of experience with Python that a poster has. In Because of this, a Google search for name surname python may sometimes help; when you get 116,000 hits, as for Steve Holden python, that may be a reasonable indication that the poster is one of the world's Python Gurus (in fact, the winner of the 2007 Frank WIllison Award -- congratulations, Steve!!!). Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: Something in the function tutorial confused me.
Neil Cerutti [EMAIL PROTECTED] wrote: ... The Python Language Reference seems a little confused about the terminology. 3.4.7 Emulating numeric types 6.3.1 Augmented assignment statements The former refers to augmented arithmetic operations, which I think is a nice terminology, since assignment is not necessarily taking place. Then the latter muddies the waters. Assignment *IS* necessarily taking place; if you try the augmented assignment on something that DOESN'T support assignment, you'll get an exception. Consider: tup=([],) tup[0] += ['zap'] Traceback (most recent call last): File stdin, line 1, in module TypeError: 'tuple' object does not support item assignment Tuples don't support item ASSIGNMENT, and += is an ASSIGNMENT, so tuples don't allow a += on any of their items. If you thought that += wasn't an assignment, this behavior and error message would be very problematic; since the language reference ISN'T confused and has things quite right, this behavior and error message are perfectly consistent and clear. Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: Destruction of generator objects
Stefan Bellon [EMAIL PROTECTED] wrote: On Thu, 09 Aug, Graham Dumpleton wrote: result = application(environ, start_response) try: for data in result: if data:# don't send headers until body appears write(data) if not headers_sent: write('') # send headers now if body was empty finally: if hasattr(result,'close'): result.close() Hm, not what I hoped for ... Isn't it possible to add some __del__ method to the generator object via some decorator or somehow else in a way that works even with Python 2.4 and can then be nicely written without cluttering up the logic between consumer and producer? No, you cannot do what you want in Python 2.4. If you can't upgrade to 2.5 or better, whatever the reason may be, you will have to live with 2.4's limitations (there ARE reasons we keep making new releases, after all...:-). Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: boolean operations on sets
Michael J. Fromberger [EMAIL PROTECTED] wrote: ... Also, it is a common behaviour in many programming languages for logical connectives to both short-circuit and yield their values, so I'd argue that most programmers are proabably accustomed to it. The and || operators of C and its descendants also behave in this manner, as do the Untrue, alas...: brain:~ alex$ cat a.c #include stdio.h int main() { printf(%d\n, 23 45); return 0; } brain:~ alex$ gcc a.c brain:~ alex$ ./a.out 1 In C, and || _do_ short circuit, BUT they always return 0 or 1, *NOT* yield their values (interpreted as return the false or true value of either operand, as in Python). Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: Formatting Results so that They Can be Nicely Imported into a Spreadsheet.
[EMAIL PROTECTED] [EMAIL PROTECTED] wrote: ... Even with the if i included, we end up with an empty list at the start. This because the first blank line wasn't blank, it was a space, so it passes the if i test. ...and you can fix that by changing the test to [... if i.split()]. Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: Efficient Rank Ordering of Nested Lists
Cousin Stanley [EMAIL PROTECTED] wrote: ... for i , item in reversed( enumerate( sorted( single_list ) ) ) : ... TypeError: argument to reversed() must be a sequence Oops, right. Well then, aux_seq = list(enumerate(sorted(single_list))) for i, item in reversed(aux_seq): ... or the like. Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: Efficient Rank Ordering of Nested Lists
[EMAIL PROTECTED] [EMAIL PROTECTED] wrote: A naive approach to rank ordering (handling ties as well) of nested lists may be accomplished via: def rankLists(nestedList): def rankList(singleList): sortedList = list(singleList) sortedList.sort() return map(sortedList.index, singleList) return map(rankList, nestedList) unranked = [ [ 1, 2, 3, 4, 5 ], [ 3, 1, 5, 2, 4 ], [ -1.1, 2.2, 0, -1.1, 13 ] ] print rankLists(unranked) [[0, 1, 2, 3, 4], [2, 0, 4, 1, 3], [0, 3, 2, 0, 4]] This works nicely when the dimensions of the nested list are small. It is slow when they are big. Can someone suggest a clever way to speed it up? Each use of sortedList.index is O(N) [where N is len(singleList)], and you have N such uses in the map in the inner function, so this approach is O(N squared). Neil's suggestion to use bisect replaces the O(N) .index with an O(log N) search, so the overall performance is O(N log N) [[and you can't do better than that, big-O wise, because the sort step is also O(N log N)]]. beginner's advice to use a dictionary is also good and may turn out to be faster, just because dicts are SO fast in Python -- but you need to try and measure both alternatives. One way to use a dict (warning, untested code): def rankList(singleList): d = {} for i, item in reversed(enumerate(sorted(singleList))): d[item] = i return [d[item] for item in singleList] If you find the for-clause too rich in functionality, you can of course split it up a bit; but note that you do need the 'reversed' to deal with the corner case of duplicate items (without it, you'd end up with 1s instead of 0s for the third one of the sublists in your example). If speed is of the essence you might try to measure what happens if you replace the returned expression with map(d.__getitem__, singleList), but I suspect the list comprehension is faster as well as clearer. Another potential small speedup is to replace the first 3 statements with just one: d = dict((item,i) for i,item in reversed(enumerate(sorted(singleList but THIS density of functionality is a bit above my personal threshold of comfort (sparse is better than dense:-). Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: Pythonic way for missing dict keys
Bruno Desthuilliers [EMAIL PROTECTED] wrote: Alex Popescu a écrit : Bruno Desthuilliers [EMAIL PROTECTED] wrote in news:[EMAIL PROTECTED]: (snip) if hasattr(obj, '__call__'): # it's a callable but I don't find it so Pythonic to have to check for a __magic__ method. It looks like Python devs have decided it is Pythonic, because it is already in the PEP. I do know, and I disagree with this decision. FWIW, repr(obj) is mostly syntactic sugar for obj.__repr__(), getattr(obj, name) for obj.__getattr__(name), type(obj) for obj.__class__ etc... IOW, we do have a lot of builtin functions that mostly encapsulate calls to __magic__ methods, and I definitively don't understand why this specific one (= callable(obj)) should disappear. I Maybe because it DOESN'T encapsulate a call to a magic method, but rather the mere check for the presence of one? usually have lot of respect for Guido's talent as a language designer (obviously since Python is still MFL), but I have to say I find this particular decision just plain stupid. Sorry. The mere check of whether an object possesses some important special method is best accomplished through the abstract-base-classes machinery (new in Python 3.0: see http://www.python.org/dev/peps/pep-3119/). At this time there is no Callable ABC, but you're welcome to argue for it on the python-3000 mailing list (please do check the archives and/or check privately with the PEP owner first to avoid duplication). Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: Floats as keys in dict
Brian Elmegaard [EMAIL PROTECTED] wrote: I am making a script to optimiza by dynamic programming. I do not know the vertices and nodes before the calculation, so I have decided to store the nodes I have in play as keys in a dict. However, the dict keys are then floats and I have to round the values of new possible nodes in each step. When profiling I see that the most time consuming part of my script is rounding. Is there a faster way than round() or is there a better way to test than 'in' or should I store the keys in another way than a dict? You may want to consider keeping a sorted list and using standard library module bisect for searches and insertions -- its behavior is O(log N) for a search, O(N) for an insertion, but it might be that in your case saving the rounding could be worth it. Otherwise, you need to consider a different container, based either on comparisons (e.g. AVL trees, of which there are several 3rd party implementations as Python extensions) or on a hashing function that will give the same hash for two numbers that are close enough (e.g., hash ignoring the lowest N bits of the float's mantissa for some N). round() operates on decimals and that may not be as fast as working on binary representations, but, to be fast, a helper function giving the hash of a binary-rounded float would have to be coded in C (or maybe use psyco). Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: Where do they tech Python officialy ?
NicolasG [EMAIL PROTECTED] wrote: Open source projects do not require previous professional experience to accept volunteers. So, one way out of your dilemma is to make a name for yourself as an open source contributor -- help out with Python itself and/or with any of the many open source projects that use Python, and you will both learn a lot _and_ acquire professional experience that any enlightened employer will recognize as such. That will take a while, but not as long as getting a college degree (and it will be far cheaper than the degree). Alex I think this is the best idea to escape the python amateur circle and go in to open source project that are considered to be professional projects. I don't know if it will be better to find a project to contribute or to start a new one .. Will have a look around and think about. Unless you have some specific new idea that you're keen to address and can't be met by existing projects, joining an existing project would normally be a better bet. One-person projects are rarely as important as larger ones, and it's quite hard to get other collaborators to a new project; working in a project with existing code and contributors will also be more instructive. As for which OS projects are considered to be professional, just about all large successful ones are so considered: after all, even games, say, are professional projects from the POV of firms that develop and sell them, such as EA!-) Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: Where do they tech Python officialy ?
Alex Popescu [EMAIL PROTECTED] wrote: ... and you will both learn a lot _and_ acquire professional experience that any enlightened employer will recognize as such. It depends :-). In my experience I met employers being concerned by my implication in the oss world :-). Considering that even the King of Proprietary Software, Microsoft, now happily hires major Open Source figures such as Jim Hugunin (MS was also a top-tier sponsor at the recent OSCON, with both managerial and senior technical employees giving keynotes and tech talks), it boggles the mind to think about which kind of company would instead be concerned by a candidate's OS experience. That will take a while, but not as long as getting a college degree (and it will be far cheaper than the degree). I don't know much about the open community in Python world, but in Java world becoming a project member may be more difficult than getting a degree (or close to :-)) ). In a major project, you will of course have to supply useful contributions as well as proving to have a reasonable personality c before being granted committer privileges; and a few projects (centered on a group of committers employed by a single firm or on an otherwise close-knit small clique) are not very open to the outside world at all. But (at least wrt projects using Python, C, C++ -- I have no experience of opensource projects focused on Java instead) that is the exception, not the rule. Alex -- http://mail.python.org/mailman/listinfo/python-list