Re: [Python-Dev] Accepting PEP 3154 for 3.4?
On Tue, Nov 19, 2013 at 2:09 PM, Antoine Pitrou solip...@pitrou.net wrote: Well, I don't think it's a big deal to add a FRAME opcode if it doesn't change the current framing logic. I'd like to defer to Alexandre on this one, anyway. Looking at the different options available to us: 1A. Mandatory framing (+) Allows the internal buffering layer of the Unpickler to rely on the presence of framing to simplify its implementation. (-) Forces all implementations of pickle to include support for framing if they want to use the new protocol. (-) Cannot be removed from future versions of the Unpickler without breaking protocols which mandates framing. 1B. Optional framing (+) Could allow optimizations to disable framing if beneficial (e.g., when pickling to and unpickling from a string). 2A. With explicit FRAME opcode (+) Makes optional framing simpler to implement. (+) Makes variable-length encoding of the frame size simpler to implement. (+) Makes framing visible to pickletools. (-) Adds an extra byte of overhead to each frames. 2B. No opcode 3A. With fixed 8-bytes headers (+) Is simple to implement (-) Adds overhead to small pickles. 3B. With variable-length headers (-) Requires Pickler implemention to do extra data copies when pickling to strings. 4A. Framing baked-in the pickle protocol (+) Enables faster implementations 4B. Framing through a specialized I/O buffering layer (+) Could be reused by other modules I may change my mind as I work on the implementation, but at least for now, I think the combination of 1B, 2A, 3A, 4A will be a reasonable compromise here. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Sharing docstrings between the Python and C implementations of a module
On Mon, Apr 15, 2013 at 12:56 AM, David Lam david.k.l...@gmail.com wrote: I tried to find an example in the source which addressed this, but found that the docstrings in similar cases to be largely duplicated. I find this annoying too. It would be nice to have a common way to share docstrings between C and Python implementations of the same interface. One roadblock though is functions in C modules often document their parameters in their docstring. import _json help(_json.scanstring) scanstring(...) scanstring(basestring, end, encoding, strict=True) - (str, end) Scan the string s for a JSON string. End is the index of the character in s after the quote that started the JSON string. [...] Argument clinic will hopefully lift this roadblock soon. Perhaps, we could add something to the clinic DSL a way to fetch the docstring directly from the Python implementation. And as an extra, it would be easy to add verification step as well that checks the both implementations provide a similar interfaces once we have this in place. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Usage of += on strings in loops in stdlib
On Tue, Feb 12, 2013 at 1:44 PM, Antoine Pitrou solip...@pitrou.net wrote: It's idiomatic because strings are immutable (by design, not because of an optimization detail) and therefore concatenation *has* to imply building a new string from scratch. Not necessarily. It is totally possible to implement strings such they are immutable and concatenation takes O(1): ropes are the canonical example of this. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Usage of += on strings in loops in stdlib
On Tue, Feb 12, 2013 at 5:25 PM, Christian Tismer tis...@stackless.comwrote: Would ropes be an answer (and a simple way to cope with string mutation patterns) as an alternative implementation, and therefore still justify the usage of that pattern? I don't think so. Ropes are really useful when you work with gigabytes of data, but unfortunately they don't make good general-purpose strings. Monolithic arrays are much more efficient and simple for the typical use-cases we have in Python. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] cpython: Issue #16218: skip test if filesystem doesn't support required encoding
On Thu, Nov 8, 2012 at 9:45 AM, Serhiy Storchaka storch...@gmail.comwrote: My intention was testing with filename which cannot be decoded as UTF-8 in strict mode. I agree that testing with name which is encodable in locale encoding can be useful too, but now the test has no effect on UTF-8 locale. So should we change the test back? Or just change the test name? ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] cpython: Issue #16218: skip test if filesystem doesn't support required encoding
The Unicode code points in the U+DC00-DFFFhttp://www.unicode.org/charts/PDF/UDC00.pdf range (low surrogate area) can't be encoded in UTF-8. Quoting from RFC 3629http://tools.ietf.org/html/rfc3629 : *The definition of UTF-8 prohibits encoding character numbers between U+D800 and U+DFFF, which are reserved for use with the UTF-16 encoding form (as surrogate pairs) and do not directly represent characters.* It looks like this test was doing something specific with regards to this. So, I am curious as well about this change. On Sat, Nov 3, 2012 at 10:13 AM, Antoine Pitrou solip...@pitrou.net wrote: On Sat, 3 Nov 2012 13:37:48 +0100 (CET) andrew.svetlov python-check...@python.org wrote: http://hg.python.org/cpython/rev/95d1adf144ee changeset: 80187:95d1adf144ee user:Andrew Svetlov andrew.svet...@gmail.com date:Sat Nov 03 14:37:37 2012 +0200 summary: Issue #16218: skip test if filesystem doesn't support required encoding files: Lib/test/test_cmd_line_script.py | 7 ++- 1 files changed, 6 insertions(+), 1 deletions(-) diff --git a/Lib/test/test_cmd_line_script.py b/Lib/test/test_cmd_line_script.py --- a/Lib/test/test_cmd_line_script.py +++ b/Lib/test/test_cmd_line_script.py @@ -366,7 +366,12 @@ def test_non_utf8(self): # Issue #16218 with temp_dir() as script_dir: -script_basename = '\udcf1\udcea\udcf0\udce8\udcef\udcf2' +script_basename = '\u0441\u043a\u0440\u0438\u043f\u0442' Why exactly did you change the tested name here? Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/alexandre%40peadrop.com ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Benchmarking Python 3.3 against Python 2.7 (wide build)
On Sun, Sep 30, 2012 at 4:50 PM, Brett Cannon br...@python.org wrote: I accidentally left out the telco benchmark, which is bad since cdecimal makes it just scream on Python 3.3 (and I verified with Python 3.2 that this is an actual speedup and not some silly screw-up like I initially had with spectral_norm): You could also make the pickle benchmark use the C accelerator module by passing the --use_cpickle flag. The Python 3 version should be a lot faster. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] On a new version of pickle [PEP 3154]: self-referential frozensets
On Sat, Jun 23, 2012 at 3:19 AM, M Stefan mstefa...@gmail.com wrote: * UNION_FROZENSET: like UPDATE_SET, but create a new frozenset stack before: ... pyfrozenset mark stackslice stack after : ... pyfrozenset.union(stackslice) Since frozenset are immutable, could you explain how adding the UNION_FROZENSET opcode helps in pickling self-referential frozensets? Or are you only adding this one to follow the current style used for pickling dicts and lists in protocols 1 and onward? While this design allows pickling of self-referenti/Eal sets, self-referential frozensets are still problematic. For instance, trying to pickle `fs': a=A(); fs=frozenset([a]); a.fs = fs (when unpickling, the object a has to be initialized before it is added to the frozenset) The only way I can think of to make this work is to postpone the initialization of all the objects inside the frozenset until after UNION_FROZENSET. I believe this is doable, but there might be memory penalties if the approach is to simply store all the initialization opcodes in memory until pickling the frozenset is finished. I don't think that's the only way. You could also emit POP opcode to discard the frozenset from stack and then emit a GET to fetch it back from the memo. This is how we currently handle self-referential tuples. Check out the save_tuple method in pickle.py to see how it is done. Personally, I would prefer that approach because it already well-tested and proven to work. That said, your approach sounds good too. The memory trade-off could lead to smaller pickles and more efficient decoding (though these self-referential objects are rare enough that I don't think that any improvements there would matter much). While self-referential frozensets are uncommon, a far more problematic situation is with the self-referential objects created with REDUCE. While pickle uses the idea of creating empty collections and then filling them, reduce tipically creates already-filled objects. For instance: cnt = collections.Counter(); cnt[a]=3; a.cnt=cnt; cnt.__reduce__() (class 'collections.Counter', ({__main__.A object at 0x0286E8F8: 3},)) where the A object contains a reference to the counter. Unpickling an object pickled with this reduce function is not possible, because the reduce function, which explains how to create the object, is asking for the object to exist before being created. Your example seems to work on Python 3. I am not sure if I follow what you are trying to say. Can you provide a working example? $ python3 Python 3.1.2 (r312:79147, Dec 9 2011, 20:47:34) [GCC 4.4.3] on linux2 Type help, copyright, credits or license for more information. import pickle, collections c = collections.Counter() class A: pass ... a = A() c[a] = 3 a.cnt = c b =pickle.loads(pickle.dumps(a)) b in b.cnt True Pickle could try to fix this by detecting when reduce returns a class type as the first tuple arg and move the dict ctor parameter to the state, but this may not always be intended. It's also a bit strange that __getstate__ is never used anywhere in pickle directly. I would advise against any such change. The reduce protocol is already fairly complex. Further I don't think change it this way would give us any extra flexibility. The documentation has a good explanation of how __getstate__ works under hood: http://docs.python.org/py3k/library/pickle.html#pickling-class-instances And if you need more, PEP 307 (http://www.python.org/dev/peps/pep-0307/) provides some of the design rationales of the API. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] What should we do with cProfile?
Hello, As per PEP 3108, we were supposed to merge profile/cProfile into one unified module. I initially championed the change, but other things got in the way and I have never got to the point of a useful patch. I posted some code and outlined an approach how the merge could be done. However, there still a lot of details to be worked out. So I wondering whether we should abandon the change all together or attempt it for the next release. Personally, I slightly leaning on the former option since the two modules are actually fairly different underneath even though they are used similarly. And also, because it is getting late to make such backward incompatible changes. I am willing to volunteer to push the change though if it is still desired by the community. Cheers! http://bugs.python.org/issue2919 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Cython for cPickle?
On Thu, Apr 19, 2012 at 4:55 AM, Stefan Behnel stefan...@behnel.de wrote: That sounds like less than two weeks of work, maybe even if we add the marshal module to it. In less than a month of GSoC time, this could easily reach a point where it's close to the speed of what we have and fast enough, but a lot more accessible and maintainable, thus also making it easier to add the extensions described in the PEP. What do you think? As others have pointed out, many users of pickle depend on its performance. The main reason why _pickle.c is so big is all the low-level optimizations we have in there. We have custom stack and dictionary implementations just for the sake of speed. We also have fast paths for I/O operations and function calls. These optimizations alone are taking easily 2000 lines of code and they are not micro-optimizations. Each of these were shown to give speedups from one to several orders of magnitude. So I disagree that we could easily reach the point where it's close to the speed of what we have. And if we were to attempt this, it would be a multiple months undertaking. I would rather see that time spent on improving pickle than on yet another reimplementation. -- Alexandre ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Cython for cPickle?
On Sun, Apr 22, 2012 at 6:12 PM, mar...@v.loewis.de wrote: So I disagree that we could easily reach the point where it's close to the speed of what we have. And if we were to attempt this, it would be a multiple months undertaking. I would rather see that time spent on improving pickle than on yet another reimplementation. Of course, this being free software, anybody can spend time on whatever they please, and this should not make anybody feel sad. You just don't get merits if you work on stuff that nobody cares about. Yes, of course. I don't want to discourage anyone to investigate this option—in fact, I would very much like to see myself proven wrong. But, if I understood Stefan correctly, he is proposing to have a GSoC student to do the work, to which I would feel uneasy about since we have no idea how valuable this would be as a contribution. -- Alexandre ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3154 - pickle protocol 4
On Fri, Aug 12, 2011 at 3:58 AM, Antoine Pitrou solip...@pitrou.net wrote: Hello, This PEP is an attempt to foster a number of small incremental improvements in a future pickle protocol version. The PEP process is used in order to gather as many improvements as possible, because the introduction of a new protocol version should be a rare occurrence. Feel free to suggest any additions. Your propositions sound all good to me. We will need to agree about the details, but I believe these improvements to the current protocol will be appreciated. Also, one thing keeps coming back is the need for pickling functions and methods which are not part of the global namespace (e.g. issue 9276http://bugs.python.org/issue9276). Support for this would likely help us fixing another related namespace issue (i.e., issue 3657 http://bugs.python.org/issue3657%C2%A0). Finally, we currently missing support for pickling classes with __new__ taking keyword-only arguments (i.e. issue 4727 http://bugs.python.org/issue4727). -- Alexandre ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Speeding up 2to3: Results from a GSOC Project
Love it! BTW, it's not a good idea to have an import statement under 3 level of loops: https://code.google.com/p/2to3-speedup2/source/browse/trunk/lib2to3/refactor.py#427 -- Alexandre ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Readability of hex strings (Was: Use of coding cookie in 3.x stdlib)
[+Python-ideas -Python-Dev] import binascii def h(s): return binascii.unhexlify(.join(s.split())) h(DE AD BE EF CA FE BA BE) -- Alexandre On Mon, Jul 26, 2010 at 11:29 AM, anatoly techtonik techto...@gmail.com wrote: I find \xXX\xXX\xXX\xXX... notation for binary data totally unreadable. Everybody who uses and analyses binary data is more familiar with plain hex dumps in the form of XX XX XX XX I wonder if it is possible to introduce an effective binary string type that will be represented as hXX XX XX in language syntax? It will be much easier to analyze printed binary data and copy/paste such data as-is from hex editors/views. On Mon, Jul 19, 2010 at 9:45 AM, Guido van Rossum gu...@python.org wrote: Sounds like a good idea to try to remove redundant cookies *and* to remove most occasional use of non-ASCII characters outside comments (except for unittests specifically trying to test Unicode features). Personally I would use \xXX escapes instead of spelling out the characters in shlex.py, for example. Both with or without the coding cookies, many ways of displaying text files garble characters outside the ASCII range, so it's better to stick to ASCII as much as possible. --Guido On Mon, Jul 19, 2010 at 1:21 AM, Alexander Belopolsky alexander.belopol...@gmail.com wrote: I was looking at the inspect module and noticed that it's source starts with # -*- coding: iso-8859-1 -*-. I have checked and there are no non-ascii characters in the file. There are several other modules that still use the cookie: Lib/ast.py:# -*- coding: utf-8 -*- Lib/getopt.py:# -*- coding: utf-8 -*- Lib/inspect.py:# -*- coding: iso-8859-1 -*- Lib/pydoc.py:# -*- coding: latin-1 -*- Lib/shlex.py:# -*- coding: iso-8859-1 -*- Lib/encodings/punycode.py:# -*- coding: utf-8 -*- Lib/msilib/__init__.py:# -*- coding: utf-8 -*- Lib/sqlite3/__init__.py:#-*- coding: ISO-8859-1 -*- Lib/sqlite3/dbapi2.py:#-*- coding: ISO-8859-1 -*- Lib/test/bad_coding.py:# -*- coding: uft-8 -*- Lib/test/badsyntax_3131.py:# -*- coding: utf-8 -*- I understand that coding: utf-8 is strictly redundant in 3.x. There are cases such as Lib/shlex.py where using encoding other than utf-8 is justified. (See http://svn.python.org/view?view=revrevision=82560). What are the guidelines for other cases? Should redundant cookies be removed? Since not all editors respect the -*- cookie, I think the answer should be yes particularly when the cookie is setting encoding other than utf-8. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org -- --Guido van Rossum (python.org/~guido) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/techtonik%40gmail.com ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/alexandre%40peadrop.com ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Future of 2.x.
On Wed, Jun 9, 2010 at 1:23 PM, Martin v. Löwis mar...@v.loewis.de wrote: Closing the backport requests is fine. For the feature requests, I'd only close them *after* the 2.7 release (after determining that they won't apply to 3.x, of course). There aren't that many backport requests, anyway, are there? There is only a few requests (about five). -- Alexandre ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Future of 2.x.
On Wed, Jun 9, 2010 at 5:55 AM, Facundo Batista facundobati...@gmail.com wrote: Yes, closing the tickets as won't fix and tagging them as will-never-happen-in-2.x or something, is the best combination of both worlds: it will clean the tracker and ease further developments, and will allow anybody to pick up those tickets later. The issue I care about are already tagged as 26backport. So, I don't think another keyword is needed. -- Alexandre ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Future of 2.x.
Is there is any plan for a 2.8 release? If not, I will go through the tracker and close outstanding backport requests of 3.x features to 2.x. -- Alexandre ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Did I miss the decision to untabify all of the C code?
On Wed, May 5, 2010 at 8:52 PM, Joao S. O. Bueno jsbu...@python.org.br wrote: Python 2.7 is in beta, but not applying such a fix now would probably mean that python 2.x would forever remain with the mixed tabs, since it would make much less sense for such a change in a minor revision (although I'd favor it even there). Since 2.7 is likely the last release of the 2.x series, wouldn't it more productive to spend time improving it instead of wasting time on minor details like indentation? -- Alexandre ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Running Clang 2.7's static analyzer over the code base
On Mon, May 3, 2010 at 7:34 PM, Barry Warsaw ba...@python.org wrote: Now would be a good time to convert the C files to 4 space indents. We've only been talking about it for a decade at least. Will changing the indentation of source files to 4 space indents break patches on the bug tracker? -- Alexandre ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Unpickling memory usage problem, and a proposed solution
On Fri, Apr 23, 2010 at 2:11 PM, Dan Gindikin dgindi...@gmail.com wrote: We were having performance problems unpickling a large pickle file, we were getting 170s running time (which was fine), but 1100mb memory usage. Memory usage ought to have been about 300mb, this was happening because of memory fragmentation, due to many unnecessary puts in the pickle stream. We made a pickletools.optimize inspired tool that could run directly on a pickle file and used pickletools.genops. This solved the unpickling problem (84s, 382mb). However the tool itself was using too much memory and time (1100s, 470mb), so I recoded it to scan through the pickle stream directly, without going through pickletools.genops, giving (240s, 130mb). Collin Winter wrote a simple optimization pass for cPickle in Unladen Swallow [1]. The code reads through the stream and remove all the unnecessary PUTs in-place. [1]: http://code.google.com/p/unladen-swallow/source/browse/trunk/Modules/cPickle.c#735 Other people that deal with large pickle files are probably having similar problems, and since this comes up when dealing with large data it is precisely in this situation that you probably can't use pickletools.optimize or pickletools.genops. It feels like functionality that ought to be added to pickletools, is there some way I can contribute this? Just put your code on bugs.python.org and I will take a look. -- Alexandre ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Unpickling memory usage problem, and a proposed solution
On Fri, Apr 23, 2010 at 2:38 PM, Alexandre Vassalotti alexan...@peadrop.com wrote: Collin Winter wrote a simple optimization pass for cPickle in Unladen Swallow [1]. The code reads through the stream and remove all the unnecessary PUTs in-place. I just noticed the code removes *all* PUT opcodes, regardless if they are needed or not. So, this code can only be used if there's no GET in the stream (which is unlikely for a large stream). I believe Collin made this trade-off for performance reasons. However, it wouldn't be hard to make the current code to work like pickletools.optimize(). -- Alexandre ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Unpickling memory usage problem, and a proposed solution
On Fri, Apr 23, 2010 at 3:07 PM, Collin Winter collinwin...@google.com wrote: I should add that, adding the necessary bookkeeping to remove only unused PUTs (instead of the current all-or-nothing scheme) should not be hard. I'd watch out for a further performance/memory hit; the pickling benchmarks in the benchmark suite should help assess this. I was thinking about this too. A simple boolean table could be fast, while keeping the space requirement down. This scheme would be nice to caches as well. The current optimization penalizes pickling to speed up unpickling, which made sense when optimizing pickles that would go into memcache and be read out 13-15x more often than they were written. This is my current impression of how pickle is most often used. Are you aware of a use case of pickle where you do more writes than reads? I can't think of any. -- Alexandre ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Unpickling memory usage problem, and a proposed solution
On Fri, Apr 23, 2010 at 3:57 PM, Dan Gindikin dgindi...@gmail.com wrote: This wouldn't help our use case, your code needs the entire pickle stream to be in memory, which in our case would be about 475mb, this is on top of the 300mb+ data structures that generated the pickle stream. In that case, the best we could do is a two-pass algorithm to remove the unused PUTs. That won't be efficient, but it will satisfy the memory constraint. Another solution is to not generate the PUTs at all by setting the 'fast' attribute on Pickler. But that won't work if you have a recursive structure, or have code that requires that the identity of objects to be preserved. import io, pickle x=[1,2] f = io.BytesIO() p = pickle.Pickler(f, protocol=-1) p.dump([x,x]) pickletools.dis(f.getvalue()) 0: \x80 PROTO 2 2: ]EMPTY_LIST 3: qBINPUT 0 5: (MARK 6: ]EMPTY_LIST 7: qBINPUT 1 9: (MARK 10: KBININT11 12: KBININT12 14: eAPPENDS(MARK at 9) 15: hBINGET 1 17: eAPPENDS(MARK at 5) 18: .STOP highest protocol among opcodes = 2 [id(x) for x in pickle.loads(f.getvalue())] [20966504, 20966504] Now with the 'fast' mode enabled: f = io.BytesIO() p = pickle.Pickler(f, protocol=-1) p.fast = True p.dump([x,x]) pickletools.dis(f.getvalue()) 0: \x80 PROTO 2 2: ]EMPTY_LIST 3: (MARK 4: ]EMPTY_LIST 5: (MARK 6: KBININT11 8: KBININT12 10: eAPPENDS(MARK at 5) 11: ]EMPTY_LIST 12: (MARK 13: KBININT11 15: KBININT12 17: eAPPENDS(MARK at 12) 18: eAPPENDS(MARK at 3) 19: .STOP highest protocol among opcodes = 2 [id(x) for x in pickle.loads(f.getvalue())] [20966504, 21917992] As you can observe, the pickle stream generated with the fast mode might actually be bigger. By the way, it is weird that the total memory usage of the data structure is smaller than the size of its respective pickle stream. What pickle protocol are you using? -- Alexandre ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] contributor to committer
On Wed, Feb 24, 2010 at 7:13 AM, Florent Xicluna florent.xicl...@gmail.com wrote: Hello, I am a semi-regular contributor for Python: I have contributed many patches since end of last year, some of them were reviewed by Antoine. Lately, he suggested that I should apply for commit rights. +1 -- Alexandre ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Help wanted on a code generator project
On Tue, Jan 26, 2010 at 7:04 AM, Yingjie Lan lany...@yahoo.com wrote: note that this is quite off-topic for this list, which is about the development of the CPython interpreter and runtime environment. Sorry if this is bothering you. I thought here are a lot of people who knows how to write extensions, and has a lot of experiences. These are exactly the best people that can perfect expy. On the other hand, expy, once perfected, would be a nice tool to expedite adding runtime modules to Python. I am not aware of other nice places to ask for help of such a sort. If you know, please let me know, thanks in advance. It is the third time now that people let you know that announcements about your project are not welcome on this mailing list. http://mail.python.org/pipermail/python-dev/2009-July/090699.html http://mail.python.org/pipermail/python-dev/2009-August/091023.html So please stop playing the ignorance card and behave appropriately. -- Alexandre ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3003 - Python Language Moratorium
On Tue, Nov 3, 2009 at 12:35 PM, Guido van Rossum gu...@python.org wrote: I've checked draft (!) PEP 3003, Python Language Moratorium, into SVN. As authors I've listed Jesse, Brett and myself. +1 from me. -- Alexandre ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Fwd: [issue6397] Implementing Solaris poll in the select module]
On Wed, Jul 1, 2009 at 10:05 PM, Guido van Rossumgu...@python.org wrote: The select module already supports the poll() system call. Or is there a special variant that only Solaris has? I think Jesus refers to /dev/poll—i.e., the interface for edge-triggered polling on Solaris. This is the Solaris equivalent of FreeBSD's kqueue and Linux's epoll. -- Alexandre ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Draft PEP 385: Migrating from svn to Mercurial
On Mon, Jun 8, 2009 at 3:57 PM, Martin v. Löwismar...@v.loewis.de wrote: FWIW, I really think that PEP 385 should really grow a timeline pretty soon. Are we going to switch this year, next year, or 2011? +1 -- Alexandre ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Dropping bytes support in json
On Mon, Apr 13, 2009 at 5:25 PM, Daniel Stutzbach dan...@stutzbachenterprises.com wrote: On Mon, Apr 13, 2009 at 3:02 PM, Martin v. Löwis mar...@v.loewis.de wrote: True, I can always convert from bytes to str or vise versa. I think you are missing the point. It will not be necessary to convert. Sometimes I want bytes and sometimes I want str. I am going to be converting some of the time. ;-) Below is a basic CGI application that assumes that json module works with str, not bytes. How would you write it if the json module does not support returning a str? print(Content-Type: application/json; charset=utf-8) input_object = json.loads(sys.stdin.read()) output_object = do_some_work(input_object) print(json.dumps(output_object)) print() Like this? print(Content-Type: application/json; charset=utf-8) input_object = json.loads(sys.stdin.buffer.read()) output_object = do_some_work(input_object) stdout.buffer.write(json.dumps(output_object)) -- Alexandre ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Dropping bytes support in json
On Thu, Apr 9, 2009 at 1:15 AM, Antoine Pitrou solip...@pitrou.net wrote: As for reading/writing bytes over the wire, JSON is often used in the same context as HTML: you are supposed to know the charset and decode/encode the payload using that charset. However, the RFC specifies a default encoding of utf-8. (*) (*) http://www.ietf.org/rfc/rfc4627.txt That is one short and sweet RFC. :-) The RFC also specifies a discrimination algorithm for non-supersets of ASCII (“Since the first two characters of a JSON text will always be ASCII characters [RFC0020], it is possible to determine whether an octet stream is UTF-8, UTF-16 (BE or LE), or UTF-32 (BE or LE) by looking at the pattern of nulls in the first four octets.”), but it is not implemented in the json module: Given the RFC specifies that the encoding used should be one of the encodings defined by Unicode, wouldn't be a better idea to remove the unicode support, instead? To me, it would make sense to use the detection algorithms for Unicode to sniff the encoding of the JSON stream and then use the detected encoding to decode the strings embed in the JSON stream. Cheers, -- Alexandre ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Mercurial?
On Tue, Apr 7, 2009 at 2:03 AM, Stephen J. Turnbull step...@xemacs.org wrote: Alexandre Vassalotti writes: This makes me remember that we will have to decide how we will reorganize our workflow. For this, we can either be conservative and keep the current CVS-style development workflow--i.e., a few main repositories where all developers can commit to. That was the original idea of PEP 374, that was a presumption under which I wrote my part of it, I think we should stick with it. As people develop personal workflows, they can suggest them, and/or changes in the public workflow needed to support them. But there should be a working sample implementation before thinking about changes to the workflow. Aahz convinced me earlier that changing the current workflow would be stupid. So, I now think the best thing to do is to provide a CVS-style environment similar to what we have currently, and let the workflow evolve naturally as developers gain more confidence with Mercurial. Or we could drink the kool-aid and go with a kernel-style development workflow--i.e., each developer maintains his own branch and pull changes from each others. Can you give examples of projects using Mercurial that do that? Mercurial itself is developed using that style, I believe. -- Alexandre ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] BufferedReader.peek() ignores its argument
On Sat, Apr 4, 2009 at 9:03 PM, Antoine Pitrou solip...@pitrou.net wrote: Hello, Currently, BufferedReader.peek() ignores its argument and can return more or less than the number of bytes requested by the user. This is how it was implemented in the Python version, and we've reflected this in the C version. It seems a bit strange and unhelpful though. Should we change the implementation so that the argument to peek() becomes the upper bound to the number of bytes returned? I am not sure if this is a good idea. Currently, the argument of peek() is documented as a lower bound that cannot exceed the size of the buffer: Returns buffered bytes without advancing the position. The argument indicates a desired minimal number of bytes; we do at most one raw read to satisfy it. We never return more than self.buffer_size. Changing the meaning of peek() now could introduce at least some confusion and maybe also bugs. And personally, I like the current behavior, since it guarantees that peek() won't return an empty string unless you reached the end-of-file. Plus, it is fairly easy to cap the number of bytes returned by doing f.peek()[:upper_bound]. -- Alexandre ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Should I/O object wrappers close their underlying buffer when deleted?
Hello, I would like to call to your attention the following behavior of TextIOWrapper: import io def test(buf): textio = io.TextIOWrapper(buf) buf = io.BytesIO() test(buf) print(buf.closed) # This prints True currently The problem here is TextIOWrapper closes its buffer when deleted. BufferedRWPair behalves similarly. The solution is simply to override the __del__ method of TextIOWrapper inherited from IOBase. -- Alexandre ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Mercurial?
On Sun, Apr 5, 2009 at 5:06 AM, Martin v. Löwis mar...@v.loewis.de wrote: Off the top of my head, the following is needed for a successful migration: - Verify that the repository at http://code.python.org/hg/ is properly converted. I see that this has four branches. What about all the other branches? Will they be converted, or not? What about the stuff outside /python? I am not sure if it would be useful to convert the old branches to Mercurial. The simplest thing to do would be to keep the current svn repository as a read-only archive. And if people needs to commit to these branches, they could request the branch to be imported into a Mercurial branch (or a simple to use script could be provided and developer could run it directly on the server to create a user branch). In particular, the Stackless people have requested that they move along with what core Python does, so their code should also be converted. Noted. - Add Mercurial support to the issue tracker. Not sure what this means. There is currently svn support insofar as the tracker can format rNNN references into ViewCVS links; this should be updated if possible (removed if not). There would also be a possibility to auto-close issues from the commit messages. This is not done currently, so I would not make it a prerequisite for the switch. Yes, I was referring to the rNNN references. Actually, I am not sure how this could be implemented, since with Mercurial we lose atomic revision IDs. We could use something like h...@branch-name (e.g, bf94293b1...@py3k) referring to specific revision. An auto-close would be a nice feature, but, as you said, not necessary for the migration. The main stumbling block to implement an auto-close feature is to define when an issue should be closed. Maybe we could add our own meta-data to the commit message. For example: Fix some nasty bug. Close-Issue: 4532 When a such commit would arrive in one of the main branches, a commit hook would close the issue if all the affected releases have been fixed. - Setup temporary svn mirrors for the main Mercurial repositories. What is that? I think it would be a good idea to host a temporary svn mirrors for developers who accesses their VCS via an IDE. Although, I am sure anymore if supporting these developers (if there are any) would worth the trouble. So, think of this as optional. - Augment code.python.org infrastructure to support the creation of developer accounts. One option would be to carry on with the current setup; migrating it to hg might work as well, of course. You mean the current setup for svn.python.org? Would you be comfortable to let this machine be accessed by core developers through SSH? Since with Mercurial, SSH access will be needed for server-side clone (or, a script similar to what the Mozilla folk have [1] could be added). [1]: https://developer.mozilla.org/en/Publishing_Mercurial_Clones - Update the release.py script. There is probably some other things that I missed Here are some: - integrate with the buildbot Good one. It seems buildbot has support for Mercurial. [2] So, this will be a matter of tweaking the right options. The batch scripts in Tools/buildbot will also need to be updated. [2]: http://djmitche.github.com/buildbot/docs/0.7.10/#How-Different-VC-Systems-Specify-Sources - come up with a strategy for /external (also relevant for the buildbot slaves) Since the directories in /external are considered read-only, we could simply a new Mercurial repository and copy the content of /external in it. When a new release needs to be added, just create a new directory and commit. - decide what to do with the bzr mirrors I don't see much benefits to keep them. So, I say, archive the branches there unless someone step-up to maintain them. -- Alexandre ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Mercurial?
On Sun, Apr 5, 2009 at 6:27 AM, Antoine Pitrou solip...@pitrou.net wrote: Alexandre Vassalotti alexandre at peadrop.com writes: Off the top of my head, the following is needed for a successful migration: There's also the issue of how we adapt the current workflow of svnmerging between branches when we want to back- or forward-port stuff. In particular, tracking of already done or blocked backports. (the issue being that svnmerge is different from what DVCS'es call merging :-)) See the PEP about that. I have written a fair amount of details how this would work with Mercurial: http://www.python.org/dev/peps/pep-0374/#backport -- Alexandre ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Mercurial?
On Sun, Apr 5, 2009 at 1:37 PM, Martin v. Löwis mar...@v.loewis.de wrote: I think it should be stated in the PEP what branches get converted, in what form, and what the further usage of the svn repository should be. Noted. I think there is a long tradition of such annotations; we should try to repeat history here. IIUC, the Debian bugtracker understands Closes: #4532 and some other syntaxes. It must be easy to remember, else people won't use it. That should reasonable. Personally, I don't really care about the syntax we would use as long its consistent and documented. Any decision to have or not have such a feature should be stated in the PEP. I personally don't use IDEs, so I don't care (although I do notice that the apparent absence of IDE support for Mercurial indicates maturity of the technology) I know Netbeans has Mercurial support built-in (which makes sense because Sun uses Mercurial for its open-source projects). However, I am not sure if Eclipse has good Mercurial support yet. There are 3rd-party plugins for Eclipse, but I don't know if they work well. Ok, I take that back. I assumed that Mercurial could work *exactly* as Subversion. Apparently, that's not the case (although I have no idea what a server-side clone is). So I wait for the PEP to explain how authentication and access control is to be implemented. Creating individual Unix accounts for committers should be avoided. With Subversion, we can do a server-side clone (or copy) using the copy command: svn copy SRC_URL DEST_URL This prevents wasting time and bandwidth by doing the copy directly on server. Without this feature, you would need to checkout the remote repository to clone, then push it to a different location. Since upload bandwidth is often limited, creating new branch in a such fashion would be time consuming. With Mercurial, we will need to add support for server-side clone ourselves. There's few ways to provide this feature. We give Unix user accounts to all core developers and let developers manages their private branches directly on the server. You made clear that this is not wanted. So an alternative approach is to add a interface accessible via SSH. As I previously mentioned, this is the approach used by Mozilla. Yet another approach would be to add a web interface for managing the repositories. This what OpenSolaris admins opted for. Personnally, I do not think this a good idea because it would requires us to roll our own authentication mechanism which is clearly a bad thing (both security-wise and usability-wise). This makes me remember that we will have to decide how we will reorganize our workflow. For this, we can either be conservative and keep the current CVS-style development workflow—i.e., a few main repositories where all developers can commit to. Or we could drink the kool-aid and go with a kernel-style development workflow—i.e., each developer maintains his own branch and pull changes from each others. From what I have heard, the CVS-style workflow has a lower overhead than the kernel-style workflow. However the kernel-style workflow somehow advantageous because changes get reviewed several times before they get in the main branches. Thus, it is less likely that someone manage to break the build. In addition, Mercurial is much better suited at supporting the kernel-style workflow. However if we go kernel-style, I will need to designate someone (i.e., an integrator) that will maintain the main branches, which will tested by buildbot and used for the public releases. These are issues I would like to address in the PEP. I can give you access to the master setup. Ideally, this should be tested before the switchover (with a single branch). We also need instructions for the slaves (if any - perhaps installing a hg binary is sufficient). I am not too familiar with our buildbot setup. So, I will to do some reading before actually doing any change. You can give me access to the buildbot master now. However, I would use this access only to study how the current setup works and to plan the changes we need accordingly. Since the directories in /external are considered read-only, we could simply a new Mercurial repository and copy the content of /external in it. - decide what to do with the bzr mirrors I don't see much benefits to keep them. Both should go into the PEP. Noted. Regards, -- Alexandre ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Mercurial?
On Sun, Apr 5, 2009 at 2:45 PM, Dirkjan Ochtman dirk...@ochtman.nl wrote: On 05/04/2009 20:36, Martin v. Löwis wrote: We do require full real names (i.e. no nicknames). Can Mercurial guarantee such a thing? We could pre-record the list of allowed names in a hook, then have the hook check that usernames include one of those names and an email address (so people can still start using another email address). But that won't work if people who are not core developers submit us patch bundle to import. And maintaining a such white-list sounds to me more burdensome than necessary. -- Alexandre ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Mercurial?
On Sun, Apr 5, 2009 at 2:40 PM, Martin v. Löwis mar...@v.loewis.de wrote: Okay, sounds like that will be easy. Would be good to enable compression on the SSH, though, if that's not already done. Where is that configured? If I recall correctly, only ssh clients can request compression to the server—in other words, the server cannot forces the clients to use compression, but merely allow them use it. See the man page for sshd_config and ssh_config for the specific details. -- Alexandre ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Mercurial?
On Mon, Apr 6, 2009 at 12:20 AM, Aahz a...@pythoncraft.com wrote: How difficult would it be to change the decision later? That is, how about starting with a CVS-style system and maybe switch to kernel-style once people get comfortable with Hg? I believe it would be fairly easy. It would be a matter of declaring a volunteer to maintain the main repositories and ask core developers to avoid committing directly to them. Cheers, -- Alexandre ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Mercurial?
On Sat, Apr 4, 2009 at 11:40 AM, Aahz a...@pythoncraft.com wrote: With Brett's (hopefully temporary!) absence, who is spearheading the Mercurial conversion? Whoever it is should probably take over PEP 374 and start updating it with the conversion plan, particularly WRT expectations for dates relative to 3.1 final and 2.7 final. I am willing to take over this. I was in charge of the Mercurial scenarios in the PEP, so it would be natural for me to continue with the transition. In addition, I volunteer to maintain the new Mercurial installation. Off the top of my head, the following is needed for a successful migration: - Verify that the repository at http://code.python.org/hg/ is properly converted. - Convert the current svn commit hooks to Mercurial. - Add Mercurial support to the issue tracker. - Update the developer FAQ. - Setup temporary svn mirrors for the main Mercurial repositories. - Augment code.python.org infrastructure to support the creation of developer accounts. - Update the release.py script. There is probably some other things that I missed, but I think this is a good overview of what needs to be done. And of course, I would welcome anyone who would be willing to help me with the transition. -- Alexandre ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] issue5578 - explanation
On Tue, Mar 31, 2009 at 11:25 PM, Guido van Rossum gu...@python.org wrote: Well hold on for a minute, I remember we used to have an exec statement in a class body in the standard library, to define some file methods in socket.py IIRC. FYI, collections.namedtuple is also implemented using exec. - Alexandre ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Should the io-c modules be put in their own directory?
On Fri, Apr 3, 2009 at 5:12 PM, Benjamin Peterson benja...@python.org wrote: I'm +.2. This is the layout I would suggest: Modules/ _io/ _io.c stringio.c textio.c etc That seems good to me. I opened an issue on the tracker and included a patch. http://bugs.python.org/issue5682 -- Alexandre ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Should the io-c modules be put in their own directory?
Hello, I just noticed that the new io-c modules were merged in the py3k branch (I know, I am kind late on the news—blame school work). Anyway, I am just wondering if it would be a good idea to put the io-c modules in a sub-directory (like sqlite), instead of scattering them around in the Modules/ directory. Cheers, -- Alexandre ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] GPython?
On Thu, Mar 26, 2009 at 11:40 PM, Collin Winter coll...@gmail.com wrote: In fact, right now I'm adding a last few tests before putting our cPickle patches up on the tracker for further review. Put me in the nosy list when you do; and when I get some free time, I will give your patches a complete review. I've already taken a quick look at cPickle changes you did in Unladen and I think some (i.e., the custom memo table) are definitely worthy to be merged in the mainlines. Cheers, -- Alexandre ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] IO implementation: in C and Python?
On Fri, Feb 20, 2009 at 12:35 AM, Steven D'Aprano st...@pearwood.info wrote: Currently, if I want to verify that (say) cFoo and Foo do the same thing, or compare their speed, it's easy because I can import the modules separately. Given the 3.0 approach, how would one access the Python versions without black magic or hacks? My prefered way to handle this is to keep the original Python implementations with a leading underscore (e.g., pickle._Pickler). I found this was the easiest way to test the C and Python implementations without resorting to import hacks. -- Alexandre ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] undesireable unpickle behavior, proposed fix
On Tue, Jan 27, 2009 at 5:16 PM, Jake McGuire j...@youtube.com wrote: Another vaguely related change would be to store string and unicode objects in the pickler memo keyed as themselves rather than their object ids. That wouldn't be difficult to do--i.e., simply add a type check in Pickler.memoize and another in Pickler.save(). But I am not sure if that would be a good idea, since you would end up hashing every string pickled. And, that would probably be expensive if you are pickling for long strings. -- Alexandre ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] test_subprocess and sparc buildbots
Here is what I found just by analyzing the logs. It seems the first failures appeared after this change: http://svn.python.org/view/python/branches/release30-maint/Objects/object.c?rev=67888view=diffr1=67888r2=67887p1=python/branches/release30-maint/Objects/object.cp2=/python/branches/release30-maint/Objects/object.c The logs of failing test runs all shows the same error message: [31481 refs] * ob object : refcnt 0 at 0x3a97728 type: str refcount: 0 address : 0x3a97728 * op-_ob_prev-_ob_next object : refcnt 0 at 0x3a97728 type: str refcount: 0 address : 0x3a97728 * op-_ob_next-_ob_prev object : [31776 refs] This is the output of _Py_ForgetReference (which calls _PyObject_Dump) called either from _PyUnicode_New or unicode_subtype_new. In both cases, this implies PyObject_MALLOC returned NULL when allocating the internal array of a str object. However, I have no idea why malloc() is failing there. By counting the number of [reftotal] printed in the log, I found that the failing test could be one of the following: test_invalid_args, test_invalid_bufsize, test_list2cmdline, test_no_leaking. Looking at the tests, it seems only test_no_leaking could be problematic: * test_list2cmdline checks if the subprocess.line2cmdline function works correctly, only Python code is involved here; * test_invalid_args checks if using an option unsupported by a platform raises an exception, only Python code is involved here; * test_invalid_bufsize only checks whether Popen rejects non-integer bufsize, only Python code is involved here. And unsurprisingly, that is the failing test: test test_subprocess failed -- Traceback (most recent call last): File /home/pybot/buildarea-sid/3.0.klose-debian-sparc/build/Lib/test/test_subprocess.py, line 423, in test_no_leaking data = p.communicate(blime)[0] File /home/pybot/buildarea-sid/3.0.klose-debian-sparc/build/Lib/subprocess.py, line 671, in communicate return self._communicate(input) File /home/pybot/buildarea-sid/3.0.klose-debian-sparc/build/Lib/subprocess.py, line 1171, in _communicate bytes_written = os.write(self.stdin.fileno(), chunk) OSError: [Errno 32] Broken pipe It seems one of the spawned processes goes out of memory while allocating a new PyUnicode object. I believe we don't see the usual MemoryError because the parent process catches stderr and stdout of the children. Also, only klose-*-sparc buildbots are failing this way; loewis-sun is failing too but for a different reason. So, how much memory is available on this machine (or actually, on this virtual machine)? Now, I wonder why manipulating the GIL caused the bug to appear in 3.0, but not in 2.x. Maybe it is related to the new I/O library in Python 3.0. Regards, -- Alexandre On Tue, Dec 30, 2008 at 4:20 PM, Nick Coghlan ncogh...@gmail.com wrote: Does anyone have local access to a sparc machine to try to track down the ongoing buildbot failures in test_subprocess? (I think the problem is specific to 3.x builds on sparc machines, but I haven't checked the buildbots all that closely - that assessment is just based on what I recall of the buildbot failure emails). Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia --- ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/alexandre%40peadrop.com ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] test_subprocess and sparc buildbots
On Tue, Dec 30, 2008 at 10:41 PM, Daniel (ajax) Diniz aja...@gmail.com wrote: A reliable way to get that in a --with-pydebug build seems to be: ~/py3k$ ./python -c import locale; locale.format_string(1,1) * ob object : refcnt 0 at 0x825c76c type: tuple refcount: 0 address : 0x825c76c * op-_ob_prev-_ob_next NULL * op-_ob_next-_ob_prev object : refcnt 0 at 0x825c76c type: tuple refcount: 0 address : 0x825c76c Fatal Python error: UNREF invalid object TypeError: expected string or buffer Aborted Nice catch! I reduced your example to: import _sre; _sre.compile(0, 0, []). And, it doesn't seem to be an input validation problem with _sre. From what I saw, it's actually a bug in Py_TRACE_REFS's code. Now, it's getting interesting! It seems something is breaking the refchain. However, I don't know what is causing the problem exactly. Found using Fusil in a very quick run on top of: Python 3.1a0 (py3k:68055M, Dec 31 2008, 01:34:52) [GCC 4.2.4 (Ubuntu 4.2.4-1ubuntu3)] on linux2 So kudos to Victor again :) Could share the details on how you used Fusil to find another crasher? It sounds like a useful tool. Thanks! -- Alexandre ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] extremely slow exit for program having huge (45G) dict (python 2.5.2)
On Mon, Dec 22, 2008 at 7:34 PM, Antoine Pitrou solip...@pitrou.net wrote: Now, we should find a way to benchmark this without having to steal Mike's machine and wait 30 minutes every time. So, I seem to reproduce it. The following script takes about 15 seconds to run and allocates a 2 GB dict which it deletes at the end (gc disabled of course). With 2.4, deleting the dict takes ~1.2 seconds while with 2.5 and higher (including 3.0), deleting the dict takes ~3.5 seconds. Nothing spectacular but the difference is clear. I modified your script to delete the dictionary without actually deallocating the items in it. You can speed up a dictionary deallocation significantly if you keep a reference to its items and delete the dictionary before deleting its items. In Python 2.4, the same behavior exists, but is not as strongly marked as in Python 2.6 with pymalloc enabled. I can understand that deallocating the items in the order (or actually, the reverse order) they were allocated is faster, than doing so in a rather haphazard manner (i.e., like dict). However, I am not sure why pymalloc accentuate this behavior. -- Alexandre Python 2.6 with pymalloc, without pydebug a...@helios:~$ python2.6 dict_dealloc_test.py creating 397476 items... - 6.613 s. building dict... - 0.230 s. deleting items... - 0.059 s. deleting dict... - 2.299 s. total deallocation time: 2.358 seconds. a...@helios:~$ python2.6 dict_dealloc_test.py creating 397476 items... - 6.530 s. building dict... - 0.228 s. deleting dict... - 0.089 s. deleting items... - 0.971 s. total deallocation time: 1.060 seconds. Python 2.6 without pymalloc, without pydebug a...@helios:release26-maint$ ./python /home/alex/dict_dealloc_test.py creating 397476 items... - 5.921 s. building dict... - 0.244 s. deleting items... - 0.073 s. deleting dict... - 1.502 s. total deallocation time: 1.586 seconds. a...@helios:release26-maint$ ./python /home/alex/dict_dealloc_test.py creating 397476 items... - 6.122 s. building dict... - 0.237 s. deleting dict... - 0.092 s. deleting items... - 1.238 s. total deallocation time: 1.330 seconds. a...@helios:~$ python2.4 dict_dealloc_test.py creating 397476 items... - 6.164 s. building dict... - 0.218 s. deleting items... - 0.057 s. deleting dict... - 1.185 s. total deallocation time: 1.243 seconds. a...@helios:~$ python2.4 dict_dealloc_test.py creating 397476 items... - 6.202 s. building dict... - 0.218 s. deleting dict... - 0.090 s. deleting items... - 0.852 s. total deallocation time: 0.943 seconds. ## import random import time import gc # Adjust this parameter according to your system RAM! target_size = int(2.0 * 1024**3) # 2.0 GB pool_size = 4 * 1024 # This is a ballpark estimate: 60 bytes overhead for each # { dict entry struct + float object + tuple object header }, # 1.3 overallocation factor for the dict. target_length = int(target_size / (1.3 * (pool_size + 60))) def make_items(): print (creating %d items... % target_length) # 1. Initialize a set of pre-computed random keys. keys = [random.random() for i in range(target_length)] # 2. Build the values that will constitute the dict. Each value will, as #far as possible, span a contiguous `pool_size` memory area. # Over 256 bytes per alloc, PyObject_Malloc defers to the system malloc() # We avoid that by allocating tuples of smaller longs. int_size = 200 # 24 roughly accounts for the long object overhead (YMMV) int_start = 1 ((int_size - 24) * 8 - 7) int_range = range(1, 1 + pool_size // int_size) values = [None] * target_length # Maximize allocation locality by pre-allocating the values for n in range(target_length): values[n] = tuple(int_start + j for j in int_range) return list(zip(keys,values)) if __name__ == __main__: gc.disable() t1 = time.time() items = make_items() t2 = time.time() print - %.3f s. % (t2 - t1) print building dict... t1 = time.time() testdict = dict(items) t2 = time.time() print - %.3f s. % (t2 - t1) def delete_testdict(): global testdict print deleting dict... t1 = time.time() del testdict t2 = time.time() print - %.3f s. % (t2 - t1) def delete_items(): global items print deleting items... t1 = time.time() del items t2 = time.time() print - %.3f s. % (t2 - t1) t1 = time.time() # Swap these, and look at the total time delete_items() delete_testdict() t2 = time.time() print total deallocation time: %.3f seconds. % (t2 - t1) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] extremely slow exit for program having huge (45G) dict (python 2.5.2)
On Fri, Dec 19, 2008 at 6:29 PM, Mike Coleman tutu...@gmail.com wrote: I have a program that creates a huge (45GB) defaultdict. (The keys are short strings, the values are short lists of pairs (string, int).) Nothing but possibly the strings and ints is shared. That is, after executing the final statement (a print), it is apparently spending a huge amount of time cleaning up before exiting. I have done 'gc.disable()' for performance (which is hideous without it)--I have no reason to think there are any loops. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] extremely slow exit for program having huge (45G) dict (python 2.5.2)
[Sorry, for the previous garbage post.] On Fri, Dec 19, 2008 at 6:29 PM, Mike Coleman tutu...@gmail.com wrote: I have a program that creates a huge (45GB) defaultdict. (The keys are short strings, the values are short lists of pairs (string, int).) Nothing but possibly the strings and ints is shared. Could you give us more information about the dictionary. For example, how many objects does it contain? Is 45GB the actual size of the dictionary or of the Python process? That is, after executing the final statement (a print), it is apparently spending a huge amount of time cleaning up before exiting. Most of this time is probably spent on DECREF'ing objects in the dictionary. As other mentioned, it would useful to have self-contained example to examine the behavior more closely. I have done 'gc.disable()' for performance (which is hideous without it)--I have no reason to think there are any loops. Have you seen any significant difference in the exit time when the cyclic GC is disabled or enabled? -- Alexandre ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Reindenting the C code base?
On Mon, Dec 15, 2008 at 3:59 PM, Guido van Rossum gu...@python.org wrote: Aha! A specific file. I'm supportive of fixing that specific file. Now if you can figure out how to do it and still allow merging between 2.6 and 3.0 that would be cool. Here's the simplest solution I thought so far to allow smooth merging subsequently. First, fix the 2.6 version with 4-space indent. Over a third of the file is already using spaces for indentation, so I don't think losing consistency is a big deal. Then, block the trunk commit with svnmerge to prevent it from being merged back to the py3k branch. Finally, fix the 3.0 version. -- Alexandre ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Reindenting the C code base?
On Sat, Dec 13, 2008 at 5:11 PM, Antoine Pitrou solip...@pitrou.net wrote: Guido van Rossum guido at python.org writes: I think we should not do this. We should use 4 space indents for new files, but existing files should not be reindented. Well, right now many files are indented with a mix of spaces and tabs, depending on who did the edit and how their editor was configured at the time. Personally, I think the indentation of, at least, Objects/unicodeobject.c should be fixed. This file has become so mixed-up with tab and space indents that I have no-idea what to use when I edit it. Just to give an idea how messy it is, they are 5214 lines indented with tabs and 4272 indented with spaces (out the 9733 of the file). -- Alexandre ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Reindenting the C code base?
On Sun, Dec 14, 2008 at 12:43 PM, Jeffrey Yasskin jyass...@gmail.com wrote: I've never figured out how to configure emacs to deduce whether the current file uses spaces or tabs and has a 4 or 8 space indent. I always try to get it right anyway, but it'd be a lot more convenient if my editor did it for me. If there are such instructions, perhaps they should be added to PEPs 7 and 8? I know python-mode is able to detect indent configuration of python code automatically, but I don't know if c-mode is able to. Personally, ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Reindenting the C code base?
On Sun, Dec 14, 2008 at 12:57 PM, Alexandre Vassalotti alexan...@peadrop.com wrote: On Sun, Dec 14, 2008 at 12:43 PM, Jeffrey Yasskin jyass...@gmail.com wrote: I've never figured out how to configure emacs to deduce whether the current file uses spaces or tabs and has a 4 or 8 space indent. I always try to get it right anyway, but it'd be a lot more convenient if my editor did it for me. If there are such instructions, perhaps they should be added to PEPs 7 and 8? I know python-mode is able to detect indent configuration of python code automatically, but I don't know if c-mode is able to. Personally, [sorry, tabspace in gmail made it send my unfinished email] Personally, I use auto-mode-alist to make Emacs choose the indent configuration to use automatically. Here's how it looks like for me: (defmacro def-styled-c-mode (name style rest body) Define styled C modes. `(defun ,name () (interactive) (c-mode) (c-set-style ,style) ,@body)) (def-styled-c-mode python-c-mode python (setq indent-tabs-mode t tab-width 8 c-basic-offset 8)) (def-styled-c-mode py3k-c-mode python (setq indent-tabs-mode nil tab-width 4 c-basic-offset 4)) (setq auto-mode-alist (append '((/python.org/python/.*\\.[ch]\\' . python-c-mode) (/python.org/.*/.*\\.[ch]\\' . py3k-c-mode)) auto-mode-alist)) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] 2to3 question about fix_imports.
On Fri, Dec 12, 2008 at 11:39 AM, Lennart Regebro rege...@gmail.com wrote: The fix_imports fix seems to fix only the first import per line that you have. So if you do for example import urllib2, cStringIO it will not fix cStringIO. Is this a bug or a feature? :-) If it's a feature it should warn at least, right? Which revision of python are you using? I tried the test-case you gave and 2to3 translated it perfectly. -- Alexandre a...@helios:~$ cat test.py import urllib2, cStringIO s = cStringIO.StringIO(urllib2.randombytes(100)) a...@helios:~$ 2to3 test.py RefactoringTool: Skipping implicit fixer: buffer RefactoringTool: Skipping implicit fixer: idioms RefactoringTool: Skipping implicit fixer: set_literal RefactoringTool: Skipping implicit fixer: ws_comma --- test.py (original) +++ test.py (refactored) @@ -1,3 +1,3 @@ -import urllib2, cStringIO +import urllib.request, urllib.error, io -s = cStringIO.StringIO(urllib2.randombytes(100)) +s = io.StringIO(urllib2.randombytes(100)) RefactoringTool: Files that need to be modified: RefactoringTool: test.py ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] 2to3 question about fix_imports.
On Sun, Dec 14, 2008 at 1:34 PM, Lennart Regebro rege...@gmail.com wrote: On Sun, Dec 14, 2008 at 19:19, Alexandre Vassalotti alexan...@peadrop.com wrote: Which revision of python are you using? I tried the test-case you gave and 2to3 translated it perfectly. 3.0, I haven't tried with trunk yet, and possibly it's a more complicated usecase. Strange, fix_imports in Python 3.0 (final) looks fine. If you can come up with a reproducible example, please open a bug on bugs.python.org and set me as the assignee (my user id is alexandre.vassalotti). Thanks, -- Alexandre ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proper initialization of structs
On Thu, Oct 30, 2008 at 1:00 PM, Fred Drake [EMAIL PROTECTED] wrote: It's good to move work into __init__ where reasonable, so that it can be avoided if a subclass wants it done in a completely different way, but new can't work that way. And that is exactly the reason why, the _pickle module doesn't use __new__ for initialization. Doing any kind of argument parsing in __new__ prevents subclasses from customizing the arguments for their __init__. Although, I agree that __new__ should be used, whenever it is possible, to initialize struct members. -- Alexandre ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proper initialization of structs
[oops, I forgot to cc the list] On Thu, Oct 30, 2008 at 7:43 PM, Christian Heimes [EMAIL PROTECTED] wrote: Alexandre Vassalotti wrote: And that is exactly the reason why, the _pickle module doesn't use __new__ for initialization. Doing any kind of argument parsing in __new__ prevents subclasses from customizing the arguments for their __init__. Although, I agree that __new__ should be used, whenever it is possible, to initialize struct members. You are missunderstanding me. I want everybody to set the struct members to *A* sensible default value, not *THE* value. Argument parsing can still happen in tp_init. tp_new should (or must?) set all struct members to sensible defaults like NULL for pointers, -1 or 0 for numbers etc. Python uses malloc to allocate memory. Unless you are using debug builds the memory block is not initialized. In both cases the block of memory isn't zeroed. You all know the problems caused by uninitialized memory. But what if PyType_GenericAlloc is used for tp_alloc? As far as I know, the memory block allocated with PyType_GenericAlloc is zeroed. -- Alexandre ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] C API for gc.enable() and gc.disable()
On Wed, Jun 25, 2008 at 4:55 PM, Martin v. Löwis [EMAIL PROTECTED] wrote: I think exactly the other way 'round. The timing of thing should not matter at all, only the exact sequence of allocations and deallocations. I would it be possible, if not a good idea, to only track object deallocations as the GC traversal trigger? As far as I know, dangling cyclic references cannot be formed when allocating objects. So, this could potentially mitigate the quadratic behavior during allocation bursts. -- Alexandre ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] C API for gc.enable() and gc.disable()
On Thu, Jun 26, 2008 at 12:01 AM, Martin v. Löwis [EMAIL PROTECTED] wrote: I would it be possible, if not a good idea, to only track object deallocations as the GC traversal trigger? As far as I know, dangling cyclic references cannot be formed when allocating objects. Not sure what you mean by that. x = [] x.append(x) del x creates a cycle with no deallocation occurring. Oh... never mind then. -- Alexandre ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] C API for gc.enable() and gc.disable()
On Sun, Jun 1, 2008 at 12:28 AM, Adam Olsen [EMAIL PROTECTED] wrote: On Sat, May 31, 2008 at 10:11 PM, Alexandre Vassalotti [EMAIL PROTECTED] wrote: Would anyone mind if I did add a public C API for gc.disable() and gc.enable()? I would like to use it as an optimization for the pickle module (I found out that I get a good 2x speedup just by disabling the GC while loading large pickles). Of course, I could simply import the gc module and call the functions there, but that seems overkill to me. I included the patch below for review. I'd rather see it fixed. It behaves quadratically if you load enough to trigger full collection a few times. Do you have any idea how this behavior could be fixed? I am not a GC expert, but I could try to fix this. -- Alexandre ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Python-3000] Betas today - I hope
On Wed, Jun 11, 2008 at 7:35 AM, Barry Warsaw [EMAIL PROTECTED] wrote: My plan is to begin building the betas tonight, at around 9 or 10pm EDT (0100 to 0200 UTC Thursday). If a showstopper comes up before then, I'll email the list. If you think we really aren't ready for beta, then I would still like to get a release out today. In that case, we'll call it alpha and delay the betas. I have two release blockers pending review: http://bugs.python.org/issue2918 http://bugs.python.org/issue2917 I believe both patches are ready to be committed to the py3k branch. However, I would certainly like that someone would review the patches (or at least test them). Right now, I am currently looking at fixing issue 2919 (http://bugs.python.org/issue2919). The profile and the cProfile module differ much more than I originally expected. So, I won't be able to get these two for the beta. I have also been looking at http://bugs.python.org/issue2874, in which Benjamin Peterson proposed an simple solution to fix it. Although I haven't tried his approach, I think I could get this one done for today. Finally, I would like to commit the patch in http://bugs.python.org/issue2523 which fix the quadratic behavior in BufferedReader.read(). It would also be nice to have someone else experienced with the io module to review the patch. Cheers, -- Alexandre ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Python-3000] How to specify keyword-only arguments from C?
On Thu, Jun 5, 2008 at 11:18 PM, Alexandre Vassalotti [EMAIL PROTECTED] wrote: On Thu, Jun 5, 2008 at 10:14 PM, Mark Hammond [EMAIL PROTECTED] wrote: Set an error if the 'arg' tuple doesn't have a length of zero? Oh, that isn't a bad idea at all. I will try this. Thanks! Worked flawlessly! Just for the archives, here's how it looks like: static int Unpickler_init(UnpicklerObject *self, PyObject *args, PyObject *kwds) { static char *kwlist[] = {file, encoding, errors, 0}; PyObject *file; char *encoding = NULL; char *errors = NULL; if (Py_SIZE(args) != 1) { PyErr_Format(PyExc_TypeError, %s takes exactly one 1 positional argument (%zd given), Py_TYPE(self)-tp_name, Py_SIZE(args)); return -1; } if (!PyArg_ParseTupleAndKeywords(args, kwds, O|ss:Unpickler, kwlist, file, encoding, errors)) return -1; ... Thank you, Mark, for the tip! -- Alexandre ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] C API for gc.enable() and gc.disable()
Would anyone mind if I did add a public C API for gc.disable() and gc.enable()? I would like to use it as an optimization for the pickle module (I found out that I get a good 2x speedup just by disabling the GC while loading large pickles). Of course, I could simply import the gc module and call the functions there, but that seems overkill to me. I included the patch below for review. -- Alexandre Index: Include/objimpl.h === --- Include/objimpl.h (revision 63766) +++ Include/objimpl.h (working copy) @@ -221,8 +221,10 @@ * == */ -/* C equivalent of gc.collect(). */ +/* C equivalent of gc.collect(), gc.enable() and gc.disable(). */ PyAPI_FUNC(Py_ssize_t) PyGC_Collect(void); +PyAPI_FUNC(void) PyGC_Enable(void); +PyAPI_FUNC(void) PyGC_Disable(void); /* Test if a type has a GC head */ #define PyType_IS_GC(t) PyType_HasFeature((t), Py_TPFLAGS_HAVE_GC) Index: Modules/gcmodule.c === --- Modules/gcmodule.c (revision 63766) +++ Modules/gcmodule.c (working copy) @@ -1252,6 +1252,18 @@ return n; } +void +PyGC_Disable(void) +{ +enabled = 0; +} + +void +PyGC_Enable(void) +{ +enabled = 1; +} + /* for debugging */ void _PyGC_Dump(PyGC_Head *g) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Module renaming and pickle mechanisms
On Sat, May 17, 2008 at 5:05 AM, M.-A. Lemburg [EMAIL PROTECTED] wrote: I'd like to bring a potential problem to attention that is caused by the recent module renaming approach: Object serialization protocols like e.g. pickle usually store the complete module path to the object class together with the object. Thanks for bringing this up. I was aware of the problem myself, but I hadn't yet worked out a good solution to it. It can also happen in storage setups where Python objects are stored using e.g. pickle, ZODB being a prominent example. As soon as a Python 2.6 application starts writing to such storages, Python 2.5 and lower versions will no longer be able to read back all the data. The opposite problem exists for Python 3.0, too. Pickle streams written by Python 2.x applications will not be readable by Python 3.0. And, one solution to this is to use Python 2.6 to regenerate pickle stream. Another solution would be to write a 2to3 pickle converter using the pickletools module. It is surely not the most elegant or robust solution, but I could work. Now, I think there's a way to solve this puzzle: Instead of renaming the modules (e.g. Queue - queue), we leave the code in the existing modules and packages and instead add the new module names and package structure with pointers and redirects to the existing 2.5 modules. This would certainly work for simple modules, but what about packages? For packages, you can't use the ``sys.modules[__name__] = Queue`` to preserve module identity. Therefore, pickle will use the new package name when writing its streams. So, we are back to the same problem again. A possible solution could be writing a compatibility layer for the Pickler class, which would map new module names to their old at runtime. Again, this is neither an elegant, nor robust, solution, but it should work in most cases. -- Alexandre ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Module renaming and pickle mechanisms
Errata: On Sat, May 17, 2008 at 10:59 AM, Alexandre Vassalotti [EMAIL PROTECTED] wrote: And, one solution to this is to use Python 2.6 to regenerate pickle stream. ... to regenerate *the* pickle *streams*. It is surely not the most elegant or robust solution, but I could work. ... but *it* could work. This would certainly work for simple modules, but what about packages? For packages, you can't use the ``sys.modules[__name__] = Queue`` to preserve module identity. ... you can't use the ``sys.modules[__name__] = Queue`` *trick* to preserve module identity. A possible solution could be writing a compatibility layer for the ... could be *to write* a compatibility layer... I guess I should start proofreading my emails before sending them, not after... -- Alexandre ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Symbolic errno values in error messages
On Fri, May 16, 2008 at 10:52 AM, Yannick Gingras [EMAIL PROTECTED] wrote: Alexander Belopolsky [EMAIL PROTECTED] writes: try: ...open('/') ... except Exception,e: ...pass ... print e [Errno 21] Is a directory So now I am not sure what OP is proposing. Do you want to replace 21 with EISDIR in the above? Yes, that's what I had in mind. Then, check out EnvironmentError_str in Objects/exceptions.c. You should be able import the errno module and fetch its errorcode dictionary. -- Alexandre ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Distutils configparser rename
On Thu, May 15, 2008 at 6:49 PM, Nick Coghlan [EMAIL PROTECTED] wrote: Since it would be nice for the standard library to not emit any warnings with the -3 flag, perhaps distutils should at least be trying the new name first, and only falling back to the old name on an ImportError (assuming we do decide we want to be able to run the 2.6 distutils on older versions of Python). Well, that is a good idea. And, that will silence the Windows buildbots while other developers find out how to add lib-old/ to the sys.path. -- Alexandre ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] heads up on svn.python.org ssh keys - debian/ubuntu users may need new ones
On Tue, May 13, 2008 at 7:12 PM, Martin v. Löwis [EMAIL PROTECTED] wrote: If you generated your python subversion ssh key during this time on a machine fitting the description above, please consider replacing your keys. apt-get update ; apt-get upgrade on debian will provide you with a ssh-vulnkey program that can be used to test if your ssh keys are valid or not. I'll ping all committers for which ssh-vulnkey reports COMPROMISED. I personally don't think the threat is severe - unless people also published their public SSH keys somewhere, there is little chance that somebody can break in by just guessing them remotely - you still need to try a lot of combinations for user names and passwords, plus with subversion, we'll easily recognize doubtful checkins (as we do even if the committer is legitimate :-). Well, I had a break in on my public server (peadrop.com) this week, which had a copy my ssh pubkey. I don't know if the attacker took a look at my pubkeys, but I won't take any change. So, I definitely have to change my key, ASAP. -- Alexandre ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Buildbots have trouble checking out the repository due to recent changes.
On Mon, May 12, 2008 at 3:40 AM, Martin v. Löwis [EMAIL PROTECTED] wrote: When I rename a module I use svn copy, since svn remove doesn't pick up changes made to the deleted file. For example, here is what I did for PixMapWrapper: You want to make changes to the deleted file? Why? The idea was to replace the orignial module file with its stub. However, the svn copy and edit process isn't the cause of the problems. It is the fact that 2 files existed in the same directory differing only by a case-change. Anyway, all the buildbot seems okay now. -- Alexandre ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Buildbots have trouble checking out the repository due to recent changes.
On Mon, May 12, 2008 at 3:49 AM, Martin v. Löwis [EMAIL PROTECTED] wrote: Well, I guess I really messed up on that one. So, do you have any idea on how to revert the changes? If the changes where in a single revision N, do svn merge -rN:N-1 . svn commit -m revert rN If they span over several subsequent revisions, use N-k instead. If they span over several revisions with intermediate revisions that you don't want to revert, try multiple merge commands before a single commit; if that fails, revert and commit each range of changes separately. Yes. That is exactly what I did to revert the changes. P.S. If you want to get the buildbots back in shape (in case they aren't), build a non-existing branch through the UI (which will cause a recursive removal of the entire checkout), then either wait for the next regular commit, or force a build of the respective branch (branches/py3k or trunk). On Windows, if there is still a python.exe process holding onto its binary, that fails, and we need support from the slave admin. Thanks for the tip. Now, I just hope that I will never have to use it. ;-) -- Alexandre ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Buildbots have trouble checking out the repository due to recent changes.
On Mon, May 12, 2008 at 7:18 AM, Paul Moore [EMAIL PROTECTED] wrote: Revision 63129 is not valid on case folding filesystems. In particular, this horribly breaks using hg-svn to make a local mirror of the Python repository: \Apps\HGsvn\hgimportsvn.exe -r 63120 http://svn.python.org/projects/python/trunk foo cd foo \apps\hgsvn\hgpullsvn hg log Lib\socketserver.py changeset: 2:e8856fdf9300 branch: trunk tag: svn.63129 user:alexandre.vassalotti date:Mon May 12 02:37:10 2008 +0100 summary: [svn] Renamed SocketServer to 'socketserver'. hg up -r2 abort: case-folding collision between Lib/socketserver.py and Lib/SocketServer.py hg up -rtip abort: case-folding collision between Lib/socketserver.py and Lib/SocketServer.py The hg repository is now totally broken. Which version of mercurial are you using? I know that versions prior 1.0 had some bug with handling case-changes on case-insensitive filesystems. -- Alexandre ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Buildbots have trouble checking out the repository due to recent changes.
On Mon, May 12, 2008 at 9:24 AM, Martin v. Löwis [EMAIL PROTECTED] wrote: The idea was to replace the orignial module file with its stub. However, the svn copy and edit process isn't the cause of the problems. It is the fact that 2 files existed in the same directory differing only by a case-change. I still don't understand. You wanted to replace the file with a stub, and then delete it? Why not just delete it (or use svn mv in the first place)? No. That is exactly what I wanted to avoid by using svn copy, instead of svn move. svn move mark the original file for removal. which makes it impossible to modify the original file on the same commit. Anyway, Brett updated the PEP with renaming procedure that avoids this problem completely. -- Alexandre ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] How best to handle the docs for a renamed module?
On Mon, May 12, 2008 at 6:10 AM, Georg Brandl [EMAIL PROTECTED] wrote: I've now updated docs for the Queue, SocketServer and copy_reg modules in the trunk. Thank you, Georg, for updating docs! -- Alexandre ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Trickery with moving urllib
On Sat, May 10, 2008 at 11:43 PM, [EMAIL PROTECTED] wrote: Brett There is going to be an issue with the current proposal for Brett keeping around urllib. Since the package is to be named the same Brett thing as the module Is this the only module morphing into a package of the same name? No, it is not. The dbm package will have the same issue. -- Alexandre ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Trickery with moving urllib
On Sat, May 10, 2008 at 11:38 PM, Brett Cannon [EMAIL PROTECTED] wrote: I see three solutions for dealing with this. 1. Have stubs for the entire urllib API in urllib.__init__ that raise a DeprecationWarning either specifying the new name or saying the function/class is deprecated. 2. Rename urllib to urllib.fetch or urllib.old_request to get people to move over to urllib.request (aka urllib2) at some point. I am probably missing something, because I don't see how this solution would solve the problem. The warning in urllib.__init__ will still be issued when people will import urllib.fetch (or urllib.fetch). -- Alexandre ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Buildbots have trouble checking out the repository due to recent changes.
Hello, I have been working the module renaming for PEP-3108, and I just noticed that some buildbots are throwing errors while updating their checkout. It seems the method I use for renaming modules hits a subversion bug on certain platforms. The error thrown looks like this: ... svn: In directory 'build/Lib/plat-mac' svn: Can't move source to dest svn: Can't move 'build/Lib/plat-mac/.svn/tmp/prop-base/pixmapwrapper.py.svn-base' to 'build/Lib/plat-mac/.svn/prop-base/pixmapwrapper.py.svn-base': No such file or directory program finished with exit code 1 (http://www.python.org/dev/buildbot/all/x86 osx.5 trunk/builds/201/step-svn/0) When I rename a module I use svn copy, since svn remove doesn't pick up changes made to the deleted file. For example, here is what I did for PixMapWrapper: svn copy ./Lib/plat-mac/PixMapWrapper.py ./Lib/plat-mac/pixmapwrapper.py edit ./Lib/plat-mac/PixMapWrapper.py svn commit It seems that I could avoid this error by using cp instead of svn copy (which I did use for renaming copy_reg). However, I am not sure if this method doesn't preserve the full history of file. So, how should I do to fix the failing buildbots? -- Alexandre ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Buildbots have trouble checking out the repository due to recent changes.
On Sun, May 11, 2008 at 5:44 PM, Paul Moore [EMAIL PROTECTED] wrote: 2008/5/11 Alexandre Vassalotti [EMAIL PROTECTED]: When I rename a module I use svn copy, since svn remove doesn't pick up changes made to the deleted file. For example, here is what I did for PixMapWrapper: svn copy ./Lib/plat-mac/PixMapWrapper.py ./Lib/plat-mac/pixmapwrapper.py edit ./Lib/plat-mac/PixMapWrapper.py svn commit That seems a very odd usage. You're renaming, not copying. Why aren't you using svn rename (svn move)? I can well imagine this causing serious confusion. I wrote: When I rename a module I use svn copy, since svn remove doesn't pick up changes made to the deleted file. For example, here is what I did for PixMapWrapper: Oops, I meant svn rename when I said svn remove. As I said, if I use svn rename I cannot make changes to the file being renamed. Please be very careful here - if you introduce revisions which contain multiple files with names that differ only in case, you're going to really mess up history (and probably the only clean way to fix this will be to actually go back and edit the history). Oh, you are right. I totally forgot about case-insensible filesystems. This is really going to make such case-change renamings nasty. -- Alexandre ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Buildbots have trouble checking out the repository due to recent changes.
On Sun, May 11, 2008 at 6:31 PM, Brett Cannon [EMAIL PROTECTED] wrote: The PEP specifies the lib-old directory to hold the old case name so that the svn rename won't lead to two files in the same directory. I was hoping that creating the stub in lib-old would allow a simple ``svn rename`` for the original module on a case-sensitive file-system and the case-insensitive file-systems would just be able to deal with it. Is that just not going to work? Oh, and I am really sorry, Alexandre, but the PixMapWrapper rename should have been taken out of the PEP as the entire Mac directory is going away, so the rename is kind of pointless since the module is going to be deleted. Well, I guess I really messed up on that one. So, do you have any idea on how to revert the changes? -- Alexandre ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Buildbots have trouble checking out the repository due to recent changes.
On Sun, May 11, 2008 at 5:29 PM, Alexandre Vassalotti [EMAIL PROTECTED] wrote: Hello, I have been working the module renaming for PEP-3108, and I just noticed that some buildbots are throwing errors while updating their checkout. It seems the method I use for renaming modules hits a subversion bug on certain platforms. The error thrown looks like this: [SNIP] So, how should I do to fix the failing buildbots? I reverted the all problematic changes and the buildbots are green again. Thank you all for you support! -- Alexandre ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] r62778 - in python/branches/py3k: Lib/io.py Lib/test/test_StringIO.py Lib/test/test_io.py Lib/test/test_largefile.py Lib/test/test_memoryio.py Lib/test/test_mimetools.py Modules/_byte
On Tue, May 6, 2008 at 6:52 PM, Christian Heimes [EMAIL PROTECTED] wrote: alexandre.vassalotti schrieb: Author: alexandre.vassalotti Date: Tue May 6 21:48:38 2008 New Revision: 62778 Log: Added fast alternate io.BytesIO implementation and its test suite. Removed old test suite for StringIO. Modified truncate() to imply a seek to given argument value. Thanks for your great work! But what about the trunk? :] Can you port your code to the trunk before the alpha gets out? I have a backported version of my code for the trunk. Should I commit it or should I post it to issue tracker and wait for proper review? -- Alexandre ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP Proposal: Revised slice objects lists use slice objects as indexes
On Sun, Mar 9, 2008 at 7:21 PM, Forrest Voight [EMAIL PROTECTED] wrote: This would simplify the handling of list slices. Slice objects that are produced in a list index area would be different, and optionally the syntax for slices in list indexes would be expanded to work everywhere. Instead of being containers for the start, end, and step numbers, they would be generators, similar to xranges. I am not sure what you are trying to propose here. The slice object isn't special, it's just a regular built-in type. slice(1,4) slice(1, 4, None) [1,2,3,4,5,6][slice(1,4)] [2, 3, 4] I don't see how introducing new syntax would simplify indexing. Lists would accept these slice objects as indexes, and would also accept any other list or generator. Why lists should accept a list or a generator as index? What is the use case you have in mind? Optionally, the 1:2 syntax would create a slice object outside of list index areas. Again, I don't see how this could be useful... list(1:5) [1, 2, 3, 4] list(1:5:2) [1, 3] list(range(1,5,2))? range(30)[1:5 + 15:17] [1, 2, 3, 4, 15, 16] This is confusing, IMHO, and doesn't provide any advantage over: s = list(range(30)) s[1:5] + s[15:17] If you really needed it, you could define a custom class with a fancy __getitem__ class A: def __getitem__(self, x): return x A()[1:3,2:5] (slice(1, 3, None), slice(2, 5, None)) P.S. You should consider using the python-ideas (http://mail.python.org/mailman/listinfo/python-ideas) mailing list, instead of python-dev for posting suggestions. Cheers, -- Alexandre ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Any Emacs tips for core developers?
On Feb 4, 2008 7:47 PM, [EMAIL PROTECTED] wrote: I should have asked this before, but what's so special about core (Python?) development that the tools should be different than for non-core development? Brett Usually the core has keywords, built-ins, etc. that have not been Brett pushed to the release versions for various editors. Ah, okay. Barry mentioned something about adjusting the python-mode syntax tables to include Python 3.x stuff, though patches are always welcome. wink Brett Plus coding guidelines might be different from PEPs 7 and 8 Brett compared to what an editor is set to do by default. That might be a bit more challenging. I was thinking today that it would be kind of nice to have a set of predefined settings for Python's new C style (someone mentioned producing that). Should that go in the C/C++ mode or be delivered somehow else? It's fairly trivial to adjust cc-mode to conform PEP 7 C coding convention: (defmacro def-styled-c-mode (name style rest body) Define styled C modes. `(defun ,name () (interactive) (c-mode) (c-set-style ,style) ,@body)) (def-styled-c-mode python-c-mode python (setq indent-tabs-mode t tab-width 8 c-basic-offset 8)) (def-styled-c-mode py3k-c-mode python (setq indent-tabs-mode nil tab-width 4 c-basic-offset 4)) -- Alexandre ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Python-3000] inst_persistent_id
Oh, you are right. I thought that save_inst() used inst_persistent_id, but that isn't the case. Now, I have checked more thoroughly and found the relevant piece of code: if (!pers_save self-inst_pers_func) { if ((tmp = save_pers(self, args, self-inst_pers_func)) != 0) { res = tmp; goto finally; } } which is indeed called only when the object is not supported by pickle. I guess my original argument doesn't hold anymore, thus I don't have anything against supporting this feature officially. Thanks for correcting me! -- Alexandre On Jan 14, 2008 12:59 PM, Armin Rigo [EMAIL PROTECTED] wrote: Hi, On Sat, Jan 12, 2008 at 07:33:38PM -0500, Alexandre Vassalotti wrote: Well, in Python 3K, inst_persistent_id() won't be usable, since PyInstance_Type was removed. Looking at the code, inst_persistent_id() is just a confusing name. It has got nothing to do with PyInstance_Type; it's called for any object type that cPickle.c doesn't know how to handle. In fact, it seems that cPickle.c never calls inst_persistent_id() for objects of type PyInstance_Type... A bientot, Armin. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP: per user site-packages directory
I can't comment on the implementation details, but +1 for the idea. I think this feature will be very useful in a shared hosting environment. -- Alexandre On Jan 11, 2008 6:27 PM, Christian Heimes [EMAIL PROTECTED] wrote: PEP: XXX Title: Per user site-packages directory Version: $Revision$ Last-Modified: $Date$ Author: Christian Heimes christian(at)cheimes(dot)de Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 11-Jan-2008 Python-Version: 2.6, 3.0 Post-History: ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Subversion forbidden error while committing to trunk
Hi, I tried a few times to commit a patch (for issue #1530) to the trunk, but I always get this error: alex:python% svn commit Lib/doctest.py --file svn-commit.tmp svn: Commit failed (details follow): svn: MKACTIVITY of '/projects/!svn/act/53683b5b-99d8-497e-bc98-6d07f9401f50': 403 Forbidden (http://svn.python.org) I first thought that was related to the Py3k freeze. However, I tried again a few minutes ago and I still got this error. Is possible that my commit rights are limited to the py3k branches? Or this is a genuine error? Thanks, -- Alexandre ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Subversion forbidden error while committing to trunk
Thanks Guido. I just found what was the problem. My checkout of the trunk was the read-only one (i.e., over http). -- Alexandre On Dec 7, 2007 11:40 PM, Guido van Rossum [EMAIL PROTECTED] wrote: On Dec 7, 2007 8:35 PM, Alexandre Vassalotti [EMAIL PROTECTED] wrote: I tried a few times to commit a patch (for issue #1530) to the trunk, but I always get this error: alex:python% svn commit Lib/doctest.py --file svn-commit.tmp svn: Commit failed (details follow): svn: MKACTIVITY of '/projects/!svn/act/53683b5b-99d8-497e-bc98-6d07f9401f50': 403 Forbidden (http://svn.python.org) I first thought that was related to the Py3k freeze. However, I tried again a few minutes ago and I still got this error. Is possible that my commit rights are limited to the py3k branches? Or this is a genuine error? I just successfully committed something to the trunk, so the server is not screwed. I'm not aware of an access control mechanism that would prevent anyone from checking in to the trunk while allowing them to check in to a branch. I suspect your workspace may be corrupt. -- --Guido van Rossum (home page: http://www.python.org/~guido/) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [poll] New name for __builtins__
I just want to let you all know that the name issue was settled and committed to py3k branch a few days ago. It was chosen to simply rename the module __builtin__ to builtins. -- Alexandre On Nov 29, 2007 6:15 AM, Nick Coghlan [EMAIL PROTECTED] wrote: Given that the *effect* of __builtins__ is to make the contents of the __builtin__ module implicitly available in every module's global namespace, why not call it __implicit__? I really don't like all of these __root__ inspired names, because __builtin__ isn't the root of any Python hierarchy that I know of. import sys import __builtin__ __builtin__.sys Traceback (most recent call last): File stdin, line 1, in module AttributeError: 'module' object has no attribute 'sys' The builtin namespace doesn't know anything about other modules, the current module's global namespace, the current function's local variables, or much of anything really. To me, the concept of root in a computing sense implies a node from which you can reach every other node - from the root of the filesystem you can get to every other directory, as the root user you can access any other account, etc. To those that like these names, what do you consider __root__ to be the root of? ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [poll] New name for __builtins__
Oh, sorry for the noise. I thought people were still arguing about the name issue, but it was in fact 5-day late emails that I am still receiving. (Gmail seems to have delivery issues lately...) -- Alexandre On Dec 4, 2007 12:49 PM, Alexandre Vassalotti [EMAIL PROTECTED] wrote: I just want to let you all know that the name issue was settled and committed to py3k branch a few days ago. It was chosen to simply rename the module __builtin__ to builtins. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Extending Python 3000
PyObject_HEAD was changed in Py3k to make it conform to C's strict aliasing rules (See PEP 3123 [1]). In your code, you need to change: static PyTypeObject MPFType = { PyObject_HEAD_INIT(NULL) 0, /*ob_size*/ ... } to this: static PyTypeObject MPFType = { PyVarObject_HEAD_INIT(NULL, 0) ... } Good luck, -- Alexandre [1]: http://www.python.org/dev/peps/pep-3123/ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Avoiding cascading test failures
On 8/28/07, Collin Winter [EMAIL PROTECTED] wrote: On 8/22/07, Alexandre Vassalotti [EMAIL PROTECTED] wrote: When I was fixing tests failing in the py3k branch, I found the number duplicate failures annoying. Often, a single bug, in an important method or function, caused a large number of testcase to fail. So, I thought of a simple mechanism for avoiding such cascading failures. My solution is to add a notion of dependency to testcases. A typical usage would look like this: @depends('test_getvalue') def test_writelines(self): ... memio.writelines([buf] * 100) self.assertEqual(memio.getvalue(), buf * 100) ... This definitely seems like a neat idea. Some thoughts: * How do you deal with dependencies that cross test modules? Say test A depends on test B, how do we know whether it's worthwhile to run A if B hasn't been run yet? It looks like you run the test anyway (I haven't studied the code closely), but that doesn't seem ideal. I am not sure what you mean by test modules. Do you mean module in the Python sense, or like a test-case class? * This might be implemented in the wrong place. For example, the [x for x in dir(self) if x.startswith('test')] you do is most certainly better-placed in a custom TestLoader implementation. That certainly is a good suggestion. I am not sure yet how I will implement my idea in the unittest module. However, I pretty sure that it will be quite different from my prototype. But despite that, I think it's a cool idea and worth pursuing. Could you set up a branch (probably of py3k) so we can see how this plays out in the large? Sure. I need to finish merging pickle and cPickle for Py3k before tackling this project, though. -- Alexandre ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Order of operations
On 8/29/07, Martin v. Löwis [EMAIL PROTECTED] wrote: Scott Dial schrieb: Martin v. Löwis wrote: Do you know why? Thanks! I'm not sure why precedence was defined that way, though. Because it is consistent with C's precedence rules. Maybe I'm missing something - how exactly is the exponentiation operator spelled in C? C doesn't have an exponentiation operator. You use the pow() function, instead: #include math.h double pow(double x, double y); -- Alexandre ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Avoiding cascading test failures
On 8/25/07, Gregory P. Smith [EMAIL PROTECTED] wrote: I like this idea. Yay! Now, I ain't the only one. ;) Be sure to have an option to ignore dependancies and run all tests. Yes, I planned to add a such option. Also when skipping tests because a depedancy failed have unittest print out an indication that a test was skipped due to a dependancy rather than silently running fewer tests. Otherwise it could be deceptive and appear that only one test was affected. However, that was never planned. I added the ignore_dependencies option. Also, I fixed the sub-optimal dependency resolution algorithm that was in my original example implementation. -- Alexandre --- dep.py.old 2007-08-25 19:54:27.0 -0400 +++ dep.py 2007-08-25 20:02:55.0 -0400 @@ -2,8 +2,9 @@ class CycleError(Exception): pass +class TestGraph: -class TestCase: +ignore_dependencies = False def __init__(self): self.graph = {} @@ -19,16 +20,16 @@ graph = self.graph toskip = set() msgs = [] -while graph: +if self.ignore_dependencies: +for test in graph: +graph[test].clear() # find tests without any pending dependencies -source = [test for test, deps in graph.items() if not deps] -if not source: -raise CycleError -for testname in source: +queue = [test for test, deps in graph.items() if not deps] +while queue: +testname = queue.pop() if testname in toskip: msgs.append(%s... skipped % testname) -resolvedeps(graph, testname) -del graph[testname] +queue.extend(resolve(graph, testname)) continue test = getattr(self, testname) try: @@ -42,8 +43,9 @@ else: msgs.append(%s... ok % testname) finally: -resolvedeps(graph, testname) -del graph[testname] +queue.extend(resolve(graph, testname)) +if graph: +raise CycleError for msg in sorted(msgs): print(msg) @@ -60,10 +62,15 @@ rdeps.update(getrevdeps(graph, x)) return rdeps - def resolvedeps(graph, testname): +def resolve(graph, testname): +toqueue = [] for test in graph: if testname in graph[test]: graph[test].remove(testname) +if not graph[test]: +toqueue.append(test) +del graph[testname] +return toqueue def depends(*args): def decorator(test): @@ -75,7 +82,9 @@ return decorator -class MyTest(TestCase): +class MyTest(TestGraph): + +ignore_dependencies = True @depends('test_foo') def test_nah(self): class CycleError(Exception): pass class TestGraph: ignore_dependencies = False def __init__(self): self.graph = {} tests = [x for x in dir(self) if x.startswith('test')] for testname in tests: test = getattr(self, testname) if hasattr(test, 'deps'): self.graph[testname] = test.deps else: self.graph[testname] = set() def run(self): graph = self.graph toskip = set() msgs = [] if self.ignore_dependencies: for test in graph: graph[test].clear() # find tests without any pending dependencies queue = [test for test, deps in graph.items() if not deps] while queue: testname = queue.pop() if testname in toskip: msgs.append(%s... skipped % testname) queue.extend(resolve(graph, testname)) continue test = getattr(self, testname) try: test() except AssertionError: toskip.update(getrevdeps(graph, testname)) msgs.append(%s... failed % testname) except: toskip.update(getrevdeps(graph, testname)) msgs.append(%s... error % testname) else: msgs.append(%s... ok % testname) finally: queue.extend(resolve(graph, testname)) if graph: raise CycleError for msg in sorted(msgs): print(msg) def getrevdeps(graph, testname): Return the reverse depencencies of a test rdeps = set() for x in graph: if testname in graph[x]: rdeps.add(x) if rdeps: # propagate depencencies recursively for x in rdeps.copy(): rdeps.update(getrevdeps(graph, x)) return rdeps def resolve(graph, testname): toqueue = [] for test in graph: if testname in graph[test]: graph[test].remove(testname) if not graph[test]: toqueue.append(test) del
[Python-Dev] Avoiding cascading test failures
When I was fixing tests failing in the py3k branch, I found the number duplicate failures annoying. Often, a single bug, in an important method or function, caused a large number of testcase to fail. So, I thought of a simple mechanism for avoiding such cascading failures. My solution is to add a notion of dependency to testcases. A typical usage would look like this: @depends('test_getvalue') def test_writelines(self): ... memio.writelines([buf] * 100) self.assertEqual(memio.getvalue(), buf * 100) ... Here, running the test is pointless if test_getvalue fails. So by making test_writelines depends on the success of test_getvalue, we can ensure that the report won't be polluted with unnecessary failures. Also, I believe this feature will lead to more orthogonal tests, since it encourages the user to write smaller test with less dependencies. I wrote an example implementation (included below) as a proof of concept. If the idea get enough support, I will implement it and add it to the unittest module. -- Alexandre class CycleError(Exception): pass class TestCase: def __init__(self): self.graph = {} tests = [x for x in dir(self) if x.startswith('test')] for testname in tests: test = getattr(self, testname) if hasattr(test, 'deps'): self.graph[testname] = test.deps else: self.graph[testname] = set() def run(self): graph = self.graph toskip = set() msgs = [] while graph: # find tests without any pending dependencies source = [test for test, deps in graph.items() if not deps] if not source: raise CycleError for testname in source: if testname in toskip: msgs.append(%s... skipped % testname) resolvedeps(graph, testname) del graph[testname] continue test = getattr(self, testname) try: test() except AssertionError: toskip.update(getrevdeps(graph, testname)) msgs.append(%s... failed % testname) except: toskip.update(getrevdeps(graph, testname)) msgs.append(%s... error % testname) else: msgs.append(%s... ok % testname) finally: resolvedeps(graph, testname) del graph[testname] for msg in sorted(msgs): print(msg) def getrevdeps(graph, testname): Return the reverse depencencies of a test rdeps = set() for x in graph: if testname in graph[x]: rdeps.add(x) if rdeps: # propagate depencencies recursively for x in rdeps.copy(): rdeps.update(getrevdeps(graph, x)) return rdeps def resolvedeps(graph, testname): for test in graph: if testname in graph[test]: graph[test].remove(testname) def depends(*args): def decorator(test): if hasattr(test, 'deps'): test.deps.update(args) else: test.deps = set(args) return test return decorator class MyTest(TestCase): @depends('test_foo') def test_nah(self): pass @depends('test_bar', 'test_baz') def test_foo(self): pass @depends('test_tin') def test_bar(self): self.fail() def test_baz(self): self.error() def test_tin(self): pass def error(self): raise ValueError def fail(self): raise AssertionError if __name__ == '__main__': t = MyTest() t.run() ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Python-3000] Documentation switch imminent
On 8/17/07, Georg Brandl [EMAIL PROTECTED] wrote: Alexandre Vassalotti schrieb: On 8/16/07, Neal Norwitz [EMAIL PROTECTED] wrote: On 8/15/07, Georg Brandl [EMAIL PROTECTED] wrote: Okay, I made the switch. I tagged the state of both Python branches before the switch as tags/py{26,3k}-before-rstdocs/. http://docs.python.org/dev/ http://docs.python.org/dev/3.0/ Is it just me, or the markup of the new docs is quite heavy? Docutils markup tends to be a bit verbose, yes, but the index is not even generated by them. alex% wget -q -O- http://docs.python.org/api/genindex.html | wc -c 77868 alex% wget -q -O- http://docs.python.org/dev/3.0/genindex.html | wc -c 918359 The new index includes all documents (api, lib, ref, ...), so the ratio is more like 678000 : 95 (using 2.6 here), and the difference can be explained quite easily because (a) sphinx uses different anchor names (mailbox.Mailbox.__contains__ vs l2h-849) and the hrefs have to include subdirs like reference/. Ah, I didn't notice that index included all the documents. That explains the huge size increase. However, would it be possible to keep the indexes separated? I noticed that I find I want more quickly when the indexes are separated. I've now removed leading spaces in the index output, and the character count is down to 85. Firefox, on my fairly recent machine, takes ~5 seconds rendering the index of the new docs from disk, compared to a fraction of a second for the old one. But you're right that rendering is slow there. It may be caused by the more complicated CSS... perhaps the index should be split up in several pages. I disabled CSS-support (with View-Page Style-No Style), but it didn't affect the initial rendering speed. However, scrolling was *much* faster without CSS. -- Alexandre ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Python-3000] Documentation switch imminent
On 8/16/07, Neal Norwitz [EMAIL PROTECTED] wrote: On 8/15/07, Georg Brandl [EMAIL PROTECTED] wrote: Okay, I made the switch. I tagged the state of both Python branches before the switch as tags/py{26,3k}-before-rstdocs/. http://docs.python.org/dev/ http://docs.python.org/dev/3.0/ Is it just me, or the markup of the new docs is quite heavy? alex% wget -q -O- http://docs.python.org/api/genindex.html | wc -c 77868 alex% wget -q -O- http://docs.python.org/dev/3.0/genindex.html | wc -c 918359 Firefox, on my fairly recent machine, takes ~5 seconds rendering the index of the new docs from disk, compared to a fraction of a second for the old one. -- Alexandre ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] cStringIO.StringIO() buffer behavior
On 8/6/07, Georg Brandl [EMAIL PROTECTED] wrote: Okay, I propose the following patch: [...] I think your patch is complicated for nothing. It would be much more straightforward to use PyString_AsStringAndSize to encode the Unicode string with the default encoding. I think it would be necessary to port the fix to O_write and O_writelines. -- Alexandre Index: Modules/cStringIO.c === --- Modules/cStringIO.c (revision 56754) +++ Modules/cStringIO.c (working copy) @@ -665,8 +674,15 @@ char *buf; Py_ssize_t size; - if (PyObject_AsCharBuffer(s, (const char **)buf, size) != 0) - return NULL; + /* Special case for unicode objects. */ + if (PyUnicode_Check(s)) { + if (PyString_AsStringAndSize(s, buf, size) == -1) + return NULL; + } + else { + if (PyObject_AsReadBuffer(s, (const void **)buf, size) == -1) + return NULL; + } self = PyObject_New(Iobject, Itype); if (!self) return NULL; ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] cStringIO.StringIO() buffer behavior
On 8/5/07, Georg Brandl [EMAIL PROTECTED] wrote: See bugs #1548891 and #1730114. In the former, it was reported that cStringIO works differently from StringIO when handling unicode strings; it used GetReadBuffer which returned the raw internal UCS-2 or UCS-4 encoded string. I changed it to use GetCharBuffer, which converts to a string using the default encoding first. This fix was also in 2.5.1. The latter bug now complains that this excludes things like array.array()s from being used as an argument to cStringIO.StringIO(), which worked before with GetReadBuffer. What's the preferred solution here? The best thing would be add a special case for ascii-only unicode objects, and keep the old behavior. However, I believe this will be ugly, especially in O_write. So, it would perhaps be better to simply stop supporting unicode objects. -- Alexandre ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Py3k: 'range' fail
Yes, range() on the p3yk branch seems broken. However, this bug has been fixed in the py3k-struni, the branch where most the development for Python 3000 is taking place. -- Alexandre On 7/24/07, Lisandro Dalcin [EMAIL PROTECTED] wrote: I did a fresh checkout as below (is p3yk the right branch?) $ svn co http://svn.python.org/projects/python/branches/p3yk python-3k after building and installing, I get $ python3.0 Python 3.0x (p3yk:56529, Jul 24 2007, 15:58:59) [GCC 4.1.1 20061011 (Red Hat 4.1.1-30)] on linux2 Type help, copyright, credits or license for more information. range(0,10,2) Traceback (most recent call last): File stdin, line 1, in module SystemError: NULL result without error in PyObject_Call ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com