Re: [Python-Dev] Code highlighting in tracker
Because tracker is ugly. Is this an unbiased opinion? :) Eugene ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] abstractmethod doesn't work in classes
Hello, I've found that abstractmethod and similar decorators don't work in classes, inherited from built-in types other than object. For example: import abc class MyBase(metaclass=abc.ABCMeta): @abc.abstractmethod def foo(): pass MyBase() Traceback (most recent call last): File pyshell#8, line 1, in module MyBase() TypeError: Can't instantiate abstract class MyBase with abstract methods foo So far so good, but: class MyList(list, MyBase): pass MyList() [] MyList.__abstractmethods__ frozenset({'foo'}) This is unexpected, since MyList still doesn't implement foo. Should this be considered a bug? I don't see this in documentation. The underlying reason is that __abstractmethods__ is checked in object_new, but built-in types typically call tp_alloc directly, thus skipping the check. Eugene ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Policy for making changes to the AST
Ok, so it sounds like ast is *not* limited to CPython? That makes it harder to justify changing it just so as to ease the compilation process in CPython (as opposed to add new language features). The changes above are not just for CPython, but to simplify processing of AST in general, by reducing redundancy and separating syntax from semantics. It just happens that the current structure of AST doesn't allow important cases of constant folding at all, so I had to make *some* changes. However, if the goal is to preserve the current AST as much as possible, I can instead make a very simple completely backward compatible change -- add one new node type that will never be present in unoptimized AST. This is much less elegant and will add more cruft to cpython's code (rather than removing it like the current patch does), but it will work. Eugene ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Policy for versions of system python
To answer Eugene's question, there's no official policy but the comment at the top of Python/makeopcodetargets.py can indeed serve as an useful guideline. I wonder if we still have buildbots with 2.3 as the system Python, by the way. Ok, I'll use 2.3 as my target. Thanks. Eugene ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Policy for versions of system python
Hello, CPython source code currently contains a number of python scripts (e.g Python/makeopcodetargets.py, Objects/typeslots.py, Parser/asdl_c.py) that are used during the build of the python interpreter itself. For this reason they are run with system installed python. What is the policy regarding the range of python versions that they should support? I looked at some of the scripts and they seem to support both 2 and 3, starting from at most 2.4. Python/makeopcodetargets.py says at the top: # This code should stay compatible with Python 2.3, at least while # some of the buildbots have Python 2.3 as their system Python. Is this the official minimal version or do we have this spelled out more explicitly somewhere? Eugene ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Policy for making changes to the AST
Hello, While working on a rewrite of peephole optimizer that works on AST (http://bugs.python.org/issue11549) I had to make a few changes to the structure of AST. The changes are described in the issue. This raises the question -- what is the policy for making changes to the AST? Documentation for ast module does not warn about possible changes, but obviously changes can occur, for example, when new constructs are introduced. What about other changes? Is there a policy for what's acceptable and how this should be handled? Assuming we do want to make changes (not just extensions for new language features), I see 3 options for handling them: 1. Do nothing. This will break code that currently uses AST, but doesn't add any complexity to cpython. 2. Write a pure-Python compatibility layer. This will convert AST between old and new formats, so that old code continues working. To do this a) Introduce ast.compile function (similar to ast.parse), which should be the recommended way of compiling to AST. b) Add ast_version argument to ast.parse and ast.compile defaulting to 1. c) Preserve old AST node classes and attributes in Python. d) If ast_version specified is not the latest, ast.parse and ast.compile would convert from/to latest version in Python using ast.NodeTransformer. This is not fully backward compatible, but allows to do all staging in Python. 3. Full backward compatibility (with Python code). This means conversion is done in compile(). It can either call Python conversion code from ast module, or actually implement conversion in C using AST visitors. Using my visitors generator this should not be very hard. Downsides here are a lot of C code and no clear separation of deprecated AST nodes (they will remain in Python.asdl). Otherwise it's similar to 2, with ast_version argument added to compile() and ast.parse. For 2 and 3 we can add a PendingDeprecationWarning when ast_version 1 is used. In any case, C extensions that manipulate AST will be broken, but 3 provides a simple way to update them -- insert calls to C conversion functions. Eugene ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Policy for making changes to the AST
However, I'm not sure we *can* do a general-purpose AST transformation that handles both new nodes and changes to existing nodes correctly for all applications. As long as both versions contain the same information we can write a transformation that does a near-perfect job. E.g. for my changes I can write a convertor that produces AST in almost the same form as the current one, the only change being the new 'docstring' attribute set to None. (That's for converting AST before optimizations, after optimizations it can contain nodes that couldn't be represented before). I believe it's similar for Try change that Benjamin mentioned above. Also, if written in Python, conversion can at least serve as a template even if it doesn't work out of the box. Eugene ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Policy for making changes to the AST
If it's do-able, your option 2 is probably the way to go. Out of the box, it may just need to raise an exception if asked to down-convert code that uses new constructs that can't readily be expressed using the old AST (I'm specifically thinking of the challenge of converting PEP 380's yield-from). I was talking only about changes in AST for existing constructs. New language features is another dimension. For example, we can leave them even in old trees, so that they can be supported in existing code with minimal changes. Or we can throw, forcing everyone who wants to process them to catch up with all other AST changes. I realized I overlooked one problem with supporting multiple versions of AST. Functions from ast module might need to know which AST version they've got. For example, ast.get_docstring will need to know whether docstring was populated or it needs to look in the body. This can be solved by attaching ast_version to affected nodes when converting. Eugene ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] utf-8 encoding in checkins?
I'm not disputing that, and I understand that my current choice of mail reader limits me. I was just asking if it would be possible (read: fairly easy) to only generate utf-8 when it was necessary. Isn't utf-8 itself same as ascii where no non-ascii symbols are used? Eugene ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Draft PEP and reference implementation of a Python launcher for Windows
Out of curiosity, what is your objection to having the child process? One of the problems is that parent process will not be able to kill launched script. Simply doing TerminateProcess will kill the launcher, leaving interpreter running. This can be partially fixed with job objects, though. Eugene ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] The purpose of SETUP_LOOP, BREAK_LOOP, CONTINUE_LOOP
I think you guys are forgetting about FOR_ITER, listcomps, and the like. That is, IIRC, the reason loops use the block stack is because they put things on the regular stack, that need to be cleared off the stack when the loop is exited (whether normally or via an exception). Good point. However, for exit via exception you always unwind till try block, having loop block in the way seems unneeded? While loops don't keep anything on values stack, so sounds like they can avoid SETUP_LOOP? Comprehensions don't use SETUP_LOOP already as you can't break from them. Sounds like the only use case is for loop with explicit break. In this case break can be compiled into POP and JUMP at the expense of bytecode size if there are multiple breaks. Eugene ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Suggest reverting today's checkin (recursive constant folding in the peephole optimizer)
To give this a positive spin, here's a patch that implements constant folding on AST (it does few other optimizations on compiler data structures, so it replaces peephole over bytecode completely). http://bugs.python.org/issue11549 It passes make test, but of course more testing is needed. Comments are welcome. Eugene ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] The purpose of SETUP_LOOP, BREAK_LOOP, CONTINUE_LOOP
There are also with blocks :-) (which use separate opcodes, although they are similar in principle to try/finally blocks) IIUC they use separate opcode, but the same block type (SETUP_FINALLY). There may be complications with nested try/finally blocks. You either need to generate separate bytecode for when the finally clause is entered following a given continue/break (so as to hardcode the jump offset at the end of the clause), or save the jump offsets somewhere on a stack for each finally clause to pop, IMO. Right, I'm not suggesting to remove all blocks, only SETUP_LOOP blocks. Do you see the problem in that case? Eugene ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Builtin open() too slow
Hi What OS and what file system you are using? Many file systems (e,g. ext2/3fs) handle large directories very poorly. A quick way to check if this has anything to do with Python is writing a small C program that opens these files and time it. Eugene On Sat, Mar 12, 2011 at 10:13 AM, Lukas Lueg lukas.l...@googlemail.com wrote: Hi, i've a storage engine that stores a lot of files (e.g. 10.000) in one path. Running the code under cProfile, I found that with a total CPU-time of 1,118 seconds, 121 seconds are spent in 27.013 calls to open(). The number of calls is not the problem; however I find it *very* discomforting that Python spends about 2 minutes out of 18 minutes of cpu time just to get a file-handle after which it can spend some other time to read from them. May this be a problem with the way Python 2.7 gets filehandles from the OS or is it a problem with large directories itself? Best regards Lukas ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/eltoder%40gmail.com ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python3 regret about deleting list.sort(cmp=...)
Can sort have an option (and/or try to figure it itself) to calculate key for every comparison instead of caching them? This will have the same memory requirements as with cmp, but doesn't require rewriting code if you decide to trade speed for memory. Will this be much slower than with cmp? If going that route sort can also cache a limited amount of keys (instead of all or nothing), using, for example, a LRU cache with fixed size. Eugene ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] The purpose of SETUP_LOOP, BREAK_LOOP, CONTINUE_LOOP
Hello, What is the purpose of SETUP_LOOP instruction? From a quick look it seems like it just pushes the size of the loop into blocks stack; that size is only used by BREAK_LOOP instruction. BREAK_LOOP could just contain the target address directly, like CONTINUE_LOOP does. This would avoid SETUP_LOOP/POP_BLOCK overhead for all loops. Am I missing something? Does SETUP_LOOP serve any other purpose? Similarly, it looks like BREAK_LOOP and CONTINUE_LOOP are just jumps that respect try/finally blocks (i.e. jumping out of try executes finally). Is there more semantics to them than this? If not, this can be simplified to: 1) If not in try/finally, simply generate a direct jump outside of the loop (break) or to the start of the loop (continue). 2) If in try/finally, generate a new instruction JUMP_FROM_TRY which replaces both BREAK_LOOP and CONTINUE_LOOP. It behaves the same way as CONTINUE_LOOP but without restriction to only jump backwards (could reuse CONTINUE_LOOP, but the name would be misleading). continue statement is already handled this way, but break always uses BREAK_LOOP. Any comments are appreciated. Regards, Eugene ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Suggest reverting today's checkin (recursive constant folding in the peephole optimizer)
Experience shows that optimizations are always error prone, no matter what framework or internal representation you use. I don't think we should assume that simply rewriting all optimizations to work on AST will make them bug free once and for all. On the contrary, I think such a rewrite will introduce some bugs of it's own. The remedy against this is well known. Instead of being afraid to touch the code we should add more tests and verifiers. If had written tests are not enough, test generator that produces python programs with different structures can be written. Such generators are used by many compiler writers. For verifiers, a function that checks that bytecode is sane (doesn't reference invalid names or consts, doesn't jump between instructions, all joins have same stack depth? etc) that runs after optimizer in debug builds can save a lot of time. Eugene ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Suggest reverting today's checkin (recursive constant folding in the peephole optimizer)
Experience shows that optimizations are always error prone, no matter what framework or internal representation you use. On that basis, I believe that we ought to declare peephole.c as being somewhat off-limits for further development (except for small adaptations if the underlying opcodes change meaning). With this attitude, it would be more logical to remove the whole thing completely. Even if you don't touch the optimizer, any changes in the compiler can start producing code that exposes a previously hidden bug. The bright ideas for other enhancements aren't actually new ideas. They were considered at the outset and not included for a reason. It would help if these were documented. It would be more correct How can you be more correct than tautological truth? :) 1) much of the semantic information available the AST has been lost by the time the bytecode is generated Not really. Many systems use stack based bytecode as an intermediate form and reconstruct different IRs from it. Examples are JVM, CLR and PyPy. The real problem with bytecode is that it's a representation optimized for storage and interpretation. Transforming it is harder, less efficient and more error-prone than transforming a tree structure. But transformations of trees are in no way automatically bug free. 2) peephole.c is having to disassemble the bytecode, reconstruct semantic relationships as well as it can, recognize transformation patterns and then regenerate bytecode. v8 does peephole directly on x86 machine code. No kittens were harmed. Now, I do agree with you that moving optimizations to AST is a step forward. Moving them to a specialized IR in SSA form would be even better. If there was a framework for AST-based optimizations in the current code, I'd write my code to use it. However, I couldn't find it. What I don't understand is why doing 90% of the work and refuse to do the last 10%. Looking at the actual patches, I do not see significant increase in complexity or code size -- all the complex cases are already handled by existing code. In some cases the code became cleaner and simpler. Not really. It is darned difficult to design tests to catch all the corner cases. You either do that or do not write optimizations. No one said it should be easy. You've got wishful thinking if you think a handful of tests can catch errors in code that sophisticated. Why limit yourself with a handful of tests? Python is widespread, there's *a lot* of code in Python. Unlike with libraries, any code you run tests the optimizer, so just run a lot of code. And, as I've said, write a test generator. Cheers, Eugene ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Suggest reverting today's checkin (recursive constant folding in the peephole optimizer)
One note on the patch: it allocates an extra stack which is dynamically grown; but there is no unittest to exercise the stack-growing code. Isn't this doing it? 1.20 +# Long tuples should be folded too. 1.21 +asm = dis_single(repr(tuple(range(1 1.22 +# One LOAD_CONST for the tuple, one for the None return value 1.23 +self.assertEqual(asm.count('LOAD_CONST'), 2) 1.24 +self.assertNotIn('BUILD_TUPLE', asm) Eugene ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Bugs in thread_nt.h
I guess all this advice doesn't really apply to this case, though. The Microsoft API declares the parameter as a volatile*, indicating that they consider it proper usage of the API to declare the storage volatile. So ISTM that we should comply regardless of whether volatile is considered morally wrong in the general case. Microsoft compiler implements Microsoft-specific behaviour for volatile variables, making them close to volatiles in Java: http://msdn.microsoft.com/en-us/library/12a04hfd(v=VS.100).aspx That may be the reason they use it in Interlocked* functions. As already been said, the thread_nt doesn't need any changes. Eugene ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] constant folding of -0
I've posted a patch. Eugene On Thu, Mar 10, 2011 at 3:30 PM, Mark Dickinson dicki...@gmail.com wrote: On Thu, Mar 10, 2011 at 2:17 AM, Eugene Toder elto...@gmail.com wrote: Indeed, see http://bugs.python.org/issue11244 Yes, I've noticed that too. However, if I'm not missing something, your patches do not address folding of -0. Hmm, it seems that way. Could you post a comment on the tracker issue about that, please? I'm not sure why the original changeset went in, but I agree it looks as though it's no longer necessary. Certainly there should be enough -0.0 versus 0.0 tests around to detect any issues, so if all the tests pass with the extra PyObject_IsTrue check disabled then there probably isn't a problem. Mark ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] constant folding of -0
Well, that was just a though. You're right that long runs of constants can appear, and it's better to avoid pathological behaviour in such cases. Your second path looks good. Eugene On Thu, Mar 10, 2011 at 6:30 PM, Antoine Pitrou solip...@pitrou.net wrote: On Thu, 10 Mar 2011 02:17:34 + (UTC) Eugene Toder elto...@gmail.com wrote: Indeed, see http://bugs.python.org/issue11244 Yes, I've noticed that too. However, if I'm not missing something, your patches do not address folding of -0. Btw, there's an alternative approach to allow recursive constant folding. Instead of keeping a stack of last constants, you can keep a pointer to the start of the last (LOAD_CONSTs + NOPs) run and the number of LOAD_CONSTs in that run (called lastlc in the current version). When you want Nth constant from the end, you start from that pointer and skip lastlc-N constants. You also make a function to get next constant from that point. This approach has worse time complexity for searching in a long run of LOAD_CONSTs, Yes, the stack basically acts as a cache to avoid computing all this again and again. however, there are upsides: - very long runs of constants are rare in real code True, but they might appear in generated code. - it uses less memory and doesn't have arbitrary limits on the size of the run Neither does the latest patch. - it's book-keeping overhead is smaller, so when you don't have long runs of constants (common case, I believe), it should be faster The book-keeping overhead should be quite small really, it's a simple autogrowing array with O(1) access and amortized append time. What's left is the overhead of the initial malloc() (and final free()). - I think it's simpler to implement Feel free to propose an alternate patch, but I'm not sure that it would be significantly simpler (and a stack is a well-known data structure). Also, please present some benchmarks if you do. Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] constant folding of -0
Hello, I've noticed since version 3.2 python doesn't fold -0: Python 3.1.3 (r313:86834, Nov 28 2010, 10:01:07) def foo(): return -0 dis(foo) 1 0 LOAD_CONST 1 (0) 3 RETURN_VALUE Python 3.2 (r32:88445, Feb 20 2011, 21:30:00) def foo(): return -0 dis(foo) 1 0 LOAD_CONST 1 (0) 3 UNARY_NEGATIVE 4 RETURN_VALUE (version built from head behaves the same way). It looks like folding -0 is disabled in peephole since this commit http://hg.python.org/cpython/diff/660419bdb4ae/Python/compile.c which was a long time ago. Before 3.2 -0 was likely folded in the parser -- in a more complex case no folding happens in either version: def foo(): return -(1-1) dis(foo) 1 0 LOAD_CONST 2 (0) 3 UNARY_NEGATIVE 4 RETURN_VALUE In 3.2 parser no longer folds -0. So I wanted to ask why folding of -0 was disabled in peephole? Commit message makes me think this was a work-around for a problem in marshal -- perhaps it couldn't save -0.0 properly and so not creating -0.0 in the code objects was a simple fix. (This would mean the change predates folding in the parser.) Was marshal fixed? If I revert the change everything seems to work and all tests pass. Since tests are run with .pyc-s I assume they test marshal? Maybe this check is no longer needed and can be reverted? Or is it there for some different reason which still holds? Regards, Eugene ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] constant folding of -0
Indeed, see http://bugs.python.org/issue11244 Yes, I've noticed that too. However, if I'm not missing something, your patches do not address folding of -0. Btw, there's an alternative approach to allow recursive constant folding. Instead of keeping a stack of last constants, you can keep a pointer to the start of the last (LOAD_CONSTs + NOPs) run and the number of LOAD_CONSTs in that run (called lastlc in the current version). When you want Nth constant from the end, you start from that pointer and skip lastlc-N constants. You also make a function to get next constant from that point. This approach has worse time complexity for searching in a long run of LOAD_CONSTs, however, there are upsides: - very long runs of constants are rare in real code - it uses less memory and doesn't have arbitrary limits on the size of the run - it's book-keeping overhead is smaller, so when you don't have long runs of constants (common case, I believe), it should be faster - I think it's simpler to implement (There's also an optimization -- if (current_position - start_of_run) / 3 == lastlc there are no NOPs in the run and you can get constants by simple indexing). Eugene ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com