Re: [Python-Dev] Update xml.etree.ElementTree for Python 2.7 and 3.2
Martin v. Löwis, 20.02.2010 13:08: >> Actually this should not be a fork of the upstream library. >> The goal is to improve stability and predictability of the ElementTree >> implementations in the stdlib, and to fix some bugs. >> I thought that it is better to backport the fixes from upstream than to >> fix each bug separately in the stdlib. >> >> I try to get some clear assessment from Fredrik. >> If it is accepted, I will probably cut some parts which are in the upstream >> library, but which are not in the API 1.2. If it is not accepted, it is bad >> news for the "xml.etree" users... > > Not sure about the timing, but in case you have not got the message: we > should rather drop ElementTree from the standard library than integrate > unreleased changes from an experimental upstream repository. > >> It is qualified as a "best effort" to get something better for ET. Nothing >> else. > > Unfortunately, it hurts ET users if it ultimately leads to a fork, or to > a removal of ET from the standard library. > > Please be EXTREMELY careful. I urge you not to act on this until > mid-March (which is the earliest time at which Fredrik has said he may > have time to look into this). I would actually encourage Florent to do the opposite: act now and prepare a patch against the latest official ET 1.2 and cET releases (or their SVN version respectively) that integrates everything that is considered safe, i.e. everything that makes cET compatible with ET and everything that seems clearly stable in ET 1.3 and does not break compatibility for existing code that uses ET 1.2. If you send that to Fredrik, I expect little opposition to making that the base for a 1.2.8 release, which can then be folded back into the stdlib. Stefan ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] __file__
On Sun, Feb 28, 2010 at 02:51:16PM +1300, Greg Ewing wrote: > Floris Bruynooghe wrote: > >(But even then I'm not > >convinced that would double the stat calls for normal users, only for > >those who only ship .pyc files) > > It would increase the number of stat calls for normal > users by 50%. You would need to look for a .pyc in the > source directory, then .py in the source directory and > .pyc in the cache directory. That's compared to two > stat calls currently, for .py and .pyc. Can't it look for a .py file in the source directory first (1st stat)? When it's there check for the .pyc in the cache directory (2nd stat, magic number encoded in filename), if it's not check for .pyc in the source directory (2nd stat + read for magic number check). Or am I missing a subtlety? > A solution might be to look for the presence of the > cache directory, and only look for a .pyc in the source > directory if there is no cache directory. Testing for > the cache directory would only have to be done once > per package and the result remembered, so it would > add very little overhead. That would work too, but I don't understand yet why the .pyc check in the source directory can't be done last. Regards Floris -- Debian GNU/Linux -- The Power of Freedom www.debian.org | www.gnu.org | www.kernel.org ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] __file__
-- http://www.ironpythoninaction.com On 28 Feb 2010, at 12:19, Floris Bruynooghe wrote: On Sun, Feb 28, 2010 at 02:51:16PM +1300, Greg Ewing wrote: Floris Bruynooghe wrote: (But even then I'm not convinced that would double the stat calls for normal users, only for those who only ship .pyc files) It would increase the number of stat calls for normal users by 50%. You would need to look for a .pyc in the source directory, then .py in the source directory and .pyc in the cache directory. That's compared to two stat calls currently, for .py and .pyc. Can't it look for a .py file in the source directory first (1st stat)? When it's there check for the .pyc in the cache directory (2nd stat, magic number encoded in filename), if it's not check for .pyc in the source directory (2nd stat + read for magic number check). Or am I missing a subtlety? The problem is doing this little dance for every path on sys.path. Michael A solution might be to look for the presence of the cache directory, and only look for a .pyc in the source directory if there is no cache directory. Testing for the cache directory would only have to be done once per package and the result remembered, so it would add very little overhead. That would work too, but I don't understand yet why the .pyc check in the source directory can't be done last. Regards Floris -- Debian GNU/Linux -- The Power of Freedom www.debian.org | www.gnu.org | www.kernel.org ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] __file__
Michael Foord wrote: >> Can't it look for a .py file in the source directory first (1st stat)? >> When it's there check for the .pyc in the cache directory (2nd stat, >> magic number encoded in filename), if it's not check for .pyc in the >> source directory (2nd stat + read for magic number check). Or am I >> missing a subtlety? > > The problem is doing this little dance for every path on sys.path. To unpack this a little bit for those not quite as familiar with the import system (and to make it clear for my own benefit!): for a top-level module/package, each path on sys.path needs to be eliminated as a possible location before the interpreter can move on to check the next path in the list. So the important number is the number of stat calls on a "miss" (i.e. when the requested module/package is not present in a directory). Currently, with builtin support for bytecode only files, there are 3 checks (package directory, py source file, pyc/pyo bytecode file) to be made for each path entry. The PEP proposes to reduce that to only two in the case of a miss, by checking for the cached pyc only if the source file is present (there would still be three checks for a "hit", but that only happens at most once per module lookup). While the PEP is right in saying that a bytecode-only import hook could be added, I believe it would actually be a little tricky to write one that didn't severely degrade the performance of either normal imports or bytecode-only imports. Keeping it in the core import, but turning it off by default seems much less likely to have unintended performance consequences when it is switched back on. Another option is to remove bytecode-only support from the default filesystem importer, but keep it for zipimport (since the stat call savings don't apply in the latter case). Cheers, Nick. -- Nick Coghlan | [email protected] | Brisbane, Australia --- ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] __file__
On Sun, Feb 28, 2010 at 11:07:27PM +1000, Nick Coghlan wrote: > Michael Foord wrote: > >> Can't it look for a .py file in the source directory first (1st stat)? > >> When it's there check for the .pyc in the cache directory (2nd stat, > >> magic number encoded in filename), if it's not check for .pyc in the > >> source directory (2nd stat + read for magic number check). Or am I > >> missing a subtlety? > > > > The problem is doing this little dance for every path on sys.path. > > To unpack this a little bit for those not quite as familiar with the > import system (and to make it clear for my own benefit!): for a > top-level module/package, each path on sys.path needs to be eliminated > as a possible location before the interpreter can move on to check the > next path in the list. Aha, that was the clue I was missing. Thanks! Floris -- Debian GNU/Linux -- The Power of Freedom www.debian.org | www.gnu.org | www.kernel.org ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Challenge: escape from the pysandbox
Hi, pysandbox is a new Python sandbox project under development. By default, untrusted code executed in the sandbox cannot modify the environment (write a file, use print or import a module). But you can configure the sandbox to choose exactly which features are allowed or not, eg. import sys module and read the file /etc/issue. Website: http://github.com/haypo/pysandbox/ Download the repository using git: git clone git://github.com/haypo/pysandbox.git or git clone http://github.com/haypo/pysandbox.git Or download the .zip or .tar.gz tarball using the "Download source" button on the website. I think that the project reached the "testable" stage. I launch a new challenge: try to escape from the sandbox. I'm unable to write strict rules. The goal is to access objects outside the sandbox. Eg. write into a file, import a module which is not in the whitelist, modify an object outside the sandbox, etc. To test the sandbox, you have 3 choices: - interpreter.py: interactive interpreter executed in the sandbox, use: --verbose to display the whole sandbox configuration, --features=help to enable help() function, --features=regex to enable regex, --help to display the help. - execfile.py : execute your script in the sandbox. It has also --features option: use --features=stdout to be able to use the print instruction :-) - use directly the Sandbox class: use methods call(), execute() or createCallback() Don't use "with sandbox: ..." because there is known but with local frame variables. I think that I will later drop this syntax because of this bug. Except of debug_sandbox, I consider that all features are safe and so you can enable all features :-) There is no prize, it's just for fun! But I will add the name of hackers founding the best exploits. pysandbox is not ready for production, it's under heavy development. Anyway I *hope* that you will quickly find bugs! -- Use tests.py to found some examples of how you can escape a sandbox. pysandbox is protected against all methods described in tests.py ;-) See the README file to get more information about how pysandbox is implemented and get a list of other Python sandboxes. pysandbox is currently specific to CPython, and it uses some ugly hacks to patch CPython in memory. In the worst case it will crash the pysandbox Python process, that's all. I tested it under Linux with Python 2.5 and 2.6. The portage to Python3 is not done yet (is someone motivated to write a patch? :-)). -- Victor Stinner http://www.haypocalc.com/ ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] __file__
On Sun, Feb 28, 2010 at 05:07, Nick Coghlan wrote: > Michael Foord wrote: > >> Can't it look for a .py file in the source directory first (1st stat)? > >> When it's there check for the .pyc in the cache directory (2nd stat, > >> magic number encoded in filename), if it's not check for .pyc in the > >> source directory (2nd stat + read for magic number check). Or am I > >> missing a subtlety? > > > > The problem is doing this little dance for every path on sys.path. > > To unpack this a little bit for those not quite as familiar with the > import system (and to make it clear for my own benefit!): for a > top-level module/package, each path on sys.path needs to be eliminated > as a possible location before the interpreter can move on to check the > next path in the list. > > So the important number is the number of stat calls on a "miss" (i.e. > when the requested module/package is not present in a directory). > Currently, with builtin support for bytecode only files, there are 3 > checks (package directory, py source file, pyc/pyo bytecode file) to be > made for each path entry. > Actually it's four: name/__init__.py, name/__init__.pyc, name.py, and then name.pyc. And just so people have terminology to go with all of this, this search is what the finder does to say whether it can or cannot handle the requested module. > > The PEP proposes to reduce that to only two in the case of a miss, by > checking for the cached pyc only if the source file is present (there > would still be three checks for a "hit", but that only happens at most > once per module lookup). > Just to be explicit, Nick is talking about name/__init__.py and name.py (note the skipping of looking for any .pyc files). At that point only the loader needs to check for the bytecode in the __pycache__ directory. > > While the PEP is right in saying that a bytecode-only import hook could > be added, I believe it would actually be a little tricky to write one > that didn't severely degrade the performance of either normal imports or > bytecode-only imports. Keeping it in the core import, but turning it off > by default seems much less likely to have unintended performance > consequences when it is switched back on. > It all depends on how it is implemented. If the bytecode-only importer stats a directory to check for the existence of any source in order to decide not to handle it, that is an extra stat call, but that is only once per sys.path/__path__ location by the path hook, not every attempted import. Now if I ever manage to find the time to break up the default importers and expose them then it should be no more then adding the bytecode-only importer to the chained finder that already exists (it essentially chains source and extension modules). > > Another option is to remove bytecode-only support from the default > filesystem importer, but keep it for zipimport (since the stat call > savings don't apply in the latter case). > That's a very nice option. That would isolate it into a single importer that doesn't impact general performance for everyone else. -Brett > > Cheers, > Nick. > > -- > Nick Coghlan | [email protected] | Brisbane, Australia > --- > ___ > Python-Dev mailing list > [email protected] > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/brett%40python.org > ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] __file__
On Sun, 2010-02-28 at 12:21 -0800, Brett Cannon wrote: > > Actually it's four: name/__init__.py, name/__init__.pyc, name.py, and > then name.pyc. And just so people have terminology to go with all of > this, this search is what the finder does to say whether it can or > cannot handle the requested module. Aren't there also: name.so namemodule.so ? -Rob signature.asc Description: This is a digitally signed message part ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] __file__
Nick Coghlan a écrit : Another option is to remove bytecode-only support from the default filesystem importer, but keep it for zipimport (since the stat call savings don't apply in the latter case). bytecode-only in a zip is used by py2exe, cx_freeze and the like, for space reasons. Disabling it would probably hurt them. However, making a difference between zipimport and the filesystem importer means the application will stop working if I unzip the library zip file, which is surprising. Unzipping the zip file can be handy when debugging a bug caused by a forgotten module. Cheers, Baptiste ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] __file__
Brett Cannon wrote: > Actually it's four: name/__init__.py, name/__init__.pyc, name.py, and > then name.pyc. And just so people have terminology to go with all of > this, this search is what the finder does to say whether it can or > cannot handle the requested module. Huh, I thought we checked for the directory first and only then checked for the __init__ module within it (hence the generation of ImportWarning when we don't find __init__ after finding a correctly named directory). So a normal miss (i.e. no directory) only needs one stat call. (However, I'll grant that I haven't looked at this particular chunk of code in a fairly long time, so I could easily be wrong). Robert raises a good point about the checks for extension modules as well - we should get an accurate count here so Barry's PEP can pitch the proportional reduction in stat calls accurately. Cheers, Nick. -- Nick Coghlan | [email protected] | Brisbane, Australia --- ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] __file__
Le Sun, 28 Feb 2010 21:45:56 +0100, Baptiste Carvello a écrit : > bytecode-only in a zip is used by py2exe, cx_freeze and the like, for > space reasons. Disabling it would probably hurt them. Source code compresses quite well. I'm not sure it would make much of a difference. AFAIR, when you create a py2exe distribution, what takes most of the place is the interpreter itself as well as any big third-party C libraries such as wxWidgets. Regards Antoine. ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] __file__
Glenn Linderman wrote: if the command line/runpy can do it, the importer could do it. Just a matter of desire and coding. Whether it is worth pursuing further depends on people's perceptions of "kookiness" vs. functional and performance considerations. Having .py files around that aren't source text could lead to a lot of confusion, given that most platforms these days decide which application to open for a given file based solely on the filename extension. I wouldn't enjoy trying to open a .py file only to have my text editor blow up because it was actually a binary file. So on balance I think it's a bit too kooky for my taste. -- Greg ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] __file__
On Sun, Feb 28, 2010 at 09:45:56PM +0100, Baptiste Carvello wrote: > However, making a difference between zipimport and the filesystem > importer means the application will stop working if I unzip the > library zip file, which is surprising. Unzipping the zip file can be > handy when debugging a bug caused by a forgotten module. That difference exists already, the zipimporter will happily run .pyo files inside the zipfile even when you're not running with -O or PYTHONOPTIMIZE. Regards Floris -- Debian GNU/Linux -- The Power of Freedom www.debian.org | www.gnu.org | www.kernel.org ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] __file__
Floris Bruynooghe wrote: Can't it look for a .py file in the source directory first (1st stat)? When it's there check for the .pyc in the cache directory (2nd stat, magic number encoded in filename), if it's not check for .pyc in the source directory (2nd stat + read for magic number check). Yes, although that would then incur higher stat overheads for people distributing .pyc files. There doesn't seem to be a way of pleasing everyone. This is all assuming that the extra stat calls are actually a problem. Does anyone have any evidence that they would really take significant time compared to loading the module? Once you've looked for one file in a given directory, looking for another one in the same directory ought to be quite fast, since all the relevant directory blocks will be in the filesystem cache. -- Greg ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Update xml.etree.ElementTree for Python 2.7 and 3.2
2010/2/28 Stefan Behnel > I would actually encourage Florent to do the opposite: act now and prepare > a patch against the latest official ET 1.2 and cET releases (or their SVN > version respectively) that integrates everything that is considered safe, > i.e. everything that makes cET compatible with ET and everything that seems > clearly stable in ET 1.3 and does not break compatibility for existing code > that uses ET 1.2. If you send that to Fredrik, I expect little opposition > to making that the base for a 1.2.8 release, which can then be folded back > into the stdlib. > > I exchanged some e-mails with Fredrik last week. Not sure if it will be 1.2.8 or 1.3, but now he is positive on the goals of the patch. I've commited all the changes and external fixes to a branch of the Mercurial repo owned by Fredrik. I'm expecting an answer soon. Branch based on the official etree repository (Mercurial): http://bitbucket.org/flox/et-2009-provolone/ Patch based on this branch: http://codereview.appspot.com/207048 (patch set 7 almost identical to the tip of the Mercurial repo) -- Florent ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] __file__
On Mon, 2010-03-01 at 12:35 +1300, Greg Ewing wrote: > > Yes, although that would then incur higher stat overheads for > people distributing .pyc files. There doesn't seem to be a > way of pleasing everyone. > > This is all assuming that the extra stat calls are actually > a problem. Does anyone have any evidence that they would > really take significant time compared to loading the module? > Once you've looked for one file in a given directory, looking > for another one in the same directory ought to be quite fast, > since all the relevant directory blocks will be in the > filesystem cache. We've done a bunch of testing in bzrlib. Basic things are: - statting /is/ expensive *if* you don't use the result. - loading code is the main cost *once* you have a hot disk cache Specifically, stats for files that are *not present* incur page-in costs for the dentries needed to determine the file is absent. In the special case of probing for $name.$ext1, ...$ext2, ...$ext3, you generally hit the same pages and don't incur additional page in costs. (you'll hit the same page in most file systems when you look for the second and third entries). In most file systems stats for files that *are present* also incur a page-in for the inode of the file. If you then do not read the file, this is I/O that doesn't really gain anything. Being able to disable .py file usage completely - so that only foo.pyc and foo/__init__.pyc are probed for, could have a very noticable change in the cold cache startup time. # Startup time for bzr (cold cache): $ drop-caches $ time bzr --no-plugins revno 5061 real0m8.875s user0m0.210s sys 0m0.140s # Hot cache $ time bzr --no-plugins revno 5061 real0m0.307s user0m0.250s sys 0m0.040s (revno is a small command that reads a small amount of data - just enough to trigger demand loading of the core repository layers and so on). strace timings for those two operations: cold cache: $ strace -c bzr --no-plugins revno 5061 % time seconds usecs/call callserrors syscall -- --- --- - - 56.340.04 76 527 read 28.980.020573 9 2273 1905 open 14.430.010248 14 734 625 stat 0.150.000107 0 533 fstat ... hot cache: % time seconds usecs/call callserrors syscall -- --- --- - - 45.100.000368 92 4 getdents 19.490.000159 0 527 read 16.910.000138 1 163 munmap 10.050.82 254 mprotect 8.460.69 0 2273 1905 open 0.000.00 0 8 write 0.000.00 0 367 close 0.000.00 0 734 625 stat ... Cheers, Rob signature.asc Description: This is a digitally signed message part ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] __file__
Robert Collins wrote: In the special case of probing for $name.$ext1, ...$ext2, ...$ext3, you generally hit the same pages and don't incur additional page in costs. So then looking for a .pyc alongside a .py or vice versa should be almost free, and we shouldn't be worrying about it. hot cache: % time seconds usecs/call callserrors syscall -- --- --- - - 45.100.000368 92 4 getdents 0.000.00 0 734 625 stat Further supporting the idea that stat calls are negligible once the cache is warmed up. -- Greg ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Draft PEP on RSON configuration file format
All: Finding .ini configuration files too limiting, JSON and XML to hard to manually edit, and YAML too complex to parse quickly, I have started work on a new configuration file parser. I call the new format RSON (for "Readable Serial Object Notation"), and it is designed to be a superset of JSON. I would love for it to be considered valuable enough to be a part of the standard library, but even if that does not come to pass, I would be very interested in feedback to help me polish the specification, and then possibly help for implementation and testing. The documentation is in rst PEP form, at: http://rson.googlecode.com/svn/trunk/doc/draftpep.txt Thanks and best regards, Pat ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Draft PEP on RSON configuration file format
2010/2/28 Patrick Maupin : > All: > > Finding .ini configuration files too limiting, JSON and XML to hard to > manually edit, and YAML too complex to parse quickly, I have started > work on a new configuration file parser. In that case, it should live in the user space for several years. If the community decides that it is an excellent format, then it should be considered for inclusion in the stand library. -- Regards, Benjamin ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Draft PEP on RSON configuration file format
On Sun, Feb 28, 2010 at 6:29 PM, Benjamin Peterson wrote: > In that case, it should live in the user space for several years. If > the community decides that it is an excellent format, then it should > be considered for inclusion in the stand library. Agreed. However, there are too many things which became de facto standards without community input this way. PEP 1 itself says: Reference Implementation -- The reference implementation must be completed before any PEP is given status "Final", but it need not be completed before the PEP is accepted. It is better to finish the specification and rationale first and reach consensus on it before writing code. So, I do not mind the code sitting outside the standard library, and the PEP not reaching "Final" for several years, but I do believe that the PEP process is itself a really good way to build a better mousetrap by consensus. If you do not care to participate in the building of this particular mousetrap, that is OK, too. Regards, Pat ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Draft PEP on RSON configuration file format
Le Sun, 28 Feb 2010 18:59:16 -0600, Patrick Maupin a écrit : > > So, I do not mind the code sitting outside the standard library, and > the PEP not reaching "Final" for several years, but I do believe that > the PEP process is itself a really good way to build a better > mousetrap by consensus. In this case it is *at best* python-ideas material, or even preferably comp.lang.python. Just for the record, my only reaction when giving the PEP a glance was "yet another configuration file format - yawn". Good luck though, Antoine. ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Draft PEP on RSON configuration file format
On Sun, Feb 28, 2010 at 7:39 PM, Antoine Pitrou wrote: > In this case it is *at best* python-ideas material, or even > preferably comp.lang.python. I was thinking about comp.lang.python at some point, but thought I would try here first. > Just for the record, my only reaction when giving the PEP a glance was > "yet another configuration file format - yawn". I suppose I have that sort of reaction about areas I am not interested in, as well, but currently I am deeply interested in configuration files due to my circumstances. In any case, the observation that there are already several preexisting file formats used for configuration is certainly covered in the PEP draft, but if you have anything constructive to add *about* configuration file formats, I would certainly welcome the input. Best regards, Pat ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Draft PEP on RSON configuration file format
Le Sun, 28 Feb 2010 19:46:30 -0600, Patrick Maupin a écrit : > > I suppose I have that sort of reaction about areas I am not interested > in, as well, but currently I am deeply interested in configuration > files due to my circumstances. In any case, the observation that > there are already several preexisting file formats used for > configuration is certainly covered in the PEP draft, but if you have > anything constructive to add *about* configuration file formats, I > would certainly welcome the input. Well, a constructive approach would involve approaching projects which have devised their own formats, so as to know what kind of unified format they would be likely to accept (or not). python-dev is probably not the place for such an approach, however. ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] __file__
On Sun, Feb 28, 2010 at 16:31, Greg Ewing wrote: > Robert Collins wrote: > >> In the special >> case of probing for $name.$ext1, ...$ext2, ...$ext3, you generally hit >> the same pages and don't incur additional page in costs. >> > > So then looking for a .pyc alongside a .py or vice versa > should be almost free, and we shouldn't be worrying about > it. > But that is making the assumption that all filesystems operate this way (.e.g does NFS have the same performance characteristics?). > > hot cache: >> % time seconds usecs/call callserrors syscall >> -- --- --- - - >> 45.100.000368 92 4 getdents >> 0.000.00 0 734 625 stat >> > > Further supporting the idea that stat calls are negligible > once the cache is warmed up. But that's the point: once it's warmed up. This is not the case when executing a script once every once in a while compared to something bzr where you are most likely going to execute the command multiple times within a small timeframe. -Brett > > > -- > Greg > ___ > Python-Dev mailing list > [email protected] > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/brett%40python.org > ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] __file__
On Sun, Feb 28, 2010 at 12:46, Nick Coghlan wrote:
> Brett Cannon wrote:
> > Actually it's four: name/__init__.py, name/__init__.pyc, name.py, and
> > then name.pyc. And just so people have terminology to go with all of
> > this, this search is what the finder does to say whether it can or
> > cannot handle the requested module.
>
> Huh, I thought we checked for the directory first and only then checked
> for the __init__ module within it (hence the generation of ImportWarning
> when we don't find __init__ after finding a correctly named directory).
> So a normal miss (i.e. no directory) only needs one stat call.
>
> (However, I'll grant that I haven't looked at this particular chunk of
> code in a fairly long time, so I could easily be wrong).
>
> Robert raises a good point about the checks for extension modules as
> well - we should get an accurate count here so Barry's PEP can pitch the
> proportional reduction in stat calls accurately.
>
Here are the details (from Python/import.c:find_module) assuming that
everything has failed to the point of trying for the implicit sys.path
importers:
stat_info = stat(name)
if stat_info.exists and stat_info.is_dir:
if stat(name/__init__.py) || stat(name/__init__.pyc):
load(name)
else:
for ext in ('.so', 'module.so', '.py', 'pyc'): # Windows has an extra
check for .pyw files.
if open(name + ext):
load(name)
So there are a total of five to six depending on the OS (actually, VMS goes
up to eight!) before a search path is considered not to contain a module.
And thanks to doing this I realized importlib is not stat'ing the directory
first which should fail faster than checking for the __init__ files every
time.
-Brett
>
> Cheers,
> Nick.
>
> --
> Nick Coghlan | [email protected] | Brisbane, Australia
> ---
>
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] __file__
On Sun, Feb 28, 2010 at 12:45, Baptiste Carvello wrote: > Nick Coghlan a écrit : > > >> Another option is to remove bytecode-only support from the default >> filesystem importer, but keep it for zipimport (since the stat call >> savings don't apply in the latter case). >> >> > bytecode-only in a zip is used by py2exe, cx_freeze and the like, for space > reasons. Disabling it would probably hurt them. > > However, making a difference between zipimport and the filesystem importer > means the application will stop working if I unzip the library zip file, > which is surprising. Unzipping the zip file can be handy when debugging a > bug caused by a forgotten module. > > Is it really that hard to unzip a bunch of .pyc files, modify what you need to, and then zip it back up? And if you are given a zip file of only .pyc files you can't really debug anything anyway. -Brett > Cheers, > Baptiste > > > ___ > Python-Dev mailing list > [email protected] > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/brett%40python.org > ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] The fate of Distutils in Python 2.7
On Fri, Feb 26, 2010 at 14:15, Tarek Ziadé wrote: > On Fri, Feb 26, 2010 at 11:13 PM, Brett Cannon wrote: > [..] > > I assume you want the Distutils2 component to auto-assign to you like > > Distutils currently does? If so I can add the component for you if people > > don't object to the new component. > > Sounds good -- Thanks > Done. -Brett ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Draft PEP on RSON configuration file format
On Sun, Feb 28, 2010 at 7:51 PM, Antoine Pitrou wrote: > Well, a constructive approach would involve approaching projects > which have devised their own formats, so as to know what kind of > unified format they would be likely to accept (or not). Trying to poll "selected projects which have configuration files" may or may not be a constructive approach. Most projects which have predefined formats are unlikely to change, unless there is standardization on a new format. It is very much a chicken and egg problem, although I agree with (and have implemented) the suggestion that I discuss this on python-list. Having said that, one of the reasons I wrote the PEP and am working on a parser is because of a few projects I use and/or am personally involved in. For example, rst2pdf stylesheets are in JSON, e.g. http://rst2pdf.googlecode.com/svn/trunk/rst2pdf/styles/styles.json Now, we're all programmers here, and we can read this, and can even modify it, but it is easy to get wrong, and very verbose with lots of syntax gotchas. For example, unlike Python, JSON won't even let you have a trailing comma. But JSON *is* a great format, and RSON (like YAML) is designed to parse properly formatted JSON, so the goal is that any project which uses JSON could use RSON as a drop-in replacement, and then update its configuration data. Of course, it is extremely easy (hence your yawn) to create a new configuration format, even if it is specified that it is upwards compatible with JSON. The trick is to create the *correct* new format, that at least some people can agree on. In order to do this, I have chosen to poll, not preexisting projects, which have entrenched configuration data and a reluctance to change, but brand new projects which haven't been invented yet. Many of the inventors of those projects hang out on python-dev, so this seemed like a reasonable place to do polling. As I tried to make clear, I will not be too disappointed if I do not come up with something worthy of the standard library for a long time (if ever), but the PEP process is very valuable, and I would like to start off on the right foot by soliciting feedback before I do too much coding. Sorry if it feels like spam; this is my last message on the matter until and unless somebody wants to constructively discuss the actual contents of the PEP. Please feel free to email me privately if you don't want to clutter up this list. Thanks and best regards, Pat ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] __file__
On approximately 2/28/2010 3:22 PM, came the following characters from the keyboard of Greg Ewing: Glenn Linderman wrote: if the command line/runpy can do it, the importer could do it. Just a matter of desire and coding. Whether it is worth pursuing further depends on people's perceptions of "kookiness" vs. functional and performance considerations. Having .py files around that aren't source text could lead to a lot of confusion, given that most platforms these days decide which application to open for a given file based solely on the filename extension. I wouldn't enjoy trying to open a .py file only to have my text editor blow up because it was actually a binary file. So on balance I think it's a bit too kooky for my taste. I understand your thoughts, but have some rebuttal comments. Mind you, if there is a better solution that can improve performance for both the source+binary and the binary-only distributions, I'm all for it. But in general, I'm all for performance improvements, even if there is some kookiness :) Thankful for Brett's posting of the actual search code fragment. If your text editor blows up because it is binary, it is a sad text editor. If you have .py mapped to a text editor, that's sort of kooky too; I have it mapped to Python. The .py files that are binary would generally be part of an application distribution in binary form, and therefore would be installed in some place like /bin or C:\Program Files ... not the place you'd look for source code, to confuse your text editor. -- Glenn -- http://nevcal.com/ === A protocol is complete when there is nothing left to remove. -- Stuart Cheshire, Apple Computer, regarding Zero Configuration Networking ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] __file__
Le Sun, 28 Feb 2010 19:32:09 -0800, Glenn Linderman a écrit : > > If your text editor blows up because it is binary, it is a sad text > editor. > > If you have .py mapped to a text editor, that's sort of kooky too; I > have it mapped to Python. File extensions exist for a reason, even if you find that "kooky" and have strong ideas about the psychology of text editors. Having some binary files named "foobar.py" would certainly annoy a lot of people, including me. Antoine. ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] __file__
Glenn Linderman wrote: If your text editor blows up because it is binary, it is a sad text editor. Blow up is probably an exaggeration, but even just getting a screen full of gibberish when I think I'm opening a text file is a jarring experience. If you have .py mapped to a text editor, that's sort of kooky too; I have it mapped to Python. On Windows the action for double-clicking is usually mapped to running the file, but there's typically another action such as "Open with IDLE" or whatever available, and a bytecode file named with ".py" would allow you to apply that action to it. -- Greg ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] __file__
On Feb 27, 2010, at 9:38 AM, Nick Coghlan wrote: > I do like the idea of pulling .pyc only imports out into a separate > importer, but would go so far as to suggest keeping them as a command > line option rather than as a separately distributed module. One advantage of doing this as a separately distributed module is that it can have its own ecosystem and momentum. Most projects that want this sort of bundling or packaging really want to be shipped with something like py2exe, and I think the folks who want such facilities would be better served by a nice project website for "python sealer" or "python bundler" rather than obscure directions for triggering the behavior via options or configuration. Making bytecode loading a feature of interpreter startup, whether it's a config file, a command-line option or an environment variable, is not a great idea. For folks that want to ship a self-contained application, any of these would require an additional customization step, where they need to somehow tell their bundled interpreter to load bytecode. For people trying to ship a self-contained and tamper-unfriendly (since even "tamper-resistant" would be overstating things) library to relatively non-technical programmers, it opens the door to a whole universe of confusion and FAQs about why the code didn't load. However bytecode-only code loading is facilitated, it should be possible to bootstrap from a vanilla python interpreter running normally, as you may not know you need to load a bytecode-only package at startup. In the stand-alone case there are already plenty of options, and in the library case, shipping a zip file should be fine, since the __init__.py of your package should be plain-text and also able to trigger the activation of the bytecode-only importer. There are already so many ways to ship bytecode already, it doesn't seem too important to support in this one particular configuration (files in a directory, compiled by just importing them, in the same place as ".py" files). The real problem is providing a seamless transition path for *build* processes, not the Python code itself. Do any of the folks who are currently using this feature have a good idea as to how your build and distribute scripts might easily be updated, perhaps by a 2to3 fixer?___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] __file__
Greg Ewing wrote: > Having .py files around that aren't source text could lead > to a lot of confusion, given that most platforms these days > decide which application to open for a given file based > solely on the filename extension. I wouldn't enjoy trying > to open a .py file only to have my text editor blow up > because it was actually a binary file. > > So on balance I think it's a bit too kooky for my taste. +1 Add to that the inverse... I will cleanup directories based on the suffix keeping the .py and deleting .pyc and .pyo. Overloading a source file suffix is not good. Larry ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] __file__
Brett Cannon a écrit : However, making a difference between zipimport and the filesystem importer means the application will stop working if I unzip the library zip file, which is surprising. Unzipping the zip file can be handy when debugging a bug caused by a forgotten module. Is it really that hard to unzip a bunch of .pyc files, modify what you need to, and then zip it back up? And if you are given a zip file of only .pyc files you can't really debug anything anyway. Well, this is a micro-use-case, I admit, I only mention it because it's something I've really done. It's only useful for debugging the building process, not the application (so I do have the source at hand), and the only reason for not rezipping is to test more quickly. I can definitely live without it! Cheers, Baptiste ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Update xml.etree.ElementTree for Python 2.7 and 3.2
Florent XICLUNA, 01.03.2010 00:36: > I exchanged some e-mails with Fredrik last week. Not sure if it will be > 1.2.8 or 1.3, but now he is positive on the goals of the patch. I've > commited all the changes and external fixes to a branch of the Mercurial > repo owned by Fredrik. I'm expecting an answer soon. Happy to hear that. Thanks for putting so much work into this! > Branch based on the official etree repository (Mercurial): > http://bitbucket.org/flox/et-2009-provolone/ Interesting, I didn't even know Fredrik had continued to work on this. It even looks like lxml.etree has a bit to catch up API-wise before I release 2.3. Stefan ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
