Re: [Python-Dev] Update xml.etree.ElementTree for Python 2.7 and 3.2

2010-02-28 Thread Stefan Behnel
Martin v. Löwis, 20.02.2010 13:08:
>> Actually this should not be a fork of the upstream library.
>> The goal is to improve stability and predictability of the ElementTree
>> implementations in the stdlib, and to fix some bugs.
>> I thought that it is better to backport the fixes from upstream than to
>> fix each bug separately in the stdlib.
>>
>> I try to get some clear assessment from Fredrik.
>> If it is accepted, I will probably cut some parts which are in the upstream
>> library, but which are not in the API 1.2. If it is not accepted, it is bad
>> news for the "xml.etree" users...
> 
> Not sure about the timing, but in case you have not got the message: we
> should rather drop ElementTree from the standard library than integrate
> unreleased changes from an experimental upstream repository.
> 
>> It is qualified as a "best effort" to get something better for ET. Nothing 
>> else.
> 
> Unfortunately, it hurts ET users if it ultimately leads to a fork, or to
> a removal of ET from the standard library.
> 
> Please be EXTREMELY careful. I urge you not to act on this until
> mid-March (which is the earliest time at which Fredrik has said he may
> have time to look into this).

I would actually encourage Florent to do the opposite: act now and prepare
a patch against the latest official ET 1.2 and cET releases (or their SVN
version respectively) that integrates everything that is considered safe,
i.e. everything that makes cET compatible with ET and everything that seems
clearly stable in ET 1.3 and does not break compatibility for existing code
that uses ET 1.2. If you send that to Fredrik, I expect little opposition
to making that the base for a 1.2.8 release, which can then be folded back
into the stdlib.

Stefan

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] __file__

2010-02-28 Thread Floris Bruynooghe
On Sun, Feb 28, 2010 at 02:51:16PM +1300, Greg Ewing wrote:
> Floris Bruynooghe wrote:
> >(But even then I'm not
> >convinced that would double the stat calls for normal users, only for
> >those who only ship .pyc files)
> 
> It would increase the number of stat calls for normal
> users by 50%. You would need to look for a .pyc in the
> source directory, then .py in the source directory and
> .pyc in the cache directory. That's compared to two
> stat calls currently, for .py and .pyc.

Can't it look for a .py file in the source directory first (1st stat)?
When it's there check for the .pyc in the cache directory (2nd stat,
magic number encoded in filename), if it's not check for .pyc in the
source directory (2nd stat + read for magic number check).  Or am I
missing a subtlety?


> A solution might be to look for the presence of the
> cache directory, and only look for a .pyc in the source
> directory if there is no cache directory. Testing for
> the cache directory would only have to be done once
> per package and the result remembered, so it would
> add very little overhead.

That would work too, but I don't understand yet why the .pyc check in
the source directory can't be done last.

Regards
Floris

-- 
Debian GNU/Linux -- The Power of Freedom
www.debian.org | www.gnu.org | www.kernel.org
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] __file__

2010-02-28 Thread Michael Foord




--
http://www.ironpythoninaction.com

On 28 Feb 2010, at 12:19, Floris Bruynooghe  
 wrote:



On Sun, Feb 28, 2010 at 02:51:16PM +1300, Greg Ewing wrote:

Floris Bruynooghe wrote:

(But even then I'm not
convinced that would double the stat calls for normal users, only  
for

those who only ship .pyc files)


It would increase the number of stat calls for normal
users by 50%. You would need to look for a .pyc in the
source directory, then .py in the source directory and
.pyc in the cache directory. That's compared to two
stat calls currently, for .py and .pyc.


Can't it look for a .py file in the source directory first (1st stat)?
When it's there check for the .pyc in the cache directory (2nd stat,
magic number encoded in filename), if it's not check for .pyc in the
source directory (2nd stat + read for magic number check).  Or am I
missing a subtlety?




The problem is doing this little dance for every path on sys.path.

Michael




A solution might be to look for the presence of the
cache directory, and only look for a .pyc in the source
directory if there is no cache directory. Testing for
the cache directory would only have to be done once
per package and the result remembered, so it would
add very little overhead.


That would work too, but I don't understand yet why the .pyc check in
the source directory can't be done last.

Regards
Floris

--
Debian GNU/Linux -- The Power of Freedom
www.debian.org | www.gnu.org | www.kernel.org
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] __file__

2010-02-28 Thread Nick Coghlan
Michael Foord wrote:
>> Can't it look for a .py file in the source directory first (1st stat)?
>> When it's there check for the .pyc in the cache directory (2nd stat,
>> magic number encoded in filename), if it's not check for .pyc in the
>> source directory (2nd stat + read for magic number check).  Or am I
>> missing a subtlety?
> 
> The problem is doing this little dance for every path on sys.path.

To unpack this a little bit for those not quite as familiar with the
import system (and to make it clear for my own benefit!): for a
top-level module/package, each path on sys.path needs to be eliminated
as a possible location before the interpreter can move on to check the
next path in the list.

So the important number is the number of stat calls on a "miss" (i.e.
when the requested module/package is not present in a directory).
Currently, with builtin support for bytecode only files, there are 3
checks (package directory, py source file, pyc/pyo bytecode file) to be
made for each path entry.

The PEP proposes to reduce that to only two in the case of a miss, by
checking for the cached pyc only if the source file is present (there
would still be three checks for a "hit", but that only happens at most
once per module lookup).

While the PEP is right in saying that a bytecode-only import hook could
be added, I believe it would actually be a little tricky to write one
that didn't severely degrade the performance of either normal imports or
bytecode-only imports. Keeping it in the core import, but turning it off
by default seems much less likely to have unintended performance
consequences when it is switched back on.

Another option is to remove bytecode-only support from the default
filesystem importer, but keep it for zipimport (since the stat call
savings don't apply in the latter case).

Cheers,
Nick.

-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
---
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] __file__

2010-02-28 Thread Floris Bruynooghe
On Sun, Feb 28, 2010 at 11:07:27PM +1000, Nick Coghlan wrote:
> Michael Foord wrote:
> >> Can't it look for a .py file in the source directory first (1st stat)?
> >> When it's there check for the .pyc in the cache directory (2nd stat,
> >> magic number encoded in filename), if it's not check for .pyc in the
> >> source directory (2nd stat + read for magic number check).  Or am I
> >> missing a subtlety?
> > 
> > The problem is doing this little dance for every path on sys.path.
> 
> To unpack this a little bit for those not quite as familiar with the
> import system (and to make it clear for my own benefit!): for a
> top-level module/package, each path on sys.path needs to be eliminated
> as a possible location before the interpreter can move on to check the
> next path in the list.

Aha, that was the clue I was missing.  Thanks!

Floris


-- 
Debian GNU/Linux -- The Power of Freedom
www.debian.org | www.gnu.org | www.kernel.org
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Challenge: escape from the pysandbox

2010-02-28 Thread Victor Stinner
Hi,

pysandbox is a new Python sandbox project under development. By default, 
untrusted code executed in the sandbox cannot modify the environment (write a 
file, use print or import a module). But you can configure the sandbox to 
choose exactly which features are allowed or not, eg. import sys module and 
read the file /etc/issue.

Website: http://github.com/haypo/pysandbox/

Download the repository using git:
  git clone git://github.com/haypo/pysandbox.git 
or
  git clone http://github.com/haypo/pysandbox.git

Or download the .zip or .tar.gz tarball using the "Download source" button on 
the website.

I think that the project reached the "testable" stage. I launch a new 
challenge: try to escape from the sandbox. I'm unable to write strict rules. 
The goal is to access objects outside the sandbox. Eg. write into a file, 
import a module which is not in the whitelist, modify an object outside the 
sandbox, etc.

To test the sandbox, you have 3 choices:
 - interpreter.py: interactive interpreter executed in the sandbox, use:
--verbose to display the whole sandbox configuration,
--features=help to enable help() function,
--features=regex to enable regex,
--help to display the help.
 - execfile.py : execute your script in the sandbox. 
   It has also --features option: use --features=stdout to be able 
   to use the print instruction :-)
 - use directly the Sandbox class: use methods call(), execute()
   or createCallback()

Don't use "with sandbox: ..." because there is known but with local frame 
variables. I think that I will later drop this syntax because of this bug. 
Except of debug_sandbox, I consider that all features are safe and so you can 
enable all features :-)

There is no prize, it's just for fun! But I will add the name of hackers 
founding the best exploits.

pysandbox is not ready for production, it's under heavy development. Anyway I 
*hope* that you will quickly find bugs!

--

Use tests.py to found some examples of how you can escape a sandbox. pysandbox 
is protected against all methods described in tests.py ;-)

See the README file to get more information about how pysandbox is implemented 
and get a list of other Python sandboxes.

pysandbox is currently specific to CPython, and it uses some ugly hacks to 
patch CPython in memory. In the worst case it will crash the pysandbox Python 
process, that's all. I tested it under Linux with Python 2.5 and 2.6. The 
portage to Python3 is not done yet (is someone motivated to write a 
patch? :-)).

-- 
Victor Stinner
http://www.haypocalc.com/
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] __file__

2010-02-28 Thread Brett Cannon
On Sun, Feb 28, 2010 at 05:07, Nick Coghlan  wrote:

> Michael Foord wrote:
> >> Can't it look for a .py file in the source directory first (1st stat)?
> >> When it's there check for the .pyc in the cache directory (2nd stat,
> >> magic number encoded in filename), if it's not check for .pyc in the
> >> source directory (2nd stat + read for magic number check).  Or am I
> >> missing a subtlety?
> >
> > The problem is doing this little dance for every path on sys.path.
>
> To unpack this a little bit for those not quite as familiar with the
> import system (and to make it clear for my own benefit!): for a
> top-level module/package, each path on sys.path needs to be eliminated
> as a possible location before the interpreter can move on to check the
> next path in the list.
>
> So the important number is the number of stat calls on a "miss" (i.e.
> when the requested module/package is not present in a directory).
> Currently, with builtin support for bytecode only files, there are 3
> checks (package directory, py source file, pyc/pyo bytecode file) to be
> made for each path entry.
>

Actually it's four: name/__init__.py, name/__init__.pyc, name.py, and then
name.pyc. And just so people have terminology to go with all of this, this
search is what the finder does to say whether it can or cannot handle the
requested module.


>
> The PEP proposes to reduce that to only two in the case of a miss, by
> checking for the cached pyc only if the source file is present (there
> would still be three checks for a "hit", but that only happens at most
> once per module lookup).
>

Just to be explicit, Nick is talking about name/__init__.py and name.py
(note the skipping of looking for any .pyc files). At that point only the
loader needs to check for the bytecode in the __pycache__ directory.


>
> While the PEP is right in saying that a bytecode-only import hook could
> be added, I believe it would actually be a little tricky to write one
> that didn't severely degrade the performance of either normal imports or
> bytecode-only imports. Keeping it in the core import, but turning it off
> by default seems much less likely to have unintended performance
> consequences when it is switched back on.
>

It all depends on how it is implemented. If the bytecode-only importer stats
a directory to check for the existence of any source in order to decide not
to handle it, that is an extra stat call, but that is only once per
sys.path/__path__ location by the path hook, not every attempted import.

Now if I ever manage to find the time to break up the default importers and
expose them then it should be no more then adding the bytecode-only importer
to the chained finder that already exists (it essentially chains source and
extension modules).


>
> Another option is to remove bytecode-only support from the default
> filesystem importer, but keep it for zipimport (since the stat call
> savings don't apply in the latter case).
>

That's a very nice option. That would isolate it into a single importer that
doesn't impact general performance for everyone else.

-Brett




>
> Cheers,
> Nick.
>
> --
> Nick Coghlan   |   [email protected]   |   Brisbane, Australia
> ---
> ___
> Python-Dev mailing list
> [email protected]
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> http://mail.python.org/mailman/options/python-dev/brett%40python.org
>
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] __file__

2010-02-28 Thread Robert Collins
On Sun, 2010-02-28 at 12:21 -0800, Brett Cannon wrote:
> 
> Actually it's four: name/__init__.py, name/__init__.pyc, name.py, and
> then name.pyc. And just so people have terminology to go with all of
> this, this search is what the finder does to say whether it can or
> cannot handle the requested module.   

Aren't there also:
name.so
namemodule.so

 ?

-Rob


signature.asc
Description: This is a digitally signed message part
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] __file__

2010-02-28 Thread Baptiste Carvello

Nick Coghlan a écrit :


Another option is to remove bytecode-only support from the default
filesystem importer, but keep it for zipimport (since the stat call
savings don't apply in the latter case).



bytecode-only in a zip is used by py2exe, cx_freeze and the like, for space 
reasons. Disabling it would probably hurt them.


However, making a difference between zipimport and the filesystem importer means 
the application will stop working if I unzip the library zip file, which is 
surprising. Unzipping the zip file can be handy when debugging a bug caused by a 
forgotten module.


Cheers,
Baptiste

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] __file__

2010-02-28 Thread Nick Coghlan
Brett Cannon wrote:
> Actually it's four: name/__init__.py, name/__init__.pyc, name.py, and
> then name.pyc. And just so people have terminology to go with all of
> this, this search is what the finder does to say whether it can or
> cannot handle the requested module.

Huh, I thought we checked for the directory first and only then checked
for the __init__ module within it (hence the generation of ImportWarning
when we don't find __init__ after finding a correctly named directory).
So a normal miss (i.e. no directory) only needs one stat call.

(However, I'll grant that I haven't looked at this particular chunk of
code in a fairly long time, so I could easily be wrong).

Robert raises a good point about the checks for extension modules as
well - we should get an accurate count here so Barry's PEP can pitch the
proportional reduction in stat calls accurately.

Cheers,
Nick.

-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
---
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] __file__

2010-02-28 Thread Antoine Pitrou
Le Sun, 28 Feb 2010 21:45:56 +0100, Baptiste Carvello a écrit :
> bytecode-only in a zip is used by py2exe, cx_freeze and the like, for
> space reasons. Disabling it would probably hurt them.

Source code compresses quite well. I'm not sure it would make much of a 
difference. AFAIR, when you create a py2exe distribution, what takes most 
of the place is the interpreter itself as well as any big third-party C 
libraries such as wxWidgets.

Regards

Antoine.

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] __file__

2010-02-28 Thread Greg Ewing

Glenn Linderman wrote:
if the 
command line/runpy can do it, the importer could do it.  Just a matter 
of desire and coding.  Whether it is worth pursuing further depends on 
people's perceptions of "kookiness" vs. functional and performance 
considerations.


Having .py files around that aren't source text could
lead to a lot of confusion, given that most platforms
these days decide which application to open for a given
file based solely on the filename extension. I wouldn't
enjoy trying to open a .py file only to have my text
editor blow up because it was actually a binary file.

So on balance I think it's a bit too kooky for my
taste.

--
Greg
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] __file__

2010-02-28 Thread Floris Bruynooghe
On Sun, Feb 28, 2010 at 09:45:56PM +0100, Baptiste Carvello wrote:
> However, making a difference between zipimport and the filesystem
> importer means the application will stop working if I unzip the
> library zip file, which is surprising. Unzipping the zip file can be
> handy when debugging a bug caused by a forgotten module.

That difference exists already, the zipimporter will happily run .pyo
files inside the zipfile even when you're not running with -O or
PYTHONOPTIMIZE.

Regards
Floris

-- 
Debian GNU/Linux -- The Power of Freedom
www.debian.org | www.gnu.org | www.kernel.org
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] __file__

2010-02-28 Thread Greg Ewing

Floris Bruynooghe wrote:


Can't it look for a .py file in the source directory first (1st stat)?
When it's there check for the .pyc in the cache directory (2nd stat,
magic number encoded in filename), if it's not check for .pyc in the
source directory (2nd stat + read for magic number check).


Yes, although that would then incur higher stat overheads for
people distributing .pyc files. There doesn't seem to be a
way of pleasing everyone.

This is all assuming that the extra stat calls are actually
a problem. Does anyone have any evidence that they would
really take significant time compared to loading the module?
Once you've looked for one file in a given directory, looking
for another one in the same directory ought to be quite fast,
since all the relevant directory blocks will be in the
filesystem cache.

--
Greg
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Update xml.etree.ElementTree for Python 2.7 and 3.2

2010-02-28 Thread Florent XICLUNA
2010/2/28 Stefan Behnel 

> I would actually encourage Florent to do the opposite: act now and prepare
> a patch against the latest official ET 1.2 and cET releases (or their SVN
> version respectively) that integrates everything that is considered safe,
> i.e. everything that makes cET compatible with ET and everything that seems
> clearly stable in ET 1.3 and does not break compatibility for existing code
> that uses ET 1.2. If you send that to Fredrik, I expect little opposition
> to making that the base for a 1.2.8 release, which can then be folded back
> into the stdlib.
>
>
I exchanged some e-mails with Fredrik last week. Not sure if it will be
1.2.8 or 1.3, but now he is positive on the goals of the patch. I've
commited all the changes and external fixes to a branch of the Mercurial
repo owned by Fredrik. I'm expecting an answer soon.

Branch based on the official etree repository (Mercurial):
http://bitbucket.org/flox/et-2009-provolone/

Patch based on this branch:
http://codereview.appspot.com/207048
(patch set 7 almost identical to the tip of the Mercurial repo)

-- 
Florent
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] __file__

2010-02-28 Thread Robert Collins
On Mon, 2010-03-01 at 12:35 +1300, Greg Ewing wrote:
> 
> Yes, although that would then incur higher stat overheads for
> people distributing .pyc files. There doesn't seem to be a
> way of pleasing everyone.
> 
> This is all assuming that the extra stat calls are actually
> a problem. Does anyone have any evidence that they would
> really take significant time compared to loading the module?
> Once you've looked for one file in a given directory, looking
> for another one in the same directory ought to be quite fast,
> since all the relevant directory blocks will be in the
> filesystem cache. 

We've done a bunch of testing in bzrlib. Basic things are:
 - statting /is/ expensive *if* you don't use the result.
 - loading code is the main cost *once* you have a hot disk cache

Specifically, stats for files that are *not present* incur page-in costs
for the dentries needed to determine the file is absent. In the special
case of probing for $name.$ext1, ...$ext2, ...$ext3, you generally hit
the same pages and don't incur additional page in costs. (you'll hit the
same page in most file systems when you look for the second and third
entries).

In most file systems stats for files that *are present* also incur a
page-in for the inode of the file. If you then do not read the file,
this is I/O that doesn't really gain anything. 

Being able to disable .py file usage completely - so that only foo.pyc
and foo/__init__.pyc are probed for, could have a very noticable change
in the cold cache startup time.

# Startup time for bzr (cold cache):
$ drop-caches
$ time bzr --no-plugins revno
5061

real0m8.875s
user0m0.210s
sys 0m0.140s

# Hot cache
$ time bzr --no-plugins revno
5061

real0m0.307s
user0m0.250s
sys 0m0.040s


(revno is a small command that reads a small amount of data - just
enough to trigger demand loading of the core repository layers and so
on).

strace timings for those two operations:
cold cache:
$ strace -c bzr --no-plugins revno
5061
% time seconds  usecs/call callserrors syscall
-- --- --- - - 
 56.340.04  76   527   read
 28.980.020573   9  2273  1905 open
 14.430.010248  14   734   625 stat
  0.150.000107   0   533   fstat
...

hot cache:
% time seconds  usecs/call callserrors syscall
-- --- --- - - 
 45.100.000368  92 4   getdents
 19.490.000159   0   527   read
 16.910.000138   1   163   munmap
 10.050.82   254   mprotect
  8.460.69   0  2273  1905 open
  0.000.00   0 8   write
  0.000.00   0   367   close
  0.000.00   0   734   625 stat
...

Cheers,
Rob


signature.asc
Description: This is a digitally signed message part
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] __file__

2010-02-28 Thread Greg Ewing

Robert Collins wrote:

In the special
case of probing for $name.$ext1, ...$ext2, ...$ext3, you generally hit
the same pages and don't incur additional page in costs.


So then looking for a .pyc alongside a .py or vice versa
should be almost free, and we shouldn't be worrying about
it.


hot cache:
% time seconds  usecs/call callserrors syscall
-- --- --- - - 
 45.100.000368  92 4   getdents
  0.000.00   0   734   625 stat


Further supporting the idea that stat calls are negligible
once the cache is warmed up.

--
Greg
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Draft PEP on RSON configuration file format

2010-02-28 Thread Patrick Maupin
All:

Finding .ini configuration files too limiting, JSON and XML to hard to
manually edit, and YAML too complex to parse quickly, I have started
work on a new configuration file parser.

I call the new format RSON (for "Readable Serial Object Notation"),
and it is designed to be a superset of JSON.

I would love for it to be considered valuable enough to be a part of
the standard library, but even if that does not come to pass, I would
be very interested in feedback to help me polish the specification,
and then possibly help for implementation and testing.

The documentation is in rst PEP form, at:

http://rson.googlecode.com/svn/trunk/doc/draftpep.txt

Thanks and best regards,
Pat
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Draft PEP on RSON configuration file format

2010-02-28 Thread Benjamin Peterson
2010/2/28 Patrick Maupin :
> All:
>
> Finding .ini configuration files too limiting, JSON and XML to hard to
> manually edit, and YAML too complex to parse quickly, I have started
> work on a new configuration file parser.

In that case, it should live in the user space for several years. If
the community decides that it is an excellent format, then it should
be considered for inclusion in the stand library.



-- 
Regards,
Benjamin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Draft PEP on RSON configuration file format

2010-02-28 Thread Patrick Maupin
On Sun, Feb 28, 2010 at 6:29 PM, Benjamin Peterson  wrote:
> In that case, it should live in the user space for several years. If
> the community decides that it is an excellent format, then it should
> be considered for inclusion in the stand library.

Agreed.

However, there are too many things which became de facto standards
without community input this way.  PEP 1 itself says:

Reference Implementation -- The reference implementation must be
completed before any PEP is given status "Final", but it need not be
completed before the PEP is accepted. It is better to finish the
specification and rationale first and reach consensus on it before
writing code.

So, I do not mind the code sitting outside the standard library, and
the PEP not reaching "Final" for several years, but I do believe that
the PEP process is itself a really good way to build a better
mousetrap by consensus.

If you do not care to participate in the building of this particular
mousetrap, that is OK, too.

Regards,
Pat
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Draft PEP on RSON configuration file format

2010-02-28 Thread Antoine Pitrou
Le Sun, 28 Feb 2010 18:59:16 -0600,
Patrick Maupin  a écrit :
> 
> So, I do not mind the code sitting outside the standard library, and
> the PEP not reaching "Final" for several years, but I do believe that
> the PEP process is itself a really good way to build a better
> mousetrap by consensus.

In this case it is *at best* python-ideas material, or even
preferably comp.lang.python.

Just for the record, my only reaction when giving the PEP a glance was
"yet another configuration file format - yawn".

Good luck though,

Antoine.


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Draft PEP on RSON configuration file format

2010-02-28 Thread Patrick Maupin
On Sun, Feb 28, 2010 at 7:39 PM, Antoine Pitrou  wrote:

> In this case it is *at best* python-ideas material, or even
> preferably comp.lang.python.

I was thinking about comp.lang.python at some point, but thought I
would try here first.

> Just for the record, my only reaction when giving the PEP a glance was
> "yet another configuration file format - yawn".

I suppose I have that sort of reaction about areas I am not interested
in, as well, but currently I am deeply interested in configuration
files due to my circumstances.  In any case, the observation that
there are already several preexisting file formats used for
configuration is certainly covered in the PEP draft, but if you have
anything constructive to add *about* configuration file formats, I
would certainly welcome the input.

Best regards,
Pat
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Draft PEP on RSON configuration file format

2010-02-28 Thread Antoine Pitrou
Le Sun, 28 Feb 2010 19:46:30 -0600,
Patrick Maupin  a écrit :
> 
> I suppose I have that sort of reaction about areas I am not interested
> in, as well, but currently I am deeply interested in configuration
> files due to my circumstances.  In any case, the observation that
> there are already several preexisting file formats used for
> configuration is certainly covered in the PEP draft, but if you have
> anything constructive to add *about* configuration file formats, I
> would certainly welcome the input.

Well, a constructive approach would involve approaching projects
which have devised their own formats, so as to know what kind of
unified format they would be likely to accept (or not).

python-dev is probably not the place for such an approach, however.


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] __file__

2010-02-28 Thread Brett Cannon
On Sun, Feb 28, 2010 at 16:31, Greg Ewing wrote:

> Robert Collins wrote:
>
>> In the special
>> case of probing for $name.$ext1, ...$ext2, ...$ext3, you generally hit
>> the same pages and don't incur additional page in costs.
>>
>
> So then looking for a .pyc alongside a .py or vice versa
> should be almost free, and we shouldn't be worrying about
> it.
>

But that is making the assumption that all filesystems operate this way
(.e.g does NFS have the same performance characteristics?).


>
>  hot cache:
>> % time seconds  usecs/call callserrors syscall
>> -- --- --- - - 
>>  45.100.000368  92 4   getdents
>>  0.000.00   0   734   625 stat
>>
>
> Further supporting the idea that stat calls are negligible
> once the cache is warmed up.


But that's the point: once it's warmed up. This is not the case when
executing a script once every once in a while compared to something bzr
where you are most likely going to execute the command multiple times within
a small timeframe.

-Brett




>
>
> --
> Greg
> ___
> Python-Dev mailing list
> [email protected]
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> http://mail.python.org/mailman/options/python-dev/brett%40python.org
>
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] __file__

2010-02-28 Thread Brett Cannon
On Sun, Feb 28, 2010 at 12:46, Nick Coghlan  wrote:

> Brett Cannon wrote:
> > Actually it's four: name/__init__.py, name/__init__.pyc, name.py, and
> > then name.pyc. And just so people have terminology to go with all of
> > this, this search is what the finder does to say whether it can or
> > cannot handle the requested module.
>
> Huh, I thought we checked for the directory first and only then checked
> for the __init__ module within it (hence the generation of ImportWarning
> when we don't find __init__ after finding a correctly named directory).
> So a normal miss (i.e. no directory) only needs one stat call.
>
> (However, I'll grant that I haven't looked at this particular chunk of
> code in a fairly long time, so I could easily be wrong).
>
> Robert raises a good point about the checks for extension modules as
> well - we should get an accurate count here so Barry's PEP can pitch the
> proportional reduction in stat calls accurately.
>

Here are the details (from Python/import.c:find_module) assuming that
everything has failed to the point of trying for the implicit sys.path
importers:

stat_info = stat(name)
if stat_info.exists and stat_info.is_dir:
  if stat(name/__init__.py) || stat(name/__init__.pyc):
 load(name)
else:
  for ext in ('.so', 'module.so', '.py', 'pyc'):  # Windows has an extra
check for .pyw files.
 if open(name + ext):
load(name)

So there are a total of five to six depending on the OS (actually, VMS goes
up to eight!) before a search path is considered not to contain a module.

And thanks to doing this I realized importlib is not stat'ing the directory
first which should fail faster than checking for the __init__ files every
time.

-Brett





>
> Cheers,
> Nick.
>
> --
> Nick Coghlan   |   [email protected]   |   Brisbane, Australia
> ---
>
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] __file__

2010-02-28 Thread Brett Cannon
On Sun, Feb 28, 2010 at 12:45, Baptiste Carvello wrote:

> Nick Coghlan a écrit :
>
>
>> Another option is to remove bytecode-only support from the default
>> filesystem importer, but keep it for zipimport (since the stat call
>> savings don't apply in the latter case).
>>
>>
> bytecode-only in a zip is used by py2exe, cx_freeze and the like, for space
> reasons. Disabling it would probably hurt them.
>
> However, making a difference between zipimport and the filesystem importer
> means the application will stop working if I unzip the library zip file,
> which is surprising. Unzipping the zip file can be handy when debugging a
> bug caused by a forgotten module.
>
>
Is it really that hard to unzip a bunch of .pyc files, modify what you need
to, and then zip it back up? And if you are given a zip file of only .pyc
files you can't really debug anything anyway.

-Brett




> Cheers,
> Baptiste
>
>
> ___
> Python-Dev mailing list
> [email protected]
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> http://mail.python.org/mailman/options/python-dev/brett%40python.org
>
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] The fate of Distutils in Python 2.7

2010-02-28 Thread Brett Cannon
On Fri, Feb 26, 2010 at 14:15, Tarek Ziadé  wrote:

> On Fri, Feb 26, 2010 at 11:13 PM, Brett Cannon  wrote:
> [..]
> > I assume you want the Distutils2 component to auto-assign to you like
> > Distutils currently does? If so I can add the component for you if people
> > don't object to the new component.
>
> Sounds good -- Thanks
>

Done.

-Brett
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Draft PEP on RSON configuration file format

2010-02-28 Thread Patrick Maupin
On Sun, Feb 28, 2010 at 7:51 PM, Antoine Pitrou  wrote:
> Well, a constructive approach would involve approaching projects
> which have devised their own formats, so as to know what kind of
> unified format they would be likely to accept (or not).

Trying to poll "selected projects which have configuration files" may
or may not be a constructive approach.  Most projects which have
predefined formats are unlikely to change, unless there is
standardization on a new format.  It is very much a chicken and egg
problem, although I agree with (and have implemented) the suggestion
that I discuss this on python-list.

Having said that, one of the reasons I wrote the PEP and am working on
a parser is because of a few projects I use and/or am personally
involved in.  For example, rst2pdf stylesheets are in JSON, e.g.

http://rst2pdf.googlecode.com/svn/trunk/rst2pdf/styles/styles.json

Now, we're all programmers here, and we can read this, and can even
modify it, but it is easy to get wrong, and very verbose with lots of
syntax gotchas.  For example, unlike Python, JSON won't even let you
have a trailing comma.

But JSON *is* a great format, and RSON (like YAML) is designed to
parse properly formatted JSON, so the goal is that any project which
uses JSON could use RSON as a drop-in replacement, and then update its
configuration data.

Of course, it is extremely easy (hence your yawn) to create a new
configuration format, even if it is specified that it is upwards
compatible with JSON.  The trick is to create the *correct* new
format, that at least some people can agree on.

In order to do this, I have chosen to poll, not preexisting projects,
which have entrenched configuration data and a reluctance to change,
but brand new projects which haven't been invented yet.  Many of the
inventors of those projects hang out on python-dev, so this seemed
like a reasonable place to do polling.

As I tried to make clear, I will not be too disappointed if I do not
come up with something worthy of the standard library for a long time
(if ever), but the PEP process is very valuable, and I would like to
start off on the right foot by soliciting feedback before I do too
much coding.

Sorry if it feels like spam; this is my last message on the matter
until and unless somebody wants to constructively discuss the actual
contents of the PEP.  Please feel free to email me privately if you
don't want to clutter up this list.

Thanks and best regards,
Pat
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] __file__

2010-02-28 Thread Glenn Linderman
On approximately 2/28/2010 3:22 PM, came the following characters from 
the keyboard of Greg Ewing:

Glenn Linderman wrote:
if the command line/runpy can do it, the importer could do it.  Just 
a matter of desire and coding.  Whether it is worth pursuing further 
depends on people's perceptions of "kookiness" vs. functional and 
performance considerations.


Having .py files around that aren't source text could
lead to a lot of confusion, given that most platforms
these days decide which application to open for a given
file based solely on the filename extension. I wouldn't
enjoy trying to open a .py file only to have my text
editor blow up because it was actually a binary file.

So on balance I think it's a bit too kooky for my
taste.


I understand your thoughts, but have some rebuttal comments.  Mind you, 
if there is a better solution that can improve performance for both the 
source+binary and the binary-only distributions, I'm all for it.  But in 
general, I'm all for performance improvements, even if there is some 
kookiness :)  Thankful for Brett's posting of the actual search code 
fragment.


If your text editor blows up because it is binary, it is a sad text editor.

If you have .py mapped to a text editor, that's sort of kooky too; I 
have it mapped to Python.


The .py files that are binary would generally be part of an application 
distribution in binary form, and therefore would be installed in some 
place like /bin or C:\Program Files ... not the place you'd look for 
source code, to confuse your text editor.


--
Glenn -- http://nevcal.com/
===
A protocol is complete when there is nothing left to remove.
-- Stuart Cheshire, Apple Computer, regarding Zero Configuration Networking

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] __file__

2010-02-28 Thread Antoine Pitrou
Le Sun, 28 Feb 2010 19:32:09 -0800,
Glenn Linderman  a écrit :
> 
> If your text editor blows up because it is binary, it is a sad text
> editor.
> 
> If you have .py mapped to a text editor, that's sort of kooky too; I 
> have it mapped to Python.

File extensions exist for a reason, even if you find that "kooky" and
have strong ideas about the psychology of text editors.

Having some binary files named "foobar.py" would certainly annoy a lot
of people, including me.


Antoine.


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] __file__

2010-02-28 Thread Greg Ewing

Glenn Linderman wrote:


If your text editor blows up because it is binary, it is a sad text editor.


Blow up is probably an exaggeration, but even just getting
a screen full of gibberish when I think I'm opening a
text file is a jarring experience.

If you have .py mapped to a text editor, that's sort of kooky too; I 
have it mapped to Python.


On Windows the action for double-clicking is usually
mapped to running the file, but there's typically another
action such as "Open with IDLE" or whatever available,
and a bytecode file named with ".py" would allow you to
apply that action to it.

--
Greg
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] __file__

2010-02-28 Thread Glyph Lefkowitz
On Feb 27, 2010, at 9:38 AM, Nick Coghlan wrote:

> I do like the idea of pulling .pyc only imports out into a separate
> importer, but would go so far as to suggest keeping them as a command
> line option rather than as a separately distributed module.

One advantage of doing this as a separately distributed module is that it can 
have its own ecosystem and momentum.  Most projects that want this sort of 
bundling or packaging really want to be shipped with something like py2exe, and 
I think the folks who want such facilities would be better served by a nice 
project website for "python sealer" or "python bundler" rather than obscure 
directions for triggering the behavior via options or configuration.

Making bytecode loading a feature of interpreter startup, whether it's a config 
file, a command-line option or an environment variable, is not a great idea.  
For folks that want to ship a self-contained application, any of these would 
require an additional customization step, where they need to somehow tell their 
bundled interpreter to load bytecode.  For people trying to ship a 
self-contained and tamper-unfriendly (since even "tamper-resistant" would be 
overstating things) library to relatively non-technical programmers, it opens 
the door to a whole universe of confusion and FAQs about why the code didn't 
load.

However bytecode-only code loading is facilitated, it should be possible to 
bootstrap from a vanilla python interpreter running normally, as you may not 
know you need to load a bytecode-only package at startup.  In the stand-alone 
case there are already plenty of options, and in the library case, shipping a 
zip file should be fine, since the __init__.py of your package should be 
plain-text and also able to trigger the activation of the bytecode-only 
importer.

There are already so many ways to ship bytecode already, it doesn't seem too 
important to support in this one particular configuration (files in a 
directory, compiled by just importing them, in the same place as ".py" files).  
The real problem is providing a seamless transition path for *build* processes, 
not the Python code itself.  Do any of the folks who are currently using this 
feature have a good idea as to how your build and distribute scripts might 
easily be updated, perhaps by a 2to3 fixer?___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] __file__

2010-02-28 Thread Bugbee, Larry
Greg Ewing  wrote:
> Having .py files around that aren't source text could lead 
> to a lot of confusion, given that most platforms these days 
> decide which application to open for a given file based 
> solely on the filename extension. I wouldn't enjoy trying 
> to open a .py file only to have my text editor blow up 
> because it was actually a binary file.
> 
> So on balance I think it's a bit too kooky for my taste.

+1

Add to that the inverse...  I will cleanup directories based on the suffix 
keeping the .py and deleting .pyc and .pyo.  Overloading a source file suffix 
is not good.

Larry
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] __file__

2010-02-28 Thread Baptiste Carvello

Brett Cannon a écrit :


However, making a difference between zipimport and the filesystem
importer means the application will stop working if I unzip the
library zip file, which is surprising. Unzipping the zip file can be
handy when debugging a bug caused by a forgotten module.


Is it really that hard to unzip a bunch of .pyc files, modify what you 
need to, and then zip it back up? And if you are given a zip file of 
only .pyc files you can't really debug anything anyway.




Well, this is a micro-use-case, I admit, I only mention it because it's 
something I've really done. It's only useful for debugging the building process, 
not the application (so I do have the source at hand), and the only reason for 
not rezipping is to test more quickly. I can definitely live without it!


Cheers, Baptiste

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Update xml.etree.ElementTree for Python 2.7 and 3.2

2010-02-28 Thread Stefan Behnel
Florent XICLUNA, 01.03.2010 00:36:
> I exchanged some e-mails with Fredrik last week. Not sure if it will be
> 1.2.8 or 1.3, but now he is positive on the goals of the patch. I've
> commited all the changes and external fixes to a branch of the Mercurial
> repo owned by Fredrik. I'm expecting an answer soon.

Happy to hear that. Thanks for putting so much work into this!


> Branch based on the official etree repository (Mercurial):
> http://bitbucket.org/flox/et-2009-provolone/

Interesting, I didn't even know Fredrik had continued to work on this. It
even looks like lxml.etree has a bit to catch up API-wise before I release 2.3.

Stefan

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com