Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-09 Thread Nick Coghlan
Terry Reedy wrote:
> Definitely. I have even wondered whether it would be possible to cache
> not just the bytecode for initializing a module, but also the
> initialized module itself (perhaps minus the name bindings for other
> imported modules).

Not easily, since running the module may have other side effects that
can't be cached.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
---
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-08 Thread Terry Reedy

On 2/8/2010 7:54 AM, Nick Coghlan wrote:

Ron Adam wrote:

To tell the truth in most cases I hardly notice the extra time the first
run takes compared to later runs with the precompiled byte code.  Yes it
may be a few seconds at start up, but after that it's usually not a big
part of the execution time.  Hmmm, I wonder if there's a threshold in
file size where it really doesn't make a significant difference?


It's relative to runtime for the application itself (long-running
applications aren't going to notice as much of a percentage effect on
runtime) as well as to how many Python files are actually imported at
startup (only importing a limited number of modules, importing primarily
extension modules or effective use of a lazy module loading mechanism
will all drastically reduce the proportional impact of precompiled bytecode)

We struggle enough with startup time that doing anything that makes it
slower is rather undesirable though.


Definitely. I have even wondered whether it would be possible to cache 
not just the bytecode for initializing a module, but also the 
initialized module itself (perhaps minus the name bindings for other 
imported modules).


Terry Jan Reedy


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-08 Thread Nick Coghlan
Ron Adam wrote:
> To tell the truth in most cases I hardly notice the extra time the first
> run takes compared to later runs with the precompiled byte code.  Yes it
> may be a few seconds at start up, but after that it's usually not a big
> part of the execution time.  Hmmm, I wonder if there's a threshold in
> file size where it really doesn't make a significant difference?

It's relative to runtime for the application itself (long-running
applications aren't going to notice as much of a percentage effect on
runtime) as well as to how many Python files are actually imported at
startup (only importing a limited number of modules, importing primarily
extension modules or effective use of a lazy module loading mechanism
will all drastically reduce the proportional impact of precompiled bytecode)

We struggle enough with startup time that doing anything that makes it
slower is rather undesirable though.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
---
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-07 Thread Ron Adam



Barry Warsaw wrote:

On Jan 31, 2010, at 01:06 PM, Ron Adam wrote:

With a single cache directory, we could have an option to force writing 
bytecode to a desired location.  That might be useful on it's own for 
creating runtime bytecode only installations for installers.


One important reason for wanting to keep the bytecode cache files colocated
with the source files is that I want to be able to continue to manipulate
$PYTHONPATH to control how Python finds its modules.  With a single
system-wide cache directory that won't be easy.  E.g. $PYTHONPATH might be
hacked to find the source file you expect, but how would that interact with
how Python finds its cache files?   I'm strongly in favor of keeping the cache
files as close to the source they were generated from as possible.


Yes, I agree, after thinking about it, it does seems like it may be more 
complex than I first thought.


I think the folder-per-folder option sounds like the best default option at 
this time.  It reduces folder clutter for the python developer and may 
loosen the link between source files and byte code files just enough that 
it will be easier to experiment with more flexible modes later.




It seems to me that in the long run, (probably no time soon), it might be 
nice to even do away with on disk byte code altogether unless it's 
explicitly asked for. As computers get faster, the time it takes to compile 
byte code may become a smaller and smaller percent of the total run time. 
That is unless the size of python programs increase at the same rate or faster.


To tell the truth in most cases I hardly notice the extra time the first 
run takes compared to later runs with the precompiled byte code.  Yes it 
may be a few seconds at start up, but after that it's usually not a big 
part of the execution time.  Hmmm, I wonder if there's a threshold in file 
size where it really doesn't make a significant difference?


Regards,
  Ron














___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-07 Thread Guido van Rossum
On Sun, Feb 7, 2010 at 12:23 PM, Brett Cannon  wrote:
> On Sun, Feb 7, 2010 at 10:44, Barry Warsaw  wrote:
>> On Feb 06, 2010, at 04:39 PM, Guido van Rossum wrote:
>>
>>>The conflict is purely that PEP 3147 proposes the new behavior to be
>>>optional, and adds a flag (-R) and an environment variable
>>>($PYTHONPYR) to change it. I presume Barry is proposing this out of
>>>fear that the new behavior might upset somebody; personally I think it
>>>would be better if the behavior weren't optional. At least not in new
>>>Python releases
>>
>> Good to know!  Yes, that's one reason why I made it option, the other being
>> that I suspect most people don't care about the original use case (making 
>> sure
>> pyc files from different Python versions don't conflict).  However, with a
>> folder-per-folder approach, the side benefit of reducing directory clutter by
>> hiding all the pyc files becomes more compelling.
>>
>>> -- in backports such as a distribution that wants this
>>>feature might make, it may make sense to be more conservative, or at
>>>least to have a way to turn it off.
>>
>> For backports I think the most conservative approach is to require a flag to
>> enable this behavior.  If we make this the default for new versions of Python
>> (something I'd support) then tools written for Python >= 3.2 will know this 
>> is
>> just how it's done.  I worry about existing deployed tools for Python < 2.7
>> and 3.1.
>>
>> How about this: enable it by default in 3.2 and 2.7.  No option to disable 
>> it.
>> Allow distro back ports to define a flag or environment variable to enable 
>> it.
>> The PEP can even be silent about how that's actually done, and a Debian
>> implementation for Python 2.6 or 3.1 could even use the (now documented :) -X
>> flag.
>
> Would you keep the old behavior around as well, or simply drop it? I
> personally vote for the latter for simplicity and performance reasons
> (by not having to look in so many places for bytecode), but I can see
> tool people who magically calculate the location of the bytecode not
> loving the idea (another reason why giving loaders a method to return
> all relevant paths is a good idea; no more guessing).

For 3.2 I think it's fine to simply drop the old behavior (as long as
a good loader API is added at the same time).

But for 2.7 I think we ought to be a lot more conservative and not
force tools to upgrade, so I think we should keep the old behavior in
2.7 as the default (though distros can change this if they want to,
and backport if they need to).

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-07 Thread Brett Cannon
On Sun, Feb 7, 2010 at 10:44, Barry Warsaw  wrote:
> On Feb 06, 2010, at 04:39 PM, Guido van Rossum wrote:
>
>>The conflict is purely that PEP 3147 proposes the new behavior to be
>>optional, and adds a flag (-R) and an environment variable
>>($PYTHONPYR) to change it. I presume Barry is proposing this out of
>>fear that the new behavior might upset somebody; personally I think it
>>would be better if the behavior weren't optional. At least not in new
>>Python releases
>
> Good to know!  Yes, that's one reason why I made it option, the other being
> that I suspect most people don't care about the original use case (making sure
> pyc files from different Python versions don't conflict).  However, with a
> folder-per-folder approach, the side benefit of reducing directory clutter by
> hiding all the pyc files becomes more compelling.
>
>> -- in backports such as a distribution that wants this
>>feature might make, it may make sense to be more conservative, or at
>>least to have a way to turn it off.
>
> For backports I think the most conservative approach is to require a flag to
> enable this behavior.  If we make this the default for new versions of Python
> (something I'd support) then tools written for Python >= 3.2 will know this is
> just how it's done.  I worry about existing deployed tools for Python < 2.7
> and 3.1.
>
> How about this: enable it by default in 3.2 and 2.7.  No option to disable it.
> Allow distro back ports to define a flag or environment variable to enable it.
> The PEP can even be silent about how that's actually done, and a Debian
> implementation for Python 2.6 or 3.1 could even use the (now documented :) -X
> flag.

Would you keep the old behavior around as well, or simply drop it? I
personally vote for the latter for simplicity and performance reasons
(by not having to look in so many places for bytecode), but I can see
tool people who magically calculate the location of the bytecode not
loving the idea (another reason why giving loaders a method to return
all relevant paths is a good idea; no more guessing).

-Brett


>
> -Barry
>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> http://mail.python.org/mailman/options/python-dev/brett%40python.org
>
>
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-07 Thread Guido van Rossum
On Sun, Feb 7, 2010 at 10:17 AM, Barry Warsaw  wrote:
> On Jan 31, 2010, at 11:34 PM, Nick Coghlan wrote:
>
>>I must admit I quite like the __pyr__ directory approach as well. Since
>>the interpreter knows the suffix it is looking for, names shouldn't
>>conflict. Using a single directory allows the name to be less cryptic,
>>too (e.g. __pycache__).
>
> Something else that occurs to me; the name of the directory (under
> folder-per-folder approach) probably ought to be the same as the name of the
> module attribute.  There's probably no good reason to make it different, and
> making it the same makes the association stronger.

I'm not sure I follow. The directory doesn't suddenly become an
attribute. Moreover, the directory contains many files (assuming
folder-per-folder) and the attribute would point to a single file
inside that directory.

> That still gives us plenty of opportunity to bikeshed, but __pycache__ seems
> reasonable to me (it's the cache of parsing and compiling the .py file).

While technically it is a cache, I don't think that emphasizing that
point is helpful. For 20 years people have thought of it as "compiled
bytecode".

Also while on the filesystem it makes sense for it to have "py" in the
directory name, that does not make sense for the attribute name. After
all we don't go around calling things __pyfile__, __pygetattr__,
__pysys__... ;-)

I'm still for __compiled__ as the attribute; I don't have a particular
preference for the directory name or the naming scheme used inside it,
as long as neither starts with '.' (and probably the directory should
be __something__).

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-07 Thread M.-A. Lemburg
Barry Warsaw wrote:
> On Feb 03, 2010, at 11:59 AM, M.-A. Lemburg wrote:
> 
>> How about using an optionally relative cache dir setting to let
>> the user decide ?
> 
> Why do we need that level of flexibility?

It's very easy to implement (see the code I posted) and gives
you a lot of control with a single env variable.

Some use cases:

1. PYTHONCACHE=. (store the cache files in the same dir as the
  .py file)

 This settings mimics what we've had in Python for decades. Users
 know about this Python behavior and expect it.

 It's also the only reasonable way of shipping byte-code only
 packages.

2. PYTHONCACHE=.pycache (store the cache files in a subdir of the
 dir where the .py file is stored)

 When using lots of cache files for multiple Python versions or
 variants, .py source code directory can easily get cluttered
 with too many such files.

 Putting them into a subdir solves this problem. This would be
 useful for developers running and testing the code with different
 Python versions.

3. PYTHONCACHE=~/.python/cache (store the cache files in a user dir,
outside the Python source file dir)

 This allows easy removal of all cache files and prevents
 cluttering up the sys.path dirs with cache files or directories
 altogether.

 It's also handy if the source code dirs are not writable by
 the user importing them. OTOH, every user would create a copy
 of the cache files (this is what currently happens with setuptools
 eggs and is very annoying).

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Feb 07 2010)
>>> Python/Zope Consulting and Support ...http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


::: Try our new mxODBC.Connect Python Database Interface for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-07 Thread Barry Warsaw
On Feb 07, 2010, at 05:59 PM, Michael Foord wrote:

>On 07/02/2010 17:48, Barry Warsaw wrote:
>> [snip...]
>>> And I propose not to disturb this in 2.7, at least not by default. I'm
>>> fine though with a flag or distro-overridable config setting to change
>>> this behavior.
>>>  
>> Cool.  I'm not sure this is absolutely necessary for Debian/Ubuntu, so I'll
>> call YAGNI on it for 2.x (until and unless it isn't ;).

Sorry, I was calling YAGNI on any change in behavior of module.__file__.

>What are the chances of getting this into 2.x at all? For it to get into 
>the 2.7, likely to be the last major version in the 2.x series, the PEP 
>needs to be approved and the implementation needs to be feature complete 
>by April 3rd (first beta release according to the schedule [1]).

I'd like to consult with my Debian/Ubuntu Python maintainer colleagues to see
if it's worth getting into 2.7.  If it is, and we can get a BDFL pronouncement
on the PEP (after the next rounds of updates), then I think it will be
feasible to implement in the time remaining.  Heck, that's what Pycon sprints
are for, no? :)

-Barry


signature.asc
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-07 Thread Barry Warsaw
On Feb 06, 2010, at 04:39 PM, Guido van Rossum wrote:

>The conflict is purely that PEP 3147 proposes the new behavior to be
>optional, and adds a flag (-R) and an environment variable
>($PYTHONPYR) to change it. I presume Barry is proposing this out of
>fear that the new behavior might upset somebody; personally I think it
>would be better if the behavior weren't optional. At least not in new
>Python releases

Good to know!  Yes, that's one reason why I made it option, the other being
that I suspect most people don't care about the original use case (making sure
pyc files from different Python versions don't conflict).  However, with a
folder-per-folder approach, the side benefit of reducing directory clutter by
hiding all the pyc files becomes more compelling.

> -- in backports such as a distribution that wants this
>feature might make, it may make sense to be more conservative, or at
>least to have a way to turn it off.

For backports I think the most conservative approach is to require a flag to
enable this behavior.  If we make this the default for new versions of Python
(something I'd support) then tools written for Python >= 3.2 will know this is
just how it's done.  I worry about existing deployed tools for Python < 2.7
and 3.1.

How about this: enable it by default in 3.2 and 2.7.  No option to disable it.
Allow distro back ports to define a flag or environment variable to enable it.
The PEP can even be silent about how that's actually done, and a Debian
implementation for Python 2.6 or 3.1 could even use the (now documented :) -X
flag.

-Barry


signature.asc
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-07 Thread Barry Warsaw
On Feb 06, 2010, at 04:02 PM, Guido van Rossum wrote:

>On Sat, Feb 6, 2010 at 3:28 PM, Barry Warsaw  wrote:
>> On Feb 01, 2010, at 02:04 PM, Paul Du Bois wrote:
>>
>>>It's an interesting challenge to write the file in such a way that
>>>it's safe for a reader and writer to co-exist. Like Brett, I
>>>considered an append-only scheme, but one needs to handle the case
>>>where the bytecode for a particular magic number changes. At some
>>>point you'd need to sweep garbage from the file. All solutions seem
>>>unnecessarily complex, and unnecessary since in practice the case
>>>should not come up.
>>
>> I don't think that part's difficult.  The byte code's only going to change if
>> the source file has changed, and in that case, /all/ the byte code in the 
>> "fat
>> pyc" file will be invalidated, so the whole thing can be deleted by the first
>> writer.  I'd worked that out in the original fat pyc version of the PEP.
>
>I'm sorry, but I'm totally against fat bytecode files. They make
>things harder for all tools. The beauty of the existing bytecode
>format is that it's totally trivial: magic number, source mtime,
>unmarshalled code object. You can't beat the beauty of that.

Just for the record, I totally agree.  I was just explaining something I had
figured out in the original version of the PEP, which wasn't published but
which Martin had seen an early draft of.  When Martin made the suggestion of
sibling cache directories, I immediately realized that it was much cleaner,
better, and easier to implement than fat files (especially because I already
had some nasty complex code that implemented the fat files ;).  I'm beginning
to be convinced  that a folder-per-folder approach is the best take on
this yet.

>For the traditional "skinny" bytecode files, I believe that the
>existing algorithm which writes zeros in the place of the magic number
>first, writes the rest of the file, and then goes back to write the
>correct magic number, is correct with a single writer and multiple
>readers (assuming the readers ignore the file if its magic number is
>invalid). The creat(O_EXCL) option ensures that there won't be
>multiple writers. No rename() is necessary; POSIX rename() may be
>atomic, but it's a directory modification which makes it potentially
>slow.

Agreed, and the current approach is time and battle tested.  I don't think we
need to be mucking around with it.

My current effort on this PEP will be spent on fleshing out the
folder-per-folder approach, understanding the implications of that, and
integrating all the other great comments in this thread.

-Barry


signature.asc
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-07 Thread Barry Warsaw
On Feb 01, 2010, at 08:26 AM, Tim Delaney wrote:

>The pyc/pyo files are just an optimisation detail, and are essentially
>temporary. Given that, if they were to live in a single directory, to me it
>seems obvious that the default location for that should be in the system
>temporary directory. I an immediately think of the following advantages:
>
>1. No one really complains too much about putting things in /tmp unless it
>starts taking up too much space. In which case they delete it and if it gets
>reused, it gets recreated.

IIUC the Filesystem Hierarchy Standard correctly, then these files really
should go under /var/cache/python.  (Don't ask me where that would be on
non-FHS compliant systems Windows).  I've explained in other
followups why I don't particularly like separating the source from the cache
files though, but if you wanted a sick approach:

Take the full absolutely path to the .py file, plus the magic number, plus the
time stamp and hash that.  Cache the pyc file under /var/cache/python/.

-Barry


signature.asc
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-07 Thread Barry Warsaw
On Jan 31, 2010, at 11:34 PM, Nick Coghlan wrote:

>I must admit I quite like the __pyr__ directory approach as well. Since
>the interpreter knows the suffix it is looking for, names shouldn't
>conflict. Using a single directory allows the name to be less cryptic,
>too (e.g. __pycache__).

Something else that occurs to me; the name of the directory (under
folder-per-folder approach) probably ought to be the same as the name of the
module attribute.  There's probably no good reason to make it different, and
making it the same makes the association stronger.

That still gives us plenty of opportunity to bikeshed, but __pycache__ seems
reasonable to me (it's the cache of parsing and compiling the .py file).

-Barry


signature.asc
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-07 Thread Barry Warsaw
On Jan 31, 2010, at 08:10 PM, Silke von Bargen wrote:

>Martin v. Löwis schrieb:
>> There is also the issue of race conditions with multiple simultaneous
>> accesses. The original format for the PEP had race conditions for
>> multiple simultaneous writers; ZIP will also have race conditions for
>> concurrent readers/writers (as any new writer will have to overwrite
>> the central directory, making the zip file temporarily unavailable -
>> unless they copy it, in which case we are back to writer/writer
>> races).
>>
>> Regards,
>> Martin
>>
>>   
>Good point. OTOH the probability for this to happen actually is very small.

And yet, when it does happen, it's probably a monster to debug and defend
against.   Unless we have a convincing cross-platform story for preventing
these race conditions, I think a single-file (e.g. zipfile) approach is
infeasible.

-Barry


signature.asc
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-07 Thread Michael Foord

On 07/02/2010 17:48, Barry Warsaw wrote:

[snip...]

And I propose not to disturb this in 2.7, at least not by default. I'm
fine though with a flag or distro-overridable config setting to change
this behavior.
 

Cool.  I'm not sure this is absolutely necessary for Debian/Ubuntu, so I'll
call YAGNI on it for 2.x (until and unless it isn't ;).

   


What are the chances of getting this into 2.x at all? For it to get into 
the 2.7, likely to be the last major version in the 2.x series, the PEP 
needs to be approved and the implementation needs to be feature complete 
by April 3rd (first beta release according to the schedule [1]).


Michael Foord

[1] http://www.python.org/dev/peps/pep-0373/#release-schedule

--
http://www.ironpythoninaction.com/
http://www.voidspace.org.uk/blog

READ CAREFULLY. By accepting and reading this email you agree, on behalf of 
your employer, to release me from all obligations and waivers arising from any 
and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, 
clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and 
acceptable use policies (”BOGUS AGREEMENTS”) that I have entered into with your 
employer, its partners, licensors, agents and assigns, in perpetuity, without 
prejudice to my ongoing rights and privileges. You further represent that you 
have the authority to release me from any BOGUS AGREEMENTS on behalf of your 
employer.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-07 Thread Barry Warsaw
On Jan 31, 2010, at 01:06 PM, Ron Adam wrote:

>With a single cache directory, we could have an option to force writing 
>bytecode to a desired location.  That might be useful on it's own for 
>creating runtime bytecode only installations for installers.

One important reason for wanting to keep the bytecode cache files colocated
with the source files is that I want to be able to continue to manipulate
$PYTHONPATH to control how Python finds its modules.  With a single
system-wide cache directory that won't be easy.  E.g. $PYTHONPATH might be
hacked to find the source file you expect, but how would that interact with
how Python finds its cache files?   I'm strongly in favor of keeping the cache
files as close to the source they were generated from as possible.

-Barry



signature.asc
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-07 Thread Barry Warsaw
On Feb 04, 2010, at 03:00 PM, Glenn Linderman wrote:

>When a PEP 3147 (if modified by my suggestion) version of Python runs, 
>and the directory doesn't exist, and it wants to create a .pyc, it would 
>create the directory, and put the .pyc there.  Sort of just like how it 
>creates .pyc files, now, but an extra step of creating the repository 
>directory if it doesn't exist.  After the first run, it would exist.  It 
>is described in the PEP, and I quoted that section... "Python will 
>create a 'foo.pyr' directory"... I'm just suggesting different semantics 
>for how many directories, and what is contained in them.

I've added __pyr_version__ as an open question in the PEP (not yet committed),
as is making this default behavior (no -R flag required).

-Barry




signature.asc
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-07 Thread Barry Warsaw
On Feb 06, 2010, at 02:20 PM, Guido van Rossum wrote:

>> Upon further reflection, I agree.  __file__ also points to the source in
>> Python 2.7.
>
>Not in the 2.7 svn repo I have access to. It still points to the .pyc
>file if it was used.

Ah, I was fooled by a missing pyc file.  Run it a second time and you're
right, it points to the pyc.

>And I propose not to disturb this in 2.7, at least not by default. I'm
>fine though with a flag or distro-overridable config setting to change
>this behavior.

Cool.  I'm not sure this is absolutely necessary for Debian/Ubuntu, so I'll
call YAGNI on it for 2.x (until and unless it isn't ;).

>> Do we need an attribute to point to the compiled bytecode file?
>
>I think we do. Quite unrelated to this discussion I have a use case
>for knowing easily whether a module was actually loaded from bytecode
>or not -- but I also have a need for __file__ to point to the source.
>So having both __file__ and __compiled__ makes sense to me.

__compiled__ or __cached__?  I like the latter but don't have strong feelings
about it either way.

>When there is no source code but only bytecode I am file with both
>pointing to the bytecode; in that case I presume that the bytecode is
>not in a __pyr__ subdirectory. For dynamically loaded extension
>modules I think both should be left unset, and some other __xxx__
>variable could point to the .so or .dll file. FWIW the most common use
>case for __file__ is probably to find data files relative to it. Since
>the data won't be in the __pyr__ directory we couldn't make __file__
>point to the __pyr__/pyc file without much code breakage.

The other main use case for having such an attribute on extension modules is
diagnostics.  I want to be able to find out where on the file system a .so
actually lives:

Python 2.7a3+ (trunk:78030, Feb  6 2010, 15:18:29) 
[GCC 4.4.1] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import _socket
>>> _socket.__file__
'/home/barry/projects/python/trunk/build/lib.linux-x86_64-2.7/_socket.so'

>(Yes, I am still in favor of the folder-per-folder model.)

Cool.
-Barry


signature.asc
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-06 Thread Nick Coghlan
exar...@twistedmatrix.com wrote:
> On 08:21 pm, ba...@python.org wrote:
>> On Feb 03, 2010, at 01:17 PM, Guido van Rossum wrote:
>>> Can you clarify? In Python 3, __file__ always points to the source.
>>> Clearly that is the way of the future. For 99.99% of uses of __file__,
>>> if it suddenly never pointed to a .pyc file any more (even if one
>>> existed) that would be just fine. So what's this talk of switching to
>>> __source__?
>>
>> Upon further reflection, I agree.  __file__ also points to the source in
>> Python 2.7.  Do we need an attribute to point to the compiled bytecode
>> file?
> 
> What if, instead of trying to annotate the module object with this
> assortment of metadata - metadata which depends on lots of things, and
> can vary from interpreter to interpreter, and even from module to module
> (depending on how it was loaded) - we just stuck with the __loader__
> annotation, and encouraged/allowed/facilitated the use of the loader
> object to learn all of this extra information?

Trickier than it sounds. In the case of answering the question "was this
module loaded from bytecode or not?", the loader will need somewhere to
store the answer for each file.

The easiest per-module store is the module's own global namespace - the
loader's own attribute namespace isn't appropriate, since one loader may
handle multiple modules.

The filesystem can't be used as a reference because even when the file
is loaded from source, the bytecode file will usually be created as a
side effect.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
---
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-06 Thread Ben Finney
Nick Coghlan  writes:

> The more decoupled they are, the harder it is to manually find the
> bytecode file.

Okay. So it's not so much about “predictable”, but rather about
“predictable by a human without too much cognitive effort”.

I can see value in that, though it's best to be explicit that this is a
goal (to be clear that “a program can tell you where they live” isn't a
solution).

> It's a fairly significant increase in mental overhead. It gets much
> worse if the location of the shadow hierarchy root is configurable in
> any way (e.g. based on sys.path contents or an environment variable).
>
> Restricting the caching mechanism to the folder containing the source
> file keeps things a lot simpler.

Simpler for the human working on the source code; not for the human
trying to fit this scheme in with an OS package management system.
(Again, I'm just clarifying and making the contrast explicit, not
judging relative values.)

This makes it clearer to me that there is a glaring incompatibility
between this desire for “keep the compiled bytecode files close to the
source files” versus “decouple the locations so the OS package manager
can do its job of managing installed files”.

I recognise after earlier discussion in this thread that's not an issue
being addressed by PEP 3147.

-- 
 \ “Those are my principles. If you don't like them I have |
  `\others.” —Groucho Marx |
_o__)  |
Ben Finney

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-06 Thread Guido van Rossum
On Sat, Feb 6, 2010 at 5:10 PM, Nick Coghlan  wrote:
> Ben Finney wrote:
>> Right; I don't see who would disagree with that. I don't see any
>> conflict between “decouple compiled bytecode file locations from source
>> file locations” versus “predictable location for the compiled bytecode
>> files”.
>
> The more decoupled they are, the harder it is to manually find the
> bytecode file.
>
> With the current .pyc scheme, .pyr folders or an SVN style Python cache
> directory, finding the bytecode file is pretty easy, since the cached
> file is either in the same directory as the source file or in a
> subdirectory.
>
> With any form of shadow hierarchy though, it gets trickier because you
> have to:
> 1. Find the root of the shadow hierarchy
> 2. Navigate within the shadow hierarchy down to the point that matches
> where your source file was
>
> It's a fairly significant increase in mental overhead. It gets much
> worse if the location of the shadow hierarchy root is configurable in
> any way (e.g. based on sys.path contents or an environment variable).
>
> Restricting the caching mechanism to the folder containing the source
> file keeps things a lot simpler.

Great way of explaining why the basic folder-per-folder model wins
over the folder-per-sys.path-entry model! The basic folder-per-folder
model doesn't need to know what sys.path is. (And I hadn't followed
previous messages in the thread with enough care to understand the
subtlen implications of Ben's point. Sorry!)

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-06 Thread Nick Coghlan
Ben Finney wrote:
> Right; I don't see who would disagree with that. I don't see any
> conflict between “decouple compiled bytecode file locations from source
> file locations” versus “predictable location for the compiled bytecode
> files”.

The more decoupled they are, the harder it is to manually find the
bytecode file.

With the current .pyc scheme, .pyr folders or an SVN style Python cache
directory, finding the bytecode file is pretty easy, since the cached
file is either in the same directory as the source file or in a
subdirectory.

With any form of shadow hierarchy though, it gets trickier because you
have to:
1. Find the root of the shadow hierarchy
2. Navigate within the shadow hierarchy down to the point that matches
where your source file was

It's a fairly significant increase in mental overhead. It gets much
worse if the location of the shadow hierarchy root is configurable in
any way (e.g. based on sys.path contents or an environment variable).

Restricting the caching mechanism to the folder containing the source
file keeps things a lot simpler.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
---
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-06 Thread Guido van Rossum
On Sat, Feb 6, 2010 at 4:27 PM, Ben Finney  wrote:
> Barry Warsaw  writes:
>> I agree. I'd prefer to have a predictable place for the cached files,
>> independent of having to run Python to tell you where that is.
>
> Right; I don't see who would disagree with that. I don't see any
> conflict between “decouple compiled bytecode file locations from source
> file locations” versus “predictable location for the compiled bytecode
> files”.

The conflict is purely that PEP 3147 proposes the new behavior to be
optional, and adds a flag (-R) and an environment variable
($PYTHONPYR) to change it. I presume Barry is proposing this out of
fear that the new behavior might upset somebody; personally I think it
would be better if the behavior weren't optional. At least not in new
Python releases -- in backports such as a distribution that wants this
feature might make, it may make sense to be more conservative, or at
least to have a way to turn it off.

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-06 Thread Ben Finney
Barry Warsaw  writes:

> On Feb 03, 2010, at 11:07 PM, Nick Coghlan wrote:
>
> >It's also the case that having to run Python to manage my own
> >filesystem would very annoying.
[…]

Files that are problematic wouldn't need Python to manage any more than
currently. The suggestion was just that, a suggestion for Python to
expose information to assist; it wouldn't be required.

> I agree. I'd prefer to have a predictable place for the cached files,
> independent of having to run Python to tell you where that is.

Right; I don't see who would disagree with that. I don't see any
conflict between “decouple compiled bytecode file locations from source
file locations” versus “predictable location for the compiled bytecode
files”.

-- 
 \ “All television is educational television. The question is: |
  `\   what is it teaching?” —Nicholas Johnson |
_o__)  |
Ben Finney

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-06 Thread Guido van Rossum
On Sat, Feb 6, 2010 at 3:28 PM, Barry Warsaw  wrote:
> On Feb 01, 2010, at 02:04 PM, Paul Du Bois wrote:
>
>>It's an interesting challenge to write the file in such a way that
>>it's safe for a reader and writer to co-exist. Like Brett, I
>>considered an append-only scheme, but one needs to handle the case
>>where the bytecode for a particular magic number changes. At some
>>point you'd need to sweep garbage from the file. All solutions seem
>>unnecessarily complex, and unnecessary since in practice the case
>>should not come up.
>
> I don't think that part's difficult.  The byte code's only going to change if
> the source file has changed, and in that case, /all/ the byte code in the "fat
> pyc" file will be invalidated, so the whole thing can be deleted by the first
> writer.  I'd worked that out in the original fat pyc version of the PEP.

I'm sorry, but I'm totally against fat bytecode files. They make
things harder for all tools. The beauty of the existing bytecode
format is that it's totally trivial: magic number, source mtime,
unmarshalled code object. You can't beat the beauty of that.

For the traditional "skinny" bytecode files, I believe that the
existing algorithm which writes zeros in the place of the magic number
first, writes the rest of the file, and then goes back to write the
correct magic number, is correct with a single writer and multiple
readers (assuming the readers ignore the file if its magic number is
invalid). The creat(O_EXCL) option ensures that there won't be
multiple writers. No rename() is necessary; POSIX rename() may be
atomic, but it's a directory modification which makes it potentially
slow.

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-06 Thread Barry Warsaw
On Feb 01, 2010, at 11:28 PM, Martin v. Löwis wrote:

>So what would you do for concurrent writers, then? The current
>implementation relies on creat(O_EXCL) to be atomic, so a second
>writer would just fail. This is but the only IO operation that is
>guaranteed to be atomic (along with mkdir(2)), so reusing the current
>approach doesn't work.

I believe rename(2) is atomic also, at least on POSIX.  I'm not sure if that
helps us though.

-Barry


signature.asc
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-06 Thread Barry Warsaw
On Feb 01, 2010, at 02:04 PM, Paul Du Bois wrote:

>It's an interesting challenge to write the file in such a way that
>it's safe for a reader and writer to co-exist. Like Brett, I
>considered an append-only scheme, but one needs to handle the case
>where the bytecode for a particular magic number changes. At some
>point you'd need to sweep garbage from the file. All solutions seem
>unnecessarily complex, and unnecessary since in practice the case
>should not come up.

I don't think that part's difficult.  The byte code's only going to change if
the source file has changed, and in that case, /all/ the byte code in the "fat
pyc" file will be invalidated, so the whole thing can be deleted by the first
writer.  I'd worked that out in the original fat pyc version of the PEP.

-Barry


signature.asc
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-06 Thread Barry Warsaw
On Jan 31, 2010, at 11:04 AM, Raymond Hettinger wrote:

>>  It does this by
>> allowing many different byte compilation files (.pyc files) to be
>> co-located with the Python source file (.py file).  
>
>It would be nice if all the compilation files could be tucked
>into one single zipfile per directory to reduce directory clutter.
>
>It has several benefits besides tidiness. It hides the implementation
>details of when magic numbers get shifted.  And it may allow faster
>start-up times when the zipfile is in the disk cache.

This is closer in spirit to the original (uncirculated) PEP which called for
fat pyc files, but without the complicated implementation details.  It's still
an interesting approach to explore.

Writer concurrency can be handled with dot-lock files, but that does incur
some extra overhead, such as the remove() of the lock file.

-Barry



signature.asc
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-06 Thread Guido van Rossum
On Sat, Feb 6, 2010 at 12:21 PM, Barry Warsaw  wrote:
> On Feb 03, 2010, at 01:17 PM, Guido van Rossum wrote:
>>Can you clarify? In Python 3, __file__ always points to the source.
>>Clearly that is the way of the future. For 99.99% of uses of __file__,
>>if it suddenly never pointed to a .pyc file any more (even if one
>>existed) that would be just fine. So what's this talk of switching to
>>__source__?
>
> Upon further reflection, I agree.  __file__ also points to the source in
> Python 2.7.

Not in the 2.7 svn repo I have access to. It still points to the .pyc
file if it was used.

And I propose not to disturb this in 2.7, at least not by default. I'm
fine though with a flag or distro-overridable config setting to change
this behavior.

> Do we need an attribute to point to the compiled bytecode file?

I think we do. Quite unrelated to this discussion I have a use case
for knowing easily whether a module was actually loaded from bytecode
or not -- but I also have a need for __file__ to point to the source.
So having both __file__ and __compiled__ makes sense to me.

When there is no source code but only bytecode I am file with both
pointing to the bytecode; in that case I presume that the bytecode is
not in a __pyr__ subdirectory. For dynamically loaded extension
modules I think both should be left unset, and some other __xxx__
variable could point to the .so or .dll file. FWIW the most common use
case for __file__ is probably to find data files relative to it. Since
the data won't be in the __pyr__ directory we couldn't make __file__
point to the __pyr__/pyc file without much code breakage.

(Yes, I am still in favor of the folder-per-folder model.)

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-06 Thread exarkun

On 08:21 pm, ba...@python.org wrote:

On Feb 03, 2010, at 01:17 PM, Guido van Rossum wrote:

Can you clarify? In Python 3, __file__ always points to the source.
Clearly that is the way of the future. For 99.99% of uses of __file__,
if it suddenly never pointed to a .pyc file any more (even if one
existed) that would be just fine. So what's this talk of switching to
__source__?


Upon further reflection, I agree.  __file__ also points to the source 
in
Python 2.7.  Do we need an attribute to point to the compiled bytecode 
file?


What if, instead of trying to annotate the module object with this 
assortment of metadata - metadata which depends on lots of things, and 
can vary from interpreter to interpreter, and even from module to module 
(depending on how it was loaded) - we just stuck with the __loader__ 
annotation, and encouraged/allowed/facilitated the use of the loader 
object to learn all of this extra information?


Jean-Paul
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-06 Thread Barry Warsaw
On Feb 03, 2010, at 09:26 AM, Floris Bruynooghe wrote:

>On Wed, Feb 03, 2010 at 06:14:44PM +1100, Ben Finney wrote:
>> I don't understand the distinction you're making between those two
>> options. Can you explain what you mean by each of “siblings” and
>> “folder-per-folder”?
>
>sibilings: the original proposal, i.e.:
>
>foo.py
>foo.pyr/
>MAGIC1.pyc
>MAGIC1.pyo
>...
>bar.py
>bar.pyr/
>MAGIC1.pyc
>MAGIC1.pyo
>...
>
>folder-per-folder:
>
>foo.py
>bar.py
>__pyr__/
>foo.MAGIC1.pyc
>foo.MAGIC1.pyo
>foo.MAGIC2.pyc
>bar.MAGIC1.pyc
>...
>
>IIUC

Correct.  If necessary, I'll define those two terms in the PEP.

-Barry


signature.asc
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-06 Thread Barry Warsaw
On Feb 03, 2010, at 11:07 PM, Nick Coghlan wrote:

>It's also the case that having to run Python to manage my own filesystem
>would very annoying. If a dev has a broken .pyc that prevents the
>affected Python build from even starting how are they meant to use the
>nonfunctioning interpreter to find and delete the offending file? How is
>someone meant to find and delete the .pyc files if they prefer to use a
>graphical file manager over (or in conjunction with) the command line?

I agree.  I'd prefer to have a predictable place for the cached files,
independent of having to run Python to tell you where that is.

-Barry


signature.asc
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-06 Thread Barry Warsaw
On Feb 03, 2010, at 11:59 AM, M.-A. Lemburg wrote:

>How about using an optionally relative cache dir setting to let
>the user decide ?

Why do we need that level of flexibility?

-Barry


signature.asc
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-06 Thread Barry Warsaw
On Feb 05, 2010, at 07:37 PM, Nick Coghlan wrote:

>Brett Cannon wrote:
>> Does code exist out there where people are constructing bytecode from
>>  multiple files for a single module?
>
>I'm quite prepared to call YAGNI on that idea and just return a 2-tuple
>of source filename and compiled filename.

Me too.  I think a 2-tuple of (source-path, compiled-path) is probably going
to be fine for all practical purposes.  I'd assign the former to a module's
__file__ (as is done today in Python >= 2.7) and the latter to a module's
__cached__.

-Barry


signature.asc
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-06 Thread Barry Warsaw
On Feb 03, 2010, at 01:17 PM, Guido van Rossum wrote:

>Can you clarify? In Python 3, __file__ always points to the source.
>Clearly that is the way of the future. For 99.99% of uses of __file__,
>if it suddenly never pointed to a .pyc file any more (even if one
>existed) that would be just fine. So what's this talk of switching to
>__source__?

Upon further reflection, I agree.  __file__ also points to the source in
Python 2.7.  Do we need an attribute to point to the compiled bytecode file?

-Barry


signature.asc
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-05 Thread Nick Coghlan
Brett Cannon wrote:
> If we add a new method like get_filenames(), I would suggest going
> with Antoine's suggestion of a tuple for __compiled__ (allowing
> loaders to indicate that they actually constructed the runtime
> bytecode from multiple cached files on-disk).
> 
> 
> Does code exist out there where people are constructing bytecode from
>  multiple files for a single module?

I'm quite prepared to call YAGNI on that idea and just return a 2-tuple
of source filename and compiled filename.

The theoretical use case was for a module that was partially compiled to
native code in advance, so it's "compiled" version was a combination of
a shared library and a bytecode file. It isn't really all that
compelling an idea - it would be easy enough for a loader to pick one or
the other and stick that in __compiled__.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
---
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-04 Thread Glenn Linderman
On approximately 2/4/2010 2:28 PM, came the following characters from 
the keyboard of Eric Smith:

Glenn Linderman wrote:
On approximately 1/30/2010 4:00 PM, came the following characters 
from the keyboard of Barry Warsaw:

When the Python executable is given a `-R` flag, or the environment
variable `$PYTHONPYR` is set, then Python will create a `foo.pyr`
directory and write a `pyc` file to that directory with the hexlified
magic number as the base name.


After the discussion so far, my opinion is that if the source 
directory contains an appropriate python repositiory directory [1], 
and the version of Python implements PEP 3147, that there should be 
no need for -R or $PYTHONPYR to exist, but that such versions of 
Python would simply, and always look in the python repository 
directory for binaries.


How would the python repository directory ever get created?


When a PEP 3147 (if modified by my suggestion) version of Python runs, 
and the directory doesn't exist, and it wants to create a .pyc, it would 
create the directory, and put the .pyc there.  Sort of just like how it 
creates .pyc files, now, but an extra step of creating the repository 
directory if it doesn't exist.  After the first run, it would exist.  It 
is described in the PEP, and I quoted that section... "Python will 
create a 'foo.pyr' directory"... I'm just suggesting different semantics 
for how many directories, and what is contained in them.


--
Glenn

“Everyone is entitled to their own opinion, but not their own facts. In 
turn, everyone is entitled to their own opinions of the facts, but not 
their own facts based on their opinions.” -- Guy Rocha, retiring NV 
state archivist


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-04 Thread Eric Smith

Glenn Linderman wrote:
On approximately 1/30/2010 4:00 PM, came the following characters from 
the keyboard of Barry Warsaw:

When the Python executable is given a `-R` flag, or the environment
variable `$PYTHONPYR` is set, then Python will create a `foo.pyr`
directory and write a `pyc` file to that directory with the hexlified
magic number as the base name.
   


After the discussion so far, my opinion is that if the source directory 
contains an appropriate python repositiory directory [1], and the 
version of Python implements PEP 3147, that there should be no need for 
-R or $PYTHONPYR to exist, but that such versions of Python would 
simply, and always look in the python repository directory for binaries.


How would the python repository directory ever get created?

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-04 Thread Brett Cannon
On Thu, Feb 4, 2010 at 13:51, Nick Coghlan  wrote:

> Brett Cannon wrote:
> > My thinking is we deprecate get_filename() and introduce some new method
> > that returns a two-item tuple (get_paths?). First item is where the
> > source should be, and the second is where the bytecode is if it exists
> > (else it's None). Putting both calculations into a single method seems
> > better than a source_path()/bytecode_path() as the latter would quite
> > possibly need source_path() to call bytecode_path() on its own to
> > calculate where the source should be if it doesn't exist on top of the
> > direct call to get_bytecode() for setting __compiled__ itself.
>
> If we add a new method like get_filenames(), I would suggest going with
> Antoine's suggestion of a tuple for __compiled__ (allowing loaders to
> indicate that they actually constructed the runtime bytecode from
> multiple cached files on-disk).
>
>
Does code exist out there where people are constructing bytecode from
multiple files for a single module?


> The runpy logic would then be something like:
>
>  try:
>method = loader.get_filenames
>  except AttributeError:
>__compiled__ = ()
>try:
>  method = loader.get_filename
>except:
>  __file__ = None
>else:
>  __file__ = method()
>  else:
>__file__, *__compiled__ = method()
>
>
Should it really be a flat sequence that get_filenames returns? That first
value has a very special meaning compared to the rest which suggests to me
keeping the returned sequence to two items, just with the second item being
a sequence itself.


>
> For the import machinery itself, setting __compiled__ would be the
> responsibility of the loaders due to the way load_module is specified.


Yep.


> I
> still sometimes wonder if we would be better off splitting that method
> into separate "prepare_module" and "exec_module" methods to allow the
> interpreter a chance to fiddle with the module globals before the module
> code gets executed.
>

There's a reason why importlib has its ABCs abstracted the way it does;
there's a bunch of stuff that can be automated and is common to all loaders
that load_module has to cover. We could consider refactoring the API, but I
don't know if it is worth the hassle since importlib has decorators that
take care of low-level commonality and has ABCs for higher-level stuff.

But yes, given a do-over, I would abstract loaders to a finer grain to let
import handle more of the details.

-Brett



>
> Cheers,
> Nick.
>
> --
> Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
> ---
>
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-04 Thread Nick Coghlan
Glenn Linderman wrote:
> Alt. C... source-file-dir/__pyr_version__, each Python version with
> different bytecode would have some sort of version string or magic
> number that identifies it, and would look only in that directory for its
> .pyc/.pyo files.  I prefer C for 4 reasons: 1) easier to blow away one
> version; 2) easier to see what that version has compiled; 3) most people
> use only one or two versions, so directory proliferation is limited; 4)
> even when there are 30 versions of Python, the subdirectories would
> contain the same order-of-magnitude count of files as the source
> directory for performance issues, if the file system has a knee in the
> performance curve as some do.

I don't think this suggestion had come up before, but I like it. It also
reduces the amount of filename adjustment needed in the individual cache
directories.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
---
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-04 Thread Nick Coghlan
Brett Cannon wrote:
> My thinking is we deprecate get_filename() and introduce some new method
> that returns a two-item tuple (get_paths?). First item is where the
> source should be, and the second is where the bytecode is if it exists
> (else it's None). Putting both calculations into a single method seems
> better than a source_path()/bytecode_path() as the latter would quite
> possibly need source_path() to call bytecode_path() on its own to
> calculate where the source should be if it doesn't exist on top of the
> direct call to get_bytecode() for setting __compiled__ itself.

If we add a new method like get_filenames(), I would suggest going with
Antoine's suggestion of a tuple for __compiled__ (allowing loaders to
indicate that they actually constructed the runtime bytecode from
multiple cached files on-disk).

The runpy logic would then be something like:

  try:
method = loader.get_filenames
  except AttributeError:
__compiled__ = ()
try:
  method = loader.get_filename
except:
  __file__ = None
else:
  __file__ = method()
  else:
__file__, *__compiled__ = method()


For the import machinery itself, setting __compiled__ would be the
responsibility of the loaders due to the way load_module is specified. I
still sometimes wonder if we would be better off splitting that method
into separate "prepare_module" and "exec_module" methods to allow the
interpreter a chance to fiddle with the module globals before the module
code gets executed.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
---
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-04 Thread Glenn Linderman
On approximately 1/30/2010 4:00 PM, came the following characters from 
the keyboard of Barry Warsaw:

When the Python executable is given a `-R` flag, or the environment
variable `$PYTHONPYR` is set, then Python will create a `foo.pyr`
directory and write a `pyc` file to that directory with the hexlified
magic number as the base name.
   


After the discussion so far, my opinion is that if the source directory 
contains an appropriate python repositiory directory [1], and the 
version of Python implements PEP 3147, that there should be no need for 
-R or $PYTHONPYR to exist, but that such versions of Python would 
simply, and always look in the python repository directory for binaries.


I've reached this conclusion for several reasons/benefits:

1) it makes the rules simpler for people finding the binaries
2) there is no "double lookup" to find a binary at run time
3) if the PEP changes to implement alternatives B or C in [1], then I 
hear a large consensus of people that like that behavior, to clean up 
the annoying clutter of .pyc files mixed with source.
4) There is no need to add or document the command line option or 
environment variable.




[1] Alternative A... source-file-root.pyr, as in the PEP, Alt. B... 
source-file-dir/__pyr__ all versions/files in same lookaside directory, 
Alt. C... source-file-dir/__pyr_version__, each Python version with 
different bytecode would have some sort of version string or magic 
number that identifies it, and would look only in that directory for its 
.pyc/.pyo files.  I prefer C for 4 reasons: 1) easier to blow away one 
version; 2) easier to see what that version has compiled; 3) most people 
use only one or two versions, so directory proliferation is limited; 4) 
even when there are 30 versions of Python, the subdirectories would 
contain the same order-of-magnitude count of files as the source 
directory for performance issues, if the file system has a knee in the 
performance curve as some do.


--
Glenn -- http://nevcal.com/
===
A protocol is complete when there is nothing left to remove.
-- Stuart Cheshire, Apple Computer, regarding Zero Configuration Networking

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-04 Thread Brett Cannon
On Wed, Feb 3, 2010 at 13:33, "Martin v. Löwis"  wrote:

> Guido van Rossum wrote:
> > On Wed, Feb 3, 2010 at 12:47 PM, Nick Coghlan 
> wrote:
> >> On the issue of __file__, I'd suggesting not being too hasty in
> >> deprecating that in favour of __source__. While I can see a lot of value
> >> in having it point to the source file more often with a different
> >> attribute that points to the cached file, I don't see a lot of gain to
> >> compensate for the pain of changing the name of __file__ itself.
> >
> > Can you clarify? In Python 3, __file__ always points to the source.
> > Clearly that is the way of the future. For 99.99% of uses of __file__,
> > if it suddenly never pointed to a .pyc file any more (even if one
> > existed) that would be just fine. So what's this talk of switching to
> > __source__?
>
> I originally proposed it, not knowing that Python 3 already changed the
> meaning of __file__ for byte code files.
>
> What I really wanted to suggest is that it should be possible to tell
> what gets really executed, plus what source file had been considered.
>
> So if __file__ is always the source file, a second attribute should tell
> whether a byte code file got read (so that you can delete that in case
> you doubt it's current, for example).
>
>
What should be done for loaders? Right now we have get_filename() which is
what __file__ is to be set to. For importlib there is source_path and
bytecode_path, but both of those are specified to return None in the cases
of source or bytecode are not available, respectively.

The bare minimum, I think, is we need loaders to have mehod(s) that return
the path to the source -- whether it exists or not, to set __file__ to --
and the path to bytecode if it exists -- to set __compiled__ or whatever
attribute we come up with. That suggests to me either two new methods or one
that returns a two-item tuple. We could possibly keep get_filename() and say
that people need to compare its output to what source_path()-equivalent
method returns, but that seems bad if the source location needs to be based
on the bytecode location.

My thinking is we deprecate get_filename() and introduce some new method
that returns a two-item tuple (get_paths?). First item is where the source
should be, and the second is where the bytecode is if it exists (else it's
None). Putting both calculations into a single method seems better than a
source_path()/bytecode_path() as the latter would quite possibly need
source_path() to call bytecode_path() on its own to calculate where the
source should be if it doesn't exist on top of the direct call to
get_bytecode() for setting __compiled__ itself.

-Brett




> Regards,
> Martin
>
>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> http://mail.python.org/mailman/options/python-dev/brett%40python.org
>
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-03 Thread Ben Finney
Thanks for the explanation.

Nick Coghlan  writes:

> Being able to get rid of the existing .pyc/.pyo clutter at the same
> time is just a bonus.

Okay. I maintain (unsurprisingly) that replacing it with subdirectory
clutter is a poor bargain. But I have nothing new to add on that score
for now.

-- 
 \ “A man may be a fool and not know it — but not if he is |
  `\   married.” —Henry L. Mencken |
_o__)  |
Ben Finney

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-03 Thread Nick Coghlan
Ben Finney wrote:
> Nick Coghlan  writes:
> 
>> P.S. Translation of the double negative: I don't find any of the
>> solutions, even the current .pyc/.pyo approach, to be particularly
>> elegant, so I can't really say I like any of them in an absolute
>> sense. However, having a single cache folder inside each Python source
>> folder seems to strike the best balance between keeping a tidy
>> filesystem and still being able to locate a cached file given only the
>> location of the source file (or vice-versa) without using any
>> Python-specific tools, so it is the approach I personally prefer.
> 
> Something I think is being lost here: AFAICT, the impetus behind this
> PEP is to allow OS distributions to decouple the location of the
> compiled bytecode files from the location of the source code files. (If
> I'm mistaken, then clearly I don't understand the PEP's purpose at all
> and I'd love to have this misconception corrected.)

No, the purpose is to allow the same source file to be shared between
multiple versions of the Python interpreter without their compiled files
conflicting as they do now. It's the support for multiple .pyc and .pyo
files per .py file that is the significant change, not the specific
location of those files.

Being able to get rid of the existing .pyc/.pyo clutter at the same time
is just a bonus.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
---
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-03 Thread Nick Coghlan
Guido van Rossum wrote:
> On Wed, Feb 3, 2010 at 12:47 PM, Nick Coghlan  wrote:
>> On the issue of __file__, I'd suggesting not being too hasty in
>> deprecating that in favour of __source__. While I can see a lot of value
>> in having it point to the source file more often with a different
>> attribute that points to the cached file, I don't see a lot of gain to
>> compensate for the pain of changing the name of __file__ itself.
> 
> Can you clarify? In Python 3, __file__ always points to the source.
> Clearly that is the way of the future. For 99.99% of uses of __file__,
> if it suddenly never pointed to a .pyc file any more (even if one
> existed) that would be just fine. So what's this talk of switching to
> __source__?
> 

In Barry's rough notes that he added to the PEP he said he thought
__file__ had become too ambiguous and was going to suggest changing the
name to __source__. That struck me as an overreaction to a very mild
ambiguity (one that will only lessen with time if a new attribute is
added to point to the cached file that was actually executed).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
---
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-03 Thread Martin v. Löwis
Guido van Rossum wrote:
> On Wed, Feb 3, 2010 at 12:47 PM, Nick Coghlan  wrote:
>> On the issue of __file__, I'd suggesting not being too hasty in
>> deprecating that in favour of __source__. While I can see a lot of value
>> in having it point to the source file more often with a different
>> attribute that points to the cached file, I don't see a lot of gain to
>> compensate for the pain of changing the name of __file__ itself.
> 
> Can you clarify? In Python 3, __file__ always points to the source.
> Clearly that is the way of the future. For 99.99% of uses of __file__,
> if it suddenly never pointed to a .pyc file any more (even if one
> existed) that would be just fine. So what's this talk of switching to
> __source__?

I originally proposed it, not knowing that Python 3 already changed the
meaning of __file__ for byte code files.

What I really wanted to suggest is that it should be possible to tell
what gets really executed, plus what source file had been considered.

So if __file__ is always the source file, a second attribute should tell
whether a byte code file got read (so that you can delete that in case
you doubt it's current, for example).

Regards,
Martin


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-03 Thread Guido van Rossum
On Wed, Feb 3, 2010 at 12:47 PM, Nick Coghlan  wrote:
> On the issue of __file__, I'd suggesting not being too hasty in
> deprecating that in favour of __source__. While I can see a lot of value
> in having it point to the source file more often with a different
> attribute that points to the cached file, I don't see a lot of gain to
> compensate for the pain of changing the name of __file__ itself.

Can you clarify? In Python 3, __file__ always points to the source.
Clearly that is the way of the future. For 99.99% of uses of __file__,
if it suddenly never pointed to a .pyc file any more (even if one
existed) that would be just fine. So what's this talk of switching to
__source__?

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-03 Thread Ben Finney
Nick Coghlan  writes:

> P.S. Translation of the double negative: I don't find any of the
> solutions, even the current .pyc/.pyo approach, to be particularly
> elegant, so I can't really say I like any of them in an absolute
> sense. However, having a single cache folder inside each Python source
> folder seems to strike the best balance between keeping a tidy
> filesystem and still being able to locate a cached file given only the
> location of the source file (or vice-versa) without using any
> Python-specific tools, so it is the approach I personally prefer.

Something I think is being lost here: AFAICT, the impetus behind this
PEP is to allow OS distributions to decouple the location of the
compiled bytecode files from the location of the source code files. (If
I'm mistaken, then clearly I don't understand the PEP's purpose at all
and I'd love to have this misconception corrected.)

If that's so, then I don't see how what you suggest above is any
significat progress toward that goal. It still tightly couples the
locations of the source code files and the complied bytecode files.
Having a distinct cache of compiled bytecode files addresses this
better.

-- 
 \  “I find the whole business of religion profoundly interesting. |
  `\ But it does mystify me that otherwise intelligent people take |
_o__)it seriously.” —Douglas Adams |
Ben Finney

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-03 Thread Brett Cannon
On Wed, Feb 3, 2010 at 05:27, Nick Coghlan  wrote:
> Glenn Linderman wrote:
>> On approximately 2/2/2010 7:05 PM, came the following characters from
>>  the keyboard of Guido van Rossum:
>>> On Tue, Feb 2, 2010 at 5:41 PM, Glenn
>>> Linderman  wrote:
 Agreed.  But in reading that, it somehow triggered a question:
 does zipimport only work for zipfiles, or does it work for any
 archive format that Python stdlib knows how to decode?  And if
 only the former, why are they so special?

>>> The former.
>>>
>>> They are special because (unlike e.g. tar files) you can read the
>>> table of contents of a zipfile without parsing the entire file.
>>
>> They are not unique in this... most archive formats except tar have a
>>  directory.  But that is likely a good reason not to support tar for
>> this purpose, especially since tar usually comes found as .tar.Z or
>> .tar.gz or .tar.bz2 etc. and would require two passes before the data
>> could be found at all.
>
> It's also because nobody has done the work to hook up any additional
> archive formats (as zipimport needs to work for importing the standard
> library itself, it isn't quite as simple as just importing an extra
> module to do the manipulation. Extending the test suite to cover a new
> archive format would require some work as well).
>
> Given that zip files already work and are almost universal, I figure
> folks have just opted to use that and then found other things to do with
> their coding time :)
>

If people really need alternative archive formats they can use the
importers package: http://packages.python.org/importers/ . If someone
really wants to use another format they can use the ABCs in the
package to easily write their own importer. It also contains a sqlite3
importer and its own zip importer.

-Brett

> Cheers,
> Nick.
>
> --
> Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
> ---
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> http://mail.python.org/mailman/options/python-dev/brett%40python.org
>
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-03 Thread Nick Coghlan
Barry Warsaw wrote:
> On Feb 03, 2010, at 11:31 PM, Nick Coghlan wrote:
> 
>> Having a lookup dictionary from Python version + C API magic numbers to
>> the magic strings used in cache filenames in the import engine shouldn't
>> be too tricky. I'll admit it wasn't until the thread had already been
>> going for a while that I realised that, though :)
> 
> I agree, and it's clear that would be much more user friendly.  I've added a
> note to my working copy of the PEP and leave that as a possible design
> change.  I'm still not certain what the right mapping would be though.  Python
> version numbers don't seem quite right, but maybe they are a "good enough"
> solution.

If we ditch the -U option for 2.7, then we'll only have one magic number
per CPython version. I've been using "cpython-27" in my examples.

On the issue of __file__, I'd suggesting not being too hasty in
deprecating that in favour of __source__. While I can see a lot of value
in having it point to the source file more often with a different
attribute that points to the cached file, I don't see a lot of gain to
compensate for the pain of changing the name of __file__ itself.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
---
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-03 Thread Barry Warsaw
On Feb 03, 2010, at 12:57 PM, Antoine Pitrou wrote:

>How about doing measurements /with the current implementation/? Everyone
>seems to worry about stat() calls but there doesn't seem to be any figures to
>evaluate their significance.

Yes, very good idea, if for no other reason than to give us a baseline for
comparison.  Added to the PEP.

-Barry


signature.asc
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-03 Thread Barry Warsaw
On Feb 03, 2010, at 11:31 PM, Nick Coghlan wrote:

>Having a lookup dictionary from Python version + C API magic numbers to
>the magic strings used in cache filenames in the import engine shouldn't
>be too tricky. I'll admit it wasn't until the thread had already been
>going for a while that I realised that, though :)

I agree, and it's clear that would be much more user friendly.  I've added a
note to my working copy of the PEP and leave that as a possible design
change.  I'm still not certain what the right mapping would be though.  Python
version numbers don't seem quite right, but maybe they are a "good enough"
solution.

-Barry


signature.asc
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-03 Thread Nick Coghlan
Barry Warsaw wrote:
> Encoding the magic number in the file name under .pyr would I thought make the
> look up scheme more efficient since the import machinery can craft the file
> name directly.  I agree it's not very human friendly because nobody really
> knows which magic numbers are associated with which Python versions and flags.

Having a lookup dictionary from Python version + C API magic numbers to
the magic strings used in cache filenames in the import engine shouldn't
be too tricky. I'll admit it wasn't until the thread had already been
going for a while that I realised that, though :)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
---
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-03 Thread Nick Coghlan
Glenn Linderman wrote:
> On approximately 2/2/2010 7:05 PM, came the following characters from
>  the keyboard of Guido van Rossum:
>> On Tue, Feb 2, 2010 at 5:41 PM, Glenn 
>> Linderman  wrote:
>>> Agreed.  But in reading that, it somehow triggered a question:
>>> does zipimport only work for zipfiles, or does it work for any
>>> archive format that Python stdlib knows how to decode?  And if
>>> only the former, why are they so special?
>>> 
>> The former.
>> 
>> They are special because (unlike e.g. tar files) you can read the 
>> table of contents of a zipfile without parsing the entire file.
> 
> They are not unique in this... most archive formats except tar have a
>  directory.  But that is likely a good reason not to support tar for
> this purpose, especially since tar usually comes found as .tar.Z or
> .tar.gz or .tar.bz2 etc. and would require two passes before the data
> could be found at all.

It's also because nobody has done the work to hook up any additional
archive formats (as zipimport needs to work for importing the standard
library itself, it isn't quite as simple as just importing an extra
module to do the manipulation. Extending the test suite to cover a new
archive format would require some work as well).

Given that zip files already work and are almost universal, I figure
folks have just opted to use that and then found other things to do with
their coding time :)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
---
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-03 Thread Nick Coghlan
Bob Ippolito wrote:
> I like this option as well, but why not just name the directory .pyc
> instead of __pyr__ or .pyr? That way people probably won't even have
> to reconfigure their tools to ignore it :)

This actually came up in another part of the thread. The conclusion was
that, since the cached Python files can significantly affect the way
Python executes, it would be better not to use dot-files or set the
hidden attribute in the folder's metadata (on filesystems that support
that).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
---
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-03 Thread Nick Coghlan
Floris Bruynooghe wrote:
> Personally I'm +1 on the folder-per-folder option.

Of all the proposed options, I also dislike the SVN/CVS style folder
structure the least ;)

Cheers,
Nick.

P.S. Translation of the double negative: I don't find any of the
solutions, even the current .pyc/.pyo approach, to be particularly
elegant, so I can't really say I like any of them in an absolute sense.
However, having a single cache folder inside each Python source folder
seems to strike the best balance between keeping a tidy filesystem and
still being able to locate a cached file given only the location of the
source file (or vice-versa) without using any Python-specific tools, so
it is the approach I personally prefer.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
---
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-03 Thread Nick Coghlan
Ben Finney wrote:
> I don't think keeping the cache files in a mass of intertwingled extra
> subdirectories is the way to solve that problem. That speaks, rather, to
> the need for Python to be able to find the file on behalf of the user
> and blow it away on request, so the user doesn't need to go searching.
> 
> Possible interface (with spelling of options chosen hastily)::
> 
> $ python foo.py# Use cached byte code if available.
> $ python --force-compile foo.py# Unconditionally compile.
> 
> If removing the byte code file, without running the module, is what's
> desired::
> 
> $ python --delete-cache foo.py # Delete cached byte code.
> $ rm $(python --show-cache-file foo.py)  # Same as above.
> 
> That should cover just about any common need for the user to know
> exactly which byte code file corresponds to a given source file. That,
> in turn, frees us to choose a less obtrusive location for the byte code
> files than mingled in with the source.

That's nice in theory, but tricky in practice given the intended
flexibility of the import system (i.e. we don't want to perpetrate new
import features that aren't part of the common importer interface, so
any such proposal would need to come complete with suggested extensions
to the PEP 302 importer protocol).

It's also the case that having to run Python to manage my own filesystem
would very annoying. If a dev has a broken .pyc that prevents the
affected Python build from even starting how are they meant to use the
nonfunctioning interpreter to find and delete the offending file? How is
someone meant to find and delete the .pyc files if they prefer to use a
graphical file manager over (or in conjunction with) the command line?

We can provide a utility script in the Python distribution to copy a
source tree without the Python cache directories easily enough, which
would be far simpler than providing the extra tools to cherry pick
compilation or deletion of individual cache files.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
---
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-03 Thread Antoine Pitrou
Barry Warsaw  python.org> writes:
> 
> As to the question of sibling directories or folder-per-folder I think
> performance issues should be the deciding factor.  There are file system
> limitations to consider (but also a wide variety of file systems in use).  Do
> the number of stat calls predominate the performance costs?  Maybe it makes
> sense to implement the two different approaches and do some measurements.

How about doing measurements /with the current implementation/? Everyone seems
to worry about stat() calls but there doesn't seem to be any figures to evaluate
their significance.

Thanks

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-03 Thread M.-A. Lemburg
> On 03/02/2010 06:50, Barry Warsaw wrote:
>> As to the question of sibling directories or folder-per-folder I think
>> performance issues should be the deciding factor.  There are file system
>> limitations to consider (but also a wide variety of file systems in
>> use).  Do
>> the number of stat calls predominate the performance costs?  Maybe it
>> makes
>> sense to implement the two different approaches and do some measurements.

How about using an optionally relative cache dir setting to let
the user decide ?

import imp, os

# Get cache dir, default to module_dir
cache_dir = os.environ.get('PYTHONCACHEDIR', '.')

# Get names and versions
module_cache_type = 'pyc'
module_cache_version = imp.get_magic().encode('hex')
module_name = module.__name__
module_cache_file = '%s.%s.%s' % (module_name, module_cache_version, 
module_cache_type)
module_dir = os.path.split(module.__file__)[0]

# Determine cache dir and cache file pathname
module_cache_dir = os.path.abspath(os.path.join(module_dir, cache_dir))
module_cache_pathname = os.path.join(module_cache_dir, module_cache_file)

# Write PYC data to module_cache_pathname
...

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Feb 03 2010)
>>> Python/Zope Consulting and Support ...http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


::: Try our new mxODBC.Connect Python Database Interface for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-03 Thread Michael Foord

On 03/02/2010 06:50, Barry Warsaw wrote:

I have to say up front that I'm somewhat shocked at how quickly this thread
has exploded!  Since I'm sprinting this week, I haven't thoroughly read every
message and won't have time tonight to answer every question, but I'll try to
pick out some common ideas.  I really appreciate everyone's input and will try
to clarify the PEP where I can.

It is probably not clear enough from the PEP, but I actually don't expect that
most individual Python developers will use this feature.  This is why the -R
flag exists and the behavior is turned off by default.


The fact that it doesn't affect most developers makes it the *perfect* 
opportunity to bikeshed... :-)


Michael


  When I'm developing
some Python code in my home directory, I usually only use one Python version
and even if I'm going to test it with multiple Python versions, I won't need
to do this *simultaneously*.  I will generally blow away all build artifacts
(including, but not limited to .pyc files) and then rebuild with the different
Python version.

I think that this feature will be limited mostly to distros, which have
different use cases than individual developers.  But these are important use
cases for Python to support nonetheless.

My rationale for choosing the file system layout in the PEP was to try to
present something more familiar to today's Python and to avoid radical
reorganization of the way Python caches its byte code.  Thus having a sibling
directory that differs from the source just by extension seemed more natural
to me.

Encoding the magic number in the file name under .pyr would I thought make the
look up scheme more efficient since the import machinery can craft the file
name directly.  I agree it's not very human friendly because nobody really
knows which magic numbers are associated with which Python versions and flags.

As to the question of sibling directories or folder-per-folder I think
performance issues should be the deciding factor.  There are file system
limitations to consider (but also a wide variety of file systems in use).  Do
the number of stat calls predominate the performance costs?  Maybe it makes
sense to implement the two different approaches and do some measurements.

-Barry
   



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk
   



--
http://www.ironpythoninaction.com/
http://www.voidspace.org.uk/blog

READ CAREFULLY. By accepting and reading this email you agree, on behalf of your 
employer, to release me from all obligations and waivers arising from any and all 
NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, 
confidentiality, non-disclosure, non-compete and acceptable use policies ("BOGUS 
AGREEMENTS") that I have entered into with your employer, its partners, licensors, 
agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. 
You further represent that you have the authority to release me from any BOGUS AGREEMENTS 
on behalf of your employer.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-03 Thread Glenn Linderman
On approximately 2/2/2010 7:05 PM, came the following characters from 
the keyboard of Guido van Rossum:

On Tue, Feb 2, 2010 at 5:41 PM, Glenn Linderman  wrote:
   

On approximately 2/2/2010 4:28 PM, came the following characters from the
keyboard of Guido van Rossum:
 

Argh. zipfiles are way to complex to be writing.
   

Agreed.  But in reading that, it somehow triggered a question: does
zipimport only work for zipfiles, or does it work for any archive format
that Python stdlib knows how to decode?  And if only the former, why are
they so special?
 

The former.

They are special because (unlike e.g. tar files) you can read the
table of contents of a zipfile without parsing the entire file.


They are not unique in this... most archive formats except tar have a 
directory.  But that is likely a good reason not to support tar for this 
purpose, especially since tar usually comes found as .tar.Z or .tar.gz 
or .tar.bz2 etc. and would require two passes before the data could be 
found at all.



Also
because they are universally supported which makes it unnecessary to
support other formats. Again, contrast tar files which are virtually
unheard of on Windows.
   


This may well be true, at least for some definitions of Universal.  
However, for the definition of Universal that matters to the discussion, 
is all the platforms on which Python is supported... and certainly all 
those platforms have support for all the archive formats in Python's 
stdlib, eh?  Oh!  Sorry, I had jumped to the conclusion that the stdlib 
(because of the batteries included philosophy) supported things like 7z 
and rar files, since they've been around for years, but I see there is a 
limited selection there.  OK, I found the ticket that suggests adding 7z 
and nosied myself.  Didn't bother to look for rar, because I'm a 7z fan, 
and it has better compression factors in most cases.


--
Glenn

“Everyone is entitled to their own opinion, but not their own facts. In 
turn, everyone is entitled to their own opinions of the facts, but not 
their own facts based on their opinions.” -- Guy Rocha, retiring NV 
state archivist


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-03 Thread Floris Bruynooghe
On Wed, Feb 03, 2010 at 06:14:44PM +1100, Ben Finney wrote:
> Barry Warsaw  writes:
> 
> > I suppose this is going to be very subjective, but in skimming the
> > thread it seems like most people like putting the byte code cache
> > artifacts in subdirectories (be they siblings or folder-per-folder).
> 
> I don't understand the distinction you're making between those two
> options. Can you explain what you mean by each of “siblings” and
> “folder-per-folder”?

sibilings: the original proposal, i.e.:

foo.py
foo.pyr/
MAGIC1.pyc
MAGIC1.pyo
...
bar.py
bar.pyr/
MAGIC1.pyc
MAGIC1.pyo
...

folder-per-folder:

foo.py
bar.py
__pyr__/
foo.MAGIC1.pyc
foo.MAGIC1.pyo
foo.MAGIC2.pyc
bar.MAGIC1.pyc
...

IIUC

Personally I'm +1 on the folder-per-folder option.


Floris


-- 
Debian GNU/Linux -- The Power of Freedom
www.debian.org | www.gnu.org | www.kernel.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-02 Thread Ben Finney
Barry Warsaw  writes:

> If you have to manually blow away a particular pyc file,
> folder-per-folder makes it much easier to find exactly what you want
> to blow away without have to search up the file system, and then back
> down again to find the pyc file to delete. How many ..'s does it take
> until you're lost in the twisty maze of ls?

I don't think keeping the cache files in a mass of intertwingled extra
subdirectories is the way to solve that problem. That speaks, rather, to
the need for Python to be able to find the file on behalf of the user
and blow it away on request, so the user doesn't need to go searching.

Possible interface (with spelling of options chosen hastily)::

$ python foo.py# Use cached byte code if available.
$ python --force-compile foo.py# Unconditionally compile.

If removing the byte code file, without running the module, is what's
desired::

$ python --delete-cache foo.py # Delete cached byte code.
$ rm $(python --show-cache-file foo.py)  # Same as above.

That should cover just about any common need for the user to know
exactly which byte code file corresponds to a given source file. That,
in turn, frees us to choose a less obtrusive location for the byte code
files than mingled in with the source.

-- 
 \ “Pinky, are you pondering what I'm pondering?” “I think so, but |
  `\  where will we find an open tattoo parlor at this time of |
_o__)   night?” —_Pinky and The Brain_ |
Ben Finney

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-02 Thread Ben Finney
Barry Warsaw  writes:

> I suppose this is going to be very subjective, but in skimming the
> thread it seems like most people like putting the byte code cache
> artifacts in subdirectories (be they siblings or folder-per-folder).

I don't understand the distinction you're making between those two
options. Can you explain what you mean by each of “siblings” and
“folder-per-folder”?

-- 
 \ “Pinky, are you pondering what I'm pondering?” “I think so, |
  `\   Brain, but Tuesday Weld isn't a complete sentence.” —_Pinky and |
_o__)   The Brain_ |
Ben Finney

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-02 Thread Barry Warsaw
On Jan 31, 2010, at 09:30 PM, Martin v. Löwis wrote:

>If a single pyc folder is used, I think an additional __source__
>attribute would be needed to indicate what source file time stamp had
>been checked (if any) to determine that the byte code file is current.

This is a good point.  __file__ is ambiguous so I think a reasonable thing to
add to the PEP is clear semantics for extracting the source file name and the
cached file name from the module object.

Python 3 uses the .py file for __file__ but I'd like to see a transition to
__source__ for that, with __cache__ for the location of the PVM, JVM, LLVM or
whatever compilation cache artifact file.

I've added a note to my working update of the PEP.

-Barry


signature.asc
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-02 Thread Barry Warsaw
On Jan 31, 2010, at 12:36 PM, Georg Brandl wrote:

>Not really -- much of the code I've seen that tries to guess the source
>file name from a __file__ value just does something like this:
>
>   if fname.lower().endswith(('.pyc', '.pyo')): fname = fname[:-1]
>
>That's not compatible with using .pyr, either.

The rationale for the .pyr extension is because I've usually seen (and
written) this instead:

base, ext = os.path.splitext(fname)
py_file = base + '.py'
# ...or...
if ext != '.py':
continue

I think I rarely care what the extension is if it's not '.py'.

-Barry


signature.asc
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-02 Thread Barry Warsaw
On Jan 31, 2010, at 03:07 PM, Ben Finney wrote:

>In other words, my understanding is that the current PEP would have the
>following tree for an example project::
>
>foo/
>__init__.py
>__init__.pyr/
>deadbeef.pyc
>decafbad.pyc
>lorem.py
>lorem.pyr/
>deadbeef.pyc
>decafbad.pyc

[...etc...]

>That's a nightmarish mess of compiled files swamping the source files,
>as has been pointed out several times.

Except that I think it will be quite uncommon for typical Python developers to
be confronted with this.

>Could we instead have a single subdirectory for each tree of module
>packages, keeping them tidily out of the way of the source files, while
>making them located just as deterministically::

If we do not choose the sibling folder approach, I feel pretty strongly that
it ought be more like the Subversion-like folder-per-folder approach than the
Bazaar-like folder-at-top-of-hierarchy approach.  If you have to manually blow
away a particular pyc file, folder-per-folder makes it much easier to find
exactly what you want to blow away without have to search up the file system,
and then back down again to find the pyc file to delete.  How many ..'s does
it take until you're lost in the twisty maze of ls?

-Barry


signature.asc
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-02 Thread Barry Warsaw
On Jan 31, 2010, at 01:44 PM, Nick Coghlan wrote:

>We deliberate don't document -U because its typical effect is "break the
>world" - it makes all strings unicode in 2.x.

As an aside, I think this should be documented *somewhere* other than just in
import.c!  I'd totally forgotten about it until I read the source and almost
missed it.  Either it should be documented or it should be ripped out.

-Barry


signature.asc
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-02 Thread Barry Warsaw
On Jan 30, 2010, at 11:21 PM, Vitor Bosshard wrote:

>Why not:
>
>foo.py
>foo.pyc # < 2.7 or < 3.2
>foo.27.pyc
>foo.32.pyc
>etc.

Because this clutters the module's directory more than it does today, which I
considered to be a negative factor.  And as others have pointed out, there
isn't a one-to-one relationship between Python version numbers and byte code
compatibility.

>I'd rather have a folder cluttered with files I know I can ignore (and
>can easily run a selective rm over) than one that is cluttered with
>subfolders.

I suppose this is going to be very subjective, but in skimming the thread it
seems like most people like putting the byte code cache artifacts in
subdirectories (be they siblings or folder-per-folder).

-Barry



signature.asc
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-02 Thread Barry Warsaw
I have to say up front that I'm somewhat shocked at how quickly this thread
has exploded!  Since I'm sprinting this week, I haven't thoroughly read every
message and won't have time tonight to answer every question, but I'll try to
pick out some common ideas.  I really appreciate everyone's input and will try
to clarify the PEP where I can.

It is probably not clear enough from the PEP, but I actually don't expect that
most individual Python developers will use this feature.  This is why the -R
flag exists and the behavior is turned off by default.  When I'm developing
some Python code in my home directory, I usually only use one Python version
and even if I'm going to test it with multiple Python versions, I won't need
to do this *simultaneously*.  I will generally blow away all build artifacts
(including, but not limited to .pyc files) and then rebuild with the different
Python version.

I think that this feature will be limited mostly to distros, which have
different use cases than individual developers.  But these are important use
cases for Python to support nonetheless.

My rationale for choosing the file system layout in the PEP was to try to
present something more familiar to today's Python and to avoid radical
reorganization of the way Python caches its byte code.  Thus having a sibling
directory that differs from the source just by extension seemed more natural
to me.

Encoding the magic number in the file name under .pyr would I thought make the
look up scheme more efficient since the import machinery can craft the file
name directly.  I agree it's not very human friendly because nobody really
knows which magic numbers are associated with which Python versions and flags.

As to the question of sibling directories or folder-per-folder I think
performance issues should be the deciding factor.  There are file system
limitations to consider (but also a wide variety of file systems in use).  Do
the number of stat calls predominate the performance costs?  Maybe it makes
sense to implement the two different approaches and do some measurements.

-Barry


signature.asc
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-02 Thread Bob Ippolito
On Sun, Jan 31, 2010 at 11:16 AM, Guido van Rossum  wrote:
> Whoa. This thread already exploded. I'm picking this message to
> respond to because it reflects my own view after reading the PEP.
>
> On Sun, Jan 31, 2010 at 4:13 AM, Hanno Schlichting  wrote:
>> On Sun, Jan 31, 2010 at 1:03 PM, Simon Cross
>>  wrote:
>>> I don't know whether I in favour of using a single pyr folder or not
>>> but if a single folder is used I'd definitely prefer the folder to be
>>> called __pyr__ rather than .pyr.
>
> Exactly what I would prefer. I worry that having many small
> directories is a fairly poor use of the filesystem. A quick scan of
> /usr/local/lib/python3.2 on my Linux box reveals 1163 .py files but
> only 57 directories).

I like this option as well, but why not just name the directory .pyc
instead of __pyr__ or .pyr? That way people probably won't even have
to reconfigure their tools to ignore it :)

-bob
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-02 Thread Guido van Rossum
On Tue, Feb 2, 2010 at 5:41 PM, Glenn Linderman  wrote:
> On approximately 2/2/2010 4:28 PM, came the following characters from the
> keyboard of Guido van Rossum:
>>
>> Argh. zipfiles are way to complex to be writing.
>
> Agreed.  But in reading that, it somehow triggered a question: does
> zipimport only work for zipfiles, or does it work for any archive format
> that Python stdlib knows how to decode?  And if only the former, why are
> they so special?

The former.

They are special because (unlike e.g. tar files) you can read the
table of contents of a zipfile without parsing the entire file. Also
because they are universally supported which makes it unnecessary to
support other formats. Again, contrast tar files which are virtually
unheard of on Windows.

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-02 Thread Glenn Linderman
On approximately 2/2/2010 4:28 PM, came the following characters from 
the keyboard of Guido van Rossum:

Argh. zipfiles are way to complex to be writing.


Agreed.  But in reading that, it somehow triggered a question: does 
zipimport only work for zipfiles, or does it work for any archive format 
that Python stdlib knows how to decode?  And if only the former, why are 
they so special?


--
Glenn -- http://nevcal.com/
===
A protocol is complete when there is nothing left to remove.
-- Stuart Cheshire, Apple Computer, regarding Zero Configuration Networking

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-02 Thread Guido van Rossum
Argh. zipfiles are way to complex to be writing. If you want to use
zipfiles, compile your whole world ahead of time, stuff it into a
zipfile, and install / distribute that. But for the automatic writing
of bytecode files as a side effect of importing the source code,
please let the filesystem do its job.

--Guido

On Tue, Feb 2, 2010 at 4:24 PM, Neil Schemenauer  wrote:
> Nick Coghlan  wrote:
>> Henning von Bargen wrote:
>>> The solution is so obvious:
>>>
>>> Why not use a .pyr file that is internally a zip file?
>
> I think a Zip file might be the right approach too.  Either you
> could have directories in the zip file for each version, e.g.
>
>    2.7/foo.pyc
>    3.3/foo.pyc
>    2.7/bar.pyc
>    3.3/bar.pyc
>
> Or a Zip directory for each module:
>
>    foo/2.7.pyc
>    foo/3.3.pyc
>
> I think you could get away without funky names because dot would
> always be in the version number.
>
> This would be implemented simply as an extension to the zip import
> mechanism we already have.  Using the zip format would allow people
> to use existing zip utilities to manipulate them.
>
>> Agreed this should be discussed in the PEP, but one obvious problem is
>> the speed impact. Picking up a file from a subdirectory is going to
>> introduce less overhead than unpacking it from a zipfile.
>
> I'm pretty sure it would be better than using directories.  A
> directory for every module is not performance friendly.  Really, our
> current module per file is not performance friendly.
>
> Zip files could use "store" as the compression method if you are
> really worried about CPU time.
>
>  Neil
>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> http://mail.python.org/mailman/options/python-dev/guido%40python.org
>



-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-02 Thread Neil Schemenauer
Nick Coghlan  wrote:
> Henning von Bargen wrote:
>> The solution is so obvious:
>> 
>> Why not use a .pyr file that is internally a zip file?

I think a Zip file might be the right approach too.  Either you
could have directories in the zip file for each version, e.g.

2.7/foo.pyc
3.3/foo.pyc
2.7/bar.pyc
3.3/bar.pyc

Or a Zip directory for each module:

foo/2.7.pyc
foo/3.3.pyc

I think you could get away without funky names because dot would
always be in the version number.

This would be implemented simply as an extension to the zip import
mechanism we already have.  Using the zip format would allow people
to use existing zip utilities to manipulate them.

> Agreed this should be discussed in the PEP, but one obvious problem is
> the speed impact. Picking up a file from a subdirectory is going to
> introduce less overhead than unpacking it from a zipfile.

I'm pretty sure it would be better than using directories.  A
directory for every module is not performance friendly.  Really, our
current module per file is not performance friendly.

Zip files could use "store" as the compression method if you are
really worried about CPU time.

  Neil

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-02 Thread Sebastian Rittau
On Sun, Jan 31, 2010 at 12:44:33PM +0100, Georg Brandl wrote:

> At least to me, this does not explain why an "unwanted" (why unwanted? If
> it's unwanted, set PYTHONDONTWRITEBYTECODE=1) directory is worse than an
> "unwanted" file.

A directory "feels" different than. For example, typing "ls" in my shell
regular print files in black, but directories in bold and blue. File
managers and IDE also highlight directories differently. In tree views,
directories have expander buttons that also make them stand out.

As a concrete example, have a look at these two screenshots:

  http://tinyurl.com/yz2fr6c and http://tinyurl.com/yg38uqt

In the first one, the subpackages stand out, while in the second one they
are hard to make out among the *.pyr directories. A directory just adds
more clutter than a file.

But overall I like the idea of using just a single __pycache__ or
__pyr__ directory per path. This would also reduce the *.pyc clutter.

 - Sebastian
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-02 Thread Larry Hastings


On Sun, Jan 31, 2010 at 1:03 PM, Simon Cross 
 wrote:

I don't know whether I in favour of using a single pyr folder or not
but if a single folder is used I'd definitely prefer the folder to be
called __pyr__ rather than .pyr.


Guido van Rossum wrote:
Exactly what I would prefer. I worry that having many small 
directories is a fairly poor use of the filesystem. A quick scan of 
/usr/local/lib/python3.2 on my Linux box reveals 1163 .py files but 
only 57 directories).


Just to be clear: what should go in the __pyr__ folder?  I can see two 
possibilities:


1) All files go directly into __pyr__, a flat directory tree.
   foo.py
   bar.py
   __pyr__/
   foo.py.c.3160
   bar.py.c.3160

2) Each source file gets its own subdirectory of __pyr__.
   foo.py
   bar.py
   __pyr__/
   foo.py/
 c.3160
   bar.py/
 c.3160

2 makes it easier to clear the cache for a particular source file--just 
delete its matching directory.  The downside is that we're back to lots 
of small directories.  And it's not that onerous to do a "rm 
__pyr__/foo.py.*".  So I suspect you prefer option 1.



The proposal gets a +1 from me,


/larry/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-01 Thread Paul Du Bois
>> The python I use (win32 2.6.2) does not complain if it cannot read
>> from or write to a .pyc; and thus it handles multiple python processes
>> trying to create .pyc files at the same time. Is the .zip case really
>> any different?

[ snip discussion of difficulty of writing a sharing-safe update ]

On Mon, Feb 1, 2010 at 2:28 PM, "Martin v. Löwis"  wrote:
> So what would you do for concurrent writers, then? The current
> implementation relies on creat(O_EXCL) to be atomic, so a second
> writer would just fail. This is but the only IO operation that is
> guaranteed to be atomic (along with mkdir(2)), so reusing the current
> approach doesn't work.

Sorry, I'm guilty of having assumed that the POSIX API has an
operation analogous to win32 CreateFile(GENERIC_WRITE, 0 /* ie,
"FILE_SHARE_NONE"*/).

If shared-reader/single-writer semantics are not available, the only
other possibility I can think of is to avoid opening the .pyc for
write. To write a .pyc one would read it, write and flush updates to a
temp file, and rename(). This isn't atomic, but given the invariant
that the .pyc always contains consistent data, the new file will also
only contain consistent data. Races manifest as updates getting lost.

One obvious drawback is that the the .pyc inode would change on every update.

paul
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-01 Thread Martin v. Löwis
> The python I use (win32 2.6.2) does not complain if it cannot read
> from or write to a .pyc; and thus it handles multiple python processes
> trying to create .pyc files at the same time. Is the .zip case really
> any different? Since .pyc files are an optimization, it seems natural
> and correct that .pyc IO errors pass silently (apologies to Tim).
> 
> It's an interesting challenge to write the file in such a way that
> it's safe for a reader and writer to co-exist. 

I grant you that this may actually work for concurrent readers
(although on Windows, you'll have to pick the file share mode
carefully). The reader would have to be fairly robust, as the central
directory may disappear or get garbled while it is reading.

So what would you do for concurrent writers, then? The current
implementation relies on creat(O_EXCL) to be atomic, so a second
writer would just fail. This is but the only IO operation that is
guaranteed to be atomic (along with mkdir(2)), so reusing the current
approach doesn't work.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-01 Thread Paul Du Bois
> On Mon, Feb 1, 2010 at 13:19, "Martin v. Löwis"  wrote:
>> How do you write to a zipfile while others are reading it?

On Mon, Feb 1, 2010 at 1:23 PM, Brett Cannon  wrote:
> By hating concurrency (i.e. I don't have an answer which kills my idea).

The python I use (win32 2.6.2) does not complain if it cannot read
from or write to a .pyc; and thus it handles multiple python processes
trying to create .pyc files at the same time. Is the .zip case really
any different? Since .pyc files are an optimization, it seems natural
and correct that .pyc IO errors pass silently (apologies to Tim).

It's an interesting challenge to write the file in such a way that
it's safe for a reader and writer to co-exist. Like Brett, I
considered an append-only scheme, but one needs to handle the case
where the bytecode for a particular magic number changes. At some
point you'd need to sweep garbage from the file. All solutions seem
unnecessarily complex, and unnecessary since in practice the case
should not come up.

paul
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-01 Thread Brett Cannon
On Mon, Feb 1, 2010 at 13:19, "Martin v. Löwis"  wrote:
>> And I disagree this would be difficult as the PEP suggests given the
>> proper file format. For zip files zipimport already has the read code
>> in C; it just would require the code to write to a zip file. And as
>> for the format I mentioned above, that's dead-simple to implement.
>
> How do you write to a zipfile while others are reading it?
>

By hating concurrency (i.e. I don't have an answer which kills my idea).

-Brett
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-01 Thread Martin v. Löwis
> And I disagree this would be difficult as the PEP suggests given the
> proper file format. For zip files zipimport already has the read code
> in C; it just would require the code to write to a zip file. And as
> for the format I mentioned above, that's dead-simple to implement.

How do you write to a zipfile while others are reading it?

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-01 Thread Antoine Pitrou
Le Mon, 01 Feb 2010 11:35:19 -0800, Brett Cannon a écrit :
> 
> As others have said, an uncompressed zip file could work here. Or even a
> file format where the first 4 bytes is the timestamp and then after that
> are chunks of length-of-bytecode|magic|bytecode. That allows for opening
> a file in append mode to add more bytecode instead of a zipfile's
> requirement of rewriting the TOC on the end of the file every time you
> mutate the file (if I remember the zip file format correctly).

Making the file append-only doesn't eliminate the problems with 
concurrent modification. You still have to specify and implement a robust 
cross-platform file locking system which will have to be shared by all 
implementations. This is really a great deal of complication to add to 
the interpreter(s).

And, besides, it might not even work on NFS which was the motivation for 
your proposal :)


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-01 Thread Brett Cannon
On Sun, Jan 31, 2010 at 11:04, Raymond Hettinger
 wrote:
>
> On Jan 30, 2010, at 4:00 PM, Barry Warsaw wrote:
>> Abstract
>> 
>>
>> This PEP describes an extension to Python's import mechanism which
>> improves sharing of Python source code files among multiple installed
>> different versions of the Python interpreter.
>
> +1
>
>
>>  It does this by
>> allowing many different byte compilation files (.pyc files) to be
>> co-located with the Python source file (.py file).
>
> It would be nice if all the compilation files could be tucked
> into one single zipfile per directory to reduce directory clutter.
>
> It has several benefits besides tidiness. It hides the implementation
> details of when magic numbers get shifted.  And it may allow faster
> start-up times when the zipfile is in the disk cache.

It also eliminates stat calls. I have not seen anyone mention this,
but on filesystems where stat calls are expensive (e.g. NFS), this is
going to increase import cost (and thus startup time which some people
are already incredibly paranoid about). You are now going to shift
from a single stat call to check for a bytecode file to two just in
the search phase *per file check* (remember you need to search for
module.py and module/__init__.py). And then you get to repeat all of
this during the load process (potentially, depending on how aggressive
the loader is with caching).

As others have said, an uncompressed zip file could work here. Or even
a file format where the first 4 bytes is the timestamp and then after
that are chunks of length-of-bytecode|magic|bytecode. That allows for
opening a file in append mode to add more bytecode instead of a
zipfile's requirement of rewriting the TOC on the end of the file
every time you mutate the file (if I remember the zip file format
correctly). Biggest cost in this simple approach would be reading the
file in (unless you mmap the thing when possible) since once read the
code will be a bytes object which means constant time indexing until
you find the right magic number. And adding support to differentiate
between -O bytecode is simply adding a marker per chunk of bytecode.

And I disagree this would be difficult as the PEP suggests given the
proper file format. For zip files zipimport already has the read code
in C; it just would require the code to write to a zip file. And as
for the format I mentioned above, that's dead-simple to implement.

-Brett
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-01 Thread Brett Cannon
On Sun, Jan 31, 2010 at 11:16, Guido van Rossum  wrote:
> Whoa. This thread already exploded. I'm picking this message to
> respond to because it reflects my own view after reading the PEP.
>
> On Sun, Jan 31, 2010 at 4:13 AM, Hanno Schlichting  wrote:
>> On Sun, Jan 31, 2010 at 1:03 PM, Simon Cross
>>  wrote:
>>> I don't know whether I in favour of using a single pyr folder or not
>>> but if a single folder is used I'd definitely prefer the folder to be
>>> called __pyr__ rather than .pyr.
>
> Exactly what I would prefer. I worry that having many small
> directories is a fairly poor use of the filesystem. A quick scan of
> /usr/local/lib/python3.2 on my Linux box reveals 1163 .py files but
> only 57 directories).
>
>> Do you have any specific reason for that?
>>
>> Using the leading dot notation is an established pattern to hide
>> non-essential information from directory views. What makes this
>> non-applicable in this situation and a custom Python notation better?
>
> Because we don't want to completely hide the pyc files. Also the dot
> naming convention is somewhat platform-specific.
>
> FWIW in Python 3, the __file__ variable always points to the .py
> source filename. I agreed with Georg that there ought to be an API for
> finding the pyc file for a module. This could be a small addition to
> the PEP.

Importlib somewhat does this already through a module's loader:
http://docs.python.org/py3k/library/importlib.html#importlib.abc.PyPycLoader.bytecode_path
. If you want to work off of module names this is enough; if importlib
did the import then you can do __loader__.bytecode_path(__name__). And
if it has not been loaded yet then that simply requires me exposing an
importlib.find_module() that returns a loader for the module.

Trick comes down to when you want it based on __file__ instead of the
module name. Oh, and me finally breaking up import so that it has
proper loaders or bootstrapping importlib; small snag. =) But at least
the code already exists for this stuff.

-Brett

>
> --
> --Guido van Rossum (python.org/~guido)
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> http://mail.python.org/mailman/options/python-dev/brett%40python.org
>
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-01 Thread M.-A. Lemburg
Raymond Hettinger wrote:
> 
> On Jan 30, 2010, at 4:00 PM, Barry Warsaw wrote:
>> Abstract
>> 
>>
>> This PEP describes an extension to Python's import mechanism which
>> improves sharing of Python source code files among multiple installed
>> different versions of the Python interpreter.
> 
> +1 

+1 from here as well.

>>  It does this by
>> allowing many different byte compilation files (.pyc files) to be
>> co-located with the Python source file (.py file).  

+1 on the idea of having a standard for Python module cache
files.

+1 on having those files in the same directory as the associated
module file, just like we already do.

-1 on the idea of using directories for these. This only
complicates cleanup, management and distribution of such
files. Perhaps we could make this an option, though.

Thanks,
-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Feb 01 2010)
>>> Python/Zope Consulting and Support ...http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


::: Try our new mxODBC.Connect Python Database Interface for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-01 Thread Antoine Pitrou

> Would you still be a -1 on making it the new scheme the default if it
> used a single cache directory instead? That would actually be cleaner
> than the current solution rather than messier.

Well, I guess no, although additional directories are always more
intrusive than additional files (visually, or with tools such as "du"
for example).



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-01 Thread Gertjan Klein
Hanno Schlichting wrote:

>+1 for a single strategy that is used in all cases. The current
>solution could be phased out across multiple releases, but in the end
>there should be a single approach and no flag. Otherwise some code and
>tools will only support one of the approaches, especially if this is
>seen as something "only a minority of Linux distributions uses".

-1. As far as I can tell, this PEP proposes to solve a specific problem
that Linux distributions have. As they have decent package managers,
this PEP makes their maintainers' lives a lot easier. If implemented, I
believe it would eventually be used by all of them, not just "a
minority".

For just about anyone else, I believe the current situation works
perfectly fine, and should not be changed. Personally, I work mainly on
Windows, and things I install are placed in the site-packages directory
of the Python version I use. There is no need to place .pyc files in
subdirectories there, as there will only ever be one. Programs I write
myself are also rarely, if ever, run by multiple Python versions. They
get run by the default Python on my system; if I change the default, the
.pyc files get overwritten, which is exactly what I want, I no longer
need the old ones.

As to the single cache directory per directory versus per .py file
issue: a subdirectory per .py file is easier to manipulate manually;
listing the .py file and the subdirectory containing the compiled
versions belonging to it makes it somewhat easier to prevent errors due
to deleting the source but not the compiled version. However, as the
use-case for this PEP seems to be to make life easier for Linux
packagers, it seems that a single __pycache__ subdirectory (or whatever
the name would be) is preferable: less filesystem clutter, and no risks
of forgetting to delete .pyc files, as this is about system-managed
Python source.

Gertjan.



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-01 Thread Martin v. Löwis
> 3. In each top level directory on sys.path, shadow file heirarchy
>   Major Pro: trivial to separate out all cached files
>   Major Con: ??? (I got nuthin')

The major con of this option (and option 2) is an ambiguity of where to
look for in case of packages. In particular for namespace packages
(of the setuptools kind, or the PEP 382 kind), the directory where a
package is found on sys.path can change across Python runs.

So when you run Python several times, and install additional eggs
in-between, you get different directories all caching the same pyc
files. If you then uninstall some of the eggs, it may be difficult to
find out what pyc files to delete.

> Note that with option two, creating a bytecode only zipfile would be
> trivial: just add the __pycache__ directory as the top-level directory
> in the zipfile and leave out everything else (assume there were no data
> files in the package that were still needed).

I think any scheme that uses directories for pyc files will cause stale
pyc files to be located on disk. I then think it is important to never
automatically use these in imports - i.e. only ever consider a file in
a __pycache__ directory if you found a .py file earlier.

If that is the policy, then a __pycache__ directory in a zipfile would
have no effect (and rightly so). Instead, to run code from bytecode,
the byte code files should be on sys.path themselves (probably still
named the same way as they are named inside __pycache__).

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-01-31 Thread Ben Finney
Nick Coghlan  writes:

> Would you still be a -1 on making it the new scheme the default if it
> used a single cache directory instead? That would actually be cleaner
> than the current solution rather than messier.

+0 on a default of “store compiled bytecode files in a single cache
directory”. It is indeed cleaner than the current default.

I'm only +0 because I don't know whether that actually addresses the use
case that raised the issue to begin with, so I'm postponing judgement
until those who want this change in the first place chime in.

-- 
 \ “Once consumers can no longer get free music, they will have to |
  `\buy the music in the formats we choose to put out.” —Steve |
_o__)  Heckler, VP of Sony Music, 2001 |
Ben Finney

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-01-31 Thread Martin v. Löwis
> But I don't understand how this answers the question.  If the
> python26-zope.sendmail package doesn't run setup.py, then a
> python-zope.sendmail package where you specify at install time which
> directory to install the files to isn't going to run setup.py, either.
> If the only difference between a packaged python27-zope.sendmail and a
> packaged python26-zope.sendmail is the directory to which the files get
> written, why can't that be controlled at install time?

It certainly would be possible to copy the files into each Python's
site-packages. They have a system that does that in place, except that
it doesn't copy the files, but symlinks them.

> Well, I certainly don't want the conversation to take a few more months.
> I'm not against the PEP, I'm making my comments and asking my questions
> in the spirit of making it a high quality PEP.  If the motivation is
> "the Debian devs have concluded, after years of experimentation...",
> then I suppose that's what should go in the motivation section.

I guess Barry will have to explain what the problem with the current
scheme is.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-01-31 Thread Nick Coghlan
Silke von Bargen wrote:
> 
>> That still leaves the question of what to do with __file__ (for which
>> even the solution in the PEP isn't particularly clean). Perhaps the
>> thing to do there is to have __file__ always point to the source file
>> and introduce a __file_cached__ that points to the bytecompiled file on
>> disk (set to None if it doesn't exist, as may be the case for __main__
>> or due to writing of bytecode files being disabled).
> And what if there isn't a source file, because I want to deploy the
> byte-code only?
> This is possible now, but would be impossible if there was this kind of
> distinction.

For a bytecode only deployment, __file__ would point to where the source
file *would* be if it was there while __file_cached__ would point to the
precompiled byte code.

Yes, this would be backwards incompatible for some uses of execfile in
conjunction with __file__ but those should be much rarer than uses of
__file__ to locate source code (which break with bytecode only
deployment anyway) and to find colocated resource files (which only care
about the path to the file and not the filename itself).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
---
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-01-31 Thread Nick Coghlan
Martin v. Löwis wrote:
>> Exactly. How would you define where the pyr folder goes? At the root
>> of a package? What if I delete the __init__.py file there? Will the
>> existing pyr folder be orphaned and a new one created in each
>> subfolder? Unlike VCS working copies, the package / module / script
>> hierarchy is not formally defined in python.
> 
> The module name could guide the location. If you are importing
> xml.dom.minidom, it could put the pyc file into a sibling of the pyc
> folder for xml (under the name xml.dom.minidom.).
> 
> If you then remove __init__, you are no longer able to import xml.dom,
> but you might import dom.minidom (assuming you put the xml folder into
> sys.path). Then, a new pyc file would be created in the pyc folder for
> the dom package.

I see three possible logical locations for the Python cache directories:

1. In each directory containing Python source files.
  Major Pro: easy to keep source files associated with their cached versions
  Major Con: proliferation of cache directories

2. In each top level directory on sys.path, flat file structure
  Major Pro: trivial to separate out all cached files
  Major Con: for path locations like the top of the standard lib, the
cache directory would get a *lot* of entries

3. In each top level directory on sys.path, shadow file heirarchy
  Major Pro: trivial to separate out all cached files
  Major Con: ??? (I got nuthin')

I didn't list a single global cache directory as a viable option as it
would create some nasty naming conflicts due to runs with different
sys.path entries and would make it impossible to create zipfiles with
precached bytecode files.

Note that with option two, creating a bytecode only zipfile would be
trivial: just add the __pycache__ directory as the top-level directory
in the zipfile and leave out everything else (assume there were no data
files in the package that were still needed).

Packages would still be identifiable by the existence of the cached pyc
file for their __init__modules.

Going back to my previous example (with one extra source file to show
how a top-level module would be handled), scheme 2 would give:

module.py
package/
  __init__.py
  foo.py
  subpackage/
__init__.py
bar.py
__pycache__/
  module.cpython-27.pyc
  module.cpython-27.pyo
  package.__init__.cpython-27.pyc
  package.__init__.cpython-27.pyo
  package.foo.cpython-27.pyc
  package.foo.cpython-27.pyo
  package.subpackage.__init__.cpython-27.pyc
  package.subpackage.__init__.cpython-27.pyo
  package.subpackage.bar.cpython-27.pyc
  package.subpackage.bar.cpython-27.pyo

While scheme 3 would look like:

module.py
package/
  __init__.py
  foo.py
  subpackage/
__init__.py
bar.py
__pycache__/
  module.cpython-27.pyc
  module.cpython-27.pyo
  package/
__init__.cpython-27.pyc
__init__.cpython-27.pyo
foo.cpython-27.pyc
foo.cpython-27.pyo
subpackage/
  __init__.cpython-27.pyc
  __init__.cpython-27.pyo
  bar.cpython-27.pyc
  bar.cpython-27.pyo

For comparison, here is what it would look like under scheme 1:

module.py
package/
  __init__.py
  foo.py
  subpackage/
__init__.py
bar.py
__pycache__/
  __init__.cpython-27.pyc
  __init__.cpython-27.pyo
  bar.cpython-27.pyc
  bar.cpython-27.pyo
  __pycache__/
__init__.cpython-27.pyc
__init__.cpython-27.pyo
foo.cpython-27.pyc
foo.cpython-27.pyo
__pycache__/
  module.cpython-27.pyc
  module.cpython-27.pyo

And the initial version proposed in the PEP:

module.py
module.pyr/
  cpython-27.pyc
  cpython-27.pyo
package/
  __init__.py
  __init__.pyr/
cpython-27.pyc
cpython-27.pyo
  foo.py
  foo.pyr/
cpython-27.pyc
cpython-27.pyo
  subpackage/
__init__.py
__init__.pyr/
  cpython-27.pyc
  cpython-27.pyo
bar.py
bar.pyr/
  cpython-27.pyc
  cpython-27.pyo

My major concern with scheme 2 is the possibility of directory size
limits affecting the caching of files, but scheme 3 looks pretty good to
me (with the higher level cache linked to the directory that is actually
on sys.path, the cache locations aren't as arbitrary as I originally
feared).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
---
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-01-31 Thread Nick Coghlan
Antoine Pitrou wrote:
> Le Sat, 30 Jan 2010 21:04:14 -0800, Jeffrey Yasskin a écrit :
>> I have a couple bikesheddy or "why didn't you do this" comments. I'll be
>> perfectly satisfied with an answer or a line in the pep.
>>
>> 1. Why the -R flag? It seems like this is a uniform improvement, so it
>> should be the default. Have faith in your design! ;-)
> 
> -1 for making it a default. It is definitely ugly and useless for most 
> cases. It is fine as long as it is optional and merely used by the Debian/
> Ubuntu installers.

Would you still be a -1 on making it the new scheme the default if it
used a single cache directory instead? That would actually be cleaner
than the current solution rather than messier.

Regards,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
---
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-01-31 Thread R. David Murray
On Sun, 31 Jan 2010 19:48:19 +0100, =?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?= 
 wrote:
> > By the way, the part that caused me the most confusion in the language
> > in the PEP was the emphasized *and their dependencies*, as if a package
> > having dependencies somehow turned the problem into a factorial explosion.
> > But there seems to be nothing special, according to your explanation,
> > about dependencies in this scheme.
> 
> For regular (forward) dependencies, there is indeed nothing special to
> consider - they would have to exist in all versions. In practice, this
> can (and was) problematic: python-zope.sendmail depends on
> python-pkg-resources, python-transaction, python-zope, and 10 other
> things. Before you could starting to provide python27-zope.sendmail,
> all of these dependencies would have to become available in a 2.7
> version first, meaning that ten other Debian developers need to act
> before you can. With the failure rate of Debian developers (who go
> as often on holidays as any other volunteer), upgrading to a new Python
> release could often take many months.

OK, that makes it clearer.  It's an internal (and probably unavoidable)
Debian social problem, not a technical one, and I see why it is an
important issue.

> > It seems like it would be simple enough to enhance the os packaging
> > systems to allow the install path to be specified at install time, if
> > that really is the only difference between the package versions.  And a
> > script that runs through all the installed python packages and installs
> > them for a new Python version when a new version is installed should be
> > as easy for other distributions as it is for Gentoo.
> 
> However, it's also unacceptable. I can't cite the exact piece of Debian
> policy, but I'm fairly sure that "build" activities are not allowed at
> installation time. So actually running setup.py files is out of
> question. Users who want such a thing would have to switch to Gentoo;
> Debian users just want it to work :-)

I'm less sympathetic to problems created by rigid policies, but that
doesn't mean I'm not sympathetic :)

But I don't understand how this answers the question.  If the
python26-zope.sendmail package doesn't run setup.py, then a
python-zope.sendmail package where you specify at install time which
directory to install the files to isn't going to run setup.py, either.
If the only difference between a packaged python27-zope.sendmail and a
packaged python26-zope.sendmail is the directory to which the files get
written, why can't that be controlled at install time?  Writing files
to a directory must be an install activity, not a build activity.  If the
issue is that *deciding* what directory to install to is a build time
activity...well, maybe I would be less sympathetic to a policy that is
*that* rigid.

> > (The os vendors are going to have
> > to change details of their packaging systems if the PEP is accepted,
> > so it's not as if the PEP saves the vendors work.)
> 
> Again, I'm a little bit unclear on the motivation, also. I think it
> mostly is "after years of experimentation, we have run out of ideas
> how to solve all related problems simultaneously without changing
> Python, so let's look for options that do involve changing Python".
> 
> If you *really* want a list of all the simultaneous problems that
> need to be solved, and an explanation of why each individual solution
> has flaws, prepare for this conversation to take a few more weeks.

Well, I certainly don't want the conversation to take a few more months.
I'm not against the PEP, I'm making my comments and asking my questions
in the spirit of making it a high quality PEP.  If the motivation is
"the Debian devs have concluded, after years of experimentation...",
then I suppose that's what should go in the motivation section.

--
R. David Murray  www.bitdance.com
Business Process Automation - Network/Server Management - Routers/Firewalls
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-01-31 Thread Terry Reedy

On 1/31/2010 4:26 PM, Tim Delaney wrote:





The pyc/pyo files are just an optimisation detail, and are essentially
temporary.


The .pycs for /Lib and similar are*not* temporarily in the sense you are 
using. They are effectively permanent for as long as the version is 
installed. They should *not* be routinely trashed as they are not 
obsolete and nearly always will be reused.


Terry Jan Reedy

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-01-31 Thread Curt Hagenlocher
On Sun, Jan 31, 2010 at 11:16 AM, Terry Reedy  wrote:

>
> 'pycache' would be pretty clear.
>
Heh -- without the underscores, I read this as "pyc ache". Seems
appropriate.

--
Curt Hagenlocher
c...@hagenlocher.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-01-31 Thread Scott Dial
On 1/31/2010 2:04 PM, Raymond Hettinger wrote:
> On Jan 30, 2010, at 4:00 PM, Barry Warsaw wrote:
>> It does this by
>> allowing many different byte compilation files (.pyc files) to be
>> co-located with the Python source file (.py file).  
> 
> It would be nice if all the compilation files could be tucked
> into one single zipfile per directory to reduce directory clutter.
> 
> It has several benefits besides tidiness. It hides the implementation
> details of when magic numbers get shifted.  And it may allow faster
> start-up times when the zipfile is in the disk cache.
> 

On a whim, I implemented a PEP302 loader that cached any important that
it could find in sys.path into a zip file.

I used running bzr as a startup benchmark, and I did my best to ensure
an empty cache by running "sync; echo 3 > /proc/sys/vm/drop_caches; time
bzr". On my particular machine, the "real" time was at minimum 3.5
seconds without using my ZipFileCacheLoader. With the loader, I found
the same was true. The average performance was all over the place (due
everything else in the operating system trying to fetch from the disk),
and I lack enough data points to reach statistical significance.

However, if the ".pyr" zip file is going to contain many versions of the
same module, then the performance impact could be more real, since you
would be forced to pull from disk *all* of the versions of a given module.

-- 
Scott Dial
sc...@scottdial.com
scod...@cs.indiana.edu
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-01-31 Thread Martin v. Löwis
>I can also see advantages to allowing out of tree compiled cache
> directories. For example, you could have a locked down .py tree with
> .pycs going into per-user trees. This prevents another user from
> spoofing a .pyc I use as well as allowing users to install arbitrary
> versions of Python without getting an admin to compile the .py tree
> with the new compiler.

This is PEP 304, which has been withdrawn by its author. While there
is some relationship with PEP 3147, the two address orthogonal issues.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


  1   2   >