Re: [Python-Dev] PEP 428: Pathlib

2013-09-16 Thread Antoine Pitrou
Le Sun, 15 Sep 2013 06:46:08 -0700,
Ethan Furman  a écrit :
> I see PEP 428 is both targeted at 3.4 and still in draft status.
> 
> What remains to be done to ask for pronouncement?

I think I have a couple of items left to integrate in the PEP.
Mostly it needs me to take a bit of time and finalize the PEP, and
then have a PEP delegate (or Guido) pronounce on it.

That's unless someone else wants to add it, of course. I don't mind
suggestions, as long as they don't deter from the short-/middle-term
goal of getting the PEP approved :)

Note that the pathlib API has to be provisional. While "OO path
objects" have been a common desire for a long time, it is a much less
well-paved avenue than e.g. TransformDict :)

Regards

Antoine.


___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 428: Pathlib

2013-09-16 Thread Guido van Rossum
I hope there is a volunteer for delegate.

--Guido van Rossum (sent from Android phone)
On Sep 16, 2013 1:17 AM, "Antoine Pitrou"  wrote:

> Le Sun, 15 Sep 2013 06:46:08 -0700,
> Ethan Furman  a écrit :
> > I see PEP 428 is both targeted at 3.4 and still in draft status.
> >
> > What remains to be done to ask for pronouncement?
>
> I think I have a couple of items left to integrate in the PEP.
> Mostly it needs me to take a bit of time and finalize the PEP, and
> then have a PEP delegate (or Guido) pronounce on it.
>
> That's unless someone else wants to add it, of course. I don't mind
> suggestions, as long as they don't deter from the short-/middle-term
> goal of getting the PEP approved :)
>
> Note that the pathlib API has to be provisional. While "OO path
> objects" have been a common desire for a long time, it is a much less
> well-paved avenue than e.g. TransformDict :)
>
> Regards
>
> Antoine.
>
>
> ___
> Python-Dev mailing list
> [email protected]
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/guido%40python.org
>
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 450 adding statistics module

2013-09-16 Thread Oscar Benjamin
On 16 September 2013 16:42, Guido van Rossum  wrote:
> I'm ready to accept this PEP. Because I haven't read this entire thread (and
> 60 messages about random diversions is really too much to try and catch up
> on) I'll give people 24 hours to remind me of outstanding rejections.
>
> I also haven't reviewed the code in any detail, but I believe the code
> review is going well, so I'm not concerned that the PEP would have to
> revised based on that alone.

I think Steven has addressed all of the issues raised. Briefly from memory:

1) There was concern about having an additional sum function. Steven
has pointed out that neither of sum/fsum is accurate for all stdlib
numeric types as is the intention for the statistics module. It is not
possible to modify either of sum/fsum in a backward compatible way
that would make them suitable here.

2) The initial names for the median functions were median.low
median.high etc. This naming scheme was considered non-standard by
some and has been redesigned as median_low, median_high etc. (there
was also discussion about the method used to attach the names to the
median function but this became irrelevant after the rename).

3) The mode function also provided an algorithm for estimating the
mode of a continuous probability distribution from a sample. It was
suggested that there is no uniquely good way of doing this and that it
is not commonly needed. This was removed and the API for mode() was
simplified (it now returns a unique mode or raises an error).

4) Some of the functions (e.g. variance) used different algorithms
(and produced different results) when given an iterator instead of a
collection. These are now changed to always use the same algorithm and
build a collection internally if necessary.

5) It was suggested that it should also be possible to compute the
mean of e.g. timedelta objects but it was pointed out that they can be
converted to numbers with the timedelta.total_seconds() method.

6) I raised an issue about the way the sum function behaved for
decimals but this was changed in a subsequent patch presenting a new
sum function that isn't susceptible to accumulated rounding errors
with Decimals.


Oscar
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 428: Pathlib

2013-09-16 Thread Charles-François Natali
2013/9/16 Antoine Pitrou :
> Le Sun, 15 Sep 2013 06:46:08 -0700,
> Ethan Furman  a écrit :
>> I see PEP 428 is both targeted at 3.4 and still in draft status.
>>
>> What remains to be done to ask for pronouncement?
>
> I think I have a couple of items left to integrate in the PEP.
> Mostly it needs me to take a bit of time and finalize the PEP, and
> then have a PEP delegate (or Guido) pronounce on it.

IIRC, during the last discussion round, we were still debating between
implicit stat() result caching - which requires an explicit restat()
method - vs a mapping between the stat() method and a stat() syscall.

What was the conclusion?
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 450 adding statistics module

2013-09-16 Thread Guido van Rossum
I'm ready to accept this PEP. Because I haven't read this entire thread
(and 60 messages about random diversions is really too much to try and
catch up on) I'll give people 24 hours to remind me of outstanding
rejections.

I also haven't reviewed the code in any detail, but I believe the code
review is going well, so I'm not concerned that the PEP would have to
revised based on that alone.


On Fri, Sep 13, 2013 at 5:59 PM, Steven D'Aprano wrote:

> On Sun, Sep 08, 2013 at 10:51:57AM -0700, Guido van Rossum wrote:
> > Never mind, I found the patch and the issue. I really think that the
> > *PEP* is ready for inclusion after the open issues are changed into
> > something like Discussion or Future Work, and after adding a more
> > prominent link to the issue with the patch. Then the *patch* can be
> > reviewed some more until it is ready -- it looks very close already.
>
> I've updated the PEP as requested. Is there anything further that needs
> to be done to have it approved?
>
> http://www.python.org/dev/peps/pep-0450/
>
>
>
> --
> Steven
> ___
> Python-Dev mailing list
> [email protected]
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/guido%40python.org
>



-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] TransformDict (PEP 455) Naming

2013-09-16 Thread Glenn Linderman

On 9/15/2013 11:28 PM, anatoly techtonik wrote:

Does anybody know if http://vote.python.org is already operational?

I decided to start a separate thread for TransformDict name, because I
want to change it.
Current implementation of PEP 455 only touches dictionary keys and it
is more narrow than the name suggests. I'd reserve TransformDict name
for something that is used to transform some other data. For my data
transformation theory I have an idea of mapping with annotated fields
that is used to change the names of some source data structure to
target data structure, converting types and applying custom rules on
the way. This is a different, but more intuitive application of such
name.


The multitude of data transformations that are possible are certainly 
broader than the scope of TransformDict. However, such transformations 
have little to do with the operation of a dict ... the key 
characteristic of a dict is accessing data by key value, and the idea of 
transformation for a dict is easily understood to be a transformation of 
that access pattern, rather than a rich transformation of the data.


Rich data transformations may be useful, and if possible to abstract a 
large number of useful data transformations into an API that would 
become popular, it would seem that such transformations would want to be 
applied not only to dict, but also to list and other data structures. It 
would be more of an object-to-object mapping, independent of the 
container that might hold the object during part of its lifetime.


Hence, it seems unlikely to me that "dict" would be part of the name or 
requirements for such rich data transformations, leaving TransformDict 
available to be used for exactly what PEP 455 proposes.
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 428: Pathlib -> stat caching

2013-09-16 Thread Antoine Pitrou
On Mon, 16 Sep 2013 19:06:37 +0200
Charles-François Natali  wrote:
> 2013/9/16 Antoine Pitrou :
> > Le Sun, 15 Sep 2013 06:46:08 -0700,
> > Ethan Furman  a écrit :
> >> I see PEP 428 is both targeted at 3.4 and still in draft status.
> >>
> >> What remains to be done to ask for pronouncement?
> >
> > I think I have a couple of items left to integrate in the PEP.
> > Mostly it needs me to take a bit of time and finalize the PEP, and
> > then have a PEP delegate (or Guido) pronounce on it.
> 
> IIRC, during the last discussion round, we were still debating between
> implicit stat() result caching - which requires an explicit restat()
> method - vs a mapping between the stat() method and a stat() syscall.
> 
> What was the conclusion?

No definite conclusion. You and Nick liked the idea of a rich stat
object (returned by os.stat()) with is_dir() methods and the like:
https://mail.python.org/pipermail/python-dev/2013-May/125809.html

However, nothing was done about that since then ;-)

There was also the scandir() proposal to return rich objects with
optional stat-like fields, but similarly it didn't get a conclusion:
https://mail.python.org/pipermail/python-dev/2013-May/126119.html

So I would like to propose the following API change:

- Path.stat() (and stat-accessing methods such as get_mtime()...)
  returns an uncached stat object by default

- Path.cache_stat() can be called to return the stat() *and* cache it
  for future use, such that any future call to stat(), cache_stat() or
  a stat-accessing function reuses that cached stat

In other words, only if you use cache_stat() at least once is the
stat() value cached and reused by the Path object.
(also, it's a per-Path decision)

Regards

Antoine.
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Compiler for the Mac OS X version of Python 3.4

2013-09-16 Thread Bill Janssen
Russell E. Owen  wrote:

> In article ,
>  Raymond Hettinger  wrote:
> 
> > On Sep 14, 2013, at 1:32 PM, Ned Deily  wrote:
> > >  The 
> > > most recent Developer Tools for 10.8 and 10.7 systems, Xcode 4.6.x, have 
> > > a mature clang but do not provide a 10.6 SDK.  Even with using an SDK, 
> > > it's still possible to end up inadvertently linking with the wrong 
> > > versions of system libraries.  We have been burned by that in the past.
> > 
> > I think we should offer a separate Mac build just for 10.6
> > (much like we do for the 32-bit PPC option for 10.5).
> 
> If Apple drops support for gcc in 10.9 I guess we have to go this route, 

Could go the Sage route -- Sage first checks for an up-to-date version
of gcc, and downloads it and builds it for its own use if necessary.

Bill

> but please be careful. Every time you add a new version of python for 
> MacOS X it means that folks providing binary installers (e.g. for numpy) 
> have to provide another binary, and folks using those installers have 
> another chance of picking the wrong one.
> 
> If you do make a 10.6-only installer, what is the minimum version of 
> MacOS X the modern compiler would support? 10.7 gives a more measured 
> upgrade path, but 10.8 gives a better compiler.
> 
> -- Russell
> 
> ___
> Python-Dev mailing list
> [email protected]
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> https://mail.python.org/mailman/options/python-dev/bill%40janssen.org
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 428: Pathlib -> stat caching

2013-09-16 Thread Antoine Pitrou
On Mon, 16 Sep 2013 15:48:54 -0400
Brett Cannon  wrote:
> >
> > So I would like to propose the following API change:
> >
> > - Path.stat() (and stat-accessing methods such as get_mtime()...)
> >   returns an uncached stat object by default
> >
> > - Path.cache_stat() can be called to return the stat() *and* cache it
> >   for future use, such that any future call to stat(), cache_stat() or
> >   a stat-accessing function reuses that cached stat
> >
> > In other words, only if you use cache_stat() at least once is the
> > stat() value cached and reused by the Path object.
> > (also, it's a per-Path decision)
> >
> 
> Any reason why stat() can't get a keyword-only cached=True argument
> instead? Or have stat() never cache() but stat_cache() always so that
> people can choose if they want fresh or cached based on API and not whether
> some library happened to make a decision for them?

1. Because you also want the helper functions (get_mtime(), etc.) to
cache the value too. It's not only about stat().

2. Because of the reverse use case where you want a library to reuse a
cached value despite the library not using an explicit caching call.

Basically, the rationale is:

1. Caching should be opt-in, which is what this new API achieves.

2. Once you have asked for caching, most always you also want the
subsequent accesses to be cached.

I realize there should be a third method clear_cache(), though ;-)

Regards

Antoine.
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Compiler for the Mac OS X version of Python 3.4

2013-09-16 Thread Russell E. Owen
In article ,
 Raymond Hettinger  wrote:

> On Sep 14, 2013, at 1:32 PM, Ned Deily  wrote:
> >  The 
> > most recent Developer Tools for 10.8 and 10.7 systems, Xcode 4.6.x, have 
> > a mature clang but do not provide a 10.6 SDK.  Even with using an SDK, 
> > it's still possible to end up inadvertently linking with the wrong 
> > versions of system libraries.  We have been burned by that in the past.
> 
> I think we should offer a separate Mac build just for 10.6
> (much like we do for the 32-bit PPC option for 10.5).

If Apple drops support for gcc in 10.9 I guess we have to go this route, 
but please be careful. Every time you add a new version of python for 
MacOS X it means that folks providing binary installers (e.g. for numpy) 
have to provide another binary, and folks using those installers have 
another chance of picking the wrong one.

If you do make a 10.6-only installer, what is the minimum version of 
MacOS X the modern compiler would support? 10.7 gives a more measured 
upgrade path, but 10.8 gives a better compiler.

-- Russell

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 428: Pathlib -> stat caching

2013-09-16 Thread Brett Cannon
On Mon, Sep 16, 2013 at 3:45 PM, Antoine Pitrou  wrote:

> On Mon, 16 Sep 2013 19:06:37 +0200
> Charles-François Natali  wrote:
> > 2013/9/16 Antoine Pitrou :
> > > Le Sun, 15 Sep 2013 06:46:08 -0700,
> > > Ethan Furman  a écrit :
> > >> I see PEP 428 is both targeted at 3.4 and still in draft status.
> > >>
> > >> What remains to be done to ask for pronouncement?
> > >
> > > I think I have a couple of items left to integrate in the PEP.
> > > Mostly it needs me to take a bit of time and finalize the PEP, and
> > > then have a PEP delegate (or Guido) pronounce on it.
> >
> > IIRC, during the last discussion round, we were still debating between
> > implicit stat() result caching - which requires an explicit restat()
> > method - vs a mapping between the stat() method and a stat() syscall.
> >
> > What was the conclusion?
>
> No definite conclusion. You and Nick liked the idea of a rich stat
> object (returned by os.stat()) with is_dir() methods and the like:
> https://mail.python.org/pipermail/python-dev/2013-May/125809.html
>
> However, nothing was done about that since then ;-)
>
> There was also the scandir() proposal to return rich objects with
> optional stat-like fields, but similarly it didn't get a conclusion:
> https://mail.python.org/pipermail/python-dev/2013-May/126119.html
>
> So I would like to propose the following API change:
>
> - Path.stat() (and stat-accessing methods such as get_mtime()...)
>   returns an uncached stat object by default
>
> - Path.cache_stat() can be called to return the stat() *and* cache it
>   for future use, such that any future call to stat(), cache_stat() or
>   a stat-accessing function reuses that cached stat
>
> In other words, only if you use cache_stat() at least once is the
> stat() value cached and reused by the Path object.
> (also, it's a per-Path decision)
>

Any reason why stat() can't get a keyword-only cached=True argument
instead? Or have stat() never cache() but stat_cache() always so that
people can choose if they want fresh or cached based on API and not whether
some library happened to make a decision for them?
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 428: Pathlib -> stat caching

2013-09-16 Thread Victor Stinner
2013/9/16 Brett Cannon :
> Any reason why stat() can't get a keyword-only cached=True argument instead?
> Or have stat() never cache() but stat_cache() always so that people can
> choose if they want fresh or cached based on API and not whether some
> library happened to make a decision for them?

I also prefer a single function, but only if the default is
cached=False. Caching by default can be surprising and unexpected.

Victor
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 428: Pathlib -> stat caching

2013-09-16 Thread R. David Murray
On Mon, 16 Sep 2013 15:48:54 -0400, Brett Cannon  wrote:
> On Mon, Sep 16, 2013 at 3:45 PM, Antoine Pitrou  wrote:
> > So I would like to propose the following API change:
> >
> > - Path.stat() (and stat-accessing methods such as get_mtime()...)
> >   returns an uncached stat object by default
> >
> > - Path.cache_stat() can be called to return the stat() *and* cache it
> >   for future use, such that any future call to stat(), cache_stat() or
> >   a stat-accessing function reuses that cached stat
> >
> > In other words, only if you use cache_stat() at least once is the
> > stat() value cached and reused by the Path object.
> > (also, it's a per-Path decision)
> >
> 
> Any reason why stat() can't get a keyword-only cached=True argument
> instead? Or have stat() never cache() but stat_cache() always so that
> people can choose if they want fresh or cached based on API and not whether
> some library happened to make a decision for them?

Well, we tend to avoid single boolean arguments in favor of differently
named functions.

But here is an alternate API:  expose the state by having a 'cache_stat'
attribute of the Path that is 'False' by default but can be set 'True'.
It could also (or only?) be set via an optional constructor argument.

--David
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Compiler for the Mac OS X version of Python 3.4

2013-09-16 Thread Ryan Gonzalez
Meh...I hate it when tools download stuff without me noticing.

Honestly, a separate 10.6 build would work well. Plus, if a new Clang
versions includes some awesome feature that could make Python builds
better, you'd be able to take advantage of it better.


On Mon, Sep 16, 2013 at 2:56 PM, Bill Janssen  wrote:

> Russell E. Owen  wrote:
>
> > In article ,
> >  Raymond Hettinger  wrote:
> >
> > > On Sep 14, 2013, at 1:32 PM, Ned Deily  wrote:
> > > >  The
> > > > most recent Developer Tools for 10.8 and 10.7 systems, Xcode 4.6.x,
> have
> > > > a mature clang but do not provide a 10.6 SDK.  Even with using an
> SDK,
> > > > it's still possible to end up inadvertently linking with the wrong
> > > > versions of system libraries.  We have been burned by that in the
> past.
> > >
> > > I think we should offer a separate Mac build just for 10.6
> > > (much like we do for the 32-bit PPC option for 10.5).
> >
> > If Apple drops support for gcc in 10.9 I guess we have to go this route,
>
> Could go the Sage route -- Sage first checks for an up-to-date version
> of gcc, and downloads it and builds it for its own use if necessary.
>
> Bill
>
> > but please be careful. Every time you add a new version of python for
> > MacOS X it means that folks providing binary installers (e.g. for numpy)
> > have to provide another binary, and folks using those installers have
> > another chance of picking the wrong one.
> >
> > If you do make a 10.6-only installer, what is the minimum version of
> > MacOS X the modern compiler would support? 10.7 gives a more measured
> > upgrade path, but 10.8 gives a better compiler.
> >
> > -- Russell
> >
> > ___
> > Python-Dev mailing list
> > [email protected]
> > https://mail.python.org/mailman/listinfo/python-dev
> > Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/bill%40janssen.org
> ___
> Python-Dev mailing list
> [email protected]
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/rymg19%40gmail.com
>



-- 
Ryan
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 428: Pathlib -> stat caching

2013-09-16 Thread Terry Reedy

On 9/16/2013 4:14 PM, R. David Murray wrote:


Well, we tend to avoid single boolean arguments in favor of differently
named functions.


The stdlib has lots of boolean arguments. My impression is that they are 
to be avoided when they would change the return type or otherwise do 
something disjointly different. I do not think this would apply here.


--
Terry Jan Reedy

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 428: Pathlib -> stat caching

2013-09-16 Thread Antoine Pitrou
On Mon, 16 Sep 2013 16:14:43 -0400
"R. David Murray"  wrote:
> On Mon, 16 Sep 2013 15:48:54 -0400, Brett Cannon  wrote:
> > On Mon, Sep 16, 2013 at 3:45 PM, Antoine Pitrou  wrote:
> > > So I would like to propose the following API change:
> > >
> > > - Path.stat() (and stat-accessing methods such as get_mtime()...)
> > >   returns an uncached stat object by default
> > >
> > > - Path.cache_stat() can be called to return the stat() *and* cache it
> > >   for future use, such that any future call to stat(), cache_stat() or
> > >   a stat-accessing function reuses that cached stat
> > >
> > > In other words, only if you use cache_stat() at least once is the
> > > stat() value cached and reused by the Path object.
> > > (also, it's a per-Path decision)
> > >
> > 
> > Any reason why stat() can't get a keyword-only cached=True argument
> > instead? Or have stat() never cache() but stat_cache() always so that
> > people can choose if they want fresh or cached based on API and not whether
> > some library happened to make a decision for them?
> 
> Well, we tend to avoid single boolean arguments in favor of differently
> named functions.
> 
> But here is an alternate API:  expose the state by having a 'cache_stat'
> attribute of the Path that is 'False' by default but can be set 'True'.

Thanks for the suggestion, that's a possibility too.

> It could also (or only?) be set via an optional constructor argument.

That's impractical if you get the Path object from a library call.

Regards

Antoine.


___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 428: Pathlib -> stat caching

2013-09-16 Thread Nick Coghlan
On 17 Sep 2013 06:45, "Antoine Pitrou"  wrote:
>
> On Mon, 16 Sep 2013 16:14:43 -0400
> "R. David Murray"  wrote:
> > On Mon, 16 Sep 2013 15:48:54 -0400, Brett Cannon 
wrote:
> > > On Mon, Sep 16, 2013 at 3:45 PM, Antoine Pitrou 
wrote:
> > > > So I would like to propose the following API change:
> > > >
> > > > - Path.stat() (and stat-accessing methods such as get_mtime()...)
> > > >   returns an uncached stat object by default
> > > >
> > > > - Path.cache_stat() can be called to return the stat() *and* cache
it
> > > >   for future use, such that any future call to stat(), cache_stat()
or
> > > >   a stat-accessing function reuses that cached stat
> > > >
> > > > In other words, only if you use cache_stat() at least once is the
> > > > stat() value cached and reused by the Path object.
> > > > (also, it's a per-Path decision)
> > > >
> > >
> > > Any reason why stat() can't get a keyword-only cached=True argument
> > > instead? Or have stat() never cache() but stat_cache() always so that
> > > people can choose if they want fresh or cached based on API and not
whether
> > > some library happened to make a decision for them?
> >
> > Well, we tend to avoid single boolean arguments in favor of differently
> > named functions.
> >
> > But here is an alternate API:  expose the state by having a 'cache_stat'
> > attribute of the Path that is 'False' by default but can be set 'True'.
>
> Thanks for the suggestion, that's a possibility too.
>
> > It could also (or only?) be set via an optional constructor argument.
>
> That's impractical if you get the Path object from a library call.

Given that this is a behavioural state change, I think asking for a
possibly *new* path with caching enabled in that case would be a good way
to go. If we treat path objects as effectively immutable (aside from the
optional internal stat cache), then checking in __new__ if a passed in path
object already has the appropriate caching status and returning it directly
if so, but otherwise creating a new path object with the cache setting
changed would avoid having libraries potentially alter the behaviour of
applications' path objects and vice-versa.

In effect, the unique "identity" of a path would be a triple representing
the type, the filesystem path and whether or not it cached stat results
internally. If you wanted to change any of those, you would have to create
a new object.

Cheers,
Nick.

>
> Regards
>
> Antoine.
>
>
> ___
> Python-Dev mailing list
> [email protected]
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] PEP 454: add a new tracemalloc module (second round)

2013-09-16 Thread Victor Stinner
Hi,

Thanks to the early remarks on the PEP 454, I redesigned and enhanced
the API of the new tracemalloc module. Changes:

* it is now possibility to record more than 1 frame per memory allocation
* add filters on filename and line number
* new GroupedStats and StatsDiff class to generate and compare statistics
* cumulative statistics
* display the traceback of a memory block
* almost all the code has unit tests
* better documentation

HTML version of the documentation:
http://www.haypocalc.com/tmp/tracemalloc/library/tracemalloc.html

The documentation contains output examples and a short tutorial
("Usage"), but also the documentation of the command line, which are
not included in the PEP.

HTML version of the PEP:
http://www.python.org/dev/peps/pep-0454/

Issue tracking the implementation:
http://bugs.python.org/issue18874


PEP: 454
Title: Add a new tracemalloc module to trace Python memory allocations
Version: $Revision$
Last-Modified: $Date$
Author: Victor Stinner 
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 3-September-2013
Python-Version: 3.4


Abstract


Add a new ``tracemalloc`` module to trace memory blocks allocated by Python.



Rationale
=

Common debug tools tracing memory allocations read the C filename and
number.  Using such tool to analyze Python memory allocations does not
help because most memory block are allocated in the same C function,
in ``PyMem_Malloc()`` for example.

There are debug tools dedicated to the Python language like ``Heapy``
and ``PySizer``. These projects analyze objects type and/or content.
These tools are useful when most memory leaks are instances of the
same type and this type is only instancied in a few functions. The
problem is when the object type is very common like ``str`` or
``tuple``, and it is hard to identify where these objects are
instancied.

Finding reference cycles is also a difficult problem. There are
different tools to draw a diagram of all references. These tools cannot
be used on large applications with thousands of objects because the
diagram is too huge to be analyzed manually.


Proposal


Using the PEP 445, it becomes easy to setup an hook on Python memory
allocators. The hook can inspect the current Python frame to get the
Python filename and line number.

This PEP proposes to add a new ``tracemalloc`` module. It is a debug
tool to trace memory allocations made by Python. The module provides the
following information:

* Compute the differences between two snapshots to detect memory leaks
* Statistics on allocated memory blocks per filename and per line number:
  total size, number and average size of allocated memory blocks
* For each allocated memory block: its size and the traceback where the block
  was allocated

The API of the tracemalloc module is similar to the API of the
faulthandler module: ``enable()``, ``disable()`` and ``is_enabled()``
functions, an environment variable (``PYTHONFAULTHANDLER`` and
``PYTHONTRACEMALLOC``), a ``-X`` command line option (``-X
faulthandler`` and ``-X tracemalloc``). See the
`documentation of the faulthandler module
`_.

The tracemalloc module has been written for CPython. Other
implementations of Python may not provide it.


API
===

To trace most memory blocks allocated by Python, the module should be
enabled as early as possible by calling ``tracemalloc.enable()``
function, by setting the ``PYTHONTRACEMALLOC`` environment variable to
``1``, or by using ``-X tracemalloc`` command line option.

By default, the ``Trace.traceback`` attribute only stores one ``Frame``
instance per allocated memory block. Use ``set_traceback_limit()`` to
store more frames.


Functions
-

``add_filter(filter)`` function:

Add a new filter on Python memory allocations, *filter* is a
``Filter`` instance.

All inclusive filters are applied at once, a memory allocation is
only ignored if no inclusive filter match its trace. A memory
allocation is ignored if at least one exclusive filter matchs its
trace.

The new filter is not applied on already collected traces. Use
``clear_traces()`` to ensure that all traces match the new filter.


``add_include_filter(filename: str, lineno: int=None, traceback:
bool=False)`` function:

Add an inclusive filter: helper for ``add_filter()`` creating a
``Filter`` instance with ``include`` attribute set to ``True``.

Example: ``tracemalloc.add_include_filter(tracemalloc.__file__)``
only includes memory blocks allocated by the ``tracemalloc`` module.


``add_exclude_filter(filename: str, lineno: int=None, traceback:
bool=False)`` function:

Add an exclusive filter: helper for ``add_filter()`` creating a
``Filter`` instance with ``include`` attribute set to ``False``.

Example: ``tracemalloc.add_exclude_filter(tracemalloc.__file__)``
ignores memory blocks allocated by the ``tracemalloc`` module.


``clear_filters()`` fu

Re: [Python-Dev] PEP 450 adding statistics module

2013-09-16 Thread Steven D'Aprano
On Mon, Sep 16, 2013 at 08:42:12AM -0700, Guido van Rossum wrote:
> I'm ready to accept this PEP. Because I haven't read this entire thread
> (and 60 messages about random diversions is really too much to try and
> catch up on) I'll give people 24 hours to remind me of outstanding
> rejections.
> 
> I also haven't reviewed the code in any detail, but I believe the code
> review is going well, so I'm not concerned that the PEP would have to
> revised based on that alone.

There are a couple of outstanding issues that I am aware of, but I don't 
believe that either of these affect acceptance/rejection of the PEP. 
Please correct me if I am wrong.

1) Implementation details of the statistics.sum function. Oscar is 
giving me a lot of very valuable assistance speeding up the 
implementation of sum.

2) The current implementation has extensive docstrings, but will also 
need a separate statistics.rst file.


I don't recall any other outstanding issues, if I have forgotten any, 
please remind me.




> On Fri, Sep 13, 2013 at 5:59 PM, Steven D'Aprano wrote:
> 
> > On Sun, Sep 08, 2013 at 10:51:57AM -0700, Guido van Rossum wrote:
> > > Never mind, I found the patch and the issue. I really think that the
> > > *PEP* is ready for inclusion after the open issues are changed into
> > > something like Discussion or Future Work, and after adding a more
> > > prominent link to the issue with the patch. Then the *patch* can be
> > > reviewed some more until it is ready -- it looks very close already.
> >
> > I've updated the PEP as requested. Is there anything further that needs
> > to be done to have it approved?
> >
> > http://www.python.org/dev/peps/pep-0450/
> >
> >
> >
> > --
> > Steven
> > ___
> > Python-Dev mailing list
> > [email protected]
> > https://mail.python.org/mailman/listinfo/python-dev
> > Unsubscribe:
> > https://mail.python.org/mailman/options/python-dev/guido%40python.org
> >
> 
> 
> 
> -- 
> --Guido van Rossum (python.org/~guido)



-- 
Steven
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 450 adding statistics module

2013-09-16 Thread Guido van Rossum
On Mon, Sep 16, 2013 at 4:59 PM, Steven D'Aprano wrote:

> On Mon, Sep 16, 2013 at 08:42:12AM -0700, Guido van Rossum wrote:
> > I'm ready to accept this PEP. Because I haven't read this entire thread
> > (and 60 messages about random diversions is really too much to try and
> > catch up on) I'll give people 24 hours to remind me of outstanding
> > rejections.
> >
> > I also haven't reviewed the code in any detail, but I believe the code
> > review is going well, so I'm not concerned that the PEP would have to
> > revised based on that alone.
>
> There are a couple of outstanding issues that I am aware of, but I don't
> believe that either of these affect acceptance/rejection of the PEP.
> Please correct me if I am wrong.
>
> 1) Implementation details of the statistics.sum function. Oscar is
> giving me a lot of very valuable assistance speeding up the
> implementation of sum.
>
> 2) The current implementation has extensive docstrings, but will also
> need a separate statistics.rst file.
>
>
> I don't recall any other outstanding issues, if I have forgotten any,
> please remind me.
>

Those certainly don't stand in the way of the PEP's acceptance (but they do
block the commit of the code :-).

The issues that Oscar listed also all seem resolved (though they would make
a nice addition to the "Discussion" section in the PEP).

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 428: Pathlib -> stat caching

2013-09-16 Thread Stephen J. Turnbull
Terry Reedy writes:
 > On 9/16/2013 4:14 PM, R. David Murray wrote:
 > 
 > > Well, we tend to avoid single boolean arguments in favor of differently
 > > named functions.
 > 
 > The stdlib has lots of boolean arguments. My impression is that they are 
 > to be avoided when they would change the return type or otherwise do 
 > something disjointly different. I do not think this would apply here.

I remember reading that the criterion is whether the argument is most
often given a literal value.  Then "stat_cache()" is preferable to
"stat(cache=True)".  OTOH, "stat(cache=want_cache)" is better than

if want_cache:
result = stat_cache()
else:
result = stat()

or "result = stat_cache() if want_cache else stat()".

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com