Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-08-02 Thread Raymond Hettinger

On Aug 1, 2012, at 1:46 AM, Mark Shannon wrote:

 
 '''
 Being able to pre-allocate lists based on the expected size, as estimated by 
 __length_hint__,
 can be a significant optimization.
 PyPy has been observed to run some code slower than CPython, purely because 
 this optimization is absent.
 '''
 
 Which is a PyPy bug report, not a rationale for a PEP ;)

Alex's rationale is correct and well expressed.

Your proposed revision reflects fuzzy thinking about why __length_hint__ is 
useful.
Regardless of resizing growth factors, it is *always* helpful to know how much
memory to allocate.  Calls to the allocators (especially for large blocks)
and possible the recopying of data should be avoided when possible.


Raymond___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-08-01 Thread Mark Shannon

While the idea behind PEP 424 is sound, the text of the PEP is rather vague and 
missing a lot of details.
There was extended discussion on the details, but none of that has appeared in 
the PEP yet.

So Alex, how about adding those details?

Also the rationale is rather poor.
Given that CPython is the reference implementation, PyPy should be compared to 
CPython, not vice-versa.
Reversing PyPy and CPython in the rationale gives:

'''
Being able to pre-allocate lists based on the expected size, as estimated by 
__length_hint__,
can be a significant optimization.
PyPy has been observed to run some code slower than CPython, purely because 
this optimization is absent.
'''

Which is a PyPy bug report, not a rationale for a PEP ;)

Perhaps a better rationale would something along the lines of:

'''
Adding a __length_hint__ method to the iterator protocol allows sequences, 
notably lists,
to be initialised from iterators with only a single resize operation.
This allows sequences to be intialised quickly, yet have a small growth factor, 
reducing memory use.
'''

Cheers,
Mark.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-08-01 Thread Maciej Fijalkowski
On Wed, Aug 1, 2012 at 10:46 AM, Mark Shannon m...@hotpy.org wrote:
 While the idea behind PEP 424 is sound, the text of the PEP is rather vague
 and missing a lot of details.
 There was extended discussion on the details, but none of that has appeared
 in the PEP yet.

 So Alex, how about adding those details?

 Also the rationale is rather poor.
 Given that CPython is the reference implementation, PyPy should be compared
 to CPython, not vice-versa.
 Reversing PyPy and CPython in the rationale gives:

 '''
 Being able to pre-allocate lists based on the expected size, as estimated by
 __length_hint__,

 can be a significant optimization.
 PyPy has been observed to run some code slower than CPython, purely because
 this optimization is absent.
 '''

 Which is a PyPy bug report, not a rationale for a PEP ;)

 Perhaps a better rationale would something along the lines of:

 '''
 Adding a __length_hint__ method to the iterator protocol allows sequences,
 notably lists,
 to be initialised from iterators with only a single resize operation.
 This allows sequences to be intialised quickly, yet have a small growth
 factor, reducing memory use.
 '''


Hi Mark.

It's not about saving memory. It really is about speed. Noone bothered
measuring cpython with length hint disabled to compare, however we did
that for pypy hence the rationale contains it. It's merely to state
this seems like an important optimization. Since the C-level code
involved is rather similar (it's mostly runtime anyway), it seems
reasonable to draw a conclusion that removing length hint from cpython
would cause slowdown.

Cheers,
fijal
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-08-01 Thread Mark Shannon

Maciej Fijalkowski wrote:

On Wed, Aug 1, 2012 at 10:46 AM, Mark Shannon m...@hotpy.org wrote:

While the idea behind PEP 424 is sound, the text of the PEP is rather vague
and missing a lot of details.
There was extended discussion on the details, but none of that has appeared
in the PEP yet.

So Alex, how about adding those details?

Also the rationale is rather poor.
Given that CPython is the reference implementation, PyPy should be compared
to CPython, not vice-versa.
Reversing PyPy and CPython in the rationale gives:

'''
Being able to pre-allocate lists based on the expected size, as estimated by
__length_hint__,

can be a significant optimization.
PyPy has been observed to run some code slower than CPython, purely because
this optimization is absent.
'''

Which is a PyPy bug report, not a rationale for a PEP ;)

Perhaps a better rationale would something along the lines of:

'''
Adding a __length_hint__ method to the iterator protocol allows sequences,
notably lists,
to be initialised from iterators with only a single resize operation.
This allows sequences to be intialised quickly, yet have a small growth
factor, reducing memory use.
'''



Hi Mark.

It's not about saving memory. It really is about speed. Noone bothered
measuring cpython with length hint disabled to compare, however we did
that for pypy hence the rationale contains it. It's merely to state
this seems like an important optimization. Since the C-level code
involved is rather similar (it's mostly runtime anyway), it seems
reasonable to draw a conclusion that removing length hint from cpython
would cause slowdown.


It is not about making it faster *or* saving memory, but *both*.
Without __length_hint__ there is a trade off between speed and memory use.
You can have speed at the cost of memory by increasing the resize factor.

With __length_hint__ you can get both speed and good memory use.

Cheers,
Mark

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-08-01 Thread Maciej Fijalkowski
On Wed, Aug 1, 2012 at 12:06 PM, Mark Shannon m...@hotpy.org wrote:
 Maciej Fijalkowski wrote:

 On Wed, Aug 1, 2012 at 10:46 AM, Mark Shannon m...@hotpy.org wrote:

 While the idea behind PEP 424 is sound, the text of the PEP is rather
 vague
 and missing a lot of details.
 There was extended discussion on the details, but none of that has
 appeared
 in the PEP yet.

 So Alex, how about adding those details?

 Also the rationale is rather poor.
 Given that CPython is the reference implementation, PyPy should be
 compared
 to CPython, not vice-versa.
 Reversing PyPy and CPython in the rationale gives:

 '''
 Being able to pre-allocate lists based on the expected size, as estimated
 by
 __length_hint__,

 can be a significant optimization.
 PyPy has been observed to run some code slower than CPython, purely
 because
 this optimization is absent.
 '''

 Which is a PyPy bug report, not a rationale for a PEP ;)

 Perhaps a better rationale would something along the lines of:

 '''
 Adding a __length_hint__ method to the iterator protocol allows
 sequences,
 notably lists,
 to be initialised from iterators with only a single resize operation.
 This allows sequences to be intialised quickly, yet have a small growth
 factor, reducing memory use.
 '''


 Hi Mark.

 It's not about saving memory. It really is about speed. Noone bothered
 measuring cpython with length hint disabled to compare, however we did
 that for pypy hence the rationale contains it. It's merely to state
 this seems like an important optimization. Since the C-level code
 involved is rather similar (it's mostly runtime anyway), it seems
 reasonable to draw a conclusion that removing length hint from cpython
 would cause slowdown.


 It is not about making it faster *or* saving memory, but *both*.
 Without __length_hint__ there is a trade off between speed and memory use.
 You can have speed at the cost of memory by increasing the resize factor.

No, you cannot.

if you allocate a huge region, you're not gonna make much of speed,
because at the end you need to copy stuff anyway. Besides large
allocations are slow. With length hint that is correct (sometimes you
can do that) you have a zero-copy scenario
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-17 Thread Mark Dickinson
On Sun, Jul 15, 2012 at 1:36 PM, Antoine Pitrou solip...@pitrou.net wrote:
 On Sun, 15 Jul 2012 18:47:38 +1000
 Nick Coghlan ncogh...@gmail.com wrote:
 I'm not seeing the value in returning None over 0 for the don't know
 case - it just makes the API harder to use.

 The point is that 0 is a legitimate value for a length hint. Simple
 implementations of __length_hint__ will start returning 0 as a
 legitimate value and you will wrongly interpret that as don't know,
 which kinds of defeat the purpose of __length-hint__ ;)

I agree with this: giving special meaning to what's already a valid
length value seems wrong.

Mark
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-17 Thread Antoine Pitrou
On Tue, 17 Jul 2012 13:19:55 +1000
Nick Coghlan ncogh...@gmail.com wrote:
 
  There are no provisions for infinite iterators, that is not within the 
  scope of
  this proposal.
 
 I'll repeat my observation that remaining silent on this point is
 effectively identical to blessing the practice of raising an exception
 in __length_hint__ to force fast failure of attempts to convert an
 infinite iterator to a concrete container.

And I'll repeat that it is false ;) Being silent is certainly not the
same thing as blessing a non-existent practice.

Regards

Antoine.


-- 
Software development and contracting: http://pro.pitrou.net


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-17 Thread Lennart Regebro
On Sun, Jul 15, 2012 at 1:28 AM, Alexandre Zani
alexandre.z...@gmail.com wrote:
 I'm +1 on not having a public API for this. Ultimately the contract
 for a length hint will depend heavily upon what you need it for. Some
 applications would require a length hint to be an at least others an
 at most and others something else entirely. Given that the contract
 here appears to be =0, I don't think the length hint is particularly
 useful to the public at large.

Other possible related uses could be to get an approximate number of
results for a query without having to actually go through the whole
query, useful for databases and search engines. But then you *do* want
__len__ as well, so that also doesn't fit with the current PEP.

But maybe that's a completely different usecase, even though it seems
related to me?

//Lennart
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-16 Thread Stefan Behnel
Mark Shannon, 15.07.2012 16:14:
 Alex Gaynor wrote:
 CPython currently defines an ``__length_hint__`` method on several types,
 such
 as various iterators. This method is then used by various other functions
 (such as ``map``) to presize lists based on the estimated returned by
 
 Don't use map as an example.
 map returns an iterator so it doesn't need __length_hint__

Right. It's a good example for something else, though. As I mentioned
before, iterators should be able to propagate the length hint of an
underlying iterator, e.g. in generator expressions or map(). I consider
that an important feature that the protocol must support.

Stefan

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-16 Thread Stefan Behnel
Antoine Pitrou, 15.07.2012 17:06:
 On Sun, 15 Jul 2012 16:33:23 +0200
 Christian Heimes wrote:
 Am 15.07.2012 16:22, schrieb Antoine Pitrou:
 On Mon, 16 Jul 2012 00:08:41 +1000
 Nick Coghlan wrote:
 Right, I agree on the value in being able to return something to say this
 cannot be converted to a concrete container.

 Who would be able to return that, apart from trivial cases like
 itertools.cycle()?

 For example most numerical sequence iterators like Fibonacci generator,
 prime number sequence generator and even trivial cases like even natural
 number generator.
 
 First, you can't implement __length_hint__ for a generator, which is the
 preferred (the most practical) way of writing iterators in pure Python.

It can be implemented for generator expressions without a conditional,
though, including the case of comprehensions. I wanted to do this in Cython
for a while, but the protocol wasn't very well adapted to that use case.
The don't know case was just too common and inefficient.

For the other points, I agree with the already presented counterarguments.
Being able to prevent some obvious traps is a good thing, even if you can't
prevent all of them.

Stefan

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-16 Thread Stefan Behnel
Alex Gaynor, 15.07.2012 00:11:
 CPython currently defines an ``__length_hint__`` method on several types, such
 as various iterators. This method is then used by various other functions 
 (such 
 as ``map``) to presize lists based on the estimated returned by
 ``__length_hint__``. Types can then define ``__length_hint__`` which are not
 sized, and thus should not define ``__len__``, but can estimate or compute a
 size (such as many iterators).
 
 Proposal
 
 
 This PEP proposes formally documenting ``__length_hint__`` for other
 interpreter and non-standard library Python to implement.
 
 ``__length_hint__`` must return an integer, and is not required to be 
 accurate.
 It may return a value that is either larger or smaller than the actual size of
 the container. It may raise a ``TypeError`` if a specific instance cannot have
 its length estimated. It may not return a negative value.

I'd like to more visibly repeat my suggestion to make this a slot method
tp_length_hint() in extension types that returns a Py_ssize_t.

That suggests that a negative return value would have a special meaning
instead of relying on return values like NotImplemented. The Python wrapper
of that slot method could still implement a mapping for this.

Return values could be -1 for don't know and -2 for infinite at the C
level, and NotImplemented for don't know at the Python level. Not sure
about a good Python value for infinite.

Maybe return -1 for infinite at both levels and -2/NotImplemented for
don't know in C/Python? That would suggest -3 to propagate exceptions at
the C level.

Stefan

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-16 Thread Nick Coghlan
Proposing anything substantially more complicated than what is currently
implemented in CPython will just get the idea rejected. The substantial
immediate gain for PyPy is in skipping the memory resizing when building
containers from itertools iterators, not anywhere else.

--
Sent from my phone, thus the relative brevity :)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-16 Thread Stefan Behnel
Nick Coghlan, 16.07.2012 10:26:
 Proposing anything substantially more complicated than what is currently
 implemented in CPython will just get the idea rejected. The substantial
 immediate gain for PyPy is in skipping the memory resizing when building
 containers from itertools iterators, not anywhere else.

The same applies to Cython, where the extension types that implement
generator expressions can benefit from propagating the length hint of the
underlying iterator. A type slot would help in making this more efficient
overall, also for CPython itself.

Stefan

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-16 Thread Raymond Hettinger

On Jul 15, 2012, at 7:22 AM, Antoine Pitrou wrote:

 On Mon, 16 Jul 2012 00:08:41 +1000
 Nick Coghlan ncogh...@gmail.com wrote:
 Right, I agree on the value in being able to return something to say this
 cannot be converted to a concrete container.
 
 Who would be able to return that, apart from trivial cases like
 itertools.cycle()?


FWIW, here are the notes from the docstring in Lib/test/test_iterlen.py:

 Test Iterator Length Transparency

Some functions or methods which accept general iterable arguments have
optional, more efficient code paths if they know how many items to expect.
For instance, map(func, iterable), will pre-allocate the exact amount of
space required whenever the iterable can report its length.

The desired invariant is:  len(it)==len(list(it)).

A complication is that an iterable and iterator can be the same object. To
maintain the invariant, an iterator needs to dynamically update its length.
For instance, an iterable such as xrange(10) always reports its length as ten,
but it=iter(xrange(10)) starts at ten, and then goes to nine after it.next().
Having this capability means that map() can ignore the distinction between
map(func, iterable) and map(func, iter(iterable)).

When the iterable is immutable, the implementation can straight-forwardly
report the original length minus the cumulative number of calls to next().
This is the case for tuples, xrange objects, and itertools.repeat().

Some containers become temporarily immutable during iteration.  This includes
dicts, sets, and collections.deque.  Their implementation is equally simple
though they need to permanently set their length to zero whenever there is
an attempt to iterate after a length mutation.

The situation slightly more involved whenever an object allows length mutation
during iteration.  Lists and sequence iterators are dynamically updatable.
So, if a list is extended during iteration, the iterator will continue through
the new items.  If it shrinks to a point before the most recent iteration,
then no further items are available and the length is reported at zero.

Reversed objects can also be wrapped around mutable objects; however, any
appends after the current position are ignored.  Any other approach leads
to confusion and possibly returning the same item more than once.

The iterators not listed above, such as enumerate and the other itertools,
are not length transparent because they have no way to distinguish between
iterables that report static length and iterators whose length changes with
each call (i.e. the difference between enumerate('abc') and
enumerate(iter('abc')).




Raymond___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-16 Thread Serhiy Storchaka

On 16.07.12 10:36, Stefan Behnel wrote:

Return values could be -1 for don't know and -2 for infinite at the C
level, and NotImplemented for don't know at the Python level.


PY_SSIZE_T_MAX is better value for infinite. In any case no difference 
for consumer between PY_SSIZE_T_MAX and a real infinity.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-16 Thread Raymond Hettinger

On Jul 16, 2012, at 12:36 AM, Stefan Behnel wrote:

 I'd like to more visibly repeat my suggestion to make this a slot method
 tp_length_hint() in extension types that returns a Py_ssize_t.

That is merely an implementation detail, but it would be a nice improvement.


Raymond

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-16 Thread Brett Cannon
On Mon, Jul 16, 2012 at 3:36 AM, Stefan Behnel stefan...@behnel.de wrote:

 Alex Gaynor, 15.07.2012 00:11:
  CPython currently defines an ``__length_hint__`` method on several
 types, such
  as various iterators. This method is then used by various other
 functions (such
  as ``map``) to presize lists based on the estimated returned by
  ``__length_hint__``. Types can then define ``__length_hint__`` which are
 not
  sized, and thus should not define ``__len__``, but can estimate or
 compute a
  size (such as many iterators).
 
  Proposal
  
 
  This PEP proposes formally documenting ``__length_hint__`` for other
  interpreter and non-standard library Python to implement.
 
  ``__length_hint__`` must return an integer, and is not required to be
 accurate.
  It may return a value that is either larger or smaller than the actual
 size of
  the container. It may raise a ``TypeError`` if a specific instance
 cannot have
  its length estimated. It may not return a negative value.

 I'd like to more visibly repeat my suggestion to make this a slot method
 tp_length_hint() in extension types that returns a Py_ssize_t.

 That suggests that a negative return value would have a special meaning
 instead of relying on return values like NotImplemented. The Python wrapper
 of that slot method could still implement a mapping for this.

 Return values could be -1 for don't know and -2 for infinite at the C
 level, and NotImplemented for don't know at the Python level. Not sure
 about a good Python value for infinite.


Gods no. Making the return value different in C vs. Python code is just
asking for trouble in terms of having to remember that specific difference
while coding. Plus asking for people to check for an explicit negative
values instead of just = 0 would be problematic and prone to error.



 Maybe return -1 for infinite at both levels and -2/NotImplemented for
 don't know in C/Python? That would suggest -3 to propagate exceptions at
 the C level.


See above.

This is another reason why I don't think the infinite iterator concept is
worth expressin. It's just mucking things up for no good reason. infinite
== I don't know in the case of pre-allocation of a list. Just raise an
exception or return None and be done with it. Nice and simple. And my vote
is for an exception as EAFP.

-Brett
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-16 Thread M Stefan

On 7/16/2012 9:54 AM, Stefan Behnel wrote:

Mark Shannon, 15.07.2012 16:14:

Alex Gaynor wrote:

CPython currently defines an ``__length_hint__`` method on several types,
such
as various iterators. This method is then used by various other functions
(such as ``map``) to presize lists based on the estimated returned by

Don't use map as an example.
map returns an iterator so it doesn't need __length_hint__

Right. It's a good example for something else, though. As I mentioned
before, iterators should be able to propagate the length hint of an
underlying iterator, e.g. in generator expressions or map(). I consider
that an important feature that the protocol must support.

Stefan

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/mstefanro%40gmail.com


map() is quite problematic in this matter, and may actually
benefit from the existence of __length_hint__.
It is very easy to create an infinite loop currently by doing stuff like 
x=[1]; x+=map(str,x)

[61081 refs]
 x=[1]; x+=map(str,x)
Traceback (most recent call last):
  ...
MemoryError
[120959834 refs]
 len(x)
120898752

Obviously, this won't cause an infinite loop in Python2 where map is 
non-lazy.
Also, this won't work for all mutable containers, because not all of 
them permit

adding elements while iterating:
 s=set([1]); s.update(map(str,s))
Traceback (most recent call last):
  ...
RuntimeError: Set changed size during iteration
[61101 refs]
 s
{1, '1'}
[61101 refs]
 del s
[61099 refs]

If map objects were to disallow changing the size of the container
while iterating (I can't really think of an use-case in which such a
limitation would be harmful), it might as well be with __length_hint__.

Also, what would iter([1,2,3]).__length_hint__() return? 3 or unknown?
If 3, then the semantics of l=[1,2,3]; l += iter(l) will change 
(infinite loop

without __length_hint__ vs. list of 6 elements with __length_hint__).
If unknown, then it doesn't seem like there are very many places where
__length_hint__ can return anything but unknown.

Regards,
  Stefan M
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-16 Thread Ethan Furman

M Stefan wrote:

Also, what would iter([1,2,3]).__length_hint__() return? 3 or unknown?
If 3, then the semantics of l=[1,2,3]; l += iter(l) will change 
(infinite loop without __length_hint__ vs. list of 6 elements with __length_hint__).


What __length_hint__ returns is irrelevant -- it's only a hint.  Python 
will have to loop over all the items.  So you would still get an 
infinite loop with the above code.


~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-16 Thread Alex Gaynor
I've updated the PEP to reflect the discussion. There are two major changes:

1) NotImplemented may be used by __length_hint__ to indicate that there is no 
   finite length hint available.
2) callers of operator.length_hint() must provide their own default value, this
   is also required by the current C _PyObject_LengthHint implementation.

There are no provisions for infinite iterators, that is not within the scope of
this proposal.

Alex

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-16 Thread Nick Coghlan
On Tue, Jul 17, 2012 at 1:03 PM, Alex Gaynor alex.gay...@gmail.com wrote:
 I've updated the PEP to reflect the discussion. There are two major changes:

 1) NotImplemented may be used by __length_hint__ to indicate that there is no
finite length hint available.

I've been thinking about this a bit more, and I think it does provide
good scope for eventually adding __length_hint__ to more iterators
(including map, filter and enumerate).

 2) callers of operator.length_hint() must provide their own default value, 
 this
is also required by the current C _PyObject_LengthHint implementation.

And this makes it explicit that API users need to deal with the
AttributeError/NotImplemented case, whilst making it easy to do so.
Good call.

 There are no provisions for infinite iterators, that is not within the scope 
 of
 this proposal.

I'll repeat my observation that remaining silent on this point is
effectively identical to blessing the practice of raising an exception
in __length_hint__ to force fast failure of attempts to convert an
infinite iterator to a concrete container. Rather than leaving people
to figure this out on their own, we may as well make it official that
TypeError can be raised in __length_hint__ to block conversion to
concrete containers that use a preallocation strategy.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-15 Thread Steven D'Aprano

Nick Coghlan wrote:

On Sun, Jul 15, 2012 at 9:18 AM, Benjamin Peterson benja...@python.org wrote:

Open questions
==

There are two open questions for this PEP:

* Should ``list`` expose a kwarg in it's constructor for supplying a length
  hint.
* Should a function be added either to ``builtins`` or some other module which
  calls ``__length_hint__``, like ``builtins.len`` calls ``__len__``.

Let's try to keep this as limited as possible for a public API.


Length hints are very useful for *any* container implementation,
whether those containers are in the standard library or not. Just as
we exposed operator.index when __index__ was added, we should expose
an operator.length_hint function with the following semantics:

[...]

As given, length_hint gives no way of distinguishing between iterables and 
non-iterables:


py length_hint([])
0
py length_hint(42)
0

nor does it give iterable objects a way to indicate that either they don't 
know their length, or that they are infinite.


I suggest:

* object (and hence all other types that don't explicitly override it)
  should have a __length_hint__ that raises TypeError;

* __length_hint__ should be allowed to return None to indicate don't know
  or -1 to indicate infinite.

Presumably anything that wishes to create a list or other sequence from an 
object with a hint of -1 could then raise an exception immediately.





--
Steven
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-15 Thread Nick Coghlan
On Sun, Jul 15, 2012 at 6:21 PM, Steven D'Aprano st...@pearwood.info wrote:
 I suggest:

 * object (and hence all other types that don't explicitly override it)
   should have a __length_hint__ that raises TypeError;

We can keep it simpler than that just by changing the order of the checks.

 * __length_hint__ should be allowed to return None to indicate don't know
   or -1 to indicate infinite.

 Presumably anything that wishes to create a list or other sequence from an
 object with a hint of -1 could then raise an exception immediately.

I'm not seeing the value in returning None over 0 for the don't know
case - it just makes the API harder to use. Declaring negative results
as meaning I'm infinite sounds reasonable, though:

def length_hint(obj):
Return an estimate of the number of items in obj.

This is useful for presizing containers when building from an iterable.

If the object supports len(), the result will be exact. Otherwise,
it may over or underestimate by an arbitrary amount.

try:
get_hint = obj.__length_hint__
except AttributeError:
return len(obj)
hint = get_hint()
if not isinstance(hint, int):
msg = Length hint must be an integer, not %r
raise TypeError(msg % type(hint))
if hint  0:
raise ValueError(%r is an infinite iterator % (obj,))
return hint

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-15 Thread Stefan Behnel
Alex Gaynor, 15.07.2012 07:20:
 there's no way for the __lenght_hint__ to specify that
 that particular instance can't have a length hint computed.  e.g. imagine
 some sort of lazy stream that cached itself, and only wanted to offer a
 length hint if it had already been evaluated.  Without an exception to
 raise, it has to return whatever the magic value for length_hint is (in
 your impl it appears to be 0, the current _PyObject_LengthHint method in
 CPython has a required `default` parameter).  The PEP proposes using
 TypeError for that.

Yes, that's a major issue. I've been planning to add a length hint to
Cython's generator expressions for a while, but the problem is really that
in most cases it is only known at runtime if the underlying iterable has a
length hint, so propagating it needs a way to say sorry, I thought I might
know, but I don't. It would be even better if this way was efficient.
Since we're at a point of making this an official protocol, why not change
the current behaviour and return -1 (or even just 0) to explicitly state
that we don't know?

The problem with an exception here is that it might have been raised
accidentally inside of the __length_hint__() implementation that is being
asked. Swallowing it just because it happened to be a TypeError rather than
something else may end up covering bugs. We had a similar issue with
hasattr() in the past.

Also, it would be nice if this became a type slot rather than requiring a
dict lookup and Python function call.

Stefan

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-15 Thread Antoine Pitrou
On Sun, 15 Jul 2012 18:47:38 +1000
Nick Coghlan ncogh...@gmail.com wrote:
 
  * __length_hint__ should be allowed to return None to indicate don't know
or -1 to indicate infinite.
 
  Presumably anything that wishes to create a list or other sequence from an
  object with a hint of -1 could then raise an exception immediately.
 
 I'm not seeing the value in returning None over 0 for the don't know
 case - it just makes the API harder to use.

The point is that 0 is a legitimate value for a length hint. Simple
implementations of __length_hint__ will start returning 0 as a
legitimate value and you will wrongly interpret that as don't know,
which kinds of defeat the purpose of __length-hint__ ;)

That said, I don't think a special value for is infinite is useful.
Just make -1 mean I don't know.

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-15 Thread Steven D'Aprano

Antoine Pitrou wrote:


The point is that 0 is a legitimate value for a length hint. Simple
implementations of __length_hint__ will start returning 0 as a
legitimate value and you will wrongly interpret that as don't know,
which kinds of defeat the purpose of __length-hint__  ;) 



That said, I don't think a special value for is infinite is useful.
Just make -1 mean I don't know.


You've obviously never accidentally called list on an infinite iterator *wink*

It's not the (eventual) MemoryError that is the problem. On some systems, this 
can cause the PC  to become unresponsive as the OS tries to free an 
ever-increasing amount of memory. Been there, done that, on a production 
system. I had to do a hard reboot to fix it.


I think having a hint that says there's no way this can succeed, fail 
immediately is more useful than caring about the difference between a hint of 
0 and a hint of 1.




--
Steven
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-15 Thread Nick Coghlan
Right, I agree on the value in being able to return something to say this
cannot be converted to a concrete container.

I still haven't seen a use case where the appropriate response to I don't
know differs from the appropriate response to a hint of zero - that is,
you don't preallocate, you just start iterating.

Cheers,
Nick.

--
Sent from my phone, thus the relative brevity :)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-15 Thread Antoine Pitrou
On Mon, 16 Jul 2012 00:08:41 +1000
Nick Coghlan ncogh...@gmail.com wrote:
 Right, I agree on the value in being able to return something to say this
 cannot be converted to a concrete container.

Who would be able to return that, apart from trivial cases like
itertools.cycle()?

Regards

Antoine.


-- 
Software development and contracting: http://pro.pitrou.net


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-15 Thread Christian Heimes
Am 15.07.2012 16:22, schrieb Antoine Pitrou:
 On Mon, 16 Jul 2012 00:08:41 +1000
 Nick Coghlan ncogh...@gmail.com wrote:
 Right, I agree on the value in being able to return something to say this
 cannot be converted to a concrete container.
 
 Who would be able to return that, apart from trivial cases like
 itertools.cycle()?

For example most numerical sequence iterators like Fibonacci generator,
prime number sequence generator and even trivial cases like even natural
number generator. IMO it's a good idea to have a notation for infinitive
iterators that can't be materialized as finite containers.

+1

Christian

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-15 Thread Mark Shannon

Nick Coghlan wrote:
Right, I agree on the value in being able to return something to say 
this cannot be converted to a concrete container.


I still haven't seen a use case where the appropriate response to I 
don't know differs from the appropriate response to a hint of zero - 
that is, you don't preallocate, you just start iterating.




There seem to be 5 possible classes values of __length_hint__ that an
iterator object can provide:

1. Don't implement it at all.

2. Implement __length_hint__() but don't want to return any value.
   Either raise an exception (TypeError) -- As suggested in the PEP.
   or return NotImplemented -- my preferred option.

3. Return a don't know value:
   Returning 0 would be fine for this, but the VM might want to respond
   differently to don't know and 0.
__length_hint__() == 0 container should be minimum size.
__length_hint__() == unknown container starts at default size.

4. Infinite iterator:
   Could return float('inf'), but given this is a hint then
   returning sys.maxsize or sys.maxsize + 1 might be OK.
   Alternatively raise an OverflowError

5. A meaningful length. No problem :)

Also, what are the allowable return types?

1. int only
2. Any number (ie any type with a __int__() method)?
3. Or any integer-like object (ie a type with a __index__() method)?

My suggestion:

a) Don't want to return any value or don't know: return NotImplemented
b) For infinite iterators: raise an OverflowError
c) All other cases: return an int or a type with a __index__() method.

Cheers,
Mark.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-15 Thread Mark Shannon

Alex Gaynor wrote:

Hi all,

I've just submitted a PEP proposing making __length_hint__ a public API for 
users to define and other VMs to implement:


These seems back-to-front.
__length_hint__ is *used* by the VM, not provided by it.
It should be part of the object model, rather than the API.



PEP: 424
Title: A method for exposing a length hint
Version: $Revision$
Last-Modified: $Date
Author: Alex Gaynor alex.gay...@gmail.com
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 14-July-2012
Python-Version: 3.4

Abstract


CPython currently defines an ``__length_hint__`` method on several types, such
as various iterators. This method is then used by various other functions (such 
as

``map``) to presize lists based on the estimated returned by


Don't use map as an example.
map returns an iterator so it doesn't need __length_hint__


``__length_hint__``. Types can then define ``__length_hint__`` which are not
sized, and thus should not define ``__len__``, but can estimate or compute a
size (such as many iterators).

Proposal


This PEP proposes formally documenting ``__length_hint__`` for other
interpreter and non-standard library Python to implement.

``__length_hint__`` must return an integer, and is not required to be accurate.
It may return a value that is either larger or smaller than the actual size of
the container. It may raise a ``TypeError`` if a specific instance cannot have
its length estimated. It may not return a negative value.


Rather than raising a TypeError, why not return NotImplemented?



Rationale
=

Being able to pre-allocate lists based on the expected size, as estimated by 
``__length_hint__``, can be a significant optimization. CPython has been

observed to run some code faster than PyPy, purely because of this optimization
being present.

Open questions
==

There are two open questions for this PEP:

* Should ``list`` expose a kwarg in it's constructor for supplying a length
  hint.
* Should a function be added either to ``builtins`` or some other module which
  calls ``__length_hint__``, like ``builtins.len`` calls ``__len__``.

Copyright
=

This document has been placed into the public domain.

..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8




Alex

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/mark%40hotpy.org


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-15 Thread Brett Cannon
On Sun, Jul 15, 2012 at 10:39 AM, Mark Shannon m...@hotpy.org wrote:

 Nick Coghlan wrote:

 Right, I agree on the value in being able to return something to say
 this cannot be converted to a concrete container.

 I still haven't seen a use case where the appropriate response to I
 don't know differs from the appropriate response to a hint of zero - that
 is, you don't preallocate, you just start iterating.


 There seem to be 5 possible classes values of __length_hint__ that an
 iterator object can provide:

 1. Don't implement it at all.

 2. Implement __length_hint__() but don't want to return any value.
Either raise an exception (TypeError) -- As suggested in the PEP.
or return NotImplemented -- my preferred option.

 3. Return a don't know value:
Returning 0 would be fine for this, but the VM might want to respond
differently to don't know and 0.
 __length_hint__() == 0 container should be minimum
 size.
 __length_hint__() == unknown container starts at default
 size.


 4. Infinite iterator:
Could return float('inf'), but given this is a hint then
returning sys.maxsize or sys.maxsize + 1 might be OK.
Alternatively raise an OverflowError


I am really having a hard time differentiating infinity with I don't know
since they are both accurate from the point of view of __length_hint__ and
its typical purpose of allocation. You have no clue how many values will be
grabbed from an infinite iterator, so it's the same as just not knowing
upfront how long the iterator will be, infinite or not, and thus not worth
distinguishing.



 5. A meaningful length. No problem :)

 Also, what are the allowable return types?

 1. int only
 2. Any number (ie any type with a __int__() method)?
 3. Or any integer-like object (ie a type with a __index__() method)?

 My suggestion:

 a) Don't want to return any value or don't know: return NotImplemented
 b) For infinite iterators: raise an OverflowError
 c) All other cases: return an int or a type with a __index__() method.


I'm fine with (a), drop (b), and for (c) use what we allow for __len__()
since, as Nick's operator.length_hint pseudo-code suggests, people will
call this as a fallback if __len__ isn't defined.

-Brett




 Cheers,
 Mark.


 __**_
 Python-Dev mailing list
 Python-Dev@python.org
 http://mail.python.org/**mailman/listinfo/python-devhttp://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe: http://mail.python.org/**mailman/options/python-dev/**
 brett%40python.orghttp://mail.python.org/mailman/options/python-dev/brett%40python.org

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-15 Thread Christian Heimes
Am 15.07.2012 16:39, schrieb Mark Shannon:
 1. Don't implement it at all.
 
 2. Implement __length_hint__() but don't want to return any value.
Either raise an exception (TypeError) -- As suggested in the PEP.
or return NotImplemented -- my preferred option.

How is this different from don't know? What's the use case for knowing
that the object doesn't want to say anything or doesn't know its
possible length.

 3. Return a don't know value:
Returning 0 would be fine for this, but the VM might want to respond
differently to don't know and 0.

How about None? It's the logical choice, simple and easy to test for in
Python and C code.

0 is a valid number for I know that's I'll return nothing.

 4. Infinite iterator:
Could return float('inf'), but given this is a hint then
returning sys.maxsize or sys.maxsize + 1 might be OK.
Alternatively raise an OverflowError

Too complex, hard to remember and even harder to check for. Since a
length is always positive or zero, -1 is a good return value for infinite.

 a) Don't want to return any value or don't know: return NotImplemented

+1

 b) For infinite iterators: raise an OverflowError

-1, I'm for -1. ;) I'm not a fan of using exception for valid and
correct return values.

 c) All other cases: return an int or a type with a __index__() method.

+1

Christian

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-15 Thread Mark Shannon

Brett Cannon wrote:



On Sun, Jul 15, 2012 at 10:39 AM, Mark Shannon m...@hotpy.org 
mailto:m...@hotpy.org wrote:


Nick Coghlan wrote:

Right, I agree on the value in being able to return something to
say this cannot be converted to a concrete container.

I still haven't seen a use case where the appropriate response
to I don't know differs from the appropriate response to a
hint of zero - that is, you don't preallocate, you just start
iterating.


There seem to be 5 possible classes values of __length_hint__ that an
iterator object can provide:

1. Don't implement it at all.

2. Implement __length_hint__() but don't want to return any value.
   Either raise an exception (TypeError) -- As suggested in the PEP.
   or return NotImplemented -- my preferred option.

3. Return a don't know value:
   Returning 0 would be fine for this, but the VM might want to respond
   differently to don't know and 0.
__length_hint__() == 0 container should be
minimum size.
__length_hint__() == unknown container starts at
default size.


4. Infinite iterator:
   Could return float('inf'), but given this is a hint then
   returning sys.maxsize or sys.maxsize + 1 might be OK.
   Alternatively raise an OverflowError


I am really having a hard time differentiating infinity with I don't 
know since they are both accurate from the point of view of 
__length_hint__ and its typical purpose of allocation. You have no clue 
how many values will be grabbed from an infinite iterator, so it's the 
same as just not knowing upfront how long the iterator will be, infinite 
or not, and thus not worth distinguishing.
 



5. A meaningful length. No problem :)

Also, what are the allowable return types?

1. int only
2. Any number (ie any type with a __int__() method)?
3. Or any integer-like object (ie a type with a __index__() method)?

My suggestion:

a) Don't want to return any value or don't know: return NotImplemented
b) For infinite iterators: raise an OverflowError
c) All other cases: return an int or a type with a __index__() method.


I'm fine with (a), drop (b), and for (c) use what we allow for __len__() 
since, as Nick's operator.length_hint pseudo-code suggests, people will 
call this as a fallback if __len__ isn't defined.


So how does an iterator express infinite length?

What should happen if I am silly enough to do this:
 list(itertools.count())

This will fail; it should fail quickly.

Cheers,
Mark.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-15 Thread Antoine Pitrou
On Sun, 15 Jul 2012 16:33:23 +0200
Christian Heimes li...@cheimes.de wrote:
 Am 15.07.2012 16:22, schrieb Antoine Pitrou:
  On Mon, 16 Jul 2012 00:08:41 +1000
  Nick Coghlan ncogh...@gmail.com wrote:
  Right, I agree on the value in being able to return something to say this
  cannot be converted to a concrete container.
  
  Who would be able to return that, apart from trivial cases like
  itertools.cycle()?
 
 For example most numerical sequence iterators like Fibonacci generator,
 prime number sequence generator and even trivial cases like even natural
 number generator.

First, you can't implement __length_hint__ for a generator, which is the
preferred (the most practical) way of writing iterators in pure Python.

Second, not all iterators will implement __length_hint__ (because it's
optional and, really, of rather little use). So, as a user, you cannot
hope that `list(some_iterator)` will always raise instead of filling
your memory with an infinite stream of values: you have to be careful
anyway.

Even if __length_hint__ is implemented, its result may be wrong.
That's the whole point: it's a *hint*; an iterator might tell you it's
finite while it's infinite, or the reverse.


My conclusion is that an infinite iterator is a documentation issue.
Just tell the user that it doesn't stop, and let them shoot themselves
in the foot in they want to.

Regards

Antoine.


-- 
Software development and contracting: http://pro.pitrou.net


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-15 Thread Antoine Pitrou
On Sun, 15 Jul 2012 16:08:00 +0100
Mark Shannon m...@hotpy.org wrote:
 
 What should happen if I am silly enough to do this:
   list(itertools.count())
 
 This will fail; it should fail quickly.

Why should it? AFAIK it's not a common complaint. You said it yourself:
it's a silly thing to do.

Regards

Antoine.


-- 
Software development and contracting: http://pro.pitrou.net


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-15 Thread Alexandre Zani
On Sun, Jul 15, 2012 at 8:08 AM, Mark Shannon m...@hotpy.org wrote:
 Brett Cannon wrote:



 On Sun, Jul 15, 2012 at 10:39 AM, Mark Shannon m...@hotpy.org
 mailto:m...@hotpy.org wrote:

 Nick Coghlan wrote:

 Right, I agree on the value in being able to return something to
 say this cannot be converted to a concrete container.

 I still haven't seen a use case where the appropriate response
 to I don't know differs from the appropriate response to a
 hint of zero - that is, you don't preallocate, you just start
 iterating.


 There seem to be 5 possible classes values of __length_hint__ that an
 iterator object can provide:

 1. Don't implement it at all.

 2. Implement __length_hint__() but don't want to return any value.
Either raise an exception (TypeError) -- As suggested in the PEP.
or return NotImplemented -- my preferred option.

 3. Return a don't know value:
Returning 0 would be fine for this, but the VM might want to
 respond
differently to don't know and 0.
 __length_hint__() == 0 container should be
 minimum size.
 __length_hint__() == unknown container starts at
 default size.


 4. Infinite iterator:
Could return float('inf'), but given this is a hint then
returning sys.maxsize or sys.maxsize + 1 might be OK.
Alternatively raise an OverflowError


 I am really having a hard time differentiating infinity with I don't
 know since they are both accurate from the point of view of __length_hint__
 and its typical purpose of allocation. You have no clue how many values will
 be grabbed from an infinite iterator, so it's the same as just not knowing
 upfront how long the iterator will be, infinite or not, and thus not worth
 distinguishing.


 5. A meaningful length. No problem :)

 Also, what are the allowable return types?

 1. int only
 2. Any number (ie any type with a __int__() method)?
 3. Or any integer-like object (ie a type with a __index__() method)?

 My suggestion:

 a) Don't want to return any value or don't know: return
 NotImplemented
 b) For infinite iterators: raise an OverflowError
 c) All other cases: return an int or a type with a __index__() method.


 I'm fine with (a), drop (b), and for (c) use what we allow for __len__()
 since, as Nick's operator.length_hint pseudo-code suggests, people will call
 this as a fallback if __len__ isn't defined.


 So how does an iterator express infinite length?

 What should happen if I am silly enough to do this:
 list(itertools.count())

 This will fail; it should fail quickly.


 Cheers,
 Mark.
 ___
 Python-Dev mailing list
 Python-Dev@python.org
 http://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe:
 http://mail.python.org/mailman/options/python-dev/alexandre.zani%40gmail.com

The PEP so far says: It may raise a ``TypeError`` if a specific
instance cannot have
its length estimated. In many ways, I don't know is the same as
this specific instance cannot have its length estimated. Why not
just raise a TypeError?

Also, regarding the code Nick posted above, I'm a little concerned
about calling len as the first thing to try. That means that if I
implement both __len__ and __len_hint__ (perhaps because __len__ is
very expensive) __len_hint__ will never be used. It's relatively easy
to say:

try:
  hint = len_hint(l)
except TypeError:
  hint = len(l)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-15 Thread Steven D'Aprano

Mark Shannon wrote:


So how does an iterator express infinite length?


The suggestion was it should return -1.



What should happen if I am silly enough to do this:
  list(itertools.count())

This will fail; it should fail quickly.


That depends on your OS. I've just tested it now on Linux Mint, and the Python 
process was terminated within seconds.


I've also inadvertently done it on a Fedora system, which became completely 
unresponsive to user-input (including ctrl-alt-delete) within a few minutes. I 
let it run overnight (16 hours) before literally pulling the plug.


(I expect the difference in behaviour is due to the default ulimit under 
Debian/Mint and RedHat/Fedora systems.)


Ignoring OS-specific features, the promise[1] of the language is that list 
will try to allocate enough space for every item yielded by the iterator, or 
fail with a MemoryError. No promise is made as to how long that will take: it 
could take hours, or days, depending on how badly memory allocation 
performance drops when faced with unreasonably large requests. You can't 
expect it to fail either quickly or with an exception.


With a length hint, we could strengthen that promise:

if __length_hint__ returns a negative number, list, tuple and set will fail 
immediately with MemoryError


which I think is a good safety feature for some things which cannot possibly 
succeed, but risk DOSing your system. Does it prevent every possible failure 
mode? No, of course not. But just because you can't prevent *every* problem 
doesn't mean you should prevent the ones which you can.



[1] I think. I'm sure I read this somewhere in the docs, but I can't find it 
now.


--
Steven
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-15 Thread Chris Angelico
On Mon, Jul 16, 2012 at 1:55 AM, Steven D'Aprano st...@pearwood.info wrote:
 (I expect the difference in behaviour is due to the default ulimit under
 Debian/Mint and RedHat/Fedora systems.)

Possibly also virtual memory settings. Allocating gobs of memory with
a huge page file slows everything down without raising an error.

And since it's possible to have non-infinite but ridiculous-sized
iterators, I'd not bother putting too much effort into protecting
infinite iterators - although the huge but not infinite case is,
admittedly, rather rarer than either reasonable-sized or actually
infinite.

ChrisA
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-15 Thread Steven D'Aprano

Antoine Pitrou wrote:


First, you can't implement __length_hint__ for a generator, which is the
preferred (the most practical) way of writing iterators in pure Python.


Limitations of generators are no reason for not improving iterators which are 
not generators. __length_hint__ already exists; this proposal simply proposes 
making it documented and officially supported.


py iter([]).__length_hint__
built-in method __length_hint__ of list_iterator object at 0xb7bcf98c



Even if __length_hint__ is implemented, its result may be wrong.
That's the whole point: it's a *hint*; an iterator might tell you it's
finite while it's infinite, or the reverse.


If it claims to be infinite, I see no reason to disbelieve it on the 
off-chance that it is actually both finite and small enough to fit into memory 
 without crashing my system. If it claims to be finite, but is actually 
infinite, well that's not much of a hint, is it? There's an implied promise 
that the hint will be close to the real value, not infinitely distant.




My conclusion is that an infinite iterator is a documentation issue.
Just tell the user that it doesn't stop, and let them shoot themselves
in the foot in they want to.


Buffer overflows are a documentation issue. Just tell the user not to 
overwrite memory they don't mean to, and let them shoot themselves in the foot 
if they want.


*wink*



--
Steven

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-15 Thread Antoine Pitrou
On Mon, 16 Jul 2012 02:00:58 +1000
Chris Angelico ros...@gmail.com wrote:
 On Mon, Jul 16, 2012 at 1:55 AM, Steven D'Aprano st...@pearwood.info wrote:
  (I expect the difference in behaviour is due to the default ulimit under
  Debian/Mint and RedHat/Fedora systems.)
 
 Possibly also virtual memory settings. Allocating gobs of memory with
 a huge page file slows everything down without raising an error.
 
 And since it's possible to have non-infinite but ridiculous-sized
 iterators, I'd not bother putting too much effort into protecting
 infinite iterators - although the huge but not infinite case is,
 admittedly, rather rarer than either reasonable-sized or actually
 infinite.

In the real world, I'm sure huge but not infinite is much more
frequent than actually infinite. Trying to list() an infinite
iterator is a programming error, so it shouldn't end up in production
code. However, data that grows bigger than expected (or that gets
disposed of too late) is quite a common thing.

hint
When hg.python.org died of OOM two weeks ago, it wasn't because of an
infinite iterator:
http://mail.python.org/pipermail/python-committers/2012-July/002084.html
/hint

Regards

Antoine.


-- 
Software development and contracting: http://pro.pitrou.net


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-15 Thread Antoine Pitrou
On Mon, 16 Jul 2012 02:21:20 +1000
Steven D'Aprano st...@pearwood.info wrote:
 
  My conclusion is that an infinite iterator is a documentation issue.
  Just tell the user that it doesn't stop, and let them shoot themselves
  in the foot in they want to.
 
 Buffer overflows are a documentation issue. Just tell the user not to 
 overwrite memory they don't mean to, and let them shoot themselves in the 
 foot 
 if they want.

No, buffer overflows are bugs and they get fixed.

Regards

Antoine.


-- 
Software development and contracting: http://pro.pitrou.net


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-15 Thread Steven D'Aprano

Steven D'Aprano wrote:


With a length hint, we could strengthen that promise:

if __length_hint__ returns a negative number, list, tuple and set will 
fail immediately with MemoryError


which I think is a good safety feature for some things which cannot 
possibly succeed, but risk DOSing your system. Does it prevent every 
possible failure mode? No, of course not. But just because you can't 
prevent *every* problem doesn't mean you should prevent the ones which 
you can.


Gah, I messed that last sentence up. It should read:

just because you can't prevent *every* problem doesn't mean you SHOULDN'T 
prevent the ones which you can.



--
Steven

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-15 Thread Stephen J. Turnbull
Nick Coghlan writes:

  Right, I agree on the value in being able to return something to say this
  cannot be converted to a concrete container.
  
  I still haven't seen a use case where the appropriate response to I don't
  know differs from the appropriate response to a hint of zero - that is,
  you don't preallocate, you just start iterating.

Why wouldn't one just believe the hint and jump past the iteration?

What about an alternative API such as length_hint(iter, bound)
returning 'cannot say' (if no hint is available), 'small' (if the
estimated length is less than bound), and 'large' (if it's greater
than the bound or infinite)?  (Or None, True, False which would give
the boolean interpretation do I know I'm small enough to be converted
to a concrete container?)

The point is that I don't really see the value in returning a precise
estimate that cannot be relied on to be accurate.  OK, Python is a
consenting adults language, but returning an integer here seems like
invitation to abuse.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-15 Thread Nick Coghlan
On Jul 16, 2012 1:52 PM, Stephen J. Turnbull step...@xemacs.org wrote:
 The point is that I don't really see the value in returning a precise
 estimate that cannot be relied on to be accurate.  OK, Python is a
 consenting adults language, but returning an integer here seems like
 invitation to abuse.

Because preallocating memory is ridiculously faster than doing multiple
resizes. That's all this API is for: how many objects should a container
constructor preallocate space for when building from an iterable. It's an
important optimisation in CPython when using itertools, and PyPy is
planning to adopt it as well. Alex is doing the right thing in attempting
to standardise it rather than risk the two implementations using subtly
incompatible definitions.

Skipping the iteration in the zero case is a pointless micro-optimisation
that just makes the API more complex for no good reason. Allowing a
negative hint to mean infinite, on the other hand, avoids certain
categories of errors without making the API any harder to use (since
negatives have to be rejected anyway).

--
Sent from my phone, thus the relative brevity :)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-15 Thread Steven D'Aprano

Stephen J. Turnbull wrote:


The point is that I don't really see the value in returning a precise
estimate that cannot be relied on to be accurate.  OK, Python is a
consenting adults language, but returning an integer here seems like
invitation to abuse.


Since __length_hint__ already exists and is already used, we should probably 
hear from somebody who knows how it is used and what problems and/or benefits 
it leads to.




--
Steven
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-14 Thread Alex Gaynor
Hi all,

I've just submitted a PEP proposing making __length_hint__ a public API for 
users to define and other VMs to implement:

PEP: 424
Title: A method for exposing a length hint
Version: $Revision$
Last-Modified: $Date
Author: Alex Gaynor alex.gay...@gmail.com
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 14-July-2012
Python-Version: 3.4

Abstract


CPython currently defines an ``__length_hint__`` method on several types, such
as various iterators. This method is then used by various other functions (such 
as
``map``) to presize lists based on the estimated returned by
``__length_hint__``. Types can then define ``__length_hint__`` which are not
sized, and thus should not define ``__len__``, but can estimate or compute a
size (such as many iterators).

Proposal


This PEP proposes formally documenting ``__length_hint__`` for other
interpreter and non-standard library Python to implement.

``__length_hint__`` must return an integer, and is not required to be accurate.
It may return a value that is either larger or smaller than the actual size of
the container. It may raise a ``TypeError`` if a specific instance cannot have
its length estimated. It may not return a negative value.

Rationale
=

Being able to pre-allocate lists based on the expected size, as estimated by 
``__length_hint__``, can be a significant optimization. CPython has been
observed to run some code faster than PyPy, purely because of this optimization
being present.

Open questions
==

There are two open questions for this PEP:

* Should ``list`` expose a kwarg in it's constructor for supplying a length
  hint.
* Should a function be added either to ``builtins`` or some other module which
  calls ``__length_hint__``, like ``builtins.len`` calls ``__len__``.

Copyright
=

This document has been placed into the public domain.

..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8




Alex

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-14 Thread Benjamin Peterson
2012/7/14 Alex Gaynor alex.gay...@gmail.com:

 Proposal
 

 This PEP proposes formally documenting ``__length_hint__`` for other
 interpreter and non-standard library Python to implement.

 ``__length_hint__`` must return an integer, and is not required to be 
 accurate.
 It may return a value that is either larger or smaller than the actual size of
 the container. It may raise a ``TypeError`` if a specific instance cannot have
 its length estimated. It may not return a negative value.

And what happens if you return a negative value?


 Rationale
 =

 Being able to pre-allocate lists based on the expected size, as estimated by
 ``__length_hint__``, can be a significant optimization. CPython has been
 observed to run some code faster than PyPy, purely because of this 
 optimization
 being present.

 Open questions
 ==

 There are two open questions for this PEP:

 * Should ``list`` expose a kwarg in it's constructor for supplying a length
   hint.
 * Should a function be added either to ``builtins`` or some other module which
   calls ``__length_hint__``, like ``builtins.len`` calls ``__len__``.

Let's try to keep this as limited as possible for a public API.


-- 
Regards,
Benjamin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-14 Thread Alex Gaynor
On Sat, Jul 14, 2012 at 4:18 PM, Benjamin Peterson benja...@python.orgwrote:

 2012/7/14 Alex Gaynor alex.gay...@gmail.com:
 
  Proposal
  
 
  This PEP proposes formally documenting ``__length_hint__`` for other
  interpreter and non-standard library Python to implement.
 
  ``__length_hint__`` must return an integer, and is not required to be
 accurate.
  It may return a value that is either larger or smaller than the actual
 size of
  the container. It may raise a ``TypeError`` if a specific instance
 cannot have
  its length estimated. It may not return a negative value.

 And what happens if you return a negative value?


ValueError, the same as with len.


 
  Rationale
  =
 
  Being able to pre-allocate lists based on the expected size, as
 estimated by
  ``__length_hint__``, can be a significant optimization. CPython has been
  observed to run some code faster than PyPy, purely because of this
 optimization
  being present.
 
  Open questions
  ==
 
  There are two open questions for this PEP:
 
  * Should ``list`` expose a kwarg in it's constructor for supplying a
 length
hint.
  * Should a function be added either to ``builtins`` or some other module
 which
calls ``__length_hint__``, like ``builtins.len`` calls ``__len__``.

 Let's try to keep this as limited as possible for a public API.


Sounds reasonable to me!  Should we just go ahead and strip those out now?



 --
 Regards,
 Benjamin


Alex

-- 
I disapprove of what you say, but I will defend to the death your right to
say it. -- Evelyn Beatrice Hall (summarizing Voltaire)
The people's good is the highest law. -- Cicero
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-14 Thread Alexandre Zani
On Sat, Jul 14, 2012 at 4:21 PM, Alex Gaynor alex.gay...@gmail.com wrote:


 On Sat, Jul 14, 2012 at 4:18 PM, Benjamin Peterson benja...@python.org
 wrote:

 2012/7/14 Alex Gaynor alex.gay...@gmail.com:
 
  Proposal
  
 
  This PEP proposes formally documenting ``__length_hint__`` for other
  interpreter and non-standard library Python to implement.
 
  ``__length_hint__`` must return an integer, and is not required to be
  accurate.
  It may return a value that is either larger or smaller than the actual
  size of
  the container. It may raise a ``TypeError`` if a specific instance
  cannot have
  its length estimated. It may not return a negative value.

 And what happens if you return a negative value?


 ValueError, the same as with len.


 
  Rationale
  =
 
  Being able to pre-allocate lists based on the expected size, as
  estimated by
  ``__length_hint__``, can be a significant optimization. CPython has been
  observed to run some code faster than PyPy, purely because of this
  optimization
  being present.
 
  Open questions
  ==
 
  There are two open questions for this PEP:
 
  * Should ``list`` expose a kwarg in it's constructor for supplying a
  length
hint.
  * Should a function be added either to ``builtins`` or some other module
  which
calls ``__length_hint__``, like ``builtins.len`` calls ``__len__``.

 Let's try to keep this as limited as possible for a public API.


 Sounds reasonable to me!  Should we just go ahead and strip those out now?

I'm +1 on not having a public API for this. Ultimately the contract
for a length hint will depend heavily upon what you need it for. Some
applications would require a length hint to be an at least others an
at most and others something else entirely. Given that the contract
here appears to be =0, I don't think the length hint is particularly
useful to the public at large.




 --
 Regards,
 Benjamin


 Alex

 --
 I disapprove of what you say, but I will defend to the death your right to
 say it. -- Evelyn Beatrice Hall (summarizing Voltaire)
 The people's good is the highest law. -- Cicero


 ___
 Python-Dev mailing list
 Python-Dev@python.org
 http://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe:
 http://mail.python.org/mailman/options/python-dev/alexandre.zani%40gmail.com

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-14 Thread Benjamin Peterson
2012/7/14 Alex Gaynor alex.gay...@gmail.com:


 On Sat, Jul 14, 2012 at 4:18 PM, Benjamin Peterson benja...@python.org
 wrote:

 2012/7/14 Alex Gaynor alex.gay...@gmail.com:
 
  Proposal
  
 
  This PEP proposes formally documenting ``__length_hint__`` for other
  interpreter and non-standard library Python to implement.
 
  ``__length_hint__`` must return an integer, and is not required to be
  accurate.
  It may return a value that is either larger or smaller than the actual
  size of
  the container. It may raise a ``TypeError`` if a specific instance
  cannot have
  its length estimated. It may not return a negative value.

 And what happens if you return a negative value?


 ValueError, the same as with len.

CPython will probably have to updated to not ignore it if you return melons.



 
  Rationale
  =
 
  Being able to pre-allocate lists based on the expected size, as
  estimated by
  ``__length_hint__``, can be a significant optimization. CPython has been
  observed to run some code faster than PyPy, purely because of this
  optimization
  being present.
 
  Open questions
  ==
 
  There are two open questions for this PEP:
 
  * Should ``list`` expose a kwarg in it's constructor for supplying a
  length
hint.
  * Should a function be added either to ``builtins`` or some other module
  which
calls ``__length_hint__``, like ``builtins.len`` calls ``__len__``.

 Let's try to keep this as limited as possible for a public API.


 Sounds reasonable to me!  Should we just go ahead and strip those out now?

Certainly.


-- 
Regards,
Benjamin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-14 Thread Terry Reedy

On 7/14/2012 6:11 PM, Alex Gaynor wrote:
...

Various thoughts:

This method is then used by various other functions (such +as ``map``) 
to presize lists
 -- map no longer produces lists. This only makes sense in 3.x if you 
mean that map can pass along the value of its inputs.


Types can then define ``__length_hint__`` which are not
+sized, and thus should not define ``__len__``,
is awkwardly phrased. I think you mean
Types that are not sized and should not define __len__ can then define 
__length_hint__.


What do 'sized' and 'should' mean? Some iterators know exactly how many 
items they have yet to yield. The main implication of having a __len__ 
versus __length_hint__ methods seems to be it bool() value when empty.


If lists were to get a new keyword arg, so should the other classes 
based on one internal array. I see this has been removed.


Generator functions are the nicest way to define iterators in Python. 
Generator instances returned from generator functions cannot be given a 
length hint. They are not directly helped. However ...


Not addressed in the PEP: do consumers of __length_hint look for it (and 
__length__ before or after calling iter(input), or both? If before, then 
the following should work.


class gwlh: # generator with length hint
def __init__(self, gen, len):
self.gen = gen
self.len = len
def __iter__(self):
return self.gen
def __length_hint__(self):
return len

Do transformation iterators pass through hints from inputs? Does map(f, 
iterable) look for len or hint on iterable? Ditto for some itertools, 
like chain (add lengths). Any guidelines in the PEP


--
Terry Jan Reedy



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-14 Thread Nick Coghlan
On Sun, Jul 15, 2012 at 9:18 AM, Benjamin Peterson benja...@python.org wrote:
 Open questions
 ==

 There are two open questions for this PEP:

 * Should ``list`` expose a kwarg in it's constructor for supplying a length
   hint.
 * Should a function be added either to ``builtins`` or some other module 
 which
   calls ``__length_hint__``, like ``builtins.len`` calls ``__len__``.

 Let's try to keep this as limited as possible for a public API.

Length hints are very useful for *any* container implementation,
whether those containers are in the standard library or not. Just as
we exposed operator.index when __index__ was added, we should expose
an operator.length_hint function with the following semantics:

def length_hint(obj):
Return an estimate of the number of items in obj. This is
useful for presizing containers when building from an iterable.

If the object supports len(), the result will be exact.
Otherwise, it may over or underestimate by an arbitrary amount. The
result will be an integer = 0.

try:
return len(obj)
except TypeError:
try:
get_hint = obj.__length_hint__
except AttributeError:
return 0
hint = get_hint()
if not isinstance(hint, int):
raise TypeError(Length hint must be an integer, not
%r % type(hint))
if hint  0:
raise ValueError(Length hint (%r) must be = 0 % hint)
return hint

There's no reason to make pure Python container implementations
reimplement all that for themselves.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-14 Thread Alex Gaynor
On Sat, Jul 14, 2012 at 10:16 PM, Nick Coghlan ncogh...@gmail.com wrote:

 On Sun, Jul 15, 2012 at 9:18 AM, Benjamin Peterson benja...@python.org
 wrote:
  Open questions
  ==
 
  There are two open questions for this PEP:
 
  * Should ``list`` expose a kwarg in it's constructor for supplying a
 length
hint.
  * Should a function be added either to ``builtins`` or some other
 module which
calls ``__length_hint__``, like ``builtins.len`` calls ``__len__``.
 
  Let's try to keep this as limited as possible for a public API.

 Length hints are very useful for *any* container implementation,
 whether those containers are in the standard library or not. Just as
 we exposed operator.index when __index__ was added, we should expose
 an operator.length_hint function with the following semantics:

 def length_hint(obj):
 Return an estimate of the number of items in obj. This is
 useful for presizing containers when building from an iterable.

 If the object supports len(), the result will be exact.
 Otherwise, it may over or underestimate by an arbitrary amount. The
 result will be an integer = 0.
 
 try:
 return len(obj)
 except TypeError:
 try:
 get_hint = obj.__length_hint__
 except AttributeError:
 return 0
 hint = get_hint()
 if not isinstance(hint, int):
 raise TypeError(Length hint must be an integer, not
 %r % type(hint))
 if hint  0:
 raise ValueError(Length hint (%r) must be = 0 % hint)
 return hint

 There's no reason to make pure Python container implementations
 reimplement all that for themselves.

 Cheers,
 Nick.

 --
 Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia


Sounds reasonable to me, the only issue with your psuedocode (err... I mean
Python ;)), is that there's no way for the __lenght_hint__ to specify that
that particular instance can't have a length hint computed.  e.g. imagine
some sort of lazy stream that cached itself, and only wanted to offer a
length hint if it had already been evaluated.  Without an exception to
raise, it has to return whatever the magic value for length_hint is (in
your impl it appears to be 0, the current _PyObject_LengthHint method in
CPython has a required `default` parameter).  The PEP proposes using
TypeError for that.

Anyways that code looks good, do you want to add it to the PEP?

Alex

-- 
I disapprove of what you say, but I will defend to the death your right to
say it. -- Evelyn Beatrice Hall (summarizing Voltaire)
The people's good is the highest law. -- Cicero
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com