Re: Do not promote `None` as the first argument to `filter` in documentation.

2018-03-06 Thread Peter Otten
Chris Angelico wrote:

> On Wed, Mar 7, 2018 at 2:12 AM, Kirill Balunov 
> wrote:
>>
>>
>> 2018-03-06 17:55 GMT+03:00 Chris Angelico :
>>>
>>> On Wed, Mar 7, 2018 at 1:48 AM, Kirill Balunov 
>>> wrote:
>>> > Note: For some historical reasons as the first argument you can use
>>> > None instead of function, in this case the identity function is
>>> > assumed. That is, all elements of iterable that are false are removed
>>> > which is equivalent
>>> > to (item for item in iterable if item). Currently, for the same
>>> > purpose the
>>> > preferred form is `filter(bool, iterable)`.
>>> >
>>>
>>> I'd prefer to word it something like:
>>>
>>> If the first argument is None, the identity function is assumed. That
>>> is, all elements of the iterable that are false are removed; it is
>>> equivalent to (item for item in iterable if item). It is approximately
>>> equivalent to (but faster than) filter(bool, iterable).
>>>
>>> ChrisA
>>> --
>>> https://mail.python.org/mailman/listinfo/python-list
>>
>>
>> I do not want to seem rude and stubborn, but how much faster is it to
>> highlight or emphasize it:
>>
> 
> Timings mean little. Why do we write:
> 
> if lst:
> 
> instead of:
> 
> if bool(lst):
> 
> ? Because it's unnecessary and pointless to call bool() on something
> before using it in a boolean context. If that concept causes you
> problems, it's not the fault of the filter function; filter simply
> uses something in a boolean context.
> 
> So "the identity function" is more correct than "the bool() function".

Fun fact: CPython handles filter(bool) and filter(None) the same way, it 
sets the checktrue flag and chooses the fast path:

static PyObject *
filter_next(filterobject *lz)
{
PyObject *item;
PyObject *it = lz->it;
long ok;
PyObject *(*iternext)(PyObject *);
int checktrue = lz->func == Py_None || lz->func == (PyObject 
*)_Type;

iternext = *Py_TYPE(it)->tp_iternext;
for (;;) {
item = iternext(it);
if (item == NULL)
return NULL;

if (checktrue) {
ok = PyObject_IsTrue(item);
} else {
PyObject *good;
good = PyObject_CallFunctionObjArgs(lz->func, item, NULL);
if (good == NULL) {
Py_DECREF(item);
return NULL;
}
ok = PyObject_IsTrue(good);
Py_DECREF(good);
}
if (ok > 0)
return item;
Py_DECREF(item);
if (ok < 0)
return NULL;
}
}

If there were a built-in identity() function it could be treated the same 
way.


-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Do not promote `None` as the first argument to `filter` in documentation.

2018-03-06 Thread Chris Angelico
On Wed, Mar 7, 2018 at 2:33 AM, Kirill Balunov  wrote:
>
>
> 2018-03-06 17:55 GMT+03:00 Chris Angelico :
>>
>> If the first argument is None, the identity function is assumed. That
>> is, all elements of the iterable that are false are removed; it is
>> equivalent to (item for item in iterable if item). It is approximately
>> equivalent to (but faster than) filter(bool, iterable).
>>
>> ChrisA
>> --
>> https://mail.python.org/mailman/listinfo/python-list
>
>
> If you look in C source for `filter_next`
> https://github.com/python/cpython/blob/5d92647102fac9e116b98ab8bbc632eeed501c34/Python/bltinmodule.c#L593,
> there is a line:
>
> int checktrue = lz->func == Py_None || lz->func == (PyObject *)_Type;
>
> So the only difference between `filter(None, ls`) and `filter(bool, ls)` is
> LOAD_NAME vs LOAD_CONST and that `None` is checked before than `bool`.
>

Assuming that nobody's shadowed the name 'bool' anywhere, which has to
be checked for at run time. (Which is the job of LOAD_NAME.)

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Do not promote `None` as the first argument to `filter` in documentation.

2018-03-06 Thread Kirill Balunov
2018-03-06 17:55 GMT+03:00 Chris Angelico :

> If the first argument is None, the identity function is assumed. That
> is, all elements of the iterable that are false are removed; it is
> equivalent to (item for item in iterable if item). It is approximately
> equivalent to (but faster than) filter(bool, iterable).
>
> ChrisA
> --
> https://mail.python.org/mailman/listinfo/python-list
>

If you look in C source for `filter_next`
https://github.com/python/cpython/blob/5d92647102fac9e116b98ab8bbc632eeed501c34/Python/bltinmodule.c#L593,
there is a line:

int checktrue = lz->func == Py_None || lz->func == (PyObject
*)_Type;

So the only difference between `filter(None, ls`) and `filter(bool, ls)` is
LOAD_NAME vs LOAD_CONST and that `None` is checked before than `bool`.


With kind regards,
-gdg
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Do not promote `None` as the first argument to `filter` in documentation.

2018-03-06 Thread Chris Angelico
On Wed, Mar 7, 2018 at 2:12 AM, Kirill Balunov  wrote:
>
>
> 2018-03-06 17:55 GMT+03:00 Chris Angelico :
>>
>> On Wed, Mar 7, 2018 at 1:48 AM, Kirill Balunov 
>> wrote:
>> > Note: For some historical reasons as the first argument you can use None
>> > instead of function, in this case the identity function is assumed. That
>> > is, all elements of iterable that are false are removed which is
>> > equivalent
>> > to (item for item in iterable if item). Currently, for the same purpose
>> > the
>> > preferred form is `filter(bool, iterable)`.
>> >
>>
>> I'd prefer to word it something like:
>>
>> If the first argument is None, the identity function is assumed. That
>> is, all elements of the iterable that are false are removed; it is
>> equivalent to (item for item in iterable if item). It is approximately
>> equivalent to (but faster than) filter(bool, iterable).
>>
>> ChrisA
>> --
>> https://mail.python.org/mailman/listinfo/python-list
>
>
> I do not want to seem rude and stubborn, but how much faster is it to
> highlight or emphasize it:
>

Timings mean little. Why do we write:

if lst:

instead of:

if bool(lst):

? Because it's unnecessary and pointless to call bool() on something
before using it in a boolean context. If that concept causes you
problems, it's not the fault of the filter function; filter simply
uses something in a boolean context.

So "the identity function" is more correct than "the bool() function".

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Do not promote `None` as the first argument to `filter` in documentation.

2018-03-06 Thread Kirill Balunov
2018-03-06 17:55 GMT+03:00 Chris Angelico :

> On Wed, Mar 7, 2018 at 1:48 AM, Kirill Balunov 
> wrote:
> > Note: For some historical reasons as the first argument you can use None
> > instead of function, in this case the identity function is assumed. That
> > is, all elements of iterable that are false are removed which is
> equivalent
> > to (item for item in iterable if item). Currently, for the same purpose
> the
> > preferred form is `filter(bool, iterable)`.
> >
>
> I'd prefer to word it something like:
>
> If the first argument is None, the identity function is assumed. That
> is, all elements of the iterable that are false are removed; it is
> equivalent to (item for item in iterable if item). It is approximately
> equivalent to (but faster than) filter(bool, iterable).
>
> ChrisA
> --
> https://mail.python.org/mailman/listinfo/python-list
>

I do not want to seem rude and stubborn, but how much faster is it to
highlight or emphasize it:

from random import randint
for i in [1, 10, 100, 1000, 1, 10]:
ls = [randint(0,1) for _ in range(i)]
%timeit [*filter(None, ls)]
%timeit [*filter(bool, ls)]
print()

272 ns ± 0.0346 ns per loop (mean ± std. dev. of 7 runs, 100 loops each)
282 ns ± 0.0714 ns per loop (mean ± std. dev. of 7 runs, 100 loops each)

283 ns ± 0.0645 ns per loop (mean ± std. dev. of 7 runs, 100 loops each)
296 ns ± 0.116 ns per loop (mean ± std. dev. of 7 runs, 100 loops each)

1.4 µs ± 1.32 ns per loop (mean ± std. dev. of 7 runs, 100 loops each)
1.41 µs ± 4.05 ns per loop (mean ± std. dev. of 7 runs, 100 loops each)

14.7 µs ± 40.1 ns per loop (mean ± std. dev. of 7 runs, 10 loops each)
14.7 µs ± 23.2 ns per loop (mean ± std. dev. of 7 runs, 10 loops each)

137 µs ± 186 ns per loop (mean ± std. dev. of 7 runs, 1 loops each)
137 µs ± 24.7 ns per loop (mean ± std. dev. of 7 runs, 1 loops each)

1.32 ms ± 285 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)
1.32 ms ± 908 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)

With kind regards,
-gdg
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Do not promote `None` as the first argument to `filter` in documentation.

2018-03-06 Thread Chris Angelico
On Wed, Mar 7, 2018 at 1:48 AM, Kirill Balunov  wrote:
> Note: For some historical reasons as the first argument you can use None
> instead of function, in this case the identity function is assumed. That
> is, all elements of iterable that are false are removed which is equivalent
> to (item for item in iterable if item). Currently, for the same purpose the
> preferred form is `filter(bool, iterable)`.
>

I'd prefer to word it something like:

If the first argument is None, the identity function is assumed. That
is, all elements of the iterable that are false are removed; it is
equivalent to (item for item in iterable if item). It is approximately
equivalent to (but faster than) filter(bool, iterable).

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Do not promote `None` as the first argument to `filter` in documentation.

2018-03-06 Thread Kirill Balunov
2018-03-06 16:58 GMT+03:00 Jason Friedman :

>
> as a ordinary Python user I'd be interested in improvements to the
> documentation, including suggestions on real-world usage.
>

I'm just an ordinary user, just like you :)


> Kirill, taking deprecation/removal off the table, what changes would you
> recommend to the documentation?
>

My English is about basic level, so you may need to fix the spelling. But I
would write as follows:


filter(function, iterable)
Construct an iterator from those elements of iterable for which function
returns truthy values. iterable may be either a sequence, a container which
supports iteration, or an iterator.

Note that filter(function, iterable) is equivalent to the generator
expression (item for item in iterable if function(item)). In cases when
function corresponds to a simple lambda function a generator expression
should be preferred. For example, when it is necessary to eliminate all
multiples of three (x for x in range(100) if x % 3) should be used, instead
of filter(lambda x: x % 3, range(100))

See itertools.filterfalse() for the complementary function that returns
elements of iterable for which function returns false.

Note: For some historical reasons as the first argument you can use None
instead of function, in this case the identity function is assumed. That
is, all elements of iterable that are false are removed which is equivalent
to (item for item in iterable if item). Currently, for the same purpose the
preferred form is `filter(bool, iterable)`.

p.s.:
maybe _function_ should be changed to _callable_.

With kind regards,
-gdg
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Do not promote `None` as the first argument to `filter` in documentation.

2018-03-06 Thread Kirill Balunov
2018-03-06 16:35 GMT+03:00 Chris Green :

> It's 'deprecation', depreciation is something quite different.  People
> replying have spelt it correctly so you might possibly have noticed I
> thought/hoped.
>
> ... and it does matter a bit because it's not just a mis-spelling, the
> word you are using has its own meaning and could thus cause confusion.
>
> ... and, yes, I know it's a very common and easily made mistake. :-)
>
>
I did not ;) Thank you!

With kind regards,
-gdg
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Do not promote `None` as the first argument to `filter` in documentation.

2018-03-06 Thread Kirill Balunov
2018-03-06 16:51 GMT+03:00 Chris Angelico :

> On Wed, Mar 7, 2018 at 12:23 AM, Kirill Balunov 
> wrote:
> > Filter is generally faster than list comprehension or generators.
> >
> > %timeit [*filter(lambda x: x % 3, range(1000))]
> > 100 µs ± 16.4 ns per loop (mean ± std. dev. of 7 runs, 1 loops each)
> >
> > f = lambda x: x % 3
> >
> > %timeit [*(f(i) for i in range(1000))]
> > 132 µs ± 73.5 ns per loop (mean ± std. dev. of 7 runs, 1 loops each)
> >
> > %timeit [f(i) for i in range(1000)]
> > 107 µs ± 179 ns per loop (mean ± std. dev. of 7 runs, 1 loops each)
> >
>
> These don't do the same thing, though. A more comparable comprehension is:
>
> [i for i in range(1000) if i % 3]
>
> rosuav@sikorsky:~$ python3 -m timeit '[i for i in range(1000) if i % 3]'
> 1 loops, best of 5: 34.5 usec per loop
> rosuav@sikorsky:~$ python3 -m timeit '[*filter(lambda x: x % 3,
> range(1000))]'
> 5000 loops, best of 5: 81.1 usec per loop
>
> And my point about comprehensions was that you do NOT use a pointless
> function for them - you just have inline code. If there is a
> pre-existing function, sure! Use it. But when you use filter or map
> with a lambda function, you should probably use a comprehension
> instead.
>
> ChrisA
> --
> https://mail.python.org/mailman/listinfo/python-list
>

Thank you, I did not understand you at first, now everything is clear. In
this sense of `x % 3`, I fully agree with you.

With kind regards,
-gdg
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Do not promote `None` as the first argument to `filter` in documentation.

2018-03-06 Thread Jason Friedman
On Tue, Mar 6, 2018 at 1:52 AM, Kirill Balunov 
wrote:

>
> I propose to delete all references in the `filter` documentation that the
> first argument can be `None`, with possible depreciation of `None` as the
> the first argument - FutureWarning in Python 3.8+ and deleting this option
> in Python 4. Personally, regarding the last point - depreciation, I do not
> see this as a great necessity, but I do not find that the option with
> `None`
> should be offered and suggested through the documentation. Instead, it is
> better to show an example with using `filter(bool, iterable)` which is
> absolutely
> equivalent, more readable, but a little bit slower.
>
> Currently documentation for `None` case uses `identity function is
> assumed`, what is this `identity` and how it is consistent with
> truthfulness?
>
> In addition, this change makes the perception of `map` and `filter` more
> consistent,with the rule that first argument must be `callable`.
>
> I see only one moment with `None`, since `None` is a keyword, the behavior
> of `filter(None, iterable)` is alsways guaranteed, but with `bool` it is
> not. Nevertheless, we are all adults here.
>

I won't pretend I am qualified to debate the technical aspects here, but
regarding this snippet:

... Instead, it is better to show an example with using `filter(bool,
iterable)` which is absolutely equivalent, more readable ...

as a ordinary Python user I'd be interested in improvements to the
documentation, including suggestions on real-world usage.  For example,
Chris Angelico below says in part:

... that said, though, any use of filter() that involves a lambda
function should
probably become list comps or genexps, so filter itself should only be used
when ...

Kirill, taking deprecation/removal off the table, what changes would you
recommend to the documentation?
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Do not promote `None` as the first argument to `filter` in documentation.

2018-03-06 Thread Chris Angelico
On Wed, Mar 7, 2018 at 12:23 AM, Kirill Balunov  wrote:
> Filter is generally faster than list comprehension or generators.
>
> %timeit [*filter(lambda x: x % 3, range(1000))]
> 100 µs ± 16.4 ns per loop (mean ± std. dev. of 7 runs, 1 loops each)
>
> f = lambda x: x % 3
>
> %timeit [*(f(i) for i in range(1000))]
> 132 µs ± 73.5 ns per loop (mean ± std. dev. of 7 runs, 1 loops each)
>
> %timeit [f(i) for i in range(1000)]
> 107 µs ± 179 ns per loop (mean ± std. dev. of 7 runs, 1 loops each)
>

These don't do the same thing, though. A more comparable comprehension is:

[i for i in range(1000) if i % 3]

rosuav@sikorsky:~$ python3 -m timeit '[i for i in range(1000) if i % 3]'
1 loops, best of 5: 34.5 usec per loop
rosuav@sikorsky:~$ python3 -m timeit '[*filter(lambda x: x % 3, range(1000))]'
5000 loops, best of 5: 81.1 usec per loop

And my point about comprehensions was that you do NOT use a pointless
function for them - you just have inline code. If there is a
pre-existing function, sure! Use it. But when you use filter or map
with a lambda function, you should probably use a comprehension
instead.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Do not promote `None` as the first argument to `filter` in documentation.

2018-03-06 Thread Chris Green
Kirill Balunov  wrote:
> 
> As I wrote, __possible depreciation__, I also do not see the point of just

It's 'deprecation', depreciation is something quite different.  People
replying have spelt it correctly so you might possibly have noticed I
thought/hoped.

... and it does matter a bit because it's not just a mis-spelling, the
word you are using has its own meaning and could thus cause confusion.

... and, yes, I know it's a very common and easily made mistake. :-)

-- 
Chris Green
·
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Do not promote `None` as the first argument to `filter` in documentation.

2018-03-06 Thread Kirill Balunov
2018-03-06 13:18 GMT+03:00 Chris Angelico :

> The identity function is:
>
> filter(lambda x: x, range(10))
>
> How is it consistent with truthiness? Exactly the same way the
> underlying object is. There's no requirement for the predicate
> function to return True or False - it's perfectly acceptable, for
> instance, to do this:
>
> filter(lambda x: x % 3, range(10))
>
> to eliminate all multiples of three.
>

Yes there is no reason to return True and False, but in the case of `None`
and `bool` under the hood there will be no difference and the form with
`bool` is much more readable.


>
> That said, though, any use of filter() that involves a lambda function
> should probably become list comps or genexps, so filter itself should
> only be used when there really IS a pre-existing function that does
> the job.


Filter is generally faster than list comprehension or generators.

%timeit [*filter(lambda x: x % 3, range(1000))]
100 µs ± 16.4 ns per loop (mean ± std. dev. of 7 runs, 1 loops each)

f = lambda x: x % 3

%timeit [*(f(i) for i in range(1000))]
132 µs ± 73.5 ns per loop (mean ± std. dev. of 7 runs, 1 loops each)

%timeit [f(i) for i in range(1000)]
107 µs ± 179 ns per loop (mean ± std. dev. of 7 runs, 1 loops each)


> So, for instance, you could strip out every occurrence of the
> string "0" with:
>
> filter(int, list_of_strings)
>
> And that still depends on the normal Python rules for boolification.
> If that's valid, then it should be just as viable to say
> "filter(identity-function, ...)", which is spelled "filter(None,
> ...)".
>
> ChrisA
> --
> https://mail.python.org/mailman/listinfo/python-list
>
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Do not promote `None` as the first argument to `filter` in documentation.

2018-03-06 Thread Kirill Balunov
2018-03-06 14:17 GMT+03:00 Steven D'Aprano <
steve+comp.lang.pyt...@pearwood.info>:

> On Tue, 06 Mar 2018 11:52:22 +0300, Kirill Balunov wrote:
>
> > I propose to delete all references in the `filter` documentation that
> > the first argument can be `None`, with possible depreciation of `None`
> > as the the first argument - FutureWarning in Python 3.8+ and deleting
> > this option in Python 4.
>
> Even if we agreed that it is unfortunate that filter accepts None as an
> argument, since it does (and has done since Python 1.0) there is nothing
> to be gained by deprecating and removing it.
>
> Deprecating and removing it will break code that currently works, for no
> good reason; removing the documentation is unacceptable, as that makes it
> too difficult for people to find out what `filter(None, values)` does.
>

As I wrote, __possible depreciation__, I also do not see the point of just
breaking someone's code. But I didn't see any benefit to explicitly promote
`filter(None, iterable)` form in the documentation as a good style.


> > Instead, it is better to show an example with using
> > `filter(bool, iterable)` which is absolutely
> > equivalent, more readable, but a little bit slower.
>
> So long as `filter(None, ...)` is still documented, I don't mind what
> example is given.
>
> But the idiom `filter(None, ...)` is an old, common idiom, very familiar
> to many people who have a background in functional programming.
>

While this form familiar and common idiom for those who are familiar with
Python from versions < 2.3, before `bool` type was introduced. It looks
kinky for newcomers and not obvious at a glance. In functional programming
we use a predicate, and `None` does not match predicate definition, while
`bool` does!


> It is unfortunate that filter takes the arguments in the order it does.
> Perhaps it would have been better to write it like this:
>
> def filter(iterable, predicate=None):
> ...
>
>
> Then `filter(values, None)` would be a standard Python idiom, explicitly
> saying to use the default predicate function. There is no difference to
> `filter(None, values)` except the order is (sadly) reversed.
>

If such a form was in Python, I probably would agree with you. Although in
its present form I like it a lot more and find it more intuitive.

> Currently documentation for `None` case uses `identity function is
> > assumed`, what is this `identity` and how it is consistent with
> > truthfulness?
>
> The identity function is a mathematical term for a function that returns
> its argument unchanged:
>
> def identity(x):
> return x
>
> So `filter(func, values)` filters according to func(x); using None
> instead filters according to x alone, without the expense of calling a do-
> nothing function:
>
> # slow because it has to call the lambda function each time;
> filter(lambda x: x, values)
>
> # fast because filter takes an optimized path
> filter(None, values)
>


> Since filter filters according to the truthy or falsey value of x, it
> isn't actually necessary to call bool(x). In Python, all values are
> automatically considered either truthy or falsey. The reason to call
> bool() is to ensure you have a canonical True/False value, and there's no
> need for that here.


I went over a bit with the question what is identity function :) But I have
a feeling that I perceive all of the above quite the contrary in the
context of a `filter` function. And since filter filters according to the
truthy or falsey value of x. `None` and `bool` should behave totally
equivalent under the hood and I'm 99% sure that it is so.


> So the identity function should be preferred to bool,
> for those who understand two things:
>
> - the identity function (using None as the predicate function)
>   returns x unchanged;
>

Sorry, but how does the above relates to the `filter` discussion?


>
> - and that x, like all values, automatically has a truthy value in a
>   boolean context (which includes filter).
>
>
Yes, and that is why there is no point to `None` since they will do the
same thing in context of `filter` function.


> > In addition, this change makes the perception of `map` and `filter` more
> > consistent,with the rule that first argument must be `callable`.
>
> I consider that a flaw in map. map should also accept None as the
> identity function, so that map(None, iterable) returns the values of
> iterable unchanged.
>
> def map(function=None, *iterables):
> if len(iterables) == 0:
> raise TypeError("map() must have at least two arguments.")
> if function is None:
> if len(iterables) > 1:
> return zip(*iterables)
> else:
> assert len(iterables) == 1
> return iter(iterables[0])
> elif len(iterables) > 1:
> return (function(*args) for args in zip(*iterables))
> else:
> assert len(iterables) == 1
> return (function(arg) for arg in iterables[0])
>

And what will be the practical reason to have this? :)


Re: Do not promote `None` as the first argument to `filter` in documentation.

2018-03-06 Thread Steven D'Aprano
On Tue, 06 Mar 2018 11:52:22 +0300, Kirill Balunov wrote:

> I propose to delete all references in the `filter` documentation that
> the first argument can be `None`, with possible depreciation of `None`
> as the the first argument - FutureWarning in Python 3.8+ and deleting
> this option in Python 4.

Even if we agreed that it is unfortunate that filter accepts None as an 
argument, since it does (and has done since Python 1.0) there is nothing 
to be gained by deprecating and removing it.

Deprecating and removing it will break code that currently works, for no 
good reason; removing the documentation is unacceptable, as that makes it 
too difficult for people to find out what `filter(None, values)` does.


> Instead, it is better to show an example with using
> `filter(bool, iterable)` which is absolutely
> equivalent, more readable, but a little bit slower.

So long as `filter(None, ...)` is still documented, I don't mind what 
example is given.

But the idiom `filter(None, ...)` is an old, common idiom, very familiar 
to many people who have a background in functional programming.

It is unfortunate that filter takes the arguments in the order it does. 
Perhaps it would have been better to write it like this:

def filter(iterable, predicate=None):
...


Then `filter(values, None)` would be a standard Python idiom, explicitly 
saying to use the default predicate function. There is no difference to 
`filter(None, values)` except the order is (sadly) reversed.


> Currently documentation for `None` case uses `identity function is
> assumed`, what is this `identity` and how it is consistent with
> truthfulness?

The identity function is a mathematical term for a function that returns 
its argument unchanged:

def identity(x):
return x

So `filter(func, values)` filters according to func(x); using None 
instead filters according to x alone, without the expense of calling a do-
nothing function:

# slow because it has to call the lambda function each time;
filter(lambda x: x, values)

# fast because filter takes an optimized path
filter(None, values)


Since filter filters according to the truthy or falsey value of x, it 
isn't actually necessary to call bool(x). In Python, all values are 
automatically considered either truthy or falsey. The reason to call 
bool() is to ensure you have a canonical True/False value, and there's no 
need for that here. So the identity function should be preferred to bool, 
for those who understand two things:

- the identity function (using None as the predicate function) 
  returns x unchanged;

- and that x, like all values, automatically has a truthy value in a
  boolean context (which includes filter).


> In addition, this change makes the perception of `map` and `filter` more
> consistent,with the rule that first argument must be `callable`.

I consider that a flaw in map. map should also accept None as the 
identity function, so that map(None, iterable) returns the values of 
iterable unchanged.

def map(function=None, *iterables):
if len(iterables) == 0:
raise TypeError("map() must have at least two arguments.")
if function is None:
if len(iterables) > 1:
return zip(*iterables)
else:
assert len(iterables) == 1
return iter(iterables[0])
elif len(iterables) > 1:
return (function(*args) for args in zip(*iterables))
else:
assert len(iterables) == 1
return (function(arg) for arg in iterables[0])


-- 
Steve

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Do not promote `None` as the first argument to `filter` in documentation.

2018-03-06 Thread Chris Angelico
On Tue, Mar 6, 2018 at 7:52 PM, Kirill Balunov  wrote:
> This thought occurred to me several times, but I could not decide to write.
> And since `filter` is a builtin, I think this change should be discussed
> here, before opening an issue on bug tracker.
>
> I propose to delete all references in the `filter` documentation that the
> first argument can be `None`, with possible depreciation of `None` as the
> the first argument - FutureWarning in Python 3.8+ and deleting this option
> in Python 4. Personally, regarding the last point - depreciation, I do not
> see this as a great necessity, but I do not find that the option with `None`
> should be offered and suggested through the documentation. Instead, it is
> better to show an example with using `filter(bool, iterable)` which is
> absolutely
> equivalent, more readable, but a little bit slower.
>
> %timeit [*filter(None, range(10))]
> 503 ns ± 0.259 ns per loop (mean ± std. dev. of 7 runs, 100 loops each)
>
> %timeit [*filter(bool, range(10))]
> 512 ns ± 1.09 ns per loop (mean ± std. dev. of 7 runs, 100 loops each)
>
> Currently documentation for `None` case uses `identity function is
> assumed`, what is this `identity` and how it is consistent with
> truthfulness?

The identity function is:

filter(lambda x: x, range(10))

How is it consistent with truthiness? Exactly the same way the
underlying object is. There's no requirement for the predicate
function to return True or False - it's perfectly acceptable, for
instance, to do this:

filter(lambda x: x % 3, range(10))

to eliminate all multiples of three.

That said, though, any use of filter() that involves a lambda function
should probably become list comps or genexps, so filter itself should
only be used when there really IS a pre-existing function that does
the job. So, for instance, you could strip out every occurrence of the
string "0" with:

filter(int, list_of_strings)

And that still depends on the normal Python rules for boolification.
If that's valid, then it should be just as viable to say
"filter(identity-function, ...)", which is spelled "filter(None,
...)".

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Do not promote `None` as the first argument to `filter` in documentation.

2018-03-06 Thread Kirill Balunov
This thought occurred to me several times, but I could not decide to write.
And since `filter` is a builtin, I think this change should be discussed
here, before opening an issue on bug tracker.

I propose to delete all references in the `filter` documentation that the
first argument can be `None`, with possible depreciation of `None` as the
the first argument - FutureWarning in Python 3.8+ and deleting this option
in Python 4. Personally, regarding the last point - depreciation, I do not
see this as a great necessity, but I do not find that the option with `None`
should be offered and suggested through the documentation. Instead, it is
better to show an example with using `filter(bool, iterable)` which is
absolutely
equivalent, more readable, but a little bit slower.

%timeit [*filter(None, range(10))]
503 ns ± 0.259 ns per loop (mean ± std. dev. of 7 runs, 100 loops each)

%timeit [*filter(bool, range(10))]
512 ns ± 1.09 ns per loop (mean ± std. dev. of 7 runs, 100 loops each)

Currently documentation for `None` case uses `identity function is
assumed`, what is this `identity` and how it is consistent with
truthfulness?

In addition, this change makes the perception of `map` and `filter` more
consistent,with the rule that first argument must be `callable`.

I see only one moment with `None`, since `None` is a keyword, the behavior
of `filter(None, iterable)` is alsways guaranteed, but with `bool` it is
not. Nevertheless, we are all adults here.

With kind regards,
-gdg
-- 
https://mail.python.org/mailman/listinfo/python-list