Re: [Python-ideas] Idea: Importing from arbitrary filenames

2018-04-15 Thread Nick Coghlan
On 14 April 2018 at 19:22, Steve Barnes  wrote:
> I generally love the current import system for "just working" regardless
> of platform, installation details, etc., but what I would like to see is
> a clear import local, (as opposed to import from wherever you can find
> something to satisfy mechanism). This is the one thing that I miss from
> C/C++ where #include  is system includes and #include "x" search
> differing include paths, (if used well).

For the latter purpose, we prefer that folks use either explicit
relative imports (if they want to search the current package
specifically), or else direct manipulation of package.__path__.

That is, if you do:

from . import custom_imports # Definitely from your own project
custom_imports.__path__[:] = (some_directory, some_other_directory)

then:

from .custom_imports import name

will search those directories for packages & modules to import, while
still cleanly mapping to a well-defined location in the module
namespace for the process as a whole (and hence being able to use all
the same caches as other imports, without causing name conflicts or
other problems).

If you want to do this dynamically relative to the current module,
then it's possible to do:

global __path__
__path__[:] = (some_directory, some_other_directory)
custom_mod = importlib.import_module(".name", package=__name__)

The discoverability of these kinds of techniques could definitely
stand to be improved, but the benefit of adopting them is that they
work on all currently supported versions of Python (even
importlib.import_module exists in Python 2.7 as a convenience wrapper
around __import__), rather than needing to wait for new language level
syntax for them.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Idea: Importing from arbitrary filenames

2018-04-15 Thread Nick Coghlan
On 15 April 2018 at 17:12, Nick Coghlan  wrote:
> If you want to do this dynamically relative to the current module,
> then it's possible to do:
>
> global __path__
> __path__[:] = (some_directory, some_other_directory)
> custom_mod = importlib.import_module(".name", package=__name__)

Copy and paste error there: to make this work in non-package modules,
drop the "[:]" from the __path__ assignment.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4)

2018-04-15 Thread Nick Coghlan
On 15 April 2018 at 13:54, Chris Angelico  wrote:
> On Sun, Apr 15, 2018 at 1:08 PM, Nick Coghlan  wrote:
>> === Target first, 'from' keyword ===
>>
>> while (value from read_next_item()) is not None: # New
>> ...
>>
>> Pros:
>>
>>   * avoids the syntactic ambiguity of "as"
>>   * being target first provides an obvious distinction from the "as" keyword
>>   * typically reads nicely as pseudocode
>>   * "from" is already associated with a namebinding operation ("from
>> module import name")
>>
>> Cons:
>>
>>   * I'm sure we'll think of some more, but all I have so far is that
>> the association with name binding is relatively weak and would need to
>> be learned
>>
>
> Cons: Syntactic ambiguity with "raise exc from otherexc", probably not 
> serious.

Ah, I forgot about that usage. The keyword usage is at least somewhat
consistent, in that it's short for:

_tmp = exc
_exc.__cause__ from otherexc
raise exc

However, someone writing "raise (ExcType from otherexc)" could be
confusing, since it would end up re-raising "otherexc" instead of
wrapping it in a new ExcType instance. If "otherexc" was also an
ExcType instance, that would be a *really* subtle bug to try and
catch, so this would likely need the same kind of special casing as
was proposed for "as" (i.e. prohibiting the top level parentheses).

I also agree with Nathan that if you hadn't encountered from
expressions before, it would be reasonable to assume they were
semantically comparable to "target = next(expr)" rather than just
"target = expr".

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Rewriting file - pythonic way

2018-04-15 Thread Alexey Shrub

Hi all,

I am new in python (i am moving from Perl world), but I always love 
Python for hight level, beatuful and clean syntax.

Now I have question/idea about working with files.
On mine opinion it very popular use case:
1. Open file (for read and write)
2. Read data from file
3. Modify data.
4. Rewrite file by modified data.

But now it is looks not so pythonic:

with open(filename, 'r+') as file:
   data = file.read()
   data = data.replace('old', 'new')
   file.seek(0)
   file.write(data)
   file.truncate()

or something like this

with open(filename) as file:
   data = file.read()
data = data.replace('old', 'new')
with open(filename) as file:
   file.write(data)

I think best way is something like this

with open(filename, 'r+') as file:
   data = file.read()
   data = data.replace('old', 'new')
   file.rewrite(data)

but for this io.BufferedIOBase must contain rewrite method

what you think about this?



___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4)

2018-04-15 Thread Kirill Balunov
2018-04-15 6:08 GMT+03:00 Nick Coghlan :

>
> It's not completely off topic. as it's due to the fact we use "," to
> separate both context managers and items in a tuple, so "with (cm1,
> cm2, cm3):" is currently legal syntax that means something quite
> different from "with cm1, cm2, cm3:". While using the parenthesised
> form is *pointless* (since it will blow up at runtime due to tuples
> not being context managers), the fact it's syntactically valid makes
> us more hesitant to add the special case around parentheses handling
> than we were for import statements. The relevance to PEP 572 is as a
> reminder that since we *really* don't like to add yet more different
> cases to "What do parentheses indicate in Python?"


Despite the fact that "with (cm1,cm2, cm3):"  currently is the legal
syntax, but as you said and as it was also noted before in this thread - it
is "pointless" in 99% cases (in context of with statement) and will fail at
runtime. Therefore, regardless of this PEP, maybe it is fair to make it at
least to be a `SyntaxWarning` (or `SyntaxError`)? Better fail sooner than
later.


we should probably
> show similar hesitation when it comes to giving ":" yet another
> meaning.
>
>
Yes, `:` is used (as a symbol) in a lot of places (in fact there is not
much in it), but in some places Python looks as a combination of words and
colons.


> P.S. The pros and cons of the current syntax proposals, as I see them:
>
> === Expression first, 'as' keyword ===
>
> while (read_next_item() as value) is not None:
> ...
>
> Pros:
>
>   * typically reads nicely as pseudocode
>   * "as" is already associated with namebinding operations
>
>
I understand that this list is subjective. But as for me it will be huge
PRO that the expression comes first.


> Cons:
>
>   * syntactic ambiguity in with statement headers (major concern)
>   * encourages a common misunderstanding of how with statements work
> (major concern)
>   * visual similarity between "as" and "and" makes name bindings easy to
> miss
>   * syntactic ambiguity in except clause headers theoretically exists,
> but is less of a concern due to the consistent type difference that
> makes the parenthesised form pointless
>
>
In reality, the first two points can be explained (if it will be required
at all). Misunderstanding is a consequence of a lack of experience. I don't
understand the the point about "visual similarity between "as" and "and"
can you elaborate on this point a little bit more?



> === Expression first, '->' symbol ===
>
> while (read_next_item() -> value) is not None:
> ...
>
> Pros:
>
>   * avoids the syntactic ambiguity of "as"
>   * "->" is used for name bindings in at least some other languages
> (but this is irrelevant to users for whom Python is their first, and
> perhaps only, programming language)
>
>
The same as previous,  the expression comes first is a huge PRO for me and
I'm sure for many others too. With the second point I agree that it is
somewhat irrelevant.


> Cons:
>
>   * doesn't read like pseudocode (you need to interpret an arbitrary
> non-arithmetic symbol)
>

Here I am a bit disagree with you. The most common for of assignment in
formal pseudo-code is `name <- expr`. The second most common form, to my
regret,  is - `:=`. The `<-` form is not possible in Python and that is why
`expr -> name` was suggested.


>   * invites the question "Why doesn't this use the 'as' keyword?"
>

All forms invites this question :)))


>   * symbols are typically harder to look up than keywords
>   * symbols don't lend themselves to easy mnemonics
>   * somewhat arbitrary repurposing of "->" compared to its use in
> function annotations
>
>  The last one is a major concern. I think that is why Guido is so
skeptical about this form.


> === Target first, ':=' symbol ===
>
> while (value := read_next_item()) is not None:
> ...
>
> Pros:
>
>   * avoids the syntactic ambiguity of "as"
>   * being target first provides an obvious distinction from the "as"
> keyword
>

For me it is a CON. Originally the rationale of this PEP was to reduce the
number of unnecessary calculations and to provide a useful syntax to make a
name binding in appropriate places. It should not, in any way, replace the
existing `=` usual way to make a name binding. Therefore, as I see it, it
is one of design goals to make the syntax forms of `assignment statement`
and `assignment expression` to be distinct and `:=` does not help with
this. This does not mean that this new syntax form should not be
convenient, but it should be different from the usual `=` form.


>   * ":=" is used for name bindings in at least some other languages
> (but this is irrelevant to users for whom Python is their first, and
> perhaps only, language)
>
> Cons:
>
>   * symbols are typically harder to look up than keywords
>   * symbols don't lend themselves to easy mnemonics
>   * subject to a visual "line noise" phenomenon when combined with
> other uses of ":" as 

Re: [Python-ideas] Rewriting file - pythonic way

2018-04-15 Thread Serhiy Storchaka

15.04.18 11:57, Alexey Shrub пише:
I am new in python (i am moving from Perl world), but I always love 
Python for hight level, beatuful and clean syntax.

Now I have question/idea about working with files.
On mine opinion it very popular use case:
1. Open file (for read and write)
2. Read data from file
3. Modify data.
4. Rewrite file by modified data.

But now it is looks not so pythonic:

with open(filename, 'r+') as file:
    data = file.read()
    data = data.replace('old', 'new')
    file.seek(0)
    file.write(data)
    file.truncate()


What do you mean by calling this not pythonic?


I think best way is something like this

with open(filename, 'r+') as file:
    data = file.read()
    data = data.replace('old', 'new')
    file.rewrite(data)

but for this io.BufferedIOBase must contain rewrite method


If the problem is that you want to use a single line instead of three 
line, you can add a function:


def file_rewrite(file, data):
 file.seek(0)
 file.write(data)
 file.truncate()

and use it. This looks pretty pythonic to me.

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4)

2018-04-15 Thread Mikhail V
On Sun, Apr 15, 2018 at 12:19 PM, Kirill Balunov
 wrote:
>
>
> 2018-04-15 6:08 GMT+03:00 Nick Coghlan :
>>
>>

>
>>
>> P.S. The pros and cons of the current syntax proposals, as I see them:
>>
>> === Expression first, 'as' keyword ===
>>
>> while (read_next_item() as value) is not None:
>> ...
>>
>> Pros:
>>
>>   * typically reads nicely as pseudocode
>>   * "as" is already associated with namebinding operations
>>
>
> I understand that this list is subjective. But as for me it will be huge PRO
> that the expression comes first.
>
[...]
>>
>> === Expression first, '->' symbol ===
>>
>> while (read_next_item() -> value) is not None:
>> ...
>>
>> Pros:
>>
>>   * avoids the syntactic ambiguity of "as"
>>   * "->" is used for name bindings in at least some other languages
>> (but this is irrelevant to users for whom Python is their first, and
>> perhaps only, programming language)
[...]
>
>>
>>   * invites the question "Why doesn't this use the 'as' keyword?"
>
>
> All forms invites this question :)))


Exactly, all forms invites this and other questions.

First of all, coming back to original spelling choice arguments
[Sorry in advance if I've missed some points in this huge thread]

citation from PEP:
  "Differences from regular assignment statements" [...]
  "Otherwise, the semantics of assignment are unchanged by this proposal."

So basically it's the same Python assignment?
Then obvious solution seems just to propose "=".
But I see Chris have put this in FAQ section:
"The syntactic similarity between ``if (x == y)`` and ``if (x = y)`` "

So IIUC, the *only* reason is to avoid '==' ad '=' similarity?
If so, then it does not sound convincing at all.
Of course Python does me a favor showing an error,
when I make a typo like this:
if (x = y)

But still, if this is the only real reason, it is not convincing.
Syntactically seen, I feel strong that normal '=' would be the way to go.

Just look at this:
y = ((eggs := spam()), (cheese := eggs.method())
y = ((eggs = spam()), (cheese = eggs.method())

The latter is so much cleaner, and already so common to any
old or new Python user. And does not raise a
question what this ":=" should really mean.
(Or probably it should raise such question?)

Given the fact that the PEP gives quite edge-case
usage examples only, this should be really more convincing.
And as a side note: I personally find the look of ":=" a bit 'noisy'.


Another point:

*Target first  vs  Expression first*
===

Well, this is nice indeed. Don't you find that first of all it must be
decided what should be the *overall tendency for Python*?
Now we have common "x = a + b" everywhere. Then there
are comprehensions (somewhat mixed direction) and
"foo as bar" things.
But wait, is the tendency to "give the freedom"? Then you should
introduce something like "<--" in the first place so that we can
write normal assignment in both directions.
Or is the tendency to convert Python to the "expression first" generally?

So if this question can be answered first, then I think it will be
more constructive to discuss the choice of particular spellings.



Mikhail
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Rewriting file - pythonic way

2018-04-15 Thread Alexey Shrub
В Воскресенье, 15 апр. 2018 в 12:40 , Serhiy Storchaka 
 написал:

If the problem is that you want to use a single line instead of three
line, you can add a function


Yes, I think that single line with word 'rewrite' is much more readable 
than those three lines.
And yes, I can make my own function, but it is typical task - maybe it 
must be in standard library?


___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] A cute Python implementation of itertools.tee

2018-04-15 Thread Antoine Pitrou
On Sun, 15 Apr 2018 00:05:58 -0500
Tim Peters  wrote:

> Just for fun - no complaint, no suggestion, just sharing a bit of code
> that tickled me.
> 
> The docs for `itertools.tee()` contain a Python work-alike, which is
> easy to follow.  It gives each derived generator its own deque, and
> when a new value is obtained from the original iterator it pushes that
> value onto each of those deques.
> 
> Of course it's possible for them to share a single deque, but the code
> gets more complicated.  Is it possible to make it simpler instead?
> 
> What it "really" needs is a shared singly-linked list of values,
> pointing from oldest value to newest.  Then each derived generator can
> just follow the links, and yield its next result in time independent
> of the number of derived generators.  But how do you know when a new
> value needs to be obtained from the original iterator, and how do you
> know when space for an older value can be recycled (because all of the
> derived generators have yielded it)?
> 
> I ended up with almost a full page of code to do that, storing with
> each value and link a count of the number of derived generators that
> had yet to yield the value, effectively coding my own reference-count
> scheme by hand, along with "head" and "tail" pointers to the ends of
> the linked list that proved remarkably tricky to keep correct in all
> cases.
> 
> Then I thought "this is stupid!  Python already does reference
> counting."  Voila!  Vast swaths of tedious code vanished, giving this
> remarkably simple implementation:

This implementation doesn't work with Python 3.7 or 3.8.
I've tried it here:
https://gist.github.com/pitrou/b3991f638300edb6d06b5be23a4c66d6

and get:
Traceback (most recent call last):
  File "mytee.py", line 14, in gen
mylast = last[1] = last = [next(it), None]
StopIteration

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "mytee.py", line 47, in 
run(mytee1)
  File "mytee.py", line 36, in run
lists[i].append(next(iters[i]))
RuntimeError: generator raised StopIteration

(Yuck!)

In short, you want the following instead:

try:
mylast = last[1] = last = [next(it), None]
except StopIteration:
return

> def mytee(xs, n):
> last = [None, None]
> 
> def gen(it, mylast):
> nonlocal last
> while True:
> mylast = mylast[1]
> if not mylast:
> mylast = last[1] = last = [next(it), None]

That's smart and obscure :-o
The way it works is that the `last` assignment changes the `last` value
seen by all derived generators, while the `last[1]` assignment updates
the bindings made in the other generators' `mylast` lists...  It's
difficult to find the words to explain it.

The chained assignment makes it more difficult to parse as well (when I
read this I don't know if `last[i]` or `last` gets assigned first;
apparently the answer is `last[i]`, otherwise the recipe wouldn't work
correctly).  Perhaps like this:

while True:
mylast = mylast[1]
if not mylast:
try:
# Create new list link
mylast = [next(it), None]
except StopIteration:
return
else:
# Append to other generators `mylast` linked lists
last[1] = mylast
# Update shared list link
last = last[1]
yield mylast[0]


Regards

Antoine.


___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Rewriting file - pythonic way

2018-04-15 Thread Serhiy Storchaka

15.04.18 12:49, Alexey Shrub пише:
В Воскресенье, 15 апр. 2018 в 12:40 , Serhiy Storchaka 
 написал:

If the problem is that you want to use a single line instead of three
line, you can add a function


Yes, I think that single line with word 'rewrite' is much more readable 
than those three lines.
And yes, I can make my own function, but it is typical task - maybe it 
must be in standard library?


Not every three lines of code must be a function in standard library. 
And these three lines don't look enough common.


Actually the reliable code should write into a separate file and replace 
the original file by the new file only if writing is successful. Or 
backup the old file and restore it if writing is failed. Or do both. And 
handle hard and soft links if necessary. And use file locks if needed to 
prevent race condition when read/write by different processes. Depending 
on the specific of the application you may need different code. Your 
three lines are enough for a one-time script if the risk of a powerful 
blackout or disk space exhaustion is insignificant or if the data is not 
critical.


___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Rewriting file - pythonic way

2018-04-15 Thread Elazar
This pitfall sounds like a good reason to have such a function in the
standard library.

Elazar

בתאריך יום א׳, 15 באפר׳ 2018, 13:13, מאת Serhiy Storchaka ‏<
storch...@gmail.com>:

> 15.04.18 12:49, Alexey Shrub пише:
> > В Воскресенье, 15 апр. 2018 в 12:40 , Serhiy Storchaka
> >  написал:
> >> If the problem is that you want to use a single line instead of three
> >> line, you can add a function
> >
> > Yes, I think that single line with word 'rewrite' is much more readable
> > than those three lines.
> > And yes, I can make my own function, but it is typical task - maybe it
> > must be in standard library?
>
> Not every three lines of code must be a function in standard library.
> And these three lines don't look enough common.
>
> Actually the reliable code should write into a separate file and replace
> the original file by the new file only if writing is successful. Or
> backup the old file and restore it if writing is failed. Or do both. And
> handle hard and soft links if necessary. And use file locks if needed to
> prevent race condition when read/write by different processes. Depending
> on the specific of the application you may need different code. Your
> three lines are enough for a one-time script if the risk of a powerful
> blackout or disk space exhaustion is insignificant or if the data is not
> critical.
>
> ___
> Python-ideas mailing list
> Python-ideas@python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Rewriting file - pythonic way

2018-04-15 Thread Paul Moore
On 15 April 2018 at 10:49, Alexey Shrub  wrote:
> В Воскресенье, 15 апр. 2018 в 12:40 , Serhiy Storchaka 
> написал:
>>
>> If the problem is that you want to use a single line instead of three
>> line, you can add a function
>
>
> Yes, I think that single line with word 'rewrite' is much more readable than
> those three lines.
> And yes, I can make my own function, but it is typical task - maybe it must
> be in standard library?

I don't think it's *that* typical. I don't recall even having wanted
to do this in all the time I've been using Python...
Paul
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Rewriting file - pythonic way

2018-04-15 Thread Paul Moore
On 15 April 2018 at 11:22, Elazar  wrote:
> בתאריך יום א׳, 15 באפר׳ 2018, 13:13, מאת Serhiy Storchaka
> ‏:
>> Actually the reliable code should write into a separate file and replace
>> the original file by the new file only if writing is successful. Or
>> backup the old file and restore it if writing is failed. Or do both. And
>> handle hard and soft links if necessary. And use file locks if needed to
>> prevent race condition when read/write by different processes. Depending
>> on the specific of the application you may need different code. Your
>> three lines are enough for a one-time script if the risk of a powerful
>> blackout or disk space exhaustion is insignificant or if the data is not
>> critical.
>
> This pitfall sounds like a good reason to have such a function in the
> standard library.

It certainly sounds like a good reason for someone to write a "safe
file rewrite" library function. But I don't think that it's such a
common need that it needs to be a stdlib function. It may well even be
the case that there's such a function already available on PyPI - has
anyone actually checked? And if there isn't, then writing module and
publishing it there would seem like a *very* good starting point - as
well as allowing the developer to thrash out the best API, it would
also provide for lots of testing in unusual scenarios that the
developer may not have thought about (Windows file locking is very
different from Unix, what is an atomic operation differs between
platforms, error handling and retries may be something to consider,
etc).

The result would be a useful package, and the download and activity
stats for it would be a great indication of whether it's a frequent
enough need to justify including in core Python.

IMO, it probably isn't. I suspect that most uses would be fine with
the quoted 3-liner, but very few people would need the sort of
robustness that Serhiy is describing (and that level of robustness
*would* be needed for a stdlib implementation). So PyPI is likely a
better home for the "bulletproof" version, and 3 lines of code is a
perfectly acceptable and Pythonic solution for people with simpler
needs.

Paul
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4)

2018-04-15 Thread Nick Coghlan
On 15 April 2018 at 19:41, Mikhail V  wrote:
> So IIUC, the *only* reason is to avoid '==' ad '=' similarity?
> If so, then it does not sound convincing at all.
> Of course Python does me a favor showing an error,
> when I make a typo like this:
> if (x = y)
>
> But still, if this is the only real reason, it is not convincing.

It's thoroughly convincing, because we're already familiar with the
consequences of folks confusing "=" and "==" when writing C & C++
code. It's an eternal bug magnet, so it's not a design we're ever
going to port over to Python. (And that's even before we get into the
parsing ambiguity problems that attempting to reuse "=" for this
purpose in Python would introduce, especially when it comes to keyword
arguments).

The only way Python will ever gain expression level name binding
support is with a spelling *other than* "=", as when that's the
proposed spelling, the answer will be an unequivocal "No, we're not
adding that".

Even if the current discussion does come up with a potentially
plausible spelling, the final determination on python-dev may *still*
be "No, we're not going to add that". That isn't a predetermined
answer though - it will depend on whether or not a proposal can be
developed that threads the small gap between "this adds too much new
cognitive overhead to reading and writing the language" and "while
this does add more complexity to the base language, it provides
sufficient compensation in allowing common ideas to be expressed more
simply".

> Syntactically seen, I feel strong that normal '=' would be the way to go.
>
> Just look at this:
> y = ((eggs := spam()), (cheese := eggs.method())
> y = ((eggs = spam()), (cheese = eggs.method())
>
> The latter is so much cleaner, and already so common to any
> old or new Python user.

Consider how close the second syntax is to "y = f(eggs=spam(),
cheese=fromage())", though.


> Given the fact that the PEP gives quite edge-case
> usage examples only, this should be really more convincing.

The examples in the PEP have been updated to better reflect some of
the key motivating use cases (embedded assignments in if and while
statement conditions, generator expressions, and container
comprehensions)

> And as a side note: I personally find the look of ":=" a bit 'noisy'.

You're not alone in that, which is one of the reasons finding a
keyword based option that's less syntactically ambiguous than "as"
could be an attractive alternative.

> Another point:
>
> *Target first  vs  Expression first*
> ===
>
> Well, this is nice indeed. Don't you find that first of all it must be
> decided what should be the *overall tendency for Python*?
> Now we have common "x = a + b" everywhere. Then there
> are comprehensions (somewhat mixed direction) and
> "foo as bar" things.
> But wait, is the tendency to "give the freedom"? Then you should
> introduce something like "<--" in the first place so that we can
> write normal assignment in both directions.
> Or is the tendency to convert Python to the "expression first" generally?

There's no general tendency towards expression first syntax, nor
towards offering flexibility in whether ordinary assignments are
target first.

All the current cases where we use the "something as target" form are
*not* direct equivalents to "target = something":

* "import dotted.modname as name": also prevents "dotted" getting
bound in the current scope the way it normally would
* "from dotted import modname as name": also prevents "modname"
getting bound in the current scope the way it normally would
* "except exc_filter as exc": binds the caught exception, not the
exception filter
* "with cm as name": binds the result of __enter__ (which may be
self), not the cm directly

Indeed, https://www.python.org/dev/peps/pep-0343/#motivation-and-summary
points out that it's this "not an ordinary assignment" aspect that
lead to with statements using the "with cm as name:" structure in the
first place - the original proposal in PEP 310 was for "with name =
cm:" and ordinary assignment semantics.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4)

2018-04-15 Thread Kirill Balunov
2018-04-15 12:41 GMT+03:00 Mikhail V :

>
> Exactly, all forms invites this and other questions.
>
> First of all, coming back to original spelling choice arguments
> [Sorry in advance if I've missed some points in this huge thread]
>
> citation from PEP:
>   "Differences from regular assignment statements" [...]
>   "Otherwise, the semantics of assignment are unchanged by this proposal."
>
> So basically it's the same Python assignment?
> Then obvious solution seems just to propose "=".
> But I see Chris have put this in FAQ section:
> "The syntactic similarity between ``if (x == y)`` and ``if (x = y)`` "
>

[OT] To be honest I never liked the fact that `=` was used in various
programming languages as assignment. But it became so common that and I got
used to it and even stopped taking a sedative :)

So IIUC, the *only* reason is to avoid '==' ad '=' similarity?
> If so, then it does not sound convincing at all.
> Of course Python does me a favor showing an error,
> when I make a typo like this:
> if (x = y)
>
> But still, if this is the only real reason, it is not convincing.
> Syntactically seen, I feel strong that normal '=' would be the way to go.
>
> Just look at this:
> y = ((eggs := spam()), (cheese := eggs.method())
> y = ((eggs = spam()), (cheese = eggs.method())
>
> The latter is so much cleaner, and already so common to any
> old or new Python user. And does not raise a
> question what this ":=" should really mean.
> (Or probably it should raise such question?)
>
> Given the fact that the PEP gives quite edge-case
> usage examples only, this should be really more convincing.
> And as a side note: I personally find the look of ":=" a bit 'noisy'.
>

You are not alone. On the other hand it is one of the strengths of Python -
not allow to do so common and complex to finding bugs. For me personally,
`: =` looks and feels just like normal assignment statement which can be
used interchangeable but in many more places in the code. And if the main
goal of the PEP was to offer this `assignment expression` as a future
replacement for `assignment statement` the `:=` syntax form would be the
very reasonable proposal (of course in this case there will be a lot more
other questions). But somehow this PEP does not mean it! And with the
current rationale of this PEP it's a huge CON for me that `=` and `:=` feel
and look the same.


>
> Another point:
>
> *Target first  vs  Expression first*
> ===
>
> Well, this is nice indeed. Don't you find that first of all it must be
> decided what should be the *overall tendency for Python*?
> Now we have common "x = a + b" everywhere. Then there
> are comprehensions (somewhat mixed direction) and
> "foo as bar" things.
> But wait, is the tendency to "give the freedom"? Then you should
> introduce something like "<--" in the first place so that we can
> write normal assignment in both directions.
>

As it was noted previously `<-` would not work because of unary minus on
the right:

>>> x = 10
>>> x <- 5
False



> Or is the tendency to convert Python to the "expression first" generally?
>
> So if this question can be answered first, then I think it will be
> more constructive to discuss the choice of particular spellings.
>

If the idea of the whole PEP was to replace `assignment statement` with
`assignment expression` I would choose name first. If the idea was to offer
an expression with the name-binding side effect, which can be used in the
appropriate places I would choose expression first.

With kind regards,
-gdg
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Rewriting file - pythonic way

2018-04-15 Thread Nick Coghlan
On 15 April 2018 at 20:47, Paul Moore  wrote:
> On 15 April 2018 at 11:22, Elazar  wrote:
>> בתאריך יום א׳, 15 באפר׳ 2018, 13:13, מאת Serhiy Storchaka
>> ‏:
>>> Actually the reliable code should write into a separate file and replace
>>> the original file by the new file only if writing is successful. Or
>>> backup the old file and restore it if writing is failed. Or do both. And
>>> handle hard and soft links if necessary. And use file locks if needed to
>>> prevent race condition when read/write by different processes. Depending
>>> on the specific of the application you may need different code. Your
>>> three lines are enough for a one-time script if the risk of a powerful
>>> blackout or disk space exhaustion is insignificant or if the data is not
>>> critical.
>>
>> This pitfall sounds like a good reason to have such a function in the
>> standard library.
>
> It certainly sounds like a good reason for someone to write a "safe
> file rewrite" library function. But I don't think that it's such a
> common need that it needs to be a stdlib function. It may well even be
> the case that there's such a function already available on PyPI - has
> anyone actually checked?

There wasn't last time I checked (which admittedly was several years ago now).

The issue is that it's painfully difficult to write a robust
cross-platform "atomic rewrite" operation that can cleanly handle a
wide range of arbitrary use cases - instead, folks are more likely to
write simpler alternatives that work well enough given whichever
simplifying assumptions are applicable to their use case (which may
even include "I don't care about atomicity, and am quite happy to let
a poorly timed Ctrl-C or unexpected system shutdown corrupt the file
I'm rewriting").

https://bugs.python.org/issue8604#msg174104 is the relevant tracker
discussion (deliberately linking into the middle of it, since the
early part is akin to this thread: reactions mostly along the lines of
"that's easy, and doesn't need to be in the standard library". It
definitely *isn't* easy, but it's also challenging to publish on PyPI,
since it's a quagmire of platform specific complexity and edge cases,
if you mess it up you can cause significant data loss, and anyone that
already knows they need atomic rewrites is likely to be able to come
up with their own purpose specific implementation in less time than it
would take them to assess the suitability of 3rd party alternatives).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4)

2018-04-15 Thread Chris Angelico
On Sun, Apr 15, 2018 at 7:19 PM, Kirill Balunov  wrote:
>> === Expression first, 'as' keyword ===
>>
>> while (read_next_item() as value) is not None:
>> ...
>>
>> Pros:
>>
>>   * typically reads nicely as pseudocode
>>   * "as" is already associated with namebinding operations
>>
>
> I understand that this list is subjective. But as for me it will be huge PRO
> that the expression comes first.

I don't think we're ever going to unify everyone on an arbitrary
question of "expression first" or "name first". But to all the
"expression first" people, a question: what if the target is not just
a simple name?

while (read_next_item() -> items[i + 1 -> i]) is not None:
print("%d/%d..." % (i, len(items)), end="\r")

Does this make sense? With the target coming first, it perfectly
parallels the existing form of assignment:

>>> items = [None] * 10
>>> i = -1
>>> i, items[i] = i+1, input("> ")
> asdf
>>> i, items[i] = i+1, input("> ")
> qwer
>>> i, items[i] = i+1, input("> ")
> zxcv
>>> items
['asdf', 'qwer', 'zxcv', None, None, None, None, None, None, None]

The unpacking syntax is a bit messy, but with expression assignment,
we can do this:

>>> items = [None] * 10
>>> i = -1
>>> items[i := i + 1] = input("> ")
> asdf
>>> items[i := i + 1] = input("> ")
> qwer
>>> items[i := i + 1] = input("> ")
> zxcv
>>>
>>> items
['asdf', 'qwer', 'zxcv', None, None, None, None, None, None, None]

Okay, it's not quite as simple as C's "items[i++]" (since you have to
start i off at negative one so you can pre-increment), but it's still
logical and sane. Are you as happy with that sort of complex
expression coming after 'as' or '->'?

Not a rhetorical question. I'm genuinely curious as to whether people
are expecting "expression -> NAME" or "expression -> TARGET", where
TARGET can be any valid assignment target.

ChrisA
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Rewriting file - pythonic way

2018-04-15 Thread Alexey Shrub
В Воскресенье, 15 апр. 2018 в 2:40 , Nick Coghlan 
 написал:

https://bugs.python.org/issue8604#msg174104 is the relevant tracker
discussion


Thanks all, I agree that universal and absolutly safe solution is very 
difficult, but for experiment I made some draft

https://github.com/worldmind/scripts/tree/master/filerewrite
main code here
https://github.com/worldmind/scripts/blob/master/filerewrite/filerewrite.py#L46

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4)

2018-04-15 Thread Kirill Balunov
2018-04-15 15:21 GMT+03:00 Chris Angelico :

> On Sun, Apr 15, 2018 at 7:19 PM, Kirill Balunov 
> wrote:
> >> === Expression first, 'as' keyword ===
> >>
> >> while (read_next_item() as value) is not None:
> >> ...
> >>
> >> Pros:
> >>
> >>   * typically reads nicely as pseudocode
> >>   * "as" is already associated with namebinding operations
> >>
> >
> > I understand that this list is subjective. But as for me it will be huge
> PRO
> > that the expression comes first.
>
> I don't think we're ever going to unify everyone on an arbitrary
> question of "expression first" or "name first". But to all the
> "expression first" people, a question: what if the target is not just
> a simple name?
>
> while (read_next_item() -> items[i + 1 -> i]) is not None:
> print("%d/%d..." % (i, len(items)), end="\r")
>
> [...]
>
> Not a rhetorical question. I'm genuinely curious as to whether people
> are expecting "expression -> NAME" or "expression -> TARGET", where
> TARGET can be any valid assignment target.
>
>
I completely agree with you that it is impossible to unify everyone opinion
- we all have different background. But this example is more likely to play
against this PEP. This is an extra complexity within one line and it can
fail hard in at least three obvious places :) And I am against this usage
no matter `name first` or `expression first`. But i will reask this with
following snippets. What do you choose from this examples:

0.

while (items[i := i+1] := read_next_item()) is not None:
print(r'%d/%d' % (i, len(items)), end='\r')

1.

while (read_next_item() -> items[(i+1) -> i]) is not None:
print(r'%d/%d' % (i, len(items)), end='\r')

2.

while (item := read_next_item()) is not None:
items[i := (i+1)] = item
print(r'%d/%d' % (i, len(items)), end='\r')

3.

while (read_next_item() -> item) is not None:
items[(i+1) -> i] = item
print(r'%d/%d' % (i, len(items)), end='\r')

4.

while (item := read_next_item()) is not None:
i = i+1
items[i] = item
print(r'%d/%d' % (i, len(items)), end='\r')

5.

while (read_next_item() -> item) is not None:
i = i+1
items[i] = item
print(r'%d/%d' % (i, len(items)), end='\r')

I am definitely Ok with both 2 and 3 here. But as it was noted `:=`
produces additional noise in other places and I am also an `expression
first` guy :) So I still prefer variant 3 to 2. But to be completely
honest, I would write it in the following way:

for item in iter(read_next_item, None):
items.append(item)
print(r'%d/%d' % (i, len(items)), end='\r')


With kind regards,
-gdg
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4)

2018-04-15 Thread Kirill Balunov
2018-04-15 17:17 GMT+03:00 Kirill Balunov :

>
>
> for item in iter(read_next_item, None):
> items.append(item)
> print(r'%d/%d' % (i, len(items)), end='\r')
>
>
> With kind regards,
> -gdg
>

Oh, I forgot about `i`:

for item in iter(read_next_item, None):
i += 1
items.append(item)
print(r'%d/%d' % (i, len(items)), end='\r')

With kind regards,
-gdg
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4)

2018-04-15 Thread Thautwarm Zhao
> To me, "from" strongly suggests that an element is being obtained from a
container/collection of
> elements. This is how I conceptualize "from module import name": "name"
refers to an object
> INSIDE the module, not the module itself. If I saw
>
> if (match from pattern.search(data)) is not None:
...

> I would guess that it is equivalent to
>
> m = next(pattern.search(data))
> if m is not None:
...

+1, although unpacking seems to be reasonable `[elem1, *elems] from
contains`.


Now we have

- "expr as name"
- "name := expr"
- "expr -> name"
- "name from expr"

Personally I prefer "as", but I think without a big change of python
Grammar file, it's impossible to avoid parsing "with expr as name" into
"with (expr as name)" because "expr as name" is actually an "expr".
I have mentioned this in previous discussions and it seems it's better to
warn you all again. I don't think people of Python-Dev are willing to
implement a totally new Python compiler.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4)

2018-04-15 Thread Thautwarm Zhao
>
>
> 0.
>
> while (items[i := i+1] := read_next_item()) is not None:
> print(r'%d/%d' % (i, len(items)), end='\r')
>
> 1.
>
> while (read_next_item() -> items[(i+1) -> i]) is not None:
> print(r'%d/%d' % (i, len(items)), end='\r')
>
> 2.
>
> while (item := read_next_item()) is not None:
> items[i := (i+1)] = item
> print(r'%d/%d' % (i, len(items)), end='\r')
>
> 3.
>
> while (read_next_item() -> item) is not None:
> items[(i+1) -> i] = item
> print(r'%d/%d' % (i, len(items)), end='\r')
>
> 4.
>
> while (item := read_next_item()) is not None:
> i = i+1
> items[i] = item
> print(r'%d/%d' % (i, len(items)), end='\r')
>
> 5.
>
> while (read_next_item() -> item) is not None:
> i = i+1
> items[i] = item
> print(r'%d/%d' % (i, len(items)), end='\r')
>
>
Also 2 or 3.
The 3rd one is in the order of natural language, just like:
while get then next item and assign it to `item`, if it's not None, do
some stuff.

However just as we have pointed out, the semantics of '->' is quite
different from the cases it's currently used at, so it should be handled
much more carefully.

I think maybe we can use unicode characters like ≜ (\triangleq) and add the
support of unicode completion to python repl. The unicode completion of
editors or ides has been quite mature.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Rewriting file - pythonic way

2018-04-15 Thread Oleg Broytman
On Sun, Apr 15, 2018 at 05:15:55PM +0300, Alexey Shrub  wrote:
> В Воскресенье, 15 апр. 2018 в 2:40 , Nick Coghlan 
> написал:
> > https://bugs.python.org/issue8604#msg174104 is the relevant tracker
> > discussion
> 
> Thanks all, I agree that universal and absolutly safe solution is very
> difficult, but for experiment I made some draft
> https://github.com/worldmind/scripts/tree/master/filerewrite

   Good!

> main code here
> https://github.com/worldmind/scripts/blob/master/filerewrite/filerewrite.py#L46

   Can I recommend to catch exceptions in `backuper.backup()`,
cleanup backuper and unlock locker?

Oleg.
-- 
 Oleg Broytmanhttp://phdru.name/p...@phdru.name
   Programmers don't die, they just GOSUB without RETURN.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4)

2018-04-15 Thread Steven D'Aprano
On Sun, Apr 15, 2018 at 10:21:02PM +1000, Chris Angelico wrote:

> I don't think we're ever going to unify everyone on an arbitrary
> question of "expression first" or "name first". But to all the
> "expression first" people, a question: what if the target is not just
> a simple name?
> 
> while (read_next_item() -> items[i + 1 -> i]) is not None:
> print("%d/%d..." % (i, len(items)), end="\r")

I don't see why it would make a difference. It doesn't to me.


> Does this make sense? With the target coming first, it perfectly
> parallels the existing form of assignment:

Yes, except this isn't ordinary assignment-as-a-statement.

I've been mulling over the question why I think the expression needs to 
come first here, whereas I'm satisfied with the target coming first for 
assignment statements, and I think I've finally got the words to explain 
it. It is not just long familiarity with maths and languages that put 
the variable first (although that's also part of it). It has to do with 
what we're looking for when we read code, specifically what is the 
primary piece of information we're initially looking for.

In assignment STATEMENTS the primary piece of information is the target. 
Yes, of course the value assigned to the target is important, but often 
we don't care what the value is, at least not at first. We're hunting 
for a known target, and only when we find it do we care about the value 
it gets.

A typical scenario: I'm reading a function, and I scan down the block 
looking at the start of each line until I find the variable I want:

spam = don't care
eggs = don't care
self.method(don't care)
cheese = ... <<< HERE IT IS

so it actually helps to have the name up front. Copying standard maths 
notation for assignment (variable first, value second) is a good thing 
for statements.

With assignment-statements, if you're scanning the code for a variable 
name, you're necessarily interested in the name and it will be helpful 
to have it on the left.

But with assignment-expressions, there's an additional circumstance: 
sometimes you don't care about the name, you only care what the value 
is. (I expect this will be more common.) The name is just something 
to skip over when you're scanning the code looking for the value.

# what did I pass as the fifth argument to the function?
result = some_func(don't care, spam := don't care, eggs := don't care,
   self.method(don't care), cheese := HERE IT IS, 
   ...)

Of course it's hard counting commas so it's probably better to add a bit 
of structure to your function call:

result = some_func(don't care, 
   spam := don't care, 
   eggs := don't care,
   self.method(don't care), 
   cheese := HERE IT IS, 
   ...)


But this time we don't care about the name. Its the value we care about:

result = some_func(don't care, 
   don't care -> don't care
   don't care -> don't care
   don't care(don't care), 
   HERE IT IS  ,
   ...)


The target is just one more thing you have to ignore, and it is helpful 
to have expression first and the target second.

Some more examples:

# what am I adding to the total?
total += don't care := expression

# what key am I looking up?
print(mapping[don't care := key])

# how many items did I just skip?
self.skip(don't care := obj.start + extra)


versus

total += expression -> don't care
print(mapping[key -> don't care])
self.skip(obj.start + extra -> don't care)


It is appropriate for assignment statements and expressions to be 
written differently because they are used differently.



[...]
> >>> items = [None] * 10
> >>> i = -1
> >>> items[i := i + 1] = input("> ")
> > asdf
> >>> items[i := i + 1] = input("> ")
> > qwer
> >>> items[i := i + 1] = input("> ")
> > zxcv
> >>>
> >>> items
> ['asdf', 'qwer', 'zxcv', None, None, None, None, None, None, None]


I don't know why you would write that instead of:

items = [None]*10
for i in range(3):
items[i] = input("> ")


or even for that matter:

items = [input("> ") for i in range(3)] + [None]*7


but whatever floats your boat. (Python isn't just not Java. It's also 
not C *wink*)


> Are you as happy with that sort of complex
> expression coming after 'as' or '->'?

Sure. Ignoring the output of the calls to input():

items = [None] * 10
i = -1
items[i + 1 -> i] = input("> ")
items[i + 1 -> i] = input("> ")
items[i + 1 -> i] = input("> ")


which isn't really such a complex target. How about this instead?


obj = SimpleNamespace(spam=None, eggs=None, 
  aardvark={'key': [None, None, -1]}
  )
items[obj.aardvark['key'][2] + 1 -> obj.aardvark['key'][2]] = input("> ")

versus:

items[obj.aardvark['key'][2] := obj.aardvark['key'][2] + 

Re: [Python-ideas] Rewriting file - pythonic way

2018-04-15 Thread Mahmoud Hashemi
Depending on how firm your requirements around locking are, you may find
this code useful:
https://github.com/mahmoud/boltons/blob/6b0721b6aeda6d3ec6f5d31be7c741bc7fcc4635/boltons/fileutils.py#L303

(docs here:
http://boltons.readthedocs.io/en/latest/fileutils.html#atomic-file-saving )

Basically every operating system has _some_ way of doing an atomic file
replacement, letting us guarantee that a file at a given location is always
valid. atomic_save provides a unified interface to that cross-platform
behavior.

The code does not do locking, as neither I nor its other users have wanted
it, but I'd be happy to extend it if there's a sensible default.

On Sun, Apr 15, 2018 at 8:19 AM, Oleg Broytman  wrote:

> On Sun, Apr 15, 2018 at 05:15:55PM +0300, Alexey Shrub 
> wrote:
> > В Воскресенье, 15 апр. 2018 в 2:40 , Nick Coghlan 
> > написал:
> > > https://bugs.python.org/issue8604#msg174104 is the relevant tracker
> > > discussion
> >
> > Thanks all, I agree that universal and absolutly safe solution is very
> > difficult, but for experiment I made some draft
> > https://github.com/worldmind/scripts/tree/master/filerewrite
>
>Good!
>
> > main code here
> > https://github.com/worldmind/scripts/blob/master/
> filerewrite/filerewrite.py#L46
>
>Can I recommend to catch exceptions in `backuper.backup()`,
> cleanup backuper and unlock locker?
>
> Oleg.
> --
>  Oleg Broytmanhttp://phdru.name/p...@phdru.name
>Programmers don't die, they just GOSUB without RETURN.
> ___
> Python-ideas mailing list
> Python-ideas@python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin]

2018-04-15 Thread Tim Peters
[Raymond Hettinger ]
> Q. Do other languages do it?
> A. Numpy, no. R, no. APL, no. Mathematica, no. Haskell, yes.
>
> * 
> http://docs.scipy.org/doc/numpy/reference/generated/numpy.ufunc.accumulate.html
> * https://stat.ethz.ch/R-manual/R-devel/library/base/html/cumsum.html
> * http://microapl.com/apl/apl_concepts_chapter5.html
>   \+ 1 2 3 4 5
>   1 3 6 10 15
> * https://reference.wolfram.com/language/ref/Accumulate.html
> * https://www.haskell.org/hoogle/?hoogle=mapAccumL

There's also C++, which is pretty much "yes" to every variation
discussed so far:

* partial_sum() is like Python's current accumulate(), including
defaulting to doing addition.

http://en.cppreference.com/w/cpp/algorithm/partial_sum

* inclusive_scan() is also like accumulate(), but allows an optional
"init" argument (which is returned if specified), and there's no
guarantee of "left-to-right" evaluation (it's intended for associative
binary functions, and wants to allow parallelism in the
implementation).

http://en.cppreference.com/w/cpp/algorithm/inclusive_scan

* exclusive_scan() is like inclusive_scan(), but _requires_ an "init"
argument (which is not returned).

http://en.cppreference.com/w/cpp/algorithm/exclusive_scan

* accumulate() is like Python's functools.reduce(), but the operation
is optional and defaults to addition, and an "init" argument is
required.

http://en.cppreference.com/w/cpp/algorithm/accumulate
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4)

2018-04-15 Thread Steven D'Aprano
On Sun, Apr 15, 2018 at 11:11:37PM +0800, Thautwarm Zhao wrote:

> I think maybe we can use unicode characters like ≜ (\triangleq) and add the
> support of unicode completion to python repl. The unicode completion of
> editors or ides has been quite mature.

What key combination do I need to type to get ≜ in the following editors 
please? I tried typing \triangleq but all I got was \triangleq.

Notepad (Windows)
Brackets (Mac)
BBEdit (Mac)
kwrite (Linux)
kate
nano
geany
gedit

as well as IDLE, my mail client (kmail, Thunderbird or mutt), my web 
browsers (Firefox, Opera and Chromium), the interactive interpreter in 
various different consoles, my Usenet client (Pan and KNode) and IRC 
(pidgin).

Oh, having it work in LibreOffice and GoogleApps too would be nice, 
although not essential since I don't often write code in them.

And what decent fonts do I need to install for ≜ to show up as something 
other than a square box ("missing glyph")?


-- 
Steve
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Rewriting file - pythonic way

2018-04-15 Thread Oleg Broytman
On Sun, Apr 15, 2018 at 09:10:57AM -0700, Mahmoud Hashemi  
wrote:
> Depending on how firm your requirements around locking are, you may find
> this code useful:
> https://github.com/mahmoud/boltons/blob/6b0721b6aeda6d3ec6f5d31be7c741bc7fcc4635/boltons/fileutils.py#L303
> 
> (docs here:
> http://boltons.readthedocs.io/en/latest/fileutils.html#atomic-file-saving )
> 
> Basically every operating system has _some_ way of doing an atomic file
> replacement, letting us guarantee that a file at a given location is always
> valid. atomic_save provides a unified interface to that cross-platform
> behavior.
> 
> The code does not do locking, as neither I nor its other users have wanted
> it, but I'd be happy to extend it if there's a sensible default.

   I don't like it renames the file at the end. Renaming could lead to
changed file ownership and permissions; restoring permissions is not
always possible, restoring ownership is almost never possible. Renaming
is also not always possible due to restricted directory permissions.

> On Sun, Apr 15, 2018 at 8:19 AM, Oleg Broytman  wrote:
> 
> > On Sun, Apr 15, 2018 at 05:15:55PM +0300, Alexey Shrub 
> > wrote:
> > > В Воскресенье, 15 апр. 2018 в 2:40 , Nick Coghlan 
> > > написал:
> > > > https://bugs.python.org/issue8604#msg174104 is the relevant tracker
> > > > discussion
> > >
> > > Thanks all, I agree that universal and absolutly safe solution is very
> > > difficult, but for experiment I made some draft
> > > https://github.com/worldmind/scripts/tree/master/filerewrite
> >
> >Good!
> >
> > > main code here
> > > https://github.com/worldmind/scripts/blob/master/
> > filerewrite/filerewrite.py#L46
> >
> >Can I recommend to catch exceptions in `backuper.backup()`,
> > cleanup backuper and unlock locker?


Oleg.
-- 
 Oleg Broytmanhttp://phdru.name/p...@phdru.name
   Programmers don't die, they just GOSUB without RETURN.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] A cute Python implementation of itertools.tee

2018-04-15 Thread Tim Peters
[Antoine Pitrou ]
> This implementation doesn't work with Python 3.7 or 3.8.
> I've tried it here:
> https://gist.github.com/pitrou/b3991f638300edb6d06b5be23a4c66d6
>
> and get:
> Traceback (most recent call last):
>   File "mytee.py", line 14, in gen
> mylast = last[1] = last = [next(it), None]
> StopIteration
>
> The above exception was the direct cause of the following exception:
>
> Traceback (most recent call last):
>   File "mytee.py", line 47, in 
> run(mytee1)
>   File "mytee.py", line 36, in run
> lists[i].append(next(iters[i]))
> RuntimeError: generator raised StopIteration
>
> (Yuck!)

Thanks for trying!  I wonder whether that will break other code.  I
wrote PEP 255, and this part was intentional at the time:

"""
If an unhandled exception-- including, but not limited to,
StopIteration --is raised by, OR PASSES THROUGH [emphasis added], a
generator function, then the exception is passed on to the caller in
the usual way, and subsequent attempts to resume the generator
function raise StopIteration.
"""

I've exploited that a number of times.


> In short, you want the following instead:
>
> try:
> mylast = last[1] = last = [next(it), None]
> except StopIteration:
> return

No, I don't ;-)  If I have to catch StopIteration myself now, then I
want the entire "white True:" loop in the "try" block.  Setting up
try/except machinery anew on each iteration would add significant
overhead; doing it just once per derived generator wouldn't.


>> def mytee(xs, n):
>> last = [None, None]
>>
>> def gen(it, mylast):
>> nonlocal last
>> while True:
>> mylast = mylast[1]
>> if not mylast:
>> mylast = last[1] = last = [next(it), None]

> That's smart and obscure :-o
> The way it works is that the `last` assignment changes the `last` value
> seen by all derived generators, while the `last[1]` assignment updates
> the bindings made in the other generators' `mylast` lists...  It's
> difficult to find the words to explain it.

Which is why I didn't even try - I did warn people that if they
thought it "was obvious", they hadn't yet thought hard enough ;-)
Good job!


> The chained assignment makes it more difficult to parse as well (when I
> read this I don't know if `last[i]` or `last` gets assigned first;
> apparently the answer is `last[i]`, otherwise the recipe wouldn't work
> correctly).

Ya, I had to look it up too :-)  Although, like almost everything else
in Python, chained assignments proceed "left to right".  I was just
trying to make it as short as possible, to increase the "huh - can
something that tiny really work?!" healthy skepticism factor :-)


>  Perhaps like this:
>
> while True:
> mylast = mylast[1]
> if not mylast:
> try:
> # Create new list link
> mylast = [next(it), None]
> except StopIteration:
> return
> else:
> # Append to other generators `mylast` linked lists
> last[1] = mylast
> # Update shared list link
> last = last[1]
> yield mylast[0]

I certainly agree that's easier to follow.  But that wasn't really the point ;-)
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4)

2018-04-15 Thread Kirill Balunov
2018-04-15 18:58 GMT+03:00 Steven D'Aprano :

>
> [...]
>
> But this time we don't care about the name. Its the value we care about:
>
> result = some_func(don't care,
>don't care -> don't care
>don't care -> don't care
>don't care(don't care),
>HERE IT IS  ,
>...)
>

This made my day! :) The programming style when you absolutely don't care
:))) I understand that this is a typo but it turned out to be very funny.

In general, I agree with everything you've said. And I think you found a
very correct way to explain why expression should go first in assignment
expression.

With kind regards,
-gdg
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4)

2018-04-15 Thread Guido van Rossum
On Sun, Apr 15, 2018 at 4:05 AM, Kirill Balunov 
wrote:

> [...] For me personally, `: =` looks and feels just like normal assignment
> statement which can be used interchangeable but in many more places in the
> code. And if the main goal of the PEP was to offer this `assignment
> expression` as a future replacement for `assignment statement` the `:=`
> syntax form would be the very reasonable proposal (of course in this case
> there will be a lot more other questions).
>

I haven't kept up with what's in the PEP (or much of this thread), but this
is the key reason I strongly prefer := as inline assignment operator.


> But somehow this PEP does not mean it! And with the current rationale of
> this PEP it's a huge CON for me that `=` and `:=` feel and look the same.
>

Then maybe the PEP needs to be updated.

-- 
--Guido van Rossum (python.org/~guido)
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] A cute Python implementation of itertools.tee

2018-04-15 Thread Serhiy Storchaka

15.04.18 19:52, Tim Peters пише:

No, I don't ;-)  If I have to catch StopIteration myself now, then I
want the entire "white True:" loop in the "try" block.  Setting up
try/except machinery anew on each iteration would add significant
overhead; doing it just once per derived generator wouldn't.


This overhead is around 10% of the time for calling `next(it)`. It may 
be less than 1-2% of the whole step of mytee iteration.


I have ideas about implementing zero-overhead try/except, but I have 
doubts that it is worth. The benefit seems too small.


___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Idea: Importing from arbitrary filenames

2018-04-15 Thread Steve Barnes


On 15/04/2018 08:12, Nick Coghlan wrote:
> On 14 April 2018 at 19:22, Steve Barnes  wrote:
>> I generally love the current import system for "just working" regardless
>> of platform, installation details, etc., but what I would like to see is
>> a clear import local, (as opposed to import from wherever you can find
>> something to satisfy mechanism). This is the one thing that I miss from
>> C/C++ where #include  is system includes and #include "x" search
>> differing include paths, (if used well).
> 
> For the latter purpose, we prefer that folks use either explicit
> relative imports (if they want to search the current package
> specifically), or else direct manipulation of package.__path__.
> 
> That is, if you do:
> 
>  from . import custom_imports # Definitely from your own project
>  custom_imports.__path__[:] = (some_directory, some_other_directory)
> 
> then:
> 
>  from .custom_imports import name
> 
> will search those directories for packages & modules to import, while
> still cleanly mapping to a well-defined location in the module
> namespace for the process as a whole (and hence being able to use all
> the same caches as other imports, without causing name conflicts or
> other problems).
> 
> If you want to do this dynamically relative to the current module,
> then it's possible to do:
> 
>  global __path__
>  __path__[:] = (some_directory, some_other_directory)
>  custom_mod = importlib.import_module(".name", package=__name__)
> 
> The discoverability of these kinds of techniques could definitely
> stand to be improved, but the benefit of adopting them is that they
> work on all currently supported versions of Python (even
> importlib.import_module exists in Python 2.7 as a convenience wrapper
> around __import__), rather than needing to wait for new language level
> syntax for them.
> 
> Cheers,
> Nick.
> 
Thanks Nick,

As you say not too discoverable at the moment - I have just reread 
PEP328 & https://docs.python.org/3/library/importlib.html but did not 
find any mention of these mechanisms or even that setting an external 
__path__ variable existed as a possibility.

Maybe a documentation enhancement proposal would be in order?
--
Steve (Gadget) Barnes
Any opinions in this message are my personal opinions and do not reflect 
those of my employer.

---
This email has been checked for viruses by AVG.
http://www.avg.com

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4)

2018-04-15 Thread Thautwarm Zhao
Dear Steve, I'm sorry to annoy you by my proposal, but I do think using
unicode might be wise in current stage.

\triangleq could be print with unicode number \u225c, and adding plugins to
support typing this in editors could be easy, just simply map \xxx to the
specific unicode char when we press the tab after typing it.

People using Julia language are proud of it but I think it's just something
convenient could be used in any other language.

There are other reasons to support unicode but it's out of this topic.

Although ':=' and '->' are not perfect, in the range of ASCII it seems to
be impossible to find a better one.

 于 2018年4月16日周一 上午12:53写道:

> Send Python-ideas mailing list submissions to
> python-ideas@python.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
> https://mail.python.org/mailman/listinfo/python-ideas
> or, via email, send a message with subject or body 'help' to
> python-ideas-requ...@python.org
>
> You can reach the person managing the list at
> python-ideas-ow...@python.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Python-ideas digest..."
> Today's Topics:
>
>1. Re: Rewriting file - pythonic way (Mahmoud Hashemi)
>2. Re: Start argument for itertools.accumulate() [Was: Proposal:
>   A Reduce-Map Comprehension and a "last" builtin] (Tim Peters)
>3. Re: Spelling of Assignment Expressions PEP 572 (was post #4)
>   (Steven D'Aprano)
>4. Re: Rewriting file - pythonic way (Oleg Broytman)
>5. Re: A cute Python implementation of itertools.tee (Tim Peters)
>
>
>
> -- Forwarded message --
> From: Mahmoud Hashemi 
> To: python-ideas 
> Cc:
> Bcc:
> Date: Sun, 15 Apr 2018 09:10:57 -0700
> Subject: Re: [Python-ideas] Rewriting file - pythonic way
> Depending on how firm your requirements around locking are, you may find
> this code useful:
> https://github.com/mahmoud/boltons/blob/6b0721b6aeda6d3ec6f5d31be7c741bc7fcc4635/boltons/fileutils.py#L303
>
> (docs here:
> http://boltons.readthedocs.io/en/latest/fileutils.html#atomic-file-saving
> )
>
> Basically every operating system has _some_ way of doing an atomic file
> replacement, letting us guarantee that a file at a given location is always
> valid. atomic_save provides a unified interface to that cross-platform
> behavior.
>
> The code does not do locking, as neither I nor its other users have wanted
> it, but I'd be happy to extend it if there's a sensible default.
>
> On Sun, Apr 15, 2018 at 8:19 AM, Oleg Broytman  wrote:
>
>> On Sun, Apr 15, 2018 at 05:15:55PM +0300, Alexey Shrub 
>> wrote:
>> > В Воскресенье, 15 апр. 2018 в 2:40 , Nick Coghlan 
>> > написал:
>> > > https://bugs.python.org/issue8604#msg174104 is the relevant tracker
>> > > discussion
>> >
>> > Thanks all, I agree that universal and absolutly safe solution is very
>> > difficult, but for experiment I made some draft
>> > https://github.com/worldmind/scripts/tree/master/filerewrite
>>
>>Good!
>>
>> > main code here
>> >
>> https://github.com/worldmind/scripts/blob/master/filerewrite/filerewrite.py#L46
>>
>>Can I recommend to catch exceptions in `backuper.backup()`,
>> cleanup backuper and unlock locker?
>>
>> Oleg.
>> --
>>  Oleg Broytmanhttp://phdru.name/
>> p...@phdru.name
>>Programmers don't die, they just GOSUB without RETURN.
>> ___
>> Python-ideas mailing list
>> Python-ideas@python.org
>> https://mail.python.org/mailman/listinfo/python-ideas
>> Code of Conduct: http://python.org/psf/codeofconduct/
>>
>
>
>
>
> -- Forwarded message --
> From: Tim Peters 
> To: Raymond Hettinger 
> Cc: Python-Ideas 
> Bcc:
> Date: Sun, 15 Apr 2018 11:15:18 -0500
> Subject: Re: [Python-ideas] Start argument for itertools.accumulate()
> [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin]
> [Raymond Hettinger ]
> > Q. Do other languages do it?
> > A. Numpy, no. R, no. APL, no. Mathematica, no. Haskell, yes.
> >
> > *
> http://docs.scipy.org/doc/numpy/reference/generated/numpy.ufunc.accumulate.html
> > *
> https://stat.ethz.ch/R-manual/R-devel/library/base/html/cumsum.html
> > * http://microapl.com/apl/apl_concepts_chapter5.html
> >   \+ 1 2 3 4 5
> >   1 3 6 10 15
> > * https://reference.wolfram.com/language/ref/Accumulate.html
> > * https://www.haskell.org/hoogle/?hoogle=mapAccumL
>
> There's also C++, which is pretty much "yes" to every variation
> discussed so far:
>
> * partial_sum() is like Python's current accumulate(), including
> defaulting to doing addition.
>
> http://en.cppreference.com/w/cpp/algorithm/partial_sum
>
> * inclusive_scan() is also like accumulate(), but allows an optional
> "init" argument (which is returned if specified), and there's no
> guarantee of "left-to-right" evaluation (it's intended for associative
> binary functions, and wants to allow parallelism in the
> implementation).
>

Re: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4)

2018-04-15 Thread Mikhail V
On Sun, Apr 15, 2018 at 7:15 PM, Steven D'Aprano  wrote:
> On Sun, Apr 15, 2018 at 11:11:37PM +0800, Thautwarm Zhao wrote:
>
>> I think maybe we can use unicode characters like ≜ (\triangleq) and add the
>> support of unicode completion to python repl. The unicode completion of
>> editors or ides has been quite mature.
>
> What key combination do I need to type to get ≜ in the following editors
> please? I tried typing \triangleq but all I got was \triangleq.
>
> Notepad (Windows)
> Brackets (Mac)
> BBEdit (Mac)
> kwrite (Linux)
> kate
> nano
> geany
> gedit
>
> as well as IDLE, my mail client (kmail, Thunderbird or mutt), my web
> browsers (Firefox, Opera and Chromium), the interactive interpreter in
> various different consoles, my Usenet client (Pan and KNode) and IRC
> (pidgin).
>
> Oh, having it work in LibreOffice and GoogleApps too would be nice,
> although not essential since I don't often write code in them.

Typing should not be a problem generally. There are a lot of 3d-party apps which
can bind a key to specific char input, system-wide. On windows I use Autohotkey.
But no 100% guarantee of course for any editor.

> And what decent fonts do I need to install for ≜ to show up as something
> other than a square box ("missing glyph")?

Well, here it is way less optimistic :)
The chances to see that "delta equal to" sign in some random font /
random app is not so big.
It's only if you have fonts fallback system setup, and by default on
my windows it seems to
work only in Firefox browser.


Mikhail
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Rewriting file - pythonic way

2018-04-15 Thread George Fischhof
Hi,

some similar thing already exist in standard:
https://docs.python.org/3/library/fileinput.html

fileinput(... inplace=True...)

BR,
George

2018-04-15 10:57 GMT+02:00 Alexey Shrub :

> Hi all,
>
> I am new in python (i am moving from Perl world), but I always love Python
> for hight level, beatuful and clean syntax.
> Now I have question/idea about working with files.
> On mine opinion it very popular use case:
> 1. Open file (for read and write)
> 2. Read data from file
> 3. Modify data.
> 4. Rewrite file by modified data.
>
> But now it is looks not so pythonic:
>
> with open(filename, 'r+') as file:
>data = file.read()
>data = data.replace('old', 'new')
>file.seek(0)
>file.write(data)
>file.truncate()
>
> or something like this
>
> with open(filename) as file:
>data = file.read()
> data = data.replace('old', 'new')
> with open(filename) as file:
>file.write(data)
>
> I think best way is something like this
>
> with open(filename, 'r+') as file:
>data = file.read()
>data = data.replace('old', 'new')
>file.rewrite(data)
>
> but for this io.BufferedIOBase must contain rewrite method
>
> what you think about this?
>
>
>
> ___
> Python-ideas mailing list
> Python-ideas@python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4)

2018-04-15 Thread Chris Angelico
On Mon, Apr 16, 2018 at 1:58 AM, Steven D'Aprano  wrote:
> On Sun, Apr 15, 2018 at 10:21:02PM +1000, Chris Angelico wrote:
>
>> I don't think we're ever going to unify everyone on an arbitrary
>> question of "expression first" or "name first". But to all the
>> "expression first" people, a question: what if the target is not just
>> a simple name?
>>
>> while (read_next_item() -> items[i + 1 -> i]) is not None:
>> print("%d/%d..." % (i, len(items)), end="\r")
>
> I don't see why it would make a difference. It doesn't to me.

Okay, that's good. I just hear people saying "name" a lot, but that
would imply restricting the grammar to just a name, and I don't know
how comfortable people are with more complex targets.

>> Does this make sense? With the target coming first, it perfectly
>> parallels the existing form of assignment:
>
> Yes, except this isn't ordinary assignment-as-a-statement.
>
> I've been mulling over the question why I think the expression needs to
> come first here, whereas I'm satisfied with the target coming first for
> assignment statements, and I think I've finally got the words to explain
> it. It is not just long familiarity with maths and languages that put
> the variable first (although that's also part of it). It has to do with
> what we're looking for when we read code, specifically what is the
> primary piece of information we're initially looking for.
>
> In assignment STATEMENTS the primary piece of information is the target.
> Yes, of course the value assigned to the target is important, but often
> we don't care what the value is, at least not at first. We're hunting
> for a known target, and only when we find it do we care about the value
> it gets.
> [chomp details]

> It is appropriate for assignment statements and expressions to be
> written differently because they are used differently.

I don't know that assignment expressions are inherently going to be
used in ways where you ignore the assignment part and care only about
the expression part. And I disagree that assignment statements are
used primarily the way you say. Frequently I skim down a column of
assignments, caring primarily about the functions being called, and
looking at the part before the equals sign only when I come across a
parameter in another call; the important part of the line is what it's
doing, not where it's stashing its result.

> [...]
>> >>> items = [None] * 10
>> >>> i = -1
>> >>> items[i := i + 1] = input("> ")
>> > asdf
>> >>> items[i := i + 1] = input("> ")
>> > qwer
>> >>> items[i := i + 1] = input("> ")
>> > zxcv
>> >>>
>> >>> items
>> ['asdf', 'qwer', 'zxcv', None, None, None, None, None, None, None]
>
>
> I don't know why you would write that instead of:
>
> items = [None]*10
> for i in range(3):
> items[i] = input("> ")
>
>
> or even for that matter:
>
> items = [input("> ") for i in range(3)] + [None]*7
>
>
> but whatever floats your boat. (Python isn't just not Java. It's also
> not C *wink*)

You and Kirill have both fallen into the trap of taking the example
too far. By completely rewriting it, you destroy its value as an
example. Write me a better example of a complex target if you like,
but the question is about how you feel about complex assignment
targets, NOT how you go about creating a particular list in memory.
That part is utterly irrelevant.

>> Are you as happy with that sort of complex
>> expression coming after 'as' or '->'?
>
> Sure. Ignoring the output of the calls to input():

The calls to input were in a while loop's header for a reason.
Ignoring them is ignoring the point of assignment expressions.

ChrisA
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4)

2018-04-15 Thread Chris Angelico
On Mon, Apr 16, 2018 at 12:17 AM, Kirill Balunov
 wrote:
>
>
> 2018-04-15 15:21 GMT+03:00 Chris Angelico :
>> I don't think we're ever going to unify everyone on an arbitrary
>> question of "expression first" or "name first". But to all the
>> "expression first" people, a question: what if the target is not just
>> a simple name?
>>
>> while (read_next_item() -> items[i + 1 -> i]) is not None:
>> print("%d/%d..." % (i, len(items)), end="\r")
>>
>
> I completely agree with you that it is impossible to unify everyone opinion
> - we all have different background. But this example is more likely to play
> against this PEP. This is an extra complexity within one line and it can
> fail hard in at least three obvious places :) And I am against this usage no
> matter `name first` or `expression first`. But i will reask this with
> following snippets. What do you choose from this examples:
>
> 0.
>
> while (items[i := i+1] := read_next_item()) is not None:
> print(r'%d/%d' % (i, len(items)), end='\r')
>
> 1.
>
> while (read_next_item() -> items[(i+1) -> i]) is not None:
> print(r'%d/%d' % (i, len(items)), end='\r')

These two are matching what I wrote, and are thus the two forms under
consideration. I notice that you added parentheses to the second one;
is there a clarity problem here and you're unsure whether "i + 1 -> i"
would capture "i + 1" or "1"? If so, that's a downside to the
proposal.

> 2.
>
> while (item := read_next_item()) is not None:
> items[i := (i+1)] = item
> print(r'%d/%d' % (i, len(items)), end='\r')
>
> 3.
>
> while (read_next_item() -> item) is not None:
> items[(i+1) -> i] = item
> print(r'%d/%d' % (i, len(items)), end='\r')
>
> 4.
>
> while (item := read_next_item()) is not None:
> i = i+1
> items[i] = item
> print(r'%d/%d' % (i, len(items)), end='\r')
>
> 5.
>
> while (read_next_item() -> item) is not None:
> i = i+1
> items[i] = item
> print(r'%d/%d' % (i, len(items)), end='\r')

All of these are fundamentally different from what I'm asking: they
are NOT all expressions that can be used in the while header. So it
doesn't answer the question of "expression first" or "target first".
Once the expression gets broken out like this, you're right back to
using "expression -> NAME" or "NAME := expression", and it's the same
sort of simple example that people have been discussing all along.

> I am definitely Ok with both 2 and 3 here. But as it was noted `:=` produces
> additional noise in other places and I am also an `expression first` guy :)
> So I still prefer variant 3 to 2. But to be completely honest, I would write
> it in the following way:
>
> for item in iter(read_next_item, None):
> items.append(item)
> print(r'%d/%d' % (i, len(items)), end='\r')

And that's semantically different in several ways. Not exactly a fair
comparison.

I invite you to write up a better example with a complex target.

ChrisA
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4)

2018-04-15 Thread Chris Angelico
On Mon, Apr 16, 2018 at 3:19 AM, Guido van Rossum  wrote:
> On Sun, Apr 15, 2018 at 4:05 AM, Kirill Balunov 
> wrote:
>> But somehow this PEP does not mean it! And with the current rationale of
>> this PEP it's a huge CON for me that `=` and `:=` feel and look the same.
>
> Then maybe the PEP needs to be updated.

I can never be sure what people are reading when they say "current"
with PEPs like this. The text gets updated fairly frequently. As of
time of posting, here's the rationale:

-
Naming the result of an expression is an important part of programming,
allowing a descriptive name to be used in place of a longer expression,
and permitting reuse.  Currently, this feature is available only in
statement form, making it unavailable in list comprehensions and other
expression contexts.  Merely introducing a way to assign as an expression
would create bizarre edge cases around comprehensions, though, and to avoid
the worst of the confusions, we change the definition of comprehensions,
causing some edge cases to be interpreted differently, but maintaining the
existing behaviour in the majority of situations.
-

Kirill, is this what you read, and if so, how does that make ':=' a
negative? The rationale says "hey, see this really good thing you can
do as a statement? Let's do it as an expression too", so the parallel
should be a good thing.

ChrisA
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4)

2018-04-15 Thread Mikhail V
On Sun, Apr 15, 2018 at 2:01 PM, Nick Coghlan  wrote:
> On 15 April 2018 at 19:41, Mikhail V  wrote:
>> So IIUC, the *only* reason is to avoid '==' ad '=' similarity?
>> If so, then it does not sound convincing at all.
>> Of course Python does me a favor showing an error,
>> when I make a typo like this:
>> if (x = y)
>>
>> But still, if this is the only real reason, it is not convincing.
>
> It's thoroughly convincing, because we're already familiar with the
> consequences of folks confusing "=" and "==" when writing C & C++
> code. It's an eternal bug magnet, so it's not a design we're ever
> going to port over to Python.
[...]
> The examples in the PEP have been updated to better reflect some of
> the key motivating use cases (embedded assignments in if and while
> statement conditions, generator expressions, and container
> comprehensions)

Im personally "0" on the whole proposal. Just was curious
about that "demonisation" of "=" and "==" visual similarity.
Granted, writing ":=" instead of "=" helps a little bit.
But if the ":=" will be accepted, then
we end up with two spellings :-)

>
>> And as a side note: I personally find the look of ":=" a bit 'noisy'.
>
> You're not alone in that, which is one of the reasons finding a
> keyword based option that's less syntactically ambiguous than "as"
> could be an attractive alternative.
>

Keyword variants look less appealing than ":=".
but if it had to be a keyword, then I'd definitely stay by
"TARGET keyword EXPR" just not to swap the traditional order.



Mikhail
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4)

2018-04-15 Thread Chris Angelico
On Mon, Apr 16, 2018 at 4:58 AM, Thautwarm Zhao  wrote:
> Dear Steve, I'm sorry to annoy you by my proposal, but I do think using
> unicode might be wise in current stage.
>
> \triangleq could be print with unicode number \u225c, and adding plugins to
> support typing this in editors could be easy, just simply map \xxx to the
> specific unicode char when we press the tab after typing it.
>
> People using Julia language are proud of it but I think it's just something
> convenient could be used in any other language.
>
> There are other reasons to support unicode but it's out of this topic.
>
> Although ':=' and '->' are not perfect, in the range of ASCII it seems to be
> impossible to find a better one.
>

If you want to introduce non-ASCII tokens to Python, start by adding
them as _alternatives_ to the current syntax. See whether people adopt
them. I've seen one or two people using editors that redisplay
ASCII-only source code using other symbols (eg ≡ for JavaScript's
===), and you could make it so the source code can actually be saved
in that form. But making it so that the ONLY way to use a feature is
to use a non-ASCII character? That's going to break a lot of people's
workflows.

ChrisA
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] A cute Python implementation of itertools.tee

2018-04-15 Thread Koos Zevenhoven
On Sun, Apr 15, 2018 at 8:05 AM, Tim Peters  wrote:
​[...]​


> Then I thought "this is stupid!  Python already does reference
> counting."  Voila!  Vast swaths of tedious code vanished, giving this
> remarkably simple implementation:
>
> def mytee(xs, n):
> last = [None, None]
>
> def gen(it, mylast):
> nonlocal last
> while True:
> mylast = mylast[1]
> if not mylast:
> mylast = last[1] = last = [next(it), None]
> yield mylast[0]
>
> it = iter(xs)
> return tuple(gen(it, last) for _ in range(n))
>
> There's no need to keep a pointer to the start of the shared list at
> all - we only need a pointer to the end of the list ("last"), and each
> derived generator only needs a pointer to its own current position in
> the list ("mylast").
>
>
Things here remind me of my implementation design for PEP 555: the
"contexts" present in the process are represented by a singly-linked tree
of assignment objects. It's definitely possible to write the above in a
more readable way, and FWIW I don't think it involves "assignments as
expressions".


> What I find kind of hilarious is that it's no help at all as a
> prototype for a C implementation:  Python recycles stale `[next(it),
> None]` pairs all by itself, when their internal refcounts fall to 0.
> That's the hardest part.
>
>
​Why can't the C implementation use Python refcounts? Are you talking about
standalone C code? Or perhaps you are thinking about overhead? (In PEP 555
that was not a concern, though). Surely it would make sense to reuse the
refcounting code that's already there. There are no cycles here, so it's
not particulaly complicated -- just duplication.

Anyway, the whole linked list is unnecessary if the iterable can be
iterated over multiple times. But "tee" won't know when to do that. *That*
is what I call overhead (unless of course all the tee branches are consumed
in an interleaved manner).



> BTW, I certainly don't suggest adding this to the itertools docs
> either.  While it's short and elegant, it's too subtle to grasp easily
> - if you think "it's obvious", you haven't yet thought hard enough
> about the problem ;-)
> ___
> Python-ideas mailing list
> Python-ideas@python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>



-- 
+ Koos Zevenhoven + http://twitter.com/k7hoven +
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] A cute Python implementation of itertools.tee

2018-04-15 Thread Chris Angelico
On Mon, Apr 16, 2018 at 6:46 AM, Koos Zevenhoven  wrote:
> Anyway, the whole linked list is unnecessary if the iterable can be iterated
> over multiple times. But "tee" won't know when to do that. *That* is what I
> call overhead (unless of course all the tee branches are consumed in an
> interleaved manner).

But if you have something you can iterate over multiple times, why
bother with tee at all? Just take N iterators from the underlying
iterable. The overhead is intrinsic to the value of the function.

ChrisA
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] A cute Python implementation of itertools.tee

2018-04-15 Thread Tim Peters
[Koos Zevenhoven ]
> It's definitely possible to write the above in a more
> readable way, and FWIW I don't think it involves "assignments as
> expressions".

Of course it is.  The point was brevity and speed, not readability.
It was presented partly as a puzzle :-)

>> What I find kind of hilarious is that it's no help at all as a
>> prototype for a C implementation:  Python recycles stale `[next(it),
>> None]` pairs all by itself, when their internal refcounts fall to 0.
>> That's the hardest part.

> Why can't the C implementation use Python refcounts? Are you talking about
> standalone C code?

Yes, expressing the algorithm in plain old C, not building on top of
(say) the Python C API.

> Or perhaps you are thinking about overhead?

Nope.


> (In PEP 555 that was not a concern, though). Surely it would make sense
> to reuse the refcounting code that's already there. There are no cycles
> here, so it's not particulaly complicated -- just duplication.
>
> Anyway, the whole linked list is unnecessary if the iterable can be iterated
> over multiple times.

If the latter were how iterables always worked, there would be no need
for tee() at all.  It's tee's _purpose_ to make it possible for
multiple consumers to traverse an iterable's can't-restart-or-even
-go-back result sequence each at their own pace.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Accepting multiple mappings as positional arguments to create dicts

2018-04-15 Thread Mike Miller

On 2018-04-12 18:03, Guido van Rossum wrote:
It's a slippery slope indeed. While having to change update() alone wouldn't 
worry me, the subclass constructors do seem like they are going to want changing 
too, and that's indeed a bit much. So let's back off a bit. Not every three 
lines of code need a built-in shorthand.


This is disappointing since the dictionary is one of the most used but 
simultaneously limited of the builtin types.  It doesn't support a lot of 
operations that strings, lists, tuples, sets, etc do.  These are the little 
niceties that make Python fun to program in.  But, for some reason we're stingy 
when it comes to dictionaries, the foundation of the language.


Has anyone disagreed the dict constructor shouldn't take multiple arguments?

Also, it isn't always three lines of code, but expands with the number that need 
to be merged.


My guess is that the dict is used an order of magnitude more than specialized 
subclasses, even more so now that the Ordered variant is unnecessary in newer 
versions.  It wouldn't bother me at all if it took a few years for the 
improvement to get rolled out to subclasses or never, it's quite a minor 
disappointment compared to getting the functionality ~90% of the time.


Also wouldn't mind helping out with the subclasses if there is some lifting that 
needed to be done.


-Mike
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4)

2018-04-15 Thread Brendan Barnwell

On 2018-04-15 08:58, Steven D'Aprano wrote:

I've been mulling over the question why I think the expression needs to
come first here, whereas I'm satisfied with the target coming first for
assignment statements, and I think I've finally got the words to explain
it. It is not just long familiarity with maths and languages that put
the variable first (although that's also part of it). It has to do with
what we're looking for when we read code, specifically what is the
primary piece of information we're initially looking for.


	Interesting.  I think your arguments are pretty reasonable overall. 
But, for me, they just don't outweigh the fact that "->" is an ugly 
assignment operator that looks nothing like the existing one, whereas 
":=" is a less-ugly one that has the additional benefit of looking like 
the existing one.  From your arguments I am convinced that putting the 
expression first has some advantages, but they just don't seem as 
important to me as they apparently do to you.


--
Brendan Barnwell
"Do not follow where the path may lead.  Go, instead, where there is no 
path, and leave a trail."

   --author unknown
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] collections.Counter should implement __mul__, __rmul__

2018-04-15 Thread Peter Norvig
For most types that implement __add__, `x + x` is equal to `2 * x`.

That is true for all numbers, list, tuple, str, timedelta, etc. -- but not
for collections.Counter. I can add two Counters, but I can't multiply one
by a scalar. That seems like an oversight.

It would be worthwhile to implement multiplication because, among other
reasons, Counters are a nice representation for discrete probability
distributions, for which multiplication is an even more fundamental
operation than addition.

Here's an implementation:

def __mul__(self, scalar):
"Multiply each entry by a scalar."
result = Counter()
for key in self:
result[key] = self[key] * scalar
return result

def __rmul__(self, scalar):
"Multiply each entry by a scalar."
result = Counter()
for key in self:
result[key] = scalar * self[key]
return result
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] collections.Counter should implement __mul__, __rmul__

2018-04-15 Thread Wes Turner
Good call. Is it any faster to initialize Counter with a dict comprehension?

return Counter({k: v*scalar for (k, v) in self.items())

On Sun, Apr 15, 2018 at 5:05 PM, Peter Norvig  wrote:

> For most types that implement __add__, `x + x` is equal to `2 * x`.
>
> That is true for all numbers, list, tuple, str, timedelta, etc. -- but not
> for collections.Counter. I can add two Counters, but I can't multiply one
> by a scalar. That seems like an oversight.
>
> It would be worthwhile to implement multiplication because, among other
> reasons, Counters are a nice representation for discrete probability
> distributions, for which multiplication is an even more fundamental
> operation than addition.
>
> Here's an implementation:
>
> def __mul__(self, scalar):
> "Multiply each entry by a scalar."
> result = Counter()
> for key in self:
> result[key] = self[key] * scalar
> return result
>
> def __rmul__(self, scalar):
> "Multiply each entry by a scalar."
> result = Counter()
> for key in self:
> result[key] = scalar * self[key]
> return result
>
> ___
> Python-ideas mailing list
> Python-ideas@python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
>
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] collections.Counter should implement __mul__, __rmul__

2018-04-15 Thread Peter Norvig
That's actually how I coded it myself the first time. But I worried it
would be wasteful to create an intermediate dict and discard it.

`timeit` results:

3.79 µs for the for-loop, 5.08 µs for the dict-comprehension with a 10-key
Counter
257 µs for the for-loop, 169 µs for the dict-comprehension with a 1000-key
Counter

So results are mixed, but you are probably right.

On Sun, Apr 15, 2018 at 3:46 PM Wes Turner  wrote:

> Good call. Is it any faster to initialize Counter with a dict
> comprehension?
>
> return Counter({k: v*scalar for (k, v) in self.items())
>
> On Sun, Apr 15, 2018 at 5:05 PM, Peter Norvig  wrote:
>
>> For most types that implement __add__, `x + x` is equal to `2 * x`.
>>
>> That is true for all numbers, list, tuple, str, timedelta, etc. -- but
>> not for collections.Counter. I can add two Counters, but I can't multiply
>> one by a scalar. That seems like an oversight.
>>
>> It would be worthwhile to implement multiplication because, among other
>> reasons, Counters are a nice representation for discrete probability
>> distributions, for which multiplication is an even more fundamental
>> operation than addition.
>>
>> Here's an implementation:
>>
>> def __mul__(self, scalar):
>> "Multiply each entry by a scalar."
>> result = Counter()
>> for key in self:
>> result[key] = self[key] * scalar
>> return result
>>
>> def __rmul__(self, scalar):
>> "Multiply each entry by a scalar."
>> result = Counter()
>> for key in self:
>> result[key] = scalar * self[key]
>> return result
>>
>> ___
>> Python-ideas mailing list
>> Python-ideas@python.org
>> https://mail.python.org/mailman/listinfo/python-ideas
>> Code of Conduct: http://python.org/psf/codeofconduct/
>>
>>
>
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] collections.Counter should implement __mul__, __rmul__

2018-04-15 Thread Raymond Hettinger


> On Apr 15, 2018, at 2:05 PM, Peter Norvig  wrote:
> 
> For most types that implement __add__, `x + x` is equal to `2 * x`. 
> 
> ... 
> 
> 
> That is true for all numbers, list, tuple, str, timedelta, etc. -- but not 
> for collections.Counter. I can add two Counters, but I can't multiply one by 
> a scalar. That seems like an oversight. 

If you view the Counter as a sparse associative array of numeric values, it 
does seem like an oversight.  If you view the Counter as a Multiset or Bag, it 
doesn't make sense at all ;-)

>From an implementation point of view, Counter is just a kind of dict that has 
>a __missing__() method that returns zero.  That makes it trivially easy to 
>subclass Counter to add new functionality or just use dictionary 
>comprehensions for bulk updates.

>  
> 
> It would be worthwhile to implement multiplication because, among other 
> reasons, Counters are a nice representation for discrete probability 
> distributions, for which multiplication is an even more fundamental operation 
> than addition.

There is an open issue on this topic.  See:  https://bugs.python.org/issue25478

One stumbling point is that a number of commenters are fiercely opposed to 
non-integer uses of Counter. Also, some of the use cases (such as those found 
in Allen Downey's "Think Stats" and "Think Bayes" books) also need division and 
rescaling to a total (i.e. normalizing the total to 1.0) for a probability mass 
function.

If the idea were to go forward, it still isn't clear whether the correct API 
should be low level (__mul__ and __div__ and a "total" property) or higher 
level (such as a normalize() or rescale() method that produces a new Counter 
instance).  The low level approach has the advantage that it is simple to 
understand and that it feels like a logical extension of the __add__ and 
__sub__ methods.  The downside is that doesn't really add any new capabilities 
(being just short-cuts for a simple dict comprehension or call to c.values()).  
And, it starts to feature creep the Counter class further away from its core 
mission of counting and ventures into the realm of generic sparse arrays with 
numeric values.  There is also a learnability/intelligibility issue in __add__ 
and __sub__ correspond to "elementwise" operations while  __mul__ and __div__ 
would be "scalar broadcast" operations.

Peter, I'm really glad you chimed in.  My advocacy lacked sufficient weight to 
move this idea forward.


Raymond



___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] collections.Counter should implement __mul__, __rmul__

2018-04-15 Thread Peter Norvig
If you think of a Counter as a multiset, then it should support __or__, not
__add__, right?

I do think it would have been fine if Counter did not support "+" at all
(and/or if Counter was limited to integer values). But  given where we are
now, it feels like we should preserve `c + c == 2 * c`.

As to the "doesn't really add any new capabilities" argument, that's true,
but it is also true for Counter as a whole: it doesn't add much over
defaultdict(int), but it is certainly convenient to have a standard way to
do what it does.

I agree with your intuition that low level is better. `total` would be
useful. If you have total and mul, then as you and others have pointed out,
normalize is just c *= 1/c.total.

I can also see the argument for a new FrequencyTable class in the
statistics module. (By the way, I refactored my
https://github.com/norvig/pytudes/blob/master/ipynb/Probability.ipynb a
bit, and now I no longer need a `normalize` function.)

On Sun, Apr 15, 2018 at 5:06 PM Raymond Hettinger <
raymond.hettin...@gmail.com> wrote:

>
>
> > On Apr 15, 2018, at 2:05 PM, Peter Norvig  wrote:
> >
> > For most types that implement __add__, `x + x` is equal to `2 * x`.
> >
> > ...
> >
> >
> > That is true for all numbers, list, tuple, str, timedelta, etc. -- but
> not for collections.Counter. I can add two Counters, but I can't multiply
> one by a scalar. That seems like an oversight.
>
> If you view the Counter as a sparse associative array of numeric values,
> it does seem like an oversight.  If you view the Counter as a Multiset or
> Bag, it doesn't make sense at all ;-)
>
> From an implementation point of view, Counter is just a kind of dict that
> has a __missing__() method that returns zero.  That makes it trivially easy
> to subclass Counter to add new functionality or just use dictionary
> comprehensions for bulk updates.
>
> >
> >
> > It would be worthwhile to implement multiplication because, among other
> reasons, Counters are a nice representation for discrete probability
> distributions, for which multiplication is an even more fundamental
> operation than addition.
>
> There is an open issue on this topic.  See:
> https://bugs.python.org/issue25478
>
> One stumbling point is that a number of commenters are fiercely opposed to
> non-integer uses of Counter. Also, some of the use cases (such as those
> found in Allen Downey's "Think Stats" and "Think Bayes" books) also need
> division and rescaling to a total (i.e. normalizing the total to 1.0) for a
> probability mass function.
>
> If the idea were to go forward, it still isn't clear whether the correct
> API should be low level (__mul__ and __div__ and a "total" property) or
> higher level (such as a normalize() or rescale() method that produces a new
> Counter instance).  The low level approach has the advantage that it is
> simple to understand and that it feels like a logical extension of the
> __add__ and __sub__ methods.  The downside is that doesn't really add any
> new capabilities (being just short-cuts for a simple dict comprehension or
> call to c.values()).  And, it starts to feature creep the Counter class
> further away from its core mission of counting and ventures into the realm
> of generic sparse arrays with numeric values.  There is also a
> learnability/intelligibility issue in __add__ and __sub__ correspond to
> "elementwise" operations while  __mul__ and __div__ would be "scalar
> broadcast" operations.
>
> Peter, I'm really glad you chimed in.  My advocacy lacked sufficient
> weight to move this idea forward.
>
>
> Raymond
>
>
>
>
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] collections.Counter should implement __mul__, __rmul__

2018-04-15 Thread Wes Turner
On Sunday, April 15, 2018, Peter Norvig  wrote:

> If you think of a Counter as a multiset, then it should support __or__,
> not __add__, right?
>
> I do think it would have been fine if Counter did not support "+" at all
> (and/or if Counter was limited to integer values). But  given where we are
> now, it feels like we should preserve `c + c == 2 * c`.
>
> As to the "doesn't really add any new capabilities" argument, that's
> true, but it is also true for Counter as a whole: it doesn't add much over
> defaultdict(int), but it is certainly convenient to have a standard way to
> do what it does.
>
> I agree with your intuition that low level is better. `total` would be
> useful. If you have total and mul, then as you and others have pointed out,
> normalize is just c *= 1/c.total.
>
> I can also see the argument for a new FrequencyTable class in the
> statistics module. (By the way, I refactored my https://github.com/norvig/
> pytudes/blob/master/ipynb/Probability.ipynb a bit, and now I no longer
> need a `normalize` function.)
>

nltk.probability.FreqDist(collections.Counter) doesn't have a __mul__ either
http://www.nltk.org/api/nltk.html#nltk.probability.FreqDist

numpy.unique(, return_counts=True).unique_counts returns an array sorted by
value with a __mul__.
https://docs.scipy.org/doc/numpy/reference/generated/numpy.unique.html

scipy.stats.itemfreq returns an array sorted by value with a __mul__ and
the items in the first column.
https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.itemfreq.html

pandas.Series.value_counts(, normalize=False) returns a Series sorted by
descending frequency.
https://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.value_counts.html


> On Sun, Apr 15, 2018 at 5:06 PM Raymond Hettinger <
> raymond.hettin...@gmail.com> wrote:
>
>>
>>
>> > On Apr 15, 2018, at 2:05 PM, Peter Norvig  wrote:
>> >
>> > For most types that implement __add__, `x + x` is equal to `2 * x`.
>> >
>> > ...
>> >
>> >
>> > That is true for all numbers, list, tuple, str, timedelta, etc. -- but
>> not for collections.Counter. I can add two Counters, but I can't multiply
>> one by a scalar. That seems like an oversight.
>>
>> If you view the Counter as a sparse associative array of numeric values,
>> it does seem like an oversight.  If you view the Counter as a Multiset or
>> Bag, it doesn't make sense at all ;-)
>>
>> From an implementation point of view, Counter is just a kind of dict that
>> has a __missing__() method that returns zero.  That makes it trivially easy
>> to subclass Counter to add new functionality or just use dictionary
>> comprehensions for bulk updates.
>>
>> >
>> >
>> > It would be worthwhile to implement multiplication because, among other
>> reasons, Counters are a nice representation for discrete probability
>> distributions, for which multiplication is an even more fundamental
>> operation than addition.
>>
>> There is an open issue on this topic.  See:  https://bugs.python.org/
>> issue25478
>>
>> One stumbling point is that a number of commenters are fiercely opposed
>> to non-integer uses of Counter. Also, some of the use cases (such as those
>> found in Allen Downey's "Think Stats" and "Think Bayes" books) also need
>> division and rescaling to a total (i.e. normalizing the total to 1.0) for a
>> probability mass function.
>>
>> If the idea were to go forward, it still isn't clear whether the correct
>> API should be low level (__mul__ and __div__ and a "total" property) or
>> higher level (such as a normalize() or rescale() method that produces a new
>> Counter instance).  The low level approach has the advantage that it is
>> simple to understand and that it feels like a logical extension of the
>> __add__ and __sub__ methods.  The downside is that doesn't really add any
>> new capabilities (being just short-cuts for a simple dict comprehension or
>> call to c.values()).  And, it starts to feature creep the Counter class
>> further away from its core mission of counting and ventures into the realm
>> of generic sparse arrays with numeric values.  There is also a
>> learnability/intelligibility issue in __add__ and __sub__ correspond to
>> "elementwise" operations while  __mul__ and __div__ would be "scalar
>> broadcast" operations.
>>
>> Peter, I'm really glad you chimed in.  My advocacy lacked sufficient
>> weight to move this idea forward.
>>
>>
>> Raymond
>>
>>
>>
>>
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] collections.Counter should implement __mul__, __rmul__

2018-04-15 Thread Wes Turner
tf.bincount() returns a vector with integer counts.
https://www.tensorflow.org/api_docs/python/tf/bincount

Keras calls np.bincount in an mnist example.

np.bincount returns an array with a __mul__
https://docs.scipy.org/doc/numpy-1.14.0/reference/generated/numpy.bincount.html

- sklearn.preprocessing.normalize

http://scikit-learn.org/stable/modules/preprocessing.html#preprocessing-normalization

http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.normalize.html


featuretools.primitives.NUnique has a normalize method.
https://docs.featuretools.com/generated/featuretools.primitives.NUnique.html#featuretools.primitives.NUnique

And I'm done sharing non-pure-python solutions for this problem, I promise

On Sunday, April 15, 2018, Wes Turner  wrote:

>
>
> On Sunday, April 15, 2018, Peter Norvig  wrote:
>
>> If you think of a Counter as a multiset, then it should support __or__,
>> not __add__, right?
>>
>> I do think it would have been fine if Counter did not support "+" at all
>> (and/or if Counter was limited to integer values). But  given where we are
>> now, it feels like we should preserve `c + c == 2 * c`.
>>
>> As to the "doesn't really add any new capabilities" argument, that's
>> true, but it is also true for Counter as a whole: it doesn't add much over
>> defaultdict(int), but it is certainly convenient to have a standard way to
>> do what it does.
>>
>> I agree with your intuition that low level is better. `total` would be
>> useful. If you have total and mul, then as you and others have pointed out,
>> normalize is just c *= 1/c.total.
>>
>> I can also see the argument for a new FrequencyTable class in the
>> statistics module. (By the way, I refactored my
>> https://github.com/norvig/pytudes/blob/master/ipynb/Probability.ipynb a
>> bit, and now I no longer need a `normalize` function.)
>>
>
> nltk.probability.FreqDist(collections.Counter) doesn't have a __mul__
> either
> http://www.nltk.org/api/nltk.html#nltk.probability.FreqDist
>
> numpy.unique(, return_counts=True).unique_counts returns an array sorted
> by value with a __mul__.
> https://docs.scipy.org/doc/numpy/reference/generated/numpy.unique.html
>
> scipy.stats.itemfreq returns an array sorted by value with a __mul__ and
> the items in the first column.
> https://docs.scipy.org/doc/scipy/reference/generated/
> scipy.stats.itemfreq.html
>
> pandas.Series.value_counts(, normalize=False) returns a Series sorted by
> descending frequency.
> https://pandas.pydata.org/pandas-docs/stable/generated/
> pandas.Series.value_counts.html
>
>
>> On Sun, Apr 15, 2018 at 5:06 PM Raymond Hettinger <
>> raymond.hettin...@gmail.com> wrote:
>>
>>>
>>>
>>> > On Apr 15, 2018, at 2:05 PM, Peter Norvig  wrote:
>>> >
>>> > For most types that implement __add__, `x + x` is equal to `2 * x`.
>>> >
>>> > ...
>>> >
>>> >
>>> > That is true for all numbers, list, tuple, str, timedelta, etc. -- but
>>> not for collections.Counter. I can add two Counters, but I can't multiply
>>> one by a scalar. That seems like an oversight.
>>>
>>> If you view the Counter as a sparse associative array of numeric values,
>>> it does seem like an oversight.  If you view the Counter as a Multiset or
>>> Bag, it doesn't make sense at all ;-)
>>>
>>> From an implementation point of view, Counter is just a kind of dict
>>> that has a __missing__() method that returns zero.  That makes it trivially
>>> easy to subclass Counter to add new functionality or just use dictionary
>>> comprehensions for bulk updates.
>>>
>>> >
>>> >
>>> > It would be worthwhile to implement multiplication because, among
>>> other reasons, Counters are a nice representation for discrete probability
>>> distributions, for which multiplication is an even more fundamental
>>> operation than addition.
>>>
>>> There is an open issue on this topic.  See:
>>> https://bugs.python.org/issue25478
>>>
>>> One stumbling point is that a number of commenters are fiercely opposed
>>> to non-integer uses of Counter. Also, some of the use cases (such as those
>>> found in Allen Downey's "Think Stats" and "Think Bayes" books) also need
>>> division and rescaling to a total (i.e. normalizing the total to 1.0) for a
>>> probability mass function.
>>>
>>> If the idea were to go forward, it still isn't clear whether the correct
>>> API should be low level (__mul__ and __div__ and a "total" property) or
>>> higher level (such as a normalize() or rescale() method that produces a new
>>> Counter instance).  The low level approach has the advantage that it is
>>> simple to understand and that it feels like a logical extension of the
>>> __add__ and __sub__ methods.  The downside is that doesn't really add any
>>> new capabilities (being just short-cuts for a simple dict comprehension or
>>> call to c.values()).  And, it starts to feature creep the Counter class
>>> further away from its core mission of counting and ventures into the realm
>>> of generic sparse arrays with numeric values.  There is also a
>>> learnabilit

Re: [Python-ideas] collections.Counter should implement __mul__, __rmul__

2018-04-15 Thread Raymond Hettinger


> On Apr 15, 2018, at 5:44 PM, Peter Norvig  wrote:
> 
> If you think of a Counter as a multiset, then it should support __or__, not 
> __add__, right?

FWIW, Counter is explicitly documented to support the four multiset-style 
mathematical operations discussed in Knuth TAOCP Volume II section 4.6.3 
exercise 19:

>>> c = Counter(a=3, b=1)
>>> d = Counter(a=1, b=2)
>>> c + d   # add two counters together:  c[x] + d[x]
Counter({'a': 4, 'b': 3})
>>> c - d   # saturating subtraction (keeping only positive 
>>> counts)
Counter({'a': 2})
>>> c & d   # intersection:  min(c[x], d[x]) 
Counter({'a': 1, 'b': 1})
>>> c | d   # union:  max(c[x], d[x])
Counter({'a': 3, 'b': 2})

The wikipedia article on Multisets lists a further operation, inclusion, that 
is not currently supported:  
https://en.wikipedia.org/wiki/Multiset#Basic_properties_and_operations

> I do think it would have been fine if Counter did not support "+" at all 
> (and/or if Counter was limited to integer values). But  given where we are 
> now, it feels like we should preserve `c + c == 2 * c`. 

The + operation has legitimate use cases (it is perfectly reasonable to want to 
combine the results two separate counts).  And, as you pointed out, it is what 
we already have and cannot change :-)

So, the API design issue that confronts us is that it would be a bit weird and 
disorienting for the arithmetic operators to have two different signatures:

 += 
 -= 
 *= 
 /= 

Also, we should respect the comments given by others on the tracker issue.  In 
particular, there is a preference to not have an in-place operation and only 
allow a new counter instance to be created.  That will help people avoid data 
structure modality problems:
.  
c[category] += 1   # Makes sense during the frequency counting or 
accumulation phase
c /= c.total   # Covert to a probability mass function
c[category] += 1   # This code looks correct but no longer makes any 
sense


> As to the "doesn't really add any new capabilities" argument, that's true, 
> but it is also true for Counter as a whole: it doesn't add much over 
> defaultdict(int), but it is certainly convenient to have a standard way to do 
> what it does.

IIRC, the defaultdict(int) in your first version triggered a bug because the 
model inadvertently changed during the analysis phase rather than being frozen 
after the training phase.  The Counter doesn't suffer from the same issue 
(modifying the dict on a failed lookup).  Also, the Counter class does have a 
few value added features:  Counter(iterable), c.most_common(), c.elements(), 
etc.   But yes, at its heart the counter is mostly just a specialized 
dictionary.  The thought I was trying to express is that suggestions to build 
out Counter API are a little less compelling when we already have a way to do 
it that is flexible, fast, clear, and standard (i.e. dict comprehensions).


> I agree with your intuition that low level is better. `total` would be 
> useful. If you have total and mul, then as you and others have pointed out, 
> normalize is just c *= 1/c.total.

I fully support adding some functionality for scaling to support probability 
distributions, bayesian update steps, chi-square tests, and whatnot.  The 
people who need convincing are the other respondents on the tracker.  They had 
a strong mental model for the Counter class that is somewhat at odds with this 
proposal.


Raymond



___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] collections.Counter should implement __mul__, __rmul__

2018-04-15 Thread Raymond Hettinger

> On Apr 15, 2018, at 7:18 PM, Wes Turner  wrote:
> 
> And I'm done sharing non-pure-python solutions for this problem, I promise

Keep them coming :-)

Thanks for the research.  It helps to remind ourselves that almost none of our 
problems are new :-)


Raymond
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] collections.Counter should implement __mul__, __rmul__

2018-04-15 Thread Chris Angelico
On Mon, Apr 16, 2018 at 1:39 PM, Raymond Hettinger
 wrote:
>
> So, the API design issue that confronts us is that it would be a bit weird 
> and disorienting for the arithmetic operators to have two different 
> signatures:
>
>  += 
>  -= 
>  *= 
>  /= 
>

This needn't be a blocker. Strings can be added to strings, and
strings can be multiplied by integers. If it's of practical value to
multiply a Counter by a number, by all means do it.

ChrisA
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] collections.Counter should implement __mul__, __rmul__

2018-04-15 Thread Peter Norvig
On Sun, Apr 15, 2018 at 8:39 PM Raymond Hettinger <
raymond.hettin...@gmail.com> wrote:

> FWIW, Counter is explicitly documented to support the four multiset-style
> mathematical operations discussed in Knuth TAOCP Volume II section 4.6.3
> exercise 19:
>

Wow, I never noticed "&" and "|" -- I guess when I got to "Common patterns
for working with" in the documentation, I figured that there wouldn't be
any new methods introduced after that and I stopped reading.

>
> it would be a bit weird and disorienting for the arithmetic operators to
> have two different signatures:
>
>  += 
>  -= 
>  *= 
>  /= 
>

Is it weird and disorienting to have:

 += 
 *= 
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] collections.Counter should implement __mul__, __rmul__

2018-04-15 Thread Raymond Hettinger


> On Apr 15, 2018, at 9:04 PM, Peter Norvig  wrote:
> 
> it would be a bit weird and disorienting for the arithmetic operators to have 
> two different signatures:
> 
>  += 
>  -= 
>  *= 
>  /= 
> 
> Is it weird and disorienting to have:
> 
>  += 
>  *=  

Yes, there is a precedent that does seem to have worked out well in practice 
:-)  It isn't exactly parallel because strings aren't containers of numbers, 
they don't have & and |, and there isn't a reason to want a / operation, but it 
does suggest that signature variation might not be problematic.  

BTW, do you just want __mul__ and __rmul__?  If those went in, presumably there 
will be a request to support __imul__ because otherwise c*=3 would still work 
but would be inefficient (that was the rationale for adding inplace variants 
for all the current arithmetic operators). Likewise, presumably someone would 
legitimately want __div__ to support the normalization use case.  Perhaps less 
likely, there would be also be a request for __floordiv__ to allow exactly 
scaled results to stay in the domain of integers.  Which if any of these makes 
sense to you?

Also, any thoughts on the cleanest way to express the computation of a 
chi-squared statistic (for example, to compare observed first digit frequencies 
to the frequencies predicted by Benford's Law)?  This isn't an arbitrary 
question (it came up when a professor first proposed a variant of this idea a 
few years ago).


Raymond
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] collections.Counter should implement __mul__, __rmul__

2018-04-15 Thread Peter Norvig
I don't have strong feelings, but I would say yes to __imul__, no to
__div__ and __floordiv__ (with str/list/tuple as the precedent).

For chisquare, I would be perfectly happy with:

digit_counts = Counter(...)
scipy.stats.chisquare(list(digit_counts.values()))

On Sun, Apr 15, 2018 at 9:39 PM Raymond Hettinger <
raymond.hettin...@gmail.com> wrote:

>
>
> > On Apr 15, 2018, at 9:04 PM, Peter Norvig  wrote:
> >
> > it would be a bit weird and disorienting for the arithmetic operators to
> have two different signatures:
> >
> >  += 
> >  -= 
> >  *= 
> >  /= 
> >
> > Is it weird and disorienting to have:
> >
> >  += 
> >  *= 
>
> Yes, there is a precedent that does seem to have worked out well in
> practice :-)  It isn't exactly parallel because strings aren't containers
> of numbers, they don't have & and |, and there isn't a reason to want a /
> operation, but it does suggest that signature variation might not be
> problematic.
>
> BTW, do you just want __mul__ and __rmul__?  If those went in, presumably
> there will be a request to support __imul__ because otherwise c*=3 would
> still work but would be inefficient (that was the rationale for adding
> inplace variants for all the current arithmetic operators). Likewise,
> presumably someone would legitimately want __div__ to support the
> normalization use case.  Perhaps less likely, there would be also be a
> request for __floordiv__ to allow exactly scaled results to stay in the
> domain of integers.  Which if any of these makes sense to you?
>
> Also, any thoughts on the cleanest way to express the computation of a
> chi-squared statistic (for example, to compare observed first digit
> frequencies to the frequencies predicted by Benford's Law)?  This isn't an
> arbitrary question (it came up when a professor first proposed a variant of
> this idea a few years ago).
>
>
> Raymond
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] collections.Counter should implement __mul__, __rmul__

2018-04-15 Thread Tim Peters
[Peter Norvig]
> For most types that implement __add__, `x + x` is equal to `2 * x`.
>
> That is true for all numbers, list, tuple, str, timedelta, etc. -- but not
> for collections.Counter. I can add two Counters, but I can't multiply one
> by a scalar. That seems like an oversight.
>
> ...
> Here's an implementation:
>
>def __mul__(self, scalar):
>"Multiply each entry by a scalar."
>result = Counter()
>for key in self:
>result[key] = self[key] * scalar
>return result
>
>def __rmul__(self, scalar):
>"Multiply each entry by a scalar."
>result = Counter()
>for key in self:
>result[key] = scalar * self[key]
>return result

Adding Counter * integer doesn't bother me a bit, but the definition
of what that should compute isn't obvious.  In particular, that
implementation doesn't preserve that `x+x == 2*x` if x has any
negative values:

>>> x = Counter(a=-1)
>>> x
Counter({'a': -1})
>>> x+x
Counter()

It would be strange if x+x != 2*x, and if x*-1 != -x:

>>> y = Counter(a=1)
>>> y
Counter({'a': 1})
>>> -y
Counter()

Etc.

Then again, it's already the case that, e.g., x-y isn't always the
same as x + -y:

>>> x = Counter(a=1)
>>> y = Counter(a=2)
>>> x - y
Counter()
>>> x + -y
Counter({'a': 1})

So screw obvious formal identities ;-)

I'm not clear on why "+" and "-" discard keys with values <= 0 to
begin with.  For "-" it's natural enough viewing "-" as being multiset
difference, but for "+"?  That's just made up ;-)

In any case, despite the oddities, I think your implementation would
be least surprising overall (ignore the sign of the resulting values).
At least for Counters that actually make sense as multisets (have no
values <= 0), and for a positive integer multiplier `n > 0`, it does
preserve that `x*n` = `x + x + ... + x` (with  `n` instances of `x`).
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] collections.Counter should implement __mul__, __rmul__

2018-04-15 Thread Wes Turner
On Monday, April 16, 2018, Raymond Hettinger 
wrote:

>
>
> > On Apr 15, 2018, at 9:04 PM, Peter Norvig  wrote:
> >
> > it would be a bit weird and disorienting for the arithmetic operators to
> have two different signatures:
> >
> >  += 
> >  -= 
> >  *= 
> >  /= 
> >
> > Is it weird and disorienting to have:
> >
> >  += 
> >  *= 
>
> Yes, there is a precedent that does seem to have worked out well in
> practice :-)  It isn't exactly parallel because strings aren't containers
> of numbers, they don't have & and |, and there isn't a reason to want a /
> operation, but it does suggest that signature variation might not be
> problematic.
>
> BTW, do you just want __mul__ and __rmul__?  If those went in, presumably
> there will be a request to support __imul__ because otherwise c*=3 would
> still work but would be inefficient (that was the rationale for adding
> inplace variants for all the current arithmetic operators). Likewise,
> presumably someone would legitimately want __div__ to support the
> normalization use case.  Perhaps less likely, there would be also be a
> request for __floordiv__ to allow exactly scaled results to stay in the
> domain of integers.  Which if any of these makes sense to you?
>
> Also, any thoughts on the cleanest way to express the computation of a
> chi-squared statistic (for example, to compare observed first digit
> frequencies to the frequencies predicted by Benford's Law)?  This isn't an
> arbitrary question (it came up when a professor first proposed a variant of
> this idea a few years ago).



https://en.wikipedia.org/wiki/Chi-squared_distribution
https://en.wikipedia.org/wiki/Chi-squared_test
https://en.wikipedia.org/wiki/Benford%27s_law
(How might one test this with e.g. *double* SHA256?)

proportions_chisquare(count, nobs, value=None)
https://www.statsmodels.org/dev/generated/statsmodels.stats.proportion.proportions_chisquare.html

https://www.statsmodels.org/dev/genindex.html?highlight=chi


scipy.stats.chisquare(f_obs, f_exp=None, ddof=0, axis=0)
https://docs.scipy.org/doc/scipy-0.18.1/reference/generated/scipy.stats.chisquare.html


sklearn.feature_selection.chi2(X, y)
http://scikit-learn.org/stable/modules/generated/sklearn.feature_selection.chi2.html#sklearn.feature_selection.chi2

kernel_approximation.AdditiveChi2Sampler
kernel_approximation.SkewedChi2Sampler
http://scikit-learn.org/stable/modules/classes.html#module-sklearn.kernel_approximation
has

sklearn.metrics.pairwise.chi2_kernel(X, Y=None, gamma=1.0)
http://scikit-learn.org/stable/modules/generated/sklearn.metrics.pairwise.chi2_kernel.html#sklearn.metrics.pairwise.chi2_kernel

sklearn.metrics.pairwise.additive_chi2_kernel(X, Y=None)
http://scikit-learn.org/stable/modules/generated/sklearn.metrics.pairwise.additive_chi2_kernel.html#sklearn.metrics.pairwise.additive_chi2_kernel

...

FreqDist(collections.Counter(odict)) ... sparse-coding ... One-Hot /
Binarization
http://contrib.scikit-learn.org/categorical-encoding/


StandardScalar (for standardization) refuses to work with sparse matrices:
http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.StandardScaler.html#sklearn.preprocessing.StandardScaler



>
> Raymond
> ___
> Python-ideas mailing list
> Python-ideas@python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] collections.Counter should implement __mul__, __rmul__

2018-04-15 Thread Raymond Hettinger


> On Apr 15, 2018, at 10:07 PM, Tim Peters  wrote:
> 
> Adding Counter * integer doesn't bother me a bit, but the definition
> of what that should compute isn't obvious.

Any thoughts on Counter * float?   A key use case for what is being proposed is:

c *= 1 / c.total


Raymond

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] collections.Counter should implement __mul__, __rmul__

2018-04-15 Thread Tim Peters
[Tim]
>> Adding Counter * integer doesn't bother me a bit, but the definition
>> of what that should compute isn't obvious.


[Raymond]
> Any thoughts on Counter * float?   A key use case for what is being proposed 
> is:
>
> c *= 1 / c.total

Ah, I thought I had already addressed that, but looks like my fingers
forgot to type it ;-)

By all mean, yes!  Indeed, that strengthens the "argument" for why
`Counter * int` should ignore the signs of the values - if we allow
multiplying by anything supporting __mul__, that clearly says we view
multiplication as being outside the "multiset" view, and so there's no
reason at all to suppress values <= 0.

I also have no problem with inplace operators.  Or with adding
`Counter /= scalar", for that matter.

Perhaps whining could be reduced by rearranging the docs some:
clearly separate operations designed to support the multiset view from
the others.  Then "but that operation makes no sense for multisets!"
can be answered with "so don't use it on multisets - like the docs
told you" ;-)
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] collections.Counter should implement __mul__, __rmul__

2018-04-15 Thread Steve Barnes


On 16/04/2018 06:07, Tim Peters wrote:
> [Peter Norvig]
>> For most types that implement __add__, `x + x` is equal to `2 * x`.
>>
>> That is true for all numbers, list, tuple, str, timedelta, etc. -- but not
>> for collections.Counter. I can add two Counters, but I can't multiply one
>> by a scalar. That seems like an oversight.
>>
>> ...
>> Here's an implementation:
>>
>> def __mul__(self, scalar):
>> "Multiply each entry by a scalar."
>> result = Counter()
>> for key in self:
>> result[key] = self[key] * scalar
>> return result
>>
>> def __rmul__(self, scalar):
>> "Multiply each entry by a scalar."
>> result = Counter()
>> for key in self:
>> result[key] = scalar * self[key]
>> return result
> 
> Adding Counter * integer doesn't bother me a bit, but the definition
> of what that should compute isn't obvious.  In particular, that
> implementation doesn't preserve that `x+x == 2*x` if x has any
> negative values:
> 
 x = Counter(a=-1)
 x
> Counter({'a': -1})
 x+x
> Counter()
> 
> It would be strange if x+x != 2*x, and if x*-1 != -x:
> 
 y = Counter(a=1)
 y
> Counter({'a': 1})
 -y
> Counter()
> 
> Etc.
> 
> Then again, it's already the case that, e.g., x-y isn't always the
> same as x + -y:
> 
 x = Counter(a=1)
 y = Counter(a=2)
 x - y
> Counter()
 x + -y
> Counter({'a': 1})
> 
> So screw obvious formal identities ;-)
> 
> I'm not clear on why "+" and "-" discard keys with values <= 0 to
> begin with.  For "-" it's natural enough viewing "-" as being multiset
> difference, but for "+"?  That's just made up ;-)
> 
> In any case, despite the oddities, I think your implementation would
> be least surprising overall (ignore the sign of the resulting values).
> At least for Counters that actually make sense as multisets (have no
> values <= 0), and for a positive integer multiplier `n > 0`, it does
> preserve that `x*n` = `x + x + ... + x` (with  `n` instances of `x`).
> ___
> Python-ideas mailing list
> Python-ideas@python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
> 

Wouldn't it make sense to have the current counter behaviour, (negative 
counts not allowed), and also a counter that did allow negative values 
(my bank doesn't seem to have a problem with my balance being able to go 
below negative), and possibly at the same time a counter class that 
allowed fractional counts?

Then:
  x = Counter(a=1)
  y = Counter(a=2)
  x - y
 > Counter()
  x + -y
 > Counter({'a': 1})
BUT:
  x = Counter(a=1, allow_negative=True)
  y = Counter(a=2, allow_negative=True)
  x - y
 > Counter({'a': 1})
  x + -y
 > Counter({'a': 1})
Likewise for a Counter that was allowed to be fractional the result of 
some_counter / scalar would have (potentially) fractional results and 
one that did not would give floor results.

-- 
Steve (Gadget) Barnes
Any opinions in this message are my personal opinions and do not reflect 
those of my employer.

---
This email has been checked for viruses by AVG.
http://www.avg.com

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] collections.Counter should implement __mul__, __rmul__

2018-04-15 Thread Raymond Hettinger


> On Apr 15, 2018, at 10:51 PM, Tim Peters  wrote:
> 
> I also have no problem with inplace operators.  Or with adding
> `Counter /= scalar", for that matter.

But surely __rdiv__() would be over the top, harmonic means be damned ;-)


Raymond




___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/