Re: [Python-ideas] Idea: Importing from arbitrary filenames
On 14 April 2018 at 19:22, Steve Barnes wrote: > I generally love the current import system for "just working" regardless > of platform, installation details, etc., but what I would like to see is > a clear import local, (as opposed to import from wherever you can find > something to satisfy mechanism). This is the one thing that I miss from > C/C++ where #include is system includes and #include "x" search > differing include paths, (if used well). For the latter purpose, we prefer that folks use either explicit relative imports (if they want to search the current package specifically), or else direct manipulation of package.__path__. That is, if you do: from . import custom_imports # Definitely from your own project custom_imports.__path__[:] = (some_directory, some_other_directory) then: from .custom_imports import name will search those directories for packages & modules to import, while still cleanly mapping to a well-defined location in the module namespace for the process as a whole (and hence being able to use all the same caches as other imports, without causing name conflicts or other problems). If you want to do this dynamically relative to the current module, then it's possible to do: global __path__ __path__[:] = (some_directory, some_other_directory) custom_mod = importlib.import_module(".name", package=__name__) The discoverability of these kinds of techniques could definitely stand to be improved, but the benefit of adopting them is that they work on all currently supported versions of Python (even importlib.import_module exists in Python 2.7 as a convenience wrapper around __import__), rather than needing to wait for new language level syntax for them. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Idea: Importing from arbitrary filenames
On 15 April 2018 at 17:12, Nick Coghlan wrote: > If you want to do this dynamically relative to the current module, > then it's possible to do: > > global __path__ > __path__[:] = (some_directory, some_other_directory) > custom_mod = importlib.import_module(".name", package=__name__) Copy and paste error there: to make this work in non-package modules, drop the "[:]" from the __path__ assignment. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4)
On 15 April 2018 at 13:54, Chris Angelico wrote: > On Sun, Apr 15, 2018 at 1:08 PM, Nick Coghlan wrote: >> === Target first, 'from' keyword === >> >> while (value from read_next_item()) is not None: # New >> ... >> >> Pros: >> >> * avoids the syntactic ambiguity of "as" >> * being target first provides an obvious distinction from the "as" keyword >> * typically reads nicely as pseudocode >> * "from" is already associated with a namebinding operation ("from >> module import name") >> >> Cons: >> >> * I'm sure we'll think of some more, but all I have so far is that >> the association with name binding is relatively weak and would need to >> be learned >> > > Cons: Syntactic ambiguity with "raise exc from otherexc", probably not > serious. Ah, I forgot about that usage. The keyword usage is at least somewhat consistent, in that it's short for: _tmp = exc _exc.__cause__ from otherexc raise exc However, someone writing "raise (ExcType from otherexc)" could be confusing, since it would end up re-raising "otherexc" instead of wrapping it in a new ExcType instance. If "otherexc" was also an ExcType instance, that would be a *really* subtle bug to try and catch, so this would likely need the same kind of special casing as was proposed for "as" (i.e. prohibiting the top level parentheses). I also agree with Nathan that if you hadn't encountered from expressions before, it would be reasonable to assume they were semantically comparable to "target = next(expr)" rather than just "target = expr". Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Rewriting file - pythonic way
Hi all, I am new in python (i am moving from Perl world), but I always love Python for hight level, beatuful and clean syntax. Now I have question/idea about working with files. On mine opinion it very popular use case: 1. Open file (for read and write) 2. Read data from file 3. Modify data. 4. Rewrite file by modified data. But now it is looks not so pythonic: with open(filename, 'r+') as file: data = file.read() data = data.replace('old', 'new') file.seek(0) file.write(data) file.truncate() or something like this with open(filename) as file: data = file.read() data = data.replace('old', 'new') with open(filename) as file: file.write(data) I think best way is something like this with open(filename, 'r+') as file: data = file.read() data = data.replace('old', 'new') file.rewrite(data) but for this io.BufferedIOBase must contain rewrite method what you think about this? ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4)
2018-04-15 6:08 GMT+03:00 Nick Coghlan : > > It's not completely off topic. as it's due to the fact we use "," to > separate both context managers and items in a tuple, so "with (cm1, > cm2, cm3):" is currently legal syntax that means something quite > different from "with cm1, cm2, cm3:". While using the parenthesised > form is *pointless* (since it will blow up at runtime due to tuples > not being context managers), the fact it's syntactically valid makes > us more hesitant to add the special case around parentheses handling > than we were for import statements. The relevance to PEP 572 is as a > reminder that since we *really* don't like to add yet more different > cases to "What do parentheses indicate in Python?" Despite the fact that "with (cm1,cm2, cm3):" currently is the legal syntax, but as you said and as it was also noted before in this thread - it is "pointless" in 99% cases (in context of with statement) and will fail at runtime. Therefore, regardless of this PEP, maybe it is fair to make it at least to be a `SyntaxWarning` (or `SyntaxError`)? Better fail sooner than later. we should probably > show similar hesitation when it comes to giving ":" yet another > meaning. > > Yes, `:` is used (as a symbol) in a lot of places (in fact there is not much in it), but in some places Python looks as a combination of words and colons. > P.S. The pros and cons of the current syntax proposals, as I see them: > > === Expression first, 'as' keyword === > > while (read_next_item() as value) is not None: > ... > > Pros: > > * typically reads nicely as pseudocode > * "as" is already associated with namebinding operations > > I understand that this list is subjective. But as for me it will be huge PRO that the expression comes first. > Cons: > > * syntactic ambiguity in with statement headers (major concern) > * encourages a common misunderstanding of how with statements work > (major concern) > * visual similarity between "as" and "and" makes name bindings easy to > miss > * syntactic ambiguity in except clause headers theoretically exists, > but is less of a concern due to the consistent type difference that > makes the parenthesised form pointless > > In reality, the first two points can be explained (if it will be required at all). Misunderstanding is a consequence of a lack of experience. I don't understand the the point about "visual similarity between "as" and "and" can you elaborate on this point a little bit more? > === Expression first, '->' symbol === > > while (read_next_item() -> value) is not None: > ... > > Pros: > > * avoids the syntactic ambiguity of "as" > * "->" is used for name bindings in at least some other languages > (but this is irrelevant to users for whom Python is their first, and > perhaps only, programming language) > > The same as previous, the expression comes first is a huge PRO for me and I'm sure for many others too. With the second point I agree that it is somewhat irrelevant. > Cons: > > * doesn't read like pseudocode (you need to interpret an arbitrary > non-arithmetic symbol) > Here I am a bit disagree with you. The most common for of assignment in formal pseudo-code is `name <- expr`. The second most common form, to my regret, is - `:=`. The `<-` form is not possible in Python and that is why `expr -> name` was suggested. > * invites the question "Why doesn't this use the 'as' keyword?" > All forms invites this question :))) > * symbols are typically harder to look up than keywords > * symbols don't lend themselves to easy mnemonics > * somewhat arbitrary repurposing of "->" compared to its use in > function annotations > > The last one is a major concern. I think that is why Guido is so skeptical about this form. > === Target first, ':=' symbol === > > while (value := read_next_item()) is not None: > ... > > Pros: > > * avoids the syntactic ambiguity of "as" > * being target first provides an obvious distinction from the "as" > keyword > For me it is a CON. Originally the rationale of this PEP was to reduce the number of unnecessary calculations and to provide a useful syntax to make a name binding in appropriate places. It should not, in any way, replace the existing `=` usual way to make a name binding. Therefore, as I see it, it is one of design goals to make the syntax forms of `assignment statement` and `assignment expression` to be distinct and `:=` does not help with this. This does not mean that this new syntax form should not be convenient, but it should be different from the usual `=` form. > * ":=" is used for name bindings in at least some other languages > (but this is irrelevant to users for whom Python is their first, and > perhaps only, language) > > Cons: > > * symbols are typically harder to look up than keywords > * symbols don't lend themselves to easy mnemonics > * subject to a visual "line noise" phenomenon when combined with > other uses of ":" as
Re: [Python-ideas] Rewriting file - pythonic way
15.04.18 11:57, Alexey Shrub пише: I am new in python (i am moving from Perl world), but I always love Python for hight level, beatuful and clean syntax. Now I have question/idea about working with files. On mine opinion it very popular use case: 1. Open file (for read and write) 2. Read data from file 3. Modify data. 4. Rewrite file by modified data. But now it is looks not so pythonic: with open(filename, 'r+') as file: data = file.read() data = data.replace('old', 'new') file.seek(0) file.write(data) file.truncate() What do you mean by calling this not pythonic? I think best way is something like this with open(filename, 'r+') as file: data = file.read() data = data.replace('old', 'new') file.rewrite(data) but for this io.BufferedIOBase must contain rewrite method If the problem is that you want to use a single line instead of three line, you can add a function: def file_rewrite(file, data): file.seek(0) file.write(data) file.truncate() and use it. This looks pretty pythonic to me. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4)
On Sun, Apr 15, 2018 at 12:19 PM, Kirill Balunov wrote: > > > 2018-04-15 6:08 GMT+03:00 Nick Coghlan : >> >> > >> >> P.S. The pros and cons of the current syntax proposals, as I see them: >> >> === Expression first, 'as' keyword === >> >> while (read_next_item() as value) is not None: >> ... >> >> Pros: >> >> * typically reads nicely as pseudocode >> * "as" is already associated with namebinding operations >> > > I understand that this list is subjective. But as for me it will be huge PRO > that the expression comes first. > [...] >> >> === Expression first, '->' symbol === >> >> while (read_next_item() -> value) is not None: >> ... >> >> Pros: >> >> * avoids the syntactic ambiguity of "as" >> * "->" is used for name bindings in at least some other languages >> (but this is irrelevant to users for whom Python is their first, and >> perhaps only, programming language) [...] > >> >> * invites the question "Why doesn't this use the 'as' keyword?" > > > All forms invites this question :))) Exactly, all forms invites this and other questions. First of all, coming back to original spelling choice arguments [Sorry in advance if I've missed some points in this huge thread] citation from PEP: "Differences from regular assignment statements" [...] "Otherwise, the semantics of assignment are unchanged by this proposal." So basically it's the same Python assignment? Then obvious solution seems just to propose "=". But I see Chris have put this in FAQ section: "The syntactic similarity between ``if (x == y)`` and ``if (x = y)`` " So IIUC, the *only* reason is to avoid '==' ad '=' similarity? If so, then it does not sound convincing at all. Of course Python does me a favor showing an error, when I make a typo like this: if (x = y) But still, if this is the only real reason, it is not convincing. Syntactically seen, I feel strong that normal '=' would be the way to go. Just look at this: y = ((eggs := spam()), (cheese := eggs.method()) y = ((eggs = spam()), (cheese = eggs.method()) The latter is so much cleaner, and already so common to any old or new Python user. And does not raise a question what this ":=" should really mean. (Or probably it should raise such question?) Given the fact that the PEP gives quite edge-case usage examples only, this should be really more convincing. And as a side note: I personally find the look of ":=" a bit 'noisy'. Another point: *Target first vs Expression first* === Well, this is nice indeed. Don't you find that first of all it must be decided what should be the *overall tendency for Python*? Now we have common "x = a + b" everywhere. Then there are comprehensions (somewhat mixed direction) and "foo as bar" things. But wait, is the tendency to "give the freedom"? Then you should introduce something like "<--" in the first place so that we can write normal assignment in both directions. Or is the tendency to convert Python to the "expression first" generally? So if this question can be answered first, then I think it will be more constructive to discuss the choice of particular spellings. Mikhail ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Rewriting file - pythonic way
В Воскресенье, 15 апр. 2018 в 12:40 , Serhiy Storchaka написал: If the problem is that you want to use a single line instead of three line, you can add a function Yes, I think that single line with word 'rewrite' is much more readable than those three lines. And yes, I can make my own function, but it is typical task - maybe it must be in standard library? ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] A cute Python implementation of itertools.tee
On Sun, 15 Apr 2018 00:05:58 -0500 Tim Peters wrote: > Just for fun - no complaint, no suggestion, just sharing a bit of code > that tickled me. > > The docs for `itertools.tee()` contain a Python work-alike, which is > easy to follow. It gives each derived generator its own deque, and > when a new value is obtained from the original iterator it pushes that > value onto each of those deques. > > Of course it's possible for them to share a single deque, but the code > gets more complicated. Is it possible to make it simpler instead? > > What it "really" needs is a shared singly-linked list of values, > pointing from oldest value to newest. Then each derived generator can > just follow the links, and yield its next result in time independent > of the number of derived generators. But how do you know when a new > value needs to be obtained from the original iterator, and how do you > know when space for an older value can be recycled (because all of the > derived generators have yielded it)? > > I ended up with almost a full page of code to do that, storing with > each value and link a count of the number of derived generators that > had yet to yield the value, effectively coding my own reference-count > scheme by hand, along with "head" and "tail" pointers to the ends of > the linked list that proved remarkably tricky to keep correct in all > cases. > > Then I thought "this is stupid! Python already does reference > counting." Voila! Vast swaths of tedious code vanished, giving this > remarkably simple implementation: This implementation doesn't work with Python 3.7 or 3.8. I've tried it here: https://gist.github.com/pitrou/b3991f638300edb6d06b5be23a4c66d6 and get: Traceback (most recent call last): File "mytee.py", line 14, in gen mylast = last[1] = last = [next(it), None] StopIteration The above exception was the direct cause of the following exception: Traceback (most recent call last): File "mytee.py", line 47, in run(mytee1) File "mytee.py", line 36, in run lists[i].append(next(iters[i])) RuntimeError: generator raised StopIteration (Yuck!) In short, you want the following instead: try: mylast = last[1] = last = [next(it), None] except StopIteration: return > def mytee(xs, n): > last = [None, None] > > def gen(it, mylast): > nonlocal last > while True: > mylast = mylast[1] > if not mylast: > mylast = last[1] = last = [next(it), None] That's smart and obscure :-o The way it works is that the `last` assignment changes the `last` value seen by all derived generators, while the `last[1]` assignment updates the bindings made in the other generators' `mylast` lists... It's difficult to find the words to explain it. The chained assignment makes it more difficult to parse as well (when I read this I don't know if `last[i]` or `last` gets assigned first; apparently the answer is `last[i]`, otherwise the recipe wouldn't work correctly). Perhaps like this: while True: mylast = mylast[1] if not mylast: try: # Create new list link mylast = [next(it), None] except StopIteration: return else: # Append to other generators `mylast` linked lists last[1] = mylast # Update shared list link last = last[1] yield mylast[0] Regards Antoine. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Rewriting file - pythonic way
15.04.18 12:49, Alexey Shrub пише: В Воскресенье, 15 апр. 2018 в 12:40 , Serhiy Storchaka написал: If the problem is that you want to use a single line instead of three line, you can add a function Yes, I think that single line with word 'rewrite' is much more readable than those three lines. And yes, I can make my own function, but it is typical task - maybe it must be in standard library? Not every three lines of code must be a function in standard library. And these three lines don't look enough common. Actually the reliable code should write into a separate file and replace the original file by the new file only if writing is successful. Or backup the old file and restore it if writing is failed. Or do both. And handle hard and soft links if necessary. And use file locks if needed to prevent race condition when read/write by different processes. Depending on the specific of the application you may need different code. Your three lines are enough for a one-time script if the risk of a powerful blackout or disk space exhaustion is insignificant or if the data is not critical. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Rewriting file - pythonic way
This pitfall sounds like a good reason to have such a function in the standard library. Elazar בתאריך יום א׳, 15 באפר׳ 2018, 13:13, מאת Serhiy Storchaka < storch...@gmail.com>: > 15.04.18 12:49, Alexey Shrub пише: > > В Воскресенье, 15 апр. 2018 в 12:40 , Serhiy Storchaka > > написал: > >> If the problem is that you want to use a single line instead of three > >> line, you can add a function > > > > Yes, I think that single line with word 'rewrite' is much more readable > > than those three lines. > > And yes, I can make my own function, but it is typical task - maybe it > > must be in standard library? > > Not every three lines of code must be a function in standard library. > And these three lines don't look enough common. > > Actually the reliable code should write into a separate file and replace > the original file by the new file only if writing is successful. Or > backup the old file and restore it if writing is failed. Or do both. And > handle hard and soft links if necessary. And use file locks if needed to > prevent race condition when read/write by different processes. Depending > on the specific of the application you may need different code. Your > three lines are enough for a one-time script if the risk of a powerful > blackout or disk space exhaustion is insignificant or if the data is not > critical. > > ___ > Python-ideas mailing list > Python-ideas@python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Rewriting file - pythonic way
On 15 April 2018 at 10:49, Alexey Shrub wrote: > В Воскресенье, 15 апр. 2018 в 12:40 , Serhiy Storchaka > написал: >> >> If the problem is that you want to use a single line instead of three >> line, you can add a function > > > Yes, I think that single line with word 'rewrite' is much more readable than > those three lines. > And yes, I can make my own function, but it is typical task - maybe it must > be in standard library? I don't think it's *that* typical. I don't recall even having wanted to do this in all the time I've been using Python... Paul ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Rewriting file - pythonic way
On 15 April 2018 at 11:22, Elazar wrote: > בתאריך יום א׳, 15 באפר׳ 2018, 13:13, מאת Serhiy Storchaka > : >> Actually the reliable code should write into a separate file and replace >> the original file by the new file only if writing is successful. Or >> backup the old file and restore it if writing is failed. Or do both. And >> handle hard and soft links if necessary. And use file locks if needed to >> prevent race condition when read/write by different processes. Depending >> on the specific of the application you may need different code. Your >> three lines are enough for a one-time script if the risk of a powerful >> blackout or disk space exhaustion is insignificant or if the data is not >> critical. > > This pitfall sounds like a good reason to have such a function in the > standard library. It certainly sounds like a good reason for someone to write a "safe file rewrite" library function. But I don't think that it's such a common need that it needs to be a stdlib function. It may well even be the case that there's such a function already available on PyPI - has anyone actually checked? And if there isn't, then writing module and publishing it there would seem like a *very* good starting point - as well as allowing the developer to thrash out the best API, it would also provide for lots of testing in unusual scenarios that the developer may not have thought about (Windows file locking is very different from Unix, what is an atomic operation differs between platforms, error handling and retries may be something to consider, etc). The result would be a useful package, and the download and activity stats for it would be a great indication of whether it's a frequent enough need to justify including in core Python. IMO, it probably isn't. I suspect that most uses would be fine with the quoted 3-liner, but very few people would need the sort of robustness that Serhiy is describing (and that level of robustness *would* be needed for a stdlib implementation). So PyPI is likely a better home for the "bulletproof" version, and 3 lines of code is a perfectly acceptable and Pythonic solution for people with simpler needs. Paul ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4)
On 15 April 2018 at 19:41, Mikhail V wrote: > So IIUC, the *only* reason is to avoid '==' ad '=' similarity? > If so, then it does not sound convincing at all. > Of course Python does me a favor showing an error, > when I make a typo like this: > if (x = y) > > But still, if this is the only real reason, it is not convincing. It's thoroughly convincing, because we're already familiar with the consequences of folks confusing "=" and "==" when writing C & C++ code. It's an eternal bug magnet, so it's not a design we're ever going to port over to Python. (And that's even before we get into the parsing ambiguity problems that attempting to reuse "=" for this purpose in Python would introduce, especially when it comes to keyword arguments). The only way Python will ever gain expression level name binding support is with a spelling *other than* "=", as when that's the proposed spelling, the answer will be an unequivocal "No, we're not adding that". Even if the current discussion does come up with a potentially plausible spelling, the final determination on python-dev may *still* be "No, we're not going to add that". That isn't a predetermined answer though - it will depend on whether or not a proposal can be developed that threads the small gap between "this adds too much new cognitive overhead to reading and writing the language" and "while this does add more complexity to the base language, it provides sufficient compensation in allowing common ideas to be expressed more simply". > Syntactically seen, I feel strong that normal '=' would be the way to go. > > Just look at this: > y = ((eggs := spam()), (cheese := eggs.method()) > y = ((eggs = spam()), (cheese = eggs.method()) > > The latter is so much cleaner, and already so common to any > old or new Python user. Consider how close the second syntax is to "y = f(eggs=spam(), cheese=fromage())", though. > Given the fact that the PEP gives quite edge-case > usage examples only, this should be really more convincing. The examples in the PEP have been updated to better reflect some of the key motivating use cases (embedded assignments in if and while statement conditions, generator expressions, and container comprehensions) > And as a side note: I personally find the look of ":=" a bit 'noisy'. You're not alone in that, which is one of the reasons finding a keyword based option that's less syntactically ambiguous than "as" could be an attractive alternative. > Another point: > > *Target first vs Expression first* > === > > Well, this is nice indeed. Don't you find that first of all it must be > decided what should be the *overall tendency for Python*? > Now we have common "x = a + b" everywhere. Then there > are comprehensions (somewhat mixed direction) and > "foo as bar" things. > But wait, is the tendency to "give the freedom"? Then you should > introduce something like "<--" in the first place so that we can > write normal assignment in both directions. > Or is the tendency to convert Python to the "expression first" generally? There's no general tendency towards expression first syntax, nor towards offering flexibility in whether ordinary assignments are target first. All the current cases where we use the "something as target" form are *not* direct equivalents to "target = something": * "import dotted.modname as name": also prevents "dotted" getting bound in the current scope the way it normally would * "from dotted import modname as name": also prevents "modname" getting bound in the current scope the way it normally would * "except exc_filter as exc": binds the caught exception, not the exception filter * "with cm as name": binds the result of __enter__ (which may be self), not the cm directly Indeed, https://www.python.org/dev/peps/pep-0343/#motivation-and-summary points out that it's this "not an ordinary assignment" aspect that lead to with statements using the "with cm as name:" structure in the first place - the original proposal in PEP 310 was for "with name = cm:" and ordinary assignment semantics. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4)
2018-04-15 12:41 GMT+03:00 Mikhail V : > > Exactly, all forms invites this and other questions. > > First of all, coming back to original spelling choice arguments > [Sorry in advance if I've missed some points in this huge thread] > > citation from PEP: > "Differences from regular assignment statements" [...] > "Otherwise, the semantics of assignment are unchanged by this proposal." > > So basically it's the same Python assignment? > Then obvious solution seems just to propose "=". > But I see Chris have put this in FAQ section: > "The syntactic similarity between ``if (x == y)`` and ``if (x = y)`` " > [OT] To be honest I never liked the fact that `=` was used in various programming languages as assignment. But it became so common that and I got used to it and even stopped taking a sedative :) So IIUC, the *only* reason is to avoid '==' ad '=' similarity? > If so, then it does not sound convincing at all. > Of course Python does me a favor showing an error, > when I make a typo like this: > if (x = y) > > But still, if this is the only real reason, it is not convincing. > Syntactically seen, I feel strong that normal '=' would be the way to go. > > Just look at this: > y = ((eggs := spam()), (cheese := eggs.method()) > y = ((eggs = spam()), (cheese = eggs.method()) > > The latter is so much cleaner, and already so common to any > old or new Python user. And does not raise a > question what this ":=" should really mean. > (Or probably it should raise such question?) > > Given the fact that the PEP gives quite edge-case > usage examples only, this should be really more convincing. > And as a side note: I personally find the look of ":=" a bit 'noisy'. > You are not alone. On the other hand it is one of the strengths of Python - not allow to do so common and complex to finding bugs. For me personally, `: =` looks and feels just like normal assignment statement which can be used interchangeable but in many more places in the code. And if the main goal of the PEP was to offer this `assignment expression` as a future replacement for `assignment statement` the `:=` syntax form would be the very reasonable proposal (of course in this case there will be a lot more other questions). But somehow this PEP does not mean it! And with the current rationale of this PEP it's a huge CON for me that `=` and `:=` feel and look the same. > > Another point: > > *Target first vs Expression first* > === > > Well, this is nice indeed. Don't you find that first of all it must be > decided what should be the *overall tendency for Python*? > Now we have common "x = a + b" everywhere. Then there > are comprehensions (somewhat mixed direction) and > "foo as bar" things. > But wait, is the tendency to "give the freedom"? Then you should > introduce something like "<--" in the first place so that we can > write normal assignment in both directions. > As it was noted previously `<-` would not work because of unary minus on the right: >>> x = 10 >>> x <- 5 False > Or is the tendency to convert Python to the "expression first" generally? > > So if this question can be answered first, then I think it will be > more constructive to discuss the choice of particular spellings. > If the idea of the whole PEP was to replace `assignment statement` with `assignment expression` I would choose name first. If the idea was to offer an expression with the name-binding side effect, which can be used in the appropriate places I would choose expression first. With kind regards, -gdg ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Rewriting file - pythonic way
On 15 April 2018 at 20:47, Paul Moore wrote: > On 15 April 2018 at 11:22, Elazar wrote: >> בתאריך יום א׳, 15 באפר׳ 2018, 13:13, מאת Serhiy Storchaka >> : >>> Actually the reliable code should write into a separate file and replace >>> the original file by the new file only if writing is successful. Or >>> backup the old file and restore it if writing is failed. Or do both. And >>> handle hard and soft links if necessary. And use file locks if needed to >>> prevent race condition when read/write by different processes. Depending >>> on the specific of the application you may need different code. Your >>> three lines are enough for a one-time script if the risk of a powerful >>> blackout or disk space exhaustion is insignificant or if the data is not >>> critical. >> >> This pitfall sounds like a good reason to have such a function in the >> standard library. > > It certainly sounds like a good reason for someone to write a "safe > file rewrite" library function. But I don't think that it's such a > common need that it needs to be a stdlib function. It may well even be > the case that there's such a function already available on PyPI - has > anyone actually checked? There wasn't last time I checked (which admittedly was several years ago now). The issue is that it's painfully difficult to write a robust cross-platform "atomic rewrite" operation that can cleanly handle a wide range of arbitrary use cases - instead, folks are more likely to write simpler alternatives that work well enough given whichever simplifying assumptions are applicable to their use case (which may even include "I don't care about atomicity, and am quite happy to let a poorly timed Ctrl-C or unexpected system shutdown corrupt the file I'm rewriting"). https://bugs.python.org/issue8604#msg174104 is the relevant tracker discussion (deliberately linking into the middle of it, since the early part is akin to this thread: reactions mostly along the lines of "that's easy, and doesn't need to be in the standard library". It definitely *isn't* easy, but it's also challenging to publish on PyPI, since it's a quagmire of platform specific complexity and edge cases, if you mess it up you can cause significant data loss, and anyone that already knows they need atomic rewrites is likely to be able to come up with their own purpose specific implementation in less time than it would take them to assess the suitability of 3rd party alternatives). Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4)
On Sun, Apr 15, 2018 at 7:19 PM, Kirill Balunov wrote: >> === Expression first, 'as' keyword === >> >> while (read_next_item() as value) is not None: >> ... >> >> Pros: >> >> * typically reads nicely as pseudocode >> * "as" is already associated with namebinding operations >> > > I understand that this list is subjective. But as for me it will be huge PRO > that the expression comes first. I don't think we're ever going to unify everyone on an arbitrary question of "expression first" or "name first". But to all the "expression first" people, a question: what if the target is not just a simple name? while (read_next_item() -> items[i + 1 -> i]) is not None: print("%d/%d..." % (i, len(items)), end="\r") Does this make sense? With the target coming first, it perfectly parallels the existing form of assignment: >>> items = [None] * 10 >>> i = -1 >>> i, items[i] = i+1, input("> ") > asdf >>> i, items[i] = i+1, input("> ") > qwer >>> i, items[i] = i+1, input("> ") > zxcv >>> items ['asdf', 'qwer', 'zxcv', None, None, None, None, None, None, None] The unpacking syntax is a bit messy, but with expression assignment, we can do this: >>> items = [None] * 10 >>> i = -1 >>> items[i := i + 1] = input("> ") > asdf >>> items[i := i + 1] = input("> ") > qwer >>> items[i := i + 1] = input("> ") > zxcv >>> >>> items ['asdf', 'qwer', 'zxcv', None, None, None, None, None, None, None] Okay, it's not quite as simple as C's "items[i++]" (since you have to start i off at negative one so you can pre-increment), but it's still logical and sane. Are you as happy with that sort of complex expression coming after 'as' or '->'? Not a rhetorical question. I'm genuinely curious as to whether people are expecting "expression -> NAME" or "expression -> TARGET", where TARGET can be any valid assignment target. ChrisA ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Rewriting file - pythonic way
В Воскресенье, 15 апр. 2018 в 2:40 , Nick Coghlan написал: https://bugs.python.org/issue8604#msg174104 is the relevant tracker discussion Thanks all, I agree that universal and absolutly safe solution is very difficult, but for experiment I made some draft https://github.com/worldmind/scripts/tree/master/filerewrite main code here https://github.com/worldmind/scripts/blob/master/filerewrite/filerewrite.py#L46 ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4)
2018-04-15 15:21 GMT+03:00 Chris Angelico : > On Sun, Apr 15, 2018 at 7:19 PM, Kirill Balunov > wrote: > >> === Expression first, 'as' keyword === > >> > >> while (read_next_item() as value) is not None: > >> ... > >> > >> Pros: > >> > >> * typically reads nicely as pseudocode > >> * "as" is already associated with namebinding operations > >> > > > > I understand that this list is subjective. But as for me it will be huge > PRO > > that the expression comes first. > > I don't think we're ever going to unify everyone on an arbitrary > question of "expression first" or "name first". But to all the > "expression first" people, a question: what if the target is not just > a simple name? > > while (read_next_item() -> items[i + 1 -> i]) is not None: > print("%d/%d..." % (i, len(items)), end="\r") > > [...] > > Not a rhetorical question. I'm genuinely curious as to whether people > are expecting "expression -> NAME" or "expression -> TARGET", where > TARGET can be any valid assignment target. > > I completely agree with you that it is impossible to unify everyone opinion - we all have different background. But this example is more likely to play against this PEP. This is an extra complexity within one line and it can fail hard in at least three obvious places :) And I am against this usage no matter `name first` or `expression first`. But i will reask this with following snippets. What do you choose from this examples: 0. while (items[i := i+1] := read_next_item()) is not None: print(r'%d/%d' % (i, len(items)), end='\r') 1. while (read_next_item() -> items[(i+1) -> i]) is not None: print(r'%d/%d' % (i, len(items)), end='\r') 2. while (item := read_next_item()) is not None: items[i := (i+1)] = item print(r'%d/%d' % (i, len(items)), end='\r') 3. while (read_next_item() -> item) is not None: items[(i+1) -> i] = item print(r'%d/%d' % (i, len(items)), end='\r') 4. while (item := read_next_item()) is not None: i = i+1 items[i] = item print(r'%d/%d' % (i, len(items)), end='\r') 5. while (read_next_item() -> item) is not None: i = i+1 items[i] = item print(r'%d/%d' % (i, len(items)), end='\r') I am definitely Ok with both 2 and 3 here. But as it was noted `:=` produces additional noise in other places and I am also an `expression first` guy :) So I still prefer variant 3 to 2. But to be completely honest, I would write it in the following way: for item in iter(read_next_item, None): items.append(item) print(r'%d/%d' % (i, len(items)), end='\r') With kind regards, -gdg ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4)
2018-04-15 17:17 GMT+03:00 Kirill Balunov : > > > for item in iter(read_next_item, None): > items.append(item) > print(r'%d/%d' % (i, len(items)), end='\r') > > > With kind regards, > -gdg > Oh, I forgot about `i`: for item in iter(read_next_item, None): i += 1 items.append(item) print(r'%d/%d' % (i, len(items)), end='\r') With kind regards, -gdg ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4)
> To me, "from" strongly suggests that an element is being obtained from a container/collection of > elements. This is how I conceptualize "from module import name": "name" refers to an object > INSIDE the module, not the module itself. If I saw > > if (match from pattern.search(data)) is not None: ... > I would guess that it is equivalent to > > m = next(pattern.search(data)) > if m is not None: ... +1, although unpacking seems to be reasonable `[elem1, *elems] from contains`. Now we have - "expr as name" - "name := expr" - "expr -> name" - "name from expr" Personally I prefer "as", but I think without a big change of python Grammar file, it's impossible to avoid parsing "with expr as name" into "with (expr as name)" because "expr as name" is actually an "expr". I have mentioned this in previous discussions and it seems it's better to warn you all again. I don't think people of Python-Dev are willing to implement a totally new Python compiler. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4)
> > > 0. > > while (items[i := i+1] := read_next_item()) is not None: > print(r'%d/%d' % (i, len(items)), end='\r') > > 1. > > while (read_next_item() -> items[(i+1) -> i]) is not None: > print(r'%d/%d' % (i, len(items)), end='\r') > > 2. > > while (item := read_next_item()) is not None: > items[i := (i+1)] = item > print(r'%d/%d' % (i, len(items)), end='\r') > > 3. > > while (read_next_item() -> item) is not None: > items[(i+1) -> i] = item > print(r'%d/%d' % (i, len(items)), end='\r') > > 4. > > while (item := read_next_item()) is not None: > i = i+1 > items[i] = item > print(r'%d/%d' % (i, len(items)), end='\r') > > 5. > > while (read_next_item() -> item) is not None: > i = i+1 > items[i] = item > print(r'%d/%d' % (i, len(items)), end='\r') > > Also 2 or 3. The 3rd one is in the order of natural language, just like: while get then next item and assign it to `item`, if it's not None, do some stuff. However just as we have pointed out, the semantics of '->' is quite different from the cases it's currently used at, so it should be handled much more carefully. I think maybe we can use unicode characters like ≜ (\triangleq) and add the support of unicode completion to python repl. The unicode completion of editors or ides has been quite mature. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Rewriting file - pythonic way
On Sun, Apr 15, 2018 at 05:15:55PM +0300, Alexey Shrub wrote: > В Воскресенье, 15 апр. 2018 в 2:40 , Nick Coghlan > написал: > > https://bugs.python.org/issue8604#msg174104 is the relevant tracker > > discussion > > Thanks all, I agree that universal and absolutly safe solution is very > difficult, but for experiment I made some draft > https://github.com/worldmind/scripts/tree/master/filerewrite Good! > main code here > https://github.com/worldmind/scripts/blob/master/filerewrite/filerewrite.py#L46 Can I recommend to catch exceptions in `backuper.backup()`, cleanup backuper and unlock locker? Oleg. -- Oleg Broytmanhttp://phdru.name/p...@phdru.name Programmers don't die, they just GOSUB without RETURN. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4)
On Sun, Apr 15, 2018 at 10:21:02PM +1000, Chris Angelico wrote: > I don't think we're ever going to unify everyone on an arbitrary > question of "expression first" or "name first". But to all the > "expression first" people, a question: what if the target is not just > a simple name? > > while (read_next_item() -> items[i + 1 -> i]) is not None: > print("%d/%d..." % (i, len(items)), end="\r") I don't see why it would make a difference. It doesn't to me. > Does this make sense? With the target coming first, it perfectly > parallels the existing form of assignment: Yes, except this isn't ordinary assignment-as-a-statement. I've been mulling over the question why I think the expression needs to come first here, whereas I'm satisfied with the target coming first for assignment statements, and I think I've finally got the words to explain it. It is not just long familiarity with maths and languages that put the variable first (although that's also part of it). It has to do with what we're looking for when we read code, specifically what is the primary piece of information we're initially looking for. In assignment STATEMENTS the primary piece of information is the target. Yes, of course the value assigned to the target is important, but often we don't care what the value is, at least not at first. We're hunting for a known target, and only when we find it do we care about the value it gets. A typical scenario: I'm reading a function, and I scan down the block looking at the start of each line until I find the variable I want: spam = don't care eggs = don't care self.method(don't care) cheese = ... <<< HERE IT IS so it actually helps to have the name up front. Copying standard maths notation for assignment (variable first, value second) is a good thing for statements. With assignment-statements, if you're scanning the code for a variable name, you're necessarily interested in the name and it will be helpful to have it on the left. But with assignment-expressions, there's an additional circumstance: sometimes you don't care about the name, you only care what the value is. (I expect this will be more common.) The name is just something to skip over when you're scanning the code looking for the value. # what did I pass as the fifth argument to the function? result = some_func(don't care, spam := don't care, eggs := don't care, self.method(don't care), cheese := HERE IT IS, ...) Of course it's hard counting commas so it's probably better to add a bit of structure to your function call: result = some_func(don't care, spam := don't care, eggs := don't care, self.method(don't care), cheese := HERE IT IS, ...) But this time we don't care about the name. Its the value we care about: result = some_func(don't care, don't care -> don't care don't care -> don't care don't care(don't care), HERE IT IS , ...) The target is just one more thing you have to ignore, and it is helpful to have expression first and the target second. Some more examples: # what am I adding to the total? total += don't care := expression # what key am I looking up? print(mapping[don't care := key]) # how many items did I just skip? self.skip(don't care := obj.start + extra) versus total += expression -> don't care print(mapping[key -> don't care]) self.skip(obj.start + extra -> don't care) It is appropriate for assignment statements and expressions to be written differently because they are used differently. [...] > >>> items = [None] * 10 > >>> i = -1 > >>> items[i := i + 1] = input("> ") > > asdf > >>> items[i := i + 1] = input("> ") > > qwer > >>> items[i := i + 1] = input("> ") > > zxcv > >>> > >>> items > ['asdf', 'qwer', 'zxcv', None, None, None, None, None, None, None] I don't know why you would write that instead of: items = [None]*10 for i in range(3): items[i] = input("> ") or even for that matter: items = [input("> ") for i in range(3)] + [None]*7 but whatever floats your boat. (Python isn't just not Java. It's also not C *wink*) > Are you as happy with that sort of complex > expression coming after 'as' or '->'? Sure. Ignoring the output of the calls to input(): items = [None] * 10 i = -1 items[i + 1 -> i] = input("> ") items[i + 1 -> i] = input("> ") items[i + 1 -> i] = input("> ") which isn't really such a complex target. How about this instead? obj = SimpleNamespace(spam=None, eggs=None, aardvark={'key': [None, None, -1]} ) items[obj.aardvark['key'][2] + 1 -> obj.aardvark['key'][2]] = input("> ") versus: items[obj.aardvark['key'][2] := obj.aardvark['key'][2] +
Re: [Python-ideas] Rewriting file - pythonic way
Depending on how firm your requirements around locking are, you may find this code useful: https://github.com/mahmoud/boltons/blob/6b0721b6aeda6d3ec6f5d31be7c741bc7fcc4635/boltons/fileutils.py#L303 (docs here: http://boltons.readthedocs.io/en/latest/fileutils.html#atomic-file-saving ) Basically every operating system has _some_ way of doing an atomic file replacement, letting us guarantee that a file at a given location is always valid. atomic_save provides a unified interface to that cross-platform behavior. The code does not do locking, as neither I nor its other users have wanted it, but I'd be happy to extend it if there's a sensible default. On Sun, Apr 15, 2018 at 8:19 AM, Oleg Broytman wrote: > On Sun, Apr 15, 2018 at 05:15:55PM +0300, Alexey Shrub > wrote: > > В Воскресенье, 15 апр. 2018 в 2:40 , Nick Coghlan > > написал: > > > https://bugs.python.org/issue8604#msg174104 is the relevant tracker > > > discussion > > > > Thanks all, I agree that universal and absolutly safe solution is very > > difficult, but for experiment I made some draft > > https://github.com/worldmind/scripts/tree/master/filerewrite > >Good! > > > main code here > > https://github.com/worldmind/scripts/blob/master/ > filerewrite/filerewrite.py#L46 > >Can I recommend to catch exceptions in `backuper.backup()`, > cleanup backuper and unlock locker? > > Oleg. > -- > Oleg Broytmanhttp://phdru.name/p...@phdru.name >Programmers don't die, they just GOSUB without RETURN. > ___ > Python-ideas mailing list > Python-ideas@python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin]
[Raymond Hettinger ] > Q. Do other languages do it? > A. Numpy, no. R, no. APL, no. Mathematica, no. Haskell, yes. > > * > http://docs.scipy.org/doc/numpy/reference/generated/numpy.ufunc.accumulate.html > * https://stat.ethz.ch/R-manual/R-devel/library/base/html/cumsum.html > * http://microapl.com/apl/apl_concepts_chapter5.html > \+ 1 2 3 4 5 > 1 3 6 10 15 > * https://reference.wolfram.com/language/ref/Accumulate.html > * https://www.haskell.org/hoogle/?hoogle=mapAccumL There's also C++, which is pretty much "yes" to every variation discussed so far: * partial_sum() is like Python's current accumulate(), including defaulting to doing addition. http://en.cppreference.com/w/cpp/algorithm/partial_sum * inclusive_scan() is also like accumulate(), but allows an optional "init" argument (which is returned if specified), and there's no guarantee of "left-to-right" evaluation (it's intended for associative binary functions, and wants to allow parallelism in the implementation). http://en.cppreference.com/w/cpp/algorithm/inclusive_scan * exclusive_scan() is like inclusive_scan(), but _requires_ an "init" argument (which is not returned). http://en.cppreference.com/w/cpp/algorithm/exclusive_scan * accumulate() is like Python's functools.reduce(), but the operation is optional and defaults to addition, and an "init" argument is required. http://en.cppreference.com/w/cpp/algorithm/accumulate ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4)
On Sun, Apr 15, 2018 at 11:11:37PM +0800, Thautwarm Zhao wrote: > I think maybe we can use unicode characters like ≜ (\triangleq) and add the > support of unicode completion to python repl. The unicode completion of > editors or ides has been quite mature. What key combination do I need to type to get ≜ in the following editors please? I tried typing \triangleq but all I got was \triangleq. Notepad (Windows) Brackets (Mac) BBEdit (Mac) kwrite (Linux) kate nano geany gedit as well as IDLE, my mail client (kmail, Thunderbird or mutt), my web browsers (Firefox, Opera and Chromium), the interactive interpreter in various different consoles, my Usenet client (Pan and KNode) and IRC (pidgin). Oh, having it work in LibreOffice and GoogleApps too would be nice, although not essential since I don't often write code in them. And what decent fonts do I need to install for ≜ to show up as something other than a square box ("missing glyph")? -- Steve ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Rewriting file - pythonic way
On Sun, Apr 15, 2018 at 09:10:57AM -0700, Mahmoud Hashemi wrote: > Depending on how firm your requirements around locking are, you may find > this code useful: > https://github.com/mahmoud/boltons/blob/6b0721b6aeda6d3ec6f5d31be7c741bc7fcc4635/boltons/fileutils.py#L303 > > (docs here: > http://boltons.readthedocs.io/en/latest/fileutils.html#atomic-file-saving ) > > Basically every operating system has _some_ way of doing an atomic file > replacement, letting us guarantee that a file at a given location is always > valid. atomic_save provides a unified interface to that cross-platform > behavior. > > The code does not do locking, as neither I nor its other users have wanted > it, but I'd be happy to extend it if there's a sensible default. I don't like it renames the file at the end. Renaming could lead to changed file ownership and permissions; restoring permissions is not always possible, restoring ownership is almost never possible. Renaming is also not always possible due to restricted directory permissions. > On Sun, Apr 15, 2018 at 8:19 AM, Oleg Broytman wrote: > > > On Sun, Apr 15, 2018 at 05:15:55PM +0300, Alexey Shrub > > wrote: > > > В Воскресенье, 15 апр. 2018 в 2:40 , Nick Coghlan > > > написал: > > > > https://bugs.python.org/issue8604#msg174104 is the relevant tracker > > > > discussion > > > > > > Thanks all, I agree that universal and absolutly safe solution is very > > > difficult, but for experiment I made some draft > > > https://github.com/worldmind/scripts/tree/master/filerewrite > > > >Good! > > > > > main code here > > > https://github.com/worldmind/scripts/blob/master/ > > filerewrite/filerewrite.py#L46 > > > >Can I recommend to catch exceptions in `backuper.backup()`, > > cleanup backuper and unlock locker? Oleg. -- Oleg Broytmanhttp://phdru.name/p...@phdru.name Programmers don't die, they just GOSUB without RETURN. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] A cute Python implementation of itertools.tee
[Antoine Pitrou ] > This implementation doesn't work with Python 3.7 or 3.8. > I've tried it here: > https://gist.github.com/pitrou/b3991f638300edb6d06b5be23a4c66d6 > > and get: > Traceback (most recent call last): > File "mytee.py", line 14, in gen > mylast = last[1] = last = [next(it), None] > StopIteration > > The above exception was the direct cause of the following exception: > > Traceback (most recent call last): > File "mytee.py", line 47, in > run(mytee1) > File "mytee.py", line 36, in run > lists[i].append(next(iters[i])) > RuntimeError: generator raised StopIteration > > (Yuck!) Thanks for trying! I wonder whether that will break other code. I wrote PEP 255, and this part was intentional at the time: """ If an unhandled exception-- including, but not limited to, StopIteration --is raised by, OR PASSES THROUGH [emphasis added], a generator function, then the exception is passed on to the caller in the usual way, and subsequent attempts to resume the generator function raise StopIteration. """ I've exploited that a number of times. > In short, you want the following instead: > > try: > mylast = last[1] = last = [next(it), None] > except StopIteration: > return No, I don't ;-) If I have to catch StopIteration myself now, then I want the entire "white True:" loop in the "try" block. Setting up try/except machinery anew on each iteration would add significant overhead; doing it just once per derived generator wouldn't. >> def mytee(xs, n): >> last = [None, None] >> >> def gen(it, mylast): >> nonlocal last >> while True: >> mylast = mylast[1] >> if not mylast: >> mylast = last[1] = last = [next(it), None] > That's smart and obscure :-o > The way it works is that the `last` assignment changes the `last` value > seen by all derived generators, while the `last[1]` assignment updates > the bindings made in the other generators' `mylast` lists... It's > difficult to find the words to explain it. Which is why I didn't even try - I did warn people that if they thought it "was obvious", they hadn't yet thought hard enough ;-) Good job! > The chained assignment makes it more difficult to parse as well (when I > read this I don't know if `last[i]` or `last` gets assigned first; > apparently the answer is `last[i]`, otherwise the recipe wouldn't work > correctly). Ya, I had to look it up too :-) Although, like almost everything else in Python, chained assignments proceed "left to right". I was just trying to make it as short as possible, to increase the "huh - can something that tiny really work?!" healthy skepticism factor :-) > Perhaps like this: > > while True: > mylast = mylast[1] > if not mylast: > try: > # Create new list link > mylast = [next(it), None] > except StopIteration: > return > else: > # Append to other generators `mylast` linked lists > last[1] = mylast > # Update shared list link > last = last[1] > yield mylast[0] I certainly agree that's easier to follow. But that wasn't really the point ;-) ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4)
2018-04-15 18:58 GMT+03:00 Steven D'Aprano : > > [...] > > But this time we don't care about the name. Its the value we care about: > > result = some_func(don't care, >don't care -> don't care >don't care -> don't care >don't care(don't care), >HERE IT IS , >...) > This made my day! :) The programming style when you absolutely don't care :))) I understand that this is a typo but it turned out to be very funny. In general, I agree with everything you've said. And I think you found a very correct way to explain why expression should go first in assignment expression. With kind regards, -gdg ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4)
On Sun, Apr 15, 2018 at 4:05 AM, Kirill Balunov wrote: > [...] For me personally, `: =` looks and feels just like normal assignment > statement which can be used interchangeable but in many more places in the > code. And if the main goal of the PEP was to offer this `assignment > expression` as a future replacement for `assignment statement` the `:=` > syntax form would be the very reasonable proposal (of course in this case > there will be a lot more other questions). > I haven't kept up with what's in the PEP (or much of this thread), but this is the key reason I strongly prefer := as inline assignment operator. > But somehow this PEP does not mean it! And with the current rationale of > this PEP it's a huge CON for me that `=` and `:=` feel and look the same. > Then maybe the PEP needs to be updated. -- --Guido van Rossum (python.org/~guido) ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] A cute Python implementation of itertools.tee
15.04.18 19:52, Tim Peters пише: No, I don't ;-) If I have to catch StopIteration myself now, then I want the entire "white True:" loop in the "try" block. Setting up try/except machinery anew on each iteration would add significant overhead; doing it just once per derived generator wouldn't. This overhead is around 10% of the time for calling `next(it)`. It may be less than 1-2% of the whole step of mytee iteration. I have ideas about implementing zero-overhead try/except, but I have doubts that it is worth. The benefit seems too small. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Idea: Importing from arbitrary filenames
On 15/04/2018 08:12, Nick Coghlan wrote: > On 14 April 2018 at 19:22, Steve Barnes wrote: >> I generally love the current import system for "just working" regardless >> of platform, installation details, etc., but what I would like to see is >> a clear import local, (as opposed to import from wherever you can find >> something to satisfy mechanism). This is the one thing that I miss from >> C/C++ where #include is system includes and #include "x" search >> differing include paths, (if used well). > > For the latter purpose, we prefer that folks use either explicit > relative imports (if they want to search the current package > specifically), or else direct manipulation of package.__path__. > > That is, if you do: > > from . import custom_imports # Definitely from your own project > custom_imports.__path__[:] = (some_directory, some_other_directory) > > then: > > from .custom_imports import name > > will search those directories for packages & modules to import, while > still cleanly mapping to a well-defined location in the module > namespace for the process as a whole (and hence being able to use all > the same caches as other imports, without causing name conflicts or > other problems). > > If you want to do this dynamically relative to the current module, > then it's possible to do: > > global __path__ > __path__[:] = (some_directory, some_other_directory) > custom_mod = importlib.import_module(".name", package=__name__) > > The discoverability of these kinds of techniques could definitely > stand to be improved, but the benefit of adopting them is that they > work on all currently supported versions of Python (even > importlib.import_module exists in Python 2.7 as a convenience wrapper > around __import__), rather than needing to wait for new language level > syntax for them. > > Cheers, > Nick. > Thanks Nick, As you say not too discoverable at the moment - I have just reread PEP328 & https://docs.python.org/3/library/importlib.html but did not find any mention of these mechanisms or even that setting an external __path__ variable existed as a possibility. Maybe a documentation enhancement proposal would be in order? -- Steve (Gadget) Barnes Any opinions in this message are my personal opinions and do not reflect those of my employer. --- This email has been checked for viruses by AVG. http://www.avg.com ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4)
Dear Steve, I'm sorry to annoy you by my proposal, but I do think using unicode might be wise in current stage. \triangleq could be print with unicode number \u225c, and adding plugins to support typing this in editors could be easy, just simply map \xxx to the specific unicode char when we press the tab after typing it. People using Julia language are proud of it but I think it's just something convenient could be used in any other language. There are other reasons to support unicode but it's out of this topic. Although ':=' and '->' are not perfect, in the range of ASCII it seems to be impossible to find a better one. 于 2018年4月16日周一 上午12:53写道: > Send Python-ideas mailing list submissions to > python-ideas@python.org > > To subscribe or unsubscribe via the World Wide Web, visit > https://mail.python.org/mailman/listinfo/python-ideas > or, via email, send a message with subject or body 'help' to > python-ideas-requ...@python.org > > You can reach the person managing the list at > python-ideas-ow...@python.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of Python-ideas digest..." > Today's Topics: > >1. Re: Rewriting file - pythonic way (Mahmoud Hashemi) >2. Re: Start argument for itertools.accumulate() [Was: Proposal: > A Reduce-Map Comprehension and a "last" builtin] (Tim Peters) >3. Re: Spelling of Assignment Expressions PEP 572 (was post #4) > (Steven D'Aprano) >4. Re: Rewriting file - pythonic way (Oleg Broytman) >5. Re: A cute Python implementation of itertools.tee (Tim Peters) > > > > -- Forwarded message -- > From: Mahmoud Hashemi > To: python-ideas > Cc: > Bcc: > Date: Sun, 15 Apr 2018 09:10:57 -0700 > Subject: Re: [Python-ideas] Rewriting file - pythonic way > Depending on how firm your requirements around locking are, you may find > this code useful: > https://github.com/mahmoud/boltons/blob/6b0721b6aeda6d3ec6f5d31be7c741bc7fcc4635/boltons/fileutils.py#L303 > > (docs here: > http://boltons.readthedocs.io/en/latest/fileutils.html#atomic-file-saving > ) > > Basically every operating system has _some_ way of doing an atomic file > replacement, letting us guarantee that a file at a given location is always > valid. atomic_save provides a unified interface to that cross-platform > behavior. > > The code does not do locking, as neither I nor its other users have wanted > it, but I'd be happy to extend it if there's a sensible default. > > On Sun, Apr 15, 2018 at 8:19 AM, Oleg Broytman wrote: > >> On Sun, Apr 15, 2018 at 05:15:55PM +0300, Alexey Shrub >> wrote: >> > В Воскресенье, 15 апр. 2018 в 2:40 , Nick Coghlan >> > написал: >> > > https://bugs.python.org/issue8604#msg174104 is the relevant tracker >> > > discussion >> > >> > Thanks all, I agree that universal and absolutly safe solution is very >> > difficult, but for experiment I made some draft >> > https://github.com/worldmind/scripts/tree/master/filerewrite >> >>Good! >> >> > main code here >> > >> https://github.com/worldmind/scripts/blob/master/filerewrite/filerewrite.py#L46 >> >>Can I recommend to catch exceptions in `backuper.backup()`, >> cleanup backuper and unlock locker? >> >> Oleg. >> -- >> Oleg Broytmanhttp://phdru.name/ >> p...@phdru.name >>Programmers don't die, they just GOSUB without RETURN. >> ___ >> Python-ideas mailing list >> Python-ideas@python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > > > > > -- Forwarded message -- > From: Tim Peters > To: Raymond Hettinger > Cc: Python-Ideas > Bcc: > Date: Sun, 15 Apr 2018 11:15:18 -0500 > Subject: Re: [Python-ideas] Start argument for itertools.accumulate() > [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin] > [Raymond Hettinger ] > > Q. Do other languages do it? > > A. Numpy, no. R, no. APL, no. Mathematica, no. Haskell, yes. > > > > * > http://docs.scipy.org/doc/numpy/reference/generated/numpy.ufunc.accumulate.html > > * > https://stat.ethz.ch/R-manual/R-devel/library/base/html/cumsum.html > > * http://microapl.com/apl/apl_concepts_chapter5.html > > \+ 1 2 3 4 5 > > 1 3 6 10 15 > > * https://reference.wolfram.com/language/ref/Accumulate.html > > * https://www.haskell.org/hoogle/?hoogle=mapAccumL > > There's also C++, which is pretty much "yes" to every variation > discussed so far: > > * partial_sum() is like Python's current accumulate(), including > defaulting to doing addition. > > http://en.cppreference.com/w/cpp/algorithm/partial_sum > > * inclusive_scan() is also like accumulate(), but allows an optional > "init" argument (which is returned if specified), and there's no > guarantee of "left-to-right" evaluation (it's intended for associative > binary functions, and wants to allow parallelism in the > implementation). >
Re: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4)
On Sun, Apr 15, 2018 at 7:15 PM, Steven D'Aprano wrote: > On Sun, Apr 15, 2018 at 11:11:37PM +0800, Thautwarm Zhao wrote: > >> I think maybe we can use unicode characters like ≜ (\triangleq) and add the >> support of unicode completion to python repl. The unicode completion of >> editors or ides has been quite mature. > > What key combination do I need to type to get ≜ in the following editors > please? I tried typing \triangleq but all I got was \triangleq. > > Notepad (Windows) > Brackets (Mac) > BBEdit (Mac) > kwrite (Linux) > kate > nano > geany > gedit > > as well as IDLE, my mail client (kmail, Thunderbird or mutt), my web > browsers (Firefox, Opera and Chromium), the interactive interpreter in > various different consoles, my Usenet client (Pan and KNode) and IRC > (pidgin). > > Oh, having it work in LibreOffice and GoogleApps too would be nice, > although not essential since I don't often write code in them. Typing should not be a problem generally. There are a lot of 3d-party apps which can bind a key to specific char input, system-wide. On windows I use Autohotkey. But no 100% guarantee of course for any editor. > And what decent fonts do I need to install for ≜ to show up as something > other than a square box ("missing glyph")? Well, here it is way less optimistic :) The chances to see that "delta equal to" sign in some random font / random app is not so big. It's only if you have fonts fallback system setup, and by default on my windows it seems to work only in Firefox browser. Mikhail ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Rewriting file - pythonic way
Hi, some similar thing already exist in standard: https://docs.python.org/3/library/fileinput.html fileinput(... inplace=True...) BR, George 2018-04-15 10:57 GMT+02:00 Alexey Shrub : > Hi all, > > I am new in python (i am moving from Perl world), but I always love Python > for hight level, beatuful and clean syntax. > Now I have question/idea about working with files. > On mine opinion it very popular use case: > 1. Open file (for read and write) > 2. Read data from file > 3. Modify data. > 4. Rewrite file by modified data. > > But now it is looks not so pythonic: > > with open(filename, 'r+') as file: >data = file.read() >data = data.replace('old', 'new') >file.seek(0) >file.write(data) >file.truncate() > > or something like this > > with open(filename) as file: >data = file.read() > data = data.replace('old', 'new') > with open(filename) as file: >file.write(data) > > I think best way is something like this > > with open(filename, 'r+') as file: >data = file.read() >data = data.replace('old', 'new') >file.rewrite(data) > > but for this io.BufferedIOBase must contain rewrite method > > what you think about this? > > > > ___ > Python-ideas mailing list > Python-ideas@python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4)
On Mon, Apr 16, 2018 at 1:58 AM, Steven D'Aprano wrote: > On Sun, Apr 15, 2018 at 10:21:02PM +1000, Chris Angelico wrote: > >> I don't think we're ever going to unify everyone on an arbitrary >> question of "expression first" or "name first". But to all the >> "expression first" people, a question: what if the target is not just >> a simple name? >> >> while (read_next_item() -> items[i + 1 -> i]) is not None: >> print("%d/%d..." % (i, len(items)), end="\r") > > I don't see why it would make a difference. It doesn't to me. Okay, that's good. I just hear people saying "name" a lot, but that would imply restricting the grammar to just a name, and I don't know how comfortable people are with more complex targets. >> Does this make sense? With the target coming first, it perfectly >> parallels the existing form of assignment: > > Yes, except this isn't ordinary assignment-as-a-statement. > > I've been mulling over the question why I think the expression needs to > come first here, whereas I'm satisfied with the target coming first for > assignment statements, and I think I've finally got the words to explain > it. It is not just long familiarity with maths and languages that put > the variable first (although that's also part of it). It has to do with > what we're looking for when we read code, specifically what is the > primary piece of information we're initially looking for. > > In assignment STATEMENTS the primary piece of information is the target. > Yes, of course the value assigned to the target is important, but often > we don't care what the value is, at least not at first. We're hunting > for a known target, and only when we find it do we care about the value > it gets. > [chomp details] > It is appropriate for assignment statements and expressions to be > written differently because they are used differently. I don't know that assignment expressions are inherently going to be used in ways where you ignore the assignment part and care only about the expression part. And I disagree that assignment statements are used primarily the way you say. Frequently I skim down a column of assignments, caring primarily about the functions being called, and looking at the part before the equals sign only when I come across a parameter in another call; the important part of the line is what it's doing, not where it's stashing its result. > [...] >> >>> items = [None] * 10 >> >>> i = -1 >> >>> items[i := i + 1] = input("> ") >> > asdf >> >>> items[i := i + 1] = input("> ") >> > qwer >> >>> items[i := i + 1] = input("> ") >> > zxcv >> >>> >> >>> items >> ['asdf', 'qwer', 'zxcv', None, None, None, None, None, None, None] > > > I don't know why you would write that instead of: > > items = [None]*10 > for i in range(3): > items[i] = input("> ") > > > or even for that matter: > > items = [input("> ") for i in range(3)] + [None]*7 > > > but whatever floats your boat. (Python isn't just not Java. It's also > not C *wink*) You and Kirill have both fallen into the trap of taking the example too far. By completely rewriting it, you destroy its value as an example. Write me a better example of a complex target if you like, but the question is about how you feel about complex assignment targets, NOT how you go about creating a particular list in memory. That part is utterly irrelevant. >> Are you as happy with that sort of complex >> expression coming after 'as' or '->'? > > Sure. Ignoring the output of the calls to input(): The calls to input were in a while loop's header for a reason. Ignoring them is ignoring the point of assignment expressions. ChrisA ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4)
On Mon, Apr 16, 2018 at 12:17 AM, Kirill Balunov wrote: > > > 2018-04-15 15:21 GMT+03:00 Chris Angelico : >> I don't think we're ever going to unify everyone on an arbitrary >> question of "expression first" or "name first". But to all the >> "expression first" people, a question: what if the target is not just >> a simple name? >> >> while (read_next_item() -> items[i + 1 -> i]) is not None: >> print("%d/%d..." % (i, len(items)), end="\r") >> > > I completely agree with you that it is impossible to unify everyone opinion > - we all have different background. But this example is more likely to play > against this PEP. This is an extra complexity within one line and it can > fail hard in at least three obvious places :) And I am against this usage no > matter `name first` or `expression first`. But i will reask this with > following snippets. What do you choose from this examples: > > 0. > > while (items[i := i+1] := read_next_item()) is not None: > print(r'%d/%d' % (i, len(items)), end='\r') > > 1. > > while (read_next_item() -> items[(i+1) -> i]) is not None: > print(r'%d/%d' % (i, len(items)), end='\r') These two are matching what I wrote, and are thus the two forms under consideration. I notice that you added parentheses to the second one; is there a clarity problem here and you're unsure whether "i + 1 -> i" would capture "i + 1" or "1"? If so, that's a downside to the proposal. > 2. > > while (item := read_next_item()) is not None: > items[i := (i+1)] = item > print(r'%d/%d' % (i, len(items)), end='\r') > > 3. > > while (read_next_item() -> item) is not None: > items[(i+1) -> i] = item > print(r'%d/%d' % (i, len(items)), end='\r') > > 4. > > while (item := read_next_item()) is not None: > i = i+1 > items[i] = item > print(r'%d/%d' % (i, len(items)), end='\r') > > 5. > > while (read_next_item() -> item) is not None: > i = i+1 > items[i] = item > print(r'%d/%d' % (i, len(items)), end='\r') All of these are fundamentally different from what I'm asking: they are NOT all expressions that can be used in the while header. So it doesn't answer the question of "expression first" or "target first". Once the expression gets broken out like this, you're right back to using "expression -> NAME" or "NAME := expression", and it's the same sort of simple example that people have been discussing all along. > I am definitely Ok with both 2 and 3 here. But as it was noted `:=` produces > additional noise in other places and I am also an `expression first` guy :) > So I still prefer variant 3 to 2. But to be completely honest, I would write > it in the following way: > > for item in iter(read_next_item, None): > items.append(item) > print(r'%d/%d' % (i, len(items)), end='\r') And that's semantically different in several ways. Not exactly a fair comparison. I invite you to write up a better example with a complex target. ChrisA ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4)
On Mon, Apr 16, 2018 at 3:19 AM, Guido van Rossum wrote: > On Sun, Apr 15, 2018 at 4:05 AM, Kirill Balunov > wrote: >> But somehow this PEP does not mean it! And with the current rationale of >> this PEP it's a huge CON for me that `=` and `:=` feel and look the same. > > Then maybe the PEP needs to be updated. I can never be sure what people are reading when they say "current" with PEPs like this. The text gets updated fairly frequently. As of time of posting, here's the rationale: - Naming the result of an expression is an important part of programming, allowing a descriptive name to be used in place of a longer expression, and permitting reuse. Currently, this feature is available only in statement form, making it unavailable in list comprehensions and other expression contexts. Merely introducing a way to assign as an expression would create bizarre edge cases around comprehensions, though, and to avoid the worst of the confusions, we change the definition of comprehensions, causing some edge cases to be interpreted differently, but maintaining the existing behaviour in the majority of situations. - Kirill, is this what you read, and if so, how does that make ':=' a negative? The rationale says "hey, see this really good thing you can do as a statement? Let's do it as an expression too", so the parallel should be a good thing. ChrisA ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4)
On Sun, Apr 15, 2018 at 2:01 PM, Nick Coghlan wrote: > On 15 April 2018 at 19:41, Mikhail V wrote: >> So IIUC, the *only* reason is to avoid '==' ad '=' similarity? >> If so, then it does not sound convincing at all. >> Of course Python does me a favor showing an error, >> when I make a typo like this: >> if (x = y) >> >> But still, if this is the only real reason, it is not convincing. > > It's thoroughly convincing, because we're already familiar with the > consequences of folks confusing "=" and "==" when writing C & C++ > code. It's an eternal bug magnet, so it's not a design we're ever > going to port over to Python. [...] > The examples in the PEP have been updated to better reflect some of > the key motivating use cases (embedded assignments in if and while > statement conditions, generator expressions, and container > comprehensions) Im personally "0" on the whole proposal. Just was curious about that "demonisation" of "=" and "==" visual similarity. Granted, writing ":=" instead of "=" helps a little bit. But if the ":=" will be accepted, then we end up with two spellings :-) > >> And as a side note: I personally find the look of ":=" a bit 'noisy'. > > You're not alone in that, which is one of the reasons finding a > keyword based option that's less syntactically ambiguous than "as" > could be an attractive alternative. > Keyword variants look less appealing than ":=". but if it had to be a keyword, then I'd definitely stay by "TARGET keyword EXPR" just not to swap the traditional order. Mikhail ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4)
On Mon, Apr 16, 2018 at 4:58 AM, Thautwarm Zhao wrote: > Dear Steve, I'm sorry to annoy you by my proposal, but I do think using > unicode might be wise in current stage. > > \triangleq could be print with unicode number \u225c, and adding plugins to > support typing this in editors could be easy, just simply map \xxx to the > specific unicode char when we press the tab after typing it. > > People using Julia language are proud of it but I think it's just something > convenient could be used in any other language. > > There are other reasons to support unicode but it's out of this topic. > > Although ':=' and '->' are not perfect, in the range of ASCII it seems to be > impossible to find a better one. > If you want to introduce non-ASCII tokens to Python, start by adding them as _alternatives_ to the current syntax. See whether people adopt them. I've seen one or two people using editors that redisplay ASCII-only source code using other symbols (eg ≡ for JavaScript's ===), and you could make it so the source code can actually be saved in that form. But making it so that the ONLY way to use a feature is to use a non-ASCII character? That's going to break a lot of people's workflows. ChrisA ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] A cute Python implementation of itertools.tee
On Sun, Apr 15, 2018 at 8:05 AM, Tim Peters wrote: [...] > Then I thought "this is stupid! Python already does reference > counting." Voila! Vast swaths of tedious code vanished, giving this > remarkably simple implementation: > > def mytee(xs, n): > last = [None, None] > > def gen(it, mylast): > nonlocal last > while True: > mylast = mylast[1] > if not mylast: > mylast = last[1] = last = [next(it), None] > yield mylast[0] > > it = iter(xs) > return tuple(gen(it, last) for _ in range(n)) > > There's no need to keep a pointer to the start of the shared list at > all - we only need a pointer to the end of the list ("last"), and each > derived generator only needs a pointer to its own current position in > the list ("mylast"). > > Things here remind me of my implementation design for PEP 555: the "contexts" present in the process are represented by a singly-linked tree of assignment objects. It's definitely possible to write the above in a more readable way, and FWIW I don't think it involves "assignments as expressions". > What I find kind of hilarious is that it's no help at all as a > prototype for a C implementation: Python recycles stale `[next(it), > None]` pairs all by itself, when their internal refcounts fall to 0. > That's the hardest part. > > Why can't the C implementation use Python refcounts? Are you talking about standalone C code? Or perhaps you are thinking about overhead? (In PEP 555 that was not a concern, though). Surely it would make sense to reuse the refcounting code that's already there. There are no cycles here, so it's not particulaly complicated -- just duplication. Anyway, the whole linked list is unnecessary if the iterable can be iterated over multiple times. But "tee" won't know when to do that. *That* is what I call overhead (unless of course all the tee branches are consumed in an interleaved manner). > BTW, I certainly don't suggest adding this to the itertools docs > either. While it's short and elegant, it's too subtle to grasp easily > - if you think "it's obvious", you haven't yet thought hard enough > about the problem ;-) > ___ > Python-ideas mailing list > Python-ideas@python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- + Koos Zevenhoven + http://twitter.com/k7hoven + ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] A cute Python implementation of itertools.tee
On Mon, Apr 16, 2018 at 6:46 AM, Koos Zevenhoven wrote: > Anyway, the whole linked list is unnecessary if the iterable can be iterated > over multiple times. But "tee" won't know when to do that. *That* is what I > call overhead (unless of course all the tee branches are consumed in an > interleaved manner). But if you have something you can iterate over multiple times, why bother with tee at all? Just take N iterators from the underlying iterable. The overhead is intrinsic to the value of the function. ChrisA ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] A cute Python implementation of itertools.tee
[Koos Zevenhoven ] > It's definitely possible to write the above in a more > readable way, and FWIW I don't think it involves "assignments as > expressions". Of course it is. The point was brevity and speed, not readability. It was presented partly as a puzzle :-) >> What I find kind of hilarious is that it's no help at all as a >> prototype for a C implementation: Python recycles stale `[next(it), >> None]` pairs all by itself, when their internal refcounts fall to 0. >> That's the hardest part. > Why can't the C implementation use Python refcounts? Are you talking about > standalone C code? Yes, expressing the algorithm in plain old C, not building on top of (say) the Python C API. > Or perhaps you are thinking about overhead? Nope. > (In PEP 555 that was not a concern, though). Surely it would make sense > to reuse the refcounting code that's already there. There are no cycles > here, so it's not particulaly complicated -- just duplication. > > Anyway, the whole linked list is unnecessary if the iterable can be iterated > over multiple times. If the latter were how iterables always worked, there would be no need for tee() at all. It's tee's _purpose_ to make it possible for multiple consumers to traverse an iterable's can't-restart-or-even -go-back result sequence each at their own pace. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Accepting multiple mappings as positional arguments to create dicts
On 2018-04-12 18:03, Guido van Rossum wrote: It's a slippery slope indeed. While having to change update() alone wouldn't worry me, the subclass constructors do seem like they are going to want changing too, and that's indeed a bit much. So let's back off a bit. Not every three lines of code need a built-in shorthand. This is disappointing since the dictionary is one of the most used but simultaneously limited of the builtin types. It doesn't support a lot of operations that strings, lists, tuples, sets, etc do. These are the little niceties that make Python fun to program in. But, for some reason we're stingy when it comes to dictionaries, the foundation of the language. Has anyone disagreed the dict constructor shouldn't take multiple arguments? Also, it isn't always three lines of code, but expands with the number that need to be merged. My guess is that the dict is used an order of magnitude more than specialized subclasses, even more so now that the Ordered variant is unnecessary in newer versions. It wouldn't bother me at all if it took a few years for the improvement to get rolled out to subclasses or never, it's quite a minor disappointment compared to getting the functionality ~90% of the time. Also wouldn't mind helping out with the subclasses if there is some lifting that needed to be done. -Mike ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Spelling of Assignment Expressions PEP 572 (was post #4)
On 2018-04-15 08:58, Steven D'Aprano wrote: I've been mulling over the question why I think the expression needs to come first here, whereas I'm satisfied with the target coming first for assignment statements, and I think I've finally got the words to explain it. It is not just long familiarity with maths and languages that put the variable first (although that's also part of it). It has to do with what we're looking for when we read code, specifically what is the primary piece of information we're initially looking for. Interesting. I think your arguments are pretty reasonable overall. But, for me, they just don't outweigh the fact that "->" is an ugly assignment operator that looks nothing like the existing one, whereas ":=" is a less-ugly one that has the additional benefit of looking like the existing one. From your arguments I am convinced that putting the expression first has some advantages, but they just don't seem as important to me as they apparently do to you. -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] collections.Counter should implement __mul__, __rmul__
For most types that implement __add__, `x + x` is equal to `2 * x`. That is true for all numbers, list, tuple, str, timedelta, etc. -- but not for collections.Counter. I can add two Counters, but I can't multiply one by a scalar. That seems like an oversight. It would be worthwhile to implement multiplication because, among other reasons, Counters are a nice representation for discrete probability distributions, for which multiplication is an even more fundamental operation than addition. Here's an implementation: def __mul__(self, scalar): "Multiply each entry by a scalar." result = Counter() for key in self: result[key] = self[key] * scalar return result def __rmul__(self, scalar): "Multiply each entry by a scalar." result = Counter() for key in self: result[key] = scalar * self[key] return result ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] collections.Counter should implement __mul__, __rmul__
Good call. Is it any faster to initialize Counter with a dict comprehension? return Counter({k: v*scalar for (k, v) in self.items()) On Sun, Apr 15, 2018 at 5:05 PM, Peter Norvig wrote: > For most types that implement __add__, `x + x` is equal to `2 * x`. > > That is true for all numbers, list, tuple, str, timedelta, etc. -- but not > for collections.Counter. I can add two Counters, but I can't multiply one > by a scalar. That seems like an oversight. > > It would be worthwhile to implement multiplication because, among other > reasons, Counters are a nice representation for discrete probability > distributions, for which multiplication is an even more fundamental > operation than addition. > > Here's an implementation: > > def __mul__(self, scalar): > "Multiply each entry by a scalar." > result = Counter() > for key in self: > result[key] = self[key] * scalar > return result > > def __rmul__(self, scalar): > "Multiply each entry by a scalar." > result = Counter() > for key in self: > result[key] = scalar * self[key] > return result > > ___ > Python-ideas mailing list > Python-ideas@python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] collections.Counter should implement __mul__, __rmul__
That's actually how I coded it myself the first time. But I worried it would be wasteful to create an intermediate dict and discard it. `timeit` results: 3.79 µs for the for-loop, 5.08 µs for the dict-comprehension with a 10-key Counter 257 µs for the for-loop, 169 µs for the dict-comprehension with a 1000-key Counter So results are mixed, but you are probably right. On Sun, Apr 15, 2018 at 3:46 PM Wes Turner wrote: > Good call. Is it any faster to initialize Counter with a dict > comprehension? > > return Counter({k: v*scalar for (k, v) in self.items()) > > On Sun, Apr 15, 2018 at 5:05 PM, Peter Norvig wrote: > >> For most types that implement __add__, `x + x` is equal to `2 * x`. >> >> That is true for all numbers, list, tuple, str, timedelta, etc. -- but >> not for collections.Counter. I can add two Counters, but I can't multiply >> one by a scalar. That seems like an oversight. >> >> It would be worthwhile to implement multiplication because, among other >> reasons, Counters are a nice representation for discrete probability >> distributions, for which multiplication is an even more fundamental >> operation than addition. >> >> Here's an implementation: >> >> def __mul__(self, scalar): >> "Multiply each entry by a scalar." >> result = Counter() >> for key in self: >> result[key] = self[key] * scalar >> return result >> >> def __rmul__(self, scalar): >> "Multiply each entry by a scalar." >> result = Counter() >> for key in self: >> result[key] = scalar * self[key] >> return result >> >> ___ >> Python-ideas mailing list >> Python-ideas@python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> >> > ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] collections.Counter should implement __mul__, __rmul__
> On Apr 15, 2018, at 2:05 PM, Peter Norvig wrote: > > For most types that implement __add__, `x + x` is equal to `2 * x`. > > ... > > > That is true for all numbers, list, tuple, str, timedelta, etc. -- but not > for collections.Counter. I can add two Counters, but I can't multiply one by > a scalar. That seems like an oversight. If you view the Counter as a sparse associative array of numeric values, it does seem like an oversight. If you view the Counter as a Multiset or Bag, it doesn't make sense at all ;-) >From an implementation point of view, Counter is just a kind of dict that has >a __missing__() method that returns zero. That makes it trivially easy to >subclass Counter to add new functionality or just use dictionary >comprehensions for bulk updates. > > > It would be worthwhile to implement multiplication because, among other > reasons, Counters are a nice representation for discrete probability > distributions, for which multiplication is an even more fundamental operation > than addition. There is an open issue on this topic. See: https://bugs.python.org/issue25478 One stumbling point is that a number of commenters are fiercely opposed to non-integer uses of Counter. Also, some of the use cases (such as those found in Allen Downey's "Think Stats" and "Think Bayes" books) also need division and rescaling to a total (i.e. normalizing the total to 1.0) for a probability mass function. If the idea were to go forward, it still isn't clear whether the correct API should be low level (__mul__ and __div__ and a "total" property) or higher level (such as a normalize() or rescale() method that produces a new Counter instance). The low level approach has the advantage that it is simple to understand and that it feels like a logical extension of the __add__ and __sub__ methods. The downside is that doesn't really add any new capabilities (being just short-cuts for a simple dict comprehension or call to c.values()). And, it starts to feature creep the Counter class further away from its core mission of counting and ventures into the realm of generic sparse arrays with numeric values. There is also a learnability/intelligibility issue in __add__ and __sub__ correspond to "elementwise" operations while __mul__ and __div__ would be "scalar broadcast" operations. Peter, I'm really glad you chimed in. My advocacy lacked sufficient weight to move this idea forward. Raymond ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] collections.Counter should implement __mul__, __rmul__
If you think of a Counter as a multiset, then it should support __or__, not __add__, right? I do think it would have been fine if Counter did not support "+" at all (and/or if Counter was limited to integer values). But given where we are now, it feels like we should preserve `c + c == 2 * c`. As to the "doesn't really add any new capabilities" argument, that's true, but it is also true for Counter as a whole: it doesn't add much over defaultdict(int), but it is certainly convenient to have a standard way to do what it does. I agree with your intuition that low level is better. `total` would be useful. If you have total and mul, then as you and others have pointed out, normalize is just c *= 1/c.total. I can also see the argument for a new FrequencyTable class in the statistics module. (By the way, I refactored my https://github.com/norvig/pytudes/blob/master/ipynb/Probability.ipynb a bit, and now I no longer need a `normalize` function.) On Sun, Apr 15, 2018 at 5:06 PM Raymond Hettinger < raymond.hettin...@gmail.com> wrote: > > > > On Apr 15, 2018, at 2:05 PM, Peter Norvig wrote: > > > > For most types that implement __add__, `x + x` is equal to `2 * x`. > > > > ... > > > > > > That is true for all numbers, list, tuple, str, timedelta, etc. -- but > not for collections.Counter. I can add two Counters, but I can't multiply > one by a scalar. That seems like an oversight. > > If you view the Counter as a sparse associative array of numeric values, > it does seem like an oversight. If you view the Counter as a Multiset or > Bag, it doesn't make sense at all ;-) > > From an implementation point of view, Counter is just a kind of dict that > has a __missing__() method that returns zero. That makes it trivially easy > to subclass Counter to add new functionality or just use dictionary > comprehensions for bulk updates. > > > > > > > It would be worthwhile to implement multiplication because, among other > reasons, Counters are a nice representation for discrete probability > distributions, for which multiplication is an even more fundamental > operation than addition. > > There is an open issue on this topic. See: > https://bugs.python.org/issue25478 > > One stumbling point is that a number of commenters are fiercely opposed to > non-integer uses of Counter. Also, some of the use cases (such as those > found in Allen Downey's "Think Stats" and "Think Bayes" books) also need > division and rescaling to a total (i.e. normalizing the total to 1.0) for a > probability mass function. > > If the idea were to go forward, it still isn't clear whether the correct > API should be low level (__mul__ and __div__ and a "total" property) or > higher level (such as a normalize() or rescale() method that produces a new > Counter instance). The low level approach has the advantage that it is > simple to understand and that it feels like a logical extension of the > __add__ and __sub__ methods. The downside is that doesn't really add any > new capabilities (being just short-cuts for a simple dict comprehension or > call to c.values()). And, it starts to feature creep the Counter class > further away from its core mission of counting and ventures into the realm > of generic sparse arrays with numeric values. There is also a > learnability/intelligibility issue in __add__ and __sub__ correspond to > "elementwise" operations while __mul__ and __div__ would be "scalar > broadcast" operations. > > Peter, I'm really glad you chimed in. My advocacy lacked sufficient > weight to move this idea forward. > > > Raymond > > > > ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] collections.Counter should implement __mul__, __rmul__
On Sunday, April 15, 2018, Peter Norvig wrote: > If you think of a Counter as a multiset, then it should support __or__, > not __add__, right? > > I do think it would have been fine if Counter did not support "+" at all > (and/or if Counter was limited to integer values). But given where we are > now, it feels like we should preserve `c + c == 2 * c`. > > As to the "doesn't really add any new capabilities" argument, that's > true, but it is also true for Counter as a whole: it doesn't add much over > defaultdict(int), but it is certainly convenient to have a standard way to > do what it does. > > I agree with your intuition that low level is better. `total` would be > useful. If you have total and mul, then as you and others have pointed out, > normalize is just c *= 1/c.total. > > I can also see the argument for a new FrequencyTable class in the > statistics module. (By the way, I refactored my https://github.com/norvig/ > pytudes/blob/master/ipynb/Probability.ipynb a bit, and now I no longer > need a `normalize` function.) > nltk.probability.FreqDist(collections.Counter) doesn't have a __mul__ either http://www.nltk.org/api/nltk.html#nltk.probability.FreqDist numpy.unique(, return_counts=True).unique_counts returns an array sorted by value with a __mul__. https://docs.scipy.org/doc/numpy/reference/generated/numpy.unique.html scipy.stats.itemfreq returns an array sorted by value with a __mul__ and the items in the first column. https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.itemfreq.html pandas.Series.value_counts(, normalize=False) returns a Series sorted by descending frequency. https://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.value_counts.html > On Sun, Apr 15, 2018 at 5:06 PM Raymond Hettinger < > raymond.hettin...@gmail.com> wrote: > >> >> >> > On Apr 15, 2018, at 2:05 PM, Peter Norvig wrote: >> > >> > For most types that implement __add__, `x + x` is equal to `2 * x`. >> > >> > ... >> > >> > >> > That is true for all numbers, list, tuple, str, timedelta, etc. -- but >> not for collections.Counter. I can add two Counters, but I can't multiply >> one by a scalar. That seems like an oversight. >> >> If you view the Counter as a sparse associative array of numeric values, >> it does seem like an oversight. If you view the Counter as a Multiset or >> Bag, it doesn't make sense at all ;-) >> >> From an implementation point of view, Counter is just a kind of dict that >> has a __missing__() method that returns zero. That makes it trivially easy >> to subclass Counter to add new functionality or just use dictionary >> comprehensions for bulk updates. >> >> > >> > >> > It would be worthwhile to implement multiplication because, among other >> reasons, Counters are a nice representation for discrete probability >> distributions, for which multiplication is an even more fundamental >> operation than addition. >> >> There is an open issue on this topic. See: https://bugs.python.org/ >> issue25478 >> >> One stumbling point is that a number of commenters are fiercely opposed >> to non-integer uses of Counter. Also, some of the use cases (such as those >> found in Allen Downey's "Think Stats" and "Think Bayes" books) also need >> division and rescaling to a total (i.e. normalizing the total to 1.0) for a >> probability mass function. >> >> If the idea were to go forward, it still isn't clear whether the correct >> API should be low level (__mul__ and __div__ and a "total" property) or >> higher level (such as a normalize() or rescale() method that produces a new >> Counter instance). The low level approach has the advantage that it is >> simple to understand and that it feels like a logical extension of the >> __add__ and __sub__ methods. The downside is that doesn't really add any >> new capabilities (being just short-cuts for a simple dict comprehension or >> call to c.values()). And, it starts to feature creep the Counter class >> further away from its core mission of counting and ventures into the realm >> of generic sparse arrays with numeric values. There is also a >> learnability/intelligibility issue in __add__ and __sub__ correspond to >> "elementwise" operations while __mul__ and __div__ would be "scalar >> broadcast" operations. >> >> Peter, I'm really glad you chimed in. My advocacy lacked sufficient >> weight to move this idea forward. >> >> >> Raymond >> >> >> >> ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] collections.Counter should implement __mul__, __rmul__
tf.bincount() returns a vector with integer counts. https://www.tensorflow.org/api_docs/python/tf/bincount Keras calls np.bincount in an mnist example. np.bincount returns an array with a __mul__ https://docs.scipy.org/doc/numpy-1.14.0/reference/generated/numpy.bincount.html - sklearn.preprocessing.normalize http://scikit-learn.org/stable/modules/preprocessing.html#preprocessing-normalization http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.normalize.html featuretools.primitives.NUnique has a normalize method. https://docs.featuretools.com/generated/featuretools.primitives.NUnique.html#featuretools.primitives.NUnique And I'm done sharing non-pure-python solutions for this problem, I promise On Sunday, April 15, 2018, Wes Turner wrote: > > > On Sunday, April 15, 2018, Peter Norvig wrote: > >> If you think of a Counter as a multiset, then it should support __or__, >> not __add__, right? >> >> I do think it would have been fine if Counter did not support "+" at all >> (and/or if Counter was limited to integer values). But given where we are >> now, it feels like we should preserve `c + c == 2 * c`. >> >> As to the "doesn't really add any new capabilities" argument, that's >> true, but it is also true for Counter as a whole: it doesn't add much over >> defaultdict(int), but it is certainly convenient to have a standard way to >> do what it does. >> >> I agree with your intuition that low level is better. `total` would be >> useful. If you have total and mul, then as you and others have pointed out, >> normalize is just c *= 1/c.total. >> >> I can also see the argument for a new FrequencyTable class in the >> statistics module. (By the way, I refactored my >> https://github.com/norvig/pytudes/blob/master/ipynb/Probability.ipynb a >> bit, and now I no longer need a `normalize` function.) >> > > nltk.probability.FreqDist(collections.Counter) doesn't have a __mul__ > either > http://www.nltk.org/api/nltk.html#nltk.probability.FreqDist > > numpy.unique(, return_counts=True).unique_counts returns an array sorted > by value with a __mul__. > https://docs.scipy.org/doc/numpy/reference/generated/numpy.unique.html > > scipy.stats.itemfreq returns an array sorted by value with a __mul__ and > the items in the first column. > https://docs.scipy.org/doc/scipy/reference/generated/ > scipy.stats.itemfreq.html > > pandas.Series.value_counts(, normalize=False) returns a Series sorted by > descending frequency. > https://pandas.pydata.org/pandas-docs/stable/generated/ > pandas.Series.value_counts.html > > >> On Sun, Apr 15, 2018 at 5:06 PM Raymond Hettinger < >> raymond.hettin...@gmail.com> wrote: >> >>> >>> >>> > On Apr 15, 2018, at 2:05 PM, Peter Norvig wrote: >>> > >>> > For most types that implement __add__, `x + x` is equal to `2 * x`. >>> > >>> > ... >>> > >>> > >>> > That is true for all numbers, list, tuple, str, timedelta, etc. -- but >>> not for collections.Counter. I can add two Counters, but I can't multiply >>> one by a scalar. That seems like an oversight. >>> >>> If you view the Counter as a sparse associative array of numeric values, >>> it does seem like an oversight. If you view the Counter as a Multiset or >>> Bag, it doesn't make sense at all ;-) >>> >>> From an implementation point of view, Counter is just a kind of dict >>> that has a __missing__() method that returns zero. That makes it trivially >>> easy to subclass Counter to add new functionality or just use dictionary >>> comprehensions for bulk updates. >>> >>> > >>> > >>> > It would be worthwhile to implement multiplication because, among >>> other reasons, Counters are a nice representation for discrete probability >>> distributions, for which multiplication is an even more fundamental >>> operation than addition. >>> >>> There is an open issue on this topic. See: >>> https://bugs.python.org/issue25478 >>> >>> One stumbling point is that a number of commenters are fiercely opposed >>> to non-integer uses of Counter. Also, some of the use cases (such as those >>> found in Allen Downey's "Think Stats" and "Think Bayes" books) also need >>> division and rescaling to a total (i.e. normalizing the total to 1.0) for a >>> probability mass function. >>> >>> If the idea were to go forward, it still isn't clear whether the correct >>> API should be low level (__mul__ and __div__ and a "total" property) or >>> higher level (such as a normalize() or rescale() method that produces a new >>> Counter instance). The low level approach has the advantage that it is >>> simple to understand and that it feels like a logical extension of the >>> __add__ and __sub__ methods. The downside is that doesn't really add any >>> new capabilities (being just short-cuts for a simple dict comprehension or >>> call to c.values()). And, it starts to feature creep the Counter class >>> further away from its core mission of counting and ventures into the realm >>> of generic sparse arrays with numeric values. There is also a >>> learnabilit
Re: [Python-ideas] collections.Counter should implement __mul__, __rmul__
> On Apr 15, 2018, at 5:44 PM, Peter Norvig wrote: > > If you think of a Counter as a multiset, then it should support __or__, not > __add__, right? FWIW, Counter is explicitly documented to support the four multiset-style mathematical operations discussed in Knuth TAOCP Volume II section 4.6.3 exercise 19: >>> c = Counter(a=3, b=1) >>> d = Counter(a=1, b=2) >>> c + d # add two counters together: c[x] + d[x] Counter({'a': 4, 'b': 3}) >>> c - d # saturating subtraction (keeping only positive >>> counts) Counter({'a': 2}) >>> c & d # intersection: min(c[x], d[x]) Counter({'a': 1, 'b': 1}) >>> c | d # union: max(c[x], d[x]) Counter({'a': 3, 'b': 2}) The wikipedia article on Multisets lists a further operation, inclusion, that is not currently supported: https://en.wikipedia.org/wiki/Multiset#Basic_properties_and_operations > I do think it would have been fine if Counter did not support "+" at all > (and/or if Counter was limited to integer values). But given where we are > now, it feels like we should preserve `c + c == 2 * c`. The + operation has legitimate use cases (it is perfectly reasonable to want to combine the results two separate counts). And, as you pointed out, it is what we already have and cannot change :-) So, the API design issue that confronts us is that it would be a bit weird and disorienting for the arithmetic operators to have two different signatures: += -= *= /= Also, we should respect the comments given by others on the tracker issue. In particular, there is a preference to not have an in-place operation and only allow a new counter instance to be created. That will help people avoid data structure modality problems: . c[category] += 1 # Makes sense during the frequency counting or accumulation phase c /= c.total # Covert to a probability mass function c[category] += 1 # This code looks correct but no longer makes any sense > As to the "doesn't really add any new capabilities" argument, that's true, > but it is also true for Counter as a whole: it doesn't add much over > defaultdict(int), but it is certainly convenient to have a standard way to do > what it does. IIRC, the defaultdict(int) in your first version triggered a bug because the model inadvertently changed during the analysis phase rather than being frozen after the training phase. The Counter doesn't suffer from the same issue (modifying the dict on a failed lookup). Also, the Counter class does have a few value added features: Counter(iterable), c.most_common(), c.elements(), etc. But yes, at its heart the counter is mostly just a specialized dictionary. The thought I was trying to express is that suggestions to build out Counter API are a little less compelling when we already have a way to do it that is flexible, fast, clear, and standard (i.e. dict comprehensions). > I agree with your intuition that low level is better. `total` would be > useful. If you have total and mul, then as you and others have pointed out, > normalize is just c *= 1/c.total. I fully support adding some functionality for scaling to support probability distributions, bayesian update steps, chi-square tests, and whatnot. The people who need convincing are the other respondents on the tracker. They had a strong mental model for the Counter class that is somewhat at odds with this proposal. Raymond ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] collections.Counter should implement __mul__, __rmul__
> On Apr 15, 2018, at 7:18 PM, Wes Turner wrote: > > And I'm done sharing non-pure-python solutions for this problem, I promise Keep them coming :-) Thanks for the research. It helps to remind ourselves that almost none of our problems are new :-) Raymond ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] collections.Counter should implement __mul__, __rmul__
On Mon, Apr 16, 2018 at 1:39 PM, Raymond Hettinger wrote: > > So, the API design issue that confronts us is that it would be a bit weird > and disorienting for the arithmetic operators to have two different > signatures: > > += > -= > *= > /= > This needn't be a blocker. Strings can be added to strings, and strings can be multiplied by integers. If it's of practical value to multiply a Counter by a number, by all means do it. ChrisA ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] collections.Counter should implement __mul__, __rmul__
On Sun, Apr 15, 2018 at 8:39 PM Raymond Hettinger < raymond.hettin...@gmail.com> wrote: > FWIW, Counter is explicitly documented to support the four multiset-style > mathematical operations discussed in Knuth TAOCP Volume II section 4.6.3 > exercise 19: > Wow, I never noticed "&" and "|" -- I guess when I got to "Common patterns for working with" in the documentation, I figured that there wouldn't be any new methods introduced after that and I stopped reading. > > it would be a bit weird and disorienting for the arithmetic operators to > have two different signatures: > > += > -= > *= > /= > Is it weird and disorienting to have: += *= ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] collections.Counter should implement __mul__, __rmul__
> On Apr 15, 2018, at 9:04 PM, Peter Norvig wrote: > > it would be a bit weird and disorienting for the arithmetic operators to have > two different signatures: > > += > -= > *= > /= > > Is it weird and disorienting to have: > > += > *= Yes, there is a precedent that does seem to have worked out well in practice :-) It isn't exactly parallel because strings aren't containers of numbers, they don't have & and |, and there isn't a reason to want a / operation, but it does suggest that signature variation might not be problematic. BTW, do you just want __mul__ and __rmul__? If those went in, presumably there will be a request to support __imul__ because otherwise c*=3 would still work but would be inefficient (that was the rationale for adding inplace variants for all the current arithmetic operators). Likewise, presumably someone would legitimately want __div__ to support the normalization use case. Perhaps less likely, there would be also be a request for __floordiv__ to allow exactly scaled results to stay in the domain of integers. Which if any of these makes sense to you? Also, any thoughts on the cleanest way to express the computation of a chi-squared statistic (for example, to compare observed first digit frequencies to the frequencies predicted by Benford's Law)? This isn't an arbitrary question (it came up when a professor first proposed a variant of this idea a few years ago). Raymond ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] collections.Counter should implement __mul__, __rmul__
I don't have strong feelings, but I would say yes to __imul__, no to __div__ and __floordiv__ (with str/list/tuple as the precedent). For chisquare, I would be perfectly happy with: digit_counts = Counter(...) scipy.stats.chisquare(list(digit_counts.values())) On Sun, Apr 15, 2018 at 9:39 PM Raymond Hettinger < raymond.hettin...@gmail.com> wrote: > > > > On Apr 15, 2018, at 9:04 PM, Peter Norvig wrote: > > > > it would be a bit weird and disorienting for the arithmetic operators to > have two different signatures: > > > > += > > -= > > *= > > /= > > > > Is it weird and disorienting to have: > > > > += > > *= > > Yes, there is a precedent that does seem to have worked out well in > practice :-) It isn't exactly parallel because strings aren't containers > of numbers, they don't have & and |, and there isn't a reason to want a / > operation, but it does suggest that signature variation might not be > problematic. > > BTW, do you just want __mul__ and __rmul__? If those went in, presumably > there will be a request to support __imul__ because otherwise c*=3 would > still work but would be inefficient (that was the rationale for adding > inplace variants for all the current arithmetic operators). Likewise, > presumably someone would legitimately want __div__ to support the > normalization use case. Perhaps less likely, there would be also be a > request for __floordiv__ to allow exactly scaled results to stay in the > domain of integers. Which if any of these makes sense to you? > > Also, any thoughts on the cleanest way to express the computation of a > chi-squared statistic (for example, to compare observed first digit > frequencies to the frequencies predicted by Benford's Law)? This isn't an > arbitrary question (it came up when a professor first proposed a variant of > this idea a few years ago). > > > Raymond ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] collections.Counter should implement __mul__, __rmul__
[Peter Norvig] > For most types that implement __add__, `x + x` is equal to `2 * x`. > > That is true for all numbers, list, tuple, str, timedelta, etc. -- but not > for collections.Counter. I can add two Counters, but I can't multiply one > by a scalar. That seems like an oversight. > > ... > Here's an implementation: > >def __mul__(self, scalar): >"Multiply each entry by a scalar." >result = Counter() >for key in self: >result[key] = self[key] * scalar >return result > >def __rmul__(self, scalar): >"Multiply each entry by a scalar." >result = Counter() >for key in self: >result[key] = scalar * self[key] >return result Adding Counter * integer doesn't bother me a bit, but the definition of what that should compute isn't obvious. In particular, that implementation doesn't preserve that `x+x == 2*x` if x has any negative values: >>> x = Counter(a=-1) >>> x Counter({'a': -1}) >>> x+x Counter() It would be strange if x+x != 2*x, and if x*-1 != -x: >>> y = Counter(a=1) >>> y Counter({'a': 1}) >>> -y Counter() Etc. Then again, it's already the case that, e.g., x-y isn't always the same as x + -y: >>> x = Counter(a=1) >>> y = Counter(a=2) >>> x - y Counter() >>> x + -y Counter({'a': 1}) So screw obvious formal identities ;-) I'm not clear on why "+" and "-" discard keys with values <= 0 to begin with. For "-" it's natural enough viewing "-" as being multiset difference, but for "+"? That's just made up ;-) In any case, despite the oddities, I think your implementation would be least surprising overall (ignore the sign of the resulting values). At least for Counters that actually make sense as multisets (have no values <= 0), and for a positive integer multiplier `n > 0`, it does preserve that `x*n` = `x + x + ... + x` (with `n` instances of `x`). ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] collections.Counter should implement __mul__, __rmul__
On Monday, April 16, 2018, Raymond Hettinger wrote: > > > > On Apr 15, 2018, at 9:04 PM, Peter Norvig wrote: > > > > it would be a bit weird and disorienting for the arithmetic operators to > have two different signatures: > > > > += > > -= > > *= > > /= > > > > Is it weird and disorienting to have: > > > > += > > *= > > Yes, there is a precedent that does seem to have worked out well in > practice :-) It isn't exactly parallel because strings aren't containers > of numbers, they don't have & and |, and there isn't a reason to want a / > operation, but it does suggest that signature variation might not be > problematic. > > BTW, do you just want __mul__ and __rmul__? If those went in, presumably > there will be a request to support __imul__ because otherwise c*=3 would > still work but would be inefficient (that was the rationale for adding > inplace variants for all the current arithmetic operators). Likewise, > presumably someone would legitimately want __div__ to support the > normalization use case. Perhaps less likely, there would be also be a > request for __floordiv__ to allow exactly scaled results to stay in the > domain of integers. Which if any of these makes sense to you? > > Also, any thoughts on the cleanest way to express the computation of a > chi-squared statistic (for example, to compare observed first digit > frequencies to the frequencies predicted by Benford's Law)? This isn't an > arbitrary question (it came up when a professor first proposed a variant of > this idea a few years ago). https://en.wikipedia.org/wiki/Chi-squared_distribution https://en.wikipedia.org/wiki/Chi-squared_test https://en.wikipedia.org/wiki/Benford%27s_law (How might one test this with e.g. *double* SHA256?) proportions_chisquare(count, nobs, value=None) https://www.statsmodels.org/dev/generated/statsmodels.stats.proportion.proportions_chisquare.html https://www.statsmodels.org/dev/genindex.html?highlight=chi scipy.stats.chisquare(f_obs, f_exp=None, ddof=0, axis=0) https://docs.scipy.org/doc/scipy-0.18.1/reference/generated/scipy.stats.chisquare.html sklearn.feature_selection.chi2(X, y) http://scikit-learn.org/stable/modules/generated/sklearn.feature_selection.chi2.html#sklearn.feature_selection.chi2 kernel_approximation.AdditiveChi2Sampler kernel_approximation.SkewedChi2Sampler http://scikit-learn.org/stable/modules/classes.html#module-sklearn.kernel_approximation has sklearn.metrics.pairwise.chi2_kernel(X, Y=None, gamma=1.0) http://scikit-learn.org/stable/modules/generated/sklearn.metrics.pairwise.chi2_kernel.html#sklearn.metrics.pairwise.chi2_kernel sklearn.metrics.pairwise.additive_chi2_kernel(X, Y=None) http://scikit-learn.org/stable/modules/generated/sklearn.metrics.pairwise.additive_chi2_kernel.html#sklearn.metrics.pairwise.additive_chi2_kernel ... FreqDist(collections.Counter(odict)) ... sparse-coding ... One-Hot / Binarization http://contrib.scikit-learn.org/categorical-encoding/ StandardScalar (for standardization) refuses to work with sparse matrices: http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.StandardScaler.html#sklearn.preprocessing.StandardScaler > > Raymond > ___ > Python-ideas mailing list > Python-ideas@python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] collections.Counter should implement __mul__, __rmul__
> On Apr 15, 2018, at 10:07 PM, Tim Peters wrote: > > Adding Counter * integer doesn't bother me a bit, but the definition > of what that should compute isn't obvious. Any thoughts on Counter * float? A key use case for what is being proposed is: c *= 1 / c.total Raymond ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] collections.Counter should implement __mul__, __rmul__
[Tim] >> Adding Counter * integer doesn't bother me a bit, but the definition >> of what that should compute isn't obvious. [Raymond] > Any thoughts on Counter * float? A key use case for what is being proposed > is: > > c *= 1 / c.total Ah, I thought I had already addressed that, but looks like my fingers forgot to type it ;-) By all mean, yes! Indeed, that strengthens the "argument" for why `Counter * int` should ignore the signs of the values - if we allow multiplying by anything supporting __mul__, that clearly says we view multiplication as being outside the "multiset" view, and so there's no reason at all to suppress values <= 0. I also have no problem with inplace operators. Or with adding `Counter /= scalar", for that matter. Perhaps whining could be reduced by rearranging the docs some: clearly separate operations designed to support the multiset view from the others. Then "but that operation makes no sense for multisets!" can be answered with "so don't use it on multisets - like the docs told you" ;-) ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] collections.Counter should implement __mul__, __rmul__
On 16/04/2018 06:07, Tim Peters wrote: > [Peter Norvig] >> For most types that implement __add__, `x + x` is equal to `2 * x`. >> >> That is true for all numbers, list, tuple, str, timedelta, etc. -- but not >> for collections.Counter. I can add two Counters, but I can't multiply one >> by a scalar. That seems like an oversight. >> >> ... >> Here's an implementation: >> >> def __mul__(self, scalar): >> "Multiply each entry by a scalar." >> result = Counter() >> for key in self: >> result[key] = self[key] * scalar >> return result >> >> def __rmul__(self, scalar): >> "Multiply each entry by a scalar." >> result = Counter() >> for key in self: >> result[key] = scalar * self[key] >> return result > > Adding Counter * integer doesn't bother me a bit, but the definition > of what that should compute isn't obvious. In particular, that > implementation doesn't preserve that `x+x == 2*x` if x has any > negative values: > x = Counter(a=-1) x > Counter({'a': -1}) x+x > Counter() > > It would be strange if x+x != 2*x, and if x*-1 != -x: > y = Counter(a=1) y > Counter({'a': 1}) -y > Counter() > > Etc. > > Then again, it's already the case that, e.g., x-y isn't always the > same as x + -y: > x = Counter(a=1) y = Counter(a=2) x - y > Counter() x + -y > Counter({'a': 1}) > > So screw obvious formal identities ;-) > > I'm not clear on why "+" and "-" discard keys with values <= 0 to > begin with. For "-" it's natural enough viewing "-" as being multiset > difference, but for "+"? That's just made up ;-) > > In any case, despite the oddities, I think your implementation would > be least surprising overall (ignore the sign of the resulting values). > At least for Counters that actually make sense as multisets (have no > values <= 0), and for a positive integer multiplier `n > 0`, it does > preserve that `x*n` = `x + x + ... + x` (with `n` instances of `x`). > ___ > Python-ideas mailing list > Python-ideas@python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > Wouldn't it make sense to have the current counter behaviour, (negative counts not allowed), and also a counter that did allow negative values (my bank doesn't seem to have a problem with my balance being able to go below negative), and possibly at the same time a counter class that allowed fractional counts? Then: x = Counter(a=1) y = Counter(a=2) x - y > Counter() x + -y > Counter({'a': 1}) BUT: x = Counter(a=1, allow_negative=True) y = Counter(a=2, allow_negative=True) x - y > Counter({'a': 1}) x + -y > Counter({'a': 1}) Likewise for a Counter that was allowed to be fractional the result of some_counter / scalar would have (potentially) fractional results and one that did not would give floor results. -- Steve (Gadget) Barnes Any opinions in this message are my personal opinions and do not reflect those of my employer. --- This email has been checked for viruses by AVG. http://www.avg.com ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] collections.Counter should implement __mul__, __rmul__
> On Apr 15, 2018, at 10:51 PM, Tim Peters wrote: > > I also have no problem with inplace operators. Or with adding > `Counter /= scalar", for that matter. But surely __rdiv__() would be over the top, harmonic means be damned ;-) Raymond ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/