Re: Speed vs Memory [was: On Lists and Iterables]
Excerpts from Xen's message of 2017-12-17 12:34:49 +0100: > > > Oorspronkelijke bericht > Onderwerp: Re: Speed vs Memory [was: On Lists and Iterables] > Datum: 17-12-2017 12:28 > Afzender: Xen > Ontvanger: Neal McBurnett > > Just to summarize this. > > Xen schreef op 17-12-2017 12:22: > > > Meanwhile Python 3.4 can be excessively slower than 2.7. SO WHERE'S THE > > GAIN? > > I haven't found any "definitive" benchmarks yet but there are plenty of > people benchmarking and they reveal that 3.4 is much slower than 2.7. > > This is an example: > > Python 2.7: 24344.88 pystones/second > Python 3.4: 17459.89 pystones/second > Nuitka 2.7: 47243.92 pystones/second > Nuitka 3.4: 28658.92 pystones/second > > That's 72% the performance of Python 2.7 and 60% while pre-compiling. > > And for that you change your language to decrease memory consumption for > temporary scoped objects? > And better native threading. And predictable unicode support. And a cleaner reference implementation. Flat out microbenchmark performance is rarely a reliable predictor of overall performance. Code spends most of its time waiting on other parts of the computer than slamming through the CPU. In fact, if it is CPU-intensive, you ought to write an extension in a more optimizable language like C or Rust. What is often a predictor of productivity in software is how effectively a team can communicate with one another. IMO, you've compromised all of our ability to communicate with one another in this thread with huge rambling emails that are mostly centered around a few repeated points. * You don't participate in the Python community. * You don't write Python for Ubuntu. * You don't like the differences between Python 2 and 3. * You'd like to argue that Ubuntu should do something to stop Python 3. I'd like to suggest that if you want to communicate with us, you stop repeating all of these. If you have said other things in your in-line replies, I missed them because I skimmed past the repetition. To the larger point: Python has learned from the Python 3 experience by actually going through it. Ubuntu has in fact participated in it from the very beginning and seems to have embraced the changes even if that has turned out to be frustrating for some users. It's over. It happened. We're here, not there. Find peace with that, however you may, but please, if you've said something already, just stop. -- Ubuntu-devel-discuss mailing list Ubuntu-devel-discuss@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel-discuss
Re: On Lists and Iterables
On Sun, Dec 17, 2017 at 12:22:20PM +0100, Xen wrote: > Neal McBurnett schreef op 16-12-2017 18:16: > > For more on the rationale for changes related to iterators see > > http://portingguide.readthedocs.io/en/latest/iterators.html > > That entire rationale is only explained with one word "memory consumption". > > So now you are changing the design of your _grammar_ just so that the > resulting code will use less memory. > > That is the job of the compiler, not the developer. I don't think the document above does a particularly good job of explaining it, and I think you've fundamentally misunderstood things, perhaps by extrapolating too much from toy examples. zip() takes iterables as its inputs; concrete lists are only one kind of iterable. Iteration constructs are very widespread in non-trivial Python code, and it's common to make use of iterators to express constructions where you can cheaply extract a few elements but it would be expensive to extract them all. For example, I spend most of my time working on database-backed web applications, which is a very popular application for Python. In that context, it's commonplace to make database queries via functions that return iterators and do lazy loading of results. You then iterate over these to build a page of results (which can use things like LIMIT and OFFSET when compiling its SQL queries), and you render and return that. If you accidentally call something that consumes the whole input iterable in the process, then it's going to do a *lot* of database traffic for some queries, and it doesn't take much of that to utterly destroy the performance of your application. This is not something that the compiler can optimise, because the *contract* of zip et al in Python 2 was that it would consume the entire inputs (up to the shortest one in the case of zip, anyway); iteration is visible to the program and can have external side-effects, and it's not something that can be quietly optimised out given the design of the language. Talking about memory consumption of the result is relevant in some cases, sure, but it's certainly not the whole story; what often matters is the work involved in materialising the whole iterable, and that can be very significant indeed. In Python 2, there were many functions that took iterables as input and returned concrete lists, consuming the entire inputs in the process. In most cases there were versions of these that operated in a lazy fashion and returned iterables instead, but they were generally hidden off in the itertools module and less obvious compared to the built-in versions. Effectively, the language did the wrong thing by default. Python 3 changes these around to give preference to the versions that take iterables as input and return iterables as output, and says that if you want a list then you have to use list() or similar to get one. This reduces the cognitive load of the language, because now instead of remembering the different names for the list-returning and iterable-returning versions of various things, you only have to remember one version and the general rule that you use list() to materialise a whole iterable into a list (which was already useful for other things even in Python 2). It makes the language simpler to learn, because there are fewer rules and they compose well; and it makes it easier to do what's usually the right thing. This comes at the cost of a bit of porting effort for some code that started out in Python 2, of which there'll be less and less as time goes on. To put it another way: "don't perform operations on collections of unbounded size" is pretty much the number one rule for webapps that I've picked up over the last few years, and Python 3 takes this lesson and applies it to the core language. Toy examples involving zip([1, 2], [3, 4]) and the like miss the point because they simplify too much. This family of functions is almost always used in iteration constructs, usually "for ... in" or a comprehension, and in those common cases the programmer doesn't have to change anything at all. In cases where they do need to change something, it has the useful effect of highlighting that something a little unusual may be going on, rather than hiding behaviour that's potentially catastrophic at scale behind an innocuous-looking built-in function. > Meanwhile Python 3.4 can be excessively slower than 2.7. SO WHERE'S THE > GAIN? It will no doubt depend on the benchmark, and rather than cherry-picking a single one it's likely more interesting to look at either a wide range of benchmarks, or at the specific application in question. Counterpoint, which also links to much more data: https://lwn.net/Articles/725114/ -- Colin Watson [cjwat...@ubuntu.com] -- Ubuntu-devel-discuss mailing list Ubuntu-devel-discuss@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel-discuss
Re: On Lists and Iterables
On Sun, Dec 17, 2017 at 11:02:48AM +0100, Xen wrote: > My topic was that Ubuntu obviously never opposed any of the changes, and > most replying here pretty much evidence that they are in favour of them. > > Now, they claim that they can do nothing against upstream and that they have > to follow the developments, that they are defending at every moment > themselves > > It's playing the victim while you are playing a leading role in making it > happen. Observing that something is out of scope is not playing the victim. Ubuntu exists, in part, to get recent versions of free software into the hands of users as straightforwardly as possible. For the most part we aren't programming language designers, and running a campaign against how the custodians of the Python language have chosen to design the latest version of that language is simply out of scope in this project. The economics are obvious in any case. Enough people want to use the latest version of the language that it's a clear requirement for us to provide it. In general we want to avoid shipping multiple versions of things where possible, because that's a maintenance burden. The Python 2-to-3 transition has been such that we've had no choice but to do so in this case, but the maintenance burden for us is only going to get worse once Python 2 stops being supported upstream. Thus it's clearly in our long-term interest to put effort into moving things over to Python 3 in ways that fall within the remit of a distribution. The Python packaging teams within most other major distributions have made much the same obvious decision, explicitly or implicitly. We are absolutely taking a role in making it happen as smoothly as possible (though you give us too much credit by saying that it's a leading role), and I don't think anyone involved is ashamed of this. This is not the same as saying that we're powerless victims. When we have issues with Python, we take them to the appropriate places, rather than posting long screeds to inappropriate mailing lists. Several Ubuntu developers over the years have been active contributors to Python upstream. Many of the changes in Python 3 releases have been the result of people working on updating their code to the latest version of the language, finding rough edges in the process, and discussing those in the right place so that they could be smoothed out. (For example, Python 3.3 reintroduced the u'...' syntax for Unicode strings even though it's strictly redundant, to make things easier for people forward-porting code; 3.5 restored "bytes % args", as mentioned recently; and so on.) These wouldn't have happened if people had stuck their heads in the sand and decided that they were going to hold on to Python 2 for dear life; they happened because people decided that it was better for the world if they collaborated to help build a better language. And when you actually want many of the changes involved, that's the very opposite of playing the victim! I want to use Python 3 for the large codebases I help to maintain (when they aren't using Python 3 already, as several of them are). A few things I positively want to use are: * async/await syntax for coroutines; we do the best we can at the moment with things like @defer.inlineCallbacks in Twisted, but it's fiddly and ugly; * many improvements to the standard library's subprocess module; as one example, I filed https://bugs.python.org/issue1652, and while I can work around it or use a backport, the whole point was to make the improved behaviour available to everyone since the old behaviour was a nasty gotcha; * much more sensible behaviour related to inheritance of file descriptors across forks; * the enum module; * contextlib.ExitStack, which allows replacing some very cumbersome constructions in Python 2; * much more flexible/programmable access to module importing. I mean, this is just off the top of my head, but hopefully it makes the point: sure, there are some stumbling blocks involved in porting to Python 3 (typically fewer of them in newer versions), but there are more than enough good things for it to be worthwhile in my opinion. I can get some of these with backported libraries, but not all, and in any case that sort of thing gets very cumbersome. Based on what you've said about how long it'd take to port all your code, it seems that you have rather little investment in Python; in fact, to be honest it sounds like you've spent more time in total writing inflammatory emails about it here than actually writing Python code. So why not spend some time listening to the views of people here who spend a lot of time writing Python code, as well as just the negative posts you found on the internet (some of which have since fed into improving the language anyway)? > No one here has evidenced being opposed to the forced nature of its > discontinuation, except for mr. Watson in saying that he would have > preferred a 'l
Re: Speed vs Memory [was: On Lists and Iterables]
Oorspronkelijke bericht Onderwerp: Re: Speed vs Memory [was: On Lists and Iterables] Datum: 17-12-2017 12:28 Afzender: Xen Ontvanger: Neal McBurnett Just to summarize this. Xen schreef op 17-12-2017 12:22: Meanwhile Python 3.4 can be excessively slower than 2.7. SO WHERE'S THE GAIN? I haven't found any "definitive" benchmarks yet but there are plenty of people benchmarking and they reveal that 3.4 is much slower than 2.7. This is an example: Python 2.7: 24344.88 pystones/second Python 3.4: 17459.89 pystones/second Nuitka 2.7: 47243.92 pystones/second Nuitka 3.4: 28658.92 pystones/second That's 72% the performance of Python 2.7 and 60% while pre-compiling. And for that you change your language to decrease memory consumption for temporary scoped objects? -- Ubuntu-devel-discuss mailing list Ubuntu-devel-discuss@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel-discuss
Re: On Lists and Iterables
Neal McBurnett schreef op 16-12-2017 18:16: Though, as John Lenton notes, there are excellent reasons for most of the changes in Python 3 By the way, all of these "excellent reasons" are disputed. And citing "authorative sources" does not make them any less disputed, or any more "proven". The fact is and remains that this is a selling out to the gains of the material world, ie. more efficiency. Apart from the apparently very broken implementation of unicode, I think most of the changes (apart from new language features) come down to efficiency of execution. I'm not sure, using a broad sweep here. But for instance: - print x + print(x) Is not mandated by anything. You go from script syntax to compiled language syntax. This agrees with the notion that they wanted more efficiency at the cost of developer pleasance. We see an evolution from a script language to a compiled language here. That is not mandated by ANYTHING other than the choice to sacrifice readability for performance. That's what they have done. They have sacrificed the attractiveness of their language for more cpu cycle output. That is what Python 3 represents. Why Python 3 exists https://snarky.ca/why-python-3-exists/ He only takes about unicode. Which is a deficient and extremely poor solution to the problem that is apparently arguably worse than the situation that existed before, at least for lowlevel programmers who have to write protocols and such things. So the argument "because unicode" really needs to be qualified with "how good is your solution actually?" But that's the only argument he makes, and they've just failed at their task. It simply means that Python 3 is a failure with respect to the task they've set themselves, even if the original intent (to solve the unicode mess) was good. It's possible to fail at what you do you know. If I try to climb a tree it doesn't mean I succeed. If I try to paint a house but the walls end up in all the wrong colours and then I say "Here, I'm done" that doesn't mean that I have succeeded. From the perspective of anyone that needs to format or convert strings that are not composed of unicode code points, the whole system is a failure. For more on the rationale for changes related to iterators see http://portingguide.readthedocs.io/en/latest/iterators.html That entire rationale is only explained with one word "memory consumption". So now you are changing the design of your _grammar_ just so that the resulting code will use less memory. That is the job of the compiler, not the developer. You are exposing language internals because this way the compiler (interpreter) doesn't need to do any work, and the developer now directly has to work with intermediate objects like I said, it's more work for the developer, but in *some* cases will lead his application to use a little less memory, even though these are discardable objects in that case and the memory will be reclaimed anyway. So even though it's the programmer's job to ensure the objects get garbage collected, by scoping them properly, you NOW make it his job to perform meaningless optimizations that for 90% of code will not matter a thing, AND you make that an essential element of language "design" (nondesign) that everyone suffers from. And you call THIS good? It's a freaking hack to consume less memory. Like what the hell is going on here. Virtually no one benefits from this. Python has excellent scoping. All those objects are released quickly enough. This is even more dumb than I thought. You are really insulting my intelligence here you know. If these temporary objects get discarded anyway, then what's the deal? We have 8GB of RAM. The compiler/interpreter can easily optimize this as well internally. And what do they do: "Alternatively, you can keep the higher-order function call, and wrap the result in list. However, many people will find the resulting code less readable:" Ah, now we get to it! They have changed the result of map and filter to an iterable, so the original code is now much more ugly: "powered = list(map(power_function, numbers))" "less readable" Yeah, you just did that!!! For your no-benefit language corruption in order to save a bit of memory that gets instantly garbage collected. I am just perplexed. This is really face-palm. The extra step to convert to list can only be less efficient unless the interpreter optimizes that away, which I doubt. This is plain dumb and plain ignorent. So you all bought the rap that this would be beneficial but you didn't even look into it yourself. Meanwhile Python 3.4 can be excessively slower than 2.7. SO WHERE'S THE GAIN? https://www.raspberrypi.org/forums/viewtopic.php?t=183829 I have never heard of a dumber decision in any programming language than this. -- Ubuntu-devel-discuss mailing list Ubuntu-devel-discuss@lists.ubuntu.com Modify settings or unsub
Re: On Lists and Iterables
Neal McBurnett schreef op 16-12-2017 18:16: For example here I port your one-line Python 2 script that uses zip. $ cat porting_example.py print zip(["a","b"], ["c","d"]) $ python porting_example.py [('a', 'c'), ('b', 'd')] $ 2to3 porting_example.py > porting_example.patch $ cat porting_example.patch --- porting_example.py (original) +++ porting_example.py (refactored) @@ -1 +1 @@ -print zip(["a","b"], ["c","d"]) +print(list(zip(["a","b"], ["c","d"]))) $ patch -b < porting_example.patch # -b saves original as porting_example.py.orig $ cat porting_example.py print(list(zip(["a","b"], ["c","d"]))) $ python3 porting_example.py [('a', 'c'), ('b', 'd')] Well that's sweet, but I hope you realize how much more ugly that line of code has become. My topic is that none of you, or almost none of you, opposes this, and then goes on to say that they "have no choice" but to "follow along with upstream". That is a basic lie and that's all that is and that's all I wanted to say. It does mean my 30 minutes of porting just went down to 5 ;-). Or well, it would still be 30, but I would have to check everything out after the fact. Here, I will do it for you, since you go to this effort anyway. In my one file the only changes are these: + for vg in list(self.vgs.values()): Verbosity has gone up in all of the 5 instances of this, but strangely, it isn't required in 3 of them, because they are for loops. In the other file likewise. Though I am not sure my data object will work. And it works! Or almost, I need to change to an integer division: keys = Gdk.Keymap.get_entries_for_keyval(Gdk.Keymap.get_default(), keyvals[ (direction + 1) / 2 ]) TypeError: list indices must be integers or slices, not float And works perfectly now :p. The point is that I don't like this "language", I do not want this "language" and I would probably sooner switch to Ruby than have to write it in this language. This is a small file, but all of the changes resulted in increased verbosity, while gaining zero benefits; how is this to my benefit? Performance? Don't make me laugh. -- Ubuntu-devel-discuss mailing list Ubuntu-devel-discuss@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel-discuss
Re: On Lists and Iterables
Jonathon Fernyhough schreef op 16-12-2017 16:29: I've only been vaguely following this thread as it doesn't appear to be related to Ubuntu Development, but it seems to me you're annoyed that you started learning Python2 before finding out that it is being replaced by Python3. No, I don't like Python3. I am happy that I have the chance to still 'learn' Python2 (as if it has anything to do with learning; you can do that 20 years into the future also). I have zero investment in Python2. Well, it would take me exactly 30 minutes to convert everything I have written to Python3, probably. And that would only be the "learning" part of what to change it into... I just do not want to have to program in that language. I am more likely to start learning Ruby than to have to use Python 3. This isn't an Ubuntu problem - you started learning the wrong language version in the first place I never opened the topic of language internals; mr. Watson did. My topic was that Ubuntu obviously never opposed any of the changes, and most replying here pretty much evidence that they are in favour of them. Now, they claim that they can do nothing against upstream and that they have to follow the developments, that they are defending at every moment themselves It's playing the victim while you are playing a leading role in making it happen. No one here has evidenced being opposed to the abandonment of python2. No one here has evidenced being opposed to the forced nature of its discontinuation, except for mr. Watson in saying that he would have preferred a 'legacy' mode for the interpreter. Therefore, while you are all stout defenders of upstream policy, don't then go play the victim saying you can do nothing about it and you "have to". Screw that, you are just lying to yourselves and to everyone. I said that more forcefully in my mind ;-). If you have an issue with the _language syntax_ then you're definitely on the wrong list. Again, my topic was Ubuntu's stance. -- Ubuntu-devel-discuss mailing list Ubuntu-devel-discuss@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel-discuss