Re: Speed vs Memory [was: On Lists and Iterables]

2017-12-17 Thread Clint Byrum
Excerpts from Xen's message of 2017-12-17 12:34:49 +0100:
> 
> 
>  Oorspronkelijke bericht 
> Onderwerp: Re: Speed vs Memory [was: On Lists and Iterables]
> Datum: 17-12-2017 12:28
> Afzender: Xen 
> Ontvanger: Neal McBurnett 
> 
> Just to summarize this.
> 
> Xen schreef op 17-12-2017 12:22:
> 
> > Meanwhile Python 3.4 can be excessively slower than 2.7. SO WHERE'S THE 
> > GAIN?
> 
> I haven't found any "definitive" benchmarks yet but there are plenty of 
> people benchmarking and they reveal that 3.4 is much slower than 2.7.
> 
> This is an example:
> 
> Python 2.7: 24344.88 pystones/second
> Python 3.4: 17459.89 pystones/second
> Nuitka 2.7: 47243.92 pystones/second
> Nuitka 3.4: 28658.92 pystones/second
> 
> That's 72% the performance of Python 2.7 and 60% while pre-compiling.
> 
> And for that you change your language to decrease memory consumption for 
> temporary scoped objects?
> 

And better native threading. And predictable unicode support. And a cleaner
reference implementation.

Flat out microbenchmark performance is rarely a reliable predictor of overall
performance. Code spends most of its time waiting on other parts of the
computer than slamming through the CPU. In fact, if it is CPU-intensive, you
ought to write an extension in a more optimizable language like C or Rust.

What is often a predictor of productivity in software is how effectively a team
can communicate with one another. IMO, you've compromised all of our ability to
communicate with one another in this thread with huge rambling emails that are
mostly centered around a few repeated points.

 * You don't participate in the Python community.

 * You don't write Python for Ubuntu.

 * You don't like the differences between Python 2 and 3.

 * You'd like to argue that Ubuntu should do something to stop Python 3.

I'd like to suggest that if you want to communicate with us, you stop repeating
all of these. If you have said other things in your in-line replies, I missed
them because I skimmed past the repetition.

To the larger point: Python has learned from the Python 3 experience by
actually going through it. Ubuntu has in fact participated in it from
the very beginning and seems to have embraced the changes even if that
has turned out to be frustrating for some users.

It's over. It happened. We're here, not there. Find peace with that,
however you may, but please, if you've said something already, just stop.

-- 
Ubuntu-devel-discuss mailing list
Ubuntu-devel-discuss@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel-discuss


Re: On Lists and Iterables

2017-12-17 Thread Colin Watson
On Sun, Dec 17, 2017 at 12:22:20PM +0100, Xen wrote:
> Neal McBurnett schreef op 16-12-2017 18:16:
> > For more on the rationale for changes related to iterators see
> > http://portingguide.readthedocs.io/en/latest/iterators.html
> 
> That entire rationale is only explained with one word "memory consumption".
> 
> So now you are changing the design of your _grammar_ just so that the
> resulting code will use less memory.
> 
> That is the job of the compiler, not the developer.

I don't think the document above does a particularly good job of
explaining it, and I think you've fundamentally misunderstood things,
perhaps by extrapolating too much from toy examples.

zip() takes iterables as its inputs; concrete lists are only one kind of
iterable.  Iteration constructs are very widespread in non-trivial
Python code, and it's common to make use of iterators to express
constructions where you can cheaply extract a few elements but it would
be expensive to extract them all.

For example, I spend most of my time working on database-backed web
applications, which is a very popular application for Python.  In that
context, it's commonplace to make database queries via functions that
return iterators and do lazy loading of results.  You then iterate over
these to build a page of results (which can use things like LIMIT and
OFFSET when compiling its SQL queries), and you render and return that.
If you accidentally call something that consumes the whole input
iterable in the process, then it's going to do a *lot* of database
traffic for some queries, and it doesn't take much of that to utterly
destroy the performance of your application.

This is not something that the compiler can optimise, because the
*contract* of zip et al in Python 2 was that it would consume the entire
inputs (up to the shortest one in the case of zip, anyway); iteration is
visible to the program and can have external side-effects, and it's not
something that can be quietly optimised out given the design of the
language.  Talking about memory consumption of the result is relevant in
some cases, sure, but it's certainly not the whole story; what often
matters is the work involved in materialising the whole iterable, and
that can be very significant indeed.

In Python 2, there were many functions that took iterables as input and
returned concrete lists, consuming the entire inputs in the process.  In
most cases there were versions of these that operated in a lazy fashion
and returned iterables instead, but they were generally hidden off in
the itertools module and less obvious compared to the built-in versions.
Effectively, the language did the wrong thing by default.

Python 3 changes these around to give preference to the versions that
take iterables as input and return iterables as output, and says that if
you want a list then you have to use list() or similar to get one.  This
reduces the cognitive load of the language, because now instead of
remembering the different names for the list-returning and
iterable-returning versions of various things, you only have to remember
one version and the general rule that you use list() to materialise a
whole iterable into a list (which was already useful for other things
even in Python 2).  It makes the language simpler to learn, because
there are fewer rules and they compose well; and it makes it easier to
do what's usually the right thing.  This comes at the cost of a bit of
porting effort for some code that started out in Python 2, of which
there'll be less and less as time goes on.

To put it another way: "don't perform operations on collections of
unbounded size" is pretty much the number one rule for webapps that I've
picked up over the last few years, and Python 3 takes this lesson and
applies it to the core language.

Toy examples involving zip([1, 2], [3, 4]) and the like miss the point
because they simplify too much.  This family of functions is almost
always used in iteration constructs, usually "for ... in" or a
comprehension, and in those common cases the programmer doesn't have to
change anything at all.  In cases where they do need to change
something, it has the useful effect of highlighting that something a
little unusual may be going on, rather than hiding behaviour that's
potentially catastrophic at scale behind an innocuous-looking built-in
function.

> Meanwhile Python 3.4 can be excessively slower than 2.7. SO WHERE'S THE
> GAIN?

It will no doubt depend on the benchmark, and rather than cherry-picking
a single one it's likely more interesting to look at either a wide range
of benchmarks, or at the specific application in question.
Counterpoint, which also links to much more data:

  https://lwn.net/Articles/725114/

-- 
Colin Watson   [cjwat...@ubuntu.com]

-- 
Ubuntu-devel-discuss mailing list
Ubuntu-devel-discuss@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel-discuss


Re: On Lists and Iterables

2017-12-17 Thread Colin Watson
On Sun, Dec 17, 2017 at 11:02:48AM +0100, Xen wrote:
> My topic was that Ubuntu obviously never opposed any of the changes, and
> most replying here pretty much evidence that they are in favour of them.
> 
> Now, they claim that they can do nothing against upstream and that they have
> to follow the developments, that they are defending at every moment
> themselves
> 
> It's playing the victim while you are playing a leading role in making it
> happen.

Observing that something is out of scope is not playing the victim.

Ubuntu exists, in part, to get recent versions of free software into the
hands of users as straightforwardly as possible.  For the most part we
aren't programming language designers, and running a campaign against
how the custodians of the Python language have chosen to design the
latest version of that language is simply out of scope in this project.

The economics are obvious in any case.  Enough people want to use the
latest version of the language that it's a clear requirement for us to
provide it.  In general we want to avoid shipping multiple versions of
things where possible, because that's a maintenance burden.  The Python
2-to-3 transition has been such that we've had no choice but to do so in
this case, but the maintenance burden for us is only going to get worse
once Python 2 stops being supported upstream.  Thus it's clearly in our
long-term interest to put effort into moving things over to Python 3 in
ways that fall within the remit of a distribution.  The Python packaging
teams within most other major distributions have made much the same
obvious decision, explicitly or implicitly.  We are absolutely taking a
role in making it happen as smoothly as possible (though you give us too
much credit by saying that it's a leading role), and I don't think
anyone involved is ashamed of this.

This is not the same as saying that we're powerless victims.  When we
have issues with Python, we take them to the appropriate places, rather
than posting long screeds to inappropriate mailing lists.  Several
Ubuntu developers over the years have been active contributors to Python
upstream.  Many of the changes in Python 3 releases have been the result
of people working on updating their code to the latest version of the
language, finding rough edges in the process, and discussing those in
the right place so that they could be smoothed out.  (For example,
Python 3.3 reintroduced the u'...' syntax for Unicode strings even
though it's strictly redundant, to make things easier for people
forward-porting code; 3.5 restored "bytes % args", as mentioned
recently; and so on.)  These wouldn't have happened if people had stuck
their heads in the sand and decided that they were going to hold on to
Python 2 for dear life; they happened because people decided that it was
better for the world if they collaborated to help build a better
language.

And when you actually want many of the changes involved, that's the very
opposite of playing the victim!  I want to use Python 3 for the large
codebases I help to maintain (when they aren't using Python 3 already,
as several of them are).  A few things I positively want to use are:

 * async/await syntax for coroutines; we do the best we can at the
   moment with things like @defer.inlineCallbacks in Twisted, but it's
   fiddly and ugly;

 * many improvements to the standard library's subprocess module; as one
   example, I filed https://bugs.python.org/issue1652, and while I can
   work around it or use a backport, the whole point was to make the
   improved behaviour available to everyone since the old behaviour was
   a nasty gotcha;

 * much more sensible behaviour related to inheritance of file
   descriptors across forks;

 * the enum module;

 * contextlib.ExitStack, which allows replacing some very cumbersome
   constructions in Python 2;

 * much more flexible/programmable access to module importing.

I mean, this is just off the top of my head, but hopefully it makes the
point: sure, there are some stumbling blocks involved in porting to
Python 3 (typically fewer of them in newer versions), but there are more
than enough good things for it to be worthwhile in my opinion.  I can
get some of these with backported libraries, but not all, and in any
case that sort of thing gets very cumbersome.

Based on what you've said about how long it'd take to port all your
code, it seems that you have rather little investment in Python; in
fact, to be honest it sounds like you've spent more time in total
writing inflammatory emails about it here than actually writing Python
code.  So why not spend some time listening to the views of people here
who spend a lot of time writing Python code, as well as just the
negative posts you found on the internet (some of which have since fed
into improving the language anyway)?

> No one here has evidenced being opposed to the forced nature of its
> discontinuation, except for mr. Watson in saying that he would have
> preferred a 'l

Re: Speed vs Memory [was: On Lists and Iterables]

2017-12-17 Thread Xen



 Oorspronkelijke bericht 
Onderwerp: Re: Speed vs Memory [was: On Lists and Iterables]
Datum: 17-12-2017 12:28
Afzender: Xen 
Ontvanger: Neal McBurnett 

Just to summarize this.

Xen schreef op 17-12-2017 12:22:

Meanwhile Python 3.4 can be excessively slower than 2.7. SO WHERE'S THE 
GAIN?


I haven't found any "definitive" benchmarks yet but there are plenty of 
people benchmarking and they reveal that 3.4 is much slower than 2.7.


This is an example:

Python 2.7: 24344.88 pystones/second
Python 3.4: 17459.89 pystones/second
Nuitka 2.7: 47243.92 pystones/second
Nuitka 3.4: 28658.92 pystones/second

That's 72% the performance of Python 2.7 and 60% while pre-compiling.

And for that you change your language to decrease memory consumption for 
temporary scoped objects?


--
Ubuntu-devel-discuss mailing list
Ubuntu-devel-discuss@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel-discuss


Re: On Lists and Iterables

2017-12-17 Thread Xen

Neal McBurnett schreef op 16-12-2017 18:16:


Though, as John Lenton notes, there are excellent reasons for most of
the changes in Python 3


By the way, all of these "excellent reasons" are disputed.

And citing "authorative sources" does not make them any less disputed, 
or any more "proven".


The fact is and remains that this is a selling out to the gains of the 
material world, ie. more efficiency.


Apart from the apparently very broken implementation of unicode, I think 
most of the changes (apart from new language features) come down to 
efficiency of execution.


I'm not sure, using a broad sweep here.

But for instance:

- print x
+ print(x)

Is not mandated by anything. You go from script syntax to compiled 
language syntax. This agrees with the notion that they wanted more 
efficiency at the cost of developer pleasance.


We see an evolution from a script language to a compiled language here. 
That is not mandated by ANYTHING other than the choice to sacrifice 
readability for performance.


That's what they have done. They have sacrificed the attractiveness of 
their language for more cpu cycle output.


That is what Python 3 represents.


   Why Python 3 exists https://snarky.ca/why-python-3-exists/


He only takes about unicode.

Which is a deficient and extremely poor solution to the problem that is 
apparently arguably worse than the situation that existed before, at 
least for lowlevel programmers who have to write protocols and such 
things.


So the argument "because unicode" really needs to be qualified with "how 
good is your solution actually?"


But that's the only argument he makes, and they've just failed at their 
task.


It simply means that Python 3 is a failure with respect to the task 
they've set themselves, even if the original intent (to solve the 
unicode mess) was good.


It's possible to fail at what you do you know.

If I try to climb a tree it doesn't mean I succeed.

If I try to paint a house but the walls end up in all the wrong colours 
and then I say "Here, I'm done" that doesn't mean that I have succeeded.


From the perspective of anyone that needs to format or convert strings 
that are not composed of unicode code points, the whole system is a 
failure.



For more on the rationale for changes related to iterators see
http://portingguide.readthedocs.io/en/latest/iterators.html


That entire rationale is only explained with one word "memory 
consumption".


So now you are changing the design of your _grammar_ just so that the 
resulting code will use less memory.


That is the job of the compiler, not the developer.

You are exposing language internals because this way the compiler 
(interpreter) doesn't need to do any work, and the developer now 
directly has to work with intermediate objects like I said, it's more 
work for the developer, but in *some* cases will lead his application to 
use a little less memory,


even though these are discardable objects in that case and the memory 
will be reclaimed anyway.


So even though it's the programmer's job to ensure the objects get 
garbage collected,


by scoping them properly,

you NOW make it his job to perform meaningless optimizations that for 
90% of code will not matter a thing,


AND you make that an essential element of language "design" (nondesign) 
that everyone suffers from.


And you call THIS good?

It's a freaking hack to consume less memory.

Like what the hell is going on here.

Virtually no one benefits from this.

Python has excellent scoping. All those objects are released quickly 
enough.


This is even more dumb than I thought.

You are really insulting my intelligence here you know.

If these temporary objects get discarded anyway, then what's the deal?

We have 8GB of RAM.

The compiler/interpreter can easily optimize this as well internally.

And what do they do:

   "Alternatively, you can keep the higher-order function call,
   and wrap the result in list. However, many people will find
   the resulting code less readable:"

Ah, now we get to it! They have changed the result of map and filter to 
an iterable, so the original code is now much more ugly:


   "powered = list(map(power_function, numbers))"

   "less readable"

Yeah, you just did that!!!

For your no-benefit language corruption in order to save a bit of memory 
that gets instantly garbage collected.


I am just perplexed. This is really face-palm.

The extra step to convert to list can only be less efficient unless the 
interpreter optimizes that away, which I doubt.


This is plain dumb and plain ignorent.

So you all bought the rap that this would be beneficial but you didn't 
even look into it yourself.


Meanwhile Python 3.4 can be excessively slower than 2.7. SO WHERE'S THE 
GAIN?


https://www.raspberrypi.org/forums/viewtopic.php?t=183829

I have never heard of a dumber decision in any programming language than 
this.


--
Ubuntu-devel-discuss mailing list
Ubuntu-devel-discuss@lists.ubuntu.com
Modify settings or unsub

Re: On Lists and Iterables

2017-12-17 Thread Xen

Neal McBurnett schreef op 16-12-2017 18:16:


For example here I port your one-line Python 2 script that uses zip.

$ cat porting_example.py
print zip(["a","b"], ["c","d"])

$ python porting_example.py
[('a', 'c'), ('b', 'd')]

$ 2to3 porting_example.py > porting_example.patch

$ cat porting_example.patch
--- porting_example.py  (original)
+++ porting_example.py  (refactored)
@@ -1 +1 @@
-print zip(["a","b"], ["c","d"])
+print(list(zip(["a","b"], ["c","d"])))

$ patch -b < porting_example.patch  # -b saves original as
porting_example.py.orig

$ cat porting_example.py
print(list(zip(["a","b"], ["c","d"])))

$ python3 porting_example.py
[('a', 'c'), ('b', 'd')]


Well that's sweet, but I hope you realize how much more ugly that line 
of code has become.


My topic is that none of you, or almost none of you, opposes this, and 
then goes on to say that they "have no choice" but to "follow along with 
upstream".


That is a basic lie and that's all that is and that's all I wanted to 
say.


It does mean my 30 minutes of porting just went down to 5 ;-). Or well, 
it would still be 30, but I would have to check everything out after the 
fact.


Here, I will do it for you, since you go to this effort anyway.

In my one file the only changes are these:

+   for vg in list(self.vgs.values()):

Verbosity has gone up in all of the 5 instances of this, but strangely, 
it isn't required in 3 of them, because they are for loops.


In the other file likewise. Though I am not sure my data object will 
work.


And it works! Or almost, I need to change to an integer division:

keys = Gdk.Keymap.get_entries_for_keyval(Gdk.Keymap.get_default(), 
keyvals[ (direction + 1) / 2 ])


TypeError: list indices must be integers or slices, not float

And works perfectly now :p.

The point is that I don't like this "language", I do not want this 
"language" and I would probably sooner switch to Ruby than have to write 
it in this language.


This is a small file, but all of the changes resulted in increased 
verbosity, while gaining zero benefits; how is this to my benefit?


Performance? Don't make me laugh.

--
Ubuntu-devel-discuss mailing list
Ubuntu-devel-discuss@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel-discuss


Re: On Lists and Iterables

2017-12-17 Thread Xen

Jonathon Fernyhough schreef op 16-12-2017 16:29:


I've only been vaguely following this thread as it doesn't appear to be
related to Ubuntu Development, but it seems to me you're annoyed that
you started learning Python2 before finding out that it is being
replaced by Python3.


No, I don't like Python3.

I am happy that I have the chance to still 'learn' Python2 (as if it has 
anything to do with learning; you can do that 20 years into the future 
also).


I have zero investment in Python2.

Well, it would take me exactly 30 minutes to convert everything I have 
written to Python3, probably.


And that would only be the "learning" part of what to change it into...

I just do not want to have to program in that language.

I am more likely to start learning Ruby than to have to use Python 3.


This isn't an Ubuntu problem - you started learning the wrong language
version in the first place


I never opened the topic of language internals; mr. Watson did.

My topic was that Ubuntu obviously never opposed any of the changes, and 
most replying here pretty much evidence that they are in favour of them.


Now, they claim that they can do nothing against upstream and that they 
have to follow the developments, that they are defending at every moment 
themselves


It's playing the victim while you are playing a leading role in making 
it happen.


No one here has evidenced being opposed to the abandonment of python2.

No one here has evidenced being opposed to the forced nature of its 
discontinuation, except for mr. Watson in saying that he would have 
preferred a 'legacy' mode for the interpreter.


Therefore, while you are all stout defenders of upstream policy, don't 
then go play the victim saying you can do nothing about it and you "have 
to".


Screw that, you are just lying to yourselves and to everyone.

I said that more forcefully in my mind ;-).


If you have an issue with the _language syntax_ then you're definitely
on the wrong list.


Again, my topic was Ubuntu's stance.

--
Ubuntu-devel-discuss mailing list
Ubuntu-devel-discuss@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel-discuss