maybe the suggestion should be "don't blindly apply six.iteritems or items" rather than don't apply iteritems at all. admittedly, it's a massive eyesore, but it's a very real use case that some projects deal with large data results and to enforce the latter policy can have negative effects[1]. one "million item dictionary" might be negligible but in a multi-user, multi-* environment that can have a significant impact on the amount memory required to store everything.
[1] disclaimer: i have no real world results but i assume memory management was the reason for the switch in logic from py2 to py3 cheers, gord ---------------------------------------- > Date: Wed, 10 Jun 2015 12:15:33 +1200 > From: robe...@robertcollins.net > To: openstack-dev@lists.openstack.org > Subject: [openstack-dev] [all][python3] use of six.iteritems() > > I'm very glad folk are working on Python3 ports. > > I'd like to call attention to one little wart in that process: I get > the feeling that folk are applying a massive regex to find things like > d.iteritems() and convert that to six.iteritems(d). > > I'd very much prefer that such a regex approach move things to > d.items(), which is much easier to read. > > Here's why. Firstly, very very very few of our dict iterations are > going to be performance sensitive in the way that iteritems() matters. > Secondly, no really - unless you're doing HUGE dicts, it doesn't > matter. Thirdly. Really, it doesn't. > > At 1 million items the overhead is 54ms[1]. If we're doing inner loops > on million item dictionaries anywhere in OpenStack today, we have a > problem. We might want to in e.g. the scheduler... if it held > in-memory state on a million hypervisors at once, because I don't > really to to imagine it pulling a million rows from a DB on every > action. But then, we'd be looking at a whole 54ms. I think we could > survive, if we did that (which we don't). > > So - please, no six.iteritems(). > > Thanks, > Rob > > > [1] > python2.7 -m timeit -s 'd=dict(enumerate(range(1000000)))' 'for i in > d.items(): pass' > 10 loops, best of 3: 76.6 msec per loop > python2.7 -m timeit -s 'd=dict(enumerate(range(1000000)))' 'for i in > d.iteritems(): pass' > 100 loops, best of 3: 22.6 msec per loop > python3.4 -m timeit -s 'd=dict(enumerate(range(1000000)))' 'for i in > d.items(): pass' > 10 loops, best of 3: 18.9 msec per loop > pypy2.3 -m timeit -s 'd=dict(enumerate(range(1000000)))' 'for i in > d.items(): pass' > 10 loops, best of 3: 65.8 msec per loop > # and out of interest, assuming that that hadn't triggered the JIT.... > but it had. > pypy -m timeit -n 1000 -s 'd=dict(enumerate(range(1000000)))' 'for i > in d.items(): pass' > 1000 loops, best of 3: 64.3 msec per loop > > -- > Robert Collins <rbtcoll...@hp.com> > Distinguished Technologist > HP Converged Cloud > > __________________________________________________________________________ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev