Re: [openstack-dev] [all][python3] use of six.iteritems()

gordon chung Tue, 09 Jun 2015 22:24:51 -0700

maybe the suggestion should be "don't blindly apply six.iteritems or items" 
rather than don't apply iteritems at all. admittedly, it's a massive eyesore, 
but it's a very real use case that some projects deal with large data results 
and to enforce the latter policy can have negative effects[1].  one "million 
item dictionary" might be negligible but in a multi-user, multi-* environment 
that can have a significant impact on the amount memory required to store 
everything.


[1] disclaimer: i have no real world results but i assume memory management was 
the reason for the switch in logic from py2 to py3

cheers,
gord


----------------------------------------
> Date: Wed, 10 Jun 2015 12:15:33 +1200
> From: [email protected]
> To: [email protected]
> Subject: [openstack-dev] [all][python3] use of six.iteritems()
>
> I'm very glad folk are working on Python3 ports.
>
> I'd like to call attention to one little wart in that process: I get
> the feeling that folk are applying a massive regex to find things like
> d.iteritems() and convert that to six.iteritems(d).
>
> I'd very much prefer that such a regex approach move things to
> d.items(), which is much easier to read.
>
> Here's why. Firstly, very very very few of our dict iterations are
> going to be performance sensitive in the way that iteritems() matters.
> Secondly, no really - unless you're doing HUGE dicts, it doesn't
> matter. Thirdly. Really, it doesn't.
>
> At 1 million items the overhead is 54ms[1]. If we're doing inner loops
> on million item dictionaries anywhere in OpenStack today, we have a
> problem. We might want to in e.g. the scheduler... if it held
> in-memory state on a million hypervisors at once, because I don't
> really to to imagine it pulling a million rows from a DB on every
> action. But then, we'd be looking at a whole 54ms. I think we could
> survive, if we did that (which we don't).
>
> So - please, no six.iteritems().
>
> Thanks,
> Rob
>
>
> [1]
> python2.7 -m timeit -s 'd=dict(enumerate(range(1000000)))' 'for i in
> d.items(): pass'
> 10 loops, best of 3: 76.6 msec per loop
> python2.7 -m timeit -s 'd=dict(enumerate(range(1000000)))' 'for i in
> d.iteritems(): pass'
> 100 loops, best of 3: 22.6 msec per loop
> python3.4 -m timeit -s 'd=dict(enumerate(range(1000000)))' 'for i in
> d.items(): pass'
> 10 loops, best of 3: 18.9 msec per loop
> pypy2.3 -m timeit -s 'd=dict(enumerate(range(1000000)))' 'for i in
> d.items(): pass'
> 10 loops, best of 3: 65.8 msec per loop
> # and out of interest, assuming that that hadn't triggered the JIT....
> but it had.
> pypy -m timeit -n 1000 -s 'd=dict(enumerate(range(1000000)))' 'for i
> in d.items(): pass'
> 1000 loops, best of 3: 64.3 msec per loop
>
> --
> Robert Collins <[email protected]>
> Distinguished Technologist
> HP Converged Cloud
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: [email protected]?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
                                          
__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [email protected]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [all][python3] use of six.iteritems()

Reply via email to