To quote from PEP 424:

Rationale
=========

Being able to pre-allocate lists based on the expected size, as estimated by ``__length_hint__``, can be a significant optimization. CPython has been
observed to run some code faster than PyPy, purely because of this optimization
being present.


Why is it a significant optimisation?

How much slower is it?
Where is the data?

*If* resizing list is so slow, then why not make it faster?

To quote wikipedia (http://en.wikipedia.org/wiki/Dynamic_array)

"""
As n elements are inserted, the capacities form a geometric progression.
Expanding the array by any constant proportion ensures that inserting n elements
takes O(n) time overall, meaning that each insertion takes amortized constant 
time.
The value of this proportion a leads to a time-space tradeoff: the average time 
per
insertion operation is about a/(a-1), while the number of wasted cells is 
bounded
above by (a-1)n. The choice of a depends on the library or application: some 
textbooks
use a = 2, but Java's ArrayList implementation uses a = 3/2 and the C 
implementation
of Python's list data structure uses a = 9/8.
"""

If resizing of lists is too slow, then we should reconsider the 9/8 factor
and/or look to tweak the resizing code.

Cheers,
Mark.

_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to