New submission from Serhiy Storchaka <storchaka+cpyt...@gmail.com>:

Currently list uses the following formula for allocating an array for items (if 
size != 0):

    allocated = size + (size >> 3) + (3 if size < 9 else 6)

If add items by 1 the growth pattern is:  0, 4, 8, 16, 25, 35, 46, 58, 72, 88, 
...

I think this strategy can be improved.

1. There is no much sense in allocating the space for 25 items. The standard 
Python allocator returns addresses aligned to 16 bytes on 64-bit platform. It 
will allocate a space for 26 items. It is more efficient if the starting 
address is a multiple of some power of two, and perhaps larger than the pointer 
size.

So it may be better to allocate a multiple of two or four items. Few first 
sizes are multiple of four, so this is good multiplier. I suggest to use the 
following expression:

    allocated = (size + (size >> 3) + 6) & ~3

It adds bits AND, but removes conditional value.

2. It is common enough case if the list is created from a sequence of a known 
size and do not adds items anymore. Or if it is created by concatenating of few 
sequences. In such cases the list can overallocate a space which will never be 
used. For example:

>>> import sys
>>> sys.getsizeof([1, 2, 3])
80
>>> sys.getsizeof([*(1, 2, 3)])
104

List display allocates a space for exactly 3 items. But a list created from a 
tuple allocates a space for 6 items.

Other example. Let extend a list by 10 items at a time.

size allocated
 10     17
 20     28
 30     39
 40     51
 50     51
 60     73
 70     73
 80     96
 90     96
100    118

We need to reallocate an array after first four steps. 17 is too small for 20, 
28 is too small for 30, 39 is too small for 40. So overallocating does not 
help, it just spends a space.

My idea, that if we adds several items at a time and need to reallocate an 
array, we check if the overallocated size is enough for adding the same amount 
of items next time. If it is not enough, we do not overallocate.

The final algorithm is:

    allocated = (size + (size >> 3) + 6) & ~3
    if size - old_size > allocated - size:
        allocated = (size + 3) & ~3

It will save a space if extend a list by many items few times.

----------
components: Interpreter Core
messages: 353976
nosy: rhettinger, serhiy.storchaka, tim.peters
priority: normal
severity: normal
status: open
title: List overallocation strategy
type: resource usage
versions: Python 3.9

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue38373>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to