This half-baked idea is inspired by this thread here: https://mail.python.org/archives/list/python-ideas@python.org/message/5LGWV3YLCNBVSL4QHQKJ7RPNTMWOALQA/
which in turn was inspired by this Stackoverflow post: https://stackoverflow.com/questions/56966429/getting-pairs-of-one-item-and-the-rest-over-a-python-list Problem: today, more and more people are using Python to work with Big Data, or at least Biggish Data. For some folks, it is not unusual to have lists with a gigabyte or more or data. So imagine you have a list with a hundred million items, and you need a copy of the entire list. Easy: alist = [...] # 100 million items blist = alist[:] And that ought to be pretty efficient too, making the slice is basically just a copy of a bunch of pointers, plus a bit of overhead. A good implementation should have lightning fast list slicing, close to the speed of memcopy in C. Give or take. But now suppose you want a copy of all the items except the item in position (let's say) 1000. Here's a bad way: blist = alist[:] del blist[1000] That's inefficient because the list has to shift almost a hundred million items over, and I think they have to be moved one at a time. Which is slowish, even in C. Here's another way: blist = alist[:1000] + alist[1001:] That has to make a slice (copy 1000 items), a second slice (copy another many million items), then make a third copy to concatenate them, then garbage collect the two temporary slices. And while you might have enough memory for double the original list (alist and blist) you might not have enough for triple (alist, the two slices, blist). There are lots of other variants we could come up with, but ultimately we're doing lots of extra work that has to be thrown away, and that costs time or memory or both. How about if lists had an in-place method for doing a fast copy of pointers from one list to another? We could do this: blist = [None]*(len(alist)-1) # Preallocate. blist.copyfrom(alist, index=0, start=0, end=1000) blist.copyfrom(alist, index=1000, start=1001) No temporary slices are needed. Here's a pure-Python (and therefore slow) implementation: def copyfrom(dest, source, index=0, start=0, end=None): if end is None: end = len(source) if len(dest) - index < end - start: raise ValueError('destination too small') for i in range(start, end): dest[index+i-start] = source[i] In pure Python, copying each item over one by one will save memory but cost a lot of time. Doing it as a method in C should save both memory and time. The implementation could optimize the cases where the source is a list, and fall back on iteration for other sequence types. Now *personally* I don't work with a lot of Big Data, so this doesn't really scratch any of my personal itches, but I thought I'd throw it out there for others to tell me why it's a dumb idea :-) -- Steve _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/VABOBMQ6YAA33CVXCHU6KZKWSMEUQGMD/ Code of Conduct: http://python.org/psf/codeofconduct/