On Sun, Mar 05, 2017 at 07:19:43AM +0000, Elliot Gorokhovsky wrote: > You may remember seeing some messages on here about optimizing list.sort() > by exploiting type-homogeneity: since comparing apples and oranges is > uncommon (though possible, i.e. float to int), it pays off to check if the > list is type-homogeneous
I sometimes need to know if a list is homogenous, but unfortunately checking large lists for a common type in pure Python is quote slow. Here is a radical thought... why don't lists track their common type themselves? There's only a few methods which can add items: - append - extend - insert - __setitem__ Suppose we gave lists a read-only attrribute, __type_hint__, which returns None for hetrogeneous lists and the type for homogeneous lists. Adding an item to the list does as follows: - if the list is empty, adding an item sets __type_hint__ to type(item); - if the list is not empty, adding an item tests whether type(item) is identical to (not a subclass) of __type_hint__, and if not, sets __type_hint__ to None; - removing an item doesn't change the __type_hint__ unless the list becomes empty, in which case it is reset to None; - if the internal allocated space of the list shrinks, that triggers a recalculation of the __type_hint__ if it is currently None. (There's no need to recalculate the hint if it is not None.) Optional: doing a list.sort() could also recalculate the hint. The effect will be: - if __type_hint__ is a type object, then you can be sure that the list is homogeneous; - if the __type_hint__ is None, then it might still be homogeneous, but it isn't safe to assume so. Not only could sorting take advantage of the type hint without needing to do a separate O(N) scan of the list, but so could other code. I know I would be interested in using this. I have a fair amount of code that has to track the type of any items seen in a list, and swap to a "type agnostic but slow" version if the list is not homogeneous. I could probably replace that with some variation of: if thelist.__type_hint__ is None: process_slow(thelist) else: process_fast(thelist) At the very least, I'd be interested in experimenting with this. Thoughts? -- Steve _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/