Re: [Python-ideas] Exploiting type-homogeneity in list.sort() (again!)
On Tue, Mar 07, 2017 at 09:10:00PM +, Elliot Gorokhovsky wrote: > I think the bigger problem, though, is that most list use does *not* > involve sorting, There are other uses for a type hint apart from sorting. There may be optimized versions of other functions (sum, math.fsum, ...) and list methods (e.g. count, remove), anything that has to walk the list comparing items). All very hypothetical at the moment, I admit it... > so it would be a shame to impose the non-trivial overhead > of type-checking on *all* list use. You might be right, but my guess is that the overhead isn't quite as big as you may think, at least for some operations. The type-check itself is just a pointer compare, not an issubclass() operation. The types are either identical, or the list is hetrogeneous. True, `mylist[i] = x` should be a quick assignment into an array of pointers, but there's a bounds check to ensure i is within the correct range. Other methods are even more expensive: `mylist[i:j] = [a...z]` has to move list items around, possibly allocate more memory, and update the list length, compared to that the scan of a...z is probably minimal. `mylist.append(x)` is *usually* a simple assignment into the array (plus updating the list length) but every now and again it triggers a resize, which is costly. I don't think we can predict whether this is a nett gain or loss just from first principles. > Anyway, my patch could always be a precursor to a more general optimization > along these lines. Indeed! Even if nothing else comes of this than your patch, thank you! -- Steve ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Exploiting type-homogeneity in list.sort() (again!)
On Tue, Mar 07, 2017 at 08:46:45PM +, Erik wrote: > I don't think anyone has mentioned this yet, but FWIW I think the 'type > hint' may need to be tri-state: heterogeneous (NULL), homogeneous (the > pointer to the type structure) and also "unknown" (a sentinel value - > the address of a static char or something). I thought about that and rejected it as an unnecessary complication. Hetrogeneous and unknown might as well be the same state: either way, you cannot use the homogeneous-type optimization. Part of the complexity here is that I'd like this flag to be available to Python code, not just a hidden internal state of the list. But my instinct for this could be wrong -- if anyone is interested to do some experiments on this, it might be that a three-state flag works out better in practice than two-states. > Otherwise, a delete operation on the list would need to scan the list to > work out if it had changed from heterogeneous to homogeneous (unless it > was acceptable that once heterogeneous, a list is always considered > heterogeneous - i.e., delete always sets the hint to NULL). In a later email, you corrected this: a delete operation need not touch the type-hint (except when the last item is deleted, at which point it resets to None/unknown. With a three-state flag, you can make a three-way decision: If the flag is Unknown: - do a O(N) scan to determine whether the list is homogeneous or hetrogeneous, then choose the optimized or unoptimized routine; if the flag is Hetrogeneous: - there is no point doing a scan, so always choose the unoptimized routine; if the flag is a type: - the list is definitely homogeneous, so depending on the type, you may be able to choose an optimized rountine. Compared to that, a two-state flag misses some opportunities to run the optimized routine: - list contains [1, 2, 3, 4, "foo"] so the hint is set to None; - delete the last item; - list is now homogeneous but the hint is still None. But also avoids bothering with an O(N) scan in some situations where the list really is hetrogeneous. So there's both an opportunity cost and a benefit. There may be other opportunities to do the scan, such as when the underlying pre-allocated array resizes, so even with the two-state flag, "unknown" need not stay unknown forever. > I'd prefer the sort optimization to be based on what my list contains > NOW, not on what it may have contained some time in the past, Remember, we're talking about opportunities for applying an optimization here, nothing more. You're not giving up anything: at worst, the ordinary, unoptimized routine will run and you're no worse off than you are today. > so I'm not > a fan of the "once heterogeneous, always considered heterogeneous" > behaviour if it's cheap enough to avoid it. It is not just a matter of the cost of tracking three states versus two. It is a matter of the complexity of the interface. I suppose this could be reported to Python code as None, False or the type. Although, I don't know about you, but I know I'll never be able to remember whether None means Unknown and False means Hetrogeneous, or the other way around. Ultimately, this is all very pie-in-the-sky unless somebody tests just how expensive this is and whether the benefit is worthwhile. That's not going to be me: I'm not able to hack on the list C code. -- Steve ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Exploiting type-homogeneity in list.sort() (again!)
On Tue, Mar 7, 2017 at 4:36 PM, Erikwrote: > * listextend() - this should do the right thing with the type hint when >> extending one list with another. >> > > * Several other methods ('contains', 'remove', 'count', 'index') also use > PyObject_RichCompareBool(). They could also presumably benefit from the > same optimisation (perhaps it's not all about sort() - perhaps this gives a > little more weight to the idea). Good point about list.extend(). I don't think __type_hint__ could help with .__contains__() or .count() or .remove(). E.g.: In [7]: lst = [1.0, 2.0, 1+0j, F(1,1)] In [8]: from fractions import Fraction as F In [9]: lst = [1.0, 2.0, 1+0j, F(1,1)] In [10]: 1 in lst Out[10]: True In [11]: lst.count(1) Out[11]: 3 In [12]: l.index(1) Out[12]: 0 The list has absolutely nothing of the right type. Yet it contains an item, counts things that are equal, finds a position for an equal item. -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Exploiting type-homogeneity in list.sort() (again!)
On Tue, Mar 7, 2017 at 1:47 PM Erikwrote: > > I'd prefer the sort optimization to be based on what my list contains > NOW, not on what it may have contained some time in the past, so I'm not > a fan of the "once heterogeneous, always considered heterogeneous" > behaviour if it's cheap enough to avoid it. > Sure. Dictionaries actually don't implement this, though: as soon as they see a non-string key, they permanently switch to a heterogeneous state (IIRC). I think the bigger problem, though, is that most list use does *not* involve sorting, so it would be a shame to impose the non-trivial overhead of type-checking on *all* list use. With dictionaries, you have to type-check inserts anyway (for equality testing), so it's less of a problem. But the fundamental list operations *don't* require type-checking currently, so why add it in? In practice, the pre-sort check is *very* cheap, and its cost is only imposed on sort usage. Anyway, my patch could always be a precursor to a more general optimization along these lines. I'm almost finished fixing the problem Tim identified earlier in this thread; after that, it'll be ready for review! ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Exploiting type-homogeneity in list.sort() (again!)
On 07/03/17 20:46, Erik wrote: (unless it was acceptable that once heterogeneous, a list is always considered heterogeneous - i.e., delete always sets the hint to NULL). Rubbish. I meant that delete would not touch the hint at all. E. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Wrapper for ctypes
Ever looked up cffi? You won't be disappointed. -- Ryan (ライアン) Yoko Shimomura > ryo (supercell/EGOIST) > Hiroyuki Sawano >> everyone else http://refi64.com On Mar 7, 2017 3:43 AM, "George Fischhof"wrote: > Hi Guys, > > right now I had to call functions from a dll, and I started using ctypes. > > > I found this library too > > https://pypi.python.org/pypi/pywrap/0.1.0 > > which says (qutation): > > Replace this: > > prototype = ctypes.WINFUNCTYPE(wintypes.HANDLE, wintypes.UINT, > wintypes.HANDLE)paramflags = (1, "uFormat"), (1, "hMem")SetClipboardData = > prototype(("SetClipboardData", user32), paramflags)SetClipboardData.errcheck > = null_errcheck > > With this: > > SetClipboardData = pywrap.wrap_winapi(name="SetClipboardData", > library=user32, > restype=wintypes.BOOL, > params=[ > Parameter("uFormat", > wintypes.UINT), > Parameter("hMem", wintypes.HANDLE) > ], > errcheck=null_errcheck) > > > > (end qutation) > > My idea: something like this library should be implemented in Python to make > it more simple to use ctypes. > > BR, > > George > > > ___ > Python-ideas mailing list > Python-ideas@python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Wrapper for ctypes
I have in the past written a small utility to use compact signatures like: i i -> i to indicate a function taking two ints and returning an int. See for example: https://github.com/stephanh42/armasm Op 7 mrt. 2017 10:43 a.m. schreef "George Fischhof": > Hi Guys, > > right now I had to call functions from a dll, and I started using ctypes. > > > I found this library too > > https://pypi.python.org/pypi/pywrap/0.1.0 > > which says (qutation): > > Replace this: > > prototype = ctypes.WINFUNCTYPE(wintypes.HANDLE, wintypes.UINT, > wintypes.HANDLE)paramflags = (1, "uFormat"), (1, "hMem")SetClipboardData = > prototype(("SetClipboardData", user32), paramflags)SetClipboardData.errcheck > = null_errcheck > > With this: > > SetClipboardData = pywrap.wrap_winapi(name="SetClipboardData", > library=user32, > restype=wintypes.BOOL, > params=[ > Parameter("uFormat", > wintypes.UINT), > Parameter("hMem", wintypes.HANDLE) > ], > errcheck=null_errcheck) > > > > (end qutation) > > My idea: something like this library should be implemented in Python to make > it more simple to use ctypes. > > BR, > > George > > > ___ > Python-ideas mailing list > Python-ideas@python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/