[Python-Dev] PEP-646 question: unpacking into single Generic parameter
Hello, first of all, thanks for working on variadic generics! I look forward to them. My question: all the examples in https://www.python.org/dev/peps/pep-0646/ unpack into variadic arguments. But can I write code like this? ``` Ts = TypeVarTuple("Ts") def enumerate_args(f: Callable[[*Tuple[int, Ts]]], *args: *Ts): f(*enumerate(args)) ``` In particular I'm talking about the `*Tuple[int, Ts]` syntax. All the examples from the PEP use `*Ts` so I don't know if this is legal, but I hope so. This should probably be clarified in the PEP. -- Best regards, Willi Schinmeyer ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/K76H6C5JWGJ7TW6DGOCZGOTVONWOFCWD/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Retrieve an arbitrary element from a set without removing it
Hi, recently I wrote an algorithm, in which very often I had to get an arbitrary element from a set without removing it. Three possibilities came to mind: 1. x = some_set.pop() some_set.add(x) 2. for x in some_set: break 3. x = iter(some_set).next() Of course, the third should be the fastest. It nevertheless goes through all the iterator creation stuff, which costs some time. I wondered, why the builtin set does not provide a more direct and efficient way for retrieving some element without removing it. Is there any reason for this? I imagine something like x = some_set.get() or x = some_set.pop(False) and am thinking about providing a patch against setobject.c (preferring the .get() solution being a stripped down pop()). Before, I would like to know whether I have overlooked something or whether this can be done in an already existing way. Thanks, wr ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] First shot at some_set.get()
Hi, here is the first shot to provide a faster means of retrieving an arbitrary element from a set without removing it. The times for = from timeit import * stat1 = "for i in xrange(100): iter(s).next()" stat2 = "for i in xrange(100): s.get()" for stat in [stat1, stat2]: t = Timer(stat, setup="s=set(range(1))") # outside the try/except try: print t.timeit(number=1000) except: t.print_exc() = are stat1: 0.0451760292053 stat2: 0.0148510932922 The patch is attached. Feel free to criticize. I would love to see something like that in the standard Python interpreter. Regards, wr Index: Objects/setobject.c === --- Objects/setobject.c (Revision 75593) +++ Objects/setobject.c (Arbeitskopie) @@ -757,6 +757,49 @@ PyDoc_STRVAR(pop_doc, "Remove and return an arbitrary set element.\n\ Raises KeyError if the set is empty."); +static PyObject * +set_get(PySetObject *so) +{ + register Py_ssize_t i = 0; + register setentry *entry; + PyObject *key; + + assert (PyAnySet_Check(so)); + if (so->used == 0) { + PyErr_SetString(PyExc_KeyError, "get from an empty set"); + return NULL; + } + + /* Set entry to "the first" unused or dummy set entry. We abuse + * the hash field of slot 0 to hold a search finger: + * If slot 0 has a value, use slot 0. + * Else slot 0 is being used to hold a search finger, + * and we use its hash value as the first index to look. + */ + entry = &so->table[0]; + if (entry->key == NULL || entry->key == dummy) { + i = entry->hash; + /* The hash field may be a real hash value, or it may be a + * legit search finger, or it may be a once-legit search + * finger that's out of bounds now because it wrapped around + * or the table shrunk -- simply make sure it's in bounds now. + */ + if (i > so->mask || i < 1) + i = 1; /* skip slot 0 */ + while ((entry = &so->table[i])->key == NULL || entry->key==dummy) { + i++; + if (i > so->mask) +i = 1; + } + } + key = entry->key; +Py_INCREF(key); + return key; +} + +PyDoc_STRVAR(get_doc, "Return an arbitrary set element without removing it.\n\ +Raises KeyError if the set is empty."); + static int set_traverse(PySetObject *so, visitproc visit, void *arg) { @@ -2043,6 +2086,8 @@ issuperset_doc}, {"pop", (PyCFunction)set_pop, METH_NOARGS, pop_doc}, + {"get", (PyCFunction)set_get, METH_NOARGS, + get_doc}, {"__reduce__", (PyCFunction)set_reduce, METH_NOARGS, reduce_doc}, {"remove", (PyCFunction)set_remove, METH_O, @@ -2355,6 +2400,16 @@ return set_pop((PySetObject *)set); } +PyObject * +PySet_Get(PyObject *set) +{ + if (!PySet_Check(set)) { + PyErr_BadInternalCall(); + return NULL; + } + return set_get((PySetObject *)set); +} + int _PySet_Update(PyObject *set, PyObject *iterable) { ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Retrieve an arbitrary element from a set without removing it
Hi, surprised about the performance of for/break provided by Vitor, I did some more testing. It revealed that indeed we can forget the get() (which was implemented as a stripped down pop()): from timeit import * stats = ["for i in xrange(1000): iter(s).next() ", "for i in xrange(1000): \n\tfor x in s: \n\t\tbreak", "for i in xrange(1000): s.add(s.pop()) ", "for i in xrange(1000): s.get() "] for stat in stats: t = Timer(stat, setup="s=set(range(100))") try: print "Time for %s:\t %f"%(stat, t.timeit(number=1000)) except: t.print_exc() $ ./test_get.py Time for for i in xrange(1000): iter(s).next() : 0.433080 Time for for i in xrange(1000): for x in s: break: 0.148695 Time for for i in xrange(1000): s.add(s.pop()) : 0.317418 Time for for i in xrange(1000): s.get() : 0.146673 In some tests, for/break was even slightly faster then get(). As always, intuition regarding performance bottlenecks is flawed ;-) Anyway, thanks for all the helpful comments, especially to Stefan for the http://comments.gmane.org/gmane.comp.python.ideas/5606 link. Regards, wr Am Freitag, 23. Oktober 2009 19:25:48 schrieb John Arbash Meinel: > Vitor Bosshard wrote: > > 2009/10/23 Willi Richert : > >> Hi, > >> > >> recently I wrote an algorithm, in which very often I had to get an > >> arbitrary element from a set without removing it. > >> > >> Three possibilities came to mind: > >> > >> 1. > >> x = some_set.pop() > >> some_set.add(x) > >> > >> 2. > >> for x in some_set: > >>break > >> > >> 3. > >> x = iter(some_set).next() > >> > >> > >> Of course, the third should be the fastest. It nevertheless goes through > >> all the iterator creation stuff, which costs some time. I wondered, why > >> the builtin set does not provide a more direct and efficient way for > >> retrieving some element without removing it. Is there any reason for > >> this? > >> > >> I imagine something like > >> > >> x = some_set.get() > > > > I see this as being useful for frozensets as well, where you can't get > > an arbitrary element easily due to the obvious lack of .pop(). I ran > > into this recently, when I had a frozenset that I knew had 1 element > > (it was the difference between 2 other sets), but couldn't get to that > > element easily (get the pun?) > > So in my testing (2) was actually the fastest. I assumed because .next() > was a function call overhead, while: > for x in some_set: > break > > Was evaluated inline. It probably still has to call PyObject_GetIter, > however it doesn't have to create a stack frame for it. > > This is what "timeit" tells me. All runs are of the form: > python -m timeit -s "s = set([10])" ... > > 0.101us "for x in s: break; x" > 0.130us "for x in s: pass; x" > 0.234us -s "n = next; i = iter" "x = n(i(s)); x" > 0.248us "x = next(iter(s)); x" > 0.341us "x = iter(s).next(); x" > > So 'for x in s: break' is about 2x faster than next(iter(s)) and 3x > faster than (iter(s).next()). > I was pretty surprised that it was 30% faster than "for x in s: pass". I > assume it has something to do with a potential "else:" statement? > > Note that all of these are significantly < 1us. So this only matters if > it is something you are doing often. > > I don't know your specific timings, but I would guess that: > for x in s: break > > Is actually going to be faster than your > s.get() > > Primarily because s.get() requires an attribute lookup. I would think > your version might be faster for: > stat2 = "g = s.get; for i in xrange(100): g()" > > However, that is still a function call, which may be treated differently > by the interpreter than the for:break loop. I certainly suggest you try > it and compare. > > John > =:-> > ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Retrieve an arbitrary element from a set without removing it
Hi, I agree. But, here are the pros/cons collected from the recent list repsonses: Pro: - more readable - newbies will encounter one of the fastest solution (.get()) before trying slower "first solutions" like (iter(set).next()) Cons: - no name consensus. get() getany() arbitrary() ? - BDFL moratorium, which I find very wise (get() is, however, no language extension, but std lib extension, which Guido did not moratorize, didn't he?) - other classes should then also be extended, like frozenset Regards, wr PS: what is the correct verb form of moratorium? Am Samstag, 24. Oktober 2009 00:49:38 schrieb Steven D'Aprano: > On Sat, 24 Oct 2009 07:53:24 am Willi Richert wrote: > > Hi, > > > > surprised about the performance of for/break provided by Vitor, I did > > some more testing. It revealed that indeed we can forget the get() > > (which was implemented as a stripped down pop()): > > I don't understand that conclusion. According to your tests, your > implementation of get() is as fast as "for x in set: break", and it's > certainly much, much more readable and straightforward. > ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Retrieve an arbitrary element from a set without removing it
Hi, someone on this list mentioned that much of the s.get() time is spend on the name lookup for get(). That is indeed the case: === from timeit import * stats = ["for i in xrange(1000): iter(s).next() ", "for i in xrange(1000): \n\tfor x in s: \n\t\tbreak", "for i in xrange(1000): s.add(s.pop()) ", "for i in xrange(1000): s.get() ", "g=s.get;\nfor i in xrange(1000): g() "] for stat in stats: t = Timer(stat, setup="s=set(range(1000))") print "Time for %s:\t %f"%(stat, t.timeit(number=1000)) == Time for for i in xrange(1000): iter(s).next() : 0.448227 Time for for i in xrange(1000): for x in s: break: 0.141669 Time for for i in xrange(1000): s.add(s.pop()) : 0.348055 Time for for i in xrange(1000): s.get() : 0.148580 Time for g=s.get; for i in xrange(1000): g() :0.080563 So, now set.get() is indeed the fastest and preferable solution if you need massive amounts of retrieving elements from a set without removing them. wr Am Freitag, 23. Oktober 2009 22:53:24 schrieb Willi Richert: > Hi, > > surprised about the performance of for/break provided by Vitor, I did some > more testing. It revealed that indeed we can forget the get() (which was > implemented as a stripped down pop()): > > from timeit import * > stats = ["for i in xrange(1000): iter(s).next() ", > "for i in xrange(1000): \n\tfor x in s: \n\t\tbreak", > "for i in xrange(1000): s.add(s.pop()) ", > "for i in xrange(1000): s.get() "] > > for stat in stats: > t = Timer(stat, setup="s=set(range(100))") > try: > print "Time for %s:\t %f"%(stat, t.timeit(number=1000)) > except: > t.print_exc() > > > > $ ./test_get.py > Time for for i in xrange(1000): iter(s).next() : 0.433080 > Time for for i in xrange(1000): > for x in s: > break: 0.148695 > Time for for i in xrange(1000): s.add(s.pop()) : 0.317418 > Time for for i in xrange(1000): s.get() : 0.146673 > > > In some tests, for/break was even slightly faster then get(). > > As always, intuition regarding performance bottlenecks is flawed ;-) > > Anyway, thanks for all the helpful comments, especially to Stefan for the > http://comments.gmane.org/gmane.comp.python.ideas/5606 link. > > Regards, > wr > > Am Freitag, 23. Oktober 2009 19:25:48 schrieb John Arbash Meinel: > > Vitor Bosshard wrote: > > > 2009/10/23 Willi Richert : > > >> Hi, > > >> > > >> recently I wrote an algorithm, in which very often I had to get an > > >> arbitrary element from a set without removing it. > > >> > > >> Three possibilities came to mind: > > >> > > >> 1. > > >> x = some_set.pop() > > >> some_set.add(x) > > >> > > >> 2. > > >> for x in some_set: > > >>break > > >> > > >> 3. > > >> x = iter(some_set).next() > > >> > > >> > > >> Of course, the third should be the fastest. It nevertheless goes > > >> through all the iterator creation stuff, which costs some time. I > > >> wondered, why the builtin set does not provide a more direct and > > >> efficient way for retrieving some element without removing it. Is > > >> there any reason for this? > > >> > > >> I imagine something like > > >> > > >> x = some_set.get() > > > > > > I see this as being useful for frozensets as well, where you can't get > > > an arbitrary element easily due to the obvious lack of .pop(). I ran > > > into this recently, when I had a frozenset that I knew had 1 element > > > (it was the difference between 2 other sets), but couldn't get to that > > > element easily (get the pun?) > > > > So in my testing (2) was actually the fastest. I assumed because .next() > > was a function call overhead, while: > > for x in some_set: > > break > > > > Was evaluated inline. It probably still has to call PyObject_GetIter, > > however it doesn't have to create a stack frame for it. > > > > This is what "timeit" tells me. All runs are of the form: > > python -m timeit -s "s = set([10])" ... > > > > 0.101us "for x in s: break; x" > > 0.1
Re: [Python-Dev] Retrieve an arbitrary element from a set without removing it
Hi, I totally agree regarding the efficiency. Code that relies on a fast "non- removing .pop()" probably has other worse bottlenecks that should be targetted first. This would, however, relief every programmer who needs this the first time in his Python experience, to research how this could be most reasonably done. And it is more of a style question. I find the "for x in some_set: break" rather ugly. wr Am Montag, 26. Oktober 2009 08:29:36 schrieben Sie: > I don't find the optimization issue to be very interesting in > the case of retrieving an arbitrary element. This isn't > the kind of thing that typically appears in an inner-loop; > afterall, if you've retrieved an arbitrary element without > removing it, then successive calls to "choose" could > potentially retrieve the exact same element again and again. > > > Raymond ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Retrieve an arbitrary element from a set withoutremoving it
For those of you who want to tinker with it, I posted the patch against the current trunk at http://bugs.python.org/issue7212 Have fun, wr Am Montag, 26. Oktober 2009 21:32:32 schrieb Guido van Rossum: > On Mon, Oct 26, 2009 at 1:19 PM, Antoine Pitrou wrote: > > Jesse Noller gmail.com> writes: > >> So far, fiddling with the PEP, I'm on the fence - adding a method to a > >> built-in object type is sort of a grey area (at least in my mind). It > >> depends on if doing so significantly alters the language/syntax. > > > > We have recently added things like float.fromhex() which IMHO shouldn't > > be blocked by the moratorium (*), although they technically add a method > > to a built-in. > > > > (*) it is a minor new feature aimed at catching up with some established > > standard for an exact, non-ambiguous string representation of floats > > Okay, so it remains a gray area. > ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Retrieve an arbitrary element from a set withoutremoving it
Am Freitag, 30. Oktober 2009 03:58:16 schrieb Steven D'Aprano: > To clarify point 3, given: > > x = set.get() > y = set.get() > > then x and y will only be the same element if set has length one. > However, given: > > x = set.get() > set.add(el) > set.remove(el) > y = set.get() > > there are no guarantees about x and y being different. > > I believe that the patch supplied by Willi Richart implemented these > behaviours. > > http://bugs.python.org/issue7212 > Actually, no. The patch makes no assumption about x and y being different. It does not try to extend the core functionality of set nor does it need to maintain any state that would be necessary for that. It is just a more obvious and cleaner way for saying "for x in some_set: break". So, as Raymond says, it is the Pythonic version of "arb" in setl. wr ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Retrieve an arbitrary element from a setwithoutremoving it
Am Sonntag, 1. November 2009 12:21:15 schrieben Sie: > It seems that even those originally asking for set retrieval have gone > silent Nope. Stilll following and waiting for the verdict of the community after having filed the corpus delicti [1] wr [1]: http://bugs.python.org/issue7212 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Retrieve an arbitrary element from a set withoutremoving it
Hi, all your points are valid -- for the experienced Python programmer who has stumbled across this issue already and solved it in one of several ways. All your points, however, do not support the "one obvious way to do it" philosophy of Python. It's all about making Python even more clean and beautiful. wr Am Montag, 2. November 2009 06:29:00 schrieb Cameron Simpson: > On 30Oct2009 20:43, Chris Bergstresser wrote: > | On Fri, Oct 30, 2009 at 8:29 PM, Steven D'Aprano wrote: > | >> > Iterating over an iterable is > | >> > what iterators are for. > | > > | > set.get(), or set.pick() as Wikipedia calls it, isn't for iterating > | > over sets. It is for getting an arbitrary element from the set. > > [...] > > | > The purpose is to > | > return an arbitrary item each time it is called. If two threads get the > | > same element, that's not a problem; if one thread misses an element > | > because another thread grabbed it first, that's not a problem either. > | > If people prefer a random element instead, I have no problem with > | > that -- personally I think that's overkill, but maybe that's just me. > | > |I also think returning a random one is overkill, in the base set. > | And I'd have the base implementation do the simplest thing possible: > | return a fixed element (either the first returned when iterated over, > | or the last stored, or whatever's easiest based on the internals). > | For me, leaving the choice of *which* element to return on each call > | is a feature. It allows subclasses to change the behavior to support > | other use cases, like a random or round-robin behavior. > > Personally, I'm for the iteration spec in a lot of ways. > > Firstly, a .get()/.pick() that always returns the same element feels > horrible. Is there anyone here who _likes_ it? > > Secondly, I think the thread-unsafe arguments are weak. Question: is the > existing set iteration scheme thread safe? i.e. does it return a fresh > iterator that a thread can happily use concurrently while another thread > uses its own iterator? (Absent set modifications). If the answer is yes, > then I don't buy the thread-unsafe complaints - there are already plenty > of thread unsafe things much as lots of iterators will break in the face > of changes to the data structures over which they're iterating. If > iter(set) gets you a safe iterator (absent set modification and multiple > users of that iterator) then thread safe methods exist and are easy to > use. No presently thread-safe program is going to be broken by adding an > iterating .get() method. > > Thirdly, an iteration-like .get() is dead easy to provide: you just keep a > _single_, cycling, internal iterator made on demand the same way __iter__ > already works. So why not do just do it? There's no need to construct it > before anyone calls .get() the first time. Somewhat like: > > def get(self): > for x in self: > return x > raise something here > > but not making a new iterator for every caller. Indeed, making a new > iterater per caller, in addition to being expensive, might easily return > the same element to every caller. > > Do anyone feel an iteration capable .get() unduely burdens subclasses > that want to impement different .get()s? Both the suggested potential > subclass choices (round robin and random) suggest iteration capable > .get()s (presuming "random" means "shuffle order" rather than potentially > returning the same element twice). > > Cheers, > ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com