Re: Planning a Python Course for Beginners
On 8/11/17 6:37 AM, Python wrote: > Marko Rauhamaa wrote: >> Python : >> >>> Marko Rauhamaa wrote: >>> I didn't disagree with any of these statements about __hash__, but only >>> your statement about id and __eq__: >>> id() is actually an ideal return value of __hash__(). The only criterion is that the returned number should be different if the __eq__() is False. That is definitely true for id() >>> >>> nan is a clear, simple, undeniable counterexample to that claim. >> >> Still, I don't see the point you are trying to make. > > You do have a cognitive disease, don't you? > > > Maybe it's time to just drop it, rather than starting to insult each other? --Ned. -- https://mail.python.org/mailman/listinfo/python-list
Re: Planning a Python Course for Beginners
Marko Rauhamaa wrote: Python : Marko Rauhamaa wrote: I didn't disagree with any of these statements about __hash__, but only your statement about id and __eq__: id() is actually an ideal return value of __hash__(). The only criterion is that the returned number should be different if the __eq__() is False. That is definitely true for id() nan is a clear, simple, undeniable counterexample to that claim. Still, I don't see the point you are trying to make. You do have a cognitive disease, don't you? -- https://mail.python.org/mailman/listinfo/python-list
Re: Planning a Python Course for Beginners
Marko Rauhamaa wrote: Of course, some algorithms can (and, we have learned, do) prefer some bits over others, but that's inside the implementation black box. I would think every bit should carry an approximately equal weight. Ideally that would be true, but you need to consider the performance cost of making it so. Dict could go to the trouble of thoroughly scrambling the hash bits before even making the first probe, but that would slow down *every* dict lookup. The way things are, it uses a very simple technique for the first probe that *usually* gives good results, which speeds things up overall. -- Greg -- https://mail.python.org/mailman/listinfo/python-list
Re: Planning a Python Course for Beginners
Steve D'Aprano wrote: On Thu, 10 Aug 2017 07:00 pm, Peter Otten wrote: /* bottom 3 or 4 bits are likely to be 0; rotate y by 4 to avoid excessive hash collisions for dicts and sets */ which I think agrees with my comment: using the id() itself would put too many objects in the same bucket (i.e. too many collisions). I suspect this is more of a minor performance tweak than a vital issue. Otherwise it would mean that dict's algorithm for assigning items to buckets based on the hash isn't all that great. -- Greg -- https://mail.python.org/mailman/listinfo/python-list
Re: Planning a Python Course for Beginners
Python wrote: Marko Rauhamaa wrote: id() is actually an ideal return value of __hash__(). The only criterion is that the returned number should be different if the __eq__() is False. That is definitely true for id() nan is a clear, simple, undeniable counterexample to that claim. It's a counterexample to the claim that id() *must* be different if __eq__() is False, but that's not the claim that was made. The claim was that it *should* be different, which allows for the possibility that it might not be different. (I'll put away my hairsplitting axe and go away now.) -- Greg -- https://mail.python.org/mailman/listinfo/python-list
Re: Planning a Python Course for Beginners
Chris Angelico : > On Fri, Aug 11, 2017 at 7:17 AM, Marko Rauhamaa wrote: >> That's interesting, but suggests there's something weird (~ suboptimal) >> going on with CPython's scrambling algorithm. Also, your performance >> test might yield completely different results on other Python >> implementations. >> >> Apart from uniqueness, there's no particular reason to prefer one >> __hash__() value over another as long as the interesting bits are inside >> the CPU's simple integer range. > > Not true. Every time you probe a new location [1], you have to fetch > more data from RAM. That delays you significantly. An ideal hashing > system is going to give a high probability of giving you an empty > bucket on the first try, to minimize the number of main memory > accesses required. > > CPython's scrambling algorithm means that, even when its first try > doesn't succeed, there's a good chance that its second will succeed. > But that doesn't change the fact that you want the first one to > succeed as often as possible. What does all that have to do with where the unique bits are in the hash value? 0x1000 0x2000 0x3000 0x4000 should be no worse as __hash__() values than 0x1000 0x2000 0x3000 0x4000 or 1 2 3 4 Of course, some algorithms can (and, we have learned, do) prefer some bits over others, but that's inside the implementation black box. I would think every bit should carry an approximately equal weight. Marko -- https://mail.python.org/mailman/listinfo/python-list
Re: Planning a Python Course for Beginners
On Fri, Aug 11, 2017 at 7:17 AM, Marko Rauhamaa wrote: > That's interesting, but suggests there's something weird (~ suboptimal) > going on with CPython's scrambling algorithm. Also, your performance > test might yield completely different results on other Python > implementations. > > Apart from uniqueness, there's no particular reason to prefer one > __hash__() value over another as long as the interesting bits are inside > the CPU's simple integer range. Not true. Every time you probe a new location [1], you have to fetch more data from RAM. That delays you significantly. An ideal hashing system is going to give a high probability of giving you an empty bucket on the first try, to minimize the number of main memory accesses required. CPython's scrambling algorithm means that, even when its first try doesn't succeed, there's a good chance that its second will succeed. But that doesn't change the fact that you want the first one to succeed as often as possible. ChrisA [1] Unless it's within the same cache line, which is eight pointers wide on my CPU. Highly unlikely when working with 100,000 pointers. -- https://mail.python.org/mailman/listinfo/python-list
Re: Planning a Python Course for Beginners
Chris Angelico : > I'm aware of this. Doesn't change the fact that the *INITIAL INDEX* is > based on exactly what I said. > > Yaknow? What you're saying is that CPython heavily prefers the low-order bits to be unique performance-wise. I don't know why that particular heuristic bias was chosen. Marko -- https://mail.python.org/mailman/listinfo/python-list
Re: Planning a Python Course for Beginners
Peter Otten <__pete...@web.de>: > Marko Rauhamaa wrote: >> I see no point in CPython's rotation magic. > > Let's see: > > $ cat hashperf.py > class A(object): > __slots__ = ["_hash"] > > def __hash__(self): > return self._hash > > def no_magic(): > a = A() > a._hash = id(a) > return a > > def magic(): > a = A() > a._hash = id(a) >> 4 > return a > > $ python3 -m timeit -s 'from hashperf import magic, no_magic; s = > {no_magic() for _ in range(10**5)}' 'for x in s: x in s' > 10 loops, best of 3: 70.7 msec per loop > > $ python3 -m timeit -s 'from hashperf import magic, no_magic; s = {magic() > for _ in range(10**5)}' 'for x in s: x in s' > 10 loops, best of 3: 52.8 msec per loop > > "magic" wins this makeshift test. Other than that you're right ;) That's interesting, but suggests there's something weird (~ suboptimal) going on with CPython's scrambling algorithm. Also, your performance test might yield completely different results on other Python implementations. Apart from uniqueness, there's no particular reason to prefer one __hash__() value over another as long as the interesting bits are inside the CPU's simple integer range. Marko -- https://mail.python.org/mailman/listinfo/python-list
Re: Planning a Python Course for Beginners
On Fri, Aug 11, 2017 at 6:56 AM, Marko Rauhamaa wrote: > Chris Angelico : > >> On Fri, Aug 11, 2017 at 6:03 AM, Marko Rauhamaa wrote: >>> I see no point in CPython's rotation magic. >> >> Have you ever implemented a hashtable? The most common way to pick a >> bucket for an object is to use modulo on the number of buckets. > > Like I said earlier, CPython takes the __hash__() value and scrambles > it. Look for "perturb" in: > > https://github.com/python/cpython/blob/master/Objects/dictobject.c> > > From a comment: > >Now the probe sequence depends (eventually) on every bit in the hash >code, and the pseudo-scrambling property of recurring on 5*j+1 is >more valuable, because it quickly magnifies small differences in the >bits that didn't affect the initial index. I'm aware of this. Doesn't change the fact that the *INITIAL INDEX* is based on exactly what I said. Yaknow? ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: Planning a Python Course for Beginners
Chris Angelico : > On Fri, Aug 11, 2017 at 6:03 AM, Marko Rauhamaa wrote: >> I see no point in CPython's rotation magic. > > Have you ever implemented a hashtable? The most common way to pick a > bucket for an object is to use modulo on the number of buckets. Like I said earlier, CPython takes the __hash__() value and scrambles it. Look for "perturb" in: https://github.com/python/cpython/blob/master/Objects/dictobject.c> >From a comment: Now the probe sequence depends (eventually) on every bit in the hash code, and the pseudo-scrambling property of recurring on 5*j+1 is more valuable, because it quickly magnifies small differences in the bits that didn't affect the initial index. Marko -- https://mail.python.org/mailman/listinfo/python-list
Re: Planning a Python Course for Beginners
Marko Rauhamaa wrote: > Peter Otten <__pete...@web.de>: > >> Steve D'Aprano wrote: >>> The C code says: >>> /* bottom 3 or 4 bits are likely to be 0; rotate y by 4 to avoid excessive hash collisions for dicts and sets */ >>> >>> which I think agrees with my comment: using the id() itself would put >>> too many objects in the same bucket (i.e. too many collisions). >> >> There's a subtle diffence: you expected objects with nearby memory >> addresses to end up in the same "bucket" while actually all addresses >> (are likely to) have the same low bits, and creation time does not >> matter. > > I see no point in CPython's rotation magic. Let's see: $ cat hashperf.py class A(object): __slots__ = ["_hash"] def __hash__(self): return self._hash def no_magic(): a = A() a._hash = id(a) return a def magic(): a = A() a._hash = id(a) >> 4 return a $ python3 -m timeit -s 'from hashperf import magic, no_magic; s = {no_magic() for _ in range(10**5)}' 'for x in s: x in s' 10 loops, best of 3: 70.7 msec per loop $ python3 -m timeit -s 'from hashperf import magic, no_magic; s = {magic() for _ in range(10**5)}' 'for x in s: x in s' 10 loops, best of 3: 52.8 msec per loop "magic" wins this makeshift test. Other than that you're right ;) -- https://mail.python.org/mailman/listinfo/python-list
Re: Planning a Python Course for Beginners
On Fri, Aug 11, 2017 at 6:03 AM, Marko Rauhamaa wrote: > Peter Otten <__pete...@web.de>: > >> Steve D'Aprano wrote: >>> The C code says: >>> /* bottom 3 or 4 bits are likely to be 0; rotate y by 4 to avoid excessive hash collisions for dicts and sets */ >>> >>> which I think agrees with my comment: using the id() itself would put >>> too many objects in the same bucket (i.e. too many collisions). >> >> There's a subtle diffence: you expected objects with nearby memory >> addresses to end up in the same "bucket" while actually all addresses >> (are likely to) have the same low bits, and creation time does not >> matter. > > I see no point in CPython's rotation magic. Have you ever implemented a hashtable? The most common way to pick a bucket for an object is to use modulo on the number of buckets. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: Planning a Python Course for Beginners
Peter Otten <__pete...@web.de>: > Steve D'Aprano wrote: >> The C code says: >> >>>/* bottom 3 or 4 bits are likely to be 0; rotate y by 4 to avoid >>>excessive hash collisions for dicts and sets */ >> >> which I think agrees with my comment: using the id() itself would put >> too many objects in the same bucket (i.e. too many collisions). > > There's a subtle diffence: you expected objects with nearby memory > addresses to end up in the same "bucket" while actually all addresses > (are likely to) have the same low bits, and creation time does not > matter. I see no point in CPython's rotation magic. Marko -- https://mail.python.org/mailman/listinfo/python-list
Re: Planning a Python Course for Beginners
Steve D'Aprano wrote: > The C code says: > >>/* bottom 3 or 4 bits are likely to be 0; rotate y by 4 to avoid >>excessive hash collisions for dicts and sets */ > > which I think agrees with my comment: using the id() itself would put too > many objects in the same bucket (i.e. too many collisions). There's a subtle diffence: you expected objects with nearby memory addresses to end up in the same "bucket" while actually all addresses (are likely to) have the same low bits, and creation time does not matter. -- https://mail.python.org/mailman/listinfo/python-list
Re: Planning a Python Course for Beginners
Python : > Marko Rauhamaa wrote: > I didn't disagree with any of these statements about __hash__, but only > your statement about id and __eq__: > >> id() is actually an ideal return value of __hash__(). The only >> criterion is that the returned number should be different if the >> __eq__() is False. That is definitely true for id() > > nan is a clear, simple, undeniable counterexample to that claim. Still, I don't see the point you are trying to make. Marko -- https://mail.python.org/mailman/listinfo/python-list
Re: Planning a Python Course for Beginners
On Fri, Aug 11, 2017 at 2:41 AM, Steve D'Aprano wrote: > On Fri, 11 Aug 2017 12:58 am, Chris Angelico wrote: > >> On Fri, Aug 11, 2017 at 12:45 AM, Steve D'Aprano >> wrote: >> >>> The C code says: >>> /* bottom 3 or 4 bits are likely to be 0; rotate y by 4 to avoid excessive hash collisions for dicts and sets */ >>> >>> which I think agrees with my comment: using the id() itself would put too >>> many objects in the same bucket (i.e. too many collisions). >>> >>> If that were the problem it wouldn't be solved by the current approach: >>> sample = [object() for _ in range(10)] >>> [hash(b) - hash(a) for a, b in zip(sample, sample[1:])] [1, 1, 1, 1, 1, 1, 1, 1, 1] >> >> A difference of 1 in a hash is usually going to mean dropping >> something into the next bucket. A difference of 4, 8, or 16 would mean >> that a tiny dictionary (which has 8 slots and thus uses modulo-8) >> would have everything on the same slot. > > Um... yes? And how does that relate to the comment given in the source code? > > "bottom 3 or 4 bits are likely to be 0; rotate y by 4 to avoid excessive hash > collisions for dicts and sets" > Are we in agreement so far? Yes, we're in agreement. It may have been unclear from my quoting style, but the main point I was disagreeing with was this: If that were the problem it wouldn't be solved by the current approach: >>> sample = [object() for _ in range(10)] >>> [hash(b) - hash(a) for a, b in zip(sample, sample[1:])] [1, 1, 1, 1, 1, 1, 1, 1, 1] Incrementing hashes by 1 usually will put things into successive buckets. Incrementing by 8 or 16 will usually put things into the same bucket. Sorry for the confusion. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: Planning a Python Course for Beginners
On Fri, 11 Aug 2017 12:58 am, Chris Angelico wrote: > On Fri, Aug 11, 2017 at 12:45 AM, Steve D'Aprano > wrote: > >> The C code says: >> >>> /* bottom 3 or 4 bits are likely to be 0; rotate y by 4 to avoid >>>excessive hash collisions for dicts and sets */ >> >> which I think agrees with my comment: using the id() itself would put too >> many objects in the same bucket (i.e. too many collisions). >> >> >>> If that were the problem it wouldn't be solved by the current approach: >>> >> sample = [object() for _ in range(10)] >> [hash(b) - hash(a) for a, b in zip(sample, sample[1:])] >>> [1, 1, 1, 1, 1, 1, 1, 1, 1] > > A difference of 1 in a hash is usually going to mean dropping > something into the next bucket. A difference of 4, 8, or 16 would mean > that a tiny dictionary (which has 8 slots and thus uses modulo-8) > would have everything on the same slot. Um... yes? And how does that relate to the comment given in the source code? "bottom 3 or 4 bits are likely to be 0; rotate y by 4 to avoid excessive hash collisions for dicts and sets" According to the comment, IDs of objects are typically: 0b(bunch of bits)1000 0b(bunch of bits) i.e. they're typically multiples of 8 or 16. Right? So modulo 8, they'll all map to the zeroeth bucket; modulo 16, they'll all map to the zeroeth or eighth bucket. Collisions, just like I said. (Was I wrong?) By stripping of the first four bits, you get: 0b(bunch of bits) which will hopefully be well-mixed modulo 8 or 16. Are we in agreement so far? >> So my money is on object() being anomalous: because it is so small, the >> hashes end up so similar. For "typical" classes, the hash function does a >> much better job of mixing the hash values up. >> > > And this is also possible, but most likely the difference would simply > widen; small dictionaries would still have all objects landing in the > zeroth bucket. Hence rotating away the low bits. I can't tell if you're disagreeing with me or not. I said: using the object ID itself would be a terrible hash, because there would be lots of collisions. Lo and behold the source code for the default hash says (paraphrased), "don't use the ID itself, that will have lots of collisions" and processes the ID by doing a >> 4 to strip off the least significant four bits. So I'm genuinely puzzled on where (if anywhere) our point of disagreement is. -- Steve “Cheer up,” they said, “things could be worse.” So I cheered up, and sure enough, things got worse. -- https://mail.python.org/mailman/listinfo/python-list
Re: Planning a Python Course for Beginners
Marko Rauhamaa wrote: Python : Marko Rauhamaa wrote: Python : Marko Rauhamaa wrote: id() is actually an ideal return value of __hash__(). The only criterion is that the returned number should be different if the __eq__() is False. That is definitely true for id(). $ python Python 2.7.13 (default, Jan 19 2017, 14:48:08) [GCC 6.3.0 20170118] on linux2 Type "help", "copyright", "credits" or "license" for more information. nan = float('NaN') id(nan) == id(nan) True nan == nan False Point being? It is a counter example to your claim that if __eq__(...) is false then id should return different values. No it's not: * __hash__() *should* return different values. It is neither possible nor necessary in the general case. * For NaN, there's no better alternative. * Dictionaries and sets try "is" before __eq__(...) so everything works anyway. So, to be precise, the __hash__() rule is: a.__hash__() *should* return a different number than b.__hash__() if a is not b and not a.__eq__(b) a.__hash__() *must* return the same number as b.__hash__() if a is b or a.__eq__(b) I didn't disagree with any of these statements about __hash__, but only your statement about id and __eq__: id() is actually an ideal return value of __hash__(). The only criterion is that the returned number should be different if the __eq__() is False. That is definitely true for id() nan is a clear, simple, undeniable counterexample to that claim. the hash function for floats is quite interesting btw, you may want to look what is its value for nan. -- https://mail.python.org/mailman/listinfo/python-list
Re: Planning a Python Course for Beginners
Python : > Marko Rauhamaa wrote: >> Python : >> >>> Marko Rauhamaa wrote: id() is actually an ideal return value of __hash__(). The only criterion is that the returned number should be different if the __eq__() is False. That is definitely true for id(). >>> >>> $ python >>> Python 2.7.13 (default, Jan 19 2017, 14:48:08) >>> [GCC 6.3.0 20170118] on linux2 >>> Type "help", "copyright", "credits" or "license" for more information. >> nan = float('NaN') >> id(nan) == id(nan) >>> True >> nan == nan >>> False >> >> >> Point being? > > It is a counter example to your claim that if __eq__(...) is false > then id should return different values. No it's not: * __hash__() *should* return different values. It is neither possible nor necessary in the general case. * For NaN, there's no better alternative. * Dictionaries and sets try "is" before __eq__(...) so everything works anyway. So, to be precise, the __hash__() rule is: a.__hash__() *should* return a different number than b.__hash__() if a is not b and not a.__eq__(b) a.__hash__() *must* return the same number as b.__hash__() if a is b or a.__eq__(b) Marko -- https://mail.python.org/mailman/listinfo/python-list
Re: Planning a Python Course for Beginners
On Fri, Aug 11, 2017 at 12:45 AM, Steve D'Aprano wrote: > The C code says: > >> /* bottom 3 or 4 bits are likely to be 0; rotate y by 4 to avoid >>excessive hash collisions for dicts and sets */ > > which I think agrees with my comment: using the id() itself would put too many > objects in the same bucket (i.e. too many collisions). > > >> If that were the problem it wouldn't be solved by the current approach: >> > sample = [object() for _ in range(10)] > [hash(b) - hash(a) for a, b in zip(sample, sample[1:])] >> [1, 1, 1, 1, 1, 1, 1, 1, 1] A difference of 1 in a hash is usually going to mean dropping something into the next bucket. A difference of 4, 8, or 16 would mean that a tiny dictionary (which has 8 slots and thus uses modulo-8) would have everything on the same slot. > > So my money is on object() being anomalous: because it is so small, the hashes > end up so similar. For "typical" classes, the hash function does a much better > job of mixing the hash values up. > And this is also possible, but most likely the difference would simply widen; small dictionaries would still have all objects landing in the zeroth bucket. Hence rotating away the low bits. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: Planning a Python Course for Beginners
On Thu, 10 Aug 2017 07:00 pm, Peter Otten wrote: > Steven D'Aprano wrote: > >> On Wed, 09 Aug 2017 20:07:48 +0300, Marko Rauhamaa wrote: >> >>> Good point! A very good __hash__() implementation is: >>> >>> def __hash__(self): >>> return id(self) >>> >>> In fact, I didn't know Python (kinda) did this by default already. I >>> can't find that information in the definition of object.__hash__(): >> >> >> Hmmm... using id() as the hash would be a terrible hash function. Objects > > It's actually id(self) >> 4 (almost, see C code below), to account for > memory alignment. Thanks for tracking that down. As you show, the default hash isn't id() itself. The C code says: > /* bottom 3 or 4 bits are likely to be 0; rotate y by 4 to avoid >excessive hash collisions for dicts and sets */ which I think agrees with my comment: using the id() itself would put too many objects in the same bucket (i.e. too many collisions). obj = object() hex(id(obj)) > '0x7f1f058070b0' hex(hash(obj)) > '0x7f1f058070b' > sample = (object() for _ in range(10)) all(id(obj) >> 4 == hash(obj) for obj in sample) > True > >> would fall into similar buckets if they were created at similar times, >> regardless of their value, rather than being well distributed. > > If that were the problem it wouldn't be solved by the current approach: > sample = [object() for _ in range(10)] [hash(b) - hash(a) for a, b in zip(sample, sample[1:])] > [1, 1, 1, 1, 1, 1, 1, 1, 1] Arguably that's a flaw with the current approach that (maybe?) makes object()'s hash too closely. But: - perhaps it doesn't matter in practice, since the hash is taken modulo the size of the hash table; - or maybe Python's dicts and sets are good enough that a difference of 1 is sufficient to give a good distribution of objects in the hash table; - or maybe it does matter, but since people hardly ever use object() itself as the keys in dicts, it doesn't come up. Here's your example with a class that inherits from object, rather than object itself: py> class X(object): ... pass ... py> sample = [X() for x in range(10)] py> [hash(b) - hash(a) for a, b in zip(sample, sample[1:])] [-5338, -10910, -2976, -2284, -21326, 4, -8, 2, -4] So my money is on object() being anomalous: because it is so small, the hashes end up so similar. For "typical" classes, the hash function does a much better job of mixing the hash values up. -- Steve “Cheer up,” they said, “things could be worse.” So I cheered up, and sure enough, things got worse. -- https://mail.python.org/mailman/listinfo/python-list
Re: Planning a Python Course for Beginners
Marko Rauhamaa wrote: Python : Marko Rauhamaa wrote: id() is actually an ideal return value of __hash__(). The only criterion is that the returned number should be different if the __eq__() is False. That is definitely true for id(). $ python Python 2.7.13 (default, Jan 19 2017, 14:48:08) [GCC 6.3.0 20170118] on linux2 Type "help", "copyright", "credits" or "license" for more information. nan = float('NaN') id(nan) == id(nan) True nan == nan False Point being? It is a counter example to your claim that if __eq__(...) is false then id should return different values. -- https://mail.python.org/mailman/listinfo/python-list
Re: Planning a Python Course for Beginners
Python : > Marko Rauhamaa wrote: >> id() is actually an ideal return value of __hash__(). The only criterion >> is that the returned number should be different if the __eq__() is >> False. That is definitely true for id(). > > $ python > Python 2.7.13 (default, Jan 19 2017, 14:48:08) > [GCC 6.3.0 20170118] on linux2 > Type "help", "copyright", "credits" or "license" for more information. nan = float('NaN') id(nan) == id(nan) > True nan == nan > False Point being? Marko -- https://mail.python.org/mailman/listinfo/python-list
Re: Planning a Python Course for Beginners
Marko Rauhamaa wrote: id() is actually an ideal return value of __hash__(). The only criterion is that the returned number should be different if the __eq__() is False. That is definitely true for id(). $ python Python 2.7.13 (default, Jan 19 2017, 14:48:08) [GCC 6.3.0 20170118] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> nan = float('NaN') >>> id(nan) == id(nan) True >>> nan == nan False >>> -- https://mail.python.org/mailman/listinfo/python-list
Re: Planning a Python Course for Beginners
Peter Otten <__pete...@web.de>: > Steven D'Aprano wrote: >> On Wed, 09 Aug 2017 20:07:48 +0300, Marko Rauhamaa wrote: >> >>> Good point! A very good __hash__() implementation is: >>> >>> def __hash__(self): >>> return id(self) >>> >>> In fact, I didn't know Python (kinda) did this by default already. I >>> can't find that information in the definition of object.__hash__(): >> >> >> Hmmm... using id() as the hash would be a terrible hash function. id() is actually an ideal return value of __hash__(). The only criterion is that the returned number should be different if the __eq__() is False. That is definitely true for id(). > It's actually id(self) >> 4 (almost, see C code below), to account for > memory alignment. Memory alignment makes no practical difference. It it is any good, the internal implementation will further scramble and scale the returned hash value. For example: index = hash(obj) % prime_table_size >> would fall into similar buckets if they were created at similar >> times, regardless of their value, rather than being well distributed. > > If that were the problem it wouldn't be solved by the current approach: It is not a problem. Hash values don't need to be well distributed, they simply need to be discerning to tiny differences in equality. sample = [object() for _ in range(10)] [hash(b) - hash(a) for a, b in zip(sample, sample[1:])] > [1, 1, 1, 1, 1, 1, 1, 1, 1] Nice demo :-) Marko -- https://mail.python.org/mailman/listinfo/python-list
Re: Planning a Python Course for Beginners
Steven D'Aprano wrote: > On Wed, 09 Aug 2017 20:07:48 +0300, Marko Rauhamaa wrote: > >> Good point! A very good __hash__() implementation is: >> >> def __hash__(self): >> return id(self) >> >> In fact, I didn't know Python (kinda) did this by default already. I >> can't find that information in the definition of object.__hash__(): > > > Hmmm... using id() as the hash would be a terrible hash function. Objects It's actually id(self) >> 4 (almost, see C code below), to account for memory alignment. >>> obj = object() >>> hex(id(obj)) '0x7f1f058070b0' >>> hex(hash(obj)) '0x7f1f058070b' >>> sample = (object() for _ in range(10)) >>> all(id(obj) >> 4 == hash(obj) for obj in sample) True > would fall into similar buckets if they were created at similar times, > regardless of their value, rather than being well distributed. If that were the problem it wouldn't be solved by the current approach: >>> sample = [object() for _ in range(10)] >>> [hash(b) - hash(a) for a, b in zip(sample, sample[1:])] [1, 1, 1, 1, 1, 1, 1, 1, 1] Py_hash_t _Py_HashPointer(void *p) { Py_hash_t x; size_t y = (size_t)p; /* bottom 3 or 4 bits are likely to be 0; rotate y by 4 to avoid excessive hash collisions for dicts and sets */ y = (y >> 4) | (y << (8 * SIZEOF_VOID_P - 4)); x = (Py_hash_t)y; if (x == -1) x = -2; return x; } -- https://mail.python.org/mailman/listinfo/python-list
Re: Planning a Python Course for Beginners
On Wed, 09 Aug 2017 20:07:48 +0300, Marko Rauhamaa wrote: > Good point! A very good __hash__() implementation is: > > def __hash__(self): > return id(self) > > In fact, I didn't know Python (kinda) did this by default already. I > can't find that information in the definition of object.__hash__(): Hmmm... using id() as the hash would be a terrible hash function. Objects would fall into similar buckets if they were created at similar times, regardless of their value, rather than being well distributed. But let's see whether or not objects actually do so, as you claim: >>> a, b, c, d = "abc", "def", "ghi", "jki" >>> [id(obj) for obj in (a,b,c,d)] [139932454814752, 139932454814808, 139932454814920, 139932454913616] >>> [hash(obj) for obj in (a,b,c,d)] [7231609897320296628, -876470178105133015, -5049894847448874792, 5697571649565117128] Wait, maybe you're referring to hash() of object(), inherited by classes that don't define their own __hash__. Let's check it out: >>> a, b, c, d = [object() for i in range(4)] >>> [id(obj) for obj in (a,b,c,d)] [139932455747696, 139932455747712, 139932455747728, 139932455747744] >>> [hash(obj) for obj in (a,b,c,d)] [8745778484231, 8745778484232, 8745778484233, 8745778484234] Maybe object does something different for itself than for subclasses? >>> class X(object): ... pass ... >>> a, b, c, d = [X() for i in range(4)] >>> [id(obj) for obj in (a,b,c,d)] [139932454939952, 139932454939896, 139932454940008, 139932454940064] >>> [hash(obj) for obj in (a,b,c,d)] [8745778433747, -9223363291076342065, -9223363291076342058, 8745778433754] I see zero evidence that Python uses id() as the default hash. Not even for classic classes in Python 2. -- “You are deluded if you think software engineers who can't write operating systems or applications without security holes, can write virtualization layers without security holes.” —Theo de Raadt -- https://mail.python.org/mailman/listinfo/python-list
Re: Planning a Python Course for Beginners
Chris Angelico : > On Wed, Aug 9, 2017 at 11:46 PM, Marko Rauhamaa wrote: >> Really, the most obvious use case for hashed objects is their membership >> in a set. For example: >> >> invitees = set(self.bff) >> invitees |= self.classmates() >> invitees |= self.relatives() > > Okay. So you should define value by object identity - NOT any sort of > external primary key. Good point! A very good __hash__() implementation is: def __hash__(self): return id(self) In fact, I didn't know Python (kinda) did this by default already. I can't find that information in the definition of object.__hash__(): https://docs.python.org/3/reference/datamodel.html?#object.__hash__> I only found it out by trying it. > That goes completely against your original statement, which I shall > quote again: > In relational-database terms, your "value" is the primary key and your "metadata" is the rest of the columns. > > If there is any possibility that you could have two objects in memory > with the same primary key but other attributes different, you'd have > major MAJOR problems with this kind of set operation. In light of the above realization, don't override __hash__() in any way in your class, and your object works perfectly as a key or a set member. A __hash__() definition is only needed when your __eq__() definition is different from "is". As for running into "major MAJOR" problems, yes, you need to know what you're doing and face the consequences. It's a bit analogous to sort() depending on the definitions of the "rich" comparison. Marko -- https://mail.python.org/mailman/listinfo/python-list
Re: Planning a Python Course for Beginners
On Wed, 9 Aug 2017 11:46 pm, Marko Rauhamaa wrote: > Typically, an object's equality is simply the "is" relation. "Typically"? I don't think so. Are you sure you've programmed in Python before? *wink* py> [1, 2] is [1, 2] False The most commonly used objects don't define equality as identity, e.g. strings, lists, tuples, ints, bytes, dicts etc don't. It would mean that two objects with the same value would nevertheless compare as unequal. In general, caring about `is` (identity) is a failure of abstraction. Why should we care about object identity? Take the value 42 -- why should anyone care whether that is represented in computer memory by a single object or by a billion separate objects? You might care about memory constraints, but that's a leaky abstraction. Ideally, where memory is not a constraint, if you care about identity, you are probably doing it wrong. I'll allow, in principle, that caring about the identity of stateless, valueless objects that are defined only by their identity such as None and NotImplemented may be acceptable. But the Singleton design pattern, as beloved by Java programmers, puts the emphasis on the wrong place: identity, instead of state. Why not have one or a million objects, so long as they have the same state? Hence the Borg design pattern. There are, in my opinion, very few legitimate uses for `is` and identity checking, and nearly all of them are either: - testing for None; or - debugging implementation details. -- Steve “Cheer up,” they said, “things could be worse.” So I cheered up, and sure enough, things got worse. -- https://mail.python.org/mailman/listinfo/python-list
Re: Planning a Python Course for Beginners
On Wed, Aug 9, 2017 at 11:46 PM, Marko Rauhamaa wrote: > Chris Angelico : > >> On Wed, Aug 9, 2017 at 10:00 PM, Marko Rauhamaa wrote: >>> Chris Angelico : >>> Which means that its value won't change. That's what I said. Two things will be equal regardless of that metadata. >>> >>> In relational-database terms, your "value" is the primary key and >>> your "metadata" is the rest of the columns. >> >> I would say the primary key is the "identity" and the rest of the >> columns are the "value". > > Your response illustrates why you and I are not yet on the same page on > this. > > Typically, an object's equality is simply the "is" relation. The only > thing remaining for its usability as a key is a hash method. In fact, > just defining: > >def __hash__(self): >return 0 > > will technically make any class applicable as a key or set member. Yes, at the cost of making all your set operations into linear searches. Basically, if all your objects have the same hashes, you might as well use lists instead of sets, except that lists don't have the methods/operators you want. > The interesting fields of the object (which you disparagingly referred > to as "metadata") don't need to participate in the calculation of the > hash. You ought to pick the maximal collection of immutable fields as a > basis of your hash, and you are all set (no pun intended). The rules are (1) two objects that compare equal (__eq__) MUST have the same hash; and (2) an object's hash must never change. Also, for efficiency's sake, objects that compare unequal should ideally have different hashes. That's why I refer to it as metadata; it's not allowed to be part of the object's value, because two objects MUST compare equal even if those other fields change. Can you give me a real-world example of where two objects are equal but have important attributes that differ? >> But if you're defining "value" solely by the PK, then you have to ask >> yourself what you're using this in a dictionary for - are you going to >> construct multiple objects representing the same underlying database >> row, and expect them to compare equal? > > Let's leave the relational world and return to objects. > > Really, the most obvious use case for hashed objects is their membership > in a set. For example: > > invitees = set(self.bff) > invitees |= self.classmates() > invitees |= self.relatives() Okay. So you should define value by object identity - NOT any sort of external primary key. That goes completely against your original statement, which I shall quote again: >>> In relational-database terms, your "value" is the primary key and >>> your "metadata" is the rest of the columns. If there is any possibility that you could have two objects in memory with the same primary key but other attributes different, you'd have major MAJOR problems with this kind of set operation. The best solution would be to use the IDs themselves (as integers) in the set, and ignore the whole question of identity and value. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: Planning a Python Course for Beginners
Chris Angelico : > On Wed, Aug 9, 2017 at 10:00 PM, Marko Rauhamaa wrote: >> Chris Angelico : >> >>> Which means that its value won't change. That's what I said. Two >>> things will be equal regardless of that metadata. >> >> In relational-database terms, your "value" is the primary key and >> your "metadata" is the rest of the columns. > > I would say the primary key is the "identity" and the rest of the > columns are the "value". Your response illustrates why you and I are not yet on the same page on this. Typically, an object's equality is simply the "is" relation. The only thing remaining for its usability as a key is a hash method. In fact, just defining: def __hash__(self): return 0 will technically make any class applicable as a key or set member. The interesting fields of the object (which you disparagingly referred to as "metadata") don't need to participate in the calculation of the hash. You ought to pick the maximal collection of immutable fields as a basis of your hash, and you are all set (no pun intended). > But if you're defining "value" solely by the PK, then you have to ask > yourself what you're using this in a dictionary for - are you going to > construct multiple objects representing the same underlying database > row, and expect them to compare equal? Let's leave the relational world and return to objects. Really, the most obvious use case for hashed objects is their membership in a set. For example: invitees = set(self.bff) invitees |= self.classmates() invitees |= self.relatives() And Python doesn't enforce this in any way except for lists. That's somewhat unfortunate since sometimes you really would like an immutable (or rather, no-longer-mutable) list to act as a key. >>> >>> Then make a tuple out of it. Job done. You're trying to say that its >>> value won't now change. >> >> Yeah, when there's a will, there's a way. > > I don't understand your comment. Do you mean that if someone wants to > change it, s/he will? No. I mean coercing lists to tuples can be quite a hefty operation. On the other hand, so could hashing a list unless the value is memoized. More importantly, tuple(collection) goes against the grain of what a tuple is. Tuples are not collections. In particular, tuples in a given role ordinarily have the same arity. Marko -- https://mail.python.org/mailman/listinfo/python-list
Re: Planning a Python Course for Beginners
On 8/9/2017 9:25 AM, Marko Rauhamaa wrote: > r...@zedat.fu-berlin.de (Stefan Ram): > >> Steve D'Aprano writes: >>> There's a word for frozen list: "tuple". >> Yes, but one should not forget that a tuple >> can contain mutable entries (such as lists). > Not when used as keys: > > >>> hash(([], [])) > Traceback (most recent call last): > File "", line 1, in > TypeError: unhashable type: 'list' > > > Marko Hence the word 'can' and not 'will' or 'must' or 'shall' ... --- This email has been checked for viruses by Avast antivirus software. https://www.avast.com/antivirus -- https://mail.python.org/mailman/listinfo/python-list
Re: Planning a Python Course for Beginners
r...@zedat.fu-berlin.de (Stefan Ram): > Steve D'Aprano writes: >>There's a word for frozen list: "tuple". > > Yes, but one should not forget that a tuple > can contain mutable entries (such as lists). Not when used as keys: >>> hash(([], [])) Traceback (most recent call last): File "", line 1, in TypeError: unhashable type: 'list' Marko -- https://mail.python.org/mailman/listinfo/python-list
Re: Planning a Python Course for Beginners
On Wed, Aug 9, 2017 at 10:00 PM, Marko Rauhamaa wrote: > Chris Angelico : > >> Which means that its value won't change. That's what I said. Two >> things will be equal regardless of that metadata. > > In relational-database terms, your "value" is the primary key and your > "metadata" is the rest of the columns. I would say the primary key is the "identity" and the rest of the columns are the "value". But if you're defining "value" solely by the PK, then you have to ask yourself what you're using this in a dictionary for - are you going to construct multiple objects representing the same underlying database row, and expect them to compare equal? And if they're equal without being identical, how do you know which one of them actually corresponds to the database? Down this path lies a form of madness that I want nothing to do with. >>> And Python doesn't enforce this in any way except for lists. That's >>> somewhat unfortunate since sometimes you really would like an >>> immutable (or rather, no-longer-mutable) list to act as a key. >> >> Then make a tuple out of it. Job done. You're trying to say that its >> value won't now change. > > Yeah, when there's a will, there's a way. I don't understand your comment. Do you mean that if someone wants to change it, s/he will? Because that's not really the point. If you're declaring that a list can now be safely compared by value, you don't want it to be mutable in any way. That's what a tuple is for. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: Planning a Python Course for Beginners
On Wed, 9 Aug 2017 08:38 pm, Marko Rauhamaa wrote: > sometimes you really would like an immutable > (or rather, no-longer-mutable) list to act as a key. There's a word for frozen list: "tuple". :-) -- Steve “Cheer up,” they said, “things could be worse.” So I cheered up, and sure enough, things got worse. -- https://mail.python.org/mailman/listinfo/python-list
Re: Planning a Python Course for Beginners
On Wed, 9 Aug 2017 02:19 pm, Dennis Lee Bieber wrote: > On Tue, 8 Aug 2017 15:38:42 + (UTC), Grant Edwards > declaimed the following: > >>On 2017-08-08, Peter Heitzer wrote: [...] >>> The differences between blanks and tabs :-) >> >>You've misspelled "Tabs are evil and should never be used". ;) > > > Tabs are logical entities indicating structure and should always be used! > Amen to that brother! -- Steve “Cheer up,” they said, “things could be worse.” So I cheered up, and sure enough, things got worse. -- https://mail.python.org/mailman/listinfo/python-list
Re: Planning a Python Course for Beginners
On Wed, 9 Aug 2017 07:51 pm, Marko Rauhamaa wrote: > Dennis Lee Bieber : > >> Then there is the facet that tuples (being unmutable) can be used as >> keys into a dictionary... > > Mutable objects can be used as keys into a dictionary. Indeed. And people can also put their hand into a fire in order to pull out a red-hot burning coal. I wouldn't recommend either, at least not under uncontrolled conditions. -- Steve “Cheer up,” they said, “things could be worse.” So I cheered up, and sure enough, things got worse. -- https://mail.python.org/mailman/listinfo/python-list
Re: Planning a Python Course for Beginners
On Wed, Aug 9, 2017 at 8:00 AM, Marko Rauhamaa wrote: > Yeah, when there's a will, there's a way. My Dad used to say "Where there's a will, there's relatives." -- https://mail.python.org/mailman/listinfo/python-list
Re: Planning a Python Course for Beginners
Chris Angelico : > On Wed, Aug 9, 2017 at 8:38 PM, Marko Rauhamaa wrote: >> Chris Angelico : >> >>> On Wed, Aug 9, 2017 at 7:51 PM, Marko Rauhamaa wrote: Mutable objects can be used as keys into a dictionary. >>> >>> Only when the objects' mutability does not affect their values. >> >> Up to equality. The objects can carry all kinds of mutable payload as >> long as __hash__() and __eq__() don't change with it. > > Which means that its value won't change. That's what I said. Two > things will be equal regardless of that metadata. In relational-database terms, your "value" is the primary key and your "metadata" is the rest of the columns. >> And Python doesn't enforce this in any way except for lists. That's >> somewhat unfortunate since sometimes you really would like an >> immutable (or rather, no-longer-mutable) list to act as a key. > > Then make a tuple out of it. Job done. You're trying to say that its > value won't now change. Yeah, when there's a will, there's a way. Marko -- https://mail.python.org/mailman/listinfo/python-list
Re: Planning a Python Course for Beginners
On Wed, Aug 9, 2017 at 8:38 PM, Marko Rauhamaa wrote: > Chris Angelico : > >> On Wed, Aug 9, 2017 at 7:51 PM, Marko Rauhamaa wrote: >>> Mutable objects can be used as keys into a dictionary. >> >> Only when the objects' mutability does not affect their values. > > Up to equality. The objects can carry all kinds of mutable payload as > long as __hash__() and __eq__() don't change with it. Which means that its value won't change. That's what I said. Two things will be equal regardless of that metadata. > And Python doesn't enforce this in any way except for lists. That's > somewhat unfortunate since sometimes you really would like an immutable > (or rather, no-longer-mutable) list to act as a key. Then make a tuple out of it. Job done. You're trying to say that its value won't now change. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: Planning a Python Course for Beginners
On Tue, 08 Aug 2017 14:19:53 +, Stefan Ram wrote: > I am planning a Python course. > > I started by writing the course akin to courses I gave in other > languages, that means, the course starts roughly with these topics: > > - number and string literals - types of number and string literals > (just giving the names »int«, »float«, and »string«) > - using simple predefined operators (+, -, *, /) > (including 2*"a" and "a"+"b") > - calling simple predefined functions (len, type, ...) > > . This is a little bit boring however and might not show off Python's > strength early in the course. > > So, I now think that maybe I should start to also include list (like > > [1,2,3] > > ) right from the start. A list conceptually is not much more difficult > than a string since a string "abc" resembles a list ["a","b","c"]. > I.e., the course then would start as follows: > > - number, string, and list literals - types of number, string and list > literals > (just giving the names »int«, »float«, »string«, and »list«) > - using simple predefined operators (+, -, *, /) > (including 2*"a", "a"+"b", 2*["a"], and [1]+[2]) > - calling simple predefined functions (len, type, ...) > > However, once the box has been opened, what else to let out? What > about tuples (like > > (1,2,3) > > ). Should I also teach tuples right from the start? > > But then how to explain to beginners why two different types (lists > AND tuples) are needed for the concept of a linear arrangement of > things? > > Are there any other very simple things that I have missed and that > should be covered very early in a Python course? > > (Especially things that can show off fantastic Python features that > are missing from other programming languages, but still only using > literals, operators and function calls.) if these are beginners with no basic programming knowledge then try not to confuse them with anything unduly complicated, I would even go so far as to start with psuedo code on a pen & paper processor & only introduce the concepts of different data types only when they have progressed to the point that they need to know. -- Round Numbers are always false. -- Samuel Johnson -- https://mail.python.org/mailman/listinfo/python-list
Re: Planning a Python Course for Beginners
Chris Angelico : > On Wed, Aug 9, 2017 at 7:51 PM, Marko Rauhamaa wrote: >> Mutable objects can be used as keys into a dictionary. > > Only when the objects' mutability does not affect their values. Up to equality. The objects can carry all kinds of mutable payload as long as __hash__() and __eq__() don't change with it. And Python doesn't enforce this in any way except for lists. That's somewhat unfortunate since sometimes you really would like an immutable (or rather, no-longer-mutable) list to act as a key. Marko -- https://mail.python.org/mailman/listinfo/python-list
Re: Planning a Python Course for Beginners
On Wed, Aug 9, 2017 at 7:51 PM, Marko Rauhamaa wrote: > Dennis Lee Bieber : > >> Then there is the facet that tuples (being unmutable) can be used as >> keys into a dictionary... > > Mutable objects can be used as keys into a dictionary. Only when the objects' mutability does not affect their values. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: Planning a Python Course for Beginners
Dennis Lee Bieber : > Then there is the facet that tuples (being unmutable) can be used as > keys into a dictionary... Mutable objects can be used as keys into a dictionary. Marko -- https://mail.python.org/mailman/listinfo/python-list
Re: Planning a Python Course for Beginners
Dennis Lee Bieber : > Tabs are logical entities indicating structure and should always be > used! I wrote an entire database program using only tabs. http://dilbert.com/strip/1992-09-08> Marko -- https://mail.python.org/mailman/listinfo/python-list
Re: Planning a Python Course for Beginners
On Aug 8, 2017 10:20 AM, "Stefan Ram" wrote: > > I am planning a Python course. > > I started by writing the course akin to courses I gave > in other languages, that means, the course starts roughly > with these topics: > > - number and string literals > - types of number and string literals > (just giving the names »int«, »float«, and »string«) > - using simple predefined operators (+, -, *, /) > (including 2*"a" and "a"+"b") > - calling simple predefined functions (len, type, ...) > > . This is a little bit boring however and might not > show off Python's strength early in the course. > > So, I now think that maybe I should start to also > include list (like > > [1,2,3] > > ) right from the start. A list conceptually is not > much more difficult than a string since a string > "abc" resembles a list ["a","b","c"]. I.e., the > course then would start as follows: > > - number, string, and list literals > - types of number, string and list literals > (just giving the names »int«, »float«, »string«, > and »list«) > - using simple predefined operators (+, -, *, /) > (including 2*"a", "a"+"b", 2*["a"], and [1]+[2]) > - calling simple predefined functions (len, type, ...) > > However, once the box has been opened, what else > to let out? What about tuples (like > > (1,2,3) > > ). Should I also teach tuples right from the start? > > But then how to explain to beginners why two > different types (lists AND tuples) are needed for > the concept of a linear arrangement of things? > > Are there any other very simple things that > I have missed and that should be covered very > early in a Python course? IMHO its a good idea to introduce conversational programming early. Start with input() and print() then int(), if, while, break . Add one item at a time. This will be more interesting and useful than a bunch of data types and operators, and answer a lot of questions that otherwise show up on the help and tutor lists. Also explain tracebacks. None of the above in great detail; just let students know there is more detail to come later > > (Especially things that can show off fantastic > Python features that are missing from other > programming languages, but still only using > literals, operators and function calls.) I think program flow is more important than fantastic or unique > > -- > https://mail.python.org/mailman/listinfo/python-list -- https://mail.python.org/mailman/listinfo/python-list
Re: Planning a Python Course for Beginners
On 2017-08-08, Peter Heitzer wrote: > Stefan Ram wrote: >> I am planning a Python course. > [different topics] >> Are there any other very simple things that >> I have missed and that should be covered very >> early in a Python course? > > The differences between blanks and tabs :-) You've misspelled "Tabs are evil and should never be used". ;) -- Grant Edwards grant.b.edwardsYow! We just joined the at civil hair patrol! gmail.com -- https://mail.python.org/mailman/listinfo/python-list
Re: Planning a Python Course for Beginners
On Tue, Aug 8, 2017 at 7:19 AM, Stefan Ram wrote: > I am planning a Python course. > > I started by writing the course akin to courses I gave > in other languages, that means, the course starts roughly > with these topics: > > - number and string literals > - types of number and string literals > (just giving the names »int«, »float«, and »string«) > - using simple predefined operators (+, -, *, /) > (including 2*"a" and "a"+"b") > - calling simple predefined functions (len, type, ...) > > . This is a little bit boring however and might not > show off Python's strength early in the course. > > So, I now think that maybe I should start to also > include list (like > > [1,2,3] > > ) right from the start. A list conceptually is not > much more difficult than a string since a string > "abc" resembles a list ["a","b","c"]. I.e., the > course then would start as follows: > > - number, string, and list literals > - types of number, string and list literals > (just giving the names »int«, »float«, »string«, > and »list«) > - using simple predefined operators (+, -, *, /) > (including 2*"a", "a"+"b", 2*["a"], and [1]+[2]) > - calling simple predefined functions (len, type, ...) > > However, once the box has been opened, what else > to let out? What about tuples (like > > (1,2,3) > > ). Should I also teach tuples right from the start? > > But then how to explain to beginners why two > different types (lists AND tuples) are needed for > the concept of a linear arrangement of things? > > Are there any other very simple things that > I have missed and that should be covered very > early in a Python course? > > (Especially things that can show off fantastic > Python features that are missing from other > programming languages, but still only using > literals, operators and function calls.) > > -- > https://mail.python.org/mailman/listinfo/python-list > One thing I find that gets overlooked in a lot of beginner Python courses is Python's module system. Covering the module system will allow you to hit on the subject of namespacing and scope. Python's module system is one of its greatest strengths in my opinion. No other language I've used makes it so simple to structure a project. Another idea: Dictionaries Dictionaries are a conceptually simple data structure that should be easy for beginners to grok. Obviously, their implementation is a bit complex, but I don't think you would need to get into that. Dictionaries are very powerful data structures that can be used to keep code DRY and store important data to be accessed from anywhere in an application or script. Dictionaries also have a time complexity average of O(1) for read access which makes them fairly efficient. -- https://mail.python.org/mailman/listinfo/python-list
Re: Planning a Python Course for Beginners
On Wed, Aug 9, 2017 at 1:02 AM, Stefan Ram wrote: > Chris Angelico writes: >>Why a new Python course? > > It is not a course in the sense of a written text > (which I would call "course notes"). > > It is a course in the sense of an event, where I will meet > participants in a classroom. I will get paid for it, so this > payment is the reason I do it (simplified ;-). Ah. Well, that's a good answer to the question of "why are you even bothering to write this", but unfortunately doesn't answer the questions that I hoped it would, about target audience and such. Heh. C'est la vie. > Since this is my first-ever Python course, but has not yet > begun, I do not know the participants, but I can say this: > > - the course description requires that the participant > have experiences "working with a computer", but not that > they have any knowledge about programming. > > - the participants in my other courses for other > programming languages usually are slow learners, so I > prepare for this kind of audience. I try to avoid topics > that are abstract, advanced or complicated as much as > possible. I try to include very simple exercises. > Most books and tutorials assume faster learners, > so that's another reasone why I don't use them. So, this here is the important info. In that case, I would start with wowing them with the amazing stuff Python can do. Start with a few demonstrations of the simplicity and beauty of expression evaluation. It's okay to use concepts you haven't yet explained; instead of starting with concrete info and building up to something interesting, start with something interesting and then explain how it all works. Just my opinion, of course, but if you didn't want totally unbacked opinions, you wouldn't have come to this list :) ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: Planning a Python Course for Beginners
Stefan Ram wrote: > I am planning a Python course. [different topics] > Are there any other very simple things that > I have missed and that should be covered very > early in a Python course? The differences between blanks and tabs :-) -- Dipl.-Inform(FH) Peter Heitzer, peter.heit...@rz.uni-regensburg.de -- https://mail.python.org/mailman/listinfo/python-list
Re: Planning a Python Course for Beginners
On Wed, Aug 9, 2017 at 12:19 AM, Stefan Ram wrote: > I am planning a Python course. > Before answering any other questions, answer this one: Why a new Python course? How is it different from what already exists? The answer to that will govern just about everything else. The specifics that you're asking about are unanswerable without first knowing your audience, for instance. ChrisA -- https://mail.python.org/mailman/listinfo/python-list