Re: Keeping track of things with dictionaries
On Wednesday 09 April 2014 05:47:37 Ian Kelly did opine: On Tue, Apr 8, 2014 at 9:31 PM, Gene Heskett ghesk...@wdtv.com wrote: 'Pneumonoultramicroscopicsilicovolcanoconiosis' has them all beat. Source citation please? http://en.wikipedia.org/wiki/Pneumonoultramicroscopicsilicovolcanoconios is http://www.oxforddictionaries.com/definition/english/pneumonoultramicro scopicsilicovolcanoconiosis http://dictionary.reference.com/browse/Pneumonoultramicroscopicsilicovo lcanoconiosis Damn, I should know better than call your bluff. Serves me right. :( Boil that down to its essence and it might be used for one of the miners lung problems (black or white lung) they eventually die from. I have a friend doing that as I type. And getting very close to the end of his rope. Cheers, Gene -- There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order. -Ed Howdershelt (Author) Genes Web page http://geneslinuxbox.net:6309/gene US V Castleman, SCOTUS, Mar 2014 is grounds for Impeaching SCOTUS -- https://mail.python.org/mailman/listinfo/python-list
Re: Keeping track of things with dictionaries
On Monday, April 7, 2014 9:08:23 PM UTC-7, Chris Angelico wrote: That depends on whether calling Brand() unnecessarily is a problem. Using setdefault() is handy when you're working with a simple list or something, but if calling Brand() is costly, or (worse) if it has side effects that you don't want, then you need to use a defaultdict. I think this is a textbook example of why defaultdict exists, though, so I'd be inclined to just use it, rather than going for setdefault :) Thanks for the clarification. -- https://mail.python.org/mailman/listinfo/python-list
Re: Keeping track of things with dictionaries
Chris Angelico ros...@gmail.com wrote in message news:captjjmqfbt2xx+bdfnhz0gagordkhtpbzrr29duwn36girz...@mail.gmail.com... On Tue, Apr 8, 2014 at 2:02 PM, Josh English joshua.r.engl...@gmail.com wrote: Would dict.setdefault() solve this problem? Is there any advantage to defaultdict over setdefault() That depends on whether calling Brand() unnecessarily is a problem. Using setdefault() is handy when you're working with a simple list or something, but if calling Brand() is costly, or (worse) if it has side effects that you don't want, then you need to use a defaultdict. It appears that when you use 'setdefault', the default is always evaluated, even if the key exists. def get_value(val): ... print('getting value', val) ... return val*2 ... my_dict = {} my_dict.setdefault('a', get_value('xyz')) getting value xyz 'xyzxyz' my_dict.setdefault('a', get_value('abc')) getting value abc 'xyzxyz' my_dict {'a': 'xyzxyz'} It seems odd. Is there a situation where this behaviour is useful? Frank Millman -- https://mail.python.org/mailman/listinfo/python-list
Re: Keeping track of things with dictionaries
On Tue, 08 Apr 2014 09:14:39 +0200, Frank Millman wrote: It appears that when you use 'setdefault', the default is always evaluated, even if the key exists. def get_value(val): ... print('getting value', val) ... return val*2 ... my_dict = {} my_dict.setdefault('a', get_value('xyz')) getting value xyz 'xyzxyz' my_dict.setdefault('a', get_value('abc')) getting value abc 'xyzxyz' my_dict {'a': 'xyzxyz'} It seems odd. Is there a situation where this behaviour is useful? It's not a feature of setdefault. It's how Python works: arguments to functions and methods are always evaluated before the function is called. The same applies to most languages. Only a very few number of syntactic features involve delayed evaluation. Off the top of my head: - the second argument to short-circuit or and and operators: if obj and obj[0]: ... - ternary if: 1/x if x != 0 else float(inf) - generator expressions - and of course the body of functions and methods don't execute until the function is called. -- Steven -- https://mail.python.org/mailman/listinfo/python-list
Re: Keeping track of things with dictionaries
On Tue, Apr 8, 2014 at 1:14 AM, Frank Millman fr...@chagford.com wrote: Chris Angelico ros...@gmail.com wrote in message news:captjjmqfbt2xx+bdfnhz0gagordkhtpbzrr29duwn36girz...@mail.gmail.com... On Tue, Apr 8, 2014 at 2:02 PM, Josh English joshua.r.engl...@gmail.com wrote: Would dict.setdefault() solve this problem? Is there any advantage to defaultdict over setdefault() That depends on whether calling Brand() unnecessarily is a problem. Using setdefault() is handy when you're working with a simple list or something, but if calling Brand() is costly, or (worse) if it has side effects that you don't want, then you need to use a defaultdict. It appears that when you use 'setdefault', the default is always evaluated, even if the key exists. def get_value(val): ... print('getting value', val) ... return val*2 ... my_dict = {} my_dict.setdefault('a', get_value('xyz')) getting value xyz 'xyzxyz' my_dict.setdefault('a', get_value('abc')) getting value abc 'xyzxyz' my_dict {'a': 'xyzxyz'} It seems odd. Is there a situation where this behaviour is useful? No. The default argument is evaluated because it must be evaluated before it can be passed into the method, just like any other function argument in Python. So why doesn't it take a callable instead of a value for its second argument? At a guess, because the method was probably added for efficiency, and the function call overhead might easily be slower than just doing a separate getitem and setitem. The reason setdefault exists I think is primarily because it was added before defaultdict. The contributors at SO can't seem to come up with any particularly good use cases either: http://stackoverflow.com/questions/3483520/use-cases-for-the-setdefault-dict-method One thing I will note as a disadvantage of defaultdict is that sometimes you only want the default value behavior while you're initially building the dict, and then you just want a normal dict with KeyErrors from then on. defaultdict doesn't do that; once constructed, it will always be a defaultdict. You can copy the data into a normal dict using the dict() constructor, but this feels dirty to me. -- https://mail.python.org/mailman/listinfo/python-list
Re: Keeping track of things with dictionaries
On Tue, Apr 8, 2014 at 5:14 PM, Frank Millman fr...@chagford.com wrote: It appears that when you use 'setdefault', the default is always evaluated, even if the key exists. def get_value(val): ... print('getting value', val) ... return val*2 ... my_dict = {} my_dict.setdefault('a', get_value('xyz')) getting value xyz 'xyzxyz' my_dict.setdefault('a', get_value('abc')) getting value abc 'xyzxyz' my_dict {'a': 'xyzxyz'} It seems odd. Is there a situation where this behaviour is useful? If the default value is cheap to define and has no side effects, it can be very clean. words_by_length = {} for word in open(/usr/share/dict/words): words_by_length.setdefault(len(word), []).append(word) This will, very conveniently, give you a list of all words of a particular length. (It's actually a little buggy but you get the idea.) ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: Keeping track of things with dictionaries
Ian Kelly wrote: One thing I will note as a disadvantage of defaultdict is that sometimes you only want the default value behavior while you're initially building the dict, and then you just want a normal dict with KeyErrors from then on. defaultdict doesn't do that; once constructed, it will always be a defaultdict. This is one of the statements that I won't believe without trying myself. As I'm posting you can probably guess my findings: from collections import defaultdict d = defaultdict(int) d[0] 0 d.default_factory = str d[1] '' d.default_factory = None d[2] Traceback (most recent call last): File stdin, line 1, in module KeyError: 2 d defaultdict(None, {0: 0, 1: ''}) So you can change a defaultdict's default_factory any time you like, and if you set it to None there will be no default. It will still be a defaultdict, but it will act like a normal dict. You can copy the data into a normal dict using the dict() constructor, but this feels dirty to me. -- https://mail.python.org/mailman/listinfo/python-list
Re: Keeping track of things with dictionaries
Chris Angelico ros...@gmail.com wrote in message news:captjjmpk-rqx0fp6_4vxyus2z34vc5fq_qntj+q9+kn8y5u...@mail.gmail.com... On Tue, Apr 8, 2014 at 5:14 PM, Frank Millman fr...@chagford.com wrote: It appears that when you use 'setdefault', the default is always evaluated, even if the key exists. It seems odd. Is there a situation where this behaviour is useful? If the default value is cheap to define and has no side effects, it can be very clean. words_by_length = {} for word in open(/usr/share/dict/words): words_by_length.setdefault(len(word), []).append(word) This will, very conveniently, give you a list of all words of a particular length. (It's actually a little buggy but you get the idea.) Thanks, that is neat. I haven't spotted the bug yet! Can you give me a hint? Frank -- https://mail.python.org/mailman/listinfo/python-list
Re: Keeping track of things with dictionaries
Ian Kelly ian.g.ke...@gmail.com wrote in message news:CALwzidmP5Bevbace9GyQrVXe-_2T=jtpq1yvapsaepvomqe...@mail.gmail.com... On Tue, Apr 8, 2014 at 1:14 AM, Frank Millman fr...@chagford.com wrote: It appears that when you use 'setdefault', the default is always evaluated, even if the key exists. It seems odd. Is there a situation where this behaviour is useful? No. The default argument is evaluated because it must be evaluated before it can be passed into the method, just like any other function argument in Python. So why doesn't it take a callable instead of a value for its second argument? At a guess, because the method was probably added for efficiency, and the function call overhead might easily be slower than just doing a separate getitem and setitem. The reason setdefault exists I think is primarily because it was added before defaultdict. The contributors at SO can't seem to come up with any particularly good use cases either: http://stackoverflow.com/questions/3483520/use-cases-for-the-setdefault-dict-method One thing I will note as a disadvantage of defaultdict is that sometimes you only want the default value behavior while you're initially building the dict, and then you just want a normal dict with KeyErrors from then on. defaultdict doesn't do that; once constructed, it will always be a defaultdict. You can copy the data into a normal dict using the dict() constructor, but this feels dirty to me. Here is an idea, inspired by Peter Otten's suggestion earlier in this thread. Instead of defaultdict, subclass dict and use __missing__() to supply the default values. When the dictionary is set up, delete __missing__ from the subclass! Ugly, but it seems to work. Frank -- https://mail.python.org/mailman/listinfo/python-list
Re: Keeping track of things with dictionaries
On Tue, Apr 8, 2014 at 6:26 PM, Frank Millman fr...@chagford.com wrote: words_by_length = {} for word in open(/usr/share/dict/words): words_by_length.setdefault(len(word), []).append(word) This will, very conveniently, give you a list of all words of a particular length. (It's actually a little buggy but you get the idea.) Thanks, that is neat. I haven't spotted the bug yet! Can you give me a hint? Run those lines in interactive Python (and change the file name if you're not on Unix or if you don't have a dictionary at that path), and then look at what's in words_by_length[23] - in the dictionary I have here (Debian Wheezy, using an American English dictionary - it's a symlink to (ultimately) /usr/share/dict/american-english), there are five entries in that list. Count how many letters there are in them. Also, there's a technical bug [1] in that I ought to use 'with' to ensure that the file's properly closed. But for a simple example, that's not critical. ChrisA [1] As Julia Jellicoe pointed out, it's an awful thing to be haunted by a technical bug! -- https://mail.python.org/mailman/listinfo/python-list
Re: Keeping track of things with dictionaries
Chris Angelico ros...@gmail.com wrote in message news:CAPTjJmoRxEhX02ZviHiLO+qi+dD+81smbGGYcPECpHb5E=p4=a...@mail.gmail.com... On Tue, Apr 8, 2014 at 6:26 PM, Frank Millman fr...@chagford.com wrote: words_by_length = {} for word in open(/usr/share/dict/words): words_by_length.setdefault(len(word), []).append(word) This will, very conveniently, give you a list of all words of a particular length. (It's actually a little buggy but you get the idea.) Thanks, that is neat. I haven't spotted the bug yet! Can you give me a hint? Run those lines in interactive Python (and change the file name if you're not on Unix or if you don't have a dictionary at that path), and then look at what's in words_by_length[23] - in the dictionary I have here (Debian Wheezy, using an American English dictionary - it's a symlink to (ultimately) /usr/share/dict/american-english), there are five entries in that list. Count how many letters there are in them. I don't have a large dictionary to test with, and a small list of words (ls /etc dict) did not throw up any problems. Are you saying that all([len(word) == 23 for word in words_by_length[23]]) # hope I got that right will not return True? Frank -- https://mail.python.org/mailman/listinfo/python-list
Re: Keeping track of things with dictionaries
On Tue, Apr 8, 2014 at 7:28 PM, Frank Millman fr...@chagford.com wrote: Are you saying that all([len(word) == 23 for word in words_by_length[23]]) # hope I got that right will not return True? That'll return true. What it won't show, though, is the length of the word as you would understand it in the English language. You see, when you iterate over a file, you get strings that include a newline at the end, and that'll be included in the length :) So with a dictionary of English words, you'll see that cat\n is a four-letter word, and python\n is a seven-letter word. It's a subtle point, but an important one when you start looking at lengths of things that are suddenly off by one. Obviously the solution is to strip them, but I didn't want to pollute the example with that (nor a 'with' block). I didn't think it particularly important, and just acknowledged the bug in what I thought was a throw-away line :) ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: Keeping track of things with dictionaries
Chris Angelico ros...@gmail.com wrote in message news:captjjmppaqmb6no7udddadqg_jv9yz0sn4d70kasksbwwr3...@mail.gmail.com... On Tue, Apr 8, 2014 at 7:28 PM, Frank Millman fr...@chagford.com wrote: Are you saying that all([len(word) == 23 for word in words_by_length[23]]) # hope I got that right will not return True? That'll return true. What it won't show, though, is the length of the word as you would understand it in the English language. You see, when you iterate over a file, you get strings that include a newline at the end, and that'll be included in the length :) So with a dictionary of English words, you'll see that cat\n is a four-letter word, and python\n is a seven-letter word. It's a subtle point, but an important one when you start looking at lengths of things that are suddenly off by one. Obviously the solution is to strip them, but I didn't want to pollute the example with that (nor a 'with' block). I didn't think it particularly important, and just acknowledged the bug in what I thought was a throw-away line :) Got it - thanks Frank -- https://mail.python.org/mailman/listinfo/python-list
Re: Keeping track of things with dictionaries
Chris Angelico wrote: in the dictionary I have here (Debian Wheezy, using an American English dictionary - it's a symlink to (ultimately) /usr/share/dict/american-english), there are five entries in that list. Mine's bigger than yours! On MacOSX 10.6 I get 41 words. (I think someone must have fed a medical dictionary into it... my list includes such obviously indispensible terms as laparocolpohysterotomy, thyroparathyroidectomy and ureterocystanastomosis.) Unfortunately I seem to be missing antidisestablishmentarianism, because the longest words in my dict are only 24 characters, excluding the '\n'. Should I ask for my money back? -- Greg -- https://mail.python.org/mailman/listinfo/python-list
Re: Keeping track of things with dictionaries
On Wed, Apr 9, 2014 at 10:43 AM, Gregory Ewing greg.ew...@canterbury.ac.nz wrote: Chris Angelico wrote: in the dictionary I have here (Debian Wheezy, using an American English dictionary - it's a symlink to (ultimately) /usr/share/dict/american-english), there are five entries in that list. Mine's bigger than yours! On MacOSX 10.6 I get 41 words. (I think someone must have fed a medical dictionary into it... my list includes such obviously indispensible terms as laparocolpohysterotomy, thyroparathyroidectomy and ureterocystanastomosis.) Yeah, it'll vary based on the exact dictionary used. I went for something with multiple, but not prolific, entries. Unfortunately I seem to be missing antidisestablishmentarianism, because the longest words in my dict are only 24 characters, excluding the '\n'. Should I ask for my money back? I think you should. That's a fundamental flaw in the dictionary. Everyone knows that word's the longest! ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: Keeping track of things with dictionaries
On 8/04/2014 6:31 PM, Frank Millman wrote: Here is an idea, inspired by Peter Otten's suggestion earlier in this thread. Instead of defaultdict, subclass dict and use __missing__() to supply the default values. When the dictionary is set up, delete __missing__ from the subclass! Ugly, but it seems to work. Ugly indeed. Replicating the behaviour of defaultdict and then deleting a method from the class seems a very heavyhanded 'solution', especially when you can just override a public attribute on defaultdict, as mentioned by Peter. -- https://mail.python.org/mailman/listinfo/python-list
Re: Keeping track of things with dictionaries
On 9/04/2014 12:33 PM, Chris Angelico wrote: Unfortunately I seem to be missing antidisestablishmentarianism, because the longest words in my dict are only 24 characters, excluding the '\n'. Should I ask for my money back? I think you should. That's a fundamental flaw in the dictionary. Everyone knows that word's the longest! It depends on whether you count 'supercalifragilisticexpialidocious'. If you don't, then 'pseudopseudohypoparathyroidism' is still slightly longer :) -- https://mail.python.org/mailman/listinfo/python-list
Re: Keeping track of things with dictionaries
On Tue, Apr 8, 2014 at 8:45 PM, alex23 wuwe...@gmail.com wrote: On 9/04/2014 12:33 PM, Chris Angelico wrote: Unfortunately I seem to be missing antidisestablishmentarianism, because the longest words in my dict are only 24 characters, excluding the '\n'. Should I ask for my money back? I think you should. That's a fundamental flaw in the dictionary. Everyone knows that word's the longest! It depends on whether you count 'supercalifragilisticexpialidocious'. If you don't, then 'pseudopseudohypoparathyroidism' is still slightly longer :) 'Pneumonoultramicroscopicsilicovolcanoconiosis' has them all beat. -- https://mail.python.org/mailman/listinfo/python-list
Re: Keeping track of things with dictionaries
On Tuesday 08 April 2014 23:31:35 Ian Kelly did opine: On Tue, Apr 8, 2014 at 8:45 PM, alex23 wuwe...@gmail.com wrote: On 9/04/2014 12:33 PM, Chris Angelico wrote: Unfortunately I seem to be missing antidisestablishmentarianism, because the longest words in my dict are only 24 characters, excluding the '\n'. Should I ask for my money back? I think you should. That's a fundamental flaw in the dictionary. Everyone knows that word's the longest! It depends on whether you count 'supercalifragilisticexpialidocious'. If you don't, then 'pseudopseudohypoparathyroidism' is still slightly longer :) 'Pneumonoultramicroscopicsilicovolcanoconiosis' has them all beat. Source citation please? Cheers, Gene -- There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order. -Ed Howdershelt (Author) Genes Web page http://geneslinuxbox.net:6309/gene US V Castleman, SCOTUS, Mar 2014 is grounds for Impeaching SCOTUS -- https://mail.python.org/mailman/listinfo/python-list
Re: Keeping track of things with dictionaries
On Tue, Apr 8, 2014 at 9:31 PM, Gene Heskett ghesk...@wdtv.com wrote: 'Pneumonoultramicroscopicsilicovolcanoconiosis' has them all beat. Source citation please? http://en.wikipedia.org/wiki/Pneumonoultramicroscopicsilicovolcanoconiosis http://www.oxforddictionaries.com/definition/english/pneumonoultramicroscopicsilicovolcanoconiosis http://dictionary.reference.com/browse/Pneumonoultramicroscopicsilicovolcanoconiosis -- https://mail.python.org/mailman/listinfo/python-list
Re: Keeping track of things with dictionaries
On Sunday, April 6, 2014 12:44:13 AM UTC-7, Giuliano Bertoletti wrote: obj = brands_seen.get(brandname) if obj is None: obj = Brand() brands_seen[brandname] = obj Would dict.setdefault() solve this problem? Is there any advantage to defaultdict over setdefault() Josh -- https://mail.python.org/mailman/listinfo/python-list
Re: Keeping track of things with dictionaries
On Tue, Apr 8, 2014 at 2:02 PM, Josh English joshua.r.engl...@gmail.com wrote: On Sunday, April 6, 2014 12:44:13 AM UTC-7, Giuliano Bertoletti wrote: obj = brands_seen.get(brandname) if obj is None: obj = Brand() brands_seen[brandname] = obj Would dict.setdefault() solve this problem? Is there any advantage to defaultdict over setdefault() That depends on whether calling Brand() unnecessarily is a problem. Using setdefault() is handy when you're working with a simple list or something, but if calling Brand() is costly, or (worse) if it has side effects that you don't want, then you need to use a defaultdict. I think this is a textbook example of why defaultdict exists, though, so I'd be inclined to just use it, rather than going for setdefault :) ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Keeping track of things with dictionaries
I frequently use this pattern to keep track of incoming data (for example, to sum up sales of a specific brand): = # read a brand record from a db ... # keep track of brands seen obj = brands_seen.get(brandname) if obj is None: obj = Brand() brands_seen[brandname] = obj obj.AddData(...)# this might for example keep track of sales = as you might guess, brands_seen is a dictionary whose keys are brandnames and whose values are brand objects. Now the point is: is there a cleverer way to do this? Basically what I'm doing is query the dictionary twice if the object does not exist. What I would like to understand is if there's some language built-in logic to: - supply a function which is meant to return a new object - have the interpreter to locate the point in the dictionary where the key is to be - if the key is already there, it returns the value/object associated and stops - if the key is not there, it calls the supplied function, assigns the returned value to the dictionary and return the object. Giulio. -- https://mail.python.org/mailman/listinfo/python-list
Re: Keeping track of things with dictionaries
Giuliano Bertoletti wrote: I frequently use this pattern to keep track of incoming data (for example, to sum up sales of a specific brand): = # read a brand record from a db ... # keep track of brands seen obj = brands_seen.get(brandname) if obj is None: obj = Brand() brands_seen[brandname] = obj obj.AddData(...) # this might for example keep track of sales = as you might guess, brands_seen is a dictionary whose keys are brandnames and whose values are brand objects. Now the point is: is there a cleverer way to do this? Basically what I'm doing is query the dictionary twice if the object does not exist. What I would like to understand is if there's some language built-in logic to: - supply a function which is meant to return a new object - have the interpreter to locate the point in the dictionary where the key is to be - if the key is already there, it returns the value/object associated and stops - if the key is not there, it calls the supplied function, assigns the returned value to the dictionary and return the object. Cudos, you give a precise discription of your problem in both english and code. There is a data structure in the stdlib that fits your task. With a collections.defaultdict your code becomes from collections import defaultdict brands_seen = defaultdict(Brand) brands_seen[brandname].add_data(...) # Method name adjusted to PEP 8 Side note: If you needed the key in the construction of the value you would have to subclass class BrandsSeen(dict): def __missing__(self, brandname): result = self[brandname] = Brand(brandname) return result brands_seen = BrandsSeen() brands_seen[brandname].add_data(...) -- https://mail.python.org/mailman/listinfo/python-list