I noticed recently that *all* examples for collection.defaultdict ( https://docs.python.org/3.7/library/collections.html#collections.defaultdict) are cases of grouping (for an int, a list and a set) from an iterator with a key, value output.
I wondered how common those constructions were, and what are defaultdict used for else. So I took a little dive into a few libs to see it (std lib, pypy, pandas, tensorflow, ..), and I saw essentially : A) basic cases of "grouping" with a simple for loop and a default_dict[key].append(value). I saw many kind of default factory utilized, with list, int, set, dict, and even defaultdict(list). ex : https://frama.link/UtNqvpvb, https://frama.link/o3Hb3-4U, https://frama.link/dw92yJ1q, https://frama.link/1Gqoa7WM, https://frama.link/bWswbHsU, https://frama.link/SZh2q8pS B) cases of grouping, but where the for loop used was alimenting more than one "grouper". pretty annoying if we want to group something. ex: https://frama.link/Db-Ny49a, https://frama.link/bZakUR33, https://frama.link/MwJFqh5o, C) classes attributes initialization (grouping is done by repeatably calling a function, so any grouping constructor will be useless here). ex : https://frama.link/GoGWuQwR, https://frama.link/BugcS8wU D) Sometimes you just want to defautdict inside a defauldict inside a dict and just have fun : https://frama.link/asBNLr1g, https://frama.link/8j7gzfA5 >From what I saw, the most useful would be to add method to a defaultdict to fill it from an iterable, and using a grouping method adapted to the default_factor (so __add__ for list, int and str, add for set, update for dict and proably __add__ for anything else) A sample code would be : from collections import defaultdict class groupingdict(defaultdict): def group_by_iterator(self, iterator): empty_element = self.default_factory() if hasattr(empty_element, "__add__"): for key, element in iterator: self[key] += element elif hasattr(empty_element, "update"): for key, element in iterator: self[key].update(element) elif hasattr(empty_element, "add"): for key, element in iterator: self[key].add(element) else: raise TypeError('default factory does not support iteration') return self So that for example : >groupingdict(dict).group_by_iterator( (grouping_key, a_dict) for grouping_key, a_dict in [ (1, {'a': 'c'}), (1, {'b': 'f'}), (1, {'a': 'e'}), (2, {'a': 'e'}) ] ) returns >groupingdict(dict, {1: {'a': 'e', 'b': 'f'}, 2: {'a': 'e'}}) My implementation is garbage and There should be 2 method, one returning the object and one modifing it, but I think it gives more leeway than just a function returning a dict 2018-07-13 7:11 GMT+02:00 Chris Barker via Python-ideas < python-ideas@python.org>: > On Mon, Jul 9, 2018 at 5:55 PM, Franklin? Lee < > leewangzhong+pyt...@gmail.com> wrote: > >> >> - The storage container. >> > >> > >> > so this means you'r passing in a full set of storage containers? I'm a >> vit >> > confused by that -- if they might be pre-populated, then they would >> need to >> > be instance,s an you'd need to have one for every key -- how would you >> know >> > in advance aht you needed??? >> >> No, I mean the mapping (outer) container. For example, I can pass in >> an empty OrderedDict, or a dict that already contained some groups >> from a previous call to the grouping function. >> > > Sure -- that's what my prototype does if you pass a Mapping in (or use > .update() ) > > why not? > > -CHB > > -- > > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > chris.bar...@noaa.gov > > _______________________________________________ > Python-ideas mailing list > Python-ideas@python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -- -- *Nicolas Rolin* | Data Scientist + 33 631992617 - nicolas.ro...@tiime.fr <prenom....@tiime.fr> *15 rue Auber, **75009 Paris* *www.tiime.fr <http://www.tiime.fr>*
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/