Re: flattening a dict
On Feb 17, 6:18 am, Terry Jones [EMAIL PROTECTED] wrote: Hi Arnaud Benjamin Here's a version that's a bit more general. It handles keys whose values are empty dicts (assigning None to the value in the result), and also dict keys that are not strings (see the test data below). It's also less recursive as it only calls itself on values that are dicts. But it returns a dict whose keys are tuples. You then need to decide what to do with this. Hence the helper function strdictflatten for the case when all dict keys can be converted to str. As I told Arnaud in email, I greatly prefer his version for its elegance. Terry def dictflatten(d, prefix=None): result = {} if prefix is None: prefix = tuple() for k, v in d.iteritems(): key = prefix + (k,) if isinstance(v, dict): if v: result.update(dictflatten(v, key)) else: result[key] = None else: result[key] = v return result def strdictflatten(d, sep='/'): return dict((sep.join(map(str, k)), v) for k, v in dictflatten(d).iteritems()) Thanks. This is great. Now, how would I restrict the depth of the flattening? For example, what if I just wanted something like this: { mays : {eggs : spam}, jam : {soda : soda/love : dump}, lamba : 23 } They are left alone up the to second level of recursion. if __name__ == '__main__': test = { : {}, 4 : 5, mays : {eggs : spam}, jam : {soda : {love : dump}}, lamba : 23 } d = dictflatten(test) print strdictflatten(test) {'': None, 'lamba': 23, 'mays/eggs': 'spam', '4': 5, 'jam/soda/love': 'dump'} -- http://mail.python.org/mailman/listinfo/python-list
Re: flattening a dict
On Feb 21, 8:04 pm, Benjamin [EMAIL PROTECTED] wrote: On Feb 17, 6:18 am, Terry Jones [EMAIL PROTECTED] wrote: Hi Arnaud Benjamin Here's a version that's a bit more general. It handles keys whose values are empty dicts (assigning None to the value in the result), and also dict keys that are not strings (see the test data below). It's also less recursive as it only calls itself on values that are dicts. But it returns a dict whose keys are tuples. You then need to decide what to do with this. Hence the helper function strdictflatten for the case when all dict keys can be converted to str. As I told Arnaud in email, I greatly prefer his version for its elegance. Terry def dictflatten(d, prefix=None): result = {} if prefix is None: prefix = tuple() for k, v in d.iteritems(): key = prefix + (k,) if isinstance(v, dict): if v: result.update(dictflatten(v, key)) else: result[key] = None else: result[key] = v return result def strdictflatten(d, sep='/'): return dict((sep.join(map(str, k)), v) for k, v in dictflatten(d).iteritems()) Thanks. This is great. Now, how would I restrict the depth of the flattening? For example, what if I just wanted something like this: { mays : {eggs : spam}, jam : {soda : soda/love : dump}, lamba : 23} They are left alone up the to second level of recursion. Why don't you do your homework on your own for a change ? Or at least pretend you're even trying ? -- http://mail.python.org/mailman/listinfo/python-list
Re: flattening a dict
On Feb 21, 9:13 pm, George Sakkis [EMAIL PROTECTED] wrote: On Feb 21, 8:04 pm, Benjamin [EMAIL PROTECTED] wrote: On Feb 17, 6:18 am, Terry Jones [EMAIL PROTECTED] wrote: Hi Arnaud Benjamin Here's a version that's a bit more general. It handles keys whose values are empty dicts (assigning None to the value in the result), and also dict keys that are not strings (see the test data below). It's also less recursive as it only calls itself on values that are dicts. But it returns a dict whose keys are tuples. You then need to decide what to do with this. Hence the helper function strdictflatten for the case when all dict keys can be converted to str. As I told Arnaud in email, I greatly prefer his version for its elegance. Terry def dictflatten(d, prefix=None): result = {} if prefix is None: prefix = tuple() for k, v in d.iteritems(): key = prefix + (k,) if isinstance(v, dict): if v: result.update(dictflatten(v, key)) else: result[key] = None else: result[key] = v return result def strdictflatten(d, sep='/'): return dict((sep.join(map(str, k)), v) for k, v in dictflatten(d).iteritems()) Thanks. This is great. Now, how would I restrict the depth of the flattening? For example, what if I just wanted something like this: { mays : {eggs : spam}, jam : {soda : soda/love : dump}, lamba : 23} They are left alone up the to second level of recursion. Why don't you do your homework on your own for a change ? Or at least pretend you're even trying ? excuseme Here's a attempt: def dictflatten(d, prefix=None, depth=2, cur_depth=0): result = {} if prefix is None: prefix = tuple() if isinstance(prefix, basestring): prefix = (prefix,) cur_depth += 1 for k, v in d.iteritems(): key = prefix + (k,) print key, cur_depth if isinstance(v, dict): if v: if cur_depth = depth: result.update(dictflatten(v, key, depth, cur_depth)) else: result.update(dictflatten(v, k, depth, cur_depth)) else: result[key] = None else: result[k] = v return result /excuseme -- http://mail.python.org/mailman/listinfo/python-list
Re: flattening a dict
Steven D'Aprano [EMAIL PROTECTED] wrote: # Untested def flattendict(d): def gen(L): return (x for M in exp(L) for x in rec(M)) def exp(L): return (L+list(kv) for kv in L.pop().iteritems()) def rec(M): return gen(M) if isinstance(M[-1],dict) else [M] return dict((tuple(L[:-1]),L[-1]) for L in gen([d])) No. The function is hard to read, not because it uses lambdas, but because it is obfuscated Python. The lambda syntax doesn't contribute to the obfuscation. In this particular case I think the lambda does contribute to the obfuscation. Yes, they are single expressions, but only because that have been contorted to become single expressions. The first two return generators, so if you don't force them into a lambda you can write them out as for loops with yield. The last one is an if..else statement. Splitting them into multiple statements also removes the need to keep them as short as possible, so you don't need to use single letter variable names everywhere. And as for your point about bad code smells, no, I don't agree. If your function consists of a single expression, and you don't expect func.__name__ to have a meaningful value, then there's nothing wrong with using a named lambda. Anonymous functions are first-class objects in Python, just as classes and modules and named functions are, and people shouldn't make up superstitious rules about not assigning them to names. You've got it backwards here: the lambdas were given names. If they hadn't been assigned to names I'd have no objection. -- http://mail.python.org/mailman/listinfo/python-list
Re: flattening a dict
Duncan Booth wrote: In this particular case I think the lambda does contribute to the obfuscation. Yes, they are single expressions, but only because that have been contorted to become single expressions. The first two return generators, so if you don't force them into a lambda you can write them out as for loops with yield. The last one is an if..else statement. The key word is become, and in this particular case the discussion thread is witness to the fact that it doesn't apply as you say. The three single expressions came about from a (misshappen, OK, but I never pretended otherwise) attempt to divide into pieces what was already a single expression in Arnaud's code while keeping in the spirit. BTW, I have a particular issue with your statement : A lambda which is assigned directly to a variable is a bad code smell. My problem is that I can't discuss its pure content with which I simply disagree, because that would legitimize its abstracted form with which I have the strongest issues, although largely OT. Best, BB -- http://mail.python.org/mailman/listinfo/python-list
Re: flattening a dict
Boris Borcic [EMAIL PROTECTED] wrote: It is more elementary in the mathematician's sense, and therefore preferable all other things being equal, imo. I've tried to split 'gen' but I can't say the result is so much better. def flattendict(d) : gen = lambda L : (x for M in exp(L) for x in rec(M)) exp = lambda L : (L+list(kv) for kv in L.pop().iteritems()) rec = lambda M : gen(M) if isinstance(M[-1],dict) else [M] return dict((tuple(L[:-1]),L[-1]) for L in gen([d])) Why, why, why, why are you using lambda here? It only makes the code harder to read (and it is bad enough without that). A lambda which is assigned directly to a variable is a bad code smell. -- http://mail.python.org/mailman/listinfo/python-list
Re: flattening a dict
Arnaud Delobelle wrote: On Feb 17, 4:03 pm, Boris Borcic [EMAIL PROTECTED] wrote: George Sakkis wrote: On Feb 17, 7:51 am, Arnaud Delobelle [EMAIL PROTECTED] wrote: BTW, I keep using the idiom itertools.chain(*iterable). I guess that during function calls *iterable gets expanded to a tuple. Wouldn't it be nice to have an equivalent one-argument function that takes an iterable of iterables and return the 'flattened' iterable? Indeed; I don't have any exact numbers but I roughly use this idiom as often or more as the case where chain() takes a known fixed number of arguments. The equivalent function you describe is trivial: def chain2(iter_of_iters): for iterable in iter_of_iters: for i in iterable: yield i or fwiw chainstar = lambda iters : (x for it in iters for x in it) - a form that better suggests how to inline it in the calling expression, if applicable. Indeed: def flattendict(d): def gen(d, pfx=()): return (x for k, v in d.iteritems() for x in (gen(v, pfx+(k,)) if isinstance(v, dict) else ((pfx+(k,), v),))) return dict(gen(d)) I don't know, I find the chain(*...) version more readable, In this case I could concede that it is, although I find both forms overly convoluted for easy reading. although this one is probably better. It is more elementary in the mathematician's sense, and therefore preferable all other things being equal, imo. I've tried to split 'gen' but I can't say the result is so much better. def flattendict(d) : gen = lambda L : (x for M in exp(L) for x in rec(M)) exp = lambda L : (L+list(kv) for kv in L.pop().iteritems()) rec = lambda M : gen(M) if isinstance(M[-1],dict) else [M] return dict((tuple(L[:-1]),L[-1]) for L in gen([d])) For fun I've also written a (vaguely prologish) non-recursive generator-based version that exploits undocumented (but logical) behavior of list.extend class iterstack(list) : __iter__ = lambda self : self def next(self) : while self : try : return self[-1].next() except StopIteration : self.pop() raise StopIteration def flattendict(d,stk=[]) : res={} its=iterstack() its.extend(stk[-1].iteritems() for stk[len(its)-1:] in chain([[d]],its) if isinstance(stk[-1],dict) or res.update([(stk.pop(),tuple(stk))[::-1]])) return res (testing was minimal) Challenge for who really has time to waste : replace the iterstack list subclass definition with a simple list + a generator expression inside flattendict. Cheers, BB -- http://mail.python.org/mailman/listinfo/python-list
Re: flattening a dict
Duncan Booth wrote: Boris Borcic [EMAIL PROTECTED] wrote: It is more elementary in the mathematician's sense, and therefore preferable all other things being equal, imo. I've tried to split 'gen' but I can't say the result is so much better. def flattendict(d) : gen = lambda L : (x for M in exp(L) for x in rec(M)) exp = lambda L : (L+list(kv) for kv in L.pop().iteritems()) rec = lambda M : gen(M) if isinstance(M[-1],dict) else [M] return dict((tuple(L[:-1]),L[-1]) for L in gen([d])) Why, why, why, why are you using lambda here? Because the 3 lambdas make manifest that they could be combined into a single expression and thus reveal that they replace a single expression. It only makes the code harder to read Matter of perceptions. I myself tend to find def g(...) : def f(x) : return (whatever) return y(f) a bit hard to follow. So frankly I don't feel it betters anything to write def flattendict(d) : def gen(L) : return (x for M in exp(L) for x in rec(M)) def exp(L) : return (L+list(kv) for kv in L.pop().iteritems()) def rec(M) : return gen(M) if isinstance(M[-1],dict) else [M] return dict((tuple(L[:-1]),L[-1]) for L in gen([d])) But whatever to please you Cheers, BB -- http://mail.python.org/mailman/listinfo/python-list
Re: flattening a dict
On Mon, 18 Feb 2008 14:03:20 +, Duncan Booth wrote: Why, why, why, why are you using lambda here? It only makes the code harder to read (and it is bad enough without that). A lambda which is assigned directly to a variable is a bad code smell. Oh come on. I don't get this allergy to lambda that so many people have. Look at the example given: def flattendict(d) : gen = lambda L : (x for M in exp(L) for x in rec(M)) exp = lambda L : (L+list(kv) for kv in L.pop().iteritems()) rec = lambda M : gen(M) if isinstance(M[-1],dict) else [M] return dict((tuple(L[:-1]),L[-1]) for L in gen([d])) The hard-to-read doesn't come from the lambda (which only adds a keyword and a name to each function), but the algorithm, which combines recursion, generator expressions, and tight coupling between three functions. Would the function be any easier to read written like this? # Untested def flattendict(d): def gen(L): return (x for M in exp(L) for x in rec(M)) def exp(L): return (L+list(kv) for kv in L.pop().iteritems()) def rec(M): return gen(M) if isinstance(M[-1],dict) else [M] return dict((tuple(L[:-1]),L[-1]) for L in gen([d])) No. The function is hard to read, not because it uses lambdas, but because it is obfuscated Python. The lambda syntax doesn't contribute to the obfuscation. And as for your point about bad code smells, no, I don't agree. If your function consists of a single expression, and you don't expect func.__name__ to have a meaningful value, then there's nothing wrong with using a named lambda. Anonymous functions are first-class objects in Python, just as classes and modules and named functions are, and people shouldn't make up superstitious rules about not assigning them to names. def foo(x): return x+1 foo = lambda x: x+1 The first case uses TWO keywords, a name, a declared argument and an expression; the lambda form uses ONE keyword, a name, a declared argument and an expression. The problem with lambdas comes from people trying to hammer multi- expression functions into a single-expression lambda, hence obfuscating the algorithm. That's no different from people who obfuscate multi- expression functions by writing them as a generator expression. -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: flattening a dict
On Feb 18, 10:22 pm, Steven D'Aprano [EMAIL PROTECTED] cybersource.com.au wrote: [...] The problem with lambdas comes from people trying to hammer multi- expression functions into a single-expression lambda, hence obfuscating the algorithm. That's no different from people who obfuscate multi- expression functions by writing them as a generator expression. I'm sorry if my Python is difficult to understand. That's because I've got a bit of a Lisp... -- Arnaud -- http://mail.python.org/mailman/listinfo/python-list
Re: flattening a dict
Arnaud Delobelle wrote: On Feb 18, 10:22 pm, Steven D'Aprano [EMAIL PROTECTED] cybersource.com.au wrote: [...] The problem with lambdas comes from people trying to hammer multi- expression functions into a single-expression lambda, hence obfuscating the algorithm. That's no different from people who obfuscate multi- expression functions by writing them as a generator expression. I'm sorry if my Python is difficult to understand. That's because I've got a bit of a Lisp... That may be the first lambda-related humor I've ever heard. -- http://mail.python.org/mailman/listinfo/python-list
Re: flattening a dict
On Feb 17, 3:56 am, Benjamin [EMAIL PROTECTED] wrote: How would I go about flattening a dict with many nested dicts within? The dicts might look like this: {mays : {eggs : spam}, jam : {soda : {love : dump}}, lamba : 23} I'd like it to put / inbetween the dicts to make it a one dimensional dict and look like this: {mays/eggs : spam, jam/soda/love : dump, lamba : 23 } In Python you can do anything, even flatten a dictionary: from itertools import chain def flattendict(d, pfx='', sep='/'): if isinstance(d, dict): if pfx: pfx += sep return chain(*(flattendict(v, pfx+k, sep) for k, v in d.iteritems())) else: return (pfx, d), test = {mays : {eggs : spam}, jam : {soda : {love : dump}}, lamba : 23 } print dict(flattendict(test)) {'lamba': 23, 'mays/eggs': 'spam', 'jam/soda/love': 'dump'} You an even have other separators ;) print dict(flattendict(test, sep='.')) {'lamba': 23, 'jam.soda.love': 'dump', 'mays.eggs': 'spam'} -- Arnaud -- http://mail.python.org/mailman/listinfo/python-list
Re: flattening a dict
Arnaud Delobelle wrote: In Python you can do anything, even ...pass the Turing test with a one-liner. Back after 9/11, when US patriotism was the rage, Python knew how to answer correctly the query filter(lambda W : W not in 'ILLITERATE','BULLSHIT') And Python 3.0 slated for next August offers it a chance to change its mind before the next elections... Cheers, BB -- http://mail.python.org/mailman/listinfo/python-list
Re: flattening a dict
Hi Arnaud Benjamin Here's a version that's a bit more general. It handles keys whose values are empty dicts (assigning None to the value in the result), and also dict keys that are not strings (see the test data below). It's also less recursive as it only calls itself on values that are dicts. But it returns a dict whose keys are tuples. You then need to decide what to do with this. Hence the helper function strdictflatten for the case when all dict keys can be converted to str. As I told Arnaud in email, I greatly prefer his version for its elegance. Terry def dictflatten(d, prefix=None): result = {} if prefix is None: prefix = tuple() for k, v in d.iteritems(): key = prefix + (k,) if isinstance(v, dict): if v: result.update(dictflatten(v, key)) else: result[key] = None else: result[key] = v return result def strdictflatten(d, sep='/'): return dict((sep.join(map(str, k)), v) for k, v in dictflatten(d).iteritems()) if __name__ == '__main__': test = { : {}, 4 : 5, mays : {eggs : spam}, jam : {soda : {love : dump}}, lamba : 23 } d = dictflatten(test) print strdictflatten(test) {'': None, 'lamba': 23, 'mays/eggs': 'spam', '4': 5, 'jam/soda/love': 'dump'} -- http://mail.python.org/mailman/listinfo/python-list
Re: flattening a dict
On Feb 17, 12:18 pm, Terry Jones [EMAIL PROTECTED] wrote: Hi Arnaud Benjamin Here's a version that's a bit more general. It handles keys whose values are empty dicts (assigning None to the value in the result), and also dict keys that are not strings (see the test data below). It's also less recursive as it only calls itself on values that are dicts. But it returns a dict whose keys are tuples. You then need to decide what to do with this. Hence the helper function strdictflatten for the case when all dict keys can be converted to str. As I told Arnaud in email, I greatly prefer his version for its elegance. Terry def dictflatten(d, prefix=None): result = {} if prefix is None: prefix = tuple() for k, v in d.iteritems(): key = prefix + (k,) if isinstance(v, dict): if v: result.update(dictflatten(v, key)) else: result[key] = None else: result[key] = v return result It's nice to do it this way I think. Here I post a generator-powered version of the same idea. from itertools import chain def flattendict(d): def gen(d, pfx=()): return chain(*(gen(v, pfx+(k,)) if isinstance(v, dict) else ((pfx+(k,), v),) for k, v in d.iteritems())) return dict(gen(d)) BTW, I keep using the idiom itertools.chain(*iterable). I guess that during function calls *iterable gets expanded to a tuple. Wouldn't it be nice to have an equivalent one-argument function that takes an iterable of iterables and return the 'flattened' iterable? -- Arnaud -- http://mail.python.org/mailman/listinfo/python-list
Re: flattening a dict
On Feb 17, 7:51 am, Arnaud Delobelle [EMAIL PROTECTED] wrote: BTW, I keep using the idiom itertools.chain(*iterable). I guess that during function calls *iterable gets expanded to a tuple. Wouldn't it be nice to have an equivalent one-argument function that takes an iterable of iterables and return the 'flattened' iterable? Indeed; I don't have any exact numbers but I roughly use this idiom as often or more as the case where chain() takes a known fixed number of arguments. The equivalent function you describe is trivial: def chain2(iter_of_iters): for iterable in iter_of_iters: for i in iterable: yield i but I usually don't bother, although if it was available in itertools I'd use it instead. Apart from introducing a new function for something quite similar, another idea would be to modify chain(*iterables) semantics if it is passed a single argument. Currently chain() is useful for 2 or more arguments; chain(x) is equivalent to iter(x), there's no reason to use it ever. On the downside, this might break backwards compatibility for cases like chain(*some_iterable) where some_iterable has length of 1 but I'd guess this is quite rare. George -- http://mail.python.org/mailman/listinfo/python-list
Re: flattening a dict
George Sakkis wrote: On Feb 17, 7:51 am, Arnaud Delobelle [EMAIL PROTECTED] wrote: BTW, I keep using the idiom itertools.chain(*iterable). I guess that during function calls *iterable gets expanded to a tuple. Wouldn't it be nice to have an equivalent one-argument function that takes an iterable of iterables and return the 'flattened' iterable? Indeed; I don't have any exact numbers but I roughly use this idiom as often or more as the case where chain() takes a known fixed number of arguments. The equivalent function you describe is trivial: def chain2(iter_of_iters): for iterable in iter_of_iters: for i in iterable: yield i or fwiw chainstar = lambda iters : (x for it in iters for x in it) - a form that better suggests how to inline it in the calling expression, if applicable. Cheers, BB -- http://mail.python.org/mailman/listinfo/python-list
Re: flattening a dict
Arnaud == Arnaud Delobelle [EMAIL PROTECTED] writes: Arnaud BTW, I keep using the idiom itertools.chain(*iterable). I guess Arnaud that during function calls *iterable gets expanded to a tuple. Arnaud Wouldn't it be nice to have an equivalent one-argument function Arnaud that takes an iterable of iterables and return the 'flattened' Arnaud iterable? This reminds me of a function I wrote a while back that iterates over all its arguments, even calling passed functions and iterating over their results. I knew *even less* about Python then than I do now; please set flamethrowers to Constructive Criticism. http://www.fluidinfo.com/terry/2007/05/07/iteranything/ Terry -- http://mail.python.org/mailman/listinfo/python-list
Re: flattening a dict
On Feb 17, 4:03 pm, Boris Borcic [EMAIL PROTECTED] wrote: George Sakkis wrote: On Feb 17, 7:51 am, Arnaud Delobelle [EMAIL PROTECTED] wrote: BTW, I keep using the idiom itertools.chain(*iterable). I guess that during function calls *iterable gets expanded to a tuple. Wouldn't it be nice to have an equivalent one-argument function that takes an iterable of iterables and return the 'flattened' iterable? Indeed; I don't have any exact numbers but I roughly use this idiom as often or more as the case where chain() takes a known fixed number of arguments. The equivalent function you describe is trivial: def chain2(iter_of_iters): for iterable in iter_of_iters: for i in iterable: yield i or fwiw chainstar = lambda iters : (x for it in iters for x in it) - a form that better suggests how to inline it in the calling expression, if applicable. Indeed: def flattendict(d): def gen(d, pfx=()): return (x for k, v in d.iteritems() for x in (gen(v, pfx+(k,)) if isinstance(v, dict) else ((pfx+(k,), v),))) return dict(gen(d)) I don't know, I find the chain(*...) version more readable, although this one is probably better. Please, Mr itertools, can we have chainstar? -- Arnaud -- http://mail.python.org/mailman/listinfo/python-list
flattening a dict
How would I go about flattening a dict with many nested dicts within? The dicts might look like this: {mays : {eggs : spam}, jam : {soda : {love : dump}}, lamba : 23 } I'd like it to put / inbetween the dicts to make it a one dimensional dict and look like this: {mays/eggs : spam, jam/soda/love : dump, lamba : 23 } -- http://mail.python.org/mailman/listinfo/python-list
Re: flattening a dict
Benjamin wrote: How would I go about flattening a dict with many nested dicts within? The dicts might look like this: {mays : {eggs : spam}, jam : {soda : {love : dump}}, lamba : 23 } I'd like it to put / inbetween the dicts to make it a one dimensional dict and look like this: {mays/eggs : spam, jam/soda/love : dump, lamba : 23 } What have you tried so far? -- http://mail.python.org/mailman/listinfo/python-list