Re: Classical FP problem in python : Hamming problem
Le vendredi 28 Janvier 2005 08:27, Paul Rubin a écrit : Francis Girard [EMAIL PROTECTED] writes: Thank you Nick and Steven for the idea of a more generic imerge. If you want to get fancy, the merge should use a priority queue (like in the heapsort algorithm) instead of a linear scan through the incoming iters, to find the next item to output. That lowers the running time to O(n log k) instead of O(n*k), where k is the number of iterators and n is the length. The goal was only to show some FP construct on small problems. Here the number of iterators are intended to be fairly small. Otherwise, yes, you could exploit the fact that, at one loop execution, you already seen most of the elements at previous loop execution, storing them in some heap structure and therefore not having to recan the whole list of already seen iterator values at each loop execution. Thank you Francis Girard -- http://mail.python.org/mailman/listinfo/python-list
Re: Classical FP problem in python : Hamming problem
Francis Girard [EMAIL PROTECTED] wrote: Thank you Nick and Steven for the idea of a more generic imerge. You are welcome :-) [It came to me while walking the children to school!] [snip] class IteratorDeiterator: def __init__(self, iterator): self._iterator = iterator.__iter__() self._firstVal = None ## Avoid consuming if not requested from outside ## Works only if iterator itself can't return None You can use a sentinel here if you want to avoid the can't return None limitation. For a sentinel you need an object your iterator couldn't possibly return. You can make one up, eg self._sentinel = object() self._firstVal = self._sentinel Or you could use self (but I'm not 100% sure that your recursive functions wouldn't return it!) def __iter__(self): return self def next(self): valReturn = self._firstVal if valReturn is None: and if valReturn is self._sentinel: valReturn = self._iterator.next() self._firstVal = None self._firstVal = self._sentinel etc.. [snip more code] Thanks for some more examples of fp-style code. I find it hard to get my head round so its been good exercise! -- Nick Craig-Wood [EMAIL PROTECTED] -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: Classical FP problem in python : Hamming problem
Le jeudi 27 Janvier 2005 10:30, Nick Craig-Wood a crit: Francis Girard [EMAIL PROTECTED] wrote: Thank you Nick and Steven for the idea of a more generic imerge. You are welcome :-) [It came to me while walking the children to school!] Sometimes fresh air and children purity is all what it takes. Much better than coffee, cigarrette and flat screen. [snip] class IteratorDeiterator: def __init__(self, iterator): self._iterator = iterator.__iter__() self._firstVal = None ## Avoid consuming if not requested from outside ## Works only if iterator itself can't return None You can use a sentinel here if you want to avoid the can't return None limitation. For a sentinel you need an object your iterator couldn't possibly return. You can make one up, eg Great idea. I'll use it. self._sentinel = object() self._firstVal = self._sentinel Or you could use self (but I'm not 100% sure that your recursive functions wouldn't return it!) def __iter__(self): return self def next(self): valReturn = self._firstVal if valReturn is None: and if valReturn is self._sentinel: valReturn = self._iterator.next() self._firstVal = None self._firstVal = self._sentinel etc.. [snip more code] Thanks for some more examples of fp-style code. I find it hard to get my head round so its been good exercise! Introduction to functional programming Richard Bird and Philip Wadler Prentice Hall 1988 This is the very best intro I ever read. The book is without hype, doesn't show its age and is language neutral. Authors are world leaders in the field today. Only really strong guys have the kindness to do understandable introductions without trying to hide the difficulties (because they are strong enough to face them with simplicity). It's been a real pleasure. Regards, Francis Girard FRANCE -- Nick Craig-Wood [EMAIL PROTECTED] -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: Classical FP problem in python : Hamming problem
Francis Girard [EMAIL PROTECTED] writes: Thank you Nick and Steven for the idea of a more generic imerge. If you want to get fancy, the merge should use a priority queue (like in the heapsort algorithm) instead of a linear scan through the incoming iters, to find the next item to output. That lowers the running time to O(n log k) instead of O(n*k), where k is the number of iterators and n is the length. -- http://mail.python.org/mailman/listinfo/python-list
Re: Classical FP problem in python : Hamming problem
Paul Rubin wrote: Francis Girard [EMAIL PROTECTED] writes: Thank you Nick and Steven for the idea of a more generic imerge. If you want to get fancy, the merge should use a priority queue (like in the heapsort algorithm) instead of a linear scan through the incoming iters, to find the next item to output. That lowers the running time to O(n log k) instead of O(n*k), where k is the number of iterators and n is the length. I looked at a heapq solution but didn't see any clean way of dealing with multiple iterators having equal values. The dict solution below deals cleanly with that, since one key can be shared by any number of iterators. Extracting the minimum, and the associated iterables is fast, but the overall solution is still slower than the brute force approach for the 3 hamming iterators. def imerge(*iterables): ... cache = {} ... iterators = map(iter,iterables) ... number = len(iterables) ... exhausted = 0 ... while 1: # First time through, looked at all of them # Subsequently, update only the minimum iterators ... for it in iterators: ... try: # Key each iterator by its next() value # Multiple iterators may share the same key ... cache.setdefault(it.next(),[]).append(it) ... except StopIteration: ... exhausted += 1 ... if exhausted == number: ... raise StopIteration # Get the lowest value ... val = min(cache) # and all the iterators that have that value ... iterators = cache.pop(val) ... yield val list(imerge([1, 2, 3, 6], [1, 2, 3, 7], [1, 2, 3, 4, 5])) [1, 2, 3, 4, 5, 6, 7] Michael -- http://mail.python.org/mailman/listinfo/python-list
Re: Classical FP problem in python : Hamming problem
Le mardi 25 Janvier 2005 09:01, Michael Spencer a crit: Francis Girard wrote: The following implementation is even more speaking as it makes self-evident and almost mechanical how to translate algorithms that run after their tail from recursion to tee usage : Thanks, Francis and Jeff for raising a fascinating topic. I've enjoyed trying to get my head around both the algorithm and your non-recursive implementation. Yes, it's been fun. Here's a version of your implementation that uses a helper class to make the algorithm itself prettier. from itertools import tee, imap def hamming(): def _hamming(): yield 1 for n in imerge(2 * hamming, imerge(3 * hamming, 5 * hamming)): yield n hamming = Tee(_hamming()) return iter(hamming) class Tee(object): Provides an indepent iterator (using tee) on every iteration request Also implements lazy iterator arithmetic def __init__(self, iterator): self.iter = tee(iterator,1)[0] def __iter__(self): return self.iter.__copy__() def __mul__(self, number): return imap(lambda x: x * number,self.__iter__()) def imerge(xs, ys): x = xs.next() y = ys.next() while True: if x == y: yield x x = xs.next() y = ys.next() elif x y: yield x x = xs.next() else: # if y x: yield y y = ys.next() hg = hamming() for i in range(1): ... n = hg.next() ... if i % 1000 == 0: print i, n ... 0 1 1000 5184 2000 81 3000 27993600 4000 4707158941350 5000 5096079360 6000 4096000 7000 2638827906662400 8000 143327232 9000 680244480 Interesting idea. Regards Michael -- http://mail.python.org/mailman/listinfo/python-list
Re: Classical FP problem in python : Hamming problem
Le mardi 25 Janvier 2005 19:52, Steven Bethard a écrit : Thank you Nick and Steven for the idea of a more generic imerge. To work with the Hamming problem, the imerge function _must_not_ allow duplicates and we can assume all of the specified iteratables are of the same size, i.e. infinite ! Therefore, Nick's solution fulfills the need. But it is admittedly confusing to call the function imerge when it doesn't merge everything (including duplicates). Anyway both solution sheds new light and brings us a bit farther. That's the beauty of many brains from all over the world collabarating. Really, it makes me emotive thinking about it. For the imerge function, what we really need to make the formulation clear is a way to look at the next element of an iteratable without consuming it. Or else, a way to put back consumed elements in the front an iteration flow, much like the list constructors in FP languages like Haskell. It is simple to encapsulate an iterator inside another iterator class that would do just that. Here's one. The additional fst method returns the next element to consume without consuming it and the isBottom checks if there is something left to consume from the iteration flow, without actually consuming anything. I named the class IteratorDeiterator for lack of imagination : -- BEGIN SNAP class IteratorDeiterator: def __init__(self, iterator): self._iterator = iterator.__iter__() self._firstVal = None ## Avoid consuming if not requested from outside ## Works only if iterator itself can't return None def __iter__(self): return self def next(self): valReturn = self._firstVal if valReturn is None: valReturn = self._iterator.next() self._firstVal = None return valReturn def fst(self): if self._firstVal is None: self._firstVal = self._iterator.next() return self._firstVal def isBottom(self): if self._firstVal is None: try: self._firstVal = self._iterator.next() except StopIteration: return True return False -- END SNAP Now the imerge_infinite which merges infinite lists, while allowing duplicates, is quite straightforward : -- BEGIN SNAP def imerge_infinite(*iterators): vItTopable = [IteratorDeiterator(it) for it in iterators] while vItTopable: yield reduce(lambda itSmallest, it: itSmallest.fst() it.fst() and it or itSmallest, vItTopable).next() -- END SNAP To merge finite lists of possibly different lengths is a bit more work as you have to eliminate iterators that reached the bottom before the others. This is quite easily done by simply filtering them out. The imerge_finite function below merges lists of possibly different sizes. -- BEGIN SNAP def imerge_finite(*iterators): vItDeit = [IteratorDeiterator(it) for it in iterators] vItDeit = filter(lambda it: not it.isBottom(), vItDeit) while vItDeit: yield reduce(lambda itSmallest, it: itSmallest.fst() it.fst() and it or itSmallest, vItDeit).next() vItDeit = filter(lambda it: not it.isBottom(), vItDeit) -- END SNAP Now, we want versions of these two merge functions that do not allow duplicates. Building upon what we've already done in a semi FP way, we just filter out the duplicates from the iteration streams returned by the above functions. The job is the same for the infinite and finite versions, hence the imerge_nodup generic function. -- BEGIN SNAP def imerge_nodup(fImerge, *iterators): it = fImerge(*iterators) el = it.next() yield el while True: el = dropwhile(lambda _next: _next == el, it).next() yield el imerge_inf_nodup = \ lambda *iterators: imerge_nodup(imerge_infinite, *iterators) imerge_finite_nodup = \ lambda *iterators: imerge_nodup(imerge_finite, *iterators) -- END SNAP I used the lambda notation for imerge_inf_nodup and imerge_finite_nodup to avoid another useless for-yield loop. Here the notation really just express function equivalence with a bounded argument (it would be great to have a notation for this in Python, i.e. binding a function argument to yield a new function). The merge function to use with hamming() is imerge_inf_nodup. Regards, Francis Girard FRANCE Nick Craig-Wood wrote: Steven Bethard [EMAIL PROTECTED] wrote: Nick Craig-Wood wrote: Thinking about this some more leads me to believe a general purpose imerge taking any number of arguments will look neater, eg def imerge(*generators): values = [ g.next() for g in generators ] while True: x = min(values) yield x for i in range(len(values)): if values[i] == x: values[i] = generators[i].next() This kinda looks like it dies after the first generator is exhausted, but I'm not certain. Yes it will stop iterating then (rather like zip() on lists of unequal size). Not sure what the specification should be! It
Re: Classical FP problem in python : Hamming problem
Francis Girard wrote: For the imerge function, what we really need to make the formulation clear is a way to look at the next element of an iteratable without consuming it. Or else, a way to put back consumed elements in the front an iteration flow, much like the list constructors in FP languages like Haskell. It is simple to encapsulate an iterator inside another iterator class that would do just that. Here's one. The additional fst method returns the next element to consume without consuming it and the isBottom checks if there is something left to consume from the iteration flow, without actually consuming anything. I named the class IteratorDeiterator for lack of imagination : [snip] Another implementation is my peekable class: http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/304373 It allows you to peek several elements ahead if that's necessary. Steve -- http://mail.python.org/mailman/listinfo/python-list
Re: Classical FP problem in python : Hamming problem
Francis Girard wrote: The following implementation is even more speaking as it makes self-evident and almost mechanical how to translate algorithms that run after their tail from recursion to tee usage : Thanks, Francis and Jeff for raising a fascinating topic. I've enjoyed trying to get my head around both the algorithm and your non-recursive implementation. Here's a version of your implementation that uses a helper class to make the algorithm itself prettier. from itertools import tee, imap def hamming(): def _hamming(): yield 1 for n in imerge(2 * hamming, imerge(3 * hamming, 5 * hamming)): yield n hamming = Tee(_hamming()) return iter(hamming) class Tee(object): Provides an indepent iterator (using tee) on every iteration request Also implements lazy iterator arithmetic def __init__(self, iterator): self.iter = tee(iterator,1)[0] def __iter__(self): return self.iter.__copy__() def __mul__(self, number): return imap(lambda x: x * number,self.__iter__()) def imerge(xs, ys): x = xs.next() y = ys.next() while True: if x == y: yield x x = xs.next() y = ys.next() elif x y: yield x x = xs.next() else: # if y x: yield y y = ys.next() hg = hamming() for i in range(1): ... n = hg.next() ... if i % 1000 == 0: print i, n ... 0 1 1000 5184 2000 81 3000 27993600 4000 4707158941350 5000 5096079360 6000 4096000 7000 2638827906662400 8000 143327232 9000 680244480 Regards Michael -- http://mail.python.org/mailman/listinfo/python-list
Re: Classical FP problem in python : Hamming problem
Francis Girard [EMAIL PROTECTED] wrote: def hamming(): def _hamming(): yield 1 hamming2 = hammingGenerators[0] hamming3 = hammingGenerators[1] hamming5 = hammingGenerators[2] for n in imerge(imap(lambda h: 2*h, iter(hamming2)), imerge(imap(lambda h: 3*h, iter(hamming3)), imap(lambda h: 5*h, iter(hamming5: yield n hammingGenerators = tee(_hamming(), 4) return hammingGenerators[3] If you are after readability, you might prefer this... def hamming(): def _hamming(): yield 1 for n in imerge(imap(lambda h: 2*h, iter(hamming2)), imerge(imap(lambda h: 3*h, iter(hamming3)), imap(lambda h: 5*h, iter(hamming5: yield n hamming2, hamming3, hamming5, result = tee(_hamming(), 4) return result PS interesting thread - never heard of Hamming sequences before! -- Nick Craig-Wood [EMAIL PROTECTED] -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: Classical FP problem in python : Hamming problem
On 25 Jan 2005 08:30:03 GMT, Nick Craig-Wood [EMAIL PROTECTED] wrote: Francis Girard [EMAIL PROTECTED] wrote: def hamming(): def _hamming(): yield 1 hamming2 = hammingGenerators[0] hamming3 = hammingGenerators[1] hamming5 = hammingGenerators[2] for n in imerge(imap(lambda h: 2*h, iter(hamming2)), imerge(imap(lambda h: 3*h, iter(hamming3)), imap(lambda h: 5*h, iter(hamming5: yield n hammingGenerators = tee(_hamming(), 4) return hammingGenerators[3] If you are after readability, you might prefer this... def hamming(): def _hamming(): yield 1 for n in imerge(imap(lambda h: 2*h, iter(hamming2)), imerge(imap(lambda h: 3*h, iter(hamming3)), imap(lambda h: 5*h, iter(hamming5: yield n hamming2, hamming3, hamming5, result = tee(_hamming(), 4) return result PS interesting thread - never heard of Hamming sequences before! Are the long words really that helpful? def hamming(): def _hamming(): yield 1 for n in imerge(imap(lambda h: 2*h, iter(hg2)), imerge(imap(lambda h: 3*h, iter(hg3)), imap(lambda h: 5*h, iter(hg5: yield n hg2, hg3, hg5, result = tee(_hamming(), 4) # four hamming generators return result Regards, Bengt Richter -- http://mail.python.org/mailman/listinfo/python-list
Re: Classical FP problem in python : Hamming problem
Francis Girard [EMAIL PROTECTED] wrote: The following implementation is even more speaking as it makes self-evident and almost mechanical how to translate algorithms that run after their tail from recursion to tee usage : *** BEGIN SNAP from itertools import tee, imap import sys def imerge(xs, ys): x = xs.next() y = ys.next() while True: if x == y: yield x x = xs.next() y = ys.next() elif x y: yield x x = xs.next() else: yield y y = ys.next() Thinking about this some more leads me to believe a general purpose imerge taking any number of arguments will look neater, eg def imerge(*generators): values = [ g.next() for g in generators ] while True: x = min(values) yield x for i in range(len(values)): if values[i] == x: values[i] = generators[i].next() def hamming(): def _hamming(): yield 1 hamming2 = hammingGenerators[0] hamming3 = hammingGenerators[1] hamming5 = hammingGenerators[2] for n in imerge(imap(lambda h: 2*h, iter(hamming2)), imerge(imap(lambda h: 3*h, iter(hamming3)), imap(lambda h: 5*h, iter(hamming5: yield n hammingGenerators = tee(_hamming(), 4) return hammingGenerators[3] This means that this can be further simplified thus, def hamming(): def _hamming(): yield 1 for n in imerge( imap(lambda h: 2*h, hamming2), imap(lambda h: 3*h, hamming3), imap(lambda h: 5*h, hamming5) ): yield n hamming2, hamming3, hamming5, result = tee(_hamming(), 4) return result (Note the iter(...) seemed not to be doing anything useful so I removed them) -- Nick Craig-Wood [EMAIL PROTECTED] -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: Classical FP problem in python : Hamming problem
Nick Craig-Wood wrote: Thinking about this some more leads me to believe a general purpose imerge taking any number of arguments will look neater, eg def imerge(*generators): values = [ g.next() for g in generators ] while True: x = min(values) yield x for i in range(len(values)): if values[i] == x: values[i] = generators[i].next() This kinda looks like it dies after the first generator is exhausted, but I'm not certain. An alternate version that doesn't search for 'i': py def imerge(*iterables): ... iters = [iter(i) for i in iterables] ... values = [i.next() for i in iters] ... while iters: ... x, i = min((val, i) for i, val in enumerate(values)) ... yield x ... try: ... values[i] = iters[i].next() ... except StopIteration: ... del iters[i] ... del values[i] ... py list(imerge([1, 4, 7], [2, 5, 8], [3, 6, 9])) [1, 2, 3, 4, 5, 6, 7, 8, 9] py list(imerge([3, 6, 9], [1, 4, 7], [2, 5, 8])) [1, 2, 3, 4, 5, 6, 7, 8, 9] py list(imerge([1, 4, 7], [3, 6, 9], [2, 5, 8])) [1, 2, 3, 4, 5, 6, 7, 8, 9] This means that this can be further simplified thus, def hamming(): def _hamming(): yield 1 for n in imerge( imap(lambda h: 2*h, hamming2), imap(lambda h: 3*h, hamming3), imap(lambda h: 5*h, hamming5) ): yield n hamming2, hamming3, hamming5, result = tee(_hamming(), 4) return result Nice. Steve -- http://mail.python.org/mailman/listinfo/python-list
Re: Classical FP problem in python : Hamming problem
Steven Bethard [EMAIL PROTECTED] wrote: Nick Craig-Wood wrote: Thinking about this some more leads me to believe a general purpose imerge taking any number of arguments will look neater, eg def imerge(*generators): values = [ g.next() for g in generators ] while True: x = min(values) yield x for i in range(len(values)): if values[i] == x: values[i] = generators[i].next() This kinda looks like it dies after the first generator is exhausted, but I'm not certain. Yes it will stop iterating then (rather like zip() on lists of unequal size). Not sure what the specification should be! It works for the hamming problem though. list(imerge(iter([1, 2]), iter([1, 2, 3]), iter([1, 2, 3, 4]))) [1, 2] An alternate version that doesn't search for 'i': py def imerge(*iterables): ... iters = [iter(i) for i in iterables] ... values = [i.next() for i in iters] ... while iters: ... x, i = min((val, i) for i, val in enumerate(values)) ... yield x ... try: ... values[i] = iters[i].next() ... except StopIteration: ... del iters[i] ... del values[i] ... py list(imerge([1, 4, 7], [2, 5, 8], [3, 6, 9])) [1, 2, 3, 4, 5, 6, 7, 8, 9] py list(imerge([3, 6, 9], [1, 4, 7], [2, 5, 8])) [1, 2, 3, 4, 5, 6, 7, 8, 9] py list(imerge([1, 4, 7], [3, 6, 9], [2, 5, 8])) [1, 2, 3, 4, 5, 6, 7, 8, 9] This isn't quite right... list(imerge([1, 2, 3], [1, 2, 3], [1, 2, 3])) [1, 1, 1, 2, 2, 2, 3, 3, 3] This should produce [1, 2, 3] So I'm afraid the searching *is* necessary - you've got to find all the generators with the min value and move them on. -- Nick Craig-Wood [EMAIL PROTECTED] -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: Classical FP problem in python : Hamming problem
Nick Craig-Wood wrote: Steven Bethard [EMAIL PROTECTED] wrote: Nick Craig-Wood wrote: Thinking about this some more leads me to believe a general purpose imerge taking any number of arguments will look neater, eg def imerge(*generators): values = [ g.next() for g in generators ] while True: x = min(values) yield x for i in range(len(values)): if values[i] == x: values[i] = generators[i].next() This kinda looks like it dies after the first generator is exhausted, but I'm not certain. Yes it will stop iterating then (rather like zip() on lists of unequal size). Not sure what the specification should be! It works for the hamming problem though. Actually, it stops iterating on lists of equal size too: py def imerge(*iterators): ... iterators = [iter(i) for i in iterators] ... values = [i.next() for i in iterators] ... while True: ... x = min(values) ... yield x ... for i, val in enumerate(values): ... if val == x: ... values[i] = iterators[i].next() ... py list(imerge([1, 4, 7], [2, 5, 8], [3, 6, 9])) [1, 2, 3, 4, 5, 6, 7] Note that 8 and 9 are not in the list. Steve -- http://mail.python.org/mailman/listinfo/python-list
Re: Classical FP problem in python : Hamming problem
Nick Craig-Wood wrote: Steven Bethard [EMAIL PROTECTED] wrote: Nick Craig-Wood wrote: Thinking about this some more leads me to believe a general purpose imerge taking any number of arguments will look neater, eg def imerge(*generators): values = [ g.next() for g in generators ] while True: x = min(values) yield x for i in range(len(values)): if values[i] == x: values[i] = generators[i].next() This kinda looks like it dies after the first generator is exhausted, but I'm not certain. Yes it will stop iterating then (rather like zip() on lists of unequal size). Not sure what the specification should be! It works for the hamming problem though. list(imerge(iter([1, 2]), iter([1, 2, 3]), iter([1, 2, 3, 4]))) [1, 2] An alternate version that doesn't search for 'i': py def imerge(*iterables): ... iters = [iter(i) for i in iterables] ... values = [i.next() for i in iters] ... while iters: ... x, i = min((val, i) for i, val in enumerate(values)) ... yield x ... try: ... values[i] = iters[i].next() ... except StopIteration: ... del iters[i] ... del values[i] ... py list(imerge([1, 4, 7], [2, 5, 8], [3, 6, 9])) [1, 2, 3, 4, 5, 6, 7, 8, 9] py list(imerge([3, 6, 9], [1, 4, 7], [2, 5, 8])) [1, 2, 3, 4, 5, 6, 7, 8, 9] py list(imerge([1, 4, 7], [3, 6, 9], [2, 5, 8])) [1, 2, 3, 4, 5, 6, 7, 8, 9] This isn't quite right... list(imerge([1, 2, 3], [1, 2, 3], [1, 2, 3])) [1, 1, 1, 2, 2, 2, 3, 3, 3] This should produce [1, 2, 3] So I'm afraid the searching *is* necessary - you've got to find all the generators with the min value and move them on. Here's a dict-based implementation: cute, but slow, at least for a small number of iterators def imerge(*iterables): ... cache = {} ... iterators = map(iter,iterables) ... number = len(iterables) ... exhausted = 0 ... while 1: ... for it in iterators: ... try: ... cache.setdefault(it.next(),[]).append(it) ... except StopIteration: ... exhausted += 1 ... if exhausted == number: ... raise StopIteration ... val = min(cache) ... iterators = cache.pop(val) ... yield val list(imerge([1, 2, 3, 6], [1, 2, 3, 7], [1, 2, 3, 4, 5])) [1, 2, 3, 4, 5, 6, 7] Michael -- http://mail.python.org/mailman/listinfo/python-list
Re: Classical FP problem in python : Hamming problem
Bengt Richter wrote: On 25 Jan 2005 08:30:03 GMT, Nick Craig-Wood [EMAIL PROTECTED] wrote: If you are after readability, you might prefer this... def hamming(): def _hamming(): yield 1 for n in imerge(imap(lambda h: 2*h, iter(hamming2)), imerge(imap(lambda h: 3*h, iter(hamming3)), imap(lambda h: 5*h, iter(hamming5: yield n hamming2, hamming3, hamming5, result = tee(_hamming(), 4) return result Are the long words really that helpful? def hamming(): def _hamming(): yield 1 for n in imerge(imap(lambda h: 2*h, iter(hg2)), imerge(imap(lambda h: 3*h, iter(hg3)), imap(lambda h: 5*h, iter(hg5: yield n hg2, hg3, hg5, result = tee(_hamming(), 4) # four hamming generators return result Well, judging by the fact that shortening the identifiers made it so that you felt the need to add a comment indicating what they were identifying, I'd say that yes, the long words *are* helpful. ;) Comments are good, but self-documenting code is even better. Jeff Shannon Technician/Programmer Credit International -- http://mail.python.org/mailman/listinfo/python-list
Re: Classical FP problem in python : Hamming problem
Ok, I think that the bottom line is this : For all the algorithms that run after their tail in an FP way, like the Hamming problem, or the Fibonacci sequence, (but unlike Sieve of Eratosthene -- there's a subtle difference), i.e. all those algorithms that typically rely upon recursion to get the beginning of the generated elements in order to generate the next one ... ... so for this family of algorithms, we have, somehow, to abandon recursion, and to explicitely maintain one or many lists of what had been generated. One solution for this is suggested in test_generators.py. But I think that this solution is evil as it looks like recursion is used but it is not and it depends heavily on how the m235 function (for example) is invoked. Furthermore, it is _NOT_ memory efficient as pretended : it leaks ! It internally maintains a lists of generated results that grows forever while it is useless to maintain results that had been consumed. Just for a gross estimate, on my system, percentage of memory usage for this program grows rapidly, reaching 21.6 % after 5 minutes. CPU usage varies between 31%-36%. Ugly and inefficient. The solution of Jeff Epler is far more elegant. The itertools.tee function does just what we want. It internally maintain a list of what had been generated and deletes the consumed elements as the algo unrolls. To follow with my gross estimate, memory usage grows from 1.2% to 1.9% after 5 minutes (probably only affected by the growing size of Huge Integer). CPU usage varies between 27%-29%. Beautiful and effecient. You might think that we shouldn't be that fussy about memory usage on generating 100 digits numbers but we're talking about a whole family of useful FP algorithms ; and the best way to implement them should be established. For this family of algorithms, itertools.tee is the way to go. I think that the semi-tutorial in test_generators.py should be updated to use tee. Or, at least, a severe warning comment should be written. Thank you, Francis Girard FRANCE Le dimanche 23 Janvier 2005 23:27, Jeff Epler a crit: Your formulation in Python is recursive (hamming calls hamming()) and I think that's why your program gives up fairly early. Instead, use itertools.tee() [new in Python 2.4, or search the internet for an implementation that works in 2.3] to give a copy of the output sequence to each multiply by N function as well as one to be the return value. Here's my implementation, which matched your list early on but effortlessly reached larger values. One large value it printed was 6412351813189632 (a Python long) which indeed has only the distinct prime factors 2 and 3. (2**43 * 3**6) Jeff from itertools import tee import sys def imerge(xs, ys): x = xs.next() y = ys.next() while True: if x == y: yield x x = xs.next() y = ys.next() elif x y: yield x x = xs.next() else: yield y y = ys.next() def hamming(): def _hamming(j, k): yield 1 hamming = generators[j] for i in hamming: yield i * k generators = [] generator = imerge(imerge(_hamming(0, 2), _hamming(1, 3)), _hamming(2, 5)) generators[:] = tee(generator, 4) return generators[3] for i in hamming(): print i, sys.stdout.flush() -- http://mail.python.org/mailman/listinfo/python-list
Re: Classical FP problem in python : Hamming problem
The following implementation is even more speaking as it makes self-evident and almost mechanical how to translate algorithms that run after their tail from recursion to tee usage : *** BEGIN SNAP from itertools import tee, imap import sys def imerge(xs, ys): x = xs.next() y = ys.next() while True: if x == y: yield x x = xs.next() y = ys.next() elif x y: yield x x = xs.next() else: yield y y = ys.next() def hamming(): def _hamming(): yield 1 hamming2 = hammingGenerators[0] hamming3 = hammingGenerators[1] hamming5 = hammingGenerators[2] for n in imerge(imap(lambda h: 2*h, iter(hamming2)), imerge(imap(lambda h: 3*h, iter(hamming3)), imap(lambda h: 5*h, iter(hamming5: yield n hammingGenerators = tee(_hamming(), 4) return hammingGenerators[3] for i in hamming(): print i, sys.stdout.flush() *** END SNAP Here's an implementation of the fibonacci sequence that uses tee : *** BEGIN SNAP from itertools import tee import sys def fib(): def _fib(): yield 1 yield 1 curGen = fibGenerators[0] curAheadGen = fibGenerators[1] curAheadGen.next() while True: yield curGen.next() + curAheadGen.next() fibGenerators = tee(_fib(), 3) return fibGenerators[2] for n in fib(): print n, sys.stdout.flush() *** END SNAP Francis Girard FRANCE Le lundi 24 Janvier 2005 14:09, Francis Girard a crit: Ok, I think that the bottom line is this : For all the algorithms that run after their tail in an FP way, like the Hamming problem, or the Fibonacci sequence, (but unlike Sieve of Eratosthene -- there's a subtle difference), i.e. all those algorithms that typically rely upon recursion to get the beginning of the generated elements in order to generate the next one ... ... so for this family of algorithms, we have, somehow, to abandon recursion, and to explicitely maintain one or many lists of what had been generated. One solution for this is suggested in test_generators.py. But I think that this solution is evil as it looks like recursion is used but it is not and it depends heavily on how the m235 function (for example) is invoked. Furthermore, it is _NOT_ memory efficient as pretended : it leaks ! It internally maintains a lists of generated results that grows forever while it is useless to maintain results that had been consumed. Just for a gross estimate, on my system, percentage of memory usage for this program grows rapidly, reaching 21.6 % after 5 minutes. CPU usage varies between 31%-36%. Ugly and inefficient. The solution of Jeff Epler is far more elegant. The itertools.tee function does just what we want. It internally maintain a list of what had been generated and deletes the consumed elements as the algo unrolls. To follow with my gross estimate, memory usage grows from 1.2% to 1.9% after 5 minutes (probably only affected by the growing size of Huge Integer). CPU usage varies between 27%-29%. Beautiful and effecient. You might think that we shouldn't be that fussy about memory usage on generating 100 digits numbers but we're talking about a whole family of useful FP algorithms ; and the best way to implement them should be established. For this family of algorithms, itertools.tee is the way to go. I think that the semi-tutorial in test_generators.py should be updated to use tee. Or, at least, a severe warning comment should be written. Thank you, Francis Girard FRANCE Le dimanche 23 Janvier 2005 23:27, Jeff Epler a crit: Your formulation in Python is recursive (hamming calls hamming()) and I think that's why your program gives up fairly early. Instead, use itertools.tee() [new in Python 2.4, or search the internet for an implementation that works in 2.3] to give a copy of the output sequence to each multiply by N function as well as one to be the return value. Here's my implementation, which matched your list early on but effortlessly reached larger values. One large value it printed was 6412351813189632 (a Python long) which indeed has only the distinct prime factors 2 and 3. (2**43 * 3**6) Jeff from itertools import tee import sys def imerge(xs, ys): x = xs.next() y = ys.next() while True: if x == y: yield x x = xs.next() y = ys.next() elif x y: yield x x = xs.next() else: yield y y = ys.next() def hamming(): def _hamming(j, k): yield 1 hamming = generators[j] for i in hamming: yield i * k generators = [] generator = imerge(imerge(_hamming(0, 2), _hamming(1, 3)), _hamming(2, 5)) generators[:] = tee(generator, 4) return generators[3] for i in hamming(): print i, sys.stdout.flush() --
Re: Classical FP problem in python : Hamming problem
[Francis Girard] For all the algorithms that run after their tail in an FP way, like the Hamming problem, or the Fibonacci sequence, (but unlike Sieve of Eratosthene -- there's a subtle difference), i.e. all those algorithms that typically rely upon recursion to get the beginning of the generated elements in order to generate the next one ... ... so for this family of algorithms, we have, somehow, to abandon recursion, and to explicitely maintain one or many lists of what had been generated. One solution for this is suggested in test_generators.py. But I think that this solution is evil as it looks like recursion is used but it is not and it depends heavily on how the m235 function (for example) is invoked. Well, yes -- Heh. Here's one way to get a shared list, complete with an excruciating namespace renaming trick was intended to warn you in advance that it wasn't pretty wink. Furthermore, it is _NOT_ memory efficient as pretended : it leaks ! Yes. But there are two solutions to the problem in that file, and the second one is in fact extremely memory-efficient compared to the first one. Efficiency here was intended in a relative sense. It internally maintains a lists of generated results that grows forever while it is useless to maintain results that had been consumed. Just for a gross estimate, on my system, percentage of memory usage for this program grows rapidly, reaching 21.6 % after 5 minutes. CPU usage varies between 31%-36%. Ugly and inefficient. Try the first solution in the file for a better understanding of what inefficient means wink. The solution of Jeff Epler is far more elegant. The itertools.tee function does just what we want. It internally maintain a list of what had been generated and deletes the consumed elements as the algo unrolls. To follow with my gross estimate, memory usage grows from 1.2% to 1.9% after 5 minutes (probably only affected by the growing size of Huge Integer). CPU usage varies between 27%-29%. Beautiful and effecient. Yes, it is better. tee() didn't exist when generators (and test_generators.py) were written, so of course nothing in the test file uses them. You might think that we shouldn't be that fussy about memory usage on generating 100 digits numbers but we're talking about a whole family of useful FP algorithms ; and the best way to implement them should be established. Possibly -- there really aren't many Pythonistas who care about this. For this family of algorithms, itertools.tee is the way to go. I think that the semi-tutorial in test_generators.py should be updated to use tee. Or, at least, a severe warning comment should be written. Please submit a patch. The purpose of that file is to test generators, so you should add a third way of doing it, not replace the two ways already there. It should also contain prose explaining why the third way is better (just as there's prose now explaining why the second way is better than the first). -- http://mail.python.org/mailman/listinfo/python-list
Re: Classical FP problem in python : Hamming problem
Ok I'll submit the patch with the prose pretty soon. Thank you Francis Girard FRANCE Le mardi 25 Janvier 2005 04:29, Tim Peters a écrit : [Francis Girard] For all the algorithms that run after their tail in an FP way, like the Hamming problem, or the Fibonacci sequence, (but unlike Sieve of Eratosthene -- there's a subtle difference), i.e. all those algorithms that typically rely upon recursion to get the beginning of the generated elements in order to generate the next one ... ... so for this family of algorithms, we have, somehow, to abandon recursion, and to explicitely maintain one or many lists of what had been generated. One solution for this is suggested in test_generators.py. But I think that this solution is evil as it looks like recursion is used but it is not and it depends heavily on how the m235 function (for example) is invoked. Well, yes -- Heh. Here's one way to get a shared list, complete with an excruciating namespace renaming trick was intended to warn you in advance that it wasn't pretty wink. Furthermore, it is _NOT_ memory efficient as pretended : it leaks ! Yes. But there are two solutions to the problem in that file, and the second one is in fact extremely memory-efficient compared to the first one. Efficiency here was intended in a relative sense. It internally maintains a lists of generated results that grows forever while it is useless to maintain results that had been consumed. Just for a gross estimate, on my system, percentage of memory usage for this program grows rapidly, reaching 21.6 % after 5 minutes. CPU usage varies between 31%-36%. Ugly and inefficient. Try the first solution in the file for a better understanding of what inefficient means wink. The solution of Jeff Epler is far more elegant. The itertools.tee function does just what we want. It internally maintain a list of what had been generated and deletes the consumed elements as the algo unrolls. To follow with my gross estimate, memory usage grows from 1.2% to 1.9% after 5 minutes (probably only affected by the growing size of Huge Integer). CPU usage varies between 27%-29%. Beautiful and effecient. Yes, it is better. tee() didn't exist when generators (and test_generators.py) were written, so of course nothing in the test file uses them. You might think that we shouldn't be that fussy about memory usage on generating 100 digits numbers but we're talking about a whole family of useful FP algorithms ; and the best way to implement them should be established. Possibly -- there really aren't many Pythonistas who care about this. For this family of algorithms, itertools.tee is the way to go. I think that the semi-tutorial in test_generators.py should be updated to use tee. Or, at least, a severe warning comment should be written. Please submit a patch. The purpose of that file is to test generators, so you should add a third way of doing it, not replace the two ways already there. It should also contain prose explaining why the third way is better (just as there's prose now explaining why the second way is better than the first). -- http://mail.python.org/mailman/listinfo/python-list
Re: Classical FP problem in python : Hamming problem
Your formulation in Python is recursive (hamming calls hamming()) and I think that's why your program gives up fairly early. Instead, use itertools.tee() [new in Python 2.4, or search the internet for an implementation that works in 2.3] to give a copy of the output sequence to each multiply by N function as well as one to be the return value. Here's my implementation, which matched your list early on but effortlessly reached larger values. One large value it printed was 6412351813189632 (a Python long) which indeed has only the distinct prime factors 2 and 3. (2**43 * 3**6) Jeff from itertools import tee import sys def imerge(xs, ys): x = xs.next() y = ys.next() while True: if x == y: yield x x = xs.next() y = ys.next() elif x y: yield x x = xs.next() else: yield y y = ys.next() def hamming(): def _hamming(j, k): yield 1 hamming = generators[j] for i in hamming: yield i * k generators = [] generator = imerge(imerge(_hamming(0, 2), _hamming(1, 3)), _hamming(2, 5)) generators[:] = tee(generator, 4) return generators[3] for i in hamming(): print i, sys.stdout.flush() pgpDLQcTdYXXo.pgp Description: PGP signature -- http://mail.python.org/mailman/listinfo/python-list
Re: Classical FP problem in python : Hamming problem
[Francis Girard] ... In the meantime, I couldn't resist to test the new Python features about laziness on a classical FP problem, i.e. the Hamming problem. ... Nevertheless, while the Haskell version prints Hamming sequence for as long as I can stand it, and with very little memory consumation, the Python version only prints : ... I think I should not have this kind of behaviour, even using recursion, since I'm only using lazy constructs all the time. At least, I would expect the program to produce much more results before surrending. What's going on ? You can find an explanation in Lib/test/test_generators.py, which uses this problem as an example; you can also find one efficient way to write it there. -- http://mail.python.org/mailman/listinfo/python-list