Re: Classical FP problem in python : Hamming problem

2005-01-28 Thread Francis Girard
Le vendredi 28 Janvier 2005 08:27, Paul Rubin a écrit :
 Francis Girard [EMAIL PROTECTED] writes:
  Thank you Nick and Steven for the idea of a more generic imerge.

 If you want to get fancy, the merge should use a priority queue (like
 in the heapsort algorithm) instead of a linear scan through the
 incoming iters, to find the next item to output.  That lowers the
 running time to O(n log k) instead of O(n*k), where k is the number of
 iterators and n is the length.

The goal was only to show some FP construct on small problems. Here the number 
of iterators are intended to be fairly small.

Otherwise, yes, you could exploit the fact that, at one loop execution, you 
already seen most of the elements at previous loop execution, storing them in 
some heap structure and therefore not having to recan the whole list of 
already seen iterator values at each loop execution.

Thank you

Francis Girard

--
http://mail.python.org/mailman/listinfo/python-list


Re: Classical FP problem in python : Hamming problem

2005-01-27 Thread Nick Craig-Wood
Francis Girard [EMAIL PROTECTED] wrote:
  Thank you Nick and Steven for the idea of a more generic imerge.

You are welcome :-)  [It came to me while walking the children to school!]

[snip]
  class IteratorDeiterator:
def __init__(self, iterator):
  self._iterator = iterator.__iter__()
  self._firstVal = None ## Avoid consuming if not requested from outside
## Works only if iterator itself can't return None

You can use a sentinel here if you want to avoid the can't return
None limitation.  For a sentinel you need an object your iterator
couldn't possibly return.  You can make one up, eg

   self._sentinel = object()
   self._firstVal = self._sentinel

Or you could use self (but I'm not 100% sure that your recursive
functions wouldn't return it!)

def __iter__(self): return self
 
def next(self):
  valReturn = self._firstVal
  if valReturn is None:

and
   if valReturn is self._sentinel:

valReturn = self._iterator.next()
  self._firstVal = None

   self._firstVal = self._sentinel

etc..

[snip more code]

Thanks for some more examples of fp-style code.  I find it hard to get
my head round so its been good exercise!
-- 
Nick Craig-Wood [EMAIL PROTECTED] -- http://www.craig-wood.com/nick
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Classical FP problem in python : Hamming problem

2005-01-27 Thread Francis Girard
Le jeudi 27 Janvier 2005 10:30, Nick Craig-Wood a crit:
 Francis Girard [EMAIL PROTECTED] wrote:
   Thank you Nick and Steven for the idea of a more generic imerge.

 You are welcome :-)  [It came to me while walking the children to school!]


Sometimes fresh air and children purity is all what it takes. Much better than 
coffee, cigarrette and flat screen.

 [snip]

   class IteratorDeiterator:
 def __init__(self, iterator):
   self._iterator = iterator.__iter__()
   self._firstVal = None ## Avoid consuming if not requested from
  outside ## Works only if iterator itself can't return None

 You can use a sentinel here if you want to avoid the can't return
 None limitation.  For a sentinel you need an object your iterator
 couldn't possibly return.  You can make one up, eg


Great idea. I'll use it.

self._sentinel = object()
self._firstVal = self._sentinel

 Or you could use self (but I'm not 100% sure that your recursive
 functions wouldn't return it!)

 def __iter__(self): return self
 
 def next(self):
   valReturn = self._firstVal
   if valReturn is None:

 and

if valReturn is self._sentinel:
 valReturn = self._iterator.next()
   self._firstVal = None

self._firstVal = self._sentinel

 etc..

 [snip more code]

 Thanks for some more examples of fp-style code.  I find it hard to get
 my head round so its been good exercise!

Introduction to functional programming
Richard Bird and Philip Wadler
Prentice Hall
1988

This is the very best intro I ever read. The book is without hype, doesn't 
show its age and is language neutral.
Authors are world leaders in the field today. Only really strong guys have the 
kindness to do understandable introductions without trying to hide the 
difficulties (because they are strong enough to face them with simplicity).

It's been a real pleasure.

Regards,

Francis Girard
FRANCE


 --
 Nick Craig-Wood [EMAIL PROTECTED] -- http://www.craig-wood.com/nick

--
http://mail.python.org/mailman/listinfo/python-list


Re: Classical FP problem in python : Hamming problem

2005-01-27 Thread Paul Rubin
Francis Girard [EMAIL PROTECTED] writes:
 Thank you Nick and Steven for the idea of a more generic imerge.

If you want to get fancy, the merge should use a priority queue (like
in the heapsort algorithm) instead of a linear scan through the
incoming iters, to find the next item to output.  That lowers the
running time to O(n log k) instead of O(n*k), where k is the number of
iterators and n is the length.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Classical FP problem in python : Hamming problem

2005-01-27 Thread Michael Spencer
Paul Rubin wrote:
Francis Girard [EMAIL PROTECTED] writes:
Thank you Nick and Steven for the idea of a more generic imerge.

If you want to get fancy, the merge should use a priority queue (like
in the heapsort algorithm) instead of a linear scan through the
incoming iters, to find the next item to output.  That lowers the
running time to O(n log k) instead of O(n*k), where k is the number of
iterators and n is the length.
I looked at a heapq solution but didn't see any clean way of dealing with 
multiple iterators having equal values.  The dict solution below deals cleanly 
with that, since one key can be shared by any number of iterators.  Extracting 
the minimum, and the associated iterables is fast, but the overall solution is 
still slower than the brute force approach for the 3 hamming iterators.

  def imerge(*iterables):
 ... cache = {}
 ... iterators = map(iter,iterables)
 ... number = len(iterables)
 ... exhausted = 0
 ... while 1:
 # First time through, looked at all of them
 # Subsequently, update only the minimum iterators
 ... for it in iterators:
 ... try:
 # Key each iterator by its next() value
 # Multiple iterators may share the same key
 ... cache.setdefault(it.next(),[]).append(it)
 ... except StopIteration:
 ... exhausted += 1
 ... if exhausted == number:
 ... raise StopIteration
 # Get the lowest value
 ... val = min(cache)
 # and all the iterators that have that value
 ... iterators = cache.pop(val) 
 ... yield val
  list(imerge([1, 2, 3, 6], [1, 2, 3, 7], [1, 2, 3, 4, 5]))
[1, 2, 3, 4, 5, 6, 7]
 
Michael
--
http://mail.python.org/mailman/listinfo/python-list


Re: Classical FP problem in python : Hamming problem

2005-01-26 Thread Francis Girard
Le mardi 25 Janvier 2005 09:01, Michael Spencer a crit:
 Francis Girard wrote:
  The following implementation is even more speaking as it makes
  self-evident and almost mechanical how to translate algorithms that run
  after their tail from recursion to tee usage :

 Thanks, Francis and Jeff for raising a fascinating topic.  I've enjoyed
 trying to get my head around both the algorithm and your non-recursive
 implementation.


Yes, it's been fun.

 Here's a version of your implementation that uses a helper class to make
 the algorithm itself prettier.

 from itertools import tee, imap

 def hamming():
  def _hamming():
  yield 1
  for n in imerge(2 * hamming, imerge(3 * hamming, 5 * hamming)):
  yield n

  hamming = Tee(_hamming())
  return iter(hamming)


 class Tee(object):
  Provides an indepent iterator (using tee) on every iteration
 request Also implements lazy iterator arithmetic
  def __init__(self, iterator):
  self.iter = tee(iterator,1)[0]
  def __iter__(self):
  return self.iter.__copy__()
  def __mul__(self, number):
  return imap(lambda x: x * number,self.__iter__())

 def imerge(xs, ys):
x = xs.next()
y = ys.next()
while True:
  if x == y:
yield x
x = xs.next()
y = ys.next()
  elif x  y:
yield x
x = xs.next()
  else: # if y  x:
yield y
y = ys.next()

   hg = hamming()
   for i in range(1):

 ... n = hg.next()
 ... if i % 1000 == 0: print i, n
 ...
 0 1
 1000 5184
 2000 81
 3000 27993600
 4000 4707158941350
 5000 5096079360
 6000 4096000
 7000 2638827906662400
 8000 143327232
 9000 680244480



Interesting idea.

 Regards

 Michael

--
http://mail.python.org/mailman/listinfo/python-list


Re: Classical FP problem in python : Hamming problem

2005-01-26 Thread Francis Girard
Le mardi 25 Janvier 2005 19:52, Steven Bethard a écrit :

Thank you Nick and Steven for the idea of a more generic imerge.

To work with the Hamming problem, the imerge function _must_not_ allow 
duplicates and we can assume all of the specified iteratables are of the same 
size, i.e. infinite !

Therefore, Nick's solution fulfills the need.  But it is admittedly confusing 
to call the function imerge when it doesn't merge everything (including 
duplicates). Anyway both solution sheds new light and brings us a bit 
farther.

That's the beauty of many brains from all over the world collabarating.
Really, it makes me emotive thinking about it.

For the imerge function, what we really need to make the formulation clear is 
a way to look at the next element of an iteratable without consuming it. Or 
else, a way to put back consumed elements in the front an iteration flow, 
much like the list constructors in FP languages like Haskell.

It is simple to encapsulate an iterator inside another iterator class that 
would do just that. Here's one. The additional fst method returns the next 
element to consume without consuming it and the isBottom checks if there is 
something left to consume from the iteration flow, without actually consuming 
anything. I named the class IteratorDeiterator for lack of imagination :

-- BEGIN SNAP
class IteratorDeiterator:
  def __init__(self, iterator):
self._iterator = iterator.__iter__()
self._firstVal = None ## Avoid consuming if not requested from outside
  ## Works only if iterator itself can't return None

  def __iter__(self): return self

  def next(self):
valReturn = self._firstVal
if valReturn is None:
  valReturn = self._iterator.next()
self._firstVal = None
return valReturn

  def fst(self):
if self._firstVal is None:
  self._firstVal = self._iterator.next()
return self._firstVal

  def isBottom(self):
if self._firstVal is None:
  try: self._firstVal = self._iterator.next()
  except StopIteration:
return True
return False
-- END SNAP

Now the imerge_infinite which merges infinite lists, while allowing 
duplicates, is quite straightforward :

-- BEGIN SNAP
def imerge_infinite(*iterators):
  vItTopable = [IteratorDeiterator(it) for it in iterators]
  while vItTopable:
yield reduce(lambda itSmallest, it: 
   itSmallest.fst()  it.fst() and it or itSmallest, 
   vItTopable).next()
-- END SNAP

To merge finite lists of possibly different lengths is a bit more work as you 
have to eliminate iterators that reached the bottom before the others. This 
is quite easily done by simply filtering them out. 
The imerge_finite function below merges lists of possibly different sizes.

-- BEGIN SNAP
def imerge_finite(*iterators):
  vItDeit = [IteratorDeiterator(it) for it in iterators]
  vItDeit = filter(lambda it: not it.isBottom(), vItDeit)
  while vItDeit:
yield reduce(lambda itSmallest, it: 
   itSmallest.fst()  it.fst() and it or itSmallest, 
   vItDeit).next()
vItDeit = filter(lambda it: not it.isBottom(), vItDeit)
-- END SNAP


Now, we want versions of these two merge functions that do not allow 
duplicates. Building upon what we've already done in a semi FP way, we just 
filter out the duplicates from the iteration streams returned by the above 
functions. The job is the same for the infinite and finite versions, hence 
the imerge_nodup generic function.

-- BEGIN SNAP
def imerge_nodup(fImerge, *iterators):
  it = fImerge(*iterators)
  el = it.next()
  yield el
  while True:
el = dropwhile(lambda _next: _next == el, it).next()
yield el

imerge_inf_nodup = \
  lambda *iterators: imerge_nodup(imerge_infinite, *iterators)
imerge_finite_nodup = \
  lambda *iterators: imerge_nodup(imerge_finite, *iterators)
-- END SNAP

I used the lambda notation for imerge_inf_nodup and imerge_finite_nodup to 
avoid another useless for-yield loop. Here the notation really just express 
function equivalence with a bounded argument (it would be great to have a 
notation for this in Python, i.e. binding a function argument to yield a new 
function).

The merge function to use with hamming() is imerge_inf_nodup.

Regards,

Francis Girard
FRANCE

 Nick Craig-Wood wrote:
  Steven Bethard [EMAIL PROTECTED] wrote:
  Nick Craig-Wood wrote:
 Thinking about this some more leads me to believe a general purpose
 imerge taking any number of arguments will look neater, eg
 
 def imerge(*generators):
 values = [ g.next() for g in generators ]
 while True:
 x = min(values)
 yield x
 for i in range(len(values)):
 if values[i] == x:
 values[i] = generators[i].next()
 
  This kinda looks like it dies after the first generator is exhausted,
  but I'm not certain.
 
  Yes it will stop iterating then (rather like zip() on lists of unequal
  size). Not sure what the specification should be!  It 

Re: Classical FP problem in python : Hamming problem

2005-01-26 Thread Steven Bethard
Francis Girard wrote:
For the imerge function, what we really need to make the formulation clear is 
a way to look at the next element of an iteratable without consuming it. Or 
else, a way to put back consumed elements in the front an iteration flow, 
much like the list constructors in FP languages like Haskell.

It is simple to encapsulate an iterator inside another iterator class that 
would do just that. Here's one. The additional fst method returns the next 
element to consume without consuming it and the isBottom checks if there is 
something left to consume from the iteration flow, without actually consuming 
anything. I named the class IteratorDeiterator for lack of imagination :

[snip]
Another implementation is my peekable class:
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/304373
It allows you to peek several elements ahead if that's necessary.
Steve
--
http://mail.python.org/mailman/listinfo/python-list


Re: Classical FP problem in python : Hamming problem

2005-01-25 Thread Michael Spencer
Francis Girard wrote:
The following implementation is even more speaking as it makes self-evident 
and almost mechanical how to translate algorithms that run after their tail 
from recursion to tee usage :

Thanks, Francis and Jeff for raising a fascinating topic.  I've enjoyed trying 
to get my head around both the algorithm and your non-recursive implementation.

Here's a version of your implementation that uses a helper class to make the 
algorithm itself prettier.

from itertools import tee, imap
def hamming():
def _hamming():
yield 1
for n in imerge(2 * hamming, imerge(3 * hamming, 5 * hamming)):
yield n
hamming = Tee(_hamming())
return iter(hamming)
class Tee(object):
Provides an indepent iterator (using tee) on every iteration request
Also implements lazy iterator arithmetic
def __init__(self, iterator):
self.iter = tee(iterator,1)[0]
def __iter__(self):
return self.iter.__copy__()
def __mul__(self, number):
return imap(lambda x: x * number,self.__iter__())
def imerge(xs, ys):
  x = xs.next()
  y = ys.next()
  while True:
if x == y:
  yield x
  x = xs.next()
  y = ys.next()
elif x  y:
  yield x
  x = xs.next()
else: # if y  x:
  yield y
  y = ys.next()
 hg = hamming()
 for i in range(1):
... n = hg.next()
... if i % 1000 == 0: print i, n
...
0 1
1000 5184
2000 81
3000 27993600
4000 4707158941350
5000 5096079360
6000 4096000
7000 2638827906662400
8000 143327232
9000 680244480
Regards
Michael
--
http://mail.python.org/mailman/listinfo/python-list


Re: Classical FP problem in python : Hamming problem

2005-01-25 Thread Nick Craig-Wood
Francis Girard [EMAIL PROTECTED] wrote:
  def hamming():
def _hamming():
  yield 1
  hamming2 = hammingGenerators[0]
  hamming3 = hammingGenerators[1]
  hamming5 = hammingGenerators[2]
  for n in imerge(imap(lambda h: 2*h, iter(hamming2)),
  imerge(imap(lambda h: 3*h, iter(hamming3)),
 imap(lambda h: 5*h, iter(hamming5:
yield n
hammingGenerators = tee(_hamming(), 4)
return hammingGenerators[3]

If you are after readability, you might prefer this...

def hamming():
  def _hamming():
yield 1
for n in imerge(imap(lambda h: 2*h, iter(hamming2)),
imerge(imap(lambda h: 3*h, iter(hamming3)),
   imap(lambda h: 5*h, iter(hamming5:
  yield n
  hamming2, hamming3, hamming5, result = tee(_hamming(), 4)
  return result

PS interesting thread - never heard of Hamming sequences before!
-- 
Nick Craig-Wood [EMAIL PROTECTED] -- http://www.craig-wood.com/nick
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Classical FP problem in python : Hamming problem

2005-01-25 Thread Bengt Richter
On 25 Jan 2005 08:30:03 GMT, Nick Craig-Wood [EMAIL PROTECTED] wrote:

Francis Girard [EMAIL PROTECTED] wrote:
  def hamming():
def _hamming():
  yield 1
  hamming2 = hammingGenerators[0]
  hamming3 = hammingGenerators[1]
  hamming5 = hammingGenerators[2]
  for n in imerge(imap(lambda h: 2*h, iter(hamming2)),
  imerge(imap(lambda h: 3*h, iter(hamming3)),
 imap(lambda h: 5*h, iter(hamming5:
yield n
hammingGenerators = tee(_hamming(), 4)
return hammingGenerators[3]

If you are after readability, you might prefer this...

def hamming():
  def _hamming():
yield 1
for n in imerge(imap(lambda h: 2*h, iter(hamming2)),
imerge(imap(lambda h: 3*h, iter(hamming3)),
   imap(lambda h: 5*h, iter(hamming5:
  yield n
  hamming2, hamming3, hamming5, result = tee(_hamming(), 4)
  return result

PS interesting thread - never heard of Hamming sequences before!

Are the long words really that helpful?

def hamming():
  def _hamming():
yield 1
for n in imerge(imap(lambda h: 2*h, iter(hg2)),
imerge(imap(lambda h: 3*h, iter(hg3)),
   imap(lambda h: 5*h, iter(hg5:
  yield n
  hg2, hg3, hg5, result = tee(_hamming(), 4) # four hamming generators
  return result

Regards,
Bengt Richter
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Classical FP problem in python : Hamming problem

2005-01-25 Thread Nick Craig-Wood
Francis Girard [EMAIL PROTECTED] wrote:
  The following implementation is even more speaking as it makes self-evident 
  and almost mechanical how to translate algorithms that run after their tail 
  from recursion to tee usage :
 
  *** BEGIN SNAP
  from itertools import tee, imap
  import sys
 
  def imerge(xs, ys):
x = xs.next()
y = ys.next()
while True:
  if x == y:
yield x
x = xs.next()
y = ys.next()
  elif x  y:
yield x
x = xs.next()
  else:
yield y
y = ys.next()

Thinking about this some more leads me to believe a general purpose
imerge taking any number of arguments will look neater, eg

def imerge(*generators):
values = [ g.next() for g in generators ]
while True:
x = min(values)
yield x
for i in range(len(values)):
if values[i] == x:
values[i] = generators[i].next()

  def hamming():
def _hamming():
  yield 1
  hamming2 = hammingGenerators[0]
  hamming3 = hammingGenerators[1]
  hamming5 = hammingGenerators[2]
  for n in imerge(imap(lambda h: 2*h, iter(hamming2)),
  imerge(imap(lambda h: 3*h, iter(hamming3)),
 imap(lambda h: 5*h, iter(hamming5:
yield n
hammingGenerators = tee(_hamming(), 4)
return hammingGenerators[3]

This means that this can be further simplified thus,

def hamming():
def _hamming():
yield 1
for n in imerge( imap(lambda h: 2*h, hamming2),
 imap(lambda h: 3*h, hamming3),
 imap(lambda h: 5*h, hamming5) ):
yield n
hamming2, hamming3, hamming5, result = tee(_hamming(), 4)
return result

(Note the iter(...) seemed not to be doing anything useful so I
removed them)

-- 
Nick Craig-Wood [EMAIL PROTECTED] -- http://www.craig-wood.com/nick
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Classical FP problem in python : Hamming problem

2005-01-25 Thread Steven Bethard
Nick Craig-Wood wrote:
Thinking about this some more leads me to believe a general purpose
imerge taking any number of arguments will look neater, eg
def imerge(*generators):
values = [ g.next() for g in generators ]
while True:
x = min(values)
yield x
for i in range(len(values)):
if values[i] == x:
values[i] = generators[i].next()
This kinda looks like it dies after the first generator is exhausted, 
but I'm not certain.  An alternate version that doesn't search for 'i':

py def imerge(*iterables):
... iters = [iter(i) for i in iterables]
... values = [i.next() for i in iters]
... while iters:
... x, i = min((val, i) for i, val in enumerate(values))
... yield x
... try:
... values[i] = iters[i].next()
... except StopIteration:
... del iters[i]
... del values[i]
... 
py list(imerge([1, 4, 7], [2, 5, 8], [3, 6, 9]))
[1, 2, 3, 4, 5, 6, 7, 8, 9]
py list(imerge([3, 6, 9], [1, 4, 7], [2, 5, 8]))
[1, 2, 3, 4, 5, 6, 7, 8, 9]
py list(imerge([1, 4, 7], [3, 6, 9], [2, 5, 8]))
[1, 2, 3, 4, 5, 6, 7, 8, 9]
This means that this can be further simplified thus,
def hamming():
def _hamming():
yield 1
for n in imerge( imap(lambda h: 2*h, hamming2),
 imap(lambda h: 3*h, hamming3),
 imap(lambda h: 5*h, hamming5) ):
yield n
hamming2, hamming3, hamming5, result = tee(_hamming(), 4)
return result
Nice.
Steve
--
http://mail.python.org/mailman/listinfo/python-list


Re: Classical FP problem in python : Hamming problem

2005-01-25 Thread Nick Craig-Wood
Steven Bethard [EMAIL PROTECTED] wrote:
  Nick Craig-Wood wrote:
  Thinking about this some more leads me to believe a general purpose
  imerge taking any number of arguments will look neater, eg
  
  def imerge(*generators):
  values = [ g.next() for g in generators ]
  while True:
  x = min(values)
  yield x
  for i in range(len(values)):
  if values[i] == x:
  values[i] = generators[i].next()
  
 
  This kinda looks like it dies after the first generator is exhausted, 
  but I'm not certain.

Yes it will stop iterating then (rather like zip() on lists of unequal
size). Not sure what the specification should be!  It works for the
hamming problem though.

 list(imerge(iter([1, 2]), iter([1, 2, 3]), iter([1, 2, 3, 4])))
[1, 2]

 An alternate version that doesn't search for 'i':
 
  py def imerge(*iterables):
  ... iters = [iter(i) for i in iterables]
  ... values = [i.next() for i in iters]
  ... while iters:
  ... x, i = min((val, i) for i, val in enumerate(values))
  ... yield x
  ... try:
  ... values[i] = iters[i].next()
  ... except StopIteration:
  ...  del iters[i]
  ...  del values[i]
  ...  
  py list(imerge([1, 4, 7], [2, 5, 8], [3, 6, 9]))
  [1, 2, 3, 4, 5, 6, 7, 8, 9]
  py list(imerge([3, 6, 9], [1, 4, 7], [2, 5, 8]))
  [1, 2, 3, 4, 5, 6, 7, 8, 9]
  py list(imerge([1, 4, 7], [3, 6, 9], [2, 5, 8]))
  [1, 2, 3, 4, 5, 6, 7, 8, 9]

This isn't quite right...

 list(imerge([1, 2, 3], [1, 2, 3], [1, 2, 3]))
[1, 1, 1, 2, 2, 2, 3, 3, 3]

This should produce
[1, 2, 3]

So I'm afraid the searching *is* necessary - you've got to find all
the generators with the min value and move them on.

-- 
Nick Craig-Wood [EMAIL PROTECTED] -- http://www.craig-wood.com/nick
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Classical FP problem in python : Hamming problem

2005-01-25 Thread Steven Bethard
Nick Craig-Wood wrote:
Steven Bethard [EMAIL PROTECTED] wrote:
Nick Craig-Wood wrote:
Thinking about this some more leads me to believe a general purpose
imerge taking any number of arguments will look neater, eg
def imerge(*generators):
   values = [ g.next() for g in generators ]
   while True:
   x = min(values)
   yield x
   for i in range(len(values)):
   if values[i] == x:
   values[i] = generators[i].next()
This kinda looks like it dies after the first generator is exhausted, 
but I'm not certain.

Yes it will stop iterating then (rather like zip() on lists of unequal
size). Not sure what the specification should be!  It works for the
hamming problem though.
Actually, it stops iterating on lists of equal size too:
py def imerge(*iterators):
... iterators = [iter(i) for i in iterators]
... values = [i.next() for i in iterators]
... while True:
... x = min(values)
... yield x
... for i, val in enumerate(values):
... if val == x:
... values[i] = iterators[i].next()
...
py list(imerge([1, 4, 7], [2, 5, 8], [3, 6, 9]))
[1, 2, 3, 4, 5, 6, 7]
Note that 8 and 9 are not in the list.
Steve
--
http://mail.python.org/mailman/listinfo/python-list


Re: Classical FP problem in python : Hamming problem

2005-01-25 Thread Michael Spencer
Nick Craig-Wood wrote:
Steven Bethard [EMAIL PROTECTED] wrote:
Nick Craig-Wood wrote:
Thinking about this some more leads me to believe a general purpose
imerge taking any number of arguments will look neater, eg
def imerge(*generators):
   values = [ g.next() for g in generators ]
   while True:
   x = min(values)
   yield x
   for i in range(len(values)):
   if values[i] == x:
   values[i] = generators[i].next()
This kinda looks like it dies after the first generator is exhausted, 
but I'm not certain.

Yes it will stop iterating then (rather like zip() on lists of unequal
size). Not sure what the specification should be!  It works for the
hamming problem though.

list(imerge(iter([1, 2]), iter([1, 2, 3]), iter([1, 2, 3, 4])))
[1, 2]

An alternate version that doesn't search for 'i':
py def imerge(*iterables):
... iters = [iter(i) for i in iterables]
... values = [i.next() for i in iters]
... while iters:
... x, i = min((val, i) for i, val in enumerate(values))
... yield x
... try:
... values[i] = iters[i].next()
... except StopIteration:
... del iters[i]
... del values[i]
... 
py list(imerge([1, 4, 7], [2, 5, 8], [3, 6, 9]))
[1, 2, 3, 4, 5, 6, 7, 8, 9]
py list(imerge([3, 6, 9], [1, 4, 7], [2, 5, 8]))
[1, 2, 3, 4, 5, 6, 7, 8, 9]
py list(imerge([1, 4, 7], [3, 6, 9], [2, 5, 8]))
[1, 2, 3, 4, 5, 6, 7, 8, 9]

This isn't quite right...

list(imerge([1, 2, 3], [1, 2, 3], [1, 2, 3]))
[1, 1, 1, 2, 2, 2, 3, 3, 3]
This should produce
[1, 2, 3]
So I'm afraid the searching *is* necessary - you've got to find all
the generators with the min value and move them on.
Here's a dict-based implementation: cute, but slow, at least for a small number 
of iterators

  def imerge(*iterables):
 ... cache = {}
 ... iterators = map(iter,iterables)
 ... number = len(iterables)
 ... exhausted = 0
 ... while 1:
 ... for it in iterators:
 ... try:
 ... cache.setdefault(it.next(),[]).append(it)
 ... except StopIteration:
 ... exhausted += 1
 ... if exhausted == number:
 ... raise StopIteration
 ... val = min(cache)
 ... iterators = cache.pop(val)
 ... yield val
  list(imerge([1, 2, 3, 6], [1, 2, 3, 7], [1, 2, 3, 4, 5]))
[1, 2, 3, 4, 5, 6, 7]
 
Michael
--
http://mail.python.org/mailman/listinfo/python-list


Re: Classical FP problem in python : Hamming problem

2005-01-25 Thread Jeff Shannon
Bengt Richter wrote:
On 25 Jan 2005 08:30:03 GMT, Nick Craig-Wood [EMAIL PROTECTED] wrote:
If you are after readability, you might prefer this...
def hamming():
def _hamming():
  yield 1
  for n in imerge(imap(lambda h: 2*h, iter(hamming2)),
  imerge(imap(lambda h: 3*h, iter(hamming3)),
 imap(lambda h: 5*h, iter(hamming5:
yield n
hamming2, hamming3, hamming5, result = tee(_hamming(), 4)
return result
Are the long words really that helpful?
def hamming():
  def _hamming():
yield 1
for n in imerge(imap(lambda h: 2*h, iter(hg2)),
imerge(imap(lambda h: 3*h, iter(hg3)),
   imap(lambda h: 5*h, iter(hg5:
  yield n
  hg2, hg3, hg5, result = tee(_hamming(), 4) # four hamming generators
  return result
Well, judging by the fact that shortening the identifiers made it so 
that you felt the need to add a comment indicating what they were 
identifying, I'd say that yes, the long words *are* helpful.  ;) 
Comments are good, but self-documenting code is even better.

Jeff Shannon
Technician/Programmer
Credit International
--
http://mail.python.org/mailman/listinfo/python-list


Re: Classical FP problem in python : Hamming problem

2005-01-24 Thread Francis Girard
Ok, I think that the bottom line is this :

For all the algorithms that run after their tail in an FP way, like the 
Hamming problem, or the Fibonacci sequence, (but unlike Sieve of Eratosthene 
-- there's a subtle difference), i.e. all those algorithms that typically 
rely upon recursion to get the beginning of the generated elements in order 
to generate the next one ...

... so for this family of algorithms, we have, somehow, to abandon recursion, 
and to explicitely maintain one or many lists of what had been generated.

One solution for this is suggested in test_generators.py. But I think that 
this solution is evil as it looks like recursion is used but it is not and it 
depends heavily on how the m235 function (for example) is invoked. 
Furthermore, it is _NOT_ memory efficient as pretended : it leaks ! It 
internally maintains a lists of generated results that grows forever while it 
is useless to maintain results that had been consumed. Just for a gross 
estimate, on my system, percentage of memory usage for this program grows 
rapidly, reaching 21.6 % after 5 minutes. CPU usage varies between 31%-36%.
Ugly and inefficient.

The solution of Jeff Epler is far more elegant. The itertools.tee function 
does just what we want. It internally maintain a list of what had been 
generated and deletes the consumed elements as the algo unrolls. To follow 
with my gross estimate, memory usage grows from 1.2% to 1.9% after 5 minutes 
(probably only affected by the growing size of Huge Integer). CPU usage 
varies between 27%-29%.
Beautiful and effecient.

You might think that we shouldn't be that fussy about memory usage on 
generating 100 digits numbers but we're talking about a whole family of 
useful FP algorithms ; and the best way to implement them should be 
established.

For this family of algorithms, itertools.tee is the way to go.

I think that the semi-tutorial in test_generators.py should be updated to 
use tee. Or, at least, a severe warning comment should be written.

Thank you,

Francis Girard
FRANCE

Le dimanche 23 Janvier 2005 23:27, Jeff Epler a crit:
 Your formulation in Python is recursive (hamming calls hamming()) and I
 think that's why your program gives up fairly early.

 Instead, use itertools.tee() [new in Python 2.4, or search the internet
 for an implementation that works in 2.3] to give a copy of the output
 sequence to each multiply by N function as well as one to be the
 return value.

 Here's my implementation, which matched your list early on but
 effortlessly reached larger values.  One large value it printed was
 6412351813189632 (a Python long) which indeed has only the distinct
 prime factors 2 and 3. (2**43 * 3**6)

 Jeff

 from itertools import tee
 import sys

 def imerge(xs, ys):
 x = xs.next()
 y = ys.next()
 while True:
 if x == y:
 yield x
 x = xs.next()
 y = ys.next()
 elif x  y:
 yield x
 x = xs.next()
 else:
 yield y
 y = ys.next()

 def hamming():
 def _hamming(j, k):
 yield 1
 hamming = generators[j]
 for i in hamming:
 yield i * k
 generators = []
 generator = imerge(imerge(_hamming(0, 2), _hamming(1, 3)), _hamming(2,
 5)) generators[:] = tee(generator, 4)
 return generators[3]

 for i in hamming():
 print i,
 sys.stdout.flush()

--
http://mail.python.org/mailman/listinfo/python-list


Re: Classical FP problem in python : Hamming problem

2005-01-24 Thread Francis Girard
The following implementation is even more speaking as it makes self-evident 
and almost mechanical how to translate algorithms that run after their tail 
from recursion to tee usage :

*** BEGIN SNAP
from itertools import tee, imap
import sys

def imerge(xs, ys):
  x = xs.next()
  y = ys.next()
  while True:
if x == y:
  yield x
  x = xs.next()
  y = ys.next()
elif x  y:
  yield x
  x = xs.next()
else:
  yield y
  y = ys.next()

  
def hamming():
  def _hamming():
yield 1
hamming2 = hammingGenerators[0]
hamming3 = hammingGenerators[1]
hamming5 = hammingGenerators[2]
for n in imerge(imap(lambda h: 2*h, iter(hamming2)),
imerge(imap(lambda h: 3*h, iter(hamming3)),
   imap(lambda h: 5*h, iter(hamming5:
  yield n
  hammingGenerators = tee(_hamming(), 4)
  return hammingGenerators[3]
  
for i in hamming():
  print i,
  sys.stdout.flush()
*** END SNAP

Here's an implementation of the fibonacci sequence that uses tee : 

*** BEGIN SNAP
from itertools import tee
import sys

def fib():
  def _fib():
yield 1
yield 1
curGen = fibGenerators[0]
curAheadGen = fibGenerators[1]
curAheadGen.next()
while True:
  yield curGen.next() + curAheadGen.next()
  fibGenerators = tee(_fib(), 3)
  return fibGenerators[2]
  
for n in fib():
  print n,
  sys.stdout.flush()
*** END SNAP

Francis Girard
FRANCE


Le lundi 24 Janvier 2005 14:09, Francis Girard a crit:
 Ok, I think that the bottom line is this :

 For all the algorithms that run after their tail in an FP way, like the
 Hamming problem, or the Fibonacci sequence, (but unlike Sieve of
 Eratosthene -- there's a subtle difference), i.e. all those algorithms that
 typically rely upon recursion to get the beginning of the generated
 elements in order to generate the next one ...

 ... so for this family of algorithms, we have, somehow, to abandon
 recursion, and to explicitely maintain one or many lists of what had been
 generated.

 One solution for this is suggested in test_generators.py. But I think
 that this solution is evil as it looks like recursion is used but it is not
 and it depends heavily on how the m235 function (for example) is invoked.
 Furthermore, it is _NOT_ memory efficient as pretended : it leaks ! It
 internally maintains a lists of generated results that grows forever while
 it is useless to maintain results that had been consumed. Just for a
 gross estimate, on my system, percentage of memory usage for this program
 grows rapidly, reaching 21.6 % after 5 minutes. CPU usage varies between
 31%-36%. Ugly and inefficient.

 The solution of Jeff Epler is far more elegant. The itertools.tee
 function does just what we want. It internally maintain a list of what had
 been generated and deletes the consumed elements as the algo unrolls. To
 follow with my gross estimate, memory usage grows from 1.2% to 1.9% after 5
 minutes (probably only affected by the growing size of Huge Integer). CPU
 usage varies between 27%-29%.
 Beautiful and effecient.

 You might think that we shouldn't be that fussy about memory usage on
 generating 100 digits numbers but we're talking about a whole family of
 useful FP algorithms ; and the best way to implement them should be
 established.

 For this family of algorithms, itertools.tee is the way to go.

 I think that the semi-tutorial in test_generators.py should be updated to
 use tee. Or, at least, a severe warning comment should be written.

 Thank you,

 Francis Girard
 FRANCE

 Le dimanche 23 Janvier 2005 23:27, Jeff Epler a crit:
  Your formulation in Python is recursive (hamming calls hamming()) and I
  think that's why your program gives up fairly early.
 
  Instead, use itertools.tee() [new in Python 2.4, or search the internet
  for an implementation that works in 2.3] to give a copy of the output
  sequence to each multiply by N function as well as one to be the
  return value.
 
  Here's my implementation, which matched your list early on but
  effortlessly reached larger values.  One large value it printed was
  6412351813189632 (a Python long) which indeed has only the distinct
  prime factors 2 and 3. (2**43 * 3**6)
 
  Jeff
 
  from itertools import tee
  import sys
 
  def imerge(xs, ys):
  x = xs.next()
  y = ys.next()
  while True:
  if x == y:
  yield x
  x = xs.next()
  y = ys.next()
  elif x  y:
  yield x
  x = xs.next()
  else:
  yield y
  y = ys.next()
 
  def hamming():
  def _hamming(j, k):
  yield 1
  hamming = generators[j]
  for i in hamming:
  yield i * k
  generators = []
  generator = imerge(imerge(_hamming(0, 2), _hamming(1, 3)),
  _hamming(2, 5)) generators[:] = tee(generator, 4)
  return generators[3]
 
  for i in hamming():
  print i,
  sys.stdout.flush()

--

Re: Classical FP problem in python : Hamming problem

2005-01-24 Thread Tim Peters
[Francis Girard]
 For all the algorithms that run after their tail in an FP way, like the
 Hamming problem, or the Fibonacci sequence, (but unlike Sieve of Eratosthene
 -- there's a subtle difference), i.e. all those algorithms that typically
 rely upon recursion to get the beginning of the generated elements in order
 to generate the next one ...

 ... so for this family of algorithms, we have, somehow, to abandon recursion,
 and to explicitely maintain one or many lists of what had been generated.
 
 One solution for this is suggested in test_generators.py. But I think that
 this solution is evil as it looks like recursion is used but it is not and it
 depends heavily on how the m235 function (for example) is invoked.

Well, yes -- Heh.  Here's one way to get a shared list, complete with
an excruciating namespace renaming trick was intended to warn you in
advance that it wasn't pretty wink.

 Furthermore, it is _NOT_ memory efficient as pretended : it leaks !

Yes.  But there are two solutions to the problem in that file, and the
second one is in fact extremely memory-efficient compared to the first
one.  Efficiency here was intended in a relative sense.

 It internally maintains a lists of generated results that grows forever while 
 it
 is useless to maintain results that had been consumed. Just for a gross
 estimate, on my system, percentage of memory usage for this program grows
 rapidly, reaching 21.6 % after 5 minutes. CPU usage varies between 31%-36%.
 Ugly and inefficient.

Try the first solution in the file for a better understanding of what
inefficient means wink.

 The solution of Jeff Epler is far more elegant. The itertools.tee function
 does just what we want. It internally maintain a list of what had been
 generated and deletes the consumed elements as the algo unrolls. To follow
 with my gross estimate, memory usage grows from 1.2% to 1.9% after 5 minutes
 (probably only affected by the growing size of Huge Integer). CPU usage
 varies between 27%-29%.
 Beautiful and effecient.

Yes, it is better.  tee() didn't exist when generators (and
test_generators.py) were written, so of course nothing in the test
file uses them.

 You might think that we shouldn't be that fussy about memory usage on
 generating 100 digits numbers but we're talking about a whole family of
 useful FP algorithms ; and the best way to implement them should be
 established.

Possibly -- there really aren't many Pythonistas who care about this.

 For this family of algorithms, itertools.tee is the way to go.

 I think that the semi-tutorial in test_generators.py should be updated to
 use tee. Or, at least, a severe warning comment should be written.

Please submit a patch.  The purpose of that file is to test
generators, so you should add a third way of doing it, not replace the
two ways already there.  It should also contain prose explaining why
the third way is better (just as there's prose now explaining why the
second way is better than the first).
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Classical FP problem in python : Hamming problem

2005-01-24 Thread Francis Girard
Ok I'll submit the patch with the prose pretty soon.
Thank you
Francis Girard
FRANCE

Le mardi 25 Janvier 2005 04:29, Tim Peters a écrit :
 [Francis Girard]

  For all the algorithms that run after their tail in an FP way, like the
  Hamming problem, or the Fibonacci sequence, (but unlike Sieve of
  Eratosthene -- there's a subtle difference), i.e. all those algorithms
  that typically rely upon recursion to get the beginning of the generated
  elements in order to generate the next one ...
 
  ... so for this family of algorithms, we have, somehow, to abandon
  recursion, and to explicitely maintain one or many lists of what had been
  generated.
 
  One solution for this is suggested in test_generators.py. But I think
  that this solution is evil as it looks like recursion is used but it is
  not and it depends heavily on how the m235 function (for example) is
  invoked.

 Well, yes -- Heh.  Here's one way to get a shared list, complete with
 an excruciating namespace renaming trick was intended to warn you in
 advance that it wasn't pretty wink.

  Furthermore, it is _NOT_ memory efficient as pretended : it leaks !

 Yes.  But there are two solutions to the problem in that file, and the
 second one is in fact extremely memory-efficient compared to the first
 one.  Efficiency here was intended in a relative sense.

  It internally maintains a lists of generated results that grows forever
  while it is useless to maintain results that had been consumed. Just
  for a gross estimate, on my system, percentage of memory usage for this
  program grows rapidly, reaching 21.6 % after 5 minutes. CPU usage varies
  between 31%-36%. Ugly and inefficient.

 Try the first solution in the file for a better understanding of what
 inefficient means wink.

  The solution of Jeff Epler is far more elegant. The itertools.tee
  function does just what we want. It internally maintain a list of what
  had been generated and deletes the consumed elements as the algo
  unrolls. To follow with my gross estimate, memory usage grows from 1.2%
  to 1.9% after 5 minutes (probably only affected by the growing size of
  Huge Integer). CPU usage varies between 27%-29%.
  Beautiful and effecient.

 Yes, it is better.  tee() didn't exist when generators (and
 test_generators.py) were written, so of course nothing in the test
 file uses them.

  You might think that we shouldn't be that fussy about memory usage on
  generating 100 digits numbers but we're talking about a whole family of
  useful FP algorithms ; and the best way to implement them should be
  established.

 Possibly -- there really aren't many Pythonistas who care about this.

  For this family of algorithms, itertools.tee is the way to go.
 
  I think that the semi-tutorial in test_generators.py should be updated
  to use tee. Or, at least, a severe warning comment should be written.

 Please submit a patch.  The purpose of that file is to test
 generators, so you should add a third way of doing it, not replace the
 two ways already there.  It should also contain prose explaining why
 the third way is better (just as there's prose now explaining why the
 second way is better than the first).

--
http://mail.python.org/mailman/listinfo/python-list


Re: Classical FP problem in python : Hamming problem

2005-01-23 Thread Jeff Epler
Your formulation in Python is recursive (hamming calls hamming()) and I
think that's why your program gives up fairly early.

Instead, use itertools.tee() [new in Python 2.4, or search the internet
for an implementation that works in 2.3] to give a copy of the output
sequence to each multiply by N function as well as one to be the
return value.

Here's my implementation, which matched your list early on but
effortlessly reached larger values.  One large value it printed was
6412351813189632 (a Python long) which indeed has only the distinct
prime factors 2 and 3. (2**43 * 3**6)

Jeff

from itertools import tee
import sys

def imerge(xs, ys):
x = xs.next()
y = ys.next()
while True:
if x == y:
yield x
x = xs.next()
y = ys.next()
elif x  y:
yield x
x = xs.next()
else:
yield y
y = ys.next()

def hamming():
def _hamming(j, k):
yield 1
hamming = generators[j]
for i in hamming:
yield i * k
generators = []
generator = imerge(imerge(_hamming(0, 2), _hamming(1, 3)), _hamming(2, 5))
generators[:] = tee(generator, 4)
return generators[3]

for i in hamming():
print i,
sys.stdout.flush()


pgpDLQcTdYXXo.pgp
Description: PGP signature
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Classical FP problem in python : Hamming problem

2005-01-23 Thread Tim Peters
[Francis Girard]
 ...
 In the meantime, I couldn't resist to test the new Python features about
 laziness on a classical FP problem, i.e. the Hamming problem.
...
 Nevertheless, while the Haskell version prints Hamming sequence for as long as
 I can stand it, and with very little memory consumation, the Python version
 only prints :
...
 I think I should not have this kind of behaviour, even using recursion, since
 I'm only using lazy constructs all the time. At least, I would expect the
 program to produce much more results before surrending.

 What's going on ?

You can find an explanation in Lib/test/test_generators.py, which uses
this problem as an example; you can also find one efficient way to
write it there.
-- 
http://mail.python.org/mailman/listinfo/python-list