Re: flatten a list of list

2009-08-17 Thread Tim Cook
On Aug 16, 6:47 am, Terry terry.yin...@gmail.com wrote:
 Hi,

 Is there a simple way (the pythonic way) to flatten a list of list?
 rather than my current solution:

 new_list=[]
 for l in list_of_list:
     new_list.extend(l)

 or,

 new_list=reduce(lambda x,y:x.extend(y), list_of_list)

 br, Terry

Well, This is not simple but it is comprhensive in that it has to do
several things.  I am using it to decompose deeply nested lists from
Pyparsing output that may have strings in a variety of languages.
Performance wise I do not know how it stacks up against the other
examples but it works for me.  :-)

def flatten(x):
flatten(sequence) - list

Returns a single, flat list which contains all elements retrieved
from the sequence and all recursively contained sub-sequences
(iterables). All strings are converted to unicode.


result = []
for el in x:
#if isinstance(el, (list, tuple)):
if hasattr(el, __iter__) and not isinstance(el, basestring):
result.extend(flatten(el))
else:
result.append(el)


# all strings must be unicode
rtnlist=[]
for x in result:
if isinstance(x,str):
# replace any brackets so Python doesn't think it's a list
and we still have a seperator.
x=x.replace('[','_')
x=x.replace(']','_')
try:
x=unicode(x, utf8)  # need more decode types here
except UnicodeDecodeError:
x=unicode(x, latin1)
except UnicodeDecodeError:
x=unicode(x,iso-8859-1)
except UnicodeDecodeError:
x=unicode(x,eucJP)

rtnlist.append(x)

return rtnlist
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: flatten a list of list

2009-08-16 Thread Chris Rebert
On Sun, Aug 16, 2009 at 5:47 AM, Terryterry.yin...@gmail.com wrote:
 Hi,

 Is there a simple way (the pythonic way) to flatten a list of list?
 rather than my current solution:

 new_list=[]
 for l in list_of_list:
    new_list.extend(l)

 or,

 new_list=reduce(lambda x,y:x.extend(y), list_of_list)

#only marginally better:
from operator import add
new_list = reduce(add, list_of_list)

#from the itertools recipes:
from itertools import chain
def flatten(listOfLists):
return list(chain.from_iterable(listOfLists))

Cheers,
Chris
-- 
http://blog.rebertia.com
-- 
http://mail.python.org/mailman/listinfo/python-list


flatten a list of list

2009-08-16 Thread Terry
Hi,

Is there a simple way (the pythonic way) to flatten a list of list?
rather than my current solution:

new_list=[]
for l in list_of_list:
new_list.extend(l)

or,

new_list=reduce(lambda x,y:x.extend(y), list_of_list)

br, Terry
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: flatten a list of list

2009-08-16 Thread Michael Fötsch

Terry wrote:

Is there a simple way (the pythonic way) to flatten a list of list?


This is probably the shortest it can get:

sum(list_of_lists, [])


Kind Regards,
M.F.
--
http://mail.python.org/mailman/listinfo/python-list


Re: flatten a list of list

2009-08-16 Thread Steven D'Aprano
On Sun, 16 Aug 2009 05:55:48 -0400, Chris Rebert wrote:

 On Sun, Aug 16, 2009 at 5:47 AM, Terryterry.yin...@gmail.com wrote:
 Hi,

 Is there a simple way (the pythonic way) to flatten a list of list?
 rather than my current solution:

 new_list=[]
 for l in list_of_list:
    new_list.extend(l)

 or,

 new_list=reduce(lambda x,y:x.extend(y), list_of_list)
 
 #only marginally better:
 from operator import add
 new_list = reduce(add, list_of_list)

Surely that's going to be O(N**2)?

I'd predict that would perform even worse than O(N**2) string 
concatenation, on account that a string of size N has to allocate space 
for N bytes, but a list of size N allocates space for N pointers (each 4 
bytes, or 8 depending on the system), rounded up to the nearest power of 
two. Also CPython can optimize string concatenation to O(N) under some 
circumstances, but not lists.


 from timeit import Timer
 setup = \\
... L = [ ([None]*5000) for x in xrange(%d) ]
... from operator import add
... 
 Timer(reduce(add, L), setup % 4).repeat(number=1000)
[0.53808808326721191, 0.51404905319213867, 0.51157188415527344]
 Timer(reduce(add, L), setup % 8).repeat(number=1000)
[2.1178171634674072, 3.8830602169036865, 4.72245192527771]
 Timer(reduce(add, L), setup % 16).repeat(number=1000)
[17.190337896347046, 21.572744131088257, 21.584265947341919]


Looks like O(N**2) behaviour to me.



-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: flatten a list of list

2009-08-16 Thread Chris Rebert
On Sun, Aug 16, 2009 at 6:49 AM, Steven
D'Apranost...@remove-this-cybersource.com.au wrote:
 On Sun, 16 Aug 2009 05:55:48 -0400, Chris Rebert wrote:
 On Sun, Aug 16, 2009 at 5:47 AM, Terryterry.yin...@gmail.com wrote:
 Hi,

 Is there a simple way (the pythonic way) to flatten a list of list?
 rather than my current solution:

 new_list=[]
 for l in list_of_list:
    new_list.extend(l)

 or,

 new_list=reduce(lambda x,y:x.extend(y), list_of_list)

 #only marginally better:
 from operator import add
 new_list = reduce(add, list_of_list)

 Surely that's going to be O(N**2)?

The OP asked for simple, not best, most proper, or fastest. My
comment was intended to mean that the code was marginally *simpler*,
not faster.

Cheers,
Chris
-- 
http://blog.rebertia.com
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: flatten a list of list

2009-08-16 Thread Steven D'Aprano
On Sun, 16 Aug 2009 12:03:53 +0200, Michael Fötsch wrote:

 Terry wrote:
 Is there a simple way (the pythonic way) to flatten a list of list?
 
 This is probably the shortest it can get:
 
 sum(list_of_lists, [])


That's also O(N**2).


 from timeit import Timer
 setup = L = [ ([None]*5000) for x in xrange(%d) ]
 Timer(sum(L, []), setup % 4).repeat(number=1000)
[0.6070549488067627, 0.54354715347290039, 0.54686999320983887]
 Timer(sum(L, []), setup % 8).repeat(number=1000)
[2.1285719871520996, 3.6722278594970703, 4.0785009860992432]
 Timer(sum(L, []), setup % 16).repeat(number=1000)
[18.370341062545776, 20.40509295463562, 21.871708869934082]



-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: flatten a list of list

2009-08-16 Thread Terry
On Aug 16, 6:59 pm, Chris Rebert c...@rebertia.com wrote:
 On Sun, Aug 16, 2009 at 6:49 AM, Steven





 D'Apranost...@remove-this-cybersource.com.au wrote:
  On Sun, 16 Aug 2009 05:55:48 -0400, Chris Rebert wrote:
  On Sun, Aug 16, 2009 at 5:47 AM, Terryterry.yin...@gmail.com wrote:
  Hi,

  Is there a simple way (the pythonic way) to flatten a list of list?
  rather than my current solution:

  new_list=[]
  for l in list_of_list:
     new_list.extend(l)

  or,

  new_list=reduce(lambda x,y:x.extend(y), list_of_list)

  #only marginally better:
  from operator import add
  new_list = reduce(add, list_of_list)

  Surely that's going to be O(N**2)?

 The OP asked for simple, not best, most proper, or fastest. My
 comment was intended to mean that the code was marginally *simpler*,
 not faster.

 Cheers,
 Chris
 --http://blog.rebertia.com

Well, if possible, I'd like not only to know a simple solution, but
also the 'best', the 'most proper' and the 'fastest':-)

If they are not the same.

br, Terry
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: flatten a list of list

2009-08-16 Thread Steven D'Aprano
On Sun, 16 Aug 2009 02:47:42 -0700, Terry wrote:

 Hi,
 
 Is there a simple way (the pythonic way) to flatten a list of list?
 rather than my current solution:
 
 new_list=[]
 for l in list_of_list:
 new_list.extend(l)

I don't think that scales terribly well. In my testing, it performs about 
as badly as the O(N**2) behaviours others have suggested (using sum or 
reduce with add) -- perhaps a bit worse for small N, but not quite as 
badly for large N.


 new_list=reduce(lambda x,y:x.extend(y), list_of_list)

That doesn't even work.

 list_of_list = [ [1,2,3], [2, 4, 8] ]
 new_list=reduce(lambda x,y:x.extend(y), list_of_list)
 new_list is None
True



Chris' suggestion using itertools seems pretty good:

 from timeit import Timer
 setup = \\
... L = [ [None]*5000 for _ in xrange(%d) ]
... from itertools import chain
... 
 Timer(list(chain.from_iterable(L)), setup % 4).repeat(number=1000)
[0.61839914321899414, 0.61799716949462891, 0.62065696716308594]
 Timer(list(chain.from_iterable(L)), setup % 8).repeat(number=1000)
[1.2618398666381836, 1.3385050296783447, 3.9113419055938721]
 Timer(list(chain.from_iterable(L)), setup % 16).repeat(number=1000)
[3.1349358558654785, 4.8554730415344238, 5.431217987061]


-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: flatten a list of list

2009-08-16 Thread Steven D'Aprano
On Sun, 16 Aug 2009 06:59:52 -0400, Chris Rebert wrote:

 Surely that's going to be O(N**2)?
 
 The OP asked for simple, not best, most proper, or fastest. My
 comment was intended to mean that the code was marginally *simpler*, not
 faster.

Fair enough, but he also asked for Pythonic, and while some people might 
argue that terrible performance is Pythonic, I hope you wouldn't be one 
of them! :)

If it soothes your ruffled sense of honour *wink*, I think your solution 
with itertools.chain is probably the best so far.


-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: flatten a list of list

2009-08-16 Thread Chris Rebert
On Sun, Aug 16, 2009 at 7:31 AM, Steven
D'Apranost...@remove-this-cybersource.com.au wrote:
 On Sun, 16 Aug 2009 06:59:52 -0400, Chris Rebert wrote:
 Surely that's going to be O(N**2)?

 The OP asked for simple, not best, most proper, or fastest. My
 comment was intended to mean that the code was marginally *simpler*, not
 faster.

 Fair enough, but he also asked for Pythonic, and while some people might
 argue that terrible performance is Pythonic, I hope you wouldn't be one
 of them! :)

Indeed not. :) I expected it would be worse performance-wise than the
OP's original due to all the intermediate lists that get produced; it
shouldn't be used in optimized production code.

 If it soothes your ruffled sense of honour *wink*, I think your solution
 with itertools.chain is probably the best so far.

Except it's not really my solution, it's whoever put it in the
itertools docs's. :(
But I'm glad to been able to help by pointing the recipe out. :-)

Cheers,
Chris
-- 
http://blog.rebertia.com
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: flatten a list of list

2009-08-16 Thread Bearophile
Chris Rebert:
 The OP asked for simple, not best, most proper, or fastest. My
 comment was intended to mean that the code was marginally *simpler*,
 not faster.

Yep, the OP has asked for simple code. But often this is not the right
way to solve this situation. A better way is to create (or copy) a
flatten that's efficient and well tested  debugged, put it into some
library of utilities, and then use importuse it when it's needed.

Bye,
bearophile
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: flatten a list of list

2009-08-16 Thread Scott David Daniels

Steven D'Aprano wrote:

On Sun, 16 Aug 2009 02:47:42 -0700, Terry wrote:

Is there a simple way (the pythonic way) to flatten a list of list?

Chris' suggestion using itertools seems pretty good:


from timeit import Timer
setup = \\

... L = [ [None]*5000 for _ in xrange(%d) ]
... from itertools import chain
... 

Timer(list(chain.from_iterable(L)), setup % 4).repeat(number=1000)

[0.61839914321899414, 0.61799716949462891, 0.62065696716308594]

Timer(list(chain.from_iterable(L)), setup % 8).repeat(number=1000)

[1.2618398666381836, 1.3385050296783447, 3.9113419055938721]

Timer(list(chain.from_iterable(L)), setup % 16).repeat(number=1000)

[3.1349358558654785, 4.8554730415344238, 5.431217987061]


OK, it definitely helps to get a size estimate before building:

 setup = \\
L = [ [None]*5000 for _ in xrange(%d) ]
import itertools

class Holder(object):
def __init__(self, list_of_lists):
self._list = list_of_lists
def __iter__(self):
return itertools.chain.from_iterable(self._list)
def __len__(self):
return sum(len(x) for x in self._list)


 timeit.Timer(list(Holder(L)), setup % 4).repeat(number=1000)
[0.59912279353940789, 0.59505886921382967, 0.59474989139681611]
 timeit.Timer(list(Holder(L)), setup % 8).repeat(number=1000)
[1.1898235669617208, 1.194797383466323, 1.1945367358141823]
 timeit.Timer(list(Holder(L)), setup % 16).repeat(number=1000)
[2.4244464031043123, 2.4261885239604482, 2.4050011942858589]

vs straight chain.from_iterable (on my machine):

[0.7828263089303249, 0.79326171343005925, 0.80967664884783019]
[1.499510971366476, 1.5263249938190455, 1.5599706107899181]
[3.4427520816193109, 3.632409426337702, 3.5290488036887382]

--Scott David Daniels
scott.dani...@acm.org
--
http://mail.python.org/mailman/listinfo/python-list


Re: flatten a list of list

2009-08-16 Thread Francesco Bochicchio
On Aug 16, 1:25 pm, Steven D'Aprano st...@remove-this-
cybersource.com.au wrote:

...

 Chris' suggestion using itertools seems pretty good:

  from timeit import Timer
  setup = \\

 ... L = [ [None]*5000 for _ in xrange(%d) ]
 ... from itertools import chain
 ...  Timer(list(chain.from_iterable(L)), setup % 
 4).repeat(number=1000)

 [0.61839914321899414, 0.61799716949462891, 0.62065696716308594] 
 Timer(list(chain.from_iterable(L)), setup % 8).repeat(number=1000)

 [1.2618398666381836, 1.3385050296783447, 3.9113419055938721] 
 Timer(list(chain.from_iterable(L)), setup % 16).repeat(number=1000)

 [3.1349358558654785, 4.8554730415344238, 5.431217987061]

 --
 Steven

I had a peek at itertools ( which is a C module btw) and realized that
chain solves the problem by creating a chain object,
which is a sort of generator. Both creating the chain object and
converting the chain object to a list seem to be O(N), so
the whole is O(N) too ...

Then I tried this pure python version:

# - CODE
from timeit import Timer
setup = \\
L = [ [None]*5000 for _ in range(%d) ]
def pychain( list_of_list ):
for l in list_of_list:
for elem in l:
yield elem


print( Timer(list(pychain(L)), setup % 4).repeat(number=1000))
print( Timer(list(pychain(L)), setup % 8).repeat(number=1000))
print( Timer(list(pychain(L)), setup % 16).repeat(number=1000))
# - END CODE


and got times that seem to confirm it :

[2.818755865097046, 2.7880589962005615, 2.79232120513916]
[5.588631868362427, 5.588244915008545, 5.587780952453613]
[11.620548009872437, 11.39465618133545, 11.40834903717041]

For reference, here are the times of the itertools.chain solution on
my laptop:

[0.6518809795379639, 0.6491332054138184, 0.6483590602874756]
[1.3188841342926025, 1.3173959255218506, 1.3207998275756836]
[2.7200729846954346, 2.5402050018310547, 2.543621063232422]

All this with Python 3.1 compiled from source on Xubuntu 8.10.

Ciao
-
FB
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: flatten a list of list

2009-08-16 Thread Alan G Isaac
On 8/16/2009 5:47 AM Terry apparently wrote:
 Is there a simple way (the pythonic way) to flatten a list of list? 
 rather than my current solution:

 new_list=[] 
 for l in list_of_list:
 new_list.extend(l)


new_list = list(xi for lst in list_of_list for xi in lst)

hth,
Alan Isaac
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: flatten a list of list

2009-08-16 Thread Paul Rubin
Terry terry.yin...@gmail.com writes:
 Is there a simple way (the pythonic way) to flatten a list of list?
 rather than my current solution:
 
 new_list=[]
 for l in list_of_list:
 new_list.extend(l)

from itertools import chain
new_list = list(chain(list_of_list))
-- 
http://mail.python.org/mailman/listinfo/python-list