Francesc Alted <faltet <at> pytables.org> writes:

> A Saturday 05 December 2009 02:06:47 Jon Olav Vik escrigué:
> > I have many sorted tables of identical structure and would like to combine
> >  them into a single, sorted table. Python >= 2.6 offers
> >  heapq.merge(*iterators), but I cannot quite get it to work. Could somebody
> >  please tell me how to convert a list of Table instances into a list of
> >  iterators that heapq.merge() can use?
> > 
> > I'm particularly puzzled by the following (can be run after the code
> >  below). Note that t[0] is a Table object.
> > 
> > In [74]: for row in t[0]: print row
> >    ....:
> > (0, 100.0)
> > (3, 103.0)
> > (6, 106.0)
> > (9, 109.0)
> > 
> > In [75]: [row for row in t[0]]
> > Out[75]: [(9, 109.0), (9, 109.0), (9, 109.0), (9, 109.0)]
> 
> Yup, this is a FAQ.  The `row` object is in fact an *accessor* to data, not a 
> data container by itself.  This is the reason why:
> 
> for row in t[0]: print row
> 
> works well, because the accessor is *used* on every iteration.  However, in:
> 
> [row for row in t[0]]
> 
> you are only appending the *same* instance of the accessor each time.
> 
> Instead, you may want to use:
> 
> [row[:] for row in t[0]]
> 
> or, if you want NumPy objects instead of tuples:
> 
> [row.fetch_all_fields() for row in t[0]]
> 
> See http://www.pytables.org/docs/manual/ch04.html#RowClassDescr for more info.

Thank you very much. This is getting so subtle it hurts my brain, but I finally 
figured out what I need.

[row[:] for row in t[i]]

is very close, but evaluates the entire list comprehension in memory. What I 
need is 

(row[:] for row in t[i]) 

a generator expression that wraps individual items generated from t[i], one at 
a time. And since I want to merge() several such generators, my final monster 
is:

In [29]: [r for r in heapq.merge(*[(row[:] for row in ti) for ti in t])]
Out[29]:
[(0, 100.0),
 (1, 101.0),
 (2, 102.0),
 (3, 103.0),
 (4, 104.0),
 (5, 105.0),
 (6, 106.0),
 (7, 107.0),
 (8, 108.0),
 (9, 109.0)]

where this is a list of generators:

In [30]: [(row[:] for row in ti) for ti in t]
Out[30]:
[<generator object <genexpr> at 0x1bdbde10>,
 <generator object <genexpr> at 0x1bdbdd20>,
 <generator object <genexpr> at 0x1bf50280>]

(I never imagined I'd learn so much about the fine distinctions between [list 
comprehensions] and (generator expressions) in Python!)


Best regards,
Jon Olav



------------------------------------------------------------------------------
Join us December 9, 2009 for the Red Hat Virtual Experience,
a free event focused on virtualization and cloud computing. 
Attend in-depth sessions from your desk. Your couch. Anywhere.
http://p.sf.net/sfu/redhat-sfdev2dev
_______________________________________________
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users

Reply via email to