A Thursday 14 May 2009 12:26:33 David Fokkema escrigué: > Hi list, > > Why is this different? > > [x for x in table] > or > [x for x in table.iterrows()] > > (which returns the first row over and over) > > and > > for x in table: > x > > or even > > [x[0] for x in table] > > (which returns all different rows) > > Probably there's some __iter__ or other magic going on, but this is not > intuitive for me. Is this a (feature request) bug or am I simply missing > something?
No, although it might seem so, there is no magic there. It is more that you should regard the Row class more as a data accessor than a data container. As you already know, Row is meant to be used inside iterators, providing access to the data in the current row of the iterator (BTW, the current row is accessible via the Row.nrow property). So, if what you want is the actual data you need to explicitely specify a getter in Row. Working with an example will clearify things. Let's consider the next table: In [24]: t Out[24]: /t (Table(10,)) '' description := { "f0": Int64Col(shape=(), dflt=0, pos=0), "f1": Float64Col(shape=(), dflt=0.0, pos=1)} byteorder := 'little' chunkshape := (512,) Using the iterator without a getter on a classic loop gives: In [26]: for r in t: r ....: Out[26]: (0, 0.0) Out[26]: (1, 1.0) Out[26]: (2, 2.0) Out[26]: (3, 3.0) Out[26]: (4, 4.0) Out[26]: (5, 5.0) Out[26]: (6, 6.0) Out[26]: (7, 7.0) Out[26]: (8, 8.0) Out[26]: (9, 9.0) However, using the same iterator on a comprehension list gives: In [25]: [r for r in t] Out[25]: [(9, 9.0), (9, 9.0), (9, 9.0), (9, 9.0), (9, 9.0), (9, 9.0), (9, 9.0), (9, 9.0), (9, 9.0), (9, 9.0)] Why the difference? Well, the former is returning a series of Row objects that the IPython shell is converting into a representation *immediately* for each iteration, while the later is returning a list of references to the *same* Row object, that are not converted into its string representation until the entire list has been built. However, by when the list is represented, the iterator has already finished and hence all the references to the Row object fetch the data pointed by its internal row counter, which is 9 by that time. This is better seen introspecting the values in the list: In [35]: l = [r for r in t] In [36]: type(l[0]) Out[36]: <type 'tables.tableExtension.Row'> In [37]: l[0].nrow Out[37]: 9 In [38]: l[1].nrow Out[38]: 9 In [39]: l[9].nrow Out[39]: 9 As you see, all the internal counters point to the same row (the last one) because all the items in the list are a reference to the *same* Row object. As I said before, you can solve this by thinking of Row as a data accessor, that is, calling a getter. For example: In [40]: [r[:] for r in t] Out[40]: [(0, 0.0), (1, 1.0), (2, 2.0), (3, 3.0), (4, 4.0), (5, 5.0), (6, 6.0), (7, 7.0), (8, 8.0), (9, 9.0)] or: In [41]: [r.fetch_all_fields() for r in t] Out[41]: [(0, 0.0), (1, 1.0), (2, 2.0), (3, 3.0), (4, 4.0), (5, 5.0), (6, 6.0), (7, 7.0), (8, 8.0), (9, 9.0)] [see the manual for the difference between the '[:]' and '.fetch_all_fields()' idioms] Hope that helps, -- Francesc Alted "One would expect people to feel threatened by the 'giant brains or machines that think'. In fact, the frightening computer becomes less frightening if it is used only to simulate a familiar noncomputer." -- Edsger W. Dykstra ------------------------------------------------------------------------------ The NEW KODAK i700 Series Scanners deliver under ANY circumstances! Your production scanning environment may not be a perfect world - but thanks to Kodak, there's a perfect scanner to get the job done! With the NEW KODAK i700 Series Scanner you'll get full speed at 300 dpi even with all image processing features enabled. http://p.sf.net/sfu/kodak-com _______________________________________________ Pytables-users mailing list Pytables-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/pytables-users