A Wednesday 23 June 2010 22:17:23 Felix Schlesinger escrigué: > Hi, > > I am dealing with a large table. One of the columns of that table should > hold a variable number of pairs of integers. In order to implement this I > use a pytables table for the fixed columns and a seperate VLArray of > IntAtom(4,2) for the variable column.
Which is the approximate number of entries on each row? > This works, but I have a bit of a performance problem when I built this > structure row by row. Even though the table has 10 columns and most rows > of the VLArray only have one pair of Ints, the inserts into the VLArray > take much longer then those into the table. > Is this expected? Yes, the Table entity is one of the fastest in PyTables, so you should notice a lot of difference when using a VLArray instead. > Is there a way to avoid it by building the VLArray in > memory and inserting it into the h5f file in one go? What about using an CArray for all the IntAtom(4,2) atoms, named say, 'values' and an additional array (saved in a different CArray or even in an attribute), say 'indices', for keeping track of the different 'row' indices. For example, if first entry in VLArray has 1500 elements, then the first 'row' index array will be [0, 1500], if the second row is 2000 elements long, then the second 'row' index will be [1500, 3500], and so on and so forth. Then, in order to retrieve a 'variable-length' row you only have to do: values[indices[my_var_len_rowidx,0]:indices[my_var_len_rowidx,1]] I'd say that this is going to be fast. -- Francesc Alted ------------------------------------------------------------------------------ ThinkGeek and WIRED's GeekDad team up for the Ultimate GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the lucky parental unit. See the prize list and enter to win: http://p.sf.net/sfu/thinkgeek-promo _______________________________________________ Pytables-users mailing list Pytables-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/pytables-users