Hi, I am dealing with a large table. One of the columns of that table should hold a variable number of pairs of integers. In order to implement this I use a pytables table for the fixed columns and a seperate VLArray of IntAtom(4,2) for the variable column.
This works, but I have a bit of a performance problem when I built this structure row by row. Even though the table has 10 columns and most rows of the VLArray only have one pair of Ints, the inserts into the VLArray take much longer then those into the table. Is this expected? Is there a way to avoid it by building the VLArray in memory and inserting it into the h5f file in one go? the table is: Reads = [('bin', 'i4'),('chr', 'u1'),('strand','S1'),('id', 'u4'),('start','i4'),('stop','i4'),('blocks','i1'),('splice','i4'), ('mate','i1'), ('mapping','i1')] h5f.createTable("/",name,np.array([],Reads), expectedrows=100000000) The VLArray is: h5f.createVLArray("/",name+'Splice',tables.IntAtom(4,2), expectedsizeinMB=100) The inserts into the table are done by constructing a row object and calling row.append The inserts into the VLArray are done by appending 2D numpy.array objects. Thanks for any tip Felix ------------------------------------------------------------------------------ ThinkGeek and WIRED's GeekDad team up for the Ultimate GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the lucky parental unit. See the prize list and enter to win: http://p.sf.net/sfu/thinkgeek-promo _______________________________________________ Pytables-users mailing list Pytables-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/pytables-users