Hi,

I am dealing with a large table. One of the columns of that table should hold
a variable number of pairs of integers. In order to implement this I use a 
pytables table for the fixed columns and a seperate VLArray of IntAtom(4,2)
for the variable column.

This works, but I have a bit of a performance problem when I built this 
structure row by row. Even though the table has 10 columns and most rows
 of the VLArray only have one pair of Ints, the inserts into the VLArray take 
much longer then those into the table. 
Is this expected? Is there a way to avoid it by building the VLArray in memory
and inserting it into the h5f file in one go?

the table is:
Reads = [('bin', 'i4'),('chr', 'u1'),('strand','S1'),('id', 
      'u4'),('start','i4'),('stop','i4'),('blocks','i1'),('splice','i4'),
      ('mate','i1'), ('mapping','i1')]
h5f.createTable("/",name,np.array([],Reads),
         expectedrows=100000000)

The VLArray is:
h5f.createVLArray("/",name+'Splice',tables.IntAtom(4,2),
       expectedsizeinMB=100)

The inserts into the table are done by constructing a row object and 
calling row.append
The inserts into the VLArray are done by appending 2D numpy.array objects.

Thanks for any tip
  Felix


------------------------------------------------------------------------------
ThinkGeek and WIRED's GeekDad team up for the Ultimate 
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the 
lucky parental unit.  See the prize list and enter to win: 
http://p.sf.net/sfu/thinkgeek-promo
_______________________________________________
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users

Reply via email to