Hello,

I am writing an app where I am processing a some genetic information.
I have already placed all my information in a H5 DB. Currently the
information that I am interested resides in a Table that contains a
String for the Gene Name, a long string with Gene Genetic Sequence and
a a 1x7 vector of type FloatCol. When I made the DB I made sure the
entries would be indexed by the Name row.

So, now I need to build some submatrixes from the data and to extract
them I am using the following code:

unlabeled_matrix=zeros( (num_genes,vec_len), dtype=float)
count = 0
for gene_name in instance.Unlabeled[0]:
      iterator = table.where( table.cols._f_col('Name') == gene_name )
      row = iterator.next()
      unlabeled_matrix[count,:] = row['Expression']
      count +=1

Basically I am doing searches by name, and at least for this
application I have made sure that I always search for existing entries
and that there will be only one entry with that name. unlabeled_matrix
is a numpy matrix and each row of this matrix contains one vector of
the Expression row in the table.

Besides that this may not be the best way of accessing the database
(Some comments on this would be appreciated) I am getting performance
that is less of what I was expecting (Although I don't know really
what I should be expecting), so I would appreciate any suggestions on
how to approach this problem. For example, it takes  between 3.5 and 4
seconds to generate a 20x7 matrix made of 20 expression genes vectors.
Besides this I need to process another matrix that would be 3800x7 and
I need to do it this several times of, which starts adding time. This
is without counting doing some mathematical manipulations with the
matrices and then storing them in the DB again.

Thanks,
Pepe


-------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid0709&bid&3057&dat1642
_______________________________________________
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users

Reply via email to