Dear all,

I'm new to pytables. I would use this package to store biological  
data. In practice I have 2 input files.
The first consists in a long string of characters. The length is in  
the range 10e7-10e8. I have to read such file character by character  
and store it in a table according to its position (counting from 1).  
For example:

1 A
2 C
3 G
4 T
  and so on.

Then, I have to read a second file and during the reading I have to  
associate to each character of the previous table another character,  
according to the position. Example:

1 A A
2 C C
3 G G
4 T T
  and so on.

However, the same position can occur a variable number of times and so  
I have to associate a variable number of characters to each position.  
Example:

1 A AAAAA
2 C CCCCTC
3 G GGGGAGGGG
4 T TGGGTGTTTTTTTT
  and so on.

I tried to use a vlarray for each position, updating the array every  
time needed. However, I noted that the creation of table according to  
the first structure above was very fast. Adding the vlarray I noted,  
instead, a drammatic performance reductiion in term of time (from  
seconds to many hours [I stopped the script before the conclution]).

Is there a way to speed up the process when there are vlarray?

I also tried to use the same table with a very long string size  
instead of vlarray but also in this case the time needed to buil the  
table was very high.

Since I don't known very well pytables, is there a way to improve the  
performance?

Thank you very much in advance for any help.

Ernesto




  

------------------------------------------------------------------------------
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
_______________________________________________
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users

Reply via email to