I actually have a similar problem of slow loading times for a large
number of intervals in an NLMSA (10^6). I have the sample code and
input file here (Beware, file size is 26MB):

http://www.ics.uci.edu/~baldig/pygr_nlmsa_test.tar.gz

Input file (bowtie_mapped_reads.txt) is 1 million reads from bowtie
mapping. All code has been added to one file necessary for parsing,
reading, and inserting the data into pygr (pygr_nlmsa_test.py). Using
"time" module, I get a loading time from pygr.Data.getResource(...) of
~50 sec. Looking at usage with "top" shows:

PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+    TIME DATA
COMMAND
32134 kdaily    25   0  361m 268m 4336 S    0  1.7   1:29.49   1:29
264m ipython
...
Mem:  16475040k total,  1745548k used, 14729492k free,    70924k
buffers
Swap: 65537156k total,     6812k used, 65530344k free,  1126804k
cached

Also, any hints on better ways of implementing this kind of
functionality are appreciated!

Thanks,

Kenny

On Mar 12, 9:32 pm, Christopher Lee <[email protected]> wrote:
> I don't think it should take 2 minutes to load that much data. 300,000  
> * 24 = 7 MB.  That can be read in a fraction of a second.  Is it  
> possible that your system is somehow going into virtual memory  
> swapping or some other very slow state?  Otherwise, we need a  
> reproducible for your performance problem, so we can debug it.
> -- Chris
>
> On Mar 12, 2009, at 6:59 PM, Alexander Alekseyenko wrote:
>
>
>
> > a few hundred thousands.
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"pygr-dev" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/pygr-dev?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to