[pygr] strategies for highly accessed multiple alignment NLMSAs

Kenny Daily Wed, 11 Nov 2009 19:25:59 -0800

I have an application that is using the UCSC multiple alignments. My
problem is that I need to query the NLMSA around 10-100 million times,
and get pieces of sequence to compute on. However, I am having trouble
figuring out ways to increase the speed, as it seems the bottleneck is
accessing the NLMSA and getting the sequence pieces - my current best
is only getting about 10/sec.


Does anyone else have any strategies for handling this kind of
throughput? Does decreasing the size of the LPO indexes help any? I
thought about breaking apart the alignments by chromosome of the
reference species, then trying to build them in memory, but rebuilding
the alignment takes some time as well (although overall, probably less
time than I'm using now).

I also have lots of processors to throw at the problem - on the order
of 100s - but they're all accessing the same disk (although this disk
is very fast and on very expensive hardware).

Any ideas would be appreciated!

Kenny

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"pygr-dev" group.
To post to this group, send email to pygr-dev@googlegroups.com
To unsubscribe from this group, send email to 
pygr-dev+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/pygr-dev?hl=en
-~----------~----~----~----~------~----~------~--~---

[pygr] strategies for highly accessed multiple alignment NLMSAs

Reply via email to