Hi all,
I have a set of many (hundreds) of genomes which I'm hoping to analyze. For
most positions in the genomes, I want to count the number of A's, G's, C's,
T's. So far I've been doing this by generating the pileup and then parsing
the data at the positions I'm interested in. The problem is this process is
taking a ton of time. Just for one individual on chr20, my entire script
took 36 minutes, 34 of which were spent generating the pileup. If I want to
run this on the entire genome for all the samples it will simply take too
long to be reasonable.
I'm hoping for some advice or ideas one how to improve this. I've tried
using -r and -l together, and it actually slightly slows samtools down
versus generating the entire chromosome and filtering for the positions I
need. My script is written in java, so I was thinking of trying to
write/alter a GATK walker which I could call directly from my script, but I
don't have any experience with walkers so I'm not thrilled about that
option. I've also done a bit of looking around and found piledriver, part
of bamtools, but haven't tried to use it yet. Does anyone have a sense of
how their performance compares?
Although I don't really know what I'm talking about, it seems that if I
have a sorted list of the positions I'm interested in and just need to get
counts for each base at those positions, that it shouldn't be so
computationally demanding.
Thanks for any and all the help/advice! I really appreciate it!
-Alissa
------------------------------------------------------------------------------
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
_______________________________________________
Samtools-help mailing list
Samtools-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/samtools-help