On 17/07/12 20:32, Galt Barber wrote: > Hi, Tony! > > Reminder: there are many excellent short-read aligners available today.
Hi, Galt. Yes, of course, and I suggested using BWA or Bowtie when I joined the lab. However, we are searching for CRAC chimeras and my colleagues have found that BLAST works well for this purpose. They have already tried short-read aligners like NovoAlign but, apart from speed, BLAST works better. My suggestion is that we used BLAT instead of BLAST to speed things up, which is why I've been using it lately. Most recently, we have been looking for intronic sequences and that's why I tried to BLAT against a 2bit format version of the Ensemble top-level FASTA file. In case you're interested, I'm working in David Tollervey's lab at the University of Edinburgh: http://tollervey.bio.ed.ac.uk [The pipeline I'm working on detects CRAC chimeric hybrid reads] BLAT has been great for most of the things I've done to date - The problem is simply that "blat" segfaults sometimes when searching a 'large' database (> 4GiB). I understand why people are advising me to split the DB into smaller fragments. However, I think it is slightly worrying that an otherwise reliable program will sometimes segfault, and that this behaviour is almost exactly reproducible by someone else running "blat" on different hardware with different query and >4GiB DB. If I've got time, I'll do some more debugging - It might be a memory leak that only shows up when indexing >4GiB DB files. As I said here, the same query files run fine against the hg19.2bit DB I downloaded. However, using this DB "blat" only uses about 3.9GiB RAM. Thanks for your helpful suggestion, Tony. _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
