Dear bioc-devel, multicrispr<https://gitlab.gwdg.de/loosolab/software/multicrispr> provides functions for Crispr/Cas9 gRNA design (and is being prepared for BioC). One task involves finding all genomic (mis)matches of a 23-bp candidate Cas9 sequence. Currently this is done with `Biostrings::vcountPDict`, an approach that is successful, though not fast. An alternative would be to switch to short read mapping rather than (Bio)string matching, which involves a one-time indexing effort, but subsequent fast alignment.
`Rsubread::align` seems to be limited to max. 16 `nBestLocations`, whereas I know from vcountPDict that some Cas9 candidates have hundreds of genomic matches. `QuasR::qAlign` (connecting to Bowtie) does not mention an upper limit on `maxHits`. Feedback request... Michael, would QuasR/(R)bowtie be a good approach to do this? Wei, did I overlook a way to do this with Rsubread? Herve, is there an elegant way to speed up vcountPDict (parallelize?) Thankyou :) Aditya [[alternative HTML version deleted]] _______________________________________________ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel