Dear Gian, If you want to reduce redundancy in a set of sequences before multiple alignment, blastclust is OK (NCBI BLAST2 package). It's not R but it'd be easy enough to process its output in R. It's also maybe a bit "crude". But this is perhaps inevitable for this kind of task.
In general: For protein sequences, use blastclust on the proteins. For non-coding nucleotides, use blastclust on the nucleotides. For coding nucleotides, use blastclust on the proteins (translations). Best wishes, Daniel On 14/03/2013 21:18, "Sam Brown" <[email protected]> wrote: > >> Date: Thu, 14 Mar 2013 10:21:39 +0100 >> From: Gian Maria Niccol? Benucci <[email protected]> >> To: [email protected] >> Subject: [R-sig-phylo] How to cluster sequences in different OTUs >> >> Hi everyone, >> >> I was wondering is exist a function that is able to generate OTUs from a >> sequences database. >> Thank you very much in advance, >> >> -- >> Gian > > >Hi Gian > >The function tclust() in the spider package (on CRAN) groups sequences >based on their genetic distance below a given threshold. > >If you want a something with likelihoods measured for it, check out the >gmyc() function in the splits package (hosted on R Forge: >https://r-forge.r-project.org/projects/splits/). This requires an >ultrametric tree to create the the OTUs. > >There may be others, but these two come immediately to mind. > >All the best. > >Sam > > >Samuel Brown >Postgraduate Student >Bio-Protection Research Centre >PO Box 84 >Lincoln University >Lincoln 7647 >Canterbury >New Zealand >[email protected] >http://www.the-praise-of-insects.blogspot.com > > [[alternative HTML version deleted]] > >_______________________________________________ >R-sig-phylo mailing list - [email protected] >https://stat.ethz.ch/mailman/listinfo/r-sig-phylo >Searchable archive at >http://www.mail-archive.com/[email protected]/ > Daniel -- Daniel Barker http://bio.st-andrews.ac.uk/staff/db60.htm The University of St Andrews is a charity registered in Scotland : No SC013532 _______________________________________________ R-sig-phylo mailing list - [email protected] https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/[email protected]/
