Dear Gian,

If you want to reduce redundancy in a set of sequences before multiple
alignment, blastclust is OK (NCBI BLAST2 package). It's not R but it'd be
easy enough to process its output in R. It's also maybe a bit "crude". But
this is perhaps inevitable for this kind of task.

In general:

For protein sequences, use blastclust on the proteins.

For non-coding nucleotides, use blastclust on the nucleotides.

For coding nucleotides, use blastclust on the proteins (translations).

Best wishes,

Daniel

On 14/03/2013 21:18, "Sam Brown" <[email protected]> wrote:

>
>> Date: Thu, 14 Mar 2013 10:21:39 +0100
>> From: Gian Maria Niccol? Benucci <[email protected]>
>> To: [email protected]
>> Subject: [R-sig-phylo] How to cluster sequences in different OTUs
>>
>> Hi everyone,
>> 
>> I was wondering is exist a function that is able to generate OTUs from a
>> sequences database.
>> Thank you very much in advance,
>> 
>> -- 
>> Gian
>
>
>Hi Gian
>
>The function tclust() in the spider package (on CRAN) groups sequences
>based on their genetic distance below a given threshold.
>
>If you want a something with likelihoods measured for it, check out the
>gmyc() function in the splits package (hosted on R Forge:
>https://r-forge.r-project.org/projects/splits/). This requires an
>ultrametric tree to create the the OTUs.
>
>There may be others, but these two come immediately to mind.
>
>All the best.
>
>Sam
>
>
>Samuel Brown
>Postgraduate Student
>Bio-Protection Research Centre
>PO Box 84
>Lincoln University
>Lincoln 7647
>Canterbury
>New Zealand
>[email protected]
>http://www.the-praise-of-insects.blogspot.com
>                                         
>       [[alternative HTML version deleted]]
>
>_______________________________________________
>R-sig-phylo mailing list - [email protected]
>https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
>Searchable archive at
>http://www.mail-archive.com/[email protected]/
>



Daniel

-- 
Daniel Barker
http://bio.st-andrews.ac.uk/staff/db60.htm
The University of St Andrews is a charity registered in Scotland : No
SC013532

_______________________________________________
R-sig-phylo mailing list - [email protected]
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/[email protected]/

Reply via email to