On 08/04/2015 06:43 AM, Nathan Olson wrote:
We are starting to work on an infrastructure for annotation of 16S metagenomic
sequencing datasets and would like your comments and/or contributions. Below are
links to two github repositories: metagenomeFeatures and greengenes13.5MgDb.
The metagenomeFeatures package contains two classes; mgDb, for 16S sequence
databases, and metagenomeAnnotation, for annotating a sequence dataset with
taxonomic information from a mgDb object. The greengenes13.5MgDb package, loads
a mgDb object with the greengenes 13.5 database. greengenes 13.5 was used as an
does it make sense to use AnnotationHub to manage these resources? Instead of
downloading and managing the fasta and taxonomy files in .onLoad and
getGreenGenes13.5Db, .onLoad would be
hub = AnnotationHub()
db_seq = hub[["AH12345"]]
db_taxa_file = hub[["AH12346"]]
with a 'recipe' describing how the corresponding annotation hub resources are to
be created. This would move download and management to AnnotationHub, and
potentially allow use of the annotation hub records by people with other
interests. If that sounds interesting we can work up a pull request.
Martin
example database, we plan on adding additional packages for other commonly used
databases, e.g RDP and Silva.
The metagenomeFeatures includes two vignettes to demonstrating the mgDb and
metagenomeAnnotation class methods using the greengenes13.5MgDb as an example
database.
We are planning on adding additional methods for the mgDb and
metagenomeAnnotation classes. For the mgDb class, assigning query sequences to
database sequences using rRDP classifier, and/or sequence alignment methods that
are part of the Biostrings package. For the metagenomeAnnotation class we plan
to include the ability to create a phylogenetic tree from a metagenomeAnnotation
object.
We would appreciate comments on the package and suggestions for additional
features.
Links to package github repositories
https://github.com/HCBravoLab/metagenomeFeatures
https://github.com/HCBravoLab/greengenes13.5MgDb
Thanks
Nate Olson and Hector Corrada Bravo
--
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109
Location: Arnold Building M1 B861
Phone: (206) 667-2793
_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel