On 08/04/2015 06:43 AM, Nathan Olson wrote:
We are starting to work on an infrastructure for annotation of 16S metagenomic
sequencing datasets and would like your comments and/or contributions. Below are
links to two github repositories: metagenomeFeatures and greengenes13.5MgDb.
The metagenomeFeatures package contains two classes; mgDb, for 16S sequence
databases, and metagenomeAnnotation, for annotating a sequence dataset with
taxonomic information from a mgDb object.  The greengenes13.5MgDb package, loads
a mgDb object with the greengenes 13.5 database.  greengenes 13.5 was used as an

does it make sense to use AnnotationHub to manage these resources? Instead of downloading and managing the fasta and taxonomy files in .onLoad and getGreenGenes13.5Db, .onLoad would be

  hub = AnnotationHub()
  db_seq = hub[["AH12345"]]
  db_taxa_file = hub[["AH12346"]]

with a 'recipe' describing how the corresponding annotation hub resources are to be created. This would move download and management to AnnotationHub, and potentially allow use of the annotation hub records by people with other interests. If that sounds interesting we can work up a pull request.

Martin

example database, we plan on adding additional packages for other commonly used
databases, e.g RDP and Silva.

The metagenomeFeatures includes two vignettes to demonstrating the mgDb and
metagenomeAnnotation class methods using the greengenes13.5MgDb as an example
database.

We are planning on adding additional methods for the mgDb and
metagenomeAnnotation classes.  For the mgDb class, assigning query sequences to
database sequences using rRDP classifier, and/or sequence alignment methods that
are part of the Biostrings package.  For the metagenomeAnnotation class we plan
to include the ability to create a phylogenetic tree from a metagenomeAnnotation
object.
We would appreciate comments on the package and suggestions for additional 
features.

Links to package github repositories

https://github.com/HCBravoLab/metagenomeFeatures

https://github.com/HCBravoLab/greengenes13.5MgDb

Thanks

Nate Olson and Hector Corrada Bravo


--
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793

_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Reply via email to