The choice to use AnnotationHub vs leaving the data in your package is a personal choice.
We encourage the use of the hubs as it has the potential to reach a broader audience and separates functions from data, keeping packages light weight. The hubs also provide others an opportunity to work with the annotations independent of your package. But the decision is yours. If you would like to proceed with using the Hubs I can assist you. We would give you temporary access to a temporary directory to upload data. We will move it to the appropriate location on S3. We suggest having subdirectories for versioning if you anticipate the files being updated in the future. In order to put the data into production on the hubs we would need the packages to have the properly formatted metadata.csv. Additional information for annotation packages that use the hub can be found here: https://github.com/Bioconductor/AnnotationHub/blob/master/vignettes/CreateAnAnnotationPackage.Rmd The metadata.csv file can be tested for proper format by using AnnotationHubData::makeAnnotationHubMetadata function i.e. AnnotationHubData::makeAnnotationHubMetadata("GeneSetDb.MSigDB.Hsapiens.v61") If you would like to proceed this rout please email me at lori.sheph...@roswellpark.org for access credentials. Lori Shepherd Bioconductor Core Team Roswell Park Cancer Institute Department of Biostatistics & Bioinformatics Elm & Carlton Streets Buffalo, New York 14263 ________________________________ From: Bioc-devel <bioc-devel-boun...@r-project.org> on behalf of Steve Lianoglou <mailinglist.honey...@gmail.com> Sent: Thursday, December 21, 2017 3:26:48 PM To: bioc-devel@r-project.org Subject: [Bioc-devel] Submit data package or use AnnotationHub? Hi all, Briefly: I'm looking to get guidance on how to handle data packages that support a suite of software packages I'd like to submit to bioconductor. More Detail: We (Genentech) have opened sourced some packages I've been developing internally for the past few years that facilitate the execution and exploration of gene set enrichment analyses. https://github.com/lianos/multiGSEA https://github.com/lianos/multiGSEA.shiny I will submit them to bioc "in the normal way", however my question is how I should do that because there are also data packages I have (that are Suggest(ed) by multiGSEA) that need to go in as well. Would these data go in as data packages or via AnnotationHub? multiGSEA provides convenience wrappers to retrieve genesets from different sources. One of these resources is the gene set collections made available by MSigDB. Using multiGSEA, a user can get the hallmark and c2 gene set collections like so: ``` library(multiGSEA) gdb.human <- getMSigGeneSetDb(c("h", "c2"), "human") gdb.mouse <- getMSigGeneSetDb(c("h", "c2"), "mouse") ``` These function calls check if the following data packages are installed and retrieve the appropriate gene sets if so (otherwise they raise an error): https://github.com/lianos/GeneSetDb.MSigDB.Hsapiens.v61 https://github.com/lianos/GeneSetDb.MSigDB.Mmusculus.v61 I've created these data packages so that they approximate what I think looks like something suitable for AnnotationHub (ie. with working inst/scripts/make-data.R scripts). These data packages start with MSigDB's gene set *xml files (ie. 'msigdb_v6.1.xml') and convert them into multiGSEA::GeneSetDb *.rds objects which are then used by the multiGSEA and multiGSEA.shiny packages. I'm curious how to proceed from here? Thanks, -steve ps: I know bioc looks down on not using "foundational" bioc classes, so we can have this discussion during pkg review, but a GeneSetDb object is a reimagined take on the GSEABase::GeneSetCollection. Unfortunately the latter just wasn't providing the functionality I wanted for how I felt like I wanted to interact with collections of genesets ... mulitGSEA provides methods to convert a GeneSetCollection to a GeneSetDb, and vice versa -- Steve Lianoglou Bioinformatics Scientist Cancer Immunology Genentech _______________________________________________ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel This email message may contain legally privileged and/or confidential information. If you are not the intended recipient(s), or the employee or agent responsible for the delivery of this message to the intended recipient(s), you are hereby notified that any disclosure, copying, distribution, or use of this email message is prohibited. If you have received this message in error, please notify the sender immediately by e-mail and delete this email message from your computer. Thank you. [[alternative HTML version deleted]] _______________________________________________ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel