I cannot speak for the core team. You should separate the data from the software methods and provide a data package containing the MAFs. This has the additional advantage of separating versionning of the mutation data from your software. As a data package this does not sound extensive; the largest dataset is 3.7Mb. There is a potential privacy problem with sharing mutations, but I don't know at what level the mutations are described. I assume you have considered this?
Best, Kasper On Sun, Oct 1, 2017 at 9:16 PM, Anand MT <anand...@hotmail.com> wrote: > Hi all, > > I maintain maftools package which offers multitude of functions to perform > various analyses and visualization of MAF (Mutation Annotation Format) > files from cancer cohorts. > > In the upcoming bioconductor release, I plan to include all MAFs from 32 > TCGA cohorts as a part of the package. These tcga mafs will be stored as > MAF objects containing curated somatic mutations along with clinical > information in the extdata directory and can be loaded via a “tcga_load” > function. > > I think this will help many researchers working with tcga mutation data > and saves the time and hassle of going through various databases to search > and assemble. I believe this also helps in reproducible research. > > However, size of these MAF objects vary according to the cohorts size and > mutation burden; with LAML (leukemia) being the smallest (91 kb) and LUAD > (Lung Adeno Carcinoma) being the largest (3.7 mb). Also including these > MAFs increases package size to 46 mb (from 7mb without theses datasets). > > My question is, > > * is it okay for a package to be of this size ? > * I haven't tried to push these commits to repository yet, but in case > git rejects my push due to size limit, is it possible to make an exception, > given the scenario ? > > If this can't be done in any ways or if it breaks any rules of package > guidelines, I don't mind dropping the idea either. > > Thanks. > > -Anand. > > > [[alternative HTML version deleted]] > > > _______________________________________________ > Bioc-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/bioc-devel > [[alternative HTML version deleted]] _______________________________________________ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel