On Wed, May 26, 2021 at 5:55 PM Stuart Lee <le...@wehi.edu.au> wrote:
> Hi You and Lori, > > Are fitted models in scope for ExperimentHub? I thought it was more for > data. Maybe there should be a ModelHub for developers to include trained > models from papers in their packages? > > @You: if that model has been fitted in R take a look at > https://github.com/tidymodels/butcher for some ways of reducing it’s size. > Thanks for these suggestions Stuart! butcher certainly seems relevant. I tried it out on the adabag output in You's package was able to effectuate some nice reductions > tr1 = x$trees[[1]] > obj_size(tr1) 340,192 B > obj_size(axe_data(axe_call(axe_fitted(tr1)))) 171,896 B So this in conjunction with xz compression could make this a moot point for @You Zhou. As for the ModelHub, two thoughts. First, I'd be more inclined at this stage to partner with a system like kipoi.org, with fitted models archived there and retrieved by API as needed by bioc packages. I wonder if there are any good examples of this by now. Second, although I don't feel we have capacity in core to introduce a new Hub at just this point, I think we'd be able to help a motivated community-based team to produce one -- if kipoi suggestion isn't viable -- utilizing some Azure resources that have been contributed by Microsoft Genomics. Interested parties should write to the list. I don't see a bioc slack channel devoted to AI/ML and maybe there would be good traffic on one. This "task area" could be added to biocchallenges, or could be a topic for a developer forum meeting. > Thanks > Stuart > ________________________________ > From: Bioc-devel <bioc-devel-boun...@r-project.org> on behalf of Kern, > Lori <lori.sheph...@roswellpark.org> > Sent: Wednesday, 26 May 2021 10:01 PM > To: You Zhou <youzhoulearn...@gmail.com>; bioc-devel@r-project.org < > bioc-devel@r-project.org> > Subject: Re: [Bioc-devel] About the size limitation of the package > > Please consider using Experiment Hub to host the large data file. More > information can be found here: > > https://bioconductor.org/packages/devel/bioc/vignettes/AnnotationHub/inst/doc/CreateAHubPackage.html > > Cheers, > > > > Lori Shepherd > > Bioconductor Core Team > > Roswell Park Comprehensive Cancer Center > > Department of Biostatistics & Bioinformatics > > Elm & Carlton Streets > > Buffalo, New York 14263 > > ________________________________ > From: Bioc-devel <bioc-devel-boun...@r-project.org> on behalf of You Zhou > <youzhoulearn...@gmail.com> > Sent: Wednesday, May 26, 2021 5:09 AM > To: bioc-devel@r-project.org <bioc-devel@r-project.org> > Subject: [Bioc-devel] About the size limitation of the package > > Dear Bioc team, > > > I am compiling a package �m6Aboost� and planning to submit it in the > Bioconductor. This package using a trained machine learning model to > identify the correct m6A signals from the miCLIP2 data set (more detail > about this machine learning model can be found in our paper > https://www.biorxiv.org/content/10.1101/2020.12.20.423675v1). > > > > Now I meet a problem: the size of this machine learning model is 10 Mb, > which is bigger than 5 Mb. Since this model is crucial for the package, I > was wondering whether I can ignore the warning message about the size > limitation. Thank you : ) > > Best regards, > You Zhou > > [[alternative HTML version deleted]] > > > > This email message may contain legally privileged and/or confidential > information. If you are not the intended recipient(s), or the employee or > agent responsible for the delivery of this message to the intended > recipient(s), you are hereby notified that any disclosure, copying, > distribution, or use of this email message is prohibited. If you have > received this message in error, please notify the sender immediately by > e-mail and delete this email message from your computer. Thank you. > [[alternative HTML version deleted]] > > _______________________________________________ > Bioc-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/bioc-devel > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioc-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/bioc-devel > -- The information in this e-mail is intended only for the ...{{dropped:18}} _______________________________________________ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel