Great thanks. I will add some ideas later this week or next week. Thomas
On Wed, Jun 29, 2016 at 12:44 PM Martin Morgan < martin.mor...@roswellpark.org> wrote: > On 06/29/2016 03:42 PM, Thomas Girke wrote: > > Yes, a "readSummarizedExperiment" would be a "modern-day analog of > > Biobase::readExpressionSet". I also agree with the other suggestions > > including github to get this started, and Vince's thoughts on binding > > meta-data more tightly to source data as well as improving > > interoperability. > > I started a repository at > > https://github.com/Bioconductor/TenStepReproducible > > I envision this as a package / white paper / eventually publication. > feel free to fork etc., and / or to contribute other ideas. > > Martin > > > > > As suggested I am sharing this discussion with the bioc-devel list. > > > > Thomas > > > > On Wed, Jun 29, 2016 at 06:22:49PM +0000, Vincent Carey wrote: > > > >> Thanks Thomas -- I think this should be circulated to biocore for > further comments. I am in agreement > >> that we need to do a better job at both demonstrating the values of a) > binding metadata to data, b) > >> using standard containers through workflows, c) allowing > interoperation. I learned some useful things > >> about spreadsheet interoperation at the conference and need to learn > more. > >> > >> In a sense we are giving a specific implementation of some of the rules > in > >> > >> > http://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1003285 > >> > >> and I wonder whether we could come up with another topic for the "ten > simple rules" > >> series that addresses these concerns, or do something similar, perhaps > for F1000Research, > >> with a Bioconductor-interoperability focus on metadata. > > > > > >> On Wed, Jun 29, 2016 at 06:28:49PM +0000, Martin Morgan wrote: > > > >> I guess you mean a modern-day analog of Biobase::readExpressionSet ? I > >> like the idea of templates, and also drafting a 'Ten Steps Toward > >> Reproduciblity in R / Bioconductor'. Would be happy to start a github > >> repo for same if there are any takers... > >> > >> Martin > > > >> This email message may contain legally privileged and/or confidential > >> information. If you are not the intended recipient(s), or the employee > >> or agent responsible for the delivery of this message to the intended > >> recipient(s), you are hereby notified that any disclosure, copying, > >> distribution, or use of this email message is prohibited. If you have > >> received this message in error, please notify the sender immediately by > >> e-mail and delete this email message from your computer. Thank you. > > > > > >> On 06/29/2016 01:57 PM, Thomas Girke wrote: > >>> Hi Vince and Martin, > >>> > >>> It was great seeing you at the Bioc conference, and thanks for all your > >>> time organizing the conference. As always it was a great success with a > >>> lot of inspiring presentations and discussions. > >>> > >>> In one of our discussions you ask me for feedback why I think handling > >>> of meta-data is currently not straightforward for non-expert users of > >>> Bioc packages such as biologists, data analysts or developers coming > >>> from other languages. > >>> > >>> In my opinion, one main reason for this difficulty is that there is no > >>> formal utility provided for importing meta-data from external files > >>> (e.g. tabular, json or other formats). SummarizedExperiments has all > >>> these great functionalities but it is not intuitive to non-expert users > >>> how to import the data into the final object. For a developer it is > easy > >>> to write a custom import function but not to non-R programmers. > >>> Addressing this need would be trivial by providing an import function > >>> that could read meta-data (optionally along with assay/range data) > >>> provided by the user directly into SummarizedExperiment objects (and/or > >>> RangedSummarizedExperiment). To the best of my knowledge, a > >>> readSummarizedExperiment is currently not available, but I might be > wrong? > >>> > >>> Almost equally important would be an export function so that users can > >>> easily report intermediate results and also share them with external > >>> software outside of R. Clearly, for the latter need exporting to an Rd > >>> file is not an option. > >>> > >>> Especially the import step overlaps substantially how we communicate > >>> with experimentalists via spreadsheets, a topic we discussed at the > >>> meeting quite a bit. Providing one or two best practice templates of > how > >>> to organize experiments in the 'spirit' of SummarizedExperiment could > >>> help to educate scientists how to format their meta-data in Excel or > >>> Google sheets so that they are easier to process. This would also > >>> improve reproducibility since many sample handling errors happen right > >>> at this level. As an example file one could use here the current > colData > >>> sample used by the SummarizedExperiment vignette. > >>> > >>> That's really all. > >>> > >>> Best, > >>> > >>> Thomas > >>> > >> > >> > > > This email message may contain legally privileged and/or confidential > information. If you are not the intended recipient(s), or the employee or > agent responsible for the delivery of this message to the intended > recipient(s), you are hereby notified that any disclosure, copying, > distribution, or use of this email message is prohibited. If you have > received this message in error, please notify the sender immediately by > e-mail and delete this email message from your computer. Thank you. > [[alternative HTML version deleted]] _______________________________________________ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel