So ideally, this wouldn't make anyone's life more difficult. In my mind, the nice thing about the eSet's use of MIAME is that it is voluntary and minimal, but a reminder of those things which could be important if the data object drops into someone's hands in the future.
For example, some HTS-specific fields from GEO are: extract protocol (e.g. Illumina TruSeq Stranded Total RNA) instrument model read length single-end or paired-end For an SE produced by summarizeOverlaps, what counting mode was used? If applicable, something like "origin of features" (e.g. TxDb.Hsapiens.UCSC.hg19.knownGene)? best, Mike On 2/3/13 6:43 PM, Tim Triche, Jr. wrote: > When I first started pulling GEO eSet representations into SE/sset > objects, I found that I had to write something to handle the mandatory > MIAME data: > > setAs("MIAME", "SimpleList", > function(from) { # {{{ > to = list() > for(i in slotNames(from)) if(i != '.__classVersion__') > to[[i]]=slot(from, i) > return(SimpleList(to)) > } > ) # }}} > > And then of course the SimpleList went into the sset exptData slot. > > I've been doing this for a while to GEO data so that I can coerce it > into sset/SE objects (I'll start calling them 'sset' even though it > doesn't make sense as an acronym ;-)). But MIAME is, specifically, > Minimal Information About a Microarray Experiment. The closest I can > think of would be the MAGE-TAB representation for TCGA sequencing > experiments. The investigation (library prep, sequencing, > quantification, etc.) is described in the IDF: > > https://tcga-data.nci.nih.gov/tcgafiles/ftp_auth/distro_ftpusers/anonymous/tumor/thca/cgcc/unc.edu/illuminahiseq_rnaseqv2/rnaseqv2/unc.edu_THCA.IlluminaHiSeq_RNASeqV2.mage-tab.1.7.0/unc.edu_THCA.IlluminaHiSeq_RNASeqV2.1.7.0.idf.txt > > The samples are then described in the SDRF: > > https://tcga-data.nci.nih.gov/tcgafiles/ftp_auth/distro_ftpusers/anonymous/tumor/thca/cgcc/unc.edu/illuminahiseq_rnaseqv2/rnaseqv2/unc.edu_THCA.IlluminaHiSeq_RNASeqV2.mage-tab.1.7.0/unc.edu_THCA.IlluminaHiSeq_RNASeqV2.1.7.0.sdrf.txt > > And all the plain-English parts are further described here (thanks > Katie!): > > https://tcga-data.nci.nih.gov/tcgafiles/ftp_auth/distro_ftpusers/anonymous/tumor/thca/cgcc/unc.edu/illuminahiseq_rnaseqv2/rnaseqv2/unc.edu_THCA.IlluminaHiSeq_RNASeqV2.mage-tab.1.7.0/DESCRIPTION.txt > > Speaking from experience, it is a pain in the (arbitrary appendage) to > assemble these, but they are essentially self-contained experiments > for the end user. This is one of the reasons I like using sset > objects even for data from GEO: I can keep all the exptData, I can map > all the probes/reads/etc. to the appropriate genome build (and > swap/lift assemblies as needed), and it's trivial to compare (say) > RNAseq results to HuEx to 3' array results. > > So I'm not against support for this, although it would make rival > labs' lives easier, which isn't always my goal in life ;-) > > > > On Sun, Feb 3, 2013 at 9:32 AM, Martin Morgan <mtmor...@fhcrc.org > <mailto:mtmor...@fhcrc.org>> wrote: > > On 02/03/2013 06:37 AM, Mike Love wrote: > > hi, > > Does/should there exist a class similar to MIAME for > sequencing data, e.g. slots > concerning the library preparation, alignment, etc.? > > This could then be suggested as something to include in the > exptData SimpleList > of SummarizedExperiment. > > > > As it is one could certainly > > > se = SummarizedExperiment() > > exptData(se) = list(MIAME()) > > If we want to go down this route then I think the right strategy > would be to make the exptData slot more strict. But what would the > MIAME-like container look like? The basics are probably shared, > but what else? > > > slotNames("MIAME") > [1] "name" "lab" "contact" > [4] "title" "abstract" "url" > [7] "pubMedIds" "samples" "hybridizations" > [10] "normControls" "preprocessing" "other" > [13] ".__classVersion__" > > Martin > > > > best, > > Mike > > _______________________________________________ > Bioc-devel@r-project.org <mailto:Bioc-devel@r-project.org> > mailing list > https://stat.ethz.ch/mailman/listinfo/bioc-devel > > > > -- > Computational Biology / Fred Hutchinson Cancer Research Center > 1100 Fairview Ave. N. > PO Box 19024 Seattle, WA 98109 > > Location: Arnold Building M1 B861 > Phone: (206) 667-2793 <tel:%28206%29%20667-2793> > > > _______________________________________________ > Bioc-devel@r-project.org <mailto:Bioc-devel@r-project.org> mailing > list > https://stat.ethz.ch/mailman/listinfo/bioc-devel > > > > > -- > /A model is a lie that helps you see the truth./ > / > / > Howard Skipper > <http://cancerres.aacrjournals.org/content/31/9/1173.full.pdf> [[alternative HTML version deleted]] _______________________________________________ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel