Hi Tim, I posted the question here is because Maintainers of most of the annotation pacakage is Bioconductor Package Maintainer, though I posted at the support website (https://support.bioconductor.org/p/76545/). Probably there should be a common workflow to handle this kind of problem for the core team of Bioconductor. Thank you.
Zhilong On 6 January 2016 at 02:07, Tim Triche, Jr. <tim.tri...@gmail.com> wrote: > I should have phrased this differently: > > "Don't create new .db0 packages _just to map symbols or sequences_." > > The .db0 infrastructure is marvelous for oligonucleotide arrays designed > to measure transcription, but in some respects it "suffers" from the BioC > release cycle. For example, suppose I have a bunch of hgu133plus2 and > HuGeneST 1.1 arrays where I find that the probe sequences, when aligned to > a more recent reference transcriptome than the arrays were designed > against, actually pick up noncoding RNAs better than the > (discarded-due-to-mismapping) mRNA targets they were originally designed > against. In Du et al (2013, > http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3702647/) we see one of > several ways in which this information can be used. > > HOWEVER! With the new mappings of probes to genes/symbols/transcripts, we > have a bit of a conundrum, especially in situations where RNAseq data is > also available. mapToIds() and mapToRanges() certainly helps, although a > helper function that does the same thing based on lifted transcriptomic > coordinates might do as well as the latter, and the former sometimes won't > find the correct IDs (again with the release cycle issues). So if I map a > number of symbols to, say, Ensembl build 83 plus some other stuff (for > example, a number of recently documented non-coding RNAs), it's going to be > rough going to get things mapped back to where I want them. And then of > course it would be nice to normalize everything in a sensible fashion. > > My suggestion, due to the final two stings in the tail, would be to look > into a probedesign (pd) file for oligo, so that a person can use SCAN.UPC > to compare RNAseq and microarray quantifications of the same transcripts > across a larger number of samples. That's just my opinion, but as may be > obvious from the above excruciating level of detail, along with several > years as maintainer of .db0 packages for platforms where the .db0 > infrastructure might not have been the best fit, I do think my opinion may > help others. > > Of course, I could always be wrong. I've been wrong many times before. > Hopefully by documenting the various ways in which I've tried doing things > (right and wrong), there can be some benefit to others trying the same. > > Best, > > > > --t > > On Tue, Jan 5, 2016 at 5:11 PM, James W. MacDonald <jmac...@uw.edu> wrote: > >> >> On Jan 5, 2016 7:01 PM, "Tim Triche, Jr." <tim.tri...@gmail.com> wrote: >> > >> > 1) this is a support.bioconductor.org question >> > 2) don't use .db0 packages, you will rue the day you did >> >> Can you expand on this statement? Right now all of the ChipDb are built >> using a db0 package, so it's not clear to me why this might be a problem. >> >> > best, >> > >> > --t >> > >> > On Tue, Jan 5, 2016 at 3:53 PM, Zhilong Jia <zhilong...@gmail.com> >> wrote: >> > >> > > Hello, >> > > >> > > Happy new year. >> > > >> > > What is the common work-flow to build an microarray annotation >> package, >> > > like hgu133a.db. >> > > >> > > For some array, there are probe sequences available, then maybe >> mapping is >> > > used? While for other situations, how to deal with? If code used by >> the >> > > team available, that will be great. Thank you. >> > > >> > > The specific goal is to build new platform annotation packages which >> are >> > > not available now from Bioconductor (what I need is just probe to gene >> > > symbols). >> > > >> > > It seems Bioconductor update the annotation package when a new version >> > > releasing due to the update of gene symbols. >> > > >> > > BTW, why name it as hgu133a.db instead of GPL96.db (from GEO) in >> > > Bioconductor? And user have to find the mapping relationship between >> them, >> > > though there are some mappings, such as >> > > >> https://gist.github.com/seandavi/bc6b1b82dc65c47510c7#file-platformmap-txt >> > > . >> > > >> > > >> > > Regards, >> > > Zhilong >> > > >> > > -- >> > > Zhilong JIA >> > > zhilong...@gmail.com >> > > https://github.com/zhilongjia >> > > >> > > [[alternative HTML version deleted]] >> > > >> > > _______________________________________________ >> > > Bioc-devel@r-project.org mailing list >> > > https://stat.ethz.ch/mailman/listinfo/bioc-devel >> > > >> > >> > [[alternative HTML version deleted]] >> > >> > _______________________________________________ >> > Bioc-devel@r-project.org mailing list >> > https://stat.ethz.ch/mailman/listinfo/bioc-devel >> > > -- Zhilong JIA zhilong...@gmail.com https://github.com/zhilongjia [[alternative HTML version deleted]] _______________________________________________ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel