Hi Ewan, I just wanted to address the question embedded in your email below:
> Looking into the corresponding import of Ensembl into UCSC here: > > http://genome.ucsc.edu/cgi-bin/hgc?hgsid=173968291&o=99517493&t=99522910&g=ensGene&i=ENST00000499990 > > This transcript is there, but I can't spot the "biotype" slot here - > it is just > that it is non coding (we have about ~20 other non coding biotypes, > eg, snoRNAs, > miRNAs etc) > > > > (Is this true - UCSC guys, would it be possible to get the concept of > BioType in > the Ensembl set?) > The BioType is not currently being included as part of our Ensembl track/table. I have passed along the suggestion to incorporate the BioType into our Ensembl track. Please note, we do have a separate track, sno/miRNA (http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg19&g=wgRna), that contains miRNA and snoRNA and other data. Katrina Learned UCSC Genome Bioinformatics Group Ewan Birney wrote, On 12/02/10 02:17: > The Ensembl project explicit aims to predict long intergenic non > coding RNAs > (lincRNAs) using a similar scheme (ie, histone modification patterns) > and > ESTs/cDNAs without coding potential in both Human and Mouse. They are > explicitly > characterised as lincRNAs. Like all our "predictions", they are biased > towards > a high specificity set and backed up by experimental evidence. > > An example one is here: > > http://www.ensembl.org/Homo_sapiens/Gene/Summary?db=core;g=ENSG00000245883;r=7:99517494-99522910;t=ENST00000499990 > > > Looking into the corresponding import of Ensembl into UCSC here: > > http://genome.ucsc.edu/cgi-bin/hgc?hgsid=173968291&o=99517493&t=99522910&g=ensGene&i=ENST00000499990 > > This transcript is there, but I can't spot the "biotype" slot here - > it is just > that it is non coding (we have about ~20 other non coding biotypes, > eg, snoRNAs, > miRNAs etc) > > > > (Is this true - UCSC guys, would it be possible to get the concept of > BioType in > the Ensembl set?) > > > Also the Havana project, which does manual curation, which is both > merged in a principled > way with the Ensembl set (ie, the Ensembl set is a super-set of Havana > at the point of > release) and is available in UCSC browser also has a large set of non > coding RNAs. > > > A count of lincRNAs in Human and Mouse in Ensembl are: > > 1443 - in Human > > 407 - in Mouse. > > > It is probably possible to either download from UCSC and the biotypes > from Ensembl with > a script to join or of course download the set from ensembl. You might > like to use > our BioMart tool: > > (showing our west coast mirror here) > > http://uswest.ensembl.org/biomart/martview/ > > > > > On 2 Dec 2010, at 07:47, Bogdan Tanasa wrote: > > >> Dear all, >> >> please could you recommend a track "Genes and Gene Prediction >> Tracks" that >> has the highest number (with good accuracy) of known/ predicted long >> ncRNAs >> (lincRNAs, etc) ? >> >> thanks, >> >> Bogdan >> _______________________________________________ >> Genome maillist - [email protected] >> https://lists.soe.ucsc.edu/mailman/listinfo/genome >> > > _______________________________________________ > Genome maillist - [email protected] > https://lists.soe.ucsc.edu/mailman/listinfo/genome > _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
