Hello all, If there are more than one exon, in the old version the header of gene would be like |XXX,XXX,XXX|XXX,XXX,XXX|. But current version only show |XXX|XXX|.
I did as follow: (1) I select caenorhabdities elegans genes(CEL160) in Dataset (2) click Attributes (Features) --> click radio button Sequences --->click SEQUENCES --> click cDNA sequences --> click header --->information --> I tick chromosome, ensemble gene ID, description, ensemble --> transcript id, biotype, start position, end positon, ensemble CDS length, 5UTR start, 5UTR end, 3UTR start, 3UTR end, exon start and exon end. Then I click button result. I give one gene to show the errors: >I|Y95B8A.6|Y95B8A.6a.2.1|882920|889792|594|882920|883166|||882920|88316 >I|6|protein_coding| >From NCBI there are 3 blocks of 5'UTR and 4 blocks of CDS. But in you result: only one block of 5'UTR is given |882920|883166|. The length is only 247, but the length of 5'UTR should be 486!. The length of CDS is 594, but you just give one block |882920|883166|, it Obviously totally wrong, it's a block of 5'UTR! I used ensemble Half year ago, it was correct. Regards, JiyuanAn > I have a similar problem. > > The old version (half year ago) was quiet good to download gene data. > but current version > > Got several errors when I download gene data from bioMart: > > 1. when a gene has several exons, only one exon's start and end (chr > bp) can be obtained. do you mean on the FASTA sequence header? From the structure attribute page you can still get all the exons but there is a problem with the FASTA header attributes that we need to fix > 2. outputted transcript attributes are wrong position. > again - is this on teh FASTA header? I will investigate best wishes Damian > > > _____ > > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On > Behalf Of Maratou, Klio > Sent: Monday, 12 February 2007 10:18 PM > To: [email protected] > Subject: [mart-dev] FW: query output inconsistency > > > > > > > > Klio Maratou, PhD > > MRC Clinical Sciences Centre > > Imperial College School of Medicine > > Hammersmith Hospital > > Du Cane Road > > London, W12 ONN > > Email:[EMAIL PROTECTED] > > Tel: 020 8383 4319 > Fax: 020 8383 8577 > > > > _____ > > From: Maratou, Klio > Sent: Mon 12/02/2007 11:00 > To: [EMAIL PROTECTED] > Subject: query output inconsistency > > I have a question about an inconsistency that I found when I query the > biomart Rattus norvegicus genes (RGSC3.4) dataset. If under Filters I > use the chromosome number along with base start and end positions to > query biomart, then I get a specific number of genes that are in this > genomic interval. However, if I repeat this query using marker names > for start and end, then I get a different number of genes for the same genomic interval. > The sequence start and end positions that I use are based on the > sequence positions of the markers. > > Why is there a difference in the output? > > Best wishes, > > Klio Maratou > > > > Klio Maratou, PhD > > MRC Clinical Sciences Centre > > Imperial College School of Medicine > > Hammersmith Hospital > > Du Cane Road > > London, W12 ONN > > Email:[EMAIL PROTECTED] > > Tel: 020 8383 4319 > Fax: 020 8383 8577 > >
