On Tue, 2007-02-13 at 22:57 +1100, Jiyuan An wrote: > Hello all, Hi JiyuanAn
> > If there are more than one exon, in the old version the header of gene > would be like |XXX,XXX,XXX|XXX,XXX,XXX|. But current version only show > |XXX|XXX|. Thats correct that we have lost one of these useful features in 0.5. However just by accident. We are working towards restoring this. > > I did as follow: > > (1) I select caenorhabdities elegans genes(CEL160) in Dataset > > (2) click Attributes (Features) --> click radio button Sequences > > --->click SEQUENCES --> click cDNA sequences --> click header > > --->information > > --> I tick chromosome, ensemble gene ID, description, ensemble > > --> transcript id, > > biotype, start position, end positon, ensemble CDS length, 5UTR start, 5UTR > end, 3UTR start, 3UTR end, exon start and exon end. > > > > Then I click button result. I give one gene to show the errors: > > > > >I|Y95B8A.6|Y95B8A.6a.2.1|882920|889792|594|882920|883166|||882920|88316 > > >I|6|protein_coding| > > >From NCBI there are 3 blocks of 5'UTR and 4 blocks of CDS. > > But in you result: only one block of 5'UTR is given |882920|883166|. > > The length is only 247, but the length of 5'UTR should be 486!. > > The length of CDS is 594, but you just give one block |882920|883166|, it > Obviously totally wrong, it's a block of 5'UTR! I used ensemble Half year > ago, it was correct. Again, because we are not displaying all the coordinates (for instance exons) as comma separated list, so one cant work out the correct length. However, the length of the actual sequence should be correct. As I said, we will fix this soon to match how it was in 0.4. Many thanks for you help. Kind Regards Syed > > > > Regards, > > JiyuanAn > > > > > > > > > > > > > I have a similar problem. > > > > > > The old version (half year ago) was quiet good to download gene data. > > > but current version > > > > > > Got several errors when I download gene data from bioMart: > > > > > > 1. when a gene has several exons, only one exon's start and end (chr > > > bp) can be obtained. > > > > do you mean on the FASTA sequence header? From the structure attribute page > you can still get all the exons but there is a problem with the FASTA header > attributes that we need to fix > > > > > 2. outputted transcript attributes are wrong position. > > > > > > > again - is this on teh FASTA header? I will investigate > > > > best wishes > > Damian > > > > > > > > > > > > > > > _____ > > > > > > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On > > > Behalf Of Maratou, Klio > > > Sent: Monday, 12 February 2007 10:18 PM > > > To: [email protected] > > > Subject: [mart-dev] FW: query output inconsistency > > > > > > > > > > > > > > > > > > > > > > > > Klio Maratou, PhD > > > > > > MRC Clinical Sciences Centre > > > > > > Imperial College School of Medicine > > > > > > Hammersmith Hospital > > > > > > Du Cane Road > > > > > > London, W12 ONN > > > > > > Email:[EMAIL PROTECTED] > > > > > > Tel: 020 8383 4319 > > > Fax: 020 8383 8577 > > > > > > > > > > > > _____ > > > > > > From: Maratou, Klio > > > Sent: Mon 12/02/2007 11:00 > > > To: [EMAIL PROTECTED] > > > Subject: query output inconsistency > > > > > > I have a question about an inconsistency that I found when I query the > > > biomart Rattus norvegicus genes (RGSC3.4) dataset. If under Filters I > > > use the chromosome number along with base start and end positions to > > > query biomart, then I get a specific number of genes that are in this > > > genomic interval. However, if I repeat this query using marker names > > > for start and end, then I get a different number of genes for the same > genomic interval. > > > The sequence start and end positions that I use are based on the > > > sequence positions of the markers. > > > > > > Why is there a difference in the output? > > > > > > Best wishes, > > > > > > Klio Maratou > > > > > > > > > > > > Klio Maratou, PhD > > > > > > MRC Clinical Sciences Centre > > > > > > Imperial College School of Medicine > > > > > > Hammersmith Hospital > > > > > > Du Cane Road > > > > > > London, W12 ONN > > > > > > Email:[EMAIL PROTECTED] > > > > > > Tel: 020 8383 4319 > > > Fax: 020 8383 8577 > > > > > > > > > > -- ====================================== Syed Haider. EMBL-European Bioinformatics Institute Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK. ======================================
