Elena, You should be able to follow this up the chain in getting accession numbers.
a. From Transcript http://www.ncbi.nlm.nih.gov/sites/entrez?db=gene&cmd=search&term=ENST00000169293 b. To Gene (link to this Gene URL is located on Transcript link above) http://www.ncbi.nlm.nih.gov/nuccore/D28593.1 c. To Protein (link to this Protein URL is located on Gene link above) http://www.ncbi.nlm.nih.gov/protein/471128 From this stand-point, I am led to believe that the Transcript maps to a Genbank protein accession and should not be filtered out with the $query->addFilter("with_protein_id", ["Only"]) filter. But in either case I would like to understand why it's being filtered out since I have to trust the data I get back and deal with it accordingly. Likewise, the following URL also appears to chain the Gene to the proper transcripts. http://www.ebi.ac.uk/ena/data/view/D28593 It appears that for some reason the data in Emsembl is not mapping transcript ENST00000169293 (and many others in similar categories) to the proper Protein Accession. But that's just my theory and would love to understand it better. Thoughts? Best regards, Phillipe ________________________________ From: Elena Rivkin <[email protected]> To: pip pipster <[email protected]>; Junjun Zhang <[email protected]>; "[email protected]" <[email protected]> Cc: Rhoda Kinsella via RT <[email protected]> Sent: Monday, August 22, 2011 2:04 PM Subject: Re: [BioMart Users] Bug or User error with filtering? Hi Philliple, When entering Protein GeneBank ID: BAA05928, and retrieving Ensembl gene id and transcript id, I get the following: ENSG00000127241 ENST00000337774 When entering Protein GeneBank ID: CAC17726, and retrieving Ensembl gene id and transcript id, I get the following: ENSG000000127152, ENST000000357195 It appears that in the Ensembl mart that you are querying, these GeneBank Ids coorespond to a different transcripts (although to the same gene ID). Regards, Elena Rivkin, PhD Outreach and Training Coordinator, Informatics and Bio-computing Ontario Institute for Cancer Research MaRS Centre, South Tower 101 College Street, Suite 800 Toronto, Ontario, Canada M5G 0A3 Tel: 647-258-4316 Toll-free: 1-866-678-6427 www.oicr.on.ca This message and any attachments may contain confidential and/or privileged information for the sole use of the intended recipient. Any review or distribution by anyone other than the person for whom it was originally intended is strictly prohibited. If you have received this message in error, please contact the sender and delete all copies. Opinions, conclusions or other information contained in this message may not be that of the organization. From: pip pipster <[email protected]> Reply-To: pip pipster <[email protected]> Date: Mon, 22 Aug 2011 13:51:56 -0400 To: Junjun Zhang <[email protected]>, "[email protected]" <[email protected]> Cc: Rhoda Kinsella via RT <[email protected]> Subject: Re: [BioMart Users] Bug or User error with filtering? Thank you Junjun. Elena, to answer your question, I believe the ncbi links in the below thread include a link to the protein where you can get the protein accession number. For example, for the 2 transcripts below you will find links to the following proteins. You will also see that the transcripts are correctly showing up on the URL's as being protein coding. http://www.ncbi.nlm.nih.gov/protein/471128 (accession BAA05928) and http://www.ncbi.nlm.nih.gov/protein/11558488 (accession CAC17726) Thank you, Phillipe ________________________________ From: Junjun Zhang <[email protected]> To: pip pipster <[email protected]>; "[email protected]" <[email protected]> Cc: Rhoda Kinsella via RT <[email protected]> Sent: Monday, August 22, 2011 12:59 PM Subject: Re: [BioMart Users] Bug or User error with filtering? Hi Phillipe, I am forwarding your questions to the Ensembl Helpdesk. Ensembl team is the best to answer questions about data contents in Ensembl databases. Cheers, Junjun From: Elena Rivkin <[email protected]> Date: Mon, 22 Aug 2011 10:46:35 -0400 To: pip pipster <[email protected]>, Rhoda Kinsella <[email protected]>, "[email protected]" <[email protected]> Subject: Re: [BioMart Users] Bug or User error with filtering? Hi Phillipe, >Can you let me know, for these two transcripts, what are their Genbank protein >accessions. I cant find them. > > >Thank you. >Elena Rivkin, PhD >Outreach and Training Coordinator, Informatics and Bio-computing > >Ontario Institute for Cancer Research >MaRS Centre, South Tower >101 College Street, Suite 800 >Toronto, Ontario, Canada M5G 0A3 > > >Tel: 647-258-4316 >Toll-free: 1-866-678-6427 >www.oicr.on.ca > > >This message and any attachments may contain confidential and/or privileged >information for the sole use of the intended recipient. Any review or >distribution by anyone other than the person for whom it was originally >intended is strictly prohibited. If you have received this message in error, >please contact the sender and delete all copies. Opinions, conclusions or >other information contained in this message may not be that of the >organization. > > >From: pip pipster <[email protected]> >Reply-To: pip pipster <[email protected]> >Date: Mon, 22 Aug 2011 10:32:43 -0400 >To: Rhoda Kinsella <[email protected]>, "[email protected]" <[email protected]> >Subject: Re: [BioMart Users] Bug or User error with filtering? > > > >After doing more investigation, something definitely isn't adding up. As it >turns out, filtering by Genbank protein accession is what we want and we need >the ability to exclude. The 2 transcripts below are examples (they show up as >protein coding Genbank as well as Ensembl) but there are thousands more like >this. The filter below is taking them out despite them having a Genbank >protein accession. What may be causing this? > > > >ENST00000169293 >http://www.ncbi.nlm.nih.gov/sites/entrez?db=gene&cmd=search&term=ENST00000169293 >http://www.ncbi.nlm.nih.gov/nuccore/D28593? > >http://useast.ensembl.org/Homo_sapiens/Transcript/Summary?g=ENSG00000127241;r=3:186964149-187009745;t=ENST00000169293 > >ENST00000345514 >http://www.ncbi.nlm.nih.gov/gene?term=ENST00000345514 >http://useast.ensembl.org/Homo_sapiens/Transcript/Summary?g=ENSG00000127152;r=14:99635624-99737822;t=ENST00000345514 > > > > >Filter used: >Manual (non-Perl) > Homo sapiens genes (GRCh37.p3) > Filters > with protein ID(s): Only > Attributes > Ensembl Gene ID > Ensembl Transcript ID > > >Same problem occurs using Perl filter as well > $query->addFilter("with_protein_id", ["Only"]); > > > >________________________________ >From: pip pipster <[email protected]> >To: Rhoda Kinsella <[email protected]> >Cc: "[email protected]" <[email protected]> >Sent: Monday, August 22, 2011 8:07 AM >Subject: Re: [BioMart Users] Bug or User error with filtering? > > >Rhoda, >Thank you for the feedback, very helpful. The Gene Type filter, >'protein_coding' will likely work, however it doesn't allow me to do an >'exclude' type filter (i.e. give me everything except for the non >protein-coding genes). Do you know if you can still do an exclude using the >method you described? > > >Thank you! >Phillipe > > > >________________________________ >From: Rhoda Kinsella <[email protected]> >To: pip pipster <[email protected]> >Cc: "[email protected]" <[email protected]> >Sent: Monday, August 22, 2011 5:04 AM >Subject: Re: [BioMart Users] Bug or User error with filtering? > > >Hi Phillipe >You are filtering using the protein ID (Genbank protein accession) and as this >Ensembl protein ID does not have a corresponding Genbank protein accession, >you will not get this ENSP. Please filter using the Gene type filter and >select protein_coding. That way you will get the ENSP data you require. >Regards >Rhoda > > > > >On 21 Aug 2011, at 22:54, pip pipster wrote: > >We are seeing strange things occur with the protein ID filter. For example, >transcript ENST00000345514 is being filtered out by the following search >below. However, you can see that it indeed has a Preotin ID shown here: >http://useast.ensembl.org/Homo_sapiens/Transcript/Summary?db=core;g=ENSG00000127152;r=14:99635624-99737861;t=ENST00000345514 > . Any idea why this is being filtered? Could this be a bug in Biomart/Data >or User Error? >> >>Manual (non-Perl) >> Homo sapiens genes (GRCh37.p3) >> Filters >> with protein ID(s): Only >> Attributes >> Ensembl Gene ID >> Ensembl Transcript ID >> >> >>Same problem occurs using Perl filter as well >> $query->addFilter("with_protein_id", ["Only"]); >> >> >>Thank you, >>Phillipe >>_______________________________________________ >>Users mailing list >>[email protected] >>https://lists.biomart.org/mailman/listinfo/users >> > >Rhoda Kinsella Ph.D. >Ensembl Bioinformatician, >European Bioinformatics Institute (EMBL-EBI), >Wellcome Trust Genome Campus, >Hinxton >Cambridge CB10 1SD, >UK. > > > > >
_______________________________________________ Users mailing list [email protected] https://lists.biomart.org/mailman/listinfo/users
