Hi Philliple, When entering Protein GeneBank ID: BAA05928, and retrieving Ensembl gene id and transcript id, I get the following: ENSG00000127241<http://www.ensembl.org/Homo_sapiens/Gene/Summary?db=core;g=ENSG00000127241> ENST00000337774<http://www.ensembl.org/Homo_sapiens/Transcript/Summary?db=core;t=ENST00000337774>
When entering Protein GeneBank ID: CAC17726, and retrieving Ensembl gene id and transcript id, I get the following: ENSG000000127152, ENST000000357195 It appears that in the Ensembl mart that you are querying, these GeneBank Ids coorespond to a different transcripts (although to the same gene ID). Regards, Elena Rivkin, PhD Outreach and Training Coordinator, Informatics and Bio-computing Ontario Institute for Cancer Research MaRS Centre, South Tower 101 College Street, Suite 800 Toronto, Ontario, Canada M5G 0A3 Tel: 647-258-4316 Toll-free: 1-866-678-6427 www.oicr.on.ca This message and any attachments may contain confidential and/or privileged information for the sole use of the intended recipient. Any review or distribution by anyone other than the person for whom it was originally intended is strictly prohibited. If you have received this message in error, please contact the sender and delete all copies. Opinions, conclusions or other information contained in this message may not be that of the organization. From: pip pipster <[email protected]<mailto:[email protected]>> Reply-To: pip pipster <[email protected]<mailto:[email protected]>> Date: Mon, 22 Aug 2011 13:51:56 -0400 To: Junjun Zhang <[email protected]<mailto:[email protected]>>, "[email protected]<mailto:[email protected]>" <[email protected]<mailto:[email protected]>> Cc: Rhoda Kinsella via RT <[email protected]<mailto:[email protected]>> Subject: Re: [BioMart Users] Bug or User error with filtering? Thank you Junjun. Elena, to answer your question, I believe the ncbi links in the below thread include a link to the protein where you can get the protein accession number. For example, for the 2 transcripts below you will find links to the following proteins. You will also see that the transcripts are correctly showing up on the URL's as being protein coding. http://www.ncbi.nlm.nih.gov/protein/471128 (accession BAA05928) and http://www.ncbi.nlm.nih.gov/protein/11558488 (accession CAC17726) Thank you, Phillipe ________________________________ From: Junjun Zhang <[email protected]<mailto:[email protected]>> To: pip pipster <[email protected]<mailto:[email protected]>>; "[email protected]<mailto:[email protected]>" <[email protected]<mailto:[email protected]>> Cc: Rhoda Kinsella via RT <[email protected]<mailto:[email protected]>> Sent: Monday, August 22, 2011 12:59 PM Subject: Re: [BioMart Users] Bug or User error with filtering? Hi Phillipe, I am forwarding your questions to the Ensembl Helpdesk. Ensembl team is the best to answer questions about data contents in Ensembl databases. Cheers, Junjun From: Elena Rivkin <[email protected]<mailto:[email protected]>> Date: Mon, 22 Aug 2011 10:46:35 -0400 To: pip pipster <[email protected]<mailto:[email protected]>>, Rhoda Kinsella <[email protected]<mailto:[email protected]>>, "[email protected]<mailto:[email protected]>" <[email protected]<mailto:[email protected]>> Subject: Re: [BioMart Users] Bug or User error with filtering? Hi Phillipe, Can you let me know, for these two transcripts, what are their Genbank protein accessions. I cant find them. Thank you. Elena Rivkin, PhD Outreach and Training Coordinator, Informatics and Bio-computing Ontario Institute for Cancer Research MaRS Centre, South Tower 101 College Street, Suite 800 Toronto, Ontario, Canada M5G 0A3 Tel: 647-258-4316 Toll-free: 1-866-678-6427 www.oicr.on.ca This message and any attachments may contain confidential and/or privileged information for the sole use of the intended recipient. Any review or distribution by anyone other than the person for whom it was originally intended is strictly prohibited. If you have received this message in error, please contact the sender and delete all copies. Opinions, conclusions or other information contained in this message may not be that of the organization. From: pip pipster <[email protected]<mailto:[email protected]>> Reply-To: pip pipster <[email protected]<mailto:[email protected]>> Date: Mon, 22 Aug 2011 10:32:43 -0400 To: Rhoda Kinsella <[email protected]<mailto:[email protected]>>, "[email protected]<mailto:[email protected]>" <[email protected]<mailto:[email protected]>> Subject: Re: [BioMart Users] Bug or User error with filtering? After doing more investigation, something definitely isn't adding up. As it turns out, filtering by Genbank protein accession is what we want and we need the ability to exclude. The 2 transcripts below are examples (they show up as protein coding Genbank as well as Ensembl) but there are thousands more like this. The filter below is taking them out despite them having a Genbank protein accession. What may be causing this? ENST00000169293 http://www.ncbi.nlm.nih.gov/sites/entrez?db=gene&cmd=search&term=ENST00000169293 http://www.ncbi.nlm.nih.gov/nuccore/D28593? http://useast.ensembl.org/Homo_sapiens/Transcript/Summary?g=ENSG00000127241;r=3:186964149-187009745;t=ENST00000169293 ENST00000345514 http://www.ncbi.nlm.nih.gov/gene?term=ENST00000345514 http://useast.ensembl.org/Homo_sapiens/Transcript/Summary?g=ENSG00000127152;r=14:99635624-99737822;t=ENST00000345514 Filter used: Manual (non-Perl) Homo sapiens genes (GRCh37.p3) Filters with protein ID(s): Only Attributes Ensembl Gene ID Ensembl Transcript ID Same problem occurs using Perl filter as well $query->addFilter("with_protein_id", ["Only"]); ________________________________ From: pip pipster <[email protected]<mailto:[email protected]>> To: Rhoda Kinsella <[email protected]<mailto:[email protected]>> Cc: "[email protected]<mailto:[email protected]>" <[email protected]<mailto:[email protected]>> Sent: Monday, August 22, 2011 8:07 AM Subject: Re: [BioMart Users] Bug or User error with filtering? Rhoda, Thank you for the feedback, very helpful. The Gene Type filter, 'protein_coding' will likely work, however it doesn't allow me to do an 'exclude' type filter (i.e. give me everything except for the non protein-coding genes). Do you know if you can still do an exclude using the method you described? Thank you! Phillipe ________________________________ From: Rhoda Kinsella <[email protected]<mailto:[email protected]>> To: pip pipster <[email protected]<mailto:[email protected]>> Cc: "[email protected]<mailto:[email protected]>" <[email protected]<mailto:[email protected]>> Sent: Monday, August 22, 2011 5:04 AM Subject: Re: [BioMart Users] Bug or User error with filtering? Hi Phillipe You are filtering using the protein ID (Genbank protein accession) and as this Ensembl protein ID does not have a corresponding Genbank protein accession, you will not get this ENSP. Please filter using the Gene type filter and select protein_coding. That way you will get the ENSP data you require. Regards Rhoda On 21 Aug 2011, at 22:54, pip pipster wrote: We are seeing strange things occur with the protein ID filter. For example, transcript ENST00000345514 is being filtered out by the following search below. However, you can see that it indeed has a Preotin ID shown here: http://useast.ensembl.org/Homo_sapiens/Transcript/Summary?db=core;g=ENSG00000127152;r=14:99635624-99737861;t=ENST00000345514 . Any idea why this is being filtered? Could this be a bug in Biomart/Data or User Error? Manual (non-Perl) Homo sapiens genes (GRCh37.p3) Filters with protein ID(s): Only Attributes Ensembl Gene ID Ensembl Transcript ID Same problem occurs using Perl filter as well $query->addFilter("with_protein_id", ["Only"]); Thank you, Phillipe _______________________________________________ Users mailing list [email protected]<mailto:[email protected]> https://lists.biomart.org/mailman/listinfo/users Rhoda Kinsella Ph.D. Ensembl Bioinformatician, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton Cambridge CB10 1SD, UK.
_______________________________________________ Users mailing list [email protected] https://lists.biomart.org/mailman/listinfo/users
