Hi Philliple,
When entering Protein GeneBank ID: BAA05928, and retrieving Ensembl gene id and 
transcript id, I get the following:
ENSG00000127241<http://www.ensembl.org/Homo_sapiens/Gene/Summary?db=core;g=ENSG00000127241>
 
ENST00000337774<http://www.ensembl.org/Homo_sapiens/Transcript/Summary?db=core;t=ENST00000337774>

When entering Protein GeneBank ID: CAC17726, and retrieving Ensembl gene id and 
transcript id, I get the following:
ENSG000000127152, ENST000000357195

It appears that in the Ensembl mart that you are querying, these GeneBank Ids 
coorespond to a different transcripts (although to the same gene ID).
Regards,
Elena Rivkin, PhD
Outreach and Training Coordinator, Informatics and Bio-computing

Ontario Institute for Cancer Research
MaRS Centre, South Tower
101 College Street, Suite 800
Toronto, Ontario, Canada M5G 0A3

Tel: 647-258-4316
Toll-free: 1-866-678-6427
www.oicr.on.ca

This message and any attachments may contain confidential and/or privileged 
information for the sole use of the intended recipient. Any review or 
distribution by anyone other than the person for whom it was originally 
intended is strictly prohibited. If you have received this message in error, 
please contact the sender and delete all copies. Opinions, conclusions or other 
information contained in this message may not be that of the organization.


From: pip pipster <[email protected]<mailto:[email protected]>>
Reply-To: pip pipster <[email protected]<mailto:[email protected]>>
Date: Mon, 22 Aug 2011 13:51:56 -0400
To: Junjun Zhang <[email protected]<mailto:[email protected]>>, 
"[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Cc: Rhoda Kinsella via RT <[email protected]<mailto:[email protected]>>
Subject: Re: [BioMart Users] Bug or User error with filtering?

Thank you Junjun.

Elena, to answer your question, I believe the ncbi links in the below thread 
include a link to the protein where you can get the protein accession number.  
For example, for the 2 transcripts below you will find links to the following 
proteins.  You will also see that the transcripts are correctly showing up on 
the URL's as being protein coding.

http://www.ncbi.nlm.nih.gov/protein/471128 (accession BAA05928)
and
http://www.ncbi.nlm.nih.gov/protein/11558488 (accession CAC17726)

Thank you,
Phillipe

________________________________
From: Junjun Zhang <[email protected]<mailto:[email protected]>>
To: pip pipster <[email protected]<mailto:[email protected]>>; 
"[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Cc: Rhoda Kinsella via RT <[email protected]<mailto:[email protected]>>
Sent: Monday, August 22, 2011 12:59 PM
Subject: Re: [BioMart Users] Bug or User error with filtering?

Hi Phillipe,

I am forwarding your questions to the Ensembl Helpdesk. Ensembl team is the 
best to answer questions about data contents in Ensembl databases.

Cheers,
Junjun


From: Elena Rivkin <[email protected]<mailto:[email protected]>>
Date: Mon, 22 Aug 2011 10:46:35 -0400
To: pip pipster <[email protected]<mailto:[email protected]>>, Rhoda 
Kinsella <[email protected]<mailto:[email protected]>>, 
"[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Subject: Re: [BioMart Users] Bug or User error with filtering?

Hi Phillipe,
Can you let me know, for these two transcripts, what are their Genbank protein 
accessions. I cant find them.

Thank you.
Elena Rivkin, PhD
Outreach and Training Coordinator, Informatics and Bio-computing

Ontario Institute for Cancer Research
MaRS Centre, South Tower
101 College Street, Suite 800
Toronto, Ontario, Canada M5G 0A3

Tel: 647-258-4316
Toll-free: 1-866-678-6427
www.oicr.on.ca

This message and any attachments may contain confidential and/or privileged 
information for the sole use of the intended recipient. Any review or 
distribution by anyone other than the person for whom it was originally 
intended is strictly prohibited. If you have received this message in error, 
please contact the sender and delete all copies. Opinions, conclusions or other 
information contained in this message may not be that of the organization.


From: pip pipster <[email protected]<mailto:[email protected]>>
Reply-To: pip pipster <[email protected]<mailto:[email protected]>>
Date: Mon, 22 Aug 2011 10:32:43 -0400
To: Rhoda Kinsella <[email protected]<mailto:[email protected]>>, 
"[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Subject: Re: [BioMart Users] Bug or User error with filtering?

After doing more investigation, something definitely isn't adding up.  As it 
turns out, filtering by Genbank protein accession is what we want and we need 
the ability to exclude.  The 2 transcripts below are examples (they show up as 
protein coding Genbank as well as Ensembl) but there are thousands more like 
this.  The filter below is taking them out despite them having a Genbank 
protein accession.  What may be causing this?

ENST00000169293
http://www.ncbi.nlm.nih.gov/sites/entrez?db=gene&cmd=search&term=ENST00000169293
http://www.ncbi.nlm.nih.gov/nuccore/D28593?
http://useast.ensembl.org/Homo_sapiens/Transcript/Summary?g=ENSG00000127241;r=3:186964149-187009745;t=ENST00000169293

ENST00000345514
http://www.ncbi.nlm.nih.gov/gene?term=ENST00000345514
http://useast.ensembl.org/Homo_sapiens/Transcript/Summary?g=ENSG00000127152;r=14:99635624-99737822;t=ENST00000345514


Filter used:
Manual (non-Perl)
    Homo sapiens genes (GRCh37.p3)
    Filters
        with protein ID(s): Only
    Attributes
        Ensembl Gene ID
        Ensembl Transcript ID

Same problem occurs using Perl filter as well
    $query->addFilter("with_protein_id", ["Only"]);

________________________________
From: pip pipster <[email protected]<mailto:[email protected]>>
To: Rhoda Kinsella <[email protected]<mailto:[email protected]>>
Cc: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Sent: Monday, August 22, 2011 8:07 AM
Subject: Re: [BioMart Users] Bug or User error with filtering?

Rhoda,
Thank you for the feedback, very helpful.  The Gene Type filter, 
'protein_coding' will likely work, however it doesn't allow me to do an 
'exclude' type filter (i.e. give me everything except for the non 
protein-coding genes).  Do you know if you can still do an exclude using the 
method you described?

Thank you!
Phillipe

________________________________
From: Rhoda Kinsella <[email protected]<mailto:[email protected]>>
To: pip pipster <[email protected]<mailto:[email protected]>>
Cc: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Sent: Monday, August 22, 2011 5:04 AM
Subject: Re: [BioMart Users] Bug or User error with filtering?

Hi Phillipe
You are filtering using the protein ID (Genbank protein accession) and as this 
Ensembl protein ID does not have a corresponding Genbank protein accession, you 
will not get this ENSP. Please filter using the Gene type filter and select 
protein_coding. That way you will get the ENSP data you require.
Regards
Rhoda


On 21 Aug 2011, at 22:54, pip pipster wrote:

We are seeing strange things occur with the protein ID filter.  For example, 
transcript ENST00000345514 is being filtered out by the following search below. 
 However, you can see that it indeed has a Preotin ID shown here:  
http://useast.ensembl.org/Homo_sapiens/Transcript/Summary?db=core;g=ENSG00000127152;r=14:99635624-99737861;t=ENST00000345514
 .  Any idea why this is being filtered?  Could this be a bug in Biomart/Data 
or User Error?

Manual (non-Perl)
    Homo sapiens genes (GRCh37.p3)
    Filters
        with protein ID(s): Only
    Attributes
        Ensembl Gene ID
        Ensembl Transcript ID

Same problem occurs using Perl filter as well
    $query->addFilter("with_protein_id", ["Only"]);

Thank you,
Phillipe
_______________________________________________
Users mailing list
[email protected]<mailto:[email protected]>
https://lists.biomart.org/mailman/listinfo/users

Rhoda Kinsella Ph.D.
Ensembl Bioinformatician,
European Bioinformatics Institute (EMBL-EBI),
Wellcome Trust Genome Campus,
Hinxton
Cambridge CB10 1SD,
UK.







_______________________________________________
Users mailing list
[email protected]
https://lists.biomart.org/mailman/listinfo/users

Reply via email to