Hi Elena,
1. Regarding: " I cannot switch to the next step to select
attributes: the "Output" and "Next" buttons on the top are not
working." Can you please describe what is not working? What
happens when you click on these buttons?
When I click on these buttons, nothing happens: I would expect to switch
to another window/dialog box to select attributes.
This happens if I don't select the Ensembl Gene ID options in the
popdown menu under the ID list limit checkbox (but then, the whole
database is returned).
1. The ID Converter can be accessed here:
http://central.biomart.org/converter/#!/ID_converter/gene_ensembl_config_2
<http://central.biomart.org/converter/#%21/ID_converter/gene_ensembl_config_2>
I just tried the link. I get "no data" returned. Here are the first few
rows of the "Conversion results" window:
Ensembl Gene ID
Ensembl Transcript ID
ENSMUSG00000000001
no data
ENSMUSG00000000003
no data
ENSMUSG00000000028
no data
ENSMUSG00000000031
no data
ENSMUSG00000000037
no data
ENSMUSG00000000049
no data
ENSMUSG00000000056
no data
ENSMUSG00000000058
no data
ENSMUSG00000000078
no data
That is puzzling.
Best wishes
Henri-Jean
From: Henri-Jean Garchon <[email protected]
<mailto:[email protected]>>
Reply-To: "[email protected]
<mailto:[email protected]>" <[email protected]
<mailto:[email protected]>>
Date: Mon, 8 Aug 2011 10:44:35 -0400
To: Microsoft Office User <[email protected]
<mailto:[email protected]>>
Cc: "[email protected] <mailto:[email protected]>" <[email protected]
<mailto:[email protected]>>, Junjun Zhang <[email protected]
<mailto:[email protected]>>
Subject: Re: [BioMart Users] Mouse EnsemblTranscriptID retrieval
Hi Elena,
Do you get 93805 transcript from the same list of Ids that you sent
us? Can you please describe how you get all the Ensembl genes (n =
36814)?
This was partly a mistake of mine. As I mentioned yesterday to Junjun
(I realize I didn't cc the Biomart Users list, sorry about it):
"I am speaking only of the 0.8 version Biomart server, using Internet
Explorer (IE9).
I was puzzled by the fact that when I had uploaded the 22308 Ensembl
Gene ID list I am working with, a list of 36814 genes ID and 93805
transcripts ID was returned, actually corresponding to the unfiltered
database. Uploading a subset of 5000 gene IDs, the output was the same.
But I realized that in the "Filters" section, altough I had checked
the "ID list limit" box, I had not selected "Ensembl Gene IDs" option
in the popdown menu (that doesn't exist in the 0.7 version). So there
was no filtering, even though I had uploaded my gene list, making
complete sense.
Indeed, if I select "Ensembl Gene IDs" in the popdown menu under the
'ID list limit" checkbox and l upload the list, I can now see the
proper gene ID list appear in the "Filters" section of the summary
panel on the left (nicely done!).
However:
my _new _problem is that I cannot switch to the next step to select
attributes: the "Output" and "Next" buttons on the top are not working."
In addition to the method described by Junjun, there is another quick
way to get the transcript ID for a list of genes. In
central.biomart.org, under Tools – select ID Converter. Select Mus
musculus dataset, and upload your list of genes. Select Transcript ID
in the TO box. Using this method I get 75,966 rows, as before.
I can't find this tool
Best regards
Henri-Jean
From: Henri-Jean Garchon <[email protected]
<mailto:[email protected]>>
Reply-To: "[email protected]
<mailto:[email protected]>" <[email protected]
<mailto:[email protected]>>
Date: Sun, 7 Aug 2011 06:28:57 -0400
To: Junjun Zhang <[email protected]
<mailto:[email protected]>>
Cc: Microsoft Office User <[email protected]
<mailto:[email protected]>>, "[email protected]
<mailto:[email protected]>" <[email protected]
<mailto:[email protected]>>
Subject: Re: [BioMart Users] Mouse EnsemblTranscriptID retrieval
Dear Junjun,
Following my previous mail, the issue was with the browser: I was
using Opera (my default).
With Internet Explorer, the "upload file" button works fine.
There are no duplicates in the output transcript list (93805
transcripts). So this is good.
The issue however is that all the Ensembl genes (n = 36814) are
retrieved.
Identical result if I upload a 5000 gene list.
Best wishes
Henri-Jean
Le 05/08/2011 22:45, Junjun Zhang a écrit :
Dear Henri-Jean,
After executing the same query directly using SQL SELECT statement
against the database and testing the same query on a BioMart 0.8
server. It is confirmed that they both do not have any problem of
missing or duplicating results. So problem is caused by BioMart 0.7
query batching.
Please try this if you'd like to test your gene IDs:
http://central.biomart.org/martwizard/#!/Genome?mart=Ensembl+Genes+63+(WTSI%2C+UK)&step=1&datasets=mmusculus_gene_ensembl
<http://central.biomart.org/martwizard/#%21/Genome?mart=Ensembl+Genes+63+%28WTSI%2C+UK%29&step=1&datasets=mmusculus_gene_ensembl>
Cheers,
Junjun
From: jzhang <[email protected] <mailto:[email protected]>>
Date: Fri, 5 Aug 2011 01:33:34 -0400
To: "[email protected]
<mailto:[email protected]>" <[email protected]
<mailto:[email protected]>>, Elena Rivkin
<[email protected] <mailto:[email protected]>>
Cc: "[email protected] <mailto:[email protected]>"
<[email protected] <mailto:[email protected]>>
Subject: Re: [BioMart Users] Mouse EnsemblTranscriptID retrieval
Dear Henri-Jean,
Thanks for sending the lists of IDs to us for testing. Based on
Elena's test here, we can confirm the problem exists, however,
we have not had a chance to look into it closely enough to
figure out what exactly causes the problem. It very likely has
something to do with BioMart 0.7's query batching (that I
described a few days ago in another thread), which may result in
missing/duplicating rows in the result. As I mentioned earlier,
this will not happen in 0.8 where batching is implemented in a
different way.
We will continue the investigation and get you back when we
found something concrete.
Best regards,
Junjun
From: Henri-Jean Garchon <[email protected]
<mailto:[email protected]>>
Reply-To: "[email protected]
<mailto:[email protected]>"
<[email protected] <mailto:[email protected]>>
Date: Thu, 4 Aug 2011 09:49:43 -0400
To: Elena Rivkin <[email protected]
<mailto:[email protected]>>, jzhang
<[email protected] <mailto:[email protected]>>
Cc: "[email protected] <mailto:[email protected]>"
<[email protected] <mailto:[email protected]>>
Subject: Re: [BioMart Users] Mouse EnsemblTranscriptID retrieval
Dear Elena, Dear JunJun,
Many thanks to both of you for having taken the time to
address my request a month ago.
I agree that using the viewer with limit set to 10 to
illustrate my issue was not very bright! Apologies.
I must say that things have changed substantially since then
and look much better today. The output file generated after
retrivieving "EntrezGene.ID"is a lot more consistent than a
month ago. There are fewer duplicates (and actually no
duplicate rows in the output table as there used to be),
much fewer "NA" entries from Entrez (although I checked
these null entries have an associated gene name). I guess
these are issues with the Entrez database. Perhaps, what is
most important: all input Ensembl.Gene.ID are present in
the outpout table.
My concern now is an issue with the retrieval of
Ensembl.Transcript.ID, the default attributes of Biomart:
Actually I am working with a list of 22308 Ensembl gene ID
mapped on the Affymetrix Mouse Gene 1.0 ST microarray.
I uploaded this list on the Biomart.org website to filter my
query.
The database is Ensembl build 63, the dataset is NCBIM37.
I retrieve the output as a TSV file ("export all results
to", not checking "unique results only").
I then go to R to check this output file.
The output table has 75966 row, of which 59458 are unique.
In other words, 42950 rows are unique and 16508 are
duplicated. Why some rows are duplicated and others not
perhaps might be explained.
My main concern is that 6467 input Ensembl.Gene.IDs are not
retrieved and are missing from the output table. These are
bona fide genes with regular associated gene names. If I
upload the list of these missing guys, I now get the
corresponding transcripts. All of them are retrieved and
there are no duplicate rows!
In anticipation I thank you very much for your valuable help
and comments
Best regards
Henri-Jean
Le 24/06/2011 15:24, Elena Rivkin a écrit :
Dr. Henri-Jean Garchon,
The reason for only seeing a subset of EntrezGeneID is b/c
only some transcripts do not have EntrezGene ID associated
with them. If you select Ensembl Transcript ID as an
attribute, you will se which transcripts correspond to
which EntrezGene ID.
For example.
ENSMUSG00000026073 (Illr2) - only one of transcripts
(ENSMUST00000027243) has EntrezGene ID
And
ENSMUSG00000035208 (Slfn8) - has two different
EntrezGEneIDs, although only one transcript
(ENSMUST00000038141).
I hope it helps.
Elena
From: Henri-Jean GARCHON <[email protected]
<mailto:[email protected]>>
Reply-To: "[email protected]
<mailto:[email protected]>"
<[email protected]
<mailto:[email protected]>>
Date: Fri, 24 Jun 2011 05:00:03 -0400
To: "[email protected] <mailto:[email protected]>"
<[email protected] <mailto:[email protected]>>
Subject: [BioMart Users] Fwd: Returned mail: see transcript
for details
-------- Message original --------
Sujet: Returned mail: see transcript for details
Date : Fri, 24 Jun 2011 09:48:20 +0100
De : Mail Delivery Subsystem <[email protected]>
Pour : <[email protected]>
The original message was received at Fri, 24 Jun 2011 09:48:20 +0100
from mx1.ebi.ac.uk [193.62.197.214]
----- The following addresses had permanent fatal errors -----
[email protected]
(reason: 550 Host unknown)
(expanded from:<[email protected]>)
----- Transcript of session follows -----
550 [email protected]... Host unknown (Name server:
biomart.org.redirect: host not found)
_______________________________________________
Users mailing list
[email protected]
https://lists.biomart.org/mailman/listinfo/users