Hi Daniel,
your question never made it to the list, see the archive:
http://listserver.ebi.ac.uk/mailing-lists-archives/mart-dev/threads.html
did you subscribe to the list before sending it ?
HOW TO SUBSCRIBE, details here:
http://www.biomart.org/contact.html
i am posting this one to the list anyways.
for answer to your question, scroll down,
Daniel Murrell wrote:
Hi Syed,
I posted a question to the list... Did you get it on your side? I've had no
response yet though. I'm wondering if my question was too naive.
This was my question to the list...
"
This misunderstanding could stem from my relative newness to the field of
bio-informatics, but I'm trying to understand how one RefSeq DNA ID (
NM_033487<http://www.ncbi.nlm.nih.gov/sites/entrez?db=nuccore&cmd=search&term=NM_033487>)
can map to 5 of 16 gene transcripts (ENSTs).
http://www.biomart.org/biomart/martview?VIRTUALSCHEMANAME=default&ATTRIBUTES=hsapiens_gene_ensembl.default.feature_page.ensembl_gene_id|hsapiens_gene_ensembl.default.feature_page.ensembl_transcript_id|hsapiens_gene_ensembl.default.feature_page.refseq_dna&FILTERS=hsapiens_gene_ensembl.default.filters.ensembl_gene_id
<http://www.biomart.org/biomart/martview?VIRTUALSCHEMANAME=default&ATTRIBUTES=hsapiens_gene_ensembl.default.feature_page.ensembl_gene_id%7Chsapiens_gene_ensembl.default.feature_page.ensembl_transcript_id%7Chsapiens_gene_ensembl.default.feature_page.refseq_dna&FILTERS=hsapiens_gene_ensembl.default.filters.ensembl_gene_id>."ENSG00000008128"&VISIBLEPANEL=resultspanel
How is this mapping generated? Where can I find documentation about how
mappings between different identifiers are done in general?
mappings are data provider specific, hence BioMart documentation is not
the right place to find this info. For ensembl, the best way to approach
this is through the helpdesk, they will get this mail anyways (cc'ed).
While I'm asking questions, why do the docs make it seem like many people
would want to install their own instances of Biomart? Isn't the point of
Biomart to have one central place that people use to look up relationships?
I dont think documentation should suggest that you need to install your
own BioMart. The only people to install BioMart are data providers, not
users. If you wish to use Perl API, you can part install the software to
access data from other publicly available BioMart's using biomart
library. This excludes setting up any of the web related stuff (apache,
mod_perl, and several CPAN modules).
"
I've decided that for efficiency purposes I'd like to store the
relationships between a few entities like RefSeq DNA IDs and ENST IDs. I'm
trying to decide whether this relationship needs to be one-many or
many-many.
Select these two attributes on interface and you will notice the
relationship. Lets wait for ensembl folk, what they have to say about this.
Best,
Syed
Thanks
Daniel
On Fri, Aug 7, 2009 at 2:05 PM, Daniel Murrell <[email protected]> wrote:
Hi Syed,
Yes thanks... the URL button tip is useful.
I have another question... I'll give the list a go :)
Thanks
Daniel
On Fri, Aug 7, 2009 at 2:01 PM, Syed Haider <[email protected]> wrote:
Hi Daniel,
Daniel Murrell wrote:
Hi Syed,
Where is the best place to ask Biomart related questions? Is it that list
that you posted my python code to?
My bio-informatics is a little sketchy so I don't want to waste too much
of
one person's time.
the best place to post BioMart related questions is BioMart mailing list:
[email protected]
However, if you notice clearly that there is some data related issue, you
may 'cc' the email to the data provider too. In case of ensembl you already
have the contacts. Most of the data providers are already on mart-dev
anyways, so you wont miss anybody. Any follow-ups are best handled if you
keep everybody else in the loop.
I'm looking at ENSG00000008128 and the Ensembl site shows 16
transcripts...
http://www.ensembl.org/Homo_sapiens/Gene/Summary?g=ENSG00000008128
while the Biomart query,
http://www.biomart.org/biomart/martview/da8f87c419b2b3e6f81c9742da585baa
,
shows only 10. Why is there a difference in the number of transcripts?
you have this sorted now by choosing more results option :)
Also, how long does the Biomart URL (like the one I've pasted in above)
store the query for?
the URL storge differs from Biomart server to server (biomart.org stores
it for a day) not sure about ensembl.org. The best thing to bookmark your
query is to hit URL button on MartView and store the URL it prints.
Hope this helps.
Syed
Thanks
Daniel
On Thu, Aug 6, 2009 at 3:07 PM, Syed Haider <[email protected]> wrote:
Hi Daniel,
storing data locally wont scale much from management (updates) point of
view. Thats what BioMart is meant to offer - federated access on the
fly.
Best,
Syed
Daniel Murrell wrote:
Hi Syed,
I don't have anything in particular in mind at the moment. I was just
wondering about the general practice of serving data that exists in
other
databases already and whether that data should be stored in your own
database as well (maybe if access times were important) or if you
should
be
grabbing it live. I was thinking that I might just use other
identifiers
to
link to other databases from my entity pages (gene, transcript, reagent
etc...).
Thanks again for the help
Daniel
On Thu, Aug 6, 2009 at 2:52 PM, Syed Haider <[email protected]> wrote:
It really depends on the use case you are to address. Better would be
to
access BioMart live and on the fly to stay up to date. Could you
please
let
me know what sort of usage do you have in mind ?
Best,
Syed
Daniel Murrell wrote:
Hi Syed,
Thanks for the info. I realise the naivety of my questions now.
I've made the following python script to achieve my ends.
import urllib
import urllib2
url = 'http://www.biomart.org/biomart/martservice'
params = urllib.urlencode({
'query': '''
<Query virtualSchemaName = "default" formatter = "TSV" header = "0"
uniqueRows = "0" count = "" datasetConfigVersion = "0.6" >
<Dataset name = "hsapiens_gene_ensembl" interface = "default" >
<Filter name = "refseq_dna" value = "NM_001699,NM_021913"/>
<Attribute name = "ensembl_transcript_id" />
</Dataset>
</Query>
'''
})
response = urllib2.urlopen(url, params).read()
print response
I've never used any REST before in Python.
I've got one more question. Is it a common occurrence for people to
query
Biomart from their web-frameworks in order to display relevant meta
data
for
certain entities? Are there any issues with response times here? I'm
pretty
new to writing bio-informatics web front-ends. I'm guessing that
querying
Biomart many times on each user access is preferable to querying
Biomart
infrequently to update your own database with the information you
wish
to
display. Am I correct here?
Thanks again
Daniel
On Thu, Aug 6, 2009 at 2:19 PM, Syed Haider <[email protected]> wrote:
Hi Daniel,
web service access to any BioMart server (biomart.org, ensembl.orgetc
etc) is possible over REST as well as SOAP protocol, REST is easier
though.
To retrieve data over the web services, the choice of language is
independent. Python, Perl, Java, C etc etc all should work just the
same.
Hence, you can send the QUERY XML in any language and get your
results
back
in one of the available formats. The idea is to formulate the XML
query.
MartView (BioMart web interface) helps you make the query and hit
XML
button
to see the XML equivalent of the query. How to send it to the
BioMart
server
(POST), please see section a and b of this page:
http://www.biomart.org/martservice.html
Best,
Syed
Daniel Murrell wrote:
Hi Glenn,
For the moment, I would like to do simple things like look for all
the
Ensembl Transcript IDs associated with certain RefSeq DNA IDs. I
know
I
could do this with R scripts or even the web interface itself but I
think
looking for a Python based solution might be more beneficial to me
in
the
long term when I might have to do things on the fly, like
generating
data
to
display on a webpage that is not directly stored in my database but
can
be
obtained through an API. My webframework uses Python so hence the
problem.
Would the Biomart team be able to reccomend any Python libraries
that
assist
in communication with the webservices particularly well? Do they
know
of
other groups who use Python to interface with Biomart and which
solutions
they have come up with?
Thanks
Daniel
On Thu, Aug 6, 2009 at 1:44 PM, Glenn Proctor <[email protected]>
wrote:
Hi Daniel
As far as I know there isn't a Python API for Biomart - Syed, is
this
the case? There is a Perl API and also the BiomaRt/Bioconductor
package that is R-based.
However I'm sure there will be Python libraries that will assist
you
in talking to webservices. What is it that you want to do,
specifically?
Glenn.
On Thu, Aug 6, 2009 at 1:28 PM, Daniel Murrell<[email protected]
wrote:
Hi Glenn,
Do you mind asking the person that deals with the Biomart site
what
the
best
way to use the API through Python is? Are REST and SOAP the
only
options
or
are there other external API's that I might be unaware of
(googling
didn't
get me far here)?
Thank you
Daniel