On 31 May 2006, at 14:13, Damian Smedley wrote:
Hi Ellen,
Apologies for the delay. Basically there is no way of doing the query
you
describe all in one step with our current datasets using web or API
but if
would be pretty easy to do some post processing of the results of a
query
to get the results.
just to add to Damian's comments. It would be quite easy for us to
change the 'boolean' filter of 'is 5 upstream' or 'is 3 upstream' for a
more informative filter
storing coordinates of a given snp with respect to a gene, with a
user-defined
cut-off criteria in a similar manner to what we have now for 'flank
region' for a given
upstream or downstream sequence. In this way you would not have to do
any post-processing
and could use the API or MartView in a standard way
a.
I would write a perl script using our API (you can see
some examples in biomart-plib/scripts) to perform a gene dataset query
to
filter on your gene IDs and retrieve the gene_id, snp_id, snp_position
and gene start and
end. I would then cycle over the results only printing out data where
the
snp_position was within start-1000..start or end..end+1000.
Let us know if you want some help constructing the script. You could
construct the query using the API directly or using a Query XML (again
examples in biomart-plib/scripts).
Another option is to use taverna (there is a link from the main
biomart.org page I think). Tom who is the main developer on Taverna
showed
me a nice pipeline he constructed to do just such a query using BioMart
queries as part of the workflow and I'm sure he could share this with
you.
hope this helps and we will look into making this simpler in the future
Best wishes
Damian
Hi all,
I am wondering if the following is possible and if so can somebody
suggest
how to code it please.
I want to search for genes by their ID and find all SNPs within the
gene
(fair enough, I can do this!), but also within (eg) ±1kb of the
start/end of
these genes.
Any examples, web or mart XML, would be very greatly received.
Many thanks,
Ellen.
--------------------------------------------------------------------
Ellen Adlem
JDRF/WT Diabetes and Inflammation Laboratory
Cambridge Institute for Medical Research
University of Cambridge
Wellcome Trust/MRC Building
Addenbrooke's Hospital
Cambridge
CB2 2XY
--------------------------------------------------------------------
----------------------------------------------------------------------
--
-------
Arek Kasprzyk
EMBL-European Bioinformatics Institute.
Wellcome Trust Genome Campus, Hinxton,
Cambridge CB10 1SD, UK.
Tel: +44-(0)1223-494606
Fax: +44-(0)1223-494468
----------------------------------------------------------------------
--
-------
------------------------------------------------------------------------
-------
Arek Kasprzyk
EMBL-European Bioinformatics Institute.
Wellcome Trust Genome Campus, Hinxton,
Cambridge CB10 1SD, UK.
Tel: +44-(0)1223-494606
Fax: +44-(0)1223-494468
------------------------------------------------------------------------
-------