Re my previous email, I'd like to download all human exon sequences.
Since there are a couple hundred thousand, I often find Ensembl or
Biomart timing out on me. I tried telling biomart to make a compressed
web file and email me about it, but 6 hours later, it still hasn't done
so.
I'm happy to do this one chromosome at a time, but I'd prefer as
lightweight a solution as possible.
Unfortunately, I can't get the web services thing working right. Am I
right in thinking that I just need to post the right query to the right
URL? I tried to do so, following some instructions from the Biomart PDF
docs. I pulled an XML query off of biomart:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE Query>
<Query virtualSchemaName = "default" Header = "1" count = ""
softwareVersion =
"0.5" >
<Dataset name = "hsapiens_gene_vega" interface = "default" >
<Attribute name = "gene_stable_id" />
<Attribute name = "chrom_name" />
<Attribute name = "transcript_stable_id" />
<Attribute name = "hugo" />
<Filter name = "hugo" value = "ABL1,ABL2,AXL"/>
</Dataset>
</Query>
I put it into y.xml. Then I said:
POST "http://www.biomart.org/biomart/martservice" < y.xml
(POST is just a link to lwp-request.)
When I try it, I get:
Query ERROR: caught BioMart::Exception: non-BioMart die(): Can't
use an undefined value as an ARRAY reference at
/ebi/www/main/biomart/www/biomart-perl/lib/BioMart/Query.pm line 1635.
Am I totally on the wrong track here? Do I need to use the Biomart Perl
API, even for something this simple?
Thanks in advance,
- Amir Karger
Research Computing
Life Sciences Division
Harvard University