Hello. 

500 timeouts are usually caused by big queries that run for a long time. The 
reason is that there can be a delay during processing these queries that 
exceeds the maximum allowable response time for the web server you are 
communicating with via MartURLLocation. 

There are two solutions:

a) break your query down into smaller pieces (e.g. each with a smaller set of 
protein IDs in your ensembl_peptide_id filter) then programatically recombine 
the results,

b) install a local mirror of the BioMart databases that you need so that you 
can configure your registry to use direct database connections instead of using 
MartURLLocations. Direct database connections are faster and do not suffer from 
timeouts imposed by web servers.

cheers,
Richard

On 20 Nov 2009, at 19:00, Chris Grassa wrote:

> Hello,
> 
> I have been having some trouble downloading sequence data via BioMart's Perl 
> API.  I've been trying to obtain sets of coding sequences (maybe on the order 
> of 35MB or so each), but every time I execute the script, the following error 
> is returned:
> 
> Problems with the web server: 500 read timeout
> 
> I seem to be getting exactly 22 sequences every time, instead of the 20,000 
> or so requested.  I certainly would appreciate any help you may be able to 
> offer.  I have included the code I am using below, which I mostly copied from 
> the Perl button on the Martview website.  Below the perl, I have included the 
> XML contained in the registry file.  Perhaps the data are available from a 
> host receiving less traffic?
> 
> Regards and best wishes,
> 
> Chris Grassa
> 
> S. Tonia Hsieh Lab
> University of Florida
> 
> 
> 
> 
> #!/usr/bin/perl -w
> 
> # An example script demonstrating the use of BioMart API.
> # This perl API representation is only available for configuration versions 
> >=  0.5
> use strict;
> use BioMart::Initializer;
> use BioMart::Query;
> use BioMart::QueryRunner;
> 
> my $confFile = 
> "/home/grassa/src/biomart-perl/conf/ensembl_mart_56_registry.xml";
> #
> # NB: change action to 'clean' if you wish to start a fresh configuration
> # and to 'cached' if you want to skip configuration step on subsequent runs 
> from the same registry
> #
> 
> my $action='cached';
> my $initializer = BioMart::Initializer->new('registryFile'=>$confFile, 
> 'action'=>$action);
> my $registry = $initializer->getRegistry;
> 
> my $query = 
> BioMart::Query->new('registry'=>$registry,'virtualSchemaName'=>'default');
> 
> 
>       $query->setDataset("btaurus_gene_ensembl");
>       $query->addFilter("ensembl_peptide_id", [large array of Ensembl protein 
> IDs]);
>       $query->addAttribute("ensembl_gene_id");
>       $query->addAttribute("ensembl_transcript_id");
>       $query->addAttribute("coding");
>       $query->addAttribute("ensembl_peptide_id");
> 
> $query->formatter("FASTA");
> 
> my $query_runner = BioMart::QueryRunner->new();
> ############################## GET COUNT ############################
> # $query->count(1);
> # $query_runner->execute($query);
> # print $query_runner->getCount();
> #####################################################################
> 
> 
> ############################## GET RESULTS ##########################
> # to obtain unique rows only
> # $query_runner->uniqueRowsOnly(1);
> 
> $query_runner->execute($query);
> $query_runner->printHeader();
> $query_runner->printResults();
> $query_runner->printFooter();
> #####################################################################
> 
> 
> 
> 
> 
> <?xml version="1.0" encoding="UTF-8"?>
> <!DOCTYPE MartRegistry>
> <MartRegistry>
>  <MartURLLocation database="ensembl_mart_56" default="1" displayName="ENSEMBL 
> 56 GENES (SANGER UK)" host="www.biomart.org" includeDatasets="" martUser="" 
> name="ensembl" path="/biomart/martservice" port="80" 
> serverVirtualSchema="default" visible="1" />
>  <MartURLLocation database="snp_mart_56" default="0" displayName="ENSEMBL 56 
> VARIATION  (SANGER UK)" host="www.biomart.org" includeDatasets="" martUser="" 
> name="snp" path="/biomart/martservice" port="80" 
> serverVirtualSchema="default" visible="1" />
>  <MartURLLocation database="functional_genomics_mart_56" default="0" 
> displayName="ENSEMBL 56 FUNCTIONAL GENOMICS (SANGER UK)" 
> host="www.biomart.org" includeDatasets="" martUser="" 
> name="functional_genomics" path="/biomart/martservice" port="80" 
> serverVirtualSchema="default" visible="1" />
>  <MartURLLocation database="vega_mart_56" default="0" displayName="VEGA 36  
> (SANGER UK)" host="www.biomart.org" includeDatasets="" martUser="" 
> name="vega" path="/biomart/martservice" port="80" 
> serverVirtualSchema="default" visible="1" />
>  <MartURLLocation database="genomic_features_mart_56" default="0" 
> displayName="ENSEMBL 56 GENOMIC FEATURES (SANGER UK)" host="www.biomart.org" 
> includeDatasets="" martUser="" name="genomic_features" 
> path="/biomart/martservice" port="80" serverVirtualSchema="default" 
> visible="0" />
>  <MartURLLocation database="ontology_mart_56" default="0" 
> displayName="ENSEMBL 56 ONTOLOGY (SANGER UK)" host="www.biomart.org" 
> includeDatasets="" martUser="" name="ontology" path="/biomart/martservice" 
> port="80" serverVirtualSchema="default" visible="0" />
>  <MartURLLocation database="sequence_mart_56" default="0" 
> displayName="ENSEMBL 56 SEQUENCE (SANGER UK)" host="www.biomart.org" 
> includeDatasets="" martUser="" name="sequence" path="/biomart/martservice" 
> port="80" serverVirtualSchema="default" visible="0" />
> </MartRegistry>
> 
> 
> 
> 
> 
> --

--
Richard Holland, BSc MBCS
Operations and Delivery Director, Eagle Genomics Ltd
T: +44 (0)1223 654481 ext 3 | E: holl...@eaglegenomics.com
http://www.eaglegenomics.com/

Reply via email to