Hi,

I periodically grab annotations files in TSV format using martservice via
XML. One of the three files I transfer is relatively large (>500MB). It
starts transferring at a normal speed but before too far into the file
(10MB or so?) the transfer speed just bottoms out and then periodically
bursts a little bit of data at a time before stopping transfer for a while
again. The transfer that previously took maybe a couple hours now can take
10-20 hours, it seems, or worse, the connection just times out and after
10+ hours of transferring data I end up with an incomplete file.

I am using Curl to download the file. An example command line I would use
that exhibits the problem is

curl -o var-annotations-unsorted.tsv.body --tr-encoding --verbose -d
@query.xml http://www.biomart.org/biomart/martservice

Where query.xml contains the data (but the XML portion is URLEncoded per
the directions by Curl)

query=<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE Query><Query
virtualSchemaName="default" formatter="TSV" header="0" un
iqueRows="0" count="" datasetConfigVersion="0.6" ><Dataset
name="mmusculus_snp" interface = "default" ><Attribute name="chr_
name"/><Attribute name="chrom_start"/><Attribute
name="refsnp_id"/><Attribute name="ensembl_gene_stable_id"/><Attribute name
="consequence_type_tv"/></Dataset></Query>

I've tried this from both my work network and my home network to verify it
wasn't an issue with our work network, and the same throttling behavior is
exhibited. I'd be somewhat less concerned of it taking 10+ hours to
complete the transfers didn't sometimes timeout after many hours of
transfer.

*Secondarily,* I was hoping to speed up the transfer by providing the
options "--tr-encoding" or "--compressed" options in Curl, which would
allow the server to send the file over the wire as gzip, but it seems your
server doesn't support this, which is too bad because that could easily cut
down the number of bytes transferred by a factor of 10 or more. I've tried
both options and neither seem to do anything with the martservice servers.
Is there some other option I could specify that would compress the data
over the wire or before transfer? I can handle nearly any file format on my
side and would do nearly anything you offer to speed up these transfers.

Any suggestions?
Kevin
_______________________________________________
Users mailing list
[email protected]
https://lists.biomart.org/mailman/listinfo/users

Reply via email to