Hi Arek,

Thanks for responding. I saw that biomart.org was back up so I tried again.
I started my script and I am seeing the exact same effect as before. It
starts normally then after 10mb or so it will only periodically burst a bit
of data across. In 34 minutes I am up to 113M (now at 37 minutes I am at
118M but most of the time is sent sending no data at all). This has been
happening like this for a little over a week, but I cannot speak to when it
might have started because it has been several months since I did these
data transfers from martservice (this is something I only do a few times a
year). Looking back over my logs, these larger files have timed out nearly
every time, some around 5 hours, some at around more than 10.

If you could look into this for it, it would be greatly appreciated.

Thanks,
Kevin

On Thu, Nov 10, 2011 at 7:37 AM, Arek Kasprzyk <[email protected]>wrote:

> Hi Kevin
> there seem to be some problems with the service recently and now
> biomart.org is down. The OICR team are working to restore the service.
> Once is restored please try again and let us know if you still are
> experiencing those problems and we'll be able to look into it in more
> details
>
> a
>
> On Wed, Nov 9, 2011 at 11:59 AM, Kevin C. Dorff 
> <[email protected]>wrote:
>
>> Hi,
>>
>> I periodically grab annotations files in TSV format using martservice via
>> XML. One of the three files I transfer is relatively large (>500MB). It
>> starts transferring at a normal speed but before too far into the file
>> (10MB or so?) the transfer speed just bottoms out and then periodically
>> bursts a little bit of data at a time before stopping transfer for a while
>> again. The transfer that previously took maybe a couple hours now can take
>> 10-20 hours, it seems, or worse, the connection just times out and after
>> 10+ hours of transferring data I end up with an incomplete file.
>>
>> I am using Curl to download the file. An example command line I would use
>> that exhibits the problem is
>>
>> curl -o var-annotations-unsorted.tsv.body --tr-encoding --verbose -d
>> @query.xml http://www.biomart.org/biomart/martservice
>>
>> Where query.xml contains the data (but the XML portion is URLEncoded per
>> the directions by Curl)
>>
>> query=<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE Query><Query
>> virtualSchemaName="default" formatter="TSV" header="0" un
>> iqueRows="0" count="" datasetConfigVersion="0.6" ><Dataset
>> name="mmusculus_snp" interface = "default" ><Attribute name="chr_
>> name"/><Attribute name="chrom_start"/><Attribute
>> name="refsnp_id"/><Attribute name="ensembl_gene_stable_id"/><Attribute name
>> ="consequence_type_tv"/></Dataset></Query>
>>
>> I've tried this from both my work network and my home network to verify
>> it wasn't an issue with our work network, and the same
>> throttling behavior is exhibited. I'd be somewhat less concerned of it
>> taking 10+ hours to complete the transfers didn't sometimes timeout after
>> many hours of transfer.
>>
>> *Secondarily,* I was hoping to speed up the transfer by providing the
>> options "--tr-encoding" or "--compressed" options in Curl, which would
>> allow the server to send the file over the wire as gzip, but it seems your
>> server doesn't support this, which is too bad because that could easily cut
>> down the number of bytes transferred by a factor of 10 or more. I've tried
>> both options and neither seem to do anything with the martservice servers.
>> Is there some other option I could specify that would compress the data
>> over the wire or before transfer? I can handle nearly any file format on my
>> side and would do nearly anything you offer to speed up these transfers.
>>
>> Any suggestions?
>> Kevin
>>
>>
>> _______________________________________________
>> Users mailing list
>> [email protected]
>> https://lists.biomart.org/mailman/listinfo/users
>>
>>
>
_______________________________________________
Users mailing list
[email protected]
https://lists.biomart.org/mailman/listinfo/users

Reply via email to