Hi Kevin,

BioMart 0.7 does not work well for handling large/long running queries (snp 
marts are large), recent high server load may have made things worse. There are 
two options you can use to alleviate to situation.


 1.  Break the query down into multiple queries, say, one query per chromosome, 
you need to add a filter like: <Filter name = "chr_name" value = "1"/> to you 
query. This way, you can track the query more easily, and rerun the failed 
query separately.
 2.  Use the email notification option at martview web GUI (this is not 
available for script driven queries).

Hope this helps, let us know how it goes.

Best regards,
Junjun


From: "Kevin C. Dorff" <[email protected]<mailto:[email protected]>>
Date: Thu, 10 Nov 2011 12:09:20 -0500
To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Subject: Re: [BioMart Users] Transfer very, very slow from martservice on 
large-ish requests

Hi Arek,

Thanks for responding. I saw that biomart.org<http://biomart.org> was back up 
so I tried again. I started my script and I am seeing the exact same effect as 
before. It starts normally then after 10mb or so it will only periodically 
burst a bit of data across. In 34 minutes I am up to 113M (now at 37 minutes I 
am at 118M but most of the time is sent sending no data at all). This has been 
happening like this for a little over a week, but I cannot speak to when it 
might have started because it has been several months since I did these data 
transfers from martservice (this is something I only do a few times a year). 
Looking back over my logs, these larger files have timed out nearly every time, 
some around 5 hours, some at around more than 10.

If you could look into this for it, it would be greatly appreciated.

Thanks,
Kevin

On Thu, Nov 10, 2011 at 7:37 AM, Arek Kasprzyk 
<[email protected]<mailto:[email protected]>> wrote:
Hi Kevin
there seem to be some problems with the service recently and now 
biomart.org<http://biomart.org> is down. The OICR team are working to restore 
the service. Once is restored please try again and let us know if you still are 
experiencing those problems and we'll be able to look into it in more details

a

On Wed, Nov 9, 2011 at 11:59 AM, Kevin C. Dorff 
<[email protected]<mailto:[email protected]>> wrote:
Hi,

I periodically grab annotations files in TSV format using martservice via XML. 
One of the three files I transfer is relatively large (>500MB). It starts 
transferring at a normal speed but before too far into the file (10MB or so?) 
the transfer speed just bottoms out and then periodically bursts a little bit 
of data at a time before stopping transfer for a while again. The transfer that 
previously took maybe a couple hours now can take 10-20 hours, it seems, or 
worse, the connection just times out and after 10+ hours of transferring data I 
end up with an incomplete file.

I am using Curl to download the file. An example command line I would use that 
exhibits the problem is

curl -o var-annotations-unsorted.tsv.body --tr-encoding --verbose -d @query.xml 
http://www.biomart.org/biomart/martservice

Where query.xml contains the data (but the XML portion is URLEncoded per the 
directions by Curl)

query=<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE Query><Query 
virtualSchemaName="default" formatter="TSV" header="0" un
iqueRows="0" count="" datasetConfigVersion="0.6" ><Dataset name="mmusculus_snp" 
interface = "default" ><Attribute name="chr_
name"/><Attribute name="chrom_start"/><Attribute name="refsnp_id"/><Attribute 
name="ensembl_gene_stable_id"/><Attribute name
="consequence_type_tv"/></Dataset></Query>

I've tried this from both my work network and my home network to verify it 
wasn't an issue with our work network, and the same throttling behavior is 
exhibited. I'd be somewhat less concerned of it taking 10+ hours to complete 
the transfers didn't sometimes timeout after many hours of transfer.

Secondarily, I was hoping to speed up the transfer by providing the options 
"--tr-encoding" or "--compressed" options in Curl, which would allow the server 
to send the file over the wire as gzip, but it seems your server doesn't 
support this, which is too bad because that could easily cut down the number of 
bytes transferred by a factor of 10 or more. I've tried both options and 
neither seem to do anything with the martservice servers. Is there some other 
option I could specify that would compress the data over the wire or before 
transfer? I can handle nearly any file format on my side and would do nearly 
anything you offer to speed up these transfers.

Any suggestions?
Kevin


_______________________________________________
Users mailing list
[email protected]<mailto:[email protected]>
https://lists.biomart.org/mailman/listinfo/users



_______________________________________________
Users mailing list
[email protected]
https://lists.biomart.org/mailman/listinfo/users

Reply via email to