Kevin,

The best way forward is to try the email option from MartView. That will do all the database retrieval and storage on server side and send you the link to download the results when they are ready.

Syed


On 10/11/2011 21:50, Kevin C. Dorff wrote:
I've modified my script to fetch these one chromosome at a time, as you 
mentioned.

My fear is that given that the splits are very unbalanced (some chromosomes are clearly 
going to be much larger files than others) I'll still get timeouts. For instance, I am 
currently transferring chromosome "X" and it is exhibiting the same stalling / 
bursting behavior. 18 minutes so far and it's at 56.9MB (now at 22 minutes it is at 
62.7MB) and just occasionally adding new data to the file but most of the time just 
sitting there transferring nothing. This feels to me like there is some flaw in the 
transfer system unless you've designed it to really throttle any transfers over a certain 
size and are throttling very, very aggressively. I'll review the transfer output tomorrow 
morning for timeouts, etc.

Kevin

On Thu, Nov 10, 2011 at 1:03 PM, Junjun 
Zhang<[email protected]<mailto:[email protected]>>  wrote:
Hi Kevin,

BioMart 0.7 does not work well for handling large/long running queries (snp 
marts are large), recent high server load may have made things worse. There are 
two options you can use to alleviate to situation.


  1.  Break the query down into multiple queries, say, one query per chromosome, you need to add a filter 
like:<Filter name = "chr_name" value = "1"/>  to you query. This way, you can 
track the query more easily, and rerun the failed query separately.
  2.  Use the email notification option at martview web GUI (this is not 
available for script driven queries).

Hope this helps, let us know how it goes.

Best regards,
Junjun


From: "Kevin C. Dorff"<[email protected]<mailto:[email protected]>>
Date: Thu, 10 Nov 2011 12:09:20 -0500
To: 
"[email protected]<mailto:[email protected]>"<[email protected]<mailto:[email protected]>>
Subject: Re: [BioMart Users] Transfer very, very slow from martservice on 
large-ish requests

Hi Arek,

Thanks for responding. I saw that biomart.org<http://biomart.org>  was back up 
so I tried again. I started my script and I am seeing the exact same effect as 
before. It starts normally then after 10mb or so it will only periodically burst a 
bit of data across. In 34 minutes I am up to 113M (now at 37 minutes I am at 118M but 
most of the time is sent sending no data at all). This has been happening like this 
for a little over a week, but I cannot speak to when it might have started because it 
has been several months since I did these data transfers from martservice (this is 
something I only do a few times a year). Looking back over my logs, these larger 
files have timed out nearly every time, some around 5 hours, some at around more than 
10.

If you could look into this for it, it would be greatly appreciated.

Thanks,
Kevin

On Thu, Nov 10, 2011 at 7:37 AM, Arek 
Kasprzyk<[email protected]<mailto:[email protected]>>  wrote:
Hi Kevin
there seem to be some problems with the service recently and now 
biomart.org<http://biomart.org>  is down. The OICR team are working to restore 
the service. Once is restored please try again and let us know if you still are 
experiencing those problems and we'll be able to look into it in more details

a

On Wed, Nov 9, 2011 at 11:59 AM, Kevin C. 
Dorff<[email protected]<mailto:[email protected]>>  wrote:
Hi,

I periodically grab annotations files in TSV format using martservice via XML. One 
of the three files I transfer is relatively large (>500MB). It starts 
transferring at a normal speed but before too far into the file (10MB or so?) the 
transfer speed just bottoms out and then periodically bursts a little bit of data 
at a time before stopping transfer for a while again. The transfer that previously 
took maybe a couple hours now can take 10-20 hours, it seems, or worse, the 
connection just times out and after 10+ hours of transferring data I end up with 
an incomplete file.

I am using Curl to download the file. An example command line I would use that 
exhibits the problem is

curl -o var-annotations-unsorted.tsv.body --tr-encoding --verbose -d @query.xml 
http://www.biomart.org/biomart/martservice

Where query.xml contains the data (but the XML portion is URLEncoded per the 
directions by Curl)

query=<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE Query><Query virtualSchemaName="default" 
formatter="TSV" header="0" un
iqueRows="0" count="" datasetConfigVersion="0.6"><Dataset name="mmusculus_snp" interface = 
"default"><Attribute name="chr_
name"/><Attribute name="chrom_start"/><Attribute name="refsnp_id"/><Attribute 
name="ensembl_gene_stable_id"/><Attribute name
="consequence_type_tv"/></Dataset></Query>

I've tried this from both my work network and my home network to verify it 
wasn't an issue with our work network, and the same throttling behavior is 
exhibited. I'd be somewhat less concerned of it taking 10+ hours to complete 
the transfers didn't sometimes timeout after many hours of transfer.

Secondarily, I was hoping to speed up the transfer by providing the options 
"--tr-encoding" or "--compressed" options in Curl, which would allow the server 
to send the file over the wire as gzip, but it seems your server doesn't support this, which is too 
bad because that could easily cut down the number of bytes transferred by a factor of 10 or more. 
I've tried both options and neither seem to do anything with the martservice servers. Is there some 
other option I could specify that would compress the data over the wire or before transfer? I can 
handle nearly any file format on my side and would do nearly anything you offer to speed up these 
transfers.

Any suggestions?
Kevin


_______________________________________________
Users mailing list
[email protected]<mailto:[email protected]>
https://lists.biomart.org/mailman/listinfo/users





_______________________________________________
Users mailing list
[email protected]
https://lists.biomart.org/mailman/listinfo/users

Reply via email to