Hi Kevin (and all other readers), Thank you very much for your detailed answer. I decided to go with the python-script option.
I was working through the tutorials you mentioned at https://wiki.humanconnectome.org/display/DataUse/Exploring+ConnectomeDB+with+Python . I have downloaded and installed all libraries and managed to open a working session. However I am unable to do the basics, such as listing the projects. For example when I run: >>> cdb.select.projects().get() I get the following errror: Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/local/lib/python2.7/dist-packages/pyxnat-0.9.5.3-py2.7.egg/pyxnat/core/resources.py", line 842, in get return [urllib.unquote(uri_last(eobj._uri)) for eobj in self] File "/usr/local/lib/python2.7/dist-packages/pyxnat-0.9.5.3-py2.7.egg/pyxnat/core/resources.py", line 663, in __iter__ eid = urllib.unquote(res[id_header]) KeyError: 'ID' However, when I run the following command and specify a project this time, there are no errors. >>> q2 = cdb.select.project('HCP_Q2') But when I want to see what is inside of this project object as described in the tutorial, I get nothing, but I should have got a list of subjects: >>>[subject.label() for subject in q2.subjects()] [ ] I am running these on an Ubuntu 13.04 machine with Python 2.7.4. Am I doing something wrong? Best wishes, Salim ________________________________ From: "Archie, Kevin" <arch...@mir.wustl.edu> To: Salim Arslan <salim.ars...@yahoo.com>; "hcp-users@humanconnectome.org" <hcp-users@humanconnectome.org> Sent: Wednesday, November 5, 2014 5:02 PM Subject: RE: [HCP-Users] Error: Server Refused Request (Code 34) Salim, I'm sorry you've been having trouble with the downloads. I don't know exactly what's going wrong, and I will suggest a workaround, but you might want to consider whether this is really what you want to do. The full 500 subject dataset is 18 TB zipped; I looked at the server logs of your successful downloads to get an estimate of the data rate between here and you, and that would be 16 days of continuous download to get everything (though you might want to check my arithmetic). Even if you're just getting a subset of the data that would still be days of downloading. Once you're done, you've got a bunch of zip files that you need to unpack, and you need somewhere to put it...Connectome in a Box (http://humanconnectome.org/data/connectome-in-a-box.html) really is a bargain if you want a substantial fraction of the data. If I haven't sold you on the box, let's figure out how you can make this download work. I'm impressed by your script to keep the web session alive; as it turns out that won't make a difference--once the Aspera download is running, ConnectomeDB isn't involved anymore--but it was a good idea. The Aspera software does pretty well (in our experience) on the scale of hours but doesn't seem to hold up well at days. As you suggested, if you could partition the requests you could probably get better reliability, or at least lose less when something breaks. I would look into making the requests from a Python script, using the tools described here: https://wiki.humanconnectome.org/display/DataUse/Exploring+ConnectomeDB+with+Python . There's a section "Accessing imaging data" that shows how to use the Aspera plugin to download data. This tutorial is a little out of date (was written after Q1 release and then patched for Q2), but with a bit of tweaking could be adapted to the 500 subject release. A single call to cdb.packages.download(...) will act like a single request from the webapp, so you probably want to partition the subjects and loop over the partitions, and maybe wrap it all up with some exception handling to automatically re-request any parts that fail. If that sounds like a disaster, there's work underway to push a copy of all the HCP data to Amazon S3. It'll be probably a few weeks before this is done, and if you're downloading to anywhere outside of AWS it'll be slower (and therefore less reliable in aggregate) than Aspera but you could do your own partitioning and if you've already worked with S3, or if dealing with S3 seems less burdensome than getting pyxnat working, this might be attractive to you. Watch this mailing list for an announcement. Good luck, and please don't hesitate to send mail to the list if you have questions or run into trouble. Kevin ________________________________ From: hcp-users-boun...@humanconnectome.org [hcp-users-boun...@humanconnectome.org] on behalf of Salim Arslan [salim.ars...@yahoo.com] Sent: Wednesday, November 05, 2014 6:08 AM To: hcp-users@humanconnectome.org Subject: [HCP-Users] Error: Server Refused Request (Code 34) Hi, I have been trying to download the 500+ dataset, but due to its size I guess, the downloading process eventually gets frozen and when I try to re-establish the connection link I get this error: Server Refused Request (Code 34). On the Aspera web site the error is explained as "Unauthorized by external auth server" (https://support.asperasoft.com/entries/22895528-fasp-Error-Codes) and it is not retryable. I have even used a script to periodically renew my session on the connectome database but it did not work. My questions are: 1- Is there a way to download the 500 dataset without getting this error? 2- If not, can I create my own subsets and download them separately? Lets say divide the whole dataset into 5 100-subject subsets. Because I was able to download the 100 dataset without getting any errors. Thanks in advance for any help! Best wishes, Salim _______________________________________________ HCP-Users mailing list HCP-Users@humanconnectome.org http://lists.humanconnectome.org/mailman/listinfo/hcp-users ________________________________ The material in this message is private and may contain Protected Healthcare Information (PHI). If you are not the intended recipient, be advised that any unauthorized use, disclosure, copying or the taking of any action in reliance on the contents of this information is strictly prohibited. If you have received this email in error, please immediately notify the sender via telephone or return mail. _______________________________________________ HCP-Users mailing list HCP-Users@humanconnectome.org http://lists.humanconnectome.org/mailman/listinfo/hcp-users