This is absolutely essential information. Thank you Tim. My standard use case is Linux ubuntu worker which I have pre-configures, using my own AMI that I launch in the same region in which the HCP data are provided, us-east-1. I then would download the data from there which can take some time, depending on the bandwidth of the chosen instance. I then push my results to my own s3 repositories. I was thinking that mounting the data could be a good idea (it would potentially cut down time and solve capacity problems of the worker which can have limited disk space. I personally would not have a problem upgrading to a NITRIC AMI if I find a standard Linux worker.
Thank you once more, Denis On Wed, Oct 19, 2016 at 7:55 PM Timothy B. Brown <[email protected]> wrote: Hello Denis, I understand that Robert Oostenveld is planning to send you some materials from the latest HCP Course that illustrate how to mount the HCP OpenAccess S3 bucket as a directory accessible from a running EC2 instance. However, I'd like to clarify a few things. First, the materials you will receive from Robert assume that you are using an Amazon EC2 instance (virtual machine) *that is based on an AMI supplied by NITRC* (analogous to a DVD of software supplied and configured by NITRC to be loaded on your virtual machine). In fact the instructions show you how to create a new EC2 instance based on that NITRC AMI. The folks at NITRC have done a lot of the work for you (like including the necessary software to mount an S3 bucket) and provided a web interface for you to specify your credentials for accessing the HCP OpenAccess S3 bucket. If you want to create an EC2 instance based on the NITRC AMI, then things should work well for you and the materials Robert sends to you should hopefully be helpful. But this will not be particularly useful to you if you are using an EC2 instance that is *not* based upon the NITRC AMI. If that is the case, you will have to do a bit more work. You will need to install a tool called *s3fs* ("S3 File System") on your instance and then configure s3fs to mount the HCP OpenAccess S3 bucket. This configuration will include storing your AWS access key information in a secure file on your running instance. A good starting point for instructions for doing this can be found at: https://forums.aws.amazon.com/message.jspa?messageID=313009 This may not cover all the issues you encounter and you may have to search for other documentation on using s3fs under Linux to get things fully configured. The information at: https://rameshpalanisamy.wordpress.com/aws/adding-s3-bucket-and-mounting-it-to-linux/ may also be helpful. Second, once you get the S3 bucket mounted, it is very important to realize that it is *read-only* from your system. By mounting the S3 bucket using s3fs, you have not created an actual EBS volume on your system that contains the HCP OpenAccess data, just a mount point where you can *read* the files in the S3 bucket. You will likely want to create a separate EBS volume on which you will run pipelines, generate new files, and do any further analysis that you want to do. To work with the data, you will want the HCP OpenAccess S3 bucket data to at least *appear* to be on that separate EBS volume. One approach would be to selectively copy data files from the mounted S3 data onto your EBS volume. However, this would be duplicating a lot of data onto the EBS volume, taking a long time and costing you money for storage of data that is already in the S3 bucket. I think a better approach is to create a directory structure on your EBS volume that contains files which are actually symbolic links to the read-only data that is accessible via your S3 mount point. The materials that Robert sent (or will send) you contain instructions for how to get and use a script that I've written that will create such a directory structure of symbolic links. After looking over those instructions, if it is not obvious to you what script I'm referring to and how to use it, feel free to send a follow up question to me. Hope that's helpful, Tim On 10/18/2016 10:51 AM, Denis-Alexander Engemann wrote: Dear HCPers, I recently had a conversation with Robert who suggested to me that it should be possible to directly mount the HCP data like an EBS volume instead of using the s3 tools for copying the data file by file. Any hint would be appreciated. Cheers, Denis _______________________________________________ HCP-Users mailing list [email protected] http://lists.humanconnectome.org/mailman/listinfo/hcp-users -- Timothy B. Brown Business & Technology Application Analyst III Pipeline Developer (Human Connectome Project) tbbrown(at)wustl.edu ------------------------------ The material in this message is private and may contain Protected Healthcare Information (PHI). If you are not the intended recipient, be advised that any unauthorized use, disclosure, copying or the taking of any action in reliance on the contents of this information is strictly prohibited. If you have received this email in error, please immediately notify the sender via telephone or return mail. _______________________________________________ HCP-Users mailing list [email protected] http://lists.humanconnectome.org/mailman/listinfo/hcp-users
