To Geoff and other hcp-users, Thanks, Geoff, for your questions about the architecture of the HCP informatics domain. Your "interpretation #3" is closest to the mark for where HCP is headed. In many instances users will find it desirable to have HCP process data queries centrally (on our high-performance computer linked to the ConnectomeDB database), then return the results to Connectome Workbench running on a local platform. In other circumstances, investigators having computationally-intensive queries will be better served by having a local copy of the data and local processing. In such cases, investigators may obtain the “Connectome in a Box, at cost (hard drive + shipping). We will give additional information and guidance about these and other options in conjunction with our February release. Although it’s more than you asked for, I’d like to take this opportunity to provide hcp-users group with a high-level picture of what will (and will not) become available in the coming months. What's the purpose of the HCP October Initial data release? Our October data release is intended to let investigators 'get their feet wet' with an initial bolus of high-quality datasets (12 subjects, ~25 GByte/subject). For example, some investigators may want to modify their own analysis tools in order to make the best use of HCP data. By providing the October dataset as an example, investigators will have lead time to do work of this kind so that they can begin real data analysis once the Q1 Data Release occurs in February, 2013. For two reasons, the October data are not intended for analyses that will lead immediately to scientific publications. (1) There is a caveat emptor to the minimally preprocessed datasets released in October, because our pipelines have not been completely finalized. Indeed, a recent (Nov 9) email reported a glitch in how the fMRI timeseries data were processed; in the meantime we also made several improvements to the pipelines. Version 2 Initial datasets (same 12 subjects released in October) are currently being reprocessed and will be released in about a week. (2) In addition, many of the 12 subjects are related to one another (twins or non-twin siblings), which could bias the results of some analyses. What data will be released in February? We expect to release data for ~70 subjects for which we have complete scan sessions acquired during the first quarter. (Recall that it will take three years to scan all 1200 subjects!) This will include the unprocessed data and the minimally processed data, akin to that provided in the October data release. Restricted Access Data. To obtain information about family structure (identification of twins and siblings), investigators will be required to sign a special data use agreement. Many investigators may instead (or also) elect to start by using data from a group of 20 unrelated subjects that will be made freely accessible (so that there are no complications regarding family structure). Fully processed data. Fully processed data will likely include 'dense connectomes' for functional and structural connectivity, plus task-fMRI activation patterns. We also hope to include 'parcellated connectomes' based on initial connectivity-based parcellations derived from HCP data - but no promises! These fully processed datasets will be based on the initial group of 20 unrelated subjects. What is ConnectomeDB? ConnectomeDB is the external-facing HCP database, based on the XNAT platform developed in Dan Marcus' lab. Data storage is on a BlueArc hardware system, whose eventual capacity will be ~1 petabyte. In conjunction with the February data release we will allow the community to access HCP data via ConnectomeDB and to download selected datasets from within ConnectomeDB, albeit with file size restrictions. Larger datasets will be downloadable via ftp or obtainable via the “Connectome in a Box” solution mentioned above. What data mining capabilities will be offered in February? This is a work in progress. For those familiar with XNAT, many search capabilities in XNAT will be available on ConnectomeDB, but will be customized to handle unique aspects of the HCP datasets. This will include options to select subgroups of individuals based on a variety of behavioral and other measures. We are aiming to provide options to view average functional connectivity maps for different groups and different brain regions of interest. What is Connectome Toolbox? The 'connectome toolbox' you asked about is not a formally-defined entity but is instead our name for the growing collection of tools that HCP will provide to the community in addition to the data itself. This will include the Connectome Workbench visualization and analysis platform, plus resources such as the code for our analysis pipelines and our in-scanner T-fMRI tasks. We intend to make all data and tools freely available to those who want them, although this will have to occur in an orderly and logical process once the tools themselves are ready. There's a lot more under the hood, but hopefully this brief overview will be useful.
If you know colleagues who may be interested in this information, please feel free to forward them this email; also, encourage them to join hcp-users at http://lists.humanconnectome.org/mailman/listinfo/hcp-users David VE On Nov 9, 2012, at 12:18 PM, Geoff Pope wrote: Hi. I'm trying to get a general understanding of the architecture of the system: what are all the software components, where will they run, how will we use them, and what are the bandwidth and storage requirements? I have looked at this http://www.humanconnectome.org/connectome/ and this http://www.humanconnectome.org/about/project/informatics.html But there is more than one interpretation; please clarify. Interpretation 1: -the 1200 subjects' data are stored on the connectome server. -the "connectome toolbox" is software running on the connectome server. -users log into the connectome server and run scripts which call "connectome toolbox" functions, to select subjects and scans, set up statistical tests, and produce output images. -the workbench runs on the user's desktop machine (the client). -the workbench downloads the output images created by the "connectome toolbox" running on the server, and displays them (the workbench is analogous to fsl's fslview or freesurfer's tksurfer, with added ftp functionality). (Here only the output images are downloaded, so this option has low bandwidth and client storage requirements). Interpretation 2: -the 1200 subjects' data are stored on the connectome server. -the "connectome toolbox" is command line software running on the user's computer (the client). -the connectome toolbox is used to select and download scans from the connectome server, to set up statistical tests, and to do processing on the client. -the workbench is a client side tool for displaying the images created by the client side "connectome toolbox" (this has high bandwidth and client storage requirements) Interpretation 3: -the 1200 subjects' data are stored on the connectome server. -the workbench runs on the user's desktop machine (the client). -the workbench is used to orchestrate processing on the server (select subjects and scans, set up statistical tests, display output images, all using a GUI). (Here only the output images are downloaded, so this option has low bandwidth and client storage requirements) How will it work? Thanks, Geoff Pope _______________________________________________ HCP-Users mailing list [email protected] http://lists.humanconnectome.org/mailman/listinfo/hcp-users
