Re: [HCP-Users] Verifying HCP data versions

Harms, Michael Tue, 15 Dec 2015 09:25:32 -0800

Determining the recon version ("r177" vs "r227") of old Q1-Q3 released data is different for dMRI vs. fMRI.

For fMRI, we have not ever changed the image reconstruction algorithm for a given subject's data across data releases. So the "fMRI_3T_ReconVrs" variable that is currently available in the database will tell you whether the recon version was r177 or r227 (or, for a small number of subjects there is a mixture of recon versions, noted as "r177 r227" for that variable), and that applies regardless of whether you are working with fMRI data from the Q1, Q2, Q3, S500, or S900 releases.

For dMRI, you have to determine the data release of the particular dMRI data you are looking at to infer the recon version. dMRI data from the Q1 or Q2 releases will have been reconn'ed using the "r177" recon algorithm. dMRI data from the Q3, S500, and S900 releases will have been reconn'ed with the "r227" recon algorithm.

Of note, to avoid any confusion, it is important to realize that the "Release" variable that you see in the database refers to when a given subject was first publicly released, but that does not tell whether the particular data you are looking comes from the Q1-Q3, S500, or S900 releases (with the exception of the latest release, since a "Release=S900" subject naturally couldn't have data released in one of the earlier, pre-S900, releases) -- e.g., depending on when you downloaded/obtained the particular data you are looking at, a subject designated as "Release=Q1" subject in the database could have had its data processed using the Q1 pipelines, the Q2/Q3 pipelines, or the S500/S900 pipelines .

We encourage any users of Q1-Q3 released data to switch to using S500/S900 released data. S500/S900 released data are processed with a consistent set of pipelines, and differ only in that additional files/data have been added to the S900 released data compared to the S500 released data.

Hope that helps.

cheers,

-MH

Michael Harms, Ph.D.

-----------------------------------------------------------

Conte Center for the Neuroscience of Mental Disorders

Washington University School of Medicine

Department of Psychiatry, Box 8134

660 South Euclid Ave. Tel: 314-747-6173

St. Louis, MO 63110 Email: mha...@wustl.edu

From: Timothy Brown <tbbr...@wustl.edu>
Reply-To: Timothy Brown <tbbr...@wustl.edu>
Date: Tuesday, December 15, 2015 10:31 AM
To: Matthew George Liptrot <matthew.lipt...@di.ku.dk>, HCP Listserv <hcp-users@humanconnectome.org>
Subject: Re: [HCP-Users] Verifying HCP data versions

Hi Matthew,

Re: Your question 1) below about detected which release data is part of

I can't really speak to differentiating between Q1 vs. Q2 vs Q3 data, but I can think of a technique that should work and be scriptable for detecting whether a subject's data is from the 500subjects release or the 900subjects release. The technique would be based on the contents of at least 1 release notes file for the subject.

If you have downloaded the structurally preprocessed data package for a subject (e.g. 100307_3T_Structural_preproc.zip) and unzipped that package, then you should have a sub-directory below the directory at which you did the unzipping named 100307 and a sub-directory below that named release-notes. In the 100307/release-notes sub-directory, you should find a file named Structural_preproc.txt (the release notes for the Structural_preproc package).

I think it might be slightly problematic to count on the modification date of that release notes file as an indication of whether the data is part of the 500subjects release or 900subjects release. In my opinion it is too easy for that file modification date to be inadvertently changed or updated. (The tool used to unzip the package may not preserve modification dates; someone might accidentally touch the file and thus change it's modification date; someone may edit the file intending to simply look at it, accidentally add a space, and save the result.) I think it would be more reliable to differentiate between the 500subjects release and the 900subjects release based on the contents of the release notes file. (I'm supposing that people are unlikely to purposely change the contents of these files and wouldn't be likely to accidentally change the date written in the file or substantially change the contents.)

For the 500subjects release, the first few lines of the release notes file should look something like the following (without the line numbers):

100307_3T_Structural_preproc.zip
Sat Mar 29 13:21:24 CDT 2014
Structural Pipeline v3.1
Execution 1
These data were generated and made available by the Human Connectome Project, ...

For the 900subjects release, the first few lines of the release notes file should look something like the following (without the line numbers):

100307_3T_Structural_preproc.zip
Mon Nov 30 23:44:34 CST 2015
These data were generated and made available by the Human Connectome Project,

Since all the Structural Preproc packages for the 900subjects release were finalized after 31 Oct 2015, you could read in the 3rd line of the release notes file, parse the date, and check to see if the date is before or after 31 Oct 2015. If it is before 31 Oct 2015, then the data is from the 500subjects release. If it's after 31 Oct 2015, then the data is from the 900subjects release.

If parsing and comparing the dates is cumbersome, you could also simply look at the contents of line 4 in the release notes file. If it starts with "Structural Pipeline" then you're working with 500subjects data. If it is a blank line, then you're working with 900subjects data. (In the 900subjects form of the release notes, the pipeline version numbers come at the end of the release notes file instead of right after the date.)

If you don't have the structurally preprocessed data for a subject, you could probably extrapolate this technique to use the release notes file for the package(s) you do have.

Off hand, I don't know what the release notes files look like for Q1, Q2, and Q3, but if you have some of that data, you might be able to extend this method to differentiate between those releases by examining the contents of those release notes files.

I realize that this isn't a particularly elegant mechanism. Maybe someone else can think of a quicker or more elegant solution (maybe simply based on the presence or absence of a particular file generated by the pipelines.)

The above technique should allow you to differentiate between releases, but as for your question 2), detecting the version of the image reconstruction algorithm applied, I don't have a good answer for that.

Hope this is at least somewhat helpful,

Tim

On Tue, Dec 8, 2015, at 04:26, Matthew George Liptrot wrote:

Hi,

1) I understand that some of the processing of HCP data is different for the different releases (Q1, Q2/3, 500subjects etc).

Is there a scriptable way to see which version/release my downloaded data came from? (I am working on several different HCP releases with various groups of co-workers and it would be nice if my scripts could check for this automatically)

2) I also understand that the image reconstruction method is different for some releases (From the wiki: “Two versions of the image reconstruction algorithm applied to dMRI and fMRI data have been used in HCP to date: version r177 for subjects scanned in Q1 through mid-Q3, version r227 for subjects scanned mid-Q3 and after. “)

Again, is there any scriptable way to check which version was used for data that is already downloaded? (Same reason as above)

Many thanks,

M@

--

Matthew George Liptrot

Department of Computer Science

University of Copenhagen

&

Section for Cognitive Systems

Department of Applied Mathematics and Computer Science

Technical University of Denmark

http://about.me/matthewliptrot

_______________________________________________
HCP-Users mailing list
HCP-Users@humanconnectome.org
http://lists.humanconnectome.org/mailman/listinfo/hcp-users

Timothy B. Brown

Business & Technology Application Analyst III

Pipeline Developer (Human Connectome Project)

tbbrown(at)wustl.edu

________________________________________

The material in this message is private and may contain Protected Healthcare Information (PHI).

If you are not the intended recipient, be advised that any unauthorized use, disclosure, copying

or the taking of any action in reliance on the contents of this information is strictly prohibited.

If you have received this email in error, please immediately notify the sender via telephone or

return mail.

_______________________________________________
HCP-Users mailing list
HCP-Users@humanconnectome.org
http://lists.humanconnectome.org/mailman/listinfo/hcp-users

The materials in this message are private and may contain Protected Healthcare Information or other information of a sensitive nature. If you are not the intended recipient, be advised that any unauthorized use, disclosure, copying or the taking of any action in reliance on the contents of this information is strictly prohibited. If you have received this email in error, please immediately notify the sender via telephone or return mail.

_______________________________________________
HCP-Users mailing list
HCP-Users@humanconnectome.org
http://lists.humanconnectome.org/mailman/listinfo/hcp-users

Re: [HCP-Users] Verifying HCP data versions

Reply via email to