Dear list,

Thanks to all and to you who engaged this puzzling and worrisome topic.
I appreciate the pointers to the MegaTrawl and OHBM COBIDAS reports linked in 
the correspondence below.
I agree that it’s not the role of the Human Connectome Project to police the 
results obtained with the data you provide.

I do think that there is a substantive contribution to be made with an analysis 
of this issue, a list of cautions, and at least a partial list of scenarios 
which either do or do not pose problems with suggested approaches to dealing 
with the problems.
For the universe of researchers, I don’t think we’re doing as well as we could 
by simple honest reporting of the comparisons we’ve made.
After all, that’s really been incumbent on all of us prior to this.
And it doesn’t really address how we even satisfy ourselves as individual 
researchers that we’re not being fooled by our own results from HCP data.

This problem is not confined to HCP data nor is its domain confined to human 
brain imaging, e.g. there are numerous tumor registries which, to my 
understanding, face the same issue.
A letter to a journal with wide scope would, I hope, be welcomed by editors and 
be of real value to the scientific community.
Had I the requisite knowledge and understanding, I would do the thinking and 
write it.
I hope that one or more of you will consider this .

Best - Don

From: Stephen Smith []
Sent: Wednesday, September 14, 2016 9:02 AM
To: Thomas Nichols
Cc: Krieger, Donald N.;
Subject: Re: [HCP-Users] Same data / Multiple comparisons ?

Hi Tom

actually the MegaTrawl latest version is here:

But yes indeed - early on within the HCP we discussed options for trying to 
deal with this large-scale-multiple-comparisons-problem, but quickly agreed 
that it really wasn't our role (and wasn't practical) for us to try to "police" 
on this issue.    Another possible form of "policing" would be to keep-back a 
truly-left-out sample of subjects, but there were too many ethical, practical 
and statistical problems with that.   We also (light-heartedly!) discussed 
having some kind of "multiple-comparison sin counter" that ticks up every time 
someone does a new analysis....  ;-)   but as you say there was no 
straightforward solution presenting itself...


On 14 Sep 2016, at 13:52, Thomas Nichols 
<<>> wrote:

Dear Don, (Simon?)

I see two classes of use for the HCP data sets.
(1)    The HCP participant results may be used as norms for comparison with 
matched participants from whom we capture measures which may be compared.
(2)    The HCP participant results may be used exclusively.

I think it is only the latter, (2), for which there is a problem although I 
certainly could be wrong.

I agree, the first doesn't have multiplicity problems (though accuracy with 
which you can match subjects & scanner data is another concern).

 Tom, you used the scenario of a bunch of labs using the data to do one test 
each and stated: “…I would say that requires a 'science-wide' correction 
applied by the reader of the 250 papers. …”
That gets at what I’m asking.
If I’m the author of one of those papers, I don’t want to be fooled or to fool 
any of my readers with the results from my laboratory by failing to correct for 
all the other comparisons which have been run on the same data.

Yes, but the basic problem you, the individual author, face is what sort of 
correction should you apply.  You only studied variable #132; should you do a 
correction just for the 20 others in that domain, or all 250?  That's why I 
think all you can do is be open, and honestly report the scope of variables you 
considered (and if you did, e.g., search over 20 variables in a domain, correct 
over those), and report your result.  If the reader collects your result with 
50 other papers they can use the appropriate level of criticism for that 
collection, which will be different from a reader that collects 250 papers 
measures for consideration.

 If I do that now, perhaps it’s workable to take account of all the work which 
has appeared to date to do the correction for multiple comparisons.
But what about a laboratory which runs some other test 5 years from now?
They must use a more stringent criterion given all the additional results which 
have since been published.
At some point, it will become impossible to find a reliable result.


Of course, these notions apply to reviewers and other readers too which places 
a new level of responsibility on them compared with reading papers today.
For editors and reviewers, the problem is particularly acute.
If the authors of a paper used the correction criterion suggested by their 
isolated analysis but a ‘science-wide’ reading calls for a more stringent 
criterion, do they bounce the paper back or accept it?

As you point out, Tom, there’s no simple answers to the base question, and 
there are lots of scenarios which would be worth understanding in this context.
I wonder if there are those lurking on the list who would consider thinking 
this through and if they deem it valuable, lay it out formally as a letter or a 
paper for all of us.
Those who are most directly involved with the HCP likely have thought about it 
already and perhaps have something.

I hope others in the HCP team will chime in, but in our internal discussions we 
could never arrive at an conclusive action.  That is, the decision was made 
early on that this is an *open* project; hypotheses will not be recorded and 
registered, and data kept in a lock-box, only made available to those who agree 
to study some particular hypothesis (though note, some large scale projects are 
run exactly like that).

Rather, it is left up to authors to honestly report to readers the scope of the 
variables considered.  Steve Smith's Mega 
openly acknowledges the consideration of nearly every behavioral and 
demographic measure in the HCP.  See also the OHBM COBIDAS 
report<>, which implores authors to be 
completely transparent in variables and hypotheses considered but not 
necessarily highlighted in a publication.


Best - Don

[<>] On Behalf Of Thomas 
Sent: Tuesday, September 13, 2016 10:53 AM
To: Krieger, Donald N.
Subject: Re: [HCP-Users] Same data / Multiple comparisons ?

Dear Don,

There are no simple answers to this question.  Firstly, always be totally 
transparent about the set of questions/contrasts you're investigating when you 
write up your results. But, when it comes to decide over what set of results to 
control multiple testing, I don't think you need to naively correct for every 
question in a paper.  For example, if you look at sex differences, and then you 
look at age effects, I won't correct as there is a literature on sex 
differences and a separate one on ageing.  But, if there is a natural set of 
questions that you are implicit or explicitly looking at together, then you 
should correct.  For example if you did a ICA dual regression to get (say) 8 
spatial maps of the main RSNs, and then test for sex differences over those 8 
and report all of them,  you probalby should do a correction for those 8 

About different labs, if each lab is working independently, they're surely 
going to make slightly different choices about the analysis, and then it will 
be a confidence building result if they all get the same/similar results.  But, 
if you're considering the thought experiment where 250 labs each publish one 
paper on 1 variable in the 250+ behavioral/demographic meaures in the HCP data, 
I would say that requires a 'science-wide' correction applied by the reader of 
the 250 papers.

You can use Bonferroni, changing a 0.05 threshold to 0.05/8=0.00625, but 
alternatively you can use PALM, which can use a sharper (less conservative) 
correction using "Tippets method" to correct for the 8 tests.

Hope this helps.


On Tue, Sep 13, 2016 at 2:00 PM, Krieger, Donald N. 
<<>> wrote:
Dear List,

When a lab analyzes their own data, they control for the degradation in 
confidence due to multiple comparisons.
But how does that work when you have many labs analyzing the same data?

At the one end, several labs could do exactly the same analysis and get the 
same results.
At the other end, several labs could run entirely different tests, each 
controlling for the comparisons they do, and reporting their results with the 
confidence levels they compute under the assumption that those are the only 
But since the total number of tests under these circumstances is the sum for 
all the labs, isn’t that the number of comparisons for which each lab must 

I hope I’ve expressed this clearly enough.
I admit to being confused by the question.
What do you think?

Best - Don

HCP-Users mailing list<>

Thomas Nichols, PhD
Professor, Head of Neuroimaging Statistics
Department of Statistics & Warwick Manufacturing Group
University of Warwick, Coventry  CV4 7AL, United Kingdom
Tel, Stats: +44 24761 51086, WMG: +44 24761 50752
Fx,  +44 24 7652 4532<tel:%2B44%2024%207652%204532>

HCP-Users mailing list<>

Thomas Nichols, PhD
Professor, Head of Neuroimaging Statistics
Department of Statistics & Warwick Manufacturing Group
University of Warwick, Coventry  CV4 7AL, United Kingdom
Tel, Stats: +44 24761 51086, WMG: +44 24761 50752
Fx,  +44 24 7652 4532

HCP-Users mailing list<>

Stephen M. Smith, Professor of Biomedical Engineering
Head of Analysis,  Oxford University FMRIB Centre

FMRIB, JR Hospital, Headington, Oxford  OX3 9DU, UK
+44 (0) 1865 222726  (fax 222717)<>

Stop the cultural destruction of Tibet<>

HCP-Users mailing list

Reply via email to