Dear Tom and Simon, Thanks for responding.
I see two classes of use for the HCP data sets. (1) The HCP participant results may be used as norms for comparison with matched participants from whom we capture measures which may be compared. (2) The HCP participant results may be used exclusively. I think it is only the latter, (2), for which there is a problem although I certainly could be wrong. Tom, you used the scenario of a bunch of labs using the data to do one test each and stated: “…I would say that requires a 'science-wide' correction applied by the reader of the 250 papers. …” That gets at what I’m asking. If I’m the author of one of those papers, I don’t want to be fooled or to fool any of my readers with the results from my laboratory by failing to correct for all the other comparisons which have been run on the same data. If I do that now, perhaps it’s workable to take account of all the work which has appeared to date to do the correction for multiple comparisons. But what about a laboratory which runs some other test 5 years from now? They must use a more stringent criterion given all the additional results which have since been published. At some point, it will become impossible to find a reliable result. Of course, these notions apply to reviewers and other readers too which places a new level of responsibility on them compared with reading papers today. For editors and reviewers, the problem is particularly acute. If the authors of a paper used the correction criterion suggested by their isolated analysis but a ‘science-wide’ reading calls for a more stringent criterion, do they bounce the paper back or accept it? As you point out, Tom, there’s no simple answers to the base question, and there are lots of scenarios which would be worth understanding in this context. I wonder if there are those lurking on the list who would consider thinking this through and if they deem it valuable, lay it out formally as a letter or a paper for all of us. Those who are most directly involved with the HCP likely have thought about it already and perhaps have something. Best - Don From: [email protected] [mailto:[email protected]] On Behalf Of Thomas Nichols Sent: Tuesday, September 13, 2016 10:53 AM To: Krieger, Donald N. Cc: [email protected] Subject: Re: [HCP-Users] Same data / Multiple comparisons ? Dear Don, There are no simple answers to this question. Firstly, always be totally transparent about the set of questions/contrasts you're investigating when you write up your results. But, when it comes to decide over what set of results to control multiple testing, I don't think you need to naively correct for every question in a paper. For example, if you look at sex differences, and then you look at age effects, I won't correct as there is a literature on sex differences and a separate one on ageing. But, if there is a natural set of questions that you are implicit or explicitly looking at together, then you should correct. For example if you did a ICA dual regression to get (say) 8 spatial maps of the main RSNs, and then test for sex differences over those 8 and report all of them, you probalby should do a correction for those 8 comparisons. About different labs, if each lab is working independently, they're surely going to make slightly different choices about the analysis, and then it will be a confidence building result if they all get the same/similar results. But, if you're considering the thought experiment where 250 labs each publish one paper on 1 variable in the 250+ behavioral/demographic meaures in the HCP data, I would say that requires a 'science-wide' correction applied by the reader of the 250 papers. You can use Bonferroni, changing a 0.05 threshold to 0.05/8=0.00625, but alternatively you can use PALM, which can use a sharper (less conservative) correction using "Tippets method" to correct for the 8 tests. Hope this helps. -Tom On Tue, Sep 13, 2016 at 2:00 PM, Krieger, Donald N. <[email protected]<mailto:[email protected]>> wrote: Dear List, When a lab analyzes their own data, they control for the degradation in confidence due to multiple comparisons. But how does that work when you have many labs analyzing the same data? At the one end, several labs could do exactly the same analysis and get the same results. At the other end, several labs could run entirely different tests, each controlling for the comparisons they do, and reporting their results with the confidence levels they compute under the assumption that those are the only tests. But since the total number of tests under these circumstances is the sum for all the labs, isn’t that the number of comparisons for which each lab must control? I hope I’ve expressed this clearly enough. I admit to being confused by the question. What do you think? Best - Don _______________________________________________ HCP-Users mailing list [email protected]<mailto:[email protected]> http://lists.humanconnectome.org/mailman/listinfo/hcp-users -- __________________________________________________________ Thomas Nichols, PhD Professor, Head of Neuroimaging Statistics Department of Statistics & Warwick Manufacturing Group University of Warwick, Coventry CV4 7AL, United Kingdom Web: http://warwick.ac.uk/tenichols Email: [email protected]<mailto:[email protected]> Tel, Stats: +44 24761 51086, WMG: +44 24761 50752 Fx, +44 24 7652 4532 _______________________________________________ HCP-Users mailing list [email protected] http://lists.humanconnectome.org/mailman/listinfo/hcp-users
