lanierb;352978 Wrote: > First, this is a very small study with only 10-30 individuals per test. > The findings are only marginally statistically significant (I computed > the p-values), and in some cases not significant. > They don't say how they calculated their statistical tests (their > F-stats), but it looks to me like they are treating different samples > of the same individual as independent observations, probably a no-no, > and treating them otherwise would likely make the results statistically > insignificant. It's not clear that it's double blind so it's probably > not. (It is stated to be single blind: the listener doesn't know which > sample he/she is listening to.) All of this would lead us to think > that the results are at best suggestive and definitely not conclusive. > Given that there are many other studies that find no effect in similar > tests, one would have to still lean toward thinking that it's more > likely there is no effect. This is particularly true given the lack of > a theory for the effect.
I am not familiar with the ANOVA method nor Fishers' PLSD test they used (just an example of many methods and tests used) but when they state that these combined are okay for assessment of the statistical significance, one should be very familiar with these methods and tests plus have access to all the data collected and calculations performed before stating that it's flawed. All results were P < 0.05 and many scores were P < 0.01 incl. in the subjective evaluation tests. As far as I understand these tests (reading this helped: http://www.hydrogenaudio.org/forums/index.php?showtopic=16295), you already have very strong evidence when 1 subject scores p <0.01 and when this is replicated by more subjects, you just eliminate the "fake results" in case someone tests all by himself and rigs it. But here the subjects were not able to fake it because of the professional supervision. When 28 subjects score p<0.05 this is accepted as significant worldwide afaik. Not double blind? The passage you quote ("It is stated to be single blind: the listener doesn't know which sample he/she is listening to" can't be found with my search function in the document, but you will be able to find these: "Neither the subjects nor the experimenters knew which conditions were being performed" and "Neither the subjects nor the experimenter knew what the sound conditions were, although they did know that the presentation was in an A-B-B-A fashion" So please follow me here that although it doesn't say the words "double blind" it actually is double blind. You say that 1 subject should only do 1 sample and if they use more than 1 sample you get insignificant results? You know that for sure and better than those 10 scientists? Why does every ABX test I know about use many samples for each subject, like 16 or so? If you have just one sample, you have a 50% chance of guessing right but the more you repeat that, the less percentage of change of guessing right. I do know that much about statistics. I think you will come up with anything to try to poke holes into it. But you should realize that this isn't a "bunch of friends of some hifi-magazine's author who had a fun night doing some testing" kind of test. It's a professional study and they don't "rig it up" or "change results" just for kicks. Most, if not all, of the researchers are in the medical field and don't care at all about how good-sounding the next set of speakers is. I think it's the same for every professional study done that you can trust the math etc. is correct. cheers, Nick. -- DeVerm ------------------------------------------------------------------------ DeVerm's Profile: http://forums.slimdevices.com/member.php?userid=18104 View this thread: http://forums.slimdevices.com/showthread.php?t=54077 _______________________________________________ audiophiles mailing list [email protected] http://lists.slimdevices.com/lists/listinfo/audiophiles
