Re: Hypothesis testing and magic - episode 2
Jerry Dallal wrote: As Tukey has pointed out, the null hypothesis of no effect is not that we think there is no effect, but we are uncertain of the direction. I wish I knew more about Delany and its application. One problem, pointed out by David Salsburg, is that a substances that eliminates one of many competing risks would appear to increase the other risks. For example, people no longer subject to heart disease would undoubtedly see an increased incidence of cancer, with all cause mortality holding steady at 100%. I would hope that such risks would be measured as probability per unit time, and so the first-order effects of `we all die' would be removed. Which still leaves the second-order effects due to the lengthy induction process of many cancers. BTW an even greater problem in animal testing seems to be due using feed-on-demand systems. The little critters are usually bored out of their minds and overeat, causing a variety of health problems. So any drug that makes them mildly unwell can easily spoil their appetite -- and make them look healthier. Peter === This list is open to everyone. Occasionally, less thoughtful people send inappropriate messages. Please DO NOT COMPLAIN TO THE POSTMASTER about these messages because the postmaster has no way of controlling them, and excessive complaints will result in termination of the list. For information about this list, including information about the problem of inappropriate messages and information about how to unsubscribe, please see the web page at http://jse.stat.ncsu.edu/ ===
Re: Hypothesis testing and magic - episode 2
On Thu, 20 Apr 2000 10:48:38 +0100, "P.G.Hamer" [EMAIL PROTECTED] wrote: snip, interesting stuff about, proper age-adjusted life-tables, with proper adjustment of base-line Ns, would not show an increase in competing causes of death BTW an even greater problem in animal testing seems to be due using feed-on-demand systems. The little critters are usually bored out of their minds and overeat, causing a variety of health problems. So any drug that makes them mildly unwell can easily spoil their appetite -- and make them look healthier. I never knew that! But that might be similar, or that might underlie another thing that I once was told about laboratory rats. I had been impressed by the newspaper reports that rats lived longer if they were underfed, i.e., on very-low-calorie diets. Then my lab-tech friends told me that the lab rats tended to live to a certain *size* rather than age. The starved ones took 30% longer to reach that same size. So my friends were not at all impressed by those news reports. [ There may be newer data that are more impressive.] I later realized that humans and dogs are in the minority among mammals, in that we achieve "adult" size and then stop growing. For elephants and moose and bears, etc., the stereotype from childhood nature stories is not all invention. If the clever "old man of the woods/jungle/forest" is the wisest and the oldest, he is likely to be the biggest, because most critters never stop growing. That seemed to tie in to the rat-life-spans, too. -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html === This list is open to everyone. Occasionally, less thoughtful people send inappropriate messages. Please DO NOT COMPLAIN TO THE POSTMASTER about these messages because the postmaster has no way of controlling them, and excessive complaints will result in termination of the list. For information about this list, including information about the problem of inappropriate messages and information about how to unsubscribe, please see the web page at http://jse.stat.ncsu.edu/ ===
Re: Hypothesis testing and magic - episode 2
Herman Rubin wrote: The truth myth is highly persistent. We have the Delaney Clause, which requires the FDA to ban any additive "which has been found to cause cancer in humans or animals". Now what does this mean? It is unlikely that anything does not affect the cancer rate. We do not have the truth, and will not get it. That point null hypothesis is false. So we need to get off the tack that we want to accept if it is true, and reject if it is false. As Tukey has pointed out, the null hypothesis of no effect is not that we think there is no effect, but we are uncertain of the direction. I wish I knew more about Delany and its application. One problem, pointed out by David Salsburg, is that a substances that eliminates one of many competing risks would appear to increase the other risks. For example, people no longer subject to heart disease would undoubtedly see an increased incidence of cancer, with all cause mortality holding steady at 100%. === This list is open to everyone. Occasionally, less thoughtful people send inappropriate messages. Please DO NOT COMPLAIN TO THE POSTMASTER about these messages because the postmaster has no way of controlling them, and excessive complaints will result in termination of the list. For information about this list, including information about the problem of inappropriate messages and information about how to unsubscribe, please see the web page at http://jse.stat.ncsu.edu/ ===
Re: Hypothesis testing and magic - episode 2
On Thu, 13 Apr 2000, Alan McLean wrote: Some more comments on hypothesis testing: My impression of the hypothesis test controversy, which seems to exist primarily in the areas of psychology, education and the like is that it is at least partly a consequence of the sheer difficulty of carrying out quantitative research in those fields. A root of the problem seems to be definitional. I am referring here to the definition of the variables involved. In, say, an agricultural research problem it is usually easy enough to define the variables. For a very simple example, if one is interested in In addition to defining the variables some areas do a better job of defining and therefore testing their models. The ag example is one where not only the variables are relatively clear so are the models. That is there is one highly plausible reason for rejecting a null that fertilizer does not effect crop production: Fertilizer increases crop production. You have rejected a model of no effect in favor of a model positing an effect. But in some areas in psychology you will have a situation where many theoretical perspectives predict the same outcome relative to a zero valued null while the zero valued null reflects no theoretical perspective. In this situation rejecting a zero valued null supports all theoretical perspectives equally and differentiates among none of them. In a recent example a student was citing the research literature supporting the convergent validity of some measure. The evidence used by all investigators was that the null of rho = 0 was rejected. I've seen this same thing many times, but this time I saw something different. The smallest sample (n about 95) failed to reject rho = 0 while the remaining samples (all n's 200) successfully rejected rho = 0 and convergent validity was declared. (No r's were actually reported in this review.) A quick thought experiment, and check of critical value tables, suggests that the best estimate of rho from the evidence provided is some value greater than 0 but less than .20. In this case it seems to me that testing the default zero valued null was misleading rather than informative. In addition to convergent validity it seems to me that correlations in the range 0 - .20 could easily be explained by at least a couple of other competing models that would not support the conclusions drawn. Only the most trivial link between theoretical models and statistical hypotheses exist in this case. Using Alan's ethnicity and statistical ability example, and assuming for the moment that all measures were useful, the first time we reject a no effect null we have some sort of useful information. Now, imagine that 12 researchers generate 12 different hypotheses explaining the cause of these differences. Current practice has all 12 of these researchers collecting data and testing to eliminate the chance model and then declaring that their hypothesis has been confirmed. I agree that measurement is a problem, but even with good measurement the lack of connection between statistical hypotheses and theoretical predictions is a fatal flaw in too many areas. Michael Regards again, Alan -- Alan McLean ([EMAIL PROTECTED]) Department of Econometrics and Business Statistics Monash University, Caulfield Campus, Melbourne Tel: +61 03 9903 2102Fax: +61 03 9903 2007 === This list is open to everyone. Occasionally, less thoughtful people send inappropriate messages. Please DO NOT COMPLAIN TO THE POSTMASTER about these messages because the postmaster has no way of controlling them, and excessive complaints will result in termination of the list. For information about this list, including information about the problem of inappropriate messages and information about how to unsubscribe, please see the web page at http://jse.stat.ncsu.edu/ === *** Michael M. Granaas Associate Professor[EMAIL PROTECTED] Department of Psychology University of South Dakota Phone: (605) 677-5295 Vermillion, SD 57069 FAX: (605) 677-6604 *** All views expressed are those of the author and do not necessarily reflect those of the University of South Dakota, or the South Dakota Board of Regents. === This list is open to everyone. Occasionally, less thoughtful people send inappropriate messages. Please DO NOT COMPLAIN TO THE POSTMASTER about these messages because the postmaster has no way of controlling them, and excessive complaints will result in termination of the list. For information about this list, including information about the problem of
Re: Hypothesis testing and magic - episode 2
At 08:37 AM 4/13/00 -0400, Art Kendall wrote: in the "harder to do" sciences it is common to distinguish an experiment from a quasi-experiment. Part of the difficulty of these fields is that we can not (or ethically may not) manipulate many independent variables. Therefore we lose the opportunity to assert "et ceteris paribus" "everything else being equal" that is part of a true experiment. there goes medicine! if this is a real distinction ... then, instead of having 'hard' and 'soft' sciences ... we should think of it as: hard and soft investigations ... but, if we follow this to some logical conclusion ... this could be rephrased as meaning ... situations where you have essentially complete control over variable manipulation = situations where you can establish 'the truth' (in terms of the impacts of these variables on things) ... but, this is precisely what many have been arguing on the list about that hypothesis testing ... statistical significance testing that is ... is in NO position to help you assert 'the truth' ... truth is a metaphysical notion ... not statistical in essence, if 'the truth' is a laudable goal and, for some reason we can 'learn of it' through 'scientific investigation' ... then it is NOT significance testing that leads us to it ... ... rather it is the DESIGN of investigations that is the key ... === This list is open to everyone. Occasionally, less thoughtful people send inappropriate messages. Please DO NOT COMPLAIN TO THE POSTMASTER about these messages because the postmaster has no way of controlling them, and excessive complaints will result in termination of the list. For information about this list, including information about the problem of inappropriate messages and information about how to unsubscribe, please see the web page at http://jse.stat.ncsu.edu/ ===
Re: Hypothesis testing and magic - episode 2
At 10:23 AM 4/13/00 -0500, Michael Granaas wrote: In addition to defining the variables some areas do a better job of defining and therefore testing their models. The ag example is one where not only the variables are relatively clear so are the models. That is there is one highly plausible reason for rejecting a null that fertilizer does not effect crop production: Fertilizer increases crop production. You have rejected a model of no effect in favor of a model positing an effect. i did not know that ag research ... in this case, production figures ... was so easily accomplished ... it might be relatively easy to distribute fertilizer in different amounts ... over plots ... but even there, there is considerable error ... check out the way our spreaders work on our lawns? and in addition ... every fertilizer i know of is a product that is an amalgamation of several subproducts ... and inert stuff too ... so the distribution of it over plots will not produce identical spreads ... and ... how is production measured? to compare across plots means gathering in crops ... and making some kind of 'volume' measurements ... and that seems much easier said than done now, i would not like to say that doing a fertilizer experiment has the same amount of 'error' as maybe one where we ask if different levels of intelligence impact differentially on problem success in later life ... but these differences are more a matter of degree ... than in one instance it is easy ... and in others it is not maybe we should ask the ag researchers if THEY think doing their research is simple === This list is open to everyone. Occasionally, less thoughtful people send inappropriate messages. Please DO NOT COMPLAIN TO THE POSTMASTER about these messages because the postmaster has no way of controlling them, and excessive complaints will result in termination of the list. For information about this list, including information about the problem of inappropriate messages and information about how to unsubscribe, please see the web page at http://jse.stat.ncsu.edu/ ===
Re: Hypothesis testing and magic - episode 2
On Thu, 13 Apr 2000, dennis roberts wrote: At 10:23 AM 4/13/00 -0500, Michael Granaas wrote: In addition to defining the variables some areas do a better job of defining and therefore testing their models. The ag example is one where not only the variables are relatively clear so are the models. That is there is one highly plausible reason for rejecting a null that fertilizer does not effect crop production: Fertilizer increases crop production. You have rejected a model of no effect in favor of a model positing an effect. i did not know that ag research ... in this case, production figures ... was so easily accomplished ... I didn't say that there wasn't a lot of work involved. What I said was that there is a clear link between the experimental manipulation, the outcome variable, the hypothesis test results, and the question asked. This particular example was based on the type of question that Fisher might have been dealing with circa 1925. If my interpretation of history is correct and Fisher's ag research was focused on treatment/no treatment effects I think it helps us understand both the strength of his method in that setting and identifies a potential weakness in our own use. The methods of Fisher are useful when there is a strong link between the substantive and statistical hypothesis. The convergent validity example I used, I thought, showed a weak link between the substantive question and the statistical hypothesis (rho =? 0). (I admit my ignorance to current practice but I am pretty sure a correlation merely different from 0 is not evidence that two measures are measuring the same thing.) The weakness of that link leaves the researcher with out any useful information from their statistical decision. (i.e., knowing that the correlation is other than zero does not establish convergent validity.) Testing a null hypothesis of, for example, rho ? .7 (there may be a more appropriate value, I just picked this one because I like 7's and it illustrates my point) would provide a much better match between the statistical decision and the substantive question. (Rejecting rho ? .7 would indicate a degree of correlation consistent with convergent validity, failing to do so would leave the question open.) There are areas in psychology that have also done a good job of making the links between their substantive and statistical hypotheses and seem to have made a good deal of progress in knowledge generation. There are others that have not. (I have heard roughly the same argument made for physics, chemistry, and biology and I expect that this is generally true for a number of other research disciplines.) With or without a link between substantive and statistical hypothesis I acknowledge, nay, I proclaim, that research in any discipline is hard work for all the reasons you suggest and more. Any disrepect for the field of agricultural research was unintended and I appologize to anyone that I may have offended. I also freely admit that ag research today is going to be very different from the ag research of 1925. Michael *** Michael M. Granaas Associate Professor[EMAIL PROTECTED] Department of Psychology University of South Dakota Phone: (605) 677-5295 Vermillion, SD 57069 FAX: (605) 677-6604 *** All views expressed are those of the author and do not necessarily reflect those of the University of South Dakota, or the South Dakota Board of Regents. === This list is open to everyone. Occasionally, less thoughtful people send inappropriate messages. Please DO NOT COMPLAIN TO THE POSTMASTER about these messages because the postmaster has no way of controlling them, and excessive complaints will result in termination of the list. For information about this list, including information about the problem of inappropriate messages and information about how to unsubscribe, please see the web page at http://jse.stat.ncsu.edu/ ===
Re: Hypothesis testing and magic - episode 2
Hi Michael, This sounds to me like lousy experimental design. Surely the purpose of the experiment is to distinguish between competing theoretical models? Michael Granaas wrote: But in some areas in psychology you will have a situation where many theoretical perspectives predict the same outcome relative to a zero valued null while the zero valued null reflects no theoretical perspective. In this situation rejecting a zero valued null supports all theoretical perspectives equally and differentiates among none of them. and I think that is what you are saying here. I agree that measurement is a problem, but even with good measurement the lack of connection between statistical hypotheses and theoretical predictions is a fatal flaw in too many areas. Regards, Alan -- Alan McLean ([EMAIL PROTECTED]) Department of Econometrics and Business Statistics Monash University, Caulfield Campus, Melbourne Tel: +61 03 9903 2102Fax: +61 03 9903 2007 === This list is open to everyone. Occasionally, less thoughtful people send inappropriate messages. Please DO NOT COMPLAIN TO THE POSTMASTER about these messages because the postmaster has no way of controlling them, and excessive complaints will result in termination of the list. For information about this list, including information about the problem of inappropriate messages and information about how to unsubscribe, please see the web page at http://jse.stat.ncsu.edu/ ===
Re: Hypothesis testing and magic - episode 2
dennis roberts wrote: but, if we follow this to some logical conclusion ... this could be rephrased as meaning ... situations where you have essentially complete control over variable manipulation = situations where you can establish 'the truth' (in terms of the impacts of these variables on things) ... but, this is precisely what many have been arguing on the list about that hypothesis testing ... statistical significance testing that is ... is in NO position to help you assert 'the truth' ... truth is a metaphysical notion ... not statistical in essence, if 'the truth' is a laudable goal and, for some reason we can 'learn of it' through 'scientific investigation' ... then it is NOT significance testing that leads us to it ... ... rather it is the DESIGN of investigations that is the key ... Truth has nothing to do with it. We contruct stories of how the universe operates - we call these stories 'theories' or 'models'. Significance testing is one way in which we choose between stories as to which is (probably) more useful in a specified context. Alan -- Alan McLean ([EMAIL PROTECTED]) Department of Econometrics and Business Statistics Monash University, Caulfield Campus, Melbourne Tel: +61 03 9903 2102Fax: +61 03 9903 2007 === This list is open to everyone. Occasionally, less thoughtful people send inappropriate messages. Please DO NOT COMPLAIN TO THE POSTMASTER about these messages because the postmaster has no way of controlling them, and excessive complaints will result in termination of the list. For information about this list, including information about the problem of inappropriate messages and information about how to unsubscribe, please see the web page at http://jse.stat.ncsu.edu/ ===
Re: Hypothesis testing and magic - episode 2
- Original Message - From: Michael Granaas [EMAIL PROTECTED] To: EDSTAT list [EMAIL PROTECTED] Sent: Thursday, April 13, 2000 8:23 AM Subject: Re: Hypothesis testing and magic - episode 2 In addition to defining the variables some areas do a better job of defining and therefore testing their models. The ag example is one where not only the variables are relatively clear so are the models. That is there is one highly plausible reason for rejecting a null that fertilizer does not effect crop production: Fertilizer increases crop production. You have rejected a model of no effect in favor of a model positing an effect. But in some areas in psychology you will have a situation where many theoretical perspectives predict the same outcome relative to a zero valued null while the zero valued null reflects no theoretical perspective. In this situation rejecting a zero valued null supports all theoretical perspectives equally and differentiates among none of them. In a recent example a student was citing the research literature supporting the convergent validity of some measure. The evidence used by all investigators was that the null of rho = 0 was rejected. I've seen this same thing many times, but this time I saw something different. The smallest sample (n about 95) failed to reject rho = 0 while the remaining samples (all n's 200) successfully rejected rho = 0 and convergent validity was declared. (No r's were actually reported in this review.) A quick thought experiment, and check of critical value tables, suggests that the best estimate of rho from the evidence provided is some value greater than 0 but less than .20. In this case it seems to me that testing the default zero valued null was misleading rather than informative. In addition to convergent validity it seems to me that correlations in the range 0 - .20 could easily be explained by at least a couple of other competing models that would not support the conclusions drawn. Only the most trivial link between theoretical models and statistical hypotheses exist in this case. Using Alan's ethnicity and statistical ability example, and assuming for the moment that all measures were useful, the first time we reject a no effect null we have some sort of useful information. Now, imagine that 12 researchers generate 12 different hypotheses explaining the cause of these differences. Current practice has all 12 of these researchers collecting data and testing to eliminate the chance model and then declaring that their hypothesis has been confirmed. . Good example of many of the current problems. 1. If testing the null hypothesis provides no conclusive information, why structure the experiment around the null hypothesis. I quoted R.A. Fisher in a previous message, so to repeat it here, I say etcetera. If the outcome explains the measured outcome, the problem is one, does it do it conclusively. There is a lot of very well done psychology work that comes to valid conclusions and gets published in Science. The point is that the work was very thoughal. You have to be very careful in establishing the research objectives and the roadmap. 2. Re the 12 researchers with different claimed valid hypotheses. Happens all the time. Any significant work with startling claims will be retested using alternate approaches. In this case the proof is not the statistical test, but the fact that others can demonstrate under different conditions that the cause put forth by one of the researchers, produces the observed result. The theory works. If they can't repeat the findings, then regardless of statistics, the theory is not accepted. There is a lot of stuff in the "hard sciences" that gets disproved, because it just doesn't hold up under a more carefull experiment. 3. If what is being done is just mathematical excercises (the main output from the bulk of the university stat departments). sure then arguing endlessly about the null hypothesis is fine. One gets visability among ones peers when this is done. But it sure doesn't help the researchers build up a really good plan and method to do a first class investigation. I fail to see why there is so much emphisis on a null hypothesis test if the result really is not important. DAHeiser Not Associated with any Stat Department, School or University === This list is open to everyone. Occasionally, less thoughtful people send inappropriate messages. Please DO NOT COMPLAIN TO THE POSTMASTER about these messages because the postmaster has no way of controlling them, and excessive complaints will result in termination of the list. For information about this list, including information about the problem of inappropriate messages and information about how to unsubscri
Re: Hypothesis testing and magic - episode 2
At 09:30 AM 4/13/00 +1000, Alan McLean wrote: In the soft sciences it is easy enough to identify a characteristic of interest alan makes good points as usual ... but i totally object to the term 'soft' sciences ... what does soft imply? that the science is bad ... or, that merely that variables are more 'difficult' to measure ... if that is the case, these ought to be called the 'hard' sciences the unpleasant associations with the term 'soft' are uncalled for ... there are excellent 'scientists' (whatEVER that means) in all fields .. and some pretty weak ones too (and gee ... BOTH kinds get tenure!) ... science is science ... and some practice it well ... some don't ... should it be some demerit against them that they happen to have opted for a field of interest ... even if many of the variables are difficult to measure? perhaps that makes it even more challenging ... finally, i would not be so quick to claim that in the areas that are non social science based ... that variables are all the clear and clean cut ... there seems to be tremdous infighting about theories and how to 'validate' them in medicine ... astronomy ... physics ... it is not like everything there is so simple ... maybe don can pop in here with some relevant examples ... i am sure there are 'mean' differences in terms of these things but ... there is a lot more WITHin variation in terms of hardness/softness ... that between disciplines == dennis roberts, penn state university educational psychology, 8148632401 http://roberts.ed.psu.edu/users/droberts/droberts.htm === This list is open to everyone. Occasionally, less thoughtful people send inappropriate messages. Please DO NOT COMPLAIN TO THE POSTMASTER about these messages because the postmaster has no way of controlling them, and excessive complaints will result in termination of the list. For information about this list, including information about the problem of inappropriate messages and information about how to unsubscribe, please see the web page at http://jse.stat.ncsu.edu/ ===