Re: Means of semantic differential scales
Jay Tanzman wrote: Jay Warner wrote: Jay Tanzman wrote: I just got chewed out by my boss for modelling the means of some 7-point semantic differential scales. The scales were part of a written, self-administered questionnaire, and were laid out like this: Not stressful 1__ 2__ 3__ 4__ 5__ 6__ 7__ Very stressful So, why or why not is it kosher to model the means of scales like this? -Jay My boss's objection was that he believes categorically (sorry) that semantic differential scales are ordinal. 1)Why do you think the scale is interval data, and not ordinal or categorical? Why would anyone think it is ordinal and not interval? Most of the scales were measuring abstract, subjective constructs, such as empathy and satisfaction, for which there is no underlying physical or biological measurement. Why not, then, _define_ degree of empathy as the subjects' rating on a 1-to-7 scale? Why not indeed?! Of course you can do this - and in fact you are doing this. The question is really - what properties should this variable possess in order that it is meaningful - that is, that it reflects 'reality' meaningfully. If it does not do this, then whatever conclusions you come to about your variable are of no use whatsoever. It is certainly true that your variable is ordinal. Is it more than this? It is extremely unlikely that it is fully numeric (that is, 'interval') because the difference between 1 and 2 is unlikely to have the same meaning as the difference between 4 and 5. You cannot simply define these differences to be equal - you need your variable to reflect reality! However, it is probable that the scale is 'reasonably numeric', so the assumption that the variable is interval may be reasonable. But this will be a model, using a number of assumptions - as all these things are. It is important that you recognise this modelling aspect of your data definition. Regards, Alan -- Alan McLean ([EMAIL PROTECTED]) Department of Econometrics and Business Statistics Monash University, Caulfield Campus, Melbourne Tel: +61 03 9903 2102Fax: +61 03 9903 2007 = Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ =
Re: Statistical Distributions
Hi Dennis, Dennis Roberts wrote: not to disagree with alan but, my goal was to parallel what glass and stanley did and that is all ...seems like there are all kinds of distributions one might discuss AND, there may be more than one order that is acceptable Sure, I realised that your goal was limited to paralleling GS - but you did ask for suggestions for developing it, and a natural extension of the coverage is one possibility. (And someone recently has been advocating discussion of relationships between the distributions.) It occurs to me that fitting the Poisson into the set also might be a good idea - that would more or less cover the 'basic' distributions. most books of recent vintage (and g and s was 1970) don't even discuss what g and s did but, just for clarity sake ... are you saying that the nd is a logical SECOND step TO the binomial or, that if you look at the binomial, one could (in many circumstances of n and p) say that the binomial is essentially a nd (very good approximation).. ? The former. the order i had for the nd, chis square, F and t seemed to make sense but, i don't necessarily buy that one NEED to START with the binominal certainly, however, if one talks about the binomial, then the link to the nd is a must What I had in mind is something I have thought for a long time (not at all actively, I confess!) but have never seen dealt with, so maybe it is totally off track. That is the idea that a normal distribution can *always* be seen as a limiting expression of a binomial. The binomial is clearly a more basic distribution than the normal, in the sense that it applies to a nominal variable - more specifically, to a dummy variable defined for one value of the nominal variable. It is concerned with whether the value occurs or does not. This registration of occurrence is more primitive than measuring a numerical value of a numeric variable. I believe that the idea expressed above is so, but I am having problems defining it. If anyone has come across this idea, I would be delighted to find a reference to it. Regards, Alan At 06:36 PM 2/17/02 -0500, Timothy W. Victor wrote: I also think Alan's idea is sound. I start my students off with some binomial expansion theory. Alan McLean wrote: This is a good idea, Dennis. I would like to see the sequence start with the binomial - in a very real way, the normal occurs naturally as an 'approximation' to the binomial. Dennis Roberts, 208 Cedar Bldg., University Park PA 16802 Emailto: [EMAIL PROTECTED] WWW: http://roberts.ed.psu.edu/users/droberts/drober~1.htm AC 8148632401 = Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ = -- Alan McLean ([EMAIL PROTECTED]) Department of Econometrics and Business Statistics Monash University, Caulfield Campus, Melbourne Tel: +61 03 9903 2102Fax: +61 03 9903 2007 = Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ =
Re: Statistical Distributions
This is a good idea, Dennis. I would like to see the sequence start with the binomial - in a very real way, the normal occurs naturally as an 'approximation' to the binomial. Alan Dennis Roberts wrote: Back in 1970, Glass and Stanley in their excellent Statistical Methods in Education and Psychology book, Prentice-Hall ... had an excellent chapter on several of the more important distributions used in statistical work (normal, chi square, F, and t) and developed how each was derived from the other(s). Most recent books do not develop distributions in this fashion anymore: they tend to discuss distributions ONLY when a specific test is discussed. I have found this to be a more disjointed treatment. Anyway, I have developed a handout that parallels their chapter, and have used Minitab to do simulation work that supplements what they have presented. The first form of this can be found in a PDF file at: http://roberts.ed.psu.edu/users/droberts/papers/statdist2.PDF Now, there is still some editing work to do AND, working with the spacing of text. Acrobat does not allow too much in the way of EDITING features and, trying to edit the original document and then convert to pdf, is also somewhat of a hit and miss operation. When I get an improved version with better spacing, I will simply copy over the file above. In the meantime, I would appreciate any feedback about this document and the general thrust of it. Feel free to pass the url along to students and others; copy freely and use if you find this helpful. Dennis Roberts, 208 Cedar Bldg., University Park PA 16802 Emailto: [EMAIL PROTECTED] WWW: http://roberts.ed.psu.edu/users/droberts/drober~1.htm AC 8148632401 = Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ = -- Alan McLean ([EMAIL PROTECTED]) Department of Econometrics and Business Statistics Monash University, Caulfield Campus, Melbourne Tel: +61 03 9903 2102Fax: +61 03 9903 2007 = Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ =
Re: One-tailed, two-tailed
Hi Stan, This is sent to both you and edstat. Have you proven that one gas gives better mileage than the other? If so, which one is better? There are two points. The first is that you have not 'proved' anything - except in the most casual interpretation of 'proof'. What you have done is provide an answer in which you can be very confident to the question posed. So the first amendmentment is to something like: Can you reasonably conclude that one gas gives better mileage than the other? If so, which one is better? Second, the question is confusingly - sloppily - posed. It appears to be two questions. The first leads to a two tailed test - does one gas give better mileage than the other? This is the question that is answered. The second question leads to a one tailed test, which is the one you are trying to answer, I gather as an extra to the original question. As soon as you try to answer both questions simultaneously you run into logical problems. You *have* to be very clear from the start which of the two you are interested in. In this case, do you only want to know (in the sense of 'conclude with some confidence') if: * one gas is better than the other (so you will do a two sided test); or * gas B is better than gas A ( so you will do a two sided test). (You can also pose the question whether gas A is better than gas A, but the sample evidence is obviously against this.) This is one of the bits that causes students most problems - identifying the question being asked! It also seems to be a problem with many researchers, Yet it is fundamental to research. Happy New Year, Alan Stan Brown wrote: I think I've got some sort of mental block on the following point. Can someone explain this to me, plainly and simply, please? Let me start with a sample problem, NOT created by me: [The student is led to enter two sets of unpaired figures into Excel. They represent miles per gallon with gasoline A and gasoline B. I won't give the actual figures, but here's a summary: A: mean = 21.9727, variance = 0.4722, n = 11 B: mean = 22.9571, variance = 0.2165, n = 14 The question is whether there is a difference in gasoline mileage. The student is led to a two-sample F test for homoscedasticity; p=0.1886 so the samples are treated as homoscedastic. Now the problem says: ] Now the main t-test ... Two Sample Assuming Equal Variances. ... Use two-tail results (since '=/=' in Ha). ... What is the P-val for the t-test? [Answer: p=.0002885] What's your conclusion about the difference in gas mileage? [Answer: At significance level 5%, previously selected, there is a difference between them.] Now we come to the part I'm having conceptual trouble with: Have you proven that one gas gives better mileage than the other? If so, which one is better? Now obviously if the two are different then one is better, and if one is better it's probably B since B had the higher sample mean. But are we in fact justified in jumping from a two-tailed test (=/=) to a one-tailed result ()? Here we have a tiny p-value, and in fact a one-tailed test gives a p-value of 0.0001443. But something seems a little smarmy about first setting out to discover whether there is a difference -- just a difference, unequal means -- then computing a two-tailed test and deciding to announce a one-tailed result. Am I being over-scrupulous here? Am I not even asking the right question? Thanks for any enlightenment. (If you send me an e-mail copy of a public follow-up, please let me know that it's a copy so I know to reply publicly.) -- Stan Brown, Oak Road Systems, Cortland County, New York, USA http://oakroadsystems.com/ My theory was a perfectly good one. The facts were misleading. -- /The Lady Vanishes/ (1938) = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ = -- Alan McLean ([EMAIL PROTECTED]) Department of Econometrics and Business Statistics Monash University, Caulfield Campus, Melbourne Tel: +61 03 9903 2102Fax: +61 03 9903 2007 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: used books
Try http://www.abebooks.com/ Alan IPEK wrote: Do you know any online used bookstore other than Amazon? I need to find some old stat and OR books. = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ = -- Alan McLean ([EMAIL PROTECTED]) Department of Econometrics and Business Statistics Monash University, Caulfield Campus, Melbourne Tel: +61 03 9903 2102Fax: +61 03 9903 2007 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Interpreting p-value = .99
Gus, Stan's two alternatives were correct as stated - they were two one sided tests, not a one sided and a two sided test. Stan, in practical terms, the conclusion 'fail to reject the null' is simply not true. You do in reality 'accept the null'. The catch is that this is, in the research situation, a tentative acceptance - you recognise that you may be wrong, so you carry forward the idea that the null may be 'true' but - on the sample evifdence - probably is not. On the other hand, this should also be the case when you 'reject the null' - the rejection may be wrong, so the rejection is also tentative. The difference is that the null has this privileged position. In areas like quality control, of course, it is quite clear that you decide, and act as if, the null is true or is not true. Regards, Alan Gus Gassmann wrote: Stan Brown wrote: On a quiz, I set the following problem to my statistics class: The manufacturer of a patent medicine claims that it is 90% effective(*) in relieving an allergy for a period of 8 hours. In a sample of 200 people who had the allergy, the medicine provided relief for 170 people. Determine whether the manufacturer's claim was legitimate, to the 0.01 significance level. (The problem was adapted from Spiegel and Stevens, /Schaum's Outline: Statistics/, problem 10.6.) I believe a one-tailed test, not a two-tailed test, is appropriate. It would be silly to test for effectiveness differs from 90% since no one would object if the medicine helps more than 90% of patients.) Framing the alternative hypothesis as the manufacturer's claim is not legitimate gives Ho: p = .9; Ha: p .9; p-value = .0092 on a one-tailed t-test. Therefore we reject Ho and conclude that the drug is less than 90% effective. But -- and in retrospect I should have seen it coming -- some students framed the hypotheses so that the alternative hypothesis was the drug is effective as claimed. They had Ho: p = .9; Ha: p .9; p-value = .9908. I don't understand where they get the .9908 from. Whether you test a one-or a two-sided alternative, the test statistic is the same. So the p-value for the two-sided version of the test should be simply twice the p-value for the one-sided alternative, 0.0184. Hence the paradox you speak of is an illusion. Unfortunately for you, the two versions of the test lead to different conclusions. If the correct p-value is given, I would give full marks (perhaps, depending on how much the problem is worth overall, subtracting 1 out of 10 marks for the nonsensical form of Ha). = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ = -- Alan McLean ([EMAIL PROTECTED]) Department of Econometrics and Business Statistics Monash University, Caulfield Campus, Melbourne Tel: +61 03 9903 2102Fax: +61 03 9903 2007 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: best inference
Happy holiday, Dennis. I have two answers to this question - pick one! First, the recognition that all of statistics, but particularly inference, is about providing, and assessing the strength of, evidence - in circumstances where some measurement(s) can sensibly be defined, and these measurements are in some manner repeated - as to the probable usefulness of some proposal about those measurements. That one comes out fairly clumsy, as a result of trying to be very careful. You may prefer my second answer: The recognition that all concepts/procedures/skills in statistics are closely interrelated and you cannot sensibly pick out one! Regards, Alan Dennis Roberts wrote: on this near holiday ... at least in the usa ... i wonder if you might consider for a moment: what is the SINGLE most valuable concept/procedure/skill (just one!) ... that you would think is most important when it comes to passing along to students studying inferential statistics what i am mainly looking for would be answers like: the notion of being able to do __ that sort of thing something that if ANY instructor in stat, say at the introductory level failed to discuss and emphasize ... he/she is really missing the boat and doing a disservice to students _ dennis roberts, educational psychology, penn state university 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED] http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ = -- Alan McLean ([EMAIL PROTECTED]) Department of Econometrics and Business Statistics Monash University, Caulfield Campus, Melbourne Tel: +61 03 9903 2102Fax: +61 03 9903 2007 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Evaluating students
Thom Baguley wrote: Alan McLean wrote: This describes a BAD closed book exam. It also describes a bad open book exam. Not entirely. I have found that many students still worry about such things regardless of the information they have about the exam. A good one-hour exam would have three, or at most four, multi-part PROBLEMS. A good exam would be one which someone who has merely memorized the book would fail, and one who understands the concepts but has forgotten all the formulas would do extremely well on. Since to understand the concepts almost always means understanding (and hence knowing) the formulas, I would interpret someone who has 'forgotten all the formulas' as understanding the concepts only in the most superficial manner, and so should do badly! I don't agree here. As a semi-trivial counterexample, would you suggest that I don't understand a concept if I am given an unfamiliar formula (e.g., because it is rearranged for some purpose such as ease of calculation, or because it uses a notation that I am unfamiliar with?). A single concept can give rise to an infinite number of formulae or forms of notation. In the context of evaluating a student if you test memory for a formula as a component of a question this leads you to unable to distinguish poor performance due to complete lack of understanding and a student who has a partial understanding (but can't recall the formula). I was responding to a comment about a student who had 'forgotten ALL the formulas' - and I consider my comment perfectly accurate. In any examination you are testing the student's memory, so if you are asking a student to carry out some activity, you are testing his or her memory of how to carry out that activity. By all means provide them with a formula sheet for at least the more complex formulas, or allow them to use their own resources - but the student has to KNOW the formula at some level in order to carry out the activity. Alan -- Alan McLean ([EMAIL PROTECTED]) Department of Econometrics and Business Statistics Monash University, Caulfield Campus, Melbourne Tel: +61 03 9903 2102Fax: +61 03 9903 2007 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Evaluating students
Herman Rubin wrote: In article [EMAIL PROTECTED], Thom Baguley [EMAIL PROTECTED] wrote: Glen wrote: As a student I *always* preferred closed book exams. If I know the material I don't need the book, and if I don't know the material, the book isn't going to help in the exam enough anyway. For open Yes. Also, closed book exams tend to be easier because the range of questions is more restricted. I have found them a way to avoid students spending most of their time memorizing near-useless material. The main reason why closed book exams tend to be easier for students is that they actually realise they have to do some work in preparation! On the contrary, closed book exams emphasize memorizing near-useless material. This describes a BAD closed book exam. It also describes a bad open book exam. A good one-hour exam would have three, or at most four, multi-part PROBLEMS. A good exam would be one which someone who has merely memorized the book would fail, and one who understands the concepts but has forgotten all the formulas would do extremely well on. Since to understand the concepts almost always means understanding (and hence knowing) the formulas, I would interpret someone who has 'forgotten all the formulas' as understanding the concepts only in the most superficial manner, and so should do badly! Overall, the evaluation of students is driven mostly by budget, (lecturers') time, lecturers' interest, the number of students, politics - the best one can do is to assess students as honestly as possible within the range allowed by these factors! My eight cents' worth. Alan -- This address is for information only. I do not claim that these views are those of the Statistics Department or of Purdue University. Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399 [EMAIL PROTECTED] Phone: (765)494-6054 FAX: (765)494-0558 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ = -- Alan McLean ([EMAIL PROTECTED]) Department of Econometrics and Business Statistics Monash University, Caulfield Campus, Melbourne Tel: +61 03 9903 2102Fax: +61 03 9903 2007 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Evaluating students
Students also confuse histograms with time series graphs. They describe a graph as, for example, 'starting low, increasing then decreasing again'. It's easy enough to see how they get this approach from their school maths. It's much more difficult to get them to see a histogram as rather more like a map, to be viewed from above. (I must admit to being something of an offender here. I emphasise the role of the inflexions in the normal curve as the only points on the peak which are identifiable without reference to the scale - except for the maximum - so can be used to measure the width of the peak. To describe them I ask students to imagine that they are riding a motor bike along the curve, The inflexions are where they momentarily straighten up) Alan Carl Lee wrote: Using introductory statistics as an example, concepts are built in a certain sequence. If students get lost at a certain stage, s/he will have difficulty to connect the later concepts together. Therefore, it is crucial to test the understanding of the connection (or relationship) among related concepts. For example, you may be surprised that the concept of histogram is much more difficult for students than we thought. Try the following problem in your final exam, you may be surprised by the outcome: If you collect a random sample of 100 salaries of working individuals who are 40 years or older. Ask students to describe the shape of the histogram that is more likely to occur, and their reason. Then, ask students to verbally describe the Y-axis and X-axis of this histogram. I have collected data for this problem for several years. When I first asked this question, I was shocked that 80% of students got confused between scatter plot and histogram. I began to pay attention and used a variety of strategies to help students. We usually think people have seen histograms all the time, it must be simple. However, this test problem seems to indicate that we may have overlooked simple concepts such as this. If we think about the construction of histogram a little more, we see that a histogram is a transformation of raw data into two-dimensional presentation for a response variable. This indeed is very different from our common experience of two-dimensional plot, which is usually involved with two response variables, a scatter plot. One assessment tool I use to test student's understanding of concepts is to test how well they understand the relationship among related concepts, not just stand-alone concept. For example, the relationship among time series plot, box plot, histogram, outliers, mean, median, standard deviation and range is important for understanding variation, distribution and later the sampling distribution of sample mean. I have developed a series of questions for testing their understanding of the relationships using the project of investigating stock prices. There is no formula neither computation is required by students in answering these questions. Another assessment tool that I use is to ask students give the reasons of their answers verbally. Again, no formula neither computation is needed. What I intend to find out is how they think and how they solve the problem. This has helped me greatly to study how students learn a variety of statistics concepts and which concept students tend to get lost at the early stage of their learning. Assessment, learning and teaching are closely connected. And understanding how students learn is most important of the three. A first step toward understanding how learning take place is to conduct a good assessment, especially assessing the process of reasoning. Teaching strategies and instructional material can then be better prepared. Carl Carl Lee, Professor of Statistics Assessment Coordinator of CMU (1999-2001) Department of Mathematics, Central Michigan University Mt. Pleasant, MI 48859 e-mail: [EMAIL PROTECTED] Learning without Thinking, I am soon confused. Thinking without Doing, I can never fully understand it. -- Donald Burrill wrote: On Wed, 14 Nov 2001, Alan McLean wrote in part: Herman Rubin wrote: A good exam would be one which someone who has merely memorized the book would fail, and one who understands the concepts but has forgotten all the formulas would do extremely well on. Since to understand the concepts almost always means understanding (and hence knowing) the formulas, I would interpret someone who has 'forgotten all the formulas' as understanding the concepts only in the most superficial manner, and so should do badly! Non sequitur. To know formulas (in a deep sense of understanding them) is one thing; to be able to write them verbatim is another thing altogether (and something that xerographic copiers do better than people do, by and large). Of course, it is easier to ask questions about
Re: Help for DL students in doing assignments
Ignoring the error in saying (2) that all primes are odd - where has 2 disappeared to? - you are highly confused about the difference between if ... then and if and only if then . Correcting (3) to: The sum of any two primes greater than 2 is even. This is true - but it does NOT imply the reverse - that any even number is the sum of two primes. Alan Dr. Fairman wrote: Stuart Gall [EMAIL PROTECTED] wrote in message news:9qa466$4je$[EMAIL PROTECTED]... Dr. Fairman [EMAIL PROTECTED] wrote in message [EMAIL PROTECTED]">news:[EMAIL PROTECTED]... Well no I am afraid not, because although for all p prime p = 2*n+1 is true it is not true that for all n n in N 2*n+1 is prime which is what you would need for your proof to be valid. Are you pulling my leg in return? if so touche :-) If you are not pulling my leg, I would say that the probability that you have a PhD in mathematics and do not recognise Q2 is vanishingly small. PS if you can solve Q1 you could make much more money by publshing the solution in a book. Hello Stuart, 1.Is sum of every two odds = even ? (Y/N) Answer: Yes. 2.Is any prime is odd? (Y/N) Answer: Yes. 3.Generalizing item #1 and #2, Is sum of any two primes = even ? (Y/N) Answer: Yes. 4.If you agree with item #3 (if not - please argue - why), it means that you are also agree with the statement: every even is (in particular) sum of any two primes. That's what you needed me to prove. Do you still have any objections? If YES - please argue, what of my items are wrong and why. Dr. Fairman. = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ ===== -- Alan McLean ([EMAIL PROTECTED]) Department of Econometrics and Business Statistics Monash University, Caulfield Campus, Melbourne Tel: +61 03 9903 2102Fax: +61 03 9903 2007 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Help for DL students in doing assignments
Can I claim the $1,000,000? There is certainly an even prime: 2. Alan Nomen Nescio wrote: Mr. Dawson wrote: Well, they do say what goes around comes around; I'd love to see what mark the dishonest DL student gets having had his homework done for him by somebody who: (a) believes all primes to be odd; ... ### Let's assume that any prime is NOT odd ### It means that is is even (no other way among integers!) ### So that prime has 3 dividers: 1,this prime and 2 ### which contradicts with prime definition: ### (prime is integer that has only two dividers: 1 and this prime itself) ### Dear Mr. Dawson, please send me at least ONE even prime ### and i shall give you $1,000,000. -Robert Dawson = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ = -- Alan McLean ([EMAIL PROTECTED]) Department of Econometrics and Business Statistics Monash University, Caulfield Campus, Melbourne Tel: +61 03 9903 2102Fax: +61 03 9903 2007 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Mean and Standard Deviation
Of course the SD can be larger than the mean. If this were not so we would not have the standard normal... If the variable can take negative values, the mean may be close to zero, or even negative - while the SD has to be positive. If the variable can not take negative values, it is still possible for the SD to be larger than the mean, but the distribution will then be not symmetric. Alan Edward Dreyer wrote: A colleague of mine - not a subscriber to this helpful list - asked me if it is possible for the standard deviation to be larger than the mean. If so, under what conditions? At first blush I do not think so - but then I believe I have seen some research results in which standard deviation was larger than the mean. Any help will be greatly appreciated.. cheersECD ___ Edward C. Dreyer Political Science The University of Tulsa = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ = -- Alan McLean ([EMAIL PROTECTED]) Department of Econometrics and Business Statistics Monash University, Caulfield Campus, Melbourne Tel: +61 03 9903 2102Fax: +61 03 9903 2007 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: ranging opines about the range
This is news to me - I have only ever heard the range defined as 'maximum - minimum' (and then usually wiped out as a mostly useless statistic..) I usually point out to students that in everyday language the word 'range' is used for the interval - as in 'prices for cabbages ranged from $1 to $2.50', so the statistical usage is (another) one of these words with a different meaning in the field. For a continuous (numeric) variable, the range only makes sense using the max - min definition. If the min is 27.324 and the max is 33.654, the range is 6.333. For a discrete (numeric) variable, you can argue that the concept of 'range' requires continuity, so that we have to assume the values are rounded. So for exam marks, recorded to the nearest per cent, a max of 97 is assumed to be rounded from somewhere in the interval 96.5 to 97.5; a min of 35 is likewise considered to be rounded from 34.5 to 35.5. With this model, the max value may have been as high as 97.5 and the min as low as 34.5, so the range is calculated as 97.5 - 34.5. This gives you your +1 calculation - that is, it is a correction for continuity. Regards, Alan jeff rasmussen wrote: Dear statistically-enamored, There was a question in my undergrad class concerning how to define the range, where a student pointed out that contrary to my edict, the range was the difference between the maximum minimum. I'd always believed that the correct answer was the difference between the maximum minimum plus one; and irrespective of what the students' textbook and also SPSS said (when I ran some numbers through it) I thought that was the commonly accepted answer. I favor the plus one account as I feel that it balances out the minus one of degrees of freedom and thus puts the Tao correctly in balance. I asked a colleague who also came up with the same answer. Below in I and II are answers from internet sites that also agree. There are also however some sites that define it nakedly as the difference between the maximum minimum; my theory is that the Evil SPSS Empire bought them off as part of their plan for world domination Finally, we have a waffler's answer in III below... Curious to hear what you think about this defining issue for our times. best, JR from http://www.cuny.edu/tony/edstat22.html I. Measures of Dispersion or Spread Range - is the difference between the highest and lowest values in a group of values plus one. For example, the range of the following group of values 60,70,80,90,100 is 41 and is calculated by subtracting the lowest value (60) from the highest value (100) = 40 plus 1 = 41. from http://www.uwsp.edu/psych/stat/5/CT-Var.htm#II1 II. Range As we noted when discussing the rules for creation of a grouped frequency distribution, the range is given by the highest score in the distribution minus the lowest score plus one. R = XH - XL + 1 from http://luna.cas.usf.edu/~rasch/stat.html III. Measures of Dispersion Range: The Inclusive Range is the highest score minus the lowest score in a distribution plus 1. If the highest score on an examination is 97 and the lowest score 65, the range is 33. The plus 1 correction captures the values from 97.49 to 64.50. The Exclusive Range is just the highest score minus the lowest score. In the above example 32. Jeff Rasmussen http://www.symynet.com website graphic design quantitative software spirit of tao te ching paperback taoism = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ = -- Alan McLean ([EMAIL PROTECTED]) Department of Econometrics and Business Statistics Monash University, Caulfield Campus, Melbourne Tel: +61 03 9903 2102Fax: +61 03 9903 2007 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: They look different; are they really?
Stan Brown wrote: I had already decided to lead off with an assessment test the first day of class next time, for the students' benefit. (If they should be in a more or less advanced class, the sooner they know it the better for them.) But as you point out, that will benefit me too. The other instructor has developed a pre-assessment test over the past couple of years, and has offered to let me use it too, so we'll be able to establish comparable baselines. The two classes are in the same subject, aren't they? How come one group is treated differently (given a pre-assessment test) from the other? Alan -- Alan McLean ([EMAIL PROTECTED]) Department of Econometrics and Business Statistics Monash University, Caulfield Campus, Melbourne Tel: +61 03 9903 2102Fax: +61 03 9903 2007 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: effect size/significance
jim clark wrote: Sometimes I think that people are looking for some magic bullet in statistics (i.e., significance, effect size, whatever) that is going to avoid all of the problems and misinterpretations that arise from existing practices. I think that is a naive belief and that we need to teach how to use all of the tools wisely because all of the tools are prone to abuse and misinterpretation. Spot on! Alan -- Alan McLean ([EMAIL PROTECTED]) Department of Econometrics and Business Statistics Monash University, Caulfield Campus, Melbourne Tel: +61 03 9903 2102Fax: +61 03 9903 2007 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Definitions of Likert scale, Likert item, etc.
5 is 5 times 0? Alan dennis roberts wrote: TO TALK about these things as ratio scales is downright silly look at the item: stat will help me in my professional work don't agree |(0)__(5)__| agree you aren't going to claim that the agree means 5 times a stronger view than don't agree ... are you??? -- Alan McLean ([EMAIL PROTECTED]) Department of Econometrics and Business Statistics Monash University, Caulfield Campus, Melbourne Tel: +61 03 9903 2102Fax: +61 03 9903 2007 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Definitions of Likert scale, Likert item, etc.
It is certainly true that the variable X = distance from the left hand end of the line (in whatever units you choose) is a ratio variable, because the zero is not arbitrary. But the variable Y = level of agreement, recorded as distance from the left hand end of this particular line is not a ratio variable. This is the case because the choice of this line, and whereabouts on it you choose to put 'zero agreement' (whatever that might mean) is quite arbitrary. A ratio variable is one where the zero is 'natural' - not arbitrarily chosen. Alan Sure do, I think that if you redid it so that the scale was now: don't agree strongly agree |___| that would give you a ratio scale between no agreement and strong agreement. You would then be able to use, e.g. ANOVA, on your test results, which would be numeric in millimeters. cheers Michelle blush = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ = -- Alan McLean ([EMAIL PROTECTED]) Department of Econometrics and Business Statistics Monash University, Caulfield Campus, Melbourne Tel: +61 03 9903 2102Fax: +61 03 9903 2007 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Definitions of Likert scale, Likert item, etc.
It's certainly true that there is a semantic problem, with people interpreting terms in different ways. (So what's new?) Having started life (so to speak) as a mathematician, a 'scale' is a characteristic of the variable being measured. The construct that a couple of people have referred to as a 'Likert scale' I would call a 'variable' (or possibly 'measure'). The range of possible values, and the way they are laid out (eg, for 0 to 100, hopefully 'interval' or even 'ratio') forms the scale for this variable. The common usage of 'Likert scale' to mean an ordinal scale, usually from 1 to 5, usually expressing level of agreement with a proposition fits this view of the terms. An individual item of this type defines a variable, and this variable has a Likert scale, in this sense of the term. The composite variable or measure (hopefully) has a reasonably numeric scale. Regards, Alan Dennis Roberts wrote: we do have a semantics problem with terms like this ... scale ... and confuse sometimes the actual physical paper and pencil instrument with the underlying continuum on which we are trying to place people so, even in likert's work ... he refers to THE attitude scales ... and then lists the items on each ... thus, it is easy to see an equating made between the collection of items ... nicely printed ... BEING the scale ... but really, the scale is not that ... one has to think about the SCORE value range ... that is possible ... when this physical thing (nicely printed collection of items) is administered to Ss ... thus ... for 10 typically response worded likert items with SA to SD ... the range of scores on the scale might be 10 to 50 ... of which any particular S might get any one of those values somewhere along the continuum but of course, scale is even deeper than that since, what we really have is a psychophysical problem ... that is, what is the functional relationship that links the physical scale ... 10 to 50 ... to the (assumed to exist) underlying psychological continuum ... PHYSICAL SCALE 10 (NEGATIVE) 50 (POSITIVE) PSYCHOLOGICAL CONTINUUM MOST NEGATIVE MOST POSITIVE problems like ... do equal distances along the physical scale ... equate to the same and equal distances along the psychological continuum? is there a linear relationship between these two? curvilinear? so, i think what we really mean by scale is this construct ... ie, the psychological continuum ... and a scale value would be where a S is along it ... but, about the best we can do to assess this is to see where the S is along the physical scale ... ie, where from 10 to 50 ... and use this as our PROXY measure ... BUT IN any case ... i think it is helpful NOT to call the actual instrument ... the paper and pencil collection of items ... THE scale ... _ dennis roberts, educational psychology, penn state university 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED] http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ = -- Alan McLean ([EMAIL PROTECTED]) Department of Econometrics and Business Statistics Monash University, Caulfield Campus, Melbourne Tel: +61 03 9903 2102Fax: +61 03 9903 2007 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Venn diagram program?
You can draw Venn diagrams very easily in Powerpoint using the ellipse/circle and box/rectangle tools. Draw the diagram, group all the bits together, and copy it into Word or whatever. Whether it is 'publication quality' depends on your definition of htis term. Alan Donald Burrill wrote: On 16 Aug 2001, John Uebersax asked for software that produces publication quality Venn diagrams: I want something to summarize and communicate to non-statisticians (e.g., physicians) the overlap between two sets (such as patients who have Major Depression those who receive antidepressant meds). Do you have reason to believe that your clients are particularly familiar with, and accustomed to interpreting, Venn diagrams? If not, why not use a simple two-way table of frequencies (or proportions)? This has the possible virtue of being readily extensible to three or more sets, whereas the characteristics you ask for below can be guaranteed only for two sets in Venn diagrams (and even then not for the complementary space representing the elements that belong to neither set). The diagram should show the area of each circle as proportional [to] its N, and the overlap area as proprotional to the number of cases in both groups. Venn diagrams don't strictly need to be displayed in terms of circles; it's merely customary, or perhaps conventional. (Possibly because rough circles are easier to draw on a blackboard in more or less recognizable form than squares or rectangles.) The geometric task would be easier if you used squares, for which this kind of proportionalitity is fairly easy to arrange (and construct). Of course, in no case can you manage to get the area of the circles (or squares, or whatever figures please you) to be proportional to their respective N's *and* have the area of the complementary set (those that are neither 'A' nor 'B') proportional to its N, unless the complementary set is rather large in comparison to 'A' and 'B'. It would be possible to subdivide a square or rectangular space into four subsets whose areas are proportional as described; but I do not think one could guarantee that more than three of the four subsets would be rectangular (the fourth might be L-shaped), nor that the sets 'A' and 'B' (both of which contain 'AB') would both be rectangular. Tables are more general, and in some senses simpler (the subspaces are all rectangular, you can display 'A' and 'B' with differently colored outlines, and their intersection is obvious). But perhaps this approach would not be viable, if you happen to be dealing with numerophobes for clients. (OTOH, the *logical* relationships are fairly clear, and one can always avoid talking about the actual *numbers* involved.) Donald F. Burrill [EMAIL PROTECTED] 184 Nashua Road, Bedford, NH 03110 603-471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ = -- Alan McLean ([EMAIL PROTECTED]) Department of Econometrics and Business Statistics Monash University, Caulfield Campus, Melbourne Tel: +61 03 9903 2102Fax: +61 03 9903 2007 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Forecasting Seasonal Indices Question (Long)
0.55492 1.23524 1.201144 Another approach is to fit a regression line to the data, find a ratio of actual to trend and then average the indices for each period. That approach yields the indices: 1.01504 0.559943 1.232433 1.184109 3.991525 Scaling everything to total to 4.00 and comparing the results, we have: Forward Average Centerd MA Ended MA DifferenceRegression -- ---- 0.9234 0.7277 1.00871.0172 0.5802 0.4752 0.55490.5611 1.2068 1.8719 1.23521.2350 1.2896 0.9253 1.20111.1866 Now, I understand why the results might be slightly different but it seems to me that they should be closer than they are. Any comments? Dr. Ronny Richardson Associate Professor of Management Southern Polytechnic State University School of Management 1100 South Marietta Parkway Marietta, GA 30060-2896 Phone: (770) 528-5542 Fax:(770) 528-4967 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ = -- Alan McLean ([EMAIL PROTECTED]) Department of Econometrics and Business Statistics Monash University, Caulfield Campus, Melbourne Tel: +61 03 9903 2102Fax: +61 03 9903 2007 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: regressive question
Thanks to everyone who answered my question. The various reservations about such a test were spot on, and helpful. My own reservations were because, I think, it is not at all clear what the null would be in this case. Are you testing mu = beta_0 (so using the null model with fixed mean) or beta_0 = mu (so using the regression model with potentially variable mean). Alan -- Alan McLean ([EMAIL PROTECTED]) Department of Econometrics and Business Statistics Monash University, Caulfield Campus, Melbourne Tel: +61 03 9903 2102Fax: +61 03 9903 2007 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: A question
Thanks, Robert, and to anyone else who has kindly answered what I realised, belatedly, was a simple question (given that I was looking for the simple normal case.) Regards, Alan Robert J. MacG. Dawson wrote: Alan McLean wrote: Hi to all. Can anyone tell me what is the distribution of the ratio of sample variances when the ratio of population vriances is not 1, but some specified other number? *If* the population distributions are normal (and this is not a robust assumption - in other words, if it's moderately wrong you are *not* safe from error) it's just a scaled F distribution. If X has variance a^2, Y had variance b^2, then (b^2/a^2) s^2_X/s^2_Y = s^2_(X/a)/s^2_(Y/b) ~ F . -Robert Dawson = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ = -- Alan McLean ([EMAIL PROTECTED]) Department of Econometrics and Business Statistics Monash University, Caulfield Campus, Melbourne Tel: +61 03 9903 2102Fax: +61 03 9903 2007 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
A question
Hi to all. Can anyone tell me what is the distribution of the ratio of sample variances when the ratio of population vriances is not 1, but some specified other number? I want to be able to calculate the probability of getting a sample ratio of 1 when the population ratio is, say, 2. Many thanks in advance. Alan -- Alan McLean ([EMAIL PROTECTED]) Department of Econometrics and Business Statistics Monash University, Caulfield Campus, Melbourne Tel: +61 03 9903 2102Fax: +61 03 9903 2007 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Artifacts in stats: (Was Student's t vs. z tests)
Herman Rubin wrote: In article [EMAIL PROTECTED], Alan McLean [EMAIL PROTECTED] wrote: Robert J. MacG. Dawson wrote: Alan McLean wrote: The p value is a direct measure of 'strength of evidence'. and Lise DeShea responded: ... There is certainly no contradiction. A small p value indicates that the effect (whatever its size!) is (probably) valid. (Use the word 'genuine' if you prefer.) The effect is (probably) valid in any case. What is being tested, which is often not what it is said is being tested, is almost certainly false. The effect may be too small to be of much use, but that is a very different question. But this should be the only question. What action should be taken? It cannot possibly be the only question. One of the roles of statistics, and it is performed particularly by hypothesis testing, is to be conservative - to stop people from taking foolish actions by jumping to conclusions. If you observe a large effect, you shout whoopee! and jump in - invest your life savings, write your world shattering paper, or whatever. Then your friendly neighbourhood statistician does a test on your data and points out that this large effect appears to be mostly a matter of chance - it was not 'significant'. He does say that it *might* be genuine! But you are more likely to get egg on your face... Of course the size of the (apparent) effect and its significance are related. But both are important. On a different issue, the frequent claim that 'the null is always false' is a meaningless statement - at best, irrelevant. A significance test compares two *models*, providing evidence as to which of them is (probably) the better choice. It does not pretend to say anything about 'true' values of parameters, and does not deal with exactitude. Unfortunately it is usually taught in those terms - leading to such ideas as 'the null is always false'! Regards, to all, Alan -- This address is for information only. I do not claim that these views are those of the Statistics Department or of Purdue University. Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399 [EMAIL PROTECTED] Phone: (765)494-6054 FAX: (765)494-0558 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ = -- Alan McLean ([EMAIL PROTECTED]) Department of Econometrics and Business Statistics Monash University, Caulfield Campus, Melbourne Tel: +61 03 9903 2102Fax: +61 03 9903 2007 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: p- values Was: Re: Artifacts in stats: (Was Student's t vs. z
Jerry Dallal wrote: Herman Rubin wrote: A p-value tells me nothing of importance. It's hard to resist the challenge, except I have to agree (if we qualify it by adding the word 'alone', that is, 'A p-value alone tells me nothing of importance.') I give in to the challenge too. Here is a p value alone: 0.023456789 Of course it tells me nothing - of importance or otherwise. Alan -- Alan McLean ([EMAIL PROTECTED]) Department of Econometrics and Business Statistics Monash University, Caulfield Campus, Melbourne Tel: +61 03 9903 2102Fax: +61 03 9903 2007 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Artifacts in stats: (Was Student's t vs. z tests)
I agree - although students do need tables in (written) exams... But we use a computer program called Tuteman in our teaching and testing, so the natural way to find critical values or p-values is via the computer - we use Excel mainly. In general, I emphasise the use of p values - in many ways it is a more natural way than using critical values to carry out a test. The p value is a direct measure of 'strength of evidence'. Alan Paul W. Jeffries wrote: Robert Dawson said that one of his approaches to dealing with z test is to treat it as a historical anecdote. I like that approach and must give it a try. But this approach made me think about artifacts in statistics. What are list members views on teaching students to use tables. In the computer age, tables are an anachronism. The vast majority of students will never use a t table. They will just rely on the computer to print the p value. And those rare students that might want to check something on a table will probably be the ones who know enough stats so that they can quickly figure out how to read a table. Does fussing with tables get in the way of students' understanding hypothesis testing or do tables help? I am interested to hear the views of list members. Paul W. Jeffries Department of Psychology SUNY--Stony Brook Stony Brook NY 11794-2500 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ = -- Alan McLean ([EMAIL PROTECTED]) Department of Econometrics and Business Statistics Monash University, Caulfield Campus, Melbourne Tel: +61 03 9903 2102Fax: +61 03 9903 2007 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Artifacts in stats: (Was Student's t vs. z tests)
Robert J. MacG. Dawson wrote: Alan McLean wrote: The p value is a direct measure of 'strength of evidence'. and Lise DeShea responded: I disagree. The p-value may be small when a study has enormous power yet a small effect size. A p-value by itself doesn't say much. I don't think there's actually a contradiction here, provided that strenth of evidence [against the null hypothesis] is not misunderstood to mean strength of evidence for the conclusion you are trying to draw, this latter rarely being the literal denial of the null hypothesis. -Robert Dawson There is certainly no contradiction. A small p value indicates that the effect (whatever its size!) is (probably) valid. (Use the word 'genuine' if you prefer.) The effect may be too small to be of much use, but that is a very different question. Alan -- Alan McLean ([EMAIL PROTECTED]) Department of Econometrics and Business Statistics Monash University, Caulfield Campus, Melbourne Tel: +61 03 9903 2102Fax: +61 03 9903 2007 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Student's t vs. z tests
I can't help but be reminded of learning to ride a bicycle. 99.% of people ride one with two wheels (natch!) - but many children do start to learn with training wheels.. Alan dennis roberts wrote: the fundamental issue here is ... is it reasonably to expect ... that when you are making some inference about a population mean ... that you will KNOW the variance in the population? i suspect that the answer is no ... in all but the most convoluted cases ... or, to say it another way ... in 99.99% (or more) of the cases where we talk about making an inference about the mean in a population ... we have no more info about the variance than we do the mean ... ie, X bar is the best we can do as an estimate of mu ... and, S^2 is the best we can do as an estimate of sigma squared ... this is why i personally don't like to start with the case where you assume that you know sigma ... as a simplification ... since it is totally unrealistic start with the realistic case ... even if it takes a bit more doing to explain it = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ = -- Alan McLean ([EMAIL PROTECTED]) Department of Econometrics and Business Statistics Monash University, Caulfield Campus, Melbourne Tel: +61 03 9903 2102Fax: +61 03 9903 2007 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Student's t vs. z tests
All of your observations about the deficiencies of data are perfectly valid. But what do you do? Just give up because your data are messy, and your assumptions are doubtful and all that? Go and dig ditches instead? You can only analyse data by making assumptions - by working with models of the world. The models may be shonky, but they are presumably the best you can do. And within those models you have to assume the data is what you think it is. I agree that we do not, in general, make it sufficiently clear to students that all statistical analysis deals with models, and those models involve assumptions which are frequently heroic - but you do have to get down to doing some analysis at some time, you can't just whinge about the lousy data, and to do that analysis you pick the techniques appropriate to the models you are working with. Alan dennis roberts wrote: At 08:46 AM 4/20/01 +1000, Alan McLean wrote: So the two good reasons are - that the z test is the basis for the t, and the understanding that knowledge has a very direct value. I hasten to add that 'knowledge' here is always understood to be 'assumed knowledge' - as it always is in statistics. My eight cents worth. Alan the problem with all these details is that ... the quality of data we get and the methods we use to get it ... PALE^2 in comparison to what such methods might tell us IF everything were clean DATA ARE NOT CLEAN! but, we prefer it seems to emphasize all this minutiae .. rather than spend much much more time on formulating clear questions to ask and, designing good ways to develop measures and collect good data every book i have seen so causally says: assume a SRS of n=40 ... when SRS are nearly impossible to get we dust off assumptions (like normality) with the flick of a cigarette ash ... we pay NO attention to whether some measure we use provides us with reliable data ... the lack of random assignment in even the simplest of experimental designs ... seems to cause barely a whimper we pound statistical significance into the ground when, it has such LIMITED application and the list goes on and on and on but yet, we get in a tizzy (me too i guess) and fight tooth and nail over such silly things as should we start the discussion of hypothesis testing for a mean with z or t? WHO CARES? ... the difference is trivial at best in the overall process of research and gathering data ... the process of analysis is the LEAST important aspect of it ... let's face it ... errors that are made in papers/articles/research projects are rarely caused by faulty analysis applications ... though sure, now and then screw ups do happen ... the biggest (by a light year) problem is bad data ... collected in a bad way ... hoping to chase answers to bad questions ... or highly overrated and/or unimportant questions NO analysis will salvage these problems ... and to worry and agonize over z or t ... and a hundred other such things is putting too much weight on the wrong things AND ALL IN ONE COURSE TOO! (as some advisors are hoping is all that their students will EVER have to take!) -- Alan McLean ([EMAIL PROTECTED]) Department of Econometrics and Business Statistics Monash University, Caulfield Campus, Melbourne Tel: +61 03 9903 2102Fax: +61 03 9903 2007 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ = == dennis roberts, penn state university educational psychology, 8148632401 http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ = -- Alan McLean ([EMAIL PROTECTED]) Department of Econometrics and Business Statistics Monash University, Caulfield Campus, Melbourne Tel: +61 03 9903 2102Fax: +61 03 9903 2007 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: normal approx. to binomial
I think you are confusing the idea of a sample with the source of a binomial random variable. The binomial model applies when some action is repeated a specified number of times, n; when we are interested in the occurrence or not of some outcome; when the probability of that outcome is the same for all repetitions of the action (ie all trials); when trials are independent; and when the variable of interest is the number of times the outcome of interest occurs. Dead simple example: toss a coin 10 times; assume the 10 tosses are independent; we want the number of heads; assume the probability of a head on each toss is 0.5 (or whatever). The variable X = number of heads out of 10 tosses is binomially distributed - more precisely, the binomial model is a (very) good model for this situation. A sampling distribution is just a probability distribution which occurs as a result of sampling. In the present context, we might take a sample of values of X. A sample of size 20, for example, would mean repeating the whole shebang 20 times - each time you toss the coin 10 times and record the number of heads. Now suppose we want to measure some characteristic of this sample - for example, the mean value of X, or the proportion of times the value of X is greater than 5, or .. This measure is a statistic of the sample. It is clearly also a random variable since it varies over the samples taken. The probability model which describes how the statistic varies over the population of all possible samples of that size is called the sampling distribution for that statistic. So a sampling distribution is just an ordinary probability distribution, in the particular case where the population is a population of samples. If you take samples of size 1, and the statistic you record for that sample is simply the value of X, you have the 'parent' distribution - so the latter is just one of the sampling distributions you can have for a particular situation. With the binomial there is a complication - if you have a particular characteristic in a population, and you take a simple random sample from that population, and measure the number of times the characteristic occurs in the sample, the binomial model describes this. Exactly, if the sampling is with replacement, approximately if without replacement. To answer your first question - ANY binomial model, whatever its origin, is approximately normal for large enough n. This has nothing to do with sampling (except that the application may be in sampling, as in the previous paragraph). A bit long winded - sorry! Alan James Ankeny wrote: Hello, I have a question regarding the so-called normal approx. to the binomial distribution. According to most textbooks I have looked at (these are undergraduate stats books), there is some talk of how a binomial random variable is approximately normal for large n, and may be approximated by the normal distribution. My question is, are they saying that the sampling distribution of a binomial rv is approximately normal for large n? Typically, a binomial rv is not thought of as a statistic, at least in these books, but this is the only way that the approximation makes sense to me. Perhaps, the sampling distribution of a binomial rv may be normal, kind of like the sampling distribution of x-bar may be normal? This way, one could calculate a statistic from a sample, like the number of successes, and form a confidence interval. Please tell me if this is way off, but when they say that a binomial rv may be normal for large n, it seems like this would only be true if they were talking about a sampling distribution where repeated samples are selected and the number of successes calculated. ___ Send a cool gift with your E-Card http://www.bluemountain.com/giftcenter/ = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ = -- Alan McLean ([EMAIL PROTECTED]) Department of Econometrics and Business Statistics Monash University, Caulfield Campus, Melbourne Tel: +61 03 9903 2102Fax: +61 03 9903 2007 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
[Fwd: Re: Statistical teaching/learning software]
Original Message Subject: Re: Statistical teaching/learning software Date: Tue, 27 Mar 2001 09:57:13 +1000 From: Catherine Rytmeister [EMAIL PROTECTED] Organization: DEFS, Macquarie University To: [EMAIL PROTECTED] Hi Alan, For some reason (although I can receive the Stat-Ed list) I am not able to send to it. I will look into this but in the meantime could you forward the following to the list: Hi Alan and others, We use a CD-ROM in our Introductory Statistics unit here at Macquarie University. It was developed and produced by Don McNeil, Jenny Middeldorp, Hilary Green and myself, with the assistance of our Centre for Flexible Learning and a grant from the University. It is currently in its second version. The total enrolment in all offerings of this unit over the year is about 3000, so there is a fair bit of investment in this course and the accompanying teaching materials. The CD includes lecture slides in pdf format (based on the Powerpoint slides used in face-to-face lectures), with annotations; worked examples; multiple choice quizzes with hints and feedback for each response; and an Excel stats package add-in called EcStat, which is Don's baby. Students also purchase materials to complement lectures and direct practical work, and the textbook for the unit is Don McNeil's Modern Statistics: A Graphical Introduction. There are weekly prac tests and quizzes using the Web-CT facility to record attempts and marks. If anyone is interested in obtaining a copy of the CD I will discuss whether we can send samples out - Don will have to make the final decision. Depends on how many requests we get, and on how much we need the money! :-) Cathy Rytmeister = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: elementary prob./stats concepts
Here's a (hopefully) simple explanation of some of this. With your sample of n weights, the natural way to think of them is that you have one random variable, X = weight of a randomly chosen individual, and you have n observations of this one variable. The mathematical model of this situation is rather different. We suppose we have n random variables: X1 = weight of the first randomly chosen individual, X2 = weight of the second randomly chosen individual, X3 = weight of the third randomly chosen individual, etc. We also suppose that these variables have the same distribution. Furthermore, we assume that the variables are independent - this is valid because of the random selection. This approach is not unreasonable in practical terms, because it is at least feasible that as you proceed to take your sample, the distribution changes, particularly if the population is small, so the assumption of 'identically distributed' is indeed an assumption. More importantly, it enables us to use the mathematics of functions of random variables. For example, E(X1+X2+...+Xn) = E(X1) + E(X2) + +E(Xn) = n*E(X), so that E(Xbar) = E(X). James Ankeny wrote: According to a textbook I have, a random sample of n objects from a random variable X, is composed of n random variables itself, namely, X1,X2,...,Xn. I am having some difficulties in figuring out how to interpret this. For example, suppose that you are considering the population of adult males in the U.S., and the random variable is weight. If you take a random sample of n individuals, are the elements of the sample random (prior to observing them, of course) because you might observe something different in another sample due to measurement error? Or perhaps you might get something different if you took the sample at a different time when weight has changed? Also, if the elements of a random sample are random variables themselves, do they have their own parameters, such as mean and standard deviation, as well as their own density functions and cumulative distribution functions? Also, if a statistic is a function of random variables, can a statistic take the form of a density function with a random vector representing the n variables? I know, conceptually, that the sampling distribution of a statistic is purely theoretical and that it represents how a statistic varies from one sample to another. Mathematically, however, I do not understand how to represent this, or if the sampling distribution of a statistic is analogous to the distribution of a random variable which may have a density function. I do not know if these questions even make any sense, but the concepts are fairly confusing to me. Any help would be greatly appreciated. ___ Send a cool gift with your E-Card http://www.bluemountain.com/giftcenter/ = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ = -- Alan McLean ([EMAIL PROTECTED]) Department of Econometrics and Business Statistics Monash University, Caulfield Campus, Melbourne Tel: +61 03 9903 2102Fax: +61 03 9903 2007 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Statistical teaching/learning software
Hi to All, I am at present trying to find sources of computer software, including web resources, for teaching and learning statistics. My interest is in question-and-answer software, the sort that is used for providing practice exercises with help, preferably more than just drill questions. My special interest is in the use of randomisation by these resources/tools/packages. I would appreciate it if people could tell me what they know of such. Thanks in advance, Alan -- Alan McLean ([EMAIL PROTECTED]) Department of Econometrics and Business Statistics Monash University, Caulfield Campus, Melbourne Tel: +61 03 9903 2102Fax: +61 03 9903 2007 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Most Common Mistake In Statistical Inference
The second sentence here ensures that generalisability to a population IS an issue for statistics. And a big issue, usually overlooked. For that matter, many applications of statistics do use sampling, not random assignment (market surveys, for example) and in these applications Dennis' observtion is spot on. Alan Elliot Cramer wrote: given random assignment the generalizability of results to a population is not an issue for statistics. It's a question of what a plausible population is, given the procedure for obtaining subjects On Thu, 22 Mar 2001, dennis roberts wrote: using and interpreting inference procedures under the assumption of SRS simple random samples ... when they just can't be = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ = -- Alan McLean ([EMAIL PROTECTED]) Department of Econometrics and Business Statistics Monash University, Caulfield Campus, Melbourne Tel: +61 03 9903 2102Fax: +61 03 9903 2007 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: One tailed vs. Two tailed test
I agree that it's the detail about which we disagree! However, one detail is pretty important - I still think you are confusing the trial and the statistical test. The same confusion is shown on the web site. I agree totally that if the treatment appears to be significantly worse than the control treatment (as in your last paragraph below, and as you illustrate with an example on the web page) you have to do something about it. But - this 'something' is quite different from the 'something' you do if you conclude that the treatment is significantly better than the control. In essence, you are setting up a second question - that is, a second pair of hypotheses. The primary question is: Is the new treatment better than the control? (This has to be the primary question in most such research - it would certainly be unethical to trial a treatment that you think is worse than the control.) The secondary question is: Is the new treatment worse than the control? Actually the secondary question is: If the new treatment is no better, is it worse than the control? I concede that you can view these two questions as one, but I think that that is confusing and (therefore) not good design. Regards, Alan Jerry Dallal wrote: We don't really disagree. Any apparent disagreement is probably due to the abbreviated kind of discussion that takes place in Usenet. See http://www.tufts.edu/~gdallal/onesided.htm Alan McLean ([EMAIL PROTECTED]) wrote: My point however is still true - that the person who receives the control treatment is presumably getting an inferior treatment. You certainly don't test a new treatment if you think it is worse than nothing, or worse than current treatments! Equipoise demands the investigator be uncertain of the direction. The problem with one-tailed tests is that they imply the irrelevance of differences in a particular direction. I've yet to meet the researcher who is willing to say they are irrelevant regardless of what they might be. For the sample data I compute xbar (the difference of sample means if there is a control group). There are three possibilities. 1. xbar is negative If 1 does happen, we would conclude either that the new treatment is no better than the control, and may be worse. In either case we junk the new treatment. The question is, do you look to see how much worse? If the answer is no, then I've no argument. But everyone looks. It's unethical not to! --Jerry = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ = -- Alan McLean ([EMAIL PROTECTED]) Department of Econometrics and Business Statistics Monash University, Caulfield Campus, Melbourne Tel: +61 03 9903 2102Fax: +61 03 9903 2007 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: One tailed vs. Two tailed test
Will Hopkins wrote: Responses to various folks. And to everyone touchy about one-tailed tests, let me make it quite clear that I am only promoting them as a way of making a sensible statement about probability. A two-tailed p value has no real meaning, because no real effects are ever null. A one-tailed p value, for a normally distributed statistic, does have a real meaning, as I pointed out. But precision of estimation--confidence limits--is paramount. Hypothesis testing is passe. ... The only use for a test statistic is to help you work out a confidence interval. Don't ever report them in your papers. This is arguably the case for research matters when estimating/testing a mean - a confidence interval and a test are two ways of approaching the same thing. Even there, the hypothesis testing approach is a useful way of thinking. It is exactly the scientific method writ small. I also happen to think that all tests should be one tailed, but almost certainly not for the same reasons as Will's. In 'practical statistics' such as quality control, one is only interested if the sample mean is sufficiently close to what it should be that one can proceed as if it does equal what it should - that is, accept the null model and proceed - or not. If it is not, the 'true value' (meaningless phrase!) is of no interest, so obtaining a confidence interval is a waste of time. It could be done, but offers nothing. Hypothesis testing is essentially a method of selecting between models. Should I use the model with mu = 0, or a model with mu not= 0? If the latter, what value of mu should I use? A more illuminating example is simple linear regression. Should I use the model with beta = 0 (that is, the 'constant mean' model, Y = mu + epsilon) or the model with beta not= 0 (that is, the varying mean model, Y = alpha + beta*X)? This is clearly a choice between two different models. Again one can resolve it by using a test statistic or by calculating a confidence interval, but in both cases you are doing the same thing - deciding between the two models. The questionable thing about hypothesis testing is the fact that the null model is privileged over the alternative. But this is resolved as follows: if a test statistic is not significant (or equivalently, if the confidence interval includes zero) then it does not matter which model you choose. But you do have to choose, at least tentatively. (In a quality control application you have to decide really; in research, you choose tentatively.) All this means is that you make your decision on some other basis than the statistics. For the regression example, we would decide on the basis of simplicity. In a court case we decide on the basis of fairness. In the case of research we decide on the basis of accepted theory. Hypothesis testing is certainly not passe! Regards, Alan -- Alan McLean ([EMAIL PROTECTED]) Department of Econometrics and Business Statistics Monash University, Caulfield Campus, Melbourne Tel: +61 03 9903 2102Fax: +61 03 9903 2007 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Two sided test with the chi-square distribution?
I think some of this is a matter of vocabulary. Do you say 'one tailed test' or 'one sided test'? (Ditto for 'two'.) People seem to use the two phrases fairly interchangeably. In this context, it does not matter whether you think of the F distribution as having two 'ends' - and you can use one or both of them in a test - or two tails (one very short and stubby, one long and skinny) - and you can use one or both of them in a test. I used the term 'one-tailed' in my previous email. If you prefer, change this to 'one sided'. dennis roberts wrote: distributions are inherently TWO ended ... at least i have never seen one that had, say ... an upper end but no lower end ... how a particular significance TEST uses a distribution ... one end or both ... is a function of how the test statistic is defined It is not the test statistic, but the test hypotheses that determine whether a test is one or two sided. Alan -- Alan McLean ([EMAIL PROTECTED]) Department of Econometrics and Business Statistics Monash University, Caulfield Campus, Melbourne Tel: +61 03 9903 2102Fax: +61 03 9903 2007 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: p values
dennis roberts wrote: in an article ... that some might be able to access ... http://bmj.com/cgi/content/full/322/7280/226 by Jonathan A C Sterne, senior lecturer in medical statistics, George Davey Smith, professor of clinical epidemiology. Department of Social Medicine, University of Bristol, Bristol BS8 2PR one of the summary points made is the following: "P values, or significance levels, measure the strength of the evidence against the null hypothesis; the smaller the P value, the stronger the evidence against the null hypothesis" my main questions of this are: 1. does the general statistical community accept this as being correct? 2. if the answer to #1 is yes ... then what does this tell us (only this p value) about what the real parameter value is? (are) It doesn't say anything about the actual value - and why should it? It is not a measure of the value, but a measure of the strength of the (sample) evidence *about* the value! Alan _ dennis roberts, educational psychology, penn state university 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED] http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ = -- Alan McLean ([EMAIL PROTECTED]) Department of Econometrics and Business Statistics Monash University, Caulfield Campus, Melbourne Tel: +61 03 9903 2102Fax: +61 03 9903 2007 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: p values
Hi Mike, An hypothesis test is only done when the sample evidence disagrees with the null hypothesis - for example, the sample mean is different from the mean postulated by the null. So to all intents and purposes, there is always evidence against the null. (Another way to express this - if we were only working from the sample, so we do not have this idea that the null 'should' be true', we would accept the evidence of the sample, in that we would estimate the mean based on the sample mean.) What a p value does is provide a measure of the strength of the (sample) evidence against the null - it is not itself that evidence! The interpretation of numerical values of p is largely a matter of common agreement. A p-value between say .2 and .9 is commonly thought to indicate that the evidence is so weak that there is no question of rejecting the null. (Note that to say that it indicates there is NO evidence, as some authors do, is simply wrong.) Between .1 and .2, the evidence is pretty weak, and mostly people would not reject the null. And so on. The real rider in all of this is that this all depends on the overall model (in effect, the test) used is reasonably appropriate! This is decided on evidence - from common experience, including research, from analysing the sample data; and to a varying degree, from hope and wishful thinking! Regards, Alan Mike Granaas wrote: On Mon, 29 Jan 2001, dennis roberts wrote: one of the summary points made is the following: "P values, or significance levels, measure the strength of the evidence against the null hypothesis; the smaller the P value, the stronger the evidence against the null hypothesis" I would add that the authors also discuss p-values between .1 and .9 as providing weak evidence against the null. And at this level I am not at all comfortable with the notion of a p-value as evidence against the null. If anything large p-values should indicate that the data is quite likely if the null is true. It is only when the p-values become small that we are confronted with the possibility of a) bad data or b) bad null. Even then we have to hedge our bets since high power can give us small p-values with small effect sizes. Michael my main questions of this are: 1. does the general statistical community accept this as being correct? 2. if the answer to #1 is yes ... then what does this tell us (only this p value) about what the real parameter value is? (are) _ dennis roberts, educational psychology, penn state university 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED] http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ = *** Michael M. Granaas Associate Professor[EMAIL PROTECTED] Department of Psychology University of South Dakota Phone: (605) 677-5295 Vermillion, SD 57069 FAX: (605) 677-6604 *** All views expressed are those of the author and do not necessarily reflect those of the University of South Dakota, or the South Dakota Board of Regents. = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ = -- Alan McLean ([EMAIL PROTECTED]) Department of Econometrics and Business Statistics Monash University, Caulfield Campus, Melbourne Tel: +61 03 9903 2102Fax: +61 03 9903 2007 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: p values
dennis roberts wrote: At 08:42 AM 1/31/01 +1100, Alan McLean wrote: It doesn't say anything about the actual value - and why should it? It is not a measure of the value, but a measure of the strength of the (sample) evidence *about* the value! Alan alan, seems like we are going 'round in circles ... we agree that there has to be some null, right? we agree that there has to be some sample data, with which to test the null, right? (or if you prefer, to compare TO the null) so, if that is the case ... and maybe it is not ... it is the interplay between sample data and the null, correct? if the p is low or high ... it has to say something about this interplay ... so, if the p is low ... then it suggests (more) that the sample data are not in alignment with the null (whatever it is) and if p is larger ... then it suggests this less, right? if this is not a pretty close to correct interpretation ... please clarify For hypothesis testing there does have to be a null model - that is the first feature that identifies hypothesis testing from other forms of model selection. A hypothesis test is only carried out if the sample data disagrees with the null. If it is a point null (eg mu = 20) this is almost guaranteed. The p value is essentially a measure of the level of disagreement between the sample data and the null. If p is low, there is strong agreement, if p is high there is weak disagreement. So I agree that your interpretation is reasonably correct. BUT - the p value still does not say anything about the actual value - only about the level of disagreement between what the null says it should be and the sample says it should be. Alan _ dennis roberts, educational psychology, penn state university 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED] http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ = -- Alan McLean ([EMAIL PROTECTED]) Department of Econometrics and Business Statistics Monash University, Caulfield Campus, Melbourne Tel: +61 03 9903 2102Fax: +61 03 9903 2007 == dennis roberts, penn state university educational psychology, 8148632401 http://roberts.ed.psu.edu/users/droberts/drober~1.htm -- Alan McLean ([EMAIL PROTECTED]) Department of Econometrics and Business Statistics Monash University, Caulfield Campus, Melbourne Tel: +61 03 9903 2102Fax: +61 03 9903 2007 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: p values
dennis roberts wrote: At 12:14 PM 1/31/01 +1100, Alan McLean wrote: For hypothesis testing there does have to be a null model - that is the first feature that identifies hypothesis testing from other forms of model selection. check A hypothesis test is only carried out if the sample data disagrees with the null. If it is a point null (eg mu = 20) this is almost guaranteed. The p value is essentially a measure of the level of disagreement between the sample data and the null. If p is low, there is strong agreement, if p is high there is weak disagreement. So I agree that your interpretation is reasonably correct. phew ... thanks (at least so far) BUT - the p value still does not say anything about the actual value - only about the level of disagreement between what the null says it should be and the sample says it should be. well, since there are a zillion different possible nulls ... then sure but, in a given context there are not a zillion nulls ... only 1 (i assume) ... like, mu = 90 ... or, rho = .7 ... or sigma squared = 256 and since the null in a context is a constant ... but, the sample could be telling you varying things ... what we have is a difference value ... between a variable and a constant ... and while your argument seems to focus on the actual "difference" value ... it is not a floating difference at both ends ... only ONE end ... the sample end ... so, in fact, the difference value will lead you to that constant ... even if the variable (sample value) is moving ... which leads you right back to THAT null value so, if that is the case ... and while you might say that if the null had been mu = 90 and ... a given p is attached to that test ... that the p says nothing about THAT particular null ... would i be correct in saying that the p says something therefore about 91 ... or 87 ... or NO value? you might be technically correct ... and, if you want, i will concede that you are ... but, the practical distinction you are making escapes me ... if the p doesn't say something about the null you have posited ... i am wondering what the use of positing that null was in the first place and, then ... what help p is really bestowing upon you (whether it be p=.09 or .03 or .008?) with respect to that posited null i can't wait to try to make this distinction to my students ... In any given test, there is only one null. Importantly, this is determined by the research question. Further, for a given sample there is only one sample statistic - not 'varying things'. Certainly the particular sample result depends on which particular sample was taken, but here we are talking about using the result of one particular sample to test one particular null model. Suppose the null is that mu = 90, the alternative is then that mu =/ 90. You decide to take a sample and test this using the sample mean - to do a t test. (You could choose a different test - but we are talking about one test.) Suppose the sample mean is 85. Here is evidence that mu = 90 is not the best choice; on the basis of the sample, mu should be equal to (about) 85. Now - if we had taken the sample without this idea that mu should be 90, we would simply estimate mu, on the basis of the sample result, as 85. But since we do have the idea that mu should be 90, we have to decide whether to go ahead on the idea that mu = 90 or mu = 85. Note that in both cases this is a *model* - we do not particularly believe that mu is exactly equal to either. (The comment that the null is always false is meaningless. The null does not say what *is*, but what we will take 'it' to be.) The difference here is that if we cannot pick between the two choices, we will plump for the null - in this case, mu = 90. This is for nonstatistical reasons - simplicity, fairness, innate conservatism, . So the 'value' that you originally referred to was (as I understood it) the value of mu - either 90 or 85. The test, incidentally, does not say anything about any other choices. My statement was that the p value does not say anything about what this value is, in the sense of what the choices are. What it does do is to help us, given the two choices, to decide between the two choices: if p is low we will select 85; if p is high we will select 85. If you want to interpret this final sentence as saying something about the value - and in a sense it does - then we agree. Alan -- Alan McLean ([EMAIL PROTECTED]) Department of Econometrics and Business Statistics Monash University, Caulfield Campus, Melbourne Tel: +61 03 9903 2102Fax: +61 03 9903 2007 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: The meaning of the p value
Will Hopkins wrote: I haven't followed this thread closely, but I would like to state the only valid and useful interpretation of the p value that I know. If you observe a positive effect, then p/2 is the probability that the true value of the effect is negative. Equivalently, 1-p/2 is the probability that the true value is positive. The probability that the null hypothesis is true is exactly 0. The probability that it is false is exactly 1. Estimation is the name of the game. Hypothesis testing belongs in another century--the 20th. Unless, that is, you base hypotheses not on the null effect but on trivial effects... With respect, Will, this is a very limited view of statistics in general and hypothesis testing in particular. One of the features of this view is that you think in terms of 'true values' rather than models. A null hypothesis is not 'true' - it may or may not be 'valid' in the sense that using it enables reasonable predictions. The same comment can be made of any scientific theory. In what sense is Relativity 'true'? But it enables reasonable predictions. Estimation is obviously important - but hypothesis testing, properly considered, is also essential. Regards, Alan Alan McLean ([EMAIL PROTECTED]) Department of Econometrics and Business Statistics Monash University, Caulfield Campus, Melbourne Tel: +61 03 9903 2102Fax: +61 03 9903 2007 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Quantiles in Excel
Does anyone know the formulas that Excel uses in its QUARTILE and PERCENTILE functions? I couldn't find them in Help. Thanks in advance, Alan -- Alan McLean ([EMAIL PROTECTED]) Department of Econometrics and Business Statistics Monash University, Caulfield Campus, Melbourne Tel: +61 03 9903 2102Fax: +61 03 9903 2007 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Software (fwd)
Hi Bob, What is FreeBSD? Alan Bob Hayden wrote: - Forwarded message from Ken - Of course you'll get what you pay for. "Comet" [EMAIL PROTECTED] wrote in message [EMAIL PROTECTED]">news:[EMAIL PROTECTED]... I search a good and free:) sofware of stat - End of forwarded message from Ken - When I installed Win98 on my computer at home it crashed multiple times per day. I'm writing this on a FreeBSD system that supports thousands of users and crashes less than once a year. _ | |Robert W. Hayden | | Work: Department of Mathematics / |Plymouth State College MSC#29 | |Plymouth, New Hampshire 03264 USA | * |fax (603) 535-2943 /| Home: 82 River Street (use this in the summer) | )Ashland, NH 03217 L_/(603) 968-9914 (use this year-round) Map of New[EMAIL PROTECTED] (works year-round) Hampshire http://mathpc04.plymouth.edu (works year-round) The State of New Hampshire takes no responsibility for what this map looks like if you are not using a fixed-width font such as Courier. "Opportunity is missed by most people because it is dressed in overalls and looks like work." --Thomas Edison = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ ========= -- Alan McLean ([EMAIL PROTECTED]) Department of Econometrics and Business Statistics Monash University, Caulfield Campus, Melbourne Tel: +61 03 9903 2102Fax: +61 03 9903 2007 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Re : Hypothesis testing
The answer to 1) is 'one'. Alan sunny wrote: I need help with the following hypothesis testing question : "The data processing department at a large company has installed new LCD monitors to replace the colour monitors used previously. The 95 operators trained to use the new monitors averaged 7.2 hours before achieving a satisfactory level of performance. Their sample variance was 16.2 squared hours. Past experience with operators on the old colour monitors showed that they averaged 8.1 hours on the machines before their performances were satisfactory. At the 0.01 significance level, should the supervisor of the department conclude that LCD monitor help the operator to learn?" Based on the above question, I have the following query: 1) whether the above is one population or two population? 2) If it is two population - is it independent or related? 3) If it is related - do u think the above sample variance given is actually the difference in the variance between 95 operators using color monitors and LCD monitors? 4) If it is independent - do u think the above sample variance provided is actually pooled variance? I will appreciate if anybody out there can help me with the above. Regards, Sunny [EMAIL PROTECTED] = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ = -- Alan McLean ([EMAIL PROTECTED]) Department of Econometrics and Business Statistics Monash University, Caulfield Campus, Melbourne Tel: +61 03 9903 2102Fax: +61 03 9903 2007 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re:
You could try Mathtype, which works very well. The equation editor in Word is a cut down version of this. URL is http://www.mathtype.com/. Regards, Alan "Paul W. Jeffries" wrote: Dear List, What software would you recommend for writing documents that contain mathematical symbols? Microsoft Word does not have all the symbols I need. Paul W. Jeffries Department of Psychology SUNY--Stony Brook Stony Brook NY 11794-2500 [EMAIL PROTECTED] -- Alan McLean ([EMAIL PROTECTED]) Department of Econometrics and Business Statistics Monash University, Caulfield Campus, Melbourne Tel: +61 03 9903 2102Fax: +61 03 9903 2007 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Laplace quote
Laplace once said: 'Probability is merely common sense reduced to numbers.' Can anyone provide a reference for this? My thanks, Alan McLean = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: .05 level of significance
Michael Granaas wrote: Someone, I think it was on this thread, mentioned Abelson's book "Statistics as Principled Argument". In this book Abelson argues that individual studies simply provide pieces of evidence for or against a particular hypothesis. It is the accumulation of the evidence that allows us to make a conclusion. (My appologies to Abelson if I have misremembered his arguments.) It is perfectly true that 'individual studies simply provide pieces of evidence for or against a particular hypothesis' - but it is equally true that multiple studies do the same. Assuming the multiple studies show the same results, the evidence is of course stronger - but it is still 'only' evidence. One can legitimately draw a conclusion on one or several studies. One's confidence (and the confidence of others!) in the conclusion depends on the strength of the evidence. One well designed, well carried out study with clear results provides strong evidence which may be enough to convince most people. Several such studies which support each other provide even stronger evidence. On the other hand, replications of poorly designed studies leading to unclear results may give a little more evidence, but not enough to convince people. In an individual study, the p-value(s) used is a measure of the strength of the evidence provided by the study - BUT it is totally dependent on the validity of the design of the study, the choice of variables, the selection of the sample, the appropriateness of the models used to obtain the p-value. So it is important, but certainly only one brick in the wall. And of course treating 5% as some God-given rule of importance is ridiculous. (It is nearly as bad as the N30 'law' for treating a sample as 'large'.) But it is a useful benchmark figure. Regards, Alan -- Alan McLean (alan.buseco.monash.edu.au) Department of Econometrics and Business Statistics Monash University, Caulfield Campus, Melbourne Tel: +61 03 9903 2102Fax: +61 03 9903 2007 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Hypothesis testing
Periodically there is a burst of discussion of hypothesis testing on this list, often with quite a lot of verbal pyrotechnics. With the current discussion going on, it seems an appropriate time to comment that a few weeks ago I sent out a call for people interested in presenting papers on Hypothesis Testing at the ICOTS6 Conference in Durban, South Africa in July 2002 to contact me. (ICOTS = International Conference on Teaching Statistics.) So - are any of you people who vigorously express views about the topic on the list interested in presenting those views at ICOTS? The abstract for the session topic is as follows. I'm sure that some of you could contribute very well to it. ++ A more complete title for this session is: "The varied roles of hypothesis testing and their place in statistical literacy". I hope that speakers within the ambit of this topic will address one or more of the following questions. What role or roles does hypothesis testing perform in statistics? If it performs multiple roles, what are the differences between the roles? Does hypothesis testing perform different roles in different disciplines; for example, in social sciences, particularly psychology and education marketing and related business areas finance and related business areas economics and econometrics biological sciences, particularly agriculture and medical research physical sciences? Is hypothesis testing perceived differently in different disciplines? How do these differing roles (if they do differ) influence the way the topic is viewed in these research disciplines? How do they influence the choice of methods used? How do they influence the way hypothesis testing is taught within these disciplines? How should they influence the way it is taught? Can a student in any of these disciplines be regarded as statistically literate if he or she is not strongly familiar with the concepts and techniques of hypothesis testing? Within these 'role' questions speakers may want to refer to 'old faithfuls' such as: What is the significance of a significant p-value? Do hypothesis testing and confidence intervals do the same things? ++= -- Alan McLean ([EMAIL PROTECTED]) Department of Econometrics and Business Statistics Monash University, Caulfield Campus, Melbourne Tel: +61 03 9903 2102Fax: +61 03 9903 2007 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: memorizing formulas
"Karl L. Wuensch" wrote: I have always thought that success in stats courses was much more a function of a student's verbal aptitude and ability to think analytically, rather than mathematical aptitude. Has anybody actually tested this hypothesis? 1. This clearly depends on the particular (type of) stats course. 2. I would find 'ability to think analytically' hard to distinguish from 'mathematical aptitude' - although I accept that some narrow definitions of both characteristics may have minimal overlap. Alan -- Alan McLean ([EMAIL PROTECTED]) Department of Econometrics and Business Statistics Monash University, Caulfield Campus, Melbourne Tel: +61 03 9903 2102Fax: +61 03 9903 2007 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: About Probability
I am sure there is a multitude of possible answers to this one. One way I would answer it is to say that probability is only applicable to *observable* events - that is, the occurrence of something which is in some way directly measurable. The existence of God is not observable in this sense, so probability is irrelevant to any discussion about the existence of God. Another, related way to express this is to say that belief in the existence of God is a *model* for the universe. Within that model probability questions can be asked, but one cannot talk meaningfully of the existence of the model. (The same comment applies, for example, about general relativity as a theory which models the universe.) Repeatability is certainly (oops! - with high probability) not a prerequisite for probability to make sense. Have fun. Alan Valar wrote: Hello to everyone! I has a question for you that comes from a discussion that I had with a friend of mine. Due to the fact that with the Bayes Probability definition we can define a probability even for events that doesn't occur necessarily several times he say that is possible to associate a probability to the existence of God. But I think that in this case probability has non sense because I think that the Bayes definition is usable only with events that are a priori reapeatable (even if they occurred only one time or they never occurred) or that are composed by some sub-events reapeatable For example he said me that we can associate a probability that with certain coditions it will rain, and we can do that even if these conditions occurr one time in our life, but I say that there is an important difference because the calculus of thi probability is made with physical consideration about several sub-events each with a probability that came from experience and physical models (that are based on experience too) What is the right opinion? Thanx for the attention See you Valar PS I'm sorry for my english that isn't very good -- Posted from mailsrv.sa.infn.it [193.205.70.3] via Mailgate.ORG Server - http://www.Mailgate.ORG = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ = -- Alan McLean ([EMAIL PROTECTED]) Department of Econometrics and Business Statistics Monash University, Caulfield Campus, Melbourne Tel: +61 03 9903 2102Fax: +61 03 9903 2007 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: What are the differences between Statistics and Econometric?
I'm not really an econometrician, despite the department I work in. But having observed the guys who are - econometrics is concerned with the estimation and validation of economic models. So the emphasis is very much on regression methods (including multi-equation methods) and on time series. Rich Ulrich wrote: On Tue, 12 Sep 2000 22:47:46 -, "Jennifer Howser" [EMAIL PROTECTED] wrote: What are the differences between Statistics and Econometric? Thanks "Statistics" could be the general term than includes a number of narrower specialties. The specialties are apt to publish in different journals, and often use different vocabularies for describing common tests -- the basic sorts of data that they refer back to are apt to differ. One broad area is biostatistics. I don't know if "biometrics" is a proper subset, or if it overlaps. The tag "-metrics" has been added to quite a few terms, as in cliometrics (study of history with an emphasis on validating by quantification, such as, the logistics of feeding a city or hosting an army -- if it wasn't logically possible, then it probably did not happen with that many people). Most of Economics is concerned with numerical relations; I am not sure how "econometrics" fits as its sub-set. Economics has enormously long time-series (daily, if you want), of related variables. By its name, "econometrics" should be especially concerned with issues of "measurement." I guess that means that you leave out the political side of economics, unless you can measure it. I would be interested in seeing other descriptions or definitions. Does an encyclopedia of statistics include definitions? -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ ===== -- Alan McLean ([EMAIL PROTECTED]) Department of Econometrics and Business Statistics Monash University, Caulfield Campus, Melbourne Tel: +61 03 9903 2102Fax: +61 03 9903 2007 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Point vs. Interal estimation
Many people - including me - have been saying this for at least 20 years. The trouble is that people have different opinions on what the 'concepts' are. Plus maths is in many ways the best way to explain some of the concepts. And then you need to relate the maths to the methodological issues... Regards, Alan Jerry Dallal wrote: FWIW, if Prof. Rubin is a voice crying in the wilderness, the frontier is getting pushed back a bit. At the JSM, more than one speaker was suggesting that a first course in statistics focus on concepts and ignore the mathematics. John Bailar is one name that comes to mind -- Alan McLean ([EMAIL PROTECTED]) Department of Econometrics and Business Statistics Monash University, Caulfield Campus, Melbourne Tel: +61 03 9903 2102Fax: +61 03 9903 2007 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
'Components of chi square'
Hi to all, For some years I have been teaching a technique which I know as testing the components of chi square in a standard contingency table problem. If you calculate the standardised residual SR = (fo - fe)/sqrt(fe) for each cell, these residuals are approximately normally distributed with mean zero and standard error given by SE = sqrt((1 - rowsum/overallsum)*(1-columnsum/overallsum)) provided the expected frequencies are large enough (as for the use of chi square itself). My problem is that I have no source for this technique. I have never seen it in a textbook. (I have no doubt about its validity, and frankly don't understand why textbooks do not refer to it.) Can anyone give me a reference to it? Ideally, a reference to its original publication. My thanks in advance. Alan -- Alan McLean ([EMAIL PROTECTED]) Department of Econometrics and Business Statistics Monash University, Caulfield Campus, Melbourne Tel: +61 03 9903 2102Fax: +61 03 9903 2007 === This list is open to everyone. Occasionally, less thoughtful people send inappropriate messages. Please DO NOT COMPLAIN TO THE POSTMASTER about these messages because the postmaster has no way of controlling them, and excessive complaints will result in termination of the list. For information about this list, including information about the problem of inappropriate messages and information about how to unsubscribe, please see the web page at http://jse.stat.ncsu.edu/ ===
Re: Rates and proportions
One might also ask what is meant by the 'population escape rate' in this context. Is the data not population data? Alan Dale Berger wrote: Hi Don et al., If we observe one escape out of 1250 inmates, why can't we reliably rule out zero as the population escape rate? The normal approximation to the binomial may not be appropriate here. Dale Berger "Unreliable" or "useless"? Well, the basic graininess in a rate is one escapee more (or less) than was reported. A rate of .08 per 100 is about 1 out of 1250. If the data on which the rate was based were 1 escapee out of 1250 inmates, one cannot _reliably_ tell the rate from zero. If the data were 13 escapees out of 16,200 inmates, one would have more faith in the rate, at least insofar as representing a small value different from (not equal to!) zero. Unfortunately, the rate itself does not tell one how grainy the data were. - Alan McLean ([EMAIL PROTECTED]) Department of Econometrics and Business Statistics Monash University, Caulfield Campus, Melbourne Tel: +61 03 9903 2102Fax: +61 03 9903 2007 === This list is open to everyone. Occasionally, less thoughtful people send inappropriate messages. Please DO NOT COMPLAIN TO THE POSTMASTER about these messages because the postmaster has no way of controlling them, and excessive complaints will result in termination of the list. For information about this list, including information about the problem of inappropriate messages and information about how to unsubscribe, please see the web page at http://jse.stat.ncsu.edu/ ===
An interesting (I hope) problem
Hi to all. A friend of mine has a problem. The following is my understanding of the problem. She has a box of, say, 50 physically identical (to the eye, anyway) objects, but they vary in chemical composition - there may be half a dozen or so different compositions in the box. She has another of these objects, physically similar to those in the box. She needs to test the objects in the box to determine if the single object came from, or could have come from, this box. If one of the box objects matches the single object in chemical composition (presumably this match is within some level of precision) then she will be able to say that the object (probably) came from the box (or could have come from the box). If none of the box objects matches the single object, then the latter could not have come from the box. She has been asked to give a statistical formula to identify the sample size she will need to take to answer the question. The problem is not very clear - I think the people asking for the formula are not statisticians, but managers who think that anything that can possibly be quantified should be quantified. But maybe someone cna come up with a suggestin I can pass on. All the best, Alan -- Alan McLean ([EMAIL PROTECTED]) Department of Econometrics and Business Statistics Monash University, Caulfield Campus, Melbourne Tel: +61 03 9903 2102Fax: +61 03 9903 2007 === This list is open to everyone. Occasionally, less thoughtful people send inappropriate messages. Please DO NOT COMPLAIN TO THE POSTMASTER about these messages because the postmaster has no way of controlling them, and excessive complaints will result in termination of the list. For information about this list, including information about the problem of inappropriate messages and information about how to unsubscribe, please see the web page at http://jse.stat.ncsu.edu/ ===
Re: What is standard deviation exactly?
There are a couple of (practical) features of the standard deviation that are worth noting. First, as a *descriptor* of the variation in a distribution, it is generally not very good. I mean this is the sense that if you want to visualise the amount of variation in a distribution the SD is only useful if the distribution is at leasst symmetric and preferably approximately normal. This appears to me to contribute to the difficulty that students have with it. Second, for a normal distribution, it is easily seen that the variatiion can be described (and measured) by the 'width of the peak'. The question is, at what point do we measure the width? Geometrically, the only two uniquely identifiable points on the curve, other than the maximum, are the two inflexions. (I usually describe these to students by getting them to imagine they are ants riding a motor bike along the curve; they lean into the curve one way, then straighten up and lean the other way.) Consequently, the only measure of the width of the peak that makes sense is the distance between these points - and this is twice the standard deviation. Hence (I think) the word 'standard'. Regards, Alan Herman Rubin wrote: In article [EMAIL PROTECTED], Neil [EMAIL PROTECTED] wrote: I was wondering what the standard deviation means exactly? I believe the reason it is called the STANDARD deviation is that if the probability distribution is concentrated equally at the two points one standard deviation from the mean, the first two moments agree with that of the original distribution; the deviation from the mean to get this is the standard deviation. -- -- Alan McLean ([EMAIL PROTECTED]) Department of Econometrics and Business Statistics Monash University, Caulfield Campus, Melbourne Tel: +61 03 9903 2102Fax: +61 03 9903 2007 === This list is open to everyone. Occasionally, less thoughtful people send inappropriate messages. Please DO NOT COMPLAIN TO THE POSTMASTER about these messages because the postmaster has no way of controlling them, and excessive complaints will result in termination of the list. For information about this list, including information about the problem of inappropriate messages and information about how to unsubscribe, please see the web page at http://jse.stat.ncsu.edu/ ===
Re: one-tailed vs. two-tailed tests
I also agree, and have the same concern. To test Ho: d=0 Ha: d0 one gives the null the 'maximum benefit of the doubt' (expressed in terms of court cases) and use the boundary value d=0 to assess the alternative. This happens to have the mathematical advantage of providing a unique value of the parameter d so that the test can actually be carried out. Regards, Alan "William B. Ware" wrote: On 8 May 2000, Richard M. Barton wrote: ***Technically incorrect? I'm not so sure. I just looked in stat books by 13 authors, to see how null and alternative hypotheses were presented in a one-tailed testing situation. Most of my books are from the social sciences. Results: 1 texts presented hypotheses in the form of Ho: d=0 Ha: d0 10 presented in the form of Ho: d=0 Ha: d0 2 presented both forms I happen to agree with Don's assertion that this "incomplete" form is technically incorrect. My concern is the implication of Richard's finding as an assessment of the quality of the texts that are being used... WBW __ William B. Ware, Professor and Chair Educational Psychology, CB# 3500 Measurement, and Evaluation University of North Carolina PHONE (919)-962-7848 Chapel Hill, NC 27599-3500 FAX: (919)-962-1533 http://www.unc.edu/~wbware/ EMAIL: [EMAIL PROTECTED] __ === This list is open to everyone. Occasionally, less thoughtful people send inappropriate messages. Please DO NOT COMPLAIN TO THE POSTMASTER about these messages because the postmaster has no way of controlling them, and excessive complaints will result in termination of the list. For information about this list, including information about the problem of inappropriate messages and information about how to unsubscribe, please see the web page at http://jse.stat.ncsu.edu/ === -- Alan McLean ([EMAIL PROTECTED]) Department of Econometrics and Business Statistics Monash University, Caulfield Campus, Melbourne Tel: +61 03 9903 2102Fax: +61 03 9903 2007 === This list is open to everyone. Occasionally, less thoughtful people send inappropriate messages. Please DO NOT COMPLAIN TO THE POSTMASTER about these messages because the postmaster has no way of controlling them, and excessive complaints will result in termination of the list. For information about this list, including information about the problem of inappropriate messages and information about how to unsubscribe, please see the web page at http://jse.stat.ncsu.edu/ ===
Re: no correlation assumption among X's in MLR
Hi Don, There are times when I realise the rust that has accumulated, and this is one of them. Changing the order of things a little, you (and DS) are of course quite correct that X variables are typically correlated, and that if they are not the coefficients are the same as if a set of simple regressions are carried out. Coincidentally, I was pointing this out to a class a couple of days ago - but the class is 'not mathematically able', like most these days, so the explanation was not of course at all technical. Rust.. With regard to correlation and collinearity - I have become used to 'explaining' collinearity to my classes in terms only of pairs of explanatory variables, forgetting that the collinearity could involve a set of three or more variables, and this 'pair-wise no collinearity' is, as I understand it, equivalent to 'no linear correlation'. This suggests, incidentally, that 'not collinear' is stronger than 'uncorrelated' (not *linearly* correlated) which doesn't agree with your statement - is this so? It also suggests that 'collinearity' means more than just 'correlated'. A useful way of picturing the situation is that each variable corresponds to an axis, the angles between the axes determined by the correlation coefficient. (I think, very uncertainly, that the correlation coefficient is the cosine of the angle.) If variables are uncorrelated, the axes are orthogonal; if they are perfectly correlated, the axes are identical. If there is a linear combination between variables, the corresponding dimensions collapse to a 'plane'. (This is all happening in k dimensions.) This corresponds to the matrix X'X having rank less than k (for k variables) so leads (as I understand it) to the collinearity problem. In terms of the data, there is unlikely to be total collapse (just as a sample correlation of exactly zero is highly unlikely) but you might get near collapse. For only two variables highly correlated, the axes are nearly indistinguishable; for three variables you will get a very low hill (this is difficult to describe!). The problem then is to decide whether or not to exclude variables - is the hill high enough to count as three variables, or so low that one variabel should be excluded? I think I stand by my original observation, that *in the data* there is always some evidence of collinearity/correlation; if this evidence is strong enough you have to reduce it by reselecting the variables. In your third paragraph you seem to be identifying collinearity with correlation - more precisely, that the problems with collinearity are those of correlation - and to a large extent identifying 'the trouble' that I spoke of. Thanks for helping to chip off some of the rust. I know there is a lot more. Regards, Alan "Donald F. Burrill" wrote: On Tue, 2 May 2000, Alan McLean wrote: 'No collinearity' *means* the X variables are uncorrelated! This is not my understanding. "Uncorrelated" means that the correlation between two variables is zero, or that the intercorrelations among several variables are all zero. "Not collinear" means that there is not a linear dependency lurking among the variables (or some subset of them). "Uncorrelated" is a much stronger condition than "not collinear". The basic OLS method assumes the variables are uncorrelated (as you say). Not as presented in, e.g., Draper Smith; who go to some trouble to show how one can produce from a set of correlated variables a set of orthogonal (= mutually uncorrelated) variables, and remark on the advantages that accrue if the X-matrix is orthogonal. But it is clear that they expect predictors to be correlated as a general rule. In practice there is usually some correlation, but the estimates are reasonably robust to this. If there is *substantial* collinearity you are in trouble. If there is collinearity _at_all_ you are in trouble; further, if the correlations among some of the predictors are high enough (= close enough to unity), a computing system with finite precision may be unable to detect the difference between a set of variables that are technically not collinear but are highly correlated, and a set of variables that _are_ collinear. (E.g., X and X^4 are not collinear; but if the range of X in the data is, say, 101 to 110, a plot of X^4 vs X will look very much like a straight line.) For this reason various safety features are usually built in to regression programs: variables whose tolerance value with respect to the other predictors is lower than a certain threshold (or whose variance inflation factor -- the reciprocal of tolerance -- is above a corresponding threshold) are usually excluded from an analysis; although it is often possible to override the system defaults if one thinks it necessary. The existence of such defaults is clear evidence that at least the persons responsible for system packages exp
Re: no correlation assumption among X's in MLR
'No collinearity' *means* the X variables are uncorrelated! The basic OLS method assumes the variables are uncorrelated (as you say). In practice there is usually some correlation, but the estimates are reasonably robust to this. If there is *substantial* collinearity you are in trouble. Alan James Eales wrote: Actually Gujarati is just listing his version of the assumptions which guarantee that OLS is BLUE. One of these is that there is no collinearity between the X variables. By this he means that the matrix of independent variables must have full rank, otherwise OLS estimates cannot be calculated. He is not assuming that the Xs are uncorrelated. -- James Eales Dept. of Agricultural Economics Purdue University [EMAIL PROTECTED] 765/494-4212 === This list is open to everyone. Occasionally, less thoughtful people send inappropriate messages. Please DO NOT COMPLAIN TO THE POSTMASTER about these messages because the postmaster has no way of controlling them, and excessive complaints will result in termination of the list. For information about this list, including information about the problem of inappropriate messages and information about how to unsubscribe, please see the web page at http://jse.stat.ncsu.edu/ === -- Alan McLean ([EMAIL PROTECTED]) Department of Econometrics and Business Statistics Monash University, Caulfield Campus, Melbourne Tel: +61 03 9903 2102Fax: +61 03 9903 2007 === This list is open to everyone. Occasionally, less thoughtful people send inappropriate messages. Please DO NOT COMPLAIN TO THE POSTMASTER about these messages because the postmaster has no way of controlling them, and excessive complaints will result in termination of the list. For information about this list, including information about the problem of inappropriate messages and information about how to unsubscribe, please see the web page at http://jse.stat.ncsu.edu/ ===
Re: hyp testing -Reply
Spot on, Robert. Alan Robert Dawson wrote: Joe Ward wrote: Yes, there occasionally were discussions in our Air Force research whether or not we were working with the POPULATION or a SAMPLE. As Dennis comments: | | the flaw here is that ... she has population data i presume ... or about | as | close as one can come to it ... within the institution ... via the budget | or comptroller's office ... THE salary data are known ... so, whatever | differences are found ... DEMS are it! | One of my Professors used to use the Invertebrate Paleontologists as his example of a POPULATION. I think at that time there were less than 20 people who were Invertebrate Paleontologists. OK. Now, suppose that you knew them all, and noticed that ten of them drove convertibles. You would probably make some generalization about invertebrate paleontologists, consider that this was a genuine phenomenon, and assume that if one more invertebrate paleontologist *did* turn up, it might well be in a convertible. [Maybe convertibles are easier than sedans to get into if you're invertebrate? grin] Suppose there were also exactly two extraterrestrial paleontologists in the world, and one of them drove a convertible. You would be less likely to think in the same way. Now, if you discovered that around 50% of the vertebrate paleontologists in the world drove convertibles, you would consider that you had ironclad proof that something was going on. I suggest that even if these groups are not true random samples (and they are not - more on that later) that the informal inferential process described has much in common with formal statistical inference. And, if it walks like a duck and quacks like a duck, it makes some sense to cook it like a duck. (Similarly, if you were to toss a coin and cover it unseen, and offer a frequentist various odds that it had landed heads, most frequentists would put their cutoff betweeen accepting and rejecting the wager at odds corresponding to a 50% probability, even if they refused to admit that that was the probability that the coin was heads-up.) There are obvious problems with the sampling technique - though probably less than if a convenience sample of (say) the most accessible half the population had been taken. As far as random samples are concerned: it is *very* rare for a true random sample, based on an equal-probability sample of the population to which the inference is intended to extend, to be taken. Say a researcher is studying the behaviour of humans. (S)he may take a random sample from the student subject pool, but not from the human race; and yet the paper published will claim to be about "Artificially Inducing The Gag Reflex in Humans", not "Artificially Inducing The Gag Reflex in Students Enrolled in Psych 1000 at Miskatonic U. (Fall '00)". Even if some future world government were to allow researchers access to a list of all humans alive at some moment to use as a sampling frame, most researchers would not disclaim any applicability of their research to those dead or not yet born. The implicit "Platonic" population larger than that available for study is a problem that is always with us; a bad sample is one in which this causes bias. The situation in which the entire actual population is available for study is an extreme case, of course. -Robert Dawson === This list is open to everyone. Occasionally, less thoughtful people send inappropriate messages. Please DO NOT COMPLAIN TO THE POSTMASTER about these messages because the postmaster has no way of controlling them, and excessive complaints will result in termination of the list. For information about this list, including information about the problem of inappropriate messages and information about how to unsubscribe, please see the web page at http://jse.stat.ncsu.edu/ ======= -- Alan McLean ([EMAIL PROTECTED]) Department of Econometrics and Business Statistics Monash University, Caulfield Campus, Melbourne Tel: +61 03 9903 2102Fax: +61 03 9903 2007 === This list is open to everyone. Occasionally, less thoughtful people send inappropriate messages. Please DO NOT COMPLAIN TO THE POSTMASTER about these messages because the postmaster has no way of controlling them, and excessive complaints will result in termination of the list. For information about this list, including information about the problem of inappropriate messages and information about how to unsubscribe, please see the web page at http://jse.stat.ncsu.edu/ ===
Re: hyp testing -Reply
s can be expected to be about right ... or, we say that samples won't be good ... and if that is true ... forget the notion of using the standard error in some rigid way for making hypothesis tests ... confidence intervals ... and the like when we use such error estimates as: stan error of the mean = S / sqrt n .. does this apply no matter what? there is a daisy chain here ... the hypothesis is about a population ... and, we use the data from our sample to make a decision about that hypothesized parameter ... BUT, if our sample cannot be considered (within some fudge factor) to be representative of the population to which we have made this hypothetical stab ... seems like we need to pack it in let's say that we take a sample by any means ... and, the question we have formulated is that .. in the population ... we will find some 6's ... AND, we happen to find 1 or more 6's ... now, i don't care how you took the sample ... good way or bad way ... we have confirmed our question ... but, what if you don't find any 6's??? i would say in this case ... you are up the creek ... since there is no model we can apply given we know nothing about how the sample was taken ... if we have to assume that samples can be anything ... since we can never EXACTLY get a truly random sample ... then we are in a peck of trouble ... i recall a number of posts that alan made ... arguing rather vehemently about the fact that we need a model for our data ... well, what is the model for our data if we have no control over our sampling ... nor any way to have a crack at estimating the error BASED on that sample information? but now ... in telling robert ... spot on ... in the context that robert is implying that it is ok to go ahead and make these inferences even if our sampling methods are poor ... so, the way i read this ... alan is more or less agreeing with that .. and that does not appear to be a very consistent approach to things ... bottom line: how goes your samples ... that's where your inferences are headed but that is just my read of it -Robert Dawson === This list is open to everyone. Occasionally, less thoughtful people send inappropriate messages. Please DO NOT COMPLAIN TO THE POSTMASTER about these messages because the postmaster has no way of controlling them, and excessive complaints will result in termination of the list. For information about this list, including information about the problem of inappropriate messages and information about how to unsubscribe, please see the web page at http://jse.stat.ncsu.edu/ ======= -- Alan McLean ([EMAIL PROTECTED]) Department of Econometrics and Business Statistics Monash University, Caulfield Campus, Melbourne Tel: +61 03 9903 2102Fax: +61 03 9903 2007 === This list is open to everyone. Occasionally, less thoughtful people send inappropriate messages. Please DO NOT COMPLAIN TO THE POSTMASTER about these messages because the postmaster has no way of controlling them, and excessive complaints will result in termination of the list. For information about this list, including information about the problem of inappropriate messages and information about how to unsubscribe, please see the web page at http://jse.stat.ncsu.edu/ === == dennis roberts, penn state university educational psychology, 8148632401 http://roberts.ed.psu.edu/users/droberts/droberts.htm === This list is open to everyone. Occasionally, less thoughtful people send inappropriate messages. Please DO NOT COMPLAIN TO THE POSTMASTER about these messages because the postmaster has no way of controlling them, and excessive complaints will result in termination of the list. For information about this list, including information about the problem of inappropriate messages and information about how to unsubscribe, please see the web page at http://jse.stat.ncsu.edu/ ======= -- Alan McLean ([EMAIL PROTECTED]) Department of Econometrics and Business Statistics Monash University, Caulfield Campus, Melbourne Tel: +61 03 9903 2102Fax: +61 03 9903 2007 === This list is open to everyone. Occasionally, less thoughtful people send inappropriate messages. Please DO NOT COMPLAIN TO THE POSTMASTER about these messages because the postmaster has no way of controlling them, and excessive complaints will result in termination of the list. For informat
Re: linear model or interactive model?
The model y = b0 + b1 * x1 + b2 * x2 + b3 * x1*x2 is a nonlinear model, just as in engineering. However, it is 'linear in the variables'. In statistics this is useful, because in estimating the model from a data set, one can define a 'new' variable x3 = x2*x2 and apply, for example, a linear regression algorithm. But in interpreting the results you have to remember that the model is nonlinear! Regards, Alan Wen-Feng Hsiao wrote: Dear Hartig, Thanks for your reply. I am sorry for my poor knowledge in statistics. But I wonder why the definition of 'linearity' of statistics is different from that of engineering mathematics, which defines 'linear' as: Each unknown xj appears to the first power only, and that there are no cross product terms xi*xj with i!=j. Wen-Feng In article [EMAIL PROTECTED], [EMAIL PROTECTED] says... Generally, you can include an interaction (or moderator) term in a linear model, like y = b0 + b1 * x1 + b2 * x2 + b3 * x1*x2, and the model still is linear. If you decide not to include x1 and x2, like y = b0 + b1 * x1*x2, you still have a linear model. === This list is open to everyone. Occasionally, less thoughtful people send inappropriate messages. Please DO NOT COMPLAIN TO THE POSTMASTER about these messages because the postmaster has no way of controlling them, and excessive complaints will result in termination of the list. For information about this list, including information about the problem of inappropriate messages and information about how to unsubscribe, please see the web page at http://jse.stat.ncsu.edu/ === -- Alan McLean ([EMAIL PROTECTED]) Department of Econometrics and Business Statistics Monash University, Caulfield Campus, Melbourne Tel: +61 03 9903 2102Fax: +61 03 9903 2007 === This list is open to everyone. Occasionally, less thoughtful people send inappropriate messages. Please DO NOT COMPLAIN TO THE POSTMASTER about these messages because the postmaster has no way of controlling them, and excessive complaints will result in termination of the list. For information about this list, including information about the problem of inappropriate messages and information about how to unsubscribe, please see the web page at http://jse.stat.ncsu.edu/ ===
Re: normal distribution
Hi Jan, I have always understood that the word 'normal' in this context means 'perpendicular'. You might remember calculus exercises in which you were asked to find 'the equation to the normal to a curve', just after you were asked to find the equation to the tangent. The reason why this name applies is because of the orthogonality properties of the (multi)normal distribution. If you take a simple random sample from a normal distribution, and represent each Xi by a different axis, the axes will be mutually perpendicular. Obviously there is more to it than this, but I can't remember the details. But you should be able to chase it up. Regards, Alan Jan Souman wrote: Does anybody know why the normal distribution is called 'normal'? The most plausible explanations I've encountered so far are: 1. The value of a variable that has a normal distribution is determined by many different factors, each contributing a small part of the total value. Because this is the case with many real life variables, like length and intelligence, the resulting distribution of values is called normal. 2. Many probability distributions are approximated by the normal distribution for large sample sizes. Maybe there are other explanations and maybe someone knows the source of the name? Jan Souman Dpt. of Social Sciences University of Utrecht, Netherlands === This list is open to everyone. Occasionally, less thoughtful people send inappropriate messages. Please DO NOT COMPLAIN TO THE POSTMASTER about these messages because the postmaster has no way of controlling them, and excessive complaints will result in termination of the list. For information about this list, including information about the problem of inappropriate messages and information about how to unsubscribe, please see the web page at http://jse.stat.ncsu.edu/ === -- Alan McLean ([EMAIL PROTECTED]) Department of Econometrics and Business Statistics Monash University, Caulfield Campus, Melbourne Tel: +61 03 9903 2102Fax: +61 03 9903 2007 === This list is open to everyone. Occasionally, less thoughtful people send inappropriate messages. Please DO NOT COMPLAIN TO THE POSTMASTER about these messages because the postmaster has no way of controlling them, and excessive complaints will result in termination of the list. For information about this list, including information about the problem of inappropriate messages and information about how to unsubscribe, please see the web page at http://jse.stat.ncsu.edu/ ===
Re: Hypothesis testing and magic - episode 2
Hi Michael, This sounds to me like lousy experimental design. Surely the purpose of the experiment is to distinguish between competing theoretical models? Michael Granaas wrote: But in some areas in psychology you will have a situation where many theoretical perspectives predict the same outcome relative to a zero valued null while the zero valued null reflects no theoretical perspective. In this situation rejecting a zero valued null supports all theoretical perspectives equally and differentiates among none of them. and I think that is what you are saying here. I agree that measurement is a problem, but even with good measurement the lack of connection between statistical hypotheses and theoretical predictions is a fatal flaw in too many areas. Regards, Alan -- Alan McLean ([EMAIL PROTECTED]) Department of Econometrics and Business Statistics Monash University, Caulfield Campus, Melbourne Tel: +61 03 9903 2102Fax: +61 03 9903 2007 === This list is open to everyone. Occasionally, less thoughtful people send inappropriate messages. Please DO NOT COMPLAIN TO THE POSTMASTER about these messages because the postmaster has no way of controlling them, and excessive complaints will result in termination of the list. For information about this list, including information about the problem of inappropriate messages and information about how to unsubscribe, please see the web page at http://jse.stat.ncsu.edu/ ===
Re: Hypothesis testing and magic - episode 2
dennis roberts wrote: but, if we follow this to some logical conclusion ... this could be rephrased as meaning ... situations where you have essentially complete control over variable manipulation = situations where you can establish 'the truth' (in terms of the impacts of these variables on things) ... but, this is precisely what many have been arguing on the list about that hypothesis testing ... statistical significance testing that is ... is in NO position to help you assert 'the truth' ... truth is a metaphysical notion ... not statistical in essence, if 'the truth' is a laudable goal and, for some reason we can 'learn of it' through 'scientific investigation' ... then it is NOT significance testing that leads us to it ... ... rather it is the DESIGN of investigations that is the key ... Truth has nothing to do with it. We contruct stories of how the universe operates - we call these stories 'theories' or 'models'. Significance testing is one way in which we choose between stories as to which is (probably) more useful in a specified context. Alan -- Alan McLean ([EMAIL PROTECTED]) Department of Econometrics and Business Statistics Monash University, Caulfield Campus, Melbourne Tel: +61 03 9903 2102Fax: +61 03 9903 2007 === This list is open to everyone. Occasionally, less thoughtful people send inappropriate messages. Please DO NOT COMPLAIN TO THE POSTMASTER about these messages because the postmaster has no way of controlling them, and excessive complaints will result in termination of the list. For information about this list, including information about the problem of inappropriate messages and information about how to unsubscribe, please see the web page at http://jse.stat.ncsu.edu/ ===
Hypothesis testing and magic - episode 2
Some more comments on hypothesis testing: My impression of the hypothesis test controversy, which seems to exist primarily in the areas of psychology, education and the like (this is coming from someone who has been involved in education for all my working life, but with a scientific/mathematical background), is that it is at least partly a consequence of the sheer difficulty of carrying out quantitative research in those fields. A root of the problem seems to be definitional. I am referring here to the definition of the variables involved. In, say, an agricultural research problem it is usually easy enough to define the variables. For a very simple example, if one is interested in comparing two strains of a crop for yield, it is very easy to define the variable of interest. It is reasonably easy to design an experiment to vary fairly obvious factors and to carry out the experiment. In the soft sciences it is easy enough to identify a characteristic of interest the problem is how to measure it. If I am interested in the relationship between ability in statistics and ethnic background, for example, I measure the statistics ability using a test of some sort; I measure ethnic background by defining a set of ethnicities. There are literally an infinite number of combinations that I can use infinitely many different tests, all purporting to measure statistics ability (even if I change only one word in a test, I cannot be absolutely certain of its effect, so it is a different test!), and a very large number of definitions of ethnicity. This is of course not news to anyone reading this. But I am coming to my point. Suppose I carry out an experiment I apply the test to a group of people of varying ethnicity, score them on the test and analyse the results, including a hypothesis test to decide if statistics ability is related to ethnicity. This test might be a simple ANOVA, or a Kruskal-Wallis or a chi square test, depending on how I score the test. As I said earlier, a hypothesis test only helps the user to decide which of two models is probably better. The point of the above paragraphs is this: the definition of the models being compared includes the definition of the variables used. If I reject the null model (a label I prefer to null hypothesis) that is I decide that the alternative model is (likely to work) better I am NOT saying that there is a relationship between statistics ability and ethnicity. All I am saying is that there is a relationship between the two variables I used. Please note that the test is not saying this I am. The test merely gives me a measure of the strength of the evidence provided by the data (significant at 1% or p-value of .0135); this measure is only relevant if the models I have used are appropriate. I can use other evidence (experience is what we usually use! but there may be related tests that help) to decide if the model is appropriate. So there are three levels at which judgement is used to make decisions: deciding what variables are to be used to measure the characteristics of interest, and how any relationship between them relates to the characteristics deciding on the model to be used, and how to test it deciding the conclusion for the model In each of these there is evidence we use to help us make the decision. The hypothesis test itself provides the test for the third. Finally (at least for the moment) whether we choose the null or alternative model, it IS a decision. In research, accepting the null means that we decide to accept it at least for the moment, so it is not necessarily a committed decision. On the other hand, if a line of investigation is not yielding results, the researcher is likely to not continue on that line so it is a decision which does lead to an action. For non research applications such as in quality control, accepting the null model quite clearly is a decision to act on the basis of that. For example, with a bottle filling machine which is periodically tested as to the mean contents, the null is that the machine is filling the bottles correctly. Rejecting the null entails stopping the machine; accepting it means the machine will not be stopped. Traditional hypothesis testing does incorporate a decision-theoretic loss function the p-value. Regards again, Alan -- Alan McLean ([EMAIL PROTECTED]) Department of Econometrics and Business Statistics Monash University, Caulfield Campus, Melbourne Tel: +61 03 9903 2102Fax: +61 03 9903 2007 === This list is open to everyone. Occasionally, less thoughtful people send inappropriate messages. Please DO NOT COMPLAIN TO THE POSTMASTER about these messages because the postmaster has no way of controlling them, and excessive complaints will result in termination of the list. For information about this list, including information about the problem of inappropriate messages
Hypothesis testing and magic
I have been reading all the back and forth about hypothesis testing with some degree of fascination. It's a topic of particular interest to me - I presented a paper called 'Hypothesis testing and the Westminster System' at the ISI conference in Helsinki last year. What I find fascinating is the way that hypothesis testing is regarded as a technique for finding out 'truth'. Just wave a magic wand, and truth will appear out of a set of data (and mutter the magic number 0.05 while you are waving it) Hypothesis testing does nothing of the sort - of course. First, hypothesis testing is not restricted to statistics or 'research'. If you are told some piece of news or gossip, you automatically check it out for plausibility against your knowledge and experience. (This is known colloquially as a 'shit filter'.) If you are at a seminar, you listen to the presenter in the same way. If what you hear is consistent with your knowledge and experience you accept that it is probably true. If it is very consistent, you may accept that it IS true. If it is not consistent, you will question it, conclude that it is probably not true. IF the news is something that requires some action on your part, you will act according your assessment of the information. If the news is important to you, and you cannot decide which way to go on prior knowledge, you will presumably go and get corroborative information, hopefully in some sense objective information. This describes hypothesis testing almost exactly; the difference is a matter of formalism. Next - a statistical hypothesis test compares two probability models of 'reality'. If you are interested in the possible difference between two populations on some numeric variable - for example, between heights of men and heights of women in some population group - and you choose to express the difference in terms of means, you are comparing a model which says height of a randomly chosen individual = overall mean + random fluctuation with one which says height of a randomly chosen individual = overall mean + factor due to sex + random fluctuation You then make assumptions about the 'random fluctuations'. Note that one of these models is embedded within the other - the first model is a particular case of the second. It is only in this situation that standard hypothesis testing is applicable. Neither of these models is 'true' - but either or both may be good descriptions of the two populations. Good in the sense that if you do start to randomly select individuals, the results agree acceptably well with what the model predicts. The role of hypothesis testing is to help you decide which of these is (PROBABLY) the better model - or if neither is. In standard hypothesis testing, one of these models is 'privileged' in that it is assumed 'true' - that is, if neither model is better, then you will use the privileged model. In most cases, this means the SIMPLER model. More accurately - if you decide that the models are equally good (or bad) you are saying that you cannot distinguish between them on the basis of the information and the statistical technique used! To decide between them you will need either to use a different technique, or more realistically, some other criterion. For example, in a court case, if you cannot decide between the models 'Guilty' and 'Innocent', you may always choose 'Innocent'. There is no reason why one model is thus privileged. In my paper I stressed my belief that this approach reflects our (and Fisher's) cultural heritage rather than any need for it to be that way. One can for example express the choice as between the embedded model and the embedded model suggested by the data. For a test on the difference between two means, this considers the models mu(diff) = 0 and mu(diff) = xbar. The interesting thing is that this is what we actually do! although it is dressed up in the language and technique of the general model mu(diff) not= 0. (This dressing up is a lot of the reason why students have trouble with hypothesis testing.) To conclude: hypothesis testing is NECESSARY. We do it all the time. Assessment of effect sizes is also necessary, but the two should not be confused. Regards, Alan -- Alan McLean ([EMAIL PROTECTED]) Department of Econometrics and Business Statistics Monash University, Caulfield Campus, Melbourne Tel: +61 03 9903 2102Fax: +61 03 9903 2007 === This list is open to everyone. Occasionally, less thoughtful people send inappropriate messages. Please DO NOT COMPLAIN TO THE POSTMASTER about these messages because the postmaster has no way of controlling them, and excessive complaints will result in termination of the list. For information about this list, including information about the problem of inappropriate messages and information about how to unsubscribe, please see the web page at http://jse.stat.ncsu.edu/ ===
Re: teaching statistical methods by rules?
[EMAIL PROTECTED] wrote: In article [EMAIL PROTECTED], [EMAIL PROTECTED] says... snip On the other hand, a body of knowledge can be thought of as a set of 'rules'. The important thing is that this set is constructed by the individual, so our aim should not be to teach statistics as a set of rules, but in such a way that each student can develop his or her own set of rules. They won't be the same for all, and they will different from the teacher's, but they hopefully will work. (If you like, this is a defintion of a 'good student' - one who manages to construct a successful set of rules for each subject. It's either undergraduate students in Australia are much smarter than those living in the United States or you live on a different planet. The last time I taught an undergraduate introductory statistics class, some students couldn't even do fractions and simple algebra. Can you expect them to develop their own rules? My comment above has nothing to do with students' 'smartness' or with their level of skill (two different things!) It is simply a way of describing what learning is. Why are people so obsessed with T and Z? When the degrees of freedom exceeds say 30, the difference between T and Z is practically negligible. You can use T or Z in such a case. However, the P-value from Z is easier to compute. Your interpretation of 'practically negligible' is different from mine, that's all. And with a computer, the p-value for t is exactly as easy to compute as the p-value for z. Regards, Alan -- Tjen-Sien Lim [EMAIL PROTECTED] www.Recursive-Partitioning.com Get your free Web-based email! http://recursive-partitioning.zzn.com -- Alan McLean ([EMAIL PROTECTED]) Acting Deputy Head, Department of Econometrics and Business Statistics Monash University, Caulfield Campus, Melbourne Tel: +61 03 9903 2102Fax: +61 03 9903 2007