Re: [UAI] Has anyone else noticed how odd many frequentist techniques are?
Hi Rich, If you are looking for a forum where these issues are frequently discussed, I recommend Andrew Gelman's blog: http://andrewgelman.com If you are looking for formal sources, there are the references cited in Kevin's attachment (in addition to his book, of course). In particular, if you are aiming to write something on the topic I recommend perusing the book by Jaynes (and his papers more generally). Regards, Konrad On Sat, Sep 27, 2014 at 12:44 PM, Richard E Neapolitan richard.neapoli...@northwestern.edu wrote: Thanks, Kevin, Well, I guess they are not too well-known. I asked my mentor on Bayesian stats, Sandy Zabell (prominant Bayesian statistician), about it. Although he agreed with me, he did not really have references stating how pathological these frequentists techniques are. I will tell Sandy about your book. He still teachs stats at NU. Best, Rich On 9/27/2014 1:08 PM, Kevin Murphy wrote: Yes, these problems are very well known. I am attaching a brief summary ( from my textbook http://www.cs.ubc.ca/%7Emurphyk/MLbook/index.html) of some of the most famous pathologies of frequentist statistics (cited references can be found in the bibliography here http://www.cs.ubc.ca/%7Emurphyk/MLbook/pml-bib.pdf). There are several more pathologies, but I didn't want to go overboard :) Kevin PS. A very nice practical book for teaching undergrad stats from a Bayesian POV is this: @book{Kruschke10, title = {{Doing Bayesian Data Analysis: A Tutorial Introduction with R and BUGS}}, author = J. Kruschke, year = 2010, publisher = Academic Press } On Fri, Sep 26, 2014 at 1:59 PM, Richard E Neapolitan richard.neapoli...@northwestern.edu wrote: Dear Colleagues, Since I converted to Bayesian statistics in the late 1980's, I have not looked at most frequentist methods. However, every time I look at them again, I notice how apparently preposterous many of them are. First that was the Bonferroni correction, which makes me update my belief about the results of an experiment based on how many other experiments I happen to conduct with it (and which of course implicitly assigns a low prior probability). One researcher even told me once that he has students first conduct fewer experiments so a finding has a better chance of being significant. I just walked away scratching my head. Now, in the process of designing a small test for a student, I noticed that two-tailed hypothesis testing is completely unreasonable. Along with the one-tailed test, it gives me decision rules which enable me to reject the hypothesis that the mean is less than or equal to 0, but not reject the hypothesis that it equals 0. The explanation is wrapped up in a story about the question asked and long run behavior with other similar experiments, that are not even run. So two people can walk away from the same experiment with different updated beliefs about whether the mean is 0, not based on their prior beliefs, but based on the question they happened to ask. In general, hypothesis testing does not seem to be the way to go. We should simply compute confidence intervals or posterior probability intervals. The Bayesian's world is so much simpler. She updates her belief solely on her prior beliefs and the data. There is no story that leads to strange results. All this matters, especially in medical applications, because so many studies are deemed significant or not significant based on the enigmatic p-value and the Bonferroni correction. I like to say that in medicine for every study there is an equal and opposite study. I am writing this because I wonder who else has noticed these oddities? I never read about them. I simply observed them independently. I find it curious that they have persisted for so long, and more is not said about them. Best, Rich -- Richard E. Neapolitan, Ph.D., Professor Division of Health and Biomedical Informatics Department of Preventive Medicine Northwestern University Feinberg School of Medicine 750 N. Lake Shore Drive, 11th floor Chicago IL 60611 ___ uai mailing list uai@ENGR.ORST.EDU https://secure.engr.oregonstate.edu/mailman/listinfo/uai -- Richard E. Neapolitan, Ph.D. Division of Biomedical Informatics Department of Preventive Medicine Northwestern Feinberg School of Medicine 750 N. Lake Shore Drive, 11th Floor Chicago, Illinois 60611 ___ uai mailing list uai@ENGR.ORST.EDU https://secure.engr.oregonstate.edu/mailman/listinfo/uai -- Konrad Scheffler University of California, San Diego http://id.ucsd.edu/faculty/KonradSchefflerPhD.shtml ___ uai mailing list uai@ENGR.ORST.EDU https
Re: [UAI] A perplexing problem - Version 2
On Mon, 23 Feb 2009, Francisco Javier Diez wrote: Konrad Scheffler wrote: I agree this is problematic - the notion of calibration (i.e. that you can say P(S|70%) = .7) does not really make sense in the subjective Bayesian framework where different individuals are working with different priors, because different individuals will have different posteriors and they can't all be equal to 0.7. I apologize if I have missed your point, but I think it does make sense. If different people have different posteriors, it means that some people will agree that the TWC reports are calibrated, while others will disagree. I think this is another way of saying the same thing - if you define the concept of calibration such that people will, depending on their priors, disagree over whether the reports are calibrated then it is still problematic to prescribe calibration in the problem formulation - because this will mean different things to different people. Unless you take TWC is calibrated to mean everyone has the same prior as TWC, which I don't think was the intention in the original question. In my opinion the source of confusion here is the use of a subjective Bayesian framework (i.e. one where the prior is not explicitly stated and is assumed to be different for different people). If instead we use an objective Bayesian framework where all priors are stated explicitly, the difficulties disappear. Who is right? In the case of unrepeatable events, this question would not make sense, because it is not possible to determine the true probability, and therefore whether a person or a model is calibrated or not is a subjective opinion (of an external observer). However, in the case of repeatable events--and I acknowledge that repeatability is a fuzzy concept--, it does make sense to speak of an objective probability, which can be identified with the relative frequency. Subjective probabilities that agree with the objective probability (frequency) can be said to be correct and models that give the correct probability for each scenario will be considered to be calibrated. If we accept that snow is a repeatable event, the all the individuals should agree on the same probability. If it is not, P(S|70%) may be different for each individual because having different priors and perhaps different likelihoods or even different structures in their models. I strongly disagree with this. The (true) relative frequency is not the same thing as the correct posterior. One can imagine a situation where the correct posterior (calculated from the available information) is very far from the relative frequency which one would obtain given the opportunity to perform exhaustive experiments. Probabilities (in any variant of the Bayesian framework) do not describe reality directly, they describe what we know about reality (typically in the absence of complete information). Coming back to the main problem, I agree again with Peter Szolovits in making the distinction between likelihood and posterior probability. a) If I take the TWC forecast as the posterior probability returned by a calibrated model (the TWC's model), then I accept that the probability of snow is 70%. b) However, if I take 70% probability of snow as a finding to be introduced in my model, then I should combine my prior with the likelihood ratio associated with this finding, and after some computation I will arrive at P(S|70%) = 0.70. [Otherwise, I would be incoherent with my assumption that the model used by the TWC is calibrated.] Of course, if I think that the TWC's model is calibrated, I do not need to build a model of TWC's reports that will return as an output the same probability estimate that I introduce as an input. Therefore I see no contradiction in the Bayesian framework. But this argument only considers the case where your prior is identical to TWC's prior. If your prior were _different_ from theirs (the more interesting case) then you would not agree that they are calibrated. ___ uai mailing list uai@ENGR.ORST.EDU https://secure.engr.oregonstate.edu/mailman/listinfo/uai
Re: [UAI] A perplexing problem - Version 2
I agree this is problematic - the notion of calibration (i.e. that you can say P(S|70%) = .7) does not really make sense in the subjective Bayesian framework where different individuals are working with different priors, because different individuals will have different posteriors and they can't all be equal to 0.7. Instead, you need a notion of calibration with respect to a particular prior. Hopefully the TWC forecasts are calibrated with respect to their own prior (otherwise they are reporting something other than what they believe). In this case your subjective posterior P(S|70%) will only be equal to .7 if your prior happens to be identical to theirs. Hope this helps, Konrad Consider the following revised version. The TWC problem 1. Question: What is the chance that it will snow next Monday? 2. My subjective prior: 5% 3. Evidence: The Weather Channel (TWC) says there is a 70% chance of snow on Monday. 4. TWC forecasts of snow are calibrated. Notice that I did not justify by subjective prior with a base rate. From P(S)=.05 and P(S|70%) = .7 I can deduce that P(70%|S)/P(70%|~S) = 44.33. So now I can deduce from my prior and evidence odds that P(S|70%) = .7. But this seems silly. Suppose my subjective prior was 20%. Then P(70%|S)/P(70%|~S) = 9.3 and again I can deduce P(S|70%)=.7. My latest quandary is that it seems odd that my subjective conditional probability of the evidence should depend on my subjective prior. This may be coherent, but is too counter intuitive for me to easily accept. It would also suggest that when receiving a single evidence item in the form of a judgment from a calibrated source, my posterior belief does not depend on my prior belief. In effect, when forecasting snow, one should ignore priors and listen to The Weather Channel. Is this correct? If so, does this bother anyone else? ___ uai mailing list uai@ENGR.ORST.EDU https://secure.engr.oregonstate.edu/mailman/listinfo/uai
Re: [UAI] A perplexing problem
Hi Paul, Your calculation is correct, but the numbers in the example are odd. If TWC really only manage to predict snow 10% of the time (90% false negative rate), you would be right not to assign much value to their predictions (you do assign _some_, hence the seven-fold increase from your prior to your posterior, but with prediction performance like that TWC cannot possibly think there is really a 70% chance of snow). Change the 10% true positives to 90%, and your posterior goes up to 82.6% - much more believable. Also, it's important not to think the figure of 70% has any bearing on the problem. I appreciate that you put it in as a red herring to challenge the students, but be aware that it may also lead to confusion. Konrad On Fri, 13 Feb 2009, Lehner, Paul E. wrote: I was working on a set of instructions to teach simple two-hypothesis/one-evidence Bayesian updating. I came across a problem that perplexed me. This can't be a new problem so I'm hoping someone will clear things up for me. The problem 1. Question: What is the chance that it will snow next Monday? 2. My prior: 5% (because it typically snows about 5% of the days during the winter) 3. Evidence: The Weather Channel (TWC) says there is a 70% chance of snow on Monday. 4. TWC forecasts of snow are calibrated. My initial answer is to claim that this problem is underspecified. So I add 5. On winter days that it snows, TWC forecasts 70% chance of snow about 10% of the time 6. On winter days that it does not snow, TWC forecasts 70% chance of snow about 1% of the time. So now from P(S)=.05; P(70%|S)=.10; and P(70%|S)=.01 I apply Bayes rule and deduce my posterior probability to be P(S|70%) = .3448. Now it seems particularly odd that I would conclude there is only a 34% chance of snow when TWC says there is a 70% chance. TWC knows so much more about weather forecasting than I do. What am I doing wrong? Paul E. Lehner, Ph.D. Consulting Scientist The MITRE Corporation (703) 983-7968 pleh...@mitre.orgmailto:pleh...@mitre.org ___ uai mailing list uai@ENGR.ORST.EDU https://secure.engr.oregonstate.edu/mailman/listinfo/uai
Re: [UAI] Computation with Imprecise Probabilities--The problem of Vera's age
Dear Prof Zadeh, Perhaps you could elucidate what you mean by cointensive? (I assume this is explained in detail in your paper, but I also assume that one purpose of your post here is to convince people that it will be worth investing the time to read the paper.) Also, what do you understand under probability? Your distinction between elasticity of meaning and probability of meaning sounds very similar to the distinction between the Bayesian and frequentist interpretations of probability (as I understand elasticity of meaning, the former encapsulates it while the latter does not - perhaps you can convince me otherwise). Regards, Konrad Dr Konrad Scheffler Computer Science Division Dept of Mathematical Sciences University of Stellenbosch +27-21-808-4306 http://www.cs.sun.ac.za/~kscheffler/ On Mon, 21 Jul 2008, Lotfi A. Zadeh wrote: Dear Dr. Mitola: Thank you for your constructive comment and for bringing the works of George Lakoff, Johnson and Rhor, Jackendoff and Tom Ziemke to the attention of the UAI community. I am very familiar with the work of George Lakoff, my good friend, and am familiar with the work of Jackendoff. The issue that you raise---context-dependence of meaning---is of basic importance. In natural languages, meaning is for the most part context-dependent. In synthetic languages, meaning is for the most part context-free. Context-dependence serves an important purpose, namely, reduction in the number of words in the vocabulary. Note that such words as small, near, tall and young are even more context-dependent than the words and phrases cited in your comment. In the examples given in my message, the information set, I, and the question, q, are described in a natural language. To come up with an answer to the question, it is necessary to precisiate the meaning of propositions in I. To illustrate, in the problem of Vera's age, it is necessary to precisiate the meaning of mother's age at birth of a child is usually between approximately twenty and approximately forty. Precisiation should be cointensive in the sense that the meaning of the result of precisiation should be close to the meaning of the object of precisiation (Zadeh 2008 http://dx.doi.org/10.1016/j.ins.2008.02.012). The issue of cointensive precisiation is not addressed in the literature of cognitive linguistics nor in the literature of computational linguistics. What is needed for this purpose is a fuzzy logic-based approach to precisiation of meaning (Zadeh 2004 http://www.aaai.org/ojs/index.php/aimagazine/article/view/1778/1676). In Precisiated Natural Language (PNL) it is the elasticity of meaning rather than the probability of meaning that plays a pivotal role. What this means is that the meaning of words can be stretched, with context governing elasticity. It is this concept that is needed to deal with context-dependence and, more particularly, with computation with imprecise probabilities, e.g., likely and usually, which are described in a natural language. In computation with imprecise probabilities, the first step involves precisiation of the information set, I. Precisiation of I can be carried out in various ways, leading to various models of I. A model, M, of I is associated with two metrics: (a) cointension; and (b) computational complexity. In general, the higher the cointension, the higher the computational complexity is. A good model of I involves a compromise. In the problem of Vera's age, I consists of three propositions. p_1 : Vera has a daughter in the mid-thirties; p_2 : Vera has a son in the mid-twenties; and p_3 (world knowledge): mother's age at the birth of her child is usually between approximately 20 and approximately 40. The simplest and the least cointensive model, M_1 , is one in which mid-thirties is precisiated as 30; mid-twenties is precisiated as 20; approximately 20 is precisiated as 20; approximately 40 is precisiated as 40; and usually is precisiated as always. In this model, p_1 precisiates as: Vera has a 35 year old daughter; p_2 precisiates as: Vera has a 25 year old son; and p_3 precisiates as mother's age at the birth of her child varies from 20 to 40. Precisiated p_1 constrains the age of Vera as the interval [55, 75]. Since p2 is not independent of p_1 , precisiated p_2 constrains the age of Vera as the interval [55, 65]. Conjunction (fusion) of the two constraints leads to the answer: Vera's age lies in the interval [55, 65]. Note that the lower bound is determined by the lower bound in p_1 while the upper bound is determined by the upper bound in p_2 . A higher level of cointension may be achieved by moving from M_1 to M_2 . In M_2 , various terms such as mid-twenties and mid-thirties are precisiated as intervals, e.g. mid-twenties is precisiated as [24, 26], with usually precisiated as always. Elementary interval
[UAI] Studentships available in evolutionary modelling at Stellenbosch University
The National Bioinformatics Network (NBN) of South Africa has awarded funds for a project in evolutionary modelling at the Computer Science Division, Department of Mathematical Sciences, Stellenbosch University, South Africa. The project, which has ties with researchers at the University of Cape Town, University of California San Diego, and Stanford University, focusses on using ideas from machine learning and probabilistic modelling to model the evolution of recombining protein-coding sequences, with application to studying the evolution of HIV. We are seeking postgraduate students to fill one PhD and one MSc position (available immediately); more positions are likely to become available for students wanting to start in January 2009. The ideal candidate will have a strong background in a mathematical science (e.g. Computer Science, Applied Mathematics, Engineering, Bioinformatics, Mathematics, Statistics; knowledge of machine learning/probabilistic modelling is a plus); however, applicants with a background in other subjects (e.g. Genetics, Biochemistry, Molecular Biology) will also be considered, provided they have strong computer programming skills. Successful candidates will be expected to complete coursework in a variety of bioinformatics topics offered by the NBN. Remuneration is according to the highly competitive NBN scales. With roots going back to 1866, Stellenbosch University is one of the oldest universities in South Africa. It is a medium-sized comprehensive public university, situated in a classic university town and surrounded by the magnificent mountain scenery of the Jonkershoek Valley. Stellenbosch University is proud of being recognized as a research driven university with over 33% percent of its enrolments at the postgraduate level and the highest publication productivity in the country. If you meet the requirements above and would like to get involved in an exciting and challenging research project, with the potential to impact on the important South African HIV research arena, please send a complete academic CV (which should include information on your most advanced computer programming project to date) and a covering letter to Dr Konrad Scheffler ([EMAIL PROTECTED]). Alternatively, please get in touch by e-mail or phone (021 808 4306) to request more information about the project. Dr Konrad Scheffler Computer Science Division Dept of Mathematical Sciences University of Stellenbosch +27-21-808-4306 http://www.cs.sun.ac.za/~kscheffler/ ___ uai mailing list uai@ENGR.ORST.EDU https://secure.engr.oregonstate.edu/mailman/listinfo/uai
Re: [UAI] A test problem involving imprecise probabilities
Hmm, no takers on this one yet? I'll rephrase the problem in a way that makes more sense to me (since the original contains words I don't know the meaning of): X and Y are unknown variables taking values in the set (1, 2, ..., n). The entries in the joint probability matrix, P, are unknown and of the form aij, where the aij take values in the unit interval and add up to unity. What is the marginal probability distribution of X in each of the following cases? a) For each aij we are given a fixed interval, with the distribution of aij being uniform inside and zero outside this interval. (I assume the intention here was that the width of the interval is known?) b) (I'll leave the translation of this one to a fuzzy specialist.) Stated this way, case (a) is unproblematic. It's worth adding that using a finite interval with zero probability density outside the interval will often be a bad thing to do in practical problems and is not recommended if you have any say in the problem design. Instead, a distribution that is nonzero throughout the unit interval will avoid nonsensical results in cases where the correct value of aij is outside the interval. Replacing the hard distribution in (a) with a soft distribution that is small but nonzero outside the given interval is easily done and the solution remains unproblematic. Konrad On Thu, 22 Sep 2005, Lotfi Zadeh wrote: X and Y are random variables taking values in the set (1, 2, ..., n). The entries in the joint probability matrix, P, are of the form approximately aij, where the aij take values in the unit interval and add up to unity. What is the marginal probability distribution of X? Two special cases: (a) approximately aij, is interpreted as an interval centering on aij; and (b) approximately aij, is interpreted as a fuzzy triangular number centering on aij. Warm regards to all, Lotfi ___ uai mailing list uai@ENGR.ORST.EDU https://secure.engr.oregonstate.edu/mailman/listinfo/uai ___ uai mailing list uai@ENGR.ORST.EDU https://secure.engr.oregonstate.edu/mailman/listinfo/uai
Re: [UAI] Is it luck or is it skill - my resolution
Hi Rich, In your analysis you present a frequentist and a Bayesian approach, arguing that the paradox exists only for the frequentist case. Fair enough. I would just like to point out that the frequentist approach (orthodox hypothesis testing) is even more problematic than that, in that it effectively makes assumptions it claims not to: In the frequentist exposition, you state: I have no idea whether my population includes clairvoyants (or at least I do not want to impose my prior beliefs). You then give us an example of a circumstance under which you would reject the null hypothesis. However, from your example we can calculate bounds on your prior belief that a randomly chosen individual is clairvoyant: P(clairvoyant) 0.5. (To explain your default belief in the null hypothesis). P(clairvoyant) 1/10001 (approximately .0001). (Otherwise it would be irrational to reject the null hypothesis on observing success - the alternative would still be less likely.) If you are willing to use the commonly used p value threshold of 0.01, we get a stronger bound: P(clairvoyant) 1/101. Here I am assuming that you are willing to believe a hypothesis whenever it has probability 0.5; if instead you prefer to build in a grey area where you do not accept any beliefs, the bounds on your prior again become more stringent. So despite the explicit denial, this method does impose your prior beliefs. regards, Konrad On Tue, 28 Jun 2005, Rich Neapolitan wrote: I thank all those who responded to my query and discussed the matter with me. Here is my resolution. First, I'll re-describe the problem using some numbers and terminology provided by Francisco Javier Diez. Suppose there is some task such that P(success) = .0001 if someone is not clairvoyant and P(success) = 1 if someone is clairvoyant. I have no idea whether my population includes clairvoyants (or at least I do not want to impose my prior beliefs). Mike claims he is one. My null hypothesis is that he is not. When he succeeds a very unlikely event has occurred (.0001) if the null hypothesis is true. So I reject that hypothesis and believe Mike probably is one. Next I have 10,000 people making claims they are clairvoyants. My null hypothesis is that none are. If the null hypothesis is true, the probability of at least one succeeding is 1-(.)^10,000 = .63. So if Mike alone succeeds I have no reason to reject the null hypothesis. I need quite few people succeeding to reject it. So I have little reason for believing Mike or anyone else in the group is clairvoyant. There is no way out of this if we insist on obtaining our beliefs from hypothesis testing. However, if as I.J. Good said, we don't sweep our prior beliefs under the carpet, we can solve the problem using Bayes' Theorem. Suppose we believe that there is a .01 probability some individual (say Mike) is clairvoyant. Then P(clairvoyant|success) = P{success|clairvoyant)P(clairvoyant)/[P{success|clairvoyant)P(clairvoyant) + P{success|not clairvoyant)P(not clairvoyant) =1 x .01 / [1x .01 + .0001 x .99] = .99. So when Mike succeeds we believe he is probably a clairvoyant regardless of how many other people attempt the task or succeed. In applications to situations like Buffet predicting stock performance I think with a little analysis we can formulate reasonable priors, etc. and analyze the problem this second way. In applications like coin tossing we can also assign extremely small priors to someone having special ability. Actually out of a large group I could see where someone could have some talent for forcing heads. So I really mean a random experiment in which we control for all known tricks. There still could be some very small probability that someone has psychic ability. Rich ___ uai mailing list uai@ENGR.ORST.EDU https://secure.engr.oregonstate.edu/mailman/listinfo/uai ___ uai mailing list uai@ENGR.ORST.EDU https://secure.engr.oregonstate.edu/mailman/listinfo/uai