Dear Prof. Zadeh, Thanks for your response. [I have added your attachment (Fuzzy solution) after the quote of your message below.]
It seems to me that you are not explaining sufficiently clearly why Bayesian methods fail on this problem. You cite as a deficiency in the Maximum Entropy principle that one cannot be sure about the answer one obtains by it, but I don't see where you have made the point that Fuzzy methods allow one to be sure. You say that the Fuzzy solution gives you a fuzzy answer while the Bayesian/MaxEnt solution gives you a crisp answer. But in fact a Bayesian would be only to happy to give you a posterior distribution of height as the answer. The question was "What is the height ... ?", so I gave a height as the answer. If you asked me to give you all the information I have about the height of Swedes that might assist you in making some decision, then I would give you a posterior distribution. I don't see how assumptions made on Fuzzy variables and a Fuzzy answer are superior to the Bayesian prior assumptions and posterior distribution answer. Actually part of the point of my post was that distributions and fuzzy answers are never the _real_ answer. They are something that you, or nature, might use in determining a real answer. But ultimately, something non-probabilistic and non-fuzzy must happen, even in quantum mechanics. So it's perfectly natural that one would be required to give a specific answer to the question "What is the average height ... ?" since ultimately one must make a committment. It's certainly possible for the question to be "What is the distribution of the height ... ?", for example if one is manufacturing clothing in Sweden, one would want to know the distribution so that one could make appropriate numbers of the different sizes. But here too one still has ultimately to make a choice of exactly how many of each size to make. Regarding the problem versions you give, I think the Bayes/MaxEnt method can readily deal with them and provide a posterior distribution of the height. You give various rough constraints which can be modeled as prior probabilities, and standard probability and Maximum Entropy can be used to determine a posterior distribution of the height consistent with your constraints and otherwise remaining maximally non-committal. It does indeed seem odd to assume a uniform distibution for height, but it's equally odd to assume a hard upper bound and a lower bound greater than zero on the possible heights. If nature did in fact restrict heights to a certain interval, then she might very well distribute them uniformly in that interval. As nature does not impose this restriction, the MaxEnt posterior that we get by imposing it is naturally incongruous. If you ask me the average height of Swedes, and tell me that it is extremely important that the answer be correct, then I would likely discretize the possible heights (to use 0-1 loss), determine some posterior distribution of the heights (given what I know, or what I believe weighted by the belief), calculate the risk associated with each possible estimate by integrating the risk times the posterior over all other values, and then choose a height from the set of heights with the least risk. Or if I were feeling particularly Bayesian, I might try a whole set of priors weighted by my belief that they could be the true prior, and determine the risk of eash estimate averaged over the possible priors. This is "model comparison", which relies for it's consistency on the universal norm of probability. Of course I would never be "sure" of my answer, but I have to give one, just as someone using Fuzzy Logic would have to give an answer and be unsure of it. If the Fuzzy Logicist is allowed to give fuzzy answers, then the Bayesian must be allowed to give posterior distributions. Kind regards, Jason - -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Lotfi A. Zadeh Sent: Friday, January 23, 2004 10:27 AM To: [EMAIL PROTECTED] Subject: Re: [UAI] functional_vs_causal models Dear Jason: Thank you for solutions to the test problems. Your solutions show that you have a high level of expertise in standard probability theory (PT) and the maximum entropy principle. However, in my view your solutions support, indirectly, my contention that PT and the maximum entropy principle do not have a capability to deal with perception-based information. To make my point, I will focus on the tall Swedes problem. For convenience, I will formulate a progression of versions of this problem. In these versions, what varies is the initial dataset. The question is the same: What is the average height of Swedes. In the following, a* denotes "approximately a." Version 1 (crisp). Swedes over 20 range in height from 140cm to 220cm. Let h be the height of a Swede picked at random. I am told that the distribution of h is uniform, and am asked, "What is the average height of Swedes?" My answer is: 180cm. If I am asked, "Are you sure?" my answer would be "Yes." Now, I ask you the same question but without telling you what is the probability distribution of h. You invoke the maximum entropy principle and in response to my question tell me that the average height is 180cm. But then I ask you, "Jason, are you sure that the average height is 180cm? If not, I may be in serious trouble." Your answer would have to be: "No, I am not sure." This is a fundamental flaw of the maximum entropy principle. Furthermore, as I have pointed out in earlier messages, the principle is not applicable when information is perception-based. Version 2 (fuzzy) Swedes over 20 range in height from 140cm to 220cm. Over 70* percent are taller than 170* cm. What is the average height of Swedes over 20? A fuzzy logic solution is described in the attachment. Version 3 (fuzzy) Swedes over 20 range in height from 140cm to 220cm. Over 70* percent are taller than 170*cm. Less than 10* percent are shorter than 150*cm. Less than 15* percent are taller than 200*cm. What is the average height of Swedes over 20? I would be very interested in your solutions to these versions, and your answer to my question: Are you sure that the answer is correct? What if an incorrect answer may lead to a serious loss? With my warm regards, Lotfi Attachment: Version 2. Swedes over 20 range in height from 140cm to 220cm. Over 70* percent are taller than 170*cm. What is the average height of Swedes over 20? Fuzzy logic solution. Consider a population of Swedes over 20, S={Swede1, Swede2, �, SwedeN}, with hi, i=1, �, N, being the height of Si. The datum �Over 70* percent of S are taller than 170*cm,� constrains the hi in h=(hi, �, hN). The constraint is precisiated through translation into the Generalized Constraint Language, GCL. More specifically, let X denote a variable taking values in S, and let X|(h(X) is 170) denote a fuzzy subset of S induced by the constraint h(X) is ? 170*. Then Over 70* percent of S are taller than 170* ? (GCL): 1/N Count(X|h(X) is >170*| is ? 0.7* where Count is the fuzzy count of X�s which satisfy the fuzzy constraint h(X) is ? 170*. A general deduction rule in fuzzy logic is the following. In this rule, X is a variable which takes values in a finite set U={u, u2, �, uN}, and a(X) is a real-valued attribute of X, with ai=a(ui) and a=(ai, �, aN) 1/N Count(X|a(X) is C) is B Av(X) is ?D where Av(X) is the fuzzy average value of X over U. Thus, computation of the average value, D, reduces to the solution of the fuzzy nonlinear programming problem �_D(v)= max_a(sum_i �_i(a_i)) subject to v= sum_i a_i (average height) where �_D and �_C are the membership functions of D and C, respectively. This is the fuzzy logic solution to Version 2. Note that computation of D requires calibration of the membership functions of ? 170* and ? 0.7*. Note also that the fuzzy logic solution is a solution in the sense that it reduces the original problem to a well-defined mathematical problem. Jason, your Bayesian solution of Version 2 would yield a crisp value of D. The fuzzy logic solution leads to a fuzzy value of D, in consequence of the fuzziness of the initial dataset. This is an instance of the principle: fuzzy in, fuzzy out. - -- Lotfi A. Zadeh Professor in the Graduate School, Computer Science Division Department of Electrical Engineering and Computer Sciences University of California Berkeley, CA 94720 -1776 Director, Berkeley Initiative in Soft Computing (BISC) Address: Computer Science Division University of California Berkeley, CA 94720-1776 [EMAIL PROTECTED] Tel.(office): (510) 642-4959 Fax (office): (510) 642-1712 Tel.(home): (510) 526-2569 Fax (home): (510) 526-2433 Fax (home): (510) 526-5181 http://www.cs.berkeley.edu/People/Faculty/Homepages/zadeh.html BISC Homepage URLs: URL: http://www-bisc.cs.berkeley/ URL: http://zadeh.cs.berkeley.edu/
