As usual, Professor Tillers is raising thought-provoking questions. Can his questions be addressed through the use of standard probability theory, as suggested by Jason Palmer? A view which is articulated in the following is that Professor Tillers' questions relate, in the main, to partiality of truth rather than to partiality of certainty. More specifically, the questions: "Are roller skates motor vehicles?" or, more contentiously, "Are motorized wheelchairs motor vehicles?" are in effect, questions which relate to the degrees to which roller skates and motorized wheelchairs are members of the fuzzy set of motor vehicles. Equivalently, grades of membership may be interpreted as truth values of the propositions, "Roller skates are motor vehicles," and "Motorized wheelchairs are motor vehicles." Generally, grades of membership and truth values are context-dependent. This is consistent with the points made in the exchange of messages between Professor Tillers and Jason Palmer. Note that measures of similarity may be interpreted as grades of membership or, equivalently, as truth values.
In a statement quoted by Professor Tillers, David Larkin points out that Bayesian network theory, call it BNT, can deal with continuous variables, and argues that this capability entails the capability of BNT to deal with partiality of truth. Unfortunately, this is not the case. In fact, the incapability of BNT and, more generally, standard probability theory, call it PT, to deal with partiality of truth is a serious limitation of both BNT and PT. A consequence of this limitation is that BNT and PT do not have the capability to operate on perception-based information expressed in a natural language. The following relatively simple test problems are intended to lend support to this contention. 1. The balls-in-box problem. A box contains black and white balls. My perceptions are: (a) there are about twenty balls; (b) most are black; and (c) there are several times as many black balls as white balls. What is the probability that a ball drawn at random is white? 2. The Robert example. Usually Robert leaves his office at about 5:30pm. Usually it takes him about thirty minutes to get home. What is the probability that Robert is home at about 6:15pm? 3. The tall Swedes problem. My perception is that most Swedes are tall. What is the average height of Swedes? 4. X is a real-valued random variable. Usually X is not very large. Usually X is not very small. What is the probability that X is neither small nor large? What is the expected value of X? 5. X is a real-valued random variable. My perception of the probability distribution of X may be described as: Prob(X is small) is low; Prob(X is medium) is high; Prob(X is large) is low. What is the expected value of X? What are the tools that are needed to solve such problems? What is needed is a generalization of PT--a generalization which adds to PT the capability to operate on perception-based information expressed in a natural language. Such generalization, call it PTp, was described in my paper, "Toward a Perception-Based Theory of Probabilistic Reasoning with Imprecise Probabilities," Journal of Statistical Planning and Inference, Vol. 105, 233-264, 2002. (Downloadable http://www-bisc.cs.berkeley.edu/BISCProgram/Projects.htm). For illustration, a very brief sketch of PTp-based solutions of Problems 1 and 3 is presented in this following. Problem 1. Let X, Y and P denote, respectively, the number of black balls, the number of white balls and the probability that a ball drawn at random is white. Let a* denote "approximately a," with "approximately a" defined as a fuzzy set centering on a. Translating perception-based information into the Generalized Constraint Language (GCL), we arrive at the following equation: (X+Y) is 20* X is most�20* X is several�Y P is Y/20*. In these equations, most and several are fuzzy numbers which are subjectively defined through their membership functions. X, Y and P are fuzzy numbers which are solutions of the equations. X, Y and P can readily be computed through the use of fuzzy integer programming. Problem 3. Assume that height of Swedes ranges from hmin to hmax. Let g(u) denote the count density function, meaning that g(u)du is the proportion of Swedes whose height lies in the interval u and u+du. The proposition "Most Swedes are tall" translates into "The integral over the interval [hmin , hmax] of g(u) times the membership function of tall, t(u), is most," where most is a fuzzy number which is subjectively defined through its membership function. The average height, have, is the integral over the interval [hmin, hmax] of g(u) times u. If g were known, this would be the average height of tall Swedes. In our problem, g is not known, but what we know is that it is constrained by the translation of the proposition "Most Swedes are tall." Through constraint propagation, the constraint on g induces a constraint on the average height. The rule governing constraint propagation is the extension principle of fuzzy logic. Applying this principle to the problem in question, leads to the membership function of the fuzzy set which describes the average height of Swedes. Details relating to use of the extension principle may be found in my JSPI paper. To understand why reasoning with perception-based information described in a natural language is beyond the reach of PT and BNT, it is helpful to introduce the concept of dimensionality of natural languages. Basically, a natural language is a system for describing perceptions. Among the many perceptions which underlie human cognition, there are three that stand not in importance: perception of truth (verity); perception of certainty (probability); and perception of possibility. These perceptions are distinct and each is associated with a degree which may be interpreted as a coordinate along a dimension. Thus, we can speak of the dimension of truth (verity), dimension of certainty (probability) and dimension of possibility. Natural languages are three-dimensional in the sense that, in general, a proposition, in a natural language is partially true, or partially certain, or partially possible, or some combination of the three. For example, "It is very likely that Robert is tall," is associated with partial certainty and partial possibility, while "It is quite true that Mary is rich," is associated with partial truth and partial possibility. Standard probability theory, PT, is one-dimensional in that it deals only with partiality of certainty and not with partiality of truth nor with partiality of possibility. The mismatch in dimensionalities is the reason why PT and BNT are ill-equipped for dealing with perception-based information expressed in a natural language. Note that, unlike PT, PTp is three-dimensional. In retrospect, historians of science may find it difficult to understand why what is so obvious--that partiality of certainty and partiality of truth are distinct concepts and require different modes of treatment--encountered so much denial and resistance. Partiality of truth and partiality of certainty play pivotal roles in human cognition. But, in the realm of law, partiality of truth, and partiality of class membership, are much more pervasive than partiality of certainty. In many instances, they occur in combination. Warm regards to all, Lotfi Lotfi A. Zadeh Computer Science Division University of California Berkeley, CA 94720-1776 Tel(office): (510) 642-4959 Fax(office): (510) 642-1712
