First of all, I would like to commend both Lotfi and Kathy for the 
reasoned, respectful and open-minded dialog, a nice contrast to the too 
many dogmatic and antagonistic "dialogs" I have seen in the past within the 
fuzzy/Bayesian/AI community.

Now, a few quick observations:

1)  At least as a starting point, I have always found it most useful to 
view "fuzzy sets" as a shorthand for "membership functions defined over 
[ordinary or "crisp"] sets".  In other words, rather than treating the 
class of fuzzy sets as a super-class of the class of crisp sets, I view it 
as a class of constructs that are tied to crisp sets.  I believe this 
distinction may clarify some of the discussion, and remove some of the 
unnecessarily competitive treatment the "crisp vs. fuzzy" issue has 
received.  The issue now become whether someone can construct a class of 
membership functions that has some value (conceptual, computational, 
pedagogical, whatever) to its users.

2)  Typically, the membership functions are restricted to the ordered 
interval [0,1] with specific properties to define the 0 and 1 endpoints and 
at least an ordinal scale.  It is conceivable to permit a more general 
class of functions onto a partially-ordered set (e.g. intervals), but I 
have doubts about the practical value of such a weak construct.  On the 
other hand, attempts to interpret the function's range as cardinal run the 
risk of reducibility to second-order probabilities.

3)  There may be some confusion between "imprecise probabilities" and 
"probabilities of imprecisely-defined events".

a)  In the former, your space of events can pass a rigorous binary clarity 
test, but you admit imperfections in the process of assessing 
probabilities.  ("I can't say for sure what the probability is that Jane 
will arrive home before 6:30, but I'd say it's about 75%.")  This may be a 
way of capturing something about the myriad possible influencing factors 
that are not modeled explicitly (e.g. all the possible events that might 
have delayed Jane's train), or about the method of generating the numbers 
(show a subject an exact probability distribution for a long time, then 
hide it and ask the subject the probabilities of various intervals, and 
you'll get variability).  The value of admitting such variability is that 
for decision purposes (e.g. when should I start cooking dinner) it may be 
unnecessary to determine the precise probability.

b)  In the latter case, you may have a perfectly well-defined crisp 
probability distribution over the (crisp) event domain, but still wish to 
define fuzzy events over that crisp domain and derive or reason about their 
probabilities.  ("Here is my exact probability distribution over the time 
Jane arrives home.  Here is my fuzzy membership function for 'early'.  Now, 
what's the probability that she arrives home early?").  Here, using the 
"level sets" interpretation of the membership functions, we can use a fuzzy 
description as a compact (at least approximate) representation of a nested 
set of interval-valued problems, indexed by the membership value.  (E.g. 
use the a-level set on all your membership functions to compute the 
interval values on probabilities, utilities, etc., and derive the interval 
for your answer, then "stack" these a-level sets to assemble the fuzzy set 
for the answer.  Then, if desired, apply a set of predetermined linguistic 
filters to translate the answer into the appropriate language.)  In 
decision situations, the value is that you obtain an answer with a 
degree-of-confidence measure attached.

Sure it's possible to include both aspects of fuzziness in the same 
problem, but I'd prefer to start with the simpler cases where there is more 
to agree on, then add complexity.  You want to go even further, what about 
fuzzy-valued membership functions -- don't go there, yet. :-)

4)  Natural language, even when restricted to static aspects of word 
meaning, is a separate issue from fuzziness.  You could address the 
problems in 4a and 4b above using explicit membership functions rather than 
words.  Words, as input data or output descriptors, simply add another 
layer of complexity with possibly different (meta-)semantics.  The 
fundamental assumption for linguistic descriptors, in my humble opinion, is 
that the community of speakers and listeners share some common agreement on 
approximately the same interpretations for verbal descriptors, including 
underlying assumptions about context.  So in the context of 
coming-home-from-work, people might share the view that arrival time has an 
essential noise level on the order of about 30 minutes (again, in a 
particular context) which is essential to interpreting the word 
"early".  It is a separate and highly worthwhile challenge to model this 
social-convergence phenomenon, which may require the combined arsenals of 
fuzzy logic and probability theory.  But I'm afraid that its complexity can 
only obscure the more fundamental non-linguistic issues involved in 
"imprecise probability" theory.

5)  With respect to the use of biological or artificial neural nets as 
models for how we treat probability, some caution is in order.  As the 
"normative systems" school has been preaching for decades now, the behavior 
of humans or animals should be the minimal criterion for success, not the 
optimal goal to be emulated.  Certainly organisms in their natural 
environments can deal effectively with chance and fuzziness within a 
comfortable range, and our engineered systems ought to do at least as well 
there, but those very strengths may turn into weaknesses when extended to 
extreme or anomalous situations.  Psychologists have documented many 
instances of suboptimal and even inconsistent behavior in probability 
judgment and decision making.  Hybrid or dual-mode approaches may be the 
best practical way to capture benefits of both the perceptual and the 
analytic approaches, but that still leaves the analytic approach modelers 
with the same fundamental issues to address (plus the problem of how to 
meld the components into a single system).

6)  One final thing -- let's not over-invoke Occam's razor here.  Logically 
equivalent concepts may have very different semantics that lead to a 
valuable diversity in conceptual approaches.  Diversity of viewpoint and 
even insularity have their benefits, as long as there is enough 
understanding and communication to transfer lessons learned and to ensure 
consistency.   Human society has benefited from the diversity of natural 
languages, and although some might argue, so has computer science.  So 
those who want to study t-norms, please do so, and those who like 
second-order probabilities, please do so too, and likewise those who want 
to study computational semantics.  I hope we can share and benefit from one 
another's results.

Jonathan Weiss

At 12:13 PM 2001-09-30, zadeh wrote:
>Dear Kathy:
>
>         Thanks for the insightful comments.  Here is what I have to say.
>
>         (a)  Please note that my comment regarding imprecise
>probabilities relates to standard axiomatics of standard probability
>theory, PT, and not to what may be found in research monographs.
>However, construction of an axiomatic system for probability theory with
>imprecise probabilities is complicated by the fact that there are many
>different ways in which probabilities may be imprecise. Can you point me
>to a  comprehensive theory which goes beyond what may be found in
>Walley's treatise on imprecise probabilities?  Is there a general
>definition of conditional probability when the underlying probabilities
>are imprecise?
>
>        (b)  When we describe an imprecise probability by a second-order
>probability distribution, we assume that the latter is known precisely.
>Is this realistic?  Furthermore, if at the end of  analysis we compute
>expectations, as we usually do, then the use of second-order
>probabilities is equivalent to equating the imprecise probability to the
>expected value of the second-order probability. For these and other
>reasons, second-order probabilities are not in favor within the
>probability community.
>
>        (c)  When an imprecise probability is assumed to be
>interval-valued, what is likely to happen is that after a few stages of
>computation the bounding interval will be close to [0,l].
>
>         (d)  With regard to your comment on perceptions, see my paper,"
>A New Direction in AI--Toward a Computational Theory of Perceptions," in
>the Spring issue of the AI Magazine.  In my approach, the point of
>departure is not a collection of raw perceptions,but their description
>in a natural language,e.g.,"it is  very unlikely that Jane is very rich
>." Standard probability theory cannot deal with perception-based
>information because there is no mechanism in the theory for
>understanding natural language.
>
>        (e)  Your points regarding novel modes of computation are well
>taken. No disagreement.
>
>                                                    With my warm regards.
>
>
>Lotfi
>
>
>--
>Professor in the Graduate School, Computer Science Division
>Department of Electrical Engineering and Computer Sciences
>University of California
>Berkeley, CA 94720 -1776
>Director, Berkeley Initiative in Soft Computing (BISC)
>
>Address:
>Computer Science Division
>University of California
>Berkeley, CA 94720-1776
>Tel(office): (510) 642-4959 Fax(office): (510) 642-1712
>Tel(home): (510) 526-2569
>Fax(home): (510) 526-2433, (510) 526-5181
>[EMAIL PROTECTED]
>http://www.cs.berkeley.edu/People/Faculty/Homepages/zadeh.html


Reply via email to