RE: [UAI] A deceptively simple test of deductive capability

Peter Tillers Tue, 01 Jul 2003 10:25:09 -0700

The opinion at
www.ca7.uscourts.gov/op3.fwx?submit1=showop&caseno=01-4208.PDF is decidedly
at variance with the dominant legal perspective on -- the dominant legal
aversion to -- overtly probabilistic and statistical evidence and argument
that I tried to describe in my original message [on the Uncertainty in
Artificial Intelligence list], which is reproduced below. It is noteworthy
- -- and not coincidental -- that the author of this opinion is Richard
Posner, now a federal judge but formerly a law professor and the founder of
the "law and economics" movement in the U.S. (an academic movement that has
long since crossed national boundaries).


The warmth with which Judge Posner greets overtly probabilistic argument
about evidence is not limited to the conservative side of the political and
judicial spectrum. Judge Jack B. Weinstein, who belongs to the liberal side
of any such spectrum, has long been a prominent advocate of more extensive
use of probabilistic and statistical methods in law. (Disclosure: I confess
to have served as a court-appointed expert witness -- with David Schum -- in
one case, at the behest of Judge Weinstein. I also readily confess that
David was the true expert in that case, which is why I enlisted his help.)

*****

Peter Tillers���������� http://tillers.net
Professor of Law
Cardozo School of Law, Yeshiva University
55 Fifth Avenue, New York, NY 10003
�
(212) 790-0334; FAX (212) 790-0205

[EMAIL PROTECTED]
�
�


-----Original Message-----
From: Peter Tillers [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, June 18, 2003 1:02 PM
To: [EMAIL PROTECTED]
Cc: Lotfi Zadeh
Subject: RE: [UAI] A deceptively simple test of deductive capability

To whom it may concern:
 
Professor Zadeh suggests that his �deceptively simple test of deductive
capability� raises a serious question about �the validity of much of
probability-based reasoning in the realm of law.� Please see latest message
to the UAI list, below.
 
I am a law teacher. I would like to comment on Prof. Zadeh�s challenge and
suggestion. I would like to begin by circling around Prof. Zadeh�s
suggestion/challenge: I will begin by providing a bit of legal background.
After doing that, I will then give my own crude �take� on the question of
the limits and uses � on the question of the epistemically legitimate uses �
of conventional probability theory in legal proceedings. (I will not comment
directly on the formal details of Prof. Zadeh�s argument; I have breathed in
his argument, without fully understanding it. But my guess, my hope, is that
my comments below have at least a tangential bearing on his formal
argument.)
 
***
 
There has been a vigorous debate in some U.S. & U.K. academic circles for
over 30 years about the use of probability theory and statistical methods in
legal proceedings. There are very significant differences of opinion what
that debate has been all about. For example, some protagonists view the
debate as being fundamentally about the admissibility of probability theory
and statistical methods on the back (so to speak) of scientific evidence
such as genetic evidence. Other protagonists in the debate maintain (at
least occasionally) that their concern is not fundamentally with the
question of the use of probability theory in trials but with the question of
the nature of uncertain reasoning about states of the world (�factual
questions�). Other protagonists have maintained that the debate is or ought
not to be about either of those matters, but about the social effects of the
use of probability theory and similar methods in trials � for example, about
the effect of the use of �overtly probabilistic� evidence in criminal trials
to show �guilt beyond a reasonable doubt.�) The truth, of course, is that
�the� debate I am referring to has been about all of these issues.  
 
As a general matter, there is fierce resistance in the real-world legal
profession � in the real world of litigation � to the use of conventional
�formal� probability theory in proof in litigation.
 
There are exceptions to this rule of resistance.
 
For example, when the issues in a lawsuit involve well-defined (�crisp�) and
repetitive phenomena � e.g., mechanisms involved in biological heredity, or
radioactive decay, etc. -- experts whose judgments rest in part on reasoning
involving formal probability theory are often allowed to testify [but rarely
if ever are such experts allowed to combine their probabilistic assessments
about phenomena within the purview of their expertise with �soft� uncertain
judgments � such as uncertain judgments about a person�s intentions on some
occasion (and experts are not allowed to opine how someone else such as a
juror should combine �hard probabilities� with whatever �soft uncertainties�
that that someone else -- that juror, for example -- might happen to
entertain)].


 
There are some other exceptions, some more surprising exceptions to the rule
of legal resistance to the use of standard probability theory in trials.
 
One striking exception of this sort is the common admissibility of
statistical evidence in employment discrimination cases, statistical
evidence that can be admitted, for example, as presumptive or prima facie
evidence of an employer�s probable behavior toward a particular employee on
a particular occasion.
 
But these are exceptional situations � in at least the two senses mentioned
next.
 
First, no court in the United States or (as far as I know) anywhere in the
Commonwealth believes that the law allows probability theory to be used in
ordinary trials to assess �ordinary� evidence. (In the occasional situations
where this has been attempted in trials, appellate courts have quickly and
firmly condemned them. The most famous case of this sort is People v.
Collins, 68 Cal.2d 319, 438 P.2d 33 (1968). But there have been other such
appellate decisions � in those very rare situations in which trial judges
have allowed trial lawyers, for example, to use the product rule in an
argument to a jury about the probability of the guilt of some defendant.
See, e.g.,  the recent case in the U.K., the Sally Clark case,
http://www.sallyclark.org.uk/, involving attempted probability computations,
effectively with the product rule, in a �sudden infant death syndrome� case.
Cf. Wilson v. Maryland, 370 Md. 191, 803 A.2d 1034 (August 7, 2002). But
compared to _Collins_, these are borderline cases � or cases lying closer to
the border of legitimate use of probability � because here at least there
were some statistics at hand, however ill-suited they were for their
intended purpose.)

 
Second, almost no judge in the U.S. or the in the Commonwealth believes that
the law would be well-advised to allow judges or jurors to use probability
theory to evaluate the evidence put before them in a trial.
 
 
The legal profession�s opposition to the use of probability theory for the
assessment of �ordinary� evidence rests only in part on the legal
profession�s awareness of the widespread innumeracy of judges, lawyers, and
jurors. There is also a very firm sense among almost legal professionals
that the attempt to translate the law�s injunctions about the handling of
inconclusive evidence ends up getting things wrong, that the attempted
translation is, necessarily, imperfect, incorrect.
 
This is where things stand. The question is, in part, whether it is an
accident that things stand where they now do or whether there is something
in the nature of most evidence and factual issues in trials (and in
litigation generally) that makes them unsuitable for dissection via standard
probability theory.
 
It would be too much to expect that the legal profession would speak with
one voice when trying to explain why it seems so obvious to the legal
profession that formal probability theory is the wrong instrument for the
occasion. But there is at least one persistent strand in the legal
profession�s thinking about uncertain �historical� events � (hypothetically)
non-recurring events � that I think may begin to touch the kind of argument
that Prof. Zadeh may have made in his message below and with his
deceptively-simple hypothetical problem.
 
Lawyers and judges always worry about the validity of �extrapolating� from
one set of experiences to another or from one set of individuals to another
or from (the behavior of) a set of individuals to (the behavior of) a
particular individual with a unique set or combination of attributes. One
way to see this concern is to see it (coldly) as a variant of the fabled
problem of intersecting reference classes. The legal profession (if not all
philosophers, logicians, statisticians, etc.) is generally powerfully
impressed by the hypothesis that �every situation [person etc.] is
different� � and, in the face of statistical studies attempting to account
for relevant variables, legal professionals are quick to spot variables
about which data has not been collected. Equally important, legal
professionals (as a general matter) are quick to see or ready to assert that
some or many �samples� � collections of observations � are useless because
(in part, sometimes) the criteria for determining whether some event has or
has not occurred are imprecise, vague, indeterminate, or fuzzy. (For
example, I have little doubt that practically all lawyers could and could
and would launch a legally-devastating attack on a statistical study that
relied on a collection of data about the events & the relationships between
�violent behavior� and �jealous rage.�)
 
So let�s put it this way: most legal professionals have the sense that it is
in principle impossible to collect meaningful data about certain kinds of
possible states of affairs � either because there is a serious question
about the extent to which the prior events observed are �like� the event
whose occurrence or non-occurrence is now in issue or because there is
something about the current event in issue � it has some an additional
attribute or attributes � that quite possibly distinguishes it from the
events that are described by the collected data. In fancy terms: two
problems: the events in an alleged reference class may not be truly
representative (because our methods of classifying those events are vague,
fuzzy, rough, etc.) and, even if we�ve got nice crisp reference classes, the
events in issue have other attributes that may make the predictions based on
existing reference classes bad. (I will leave it to you folks to convert my
rude language into -mathematically and -philosophically and
statistically-refined and rigorous language.)
 
�         Thus far, my argument may suggest that statistical reasoning has
little value only in connection with fuzzy events or categories such as
�anger,� �irritation,� and �jealousy.� But the problem or phenomenon here
goes much farther, much deeper: many people � many witnesses � talk in rough
or fuzzy ways about crisp and recurring events.
�         This fuzzy and rough talk adds, of course, uncertainty to the
problem at hand. The law can try to force witnesses to speak in crisp terms
� judge: �just the facts, m�am� � but past legal experience � e.g. with a
version of the lay opinion rule that told witnesses to speak only in terms
of basic sense data rather than inferences [scenario: law tells witness:
don�t tell me whether you think he was drunk, but describe the {precise}
behavior that led you to believe or conclude that he was drunk {typical
answer: �well, he walked in a sort of herky-jerky way, he stumbled around
quite a bit, he didn�t seem to see straight�]�past legal experience firmly
suggests that it is often just impossible � without forcing people to
embrace propositions that they don�t actually believe � it is often
impossible or very, very difficult to get people, witnesses, to always or
usually speak purely in crisp terms.
 
 
One escape route from these sorts of problems is to say that although we
can�t really (ordinarily) use (rigorous) statistical inference to deal with
the sort of evidence and issues we usually have in legal proceedings, we can
and should think _logically_ about factual uncertainty and, thus, we should
honor principles such as the complementarity of the probabilities of
disjoint and exhaustive hypotheses, and we should � it has been repeatedly
argued � we should use something like subjective probability (bereft of any
real statistical basis; use probability judgments not based on any real
statistics, just use them as expressions of the degree of our own personal
or subjective uncertainties) to reason about evidence � or, at least, to
think about how jurors etc. reason about ordinary evidence. This escape
hatch is known to all of you. Is it available in law?
 
My own answer: it is useful, sometimes, to think about problems of evidence
and inference in law by casting them (or portions of them, simplified
versions of them) into expressions sanctioned by, having meaning, in
standard probability theory.
 
But one should not expect too much of such Gedankenexperimente. First,
real-world problems of evidence and inference almost always have so many
ingredients � they involve so many cascaded inferences, they involve
inference networks with so many arcs and nodes � that it is beyond human
capacity to use formal probability to capture all recognized points of
uncertainty. See David Schum�s analysis of a very small portion of the
evidence in the Sacco Vanzetti case. It took Schum several years to analyze
just a small fraction of the evidence in that case.
 
Second, practically all evidence in trials comes clothed in �semantic
uncertainty�: Witness Officer Smith testifies: David Defendant made a
�furtive gesture�; testimony: David Defendant was �a bit nervous,� he was
�edgy�; hearsay evidence: Witness X heard Witness Y, not now in the
courtroom because of death, say that Peter Plaintiff was driving
�carelessly�; John Smith, accused of later driving while under the
influence, is reported to have said before the alleged drunk driving that he
was feeling �free and easy,� and John Smith invokes the 5th privilege at
trial and does not testify, we have only his out-of-court words or
confession; etc. etc. etc.
 
For what it�s worth: the idea of using probability theory to capture the
uncertainty associated with such vague words and expressions boggles my
mind.
 
Next question: can the theory of fuzzy sets or the theory of rough sets do
any better?
 
My answer: I don�t know.
 
�         The question here is, obviously, not whether lawyers or judges can
be convinced to allow actors and decision makers in trials to use fuzzy or
rough probability theory in trials. The answer to that question, at present,
is clearly: this will not happen any time soon simply because lawyers and
judges do not, as a general matter, have any idea of how such theories or
methods work. The question is, now, for my purposes, whether fuzzy set
theory or rough set theory or the two approaches taken together do a better
job of picturing the kind of uncertainty involved when there are ambiguous
or fuzzy testimonial reports or when there are testimonial reports about
fuzzy or fluid things. I don�t yet have an answer.
 
But let me say one more thing: the theory of fuzzy sets, it seems to me, has
at least one big advantage over a theory such as Pearl�s causal
interpretation of Bayes� nets:
 
As I understand fuzzy set theory and Pearl�s theory, the theory of fuzzy &
rough sets does not depend on � does not need � an understanding of the
underlying �factors� that make language work as well as it does; but Pearl�s
approach, if it is to be successful, is ALL about finding hidden or omitted
variables. I just cannot imagine that we can use in the here and now any
formalization of reasoning that demands � seems to demand � that we know or
be able infer with some assurance what factors � variables � ALL factors �
�really� explain pr account for the phenomenon or phenomena that we (think
we) have before us. The power of fuzzy sets is in part (oddly) its
proclamation or claim or presupposition that we can control or manage our
environment (to a substantial extent) just by understanding how our language
already works � or, by what perhaps amounts to the same thing, by
constructing an artificial logic or language that mimics natural language.  
 
There is an affinity, isn�t there, between this fuzzy set theory�s
confidence in natural language and the law�s � and some AI�s � belief in
common sense reasoning: they all assume, with some pretty good �reason,�
that human beings can use �crude� expressions such as �hard� and �soft� to
good effect.
 
***
 
Have I said anything interesting or important?
 
I wonder. But the above is the best I can do in a couple of hours late at
night. I hope to write with much more care and after much more reflection in
a week or two.
 
 
Peter T


*****

Peter Tillers���������� http://tillers.net
Professor of Law
Cardozo School of Law, Yeshiva University
55 Fifth Avenue, New York, NY 10003

RE: [UAI] A deceptively simple test of deductive capability

Reply via email to