The opinion at www.ca7.uscourts.gov/op3.fwx?submit1=showop&caseno=01-4208.PDF is decidedly at variance with the dominant legal perspective on -- the dominant legal aversion to -- overtly probabilistic and statistical evidence and argument that I tried to describe in my original message [on the Uncertainty in Artificial Intelligence list], which is reproduced below. It is noteworthy - -- and not coincidental -- that the author of this opinion is Richard Posner, now a federal judge but formerly a law professor and the founder of the "law and economics" movement in the U.S. (an academic movement that has long since crossed national boundaries).
The warmth with which Judge Posner greets overtly probabilistic argument about evidence is not limited to the conservative side of the political and judicial spectrum. Judge Jack B. Weinstein, who belongs to the liberal side of any such spectrum, has long been a prominent advocate of more extensive use of probabilistic and statistical methods in law. (Disclosure: I confess to have served as a court-appointed expert witness -- with David Schum -- in one case, at the behest of Judge Weinstein. I also readily confess that David was the true expert in that case, which is why I enlisted his help.) ***** Peter Tillers���������� http://tillers.net Professor of Law Cardozo School of Law, Yeshiva University 55 Fifth Avenue, New York, NY 10003 � (212) 790-0334; FAX (212) 790-0205 [EMAIL PROTECTED] � � -----Original Message----- From: Peter Tillers [mailto:[EMAIL PROTECTED] Sent: Wednesday, June 18, 2003 1:02 PM To: [EMAIL PROTECTED] Cc: Lotfi Zadeh Subject: RE: [UAI] A deceptively simple test of deductive capability To whom it may concern: Professor Zadeh suggests that his �deceptively simple test of deductive capability� raises a serious question about �the validity of much of probability-based reasoning in the realm of law.� Please see latest message to the UAI list, below. I am a law teacher. I would like to comment on Prof. Zadeh�s challenge and suggestion. I would like to begin by circling around Prof. Zadeh�s suggestion/challenge: I will begin by providing a bit of legal background. After doing that, I will then give my own crude �take� on the question of the limits and uses � on the question of the epistemically legitimate uses � of conventional probability theory in legal proceedings. (I will not comment directly on the formal details of Prof. Zadeh�s argument; I have breathed in his argument, without fully understanding it. But my guess, my hope, is that my comments below have at least a tangential bearing on his formal argument.) *** There has been a vigorous debate in some U.S. & U.K. academic circles for over 30 years about the use of probability theory and statistical methods in legal proceedings. There are very significant differences of opinion what that debate has been all about. For example, some protagonists view the debate as being fundamentally about the admissibility of probability theory and statistical methods on the back (so to speak) of scientific evidence such as genetic evidence. Other protagonists in the debate maintain (at least occasionally) that their concern is not fundamentally with the question of the use of probability theory in trials but with the question of the nature of uncertain reasoning about states of the world (�factual questions�). Other protagonists have maintained that the debate is or ought not to be about either of those matters, but about the social effects of the use of probability theory and similar methods in trials � for example, about the effect of the use of �overtly probabilistic� evidence in criminal trials to show �guilt beyond a reasonable doubt.�) The truth, of course, is that �the� debate I am referring to has been about all of these issues. As a general matter, there is fierce resistance in the real-world legal profession � in the real world of litigation � to the use of conventional �formal� probability theory in proof in litigation. There are exceptions to this rule of resistance. For example, when the issues in a lawsuit involve well-defined (�crisp�) and repetitive phenomena � e.g., mechanisms involved in biological heredity, or radioactive decay, etc. -- experts whose judgments rest in part on reasoning involving formal probability theory are often allowed to testify [but rarely if ever are such experts allowed to combine their probabilistic assessments about phenomena within the purview of their expertise with �soft� uncertain judgments � such as uncertain judgments about a person�s intentions on some occasion (and experts are not allowed to opine how someone else such as a juror should combine �hard probabilities� with whatever �soft uncertainties� that that someone else -- that juror, for example -- might happen to entertain)]. There are some other exceptions, some more surprising exceptions to the rule of legal resistance to the use of standard probability theory in trials. One striking exception of this sort is the common admissibility of statistical evidence in employment discrimination cases, statistical evidence that can be admitted, for example, as presumptive or prima facie evidence of an employer�s probable behavior toward a particular employee on a particular occasion. But these are exceptional situations � in at least the two senses mentioned next. First, no court in the United States or (as far as I know) anywhere in the Commonwealth believes that the law allows probability theory to be used in ordinary trials to assess �ordinary� evidence. (In the occasional situations where this has been attempted in trials, appellate courts have quickly and firmly condemned them. The most famous case of this sort is People v. Collins, 68 Cal.2d 319, 438 P.2d 33 (1968). But there have been other such appellate decisions � in those very rare situations in which trial judges have allowed trial lawyers, for example, to use the product rule in an argument to a jury about the probability of the guilt of some defendant. See, e.g., the recent case in the U.K., the Sally Clark case, http://www.sallyclark.org.uk/, involving attempted probability computations, effectively with the product rule, in a �sudden infant death syndrome� case. Cf. Wilson v. Maryland, 370 Md. 191, 803 A.2d 1034 (August 7, 2002). But compared to _Collins_, these are borderline cases � or cases lying closer to the border of legitimate use of probability � because here at least there were some statistics at hand, however ill-suited they were for their intended purpose.) Second, almost no judge in the U.S. or the in the Commonwealth believes that the law would be well-advised to allow judges or jurors to use probability theory to evaluate the evidence put before them in a trial. The legal profession�s opposition to the use of probability theory for the assessment of �ordinary� evidence rests only in part on the legal profession�s awareness of the widespread innumeracy of judges, lawyers, and jurors. There is also a very firm sense among almost legal professionals that the attempt to translate the law�s injunctions about the handling of inconclusive evidence ends up getting things wrong, that the attempted translation is, necessarily, imperfect, incorrect. This is where things stand. The question is, in part, whether it is an accident that things stand where they now do or whether there is something in the nature of most evidence and factual issues in trials (and in litigation generally) that makes them unsuitable for dissection via standard probability theory. It would be too much to expect that the legal profession would speak with one voice when trying to explain why it seems so obvious to the legal profession that formal probability theory is the wrong instrument for the occasion. But there is at least one persistent strand in the legal profession�s thinking about uncertain �historical� events � (hypothetically) non-recurring events � that I think may begin to touch the kind of argument that Prof. Zadeh may have made in his message below and with his deceptively-simple hypothetical problem. Lawyers and judges always worry about the validity of �extrapolating� from one set of experiences to another or from one set of individuals to another or from (the behavior of) a set of individuals to (the behavior of) a particular individual with a unique set or combination of attributes. One way to see this concern is to see it (coldly) as a variant of the fabled problem of intersecting reference classes. The legal profession (if not all philosophers, logicians, statisticians, etc.) is generally powerfully impressed by the hypothesis that �every situation [person etc.] is different� � and, in the face of statistical studies attempting to account for relevant variables, legal professionals are quick to spot variables about which data has not been collected. Equally important, legal professionals (as a general matter) are quick to see or ready to assert that some or many �samples� � collections of observations � are useless because (in part, sometimes) the criteria for determining whether some event has or has not occurred are imprecise, vague, indeterminate, or fuzzy. (For example, I have little doubt that practically all lawyers could and could and would launch a legally-devastating attack on a statistical study that relied on a collection of data about the events & the relationships between �violent behavior� and �jealous rage.�) So let�s put it this way: most legal professionals have the sense that it is in principle impossible to collect meaningful data about certain kinds of possible states of affairs � either because there is a serious question about the extent to which the prior events observed are �like� the event whose occurrence or non-occurrence is now in issue or because there is something about the current event in issue � it has some an additional attribute or attributes � that quite possibly distinguishes it from the events that are described by the collected data. In fancy terms: two problems: the events in an alleged reference class may not be truly representative (because our methods of classifying those events are vague, fuzzy, rough, etc.) and, even if we�ve got nice crisp reference classes, the events in issue have other attributes that may make the predictions based on existing reference classes bad. (I will leave it to you folks to convert my rude language into -mathematically and -philosophically and statistically-refined and rigorous language.) � Thus far, my argument may suggest that statistical reasoning has little value only in connection with fuzzy events or categories such as �anger,� �irritation,� and �jealousy.� But the problem or phenomenon here goes much farther, much deeper: many people � many witnesses � talk in rough or fuzzy ways about crisp and recurring events. � This fuzzy and rough talk adds, of course, uncertainty to the problem at hand. The law can try to force witnesses to speak in crisp terms � judge: �just the facts, m�am� � but past legal experience � e.g. with a version of the lay opinion rule that told witnesses to speak only in terms of basic sense data rather than inferences [scenario: law tells witness: don�t tell me whether you think he was drunk, but describe the {precise} behavior that led you to believe or conclude that he was drunk {typical answer: �well, he walked in a sort of herky-jerky way, he stumbled around quite a bit, he didn�t seem to see straight�]�past legal experience firmly suggests that it is often just impossible � without forcing people to embrace propositions that they don�t actually believe � it is often impossible or very, very difficult to get people, witnesses, to always or usually speak purely in crisp terms. One escape route from these sorts of problems is to say that although we can�t really (ordinarily) use (rigorous) statistical inference to deal with the sort of evidence and issues we usually have in legal proceedings, we can and should think _logically_ about factual uncertainty and, thus, we should honor principles such as the complementarity of the probabilities of disjoint and exhaustive hypotheses, and we should � it has been repeatedly argued � we should use something like subjective probability (bereft of any real statistical basis; use probability judgments not based on any real statistics, just use them as expressions of the degree of our own personal or subjective uncertainties) to reason about evidence � or, at least, to think about how jurors etc. reason about ordinary evidence. This escape hatch is known to all of you. Is it available in law? My own answer: it is useful, sometimes, to think about problems of evidence and inference in law by casting them (or portions of them, simplified versions of them) into expressions sanctioned by, having meaning, in standard probability theory. But one should not expect too much of such Gedankenexperimente. First, real-world problems of evidence and inference almost always have so many ingredients � they involve so many cascaded inferences, they involve inference networks with so many arcs and nodes � that it is beyond human capacity to use formal probability to capture all recognized points of uncertainty. See David Schum�s analysis of a very small portion of the evidence in the Sacco Vanzetti case. It took Schum several years to analyze just a small fraction of the evidence in that case. Second, practically all evidence in trials comes clothed in �semantic uncertainty�: Witness Officer Smith testifies: David Defendant made a �furtive gesture�; testimony: David Defendant was �a bit nervous,� he was �edgy�; hearsay evidence: Witness X heard Witness Y, not now in the courtroom because of death, say that Peter Plaintiff was driving �carelessly�; John Smith, accused of later driving while under the influence, is reported to have said before the alleged drunk driving that he was feeling �free and easy,� and John Smith invokes the 5th privilege at trial and does not testify, we have only his out-of-court words or confession; etc. etc. etc. For what it�s worth: the idea of using probability theory to capture the uncertainty associated with such vague words and expressions boggles my mind. Next question: can the theory of fuzzy sets or the theory of rough sets do any better? My answer: I don�t know. � The question here is, obviously, not whether lawyers or judges can be convinced to allow actors and decision makers in trials to use fuzzy or rough probability theory in trials. The answer to that question, at present, is clearly: this will not happen any time soon simply because lawyers and judges do not, as a general matter, have any idea of how such theories or methods work. The question is, now, for my purposes, whether fuzzy set theory or rough set theory or the two approaches taken together do a better job of picturing the kind of uncertainty involved when there are ambiguous or fuzzy testimonial reports or when there are testimonial reports about fuzzy or fluid things. I don�t yet have an answer. But let me say one more thing: the theory of fuzzy sets, it seems to me, has at least one big advantage over a theory such as Pearl�s causal interpretation of Bayes� nets: As I understand fuzzy set theory and Pearl�s theory, the theory of fuzzy & rough sets does not depend on � does not need � an understanding of the underlying �factors� that make language work as well as it does; but Pearl�s approach, if it is to be successful, is ALL about finding hidden or omitted variables. I just cannot imagine that we can use in the here and now any formalization of reasoning that demands � seems to demand � that we know or be able infer with some assurance what factors � variables � ALL factors � �really� explain pr account for the phenomenon or phenomena that we (think we) have before us. The power of fuzzy sets is in part (oddly) its proclamation or claim or presupposition that we can control or manage our environment (to a substantial extent) just by understanding how our language already works � or, by what perhaps amounts to the same thing, by constructing an artificial logic or language that mimics natural language. There is an affinity, isn�t there, between this fuzzy set theory�s confidence in natural language and the law�s � and some AI�s � belief in common sense reasoning: they all assume, with some pretty good �reason,� that human beings can use �crude� expressions such as �hard� and �soft� to good effect. *** Have I said anything interesting or important? I wonder. But the above is the best I can do in a couple of hours late at night. I hope to write with much more care and after much more reflection in a week or two. Peter T ***** Peter Tillers���������� http://tillers.net Professor of Law Cardozo School of Law, Yeshiva University 55 Fifth Avenue, New York, NY 10003
