Rich,

One thing I really dislike about the "standard stat text" definition 
of random variables and distributions is that the random variables 
themselves change when the sample space is incrementally extended.

Suppose I have defined a belief network for inferring what disease a 
patient has from a bunch of diseases, symptoms and background 
variables.  Now some doctor comes along and tells me he has been 
seeing a brand-new skin rash no one has ever seen before in his 
clinic.  We keep an eye out and sure enough, this skin rash starts 
cropping up all over the place.  We then discover it is due to a 
virus that used to infect only squirrels and is spread by fleas, but 
has now mutated to infect the human population.

So we add a new disease to our list of diseases and a new symptom to 
our list of symptoms, which means the BN now has two new variables it 
didn't have before.  We add some arcs connecting relevant background 
information (such as region to the country and whether patient has 
spent time outdoors) to these new symptoms. The rest of the BN stays 
the same.

According to the standard statistics texts, I now have a new sample 
space, which means all my random variables (including ones bearing no 
relation to the new disease) are now different mathematical objects 
from what they were before.  "Mammogram," for example, used to be a 
function from the old sample space (the cross-product of all the 
state spaces of the previously modeled symptoms, background variables 
and diseases) to the values "positive, negative, inconclusive."  Now 
it's a function from the new sample space (a cross-product with 2 
additional dimensions for my two new random variables) to the same 
outcome set.

I find it much less confusing to use the incremental specification as 
the basic definition.  Your defining things that way was one thing I 
liked about your old text (which I used before it went out of print 
and I couldn't get it any more).  However, neither way of doing it is 
"right." I tell students it can be done either way, because either 
way is a valid way of specifying a joint probability distribution.

Kathy Laskey


Reply via email to