# RE: Observation selection effects

-----Original Message-----
From: Jesse Mazer [mailto:[EMAIL PROTECTED]
Sent: Tuesday, October 05, 2004 8:45 PM
To: [EMAIL PROTECTED]; [EMAIL PROTECTED]
Subject: RE: Observation selection effects

If the range of the smaller amount is infinite,
as in my P(x)=1/e^x
example, then it would no longer make sense to say that
the range of the
larger amount is r times larger.

Sure it does; r*inf=inf.  P(s)=exp(-x) -> P(l)=exp(-x/r)

But it would make just as much sense to say that the second range is 3r times wider, since by the same logic 3r*inf=inf. In other words, this step in your proof doesn't make sense:

In other words, the range of possible
amounts is such that the larger and smaller amount do not overlap.
Then, for any interval of the range (x,x+dx) for the smaller
amount with probability p, there is a corresponding interval (r*x,
r*x+r*dx) with probability p for the larger amount.  Since the
latter interval is longer by a factor of r

P(l|m)/P(s|m) = r ,

In other words, no matter what m is, it is r-times more likely to
fall in a large-amount interval than in a small-amount interval.

As for your statement that "P(s)=exp(-x) -> P(l)=exp(-x/r)", that can't be true. It doesn't make sense that the value of the second probability distribution at x would be exp(-x/r), since the range of possible values for the amount in that envelope is 0 to infinity, but the integral of exp(-x/r) from 0 to infinity is not equal to 1, so that's not a valid probability distribution.

Also, now that I think more about it I'm not even sure the step in your proof I quoted above actually makes sense even in the case of a probability distribution with finite range. What exactly does the equation "P(l|m)/P(s|m) = r" mean, anyway? It can't mean that if I choose an envelope at random, before I even open it I can say that the amount m inside is r times more likely to have been picked from the larger distribution, since I know there is a 50% chance I will pick the envelope whose amount was picked from the larger distribution. Is it supposed to mean that if we let the number of trials go to infinity and then look at the subset of trials where the envelope I opened contained m dollars, it is r times more likely that the envelope was picked from the larger distribution on any given trial? This can't be true for every specific m--for example, if the smaller distribution had a range of 0 to 100 and the larger had a range of 0 to 200, if I set m=150, then in every single trial where I found 150 dollars in the envelope it must have been selected from the larger distribution. You could do a weighted average over all possible values of m, like "integral over all possible values of m of P('I found m dollars in the envelope I selected')*P('the envelope I selected had an amount taken from the smaller distribution' | 'I found m dollars in the envelope I selected'), which you could write as "integral over m of P(m)*P(s|m)", but I don't think it would be true that the ratio "integral over m of P(m)*P(l|m)"/"integral over m of P(m)*P(s|m)" would be equal to r, in fact I think both integrals would always come out to 1/2 so the ratio would always be 1...and even if I'm wrong, replacing P(l|m)/P(s|m) with this ratio of integrals would mess up the rest of your proof.

Jesse