On 21-Feb-2009, at 18:33, Chris Gehlker wrote:
> On Feb 21, 2009, at 2:14 PM, LuKreme wrote:
>> Let's say I am trying to find out what the probability is of  
>> something
>> happening in a game, and let's say that the mechanics are hidden (so
>> no dice, but a computer calculation)
>>
>> Let's say I do something in the game 78 times, and 19 times I have a
>> 'positive' result.
>>
>> How sure am I that the 'positive result percentage' is 24.35897%?
>
> You can't actually get a point estimate like that. Obviously if you
> try one more time you are going to get either 24.05063291% or
> 25.3164557%. What you  can get is a range that is a function of both
> sample size and the risk that you are willing to take that the true
> value falls outside the  range.

Right, I know that the 'real' probability is unlikely to be 24.35897%,  
and I know that every result will change the current percentage up or  
down, but my sense is that with 19/78 I have a pretty decent  
confidence that the 'real' percentage is going to be pretty close to  
25%.  If I have a result of 190/780 then I have a MUCH STRONGER sense  
that the 'real' percentage is pretty damn close to 24.3% and if I had  
19,000/78,000 I'd feel pretty confident is saying that the real  
percentage is 24.359%.  make sense?

What my question is, and I suspect I am phrasing it badly or not using  
the right terms is how do I measure that confidence?  I mean, I could  
just go with the old Algebra II notion of significant digits, but that  
doesn't give me any sort of confidence interval.

> In practice this means that you use the "plus four" method for
> computing a confidence interval. Let p* = (positive results + 2)/
> (trials +4) and then your interval is p* + or - sqrt(p* (1-p*)/(trials
> + 4)) * Z. You get Z from a table in the  back of a  stat book or from
> a spreadsheet function. For example if I wanted to be 95% sure that my
> interval contained the true value I would type =NORMSINV(0.975) into
> Excel and  get a value of 1.959963985. If I was willing to live with
> 90% confidence I'd use NORMSINV(0.95) and get 1.644853627. The
> variable is always (1 - half chance of the error I'm willing to
> accept) because I want a symmetric estimate with the  chance of
> overestimating = the chance of underestimating.

so in this example case, say I wanted 95% certainty, I would then plug  
the number into the formula you gave, like this:

p* = (21/80) =.2625

.2625 ± [ sqrt(.2625 * .7375 / 82) * 1.9599 ]

.2625 ± 0.095229752 = 0.167270248 <-> 0.357729752

So, I am 95% certain that the true probability is in the range of  
16.72% to 35.77% or so.  So, my 'sense' that 19/78 is pretty close to  
25% is pretty much shit.

If we move up and order of magnitude we get

.2625 ± [ sqrt(.2625 * .7325 / 792 ) * 1.9599 ]

.2625 ± 0.030537944

Still a 3% range even after nearly 800 trials.

OK, that at least tells me that my 'sense' is completely worthless in  
this case.  What I have to do is decide the confidence interval I  
want, and then calculate the possible range of values, which is  
exactly opposite of what I was thinking (calculate the value, and then  
calculate a ± confidence on that value).

I knew one of you would know....

-- 
If fashion is your trade then when you're naked I guess you must be
        unemployed.

_______________________________________________
OSX-Nutters mailing list | [email protected]
http://lists.tit-wank.com/mailman/listinfo/osx-nutters
List hosted at http://cat5.org/

Reply via email to