Thursday, February 20, 2003, 8:11:48 PM, Ben Goertzel wrote:

CS> Somehow I see this ending up as finding a set a bell curves (i.e.
CS> their height, spread and optimum) for each estimate.  That is to say I
CS> don't see *just* the probability as relevant but the probability
CS> distribution...if I sample 10 people, the curves are all "wider" than
CS> if I sample 100 people out of a 1,000 total.

BG> You can do that, it's true.  One option we prototyped in Novamente was using
BG> "probability distribution truth values" instead of simple probability truth
BG> values.  However, it vastly increases the computational cost, and in many
BG> cases there's not enough data to support a distributional truth value
BG> meaningfully.

BG> So the system is designed to be able to switch between distributional and
BG> simple truth values adaptively ;-)

BG> However, a truth value distribution need not be a bell curve.

BG> For example, how about the truth value of

BG> P( male | human )

BG> As a number it's .5

BG> As a distribution, it's bimodal, not Gaussian at all....

Well, it's bimodal *IF* you know the *real* probability distribution.

Put it this way: given 100 samples from a 1,000 total population

  100 samples
   51 exhibit property X
    3 exhibit property Y

Let's say we *know* that property X is "male" and the sampled
population is "people", (i.e. those are givens) then this is both
expected and not too far off. 

But property Y, because it's so "small", doesn't tell us as much about
"the probability of property Y".  E.g., if Y is having green eyes and
red hair, and this is a rare property, a few more or less in a sample
of 100 is "pretty possible".  Whereas a number of males >> .5n or
<< .5n is much less likely.  If in a random sample of 100 people only
5 were male, we'd be, ehm, surprised.

Isn't there some equation that determines the likelihood of out-of-
bounds results in a random sample?  My probability theory is too rusty
here...

Isn't there some way, if a "full curve" is too computationally
exensive, some way of expressing, say, 2 sigmas (standard deviations)
or whatever? E.g. 74% will fall within 1 standard dev. of optimum X?

Finally, isn't there some precise equation or set of equations you are
approximating?


--
Cliff

-------
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

Reply via email to