Dear friends in the J-community

I wish to share with you a formula for Bayesian induction. Your comments and 
improvements are welcome. 

Verbs for deduction and induction are defined like this.

   a     =. *`%`:3"2
   b     =. ,: (%:@* -.)
   c     =. (,: , 1:) % +/@]
   deduc =. a@b@c f.
   T     =. -@(+ #)
   induc =. (T@}: , }.)@(T~ deduc T) f.

Consider (following Laplace) an urn containing, say, 50 red balls, 30 yellow  
balls and 20 green balls. Close your eyes and pick 10 balls out of the urn. How 
many  balls of each color will you get? You cannot know for sure, but the order 
of magnitude will be 5 red  balls, 3 yellow  balls, and 2 green balls, computed 
like this.

   10 (* % +/@]) 50 30 20
5 3 2

The statistical uncertainties are written under the orders of magnitude like 
this.

   10 deduc 50 30 20
      5      3       2
1.50756 1.3817 1.20605

You get 5 red balls, give or take 1.5. 
You get 3 yellow balls, give or take 1.4. 
You get 2 green balls, give or take 1.2. 

Here are some simple examples.

   1 deduc 1 1 NB. uncertain result
0.5 0.5
0.5 0.5

   1 deduc 2 0 NB. but absolute certainty when both balls have the same color 
1 0
0 0

   2 deduc 1 1 NB. or when both balls are picked
1 1
0 0

When the sample is known and the population is unknown, it is called induction. 

Close your eyes and pick 10 balls out of 
an urn containing 100 balls. Open your eyes and count 5 red, 3 yellow, and 2 
green balls. What can be said about the number of balls 
of each color in the urn?

   5 3 2 induc 100
46.5385 30.6923 22.7692
12.8279 11.8764 10.8416

There are 47 red balls, give or take 13.
There are 31 yellow balls, give or take 12.
There are 23 green balls, give or take 11.

For fun I choose some examples here where all the results are exact integers.

   1 0 induc 4
3 1
1 1

NB. Even if there are no yellow balls in the sample there may still be some 
yellow balls in the population. As is well known in the philosophy of science, 
induction is not absolutely certain. Unless you investigate the whole 
population of course.

   4 0 induc 4
4 0
0 0

   0 0 0 induc 3
1 1 1
1 1 1

   0 0 induc 6
3 3
2 2

   1 1 induc 18
9 9
4 4

   2 2 induc 12
6 6
2 2

   2 0 induc 62
47 15
12 12

The technical term for 'order of magnitude' is Mean Value, or Expected Value. 
The technical term for 'statistical uncertainty' is Standard Deviation. 
For only two colors, deduction is known as the Hypergeometric Distribution. 

But the induction formula, and its relation to the transformation T, seems not 
to be known in the statistical industry, although it is very useful.

Have fun!

-Bo
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to