Multinomial estimation without breaking entropy chain rule?

Markus Mottl Mon, 16 Jun 2003 15:02:09 -0700

Hi,

I have read papers on multinomial estimation for sparse variables (e.g. "A
Natural Law of Succession" by Eric Sven Ristad), because I need good
probability estimates for calculating the joint entropy of sets of those.


However, I have found that the measures that I have looked at so far
break the chain rule for entropy, with the exception of simple frequency
counting and the Laplace-estimate, both of which are discouraged for
very sparse data.

The entropy of a random variable X having possible classes A_X is
defined as:

  H(X) = - \sum_{x \el A_X}{p(X) \times \log(p(x))}

The joint entropy of two random variables X,Y having possible classes
A_X and A_Y is defined as:

  H(X,Y) = - \sum_{x,y \el A_X,A_Y}{p(x,y) \times \log(p(x,y))}

and has the property that H(X,Y) = H(Y,X).

The chain rule for entropy states that:

  H(X,Y) = H(X) + H(Y|X) = H(Y) + H(X|Y)

When using e.g. Ristad's formula for estimating probabilities, the above
does not hold anymore! Thus, because I use the chain rule to compute
joint entropies, I get inconsistent results depending on the order in
which I apply it to the variables.

Does anybody know of multinomial estimators that preserve the laws of
entropy and give good results on sparse variables?

Best regards,
Markus Mottl

-- 
Markus Mottl          http://www.oefai.at/~markus          [EMAIL PROTECTED]
.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
.                  http://jse.stat.ncsu.edu/                    .
=================================================================

Multinomial estimation without breaking entropy chain rule?

Reply via email to