Re: Question about log-likelihood formulation

Ted Dunning Sat, 29 Jan 2011 11:32:29 -0800

the contents of the 2 x 2 matrix are more easily understood if we augment it
with row and column sums:


     1and2     1not2   |   preferring1
   not1and2  not1not2  |   not1
  ---------------------+------------
preferring2     not2   |   all

So out of this, 1not2 = preferring1 - 1and2   and    not1and2 =
preferring2-1and2     and  not2 = all - preferring2    and     not1not2 =
not2 - 1not2 = all - preferring1 - preferring2 + 1and2

Thus I think your code should be:

   double logLikelihood = twoLogLambda(preferring1and2,
                                       preferring1 - preferring1and2,
                                       preferring2 - preferring1and2,
                                       numUsers - preferring1 - preferring2
+ preferring1and2);

I find it easier to understand the twoLogLambda code if it is written this
way:

   double twoLogLambda(k11, k12, k21, k22) {
     return 2 * ( kLogP(k11, k12, k21, k22) - kLogP(k11+k12, k21+k22)
- kLogP(k11+k21,
k12+k22) )
   }

   double kLogP(int... values) {
      double total = 0;
      for (int x : values) {
         total += x;
      }
      double result = 0;
      for (int x : values) {
         if (x > 0) {
            result += k * Math.log(k / total);
         }
      }
      return result;
   }

We have code in Mahout that does something like this.  It is also
essentially the same as the R code I gave earlier.

On Sat, Jan 29, 2011 at 11:09 AM, Sean Owen <[email protected]> wrote:

> Maybe the formulation in the code now is slightly wrong. The key math is:
>
>    double logLikelihood = twoLogLambda(preferring1and2,
>                                        preferring1 - preferring1and2,
>                                        preferring2,
>                                        numUsers - preferring2);
>
>
>  static double twoLogLambda(double k1, double k2, double n1, double n2) {
>    double p = (k1 + k2) / (n1 + n2);
>    return 2.0 * (logL(k1 / n1, k1, n1)
>                  + logL(k2 / n2, k2, n2)
>                  - logL(p, k1, n1)
>                  - logL(p, k2, n2));
>  }
>
>  private static double logL(double p, double k, double n) {
>    return k * Math.log(p) + (n - k) * Math.log(1.0 - p);
>  }
>

Re: Question about log-likelihood formulation

Reply via email to