That is very helpful, Ted. Thanks so much. There seems to be a fair
number of papers in the world of psycholinguistics that use the term
MI when they mean PMI.

Thanks for the pointer to the thesis. I really appreciate it.

Yours,

Cyrus



On Tue, Jun 7, 2011 at 6:42 PM, Ted Pedersen <tpede...@d.umn.edu> wrote:
> Hi Cyrus,
>
> There's nothing wrong with your formulation, although I would refer to
> what you describe as Pointwise Mutual Information (PMI), since it
> seems like it would only compute the probability of observing A B C D
> all together and separately, and not include probabilities of A (not
> B) C D, and so forth. If you were doing that, then you'd be more in
> the realm of Mutual Information (or tmi as we call it).
>
> Note that NSP does include a 3d version of PMI that essentially
> follows your definition.
>
> http://search.cpan.org/dist/Text-NSP/lib/Text/NSP/Measures/3D/MI/pmi.pm
>
> Extending to 4-d would not be difficult.
>
> If on the other hand you would like to do Mutual Information, remember
> that only differs from the Log Likelihood Ratio by a constant term, so
> you could use our 4-d ll measure for that...
>
> http://search.cpan.org/dist/Text-NSP/lib/Text/NSP/Measures/4D/MI/ll.pm
>
> Also, some of the background for these trigram and 4gram measures is
> described in Bridget McInnes' MS thesis...
>
> Extending the Log-Likelihood Ratio to Improve Collocation
> Identification (McInnes) - Master of Science Thesis, Department of
> Computer Science, University of Minnesota, Duluth, December, 2004.
> http://www.d.umn.edu/~tpederse/Pubs/bridget-thesis.pdf
>
> There are some additional subtleties when you move beyond bigrams, and
> that's because rather than simply comparing the occurance of an ngram
> to the model of independence (ie P(A,B)/P(A)(B)) you have the option
> of comparing to other models (ie P(A,B,C)/P(A,B)P(C)) This becomes
> it's own big complicated issue which I won't go into much here, but it
> does open up a lot of interesting possibilities for longer ngrams that
> you don't have with bigrams. Some of this is discussed in more detail
> in Bridget's thesis.
>
> I hope this helps, and please do let us know if you have any
> additional questions, observations or ideas.
>
> Good luck,
> Ted
>
> On Tue, Jun 7, 2011 at 5:26 PM, Cyrus Shaoul <cyrus.sha...@ualberta.ca> wrote:
>>
>>
>>
>> Hi everyone,
>>
>> My apologies if this has been asked many times before, but
>>
>> would this be an appropriate way to calculated the Mutual Information for a 
>> 4-gram made of of words A B C and D?
>>
>> Mi(ABCD) = log(P(ABCD) / (P (A) x P (B) x P (C) x P (D)))
>>
>> If not, what is a better way? Why is this bad?
>>
>> Thanks for your help,
>>
>> Cyrus
>>
>>
>
>
> --
> Ted Pedersen
> http://www.d.umn.edu/~tpederse
>
>
> ------------------------------------
>
> Yahoo! Groups Links
>
>
>
>



-- 
http://www.ualberta.ca/~cshaoul/

Reply via email to