Thanks, for both replies. Helped me a lot. On Jun 21, 2012, at 11:20 PM, Ted Dunning wrote:
> Most correlation measures have trouble with small counts. They ascribe very > high score to coincidence (hence the title of the original paper) > > Sent from my iPhone > > On Jun 21, 2012, at 2:01 PM, Nimrod Priell <[email protected]> wrote: > >> >> I did note Lingpipe uses a different type of scoring, Pearson C_2 goodness >> of fit (it seems different from LLR, but I didn't dig deep) to do their >> collocation scoring: >> http://alias-i.com/lingpipe/demos/tutorial/interestingPhrases/read-me.html >> (the exact method is documented in the code, >> http://alias-i.com/lingpipe/docs/api/com/aliasi/lm/TokenizedLM.html#chiSquaredIndependence(int[]) >> ). Is that method a good way to capture what I'd like?
