Thanks, for both replies. Helped me a lot.

On Jun 21, 2012, at 11:20 PM, Ted Dunning wrote:

> Most correlation measures have trouble with small counts. They ascribe very 
> high score to coincidence (hence the title of the original paper)
> 
> Sent from my iPhone
> 
> On Jun 21, 2012, at 2:01 PM, Nimrod Priell <[email protected]> wrote:
> 
>> 
>> I did note Lingpipe uses a different type of scoring, Pearson C_2 goodness 
>> of fit (it seems different from LLR, but I didn't dig deep) to do their 
>> collocation scoring: 
>> http://alias-i.com/lingpipe/demos/tutorial/interestingPhrases/read-me.html 
>> (the exact method is documented in the code, 
>> http://alias-i.com/lingpipe/docs/api/com/aliasi/lm/TokenizedLM.html#chiSquaredIndependence(int[])
>>  ). Is that method a good way to capture what I'd like?

Reply via email to