Re: Collocation and Seq2Sparse Questions

Ted Dunning Thu, 27 May 2010 08:58:58 -0700

Just to forestall some effort on this, LLR is very good for threshold, but
the value is bad as a score so substituting TF or TFIDF is entirely
appropriate.


There may be use cases for keeping LLR if only for diagnostic purposes.

On Thu, May 27, 2010 at 8:52 AM, Drew Farris <[email protected]> wrote:

> > 2. How can I, given a vector, get the top collocations for that Vector,
> as
> > ranked by LLR?
> >
>
> If I recall correctly, the LLR score gets dropped in seq2sparse in favor of
> TF or TFIDF depending on the nature of the vectors being generated.
> Meanwhile, CollocDriver simply emits a list of collocations in a collection
> ranked by llr, so neither is strictly what you are interested in. Is there
> a
> good way to include both something like TF >and< LLR in the output of
> seq2sparse -- would it be necessary to resort to emitting 2 separate sets
> of
> vectors?
>

Re: Collocation and Seq2Sparse Questions

Reply via email to