Re: Label-dependent Features

Jim - FooBar(); Wed, 02 Oct 2013 04:04:39 -0700

The most straight-forward approach of doing what you want would be todefine a string-similarity measure (i.e. Levenshtein-distance) and thensimply for each word in S2, iterate S1 and disregard all the occurrencesof words that return more than some predefined distance value. You areactually overcomplicating the problem by using maxent...


hope that helps,
Jim




On 02/10/13 04:10, George Ramonov wrote:

Hi everyone,

I am new to OpenNLP maxent classifier, and I have a question regarding
using features that are label-dependent.

I have two sets of words (S1 and S2, where ||S1|| >> ||S2||), and I am
trying to create find words from S2 that are most similar to S1 using
features I designed. I turned this into a classification problem, treating
words from S2 as labels, and built a nice training set. However, my
features are dependent on the labels itself. I can't find a simple way in
OpenNLP to utilize labels in the prediction process. My guess is I would
have to subclass MaxentModel and implement eval() method? Is there an
easier way to solve this problem? Or perhaps, maximum entropy is not the
best algorithm of choice?

Thanks,
George

Re: Label-dependent Features

Reply via email to