Re: OpenNLP Maxent Data Format

David Young Tue, 04 Sep 2012 09:19:38 -0700

Hi thanks for the reply. I am not as familiar with Java so I thought Id
produce a model first with SharpEntropy.
I have not really modified the simple example so It is only several lines
of basic code.

This is how it works:
http://pastebin.com/LK9tNsrj

The training data is as follows:
http://pastebin.com/3icni8Jc

This example works fine but the problem is when I try to use any words that
are not in the training data.
For example
            context.Add("oWord=someNewWord")...

This gives an unknown key error because it is not recognised. But I want to
make predictions using what is known. The surrounding context.

As a maximum entropy model I have lots of words in training data that
should be taken into account when available in addition to each word POS.
But sometimes in the real data I want to evaluate I have the POS for each
word, some words that are in the training data but also in the context
there might be words that are not in the training data. How do I still get
a prediction in this case using the rest of the context?

Thanks for your time.

On Tue, Sep 4, 2012 at 10:50 AM, Jörn Kottmann <[email protected]> wrote:

> On 09/03/2012 01:45 AM, David Young wrote:
>
>> But my question is; what happens when I want to use something like
>> "next=WordNotInModel", a word that does not exist in the training data,
>> and
>> still want to get a prediction using the rest of the surrounding context?
>> Even If I use "next=Unknown" or "next=null" or Null I get an error
>> "predicateLabel KeyNotFoundException was unhandled". Because "next=
>> WordNotInModel" is not a known key.
>>
>
> Usually maxent is used as an API, can you post some code here
> so we can see what you are doing? Or do you use one of the command
> line utils?
>
> Thanks,
> Jörn
>

Re: OpenNLP Maxent Data Format

Reply via email to