Thanks a lot Jörn, it works now. I don't know why I typed SKIP instead of
SPLIT and I was focused on the error message.

Sorry for taking yr time.

Best wishes,

Jean-Claude


On Fri, May 13, 2011 at 11:47 AM, Jörn Kottmann <[email protected]> wrote:

> On 5/13/11 11:33 AM, Jean-Claude Dauphin wrote:
>
>> Hi,
>>
>> I tried to produce train models for french from a set of french human
>> resource positions data which are splitted in sentences and use it as
>> sample
>> train data stream.
>> It works fine for the sentence detector model using *
>> SentenceDectectorME.train*
>>
>> However, if I use the same sample as Tokenizer training content with *
>> opennlp.tools.tokenize.TokenizerME.train* , I got the following error:
>>
>> The maxent model is not compatible!
>>
>
> The error message sounds a bit strange, what it means is that you only
> train
> with NO_SPLIT events (I guess). The produced model will not be able to
> split any tokens.
>
> We should fix the model validation code, or put out some more meaningful
> error
> message.
>
> Anyway, to solve you problem rename your <SKIP> tags to <SPLIT>.
>
> Have a look at our documentation here:
>
> http://incubator.apache.org/opennlp/documentation/manual/opennlp.html#tools.tokenizer.cmdline.training
>
> Hope that helps,
> Jörn
>



-- 
Jean-Claude Dauphin

[email protected]
[email protected]

http://kenai.com/projects/j-isis/
http://www.unesco.org/isis/
http://www.unesco.org/idams/
http://www.greenstone.org

Reply via email to