Hi,

Am 13.01.2017 um 08:19 schrieb Richard Eckart de Castilho:
> ...
>
> In theory there is also a trainer for the tokenizer, but I haven't been able 
> yet to set up a working unit test for it. I think that was due to an 
> immediate lack up suitable training data. So it remains on the todo list.
>

we have several OpenNLP tokenizer models. Aren't most corpora, e.g.,
annotated with POS tags, suitable?

Best,

Peter



> Cheers,
>
> -- Richard
>
>> On 11.01.2017, at 23:51, Joern Kottmann <kottm...@gmail.com> wrote:
>>
>> Hello all,
>>
>> the UIMA integration contains AEs which can be used to train models if
>> a UIMA pipeline is set up to process a some kind of corpus.
>>
>> I have the impression that this is kind of dead/unused code.
>>
>> I opened an issue [1] to deprecate it and would like to know if there
>> is any interest in keeping that code? Do we have someone here using
>> that?
>>
>> Please share your opinion with us so we can make a good decision!
>>
>> Jörn
>>
>> [1] https://issues.apache.org/jira/browse/OPENNLP-928

Reply via email to