Just today I was testing the TokenizerTrainer and I found a bug there with
the isSkipAlphaNumerics parameter: in the initialize() method, I see that
it's defined as a local variable too so the instance variable gets never
assigned and this causes a NPE on the collectionProcessComplete().
The fix is in just removing the "Boolean" type definition at line 111 of
TokenizerTrainer [1] which allows assignment of configuration parameter
value to the instance variable.
Tommaso

[1] :
http://svn.apache.org/viewvc/incubator/opennlp/trunk/opennlp-uima/src/main/java/opennlp/uima/tokenize/TokenizerTrainer.java?view=markup


2011/4/1 Tommaso Teofili <[email protected]>

> 2011/4/1 Jörn Kottmann <[email protected]>
>
>> On 4/1/11 12:58 PM, Tommaso Teofili wrote:
>>
>>> One issue I found is that the opennlp.uima.Language parameter is not
>>> defined
>>> in the trainers' descriptors causing them to fail during initialization
>>> since the *Trainer classes need the language as a mandatory parameter
>>> (that
>>> is good I think since the statistical model built is language dependent).
>>> Am I right or am I missing something?
>>>
>>
>> No, that really sounds like a mistake, seems like I simply forgot to put
>> the parameter
>> deceleration into the descriptor. I will change it on Monday, or of course
>> a patch is welcome :)
>
>
> I didn't run in any other issues, will provide a patch for the descriptors
> tomorrow or sunday :)
> Tommaso
>
>
>>
>>  p.s.:
>>> Also within that fail case it seems
>>> the org.apache.uima.UIMAException_Messages is missing, but I'd not
>>> consider
>>> this a bug at the moment since I am doing tests in a separate project
>>> which
>>> could need some tweaks but I though it was still useful to report
>>>
>>
>> I will have a look, thanks for pointing out, even its not a bug we might
>> want
>> to improve it.
>>
>> Jörn
>>
>
>

Reply via email to