[ 
https://issues.apache.org/jira/browse/OPENNLP-238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13073010#comment-13073010
 ] 

William Colen commented on OPENNLP-238:
---------------------------------------

I don't know, I added a breakpoint to the method that validates the model and 
it never stopped there while running the cross validator. I'll investigate that 
and open a new Jira.

Also I will open a Jira for a new tool to create POS Tag dictionaries that 
optionally checks if the tagset is valid, maybe looking at the training corpus 
to extract the tagset using a cutoff.

> BestSequence method in BeamSearch can cause NullPointerException if it can 
> not find a valid sequence
> ----------------------------------------------------------------------------------------------------
>
>                 Key: OPENNLP-238
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-238
>             Project: OpenNLP
>          Issue Type: Bug
>          Components: POS Tagger
>    Affects Versions: tools-1.5.2-incubating
>            Reporter: William Colen
>            Assignee: William Colen
>             Fix For: tools-1.5.2-incubating
>
>
> I am using the standard sequence validator of POS Tagger with a 
> TagDictionary. Sometimes there are no outcome that matches with the tags in 
> the dictionary. That is causing a NullPointerException in bestSequence method.
> I think we should add an extra validation: if the heap 'next' still empty 
> after advancing all valid sequences (line 159) we should let it advance 
> invalid sequences. It would make the POS Tagger more robust.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to