[ 
https://issues.apache.org/jira/browse/OPENNLP-597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13781903#comment-13781903
 ] 

Ioan Barbulescu commented on OPENNLP-597:
-----------------------------------------

Hi Joern

I agree with you on case two. I checked some more documentation on TPB-II 
format and this is a valid form.

Case one and three already throw an exception. It is true that it is a 
NullPointerException and not a specific parsing exception, but it is an 
exception nevertheless.

I also agree that the best way is to generate proper parse tree and not 
incomplete ones - otherwise we only propagate the problem.

So, in the light of these new considerations, I would suggest to simply close / 
"won't fix" this issue.
I mean, only to replace one type of exception with another one doesn't bring 
too much value ...

Your opinion?

Thank you.
Ioan

> Code in tools/parser throws some NullPointerExceptions when dealing with poor 
> training data
> -------------------------------------------------------------------------------------------
>
>                 Key: OPENNLP-597
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-597
>             Project: OpenNLP
>          Issue Type: Bug
>          Components: Parser
>    Affects Versions: tools-1.5.3
>         Environment: Windows 7 + java 1.7.0_21 
>            Reporter: Ioan Barbulescu
>            Priority: Minor
>             Fix For: 1.6.0
>
>         Attachments: tools.patch
>
>
> I was trying to train the Treebank Parser with some new data.
> Truth to be told, the data was in poor format. Specifically, instead of 
> "(-RRB- -RRB-)", it contained "( -RRB-)".
> The same for -LRB- constructions.
> Due to this input data, the parsing code was throwing some 
> NullPointerException errors.
> The fixes consist in some supplementary "if()"s, to safeguard against null 
> pointers.
> Fixes are in 3 files, attached as diff. The diff was created by svn, run in 
> the opennlp-tool/.../parser directory.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to