[
https://issues.apache.org/jira/browse/OPENNLP-820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14942454#comment-14942454
]
Steven Owens commented on OPENNLP-820:
--------------------------------------
As you can see the part of speech tagger doesn't have this problem. A simple
method to fix this issue and others like it would seem to be if the
POSTaggerFactory and ParserChunkerFactory used in training the parser were
configurable (either by allowing them to be passed to train method or by
storing them in some kind of ParserFactory (and move the HeadRules into that as
well)) then retrain the parser model using POSTaggerFactory like the used for
creating the POSTagger model. I can do the code refactoring work I just want a
second opinion that this is the right solution.
> parser is mistagging quotes
> ---------------------------
>
> Key: OPENNLP-820
> URL: https://issues.apache.org/jira/browse/OPENNLP-820
> Project: OpenNLP
> Issue Type: Bug
> Components: Parser
> Affects Versions: 1.6.0
> Reporter: Steven Owens
> Labels: english
> Original Estimate: 1h
> Remaining Estimate: 1h
>
> the parser is mistagging quotes (both single and double) with the default
> English model. I notice most on opening quotes but it happens to closing
> quotes.
> ex. (TOP (NP (NP-S-NP (NN "))(ADVP-C-NP (RB Here))(. ?)(. "))) both double
> quotes should be labeled ''(two single quotes).
> same sentence labeled with the part of speech tagger using the default
> English model: "__`` Here_RB ?_. "_''
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)