[
https://issues.apache.org/jira/browse/OPENNLP-231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13071835#comment-13071835
]
William Colen commented on OPENNLP-231:
---------------------------------------
The ngram dictionary is created from the sample data. The
POSTaggerCrossValidator class expects a ngram dictionary in its constructor,
but if we create this dictionary using the entire sample and send it to the
POSTaggerCrossValidator it would be an unfair evaluation.
Instead of passing the ngram dictionary we should pass the cutoff and let the
evaluate method create the dictionary using the training sample.
> POS Tagger cross validator tool is not evaluating models that includes ngram
> dictionaries.
> ------------------------------------------------------------------------------------------
>
> Key: OPENNLP-231
> URL: https://issues.apache.org/jira/browse/OPENNLP-231
> Project: OpenNLP
> Issue Type: Improvement
> Components: Command Line Interface, POS Tagger
> Affects Versions: tools-1.5.2-incubating
> Reporter: William Colen
> Assignee: William Colen
> Priority: Minor
> Fix For: tools-1.5.2-incubating
>
>
> The parameter -ngram is present on POS Tagger trainer tool, but it is not
> present on CV tool.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira