[
https://issues.apache.org/jira/browse/OPENNLP-177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
William Colen resolved OPENNLP-177.
-----------------------------------
Resolution: Fixed
Created the CL tool -
$ bin/opennlp DoccatCrossValidator
Usage: opennlp DoccatCrossValidator[.leipzig] [-reportOutputFile outputFile]
[-misclassified true|false] [-folds num] [-params paramsFile] -lang language
-data sampleData [-encoding charsetName]
Arguments description:
-reportOutputFile outputFile
the path of the fine-grained report file.
-misclassified true|false
if true will print false negatives and false positives.
-folds num
number of folds, default is 10.
-params paramsFile
training parameters file.
-lang language
language which is being processed.
-data sampleData
data to be used, usually a file name.
-encoding charsetName
encoding for reading and writing text, if absent the system
default is used.
The reportOutputFile will output detailed F-Measure for each category and the
confusion matrix.
> Add cross validation support to doccat
> --------------------------------------
>
> Key: OPENNLP-177
> URL: https://issues.apache.org/jira/browse/OPENNLP-177
> Project: OpenNLP
> Issue Type: Improvement
> Components: Doccat
> Reporter: Joern Kottmann
> Assignee: William Colen
>
> Doccat should support cross validation in order to measure the performance on
> a data set without test data.
--
This message was sent by Atlassian JIRA
(v6.2#6252)