Repository: opennlp Updated Branches: refs/heads/master 804797c80 -> 31d463b54
No Jira: updated autogenerated CLI documentation closes apache/opennlp#103 Project: http://git-wip-us.apache.org/repos/asf/opennlp/repo Commit: http://git-wip-us.apache.org/repos/asf/opennlp/commit/31d463b5 Tree: http://git-wip-us.apache.org/repos/asf/opennlp/tree/31d463b5 Diff: http://git-wip-us.apache.org/repos/asf/opennlp/diff/31d463b5 Branch: refs/heads/master Commit: 31d463b540f98a1f78ec4c606c4c4a0f925b3045 Parents: 804797c Author: William D C M SILVA <[email protected]> Authored: Mon Jan 30 16:53:50 2017 -0200 Committer: William D C M SILVA <[email protected]> Committed: Mon Jan 30 16:54:38 2017 -0200 ---------------------------------------------------------------------- opennlp-docs/src/docbkx/cli.xml | 527 ++++++++++++++++++++++------------- 1 file changed, 340 insertions(+), 187 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/opennlp/blob/31d463b5/opennlp-docs/src/docbkx/cli.xml ---------------------------------------------------------------------- diff --git a/opennlp-docs/src/docbkx/cli.xml b/opennlp-docs/src/docbkx/cli.xml index 5227f47..3dc66b7 100644 --- a/opennlp-docs/src/docbkx/cli.xml +++ b/opennlp-docs/src/docbkx/cli.xml @@ -113,15 +113,15 @@ Arguments description: <screen> <![CDATA[ -Usage: opennlp DoccatEvaluator[.leipzig] [-reportOutputFile outputFile] [-misclassified true|false] -model - model -data sampleData [-encoding charsetName] +Usage: opennlp DoccatEvaluator[.leipzig] [-misclassified true|false] -model model [-reportOutputFile + outputFile] -data sampleData [-encoding charsetName] Arguments description: - -reportOutputFile outputFile - the path of the fine-grained report file. -misclassified true|false if true will print false negatives and false positives. -model model the model file to be evaluated. + -reportOutputFile outputFile + the path of the fine-grained report file. -data sampleData data to be used, usually a file name. -encoding charsetName @@ -160,16 +160,14 @@ Arguments description: <screen> <![CDATA[ -Usage: opennlp DoccatCrossValidator[.leipzig] [-reportOutputFile outputFile] [-misclassified true|false] - [-folds num] [-factory factoryName] [-tokenizer tokenizer] [-featureGenerators fg] [-params - paramsFile] -lang language -data sampleData [-encoding charsetName] +Usage: opennlp DoccatCrossValidator[.leipzig] [-folds num] [-misclassified true|false] [-factory factoryName] + [-tokenizer tokenizer] [-featureGenerators fg] [-params paramsFile] -lang language [-reportOutputFile + outputFile] -data sampleData [-encoding charsetName] Arguments description: - -reportOutputFile outputFile - the path of the fine-grained report file. - -misclassified true|false - if true will print false negatives and false positives. -folds num number of folds, default is 10. + -misclassified true|false + if true will print false negatives and false positives. -factory factoryName A sub-class of DoccatFactory where to get implementation and resources. -tokenizer tokenizer @@ -180,6 +178,8 @@ Arguments description: training parameters file. -lang language language which is being processed. + -reportOutputFile outputFile + the path of the fine-grained report file. -data sampleData data to be used, usually a file name. -encoding charsetName @@ -351,18 +351,18 @@ Arguments description: <entry>Encoding for reading and writing text, if absent the system default is used.</entry> </row> <row> -<entry>splitHyphenatedTokens</entry> -<entry>split</entry> -<entry>Yes</entry> -<entry>If true all hyphenated tokens will be separated (default true)</entry> -</row> -<row> <entry>lang</entry> <entry>language</entry> <entry>No</entry> <entry>Language which is being processed.</entry> </row> <row> +<entry>splitHyphenatedTokens</entry> +<entry>split</entry> +<entry>Yes</entry> +<entry>If true all hyphenated tokens will be separated (default true)</entry> +</row> +<row> <entry>data</entry> <entry>sampleData</entry> <entry>No</entry> @@ -490,18 +490,18 @@ Arguments description: <entry>Encoding for reading and writing text, if absent the system default is used.</entry> </row> <row> -<entry>splitHyphenatedTokens</entry> -<entry>split</entry> -<entry>Yes</entry> -<entry>If true all hyphenated tokens will be separated (default true)</entry> -</row> -<row> <entry>lang</entry> <entry>language</entry> <entry>No</entry> <entry>Language which is being processed.</entry> </row> <row> +<entry>splitHyphenatedTokens</entry> +<entry>split</entry> +<entry>Yes</entry> +<entry>If true all hyphenated tokens will be separated (default true)</entry> +</row> +<row> <entry>data</entry> <entry>sampleData</entry> <entry>No</entry> @@ -602,14 +602,14 @@ Arguments description: <screen> <![CDATA[ -Usage: opennlp TokenizerCrossValidator[.ad|.pos|.conllx|.namefinder|.parse] [-misclassified true|false] - [-folds num] [-factory factoryName] [-abbDict path] [-alphaNumOpt isAlphaNumOpt] [-params paramsFile] +Usage: opennlp TokenizerCrossValidator[.ad|.pos|.conllx|.namefinder|.parse] [-folds num] [-misclassified + true|false] [-factory factoryName] [-abbDict path] [-alphaNumOpt isAlphaNumOpt] [-params paramsFile] -lang language -data sampleData [-encoding charsetName] Arguments description: - -misclassified true|false - if true will print false negatives and false positives. -folds num number of folds, default is 10. + -misclassified true|false + if true will print false negatives and false positives. -factory factoryName A sub-class of TokenizerFactory where to get implementation and resources. -abbDict path @@ -640,18 +640,18 @@ Arguments description: <entry>Encoding for reading and writing text, if absent the system default is used.</entry> </row> <row> -<entry>splitHyphenatedTokens</entry> -<entry>split</entry> -<entry>Yes</entry> -<entry>If true all hyphenated tokens will be separated (default true)</entry> -</row> -<row> <entry>lang</entry> <entry>language</entry> <entry>No</entry> <entry>Language which is being processed.</entry> </row> <row> +<entry>splitHyphenatedTokens</entry> +<entry>split</entry> +<entry>Yes</entry> +<entry>If true all hyphenated tokens will be separated (default true)</entry> +</row> +<row> <entry>data</entry> <entry>sampleData</entry> <entry>No</entry> @@ -769,18 +769,18 @@ Usage: opennlp TokenizerConverter help|ad|pos|conllx|namefinder|parse [help|opti <entry>Encoding for reading and writing text, if absent the system default is used.</entry> </row> <row> -<entry>splitHyphenatedTokens</entry> -<entry>split</entry> -<entry>Yes</entry> -<entry>If true all hyphenated tokens will be separated (default true)</entry> -</row> -<row> <entry>lang</entry> <entry>language</entry> <entry>No</entry> <entry>Language which is being processed.</entry> </row> <row> +<entry>splitHyphenatedTokens</entry> +<entry>split</entry> +<entry>Yes</entry> +<entry>If true all hyphenated tokens will be separated (default true)</entry> +</row> +<row> <entry>data</entry> <entry>sampleData</entry> <entry>No</entry> @@ -915,9 +915,9 @@ Usage: opennlp SentenceDetector model < sentences <screen> <![CDATA[ -Usage: opennlp SentenceDetectorTrainer[.ad|.pos|.conllx|.namefinder|.parse] [-factory factoryName] [-eosChars - string] [-abbDict path] [-params paramsFile] -lang language -model modelFile -data sampleData - [-encoding charsetName] +Usage: opennlp SentenceDetectorTrainer[.ad|.pos|.conllx|.namefinder|.parse|.moses|.letsmt] [-factory + factoryName] [-eosChars string] [-abbDict path] [-params paramsFile] -lang language -model modelFile + -data sampleData [-encoding charsetName] Arguments description: -factory factoryName A sub-class of SentenceDetectorFactory where to get implementation and resources. @@ -951,18 +951,18 @@ Arguments description: <entry>Encoding for reading and writing text.</entry> </row> <row> -<entry>includeTitles</entry> -<entry>includeTitles</entry> -<entry>Yes</entry> -<entry>If true will include sentences marked as headlines.</entry> -</row> -<row> <entry>lang</entry> <entry>language</entry> <entry>No</entry> <entry>Language which is being processed.</entry> </row> <row> +<entry>includeTitles</entry> +<entry>includeTitles</entry> +<entry>Yes</entry> +<entry>If true will include sentences marked as headlines.</entry> +</row> +<row> <entry>data</entry> <entry>sampleData</entry> <entry>No</entry> @@ -1044,6 +1044,38 @@ Arguments description: <entry>No</entry> <entry>Specifies the file with detokenizer dictionary.</entry> </row> +<row> +<entry morerows='1' valign='middle'>moses</entry> +<entry>data</entry> +<entry>sampleData</entry> +<entry>No</entry> +<entry>Data to be used, usually a file name.</entry> +</row> +<row> +<entry>encoding</entry> +<entry>charsetName</entry> +<entry>Yes</entry> +<entry>Encoding for reading and writing text, if absent the system default is used.</entry> +</row> +<row> +<entry morerows='2' valign='middle'>letsmt</entry> +<entry>detokenizer</entry> +<entry>dictionary</entry> +<entry>Yes</entry> +<entry>Specifies the file with detokenizer dictionary.</entry> +</row> +<row> +<entry>data</entry> +<entry>sampleData</entry> +<entry>No</entry> +<entry>Data to be used, usually a file name.</entry> +</row> +<row> +<entry>encoding</entry> +<entry>charsetName</entry> +<entry>Yes</entry> +<entry>Encoding for reading and writing text, if absent the system default is used.</entry> +</row> </tbody> </tgroup></informaltable> @@ -1057,8 +1089,8 @@ Arguments description: <screen> <![CDATA[ -Usage: opennlp SentenceDetectorEvaluator[.ad|.pos|.conllx|.namefinder|.parse] [-misclassified true|false] - -model model -data sampleData [-encoding charsetName] +Usage: opennlp SentenceDetectorEvaluator[.ad|.pos|.conllx|.namefinder|.parse|.moses|.letsmt] [-misclassified + true|false] -model model -data sampleData [-encoding charsetName] Arguments description: -misclassified true|false if true will print false negatives and false positives. @@ -1084,18 +1116,18 @@ Arguments description: <entry>Encoding for reading and writing text.</entry> </row> <row> -<entry>includeTitles</entry> -<entry>includeTitles</entry> -<entry>Yes</entry> -<entry>If true will include sentences marked as headlines.</entry> -</row> -<row> <entry>lang</entry> <entry>language</entry> <entry>No</entry> <entry>Language which is being processed.</entry> </row> <row> +<entry>includeTitles</entry> +<entry>includeTitles</entry> +<entry>Yes</entry> +<entry>If true will include sentences marked as headlines.</entry> +</row> +<row> <entry>data</entry> <entry>sampleData</entry> <entry>No</entry> @@ -1177,6 +1209,38 @@ Arguments description: <entry>No</entry> <entry>Specifies the file with detokenizer dictionary.</entry> </row> +<row> +<entry morerows='1' valign='middle'>moses</entry> +<entry>data</entry> +<entry>sampleData</entry> +<entry>No</entry> +<entry>Data to be used, usually a file name.</entry> +</row> +<row> +<entry>encoding</entry> +<entry>charsetName</entry> +<entry>Yes</entry> +<entry>Encoding for reading and writing text, if absent the system default is used.</entry> +</row> +<row> +<entry morerows='2' valign='middle'>letsmt</entry> +<entry>detokenizer</entry> +<entry>dictionary</entry> +<entry>Yes</entry> +<entry>Specifies the file with detokenizer dictionary.</entry> +</row> +<row> +<entry>data</entry> +<entry>sampleData</entry> +<entry>No</entry> +<entry>Data to be used, usually a file name.</entry> +</row> +<row> +<entry>encoding</entry> +<entry>charsetName</entry> +<entry>Yes</entry> +<entry>Encoding for reading and writing text, if absent the system default is used.</entry> +</row> </tbody> </tgroup></informaltable> @@ -1190,9 +1254,9 @@ Arguments description: <screen> <![CDATA[ -Usage: opennlp SentenceDetectorCrossValidator[.ad|.pos|.conllx|.namefinder|.parse] [-factory factoryName] - [-eosChars string] [-abbDict path] [-params paramsFile] -lang language [-misclassified true|false] - [-folds num] -data sampleData [-encoding charsetName] +Usage: opennlp SentenceDetectorCrossValidator[.ad|.pos|.conllx|.namefinder|.parse|.moses|.letsmt] [-factory + factoryName] [-eosChars string] [-abbDict path] [-params paramsFile] -lang language [-folds num] + [-misclassified true|false] -data sampleData [-encoding charsetName] Arguments description: -factory factoryName A sub-class of SentenceDetectorFactory where to get implementation and resources. @@ -1204,10 +1268,10 @@ Arguments description: training parameters file. -lang language language which is being processed. - -misclassified true|false - if true will print false negatives and false positives. -folds num number of folds, default is 10. + -misclassified true|false + if true will print false negatives and false positives. -data sampleData data to be used, usually a file name. -encoding charsetName @@ -1228,18 +1292,18 @@ Arguments description: <entry>Encoding for reading and writing text.</entry> </row> <row> -<entry>includeTitles</entry> -<entry>includeTitles</entry> -<entry>Yes</entry> -<entry>If true will include sentences marked as headlines.</entry> -</row> -<row> <entry>lang</entry> <entry>language</entry> <entry>No</entry> <entry>Language which is being processed.</entry> </row> <row> +<entry>includeTitles</entry> +<entry>includeTitles</entry> +<entry>Yes</entry> +<entry>If true will include sentences marked as headlines.</entry> +</row> +<row> <entry>data</entry> <entry>sampleData</entry> <entry>No</entry> @@ -1321,6 +1385,38 @@ Arguments description: <entry>No</entry> <entry>Specifies the file with detokenizer dictionary.</entry> </row> +<row> +<entry morerows='1' valign='middle'>moses</entry> +<entry>data</entry> +<entry>sampleData</entry> +<entry>No</entry> +<entry>Data to be used, usually a file name.</entry> +</row> +<row> +<entry>encoding</entry> +<entry>charsetName</entry> +<entry>Yes</entry> +<entry>Encoding for reading and writing text, if absent the system default is used.</entry> +</row> +<row> +<entry morerows='2' valign='middle'>letsmt</entry> +<entry>detokenizer</entry> +<entry>dictionary</entry> +<entry>Yes</entry> +<entry>Specifies the file with detokenizer dictionary.</entry> +</row> +<row> +<entry>data</entry> +<entry>sampleData</entry> +<entry>No</entry> +<entry>Data to be used, usually a file name.</entry> +</row> +<row> +<entry>encoding</entry> +<entry>charsetName</entry> +<entry>Yes</entry> +<entry>Encoding for reading and writing text, if absent the system default is used.</entry> +</row> </tbody> </tgroup></informaltable> @@ -1330,11 +1426,11 @@ Arguments description: <title>SentenceDetectorConverter</title> -<para>Converts foreign data formats (ad,pos,conllx,namefinder,parse) to native OpenNLP format</para> +<para>Converts foreign data formats (ad,pos,conllx,namefinder,parse,moses,letsmt) to native OpenNLP format</para> <screen> <![CDATA[ -Usage: opennlp SentenceDetectorConverter help|ad|pos|conllx|namefinder|parse [help|options...] +Usage: opennlp SentenceDetectorConverter help|ad|pos|conllx|namefinder|parse|moses|letsmt [help|options...] ]]> </screen> @@ -1351,18 +1447,18 @@ Usage: opennlp SentenceDetectorConverter help|ad|pos|conllx|namefinder|parse [he <entry>Encoding for reading and writing text.</entry> </row> <row> -<entry>includeTitles</entry> -<entry>includeTitles</entry> -<entry>Yes</entry> -<entry>If true will include sentences marked as headlines.</entry> -</row> -<row> <entry>lang</entry> <entry>language</entry> <entry>No</entry> <entry>Language which is being processed.</entry> </row> <row> +<entry>includeTitles</entry> +<entry>includeTitles</entry> +<entry>Yes</entry> +<entry>If true will include sentences marked as headlines.</entry> +</row> +<row> <entry>data</entry> <entry>sampleData</entry> <entry>No</entry> @@ -1444,6 +1540,38 @@ Usage: opennlp SentenceDetectorConverter help|ad|pos|conllx|namefinder|parse [he <entry>No</entry> <entry>Specifies the file with detokenizer dictionary.</entry> </row> +<row> +<entry morerows='1' valign='middle'>moses</entry> +<entry>data</entry> +<entry>sampleData</entry> +<entry>No</entry> +<entry>Data to be used, usually a file name.</entry> +</row> +<row> +<entry>encoding</entry> +<entry>charsetName</entry> +<entry>Yes</entry> +<entry>Encoding for reading and writing text, if absent the system default is used.</entry> +</row> +<row> +<entry morerows='2' valign='middle'>letsmt</entry> +<entry>detokenizer</entry> +<entry>dictionary</entry> +<entry>Yes</entry> +<entry>Specifies the file with detokenizer dictionary.</entry> +</row> +<row> +<entry>data</entry> +<entry>sampleData</entry> +<entry>No</entry> +<entry>Data to be used, usually a file name.</entry> +</row> +<row> +<entry>encoding</entry> +<entry>charsetName</entry> +<entry>Yes</entry> +<entry>Encoding for reading and writing text, if absent the system default is used.</entry> +</row> </tbody> </tgroup></informaltable> @@ -1545,18 +1673,18 @@ Arguments description: <entry>Encoding for reading and writing text, if absent the system default is used.</entry> </row> <row> -<entry>splitHyphenatedTokens</entry> -<entry>split</entry> -<entry>Yes</entry> -<entry>If true all hyphenated tokens will be separated (default true)</entry> -</row> -<row> <entry>lang</entry> <entry>language</entry> <entry>No</entry> <entry>Language which is being processed.</entry> </row> <row> +<entry>splitHyphenatedTokens</entry> +<entry>split</entry> +<entry>Yes</entry> +<entry>If true all hyphenated tokens will be separated (default true)</entry> +</row> +<row> <entry>data</entry> <entry>sampleData</entry> <entry>No</entry> @@ -1708,8 +1836,8 @@ Arguments description: <screen> <![CDATA[ Usage: opennlp TokenNameFinderEvaluator[.evalita|.ad|.conll03|.bionlp2004|.conll02|.muc6|.ontonotes|.brat] - [-nameTypes types] [-misclassified true|false] -model model [-detailedF true|false] -data sampleData - [-encoding charsetName] + [-nameTypes types] [-misclassified true|false] -model model [-detailedF true|false] + [-reportOutputFile outputFile] -data sampleData [-encoding charsetName] Arguments description: -nameTypes types name types to use for evaluation @@ -1719,6 +1847,8 @@ Arguments description: the model file to be evaluated. -detailedF true|false if true will print detailed FMeasure results. + -reportOutputFile outputFile + the path of the fine-grained report file. -data sampleData data to be used, usually a file name. -encoding charsetName @@ -1764,18 +1894,18 @@ Arguments description: <entry>Encoding for reading and writing text, if absent the system default is used.</entry> </row> <row> -<entry>splitHyphenatedTokens</entry> -<entry>split</entry> -<entry>Yes</entry> -<entry>If true all hyphenated tokens will be separated (default true)</entry> -</row> -<row> <entry>lang</entry> <entry>language</entry> <entry>No</entry> <entry>Language which is being processed.</entry> </row> <row> +<entry>splitHyphenatedTokens</entry> +<entry>split</entry> +<entry>Yes</entry> +<entry>If true all hyphenated tokens will be separated (default true)</entry> +</row> +<row> <entry>data</entry> <entry>sampleData</entry> <entry>No</entry> @@ -1929,8 +2059,9 @@ Arguments description: Usage: opennlp TokenNameFinderCrossValidator[.evalita|.ad|.conll03|.bionlp2004|.conll02|.muc6|.ontonotes|.brat] [-factory factoryName] [-resources resourcesDir] [-type modelType] [-featuregen featuregenFile] - [-nameTypes types] [-sequenceCodec codec] [-params paramsFile] -lang language [-misclassified - true|false] [-folds num] [-detailedF true|false] -data sampleData [-encoding charsetName] + [-nameTypes types] [-sequenceCodec codec] [-params paramsFile] -lang language [-folds num] + [-misclassified true|false] [-detailedF true|false] [-reportOutputFile outputFile] -data sampleData + [-encoding charsetName] Arguments description: -factory factoryName A sub-class of TokenNameFinderFactory @@ -1948,12 +2079,14 @@ Arguments description: training parameters file. -lang language language which is being processed. - -misclassified true|false - if true will print false negatives and false positives. -folds num number of folds, default is 10. + -misclassified true|false + if true will print false negatives and false positives. -detailedF true|false if true will print detailed FMeasure results. + -reportOutputFile outputFile + the path of the fine-grained report file. -data sampleData data to be used, usually a file name. -encoding charsetName @@ -1999,18 +2132,18 @@ Arguments description: <entry>Encoding for reading and writing text, if absent the system default is used.</entry> </row> <row> -<entry>splitHyphenatedTokens</entry> -<entry>split</entry> -<entry>Yes</entry> -<entry>If true all hyphenated tokens will be separated (default true)</entry> -</row> -<row> <entry>lang</entry> <entry>language</entry> <entry>No</entry> <entry>Language which is being processed.</entry> </row> <row> +<entry>splitHyphenatedTokens</entry> +<entry>split</entry> +<entry>Yes</entry> +<entry>If true all hyphenated tokens will be separated (default true)</entry> +</row> +<row> <entry>data</entry> <entry>sampleData</entry> <entry>No</entry> @@ -2203,18 +2336,18 @@ Usage: opennlp TokenNameFinderConverter help|evalita|ad|conll03|bionlp2004|conll <entry>Encoding for reading and writing text, if absent the system default is used.</entry> </row> <row> -<entry>splitHyphenatedTokens</entry> -<entry>split</entry> -<entry>Yes</entry> -<entry>If true all hyphenated tokens will be separated (default true)</entry> -</row> -<row> <entry>lang</entry> <entry>language</entry> <entry>No</entry> <entry>Language which is being processed.</entry> </row> <row> +<entry>splitHyphenatedTokens</entry> +<entry>split</entry> +<entry>Yes</entry> +<entry>If true all hyphenated tokens will be separated (default true)</entry> +</row> +<row> <entry>data</entry> <entry>sampleData</entry> <entry>No</entry> @@ -2446,6 +2579,12 @@ Arguments description: <entry>Encoding for reading and writing text, if absent the system default is used.</entry> </row> <row> +<entry>lang</entry> +<entry>language</entry> +<entry>No</entry> +<entry>Language which is being processed.</entry> +</row> +<row> <entry>expandME</entry> <entry>expandME</entry> <entry>Yes</entry> @@ -2458,12 +2597,6 @@ Arguments description: <entry>Combine POS Tags with word features, like number and gender.</entry> </row> <row> -<entry>lang</entry> -<entry>language</entry> -<entry>No</entry> -<entry>Language which is being processed.</entry> -</row> -<row> <entry>data</entry> <entry>sampleData</entry> <entry>No</entry> @@ -2515,15 +2648,15 @@ Arguments description: <screen> <![CDATA[ -Usage: opennlp POSTaggerEvaluator[.ad|.conllx|.parse|.ontonotes] [-reportOutputFile outputFile] - [-misclassified true|false] -model model -data sampleData [-encoding charsetName] +Usage: opennlp POSTaggerEvaluator[.ad|.conllx|.parse|.ontonotes] [-misclassified true|false] -model model + [-reportOutputFile outputFile] -data sampleData [-encoding charsetName] Arguments description: - -reportOutputFile outputFile - the path of the fine-grained report file. -misclassified true|false if true will print false negatives and false positives. -model model the model file to be evaluated. + -reportOutputFile outputFile + the path of the fine-grained report file. -data sampleData data to be used, usually a file name. -encoding charsetName @@ -2544,6 +2677,12 @@ Arguments description: <entry>Encoding for reading and writing text, if absent the system default is used.</entry> </row> <row> +<entry>lang</entry> +<entry>language</entry> +<entry>No</entry> +<entry>Language which is being processed.</entry> +</row> +<row> <entry>expandME</entry> <entry>expandME</entry> <entry>Yes</entry> @@ -2556,12 +2695,6 @@ Arguments description: <entry>Combine POS Tags with word features, like number and gender.</entry> </row> <row> -<entry>lang</entry> -<entry>language</entry> -<entry>No</entry> -<entry>Language which is being processed.</entry> -</row> -<row> <entry>data</entry> <entry>sampleData</entry> <entry>No</entry> @@ -2613,17 +2746,15 @@ Arguments description: <screen> <![CDATA[ -Usage: opennlp POSTaggerCrossValidator[.ad|.conllx|.parse|.ontonotes] [-reportOutputFile outputFile] - [-misclassified true|false] [-folds num] [-factory factoryName] [-type - maxent|perceptron|perceptron_sequence] [-dict dictionaryPath] [-ngram cutoff] [-tagDictCutoff - tagDictCutoff] [-params paramsFile] -lang language -data sampleData [-encoding charsetName] +Usage: opennlp POSTaggerCrossValidator[.ad|.conllx|.parse|.ontonotes] [-folds num] [-misclassified + true|false] [-factory factoryName] [-type maxent|perceptron|perceptron_sequence] [-dict + dictionaryPath] [-ngram cutoff] [-tagDictCutoff tagDictCutoff] [-params paramsFile] -lang language + [-reportOutputFile outputFile] -data sampleData [-encoding charsetName] Arguments description: - -reportOutputFile outputFile - the path of the fine-grained report file. - -misclassified true|false - if true will print false negatives and false positives. -folds num number of folds, default is 10. + -misclassified true|false + if true will print false negatives and false positives. -factory factoryName A sub-class of POSTaggerFactory where to get implementation and resources. -type maxent|perceptron|perceptron_sequence @@ -2638,6 +2769,8 @@ Arguments description: training parameters file. -lang language language which is being processed. + -reportOutputFile outputFile + the path of the fine-grained report file. -data sampleData data to be used, usually a file name. -encoding charsetName @@ -2658,6 +2791,12 @@ Arguments description: <entry>Encoding for reading and writing text, if absent the system default is used.</entry> </row> <row> +<entry>lang</entry> +<entry>language</entry> +<entry>No</entry> +<entry>Language which is being processed.</entry> +</row> +<row> <entry>expandME</entry> <entry>expandME</entry> <entry>Yes</entry> @@ -2670,12 +2809,6 @@ Arguments description: <entry>Combine POS Tags with word features, like number and gender.</entry> </row> <row> -<entry>lang</entry> -<entry>language</entry> -<entry>No</entry> -<entry>Language which is being processed.</entry> -</row> -<row> <entry>data</entry> <entry>sampleData</entry> <entry>No</entry> @@ -2744,6 +2877,12 @@ Usage: opennlp POSTaggerConverter help|ad|conllx|parse|ontonotes [help|options.. <entry>Encoding for reading and writing text, if absent the system default is used.</entry> </row> <row> +<entry>lang</entry> +<entry>language</entry> +<entry>No</entry> +<entry>Language which is being processed.</entry> +</row> +<row> <entry>expandME</entry> <entry>expandME</entry> <entry>Yes</entry> @@ -2756,12 +2895,6 @@ Usage: opennlp POSTaggerConverter help|ad|conllx|parse|ontonotes [help|options.. <entry>Combine POS Tags with word features, like number and gender.</entry> </row> <row> -<entry>lang</entry> -<entry>language</entry> -<entry>No</entry> -<entry>Language which is being processed.</entry> -</row> -<row> <entry>data</entry> <entry>sampleData</entry> <entry>No</entry> @@ -2869,15 +3002,15 @@ Arguments description: <screen> <![CDATA[ -Usage: opennlp LemmatizerEvaluator [-reportOutputFile outputFile] [-misclassified true|false] -model model +Usage: opennlp LemmatizerEvaluator [-misclassified true|false] -model model [-reportOutputFile outputFile] -data sampleData [-encoding charsetName] Arguments description: - -reportOutputFile outputFile - the path of the fine-grained report file. -misclassified true|false if true will print false negatives and false positives. -model model the model file to be evaluated. + -reportOutputFile outputFile + the path of the fine-grained report file. -data sampleData data to be used, usually a file name. -encoding charsetName @@ -2960,10 +3093,10 @@ Arguments description: <entry>Language which is being processed.</entry> </row> <row> -<entry>data</entry> -<entry>sampleData</entry> -<entry>No</entry> -<entry>Data to be used, usually a file name.</entry> +<entry>start</entry> +<entry>start</entry> +<entry>Yes</entry> +<entry>Index of first sentence</entry> </row> <row> <entry>end</entry> @@ -2972,10 +3105,10 @@ Arguments description: <entry>Index of last sentence</entry> </row> <row> -<entry>start</entry> -<entry>start</entry> -<entry>Yes</entry> -<entry>Index of first sentence</entry> +<entry>data</entry> +<entry>sampleData</entry> +<entry>No</entry> +<entry>Data to be used, usually a file name.</entry> </row> </tbody> </tgroup></informaltable> @@ -3025,10 +3158,10 @@ Arguments description: <entry>Language which is being processed.</entry> </row> <row> -<entry>data</entry> -<entry>sampleData</entry> -<entry>No</entry> -<entry>Data to be used, usually a file name.</entry> +<entry>start</entry> +<entry>start</entry> +<entry>Yes</entry> +<entry>Index of first sentence</entry> </row> <row> <entry>end</entry> @@ -3037,10 +3170,10 @@ Arguments description: <entry>Index of last sentence</entry> </row> <row> -<entry>start</entry> -<entry>start</entry> -<entry>Yes</entry> -<entry>Index of first sentence</entry> +<entry>data</entry> +<entry>sampleData</entry> +<entry>No</entry> +<entry>Data to be used, usually a file name.</entry> </row> </tbody> </tgroup></informaltable> @@ -3055,9 +3188,8 @@ Arguments description: <screen> <![CDATA[ -Usage: opennlp ChunkerCrossValidator[.ad] [-factory factoryName] [-params paramsFile] -lang language - [-misclassified true|false] [-folds num] [-detailedF true|false] -data sampleData [-encoding - charsetName] +Usage: opennlp ChunkerCrossValidator[.ad] [-factory factoryName] [-params paramsFile] -lang language [-folds + num] [-misclassified true|false] [-detailedF true|false] -data sampleData [-encoding charsetName] Arguments description: -factory factoryName A sub-class of ChunkerFactory where to get implementation and resources. @@ -3065,10 +3197,10 @@ Arguments description: training parameters file. -lang language language which is being processed. - -misclassified true|false - if true will print false negatives and false positives. -folds num number of folds, default is 10. + -misclassified true|false + if true will print false negatives and false positives. -detailedF true|false if true will print detailed FMeasure results. -data sampleData @@ -3097,10 +3229,10 @@ Arguments description: <entry>Language which is being processed.</entry> </row> <row> -<entry>data</entry> -<entry>sampleData</entry> -<entry>No</entry> -<entry>Data to be used, usually a file name.</entry> +<entry>start</entry> +<entry>start</entry> +<entry>Yes</entry> +<entry>Index of first sentence</entry> </row> <row> <entry>end</entry> @@ -3109,10 +3241,10 @@ Arguments description: <entry>Index of last sentence</entry> </row> <row> -<entry>start</entry> -<entry>start</entry> -<entry>Yes</entry> -<entry>Index of first sentence</entry> +<entry>data</entry> +<entry>sampleData</entry> +<entry>No</entry> +<entry>Data to be used, usually a file name.</entry> </row> </tbody> </tgroup></informaltable> @@ -3150,10 +3282,10 @@ Usage: opennlp ChunkerConverter help|ad [help|options...] <entry>Language which is being processed.</entry> </row> <row> -<entry>data</entry> -<entry>sampleData</entry> -<entry>No</entry> -<entry>Data to be used, usually a file name.</entry> +<entry>start</entry> +<entry>start</entry> +<entry>Yes</entry> +<entry>Index of first sentence</entry> </row> <row> <entry>end</entry> @@ -3162,10 +3294,10 @@ Usage: opennlp ChunkerConverter help|ad [help|options...] <entry>Index of last sentence</entry> </row> <row> -<entry>start</entry> -<entry>start</entry> -<entry>Yes</entry> -<entry>Index of first sentence</entry> +<entry>data</entry> +<entry>sampleData</entry> +<entry>No</entry> +<entry>Data to be used, usually a file name.</entry> </row> </tbody> </tgroup></informaltable> @@ -3186,10 +3318,11 @@ Usage: opennlp ChunkerConverter help|ad [help|options...] <screen> <![CDATA[ -Usage: opennlp Parser [-bs n -ap n -k n] model < sentences +Usage: opennlp Parser [-bs n -ap n -k n -tk tok_model] model < sentences -bs n: Use a beam size of n. -ap f: Advance outcomes in with at least f% of the probability mass. -k n: Show the top n parses. This will also display their log-probablities. +-tk tok_model: Use the specified tokenizer model to tokenize the sentences. Defaults to a WhitespaceTokenizer. ]]> </screen> @@ -3203,18 +3336,18 @@ Usage: opennlp Parser [-bs n -ap n -k n] model < sentences <screen> <![CDATA[ -Usage: opennlp ParserTrainer[.ontonotes|.frenchtreebank] [-fun true|false] [-headRulesSerializerImpl - className] -headRules headRulesFile [-parserType CHUNKING|TREEINSERT] [-params paramsFile] -lang - language -model modelFile [-encoding charsetName] -data sampleData +Usage: opennlp ParserTrainer[.ontonotes|.frenchtreebank] [-headRulesSerializerImpl className] -headRules + headRulesFile [-parserType CHUNKING|TREEINSERT] [-fun true|false] [-params paramsFile] -lang language + -model modelFile [-encoding charsetName] -data sampleData Arguments description: - -fun true|false - Learn to generate function tags. -headRulesSerializerImpl className head rules artifact serializer class name -headRules headRulesFile head rules file. -parserType CHUNKING|TREEINSERT one of CHUNKING or TREEINSERT, default is CHUNKING. + -fun true|false + Learn to generate function tags. -params paramsFile training parameters file. -lang language @@ -3496,6 +3629,26 @@ Usage: opennlp EntityLinker model < sentences </section> +<section id='tools.cli.languagemodel'> + +<title>Languagemodel</title> + +<section id='tools.cli.languagemodel.LanguageModel'> + +<title>LanguageModel</title> + +<para>Gives the probability of a sequence of tokens in a language model</para> + +<screen> +<![CDATA[ +Usage: opennlp LanguageModel model + +]]> +</screen> +</section> + +</section> + </chapter>
