Bruno P. Kinoshita created OPENNLP-1479:
-------------------------------------------
Summary: Write better tests for pattern verification (tokenizers)
Key: OPENNLP-1479
URL: https://issues.apache.org/jira/browse/OPENNLP-1479
Project: OpenNLP
Issue Type: Improvement
Components: Tokenizer
Affects Versions: 2.1.1
Reporter: Bruno P. Kinoshita
Fix For: 2.1.2
>From [https://github.com/apache/opennlp/pull/516#issuecomment-1455015772]
At the moment our tests verify that the tokenizer objects are created correctly
(i.e. tests getters and setters, constructor, etc.), without verifying the
actual behavior when used in conjunction with other classes (factory,
tokenizer, trainers, etc).
It would be best to test the patterns used in the factories for different
languages with some interesting sample data (maybe something from project
gutenberg, open source news sites, etc.).
--
This message was sent by Atlassian Jira
(v8.20.10#820010)