[ 
https://issues.apache.org/jira/browse/OPENNLP-1479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17789918#comment-17789918
 ] 

Lara Marinov edited comment on OPENNLP-1479 at 11/27/23 6:17 AM:
-----------------------------------------------------------------

I read the discussion in [https://github.com/apache/opennlp/pull/516]. For 
which languages would it be good to add new tests in 
[https://github.com/apache/opennlp/blob/main/opennlp-tools/src/test/java/opennlp/tools/tokenize/TokenizerFactoryTest.java]?


was (Author: JIRAUSER303288):
I read the discussion in 
[https://github.com/apache/opennlp/pull/516#issuecomment-1455015772]. For which 
languages would it be good to add new tests in 
[https://github.com/apache/opennlp/blob/main/opennlp-tools/src/test/java/opennlp/tools/tokenize/TokenizerFactoryTest.java]?

> Write better tests for pattern verification (tokenizers)
> --------------------------------------------------------
>
>                 Key: OPENNLP-1479
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-1479
>             Project: OpenNLP
>          Issue Type: Improvement
>          Components: Tokenizer
>    Affects Versions: 2.1.1
>            Reporter: Bruno P. Kinoshita
>            Priority: Major
>
> From [https://github.com/apache/opennlp/pull/516#issuecomment-1455015772]
> At the moment our tests verify that the tokenizer objects are created 
> correctly (i.e. tests getters and setters, constructor, etc.), without 
> verifying the actual behavior when used in conjunction with other classes 
> (factory, tokenizer, trainers, etc).
> It would be best to test the patterns used in the factories for different 
> languages with some interesting sample data (maybe something from project 
> gutenberg, open source news sites, etc.).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to