If a rule test fails because no error has been found in the bad example sentence, one of the reason can be that the tokenization of the bad sentence example does not match the expected one in the rule itself.
To identify these cases more easily, add the token readings to the assertion message. Signed-off-by: Silvan Jegen <s.je...@gmail.com> --- Hi I had difficulties when creating Japanese rules because the mecab program I used to determine the tokenization of the example phrases produced different tokens than the tokenization library used in languagetool. It took me quite a while to find out why the errors in my bad example sentences where not found. Having the tokenized readings of the bad sentence examples in the assertion message makes it easier to identify issues with tokenization. I realize that this change may be less useful for languages with easier tokenization but I still think it would be nice to discuss whether it would make sense to include this output. Maybe there is another functionality in languagetool, that I do not know of, that would make the suggested changes unnecessary? If including the analyzed token readings is useful in other assertion messages as well, it may also be better to refactor the token reading code into its own function and making it less ad hoc. What do you think ? (If you want to include the patch, I can open a pull request on Github if you prefer) Cheers, Silvan .../org/languagetool/rules/patterns/PatternRuleTest.java | 15 ++++++++++++--- 1 file changed, 12 insertions(+), 3 deletions(-) diff --git a/languagetool-core/src/test/java/org/languagetool/rules/patterns/PatternRuleTest.java b/languagetool-core/src/test/java/org/languagetool/rules/patterns/PatternRuleTest.java index 0d5580d..d279b36 100644 --- a/languagetool-core/src/test/java/org/languagetool/rules/patterns/PatternRuleTest.java +++ b/languagetool-core/src/test/java/org/languagetool/rules/patterns/PatternRuleTest.java @@ -22,6 +22,7 @@ import java.io.File; import java.io.IOException; import java.io.InputStream; import java.lang.String; +import java.lang.StringBuilder; import java.util.*; import junit.framework.TestCase; @@ -281,9 +282,17 @@ public class PatternRuleTest extends TestCase { } if (!rule.isWithComplexPhrase()) { - assertTrue(lang + ": Did expect one error in: \"" + badSentence - + "\" (Rule: " + rule + "), but found " + matches.size() - + ". Additional info:" + rule.getMessage() + ", Matches: " + matches, matches.size() == 1); + if (matches.size() != 1) { + final AnalyzedSentence analyzedSentence = languageTool.getAnalyzedSentence(badSentence); + final AnalyzedTokenReadings[] analyzedTR = analyzedSentence.getTokens(); + final StringBuilder sb = new StringBuilder("Analyzed token readings:"); + for (AnalyzedTokenReadings atr : analyzedTR) { + sb.append(" " + atr.toString()); + } + assertTrue(lang + ": Did expect one error in: \"" + badSentence + + "\" (Rule: " + rule + "), but found " + matches.size() + + ". Additional info:" + rule.getMessage() + ", " + sb.toString() + ", Matches: " + matches, matches.size() == 1); + } assertEquals(lang + ": Incorrect match position markup (start) for rule " + rule + ", sentence: " + badSentence, expectedMatchStart, matches.get(0).getFromPos()); -- 2.0.4 ------------------------------------------------------------------------------ _______________________________________________ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel