[
https://issues.apache.org/jira/browse/OPENNLP-1519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17788112#comment-17788112
]
ASF GitHub Bot commented on OPENNLP-1519:
-----------------------------------------
Sujishark commented on PR #556:
URL: https://github.com/apache/opennlp/pull/556#issuecomment-1819541012
Greetings,
I've added the environment details.
A similar issue was already addressed in this PR:
https://github.com/apache/opennlp/pull/387
The change I made is because in the `equals` method designed for comparing
elements within a Set, the ordering of the elements is also checked for the Set.
https://github.com/apache/opennlp/blob/dab19af803e9139b57b972eb4e1af4c978d2ff3f/opennlp-tools/src/main/java/opennlp/tools/dictionary/Dictionary.java#L356
The `entrySet` here is a HashSet which doesn't maintain a constant order as
mentioned in the
[Java_17_documentation](https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/util/HashSet.html).
https://github.com/apache/opennlp/blob/dab19af803e9139b57b972eb4e1af4c978d2ff3f/opennlp-tools/src/main/java/opennlp/tools/dictionary/Dictionary.java#L95
I've made changes only to the test files now without altering the actual
code.
> Use LinkedHashSet for deterministic iteration order
> ---------------------------------------------------
>
> Key: OPENNLP-1519
> URL: https://issues.apache.org/jira/browse/OPENNLP-1519
> Project: OpenNLP
> Issue Type: Improvement
> Reporter: Sujithra Rajan
> Priority: Minor
>
> Two tests
> * `opennlp.tools.dictionary.DictionaryAsSetCaseInsensitiveTest.testEquals`
> *
> `opennlp.tools.dictionary.DictionaryAsSetCaseInsensitiveTest.testEqualsDifferentCase`
> uses HashSet for entrySet while initializing the 'Dictionary' and thus the
> order is not constant all the time.
> This was found by using the
> [NonDex][https://github.com/TestingResearchIllinois/NonDex] tool.
> Encountered the following error messages:
> {quote}org.opentest4j.AssertionFailedError: expected: <[1a, 1b]> but was:
> <[1b, 1a]>
> at
> opennlp.tools.dictionary.DictionaryAsSetCaseInsensitiveTest.testEquals(DictionaryAsSetCaseInsensitiveTest.java:121)
> {quote}
> {quote}org.opentest4j.AssertionFailedError: expected: <[1a, 1b]> but was:
> <[1B, 1A]>
> at
> opennlp.tools.dictionary.DictionaryAsSetCaseInsensitiveTest.testEqualsDifferentCase(DictionaryAsSetCaseInsensitiveTest.java:142)
> {quote}
> {quote}org.opentest4j.AssertionFailedError: expected: <[[Berlin],
> [Stockholm], [New,York], [London], [Copenhagen], [Paris]]> but was:
> <[[Copenhagen], [London], [New,York], [Stockholm], [Paris], [Berlin]]>
> at
> opennlp.uima.dictionary.DictionaryResourceTest.testDictionaryWasLoaded(DictionaryResourceTest.java:76)
> {quote}
>
> The fix is to change HashSet to LinkedHashSet so that the iteration order
> remains stable all the time.
> Assertion statement of 'testDictionaryWasLoaded' was modified to match the
> exact ordering of dictionary.dic.
>
> {*}REPRODUCE{*}:
> ```
> mvn edu.illinois:nondex-maven-plugin:2.1.7-SNAPSHOT:nondex
> -Dtest=opennlp.tools.dictionary.DictionaryAsSetCaseInsensitiveTest#testEquals
> ```
> Can I proceed and create PR ?
--
This message was sent by Atlassian Jira
(v8.20.10#820010)