njain2208 opened a new pull request, #1476: URL: https://github.com/apache/tika/pull/1476
PR Overview: _________________________________________________________________________________________________________ This PR fixes the flaky/non-deterministic behavior of the following test because it assumes the ordering. [org.apache.tika.cli.TikaCLITest#testJsonMetadataOutput](https://github.com/apache/tika/blob/7b79f881e7b47f9272d626ebdfb9456fc206e08f/tika-app/src/test/java/org/apache/tika/cli/TikaCLITest.java#L248) Test Overview: _________________________________________________________________________________________________________ The output returned by getParamOutContent is flaky in nature and is non-deterministic. This results in the testJsonMetadataOutput failing. This flakiness was identified by the [nondex tool](https://github.com/TestingResearchIllinois/NonDex) created by the researchers of UIUC. ``` [ERROR] TikaCLITest.testJsonMetadataOutput:258 expected: <true> but was: <false> ``` You can reproduce the issue by running the following commands: ``` mvn install -pl tika-app -am -DskipTests mvn test -pl tika-app -Dtest=org.apache.tika.cli.TikaCLITest#testJsonMetadataOutput mvn -pl tika-app edu.illinois:index-maven-plugin:2.1.1:nondex -Dtest=org.apache.tika.cli.TikaCLITest#testJsonMetadataOutput ``` Fix: _________________________________________________________________________________________________________ To fix the issue I decided to sort the JSON string. To sort the JSON string I used the Google extension "com.google.code.gson" in the tika-app/pom.xml file. This made the output deterministic and led the code to successfully passing. https://github.com/njain2208/tika/blob/8fce6e9afcc6872e0d5cdfac08ccd2c7dfd7d0b0/tika-app/src/test/java/org/apache/tika/cli/TikaCLITest.java#L251-L271 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@tika.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org