njain2208 opened a new pull request, #1476:
URL: https://github.com/apache/tika/pull/1476

   PR Overview:
   
_________________________________________________________________________________________________________
   This PR fixes the flaky/non-deterministic behavior of the following test 
because it assumes the ordering.
   
   
[org.apache.tika.cli.TikaCLITest#testJsonMetadataOutput](https://github.com/apache/tika/blob/7b79f881e7b47f9272d626ebdfb9456fc206e08f/tika-app/src/test/java/org/apache/tika/cli/TikaCLITest.java#L248)
   
   Test Overview:
   
_________________________________________________________________________________________________________
   The output returned by getParamOutContent is flaky in nature and is 
non-deterministic. This results in the testJsonMetadataOutput failing.
   
   This flakiness was identified by the [nondex 
tool](https://github.com/TestingResearchIllinois/NonDex) created by the 
researchers of UIUC.
   
   ```
   [ERROR]   TikaCLITest.testJsonMetadataOutput:258 expected: <true> but was: 
<false>
   ```
   
   You can reproduce the issue by running the following commands:
   
   ```
   mvn install -pl tika-app -am -DskipTests
   mvn test -pl tika-app  
-Dtest=org.apache.tika.cli.TikaCLITest#testJsonMetadataOutput
   mvn -pl tika-app edu.illinois:index-maven-plugin:2.1.1:nondex 
-Dtest=org.apache.tika.cli.TikaCLITest#testJsonMetadataOutput
   ```
   Fix:
   
_________________________________________________________________________________________________________
   To fix the issue I decided to sort the JSON string. To sort the JSON string 
I used the Google extension "com.google.code.gson" in the tika-app/pom.xml 
file. This made the output deterministic and led the code to successfully 
passing.
   
   
https://github.com/njain2208/tika/blob/8fce6e9afcc6872e0d5cdfac08ccd2c7dfd7d0b0/tika-app/src/test/java/org/apache/tika/cli/TikaCLITest.java#L251-L271


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@tika.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to