GitHub user cloud-fan opened a pull request: https://github.com/apache/spark/pull/20584
[SPARK-23390][SQL] Flaky Test Suite: FileBasedDataSourceSuite in Spark 2.3/hadoop 2.7 ## What changes were proposed in this pull request? This test only fails with sbt on Hadoop 2.7, I can't reproduce it locally, but here is my speculation by looking at the code: 1. FileSystem.delete doesn't delete the directory entirely, somehow we can still open the file as a 0-length empty file.(just speculation) 2. ORC intentionally allow empty files, and the reader fails during reading without closing the file stream. This PR improves the test to make sure all files are deleted and can't be opened. ## How was this patch tested? N/A You can merge this pull request into a Git repository by running: $ git pull https://github.com/cloud-fan/spark flaky-test Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/20584.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #20584 ---- commit 51bb48a4189aeb0322dd4ccd0f02416a52e963c3 Author: Wenchen Fan <wenchen@...> Date: 2018-02-12T04:24:35Z make sure all files are deleted when testing IGNORE_MISSING_FILES ---- --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org