GitHub user httfighter opened a pull request: https://github.com/apache/spark/pull/22487
[SPARK-25477] âINSERT OVERWRITE LOCAL DIRECTORYâï¼ the data files allo⦠â¦cated on the non-driver node will not be written to the specified output directory ## What changes were proposed in this pull request? As The "INSERT OVERWRITE LOCAL DIRECTORY" features use the local staging directory to load data into the specified output directory , the data files allocated on the non-driver node will not be written to the specified output directory. In saveAsHiveFile.scala, the code is based on the output directory to determine whether to use the local staging directory or the distributed staging directory. I change the getStagingDir() method. Modify the first parameter from " new Path(extURI.getScheme, extURI.getAuthority, extURI.getPath) " to "new Path(extURI.getPath)" If spark depends on the distributed storage system, then it will be used first. If it is not, it will be used locally. You can directly adjust it to let it be automatically selected instead of specifying it according to the output directory. ## How was this patch tested? manual tests Please review http://spark.apache.org/contributing.html before opening a pull request. You can merge this pull request into a Git repository by running: $ git pull https://github.com/httfighter/spark SPARK-25477 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/22487.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #22487 ---- commit 8fe6d095fd2ce1a1a129a46345b1cecf6df70d8c Author: é©ç°ç°00222924 <han.tiantian@...> Date: 2018-09-20T07:57:06Z [SPARK-25477] âINSERT OVERWRITE LOCAL DIRECTORYâï¼ the data files allocated on the non-driver node will not be written to the specified output directory ---- --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org