GitHub user srowen opened a pull request:

    https://github.com/apache/spark/pull/2670

    SPARK-3811 [CORE] More robust / standard Utils.deleteRecursively, 
Utils.createTempDir

    I noticed a few issues with how temp directories are created and deleted:
    
    *Minor*
    
    * Guava's `Files.createTempDir()` plus `File.deleteOnExit()` is used in 
many tests to make a temp dir, but `Utils.createTempDir()` seems to be the 
standard Spark mechanism
    * Call to `File.deleteOnExit()` could be pushed into 
`Utils.createTempDir()` as well, along with this replacement
    * _I messed up the message in an exception in `Utils` in SPARK-3794; fixed 
here_
    
    *Bit Less Minor*
    
    * `Utils.deleteRecursively()` fails immediately if any `IOException` 
occurs, instead of trying to delete any remaining files and subdirectories. 
I've observed this leave temp dirs around. I suggest changing it to continue in 
the face of an exception and throw one of the possibly several exceptions that 
occur at the end.
    * `Utils.createTempDir()` will add a JVM shutdown hook every time the 
method is called. Even if the subdir is the parent of another parent dir, since 
this check is inside the hook. However `Utils` manages a set of all dirs to 
delete on shutdown already, called `shutdownDeletePaths`. A single hook can be 
registered to delete all of these on exit. This is how Tachyon temp paths are 
cleaned up in `TachyonBlockManager`.
    
    I noticed a few other things that might be changed but wanted to ask first:
    
    * Shouldn't the set of dirs to delete be `File`, not just `String` paths?
    * `Utils` manages the set of `TachyonFile` that have been registered for 
deletion, but the shutdown hook is managed in `TachyonBlockManager`. Should 
this logic not live together, and not in `Utils`? it's more specific to 
Tachyon, and looks a slight bit odd to import in such a generic place.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/srowen/spark SPARK-3811

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/2670.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2670
    
----
commit 3a0faa4e151cac3d9d9b4b4ee87cd024d260c9b1
Author: Sean Owen <so...@cloudera.com>
Date:   2014-10-06T10:19:01Z

    Standardize on Utils.createTempDir instead of Files.createTempDir

commit da0146de0fd21f375843afb47441a2d9a4db146d
Author: Sean Owen <so...@cloudera.com>
Date:   2014-10-06T10:19:30Z

    Make Utils.deleteRecursively try to delete all paths even when an exception 
occurs; use one shutdown hook instead of one per method call to delete temp dirs

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to