Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/5136#discussion_r27020752
  
    --- Diff: 
core/src/main/scala/org/apache/spark/storage/DiskBlockManager.scala ---
    @@ -91,7 +90,12 @@ private[spark] class DiskBlockManager(blockManager: 
BlockManager, conf: SparkCon
       /** List all the files currently stored on disk by the disk manager. */
       def getAllFiles(): Seq[File] = {
         // Get all the files inside the array of array of directories
    -    subDirs.flatten.filter(_ != null).flatMap { dir =>
    +    subDirs.flatMap { dir =>
    --- End diff --
    
    I think you have a decent point. Yes the example I gave happened to involve 
strings, which have `final` fields, but imagine a different example that 
doesn't. I think I am implicitly reasoning that the file creation, for example, 
must happen-before the assignment within one thread (this is not a question of 
the Java memory model and visibility). I also don't think you can see a memory 
location before the default object initialization finishes since this is atomic 
w.r.t. the Java program (not the constructor body). That plus the end of the 
first `synchronized` block is a memory barrier that causes all the writes to be 
visible.
    
    So I made too strong an assertion that "this sort of thing can never 
happen" since it does depend a little more on exactly what's happening.
    
    So, hm, just given that there was discussion here, I can see the argument 
for being safe and leaving in the extra copy "just in case". I suppose the 
question is how expensive or error prone is it?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to