Github user JoshRosen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/1799#discussion_r16029607
  
    --- Diff: core/src/main/scala/org/apache/spark/SparkEnv.scala ---
    @@ -246,8 +250,13 @@ object SparkEnv extends Logging {
           "."
         }
     
    -    val shuffleManager = instantiateClass[ShuffleManager](
    -      "spark.shuffle.manager", 
"org.apache.spark.shuffle.hash.HashShuffleManager")
    +    // Let the user specify short names for shuffle managers
    +    val shortShuffleMgrNames = Map(
    +      "hash" -> "org.apache.spark.shuffle.hash.HashShuffleManager",
    +      "sort" -> "org.apache.spark.shuffle.sort.SortShuffleManager")
    +    val shuffleMgrName = conf.get("spark.shuffle.manager", "hash")
    --- End diff --
    
    I ran into a problem using these short names: in ShuffleBlockManager, 
there's a line that looks at the `spark.shuffle.manager` property to see 
whether we're using sort-based shuffle:
    
    ```scala
      // Are we using sort-based shuffle?
      val sortBasedShuffle =
        conf.get("spark.shuffle.manager", "") == 
classOf[SortShuffleManager].getName
    ```
    
    This won't work properly if the configuration property is set to one of the 
short names.
    
    We can't just re-assign the property to the full name because the 
BlockManager will have already been created by this point and it will have 
created the ShuffleBlockManager with the wrong property value.  Similarly, the 
ShuffleBlockManager can't access SparkEnv to inspect the actual ShuffleManager 
because it won't be fully initialized.
    
    I think we should perform all configuration normalization / mutation at a 
single top-level location and then treat the configuration as immutable from 
that point forward, since that seems easier to reason about.  What do you think 
about moving the aliasing  / normalization to the top of SparkEnv?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to