[GitHub] spark issue #20091: [SPARK-22465][FOLLOWUP] Update the number of partitions ...

mridulm Wed, 27 Dec 2017 21:13:02 -0800

Github user mridulm commented on the issue:

    https://github.com/apache/spark/pull/20091
  
    @jiangxb1987  I am not disagreeing with your hypothesis that default 
parallelism might not be optimal in all cases within an application (example - 
when different RDD's in application have widely varying cardinalities).
    
    Since spark.default.parallelism is an exposed interface, which applications 
depend on, changing the semantics here will be a regression in terms of 
functionality and will be breaking an exposed contract in spark.
    
    This is why we have the option of explicitly overriding number of 
partitions when default does not work well.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark issue #20091: [SPARK-22465][FOLLOWUP] Update the number of partitions ...

Reply via email to