Re: partition size inherited from parent: auto coalesce

2017-01-16 Thread Takeshi Yamamuro
Hi, The coalesce does not automatically happen now and you need to control the number for yourself. Basically, #partitions respect a `spark.default.parallelism` number, by default, #cores for your computer. http://spark.apache.org/docs/latest/configuration.html#execution-behavior // maropu On

partition size inherited from parent: auto coalesce

2017-01-16 Thread Suzen, Mehmet
Hello List, I was wondering what is the design principle that partition size of an RDD is inherited from the parent. See one simple example below [*]. 'ngauss_rdd2' has significantly less data, intuitively in such cases, shouldn't spark invoke coalesce automatically for performance? What would

partition size inherited from parent: auto coalesce

2017-01-16 Thread Suzen, Mehmet
Hello List, I was wondering what is the design principle that partition size of an RDD is inherited from the parent. See one simple example below [*]. 'ngauss_rdd2' has significantly less data, intuitively in such cases, shouldn't spark invoke coalesce automatically for performance? What would