Hi folks, puzzled by something pretty simple: I have a standalone cluster with default parallelism of 2, spark-shell running with 2 cores
sc.textFile("README.md").partitions.size returns 2 (this makes sense) sc.textFile("README.md").coalesce(100,true).partitions.size returns 100, also makes sense but sc.textFile("README.md",100).partitions.size gives 102 --I was expecting this to be equivalent to last statement (i.e.result in 100 partitions) I'd appreciate if someone can enlighten me as to why I end up with 102 This is on Spark 1.2 thanks