textFile partitions

Yana Kadiyska Mon, 09 Feb 2015 20:01:05 -0800

Hi folks, puzzled by something pretty simple:

I have a standalone cluster with default parallelism of 2, spark-shell
running with 2 cores


sc.textFile("README.md").partitions.size returns 2 (this makes sense)
sc.textFile("README.md").coalesce(100,true).partitions.size returns 100,
also makes sense

but

sc.textFile("README.md",100).partitions.size
 gives 102 --I was expecting this to be equivalent to last statement
(i.e.result in 100 partitions)

I'd appreciate if someone can enlighten me as to why I end up with 102
This is on Spark 1.2

thanks

textFile partitions

Reply via email to