Hi,
How did u check no of splits in ur file. Did i run ur mr job or calculated
it.?
The formula for split size is
max(minSize, min(max size, block size)). Can u check if it satisfies ur
case.?
Thanks & Regards,
Archit Thakur.
On Saturday, April 25, 2015, Wenlei Xie wrote:
> Hi,
>
> I checked
Hi,
I checked the number of partitions by
System.out.println("INFO: RDD with " + rdd.partitions().size() + "
partitions created.");
Each single split is about 100MB. I am currently loading the data from
local file system, would this explains this observation?
Thank you!
Best,
Wenlei
On Tue,
Hi,
It should generate the same no of partitions as the no. of splits.
Howd you check no of partitions.? Also please paste your file size and
hdfs-site.xml and mapred-site.xml here.
Thanks and Regards,
Archit Thakur.
On Sat, Apr 18, 2015 at 6:20 PM, Wenlei Xie wrote:
> Hi,
>
> I am wondering t
Hi,
I am wondering the mechanism that determines the number of partitions
created by SparkContext.sequenceFile ?
For example, although my file has only 4 splits, Spark would create 16
partitions for it. Is it determined by the file size? Is there any way to
control it? (Looks like I can only tune