How big is your input dataset?

On Thursday, November 27, 2014, Praveen Sripati <>

> Hi,
> When I run the below program, I see two files in the HDFS because the
> number of partitions in 2. But, one of the file is empty. Why is it so? Is
> the work not distributed equally to all the tasks?
> textFile.flatMap(lambda line: line.split()).map(lambda word: (word, 1)).
> *reduceByKey*(lambda a, b: a+b).*repartition(2)*
> .saveAsTextFile("hdfs://localhost:9000/user/praveen/output/")
> Thanks,
> Praveen

- Rishi

Reply via email to