How big is your input dataset?

On Thursday, November 27, 2014, Praveen Sripati <praveensrip...@gmail.com>
wrote:

> Hi,
>
> When I run the below program, I see two files in the HDFS because the
> number of partitions in 2. But, one of the file is empty. Why is it so? Is
> the work not distributed equally to all the tasks?
>
> textFile.flatMap(lambda line: line.split()).map(lambda word: (word, 1)).
> *reduceByKey*(lambda a, b: a+b).*repartition(2)*
> .saveAsTextFile("hdfs://localhost:9000/user/praveen/output/")
>
> Thanks,
> Praveen
>


-- 
- Rishi

Reply via email to