I'm doing a map reduce job to create the HFileOutputFormat out of CSVs.

* The mapreduce job, operates on 75files, each containing 1Million rows.
Total comes up to 16GB. [with replication factor as 2, the total DFS used is
32GB ]
* There are 300 Map jobs.
* The map job ends perfectly.
* There are 3 slave nodes (having 145GB hard disk), so
job.setNumReduceTasks(3) are 3 reducers, 
* When the reduce job is about to end, the space on all the slave nodes run
out. 

I am confused. Why my space runs out during the reduce time (in the shuffle
phase) ? 
-- 
View this message in context: 
http://old.nabble.com/Non-DFS-space-usage-blows-up.-tp30511999p30511999.html
Sent from the HBase User mailing list archive at Nabble.com.

Reply via email to