Hi, I have been doing a rowcount via mapreduce and its taking about 4-5 hours to count a 500million rows in a table. I was wondering if there are any map reduce tunings I can do so it will go much faster.
I have 10 node cluster, each node with 8CPUs with 64GB of memory. Any tuning advice would be much appreciated. -- --- Get your facts first, then you can distort them as you please.--
