One of my task is to calculate some statistics from a very large amount of log 
files for our customers. We are trying out hadoop to solve this problem. 
>From what I can see, it is a perfect problem for hadoop designed to solve.

The mapper and reducer code are very straight ward. But when we try to run it 
on a two node cluster, it is surprisingly slow. It has been running for three 
hours and did not finish half of 250M log files. And, I am not seeing much disk 
or network usage.

I want to know if there is something I should check. (Maybe some 
configurations?)

Any help will be appreciated. If there is more information needed, let me know. 

Thanks in advance

       
---------------------------------
Ahhh...imagining that irresistible "new car" smell?
 Check outnew cars at Yahoo! Autos.

Reply via email to