Hi All,
I'm quite new to Pig/Hadoop. So maybe my cluster size will make you laugh.
I wrote a script on Pig handling 1.5GB of logs in less than one hour in
pig local mode on a Intel core 2 duo with 3GB of RAM.
Then I tried this script on a simple 2 nodes cluster. These 2 nodes are
not servers but simple computers:
- Intel core 2 duo with 3GB of RAM.
- Intel Quad with 4GB of RAM.
Well I was aware that hadoop has overhead and that it won't be done in
half an hour (time in local divided by number of nodes). But I was
surprised to see this morning it took 7 hours to complete!!!
My configuration was made according to this link:
http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_%28Multi-Node_Cluster%29
My question is simple: Is it normal?
Cheers
Vincent