Ryan

     Thank you for the answer.

> I have found with my tests that 3 nodes is wholy insufficient.  I think it's
> causing me to hit the xciever limit sooner than I would if I was running 10+
> machines.  The issue is with r=3 on HDFS, and you have 3 machines, you get
> reliability but no spreading of load.  I don't know how big the 'large EC2'
> instances are, but you might want to consider running more of smaller for
> the same cost if possible.  You get better spread of load across machines,
> and should increase overall performance.

It looks like the performance of our cluster is ok. We will see how it is going.

> Also, how is it running on EC2?  What happens when your machines go away?
> You have to rewrite and copy the config around, do you not?

No. We created our own images with a bootstrap script, which copies
the configs, etc.

> One last thing, the master is very important, but also takes the least
> load.  Running bigger iron for it seems pointless to me.  My master has a
> load average of 0.00 at all times, including when I am running intense
> import MR tasks that put a LA of 6+ on all my region server/datanode
> servers.

Thank you for your cooperation,
M.

Reply via email to