Ryan
Thank you for the answer.
> I have found with my tests that 3 nodes is wholy insufficient. I think it's
> causing me to hit the xciever limit sooner than I would if I was running 10+
> machines. The issue is with r=3 on HDFS, and you have 3 machines, you get
> reliability but no spreading of load. I don't know how big the 'large EC2'
> instances are, but you might want to consider running more of smaller for
> the same cost if possible. You get better spread of load across machines,
> and should increase overall performance.
It looks like the performance of our cluster is ok. We will see how it is going.
> Also, how is it running on EC2? What happens when your machines go away?
> You have to rewrite and copy the config around, do you not?
No. We created our own images with a bootstrap script, which copies
the configs, etc.
> One last thing, the master is very important, but also takes the least
> load. Running bigger iron for it seems pointless to me. My master has a
> load average of 0.00 at all times, including when I am running intense
> import MR tasks that put a LA of 6+ on all my region server/datanode
> servers.
Thank you for your cooperation,
M.