Here is an interesting anecdote.  I had regionservers running on each of 8
node hadoop cluster.  Yesterday morning, I ran a series of MR jobs where the
last MR job does a batched inserts into a production MySQL server.  All
other MR jobs have 3 mappers and 3 reducers running on a node.  The db job
has 3 mapper and 1 reducer on each node.  The regionservers stayed up until
the db reducer job started running, then the heart beats to zookeeper were
lost and they all went down.

It looks to be network or swap related.  I will dig into cacti result more.

Reply via email to