This might be better on the user list? Anyway.. How many IPC handlers are you giving? m1.xlarge is very low cpu. Not only does it have only 4 cores (more cores allow more concurrent threads with less context switching), but those cores are severely underpowered. I would recommend at least c1.xlarge, which is only a bit more expensive. If you happen to be doing heavy GC, with 1-2 compactions running, and with many writes incoming, you are quickly using up quite a bit of CPU. What is the load and CPU usage, on the 10.38.106.234:50010?
Did you see anything about blocking updates in the hbase logs? How much memstore are you giving? On Thu, Jan 16, 2014 at 1:17 PM, Andrew Purtell <[email protected]> wrote: > On Wed, Jan 15, 2014 at 5:32 PM, > Vladimir Rodionov <[email protected]> wrote: > > > Yes, I am using ephemeral (local) storage. I found that iostat is most of > > the time idle on 3K load with periodic bursts up to 10% iowait. > > > > Ok, sounds like the problem is higher up the stack. > > I see in later emails on this thread a log snippet that shows an issue with > the WAL writer pipeline, one of the datanodes is slow, sick, or partially > unreachable. If you have uneven point to point ping times among your > cluster instances, or periodic loss, it might still be AWS's fault, > otherwise I wonder why the DFSClient says a datanode is sick. > > -- > Best regards, > > - Andy > > Problems worthy of attack prove their worth by hitting back. - Piet Hein > (via Tom White) >
