Basically without metrics on what's going on it's tough to know for sure. I would turn on GC logging and make sure that is not playing a part, get metrics on IO while this is going on, and look through the logs to see what is happening when you notice the pause.
On Wed, Jun 20, 2012 at 6:39 AM, Martin Alig <[email protected]> wrote: > Hi > > I'm doing some evaluations with HBase. The workload I'm facing is mainly > insert-only. > Currently I'm inserting 1KB rows, where 100Bytes go into one column. > > I have the following cluster machines at disposal: > > Intel Xeon L5520 2.26 Ghz (Nehalem, with HT enabled) > 24 GiB Memory > 1 GigE > 2x 15k RPM Sas 73 GB (RAID1) > > I have 10 Nodes. > The first node runs: > > Namenode, SecondaryNamenode, Datanode, HMaster, Zookeeper, and a > RegionServer > > The other nodes run: > > Datanode and RegionServer > > > Now running my test client and inserting rows, the throughput goes up to > 150'000 inserts/sec. But then after some time the throughput drops down to > 0 inserts/sec for quite some time, before it goes up again. > My assumption is, that it happens when the RegionServers start to write the > data from memory to the disks. I know, that the recommended hardware for > HBase should contain multiple disks using JBOD or RAID 0. > But at that point I am limited right now. > > I am just asking if in my hardware setup, the blocking periods are really > caused by the non-optimal disk configuration. > > > Thank you in advance for any suggestions. > > > Martin >
