Vinay: Can you get jstack of the 2 region servers during load and pastebin them ?
Thanks On Fri, Oct 25, 2013 at 7:53 AM, Vinay Kashyap <[email protected]> wrote: > Hi Lars, > > Yes, I understand that it is not advisable to configure memstore with such > a huge value, but I wanted to test HBase for the scalability of HBase when > data is completely in memory and also I want to avoid disk access as I have > single disk configured with RAID-1, which is not an optimized for HDFS. > > I have disabled write to WAL also. > > Few more observations are > 1. Out of 25 region servers, 23 region servers, serve around 60K ops till > completion, but only 2 region servers, start with not more than 17k ops. > 2. These 2 region servers endup with less than 100 ops at the final stages > dragging the overall time taken to a bigger value (say 1500 seconds) where > all the other region servers are finished.( say in 200 seconds). > 3. I verified no other processes are running in these 2 region servers to > put the load. > 4. If the number of regions are increased, say table precreated with 50, > 100, 200 etc..the load on the these region servers are reduced and serve > more requests. ( But still with a notable difference with other region > servers ) > > So, I wanted to understand, what extra work the 2 region servers are packed > up with to see a reduced performance like this. > > > Thanks and regards > Vinay Kashyap > > > > > On Fri, Oct 25, 2013 at 2:16 PM, lars hofhansl <[email protected]> wrote: > > > No this is different. > > > > All your data is in the memstore still. > > > > The memstore is organized as a skip list, nobody has ever tested that > with > > 72gb. 256mb, 512mb, 1gb, sure... 72gb... no way. > > Same with a 96gb of java heap. Not with Oracle or OpenJDK and an > > application specifically for such large heaps. > > I would keep it under 30gb. > > > > > > I think what you want is the following: > > 1. disable WAL writes (you don't care if you lose data) > > 2. lower your memstore size, so you'll see some flushes and eventually > > compactions. > > > > 3. Don't give the JVM more the 30g or so > > 4. Flush your memstores to disk. They'll end up in the block cache that > way > > > > Currently we can't fill the block cache without flushing to disk. > > > > Maybe HBase is not the right solution. If you need a large ephemeral > > in-memory store, maybe look at memcache? > > > > > > -- Lars > > > > > > > > ________________________________ > > From: Vinay Kashyap <[email protected]> > > To: [email protected] > > Sent: Wednesday, October 23, 2013 5:57 PM > > Subject: Fwd: High CPU utilization in few Region servers during read > > > > > > From the thread dump looks like so many threads are stuck at > > > > org.apache.hadoop.hbase.KeyValue$KVComparator.compare(KeyValue.java:1535) > > > > org.apache.hadoop.hbase.KeyValue$KVComparator.compare(KeyValue.java:1523) > > > > > > > java.util.concurrent.ConcurrentSkipListMap$ComparableUsingComparator.compareTo(ConcurrentSkipListMap.java:647) > > > > > > > java.util.concurrent.ConcurrentSkipListMap.findNear(ConcurrentSkipListMap.java:1346) > > > > Is this similar to HBASE-9428 issue.?? > > > > Waiting for some help regarding this.. :) > > > > > > Thanks and regards > > Vinay S Kashyap > > > > > > > > ---------- Forwarded message ---------- > > From: Vinay Kashyap <[email protected]> > > Date: Tue, Oct 22, 2013 at 8:47 PM > > Subject: High CPU utilization in few Region servers during read > > To: [email protected] > > > > > > Hi, > > > > I am running HBase 0.94.6 (cdh-4.4.0) with 25 region servers. > > I am testing a scenario to read and write only from/to RAM. > > > > I have the following settings > > Table precreated with 25 regions. > > HFile size - 48 GB > > MemStore size - 72 GB > > Heap size - 96 GB > > > > These settings are to avoid any flushes to the disk. Data need not be > > persisted. > > > > I am able to achieve a load throughput of 75K ops per region server. > > While reading 23 region servers are serving requests with throughput of > 55K > > ops, but randomly 2 of the region servers always end up serving few 100 > > ops. > > > > In these 2 region servers the CPU usage is very high and close to 100% > > continuously bringing down the overall throughput. I did not observe any > > long GC pauses in this time. > > > > I also tried applying the patch for HBASE-9428 issue, but still faced the > > same problem. > > Thread dump for the affected region server is at > > http://pastebin.com/JGx9gXnm > > > > Any hints on how to solve this.? > > > > Thanks and regards > > Vinay S Kashyap > > >
