subject:"High cpu usage on a region server"

Re: High cpu usage on a region server

2013-09-15 Thread OpenSource Dev

We patched HBase 0.94.6 with HBASE-9428, and now the difference is as
day and night.
Read latency has been very consistent and haven't seen any cpu load
issue in last 24+hrs

Thank you all for helping us out to resolve this issue.

Bikrant

On Thu, Sep 12, 2013 at 10:25 AM, lars hofhansl la...@apache.org wrote:
Not that I am aware of. Reduce the HFile block size will lessen this problem
(but then cause other issues).

It's just a fix to the RegexStringFilter. You can just recompile that and
deploy it to the RegionServers (need to make it's in the class path before
the HBase jars).
Probably easier to roll a new release. It's a shame we did not see this
earlier.

-- Lars

From: OpenSource Dev dev.opensou...@gmail.com
To: user@hbase.apache.org; lars hofhansl la...@apache.org
Sent: Thursday, September 12, 2013 9:52 AM
Subject: Re: High cpu usage on a region server

Thanks Lars.

Are there any other workarounds for this issue until we get the fix ?
If not we might have to do the patch and rollout custom pkg.

On Thu, Sep 12, 2013 at 8:36 AM, lars hofhansl la...@apache.org wrote:
Yep... Very likely HBASE-9428:

8 threads:
java.lang.Thread.State: RUNNABLE
at java.util.Arrays.copyOf(Arrays.java:2786)
at java.lang.StringCoding.decode(StringCoding.java:178)
at java.lang.String.init(String.java:483)
at
org.apache.hadoop.hbase.filter.RegexStringComparator.compareTo(RegexStringComparator.java:96)
...

4 threads:
java.lang.Thread.State: RUNNABLE
at sun.nio.cs.ISO_8859_1$Decoder.decodeArrayLoop(ISO_8859_1.java:79)
at sun.nio.cs.ISO_8859_1$Decoder.decodeLoop(ISO_8859_1.java:106)
at java.nio.charset.CharsetDecoder.decode(CharsetDecoder.java:544)
at java.lang.StringCoding$StringDecoder.decode(StringCoding.java:140)
at java.lang.StringCoding.decode(StringCoding.java:179)
at java.lang.String.init(String.java:483)
at
org.apache.hadoop.hbase.filter.RegexStringComparator.compareTo(RegexStringComparator.java:96)

It's also consistent with what you see: Lots of garbage (hence tweaking your
GC options had a significant effect)
The fix is in 0.94.12, which is in RC right now, probably to be released
early next week.

-- Lars

From: OpenSource Dev dev.opensou...@gmail.com
To: user@hbase.apache.org
Sent: Thursday, September 12, 2013 8:15 AM
Subject: Re: High cpu usage on a region server

A server started getting busy last night, but this time it took ~5 hrs
to get from 15% busy to 75% busy. It is not running 80% flat-out yet.
But this is still very high compared to other servers that are running
under ~25% cpu usage. Only change that I made yesterday was the
addition of -XX:+UseParNewGC to hbase startup command.

http://pastebin.com/VRmujgyH

On Wed, Sep 11, 2013 at 2:28 PM, Stack st...@duboce.net wrote:
Can you thread dump the busy server and pastebin it?
Thanks,
St.Ack

On Wed, Sep 11, 2013 at 1:49 PM, OpenSource Dev
dev.opensou...@gmail.comwrote:

Hi,

I'm using HBase 0.94.6 (CDH 4.3) for Opentsdb. So far I have had no
issues with writes/puts. System is handles upto 800k puts per seconds
without issue. On average we do 250k puts per second.

I am having the problem with Reads, I've also isolated where the
problem is but not been able to find the root cause.

I have 16 machines running hbase-region server, each has ~35 regions.
Once in a while cpu goes flatout 80% in 1 region server. These are the
things i've noticed in ganglia:

hbase.regionserver.request - evenly distributed. Not seeing any spikes
on the busy server
hbase.regionserver.blockCacheSize - between 500MB and 1000MB
hbase.regionserver.compactionQueueSize - avg 2 or less
hbase.regionserver.blockCacheHitRatio - 30% on busy node, 60% on other
nodes

JVM Heap size is set to 16GB and I'm using -XX:+UseParNewGC
-XX:+UseConcMarkSweepGC

I've noticed the system load moves to a different region, sometimes
within a minute, if the busy region is restarted.

Any suggestion what could be causing the load and/or what other
metrics should I check ?

Thank you!

Re: High cpu usage on a region server

2013-09-15 Thread lars hofhansl

Thanks for reporting back Bikrant, glad that that turned out to be issue.

From: OpenSource Dev dev.opensou...@gmail.com
To: user@hbase.apache.org; lars hofhansl la...@apache.org 
Sent: Saturday, September 14, 2013 11:21 PM
Subject: Re: High cpu usage on a region server

We patched HBase 0.94.6 with HBASE-9428, and now the difference is as
day and night.
Read latency has been very consistent and haven't seen any cpu load
issue in last 24+hrs

Thank you all for helping us out to resolve this issue.

Bikrant

On Thu, Sep 12, 2013 at 10:25 AM, lars hofhansl la...@apache.org wrote:
 Not that I am aware of. Reduce the HFile block size will lessen this problem 
 (but then cause other issues).

 It's just a fix to the RegexStringFilter. You can just recompile that and 
 deploy it to the RegionServers (need to make it's in the class path before 
 the HBase jars).
 Probably easier to roll a new release. It's a shame we did not see this 
 earlier.

 -- Lars

  From: OpenSource Dev dev.opensou...@gmail.com
 To: user@hbase.apache.org; lars hofhansl la...@apache.org
 Sent: Thursday, September 12, 2013 9:52 AM
 Subject: Re: High cpu usage on a region server

 Thanks Lars.

 Are there any other workarounds for this issue until we get the fix ?
 If not we might have to do the patch and rollout custom pkg.

 On Thu, Sep 12, 2013 at 8:36 AM, lars hofhansl la...@apache.org wrote:
 Yep... Very likely HBASE-9428:

 8 threads:
    java.lang.Thread.State: RUNNABLE
         at java.util.Arrays.copyOf(Arrays.java:2786)
         at java.lang.StringCoding.decode(StringCoding.java:178)
         at java.lang.String.init(String.java:483)
         at 
org.apache.hadoop.hbase.filter.RegexStringComparator.compareTo(RegexStringComparator.java:96)
         ...

 4 threads:
    java.lang.Thread.State: RUNNABLE
         at sun.nio.cs.ISO_8859_1$Decoder.decodeArrayLoop(ISO_8859_1.java:79)
         at sun.nio.cs.ISO_8859_1$Decoder.decodeLoop(ISO_8859_1.java:106)
         at java.nio.charset.CharsetDecoder.decode(CharsetDecoder.java:544)
         at java.lang.StringCoding$StringDecoder.decode(StringCoding.java:140)
         at java.lang.StringCoding.decode(StringCoding.java:179)
         at java.lang.String.init(String.java:483)
         at 
org.apache.hadoop.hbase.filter.RegexStringComparator.compareTo(RegexStringComparator.java:96)

 It's also consistent with what you see: Lots of garbage (hence tweaking your 
 GC options had a significant effect)
 The fix is in 0.94.12, which is in RC right now, probably to be released 
 early next week.

 -- Lars

  From: OpenSource Dev dev.opensou...@gmail.com
 To: user@hbase.apache.org
 Sent: Thursday, September 12, 2013 8:15 AM
 Subject: Re: High cpu usage on a region server

 A server started getting busy last night, but this time it took ~5 hrs
 to get from 15% busy to 75% busy. It is not running 80% flat-out yet.
 But this is still very high compared to other servers that are running
 under ~25% cpu usage. Only change that I made yesterday was the
 addition of -XX:+UseParNewGC to hbase startup command.

 http://pastebin.com/VRmujgyH

 On Wed, Sep 11, 2013 at 2:28 PM, Stack st...@duboce.net wrote:
 Can you thread dump the busy server and pastebin it?
 Thanks,
 St.Ack

 On Wed, Sep 11, 2013 at 1:49 PM, OpenSource Dev 
 dev.opensou...@gmail.comwrote:

 Hi,

 I'm using HBase 0.94.6 (CDH 4.3) for Opentsdb. So far I have had no
 issues with writes/puts. System is handles upto 800k puts per seconds
 without issue. On average we do 250k puts per second.

 I am having the problem with Reads, I've also isolated where the
 problem is but not been able to find the root cause.

 I have 16 machines running hbase-region server, each has ~35 regions.
 Once in a while cpu goes flatout 80% in 1 region server. These are the
 things i've noticed in ganglia:

 hbase.regionserver.request - evenly distributed. Not seeing any spikes
 on the busy server
 hbase.regionserver.blockCacheSize - between 500MB and 1000MB
 hbase.regionserver.compactionQueueSize - avg 2 or less
 hbase.regionserver.blockCacheHitRatio - 30% on busy node, 60% on other
 nodes

 JVM Heap size is set to 16GB and I'm using -XX:+UseParNewGC
 -XX:+UseConcMarkSweepGC

 I've noticed the system load moves to a different region, sometimes
 within a minute, if the busy region is restarted.

 Any suggestion what could be causing the load and/or what other
 metrics should I check ?

 Thank you!

Re: High cpu usage on a region server

2013-09-12 Thread lars hofhansl

Yep... Very likely HBASE-9428:

8 threads:
   java.lang.Thread.State: RUNNABLE
    at java.util.Arrays.copyOf(Arrays.java:2786)
    at java.lang.StringCoding.decode(StringCoding.java:178)
    at java.lang.String.init(String.java:483)
    at 
org.apache.hadoop.hbase.filter.RegexStringComparator.compareTo(RegexStringComparator.java:96)
    ...

4 threads:
   java.lang.Thread.State: RUNNABLE
    at sun.nio.cs.ISO_8859_1$Decoder.decodeArrayLoop(ISO_8859_1.java:79)
    at sun.nio.cs.ISO_8859_1$Decoder.decodeLoop(ISO_8859_1.java:106)
    at java.nio.charset.CharsetDecoder.decode(CharsetDecoder.java:544)
    at java.lang.StringCoding$StringDecoder.decode(StringCoding.java:140)
    at java.lang.StringCoding.decode(StringCoding.java:179)
    at java.lang.String.init(String.java:483)
    at 
org.apache.hadoop.hbase.filter.RegexStringComparator.compareTo(RegexStringComparator.java:96)

It's also consistent with what you see: Lots of garbage (hence tweaking your GC 
options had a significant effect)
The fix is in 0.94.12, which is in RC right now, probably to be released early 
next week.

-- Lars




 From: OpenSource Dev dev.opensou...@gmail.com
To: user@hbase.apache.org 
Sent: Thursday, September 12, 2013 8:15 AM
Subject: Re: High cpu usage on a region server
 

A server started getting busy last night, but this time it took ~5 hrs
to get from 15% busy to 75% busy. It is not running 80% flat-out yet.
But this is still very high compared to other servers that are running
under ~25% cpu usage. Only change that I made yesterday was the
addition of -XX:+UseParNewGC to hbase startup command.

http://pastebin.com/VRmujgyH

On Wed, Sep 11, 2013 at 2:28 PM, Stack st...@duboce.net wrote:
 Can you thread dump the busy server and pastebin it?
 Thanks,
 St.Ack


 On Wed, Sep 11, 2013 at 1:49 PM, OpenSource Dev 
 dev.opensou...@gmail.comwrote:

 Hi,

 I'm using HBase 0.94.6 (CDH 4.3) for Opentsdb. So far I have had no
 issues with writes/puts. System is handles upto 800k puts per seconds
 without issue. On average we do 250k puts per second.

 I am having the problem with Reads, I've also isolated where the
 problem is but not been able to find the root cause.

 I have 16 machines running hbase-region server, each has ~35 regions.
 Once in a while cpu goes flatout 80% in 1 region server. These are the
 things i've noticed in ganglia:

 hbase.regionserver.request - evenly distributed. Not seeing any spikes
 on the busy server
 hbase.regionserver.blockCacheSize - between 500MB and 1000MB
 hbase.regionserver.compactionQueueSize - avg 2 or less
 hbase.regionserver.blockCacheHitRatio - 30% on busy node, 60% on other
 nodes


 JVM Heap size is set to 16GB and I'm using -XX:+UseParNewGC
 -XX:+UseConcMarkSweepGC

 I've noticed the system load moves to a different region, sometimes
 within a minute, if the busy region is restarted.

 Any suggestion what could be causing the load and/or what other
 metrics should I check ?


 Thank you!

Re: High cpu usage on a region server

2013-09-12 Thread Jean-Daniel Cryans

Or roll back to CDH 4.2's HBase. They are fully compatible.

J-D

On Thu, Sep 12, 2013 at 10:25 AM, lars hofhansl la...@apache.org wrote:

Not that I am aware of. Reduce the HFile block size will lessen this
problem (but then cause other issues).

-- Lars

From: OpenSource Dev dev.opensou...@gmail.com
To: user@hbase.apache.org; lars hofhansl la...@apache.org
Sent: Thursday, September 12, 2013 9:52 AM
Subject: Re: High cpu usage on a region server

Thanks Lars.

Are there any other workarounds for this issue until we get the fix ?
If not we might have to do the patch and rollout custom pkg.

On Thu, Sep 12, 2013 at 8:36 AM, lars hofhansl la...@apache.org wrote:
Yep... Very likely HBASE-9428:

4 threads:
java.lang.Thread.State: RUNNABLE
at
sun.nio.cs.ISO_8859_1$Decoder.decodeArrayLoop(ISO_8859_1.java:79)
at sun.nio.cs.ISO_8859_1$Decoder.decodeLoop(ISO_8859_1.java:106)
at
java.nio.charset.CharsetDecoder.decode(CharsetDecoder.java:544)
at
java.lang.StringCoding$StringDecoder.decode(StringCoding.java:140)
at java.lang.StringCoding.decode(StringCoding.java:179)
at java.lang.String.init(String.java:483)
at
org.apache.hadoop.hbase.filter.RegexStringComparator.compareTo(RegexStringComparator.java:96)

It's also consistent with what you see: Lots of garbage (hence tweaking
your GC options had a significant effect)
The fix is in 0.94.12, which is in RC right now, probably to be released
early next week.

-- Lars

From: OpenSource Dev dev.opensou...@gmail.com
To: user@hbase.apache.org
Sent: Thursday, September 12, 2013 8:15 AM
Subject: Re: High cpu usage on a region server

http://pastebin.com/VRmujgyH

On Wed, Sep 11, 2013 at 2:28 PM, Stack st...@duboce.net wrote:
Can you thread dump the busy server and pastebin it?
Thanks,
St.Ack

On Wed, Sep 11, 2013 at 1:49 PM, OpenSource Dev
dev.opensou...@gmail.comwrote:

Hi,

I'm using HBase 0.94.6 (CDH 4.3) for Opentsdb. So far I have had no
issues with writes/puts. System is handles upto 800k puts per seconds
without issue. On average we do 250k puts per second.

I am having the problem with Reads, I've also isolated where the
problem is but not been able to find the root cause.

I have 16 machines running hbase-region server, each has ~35 regions.
Once in a while cpu goes flatout 80% in 1 region server. These are the
things i've noticed in ganglia:

JVM Heap size is set to 16GB and I'm using -XX:+UseParNewGC
-XX:+UseConcMarkSweepGC

I've noticed the system load moves to a different region, sometimes
within a minute, if the busy region is restarted.

Any suggestion what could be causing the load and/or what other
metrics should I check ?

Thank you!

Re: High cpu usage on a region server

2013-09-12 Thread lars hofhansl

Not that I am aware of. Reduce the HFile block size will lessen this problem 
(but then cause other issues).

It's just a fix to the RegexStringFilter. You can just recompile that and 
deploy it to the RegionServers (need to make it's in the class path before the 
HBase jars).
Probably easier to roll a new release. It's a shame we did not see this earlier.


-- Lars




 From: OpenSource Dev dev.opensou...@gmail.com
To: user@hbase.apache.org; lars hofhansl la...@apache.org 
Sent: Thursday, September 12, 2013 9:52 AM
Subject: Re: High cpu usage on a region server
 

Thanks Lars.

Are there any other workarounds for this issue until we get the fix ?
If not we might have to do the patch and rollout custom pkg.

On Thu, Sep 12, 2013 at 8:36 AM, lars hofhansl la...@apache.org wrote:
 Yep... Very likely HBASE-9428:

 8 threads:
    java.lang.Thread.State: RUNNABLE
         at java.util.Arrays.copyOf(Arrays.java:2786)
         at java.lang.StringCoding.decode(StringCoding.java:178)
         at java.lang.String.init(String.java:483)
         at 
org.apache.hadoop.hbase.filter.RegexStringComparator.compareTo(RegexStringComparator.java:96)
         ...

 4 threads:
    java.lang.Thread.State: RUNNABLE
         at sun.nio.cs.ISO_8859_1$Decoder.decodeArrayLoop(ISO_8859_1.java:79)
         at sun.nio.cs.ISO_8859_1$Decoder.decodeLoop(ISO_8859_1.java:106)
         at java.nio.charset.CharsetDecoder.decode(CharsetDecoder.java:544)
         at java.lang.StringCoding$StringDecoder.decode(StringCoding.java:140)
         at java.lang.StringCoding.decode(StringCoding.java:179)
         at java.lang.String.init(String.java:483)
         at 
org.apache.hadoop.hbase.filter.RegexStringComparator.compareTo(RegexStringComparator.java:96)

 It's also consistent with what you see: Lots of garbage (hence tweaking your 
 GC options had a significant effect)
 The fix is in 0.94.12, which is in RC right now, probably to be released 
 early next week.

 -- Lars



 
  From: OpenSource Dev dev.opensou...@gmail.com
 To: user@hbase.apache.org
 Sent: Thursday, September 12, 2013 8:15 AM
 Subject: Re: High cpu usage on a region server


 A server started getting busy last night, but this time it took ~5 hrs
 to get from 15% busy to 75% busy. It is not running 80% flat-out yet.
 But this is still very high compared to other servers that are running
 under ~25% cpu usage. Only change that I made yesterday was the
 addition of -XX:+UseParNewGC to hbase startup command.

 http://pastebin.com/VRmujgyH

 On Wed, Sep 11, 2013 at 2:28 PM, Stack st...@duboce.net wrote:
 Can you thread dump the busy server and pastebin it?
 Thanks,
 St.Ack


 On Wed, Sep 11, 2013 at 1:49 PM, OpenSource Dev 
 dev.opensou...@gmail.comwrote:

 Hi,

 I'm using HBase 0.94.6 (CDH 4.3) for Opentsdb. So far I have had no
 issues with writes/puts. System is handles upto 800k puts per seconds
 without issue. On average we do 250k puts per second.

 I am having the problem with Reads, I've also isolated where the
 problem is but not been able to find the root cause.

 I have 16 machines running hbase-region server, each has ~35 regions.
 Once in a while cpu goes flatout 80% in 1 region server. These are the
 things i've noticed in ganglia:

 hbase.regionserver.request - evenly distributed. Not seeing any spikes
 on the busy server
 hbase.regionserver.blockCacheSize - between 500MB and 1000MB
 hbase.regionserver.compactionQueueSize - avg 2 or less
 hbase.regionserver.blockCacheHitRatio - 30% on busy node, 60% on other
 nodes


 JVM Heap size is set to 16GB and I'm using -XX:+UseParNewGC
 -XX:+UseConcMarkSweepGC

 I've noticed the system load moves to a different region, sometimes
 within a minute, if the busy region is restarted.

 Any suggestion what could be causing the load and/or what other
 metrics should I check ?


 Thank you!

Re: High cpu usage on a region server

2013-09-12 Thread OpenSource Dev

Thanks Lars.

Are there any other workarounds for this issue until we get the fix ?
If not we might have to do the patch and rollout custom pkg.

On Thu, Sep 12, 2013 at 8:36 AM, lars hofhansl la...@apache.org wrote:
Yep... Very likely HBASE-9428:

-- Lars

From: OpenSource Dev dev.opensou...@gmail.com
To: user@hbase.apache.org
Sent: Thursday, September 12, 2013 8:15 AM
Subject: Re: High cpu usage on a region server

http://pastebin.com/VRmujgyH

On Wed, Sep 11, 2013 at 2:28 PM, Stack st...@duboce.net wrote:
Can you thread dump the busy server and pastebin it?
Thanks,
St.Ack

On Wed, Sep 11, 2013 at 1:49 PM, OpenSource Dev
dev.opensou...@gmail.comwrote:

Hi,

I'm using HBase 0.94.6 (CDH 4.3) for Opentsdb. So far I have had no
issues with writes/puts. System is handles upto 800k puts per seconds
without issue. On average we do 250k puts per second.

I am having the problem with Reads, I've also isolated where the
problem is but not been able to find the root cause.

I have 16 machines running hbase-region server, each has ~35 regions.
Once in a while cpu goes flatout 80% in 1 region server. These are the
things i've noticed in ganglia:

JVM Heap size is set to 16GB and I'm using -XX:+UseParNewGC
-XX:+UseConcMarkSweepGC

I've noticed the system load moves to a different region, sometimes
within a minute, if the busy region is restarted.

Any suggestion what could be causing the load and/or what other
metrics should I check ?

Thank you!

Re: High cpu usage on a region server

2013-09-12 Thread OpenSource Dev

A server started getting busy last night, but this time it took ~5 hrs
to get from 15% busy to 75% busy. It is not running 80% flat-out yet.
But this is still very high compared to other servers that are running
under ~25% cpu usage. Only change that I made yesterday was the
addition of -XX:+UseParNewGC to hbase startup command.

http://pastebin.com/VRmujgyH

On Wed, Sep 11, 2013 at 2:28 PM, Stack st...@duboce.net wrote:
 Can you thread dump the busy server and pastebin it?
 Thanks,
 St.Ack


 On Wed, Sep 11, 2013 at 1:49 PM, OpenSource Dev 
 dev.opensou...@gmail.comwrote:

 Hi,

 I'm using HBase 0.94.6 (CDH 4.3) for Opentsdb. So far I have had no
 issues with writes/puts. System is handles upto 800k puts per seconds
 without issue. On average we do 250k puts per second.

 I am having the problem with Reads, I've also isolated where the
 problem is but not been able to find the root cause.

 I have 16 machines running hbase-region server, each has ~35 regions.
 Once in a while cpu goes flatout 80% in 1 region server. These are the
 things i've noticed in ganglia:

 hbase.regionserver.request - evenly distributed. Not seeing any spikes
 on the busy server
 hbase.regionserver.blockCacheSize - between 500MB and 1000MB
 hbase.regionserver.compactionQueueSize - avg 2 or less
 hbase.regionserver.blockCacheHitRatio - 30% on busy node, 60% on other
 nodes


 JVM Heap size is set to 16GB and I'm using -XX:+UseParNewGC
 -XX:+UseConcMarkSweepGC

 I've noticed the system load moves to a different region, sometimes
 within a minute, if the busy region is restarted.

 Any suggestion what could be causing the load and/or what other
 metrics should I check ?


 Thank you!

Re: High cpu usage on a region server

2013-09-11 Thread lars hofhansl

You might have run into HBASE-9428

-- Lars

 From: OpenSource Dev dev.opensou...@gmail.com
To: user@hbase.apache.org 
Sent: Wednesday, September 11, 2013 1:49 PM
Subject: High cpu usage on a region server

Hi,

I'm using HBase 0.94.6 (CDH 4.3) for Opentsdb. So far I have had no
issues with writes/puts. System is handles upto 800k puts per seconds
without issue. On average we do 250k puts per second.

I am having the problem with Reads, I've also isolated where the
problem is but not been able to find the root cause.

I have 16 machines running hbase-region server, each has ~35 regions.
Once in a while cpu goes flatout 80% in 1 region server. These are the
things i've noticed in ganglia:

hbase.regionserver.request - evenly distributed. Not seeing any spikes
on the busy server
hbase.regionserver.blockCacheSize - between 500MB and 1000MB
hbase.regionserver.compactionQueueSize - avg 2 or less
hbase.regionserver.blockCacheHitRatio - 30% on busy node, 60% on other nodes

JVM Heap size is set to 16GB and I'm using -XX:+UseParNewGC
-XX:+UseConcMarkSweepGC

I've noticed the system load moves to a different region, sometimes
within a minute, if the busy region is restarted.

Any suggestion what could be causing the load and/or what other
metrics should I check ?

Thank you!

Re: High cpu usage on a region server

2013-09-11 Thread Ted Yu

Have you turned on short-circuit read ?

Cheers


On Wed, Sep 11, 2013 at 1:49 PM, OpenSource Dev dev.opensou...@gmail.comwrote:

 Hi,

 I'm using HBase 0.94.6 (CDH 4.3) for Opentsdb. So far I have had no
 issues with writes/puts. System is handles upto 800k puts per seconds
 without issue. On average we do 250k puts per second.

 I am having the problem with Reads, I've also isolated where the
 problem is but not been able to find the root cause.

 I have 16 machines running hbase-region server, each has ~35 regions.
 Once in a while cpu goes flatout 80% in 1 region server. These are the
 things i've noticed in ganglia:

 hbase.regionserver.request - evenly distributed. Not seeing any spikes
 on the busy server
 hbase.regionserver.blockCacheSize - between 500MB and 1000MB
 hbase.regionserver.compactionQueueSize - avg 2 or less
 hbase.regionserver.blockCacheHitRatio - 30% on busy node, 60% on other
 nodes


 JVM Heap size is set to 16GB and I'm using -XX:+UseParNewGC
 -XX:+UseConcMarkSweepGC

 I've noticed the system load moves to a different region, sometimes
 within a minute, if the busy region is restarted.

 Any suggestion what could be causing the load and/or what other
 metrics should I check ?


 Thank you!

Re: High cpu usage on a region server

2013-09-11 Thread Stack

Can you thread dump the busy server and pastebin it?
Thanks,
St.Ack


On Wed, Sep 11, 2013 at 1:49 PM, OpenSource Dev dev.opensou...@gmail.comwrote:

 Hi,

 I'm using HBase 0.94.6 (CDH 4.3) for Opentsdb. So far I have had no
 issues with writes/puts. System is handles upto 800k puts per seconds
 without issue. On average we do 250k puts per second.

 I am having the problem with Reads, I've also isolated where the
 problem is but not been able to find the root cause.

 I have 16 machines running hbase-region server, each has ~35 regions.
 Once in a while cpu goes flatout 80% in 1 region server. These are the
 things i've noticed in ganglia:

 hbase.regionserver.request - evenly distributed. Not seeing any spikes
 on the busy server
 hbase.regionserver.blockCacheSize - between 500MB and 1000MB
 hbase.regionserver.compactionQueueSize - avg 2 or less
 hbase.regionserver.blockCacheHitRatio - 30% on busy node, 60% on other
 nodes


 JVM Heap size is set to 16GB and I'm using -XX:+UseParNewGC
 -XX:+UseConcMarkSweepGC

 I've noticed the system load moves to a different region, sometimes
 within a minute, if the busy region is restarted.

 Any suggestion what could be causing the load and/or what other
 metrics should I check ?


 Thank you!

Re: High cpu usage on a region server

2013-09-11 Thread OpenSource Dev

No, dfs.client.read.shortcircuit is set to false by default in our cluster.

Looks like this is a good performance improvement parameter, are there
any side effects of turning it on ?

Thx

On Wed, Sep 11, 2013 at 1:57 PM, Ted Yu yuzhih...@gmail.com wrote:
 Have you turned on short-circuit read ?

 Cheers


 On Wed, Sep 11, 2013 at 1:49 PM, OpenSource Dev 
 dev.opensou...@gmail.comwrote:

 Hi,

 I'm using HBase 0.94.6 (CDH 4.3) for Opentsdb. So far I have had no
 issues with writes/puts. System is handles upto 800k puts per seconds
 without issue. On average we do 250k puts per second.

 I am having the problem with Reads, I've also isolated where the
 problem is but not been able to find the root cause.

 I have 16 machines running hbase-region server, each has ~35 regions.
 Once in a while cpu goes flatout 80% in 1 region server. These are the
 things i've noticed in ganglia:

 hbase.regionserver.request - evenly distributed. Not seeing any spikes
 on the busy server
 hbase.regionserver.blockCacheSize - between 500MB and 1000MB
 hbase.regionserver.compactionQueueSize - avg 2 or less
 hbase.regionserver.blockCacheHitRatio - 30% on busy node, 60% on other
 nodes


 JVM Heap size is set to 16GB and I'm using -XX:+UseParNewGC
 -XX:+UseConcMarkSweepGC

 I've noticed the system load moves to a different region, sometimes
 within a minute, if the busy region is restarted.

 Any suggestion what could be causing the load and/or what other
 metrics should I check ?


 Thank you!

High cpu usage on a region server

2013-09-11 Thread OpenSource Dev

Hi,

I'm using HBase 0.94.6 (CDH 4.3) for Opentsdb. So far I have had no
issues with writes/puts. System is handles upto 800k puts per seconds
without issue. On average we do 250k puts per second.

I am having the problem with Reads, I've also isolated where the
problem is but not been able to find the root cause.

I have 16 machines running hbase-region server, each has ~35 regions.
Once in a while cpu goes flatout 80% in 1 region server. These are the
things i've noticed in ganglia:

hbase.regionserver.request - evenly distributed. Not seeing any spikes
on the busy server
hbase.regionserver.blockCacheSize - between 500MB and 1000MB
hbase.regionserver.compactionQueueSize - avg 2 or less
hbase.regionserver.blockCacheHitRatio - 30% on busy node, 60% on other nodes


JVM Heap size is set to 16GB and I'm using -XX:+UseParNewGC
-XX:+UseConcMarkSweepGC

I've noticed the system load moves to a different region, sometimes
within a minute, if the busy region is restarted.

Any suggestion what could be causing the load and/or what other
metrics should I check ?


Thank you!

Re: High cpu usage on a region server

2013-09-11 Thread OpenSource Dev

Load has not gone up since last 5 hrs :)
Will get the dump if it goes up again.

thx

On Wed, Sep 11, 2013 at 2:28 PM, Stack st...@duboce.net wrote:
 Can you thread dump the busy server and pastebin it?
 Thanks,
 St.Ack


 On Wed, Sep 11, 2013 at 1:49 PM, OpenSource Dev 
 dev.opensou...@gmail.comwrote:

 Hi,

 I'm using HBase 0.94.6 (CDH 4.3) for Opentsdb. So far I have had no
 issues with writes/puts. System is handles upto 800k puts per seconds
 without issue. On average we do 250k puts per second.

 I am having the problem with Reads, I've also isolated where the
 problem is but not been able to find the root cause.

 I have 16 machines running hbase-region server, each has ~35 regions.
 Once in a while cpu goes flatout 80% in 1 region server. These are the
 things i've noticed in ganglia:

 hbase.regionserver.request - evenly distributed. Not seeing any spikes
 on the busy server
 hbase.regionserver.blockCacheSize - between 500MB and 1000MB
 hbase.regionserver.compactionQueueSize - avg 2 or less
 hbase.regionserver.blockCacheHitRatio - 30% on busy node, 60% on other
 nodes


 JVM Heap size is set to 16GB and I'm using -XX:+UseParNewGC
 -XX:+UseConcMarkSweepGC

 I've noticed the system load moves to a different region, sometimes
 within a minute, if the busy region is restarted.

 Any suggestion what could be causing the load and/or what other
 metrics should I check ?


 Thank you!

Re: High cpu usage on a region server

2013-09-11 Thread OpenSource Dev

Hi Lars,

All the read  write requests are equally distributed across all region-servers.

If it is caused by the HBASE-9428 bug, any idea why it would impact
only 1 reason server at a given time ?

Thx


On Wed, Sep 11, 2013 at 1:55 PM, lars hofhansl la...@apache.org wrote:
 You might have run into HBASE-9428

 -- Lars



 
  From: OpenSource Dev dev.opensou...@gmail.com
 To: user@hbase.apache.org
 Sent: Wednesday, September 11, 2013 1:49 PM
 Subject: High cpu usage on a region server


 Hi,

 I'm using HBase 0.94.6 (CDH 4.3) for Opentsdb. So far I have had no
 issues with writes/puts. System is handles upto 800k puts per seconds
 without issue. On average we do 250k puts per second.

 I am having the problem with Reads, I've also isolated where the
 problem is but not been able to find the root cause.

 I have 16 machines running hbase-region server, each has ~35 regions.
 Once in a while cpu goes flatout 80% in 1 region server. These are the
 things i've noticed in ganglia:

 hbase.regionserver.request - evenly distributed. Not seeing any spikes
 on the busy server
 hbase.regionserver.blockCacheSize - between 500MB and 1000MB
 hbase.regionserver.compactionQueueSize - avg 2 or less
 hbase.regionserver.blockCacheHitRatio - 30% on busy node, 60% on other nodes


 JVM Heap size is set to 16GB and I'm using -XX:+UseParNewGC
 -XX:+UseConcMarkSweepGC

 I've noticed the system load moves to a different region, sometimes
 within a minute, if the busy region is restarted.

 Any suggestion what could be causing the load and/or what other
 metrics should I check ?


 Thank you!

Re: High cpu usage on a region server

2013-09-11 Thread lars hofhansl

It might be a larger scan (maybe gathering many data points for a metric) 
hitting many regions, in that case you'd see only a single region server being 
busy at a given time, since HBase scans only a region at a time for a single 
client scan.


A thread dump would give us a better idea. J-D specifically mentions OpenTSDB 
in that jira.


-- Lars




 From: OpenSource Dev dev.opensou...@gmail.com
To: user@hbase.apache.org; lars hofhansl la...@apache.org 
Sent: Wednesday, September 11, 2013 8:59 PM
Subject: Re: High cpu usage on a region server
 

Hi Lars,

All the read  write requests are equally distributed across all region-servers.

If it is caused by the HBASE-9428 bug, any idea why it would impact
only 1 reason server at a given time ?

Thx


On Wed, Sep 11, 2013 at 1:55 PM, lars hofhansl la...@apache.org wrote:
 You might have run into HBASE-9428

 -- Lars



 
  From: OpenSource Dev dev.opensou...@gmail.com
 To: user@hbase.apache.org
 Sent: Wednesday, September 11, 2013 1:49 PM
 Subject: High cpu usage on a region server


 Hi,

 I'm using HBase 0.94.6 (CDH 4.3) for Opentsdb. So far I have had no
 issues with writes/puts. System is handles upto 800k puts per seconds
 without issue. On average we do 250k puts per second.

 I am having the problem with Reads, I've also isolated where the
 problem is but not been able to find the root cause.

 I have 16 machines running hbase-region server, each has ~35 regions.
 Once in a while cpu goes flatout 80% in 1 region server. These are the
 things i've noticed in ganglia:

 hbase.regionserver.request - evenly distributed. Not seeing any spikes
 on the busy server
 hbase.regionserver.blockCacheSize - between 500MB and 1000MB
 hbase.regionserver.compactionQueueSize - avg 2 or less
 hbase.regionserver.blockCacheHitRatio - 30% on busy node, 60% on other nodes


 JVM Heap size is set to 16GB and I'm using -XX:+UseParNewGC
 -XX:+UseConcMarkSweepGC

 I've noticed the system load moves to a different region, sometimes
 within a minute, if the busy region is restarted.

 Any suggestion what could be causing the load and/or what other
 metrics should I check ?


 Thank you!

Re: High cpu usage on a region server

Re: High cpu usage on a region server

Re: High cpu usage on a region server

Re: High cpu usage on a region server

Re: High cpu usage on a region server

Re: High cpu usage on a region server

Re: High cpu usage on a region server

Re: High cpu usage on a region server

Re: High cpu usage on a region server

Re: High cpu usage on a region server

Re: High cpu usage on a region server

High cpu usage on a region server

Re: High cpu usage on a region server

Re: High cpu usage on a region server

Re: High cpu usage on a region server

15 matches

Site Navigation

Mail list logo

Footer information