Re: High iowait in idle hbase cluster

2015-09-07 Thread Akmal Abbasov
what is the configs used to tune the run frequency of block scanner? or what event is used to trigger it to run? Thanks. > On 07 Sep 2015, at 15:17, Ted Yu wrote: > > W.r.t. Upgrade, this thread may be of interest to you: > > http://search-hadoop.com/m/uOzYt48qItawnLv1 >

Re: High iowait in idle hbase cluster

2015-09-07 Thread Ted Yu
W.r.t. Upgrade, this thread may be of interest to you: http://search-hadoop.com/m/uOzYt48qItawnLv1 > On Sep 7, 2015, at 5:15 AM, Akmal Abbasov wrote: > > While looking into this problem, I found that I have large > dncp_block_verification.log.curr and dncp_block_verification.log.prev files.

Re: High iowait in idle hbase cluster

2015-09-07 Thread Akmal Abbasov
While looking into this problem, I found that I have large dncp_block_verification.log.curr and dncp_block_verification.log.prev files. They are 294G each in the node which has high IOWAIT, even when the cluster was almost idle. While the others have 0 for dncp_block_verification.log.curr, and <1

Re: High iowait in idle hbase cluster

2015-09-04 Thread Adrien Mogenet
What is your disk configuration? JBOD? If RAID, possibly a dysfunctional RAID controller, or a constantly-rebuilding array. Do you have any idea at which files are linked the read blocks? On 4 September 2015 at 11:02, Akmal Abbasov wrote: > Hi Adrien, > for the last 24 hours all RS are up and r

Re: High iowait in idle hbase cluster

2015-09-04 Thread Akmal Abbasov
Hi Adrien, for the last 24 hours all RS are up and running. There was no region transitions. The overall cluster iowait has decreased, but still 2 RS have very high iowait, while there is no load on the cluster. My assumption with the hight number of HDFS_READ/HDFS_WRITE in RS logs have failed,

Re: High iowait in idle hbase cluster

2015-09-03 Thread Adrien Mogenet
Is the uptime of RS "normal"? No quick and global reboot that could lead into a regiongi-reallocation-storm? On 3 September 2015 at 18:42, Akmal Abbasov wrote: > Hi Adrien, > I’ve tried to run hdfs fsck and hbase hbck, and hdfs is healthy, also > hbase is consistent. > I’m using default value of

Re: High iowait in idle hbase cluster

2015-09-03 Thread Akmal Abbasov
Hi Adrien, I’ve tried to run hdfs fsck and hbase hbck, and hdfs is healthy, also hbase is consistent. I’m using default value of the replication, so it is 3. There are some under replicated HBase master(node 10.10.8.55) is reading constantly from regionservers. Only today, it send >150.000 HDFS_

Re: High iowait in idle hbase cluster

2015-09-03 Thread Adrien Mogenet
Is your HDFS healthy (fsck /)? Same for hbase hbck? What's your replication level? Can you see constant network use as well? Anything than might be triggered by the hbasemaster? (something like a virtually dead RS, due to ZK race-condition, etc.) Your 3-weeks-ago balancer shouldn't have any ef

Re: High iowait in idle hbase cluster

2015-09-03 Thread Akmal Abbasov
I’ve started HDFS balancer, but then stopped it immediately after knowing that it is not a good idea. but it was around 3 weeks ago, is it possible that it had an influence on the cluster behaviour I’m having now? Thanks. > On 03 Sep 2015, at 14:23, Akmal Abbasov wrote: > > Hi Ted, > No there

Re: High iowait in idle hbase cluster

2015-09-03 Thread Akmal Abbasov
Hi Ted, No there is no short-circuit read configured. The logs of datanode of the 10.10.8.55 are full of following messages 2015-09-03 12:03:56,324 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.10.8.55:50010, dest: /10.10.8.53:58622, bytes: 77, op: HDFS_READ, cliID:

Re: High iowait in idle hbase cluster

2015-09-02 Thread Ted Yu
I assume you have enabled short-circuit read. Can you capture region server stack trace(s) and pastebin them ? Thanks On Wed, Sep 2, 2015 at 12:11 PM, Akmal Abbasov wrote: > Hi Ted, > I’ve checked the time when addresses were changed, and this strange > behaviour started weeks before it. > > y

Re: High iowait in idle hbase cluster

2015-09-02 Thread Akmal Abbasov
Hi Ted, I’ve checked the time when addresses were changed, and this strange behaviour started weeks before it. yes, 10.10.8.55 is region server and 10.10.8.54 is a hbase master. any thoughts? Thanks > On 02 Sep 2015, at 18:45, Ted Yu wrote: > > bq. change the ip addresses of the cluster nodes

Re: High iowait in idle hbase cluster

2015-09-02 Thread Ted Yu
bq. change the ip addresses of the cluster nodes Did this happen recently ? If high iowait was observed after the change (you can look at ganglia graph), there is a chance that the change was related. BTW I assume 10.10.8.55 is where your region server resides. Cheers

Re: High iowait in idle hbase cluster

2015-09-02 Thread Akmal Abbasov
Hi Ted, sorry forget to mention > release of hbase / hadoop you're using hbase hbase-0.98.7-hadoop2, hadoop hadoop-2.5.1 > were region servers doing compaction ? I’ve run major compactions manually earlier today, but it seems that they already completed, looking at the compactionQueueSize. > ha

Re: High iowait in idle hbase cluster

2015-09-02 Thread Ted Yu
Please provide some more information: release of hbase / hadoop you're using were region servers doing compaction ? have you checked region server logs ? Thanks On Wed, Sep 2, 2015 at 9:11 AM, Akmal Abbasov wrote: > Hi, > I’m having strange behaviour in hbase cluster. It is almost idle, only <

High iowait in idle hbase cluster

2015-09-02 Thread Akmal Abbasov
Hi, I’m having strange behaviour in hbase cluster. It is almost idle, only <5 puts and gets. But the data in hdfs is increasing, and region servers have very high iowait(>100, in 2 core CPU). iotop shows that datanode process is reading and writing all the time. Any suggestions? Thanks.