Hi Ted, No there is no short-circuit read configured. The logs of datanode of the 10.10.8.55 are full of following messages 2015-09-03 12:03:56,324 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.10.8.55:50010, dest: /10.10.8.53:58622, bytes: 77, op: HDFS_READ, cliID: DFSClient_NONMAPREDUCE_-483065515_1, offset: 0, srvID: ee7d0634-89a3-4ada-a8ad-7848214397be, blockid: BP-439084760-10.32.0.180-1387281790961:blk_1075349331_1612273, duration: 276448307 2015-09-03 12:03:56,494 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.10.8.55:50010, dest: /10.10.8.53:58622, bytes: 538, op: HDFS_READ, cliID: DFSClient_NONMAPREDUCE_-483065515_1, offset: 0, srvID: ee7d0634-89a3-4ada-a8ad-7848214397be, blockid: BP-439084760-10.32.0.180-1387281790961:blk_1075349334_1612276, duration: 60550244 2015-09-03 12:03:59,561 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.10.8.55:50010, dest: /10.10.8.53:58622, bytes: 455, op: HDFS_READ, cliID: DFSClient_NONMAPREDUCE_-483065515_1, offset: 0, srvID: ee7d0634-89a3-4ada-a8ad-7848214397be, blockid: BP-439084760-10.32.0.180-1387281790961:blk_1075351814_1614757, duration: 755613819 There are >100.000 of them just for today. The situation with other regionservers are similar. Node 10.10.8.53 is hbase-master node, and the process on the port is also hbase-master. So if there is no load on the cluster, why there are so much IO happening? Any thoughts. Thanks.
> On 02 Sep 2015, at 21:57, Ted Yu <[email protected]> wrote: > > I assume you have enabled short-circuit read. > > Can you capture region server stack trace(s) and pastebin them ? > > Thanks > > On Wed, Sep 2, 2015 at 12:11 PM, Akmal Abbasov <[email protected] > <mailto:[email protected]>> wrote: > Hi Ted, > I’ve checked the time when addresses were changed, and this strange behaviour > started weeks before it. > > yes, 10.10.8.55 is region server and 10.10.8.54 is a hbase master. > any thoughts? > > Thanks > >> On 02 Sep 2015, at 18:45, Ted Yu <[email protected] >> <mailto:[email protected]>> wrote: >> >> bq. change the ip addresses of the cluster nodes >> >> Did this happen recently ? If high iowait was observed after the change (you >> can look at ganglia graph), there is a chance that the change was related. >> >> BTW I assume 10.10.8.55 <http://10.10.8.55:50010/> is where your region >> server resides. >> >> Cheers >> >> On Wed, Sep 2, 2015 at 9:39 AM, Akmal Abbasov <[email protected] >> <mailto:[email protected]>> wrote: >> Hi Ted, >> sorry forget to mention >> >>> release of hbase / hadoop you're using >> >> hbase hbase-0.98.7-hadoop2, hadoop hadoop-2.5.1 >> >>> were region servers doing compaction ? >> >> I’ve run major compactions manually earlier today, but it seems that they >> already completed, looking at the compactionQueueSize. >> >>> have you checked region server logs ? >> The logs of datanode is full of this kind of messages >> 2015-09-02 16:37:06,950 INFO >> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: >> /10.10.8.55:50010 <http://10.10.8.55:50010/>, dest: /10.10.8.54:32959 >> <http://10.10.8.54:32959/>, bytes: 19673, op: HDFS_READ, cliID: >> DFSClient_NONMAPREDUCE_1225374853_1, offset: 0, srvID: >> ee7d0634-89a3-4ada-a8ad-7848217327be, blockid: >> BP-329084760-10.32.0.180-1387281790961:blk_1075277914_1540222, duration: >> 7881815 >> >> p.s. we had to change the ip addresses of the cluster nodes, is it relevant? >> >> Thanks. >> >>> On 02 Sep 2015, at 18:20, Ted Yu <[email protected] >>> <mailto:[email protected]>> wrote: >>> >>> Please provide some more information: >>> >>> release of hbase / hadoop you're using >>> were region servers doing compaction ? >>> have you checked region server logs ? >>> >>> Thanks >>> >>> On Wed, Sep 2, 2015 at 9:11 AM, Akmal Abbasov <[email protected] >>> <mailto:[email protected]>> wrote: >>> Hi, >>> I’m having strange behaviour in hbase cluster. It is almost idle, only <5 >>> puts and gets. >>> But the data in hdfs is increasing, and region servers have very high >>> iowait(>100, in 2 core CPU). >>> iotop shows that datanode process is reading and writing all the time. >>> Any suggestions? >>> >>> Thanks. >>> >> >> > >
