ResponseTooSlow wasn't shown in your first email. Have you checked GC log to see whether there was correlation between the GC pause and slow response ? How much load did your cluster undergo around period when response was slow ?
Thanks > On Oct 10, 2015, at 1:53 AM, "[email protected]" > <[email protected]> wrote: > > the exception client get will be masterNotRunningException sometimes > and the maste will print log: > responseTooSlow > > > > [email protected] > > From: Ted Yu > Date: 2015-10-10 16:31 > To: [email protected] > Subject: Re: client getTableDescriptors from master timeout > Did this exception happen repeatedly or intermittently ? > > Does your cluster run secure hbase ? > > Cheers > > On Sat, Oct 10, 2015 at 1:23 AM, [email protected] < > [email protected]> wrote: > >> we use hbase0.98.10, see the exception as flows: >> >> >> Caused by: java.lang.reflect.UndeclaredThrowableException >> at com.sun.proxy.$Proxy6.getHTableDescriptors(Unknown Source) >> at >> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHTableDescriptor(HConnectionManager.java:1835) >> at >> org.apache.hadoop.hbase.client.HTable.getTableDescriptor(HTable.java:403) >> at >> org.apache.hadoop.hbase.mapreduce.HFileOutputFormat.configureCompression(HFileOutputFormat.java:436) >> at >> org.apache.hadoop.hbase.mapreduce.HFileOutputFormat.configureIncrementalLoad(HFileOutputFormat.java:368) >> at >> com.fiberhome.bigdata.bulkload.BulkLoadJobExecutor.initBulkLoadJob(BulkLoadJobExecutor.java:101) >> at >> com.fiberhome.nebula.datacenter.bulkload.app.GroupMRTask.call(GroupMRTask.java:92) >> Caused by: java.io.IOException: Call to hm:60000 failed on local >> exception: org.apache.hadoop.hbase.ipc.HBaseClient$CallTimeoutException: >> Call id=25409697, waitTime=68690, rpcTimetout=60000 >> at >> org.apache.hadoop.hbase.ipc.HBaseClient.wrapException(HBaseClient.java:1056) >> at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:1025) >> at >> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:150) >> ... 15 more >> Caused by: org.apache.hadoop.hbase.ipc.HBaseClient$CallTimeoutException: >> Call id=25409697, waitTime=68690, rpcTimetout=60000 >> at >> org.apache.hadoop.hbase.ipc.HBaseClient$Connection.cleanupCalls(HBaseClient.java:786) >> at >> org.apache.hadoop.hbase.ipc.HBaseClient$Connection.receiveResponse(HBaseClient.java:715) >> at >> org.apache.hadoop.hbase.ipc.HBaseClient$Connection.run(HBaseClient.java:606) >> >> >> the log of master at the same time as flows: >> 2015-09-07 01:37:36,471 INFO org.apache.hadoop.hbase.master.LoadBalancer: >> Skipping load balancing because balanced cluster; servers=10 regions=169 >> average=16.9 mostloaded=17 leastloaded=16 >> 2015-09-07 01:37:36,471 INFO org.apache.hadoop.hbase.master.LoadBalancer: >> Skipping load balancing because balanced cluster; servers=10 regions=2 >> average=0.2 mostloaded=1 leastloaded=0 >> ... >> >> 2015-09-07 01:37:49,814 DEBUG >> org.apache.hadoop.hbase.client.ClientScanner: Creating scanner over .META. >> starting at key '' >> 2015-09-07 01:37:49,815 DEBUG >> org.apache.hadoop.hbase.client.ClientScanner: Advancing internal scanner to >> startKey at '' >> 2015-09-07 01:37:53,325 DEBUG >> org.apache.hadoop.hbase.client.ClientScanner: Finished with scanning at >> {NAME => '.META.,,1', STARTKEY => '', ENDKEY => '', ENCODED => 1028785192,} >> 2015-09-07 01:37:53,378 DEBUG >> org.apache.hadoop.hbase.master.CatalogJanitor: Scanned 49295 catalog row(s) >> and gc'd 0 unreferenced parent region(s) >> >> it seems master just does its periodic ordinary works. >> >> so, I am wondering why timeout happens >> >> >> [email protected] >>
