actually going further back in the RS logs I see these: java.io.IOException: Got error, status message org.apache.hadoop.yarn.server.nodemanager.util.UtilizationBasedNodeBusyChecker CPU: 18.97175> 10 , for OP_READ_BLOCK, self=/25.123.83.126:41098, remote=/10.27.138.10:10010, for file /hbase/SomeData/data/default/SomeTable122016/bfce55b49e2ade82e1bac73c4205d967/info/995f0a2a24b84a048ea55a4879f46e28, for pool BP-575538346-25.126.51.77-1446116651710 block 1096040129_23473372 at org.apache.hadoop.hdfs.protocol.datatransfer.DataTransferProtoUtil.checkBlockOpStatus(DataTransferProtoUtil.java:142) at org.apache.hadoop.hdfs.RemoteBlockReader2.checkSuccess(RemoteBlockReader2.java:456) at org.apache.hadoop.hdfs.RemoteBlockReader2.newBlockReader(RemoteBlockReader2.java:424) at org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReader(BlockReaderFactory.java:821) at org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReaderFromTcp(BlockReaderFactory.java:700) at org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:358) at org.apache.hadoop.hdfs.DFSInputStream.getBlockReader(DFSInputStream.java:729) at org.apache.hadoop.hdfs.DFSInputStream.actualGetFromOneDataNode(DFSInputStream.java:1651) at org.apache.hadoop.hdfs.DFSInputStream$3.call(DFSInputStream.java:1610) at org.apache.hadoop.hdfs.DFSInputStream$3.call(DFSInputStream.java:1602) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)
________________________________ From: jeff saremi <jeffsar...@hotmail.com> Sent: Friday, August 4, 2017 2:22:54 PM To: user@hbase.apache.org Subject: Baffling RPC exceptions with our Thrift servers Every once in a while (and this is getting more frequent) our Thrift clients report errors all over. I check say one of the Thrift server logs. I see a lot of lines like the following: 2017-08-04 14:15:17,089 INFO [thrift-worker-29] client.RpcRetryingCaller: Call exception, tries=14, retries=35, started=108853 ms ago, cancelled=false, msg=row 'http://hobartexchange.com.au/classifieds/_g397381.html' on table 'ClickStreamTable122016' at region=ClickStreamTable122016,http://hifimov.com/youtube-videos/mcent-hack-unlimited-money-cracked-apk,1501285634230.bfce55b49e2ade82e1bac73c4205d967., hostname=co4aps197b537e,16020,1501339699340, seqNum=54295 I go to mater. Check status. No issues whatsoever. I check the logs for the RS mentioned in the log. No issues that I can tell you. I restarted all Thrift servers and that didn't help. I bounced the active master. still nothing What else can I check? what could be the reason? How can we get Thrift working again? thanks Jeff