Hi All, I am facing a strange issue, I am running a Hbase Bulk Load to load a Hfile to my hbase table, while running the same I am landing into the same issue over and over again.
java.io.IOException: BulkLoad encountered an unrecoverable problem at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.bulkLoadPhase(LoadIncrementalHFiles.java:381) at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.doBulkLoad(LoadIncrementalHFiles.java:310) at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.run(LoadIncrementalHFiles.java:896) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) at org.apache.kylin.job.hadoop.hbase.BulkLoadJob.run(BulkLoadJob.java:83) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) at org.apache.kylin.job.common.HadoopShellExecutable.doWork(HadoopShellExecutable.java:63) at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:107) at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:50) at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:107) at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:132) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=35, exceptions: Tue Jul 14 23:18:48 PDT 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@216b1af0, java.net.UnknownHostException: unknown host: prod-hadoop-data02 Tue Jul 14 23:18:49 PDT 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@216b1af0, java.net.UnknownHostException: unknown host: prod-hadoop-data02 So there are multiple jobs which are initiating table load process out of which one of the job is failing intermittently, let me clarify the failing job is not the same everytime. So for instance last time my job 1 got failed but today its job 2, but out of all the exception remains the same. I am having the respective host entry on the all my hosts hadoop-yarn 10 node cluster(10 data node). Prominent suggestion are appreciated. Thanks!
