Jacky Woo created KYLIN-3675: -------------------------------- Summary: Unknown host exception when building cube of create HTable steps. Key: KYLIN-3675 URL: https://issues.apache.org/jira/browse/KYLIN-3675 Project: Kylin Issue Type: Bug Components: Job Engine, Storage - HBase Affects Versions: v2.5.0 Reporter: Jacky Woo Attachments: hbase.hdfs.xml, kylin.properties
Hi, all I had "UnknownHostException" when building cube. Below is stack trace: {panel:title=stack trace:} 2018-11-08 18:44:55,069 ERROR [Scheduler 321750220 Job 42a75dbe-4b37-bb8a-8361-0bab7bcea106-849] common.HadoopShellExecutable:65 : error execute HadoopShellExecutable{id=42a75dbe-4b37-bb8a-8361-0bab7bcea106-04, name=Create HTable, st ate=RUNNING} org.apache.hadoop.hbase.DoNotRetryIOException: org.apache.hadoop.hbase.DoNotRetryIOException: java.net.UnknownHostException: data-batch-hdfs Set hbase.table.sanity.checks to false at conf or table descriptor if you want to bypass san ity checks at org.apache.hadoop.hbase.master.HMaster.warnOrThrowExceptionForFailure(HMaster.java:1785) at org.apache.hadoop.hbase.master.HMaster.sanityCheckTableDescriptor(HMaster.java:1646) at org.apache.hadoop.hbase.master.HMaster.createTable(HMaster.java:1576) at org.apache.hadoop.hbase.master.MasterRpcServices.createTable(MasterRpcServices.java:469) at org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java:55682) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2183) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:112) at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:185) at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:165) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:422) at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106) at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95) at org.apache.hadoop.hbase.client.RpcRetryingCaller.translateException(RpcRetryingCaller.java:236) at org.apache.hadoop.hbase.client.RpcRetryingCaller.translateException(RpcRetryingCaller.java:254) at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:150) at org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin.java:4313) at org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin.java:4305) at org.apache.hadoop.hbase.client.HBaseAdmin.createTableAsyncV2(HBaseAdmin.java:768) at org.apache.hadoop.hbase.client.HBaseAdmin.createTable(HBaseAdmin.java:689) at org.apache.kylin.storage.hbase.steps.CubeHTableUtil.createHTable(CubeHTableUtil.java:107) at org.apache.kylin.storage.hbase.steps.CreateHTableJob.run(CreateHTableJob.java:120) at org.apache.kylin.engine.mr.MRUtil.runMRJob(MRUtil.java:92) at org.apache.kylin.engine.mr.common.HadoopShellExecutable.doWork(HadoopShellExecutable.java:63) at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:163) at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:69) at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:163) at org.apache.kylin.job.impl.threadpool.DistributedScheduler$JobRunner.run(DistributedScheduler.java:111) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) {panel} I use kylin 2.5.0.cdh.5.7 There are two hdfs in my deployment: "data-batch-hdfs" : for cube building "hbase-hdfs" : for hbase And I use "data-batch-hdfs" as defaultFS in core-site.xml. {color:#FF0000}*The error occurs at "Create HTable" step, but it is incidental not always.*{color} I did some research: 1. According the error stack trace, error is thrown from "CubeHTableUtil.createHTable" method. So I added some log to print HBase table desc and found coprocessor jar location is in "data-batch-hdfs" which is not expected.(coprocessor should locate in "hbase-hdfs") 2. So I added more logs in "DeployCoprocessorCli.initHTableCoprocessor" method. I print which hdfs to use and all HBase configurations of all threads. Below is code : {code:java} private static void initHTableCoprocessor(HTableDescriptor desc) throws IOException { KylinConfig kylinConfig = KylinConfig.getInstanceFromEnv(); Configuration hconf = HBaseConnection.getCurrentHBaseConfiguration(); FileSystem fileSystem = FileSystem.get(hconf); String localCoprocessorJar = kylinConfig.getCoprocessorLocalJar(); Path hdfsCoprocessorJar = DeployCoprocessorCLI.uploadCoprocessorJar(localCoprocessorJar, fileSystem, null); if(fileSystem.getUri().toString().contains("data-batch-hdfs")){ logger.info("use hdfs " + hconf.get(FileSystem.FS_DEFAULT_NAME_KEY)); logger.info(String.format("use hdfs %s when deploy coprocessor, current thread %s", fileSystem.getUri().toString(), Thread.currentThread().getId()+"-"+Thread.currentThread().getName())); //pring HBaseConnection.configThreadLocal of all threads Map<String, Configuration> hbaseConfs = ThreadUtils.listThreadLocal(HBaseConnection.configThreadLocal); StringBuffer sb = new StringBuffer(); for(Map.Entry<String, Configuration> e : hbaseConfs.entrySet()){ sb.append("\n\t").append(e.getKey()).append(" : ") .append(e.getValue().get(FileSystem.FS_DEFAULT_NAME_KEY)).append("\t") .append(e.getValue().get(DFSConfigKeys.DFS_NAMESERVICES)); } sb.append("\n"); logger.info("HBaseConnection configThreadLocal : " + sb.toString()); } logger.info("coprocessor path " + hdfsCoprocessorJar); DeployCoprocessorCLI.addCoprocessorOnHTable(desc, hdfsCoprocessorJar); } {code} 3. I startup some building jobs And got some log: {code:java} 2018-11-08 18:44:55,002 INFO [Scheduler 321750220 Job 42a75dbe-4b37-bb8a-8361-0bab7bcea106-849] util.DeployCoprocessorCLI:275 : use hdfs hdfs://data-batch-hdfs 2018-11-08 18:44:55,002 INFO [Scheduler 321750220 Job 42a75dbe-4b37-bb8a-8361-0bab7bcea106-849] util.DeployCoprocessorCLI:276 : use hdfs hdfs://data-batch-hdfs when deploy coprocessor, current thread 849-Scheduler 321750220 Job 42a7 5dbe-4b37-bb8a-8361-0bab7bcea106-849 2018-11-08 18:44:55,012 INFO [Scheduler 321750220 Job 42a75dbe-4b37-bb8a-8361-0bab7bcea106-849] util.DeployCoprocessorCLI:288 : HBaseConnection configThreadLocal : 849##Scheduler 321750220 Job 42a75dbe-4b37-bb8a-8361-0bab7bcea106-849 : hdfs://data-batch-hdfs data-batch-hdfs,hbase-hdfs 404##pool-12-thread-9 : hdfs://hbase-hdfs data-batch-hdfs,hbase-hdfs 1231##pool-12-thread-20 : hdfs://data-batch-hdfs data-batch-hdfs,hbase-hdfs 383##pool-12-thread-8 : hdfs://hbase-hdfs data-batch-hdfs,hbase-hdfs 1073##pool-12-thread-19 : hdfs://data-batch-hdfs data-batch-hdfs,hbase-hdfs 538##pool-12-thread-13 : hdfs://hbase-hdfs data-batch-hdfs,hbase-hdfs {code} According the logic of "HBaseConnection", HBase configuration of all thread will use "hbase-hdfs" but not "data-batch-hdfs". What is the reason? Attachment is my kylin configuration and hdfs configuration of hbase cluster. -- This message was sent by Atlassian JIRA (v7.6.3#76005)