Actually as I noticed, it's not the corpocessor that's failing, but HBase when trying to load the coprocessor itself from HDFS (form a reference somewhere that still points to the old HDFS namenode).
On Thu, Jun 27, 2019 at 10:19 AM Andras Nagy <[email protected]> wrote: > Hi ShaoFeng, > > After disabling the "KYLIN_*" tables (but not 'kylin_metadata') the > RegionServers could indeed start up and the coprocessor refresh succeeded. > > But after re-enabling those tables again, the issue continues, and again > the RegionServers fail by trying to connect to the old master node. What I > noticed now from the stacktrace is that the coprocessor is actually trying > to connect to the old HDFS namenode on port 8020 (and not to the HBase > master). > > Best regards, > Andras > > > On Thu, Jun 27, 2019 at 4:21 AM ShaoFeng Shi <[email protected]> > wrote: > >> I see; Can you try this way: disable all "KYLIN_*" tables in HBase >> console, and then see whether the region servers can start. >> >> If they can start, then run the above command to refresh the coprocessor. >> >> Best regards, >> >> Shaofeng Shi 史少锋 >> Apache Kylin PMC >> Email: [email protected] >> >> Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html >> Join Kylin user mail group: [email protected] >> Join Kylin dev mail group: [email protected] >> >> >> >> >> Andras Nagy <[email protected]> 于2019年6月26日周三 下午10:57写道: >> >>> Hi ShaoFeng, >>> Yes, but it fails as well. Actually it fails because the RegionServers >>> are not running (as they fail when starting up). >>> Best regards, >>> Andras >>> >>> On Wed, Jun 26, 2019 at 4:42 PM ShaoFeng Shi <[email protected]> >>> wrote: >>> >>>> Hi Andras, >>>> >>>> Did you try this? >>>> https://kylin.apache.org/docs/howto/howto_update_coprocessor.html >>>> >>>> Best regards, >>>> >>>> Shaofeng Shi 史少锋 >>>> Apache Kylin PMC >>>> Email: [email protected] >>>> >>>> Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html >>>> Join Kylin user mail group: [email protected] >>>> Join Kylin dev mail group: [email protected] >>>> >>>> >>>> >>>> >>>> Andras Nagy <[email protected]> 于2019年6月26日周三 下午10:05写道: >>>> >>>>> Greetings, >>>>> >>>>> I'm testing a setup where HBase is running on AWS EMR and HBase data >>>>> is stored on S3. It's working fine so far, but when I terminate the EMR >>>>> cluster and recreate it with the same S3 location for HBase, HBase won't >>>>> start up properly. Before shutting down, I did execute the >>>>> disable_all_tables.sh script to flush HBase state to S3. >>>>> >>>>> Actually the issue is that RegionServers don't start up. Maybe I'm >>>>> missing something in the EMR setup and not in Kylin setup, but the >>>>> exceptions I get in the RegionServer's log point at Kylin's >>>>> CubeVisitService coprocessor, which is still trying to connect to the old >>>>> HBase master on the old EMR cluster's master node and fails with: >>>>> "coprocessor.CoprocessorHost: The coprocessor >>>>> org.apache.kylin.storage.hbase.cube.v2.coprocessor.endpoint.CubeVisitService >>>>> threw java.net.NoRouteToHostException: No Route to Host from >>>>> ip-172-35-5-11/172.35.5.11 to >>>>> ip-172-35-7-125.us-west-2.compute.internal:8020 failed on socket timeout >>>>> exception: java.net.NoRouteToHostException: No route to host; " >>>>> >>>>> (Here, ip-172-35-7-125 was the old clusters' master node.) >>>>> >>>>> Does anyone have any idea what I'm doing wrong here? >>>>> The HBase master node's address seems to be cached somewhere, and when >>>>> starting up HBase on the new cluster with the same S3 location for HFiles, >>>>> this old address is used still. >>>>> Is there anything specific I have missed to get this scenario to work >>>>> properly? >>>>> >>>>> This is the full stacktrace: >>>>> >>>>> 2019-06-26 12:33:53,352 ERROR [RS_OPEN_REGION-ip-172-35-5-11:16020-1] >>>>> coprocessor.CoprocessorHost: The coprocessor >>>>> org.apache.kylin.storage.hbase.cube.v2.coprocessor.endpoint.CubeVisitService >>>>> threw java.net.NoRouteToHostException: No Route to Host from >>>>> ip-172-35-5-11/172.35.5.11 to >>>>> ip-172-35-7-125.us-west-2.compute.internal:8020 failed on socket timeout >>>>> exception: java.net.NoRouteToHostException: No route to host; For more >>>>> details see: http://wiki.apache.org/hadoop/NoRouteToHost >>>>> java.net.NoRouteToHostException: No Route to Host from ip-172-35-5-11/ >>>>> 172.35.5.11 to ip-172-35-7-125.us-west-2.compute.internal:8020 failed >>>>> on socket timeout exception: java.net.NoRouteToHostException: No route to >>>>> host; For more details see: >>>>> http://wiki.apache.org/hadoop/NoRouteToHost >>>>> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native >>>>> Method) >>>>> at >>>>> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) >>>>> at >>>>> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) >>>>> at java.lang.reflect.Constructor.newInstance(Constructor.java:423) >>>>> at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:801) >>>>> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:758) >>>>> at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1493) >>>>> at org.apache.hadoop.ipc.Client.call(Client.java:1435) >>>>> at org.apache.hadoop.ipc.Client.call(Client.java:1345) >>>>> at >>>>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:227) >>>>> at >>>>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116) >>>>> at com.sun.proxy.$Proxy36.getFileInfo(Unknown Source) >>>>> at >>>>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:796) >>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>>>> at >>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) >>>>> at >>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >>>>> at java.lang.reflect.Method.invoke(Method.java:498) >>>>> at >>>>> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:409) >>>>> at >>>>> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:163) >>>>> at >>>>> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:155) >>>>> at >>>>> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95) >>>>> at >>>>> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:346) >>>>> at com.sun.proxy.$Proxy37.getFileInfo(Unknown Source) >>>>> at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1649) >>>>> at >>>>> org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1440) >>>>> at >>>>> org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1437) >>>>> at >>>>> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) >>>>> at >>>>> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1452) >>>>> at org.apache.hadoop.fs.FileSystem.isFile(FileSystem.java:1466) >>>>> at >>>>> org.apache.hadoop.hbase.util.CoprocessorClassLoader.getClassLoader(CoprocessorClassLoader.java:264) >>>>> at >>>>> org.apache.hadoop.hbase.coprocessor.CoprocessorHost.load(CoprocessorHost.java:214) >>>>> at >>>>> org.apache.hadoop.hbase.coprocessor.CoprocessorHost.load(CoprocessorHost.java:188) >>>>> at >>>>> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.loadTableCoprocessors(RegionCoprocessorHost.java:376) >>>>> at >>>>> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.<init>(RegionCoprocessorHost.java:238) >>>>> at >>>>> org.apache.hadoop.hbase.regionserver.HRegion.<init>(HRegion.java:802) >>>>> at >>>>> org.apache.hadoop.hbase.regionserver.HRegion.<init>(HRegion.java:710) >>>>> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native >>>>> Method) >>>>> at >>>>> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) >>>>> at >>>>> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) >>>>> at java.lang.reflect.Constructor.newInstance(Constructor.java:423) >>>>> at >>>>> org.apache.hadoop.hbase.regionserver.HRegion.newHRegion(HRegion.java:6716) >>>>> at >>>>> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7020) >>>>> at >>>>> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6992) >>>>> at >>>>> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6948) >>>>> at >>>>> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6899) >>>>> at >>>>> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:364) >>>>> at >>>>> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:131) >>>>> at >>>>> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:129) >>>>> at >>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) >>>>> at >>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) >>>>> at java.lang.Thread.run(Thread.java:748) >>>>> Caused by: java.net.NoRouteToHostException: No route to host >>>>> at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) >>>>> at >>>>> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) >>>>> at >>>>> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) >>>>> at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531) >>>>> at >>>>> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:685) >>>>> at >>>>> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:788) >>>>> at org.apache.hadoop.ipc.Client$Connection.access$3500(Client.java:410) >>>>> at org.apache.hadoop.ipc.Client.getConnection(Client.java:1550) >>>>> at org.apache.hadoop.ipc.Client.call(Client.java:1381) >>>>> ... 43 more >>>>> >>>>> Many thanks, >>>>> Andras >>>>> >>>>
