[jira] [Commented] (HBASE-20145) HMaster start fails with IllegalStateException when HADOOP_HOME is set
[ https://issues.apache.org/jira/browse/HBASE-20145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16436668#comment-16436668 ] Rohith Sharma K S commented on HBASE-20145: --- Thanks [~jojochuang] for trying out. I have found the root cause for the failure. Did you build HBase-2.0.0-beta1 from source or directly used hbase*.tar.gz which is available in mirrors? Btw, I have answered in stack over flow question, see [hbase-error-illegalstateexception-when-starting-master-hsync|https://stackoverflow.com/questions/48709569/hbase-error-illegalstateexception-when-starting-master-hsync] . This should help > HMaster start fails with IllegalStateException when HADOOP_HOME is set > -- > > Key: HBASE-20145 > URL: https://issues.apache.org/jira/browse/HBASE-20145 > Project: HBase > Issue Type: Bug > Environment: HBase-2.0-beta1. > Hadoop trunk version. > java version "1.8.0_144" >Reporter: Rohith Sharma K S >Assignee: Wei-Chiu Chuang >Priority: Critical > > It is observed that HMaster start is failed when HADOOP_HOME is set as env > while starting HMaster. HADOOP_HOME is pointing to Hadoop trunk version. > {noformat} > 2018-03-07 16:59:52,654 ERROR [master//10.200.4.200:16000] master.HMaster: > Failed to become active master > java.lang.IllegalStateException: The procedure WAL relies on the ability to > hsync for proper operation during component failures, but the underlying > filesystem does not support doing so. Please check the config value of > 'hbase.procedure.store.wal.use.hsync' to set the desired level of robustness > and ensure the config value of 'hbase.wal.dir' points to a FileSystem mount > that can provide it. > at > org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.rollWriter(WALProcedureStore.java:1036) > at > org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.recoverLease(WALProcedureStore.java:374) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.start(ProcedureExecutor.java:532) > at > org.apache.hadoop.hbase.master.HMaster.startProcedureExecutor(HMaster.java:1232) > at > org.apache.hadoop.hbase.master.HMaster.startServiceThreads(HMaster.java:1145) > at > org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:837) > at > org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2026) > at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:547) > at java.lang.Thread.run(Thread.java:748) > {noformat} > The same configs is working in HBase-1.2.6 build properly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20145) HMaster start fails with IllegalStateException when HADOOP_HOME is set
[ https://issues.apache.org/jira/browse/HBASE-20145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16389491#comment-16389491 ] Rohith Sharma K S commented on HBASE-20145: --- I haven't explicitly set any configurations for Hadoop. Hadoop cluster is running with default configurations. For HBase, I have set basics configuration such as _hbase.rootdir_ _hbase.cluster.distributed_ and _hbase.zookeeper.quorum_ > HMaster start fails with IllegalStateException when HADOOP_HOME is set > -- > > Key: HBASE-20145 > URL: https://issues.apache.org/jira/browse/HBASE-20145 > Project: HBase > Issue Type: Bug > Environment: HBase-2.0-beta1. > Hadoop trunk version. > java version "1.8.0_144" >Reporter: Rohith Sharma K S >Priority: Critical > > It is observed that HMaster start is failed when HADOOP_HOME is set as env > while starting HMaster. HADOOP_HOME is pointing to Hadoop trunk version. > {noformat} > 2018-03-07 16:59:52,654 ERROR [master//10.200.4.200:16000] master.HMaster: > Failed to become active master > java.lang.IllegalStateException: The procedure WAL relies on the ability to > hsync for proper operation during component failures, but the underlying > filesystem does not support doing so. Please check the config value of > 'hbase.procedure.store.wal.use.hsync' to set the desired level of robustness > and ensure the config value of 'hbase.wal.dir' points to a FileSystem mount > that can provide it. > at > org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.rollWriter(WALProcedureStore.java:1036) > at > org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.recoverLease(WALProcedureStore.java:374) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.start(ProcedureExecutor.java:532) > at > org.apache.hadoop.hbase.master.HMaster.startProcedureExecutor(HMaster.java:1232) > at > org.apache.hadoop.hbase.master.HMaster.startServiceThreads(HMaster.java:1145) > at > org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:837) > at > org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2026) > at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:547) > at java.lang.Thread.run(Thread.java:748) > {noformat} > The same configs is working in HBase-1.2.6 build properly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20145) HMaster start fails with IllegalStateException when HADOOP_HOME is set
[ https://issues.apache.org/jira/browse/HBASE-20145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16389438#comment-16389438 ] Rohith Sharma K S commented on HBASE-20145: --- Here is the tail log trace {noformat} 2018-03-07 16:59:52,620 INFO [master//10.200.4.200:16000] wal.WALProcedureStore: Starting WAL Procedure Store lease recovery 2018-03-07 16:59:52,623 INFO [master//10.200.4.200:16000] util.FSHDFSUtils: Recover lease on dfs file hdfs://127.0.0.1:9000/hbase/MasterProcWALs/pv2-0001.log 2018-03-07 16:59:52,629 INFO [master//10.200.4.200:16000] util.FSHDFSUtils: Recovered lease, attempt=0 on file=hdfs://127.0.0.1:9000/hbase/MasterProcWALs/pv2-0001.log after 6ms 2018-03-07 16:59:52,630 WARN [master//10.200.4.200:16000] wal.WALProcedureStore: Remove uninitialized log: FileStatus{path=hdfs://127.0.0.1:9000/hbase/MasterProcWALs/pv2-0001.log; isDirectory=false; length=0; replication=3; blocksize=134217728; modification_time=1520422065550; access_time=1520422062450; owner=rsharmaks; group=supergroup; permission=rw-r--r--; isSymlink=false} 2018-03-07 16:59:52,630 INFO [master//10.200.4.200:16000] wal.ProcedureWALFile: Archiving hdfs://127.0.0.1:9000/hbase/MasterProcWALs/pv2-0001.log to hdfs://127.0.0.1:9000/hbase/oldWALs/pv2-0001.log 2018-03-07 16:59:52,654 ERROR [master//10.200.4.200:16000] master.HMaster: Failed to become active master java.lang.IllegalStateException: The procedure WAL relies on the ability to hsync for proper operation during component failures, but the underlying filesystem does not support doing so. Please check the config value of 'hbase.procedure.store.wal.use.hsync' to set the desired level of robustness and ensure the config value of 'hbase.wal.dir' points to a FileSystem mount that can provide it. at org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.rollWriter(WALProcedureStore.java:1036) at org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.recoverLease(WALProcedureStore.java:374) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.start(ProcedureExecutor.java:532) at org.apache.hadoop.hbase.master.HMaster.startProcedureExecutor(HMaster.java:1232) at org.apache.hadoop.hbase.master.HMaster.startServiceThreads(HMaster.java:1145) at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:837) at org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2026) at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:547) at java.lang.Thread.run(Thread.java:748) 2018-03-07 16:59:52,655 ERROR [master//10.200.4.200:16000] master.HMaster: Master server abort: loaded coprocessors are: [] 2018-03-07 16:59:52,655 ERROR [master//10.200.4.200:16000] master.HMaster: Unhandled exception. Starting shutdown. java.lang.IllegalStateException: The procedure WAL relies on the ability to hsync for proper operation during component failures, but the underlying filesystem does not support doing so. Please check the config value of 'hbase.procedure.store.wal.use.hsync' to set the desired level of robustness and ensure the config value of 'hbase.wal.dir' points to a FileSystem mount that can provide it. at org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.rollWriter(WALProcedureStore.java:1036) at org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.recoverLease(WALProcedureStore.java:374) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.start(ProcedureExecutor.java:532) at org.apache.hadoop.hbase.master.HMaster.startProcedureExecutor(HMaster.java:1232) at org.apache.hadoop.hbase.master.HMaster.startServiceThreads(HMaster.java:1145) at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:837) at org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2026) at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:547) at java.lang.Thread.run(Thread.java:748) {noformat} > HMaster start fails with IllegalStateException when HADOOP_HOME is set > -- > > Key: HBASE-20145 > URL: https://issues.apache.org/jira/browse/HBASE-20145 > Project: HBase > Issue Type: Bug > Environment: HBase-2.0-beta1. > Hadoop trunk version. > java version "1.8.0_144" >Reporter: Rohith Sharma K S >Priority: Critical > > It is observed that HMaster start is failed when HADOOP_HOME is set as env > while starting HMaster. HADOOP_HOME is pointing to Hadoop trunk version. > {noformat} > 2018-03-07 16:59:52,654 ERROR [master//10.200.4.200:16000] master.HMaster: > Failed to become
[jira] [Moved] (HBASE-20145) HMaster start fails with IllegalStateException when HADOOP_HOME is set
[ https://issues.apache.org/jira/browse/HBASE-20145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith Sharma K S moved HDFS-13242 to HBASE-20145: -- Key: HBASE-20145 (was: HDFS-13242) Project: HBase (was: Hadoop HDFS) > HMaster start fails with IllegalStateException when HADOOP_HOME is set > -- > > Key: HBASE-20145 > URL: https://issues.apache.org/jira/browse/HBASE-20145 > Project: HBase > Issue Type: Bug > Environment: HBase-2.0-beta1. > Hadoop trunk version. > java version "1.8.0_144" >Reporter: Rohith Sharma K S >Priority: Critical > > It is observed that HMaster start is failed when HADOOP_HOME is set as env > while starting HMaster. HADOOP_HOME is pointing to Hadoop trunk version. > {noformat} > 2018-03-07 16:59:52,654 ERROR [master//10.200.4.200:16000] master.HMaster: > Failed to become active master > java.lang.IllegalStateException: The procedure WAL relies on the ability to > hsync for proper operation during component failures, but the underlying > filesystem does not support doing so. Please check the config value of > 'hbase.procedure.store.wal.use.hsync' to set the desired level of robustness > and ensure the config value of 'hbase.wal.dir' points to a FileSystem mount > that can provide it. > at > org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.rollWriter(WALProcedureStore.java:1036) > at > org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.recoverLease(WALProcedureStore.java:374) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.start(ProcedureExecutor.java:532) > at > org.apache.hadoop.hbase.master.HMaster.startProcedureExecutor(HMaster.java:1232) > at > org.apache.hadoop.hbase.master.HMaster.startServiceThreads(HMaster.java:1145) > at > org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:837) > at > org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2026) > at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:547) > at java.lang.Thread.run(Thread.java:748) > {noformat} > The same configs is working in HBase-1.2.6 build properly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19883) HM and RS are going down with Failed to find any Kerberos tgt after token lifetime expired
[ https://issues.apache.org/jira/browse/HBASE-19883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16343114#comment-16343114 ] Rohith Sharma K S commented on HBASE-19883: --- I found that HADOOP-10786 fixes issue with reloginFromKeytab in hadoop-2.6.1 onwards. HBase-1.2.6 is compiled against hadoop-2.5.1 which HADOOP-10786 fix is not available. We see this error in java-1.8. Probably compiling HBase-1.2.6 against >= hadoop-2.6.1 works(I have not verified it). > HM and RS are going down with Failed to find any Kerberos tgt after token > lifetime expired > -- > > Key: HBASE-19883 > URL: https://issues.apache.org/jira/browse/HBASE-19883 > Project: HBase > Issue Type: Bug >Affects Versions: 1.2.6 > Environment: Java Version : 1.8 Open JDK > OS : centos-7 64-bit >Reporter: Rohith Sharma K S >Priority: Critical > > HBase non-ha secure cluster was installed and running in successfully with > regular operations. HDFS service was HA and couple of time, NameNode switch > happened back and forth first day. > After 24 hours i.e when token lifetime expired, It is observed that HBase > cluster daemons such as Hmaster and HRegionserver are shutting down with > security exception No valid credentials provided (Mechanism level: Failed to > find any Kerberos tgt)! -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19883) HM and RS are going down with Failed to find any Kerberos tgt after token lifetime expired
[ https://issues.apache.org/jira/browse/HBASE-19883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16342994#comment-16342994 ] Rohith Sharma K S commented on HBASE-19883: --- After 24hours i.e when token lifetime expires, the RS and HM has below logs. At this point of time, HDFS service was up and running. {noformat} 2018-01-24 09:13:18,177 WARN [LeaseRenewer:yarn@mycluster] security.UserGroupInformation: Not attempting to re-login since the last re-login was attempted less than 600 seconds before. 2018-01-24 09:13:20,413 WARN [LeaseRenewer:yarn@mycluster] security.UserGroupInformation: Not attempting to re-login since the last re-login was attempted less than 600 seconds before. 2018-01-24 09:13:23,158 WARN [LeaseRenewer:yarn@mycluster] security.UserGroupInformation: Not attempting to re-login since the last re-login was attempted less than 600 seconds before. 2018-01-24 09:13:24,446 WARN [LeaseRenewer:yarn@mycluster] security.UserGroupInformation: Not attempting to re-login since the last re-login was attempted less than 600 seconds before. 2018-01-24 09:13:27,509 WARN [LeaseRenewer:yarn@mycluster] ipc.Client: Couldn't setup connection for yarn/ctr-e137-1514896590304-33059-01-03.hwx.s...@example.com to ctr-e137-1514896590304-33059-01-03.hwx.site/172.27.12.21:8020 2018-01-24 09:13:27,510 INFO [LeaseRenewer:yarn@mycluster] retry.RetryInvocationHandler: Exception while invoking renewLease of class ClientNamenodeProtocolTranslatorPB over ctr-e137-1514896590304-33059-01-03.hwx.site/172.27.12.21:8020. Trying to fail over immediately. java.io.IOException: Failed on local exception: java.io.IOException: Couldn't setup connection for yarn/ctr-e137-1514896590304-33059-01-03.hwx.s...@example.com to ctr-e137-1514896590304-33059-01-03.hwx.site/172.27.12.21:8020; Host Details : local host is: "ctr-e137-1514896590304-33059-01-03.hwx.site/172.27.12.21"; destination host is: "ctr-e137-1514896590304-33059-01-03.hwx.site":8020; at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764) at org.apache.hadoop.ipc.Client.call(Client.java:1415) at org.apache.hadoop.ipc.Client.call(Client.java:1364) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206) at com.sun.proxy.$Proxy15.renewLease(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.renewLease(ClientNamenodeProtocolTranslatorPB.java:540) at sun.reflect.GeneratedMethodAccessor22.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) at com.sun.proxy.$Proxy16.renewLease(Unknown Source) at sun.reflect.GeneratedMethodAccessor22.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.hbase.fs.HFileSystem$1.invoke(HFileSystem.java:279) at com.sun.proxy.$Proxy17.renewLease(Unknown Source) at org.apache.hadoop.hdfs.DFSClient.renewLease(DFSClient.java:814) at org.apache.hadoop.hdfs.LeaseRenewer.renew(LeaseRenewer.java:417) at org.apache.hadoop.hdfs.LeaseRenewer.run(LeaseRenewer.java:442) at org.apache.hadoop.hdfs.LeaseRenewer.access$700(LeaseRenewer.java:71) at org.apache.hadoop.hdfs.LeaseRenewer$1.run(LeaseRenewer.java:298) at java.lang.Thread.run(Thread.java:748) Caused by: java.io.IOException: Couldn't setup connection for yarn/ctr-e137-1514896590304-33059-01-03.hwx.s...@example.com to ctr-e137-1514896590304-33059-01-03.hwx.site/172.27.12.21:8020 at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:671) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614) at org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:642) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:725) at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:367) at org.apache.hadoop.ipc.Client.getConnection(Client.java:1463) at org.apache.hadoop.ipc.Client.call(Client.java:1382) ... 21 more Caused by: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)] at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211) at org.apache.hadoop.security.SaslRpcClient.saslCon
[jira] [Created] (HBASE-19883) HM and RS are going down with Failed to find any Kerberos tgt after token lifetime expired
Rohith Sharma K S created HBASE-19883: - Summary: HM and RS are going down with Failed to find any Kerberos tgt after token lifetime expired Key: HBASE-19883 URL: https://issues.apache.org/jira/browse/HBASE-19883 Project: HBase Issue Type: Bug Affects Versions: 1.2.6 Environment: Java Version : 1.8 Open JDK OS : centos-7 64-bit Reporter: Rohith Sharma K S HBase non-ha secure cluster was installed and running in successfully with regular operations. HDFS service was HA and couple of time, NameNode switch happened back and forth first day. After 24 hours i.e when token lifetime expired, It is observed that HBase cluster daemons such as Hmaster and HRegionserver are shutting down with security exception No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)! -- This message was sent by Atlassian JIRA (v7.6.3#76005)