[jira] [Commented] (ATLAS-3561) Atlas start fails in embedded-hbase mode with zookeeper error

chaitali borole (Jira) Mon, 16 Dec 2019 00:52:10 -0800


    [ 
https://issues.apache.org/jira/browse/ATLAS-3561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16997088#comment-16997088
 ]


chaitali borole commented on ATLAS-3561:
----------------------------------------

The above error logs are from Atlas application.log file, below is snippet from 
HBase hbase master log file.


{noformat}
2019-12-16 13:53:18,197 WARN  [Thread-22] wal.WALProcedureStore: Remove 
uninitialized log: 
DeprecatedRawLocalFileStatus\{path=file:/User/atlas/distro/target/apache-atlas-3.0.0-SNAPSHOT-bin/apache-atlas-3.0.0-SNAPSHOT/data/hbase-root/MasterProcWALs/pv2-00000000000000000021.log;
 isDirectory=false; length=0; replication=1; blocksize=33554432; 
modification_time=1576483024000; access_time=0; owner=; group=; 
permission=rw-rw-rw-; isSymlink=false}2019-12-16 13:53:18,197 WARN  [Thread-22] 
wal.WALProcedureStore: Remove uninitialized log: 
DeprecatedRawLocalFileStatus\{path=file:/User/atlas/distro/target/apache-atlas-3.0.0-SNAPSHOT-bin/apache-atlas-3.0.0-SNAPSHOT/data/hbase-root/MasterProcWALs/pv2-00000000000000000021.log;
 isDirectory=false; length=0; replication=1; blocksize=33554432; 
modification_time=1576483024000; access_time=0; owner=; group=; 
permission=rw-rw-rw-; isSymlink=false}2019-12-16 13:53:18,197 INFO  [Thread-22] 
wal.ProcedureWALFile: Archiving 
file:/User/atlas/distro/target/apache-atlas-3.0.0-SNAPSHOT-bin/apache-atlas-3.0.0-SNAPSHOT/data/hbase-root/MasterProcWALs/pv2-00000000000000000021.log
 to 
file:/User/atlas/distro/target/apache-atlas-3.0.0-SNAPSHOT-bin/apache-atlas-3.0.0-SNAPSHOT/data/hbase-root/oldWALs/pv2-00000000000000000021.log2019-12-16
 13:53:18,246 ERROR [Thread-22] master.HMaster: Failed to become active 
masterjava.lang.IllegalStateException: The procedure WAL relies on the ability 
to hsync for proper operation during component failures, but the underlying 
filesystem does not support doing so. Please check the config value of 
'hbase.procedure.store.wal.use.hsync' to set the desired level of robustness 
and ensure the config value of 'hbase.wal.dir' points to a FileSystem mount 
that can provide it. at 
org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.rollWriter(WALProcedureStore.java:1044)
 at 
org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.recoverLease(WALProcedureStore.java:383)
 at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor.init(ProcedureExecutor.java:649)
 at 
org.apache.hadoop.hbase.master.HMaster.createProcedureExecutor(HMaster.java:1282)
 at 
org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:842)
 at 
org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2086)
 at org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:553) at 
java.lang.Thread.run(Thread.java:748)2019-12-16 13:53:18,247 ERROR [Thread-22] 
master.HMaster: ***** ABORTING master fipl5,61500,1576484594079: Unhandled 
exception. Starting shutdown. *****java.lang.IllegalStateException: The 
procedure WAL relies on the ability to hsync for proper operation during 
component failures, but the underlying filesystem does not support doing so. 
Please check the config value of 'hbase.procedure.store.wal.use.hsync' to set 
the desired level of robustness and ensure the config value of 'hbase.wal.dir' 
points to a FileSystem mount that can provide it. at 
org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.rollWriter(WALProcedureStore.java:1044)
 at 
org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.recoverLease(WALProcedureStore.java:383)
 at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor.init(ProcedureExecutor.java:649)
 at 
org.apache.hadoop.hbase.master.HMaster.createProcedureExecutor(HMaster.java:1282)
 at 
org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:842)
 at 
org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2086)
 at org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:553) at 
java.lang.Thread.run(Thread.java:748)2019-12-16 13:53:18,248 INFO  [Thread-22] 
regionserver.HRegionServer: ***** STOPPING region server 
'fipl5,61500,1576484594079' *****2019-12-16 13:53:18,248 INFO  [Thread-22] 
regionserver.HRegionServer: STOPPED: Stopped by Thread-222019-12-16 
13:53:18,248 INFO  [M:0;fipl5:61500] regionserver.HRegionServer: Stopping 
infoServer2019-12-16 13:53:18,259 INFO  [M:0;fipl5:61500] 
handler.ContextHandler: Stopped 
o.e.j.w.WebAppContext@39c1fe0b\{/,null,UNAVAILABLE}{file:/User/atlas/distro/target/apache-atlas-3.0.0-SNAPSHOT-bin/apache-atlas-3.0.0-SNAPSHOT/hbase/hbase-webapps/master}

{noformat}

This error shows that Atlas start failed due to HBase Master start failing for:
{code:none}
_master.HMaster: Failed to become active masterjava.lang.IllegalStateException: 
The procedure WAL relies on the ability to hsync for proper operation during 
component failures, but the underlying filesystem does not support doing so. 
Please check the config value of 'hbase.procedure.store.wal.use.hsync' to set 
the desired level of robustness and ensure the config value of 'hbase.wal.dir' 
points to a FileSystem mount that can provide it._
{code}

After adding the  property
{noformat}<property>
 <name>hbase.unsafe.stream.capability.enforce</name>
 <value>false</value>
 </property>{noformat}
 HBase Master start was successful and Atlas start also succeeds.

> Atlas start fails in embedded-hbase mode with zookeeper error
> -------------------------------------------------------------
>
>                 Key: ATLAS-3561
>                 URL: https://issues.apache.org/jira/browse/ATLAS-3561
>             Project: Atlas
>          Issue Type: Bug
>          Components:  atlas-core
>    Affects Versions: 3.0.0
>            Reporter: chaitali borole
>            Assignee: chaitali borole
>            Priority: Minor
>             Fix For: 3.0.0
>
>
> After compiling Atlas with {{mvn clean package -Pdist,embedded-hbase-solr}}
>  and starting Atlas with  embedded services hbase, solr and kafka using 
> {{atlas_start.py}}, the Atlas start fails with below error in 
> {{application.log}}
> {noformat}
> 2019-12-09 16:01:28,839 INFO  - [main:] ~ Not running setup per configuration 
> atlas.server.run.setup.on.start. (SetupSteps$SetupRequired:189)
> 2019-12-09 16:01:32,786 WARN  - 
> [ReadOnlyZKClient-localhost:2181@0x0fa5f81c-SendThread(localhost:2181):] ~ 
> Session 0x16eea36b27b0003 for server null, unexpected error, closing socket 
> connection and attempting reconnect (ClientCnxn$SendThread:1102)
> java.net.ConnectException: Connection refused
>       at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>       at 
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
>       at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
>       at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
> 2019-12-09 16:01:32,889 WARN  - [ReadOnlyZKClient-localhost:2181@0x0fa5f81c:] 
> ~ 0x0fa5f81c to localhost:2181 failed for get of /hbase/meta-region-server, 
> code = CONNECTIONLOSS, retries = 1 (ReadOnlyZKClient$ZKTask$1:183)
> 2019-12-09 16:01:34,004 WARN  - 
> [ReadOnlyZKClient-localhost:2181@0x0fa5f81c-SendThread(localhost:2181):] ~ 
> Session 0x16eea36b27b0003 for server null, unexpected error, closing socket 
> connection and attempting reconnect (ClientCnxn$SendThread:1102)
> java.net.ConnectException: Connection refused
>       at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>       at 
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
>       at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
>       at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
> {noformat}
>  
> *Workaround*
> Adding below property in {{hbase-site.xml.template}} and
> running  {{mvn clean package -Pdist,embedded-hbase-solr}} the issue is 
> resolved.
>  
> {code:none}
> <property>
>  <name>hbase.unsafe.stream.capability.enforce</name>
>  <value>false</value>
>  </property>
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (ATLAS-3561) Atlas start fails in embedded-hbase mode with zookeeper error

Reply via email to