[ https://issues.apache.org/jira/browse/ATLAS-3561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16997088#comment-16997088 ]
chaitali borole commented on ATLAS-3561: ---------------------------------------- The above error logs are from Atlas application.log file, below is snippet from HBase hbase master log file. {noformat} 2019-12-16 13:53:18,197 WARN [Thread-22] wal.WALProcedureStore: Remove uninitialized log: DeprecatedRawLocalFileStatus\{path=file:/User/atlas/distro/target/apache-atlas-3.0.0-SNAPSHOT-bin/apache-atlas-3.0.0-SNAPSHOT/data/hbase-root/MasterProcWALs/pv2-00000000000000000021.log; isDirectory=false; length=0; replication=1; blocksize=33554432; modification_time=1576483024000; access_time=0; owner=; group=; permission=rw-rw-rw-; isSymlink=false}2019-12-16 13:53:18,197 WARN [Thread-22] wal.WALProcedureStore: Remove uninitialized log: DeprecatedRawLocalFileStatus\{path=file:/User/atlas/distro/target/apache-atlas-3.0.0-SNAPSHOT-bin/apache-atlas-3.0.0-SNAPSHOT/data/hbase-root/MasterProcWALs/pv2-00000000000000000021.log; isDirectory=false; length=0; replication=1; blocksize=33554432; modification_time=1576483024000; access_time=0; owner=; group=; permission=rw-rw-rw-; isSymlink=false}2019-12-16 13:53:18,197 INFO [Thread-22] wal.ProcedureWALFile: Archiving file:/User/atlas/distro/target/apache-atlas-3.0.0-SNAPSHOT-bin/apache-atlas-3.0.0-SNAPSHOT/data/hbase-root/MasterProcWALs/pv2-00000000000000000021.log to file:/User/atlas/distro/target/apache-atlas-3.0.0-SNAPSHOT-bin/apache-atlas-3.0.0-SNAPSHOT/data/hbase-root/oldWALs/pv2-00000000000000000021.log2019-12-16 13:53:18,246 ERROR [Thread-22] master.HMaster: Failed to become active masterjava.lang.IllegalStateException: The procedure WAL relies on the ability to hsync for proper operation during component failures, but the underlying filesystem does not support doing so. Please check the config value of 'hbase.procedure.store.wal.use.hsync' to set the desired level of robustness and ensure the config value of 'hbase.wal.dir' points to a FileSystem mount that can provide it. at org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.rollWriter(WALProcedureStore.java:1044) at org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.recoverLease(WALProcedureStore.java:383) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.init(ProcedureExecutor.java:649) at org.apache.hadoop.hbase.master.HMaster.createProcedureExecutor(HMaster.java:1282) at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:842) at org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2086) at org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:553) at java.lang.Thread.run(Thread.java:748)2019-12-16 13:53:18,247 ERROR [Thread-22] master.HMaster: ***** ABORTING master fipl5,61500,1576484594079: Unhandled exception. Starting shutdown. *****java.lang.IllegalStateException: The procedure WAL relies on the ability to hsync for proper operation during component failures, but the underlying filesystem does not support doing so. Please check the config value of 'hbase.procedure.store.wal.use.hsync' to set the desired level of robustness and ensure the config value of 'hbase.wal.dir' points to a FileSystem mount that can provide it. at org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.rollWriter(WALProcedureStore.java:1044) at org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.recoverLease(WALProcedureStore.java:383) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.init(ProcedureExecutor.java:649) at org.apache.hadoop.hbase.master.HMaster.createProcedureExecutor(HMaster.java:1282) at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:842) at org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2086) at org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:553) at java.lang.Thread.run(Thread.java:748)2019-12-16 13:53:18,248 INFO [Thread-22] regionserver.HRegionServer: ***** STOPPING region server 'fipl5,61500,1576484594079' *****2019-12-16 13:53:18,248 INFO [Thread-22] regionserver.HRegionServer: STOPPED: Stopped by Thread-222019-12-16 13:53:18,248 INFO [M:0;fipl5:61500] regionserver.HRegionServer: Stopping infoServer2019-12-16 13:53:18,259 INFO [M:0;fipl5:61500] handler.ContextHandler: Stopped o.e.j.w.WebAppContext@39c1fe0b\{/,null,UNAVAILABLE}{file:/User/atlas/distro/target/apache-atlas-3.0.0-SNAPSHOT-bin/apache-atlas-3.0.0-SNAPSHOT/hbase/hbase-webapps/master} {noformat} This error shows that Atlas start failed due to HBase Master start failing for: {code:none} _master.HMaster: Failed to become active masterjava.lang.IllegalStateException: The procedure WAL relies on the ability to hsync for proper operation during component failures, but the underlying filesystem does not support doing so. Please check the config value of 'hbase.procedure.store.wal.use.hsync' to set the desired level of robustness and ensure the config value of 'hbase.wal.dir' points to a FileSystem mount that can provide it._ {code} After adding the property {noformat}<property> <name>hbase.unsafe.stream.capability.enforce</name> <value>false</value> </property>{noformat} HBase Master start was successful and Atlas start also succeeds. > Atlas start fails in embedded-hbase mode with zookeeper error > ------------------------------------------------------------- > > Key: ATLAS-3561 > URL: https://issues.apache.org/jira/browse/ATLAS-3561 > Project: Atlas > Issue Type: Bug > Components: atlas-core > Affects Versions: 3.0.0 > Reporter: chaitali borole > Assignee: chaitali borole > Priority: Minor > Fix For: 3.0.0 > > > After compiling Atlas with {{mvn clean package -Pdist,embedded-hbase-solr}} > and starting Atlas with embedded services hbase, solr and kafka using > {{atlas_start.py}}, the Atlas start fails with below error in > {{application.log}} > {noformat} > 2019-12-09 16:01:28,839 INFO - [main:] ~ Not running setup per configuration > atlas.server.run.setup.on.start. (SetupSteps$SetupRequired:189) > 2019-12-09 16:01:32,786 WARN - > [ReadOnlyZKClient-localhost:2181@0x0fa5f81c-SendThread(localhost:2181):] ~ > Session 0x16eea36b27b0003 for server null, unexpected error, closing socket > connection and attempting reconnect (ClientCnxn$SendThread:1102) > java.net.ConnectException: Connection refused > at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) > at > sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) > at > org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361) > at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081) > 2019-12-09 16:01:32,889 WARN - [ReadOnlyZKClient-localhost:2181@0x0fa5f81c:] > ~ 0x0fa5f81c to localhost:2181 failed for get of /hbase/meta-region-server, > code = CONNECTIONLOSS, retries = 1 (ReadOnlyZKClient$ZKTask$1:183) > 2019-12-09 16:01:34,004 WARN - > [ReadOnlyZKClient-localhost:2181@0x0fa5f81c-SendThread(localhost:2181):] ~ > Session 0x16eea36b27b0003 for server null, unexpected error, closing socket > connection and attempting reconnect (ClientCnxn$SendThread:1102) > java.net.ConnectException: Connection refused > at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) > at > sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) > at > org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361) > at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081) > {noformat} > > *Workaround* > Adding below property in {{hbase-site.xml.template}} and > running {{mvn clean package -Pdist,embedded-hbase-solr}} the issue is > resolved. > > {code:none} > <property> > <name>hbase.unsafe.stream.capability.enforce</name> > <value>false</value> > </property> > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005)