[ https://issues.apache.org/jira/browse/HDFS-13198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16378027#comment-16378027 ]
Íñigo Goiri commented on HDFS-13198: ------------------------------------ I was able to reproduce it locally. The sequence is the {{StateStoreZooKeeperImpl}} getting stuck when connecting to ZooKeeper, then when {{RouterSafemodeService}} does {{updateRouterState()}}, the state store is not yet initialized. I haven't figured out why the internal version doesn't have this issue but I think it makes sense to check if the driver is ready before triggering a change in state. > RBF: RouterHeartbeatService throws out CachedStateStore related exceptions > when starting router > ----------------------------------------------------------------------------------------------- > > Key: HDFS-13198 > URL: https://issues.apache.org/jira/browse/HDFS-13198 > Project: Hadoop HDFS > Issue Type: Sub-task > Reporter: Wei Yan > Assignee: Wei Yan > Priority: Minor > > Exception looks like: > {code:java} > 2018-02-23 19:04:56,341 ERROR router.RouterHeartbeatService: Cannot get > version for class > org.apache.hadoop.hdfs.server.federation.store.MembershipStore: Cached State > Store not initialized, MembershipState records not valid > 2018-02-23 19:04:56,341 ERROR router.RouterHeartbeatService: Cannot get > version for class > org.apache.hadoop.hdfs.server.federation.store.MountTableStore: Cached State > Store not initialized, MountTable records not valid > Exception in thread "Router Heartbeat Async" java.lang.NullPointerException > at > org.apache.hadoop.hdfs.server.federation.store.driver.impl.StateStoreSerializableImpl.serialize(StateStoreSerializableImpl.java:60) > at > org.apache.hadoop.hdfs.server.federation.store.driver.impl.StateStoreZooKeeperImpl.putAll(StateStoreZooKeeperImpl.java:191) > at > org.apache.hadoop.hdfs.server.federation.store.driver.impl.StateStoreBaseImpl.put(StateStoreBaseImpl.java:75) > at > org.apache.hadoop.hdfs.server.federation.store.impl.RouterStoreImpl.routerHeartbeat(RouterStoreImpl.java:88) > at > org.apache.hadoop.hdfs.server.federation.router.RouterHeartbeatService.updateStateStore(RouterHeartbeatService.java:95) > at > org.apache.hadoop.hdfs.server.federation.router.RouterHeartbeatService.access$000(RouterHeartbeatService.java:43) > at > org.apache.hadoop.hdfs.server.federation.router.RouterHeartbeatService$1.run(RouterHeartbeatService.java:68) > at java.lang.Thread.run(Thread.java:748){code} > This is because, during starting the Router, the CachedStateStore hasn't been > initialized and cannot serve requests. Although the router will still be > started, it would be better to fix the exceptions. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org