Jaehwa Jung created TAJO-1586:
---------------------------------
Summary: TajoMaster HA startup failure on Yarn.
Key: TAJO-1586
URL: https://issues.apache.org/jira/browse/TAJO-1586
Project: Tajo
Issue Type: Bug
Components: tajo master
Affects Versions: 0.10.0
Reporter: Jaehwa Jung
Assignee: Jaehwa Jung
I tried to deploy Tajo on YARN with Slider. But I couldn't deploy Tajo because
of TajoMaster HA failure. TajoWorker failed to load TajoMaster address as
follows.
{code:xml}
2015-04-28 04:52:22,266 INFO org.apache.hadoop.service.AbstractService: Service
org.apache.tajo.worker.TajoWorker failed in state STARTED; cause:
org.apache.tajo.service.ServiceTrackerException:
org.apache.tajo.service.ServiceTrackerException: No active master entry
org.apache.tajo.service.ServiceTrackerException:
org.apache.tajo.service.ServiceTrackerException: No active master entry
at
org.apache.tajo.ha.HdfsServiceTracker.getAddressElements(HdfsServiceTracker.java:441)
at
org.apache.tajo.ha.HdfsServiceTracker.getUmbilicalAddress(HdfsServiceTracker.java:348)
at org.apache.tajo.worker.TajoWorker.serviceStart(TajoWorker.java:318)
at
org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
at org.apache.tajo.worker.TajoWorker.startWorker(TajoWorker.java:141)
at org.apache.tajo.worker.TajoWorker.main(TajoWorker.java:627)
Caused by: org.apache.tajo.service.ServiceTrackerException: No active master
entry
at
org.apache.tajo.ha.HdfsServiceTracker.getAddressElements(HdfsServiceTracker.java:413)
... 5 more
2015-04-28 04:52:22,307 INFO org.apache.hadoop.service.AbstractService: Service
WorkerHeartbeatService failed in state STOPPED; cause:
java.lang.NullPointerException
java.lang.NullPointerException
at
org.apache.tajo.worker.WorkerHeartbeatService$WorkerHeartbeatThread.access$000(WorkerHeartbeatService.java:101)
at
org.apache.tajo.worker.WorkerHeartbeatService.serviceStop(WorkerHeartbeatService.java:90)
at
org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221)
at
org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:52)
at
org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:80)
at
org.apache.hadoop.service.CompositeService.stop(CompositeService.java:157)
at
org.apache.hadoop.service.CompositeService.serviceStop(CompositeService.java:131)
at org.apache.tajo.worker.TajoWorker.serviceStop(TajoWorker.java:375)
at
org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221)
at
org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:52)
at
org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:80)
at
org.apache.hadoop.service.AbstractService.start(AbstractService.java:203)
at org.apache.tajo.worker.TajoWorker.startWorker(TajoWorker.java:141)
at org.apache.tajo.worker.TajoWorker.main(TajoWorker.java:627){code}
I think that the cause of this failure is time difference between TajoMaster
and TajoWorker.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)