[
https://issues.apache.org/jira/browse/TAJO-1586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14520860#comment-14520860
]
ASF GitHub Bot commented on TAJO-1586:
--------------------------------------
Github user jinossy commented on a diff in the pull request:
https://github.com/apache/tajo/pull/560#discussion_r29403727
--- Diff:
tajo-core/src/main/java/org/apache/tajo/ha/HdfsServiceTracker.java ---
@@ -70,12 +70,45 @@
public HdfsServiceTracker(TajoConf conf) throws IOException {
this.conf = conf;
+ }
+
+ @Override
+ public void register() throws IOException {
initSystemDirectory();
InetSocketAddress socketAddress =
conf.getSocketAddrVar(ConfVars.TAJO_MASTER_UMBILICAL_RPC_ADDRESS);
- this.masterName = socketAddress.getAddress().getHostAddress() + ":" +
socketAddress.getPort();
+ this.masterName = socketAddress.getAddress().getHostName()+ ":" +
socketAddress.getPort();
--- End diff --
If a hostname is not FQDN, it may be throw the unknown host exception.
In my opinion, ip address is more safety. How do you think?
> TajoMaster HA startup failure on Yarn.
> --------------------------------------
>
> Key: TAJO-1586
> URL: https://issues.apache.org/jira/browse/TAJO-1586
> Project: Tajo
> Issue Type: Bug
> Components: tajo master
> Affects Versions: 0.10.0
> Reporter: Jaehwa Jung
> Assignee: Jaehwa Jung
>
> I tried to deploy Tajo on YARN with Slider. But I couldn't deploy Tajo
> because of TajoMaster HA failure. TajoWorker failed to load TajoMaster
> address as follows.
> {code:xml}
> 2015-04-28 04:52:22,266 INFO org.apache.hadoop.service.AbstractService:
> Service org.apache.tajo.worker.TajoWorker failed in state STARTED; cause:
> org.apache.tajo.service.ServiceTrackerException:
> org.apache.tajo.service.ServiceTrackerException: No active master entry
> org.apache.tajo.service.ServiceTrackerException:
> org.apache.tajo.service.ServiceTrackerException: No active master entry
> at
> org.apache.tajo.ha.HdfsServiceTracker.getAddressElements(HdfsServiceTracker.java:441)
> at
> org.apache.tajo.ha.HdfsServiceTracker.getUmbilicalAddress(HdfsServiceTracker.java:348)
> at org.apache.tajo.worker.TajoWorker.serviceStart(TajoWorker.java:318)
> at
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
> at org.apache.tajo.worker.TajoWorker.startWorker(TajoWorker.java:141)
> at org.apache.tajo.worker.TajoWorker.main(TajoWorker.java:627)
> Caused by: org.apache.tajo.service.ServiceTrackerException: No active master
> entry
> at
> org.apache.tajo.ha.HdfsServiceTracker.getAddressElements(HdfsServiceTracker.java:413)
> ... 5 more
> 2015-04-28 04:52:22,307 INFO org.apache.hadoop.service.AbstractService:
> Service WorkerHeartbeatService failed in state STOPPED; cause:
> java.lang.NullPointerException
> java.lang.NullPointerException
> at
> org.apache.tajo.worker.WorkerHeartbeatService$WorkerHeartbeatThread.access$000(WorkerHeartbeatService.java:101)
> at
> org.apache.tajo.worker.WorkerHeartbeatService.serviceStop(WorkerHeartbeatService.java:90)
> at
> org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221)
> at
> org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:52)
> at
> org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:80)
> at
> org.apache.hadoop.service.CompositeService.stop(CompositeService.java:157)
> at
> org.apache.hadoop.service.CompositeService.serviceStop(CompositeService.java:131)
> at org.apache.tajo.worker.TajoWorker.serviceStop(TajoWorker.java:375)
> at
> org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221)
> at
> org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:52)
> at
> org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:80)
> at
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:203)
> at org.apache.tajo.worker.TajoWorker.startWorker(TajoWorker.java:141)
> at org.apache.tajo.worker.TajoWorker.main(TajoWorker.java:627){code}
> I think that the cause of this failure is time difference between TajoMaster
> and TajoWorker.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)