[
https://issues.apache.org/jira/browse/HIVE-10648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15010040#comment-15010040
]
Sergey Shelukhin commented on HIVE-10648:
-----------------------------------------
[~gopalv] any update on this?
> LLAP: registry; Tez attempted to schedule to daemon that didn't exist
> ---------------------------------------------------------------------
>
> Key: HIVE-10648
> URL: https://issues.apache.org/jira/browse/HIVE-10648
> Project: Hive
> Issue Type: Sub-task
> Reporter: Sergey Shelukhin
> Assignee: Gopal V
>
> I can post logs externally; for now app IDs on test cluster are
> application_1429683757595_0784 and application_1429683757595_0783, I also
> have logs copied over.
> AM found the node (same logs for other nodes):
> {noformat}
> 2015-05-07 12:13:28,074 INFO
> [ServiceThread:org.apache.tez.dag.app.rm.TaskSchedulerEventHandler]
> impl.LlapYarnRegistryImpl: Adding new worker
> 342f4992-2608-43ab-a119-b50882e35f75 which mapped to DynamicServiceInstance
> [alive=true, host=cn059-10.l42scl.hortonworks.com:15001 with
> resources=<memory:20480, vCores:6>]
> ....
> 2015-05-07 12:13:28,082 INFO [Dispatcher thread: Central] node.AMNodeTracker:
> Num cluster nodes = 19
> {noformat}
> Trouble is, this node never actually existed... The cluster only had 15
> nodes.
> As the job was progressing, AM repeatedly tried to schedule to this node and
> failed. There was no other LLAP cluster running at the same time.
> In fact, given that I always start a 15-node cluster I am not sure where
> 19-node data could conceivably come from...
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)