[
https://issues.apache.org/jira/browse/HAWQ-979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lin Wen resolved HAWQ-979.
--------------------------
Resolution: Fixed
> Resource Broker Should Reconnect Hadoop Yarn When Failed to Get Cluster Report
> ------------------------------------------------------------------------------
>
> Key: HAWQ-979
> URL: https://issues.apache.org/jira/browse/HAWQ-979
> Project: Apache HAWQ
> Issue Type: Bug
> Components: Resource Manager
> Reporter: Lin Wen
> Assignee: Lin Wen
> Fix For: 2.0.1.0-incubating
>
>
> While HAWQ with yarn mode is running, sometimes the heartbeat thread of
> libyarn maybe fail(e.g. YARN RM restarts) and quit,
> 2016-08-03 18:45:27.913838
> PDT,,,p34645,th-1290610400,,,,0,con4,,seg-10000,,,,,"WARNING","01000","YARN
> mode resource broker failed to get YARN queue report of queue default.
> LibYarnClient::getQueueInfo, Catch the Exception:LibYarnClient::libyarn AM
> heartbeat thread has stopped.",,,,,,,0,,"resourcebroker_LIBYARN_proc.c",1840,
> resource broker process should re-register HAWQ to YARN in this case, but
> actually not.
> The reason is:
> In function handleRM2RB_GetClusterReport(), when RB2YARN_getQueueReport()
> failed, function sendRBGetClusterReportErrorData() is called, but
> sendRBGetClusterReportErrorData() returns OK(should return RESBROK_ERROR_GRM)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)