[
https://issues.apache.org/jira/browse/AMBARI-19416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15810928#comment-15810928
]
Hadoop QA commented on AMBARI-19416:
------------------------------------
{color:green}+1 overall{color}. Here are the results of testing the latest
attachment
http://issues.apache.org/jira/secure/attachment/12846213/AMBARI-19416.v2.patch
against trunk revision .
{color:green}+1 @author{color}. The patch does not contain any @author
tags.
{color:green}+1 tests included{color}. The patch appears to include 1 new
or modified test files.
{color:green}+1 javac{color}. The applied patch does not increase the
total number of javac compiler warnings.
{color:green}+1 release audit{color}. The applied patch does not increase
the total number of release audit warnings.
{color:green}+1 core tests{color}. The patch passed unit tests in
ambari-agent.
Test results:
https://builds.apache.org/job/Ambari-trunk-test-patch/9957//testReport/
Console output:
https://builds.apache.org/job/Ambari-trunk-test-patch/9957//console
This message is automatically generated.
> Ambari agents remain in heartbeat lost state after ambari server restart
> ------------------------------------------------------------------------
>
> Key: AMBARI-19416
> URL: https://issues.apache.org/jira/browse/AMBARI-19416
> Project: Ambari
> Issue Type: Bug
> Reporter: Sebastian Toader
> Assignee: Sebastian Toader
> Priority: Critical
> Fix For: 3.0.0
>
> Attachments: AMBARI-19416.v2.patch
>
>
> With the implementation https://issues.apache.org/jira/browse/AMBARI-18505
> the execution of status commands is done in a separate child process. Status
> commands received from the server by ambari agent are passed to the status
> command executor child process via Queue ({{multiprocessing.Queue()}}. In
> case the child process is killed, either manually or by the parent process
> the queue may end up in bad state (see: http://bugs.python.org/issue20527)
> thus the re-spawned status command executor child process may not receive new
> status commands any more.
> When ambari server is restarted the agent re-registers with ambari server and
> upon re-registration it re-spawns the status command child process in order
> to receive up to date agent configs
> (https://issues.apache.org/jira/browse/AMBARI-19392). In this case the status
> commands won't be received by the status command executor child process due
> the queue may get stuck leading the ambari agent to stay in heatbeat lost
> state.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)