I had to do some network reconfiguration on our cluster. After rebooting 
everything and restarting
the ambari server and the ambari agents, the server reports (via the UI) that 
it is not receiving heartbeats.
However, when I look at the server and agent logs, I see heartbeat activity:

agent:
INFO 2013-07-15 11:40:12,169 Heartbeat.py:61 - Sending heartbeat with response 
id: 251 and timestamp: 1373902812168
INFO 2013-07-15 11:40:12,214 Controller.py:176 - No commands sent from the 
Server.

server
11:41:44,760  INFO HeartBeatHandler:108 - Received heartbeat from host, 
hostname=foo.net, currentResponseId=260, receivedResponseId=260
11:41:44,761  INFO AgentResource:109 - Sending heartbeat response with response 
id 261

(response id's don't match because I didn't try to capture them in unison). I 
suspect there may be persisted state in the postgres database
from the previous network configuration that is causing the problem. Any 
suggestions for a fix short of a complete redeploy?

TIA

Brian

Reply via email to