Hi,

I read in DUCC book about:

Agents monitors nodes, sending heartbeat packets with node statistics to interested components (such as the RM and web-server).

Status

   This shows the current state of a machine. Values include:

   defined
       The node is in the DUCCnodes file
       <http://192.168.10.144:52133/doc/duccbook.html#x1-23600012.6>,
       but no DUCC process has been started there, or else there is a
       communication problem and the state messages are not being
       delivered.
   up
       The node has a DUCC Agent process running on it and the web
       server is receiving regular heartbeat packets from it.
   down
       The node had a healthy DUCC Agent on it at some point in the
       past (since the last DUCC boot), but the web server has stopped
       receiving heartbeats from it.

       The agent may have been manually shut down, may have crashed, or
       there may be a communication problem.

       Additionally, very heavy loads from jobs running the the node
       can cause the DUCC Agents heartbeats to be delayed.

I have some question in my mind i.e.

1.    What are Heartbeat Packets?
2.    Are they same as defined in this url: http://250bpm.com/blog:22.
3.    How daemons broadcast a heartbeat?
4.    How Agents nodes send heartbeat packets?

As My DUCC Agents were going down again and again for a particular time period.

5.   How can I identify Agents were going down due to network issue?

Thanks in Advanced.

Reshu.

Reply via email to