Hi list,

My team would like to report an unusual pattern we're seeing, and find
out whether it's a bug and if it will be fixed in a future OSSEC
version.

Our setup is an OSSEC v2.4.1 manager monitoring around a dozen servers
and workstations (using agent v2.4.1), mixed Windows 2003 and *nix.
Our systems are strict one-way; that is, the agents can talk to the
manager, but the manager cannot talk to the agents. Our *nix agents
behave as expected, but our Windows 2003 agents frequently show up as
"disconnected" when checked with agent_control. However, the Windows
agents ARE connected, and are sending events before and after
agent_control shows them as disconnected.

It was our understanding that agent_control marks an agent as
disconnected if that agent hasn't sent an event within the last 15
minutes (if this is incorrect, please tell me what it really is). Our
systems tend to be very quiet for long periods of time, so we
originally thought that the "disconnects" were happening simply
because there were no events.

We implemented a heartbeat to fix this, which uses OSSEC's process
monitoring tool to periodically send events. On Windows, the process
monitor sends an event every 5-7 minutes. We expected that if OSSEC
saw a heartbeat every 5-7 minutes, the 15-minute disconnect period in
agent_control would never trigger.

However, even with the heartbeat, the Windows agents continue to show
as "disconnected" for long periods of time - even if there was a
heartbeat a few minutes prior to running the agent_control query. In
addition, events continue to arrive at the manager despite the agent
showing disconnected.

Below is the output of two agent_control commands, and two alerts from
OSSEC's alerts.log which show heartbeats 3 minutes before the first
agent_control and one minute after the second:
_____________________________________________________________________________________
# date
Mon Apr 11 18:10:51 GMT 2011
# ./agent_control -l

OSSEC HIDS agent_control. List of available agents:
   ID: 000, Name: manager (server), IP: 127.0.0.1, Active/Local
   <snip>
   ID: 009, Name: win-server1, IP: <IP>, Disconnected

# date
Mon Apr 11 18:12:51 GMT 2011
# ./agent_control -l

OSSEC HIDS agent_control. List of available agents:
   ID: 000, Name: manager (server), IP: 127.0.0.1, Active/Local
   <snip>
   ID: 009, Name: win-server1, IP: <IP>, Disconnected

/opt/ossec/logs/alerts/alerts.log:

** Alert 1302545271.2550465: - local,syslog,
2011 Apr 11 18:07:51 (win-server1) <IP>->"C\\Program Files\ossec-agent
\win_heartbeat.bat"
Rule: 101003 (level 1) -> 'OSSEC heartbeat: OSSEC Windows is running'
Src IP: (none)
User: (none)
ossec: output: '"C\\Program Files\ossec-agent\win_heartbeat.bat"': Mon
04/11/2011  18:07 PM ossec-win-hb: Heartbeat: ossec-
agent.exe               5852                            0      4,652 K

<snip>

** Alert 1302545632.2566243: - local,syslog,
2011 Apr 11 18:13:52 (win-server1) <IP>->"C\\Program Files\ossec-agent
\win_heartbeat.bat"
Rule: 101003 (level 1) -> 'OSSEC heartbeat: OSSEC Windows is running'
Src IP: (none)
User: (none)
ossec: output: '"C\\Program Files\ossec-agent\win_heartbeat.bat"': Mon
04/11/2011  18:13 PM ossec-win-hb: Heartbeat: ossec-
agent.exe               5852                            0      4,644 K
_____________________________________________________________________________________

Although our current setup is OSSEC v2.4.1, we've been seeing this
since OSSEC v1.6.1. It's definitely not that the Windows agents are
disconnecting, as in the bug reports from a few years back, but that
agent_control is marking them as disconnected when they are still
connected and still sending events. We have several instances of this
setup (dev, lab, test, etc) and it happens in all of them.

What is going on here? Is there some reason why the process monitor
heartbeat isn't registering as an event to the manager? What else
could be causing agent_control to display the Windows agents as
disconnected when they're not?

Thanks!
-Alisha Kloc

Reply via email to