[ossec-list] Re: agent_control disconnect bug?

Alisha Kloc Wed, 13 Apr 2011 15:22:38 -0700

Hi Daniel,

Why does setting the agent to be one-way stop the keepalives, and why
does it do so only on the Windows agents? The *nix agents are
apparently sending their keepalives normally, since none of them
appear disconnected, even though they are configured one-way as well,
and behind the same network firewall as the Windows agents.


Also, we did some more testing in our lab overnight, and we realized
that a one-way Windows agent which can still hear from the manager
(i.e., it is configured to be one-way, but there isn't any network
configuration preventing the manager from communicating back) remains
properly connected.

However, the same one-way Windows agent, when it can't hear from the
manager (i.e., there's a router blocking all UDP traffic between the
manager and the agent), shows as disconnected in agent_control.

So it seems to be a bug specifically in the one-way Windows agents...

Thanks!
-Alisha




On Apr 12, 12:08 pm, Daniel Cid <[email protected]> wrote:
> Hi Alisha,
>
> Always Windows giving us problem :)
>
> The manager itself (remoted) doesn't keep state from any of the agents
> (doesn't know if it is on or off). But whenever it receives the keep
> alive
> from the agent, it will update the timestamp from the agent file in the queue.
>
> That agent file is ready by the other tools (agent_control, monitord,
> etc) to list which agents are on and off. So even if your agent is
> sending
> lots of events and communicating properly wit the manager, if it
> doesn't send the keep alive, the manager will never update the file
> and
> all the tools will think it is disconnected...
>
> Hope I made sense.
>
> thanks,
>
> On Tue, Apr 12, 2011 at 3:03 PM, Alisha Kloc <[email protected]> wrote:
> > Hi,
>
> > I don't think it's because of the one-way setup. We've seen the
> > disconnect issue in lab/troubleshoot setups where one-way isn't
> > configured; also, our *nix agents are configured to be one-way as
> > well, but they always show properly as connected. It's only the
> > Windows agents that do this.
>
> > Would there be any reason for the OSSEC manager to not attribute a
> > process monitor event to a Windows agent, so it doesn't realize that
> > the heartbeat sent by the process monitor is an event for purposes of
> > keeping the agent marked alive?
>
> > Also, how exactly does OSSEC determine whether an agent is connected
> > or not? We were originally told that it just marks an agent as
> > disconnected if it hasn't seen an event from the agent in 15 minutes,
> > but you just now said the agents send keepalives. Could you clarify?
>
> > Thanks!
> > -Alisha
>
> > On Apr 12, 10:38 am, Daniel Cid <[email protected]> wrote:
> >> Hey,
>
> >> Maybe because it is setup as one-way? The manager uses the keep alives
> >> (internally sent by the agent) to keep
> >> track if they are up or not. Since you have it disabled, those keep
> >> alives are never sent, so they show up as
> >> disabled.
>
> >> thanks,
>
> >> On Mon, Apr 11, 2011 at 4:36 PM, Alisha Kloc <[email protected]> 
> >> wrote:
> >> > Hi list,
>
> >> > My team would like to report an unusual pattern we're seeing, and find
> >> > out whether it's a bug and if it will be fixed in a future OSSEC
> >> > version.
>
> >> > Our setup is an OSSEC v2.4.1 manager monitoring around a dozen servers
> >> > and workstations (using agent v2.4.1), mixed Windows 2003 and *nix.
> >> > Our systems are strict one-way; that is, the agents can talk to the
> >> > manager, but the manager cannot talk to the agents. Our *nix agents
> >> > behave as expected, but our Windows 2003 agents frequently show up as
> >> > "disconnected" when checked with agent_control. However, the Windows
> >> > agents ARE connected, and are sending events before and after
> >> > agent_control shows them as disconnected.
>
> >> > It was our understanding that agent_control marks an agent as
> >> > disconnected if that agent hasn't sent an event within the last 15
> >> > minutes (if this is incorrect, please tell me what it really is). Our
> >> > systems tend to be very quiet for long periods of time, so we
> >> > originally thought that the "disconnects" were happening simply
> >> > because there were no events.
>
> >> > We implemented a heartbeat to fix this, which uses OSSEC's process
> >> > monitoring tool to periodically send events. On Windows, the process
> >> > monitor sends an event every 5-7 minutes. We expected that if OSSEC
> >> > saw a heartbeat every 5-7 minutes, the 15-minute disconnect period in
> >> > agent_control would never trigger.
>
> >> > However, even with the heartbeat, the Windows agents continue to show
> >> > as "disconnected" for long periods of time - even if there was a
> >> > heartbeat a few minutes prior to running the agent_control query. In
> >> > addition, events continue to arrive at the manager despite the agent
> >> > showing disconnected.
>
> >> > Below is the output of two agent_control commands, and two alerts from
> >> > OSSEC's alerts.log which show heartbeats 3 minutes before the first
> >> > agent_control and one minute after the second:
> >> > _____________________________________________________________________________________
> >> > # date
> >> > Mon Apr 11 18:10:51 GMT 2011
> >> > # ./agent_control -l
>
> >> > OSSEC HIDS agent_control. List of available agents:
> >> >   ID: 000, Name: manager (server), IP: 127.0.0.1, Active/Local
> >> >   <snip>
> >> >   ID: 009, Name: win-server1, IP: <IP>, Disconnected
>
> >> > # date
> >> > Mon Apr 11 18:12:51 GMT 2011
> >> > # ./agent_control -l
>
> >> > OSSEC HIDS agent_control. List of available agents:
> >> >   ID: 000, Name: manager (server), IP: 127.0.0.1, Active/Local
> >> >   <snip>
> >> >   ID: 009, Name: win-server1, IP: <IP>, Disconnected
>
> >> > /opt/ossec/logs/alerts/alerts.log:
>
> >> > ** Alert 1302545271.2550465: - local,syslog,
> >> > 2011 Apr 11 18:07:51 (win-server1) <IP>->"C\\Program Files\ossec-agent
> >> > \win_heartbeat.bat"
> >> > Rule: 101003 (level 1) -> 'OSSEC heartbeat: OSSEC Windows is running'
> >> > Src IP: (none)
> >> > User: (none)
> >> > ossec: output: '"C\\Program Files\ossec-agent\win_heartbeat.bat"': Mon
> >> > 04/11/2011  18:07 PM ossec-win-hb: Heartbeat: ossec-
> >> > agent.exe               5852                            0      4,652 K
>
> >> > <snip>
>
> >> > ** Alert 1302545632.2566243: - local,syslog,
> >> > 2011 Apr 11 18:13:52 (win-server1) <IP>->"C\\Program Files\ossec-agent
> >> > \win_heartbeat.bat"
> >> > Rule: 101003 (level 1) -> 'OSSEC heartbeat: OSSEC Windows is running'
> >> > Src IP: (none)
> >> > User: (none)
> >> > ossec: output: '"C\\Program Files\ossec-agent\win_heartbeat.bat"': Mon
> >> > 04/11/2011  18:13 PM ossec-win-hb: Heartbeat: ossec-
> >> > agent.exe               5852                            0      4,644 K
> >> > _____________________________________________________________________________________
>
> >> > Although our current setup is OSSEC v2.4.1, we've been seeing this
> >> > since OSSEC v1.6.1. It's definitely not that the Windows agents are
> >> > disconnecting, as in the bug reports from a few years back, but that
> >> > agent_control is marking them as disconnected when they are still
> >> > connected and still sending events. We have several instances of this
> >> > setup (dev, lab, test, etc) and it happens in all of them.
>
> >> > What is going on here? Is there some reason why the process monitor
> >> > heartbeat isn't registering as an event to the manager? What else
> >> > could be causing agent_control to display the Windows agents as
> >> > disconnected when they're not?
>
> >> > Thanks!
> >> > -Alisha Kloc

[ossec-list] Re: agent_control disconnect bug?

Reply via email to