Hello,

I just wanted to share about my experience about disconnected agents status in 
agent_control.
I have a setup with 200+ agents deployed, every agent were in a connected state 
until 2 days ago.
I have some windows agents but most of them are Unix (RedHat/CentOS/AIX). I 
have several VLAN, some are protected by firewall others not, etc.
I don't use SELINUX, and iptables always worked before. We didn't do any 
massive change in the infrastructure.

I have a zabbix graphs keeping an eye on all OSSEC connected agents, it gives 
us a history about all deployed agent. Everything was fine until 2 days ago, it 
dropped from 169 to 15.
After running agent_control –l on the OSSEC server, I discovered most of the 
agents were marked as disconnected. I discovered some of the sysadmin people 
have been cloning machines and so, OSSEC keys across multiple machines.
It created issues with the RIDS file (I have a OSSEC rule warning me about such 
issue, and I got way too many emails to ignore the problem). I found the 
duplicate keys and fix them but starting to work on the the disconnected agents.

So, after spending 2 days trying to figure why most of my agents were 
disconnected, reading any piece of info about it, I finally found the root 
cause.
It is related to the rids files under queue/rids directory. So, I cleaned them 
up, on each side (server and clients), and everything is back to normal.

I thought I will share the story in case anybody is in the same situation.
I didn't want to move my server, I didn't want to upgrade it (well, I'm already 
running 2.6).

-StephaneR

Reply via email to