Daniel, On August 3rd 2010, we observed that all of the Ossec Clients suddenly connected again and continue to stay connected. We did not make any changes to the environment on that day. The Ossec Client-Master Disconnect issue appears to be cleared for the last couple of days. We will continue to monitor the environment to see if the issue re-occurs.
Just a re-cap ... We started receiving these disconnects again on Wednesday July 7th 2010. The clients disconnected daily and failed to reconnect. This issue was also observed in May on 5/10/2010. That issue lasted for two weeks and then suddenly stopped. Cleaning out the "rids" directory and disabling the counters on both the Ossec Master Server and all Clients (set verify_msg_id to 0 on the internal_options.conf) cleared the issue for a couple of days and then the Client disconnect issue re-emerged again on July 15th 2010. As stated above, the issue observed then cleared again without any intervention on August 3rd 2010 . Below is what I think you are looking for: [r...@nydcossec01 hourly-average]# pwd /opt/ossec/stats/hourly-average [r...@nydcossec01 hourly-average]# ls -ltar total 108 drwxr-x--- 5 ossec ossec 4096 Apr 17 16:03 .. drwxr-x--- 2 ossec ossec 4096 Apr 18 00:00 . -rw-r----- 1 ossec ossec 5 Aug 6 00:00 9 -rw-r----- 1 ossec ossec 5 Aug 6 00:00 8 -rw-r----- 1 ossec ossec 5 Aug 6 00:00 7 -rw-r----- 1 ossec ossec 6 Aug 6 00:00 6 -rw-r----- 1 ossec ossec 5 Aug 6 00:00 5 -rw-r----- 1 ossec ossec 5 Aug 6 00:00 4 -rw-r----- 1 ossec ossec 5 Aug 6 00:00 3 -rw-r----- 1 ossec ossec 3 Aug 6 00:00 24 -rw-r----- 1 ossec ossec 5 Aug 6 00:00 23 -rw-r----- 1 ossec ossec 5 Aug 6 00:00 22 -rw-r----- 1 ossec ossec 5 Aug 6 00:00 21 -rw-r----- 1 ossec ossec 5 Aug 6 00:00 20 -rw-r----- 1 ossec ossec 5 Aug 6 00:00 2 -rw-r----- 1 ossec ossec 5 Aug 6 00:00 19 -rw-r----- 1 ossec ossec 5 Aug 6 00:00 18 -rw-r----- 1 ossec ossec 5 Aug 6 00:00 17 -rw-r----- 1 ossec ossec 5 Aug 6 00:00 16 -rw-r----- 1 ossec ossec 5 Aug 6 00:00 15 -rw-r----- 1 ossec ossec 5 Aug 6 00:00 14 -rw-r----- 1 ossec ossec 5 Aug 6 00:00 13 -rw-r----- 1 ossec ossec 5 Aug 6 00:00 12 -rw-r----- 1 ossec ossec 5 Aug 6 00:00 11 -rw-r----- 1 ossec ossec 5 Aug 6 00:00 10 -rw-r----- 1 ossec ossec 5 Aug 6 00:00 1 -rw-r----- 1 ossec ossec 5 Aug 6 00:00 0 [r...@nydcossec01 hourly-average]# cat 0 30662 Thanks, Robert -----Original Message----- From: [email protected] [mailto:[email protected]] On Behalf Of Daniel Cid Sent: Friday, August 06, 2010 5:49 PM To: [email protected] Subject: Re: [ossec-list] All UNIX/LINUX agents disconnecting and failing to reconnect Hi Robert, How many events per second is the manager processing right now? Such high CPU utilization is not normal.. You can see these values at /var/ossec/stats . Also, anything special in the manager log when the disconnect happen? Thanks, -- Daniel B. Cid dcid ( at ) ossec.net On Thu, Aug 5, 2010 at 7:49 PM, Griffith, Robert <[email protected]> wrote: > > Attached is a portion of the Client Log that shows the exact time when an > OSSEC client disconnects & fails to reconnect. Also is snswmsta1.pcap which > contains the network trace of that particular instance and the communication > between both servers. It shows the conversation between the client (Source) > and the OSSEC server (Destination). > > Thank you, > Robert > > > -----Original Message----- > From: [email protected] [mailto:[email protected]] > On Behalf Of Griffith, Robert > Sent: Wednesday, July 28, 2010 3:36 PM > To: '[email protected]' > Subject: RE: [ossec-list] All UNIX/LINUX agents disconnecting and > failing to reconnect > > > Also, on our Ossec Master Server, we are observing that the "ossec-analysisd" > uses ~100% of a single CPU (4 CPU's are available). Could this be causing > any issues? > > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > 2779 ossec 25 0 8236 2932 704 R 93.2 0.1 119:14.70 > ossec-analysisd > > Thank you, > Robert > > > -----Original Message----- > From: [email protected] [mailto:[email protected]] > On Behalf Of Griffith, Robert > Sent: Wednesday, July 28, 2010 5:14 AM > To: '[email protected]' > Subject: [ossec-list] All UNIX/LINUX agents disconnecting and failing > to reconnect > Importance: High > > We continue to observe the Ossec Client disconnects to our Ossec Master > Server over the network. We started receiving these disconnects again on > Wednesday July 7th 2010. The clients disconnected daily and failed to > reconnect for hours (some clients took days or never reconnected again). > This issue was also observed in May on 5/10/2010. That issue lasted for two > weeks and then suddenly stopped without any Ossec configuration changes. > > We implemented the fix you provided below after we encountered the issue > again on Wednesday July 7th 2010. We cleaned the "rids" directory and > disabled the counters on both the Ossec Master Server and all UNIX/Linux > Clients (set verify_msg_id to 0 on the internal_options.conf). The "fix" you > provided cleared the issue for 5 days and then the Client disconnect issue > re-emerged. We re-applied the "fix" again without success. > > We are again experiencing disconnects and failed re-connects on all > UNIX/LINUX Ossec agents. > > FYI: We are using Ossec Version 2.4. Counters are disabled. > > Thank you, > Robert > > > -----Original Message----- > From: [email protected] [mailto:[email protected]] > On Behalf Of Daniel Cid > Sent: Friday, May 14, 2010 9:43 AM > To: [email protected] > Subject: Re: [ossec-list] RE: All UNIX/LINUX agents disconnecting > > Hi Lucio, > > There is two issues in this thread. One, the agent disconnects and then > reconnects by itself. That's fine and can happen on high load environment or > when a message gets dropped. > > The second issue that Mike mentioned happens when the counters get out of > sync and the agent never reconnects. For this problem, you have to either > clean the "rids" directory on the manager or disable the counters. To disable > it, set verify_msg_id to 0 on the internal_options.conf file: > > # Verify msg id (set to 0 to disable it) remoted.verify_msg_id=0 > > Thanks, > > -- > Daniel B. Cid > dcid ( at ) ossec.net > > > On Thu, May 13, 2010 at 1:21 PM, Lucio Emanuel Soldo <[email protected]> > wrote: >> Hi Mike, how are you? Could you post the final solution your team has >> produced in order to fix its problem? >> >> Thanx alot! >> >> On Tue, May 11, 2010 at 6:56 PM, Pendergrast, Michael L >> <[email protected]> wrote: >>> >>> Yes we have >>> >>> although we have v1.6 >>> >>> I don't have the details as my team has worked the problem and is >>> currently deployed. >>> >>> What we did find is that there is a counter in the agent and in the >>> manager and if they get out of sequence the agent will stop >>> (basicaqlly they get out of sequence). We also found that at >>> startup of the UNIX agents that if multiple agents all start at the >>> same time, the agents will stop. In this case, for initial startup >>> we had to sequence the startup in about 10 min increments. >>> >>> Mike >>> ________________________________ >>> From: [email protected] >>> [mailto:[email protected]] On Behalf Of Griffith, Robert >>> Sent: Tuesday, May 11, 2010 12:26 PM >>> To: '[email protected]' >>> Subject: [ossec-list] All UNIX/LINUX agents disconnecting >>> Importance: High >>> >>> We have been running the new version of Ossec 2.4 in our >>> environment for >>> 3 weeks. Yesterday all of our UNIX/LINUX client agents started >>> disconnecting. None of our Windows Server client agents have disconnected. >>> Has anyone experienced this and/or found a resolution for this issue. >>> >>> Thank you, >>> Robert >>> >> >
