Re: [ossec-list] Re: 2010 bug?

Rich Rumble Fri, 08 Jan 2010 12:43:46 -0800

These were messages on the server around mid-night, but I know space was not
an issue as we have extensive monitoring on the server via nagios and
cacti, both
showing du was around 50% just as it is now...

2010/01/01 00:00:00 alerts(1107): ERROR: Unable to create directory:
'/logs/archives/2010/'
2010/01/01 00:00:00 ossec-remoted: socketerr (not available).
2010/01/01 00:00:00 ossec-remoted(1210): ERROR: Queue
'/queue/ossec/queue' not accessible: 'Connection refused'.
2010/01/01 00:00:03 ossec-remoted(1210): ERROR: Queue
'/queue/ossec/queue' not accessible: 'Connection refused'.
2010/01/01 00:00:03 ossec-remoted(1211): ERROR: Unable to access
queue: '/queue/ossec/queue'. Giving up..
2010/01/01 00:00:03 ossec-logcollector: socketerr (not available).
2010/01/01 00:01:21 ossec-monitord: socketerr (not available).
2010/01/01 00:01:21 ossec-monitord(1224): ERROR: Error sending message to queue.
2010/01/01 00:01:21 ossec-monitord: socketerr (not available).
2010/01/01 00:01:21 ossec-monitord(1224): ERROR: Error sending message to queue.
2010/01/01 00:01:21 ossec-monitord: socketerr (not available).
2010/01/01 00:01:21 ossec-monitord(1224): ERROR: Error sending message to queue.
2010/01/01 00:01:21 ossec-monitord: socketerr (not available).
2010/01/01 00:01:21 ossec-monitord(1224): ERROR: Error sending message to queue.
2010/01/01 00:01:43 ossec-monitord: Compression error:
/logs/alerts/2009/Dec/ossec-alerts-31.log.gz: No space left on device
2010/01/01 00:01:43 ossec-monitord: Compression error:
/logs/alerts/2009/Dec/ossec-alerts-31.log.gz: No space left on device
2010/01/01 00:04:23 ossec-logcollector: socketerr (not available).
2010/01/01 00:05:50 ossec-monitord: socketerr (not available).
2010/01/01 00:05:50 ossec-monitord(1224): ERROR: Error sending message to queue.
2010/01/01 00:06:33 ossec-logcollector: socketerr (not available).
2010/01/01 00:07:50 ossec-monitord: socketerr (not available).
2010/01/01 00:08:43 ossec-logcollector: socketerr (not available).
2010/01/01 00:10:53 ossec-logcollector: socketerr (not available).
2010/01/01 00:13:03 ossec-logcollector: socketerr (not available).
2010/01/01 01:31:52 ossec-syscheckd: socketerr (not available).
2010/01/01 01:31:52 ossec-syscheckd(1224): ERROR: Error sending
message to queue.
2010/01/01 01:31:55 ossec-syscheckd(1210): ERROR: Queue
'/var/ossec/queue/ossec/queue' not accessible: 'Connection refused'.
2010/01/01 01:31:55 ossec-syscheckd(1211): ERROR: Unable to access
queue: '/var/ossec/queue/ossec/queue'. Giving up..
2010/01/02 00:02:01 ossec-monitord(1103): ERROR: Unable to open file
'/logs/archives/2010/Jan/ossec-archive-01.log.sum'.
2010/01/02 00:02:01 ossec-monitord: File
'/logs/alerts/2010/Jan/ossec-alerts-01.log' not found. MD5 checksum
skipped.
2010/01/02 00:02:01 ossec-monitord: File
'/logs/alerts/2010/Jan/ossec-alerts-01.log' not found. SHA1 checksum
skipped.
2010/01/02 00:02:01 ossec-monitord(1103): ERROR: Unable to open file
'/logs/alerts/2010/Jan/ossec-alerts-01.log.sum'.
2010/01/02 00:02:01 ossec-monitord(1103): ERROR: Unable to open file
'/logs/firewall/2010/Jan/ossec-firewall-01.log.sum'.

I restarted the ossec service on the 4th, and from then on, all the
logs have been
created for 2010/Jan.

The odd portion is the emails only send the disconnects, and never the
reconnects unless I am actually restarting from the windows gui on the hosts.
Newly added hosts, as of yesterday are "immune" and are not being disconnected
so far (10-new hosts running 2.3).

On Fri, Jan 8, 2010 at 1:56 PM, Daniel Cid <[email protected]> wrote:
> Hi Rich,
>
> Strange issue for sure... I have not heard of anyone else having problems 
> during
> the 2010 change and I don't think we have anything in the code that
> would be affected
> by the year change.
>
> What is showing up on the manager's log? Can you check if it has any
> partition full or
> anything like that? For all the agents to be affected, I would guess
> an issue on the
> manager side...
>
> Thanks,
>
> --
> Daniel B. Cid
> dcid ( at ) ossec.net
>
>
>
>
>
> On Thu, Jan 7, 2010 at 11:54 AM, Rich Rumble <[email protected]> wrote:
>> Still on going issues here... I'm getting emails every 2 minutes
>> listing
>> all the host's disconnecting from the server. They seem to be going in
>> order of their number, the last 160 disconnects first, then 159 etc..
>> all
>> the way down to 001, then all host's start to come back in the
>> opposite
>> direction, 001, then 002 etc... The server has been rebooted, I cannot
>> reboot
>> the clients, but their services have been restarted, keys copied
>> over...
>> When they are "connected" after the whole cycle they never email about
>> being connected, their status just shows connected:
>>  /var/ossec/bin/agent_control -lc shows the connected boxes
>> ... Again the server is running 2.3 code and most clients are 2.2
>> No changes for the past 6 months other than upgrading to 2.3 on the
>> server a few weeks ago. The clients are showing the issues I stated
>> previously, again
>> all of their trouble began at the roll over to 2010.
>>
>>> 2010/01/02 01:43:08 ossec-agent: Error waiting mutex (timeout).
>>> 2010/01/02 01:44:53 ossec-agent: Error waiting mutex (timeout).
>>> 2010/01/02 01:46:38 ossec-agent: Error waiting mutex (timeout).
>>> 2010/01/02 01:48:23 ossec-agent: Error waiting mutex (timeout).
>>> 2010/01/02 01:50:08 ossec-agent: Error waiting mutex (timeout).
>>> 2010/01/02 01:50:17 ossec-agent: INFO: Trying to connect to server
>>> (10.2.2.6:1514).
>>> and on and on.
>>> I've restarted the server, restarted the agents to no avail. Some are
>>> now reporting duplicate counter
>>> errors, and deleting the rids files is not fixing them this time
>>> around.
>>> The server is 2.3 and most agents are 2.2 windows only.
>>
>

Re: [ossec-list] Re: 2010 bug?

Reply via email to