Hi Rich,

Looking at your logs, it seems that your permissions are wrong (or a
hardware problem)
because OSSEC was not able to create the proper directories and its log files:

2010/01/01 00:00:00 alerts(1107): ERROR: Unable to create directory:
'/logs/archives/2010/'


Because of that, analysisd probably crashed causing all other processes to stop
working...

Try re-running the install.sh and updating to version 2.3 to see if it
fixes the problem.

Thanks,

--
Daniel B. Cid
dcid ( at ) ossec.net

On Fri, Jan 8, 2010 at 4:11 PM, Rich Rumble <[email protected]> wrote:
> These were messages on the server around mid-night, but I know space was not
> an issue as we have extensive monitoring on the server via nagios and
> cacti, both
> showing du was around 50% just as it is now...
>
> 2010/01/01 00:00:00 alerts(1107): ERROR: Unable to create directory:
> '/logs/archives/2010/'
> 2010/01/01 00:00:00 ossec-remoted: socketerr (not available).
> 2010/01/01 00:00:00 ossec-remoted(1210): ERROR: Queue
> '/queue/ossec/queue' not accessible: 'Connection refused'.
> 2010/01/01 00:00:03 ossec-remoted(1210): ERROR: Queue
> '/queue/ossec/queue' not accessible: 'Connection refused'.
> 2010/01/01 00:00:03 ossec-remoted(1211): ERROR: Unable to access
> queue: '/queue/ossec/queue'. Giving up..
> 2010/01/01 00:00:03 ossec-logcollector: socketerr (not available).
> 2010/01/01 00:01:21 ossec-monitord: socketerr (not available).
> 2010/01/01 00:01:21 ossec-monitord(1224): ERROR: Error sending message to 
> queue.
> 2010/01/01 00:01:21 ossec-monitord: socketerr (not available).
> 2010/01/01 00:01:21 ossec-monitord(1224): ERROR: Error sending message to 
> queue.
> 2010/01/01 00:01:21 ossec-monitord: socketerr (not available).
> 2010/01/01 00:01:21 ossec-monitord(1224): ERROR: Error sending message to 
> queue.
> 2010/01/01 00:01:21 ossec-monitord: socketerr (not available).
> 2010/01/01 00:01:21 ossec-monitord(1224): ERROR: Error sending message to 
> queue.
> 2010/01/01 00:01:43 ossec-monitord: Compression error:
> /logs/alerts/2009/Dec/ossec-alerts-31.log.gz: No space left on device
> 2010/01/01 00:01:43 ossec-monitord: Compression error:
> /logs/alerts/2009/Dec/ossec-alerts-31.log.gz: No space left on device
> 2010/01/01 00:04:23 ossec-logcollector: socketerr (not available).
> 2010/01/01 00:05:50 ossec-monitord: socketerr (not available).
> 2010/01/01 00:05:50 ossec-monitord(1224): ERROR: Error sending message to 
> queue.
> 2010/01/01 00:06:33 ossec-logcollector: socketerr (not available).
> 2010/01/01 00:07:50 ossec-monitord: socketerr (not available).
> 2010/01/01 00:08:43 ossec-logcollector: socketerr (not available).
> 2010/01/01 00:10:53 ossec-logcollector: socketerr (not available).
> 2010/01/01 00:13:03 ossec-logcollector: socketerr (not available).
> 2010/01/01 01:31:52 ossec-syscheckd: socketerr (not available).
> 2010/01/01 01:31:52 ossec-syscheckd(1224): ERROR: Error sending
> message to queue.
> 2010/01/01 01:31:55 ossec-syscheckd(1210): ERROR: Queue
> '/var/ossec/queue/ossec/queue' not accessible: 'Connection refused'.
> 2010/01/01 01:31:55 ossec-syscheckd(1211): ERROR: Unable to access
> queue: '/var/ossec/queue/ossec/queue'. Giving up..
> 2010/01/02 00:02:01 ossec-monitord(1103): ERROR: Unable to open file
> '/logs/archives/2010/Jan/ossec-archive-01.log.sum'.
> 2010/01/02 00:02:01 ossec-monitord: File
> '/logs/alerts/2010/Jan/ossec-alerts-01.log' not found. MD5 checksum
> skipped.
> 2010/01/02 00:02:01 ossec-monitord: File
> '/logs/alerts/2010/Jan/ossec-alerts-01.log' not found. SHA1 checksum
> skipped.
> 2010/01/02 00:02:01 ossec-monitord(1103): ERROR: Unable to open file
> '/logs/alerts/2010/Jan/ossec-alerts-01.log.sum'.
> 2010/01/02 00:02:01 ossec-monitord(1103): ERROR: Unable to open file
> '/logs/firewall/2010/Jan/ossec-firewall-01.log.sum'.
>
> I restarted the ossec service on the 4th, and from then on, all the
> logs have been
> created for 2010/Jan.
>
> The odd portion is the emails only send the disconnects, and never the
> reconnects unless I am actually restarting from the windows gui on the hosts.
> Newly added hosts, as of yesterday are "immune" and are not being disconnected
> so far (10-new hosts running 2.3).
>
> On Fri, Jan 8, 2010 at 1:56 PM, Daniel Cid <[email protected]> wrote:
>> Hi Rich,
>>
>> Strange issue for sure... I have not heard of anyone else having problems 
>> during
>> the 2010 change and I don't think we have anything in the code that
>> would be affected
>> by the year change.
>>
>> What is showing up on the manager's log? Can you check if it has any
>> partition full or
>> anything like that? For all the agents to be affected, I would guess
>> an issue on the
>> manager side...
>>
>> Thanks,
>>
>> --
>> Daniel B. Cid
>> dcid ( at ) ossec.net
>>
>>
>>
>>
>>
>> On Thu, Jan 7, 2010 at 11:54 AM, Rich Rumble <[email protected]> wrote:
>>> Still on going issues here... I'm getting emails every 2 minutes
>>> listing
>>> all the host's disconnecting from the server. They seem to be going in
>>> order of their number, the last 160 disconnects first, then 159 etc..
>>> all
>>> the way down to 001, then all host's start to come back in the
>>> opposite
>>> direction, 001, then 002 etc... The server has been rebooted, I cannot
>>> reboot
>>> the clients, but their services have been restarted, keys copied
>>> over...
>>> When they are "connected" after the whole cycle they never email about
>>> being connected, their status just shows connected:
>>>  /var/ossec/bin/agent_control -lc shows the connected boxes
>>> ... Again the server is running 2.3 code and most clients are 2.2
>>> No changes for the past 6 months other than upgrading to 2.3 on the
>>> server a few weeks ago. The clients are showing the issues I stated
>>> previously, again
>>> all of their trouble began at the roll over to 2010.
>>>
>>>> 2010/01/02 01:43:08 ossec-agent: Error waiting mutex (timeout).
>>>> 2010/01/02 01:44:53 ossec-agent: Error waiting mutex (timeout).
>>>> 2010/01/02 01:46:38 ossec-agent: Error waiting mutex (timeout).
>>>> 2010/01/02 01:48:23 ossec-agent: Error waiting mutex (timeout).
>>>> 2010/01/02 01:50:08 ossec-agent: Error waiting mutex (timeout).
>>>> 2010/01/02 01:50:17 ossec-agent: INFO: Trying to connect to server
>>>> (10.2.2.6:1514).
>>>> and on and on.
>>>> I've restarted the server, restarted the agents to no avail. Some are
>>>> now reporting duplicate counter
>>>> errors, and deleting the rids files is not fixing them this time
>>>> around.
>>>> The server is 2.3 and most agents are 2.2 windows only.
>>>
>>
>

Reply via email to