Hi John,

So, for the first issue (using wildcards), you can do if you update to
the latest snapshot:

http://www.ossec.net/files/snapshots/ossec-hids-090330.tar.gz

For the second issue, by looking at the strace output you sent and the
logs, it is being caused by
rootcheck (that does the rootkit detection) and not by syscheck.
However, rootcheck is called from
inside syscheck and that's why you are seeing the process
ossec-syscheckd going crazy.

If you want to disable rootcheck, just set <disabled> to yes under the
rootcheck configuration and this
problem should go away. Also, by looking at the strace, the CPU was
going very high during the period
of process checking, where it tries to loop through all available pids
and compare the output of
getpid, getpgid, getsid, proc and ps, looking for anomalies... So, it
was not dead of hang .


Thanks,

--
Daniel B. Cid
dcid ( at ) ossec.net



On Mon, Mar 30, 2009 at 12:01 PM, John A. Sullivan III
<[email protected]> wrote:
> On Mon, 2009-03-30 at 08:05 -0400, John A. Sullivan III wrote:
>> On Mon, 2009-03-30 at 07:10 -0400, John A. Sullivan III wrote:
>> > On Mon, 2009-03-30 at 07:04 -0400, John A. Sullivan III wrote:
>> > > On Mon, 2009-03-30 at 06:58 -0400, John A. Sullivan III wrote:
>> > > > On Tue, 2009-03-24 at 11:49 -0400, John A. Sullivan III wrote:
>> > > > > Here it is.  There is another problem.  My apologies for wondering 
>> > > > > why
>> > > > > the list was so slow to respond.  I am not receiving any email from 
>> > > > > the
>> > > > > list including Nerijus' response below. I only received your direct
>> > > > > responses, Daniel.  Does one need a gmail account to use 
>> > > > > googlegroups?
>> > > > >
>> > > > > In any event, here is the bzip2 file.  Thanks - John
>> > > > >
>> > > > > On Tue, 2009-03-24 at 11:44 -0300, Daniel Cid wrote:
>> > > > > > Yes, try zipping it and sending to the list (or directly to my 
>> > > > > > email
>> > > > > > if you think it may contain confidential
>> > > > > > information). It will certainly help us debug this issue.
>> > > > > >
>> > > > > > Thanks,
>> > > > > >
>> > > > > > --
>> > > > > > Daniel B. Cid
>> > > > > > dcid ( at ) ossec.net
>> > > > > >
>> > > > > > On Fri, Mar 20, 2009 at 3:13 AM, Nerijus Krukauskas
>> > > > > > <[email protected]> wrote:
>> > > > > > >
>> > > > > > > On 19/03/2009, John A. Sullivan III 
>> > > > > > > <[email protected]> wrote:
>> > > > > > >>
>> > > > > > >> Thanks, Daniel.  I have the trace but it is a 40 MB file.  How 
>> > > > > > >> shall I
>> > > > > > >> send it to you? - John
>> > > > > > >
>> > > > > > >  I believe that if you try to zip it, it's gonna be something 
>> > > > > > > around 4 MB... :)
>> > > > > > >
>> > > > > > > --
>> > > > > > > http://nk99.org/
>> > > > > > >
>> > > > Hello, all.  I do have some more information on this serious bug.  It
>> > > > has now bitten us on two out of two vservers.
>> > > >
>> > > > We first thought it might have to do with our use of wildcards in the
>> > > > localfile definitions, e.g.,
>> > > >   <localfile>
>> > > >     <log_format>syslog</log_format>
>> > > >     <location>/vservers/[a-zA-Z0-9]*/var/log/maillog</location>
>> > > >   </localfile>
>> > > > So we pulled them all out.  We still had the same problem.  However, it
>> > > > did seem to be coincidental with not being able to find specified 
>> > > > files.
>> > > > We had mistyped some file names and paths and saw this in the error 
>> > > > logs
>> > > > before the service spun out of control:
>> > > >
>> > > > 2009/03/30 04:57:14 ossec-syscheckd: INFO: Starting syscheck scan (db).
>> > > > 2009/03/30 04:58:41 ossec-logcollector(1103): ERROR: Unable to open 
>> > > > file '/vservers/w01/var/log/httpd/ssipki.error_log'.
>> > > > 2009/03/30 04:58:41 ossec-logcollector(1103): ERROR: Unable to open 
>> > > > file '/vservers/w01/var/log/httpd/ssipki.access_log'.
>> > > > 2009/03/30 04:58:41 ossec-logcollector(1103): ERROR: Unable to open 
>> > > > file '/var/log/dirsrv/admin-serv/error'.
>> > > > 2009/03/30 04:58:41 ossec-logcollector(1103): ERROR: Unable to open 
>> > > > file '/var/log/dirsrv/admin-serv/access'.
>> > > > 2009/03/30 04:58:41 ossec-logcollector(1103): ERROR: Unable to open 
>> > > > file '/var/log/dirsrv/slapd-ldap01/errors'.
>> > > > 2009/03/30 05:00:51 ossec-logcollector(1103): ERROR: Unable to open 
>> > > > file '/vservers/w01/var/log/httpd/ssipki.error_log'.
>> > > > 2009/03/30 05:00:51 ossec-logcollector(1103): ERROR: Unable to open 
>> > > > file '/vservers/w01/var/log/httpd/ssipki.access_log'.
>> > > > 2009/03/30 05:00:51 ossec-logcollector(1103): ERROR: Unable to open 
>> > > > file '/var/log/dirsrv/admin-serv/error'.
>> > > > 2009/03/30 05:00:51 ossec-logcollector(1103): ERROR: Unable to open 
>> > > > file '/var/log/dirsrv/admin-serv/access'.
>> > > > 2009/03/30 05:00:51 ossec-logcollector(1103): ERROR: Unable to open 
>> > > > file '/var/log/dirsrv/slapd-ldap01/errors'.
>> > > > 2009/03/30 05:03:01 ossec-logcollector(1103): ERROR: Unable to open 
>> > > > file '/vservers/w01/var/log/httpd/ssipki.error_log'.
>> > > > 2009/03/30 05:03:01 ossec-logcollector(1103): ERROR: Unable to open 
>> > > > file '/vservers/w01/var/log/httpd/ssipki.access_log'.
>> > > > 2009/03/30 05:03:01 ossec-logcollector(1103): ERROR: Unable to open 
>> > > > file '/var/log/dirsrv/admin-serv/error'.
>> > > > 2009/03/30 05:03:01 ossec-logcollector(1103): ERROR: Unable to open 
>> > > > file '/var/log/dirsrv/admin-serv/access'.
>> > > > 2009/03/30 05:03:01 ossec-logcollector(1103): ERROR: Unable to open 
>> > > > file '/var/log/dirsrv/slapd-ldap01/errors'.
>> > > > 2009/03/30 05:05:11 ossec-logcollector(1103): ERROR: Unable to open 
>> > > > file '/vservers/w01/var/log/httpd/ssipki.error_log'.
>> > > > 2009/03/30 05:05:11 ossec-logcollector(1103): ERROR: Unable to open 
>> > > > file '/vservers/w01/var/log/httpd/ssipki.access_log'.
>> > > > 2009/03/30 05:05:11 ossec-logcollector(1103): ERROR: Unable to open 
>> > > > file '/var/log/dirsrv/admin-serv/error'.
>> > > > 2009/03/30 05:05:11 ossec-logcollector(1103): ERROR: Unable to open 
>> > > > file '/var/log/dirsrv/admin-serv/access'.
>> > > > 2009/03/30 05:05:11 ossec-logcollector(1103): ERROR: Unable to open 
>> > > > file '/var/log/dirsrv/slapd-ldap01/errors'.
>> > > > 2009/03/30 05:07:21 ossec-logcollector(1103): ERROR: Unable to open 
>> > > > file '/vservers/w01/var/log/httpd/ssipki.error_log'.
>> > > > 2009/03/30 05:07:21 ossec-logcollector(1103): ERROR: Unable to open 
>> > > > file '/vservers/w01/var/log/httpd/ssipki.access_log'.
>> > > > 2009/03/30 05:07:21 ossec-logcollector(1103): ERROR: Unable to open 
>> > > > file '/var/log/dirsrv/admin-serv/error'.
>> > > > 2009/03/30 05:07:21 ossec-logcollector(1103): ERROR: Unable to open 
>> > > > file '/var/log/dirsrv/admin-serv/access'.
>> > > > 2009/03/30 05:07:21 ossec-logcollector(1103): ERROR: Unable to open 
>> > > > file '/var/log/dirsrv/slapd-ldap01/errors'.
>> > > > 2009/03/30 05:09:32 ossec-logcollector(1904): INFO: File not 
>> > > > available, ignoring it: '/vservers/w01/var/log/httpd/ssipki.error_log'.
>> > > > 2009/03/30 05:09:32 ossec-logcollector(1904): INFO: File not 
>> > > > available, ignoring it: 
>> > > > '/vservers/w01/var/log/httpd/ssipki.access_log'.
>> > > > 2009/03/30 05:09:32 ossec-logcollector(1904): INFO: File not 
>> > > > available, ignoring it: '/var/log/dirsrv/admin-serv/error'.
>> > > > 2009/03/30 05:09:32 ossec-logcollector(1904): INFO: File not 
>> > > > available, ignoring it: '/var/log/dirsrv/admin-serv/access'.
>> > > > 2009/03/30 05:09:32 ossec-logcollector(1904): INFO: File not 
>> > > > available, ignoring it: '/var/log/dirsrv/slapd-ldap01/errors'.
>> > > > 2009/03/30 05:16:10 ossec-syscheckd: INFO: Ending syscheck scan (db).
>> > > >
>> > > > On our second vserver, we did try wildcards in the directories
>> > > > definitions.  That gave us the following before spinning out of 
>> > > > control:
>> > > > 2009/03/30 05:49:22 ossec-syscheckd: Error opening directory: 
>> > > > '/user/local/sbin': No such file or directory
>> > > > 2009/03/30 05:49:22 ossec-syscheckd: Error opening directory: 
>> > > > '/vservers/*/etc': No such file or directory
>> > > > 2009/03/30 05:49:22 ossec-syscheckd: Error opening directory: 
>> > > > '/vservers/*/usr/bin': No such file or directory
>> > > > 2009/03/30 05:49:22 ossec-syscheckd: Error opening directory: 
>> > > > '/vservers/*/usr/sbin': No such file or directory
>> > > > 2009/03/30 05:49:22 ossec-syscheckd: Error opening directory: 
>> > > > '/vservers/*/bin': No such file or directory
>> > > > 2009/03/30 05:49:22 ossec-syscheckd: Error opening directory: 
>> > > > '/vservers/*/sbin': No such file or directory
>> > > > 2009/03/30 05:49:22 ossec-syscheckd: Error opening directory: 
>> > > > '/vservers/*/usr/local/bin': No such file or directory
>> > > > 2009/03/30 05:49:22 ossec-syscheckd: Error opening directory: 
>> > > > '/vservers/*/user/local/sbin': No such file or directory
>> > > > 2009/03/30 05:49:22 ossec-syscheckd: Error opening directory: 
>> > > > '/vservers/*/usr/local/etc': No such file or directory
>> > > > 2009/03/30 05:51:22 ossec-syscheckd: INFO: Starting syscheck scan (db).
>> > > >
>> > > > Having corrected the paths in the first vserver and taken out the wild
>> > > > cards, it seems to be behaving itself.  However, not being able to use
>> > > > wild cards or regex's in the directories and localfiles definitions is
>> > > > certainly inconvenient when we anticipate hundreds of virtual machines
>> > > > on some of these systems.
>> > > >
>> > > > That still leaves us with the base problem.  It appears that if ossec
>> > > > syscheckd encounters enough missing files, it does spin out of control
>> > > > and requires a power cycle of the system to recover.  Thanks - John
>> > > >
>> > > > PS - I'm still not receiving any emails from the mail list.
>> > > >
>> > > Oops! I spoke to soon.  The first vserver just went out of control but
>> > > again, it is about missing files.  We had defined some directories we
>> > > knew didn't have any files just in case they were populated in the
>> > > future.  We would hope we could do that to prevent human error.  Here is
>> > > what the logs showed before CPU usage spiked to 100%:
>> > >
>> > > 2009/03/30 06:22:20 ossec-syscheckd: Error opening directory: 
>> > > '/user/local/sbin': No such file or directory
>> > > 2009/03/30 06:23:07 ossec-syscheckd: Error opening directory: 
>> > > '/vservers/ns02/user/local/sbin': No such file or directory
>> > > 2009/03/30 06:23:57 ossec-syscheckd: Error opening directory: 
>> > > '/vservers/w01/user/local/sbin': No such file or directory
>> > > 2009/03/30 06:25:18 ossec-syscheckd: Error opening directory: 
>> > > '/vservers/pg01/user/local/sbin': No such file or directory
>> > > 2009/03/30 06:26:43 ossec-syscheckd: Error opening directory: 
>> > > '/vservers/ld01/user/local/sbin': No such file or directory
>> > > 2009/03/30 06:28:43 ossec-syscheckd: INFO: Starting syscheck scan (db).
>> > >
>> > >
>> > talk about embarassment - I just noticed the typo - however, it again
>> > emphasizes the point that ossec gets very unhappy if it can't find
>> > something that has been defined in ossec.conf - John
>>
>> Bad news! The first vserver spun out of control again.  This is with all
>> typos corrected and no wild cards.  Here is the log since the last
>> reboot:
>>
>> 2009/03/30 07:09:44 ossec-execd: INFO: Started (pid: 5743).
>> 2009/03/30 07:09:44 ossec-agentd(1410): INFO: Reading authentication keys 
>> file.
>> 2009/03/30 07:09:44 ossec-agentd: INFO: No previous counter available for 
>> 'vs01'.
>> 2009/03/30 07:09:44 ossec-agentd: INFO: Assigning counter for agent 
>> vserver01: '0:0'.
>> 2009/03/30 07:09:44 ossec-agentd: INFO: Assigning sender counter: 6:4637
>> 2009/03/30 07:09:44 ossec-agentd: INFO: Started (pid: 5747).
>> 2009/03/30 07:09:44 ossec-agentd: INFO: Server IP Address: 172.x.x.30
>> 2009/03/30 07:09:44 ossec-agentd: INFO: Trying to connect to server 
>> (172.x.x.30:1514).
>> 2009/03/30 07:09:48 ossec-syscheckd: INFO: Started (pid: 5755).
>> 2009/03/30 07:09:48 ossec-rootcheck: INFO: Started (pid: 5755).
>> 2009/03/30 07:09:50 ossec-logcollector(1950): INFO: Analyzing file: 
>> '/var/log/messages'.
>> 2009/03/30 07:09:50 ossec-logcollector(1950): INFO: Analyzing file: 
>> '/var/log/secure'.
>> 2009/03/30 07:09:50 ossec-logcollector(1950): INFO: Analyzing file: 
>> '/var/log/maillog'.
>> 2009/03/30 07:09:50 ossec-logcollector(1950): INFO: Analyzing file: 
>> '/var/log/cron'.
>> 2009/03/30 07:09:50 ossec-logcollector(1950): INFO: Analyzing file: 
>> '/vservers/w01/var/log/httpd/ssipkipub.error_log'.
>> 2009/03/30 07:09:50 ossec-logcollector(1950): INFO: Analyzing file: 
>> '/vservers/w01/var/log/httpd/ssipkipub.access_log'.
>> 2009/03/30 07:09:50 ossec-logcollector(1950): INFO: Analyzing file: 
>> '/vservers/w01/var/log/httpd/error_log'.
>> 2009/03/30 07:09:50 ossec-logcollector(1950): INFO: Analyzing file: 
>> '/vservers/w01/var/log/httpd/access_log'.
>> 2009/03/30 07:09:50 ossec-logcollector(1950): INFO: Analyzing file: 
>> '/vservers/w01/var/log/httpd/ssl_error_log'.
>> 2009/03/30 07:09:50 ossec-logcollector(1950): INFO: Analyzing file: 
>> '/vservers/w01/var/log/httpd/ssl_access_log'.
>> 2009/03/30 07:09:50 ossec-logcollector(1950): INFO: Analyzing file: 
>> '/vservers/ld01/var/log/dirsrv/admin-serv/error'.
>> 2009/03/30 07:09:50 ossec-logcollector(1950): INFO: Analyzing file: 
>> '/vservers/ld01/var/log/dirsrv/admin-serv/access'.
>> 2009/03/30 07:09:50 ossec-logcollector(1950): INFO: Analyzing file: 
>> '/vservers/ld01/var/log/dirsrv/slapd-ldap01/errors'.
>> 2009/03/30 07:09:50 ossec-logcollector: INFO: Started (pid: 5751).
>> 2009/03/30 07:09:59 ossec-agentd(4102): INFO: Connected to the server 
>> (172.x.x.30:1514).
>> 2009/03/30 07:19:03 ossec-syscheckd: INFO: Starting syscheck scan (db).
>> 2009/03/30 07:38:01 ossec-syscheckd: INFO: Ending syscheck scan (db).
>> 2009/03/30 07:38:21 ossec-rootcheck: INFO: Starting rootcheck scan.
>>
>> I suppose this implies it is not about not finding files but something
>> specific to searching these vserver directories. They should appear as
>> normal file systems.  I will next try it without any vserver directories
>> - John
> <snip>
> Argh! Even worse news.  It still hangs - not a single mention of vserver
> directories.  As far as I can tell, this should be just like a regular
> server - we are only scanning the host.  No clues in the log files other
> than it didn't take long to lock.  Here's the log from restart:
>
> 2009/03/30 08:11:31 ossec-execd: INFO: Started (pid: 4373).
> 2009/03/30 08:11:31 ossec-agentd(1410): INFO: Reading authentication keys 
> file.
> 2009/03/30 08:11:31 ossec-agentd: INFO: No previous counter available for 
> 'vserver01'.
> 2009/03/30 08:11:31 ossec-agentd: INFO: Assigning counter for agent 
> vserver01: '0:0'.
> 2009/03/30 08:11:31 ossec-agentd: INFO: Assigning sender counter: 7:3613
> 2009/03/30 08:11:31 ossec-agentd: INFO: Started (pid: 4377).
> 2009/03/30 08:11:31 ossec-agentd: INFO: Server IP Address: 172.30.10.30
> 2009/03/30 08:11:31 ossec-agentd: INFO: Trying to connect to server 
> (172.30.10.30:1514).
> 2009/03/30 08:11:36 ossec-syscheckd: INFO: Started (pid: 4385).
> 2009/03/30 08:11:36 ossec-rootcheck: INFO: Started (pid: 4385).
> 2009/03/30 08:11:37 ossec-logcollector(1950): INFO: Analyzing file: 
> '/var/log/messages'.
> 2009/03/30 08:11:37 ossec-logcollector(1950): INFO: Analyzing file: 
> '/var/log/secure'.
> 2009/03/30 08:11:37 ossec-logcollector(1950): INFO: Analyzing file: 
> '/var/log/maillog'.
> 2009/03/30 08:11:37 ossec-logcollector(1950): INFO: Analyzing file: 
> '/var/log/cron'.
> 2009/03/30 08:11:37 ossec-logcollector: INFO: Started (pid: 4381).
> 2009/03/30 08:11:46 ossec-agentd(4102): INFO: Connected to the server 
> (172.30.10.30:1514).
> 2009/03/30 08:11:57 ossec-logcollector(1225): INFO: SIGNAL Received. Exit 
> Cleaning...
> 2009/03/30 08:11:57 ossec-syscheckd(1225): INFO: SIGNAL Received. Exit 
> Cleaning...
> 2009/03/30 08:11:57 ossec-agentd(1225): INFO: SIGNAL Received. Exit 
> Cleaning...
> 2009/03/30 08:11:57 ossec-execd(1314): INFO: Shutdown received. Deleting 
> responses.
> 2009/03/30 08:11:57 ossec-execd(1225): INFO: SIGNAL Received. Exit Cleaning...
> 2009/03/30 08:12:50 ossec-execd: INFO: Started (pid: 5438).
> 2009/03/30 08:12:50 ossec-agentd(1410): INFO: Reading authentication keys 
> file.
> 2009/03/30 08:12:50 ossec-agentd: INFO: No previous counter available for 
> 'vserver01'.
> 2009/03/30 08:12:50 ossec-agentd: INFO: Assigning counter for agent 
> vserver01: '0:0'.
> 2009/03/30 08:12:50 ossec-agentd: INFO: Assigning sender counter: 7:3623
> 2009/03/30 08:12:50 ossec-agentd: INFO: Started (pid: 5442).
> 2009/03/30 08:12:50 ossec-agentd: INFO: Server IP Address: 172.30.10.30
> 2009/03/30 08:12:50 ossec-agentd: INFO: Trying to connect to server 
> (172.30.10.30:1514).
> 2009/03/30 08:12:51 ossec-agentd(4102): INFO: Connected to the server 
> (172.30.10.30:1514).
> 2009/03/30 08:12:54 ossec-syscheckd: INFO: Started (pid: 5450).
> 2009/03/30 08:12:54 ossec-rootcheck: INFO: Started (pid: 5450).
> 2009/03/30 08:12:56 ossec-logcollector(1950): INFO: Analyzing file: 
> '/var/log/messages'.
> 2009/03/30 08:12:56 ossec-logcollector(1950): INFO: Analyzing file: 
> '/var/log/secure'.
> 2009/03/30 08:12:56 ossec-logcollector(1950): INFO: Analyzing file: 
> '/var/log/maillog'.
> 2009/03/30 08:12:56 ossec-logcollector(1950): INFO: Analyzing file: 
> '/var/log/cron'.
> 2009/03/30 08:12:56 ossec-logcollector: INFO: Started (pid: 5446).
> 2009/03/30 08:17:46 ossec-syscheckd: INFO: Starting syscheck scan (db).
> 2009/03/30 08:24:22 ossec-syscheckd: INFO: Ending syscheck scan (db).
> 2009/03/30 08:24:42 ossec-rootcheck: INFO: Starting rootcheck scan.
>
> Where do I look now to solve this problem? Thanks - John
> --
> John A. Sullivan III
> Open Source Development Corporation
> +1 207-985-7880
> [email protected]
>
> http://www.spiritualoutreach.com
> Making Christianity intelligible to secular society
>
>

Reply via email to