Hi John,

It seems to me that syscheck got stuck while trying to read a file
(probably on the middle of a system call since
you can't kill it). On a normal environment we check if the file is
regular (and not a socket, device, etc) before we
read it. However, on a virtual environment this may be failing.

Can you try the following? Before it starts to run out of control
start strace to see what is happening. For example:

r...@ourhome:/root# ps auwx |grep syscheckd
root     26897  1.4  0.0   2028   524 ?        R    17:05   0:02
/var/ossec/bin/ossec-syscheckd

r...@ourhome:/root# strace -F -T -p 26897

It should let us know where it will get stuck after a while...

*you may also want to redirect the output to a file "strace -F -T -p
pid > /tmp/log 2>&1"

Let us know the results and we will try to find out the issue.

Thanks,

--
Daniel B. Cid
dcid ( at ) ossec.net

On Wed, Mar 18, 2009 at 1:50 PM, John A. Sullivan III
<[email protected]> wrote:
>
> Hello, all.  We are suddenly having a bit of a nightmare with our
> otherwise usually delightful OSSEC.  We've installed it on a dual quad
> core AMD server with 32GB of RAM running CentOS 5.2 but with kernel
> 2.6.28.7 (the CentOS kernel panics with open-iscsi) and VServer
> 2.3.0.36.7.
>
> After a while, a syscheckd process spins completely out of control
> consuming 100% of one processor.  It refuses to die.  kill does not
> work, kill -9 does not work, service ossec stop does not work.  Only
> rebooting seems to work.  The console is flooded with:
> BUG: soft lockup - CPU#3 stuck for 61s! [ossec-syscheckd:4625]
>
> The VServer host (the source of the runaway process) is an OSSEC agent.
> Originally, the OSSEC server was running as one of its guests but we
> thought that was the problem.  We moved the OSSEC server to another
> piece of hardware yet the problem has persisted.
>
> We are using OSSEC http://www.ossec.net/files/ossec-hids-2.0.tar.gz
> downloaded today.  Checksum matched.
>
> Here is the log since the last start.  Notice that it thinks syscheckd
> has stopped:
> 2009/03/18 11:55:43 ossec-execd: INFO: Started (pid: 4613).
> 2009/03/18 11:55:43 ossec-agentd(1410): INFO: Reading authentication keys 
> file.
> 2009/03/18 11:55:43 ossec-agentd: INFO: No previous counter available for 
> 'vserver'.
> 2009/03/18 11:55:43 ossec-agentd: INFO: Assigning counter for agent 
> vserver01: '0:0'.
> 2009/03/18 11:55:43 ossec-agentd: INFO: Assigning sender counter: 3:3930
> 2009/03/18 11:55:43 ossec-agentd: INFO: Started (pid: 4617).
> 2009/03/18 11:55:43 ossec-agentd: INFO: Server IP Address: 172.x.x.30
> 2009/03/18 11:55:43 ossec-agentd: INFO: Trying to connect to server 
> (172.x.x.30:1514).
> 2009/03/18 11:55:44 ossec-agentd(4102): INFO: Connected to the server 
> (172.x.x.30:1514).
> 2009/03/18 11:55:47 ossec-syscheckd: INFO: Started (pid: 4625).
> 2009/03/18 11:55:47 ossec-rootcheck: INFO: Started (pid: 4625).
> 2009/03/18 11:55:49 ossec-logcollector(1950): INFO: Analyzing file: 
> '/var/log/messages'.
> 2009/03/18 11:55:49 ossec-logcollector(1950): INFO: Analyzing file: 
> '/var/log/secure'.
> 2009/03/18 11:55:49 ossec-logcollector(1950): INFO: Analyzing file: 
> '/var/log/maillog'.
> 2009/03/18 11:55:49 ossec-logcollector(1950): INFO: Analyzing file: 
> '/var/log/cron'.
> 2009/03/18 11:55:49 ossec-logcollector(1950): INFO: Analyzing file: 
> '/vservers/basevs/var/log/messages'.
> 2009/03/18 11:55:49 ossec-logcollector(1950): INFO: Analyzing file: 
> '/vservers/h01/var/log/messages'.
> 2009/03/18 11:55:49 ossec-logcollector(1950): INFO: Analyzing file: 
> '/vservers/ns02/var/log/messages'.
> 2009/03/18 11:55:49 ossec-logcollector(1950): INFO: Analyzing file: 
> '/vservers/basevs/var/log/secure'.
> 2009/03/18 11:55:49 ossec-logcollector(1950): INFO: Analyzing file: 
> '/vservers/h01/var/log/secure'.
> 2009/03/18 11:55:49 ossec-logcollector(1950): INFO: Analyzing file: 
> '/vservers/ns02/var/log/secure'.
> 2009/03/18 11:55:49 ossec-logcollector(1950): INFO: Analyzing file: 
> '/vservers/basevs/var/log/maillog'.
> 2009/03/18 11:55:49 ossec-logcollector(1950): INFO: Analyzing file: 
> '/vservers/h01/var/log/maillog'.
> 2009/03/18 11:55:49 ossec-logcollector(1950): INFO: Analyzing file: 
> '/vservers/ns02/var/log/maillog'.
> 2009/03/18 11:55:49 ossec-logcollector(1950): INFO: Analyzing file: 
> '/vservers/basevs/var/log/cron'.
> 2009/03/18 11:55:49 ossec-logcollector(1950): INFO: Analyzing file: 
> '/vservers/h01/var/log/cron'.
> 2009/03/18 11:55:49 ossec-logcollector(1950): INFO: Analyzing file: 
> '/vservers/ns02/var/log/cron'.
> 2009/03/18 11:55:49 ossec-logcollector: INFO: Started (pid: 4621).
> 2009/03/18 11:58:28 ossec-syscheckd: Error opening directory: 
> '/user/local/sbin': No such file or directory
> 2009/03/18 11:59:13 ossec-syscheckd: Error opening directory: 
> '/vservers/ns02/user/local/sbin': No such file or directory
> 2009/03/18 12:01:13 ossec-syscheckd: INFO: Starting syscheck scan (db).
> 2009/03/18 12:09:53 ossec-syscheckd: INFO: Ending syscheck scan (db).
> 2009/03/18 12:10:13 ossec-rootcheck: INFO: Starting rootcheck scan.
>
>
> Here is ossec.conf on the VServer host:
> <ossec_config>
>  <client>
>    <server-ip>172.30.10.30</server-ip>
>  </client>
>
>  <syscheck>
>    <!-- Frequency that syscheck is executed - default to every 6 hours -->
>    <frequency>21600</frequency>
>    <alert_new_files>yes</alert_new_files>
>
>    <!-- Directories to check  (perform all possible verifications) -->
>    <directories check_all="yes">/etc,/usr/bin,/usr/sbin</directories>
>    <directories 
> check_all="yes">/bin,/sbin,/usr/local/bin,/user/local/sbin,/usr/local/etc</directories>
>    <directories 
> check_all="yes">/vservers/ns02/etc,/vservers/ns02/usr/bin,/vservers/ns02/usr/sbin</directories>
>    <directories 
> check_all="yes">/vservers/ns02/bin,/vservers/ns02/sbin,/vservers/ns02/usr/local/bin,/vservers/ns02/user/local/sbin,/vservers/ns02/usr/local/etc</directories>
>
>    <!-- Files/directories to ignore -->
>    <ignore>/etc/mtab</ignore>
>    <ignore>/etc/mnttab</ignore>
>    <ignore>/etc/hosts.deny</ignore>
>    <ignore>/etc/mail/statistics</ignore>
>    <ignore>/etc/random-seed</ignore>
>    <ignore>/etc/adjtime</ignore>
>    <ignore>/etc/httpd/logs</ignore>
>    <ignore>/etc/utmpx</ignore>
>    <ignore>/etc/wtmpx</ignore>
>    <ignore>/etc/cups/certs</ignore>
>    <ignore>/etc/dumpdates</ignore>
>    <ignore>/etc/svc/volatile</ignore>
>
>    <!-- Windows files to ignore -->
>    <ignore>C:\WINDOWS/System32/LogFiles</ignore>
>    <ignore>C:\WINDOWS/Debug</ignore>
>    <ignore>C:\WINDOWS/WindowsUpdate.log</ignore>
>    <ignore>C:\WINDOWS/iis6.log</ignore>
>    <ignore>C:\WINDOWS/system32/wbem/Logs</ignore>
>    <ignore>C:\WINDOWS/system32/wbem/Repository</ignore>
>    <ignore>C:\WINDOWS/Prefetch</ignore>
>    <ignore>C:\WINDOWS/PCHEALTH/HELPCTR/DataColl</ignore>
>    <ignore>C:\WINDOWS/SoftwareDistribution</ignore>
>    <ignore>C:\WINDOWS/Temp</ignore>
>    <ignore>C:\WINDOWS/system32/config</ignore>
>    <ignore>C:\WINDOWS/system32/spool</ignore>
>    <ignore>C:\WINDOWS/system32/CatRoot</ignore>
>  </syscheck>
>
>  <rootcheck>
>    
> <rootkit_files>/usr/local/ossec/etc/shared/rootkit_files.txt</rootkit_files>
>    
> <rootkit_trojans>/usr/local/ossec/etc/shared/rootkit_trojans.txt</rootkit_trojans>
>    
> <system_audit>/usr/local/ossec/etc/shared/system_audit_rcl.txt</system_audit>
>    
> <system_audit>/usr/local/ossec/etc/shared/cis_debian_linux_rcl.txt</system_audit>
>    
> <system_audit>/usr/local/ossec/etc/shared/cis_rhel_linux_rcl.txt</system_audit>
>    
> <system_audit>/usr/local/ossec/etc/shared/cis_rhel5_linux_rcl.txt</system_audit>
>  </rootcheck>
>  <!-- Files to monitor (localfiles) -->
>
>  <localfile>
>    <log_format>syslog</log_format>
>    <location>/var/log/messages</location>
>  </localfile>
>
>  <localfile>
>    <log_format>syslog</log_format>
>    <location>/var/log/secure</location>
>  </localfile>
>
>  <localfile>
>    <log_format>syslog</log_format>
>    <location>/var/log/maillog</location>
>  </localfile>
>
>  <localfile>
>    <log_format>syslog</log_format>
>    <location>/var/log/cron</location>
>  </localfile>
>
>  <localfile>
>    <log_format>syslog</log_format>
>    <location>/vservers/[a-zA-Z0-9]*/var/log/messages</location>
>  </localfile>
>
>  <localfile>
>    <log_format>syslog</log_format>
>    <location>/vservers/[a-zA-Z0-9]*/var/log/secure</location>
>  </localfile>
>
>  <localfile>
>    <log_format>syslog</log_format>
>    <location>/vservers/[a-zA-Z0-9]*/var/log/maillog</location>
>  </localfile>
>
>  <localfile>
>    <log_format>syslog</log_format>
>    <location>/vservers/[a-zA-Z0-9]*/var/log/cron</location>
>  </localfile>
> </ossec_config>
>
> Any idea what is causing this? How to kill the process without
> rebooting? How to fix it?
>
> We're starting to fall behind on this critical project so any help is
> greatly appreciated.  Thanks - John
>
> --
> John A. Sullivan III
> Open Source Development Corporation
> +1 207-985-7880
> [email protected]
>
> http://www.spiritualoutreach.com
> Making Christianity intelligible to secular society
>
>

Reply via email to