Hello, all.  We are suddenly having a bit of a nightmare with our
otherwise usually delightful OSSEC.  We've installed it on a dual quad
core AMD server with 32GB of RAM running CentOS 5.2 but with kernel
2.6.28.7 (the CentOS kernel panics with open-iscsi) and VServer
2.3.0.36.7.

After a while, a syscheckd process spins completely out of control
consuming 100% of one processor.  It refuses to die.  kill does not
work, kill -9 does not work, service ossec stop does not work.  Only
rebooting seems to work.  The console is flooded with:
BUG: soft lockup - CPU#3 stuck for 61s! [ossec-syscheckd:4625]

The VServer host (the source of the runaway process) is an OSSEC agent.
Originally, the OSSEC server was running as one of its guests but we
thought that was the problem.  We moved the OSSEC server to another
piece of hardware yet the problem has persisted.

We are using OSSEC http://www.ossec.net/files/ossec-hids-2.0.tar.gz
downloaded today.  Checksum matched.

Here is the log since the last start.  Notice that it thinks syscheckd
has stopped:
2009/03/18 11:55:43 ossec-execd: INFO: Started (pid: 4613).
2009/03/18 11:55:43 ossec-agentd(1410): INFO: Reading authentication keys file.
2009/03/18 11:55:43 ossec-agentd: INFO: No previous counter available for 
'vserver'.
2009/03/18 11:55:43 ossec-agentd: INFO: Assigning counter for agent vserver01: 
'0:0'.
2009/03/18 11:55:43 ossec-agentd: INFO: Assigning sender counter: 3:3930
2009/03/18 11:55:43 ossec-agentd: INFO: Started (pid: 4617).
2009/03/18 11:55:43 ossec-agentd: INFO: Server IP Address: 172.x.x.30
2009/03/18 11:55:43 ossec-agentd: INFO: Trying to connect to server 
(172.x.x.30:1514).
2009/03/18 11:55:44 ossec-agentd(4102): INFO: Connected to the server 
(172.x.x.30:1514).
2009/03/18 11:55:47 ossec-syscheckd: INFO: Started (pid: 4625).
2009/03/18 11:55:47 ossec-rootcheck: INFO: Started (pid: 4625).
2009/03/18 11:55:49 ossec-logcollector(1950): INFO: Analyzing file: 
'/var/log/messages'.
2009/03/18 11:55:49 ossec-logcollector(1950): INFO: Analyzing file: 
'/var/log/secure'.
2009/03/18 11:55:49 ossec-logcollector(1950): INFO: Analyzing file: 
'/var/log/maillog'.
2009/03/18 11:55:49 ossec-logcollector(1950): INFO: Analyzing file: 
'/var/log/cron'.
2009/03/18 11:55:49 ossec-logcollector(1950): INFO: Analyzing file: 
'/vservers/basevs/var/log/messages'.
2009/03/18 11:55:49 ossec-logcollector(1950): INFO: Analyzing file: 
'/vservers/h01/var/log/messages'.
2009/03/18 11:55:49 ossec-logcollector(1950): INFO: Analyzing file: 
'/vservers/ns02/var/log/messages'.
2009/03/18 11:55:49 ossec-logcollector(1950): INFO: Analyzing file: 
'/vservers/basevs/var/log/secure'.
2009/03/18 11:55:49 ossec-logcollector(1950): INFO: Analyzing file: 
'/vservers/h01/var/log/secure'.
2009/03/18 11:55:49 ossec-logcollector(1950): INFO: Analyzing file: 
'/vservers/ns02/var/log/secure'.
2009/03/18 11:55:49 ossec-logcollector(1950): INFO: Analyzing file: 
'/vservers/basevs/var/log/maillog'.
2009/03/18 11:55:49 ossec-logcollector(1950): INFO: Analyzing file: 
'/vservers/h01/var/log/maillog'.
2009/03/18 11:55:49 ossec-logcollector(1950): INFO: Analyzing file: 
'/vservers/ns02/var/log/maillog'.
2009/03/18 11:55:49 ossec-logcollector(1950): INFO: Analyzing file: 
'/vservers/basevs/var/log/cron'.
2009/03/18 11:55:49 ossec-logcollector(1950): INFO: Analyzing file: 
'/vservers/h01/var/log/cron'.
2009/03/18 11:55:49 ossec-logcollector(1950): INFO: Analyzing file: 
'/vservers/ns02/var/log/cron'.
2009/03/18 11:55:49 ossec-logcollector: INFO: Started (pid: 4621).
2009/03/18 11:58:28 ossec-syscheckd: Error opening directory: 
'/user/local/sbin': No such file or directory
2009/03/18 11:59:13 ossec-syscheckd: Error opening directory: 
'/vservers/ns02/user/local/sbin': No such file or directory
2009/03/18 12:01:13 ossec-syscheckd: INFO: Starting syscheck scan (db).
2009/03/18 12:09:53 ossec-syscheckd: INFO: Ending syscheck scan (db).
2009/03/18 12:10:13 ossec-rootcheck: INFO: Starting rootcheck scan.


Here is ossec.conf on the VServer host:
<ossec_config>
  <client>
    <server-ip>172.30.10.30</server-ip>
  </client>

  <syscheck>
    <!-- Frequency that syscheck is executed - default to every 6 hours -->
    <frequency>21600</frequency>
    <alert_new_files>yes</alert_new_files>

    <!-- Directories to check  (perform all possible verifications) -->
    <directories check_all="yes">/etc,/usr/bin,/usr/sbin</directories>
    <directories 
check_all="yes">/bin,/sbin,/usr/local/bin,/user/local/sbin,/usr/local/etc</directories>
    <directories 
check_all="yes">/vservers/ns02/etc,/vservers/ns02/usr/bin,/vservers/ns02/usr/sbin</directories>
    <directories 
check_all="yes">/vservers/ns02/bin,/vservers/ns02/sbin,/vservers/ns02/usr/local/bin,/vservers/ns02/user/local/sbin,/vservers/ns02/usr/local/etc</directories>

    <!-- Files/directories to ignore -->
    <ignore>/etc/mtab</ignore>
    <ignore>/etc/mnttab</ignore>
    <ignore>/etc/hosts.deny</ignore>
    <ignore>/etc/mail/statistics</ignore>
    <ignore>/etc/random-seed</ignore>
    <ignore>/etc/adjtime</ignore>
    <ignore>/etc/httpd/logs</ignore>
    <ignore>/etc/utmpx</ignore>
    <ignore>/etc/wtmpx</ignore>
    <ignore>/etc/cups/certs</ignore>
    <ignore>/etc/dumpdates</ignore>
    <ignore>/etc/svc/volatile</ignore>

    <!-- Windows files to ignore -->
    <ignore>C:\WINDOWS/System32/LogFiles</ignore>
    <ignore>C:\WINDOWS/Debug</ignore>
    <ignore>C:\WINDOWS/WindowsUpdate.log</ignore>
    <ignore>C:\WINDOWS/iis6.log</ignore>
    <ignore>C:\WINDOWS/system32/wbem/Logs</ignore>
    <ignore>C:\WINDOWS/system32/wbem/Repository</ignore>
    <ignore>C:\WINDOWS/Prefetch</ignore>
    <ignore>C:\WINDOWS/PCHEALTH/HELPCTR/DataColl</ignore>
    <ignore>C:\WINDOWS/SoftwareDistribution</ignore>
    <ignore>C:\WINDOWS/Temp</ignore>
    <ignore>C:\WINDOWS/system32/config</ignore>
    <ignore>C:\WINDOWS/system32/spool</ignore>
    <ignore>C:\WINDOWS/system32/CatRoot</ignore>
  </syscheck>

  <rootcheck>
    <rootkit_files>/usr/local/ossec/etc/shared/rootkit_files.txt</rootkit_files>
    
<rootkit_trojans>/usr/local/ossec/etc/shared/rootkit_trojans.txt</rootkit_trojans>
    
<system_audit>/usr/local/ossec/etc/shared/system_audit_rcl.txt</system_audit>
    
<system_audit>/usr/local/ossec/etc/shared/cis_debian_linux_rcl.txt</system_audit>
    
<system_audit>/usr/local/ossec/etc/shared/cis_rhel_linux_rcl.txt</system_audit>
    
<system_audit>/usr/local/ossec/etc/shared/cis_rhel5_linux_rcl.txt</system_audit>
  </rootcheck>
  <!-- Files to monitor (localfiles) -->

  <localfile>
    <log_format>syslog</log_format>
    <location>/var/log/messages</location>
  </localfile>

  <localfile>
    <log_format>syslog</log_format>
    <location>/var/log/secure</location>
  </localfile>

  <localfile>
    <log_format>syslog</log_format>
    <location>/var/log/maillog</location>
  </localfile>

  <localfile>
    <log_format>syslog</log_format>
    <location>/var/log/cron</location>
  </localfile>

  <localfile>
    <log_format>syslog</log_format>
    <location>/vservers/[a-zA-Z0-9]*/var/log/messages</location>
  </localfile>

  <localfile>
    <log_format>syslog</log_format>
    <location>/vservers/[a-zA-Z0-9]*/var/log/secure</location>
  </localfile>

  <localfile>
    <log_format>syslog</log_format>
    <location>/vservers/[a-zA-Z0-9]*/var/log/maillog</location>
  </localfile>

  <localfile>
    <log_format>syslog</log_format>
    <location>/vservers/[a-zA-Z0-9]*/var/log/cron</location>
  </localfile>
</ossec_config>

Any idea what is causing this? How to kill the process without
rebooting? How to fix it?

We're starting to fall behind on this critical project so any help is
greatly appreciated.  Thanks - John

-- 
John A. Sullivan III
Open Source Development Corporation
+1 207-985-7880
[email protected]

http://www.spiritualoutreach.com
Making Christianity intelligible to secular society

Reply via email to