On Thu, Aug 16, 2012 at 1:12 AM, Jim <[email protected]> wrote:
> Hello,
>
> Any further thoughts on fixing this core dump problem?
>
> Thanks,
>
>
> --JIM
>

Is there any chance you can run it in gdb?

gdb /var/ossec/bin/ossec-analysisd
set follow-fork-mode child
run
*CRASH*
bt


There are probably other things you should be running in addition to
bt, but I don't know it all that well.

>
> On Monday, August 13, 2012 7:41:39 PM UTC-4, Jim wrote:
>>
>> Here are the logs from the ossec.log, which was running in debug.  Which
>> reports until you can see analysisd core dumps.  IPs and hostnames have been
>> changed from the original...
>>
>> 2012/08/12 15:07:09 ossec-logcollector: DEBUG: Reading syslog message:
>> 'Aug 12 15:07:08 casrv9-hidn.local1 ipmon[123]: 15:07:07.674864 e1000g0
>> @0:63 p 192.168.2.81,46618 -> 192.168.2.205,994 PR tcp len 20 142 -AP IN'
>> 2012/08/12 15:07:09 ossec-logcollector: DEBUG: Reading syslog message:
>> 'Aug 12 15:07:08 casrv9-hidn.local1 ipmon[123]: 15:07:07.675807 e1000g0
>> @0:63 p 192.168.2.81,46618 -> 192.168.2.205,994 PR tcp len 20 52 -A IN'
>> 2012/08/12 15:07:09 ossec-logcollector: DEBUG: Reading syslog message:
>> 'Aug 12 15:07:08 batch.local1 batch kernel: VFS: busy inodes on changed
>> media or resized disk hdb'
>> 2012/08/12 15:07:09 ossec-logcollector: DEBUG: Reading syslog message:
>> 'Aug 12 15:07:08 n58.local1 ntpd[2816]: kernel time sync enabled 0001'
>> 2012/08/12 15:07:09 ossec-logcollector: DEBUG: Reading syslog message:
>> 'Aug 12 15:07:09 n44.local1 smartd[2942]: Device: /dev/sda, 661 Currently
>> unreadable (pending) sectors '
>> 2012/08/12 15:07:11 ossec-logcollector: DEBUG: Reading syslog message:
>> 'Aug 12 15:07:10 FW1.local1 kernel: ACCEPTED IN=eth0 OUT=eth4
>> SRC=192.168.1.200 DST=172.16.4.34 LEN=84 TOS=0x00 PREC=0x00 TTL=252 ID=27140
>> PROTO=ICMP TYPE=8 CODE=0 ID=16044 SEQ=0 '
>> 2012/08/12 15:07:11 ossec-logcollector: DEBUG: Reading syslog message:
>> 'Aug 12 15:07:10 batch.local1 batch kernel: VFS: busy inodes on changed
>> media or resized disk hdb'
>> 2012/08/12 15:07:11 ossec-logcollector: DEBUG: Reading syslog message:
>> 'Aug 12 15:07:11 casrv9-hidn.local1 ipmon[123]: 15:07:10.875337 e1000g0
>> @0:60 p 192.168.4.52,49168 -> 192.168.2.205,6667 PR tcp len 20 72 -AP IN'
>> 2012/08/12 15:07:11 ossec-logcollector: DEBUG: Reading syslog message:
>> 'Aug 12 15:07:11 casrv9-hidn.local1 ipmon[123]: 15:07:10.876005 e1000g0
>> @0:60 p 192.168.4.52,49168 -> 192.168.2.205,6667 PR tcp len 20 52 -A IN'
>> 2012/08/12 15:07:11 ossec-logcollector: DEBUG: Reading syslog message:
>> 'Aug 12 15:07:11 casrv9-hidn.local1 ipmon[123]: 15:07:11.071610 e1000g0
>> @0:63 p 192.168.2.127,56476 -> 192.168.2.205,994 PR tcp len 20 142 -AP IN'
>> 2012/08/12 15:07:11 ossec-logcollector: DEBUG: Reading syslog message:
>> 'Aug 12 15:07:11 casrv9-hidn.local1 ipmon[123]: 15:07:11.074955 e1000g0
>> @0:63 p 192.168.2.127,56476 -> 192.168.2.205,994 PR tcp len 20 52 -A IN'
>> 2012/08/12 15:07:11 ossec-logcollector: DEBUG: Reading syslog message:
>> 'Aug 12 15:07:11 casrv9-hidn.local1 ipmon[123]: 15:07:11.140966 e1000g0
>> @0:63 p 192.168.2.81,46618 -> 192.168.2.205,994 PR tcp len 20 126 -AP IN'
>> 2012/08/12 15:07:11 ossec-logcollector: DEBUG: Reading syslog message:
>> 'Aug 12 15:07:11 casrv9-hidn.local1 ipmon[123]: 15:07:11.142122 e1000g0
>> @0:63 p 192.168.2.81,46618 -> 192.168.2.205,994 PR tcp len 20 52 -A IN'
>> 2012/08/12 15:07:13 ossec-logcollector: DEBUG: Reading syslog message:
>> 'Aug 12 15:07:12 loghost unix: NOTICE: core_log: ossec-analysisd[713] core
>> dumped: /var/cores/core.ossec-analysisd.713'
>>
>>
>> And here is the firewall log from ossec.  I can see that these messages
>> stop before the above messages:
>>
>> 2012 Aug 12 15:06:57 casrv9-hidn.local1->/var/adm/systemlog ALLOW tcp
>> 192.168.4.142:45755->192.168.2.205:994
>> 2012 Aug 12 15:06:59 casrv9-hidn.local1->/var/adm/systemlog ALLOW tcp
>> 192.168.4.142:45755->192.168.2.205:994
>> 2012 Aug 12 15:06:59 casrv9-hidn.local1->/var/adm/systemlog ALLOW tcp
>> 192.168.4.127:58951->192.168.2.205:994
>> 2012 Aug 12 15:06:59 casrv9-hidn.local1->/var/adm/systemlog ALLOW tcp
>> 192.168.4.127:58951->192.168.2.205:994
>> 2012 Aug 12 15:07:01 casrv9-hidn.local1->/var/adm/systemlog ALLOW tcp
>> 192.168.15.70:53959->192.168.2.205:994
>> 2012 Aug 12 15:07:01 casrv9-hidn.local1->/var/adm/systemlog ALLOW tcp
>> 192.168.15.70:53959->192.168.2.205:994
>> 2012 Aug 12 15:07:05 casrv9-hidn.local1->/var/adm/systemlog ALLOW tcp
>> 192.168.4.61:49195->192.168.2.205:6667
>> 2012 Aug 12 15:07:05 casrv9-hidn.local1->/var/adm/systemlog ALLOW tcp
>> 192.168.4.61:49195->192.168.2.205:6667
>> 2012 Aug 12 15:07:09 casrv9-hidn.local1->/var/adm/systemlog ALLOW tcp
>> 192.168.2.81:46618->192.168.2.205:994
>> 2012 Aug 12 15:07:09 casrv9-hidn.local1->/var/adm/systemlog ALLOW tcp
>> 192.168.2.81:46618->192.168.2.205:994
>>
>> Last couple of lines in firewall log match the first couple of lines in
>> the ossec.log snippet.  I am guessing things went bad after the last entry
>> reported in the firewall log?
>>
>>
>> --JIM
>>
>>
>> On Monday, August 13, 2012 3:03:53 PM UTC-4, JB wrote:
>>>
>>> Seg fault occurred at ./analysisd/alerts/log.c around line 261:
>>>     fprintf(_fflog,
>>>             "%d %s %02d %s %s%s%s %s %s %s:%s->%s:%s\n",
>>>             lf->year,
>>>             lf->mon,
>>>             lf->day,
>>>             lf->hour,
>>>             lf->hostname != lf->location?lf->hostname:"",
>>>             lf->hostname != lf->location?"->":"",
>>>             lf->location,
>>>             lf->action,
>>>             lf->protocol,
>>>             lf->srcip,
>>>             lf->srcport,
>>>             lf->dstip,
>>>             lf->dstport);
>>>
>>> Could you provide some Solaris firewall raw log entries prior to the time
>>> of this core dump?
>>> Also, provide same OSSEC firewall log entries (e.g.,
>>> /var/ossec/logs/fiewwall/2012/Aug/oosec-firewall-12.log).
>>> It seems one of the lf-> fields was null but I do not know which one yet.
>>>
>>> On Sunday, August 12, 2012 12:19:23 PM UTC-7, Jim wrote:
>>>>
>>>> Hi,
>>>>
>>>> I am trying to get OSSEC version 2.6 running on one of our Solaris 10
>>>> loghosts but I am running into a problem where analysisd cores after about 
>>>> a
>>>> minute.  It will actually report some e-mail alerts for a minute or so
>>>> before analysisd cores.  Below is an MDeBug of the core file:
>>>>
>>>>
>>>>
>>>> ******************************************************************************
>>>>   Application core Dump Analysis Output                     MDeBug Rev
>>>> 1.0
>>>>   Sun Aug 12 15:13:15 EDT 2012                   Files:
>>>> /var/ossec/bin/ossec-analysisd  core.ossec-analysisd.713
>>>>
>>>> ******************************************************************************
>>>>
>>>>
>>>>
>>>>                 ** Core file status **
>>>>                 ------------------------
>>>> debugging core file of ossec-analysisd (32-bit) from ex
>>>> file: /var/ossec/bin/ossec-analysisd
>>>> initial argv: /var/ossec/bin/ossec-analysisd -d -d
>>>> threading model: multi-threaded
>>>> status: process terminated by SIGSEGV (Segmentation Fault)
>>>>
>>>>
>>>>                 ** Thread stack($c) **
>>>>                 ----------------------
>>>> libc.so.1`strlen+0xc(8095e28, 8046478, 80bbc10, 0)
>>>> libc.so.1`fprintf+0x99(80bbc10, 8095e08, 7dc, 81b31f2, c, 81b31e8)
>>>> FW_Log+0x38e(81b3178, 81b3178, 8046500, 805763c)
>>>> OS_ReadMSG+0x5d0(4, 808d48b, 2c9, 808d46b)
>>>> main+0x98a(3, 8047db0, 8047dc0)
>>>> _start+0x80(3, 8047e74, 8047e93, 8047e96, 0, 8047e99)
>>>>
>>>>
>>>>                 ** Shared objects **
>>>>                 ----------------------
>>>>     BASE    LIMIT     SIZE NAME
>>>>  8050000  80a8000    58000 /var/ossec/bin/ossec-analysisd
>>>> fef90000 fef9b000     b000 /lib/libsocket.so.1
>>>> feef0000 fef71000    81000 /lib/libnsl.so.1
>>>> feea0000 feed6000    36000 /lib/libresolv.so.2
>>>> fed70000 fee7e000   10e000 /lib/libc.so.1
>>>> fefc3000 fefeb000    28000 /lib/ld.so.1
>>>>
>>>>
>>>>                 Thread stack for MT app
>>>>                 ------------------------
>>>> stack pointer for thread 1: 8046440
>>>> [ 08046440 libc.so.1`strlen+0xc() ]
>>>>   08046468 libc.so.1`fprintf+0x99()
>>>>   080464c8 FW_Log+0x38e()
>>>>   08047d28 OS_ReadMSG+0x5d0()
>>>>   08047d8c main+0x98a()
>>>>   08047da4 _start+0x80()
>>>>
>>>>
>>>> Then a disassembly of FW_log.  Let me know if you need me to dig deeper
>>>> here...
>>>>
>>>> bash-3.00# mdb /var/ossec/bin/ossec-analysisd core.ossec-analysisd.713
>>>> Loading modules: [ libc.so.1 ld.so.1 ]
>>>> > ::stack
>>>> libc.so.1`strlen+0xc(8095e28, 8046478, 80bbc10, 0)
>>>> libc.so.1`fprintf+0x99(80bbc10, 8095e08, 7dc, 81b31f2, c, 81b31e8)
>>>> FW_Log+0x38e(81b3178, 81b3178, 8046500, 805763c)
>>>> OS_ReadMSG+0x5d0(4, 808d48b, 2c9, 808d46b)
>>>> main+0x98a(3, 8047db0, 8047dc0)
>>>> _start+0x80(3, 8047e74, 8047e93, 8047e96, 0, 8047e99)
>>>>
>>>> > FW_Log+0x38e::dis
>>>> FW_Log+0x36b:                   movl   0x8(%ebp),%eax
>>>> FW_Log+0x36e:                   pushl  0x68(%eax)
>>>> FW_Log+0x371:                   movl   0x8(%ebp),%eax
>>>> FW_Log+0x374:                   addl   $0x7a,%eax
>>>> FW_Log+0x377:                   pushl  %eax
>>>> FW_Log+0x378:                   movl   0x8(%ebp),%eax
>>>> FW_Log+0x37b:                   pushl  0x6c(%eax)
>>>> FW_Log+0x37e:                   pushl  $0x8095e08
>>>> FW_Log+0x383:                   pushl  0x80c068c
>>>> FW_Log+0x389:                   call   -0x20f6c <PLT=libc.so.1`fprintf>
>>>> FW_Log+0x38e:                   addl   $0x40,%esp
>>>> FW_Log+0x391:                   subl   $0xc,%esp
>>>> FW_Log+0x394:                   pushl  0x80c068c
>>>> FW_Log+0x39a:                   call   -0x20e5d <PLT=libc.so.1`fflush>
>>>> FW_Log+0x39f:                   addl   $0x10,%esp
>>>> FW_Log+0x3a2:                   movl   $0x1,-0x8(%ebp)
>>>> FW_Log+0x3a9:                   movl   -0x8(%ebp),%eax
>>>> FW_Log+0x3ac:                   movl   -0x4(%ebp),%ebx
>>>> FW_Log+0x3af:                   leave
>>>> FW_Log+0x3b0:                   ret
>>>> mknod:                          pushl  %ebp
>>>> > ::quit
>>>>
>>>>
>>>>
>>>> --JIM

Reply via email to