I am not very knowledgable with gdb. I tried running the following:

ossec2:/home/glantz # gdb /var/ossec/bin/ossec-remoted
GNU gdb 6.6
Copyright (C) 2006 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and
you are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for
details.
This GDB was configured as "x86_64-suse-linux"...
Using host libthread_db library "/lib64/libthread_db.so.1".
(gdb) set follow-fork-mode child
(gdb) run
Starting program: /var/ossec/bin/ossec-remoted
[Thread debugging using libthread_db enabled]
[New Thread 47921259252448 (LWP 844)]

Program exited with code 01.
(gdb) bt
No stack.
(gdb)

I ran backtrace after daemon crash but there is no stack. Not sure
what I am doing wrong.

Our environment is Suse 10.1 running inside xen.
uname -a:
Linux ossec2 2.6.16.54-0.2.3-xen #1 SMP Thu Nov 22 18:32:07 UTC 2007
x86_64 x86_64 x86_64 GNU/Linux

The server is the latest 1.4 although the agents are older, I believe
1.3. This problem was occurring even when they where running same
version of 1.3. I upgraded the server in attempt to troubleshoot. (I
noticed some people upgrading because of maild stopping)

I was watching this all day yesterday and I don't see a pattern or
anything obvious as to why remoted crashes. The analysed does crash
occasionally as well, as you said. I have a perl script setup that
monitors the ossec processes and emails which processes where stopped.
Mostly it's just the remoted that stops, but occasionally analysed,
remoted and logcollector all stop. I also had the perl script run the
strace every time it it reported remoted down. I left that on for 12
hours overnight. I have not gone through those yet, but maybe I will
find a pattern there. I am stumped. Thank you for your time on this, I
appreciate it.

George

On Feb 6, 5:54 pm, "Daniel Cid" <[EMAIL PROTECTED]> wrote:
> Hi George,
>
> First of all, thanks for the detailed explanation. This later problem
> (all the socketerr + error
> sending message to queue) are caused when ossec-analysis is not
> running. Can you check
> in your logs for anything from it? It creates all the queues to
> receive the events from the
> other daemons...
>
> As for remoted segfaulting, can you run it with gdb?
>
> # gdb /var/ossec/bin/ossec-remoted
> (gdb) set follow-fork-mode child
> (gdb) run  --> (after it crashes/exits, run bt)
> (gdb) bt
>
> Btw, which OS + ossec version are you using?
>
> Thanks,
>
> --
> Daniel B. Cid
> dcid ( at ) ossec.net
>
> On Feb 6, 2008 1:54 PM, glantz <[EMAIL PROTECTED]> wrote:
>
>
>
> > Some more info. I was looking through the /var/ossec/logs/ossec.log
> > this morning and noticed this chunk this was around the same time I
> > received another segfualt:
>
> > 2008/02/06 09:08:56 ossec-remoted: socketerr (not available).
> > 2008/02/06 09:08:56 ossec-remoted(1210): Queue '/queue/ossec/queue'
> > not accessible: 'Connection refused'.
> > 2008/02/06 09:08:56 ossec-logcollector: socketerr (not available).
> > 2008/02/06 09:08:56 ossec-logcollector(1224): Error sending message to
> > queue.
> > 2008/02/06 09:08:59 ossec-remoted(1210): Queue '/queue/ossec/queue'
> > not accessible: 'Connection refused'.
> > 2008/02/06 09:08:59 ossec-remoted(1211): Unable to access queue: '/
> > queue/ossec/queue'. Giving up..
> > 2008/02/06 09:08:59 ossec-logcollector(1210): Queue '/var/ossec/queue/
> > ossec/queue' not accessible: 'Connection refused'.
> > 2008/02/06 09:08:59 ossec-logcollector(1211): Unable to access queue:
> > '/var/ossec/queue/ossec/queue'. Giving up..
>
> > What would cause this, or, what things can I possibly check to try and
> > troubleshoot?
>
> > Thank you,
> > George
>
> > On Feb 4, 2:20 pm, glantz <[EMAIL PROTECTED]> wrote:
> > > We have about 150 agents pointing to our ossec server. Something seems
> > > to be killing ossec-remoted, possibly one of the agents. Nothing
> > > suspicious in the ossec logs that I can see. However, /var/log/
> > > messages | grep remoted shows:
>
> > > Feb  4 13:51:05 ossec2 kernel: ossec-remoted[21608] general protection
> > > rip:2ae9b135f8b3 rsp:7ffff99eb378 error:0
> > > Feb  4 13:57:34 ossec2 kernel: ossec-remoted[21803]: segfault at
> > > 00000000000002d0 rip 00002ab0ad0b38b3 rsp 00007ffffdc976f8 error 4
>
> > > I have a stack trace, way too big to post here since it runs correctly
> > > usually for 5-10 minutes at a time with at least 150 agents. Here is
> > > the last few lines, with IP removed.
>
> > > 21976 stat("/queue/ossec/.wait", 0x7fff3a949050) = -1 ENOENT (No such
> > > file or directory)
> > > 21976 sendto(5, "1:(linux-246) 10.x.x.x->ossec"..., 52, 0, NULL, 0) =
> > > 52
> > > 21976 recvfrom(4, ":\3703\265\313N\363\277\4>\211\3p|\332z\23X
> > > \36\27\177\277"..., 6144, 0, {sa_family=AF_INET,
> > > sin_port=htons(32784), sin_addr=inet_addr("10.x.x.x")}, [16]) = 73
> > > 21976 time(NULL)                        = 1202155456
> > > 21976 --- SIGSEGV (Segmentation fault) @ 0 (0) ---
> > > 21977 <... recvfrom resumed> 0x407ff970, 1023, 0, 0x5405e0, 0x53f710)
> > > = ? ERESTARTSYS (To be restarted)
> > > 21978 <... futex resumed> )             = -1 EINTR (Interrupted system
> > > call)
> > > 21977 +++ killed by SIGSEGV +++
> > > 21978 +++ killed by SIGSEGV +++

Reply via email to