Hi,

It is a race condition.  While I am adding debug methods the problem
goes away.  After a pause of some time the problem start over again.

By the look of the logs and strace seems that noop service ends to
fast and the event loop waits for a process that have already
terminated.

Kind regards
Jose M Calhariz

On Fri, Jul 10, 2020 at 02:42:38PM +0100, Jose M Calhariz wrote:
> Hi,
> 
> I have done more research on this problem.  More information about it
> inline.
> 
> On Tue, Jun 30, 2020 at 12:18:55PM +0100, Jose M Calhariz wrote:
> > On Mon, Jun 29, 2020 at 11:06:46AM -0600, Charles Curley wrote:
> > > On Mon, 29 Jun 2020 16:36:33 +0100
> > > Jose M Calhariz <jose.calha...@tecnico.ulisboa.pt> wrote:
> > > 
> > > > On my main amanda installation I have a client that gives time out
> > > > when doing backups.  I have researched and checked out the most common
> > > > problems.  In the end I have found that:
> > > > 
> > > > - "amcheck Config -c client" gives 30 seconds of timeout.
> > > 
> > > I do not see that. I checked two clients, one AMD64, the other i386.
> > 
> > On mine main amanda installation and have dozens of Debian clients.
> > It is one client only that is failing and I do not understand why.
> 
> Now I have two clients with problems, but the main problem are different.
> 
> > 
> > > 
> > > > 
> > > > - I do not have any clue on the logs at the client.
> > > > 
> > > > - Running the command by hand on the client I get segmentation fault.
> > > > 
> > > > /usr/lib/amanda/amandad -auth=ssh amdump
> > > > Segmentation fault
> > > 
> > > I see that, on both machines.
> > >
> > 
> > OK, so the problem is another.  More research to do.  Thank you.
> > I will post here when I find more info.
> 
> 
> This nigth I have done more investigation.  When I run "amcheck Conf
> -c client" it gives a 30 seconds timeout.  This timeout was increased
> by me some time ago for other problem.  So YMMV.
> 
> On the client the amandad runs successfully but the never try to run
> selfcheck and do not give an error.  The prof is that I have no logs
> in /var/log/amanda/client/Conf.  I have made some changes, increased
> the client debug logs and some others things and the selfcheck started
> to run.  Today selfcheck is not running again.
> 
> 
> Anyone know where in the code of amandad is the launch of selfcheck so
> I can quickly find the place and possibly add more debugging code?
> 
> 
> > 
> > 
> > > 
> > > > 
> > > > - Running the command inside gdb I see a NULL pointer.
> > > > 
> > > > gdb /usr/lib/amanda/amandad 
> > > > GNU gdb (Debian 8.2.1-2+b3) 8.2.1
> > > 
> > > I do not have symbols installed, so of course I can't examine the
> > > variables. I do see a segment fault:
> > > 
> > > ...
> > > (gdb) run -auth=ssh amdump
> > > Starting program: /usr/lib/amanda/amandad -auth=ssh amdump
> > > [Thread debugging using libthread_db enabled]
> > > Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
> > > 
> > > Program received signal SIGSEGV, Segmentation fault.
> > > 0x00007ffff7f3a687 in stream_sendpkt () from 
> > > /usr/lib/x86_64-linux-gnu/amanda/libamanda-3.5.1.so
> > > (gdb)
> > > ...
> > > 
> > > I have no idea what is going on here, not having the source in front of
> > > me. But I wonder if this is because amanda is trying to use an SSH
> > > connection that isn't there?
> > > 
> > > The first time, I SSHed in as root, did an su - to backup, then ran
> > > gdb. To test my hypothesis above, I went to my amanda server, su - to
> > > backup, then sshed to the client. This time I did not get a seg fault,
> > > and used Ctl-c to end the process:
> > > 
> > > (gdb) run -auth=ssh amdump
> > > Starting program: /usr/lib/amanda/amandad -auth=ssh amdump
> > > [Thread debugging using libthread_db enabled]
> > > Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
> > > ^C
> > > Program received signal SIGINT, Interrupt.
> > > 0x00007ffff76a27e4 in __GI___poll (fds=0x55555559e3d0, nfds=2, 
> > > timeout=30000) at ../sysdeps/unix/sysv/linux/poll.c:29
> > > 29        ../sysdeps/unix/sysv/linux/poll.c: No such file or directory.
> > > (gdb) 
> > > 
> > > 
> > > 
> > > This is with the bog standard Buster version of amanda:
> > > 
> > > root@dzur:~# pre amanda
> > > amanda-client     1:3.5.1-2+b2            amd64
> > > amanda-common     1:3.5.1-2+b2            amd64
> > > root@dzur:~# 
> > > 
> > > 
> > 
> > Kind regards
> > Jose M Calhariz
> > 
> > 
> 
> Kind regards
> Jose M Calhariz
> 
> 
> 



-- 
--
Quando um não quer... o outro insiste.

Attachment: signature.asc
Description: PGP signature

Reply via email to