Hi, I have done more research on this problem. More information about it inline.
On Tue, Jun 30, 2020 at 12:18:55PM +0100, Jose M Calhariz wrote: > On Mon, Jun 29, 2020 at 11:06:46AM -0600, Charles Curley wrote: > > On Mon, 29 Jun 2020 16:36:33 +0100 > > Jose M Calhariz <[email protected]> wrote: > > > > > On my main amanda installation I have a client that gives time out > > > when doing backups. I have researched and checked out the most common > > > problems. In the end I have found that: > > > > > > - "amcheck Config -c client" gives 30 seconds of timeout. > > > > I do not see that. I checked two clients, one AMD64, the other i386. > > On mine main amanda installation and have dozens of Debian clients. > It is one client only that is failing and I do not understand why. Now I have two clients with problems, but the main problem are different. > > > > > > > > > - I do not have any clue on the logs at the client. > > > > > > - Running the command by hand on the client I get segmentation fault. > > > > > > /usr/lib/amanda/amandad -auth=ssh amdump > > > Segmentation fault > > > > I see that, on both machines. > > > > OK, so the problem is another. More research to do. Thank you. > I will post here when I find more info. This nigth I have done more investigation. When I run "amcheck Conf -c client" it gives a 30 seconds timeout. This timeout was increased by me some time ago for other problem. So YMMV. On the client the amandad runs successfully but the never try to run selfcheck and do not give an error. The prof is that I have no logs in /var/log/amanda/client/Conf. I have made some changes, increased the client debug logs and some others things and the selfcheck started to run. Today selfcheck is not running again. Anyone know where in the code of amandad is the launch of selfcheck so I can quickly find the place and possibly add more debugging code? > > > > > > > > > > - Running the command inside gdb I see a NULL pointer. > > > > > > gdb /usr/lib/amanda/amandad > > > GNU gdb (Debian 8.2.1-2+b3) 8.2.1 > > > > I do not have symbols installed, so of course I can't examine the > > variables. I do see a segment fault: > > > > ... > > (gdb) run -auth=ssh amdump > > Starting program: /usr/lib/amanda/amandad -auth=ssh amdump > > [Thread debugging using libthread_db enabled] > > Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". > > > > Program received signal SIGSEGV, Segmentation fault. > > 0x00007ffff7f3a687 in stream_sendpkt () from > > /usr/lib/x86_64-linux-gnu/amanda/libamanda-3.5.1.so > > (gdb) > > ... > > > > I have no idea what is going on here, not having the source in front of > > me. But I wonder if this is because amanda is trying to use an SSH > > connection that isn't there? > > > > The first time, I SSHed in as root, did an su - to backup, then ran > > gdb. To test my hypothesis above, I went to my amanda server, su - to > > backup, then sshed to the client. This time I did not get a seg fault, > > and used Ctl-c to end the process: > > > > (gdb) run -auth=ssh amdump > > Starting program: /usr/lib/amanda/amandad -auth=ssh amdump > > [Thread debugging using libthread_db enabled] > > Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". > > ^C > > Program received signal SIGINT, Interrupt. > > 0x00007ffff76a27e4 in __GI___poll (fds=0x55555559e3d0, nfds=2, > > timeout=30000) at ../sysdeps/unix/sysv/linux/poll.c:29 > > 29 ../sysdeps/unix/sysv/linux/poll.c: No such file or directory. > > (gdb) > > > > > > > > This is with the bog standard Buster version of amanda: > > > > root@dzur:~# pre amanda > > amanda-client 1:3.5.1-2+b2 amd64 > > amanda-common 1:3.5.1-2+b2 amd64 > > root@dzur:~# > > > > > > Kind regards > Jose M Calhariz > > Kind regards Jose M Calhariz -- -- Aqueles que amam ser temidos temem ser amados, e eles próprios são mais medrosos do que todos, porque enquanto os outros homens temem apenas a eles, eles temem a tudo --São Francisco de Sales
signature.asc
Description: PGP signature
