>>>>> On Wed, 17 Jun 2020 13:02:13 -0400, Phil Stracchino said:
> 
> On 2020-06-17 12:28, Martin Simmons wrote:
> >>>>>> On Wed, 17 Jun 2020 08:37:37 -0400, Phil Stracchino said:
> >>
> >> OK, so, I reinstalled with debug binaries, having figured out how to
> >> override global make.conf for specific builds.  This morning, three jobs
> >> were hung, and once again it seemed to be random clients hung on random
> >> files.  There was nothing much of interest showing in the client trace
> >> files (at d200) except for heartbeat messages one after another.
> > 
> > In that case, you'll need to attach gdb to the director, client and storage
> > daemons when it is hanging (before cancelling the job) and use the "thread
> > apply all bt" command in gdb to get backtraces.  You may need to recompile
> > those daemons with debug info too.
> 
> 
> I recompiled all of the Bacula packages, on all of the Gentoo machines.
>  One of the clients that has hung several times is a Fedora system not
> set up for development, and it's still running the 9.4.4 client that's
> the newest available for Fedora, which suggests the client isn't
> actually the problem.

You can try installing bacula debuginfo rpms for Fedora if they exist
(https://fedoraproject.org/wiki/StackTraces#What_are_debuginfo_rpms.2C_and_how_do_I_get_them.3F).


> 
> 
> > OK, so the MySQL client library is calling abort (possibly some failed call
> > to assert).  To get more detail, you'll need to recompile
> > /usr/lib64/libmysqlclient.so.21 with debug information as well.
> 
> 
> That's interesting, because MySQL (actually MariaDB 10.4.12) has not
> changed since 41 days before I updated Bacula from 9.6.3 to 9.6.5.  The
> problem started only after Bacula updated, which suggests it's not
> MariaDB's fault.  Now that I know how to do per-package build
> environment overrides on Gentoo, though, it's easy enough to just add
> the debug flags to dev-db/mariadb.

Yes, this crash is probably not a bug in MariaDB.  It might also be unrelated
to the hanging.

> 
> When I look at the 9.6.4/9.6.5 release notes, I don't immediately see
> any changes that appear obviously relevant to this problem.  It would be
> pretty simple for me to roll back to 9.6.3 as a sanity check to verify
> that the problem goes away.

I wouldn't change the version at the moment if you want to debug the problem,
in case it goes into hibernation -- or should that be lockdown? :-)

__Martin


_______________________________________________
Bacula-devel mailing list
Bacula-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-devel

Reply via email to