Hello Phil,
Can you try rebuilding the failing Director (and as many other
components as you can), with
--enable-smartalloc \
in your ./configure script. I have an uneasy feeling that if one is not
using smartalloc problems could like yours could happen.
Best regards,
Kern
On 6/23/20 5:11 PM, Phil Stracchino wrote:
Everything else re-upgraded to 9.6.5. No problems overnight. Looks so
far like the issue is triggered by the combination of Bacula 9.6.5 +
Galera cluster + HAproxy.
This *does* seem to imply that some change in the Director between 9.6.3
and 9.6.5 is a factor. It also appears it is a problem only with
HAproxy in the picture. Both client and server timeouts in my HAproxy
config are set to 500000 (ms). *Could* this perhaps be a timeout issue
that for some reason affects 9.6.5 but not 9.6.3? I can see the
potential for a Director crash if HAproxy has terminated the connection
but the Director thinks it's still open. However, I've never
encountered this problem prior to 9.6.5, and I've had Bacula running
against the Galera cluster for years now and relaying via HAproxy for
many months.
I can test this theory by setting the Director's DB connection back to
via HAproxy and increasing HAproxy's timeout.
_______________________________________________
Bacula-devel mailing list
Bacula-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-devel