>>>>> On Wed, 17 Jun 2020 08:37:37 -0400, Phil Stracchino said:
> 
> OK, so, I reinstalled with debug binaries, having figured out how to
> override global make.conf for specific builds.  This morning, three jobs
> were hung, and once again it seemed to be random clients hung on random
> files.  There was nothing much of interest showing in the client trace
> files (at d200) except for heartbeat messages one after another.

In that case, you'll need to attach gdb to the director, client and storage
daemons when it is hanging (before cancelling the job) and use the "thread
apply all bt" command in gdb to get backtraces.  You may need to recompile
those daemons with debug info too.


> On trying to kill the first stuck job, nothing happened except that it
> did not terminate.  On trying to kill the second, the director crashed
> again.
> 
> Upon restarting the Director and re-queueing the failed jobs, all ran
> successfully.
> 
> 
> Here's the traceback:
> ...
> Thread 5 (Thread 0x7f74c67fc700 (LWP 23810)):
> #0  0x00007f74d89439ae in waitpid () from /lib64/libpthread.so.0
> #1  0x00007f74d899a74c in signal_handler (sig=6) at signal.c:233
> #2  <signal handler called>
> #3  0x00007f74d8537621 in raise () from /lib64/libc.so.6
> #4  0x00007f74d852155b in abort () from /lib64/libc.so.6
> #5  0x00007f74d7e88c01 in ?? () from /usr/lib64/libmysqlclient.so.21
> #6  0x00007f74d7e8b5c7 in ?? () from /usr/lib64/libmysqlclient.so.21
> #7  0x00007f74d7e398d0 in ?? () from /usr/lib64/libmysqlclient.so.21
> #8  0x00007f74d7e39e1d in ?? () from /usr/lib64/libmysqlclient.so.21
> #9  0x00007f74d7e35146 in mysql_send_query () from
> /usr/lib64/libmysqlclient.so.21
> #10 0x00007f74d7e35385 in mysql_real_query () from
> /usr/lib64/libmysqlclient.so.21
> #11 0x00007f74d89e0138 in BDB_MYSQL::sql_query (this=0x558706673108,
> query=0x7f74c801b320 "UPDATE Job SET JobStatus='f',EndTime='2020-06-17
> 08:24:13',ClientId=31,JobBytes=15644859,ReadBytes=0,JobFiles=19,JobErrors=1,VolSessionId=33,VolSessionTime=1592158122,PoolId=5,FileSetId=31,JobTDate=15"...,
> flags=0) at mysql.c:537
> #12 0x00007f74d89f323c in BDB::UpdateDB (this=this@entry=0x558706673108,
> jcr=jcr@entry=0x558706672128, cmd=0x7f74c801b320 "UPDATE Job SET
> JobStatus='f',EndTime='2020-06-17
> 08:24:13',ClientId=31,JobBytes=15644859,ReadBytes=0,JobFiles=19,JobErrors=1,VolSessionId=33,VolSessionTime=1592158122,PoolId=5,FileSetId=31,JobTDate=15"...,
> can_be_empty=can_be_empty@entry=false, file=file@entry=0x7f74d8a040d7
> "bdb.h", line=line@entry=140) at sql.c:474
> #13 0x00007f74d8a0292c in BDB::bdb_update_job_end_record
> (this=0x558706673108, jcr=jcr@entry=0x558706672128,
> jr=jr@entry=0x558706672618) at sql_update.c:190
> #14 0x000055870484aff1 in update_job_end_record (jcr=0x558706672128) at
> job.c:1369
> #15 0x0000558704835c1c in backup_cleanup (jcr=jcr@entry=0x558706672128,
> TermCode=TermCode@entry=69) at backup.c:898
> #16 0x00005587048491aa in job_thread (arg=0x558706672128) at job.c:455
> #17 0x000055870484f7fb in jobq_server (arg=0x5587048c76a0 <job_queue>)
> at jobq.c:468
> #18 0x00007f74d8938ea7 in start_thread () from /lib64/libpthread.so.0
> #19 0x00007f74d85f7c6f in clone () from /lib64/libc.so.6

OK, so the MySQL client library is calling abort (possibly some failed call
to assert).  To get more detail, you'll need to recompile
/usr/lib64/libmysqlclient.so.21 with debug information as well.

__Martin


_______________________________________________
Bacula-devel mailing list
Bacula-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-devel

Reply via email to