I've never used "list nextvol" before, but I tried it
after seeing Tilman's report, and sure enough, it kills my
Director too.  I'm running bacula 2.2.8 on Scientific Linux 4.5
(essentially recompiled Red Hat 4.5), with the catalog in a
PostgreSQL 8.2.6 server the same machine (an old Dell
PowerEdge 1500SC) where the Director and the Storage Daemon run.

To provide more details. I type 'list nextvol' at bconsole's
prompt, and it shows me a list of job numbers to choose from. I
pick one at random, and bconsole immediately exits, returning me
to my shell prompt.

  *list nextvol
  Automatically selected Catalog: MainCatalog
  Using Catalog "MainCatalog"
  The defined Job resources are:
       1: Test Job
       2: Ad Hoc with Generic defaults
       3: BackupCatalog
       .
       .
       .
      47: walecki
      48: wiglaf
      49: zeus2
  Select Job resource (1-49): 27
  502 backup $

ps confirms that the director process has disappeared.  I restart
it, and when I reconnect with bconsole, the first messages I
receive inform me of the director's exit:

  *messages
  01-Mar 16:41 backup-dir: Fatal Error at sql_get.c:580 because: 
  rwl_writelock failure. stat=22: ERR=Invalid argument
  01-Mar 16:41 backup-dir: Fatal Error because: 
  Bacula interrupted by signal 11: Segmentation violation

My bacula account was also emailed an unsuccessful stack trace attempt:

  Using host libthread_db library "/lib/tls/libthread_db.so.1".
  ptrace: Operation not permitted.
  /backup/working/16588: No such file or directory.
  $1 = '\0' <repeats 29 times>
  $2 = 0x0
  $3 = 0x0
  $4 = 0x0
  $5 = 0x80cbc98 "2.2.8 (26 January 2008)"
  $6 = 0x80b295e "i686-pc-linux-gnu"
  $7 = 0x80b2957 "redhat"
  $8 = 0x80b2953 "4.5"
  No stack.
  /usr/local/bacula/sbin/btraceback.gdb:11: Error in sourced command file:
  No stack.

>From experiments attaching gdb to bacula-dir by hand, I speculate
that the "ptrace: Operation not permitted." is a byproduct of
using "bacula-dir -u bacula" to switch the director to a non-root
user; i.e. when bacula's signal handler, running as bacula,
attempts to attach gdb to the bacula-dir process, it fails
because the process was originally started by root.  In any
event, as root, I can successfully attach to bacula-dir before
provoking the error, in which case I get the following stack
trace:

  0x009277a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
  (gdb) 
  (gdb) continue
  Continuing.
  [New Thread -1229907040 (LWP 17888)]

  Program received signal SIGSEGV, Segmentation fault.
  [Switching to Thread -1229907040 (LWP 17888)]
  sm_sizeof_pool_memory (fname=0x80cbcb8 "message.c", lineno=1273, obuf=0x7265 
<Address 0x7265 out of bounds>) at mem_pool.c:165
  165     mem_pool.c: No such file or directory.
          in mem_pool.c
  Current language:  auto; currently c++
  (gdb) where
  #0  sm_sizeof_pool_memory (fname=0x80cbcb8 "message.c", lineno=1273, 
obuf=0x7265 <Address 0x7265 out of bounds>) at mem_pool.c:165
  #1  0x0809fea9 in Mmsg ([EMAIL PROTECTED], 
      fmt=0x80c5bf4 "SELECT 
PoolId,Name,NumVols,MaxVols,UseOnce,UseCatalog,AcceptAnyVolume,AutoPrune,Recycle,VolRetention,VolUseDuration,MaxVolJobs,MaxVolFiles,MaxVolBytes,PoolType,LabelType,LabelFormat,RecyclePoolId
 FROM"...) at message.c:1273
  #2  0x08089f04 in db_get_pool_record (jcr=0x98ed630, mdb=0x98ee958, 
pdbr=0xb6b10600) at sql_get.c:588
  #3  0x08072037 in do_list_cmd (ua=0x98ee608, cmd=Variable "cmd" is not 
available.
  ) at ua_output.c:482
  #4  0x08069356 in do_a_command (ua=0x98ee608, cmd=0x98ea500 "27") at 
ua_cmds.c:180
  #5  0x0807ce2b in handle_UA_client_request (arg=0x98eb7e8) at ua_server.c:147
  #6  0x080aebb4 in workq_server (arg=0x80d4d40) at workq.c:357
  #7  0x00ab13cc in start_thread () from /lib/tls/libpthread.so.0
  #8  0x00a09c3e in clone () from /lib/tls/libc.so.6
  (gdb) quit

I added "-d 200 -v -f" to the bacula-dir process to see if extra tracing
provided more clues:

  .
  .
  .
  backup-dir: ua_cmds.c:1836-0 UA Open database
  backup-dir: postgresql.c:103-0 db_open first time
  backup-dir: postgresql.c:194-0 pg_real_connect done
  backup-dir: postgresql.c:196-0 db_user=bacula db_name=bacula 
db_password=<WHATEVER>
  backup-dir: ua_cmds.c:1854-0 DB bacula opened
  backup-dir: ua_output.c:259-0 list: list nextvol
  backup-dir: ua_output.c:589-0 now=1204412320 runtime=1204354500
  backup-dir: ua_output.c:589-0 now=1204412320 runtime=1204440900
  backup-dir: ua_output.c:591-0 Found it level=73 I
  backup-dir: job.c:1127-0 wstorage=ML6000
  backup-dir: job.c:1136-0 wstore=ML6000 where=Pool resource
  backup-dir: ua_output.c:617-0 complete_jcr close db
  backup-dir: ua_output.c:622-0 complete_jcr open db
  backup-dir: postgresql.c:103-0 db_open first time
  backup-dir: postgresql.c:194-0 pg_real_connect done
  backup-dir: postgresql.c:196-0 db_user=bacula db_name=bacula 
db_password=<WHATEVER>
  Kaboom! bacula-dir, backup-dir got signal 11 - Segmentation violation. 
Attempting traceback.
  Kaboom! exepath=/usr/local/bacula/sbin/
  Calling: /usr/local/bacula/sbin/btraceback /usr/local/bacula/sbin/bacula-dir 
17874
  backup-dir: scheduler.c:273-0 enter find_runs()
  backup-dir: scheduler.c:288-0 now = 47c9dfe0: h=16 m=2 md=0 wd=6 wom=0 woy=9
  backup-dir: scheduler.c:306-0 nh = 47c9edf0: h=17 m=2 md=0 wd=6 wom=0 woy=9
  backup-dir: scheduler.c:315-0 Got job: Test Job
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:315-0 Got job: Ad Hoc with Generic defaults
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:315-0 Got job: BackupCatalog
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:315-0 Got job: abla
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:315-0 Got job: agnesi
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:315-0 Got job: alvis
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:315-0 Got job: antar
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:315-0 Got job: araneida
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:315-0 Got job: aufbau
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:315-0 Got job: backup
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:315-0 Got job: bayes
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:315-0 Got job: beowulf
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:315-0 Got job: cadmium
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:315-0 Got job: capella
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:315-0 Got job: castor
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:315-0 Got job: cerebus
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:315-0 Got job: dickens
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:315-0 Got job: free
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:315-0 Got job: grendel
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:315-0 Got job: gridiron
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:315-0 Got job: hercules
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:315-0 Got job: hilbert
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:315-0 Got job: jordan
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:315-0 Got job: juno
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:315-0 Got job: jupiter
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:315-0 Got job: leto
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:315-0 Got job: mercury
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:315-0 Got job: mercury2
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:315-0 Got job: merlin
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:315-0 Got job: mesmer
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:315-0 Got job: mobius
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:315-0 Got job: open
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:315-0 Got job: perron
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:315-0 Got job: ramanujan
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:315-0 Got job: riccati
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:315-0 Got job: schur
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:315-0 Got job: ssd1
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:315-0 Got job: ssd2
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:315-0 Got job: sun
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:315-0 Got job: tdi
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:315-0 Got job: venus
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:315-0 Got job: vonneumann
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:315-0 Got job: walecki
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:315-0 Got job: wiglaf
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:315-0 Got job: zeus2
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:358-0 [EMAIL PROTECTED]: run_now=0 run_nh=0
  backup-dir: scheduler.c:377-0 Leave find_runs()
  Traceback complete, attempting cleanup ...

The postgres server's log doesn't provide any clues that I
recognize:

  Mar  1 16:57:33 backup postgres[17877]: [1-1] LOG:  connection received: 
host=[local]
  Mar  1 16:57:33 backup postgres[17877]: [2-1] LOG:  connection authorized: 
user=bacula database=bacula
  Mar  1 16:58:37 backup postgres[17889]: [1-1] LOG:  connection received: 
host=[local]
  Mar  1 16:58:37 backup postgres[17889]: [2-1] LOG:  connection authorized: 
user=bacula database=bacula
  Mar  1 16:58:40 backup postgres[17890]: [1-1] LOG:  connection received: 
host=[local]
  Mar  1 16:58:40 backup postgres[17890]: [2-1] LOG:  connection authorized: 
user=bacula database=bacula
  Mar  1 16:59:44 backup postgres[17890]: [3-1] LOG:  unexpected EOF on client 
connection

Since I haven't used "list nextvol", I'm quite content
simply eschewing the command, but I thought I'd send this "me
too" in case it helps track the problem down for people who do
need "list nextvol".


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to