All, I've got a Bacula installation that has a peculiar problem that I need some assistance in resolving:
Once a job starts (as in, gets accepted by the SD as a job to run), I am unable to cancel it. Using the "cancel" bconsole command, I receive the following output: *cancel jobid=205 2001 Job fileserver2_Backup.2010-12-09_09.39.25_04 marked to be canceled. 3000 Job fileserver2_Backup.2010-12-09_09.39.25_04 marked to be canceled. You have messages. *messages 09-Dec 09:45 director.f.q.d.n-dir JobId 205: Fatal error: Network error with FD during Backup: ERR=Interrupted system call 09-Dec 09:45 sd.f.q.d.n-sd JobId 205: JobId=205 Job="fileserver2_Backup.2010-12-09_09.39.25_04" marked to be canceled. 09-Dec 09:45 sd.f.q.d.n-sd JobId 205: JobId=205 Job="fileserver2_Backup.2010-12-09_09.39.25_04" marked to be canceled. 09-Dec 09:45 director.f.q.d.n-dir JobId 205: Fatal error: No Job status returned from FD. 09-Dec 09:45 director.f.q.d.n-dir JobId 205: Bacula director.f.q.d.n-dir 5.0.3 (04Aug10): 09-Dec-2010 09:45:16 Build OS: i686-pc-linux-gnu redhat Enterprise release JobId: 205 Job: fileserver2_Backup.2010-12-09_09.39.25_04 Backup Level: Full Client: "fileserver2.f.q.d.n-fd" 5.0.3 (04Aug10) Linux,Cross-compile,Win32 FileSet: "fileserver2 Set" 2010-11-08 23:05:00 Pool: "Default" (From Job resource) Catalog: "Catalog" (From Client resource) Storage: "tapelibrary" (From Pool resource) Scheduled time: 09-Dec-2010 09:39:25 Start time: 09-Dec-2010 09:39:28 End time: 09-Dec-2010 09:45:16 Elapsed time: 5 mins 48 secs Priority: 10 FD Files Written: 0 SD Files Written: 0 FD Bytes Written: 0 (0 B) SD Bytes Written: 0 (0 B) Rate: 0.0 KB/s Software Compression: None VSS: no Encryption: no Accurate: yes Volume name(s): Volume Session Id: 1 Volume Session Time: 1291904997 Last Volume Bytes: 187,697,986,560 (187.6 GB) Non-fatal FD errors: 0 SD Errors: 0 FD termination status: Error SD termination status: Error Termination: Backup Canceled The director then treats this job as canceled, but both FD and SD treats it as if it is still running. Canceling a job before it becomes runnable (say, if the SD has exceeded the Max Storage Jobs directive) works perfectly fine. As a side effect of this problem, the SD will keep the backup device open indefinitely. I have to kill both the FD and the SD in order to reset things back to a working state following a cancelled job. I'd appreciate some guidance on how to go about diagnosing and resolving this problem. Thanks! -- Alan Gerber Some configuration notes: I'd be happy to provide additional configuration details (such as the contents of the bacula-*.conf files) if anyone thinks that would be helpful. The director and SD are running on the same machine. The example output above was generated by a FD running on a different machine from the director and SD, but I can replicate the problem on a FD running locally on the same machine as the director and SD, as well as any other FD in my installation. uname -a outputs "Linux director.f.q.d.n 2.6.18-194.8.1.el5 #1 SMP Wed Jun 23 10:58:38 EDT 2010 i686 i686 i386 GNU/Linux" Bacula was built from source, using the following configure: ./configure \ -sbindir=/scratch/bacula/bin \ -sysconfdir=/scratch/bacula/etc \ -enable-smartalloc \ -enable-batch-insert \ -enable-largefile \ -with-mysql \ -with-openssl \ -enable-conio \ -with-working-dir=/scratch/bacula/var ...which resulted in the following summary output: Configuration on Thu Sep 9 11:25:22 EDT 2010: Host: i686-pc-linux-gnu -- redhat Enterprise release Bacula version: Bacula 5.0.3 (04 August 2010) Source code location: . Install binaries: /scratch/bacula/bin Install libraries: /usr/lib Install config files: /scratch/bacula/etc Scripts directory: /scratch/bacula/etc Archive directory: /tmp Working directory: /scratch/bacula/var PID directory: /var/run Subsys directory: /var/lock/subsys Man directory: ${datarootdir}/man Data directory: /usr/share Plugin directory: /usr/lib C Compiler: gcc 4.1.2 C++ Compiler: /usr/bin/g++ 4.1.2 Compiler flags: -g -O2 -Wall -fno-strict-aliasing -fno-exceptions -fno-rtti Linker flags: Libraries: -lpthread -ldl Statically Linked Tools: no Statically Linked FD: no Statically Linked SD: no Statically Linked DIR: no Statically Linked CONS: no Database type: MySQL Database port: Database lib: -L/usr/lib/mysql -lmysqlclient_r -lz Database name: bacula Database user: bacula Job Output Email: r...@localhost Traceback Email: r...@localhost SMTP Host Address: localhost Director Port: 9101 File daemon Port: 9102 Storage daemon Port: 9103 Director User: Director Group: Storage Daemon User: Storage DaemonGroup: File Daemon User: File Daemon Group: SQL binaries Directory /usr/bin Large file support: yes Bacula conio support: yes -lncurses readline support: no TCP Wrappers support: no TLS support: yes Encryption support: yes ZLIB support: yes enable-smartalloc: yes enable-lockmgr: no bat support: no enable-gnome: no enable-bwx-console: no enable-tray-monitor: no client-only: no build-dird: yes build-stored: yes Plugin support: yes AFS support: no ACL support: no XATTR support: yes Python support: no Batch insert enabled: yes ------------------------------------------------------------------------------ This SF Dev2Dev email is sponsored by: WikiLeaks The End of the Free Internet http://p.sf.net/sfu/therealnews-com _______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users