Hi guys,
For my company I've been trying to get bacula up and running properly. My currect situation: Host 'leiden' : Located at my home, multiple large (8TB) raid arrays attached. Therefore running bacula-sd and bacula-dir. >100mbit download bandwidth. Running debian testing, bacula version 5.0.3. Multiple hosts to be backed up, on a 100/100 connection. debian stable, bacula 5.0.3 running bacula-fd, default config. The complete bacula-dir.conf is located at: http://pastebin.com/8JvCdmL9 Please note that I have substituted all passwords by an X. Relevant parts are: Director { # define myself Name = leiden-dir QueryFile = "/etc/bacula/scripts/query.sql" WorkingDirectory = "/var/lib/bacula" PidDirectory = "/var/run/bacula" Maximum Concurrent Jobs = 10 Password = "X" # Console password Messages = Daemon DirAddresses = { ip = { addr = 192.168.1.44; port = 9101 } ip = { addr = 127.0.0.1; port =9101 } } } JobDefs { Name = "sql-weekly" Type = Backup Level = Incremental Client = sql FileSet = "Full Set" Schedule = "WeeklyCycle" Storage = leiden-filestorage Messages = Standard Pool = LeidenPool Priority = 10 } JobDefs { Name = "mail-weekly" Type = Backup Level = Incremental Client = mail FileSet = "Full Set" Schedule = "WeeklyCycle" Storage = leiden-filestorage Messages = Standard Pool = LeidenPool Priority = 10 } Job { Name = "sqljob" JobDefs = "sql-weekly" Write Bootstrap = "/var/lib/bacula/sql.bsr" } Job { Name = "mailjob" JobDefs = "mail-weekly" Write Bootstrap = "/var/lib/bacula/mail.bsr" } # Client (File Services) to backup Client { Name = sql Address = sql.boudewijnector.nl FDPort = 9102 Catalog = MyCatalog Password = "X" # password for FileDaemon File Retention = 30 days # 30 days Job Retention = 6 months # six months AutoPrune = yes # Prune expired Jobs/Files } Client { Name = mail Address = mail.boudewijnector.nl FDPort = 9102 Catalog = MyCatalog Password = "X" # password for FileDaemon File Retention = 30 days # 30 days Job Retention = 6 months # six months AutoPrune = yes # Prune expired Jobs/Files } The current problem is that I get errors on some hosts, such as: 17-Jul 02:52 leiden-dir JobId 94: Fatal error: Network error with FD during Backup: ERR=Connection reset by peer 17-Jul 02:52 leiden-dir JobId 94: Fatal error: No Job status returned from FD. 17-Jul 02:52 leiden-dir JobId 94: Error: Bacula leiden-dir 5.0.3 (04Aug10): 17-Jul-2011 02:52:30 Build OS: i486-pc-linux-gnu debian wheezy/sid JobId: 94 Job: BLAjob.2011-07-17_00.52.14_10 Backup Level: Full (upgraded from Incremental) Client: "client4" 5.0.2 (28Apr10) x86_64-pc-linux-gnu,debian,squeeze/sid FileSet: "Home Set" 2011-07-16 23:49:43 Pool: "LeidenPool" (From Job resource) Catalog: "MyCatalog" (From Client resource) Storage: "leiden-filestorage" (From Job resource) Scheduled time: 17-Jul-2011 00:52:13 Start time: 17-Jul-2011 00:52:16 End time: 17-Jul-2011 02:52:30 Elapsed time: 2 hours 14 secs Priority: 10 FD Files Written: 0 SD Files Written: 137,033 FD Bytes Written: 0 (0 B) SD Bytes Written: 3,586,674,915 (3.586 GB) Rate: 0.0 KB/s Software Compression: None VSS: no Encryption: no Accurate: no Volume name(s): LeidenVol0005 Volume Session Id: 20 Volume Session Time: 1310599400 Last Volume Bytes: 12,025,925,394 (12.02 GB) Non-fatal FD errors: 0 SD Errors: 0 FD termination status: Error SD termination status: OK Termination: *** Backup Error *** When trying to rerun the job it also fails after 2 hours.... I tried to fix it this way: In the Job @ bacula-dir , I added "Max Run Time = 144000" because it seemed like bacula shut down the connection after 2 hours. I also changed the keep-alive time on the machine running bacula-dir : sysctl -w net.ipv4.tcp_keepalive_time=60 When I did so, it failed completely: Elapsed time: 15 hours 22 mins 58 secs Priority: 10 FD Files Written: 0 SD Files Written: 0 FD Bytes Written: 0 (0 B) SD Bytes Written: 0 (0 B) Rate: 0.0 KB/s Software Compression: None VSS: no Encryption: no Accurate: no Volume name(s): Volume Session Id: 33 Volume Session Time: 1310599400 That's really bad, my router did not detect any traffic at all except for some bytes when setting up the connection. Can someone please point me out where I should start to investigate this problem? From the internet, I can reach the director and the SD @ the 'leiden' system. I can reach the FD's at all servers which are to be backed up. Cheers, Boudewijn Ector ------------------------------------------------------------------------------ Got Input? Slashdot Needs You. Take our quick survey online. Come on, we don't ask for help often. Plus, you'll get a chance to win $100 to spend on ThinkGeek. http://p.sf.net/sfu/slashdot-survey _______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users