On 28-nov-2006, at 10:05, Dahlgren Mattias wrote: > Hello everyone. > > Im trying to set up bacula to do the backup of the about 12 FreeBSD > webservers we have. > > I got it working on all but 2 servers, on these servers i keep > continuosly getting errors that the operation times out. The strange > thing is that it seems to ALWAYS occur after almost the exact same > time on both servers. That time is: 2 hours 10 mins 10 secs. The secs > can vary between 10-14 but its definitely the same time. > > I'v read some other posts here about similar problems but nothing that > exactly seems to match our issue. > > I have tried setting the heartbeat interval in the SD resource to 15 > seconds as i saw mentioned in another post which didnt help. I tried > setting it in the Client resource aswell as suggested in the Bacula > manual. However this causes Bacula-dir to refuse to start saying there > is a syntax error in the config file and pointing to this exact line > in the client resource. > > Basically im lost and i really need to get this operational, is there > anyone who has any ideas? I imagine it could be the network somehow > timing out since its happening after the exact same elapsed time on > both servers but i cant think of where to change this time out. > > Here is a cut from my log file with regards to this issue: > > 23-Nov 01:47 xxxx-dir: No prior Full backup Job record found. > 23-Nov 01:47 xxxx-dir: No prior or suitable Full backup found. Doing > FULL backup. > 23-Nov 01:47 xxxx-dir: Start Backup JobId 1046, Job=xxxx. > 2006-11-23_00.30.01 > 23-Nov 01:47 xxx-sd: Volume "xxxxFull-0002" previously written, moving > to end of data. > 23-Nov 03:57 xxxx-dir: xxxx.2006-11-23_00.30.01 Fatal error: Network > error with FD during Backup: ERR=Operation timed out > 23-Nov 03:57 xxxx-dir: obelix.2006-11-23_00.30.01 Fatal error: No Job > status returned from FD. > 23-Nov 03:57 xxxx-dir: obelix.2006-11-23_00.30.01 Error: Bacula > 1.38.11 (28Jun06): 23-Nov-2006 03:57:43 > JobId: 1046 > Job: xxxx.2006-11-23_00.30.01 > Backup Level: Full (upgraded from Incremental) > Client: "xxxx-fd" i386-portbld-freebsd6.1,freebsd, > 6.1-STABLE > FileSet: "xxxx Full FileSet" 2006-11-21 17:28:05 > Pool: "xxxx-Full-Pool" > Storage: "File3" > Scheduled time: 23-Nov-2006 00:30:00 > Start time: 23-Nov-2006 01:47:33 > End time: 23-Nov-2006 03:57:43 > Elapsed time: 2 hours 10 mins 10 secs > Priority: 10 > FD Files Written: 0 > SD Files Written: 0 > FD Bytes Written: 0 (0 B) > SD Bytes Written: 0 (0 B) > Rate: 0.0 KB/s > Software Compression: None > Volume name(s): xxxxFull-0002 > Volume Session Id: 6 > Volume Session Time: 1164209750 > Last Volume Bytes: 31,997,951,399 (31.99 GB) > Non-fatal FD errors: 0 > SD Errors: 0 > FD termination status: Error > SD termination status: Error > Termination: *** Backup Error *** > > > Any help would be appreciated.
We've been having the exact same issue, with backups timing out after _exactly_ 2 hours, 11 minutes and 15 seconds. This weekend I finally found the most likely culprit, which was a (Checkpoint) firewall between the director and the client. It started dropping ACK packets after exactly 2 hours, because the TCP session timed out, so it would only accepts SYN packets. The director would then start retransmiting the packets at at a 75 second interval fro 11 minutes and 25 seconds, after which a timeout occurs. The strange thing is that TCP_TIMEOUT was set at 60 minutes on the firewall and that Heartbeat Interval on the client was set at 5 minutes, so something's weird about this... I've changed the TCP_TIMEOUT to 3 hours on the firewall and decreased the Heartbeat Interval to 2 minutes on the clients. It's too soon to tell if this has actually resolved the issue, because it would manifest itself intermittently and unpredictably. However, I haven't seen a timeout in three days. So you should check the network components (routers, firewalls) between your director and clients. Leander ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys - and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users