[Bacula-users] Bacula 5.0.x Max Wait Time
Hi, I wonder if the MaxWaitTime problem is still unsolved in Bacula 5.0.x It was discussed/bug reported several times in earlier years, and again and again promised to be solved in the next release. There is a chart in the bacula manual (in the Job section of director configuration), showing the relations between Max Run Sched Time Max Start Delay Max Run Time Max Wait Time This illustration implies that MaxWaitTime is unaffected by the span of time between schedule and start. Due to the chart, MaxWaitTime begins to count _after_ the job has started, when a blocking situation is given. This is also the meaning of the textual definition there: »Max Wait Time = time The time specifies the maximum allowed time that a job may block waiting for a resource (such as waiting for a tape to be mounted, or waiting for the storage or file daemons to perform their duties), counted from the when the job starts, (not necessarily the same as when the job was scheduled). This directive works as expected since bacula 2.3.18. « De facto, this isn't the case. I've got this job definiton: ... MaxStartDelay = 6 hours MaxRunTime = 2 hours MaxWaitTime = 1 hour ... According to the chart in the manual, it should be ok if the job has to wait for start for up to 6 hours after it was scheduled (f.i. if jobs with a higher priority still keeb on running) Once started, it may run for up to 2 hours, within this span of time, it may be interrupted for up to 1 hour. But, again and again, I get this job cancelled with this log: 23-May 00:15 bup-serv-dir JobId 140008: Fatal error: Max wait time exceeded. Job canceled. ... Scheduled time: 22-May-2013 23:15:00 Start time: 23-May-2013 04:07:05 End time: 23-May-2013 04:07:05 Elapsed time: 0 secs This is, the job is _scheduled_ at 23:15, and cancelled due to exceeded MaxWaitTime after 1 hour, at 00:15, when it wasn't started at all. Obviously, MaxWaitTime is counted from when the job is SCHEDULED, and NOT from when the job STARTS. Thus, either the manual is wrong (text + chart), or this is a bug. I'm right? Best regards, Robert +++German Research Center for Artificial Intelligence+++ Dipl.-Inform. Robert V. Wirth, Campus D3_2, D-66123 Saarbruecken @office: +49-681-85775-5078 / -5572 +++ @fax: +49-681-85775-5020 mailto:robert.wi...@dfki.de ++ http://www.dfki.de/~wirth -- Try New Relic Now We'll Send You this Cool Shirt New Relic is the only SaaS-based application performance monitoring service that delivers powerful full stack analytics. Optimize and monitor your browser, app, servers with just a few lines of code. Try New Relic and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_may ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Full job terminates claiming no space left but tape gets more data
Marco van Wieringen m...@planets.elm.net writes: ... Fseek on attributes file failed: ERR=No space left on device ... Last Volume Bytes: 1,283,646,163,968 (1.283 TB) Really, the tape isn't full! The other 5 jobs (and this one after being rescheduled) continue to write that tape AD0016L5 which by now has 1,633,895,424,000 Bytes written and is still in Append state. Question What may be the cause for that ERR=No space left on device Older versions of Bacula (at least the old ones you talk about) never did so called attribute spooling. The problem is probably the amount of storage on the storage daemon see commit_attribute_spool() function that emits the error Fseek on attributes file failed. Oh, yes. Working dir filesystem was full. Oops... I didn't realize... You have a couple of options: - disable attribute spooling (see spoolattributes setting on director) - make sure enough space is available on disk for spooling the attributes first to disk before they are batch inserted into the database. Keep in mind that disabling attribute spooling and batch insert will slow down the overall backup probably.) The attribute spool files are created in either the spool directory or the working directory. Thank you a lot! This is the solution. Best regards, Robert --- +++ German Research Center for Artificial Intelligence+++ --- Robert Wirth,Campus D32,Stuhlsatzenhausweg 3,66123 Saarbruecken @phone:+49-681-85775-5078 Germany mailto:robert.wi...@dfki.de --- -- LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial Remotely access PCs and mobile devices and provide instant support Improve your efficiency, and focus on delivering more value-add services Discover what IT Professionals Know. Rescue delivers http://p.sf.net/sfu/logmein_12329d2d ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
[Bacula-users] Full job terminates claiming no space left but tape gets more data
Hi, Situation: -- The backup strategy is doing a yearly Full, monthly Diff and daily Incr. This worked well over years with Bacula 1.x and 2.x. Some months ago, I upgraded SD and DIR from Bacula 2.2.8 to 5.0.1 The Diff/Incr jobs continued to do well. But now, while doing a Full for the first time under 5.0.1, I got errors (and can reproduce them) Hardware is two LTO-5 drives with a changer and a lot of tapes. MaximumSpoolSize = 1024 GB MaximumJobSpoolSize = 32 GB What's up? -- I've 6 bigger jobs (TB size each) in the same pool and start them at the same time. They're spooling and despooling alternately to the same tape(s) as expected. Tape AD0018L5 is written by the 6 jobs and, when full, automatically replaced by next free tape AD0016L5. The latter one is also written by the 6 jobs... Then, one of the 6 jobs fails (and is rescheduled): Committing spooled data to Volume AD0016L5. Despooling 22,147,812,023 bytes ... Fseek on attributes file failed: ERR=No space left on device ... Last Volume Bytes: 1,283,646,163,968 (1.283 TB) Really, the tape isn't full! The other 5 jobs (and this one after being rescheduled) continue to write that tape AD0016L5 which by now has 1,633,895,424,000 Bytes written and is still in Append state. Question What may be the cause for that ERR=No space left on device Same configuration worked well with Bacula = 2.2.8 and LTO-3 (with lots more full tapes and changes during a Full session than today when filling LTO-5) More info: -- 08-Dez 05:46 bup-serv-dir JobId 130936: Start Backup JobId 130936, Job=lnv-91163.2012-12-08_04.46.11_20 08-Dez 05:46 bup-serv-dir JobId 130936: Using Device Tape1A 08-Dez 05:46 bup-serv-sd JobId 130936: Spooling data ... ... 09-Dez 20:17 bup-serv-sd JobId 130936: Committing spooled data to Volume AD0016L5. Despooling 22,147,812,023 bytes ... 10-Dez 00:52 bup-serv-sd JobId 130936: Despooling elapsed time = 00:34:51, Transfer rate = 10.59 M Bytes/second 10-Dez 00:52 bup-serv-sd JobId 130936: Fatal error: Fseek on attributes file failed: ERR=No space left on device 10-Dez 00:52 bup-serv-dir JobId 130936: Error: Bacula bup-serv-dir 5.0.1 (24Feb10): 10-Dez-2012 00:52:50 ... Elapsed time: 1 day 19 hours 6 mins 36 secs Priority: 11 FD Files Written: 445,124 SD Files Written: 445,124 FD Bytes Written: 277,792,113,411 (277.7 GB) SD Bytes Written: 277,875,653,641 (277.8 GB) ... Volume name(s): AD0018L5|AD0016L5 Volume Session Id: 202 Volume Session Time:1354723843 Last Volume Bytes: 1,283,646,163,968 (1.283 TB) Non-fatal FD errors:0 SD Errors: 0 FD termination status: OK SD termination status: Error Termination:*** Backup Error *** list volumes pool=ServerFull ... | 1,012 | AD0016L5 | Append| 1 | 1,633,895,424,000 |1,634 | 63,072,000 | 1 | 56 | 1 | LTO-5 | 2012-12-10 10:24:20 | ... | 1,014 | AD0018L5 | Full | 1 | 1,896,767,502,336 |1,897 | 63,072,000 | 0 | 58 | 1 | LTO-5 | 2012-12-08 14:45:09 | Regards, Robert --- +++ German Research Center for Artificial Intelligence+++ --- Robert Wirth,Campus D32,Stuhlsatzenhausweg 3,66123 Saarbruecken @phone:+49-681-85775-5078 Germany mailto:robert.wi...@dfki.de --- -- LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial Remotely access PCs and mobile devices and provide instant support Improve your efficiency, and focus on delivering more value-add services Discover what IT Professionals Know. Rescue delivers http://p.sf.net/sfu/logmein_12329d2d ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] backup slowdown (mysqld) after tape autochange
On Tue, December 14, 2010 11:48 am, Robert Wirth wrote: Hi, strange problem. Here's some hardware where Bacula has been running successfully for ca. 5 years. It was release 1.38.11 under Solaris 10x86. Last month, we had a system disk crash on the backup system. No backup datas have been lost. We just had to reinstall the backup system. Since this was our only Solaris x86 system, we decided to migrate to Linux and to a newer Bacula release. Until the repaired hardware was present, we started with a virtualized new system, just for the daily incremental backups to disk volumes. Since most of our actual systems are Ubuntu Hardy server LTS, we choosed Bacula 2.2.8 of this distribution as our new version (well, it's old, but 1.38.11 was running well, and 2.2.8 was the default) We upgraded Bacula's mysql database with the corresponding script from 1.38.11 to 2.2.8. We imported the updated DB using mysql_dump into the new system which has MySQL 5.1.41 and Linux Kernel 2.6.32 The virtualized system worked well all the time. Now, the hardware version of the system is ready, and a yearly full backup, which goes directly to tape, is imminent. And now, the strange things are coming... /* The system is a 2x2 core AMD Opteron system, 4 GB RAM, 6xLSI SCSI U320 Megaraid with seperated channels for external disks, tape readers and autochanger. 23 TB disk storage on external RAIDs, autochanger and HP-readers for LTO-3 tapes. System: see above. */ NOW BACKING UP... Starting a bunch of full backup jobs which fit into 1 SINGLE TAPE produces NO PROBLEMS: the jobs start, run and write, and terminate within a usual span of time. In so doing, I can backup a dozen systems with totally 360 GB on one tape in a few hours. FACING THE PROBLEM... Starting a bunch of full backup jobs that DO NOT FIT into 1 single tape proceeds like follows (with a fresh tape forced by setting the former one to readonly): - first, the jobs run well and write their data to the first fresh tape of the corresponding pool. Speed is similar as known from the old OS. - when the tape is full with around 600GB of data, it is marked as Full, being unloaded, and the next free tape of the pool is loaded. - from this moment on, writing to the new fresh tape becomes incredibly slow (4 GB/hour) and mysqld has constantly 95%-100% CPU load. No other process has an important load, and the mysql load isn't represented in the system's load values: Cpu(s): 3.3%us, 2.2%sy, 0.0%ni, 91.6%id, 2.1%wa, 0.1%hi, 0.7%si, 0.0%st Mem: 3961616k total, 3850072k used, 111544k free,17532k buffers Swap: 3906552k total,0k used, 3906552k free, 3579956k cached PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 1356 mysql 20 0 144m 31m 2376 S 98 0.8 163:57.79 mysqld 1 root 20 0 2620 948 528 S0 0.0 0:00.63 init 2 root 20 0 000 S0 0.0 0:00.00 kthreadd The only further effect I can see is that the table bacula.JobMedia is growing. No errors in system log, no mysql errors, nor in Baculas log. What I mainly don't understand is why this happens after a tape change. The MaxSpoolSize is 32GB, and I'm backing up 7 systems. Each of them had several spool steps during the first tape. From the view of Bacula and its program logic, what has changed when the tape has been changed? I guess it's all the same: spooling data, writing them to tape and update the catalog, regardless of first, second or later tape...?!? What do you see under Running Jobs in the 'status dir' output before and after the first tape has filled? If you have only the 'after' just now, that might be interesting. Will try to see this on next try. /* Actual state: the second tape is loaded and used since yesterday 02:14 p.m., and it really has been written 29 GB :-((( */ And, I wrote wrong. There IS something strange in bacula log. Oooh. Yesterday, I started the 7 not-fitting big backup jobs around 9:45 a.m. The first tape has been filled with 540 GB until 02:45 p.m. | 833 | 90L1 | Full | 1 | 543,602,949,120 | 544 | 63,072,000 | 0 |1 | 1 | LTO-3 | 2010-12-14 14:45:52 | During this time, I could follow the spooling via messages on bconsole. It all looked correct: parallel spooling and writing of data. /* PARENTHESIS: Actual state: the second tape was loaded yesterday 02:45 p.m., and it really has been written 29 GB since then during 18 hours :-((( | 834 | 98L1 | Append| 1 | 29,998,080,000 | 30 | 63,072,000 | 0 |4 | 1 | LTO-3 | 2010-12-15 08:14:46 | END OF PARENTHESIS */ In the log, there's only information about the job which I started first (lnv-102). No word about the other 6 jobs and their spooling
[Bacula-users] backup slowdown (mysqld) after tape autochange
Hi, strange problem. Here's some hardware where Bacula has been running successfully for ca. 5 years. It was release 1.38.11 under Solaris 10x86. Last month, we had a system disk crash on the backup system. No backup datas have been lost. We just had to reinstall the backup system. Since this was our only Solaris x86 system, we decided to migrate to Linux and to a newer Bacula release. Until the repaired hardware was present, we started with a virtualized new system, just for the daily incremental backups to disk volumes. Since most of our actual systems are Ubuntu Hardy server LTS, we choosed Bacula 2.2.8 of this distribution as our new version (well, it's old, but 1.38.11 was running well, and 2.2.8 was the default) We upgraded Bacula's mysql database with the corresponding script from 1.38.11 to 2.2.8. We imported the updated DB using mysql_dump into the new system which has MySQL 5.1.41 and Linux Kernel 2.6.32 The virtualized system worked well all the time. Now, the hardware version of the system is ready, and a yearly full backup, which goes directly to tape, is imminent. And now, the strange things are coming... /* The system is a 2x2 core AMD Opteron system, 4 GB RAM, 6xLSI SCSI U320 Megaraid with seperated channels for external disks, tape readers and autochanger. 23 TB disk storage on external RAIDs, autochanger and HP-readers for LTO-3 tapes. System: see above. */ NOW BACKING UP... Starting a bunch of full backup jobs which fit into 1 SINGLE TAPE produces NO PROBLEMS: the jobs start, run and write, and terminate within a usual span of time. In so doing, I can backup a dozen systems with totally 360 GB on one tape in a few hours. FACING THE PROBLEM... Starting a bunch of full backup jobs that DO NOT FIT into 1 single tape proceeds like follows (with a fresh tape forced by setting the former one to readonly): - first, the jobs run well and write their data to the first fresh tape of the corresponding pool. Speed is similar as known from the old OS. - when the tape is full with around 600GB of data, it is marked as Full, being unloaded, and the next free tape of the pool is loaded. - from this moment on, writing to the new fresh tape becomes incredibly slow (4 GB/hour) and mysqld has constantly 95%-100% CPU load. No other process has an important load, and the mysql load isn't represented in the system's load values: Cpu(s): 3.3%us, 2.2%sy, 0.0%ni, 91.6%id, 2.1%wa, 0.1%hi, 0.7%si, 0.0%st Mem: 3961616k total, 3850072k used, 111544k free,17532k buffers Swap: 3906552k total,0k used, 3906552k free, 3579956k cached PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 1356 mysql 20 0 144m 31m 2376 S 98 0.8 163:57.79 mysqld 1 root 20 0 2620 948 528 S0 0.0 0:00.63 init 2 root 20 0 000 S0 0.0 0:00.00 kthreadd The only further effect I can see is that the table bacula.JobMedia is growing. No errors in system log, no mysql errors, nor in Baculas log. What I mainly don't understand is why this happens after a tape change. The MaxSpoolSize is 32GB, and I'm backing up 7 systems. Each of them had several spool steps during the first tape. From the view of Bacula and its program logic, what has changed when the tape has been changed? I guess it's all the same: spooling data, writing them to tape and update the catalog, regardless of first, second or later tape...?!? Regards, Robert +++German Research Center for Artificial Intelligence+++ Dipl.-Inform. Robert V. Wirth, Campus D3_2, D-66123 Saarbruecken @office: +49-681-85775-5078 / -5572 +++ @fax: +49-681-85775-5020 mailto:robert.wi...@dfki.de ++ http://www.dfki.de/~wirth Deutsches Forschungszentrum fuer Kuenstliche Intelligenz GmbH - Firmensitz Trippstadter Strasse 122, D-67663 Kaiserslautern Geschaeftsfuehrung (executive board): - Prof. Dr. Dr. h.c. mult. Wolfgang Wahlster (Vorsitzender) - Dr. Walter Olthoff Vorsitzender des Aufsichtsrats (supervisory board chairman): - Prof. Dr. h.c. Hans A. Aukes Amtsgericht Kaiserslautern, HRB 2313 -- Lotusphere 2011 Register now for Lotusphere 2011 and learn how to connect the dots, take your collaborative environment to the next level, and enter the era of Social Business. http://p.sf.net/sfu/lotusphere-d2d ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
[Bacula-users] How to change MaxStartDelay of a job?
Hi! Running Bacula 1.38.11, I want to change the MaxStartDelay of a job. Situation: - several Jobs are started at 23:10 with priority 10 - the Catalog job is started at 23:15 with priority 13 The MaxStartDelay for the Catalog job was 1 hour. This was to short (it came from the misunderstanding of MaxStartDelay which, I now learned to be likely a MaxStartDelayAfterSchedule). The Catalog is aborted (without an email warning/error message) because higher jobs are still running 1 hour after it was _scheduled_ (and still waiting for the higher jobs to complete). Manually run of the Catalog job in the morning works, thus the Job is ok. Now, I increased the MaxStartDelay in the bacula-dir.conf to 6 hours. Quite long enough that all higher jobs are completed before. After reloading the configuration using bconsole, nothing has changed :-( I wonder if the value of MaxStartDelay is buffered somewhere in the Bacula DB, and changes on the configuration file will not proceed without an update of the DB entries? (I tried to find out the value of MaxStartDelay using bconsole's show, but I can't find it). Any help welcome. Regards, Robert +++German Research Center for Artificial Intelligence+++ +++ I N F R A - S T R U C T U R E - G R O U P +++ DFKI GmbH,ISG,Campus D32,Stuhlsatzenhausweg 3,66123 Saarbruecken @office: +49-681-302-5572,-5514,-5078 @telefax: +49-681-302-5020 mailto:isg-sb(at)dfki.de Germany http://www.dfki.de/web/ Deutsches Forschungszentrum fuer kuenstliche Intelligenz GmbH - Firmensitz Trippstadter Strasse 122, D-67663 Kaiserslautern Geschaeftsfuehrung (executive board): - Prof. Dr. Dr. h.c. mult. Wolfgang Wahlster (Vorsitzender) - Dr. Walter Olthoff Vorsitzender des Aufsichtsrats (supervisory board chairman): - Prof. Dr. h.c. Hans A. Aukes Amtsgericht Kaiserslautern, HRB 2313 - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] How to change MaxStartDelay of a job?
11.01.2008 13:11:28, Arno Lehmann wrote: Show job=... might reveal it. Other than that, I can only tell you that these values are not stored in the catalog, as far as I know. My job changes show up immediately after a 'reload'. But I haven't worked with the delay settings much... I know that. But where's the MaxStartDelay in it? The only large number there is 3,600, I think this is the time span for rescheduling the job. *show job=Catalog Job: name=Catalog JobType=66 level=Full Priority=13 Enabled=1 MaxJobs=1 Resched=1 Times=1 Interval=3,600 Spool=1 WritePartAfterJob=0 -- Client: name=-fd address=X FDport=9102 MaxJobs=8 JobRetention=2 years FileRetention=2 years AutoPrune=1 -- Catalog: name=MySQL address=*None* DBport=0 db_name=bacula db_user=bacula MutliDBConn=0 -- FileSet: name=Catalog O M N I /var/services/Dump/mysql/bacula.sql N -- Schedule: name=Catalog -- Run Level=Full hour=23 mday=0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 month=0 1 2 3 4 5 6 7 8 9 10 11 wday=0 1 2 3 4 5 6 wom=0 1 2 3 4 woy=0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 mins=15 -- RunBefore=/usr/local/share/bacula/scripts/make_catalog_backup bacula bacula X -- RunAfter=/usr/local/share/bacula/scripts/delete_catalog_backup bacula -- WriteBootstrap=/var/services/bacula/work/bootstraps/Catalog.bsr [SNIP] /* the rest of the record corresponds to Storage, Pool etc. settings */ This is the Job record in bacula-dir.conf: Job { Name = Catalog Client = -fd Type = Backup Level = Full Messages = Standard Schedule = Catalog FileSet = Catalog Storage = Tape1 Pool = Test MaximumConcurrentJobs = 1 MaxStartDelay = 6 hours ## was 1 hour before the change MaxRunTime = 6 hours MaxWaitTime = 1 hour PruneJobs = no PruneFiles = no PruneVolumes = no RerunFailedLevels = no RescheduleOnError = yes RescheduleInterval = 1 hour RescheduleTimes = 1 SpoolData = yes SpoolAttributes = no Priority = 13 WriteBootstrap = /var/services/bacula/work/bootstraps/Catalog.bsr RunBeforeJob = /usr/local/share/bacula/scripts/make_catalog_backup bacula bacula XXX RunAfterJob = /usr/local/share/bacula/scripts/delete_catalog_backup bacula } Best, Robert +++German Research Center for Artificial Intelligence+++ +++ I N F R A - S T R U C T U R E - G R O U P +++ DFKI GmbH,ISG,Campus D32,Stuhlsatzenhausweg 3,66123 Saarbruecken @office: +49-681-302-5572,-5514,-5078 @telefax: +49-681-302-5020 mailto:isg-sb(at)dfki.de Germany http://www.dfki.de/web/ - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
[Bacula-users] [Hint] corrupt ext3 filesystem forces giant job (_no_ problem)
Hi! FYIO: (Still) running bacula 1.38.11, I've found an interesting effect last night. On a system with 35 GB disk space, running SuSE Linux 10.1, Bacula tried to save a some-TB job. The backup was canceled after 1.2 TB were written to the SD due to the MaxRunTime directive. I don't know if it had terminated else as long as tape medias were available... The cause is simple: there was an error in the ext3 filesystem. There was an inode claimed to be some TB big. fsck recognized that when I search for the cause. But bacula-fd can't, obviously. Before correction with fsck, the effect was reproducible, i.e. the next backup job would have run in the same way. Fortunately, the virtual mega-file consisted of only few different bytes (or of one byte only). Thus, the hardware compression of my tape drive filled that 1.2 TB in short tape space. Best, Robert -- ++ Deutsches Forschungszentrum fuer Kuenstliche Intelligenz ++ -- Dipl.Inf. Robert Wirth,Stuhlsatzenhausweg 3,66123 Saarbruecken @office: +49-681-302-5078 oder -5572 ++ @fax: +49-681-302-5020 mailto:[EMAIL PROTECTED] http://www.dfki.de/~wirth -- Deutsches Forschungszentrum fuer Kuenstliche Intelligenz GmbH - Firmensitz Trippstadter Strasse 122, D-67663 Kaiserslautern Geschaeftsfuehrung: - Prof. Dr. Dr. h.c. mult. Wolfgang Wahlster (Vorsitzender) - Dr. Walter Olthoff Vorsitzender des Aufsichtsrats: - Prof. Dr. h.c. Hans A. Aukes Amtsgericht Kaiserslautern, HRB 2313 -- - Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT business topics through brief surveys-and earn cash http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
[Bacula-users] File count after FD termination _or_ out of MaxRunTime
Hi, using Bacula Version: 1.38.11 (28 June 2006) this happens: A.. While a backup job is running, the FD terminates accidentially (process kill, power off machine etc.) The backup job is canceled by bacula (error mail is sent etc.) ! The volume file count is wrong after that --with a difference of 1. The next job, waiting for a writable volume in the same pool, is scheduled and marks the volume (tape) with Error. B.. A backup job reaches MaxRunTime. Bacula decides to cancel the job, that's ok. Again, the file count isn't updated right. That's not ok. The result is the same as in A..: file count mismatch, Error state, when the next job tries to write on the volume. My question: Why is there a file count mismatch? Consider case B.., which I think is a heavy thing: The complete Cancel operation is under the control of Bacula, it's a Cancel from within the system! It should be easy for Bacula to enter the correct file count in the database...?! Although in case A.., where the interrupt comes from outside. The SD, after losing the connection to FD, should be able to tell the DIR the correct file count it has written to storage, and the DIR should be able to update the count in the database. So, it's a bug, isn't it? Any help with that? Actually, I use to correct file counts manually using bconsole. That's a boring thing, because the volumes with wrong file counts are still in status Append, as long as no further job fails on them. Thus, I've to check all Cancel-mails every morning for the exact reason. Gruezi and thanks, Robert -- ++German Research Center for Artificial Intelligence++ -- Dipl.Inf. Robert Wirth,Stuhlsatzenhausweg 3,66123 Saarbruecken @office: +49-681-302-5078, or -5572 +++ @fax: +49-681-302-5020 mailto:[EMAIL PROTECTED] http://www.dfki.de/~wirth -- - Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier. Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnkkid=120709bid=263057dat=121642 ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Problem: Bacula don't use a new tape
Hi Ralf, This is also my experience as well. As a workaround, I'm using only 23 hours as a one-day usage period, this happens to work. Regards, Robert ++ German Research Center for Artificial Intelligence ++ Robert Wirth, Stuhlsatzenhausweg 3, D-66123 Saarbruecken @office: +49-681-302-5078/5572 ++ @fax: +49-681-302-5020 mailto:[EMAIL PROTECTED] ++ http://www.dfki.de/~wirth Ralf Winkler wrote: I use Bacula Versioin 1.38.11 on 2 different systems. On both systems Bacula write to a hard drive and shall use the file for a period of 24 hours (23 hours 50 mins excactly). If i understand the docu in the right way, i have to define this in the Pool definition. So i did. # Default pool definition Pool { Name = Default Pool Type = Backup Recycle = yes # Bacula can automatically recycle Volumes LabelFormat = File-Name AutoPrune = yes # Prune expired volumes Volume Retention = 10 days # changed to 10 days instead of 1 year Maximum Volumes = 10# set too 12 tapes (with 2 spares) Volume Use Duration = 1430m # set to 23h 50m Accept Any Volume = yes # write on any volume in the pool And now my problem, on one system, Bacula change and write to a new file every day. On the other system Bacula always want to write to the same file. I have to set the file to used in the bconsole then Bacula will use the next file, but don't switch to the next file after 24 hours, i have to do it again by hand. The definition is except the Label the same on both machines. Did i miss something? - Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT business topics through brief surveys - and earn cash http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Backup Method with Multiple Media Types - Resend
Hi Alan, a job definition allows one Storage directive only, of course. All directives are keywords, thus they must be unique. But, within a Schedule definition, you may change the storage for the scheduled job. Example: Schedule { Name = Foo Run = Full Storage=Tape Pool=FooFull 1st sat on dec at 06:10 Run = Differential Storage=Tape DifferentialPool=FooDiff 1st sat on jan-nov at 23:10 Run = Incremental Storage=Disk IncrementalPool=FooIncr sun-fri at 23:10 Run = Incremental Storage=Disk IncrementalPool=FooIncr 2nd-5th sat at 23:10 } JobDefs { Name = FooJob Level = Incremental Schedule = Foo Storage = Tape Pool = ... ... } Job { Name = FooHost JobDefs = FooJob Client = FooHost-fd ... } Client { Name = FooHost-fd ... } Best, Robert ++ German Research Center for Artificial Intelligence ++ Robert Wirth, Stuhlsatzenhausweg 3, D-66123 Saarbruecken @office: +49-681-302-5078/5572 ++ @fax: +49-681-302-5020 mailto:[EMAIL PROTECTED] ++ http://www.dfki.de/~wirth - Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnkkid=120709bid=263057dat=121642 ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
[Bacula-users] How enforce Bacula to use tapes one after another?
Hi! I've got two LTO-3 drives and with one autochanger device. Backups of dozends of clients into several pools works well, in general. My Bacula version is 1.38.7. There remains one annoying problem: I've defined a pool with two tapes for daily snapshot backups of some data of temporary interest from several clients. For simplicity, all snapshot jobs are defined identically. As a consequence, they all have the same start time. The idea is, that Bacula uses the first tape, until it's full, then continues with the second tape. The retention periods are choosed in a way that, when the second tape will be completed (Full, Used etc), all data on the first tape will be out of time, thus the first tape can be recycled automatically. And the other way round... But, good idea -- bad conversion: Bacula uses the tapes in an arbitrary order. Some day, only one tape is used. Another day, both tapes are used. As a consequence, when tape recycling is needed some day, both tapes will contain actual backup data. F.i. have a look at the actual situation (the jobs start daily at 20:05 pm, using the same Schedule and JobDefs) Pool: Snapshot +-+-+-+--+-+-+ | MediaId | ... | VolBytes| VolFiles | ... | LastWritten | +-+-+---+-+--+-+-+ | 54 | ... | 497,058,209,311 | 520 | ... | 2006-06-09 20:14:29 | | 86 | ... | 110,145,463,630 | 115 | ... | 2006-06-09 20:19:40 | +-+-+-+--+-+-+ You can see: Tape 54 (the first in pool) is used mostly, but not exclusively, although the tape is still in Append state. I can't find a hint whether/how I can Bacula instruct to use the volumes of a pool consecutively. I'd expect that this would be the default behaviour, but, obviously, it's not. Is there any solution with Bacula? Best regards, Robert ++ German Research Center for Artificial Intelligence ++ Robert Wirth, Stuhlsatzenhausweg 3, D-66123 Saarbruecken @office: +49-681-302-5078/5572 ++ @fax: +49-681-302-5341 mailto:[EMAIL PROTECTED] ++ http://www.dfki.de/~wirth ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] How enforce Bacula to use tapes one after another?
I'm pretty sure that the normal way of working for an SQL engine is to always use a higher number when creating new records. That's how the serial type in PostgreSQL works. I don't think an RDBMS tries to My experience is the same with MySQL: MediaIDs of deleted volumes aren't reused. A new volume gets a MediaID higher than the highest one ever used. Regards, Robert Robert Wirth, Stuhlsatzenhausweg 3, D-66123 Saarbruecken @office: +49-681-302-5078/5572 ++ @fax: +49-681-302-5341 mailto:[EMAIL PROTECTED] ++ http://www.dfki.de/~wirth ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
[Bacula-users] director crash(?) and empty report email
Hi! I'm using bacula 1.38.7. Today, when I changed the director's configuration a bit --just changed a MaxWaitTime entry in a JobDefs resource--, the director daemon terminated after the reload and sent the email attached here. The email has a subject which I don't understand, and an empty body. Thus, I can't figure out what was going wrong. Can anybody give me a hint? Around the same time, there was a backup job running that saved the catalog database. I wonder if the reload itself, or the reload while backup, or what else was the cause for the crash. Regards, Robert ---BeginMessage--- ---End Message--- ++ German Research Center for Artificial Intelligence ++ Robert Wirth, Stuhlsatzenhausweg 3, D-66123 Saarbruecken @office: +49-681-302-5078/5572 ++ @fax: +49-681-302-5341 mailto:[EMAIL PROTECTED] ++ http://www.dfki.de/~wirth
[Bacula-users] Question: decision how tapes are choosed from a pool
Hi, I recently studied the Pool/Volumes section in the manual, but I didn't find a hint about which algorithm is running when bacula has to choose a tape from a pool for an actual job. Example (or my actual problem, you know ;-) - the system is an autochanger with barcode, 60 slots and 2 LTO-3 drives, Bacula version 1.38.7, running well with dozends of clients, in principle - there's a new pool with two freshly labeled, empty tapes - two jobs are scheduled at the same time, doing full backup in that pool the jobs have same priority et. al., only the client names differ - the first tape is loaded into drive 0, the first job completes successfully. The tape remains in the drive, with some 100 MBs written out and state Append - then, the second job unloads the tape, loads the other one in the same drive and writes to _that_ tape. Why? The first tape was OK and loaded! - finally I have both new tapes written, and will get some problems in future, because I want the two tapes used alternately using automatic volume retention and recycling. The parallel initial time stamp on the tapes will be confusing. I'm wondering if there's a clear algorithm in Bacula which decides to choose some tape. I've got some more pools with similar configuration, and there, the tapes are (until today) used one after aneother. Regards, Robert ++ German Research Center for Artificial Intelligence ++ Robert Wirth, Stuhlsatzenhausweg 3, D-66123 Saarbruecken @office: +49-681-302-5078/5572 ++ @fax: +49-681-302-5341 mailto:[EMAIL PROTECTED] ++ http://www.dfki.de/~wirth --- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnkkid=120709bid=263057dat=121642 ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
[Bacula-users] help with bacula-fd and inetd
Hi! I'm using bacula-fd via inetd. This is an example line in inetd.conf: bacula-fd stream tcpnowait root/usr/local/sbin/bacula-fd bacula-fd -i -c /etc/opt/bacula/bacula-fd.conf As you can see, I'm using the option -i as required for bacula's work with inetd. This works well in normal cases, when all things alright: The filedaemon is started for backup, it sends all data to storage daemon an terminates correctly. But, if I try to cancel a job using bconsole, the communication between director and filedaemon fails. I found, that the director wants to establish another connection to port 9102 (I guess, to send the cancel command), and inetd then tries to start up a second process instance of the filedaemon. This instance fails because it finds the pidfile of the first, running backup instance. The same thing works well if I run the filedaemon at front side. Is this the expected behaviour? Or do I miss some option/directive. The system is Solaris 9 with its own inetd. Or is this a general limitation of inetd? Regards, Robert Wirth ++ German Research Center for Artificial Intelligence ++ Robert Wirth, Stuhlsatzenhausweg 3, D-66123 Saarbruecken @office: +49-681-302-5078/5572 ++ @fax: +49-681-302-5341 mailto:[EMAIL PROTECTED] ++ http://www.dfki.de/~wirth --- This SF.Net email is sponsored by xPML, a groundbreaking scripting language that extends applications into web and mobile media. Attend the live webcast and join the prime developer group breaking into this new coding territory! http://sel.as-us.falkag.net/sel?cmd=lnkkid=110944bid=241720dat=121642 ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
[Bacula-users] storage forces jobs to wait
Hi! Just another problem. What happens: - director starts a backup job. The filedaemon is not present correctly, so the job is waiting. That's ok. - I do cancel the job in the director via console. 'status director' shows the job as cancelled. - I ensure the filedaemon is not running (is started via inetd) - I ensure all parameters are alright now and do start the SAME job again Now, it could work, but... when I looked at status director, the canceled job is still canceled, and the new job is 'waiting for max Storage'. This is the same some hours later. Then, I looked at status storage, there's a storage job waiting for the filedaemon to connect. This storage job seems to correspond to the first, canceled director's job. I understand, that the director has initiated and then canceled the first job, but the storage didn't understand the cancel, or just hasn't been informed about the cancel by the director. And, the director is blocked by the storage (for this job's client), and is unable influence the storage in a way that it cancels the 'storage job !?? Again, the only solution was to stop/start the storage daemon and reconnect the director to it. Any chance to reload/reset the storage softly by the director? Best regards, Robert Wirth -- +++ German Research Center for Artificial Intelligence +++ +++I N F R A - S T R U C T U R E - G R O U P +++ -- DFKI GmbH, ISG, Stuhlsatzenhausweg 3, D-66123 Saarbruecken @office: +49-681-302-5572/5514/5078 @fax: +49-681-302-5341 mailto:[EMAIL PROTECTED] ++ Germany ++ http://www.dfki.de/isg -- --- This SF.Net email is sponsored by xPML, a groundbreaking scripting language that extends applications into web and mobile media. Attend the live webcast and join the prime developer group breaking into this new coding territory! http://sel.as-us.falkag.net/sel?cmd=lnkkid=110944bid=241720dat=121642 ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
[Bacula-users] BLOCKED device
Hi, what happend: there was a missing full backup so that an incremental job was upgraded. But there was no volume present for the full backup, thus the tape device is now blocked, waiting for the mount. I decided to cancel the waiting job, because I didn't want to do the full backup now (daily), and I like to do some other small backup jobs now. The cancel is done, but the device is already BLOCKED waiting for media. I tried to umount, but it didn't help. How can I tell the SD to forget all about former mount/wait/block stuff on a certain device (if no job is present anymore)? In general, I could restart the daemon. But this is no option at the moment, because the second tape drive is doing some other backup jobs on another volume. Again: how can I tell the SD to unblock a device during runtime? Regards, Robert Wirth ++ German Research Center for Artificial Intelligence ++ Robert Wirth, Stuhlsatzenhausweg 3, D-66123 Saarbruecken @office: +49-681-302-5078/5572 ++ @fax: +49-681-302-5341 mailto:[EMAIL PROTECTED] ++ http://www.dfki.de/~wirth --- This SF.Net email is sponsored by xPML, a groundbreaking scripting language that extends applications into web and mobile media. Attend the live webcast and join the prime developer group breaking into this new coding territory! http://sel.as-us.falkag.net/sel?cmd=lnkkid=110944bid=241720dat=121642 ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users