[Bacula-users] Bacula 5.0.x Max Wait Time

2013-05-24 Thread Robert Wirth
Hi,

I wonder if the MaxWaitTime problem is still unsolved in Bacula 5.0.x
It was discussed/bug reported several times in earlier years, and 
again and again promised to be solved in the next release.  

There is a chart in the bacula manual (in the Job section of director
configuration), showing the relations between 

Max Run Sched Time
Max Start Delay
Max Run Time
Max Wait Time

This illustration implies that MaxWaitTime is unaffected by the span
of time between schedule and start.  Due to the chart, MaxWaitTime 
begins to count _after_ the job has started, when a blocking situation
is given.  This is also the meaning of the textual definition there:

»Max Wait Time = time
The time specifies the maximum allowed time that a job may block waiting 
for a resource (such as waiting for a tape to be mounted, or waiting for the 
storage or file daemons to perform their duties), counted from the when the job 
starts, (not 
necessarily the same as when the job was scheduled). This directive works as 
expected since bacula 2.3.18. «

De facto, this isn't the case.  I've got this job definiton:

...
MaxStartDelay = 6 hours
MaxRunTime = 2 hours
MaxWaitTime = 1 hour
...

According to the chart in the manual, it should be ok if the job has 
to wait for start for up to 6 hours after it was scheduled (f.i. if 
jobs with a higher priority still keeb on running)  Once started, it 
may run for up to 2 hours, within this span of time, it may be 
interrupted for up to 1 hour.

But, again and again, I get this job cancelled with this log:

23-May 00:15 bup-serv-dir JobId 140008: Fatal error: Max wait time exceeded. 
Job canceled.
...
  Scheduled time: 22-May-2013 23:15:00
  Start time: 23-May-2013 04:07:05
  End time:   23-May-2013 04:07:05
  Elapsed time:   0 secs

This is, the job is _scheduled_ at 23:15, and cancelled due to exceeded
MaxWaitTime after 1 hour, at 00:15, when it wasn't started at all.  
Obviously, MaxWaitTime is counted from when the job is SCHEDULED, 
and NOT from when the job STARTS.

Thus, either the manual is wrong (text + chart), or this is a bug.  


I'm right? 


Best regards,

Robert



+++German Research Center for Artificial Intelligence+++

Dipl.-Inform. Robert V. Wirth, Campus D3_2, D-66123 Saarbruecken
@office: +49-681-85775-5078 / -5572 +++ @fax: +49-681-85775-5020
mailto:robert.wi...@dfki.de ++ http://www.dfki.de/~wirth



--
Try New Relic Now  We'll Send You this Cool Shirt
New Relic is the only SaaS-based application performance monitoring service 
that delivers powerful full stack analytics. Optimize and monitor your
browser, app,  servers with just a few lines of code. Try New Relic
and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_may
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Full job terminates claiming no space left but tape gets more data

2012-12-11 Thread Robert Wirth
Marco van Wieringen m...@planets.elm.net writes:
  ...
  Fseek on attributes file failed: ERR=No space left on device
  ...
  Last Volume Bytes:  1,283,646,163,968 (1.283 TB)
  
  Really, the tape isn't full!  The other 5 jobs (and this one 
  after being rescheduled) continue to write that tape AD0016L5
  which by now has 1,633,895,424,000 Bytes written and is still 
  in Append state.
  
  Question
  
  What may be the cause for that ERR=No space left on device
  
 Older versions of Bacula (at least the old ones you talk about)
 never did so called attribute spooling. The problem is probably the
 amount of storage on the storage daemon see commit_attribute_spool()
 function that emits the error Fseek on attributes file failed.

Oh, yes.  Working dir filesystem was full.  Oops... I didn't realize...

 You have a couple of options:
 
 - disable attribute spooling (see spoolattributes setting on director)
 - make sure enough space is available on disk for spooling the
   attributes first to disk before they are batch inserted into
   the database. Keep in mind that disabling attribute spooling
   and batch insert will slow down the overall backup probably.)
   The attribute spool files are created in either the spool directory
   or the working directory.

Thank you a lot!  This is the solution.

Best regards,

Robert


---
+++   German Research Center for Artificial Intelligence+++
---
Robert Wirth,Campus D32,Stuhlsatzenhausweg 3,66123 Saarbruecken
@phone:+49-681-85775-5078  Germany  mailto:robert.wi...@dfki.de
---



--
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


[Bacula-users] Full job terminates claiming no space left but tape gets more data

2012-12-10 Thread Robert Wirth
Hi,

Situation:
--
The backup strategy is doing a yearly Full, monthly Diff and daily Incr.
This worked well over years with Bacula 1.x and 2.x.  

Some months ago, I upgraded SD and DIR from Bacula 2.2.8 to 5.0.1
The Diff/Incr jobs continued to do well.

But now, while doing a Full for the first time under 5.0.1,
I got errors (and can reproduce them)

Hardware is two LTO-5 drives with a changer and a lot of tapes.

MaximumSpoolSize = 1024 GB
MaximumJobSpoolSize = 32 GB



What's up?
--
I've 6 bigger jobs (TB size each) in the same pool and start them 
at the same time.  They're spooling and despooling alternately
to the same tape(s) as expected.

Tape AD0018L5 is written by the 6 jobs and, when full, 
automatically replaced by next free tape AD0016L5.  The 
latter one is also written by the 6 jobs...  

Then, one of the 6 jobs fails (and is rescheduled):

Committing spooled data to Volume AD0016L5. Despooling 22,147,812,023 bytes 
...
Fseek on attributes file failed: ERR=No space left on device
...
Last Volume Bytes:  1,283,646,163,968 (1.283 TB)

Really, the tape isn't full!  The other 5 jobs (and this one 
after being rescheduled) continue to write that tape AD0016L5
which by now has 1,633,895,424,000 Bytes written and is still 
in Append state.


Question

What may be the cause for that ERR=No space left on device

Same configuration worked well with Bacula = 2.2.8 and LTO-3
(with lots more full tapes and changes during a Full session
than today when filling LTO-5)


More info:
--

08-Dez 05:46 bup-serv-dir JobId 130936: Start Backup JobId 130936, 
Job=lnv-91163.2012-12-08_04.46.11_20
08-Dez 05:46 bup-serv-dir JobId 130936: Using Device Tape1A
08-Dez 05:46 bup-serv-sd JobId 130936: Spooling data ...
...
09-Dez 20:17 bup-serv-sd JobId 130936: Committing spooled data to Volume 
AD0016L5. Despooling 22,147,812,023 bytes ...
10-Dez 00:52 bup-serv-sd JobId 130936: Despooling elapsed time = 00:34:51, 
Transfer rate = 10.59 M Bytes/second
10-Dez 00:52 bup-serv-sd JobId 130936: Fatal error: Fseek on attributes file 
failed: ERR=No space left on device
10-Dez 00:52 bup-serv-dir JobId 130936: Error: Bacula bup-serv-dir 5.0.1 
(24Feb10): 10-Dez-2012 00:52:50
...
  Elapsed time:   1 day 19 hours 6 mins 36 secs
  Priority:   11
  FD Files Written:   445,124
  SD Files Written:   445,124
  FD Bytes Written:   277,792,113,411 (277.7 GB)
  SD Bytes Written:   277,875,653,641 (277.8 GB)
...
  Volume name(s): AD0018L5|AD0016L5
  Volume Session Id:  202
  Volume Session Time:1354723843
  Last Volume Bytes:  1,283,646,163,968 (1.283 TB)
  Non-fatal FD errors:0
  SD Errors:  0
  FD termination status:  OK
  SD termination status:  Error
  Termination:*** Backup Error ***


list volumes pool=ServerFull
...
|   1,012 | AD0016L5   | Append|   1 | 1,633,895,424,000 |1,634 |   
63,072,000 |   1 |   56 | 1 | LTO-5 | 2012-12-10 10:24:20 |
...
|   1,014 | AD0018L5   | Full  |   1 | 1,896,767,502,336 |1,897 |   
63,072,000 |   0 |   58 | 1 | LTO-5 | 2012-12-08 14:45:09 |


Regards, 

Robert

---
+++   German Research Center for Artificial Intelligence+++
---
Robert Wirth,Campus D32,Stuhlsatzenhausweg 3,66123 Saarbruecken
@phone:+49-681-85775-5078  Germany  mailto:robert.wi...@dfki.de
---



--
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] backup slowdown (mysqld) after tape autochange

2010-12-15 Thread Robert Wirth
 
 On Tue, December 14, 2010 11:48 am, Robert Wirth wrote:
  Hi,
 
  strange problem.  Here's some hardware where Bacula has been running
  successfully for ca. 5 years.  It was release 1.38.11 under Solaris 10x86.
 
  Last month, we had a system disk crash on the backup system.  No backup
  datas have been lost.  We just had to reinstall the backup system.
  Since this was our only Solaris x86 system, we decided to migrate
  to Linux and to a newer Bacula release.  Until the repaired hardware
  was present, we started with a virtualized new system, just for the
  daily incremental backups to disk volumes.
 
  Since most of our actual systems are Ubuntu Hardy server LTS, we
  choosed Bacula 2.2.8 of this distribution as our new version (well,
  it's old, but 1.38.11 was running well, and 2.2.8 was the default)
 
  We upgraded Bacula's mysql database with the corresponding script
  from 1.38.11 to 2.2.8.  We imported the updated DB using mysql_dump
  into the new system which has MySQL 5.1.41 and Linux Kernel 2.6.32
  The virtualized system worked well all the time.
 
  Now, the hardware version of the system is ready, and a yearly full
  backup, which goes directly to tape, is imminent.
 
  And now, the strange things are coming...
 
 
  /* The system is a 2x2 core AMD Opteron system, 4 GB RAM, 6xLSI SCSI U320
  Megaraid with seperated channels for external disks, tape readers and
  autochanger.  23 TB disk storage on external RAIDs, autochanger and
  HP-readers for LTO-3 tapes.   System: see above. */
 
 
  NOW BACKING UP...
 
  Starting a bunch of full backup jobs which fit into 1 SINGLE TAPE
  produces NO PROBLEMS:  the jobs start, run and write, and terminate
  within a usual span of time.  In so doing, I can backup a dozen
  systems with totally 360 GB on one tape in a few hours.
 
 
  FACING THE PROBLEM...
 
  Starting a bunch of full backup jobs that DO NOT FIT into 1 single
  tape proceeds like follows  (with a fresh tape forced by setting the
  former one to readonly):
 
  - first, the jobs run well and write their data to the first fresh tape
of the corresponding pool.  Speed is similar as known from the old OS.
 
  - when the tape is full with around 600GB of data, it is marked as
Full, being unloaded, and the next free tape of the pool is loaded.
 
  - from this moment on, writing to the new fresh tape becomes incredibly
slow (4 GB/hour) and mysqld has constantly 95%-100% CPU load.
No other process has an important load, and the mysql load isn't
represented in the system's load values:
 
  Cpu(s):  3.3%us,  2.2%sy,  0.0%ni, 91.6%id,  2.1%wa,  0.1%hi,  0.7%si,
  0.0%st
  Mem:   3961616k total,  3850072k used,   111544k free,17532k buffers
  Swap:  3906552k total,0k used,  3906552k free,  3579956k cached
 
PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
   1356 mysql 20   0  144m  31m 2376 S   98  0.8 163:57.79 mysqld
  1 root  20   0  2620  948  528 S0  0.0   0:00.63 init
  2 root  20   0 000 S0  0.0   0:00.00 kthreadd
   
 
  The only further effect I can see is that the table bacula.JobMedia is
  growing.   No errors in system log, no mysql errors, nor in Baculas log.
 
  What I mainly don't understand is why this happens after a tape change.
  The MaxSpoolSize is 32GB, and I'm backing up 7 systems.  Each of them
  had several spool steps during the first tape.
 
 From the view of Bacula and its program logic, what has changed when
  the tape has been changed?  I guess it's all the same:  spooling data,
  writing them to tape and update the catalog, regardless of first, second
  or later tape...?!?
 
 What do you see under Running Jobs in the 'status dir' output before and
 after the first tape has filled?
 
 If you have only the 'after' just now, that might be interesting.

Will try to see this on next try.

/* Actual state:  the second tape is loaded and used since yesterday 
02:14 p.m., and it really has been written 29 GB :-((( */


And, I wrote wrong.  There IS something strange in bacula log.  Oooh.

Yesterday, I started the 7 not-fitting big backup jobs around 9:45 a.m.  
The first tape has been filled with 540 GB until 02:45 p.m.  

| 833 | 90L1   | Full  |   1 | 543,602,949,120 |  544 |   
63,072,000 |   0 |1 | 1 | LTO-3 | 2010-12-14 14:45:52 |

During this time, I could follow the spooling via messages on bconsole. 
It all looked correct:  parallel spooling and writing of data.

/* PARENTHESIS: 
Actual state:  the second tape was loaded yesterday 02:45 p.m., 
and it really has been written 29 GB since then during 18 hours :-((( 
| 834 | 98L1   | Append|   1 |  29,998,080,000 |   30 |   
63,072,000 |   0 |4 | 1 | LTO-3 | 2010-12-15 08:14:46 |
END OF PARENTHESIS */


In the log, there's only information about the job which I started
first (lnv-102).  No word about the other 6 jobs and their spooling

[Bacula-users] backup slowdown (mysqld) after tape autochange

2010-12-14 Thread Robert Wirth
Hi,

strange problem.  Here's some hardware where Bacula has been running 
successfully for ca. 5 years.  It was release 1.38.11 under Solaris 10x86.

Last month, we had a system disk crash on the backup system.  No backup 
datas have been lost.  We just had to reinstall the backup system.
Since this was our only Solaris x86 system, we decided to migrate 
to Linux and to a newer Bacula release.  Until the repaired hardware
was present, we started with a virtualized new system, just for the
daily incremental backups to disk volumes.

Since most of our actual systems are Ubuntu Hardy server LTS, we 
choosed Bacula 2.2.8 of this distribution as our new version (well,
it's old, but 1.38.11 was running well, and 2.2.8 was the default)

We upgraded Bacula's mysql database with the corresponding script
from 1.38.11 to 2.2.8.  We imported the updated DB using mysql_dump
into the new system which has MySQL 5.1.41 and Linux Kernel 2.6.32
The virtualized system worked well all the time.

Now, the hardware version of the system is ready, and a yearly full 
backup, which goes directly to tape, is imminent.  

And now, the strange things are coming...


/* The system is a 2x2 core AMD Opteron system, 4 GB RAM, 6xLSI SCSI U320
Megaraid with seperated channels for external disks, tape readers and
autochanger.  23 TB disk storage on external RAIDs, autochanger and 
HP-readers for LTO-3 tapes.   System: see above. */


NOW BACKING UP...

Starting a bunch of full backup jobs which fit into 1 SINGLE TAPE
produces NO PROBLEMS:  the jobs start, run and write, and terminate 
within a usual span of time.  In so doing, I can backup a dozen
systems with totally 360 GB on one tape in a few hours. 


FACING THE PROBLEM...

Starting a bunch of full backup jobs that DO NOT FIT into 1 single 
tape proceeds like follows  (with a fresh tape forced by setting the
former one to readonly):

- first, the jobs run well and write their data to the first fresh tape 
  of the corresponding pool.  Speed is similar as known from the old OS.

- when the tape is full with around 600GB of data, it is marked as 
  Full, being unloaded, and the next free tape of the pool is loaded.

- from this moment on, writing to the new fresh tape becomes incredibly 
  slow (4 GB/hour) and mysqld has constantly 95%-100% CPU load. 
  No other process has an important load, and the mysql load isn't 
  represented in the system's load values:
 
Cpu(s):  3.3%us,  2.2%sy,  0.0%ni, 91.6%id,  2.1%wa,  0.1%hi,  0.7%si,  0.0%st
Mem:   3961616k total,  3850072k used,   111544k free,17532k buffers
Swap:  3906552k total,0k used,  3906552k free,  3579956k cached

  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
 1356 mysql 20   0  144m  31m 2376 S   98  0.8 163:57.79 mysqld
1 root  20   0  2620  948  528 S0  0.0   0:00.63 init
2 root  20   0 000 S0  0.0   0:00.00 kthreadd
 

The only further effect I can see is that the table bacula.JobMedia is
growing.   No errors in system log, no mysql errors, nor in Baculas log.

What I mainly don't understand is why this happens after a tape change.
The MaxSpoolSize is 32GB, and I'm backing up 7 systems.  Each of them
had several spool steps during the first tape.   

From the view of Bacula and its program logic, what has changed when 
the tape has been changed?  I guess it's all the same:  spooling data,
writing them to tape and update the catalog, regardless of first, second
or later tape...?!?


Regards,

Robert



+++German Research Center for Artificial Intelligence+++

Dipl.-Inform. Robert V. Wirth, Campus D3_2, D-66123 Saarbruecken
@office: +49-681-85775-5078 / -5572 +++ @fax: +49-681-85775-5020
mailto:robert.wi...@dfki.de ++ http://www.dfki.de/~wirth

Deutsches Forschungszentrum fuer Kuenstliche Intelligenz GmbH
- Firmensitz Trippstadter Strasse 122, D-67663 Kaiserslautern

Geschaeftsfuehrung (executive board):
- Prof. Dr. Dr. h.c. mult. Wolfgang Wahlster (Vorsitzender)
- Dr. Walter Olthoff

Vorsitzender des Aufsichtsrats (supervisory board chairman):
- Prof. Dr. h.c. Hans A. Aukes

Amtsgericht Kaiserslautern, HRB 2313




--
Lotusphere 2011
Register now for Lotusphere 2011 and learn how
to connect the dots, take your collaborative environment
to the next level, and enter the era of Social Business.
http://p.sf.net/sfu/lotusphere-d2d
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


[Bacula-users] How to change MaxStartDelay of a job?

2008-01-11 Thread Robert Wirth
Hi!

Running Bacula 1.38.11, I want to change the MaxStartDelay of a job.
Situation:  

- several Jobs are started at 23:10 with priority 10
- the Catalog job is started at 23:15 with priority 13

The MaxStartDelay for the Catalog job was 1 hour.  This was to short 
(it came from the misunderstanding of MaxStartDelay which, I now learned
to be likely a MaxStartDelayAfterSchedule).
The Catalog is aborted (without an email warning/error message) because
higher jobs are still running 1 hour after it was _scheduled_ (and still
waiting for the higher jobs to complete).

Manually run of the Catalog job in the morning works, thus the Job is ok.

Now, I increased the MaxStartDelay in the bacula-dir.conf to 6 hours.  
Quite long enough that all higher jobs are completed before.   After 
reloading the configuration using bconsole, nothing has changed :-(

I wonder if the value of MaxStartDelay is buffered somewhere in the
Bacula DB, and changes on the configuration file will not proceed
without an update of the DB entries?  
(I tried to find out the value of MaxStartDelay using bconsole's show, 
but I can't find it).

Any help welcome.

Regards,

Robert


+++German Research Center for Artificial Intelligence+++
+++   I N F R A  -  S T R U C T U R E  -  G R O U P  +++

DFKI GmbH,ISG,Campus D32,Stuhlsatzenhausweg 3,66123 Saarbruecken
@office: +49-681-302-5572,-5514,-5078 @telefax: +49-681-302-5020
mailto:isg-sb(at)dfki.de Germany http://www.dfki.de/web/

Deutsches Forschungszentrum fuer kuenstliche Intelligenz GmbH
- Firmensitz Trippstadter Strasse 122, D-67663 Kaiserslautern

Geschaeftsfuehrung (executive board):
- Prof. Dr. Dr. h.c. mult. Wolfgang Wahlster (Vorsitzender)
- Dr. Walter Olthoff

Vorsitzender des Aufsichtsrats (supervisory board chairman):
- Prof. Dr. h.c. Hans A. Aukes

Amtsgericht Kaiserslautern, HRB 2313





-
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] How to change MaxStartDelay of a job?

2008-01-11 Thread Robert Wirth
11.01.2008 13:11:28, Arno Lehmann wrote:

 Show job=... might reveal it.
 
 Other than that, I can only tell you that these values are not stored 
 in the catalog, as far as I know. My job changes show up immediately 
 after a 'reload'. But I haven't worked with the delay settings much...

I know that.  But where's the MaxStartDelay in it?  The only large number
there is 3,600, I think this is the time span for rescheduling the job.

*show job=Catalog
Job: name=Catalog JobType=66 level=Full Priority=13 Enabled=1
 MaxJobs=1 Resched=1 Times=1 Interval=3,600 Spool=1 WritePartAfterJob=0
  -- Client: name=-fd address=X FDport=9102 MaxJobs=8
  JobRetention=2 years  FileRetention=2 years  AutoPrune=1
  -- Catalog: name=MySQL address=*None* DBport=0 db_name=bacula
  db_user=bacula MutliDBConn=0
  -- FileSet: name=Catalog
  O M
  N
  I /var/services/Dump/mysql/bacula.sql
  N
  -- Schedule: name=Catalog
  -- Run Level=Full
  hour=23 
  mday=0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 
26 27 28 29 30 
  month=0 1 2 3 4 5 6 7 8 9 10 11 
  wday=0 1 2 3 4 5 6 
  wom=0 1 2 3 4 
  woy=0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 
26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 
52 53 
  mins=15
  -- RunBefore=/usr/local/share/bacula/scripts/make_catalog_backup bacula 
bacula X
  -- RunAfter=/usr/local/share/bacula/scripts/delete_catalog_backup bacula
  -- WriteBootstrap=/var/services/bacula/work/bootstraps/Catalog.bsr
[SNIP]
/* the rest of the record corresponds to Storage, Pool etc. settings */


This is the Job record in bacula-dir.conf:

Job {
Name = Catalog
Client = -fd
Type = Backup
Level = Full
Messages = Standard
Schedule = Catalog
FileSet = Catalog
Storage = Tape1 
Pool = Test
MaximumConcurrentJobs = 1
MaxStartDelay = 6 hours ## was 1 hour before the change
MaxRunTime = 6 hours
MaxWaitTime = 1 hour
PruneJobs = no
PruneFiles = no
PruneVolumes = no
RerunFailedLevels = no 
RescheduleOnError = yes
RescheduleInterval = 1 hour
RescheduleTimes = 1
SpoolData = yes
SpoolAttributes = no
Priority = 13
WriteBootstrap = /var/services/bacula/work/bootstraps/Catalog.bsr
RunBeforeJob = /usr/local/share/bacula/scripts/make_catalog_backup 
bacula bacula XXX
RunAfterJob  = /usr/local/share/bacula/scripts/delete_catalog_backup 
bacula
}


Best,

Robert


+++German Research Center for Artificial Intelligence+++
+++   I N F R A  -  S T R U C T U R E  -  G R O U P  +++

DFKI GmbH,ISG,Campus D32,Stuhlsatzenhausweg 3,66123 Saarbruecken
@office: +49-681-302-5572,-5514,-5078 @telefax: +49-681-302-5020
mailto:isg-sb(at)dfki.de Germany http://www.dfki.de/web/





-
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


[Bacula-users] [Hint] corrupt ext3 filesystem forces giant job (_no_ problem)

2007-04-10 Thread Robert Wirth
Hi!

FYIO: (Still) running bacula 1.38.11, I've found an interesting effect
last night.  On a system with 35 GB disk space, running SuSE Linux 10.1,
Bacula tried to save a some-TB job.  The backup was canceled after 1.2 TB 
were written to the SD due to the MaxRunTime directive.  I don't know if 
it had terminated else as long as tape medias were available... 

The cause is simple:  there was an error in the ext3 filesystem.  There
was an inode claimed to be some TB big.   fsck recognized that when I 
search for the cause.  But bacula-fd can't, obviously.  Before correction
with fsck, the effect was reproducible, i.e. the next backup job would
have run in the same way.

Fortunately, the virtual mega-file consisted of only few different bytes
(or of one byte only).  Thus, the hardware compression of my tape drive 
filled that 1.2 TB in short tape space.

Best,

Robert

--
++ Deutsches Forschungszentrum fuer Kuenstliche Intelligenz ++
--
Dipl.Inf. Robert Wirth,Stuhlsatzenhausweg 3,66123 Saarbruecken
@office: +49-681-302-5078 oder -5572 ++ @fax: +49-681-302-5020
mailto:[EMAIL PROTECTED]  http://www.dfki.de/~wirth
--
Deutsches Forschungszentrum fuer Kuenstliche Intelligenz GmbH
- Firmensitz Trippstadter Strasse 122, D-67663 Kaiserslautern

Geschaeftsfuehrung:
- Prof. Dr. Dr. h.c. mult. Wolfgang Wahlster (Vorsitzender)
- Dr. Walter Olthoff

Vorsitzender des Aufsichtsrats:
- Prof. Dr. h.c. Hans A. Aukes

Amtsgericht Kaiserslautern, HRB 2313
--



-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


[Bacula-users] File count after FD termination _or_ out of MaxRunTime

2007-02-02 Thread Robert Wirth
Hi,

using Bacula Version: 1.38.11 (28 June 2006) this happens:


A.. While a backup job is running, the FD terminates accidentially 
(process kill, power off machine etc.) 
The backup job is canceled by bacula (error mail is sent etc.)

  ! The volume file count is wrong after that --with a difference of 1.
 
The next job, waiting for a writable volume in the same pool, 
is scheduled and marks the volume (tape) with Error. 


B.. A backup job reaches MaxRunTime.  Bacula decides to cancel the job, 
that's ok.  Again, the file count isn't updated right.  That's not ok.

The result is the same as in A..: file count mismatch, Error state,
when the next job tries to write on the volume.


My question:  Why is there a file count mismatch?  Consider case B.., which
I think is a heavy thing: The complete Cancel operation is under the control 
of Bacula, it's a Cancel from within the system!  It should be easy for 
Bacula to enter the correct file count in the database...?!

Although in case A.., where the interrupt comes from outside. The SD, after
losing the connection to FD, should be able to tell the DIR the correct file
count it has written to storage, and the DIR should be able to update the 
count in the database.


So, it's a bug, isn't it?  Any help with that?


Actually, I use to correct file counts manually using bconsole.  That's 
a boring thing, because the volumes with wrong file counts are still
in status Append, as long as no further job fails on them.  Thus, I've
to check all Cancel-mails every morning for the exact reason.

Gruezi and thanks,

Robert

--
++German Research Center for Artificial Intelligence++
--
Dipl.Inf. Robert Wirth,Stuhlsatzenhausweg 3,66123 Saarbruecken
@office: +49-681-302-5078, or -5572 +++ @fax: +49-681-302-5020
mailto:[EMAIL PROTECTED]  http://www.dfki.de/~wirth
--



-
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier.
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnkkid=120709bid=263057dat=121642
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Problem: Bacula don't use a new tape

2007-01-22 Thread Robert Wirth
Hi Ralf, 

This is also my experience as well.  As a workaround, I'm using only 
23 hours as a one-day usage period, this happens to work.  

Regards,

Robert


++ German Research Center for Artificial Intelligence ++

Robert Wirth, Stuhlsatzenhausweg 3, D-66123 Saarbruecken
@office: +49-681-302-5078/5572 ++ @fax: +49-681-302-5020
mailto:[EMAIL PROTECTED] ++ http://www.dfki.de/~wirth


Ralf Winkler wrote:

 I use Bacula Versioin 1.38.11 on 2 different systems.
 On both systems Bacula write to a hard drive and shall use the file for a
 period of 24 hours (23 hours 50 mins excactly).
 If i understand the docu in the right way, i have to define this in the Pool
 definition.
 So i did.
 
 # Default pool definition
 Pool {
   Name = Default
   Pool Type = Backup
   Recycle = yes   # Bacula can automatically recycle
 Volumes
   LabelFormat = File-Name
   AutoPrune = yes # Prune expired volumes
   Volume Retention = 10 days  # changed to 10 days instead of 1 year
   Maximum Volumes = 10# set too 12 tapes (with 2 spares)
   Volume Use Duration = 1430m # set to 23h 50m
   Accept Any Volume = yes # write on any volume in the pool
 
 
 And now my problem, on one system, Bacula change and write to a new file
 every day. On the other system Bacula always want to write to the same file.
 I have to set the file to used in the bconsole then Bacula will use the next
 file, but don't switch to the next file after 24 hours, i have to do it
 again by hand.
 
 The definition is except the Label the same on both machines.
 Did i miss something?




-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Backup Method with Multiple Media Types - Resend

2006-11-03 Thread Robert Wirth
Hi Alan,

a job definition allows one Storage directive only, of course.
All directives are keywords, thus they must be unique.  But, 
within a Schedule definition, you may change the storage for 
the scheduled job.  Example:

Schedule {
Name = Foo
Run = Full Storage=Tape Pool=FooFull 1st sat on dec at 06:10
Run = Differential Storage=Tape DifferentialPool=FooDiff 1st sat on 
jan-nov at 23:10
Run = Incremental Storage=Disk IncrementalPool=FooIncr sun-fri at 
23:10
Run = Incremental Storage=Disk IncrementalPool=FooIncr 2nd-5th sat 
at 23:10
}

JobDefs {
Name = FooJob
Level = Incremental
Schedule = Foo
Storage = Tape 
Pool = ...
...
}

Job {
Name = FooHost
JobDefs = FooJob
Client = FooHost-fd
...
}

Client {
Name = FooHost-fd
...
}




Best,

Robert


++ German Research Center for Artificial Intelligence ++

Robert Wirth, Stuhlsatzenhausweg 3, D-66123 Saarbruecken
@office: +49-681-302-5078/5572 ++ @fax: +49-681-302-5020
mailto:[EMAIL PROTECTED] ++ http://www.dfki.de/~wirth




-
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnkkid=120709bid=263057dat=121642
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


[Bacula-users] How enforce Bacula to use tapes one after another?

2006-06-12 Thread Robert Wirth
Hi!

I've got two LTO-3 drives and with one autochanger device.  Backups of 
dozends of clients into several pools works well, in general.  My Bacula
version is 1.38.7.

There remains one annoying problem: 

I've defined a pool with two tapes for daily snapshot backups of 
some data of temporary interest from several clients.  
For simplicity, all snapshot jobs are defined identically. 
As a consequence, they all have the same start time.
  
The idea is, that Bacula uses the first tape, until it's full,
then continues with the second tape. 
The retention periods are choosed in a way that, when the second 
tape will be completed (Full, Used etc), all data on the first tape 
will be out of time, thus the first tape can be recycled automatically.  
And the other way round...

But, good idea -- bad conversion:

Bacula uses the tapes in an arbitrary order.  Some day, only
one tape is used.  Another day, both tapes are used.  As a 
consequence, when tape recycling is needed some day, both tapes 
will contain actual backup data.  

F.i. have a look at the actual situation (the jobs start daily at 20:05 pm,
using the same Schedule and JobDefs)


Pool: Snapshot
+-+-+-+--+-+-+
| MediaId | ... | VolBytes| VolFiles | ... | LastWritten |
+-+-+---+-+--+-+-+
|  54 | ... | 497,058,209,311 |  520 | ... | 2006-06-09 20:14:29 |
|  86 | ... | 110,145,463,630 |  115 | ... | 2006-06-09 20:19:40 |
+-+-+-+--+-+-+

You can see: Tape 54 (the first in pool) is used mostly, but not exclusively,
although the tape is still in Append state.


I can't find a hint whether/how I can Bacula instruct to use the volumes 
of a pool consecutively.  I'd expect that this would be the default 
behaviour, but, obviously, it's not.  

Is there any solution with Bacula?

Best regards,

Robert


++ German Research Center for Artificial Intelligence ++

Robert Wirth, Stuhlsatzenhausweg 3, D-66123 Saarbruecken
@office: +49-681-302-5078/5572 ++ @fax: +49-681-302-5341 
mailto:[EMAIL PROTECTED] ++ http://www.dfki.de/~wirth






___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] How enforce Bacula to use tapes one after another?

2006-06-12 Thread Robert Wirth
 I'm pretty sure that the normal way of working for an SQL engine is to 
 always use a higher number when creating new records. That's how the 
 serial type in PostgreSQL works. I don't think an RDBMS tries to 

My experience is the same with MySQL:  MediaIDs of deleted volumes aren't
reused.  A new volume gets a MediaID higher than the highest one ever used.

Regards,

Robert


Robert Wirth, Stuhlsatzenhausweg 3, D-66123 Saarbruecken
@office: +49-681-302-5078/5572 ++ @fax: +49-681-302-5341 
mailto:[EMAIL PROTECTED] ++ http://www.dfki.de/~wirth





___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


[Bacula-users] director crash(?) and empty report email

2006-05-24 Thread Robert Wirth
Hi!

I'm using bacula 1.38.7.  Today, when I changed the director's 
configuration a bit --just changed a MaxWaitTime entry in a JobDefs 
resource--, the director daemon terminated after the reload and sent
the email attached here. 

The email has a subject which I don't understand, and an empty body.
Thus, I can't figure out what was going wrong.

Can anybody give me a hint?  Around the same time, there was a backup job
running that saved the catalog database.  I wonder if the reload itself, 
or the reload while backup, or what else was the cause for the crash.


Regards,

Robert

---BeginMessage---
---End Message---

++ German Research Center for Artificial Intelligence ++

Robert Wirth, Stuhlsatzenhausweg 3, D-66123 Saarbruecken
@office: +49-681-302-5078/5572 ++ @fax: +49-681-302-5341 
mailto:[EMAIL PROTECTED] ++ http://www.dfki.de/~wirth



[Bacula-users] Question: decision how tapes are choosed from a pool

2006-05-11 Thread Robert Wirth
Hi,

I recently studied the Pool/Volumes section in the manual, but I didn't 
find a hint about which algorithm is running when bacula has to choose 
a tape from a pool for an actual job.


Example (or my actual problem, you know ;-)

- the system is an autochanger with barcode, 60 slots  and 2 LTO-3 drives,
  Bacula version 1.38.7, running well with dozends of clients, in principle

- there's a new pool with two freshly labeled, empty tapes

- two jobs are scheduled at the same time, doing full backup in that pool
  the jobs have same priority et. al., only the client names differ

- the first tape is loaded into drive 0, the first job completes 
  successfully.  The tape remains in the drive, with some 100 MBs written 
  out and state Append

- then, the second job unloads the tape, loads the other one in the same 
  drive and writes to _that_ tape.  Why?  The first tape was OK and loaded!

- finally I have both new tapes written, and will get some problems in 
  future, because I want the two tapes used alternately using automatic 
  volume retention and recycling.  The parallel initial time stamp on the
  tapes will be confusing.


I'm wondering if there's a clear algorithm in Bacula which decides to 
choose some tape.  I've got some more pools with similar configuration,
and there, the tapes are (until today) used one after aneother.  

Regards,

Robert


++ German Research Center for Artificial Intelligence ++

Robert Wirth, Stuhlsatzenhausweg 3, D-66123 Saarbruecken
@office: +49-681-302-5078/5572 ++ @fax: +49-681-302-5341 
mailto:[EMAIL PROTECTED] ++ http://www.dfki.de/~wirth





---
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnkkid=120709bid=263057dat=121642
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


[Bacula-users] help with bacula-fd and inetd

2006-04-07 Thread Robert . Wirth
Hi!

I'm using bacula-fd via inetd.  This is an example line in inetd.conf:

bacula-fd stream tcpnowait  root/usr/local/sbin/bacula-fd bacula-fd -i 
-c /etc/opt/bacula/bacula-fd.conf

As you can see, I'm using the option -i as required for bacula's work 
with inetd.  This works well in normal cases, when all things alright:
The filedaemon is started for backup, it sends all data to storage daemon
an terminates correctly.  

But, if I try to cancel a job using bconsole, the communication between
director and filedaemon fails.  I found, that the director wants to 
establish another connection to port 9102 (I guess, to send the cancel
command), and inetd then tries to start up a second process instance of 
the filedaemon.  This instance fails because it finds the pidfile of the
first, running backup instance.

The same thing works well if I run the filedaemon at front side.

Is this the expected behaviour?  Or do I miss some option/directive.
The system is Solaris 9 with its own inetd.  Or is this a general 
limitation of inetd?

Regards,

Robert Wirth


++ German Research Center for Artificial Intelligence ++

Robert Wirth, Stuhlsatzenhausweg 3, D-66123 Saarbruecken
@office: +49-681-302-5078/5572 ++ @fax: +49-681-302-5341 
mailto:[EMAIL PROTECTED] ++ http://www.dfki.de/~wirth






---
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnkkid=110944bid=241720dat=121642
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


[Bacula-users] storage forces jobs to wait

2006-04-07 Thread Robert . Wirth
Hi!

Just another problem.  What happens:

- director starts a backup job.  The filedaemon is not present correctly,
  so the job is waiting.   That's ok.
- I do cancel the job in the director via console.  
  'status director' shows the job as cancelled.
- I ensure the filedaemon is not running (is started via inetd)
- I ensure all parameters are alright now and do start the SAME job again
  Now, it could work, but...

when I looked at status director, the canceled job is still canceled,
and the new job is 'waiting for max Storage'.  This is the same some 
hours later. 

Then, I looked at status storage, there's a storage job waiting for 
the filedaemon to connect.  This storage job seems to correspond to
the first, canceled director's job.

I understand, that the director has initiated and then canceled
the first job, but the storage didn't understand the cancel, or just 
hasn't been informed about the cancel by the director.   And, the 
director is blocked by the storage (for this job's client), and is 
unable influence the storage in a way that it cancels the 'storage job !??

Again, the only solution was to stop/start the storage daemon and 
reconnect the director to it.

Any chance to reload/reset the storage softly by the director?

Best regards, 

Robert Wirth

--
+++ German Research Center for Artificial Intelligence +++
+++I N F R A  -  S T R U C T U R E  -  G R O U P   +++
--
DFKI GmbH, ISG, Stuhlsatzenhausweg 3, D-66123 Saarbruecken
@office: +49-681-302-5572/5514/5078 @fax: +49-681-302-5341
mailto:[EMAIL PROTECTED] ++ Germany ++ http://www.dfki.de/isg
--





---
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnkkid=110944bid=241720dat=121642
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


[Bacula-users] BLOCKED device

2006-04-04 Thread Robert . Wirth
Hi,

what happend:  there was a missing full backup so that an incremental 
job was upgraded.  But there was no volume present for the full backup,
thus the tape device is now blocked, waiting for the mount.  

I decided to cancel the waiting job, because I didn't want to do the 
full backup now (daily), and I like to do some other small backup jobs
now.  The cancel is done, but the device is already BLOCKED waiting
for media.  I tried to umount, but it didn't help.

How can I tell the SD to forget all about former mount/wait/block 
stuff on a certain device (if no job is present anymore)?   

In general, I could restart the daemon.  But this is no option at 
the moment, because the second tape drive is doing some other 
backup jobs on another volume.

Again:  how can I tell the SD to unblock a device during runtime?

Regards,

Robert Wirth


++ German Research Center for Artificial Intelligence ++

Robert Wirth, Stuhlsatzenhausweg 3, D-66123 Saarbruecken
@office: +49-681-302-5078/5572 ++ @fax: +49-681-302-5341 
mailto:[EMAIL PROTECTED] ++ http://www.dfki.de/~wirth






---
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnkkid=110944bid=241720dat=121642
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users