On Friday 2016-11-04 10:33:03 jahlives wrote:
> Hi Josip
> 
> Am 04.11.2016 um 09:54 schrieb Josip Deanovic:
> > I am guessing that you have other jobs with higher priority that
> > got run before this job you are monitoring and at the time the
> > bacula was ready to run it, its scheduled time was already past
> > by more than 600 seconds (10 minutes) and the is configured to
> > fail in that case.
> 
> maybe you got me wrong :-)

Might be but I am still convinced that I understood your case correctly.
I hope that others will add with their comments on the subject as well.

> It's not the question why this job has been canceled. It's canceled
> because this is an hourly job and huge daily backups run around this
> time. So we decided to "kill" the hourly jobs quite fast, as an hour
> later will be the next run.

Ok but wouldn't be a better idea to leave the current job to complete
and cancel the next job which failed to start in the scheduled time?
Otherwise you might end up with backup jobs constantly failing and
with no useful backup available. For that you would just need to
increase the "Max Run Time".

> The question was more: why the job is not completely canceled by 02:10
> (10min after scheduled starttime)? Why took it until 02:36 to cancel it,
> although bacula logged at 02:10 that the job is canceled
> 
> 04-Nov 02:10 zabbix-dir JobId 571889: Fatal error: Max run time
> exceeded. Job canceled.
> 04-Nov 02:36 zabbix-dir JobId 571889: Fatal error: Job canceled because
> max start delay time exceeded.

Because according to the report you got from the bacula the job didn't
start at 02:00, it started at 02:36.

I think that by reporting all the possible reasons to cause the job to
cancel bacula mislead you to think that it is failing because of the Max
run time which exceeded and not because of the Max start delay which
exceeded.
The truth is that both variables apply here but the job didn't start
at 02:00, it was started at 02:36 and you have "Max Start Delay" set
to 10 minutes which means that immediately it started at 02:36 it was
already a candidate to get canceled which bacula actually did.

I can't say why it started at 02:36 but you could probably look
what prevented the job to run at the scheduled time (02:00).

Also, there is a helpful chart here:
http://www.bacula.org/5.2.x-manuals/en/main/main/img24.png


I'll paste this from my last message because I think it's useful:

Notice what you are asking bacula to do:
> Max Run Time = 600
> Max Run Sched Time = 600
> Max Start Delay = 600

And what bacula actually did (according to its report):
> Scheduled time:         04-Nov-2016 02:00:00
> Start time:             04-Nov-2016 02:36:34
> End time:               04-Nov-2016 02:36:34


-- 
Josip Deanovic

------------------------------------------------------------------------------
Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today. http://sdm.link/xeonphi
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to