On 2019-12-03 06:56, Radosław Korzeniewski wrote:
> Hello,
> 
> pon., 2 gru 2019 o 20:05 Phil Stracchino <ph...@caerllewys.net
> <mailto:ph...@caerllewys.net>> napisał(a):
> 
>     It seems to me like this is a bug, and the selection criteria for a COPY
>     job should not be resolved until the COPY job is actually started.
> 
> 
> Well, I see a contradiction here. You want to order the dinner but you
> will tell what dish you want after a cook start cooking. But what dish
> he should start cooking? So it won't work, IMHO.

No, Radoslaw, I think you're missing my point.


Here's what the schedule looks like:

03:10 - Clean database
04:30 - Client backups scheduled to start, with priority 10
04:55 - Database backup scheduled to start, with priority 15
05:00 - Archive copy job scheduled to start, with priority 20


What happens now:

The DB cleanup is done long before any jobs start.
At 04:30, all of the backups start.
At 04:55, the catalog backup is added to the queue, but it is at
priority 15, it cannot RUN until all the other jobs complete.
At 05:00, the archive copy job gets added to the queue.  At priority 20,
it cannot RUN until all the client backups AND the catalog backup - YOU
KNOW, THE JOBS IT'S SUPPOSED TO COPY - complete.

But its job selection query is run NOW, even though it will be eight
hours before it can actually RUN.  None of the jobs it is to copy have
completed yet, so they do not match its selection criteria.

Along about 11:something, the last of the client backups complete.  NOW
the catalog backup can run.  It executes its run-before-job database
dump (yesterday it executed the dump at 11:37), dumping the state of the
catalog NOW, then backs up those dump files.

Now, finally, at about 11:50, having twiddled its thumbs for nearly
seven hours, the archive copy job starts, concludes that SEVEN HOURS AGO
there was nothing for it to do, picks up its marbles and goes home,
having copied nothing.  NOW there are six jobs for it to copy.  But it
won't copy them because its selection criteria were evaluated not when
it STARTED, but when it was QUEUED.



What I am saying SHOULD happen is that the selection query for a SQL
QUERY selection type should not be evaluated when the job is QUEUED.  It
should be evaluated when the job actually STARTS.  And I assert that
evaluating that query when the job gets queued, hours or even possibly
days before the job actually STARTS, is a bug.

You say,
"You want to order the dinner but you will tell what dish you want after
a cook start cooking. But what dish he should start cooking?"

Well, let's look at that another way.  The way the selection query is
evaluated NOW, if I want to copy a set of jobs, I cannot SCHEDULE the
copy job until after the set of jobs it is supposed to copy have
finished.  But I don't know what time the jobs it is to copy will
finish.  Perhaps there will be a delay.  Perhaps one job will run
slowly.  So how can I know what time I can safely schedule the copy job for?

...I can't.  Because I can't know when the previous jobs will finish,
and Bacula does not wait until the copy job actually STARTS to evaluate
its selection query.


Do you see the problem?  The only thing I can do is schedule the archive
copy FAR, FAR out to be certain the jobs it is to copy have finished.
But then I have a window of vulnerability during which those monthly
full backups have not yet been copied off-system.  (I plan eventually to
extend this to at least the weekly Differentials as well.)




-- 
  Phil Stracchino
  Babylon Communications
  ph...@caerllewys.net
  p...@co.ordinate.org
  Landline: +1.603.293.8485
  Mobile:   +1.603.998.6958


_______________________________________________
Bacula-devel mailing list
Bacula-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-devel

Reply via email to