> On Nov 16, 2018, at 12:11 PM, Austin S. Hemmelgarn <[email protected]>
> wrote:
>
> On 2018-11-16 12:27, Chris Miller wrote:
>> Hi Folks,
>> I'm unclear on the timing of the flush from holding disk to vtape. Suppose I
>> run two backup jobs,and each uses the holding disk. When will the second job
>> start? Obviously, after the client has sent everything... Before the holding
>> disk flush starts, or after the holding disk flush has completed?
> If by 'jobs' you mean 'amanda configurations', the second one starts when you
> start it. Note that `amdump` does not return until everything is finished
> dumping and optionally taping if anything would be taped, so you can
> literally just run each one sequentially in a shell script and they won't run
> in parallel.
>
> If by 'jobs' you mean DLE's, they run as concurrently as you tell Amanda to
> run them. If you've got things serialized (`inparallel` is set to 1 in your
> config), then the next DLE will start dumping once the previous one is
> finished dumping to the holding disk. Otherwise, however many you've said
> can run in parallel run (within per-host limits), and DLE's start when the
> previous one in sequence for that dumper finishes. Taping can (by default)
> run in parallel with dumping if you're using a holding disk, which is
> generally a good thing, though you can also easily configure it to wait for
> some amount of data to be buffered on the holding disk before it starts
> taping.
>> Is there any way to defer the holding disk flush until all backup jobs for a
>> given night have completed?
> Generically, set `autoflush no` in each configuration, and then run `amflush`
> for each configuration once all the dumps are done.
>
> However, unless you've got an odd arrangement where every system saturates
> the network link while actually dumping and you are sharing a single link on
> the Amanda server for both dumping and taping, this actually probably won't
> do anything for your performance. You can easily configure amanda to flush
> backups from each DLE as soon as they are done, and it will wait to exit
> until everything is actually flushed.
>
> Building from that, if you just want to ensure the `amdump` instances don't
> run in parallel, just use a tool to fire them off sequentially in the
> foreground. Stuff like Ansible is great for this (especially because you can
> easily conditionally back up your index and tapelist when the dump finishes).
> As long as the next `amdump` command isn't started until the previous one
> returns, you won't have to worry about them fighting each other for bandwidth.
Chris: you have some control over when DLEs go from the holding disk to the
actual tape (or vtape).
This paragraph is from the examples, and I keep it in my config files so I
remember how to setup these params:
# New amanda includes these explanatory paragraphs:
# flush-threshold-dumped, flush-threshold-scheduled, taperflush, and autoflush
# are used to control tape utilization. See the amanda.conf (5) manpage for
# details on how they work. Taping will not start until all criteria are
# satisfied. Here are some examples:
#
# You want to fill tapes completely even in the case of failed dumps, and
# don't care if some dumps are left on the holding disk after a run:
# flush-threshold-dumped 100 # (or more)
# flush-threshold-scheduled 100 # (or more)
# taperflush 100
# autoflush yes
#
# You want to improve tape performance by waiting for a complete tape of data
# before writing anything. However, all dumps will be flushed; none will
# be left on the holding disk.
# flush-threshold-dumped 100 # (or more)
# flush-threshold-scheduled 100 # (or more)
# taperflush 0
#
# You don't want to use a new tape for every run, but want to start writing
# to tape as soon as possible:
# flush-threshold-dumped 0 # (or more)
# flush-threshold-scheduled 100 # (or more)
# taperflush 100
# autoflush yes
# maxdumpsize 100k # amount of data to dump each run; see above.
#
# You want to keep the most recent dumps on holding disk, for faster recovery.
# Older dumps will be rotated to tape during each run.
# flush-threshold-dumped 300 # (or more)
# flush-threshold-scheduled 300 # (or more)
# taperflush 300
# autoflush yes
#
# Defaults:
# (no restrictions; flush to tape immediately; don't flush old dumps.)
#flush-threshold-dumped 0
#flush-threshold-scheduled 0
#taperflush 0
#autoflush no
#
—————
Here is part of my setup, with further comments beside each param. I may have
written some of these comments,
so I hope they are completely correct. I think they are.
———————
## with LTO5 tapes, as of 2/27/2015, I still only USE one tape.
## Don't faff around; just write to the silly tape. But to avoid
## shoe shining, let some amount accumulate. Else we'd be writing
## the first tiny file and then waiting .....
## see <othernode> if you need LTO5 settings.
## Enzo is using an LTO4, so revert to these:
# You want to improve tape performance by waiting for a complete tape of data
# before writing anything. However, all dumps will be flushed; none will
# be left on the holding disk.
# flush-threshold-dumped 100 # (or more)
# flush-threshold-scheduled 100 # (or more)
# taperflush 0
flush-threshold-dumped 100 #Default: 0.
# Amanda will not begin writing data to a new tape
volume
# until the amount of data on the holding disk is at
least this percentage
# of the volume size. The idea is to accumulate a
bunch of files,
# so the fill algorithm "Greedy Algorithm" has some
choices to work with.
# The value of this parameter may not exceed than that
of the
# flush-threshold-scheduled parameter.
flush-threshold-scheduled 100 #Default: 0.
# Amanda will not begin writing data to a new volume
until the sum of
# the amount of data on the holding disk and the
estimated amount of data
# remaining to be dumped during this run is at least
this percentage of
# the volume size.
# The value of this parameter may not be less than
that of the
# flush-threshold-dumped or taperflush parameters.
taperflush 0 # Default: 0.
# At the end of a run, Amanda will start a new tape to
flush remaining data
# if there is more data on the holding disk at the end
of a run than this
# setting allows; the amount is specified as a
percentage of the capacity
# of a single volume.
#### dsbdsb ie. 0 == start a new tape if any data
is still on holding disk.
#### Good.
## taperflush <= flush-threshold-scheduled
## flush-threshold-dumped <= flush-threshold-scheduled
#autoflush yes # only flushes those NAMED on the command line. Use
ALL. 6/28/13
autoflush all # flush leftovers from a crash, or a ran-out-of-tape
condition
NOTE THE PART THAT SURPRISED ME, A FEW VERSIONS BACK;
autoflush has values of no / yes / all
“yes” and “all” behave slightly differently.
Deb Baddorf
Fermilab