> On Nov 16, 2018, at 1:37 PM, Gene Heskett <[email protected]> wrote:
>
> On Friday 16 November 2018 13:59:59 Debra S Baddorf wrote:
>
>>> On Nov 16, 2018, at 12:11 PM, Austin S. Hemmelgarn
>>> <[email protected]> wrote:
>>>
>>> On 2018-11-16 12:27, Chris Miller wrote:
>>>> Hi Folks,
>>>> I'm unclear on the timing of the flush from holding disk to vtape.
>>>> Suppose I run two backup jobs,and each uses the holding disk. When
>>>> will the second job start? Obviously, after the client has sent
>>>> everything... Before the holding disk flush starts, or after the
>>>> holding disk flush has completed?
>>>
>>> If by 'jobs' you mean 'amanda configurations', the second one starts
>>> when you start it. Note that `amdump` does not return until
>>> everything is finished dumping and optionally taping if anything
>>> would be taped, so you can literally just run each one sequentially
>>> in a shell script and they won't run in parallel.
>>>
>>> If by 'jobs' you mean DLE's, they run as concurrently as you tell
>>> Amanda to run them. If you've got things serialized (`inparallel`
>>> is set to 1 in your config), then the next DLE will start dumping
>>> once the previous one is finished dumping to the holding disk.
>>> Otherwise, however many you've said can run in parallel run (within
>>> per-host limits), and DLE's start when the previous one in sequence
>>> for that dumper finishes. Taping can (by default) run in parallel
>>> with dumping if you're using a holding disk, which is generally a
>>> good thing, though you can also easily configure it to wait for some
>>> amount of data to be buffered on the holding disk before it starts
>>> taping.
>>>
>>>> Is there any way to defer the holding disk flush until all backup
>>>> jobs for a given night have completed?
>>>
>>> Generically, set `autoflush no` in each configuration, and then run
>>> `amflush` for each configuration once all the dumps are done.
>>>
>>> However, unless you've got an odd arrangement where every system
>>> saturates the network link while actually dumping and you are
>>> sharing a single link on the Amanda server for both dumping and
>>> taping, this actually probably won't do anything for your
>>> performance. You can easily configure amanda to flush backups from
>>> each DLE as soon as they are done, and it will wait to exit until
>>> everything is actually flushed.
>>>
>>> Building from that, if you just want to ensure the `amdump`
>>> instances don't run in parallel, just use a tool to fire them off
>>> sequentially in the foreground. Stuff like Ansible is great for
>>> this (especially because you can easily conditionally back up your
>>> index and tapelist when the dump finishes). As long as the next
>>> `amdump` command isn't started until the previous one returns, you
>>> won't have to worry about them fighting each other for bandwidth.
>>
>> Chris: you have some control over when DLEs go from the holding disk
>> to the actual tape (or vtape). This paragraph is from the examples,
>> and I keep it in my config files so I remember how to setup these
>> params: # New amanda includes these explanatory paragraphs:
>>
>> # flush-threshold-dumped, flush-threshold-scheduled, taperflush, and
>> autoflush # are used to control tape utilization. See the amanda.conf
>> (5) manpage for # details on how they work. Taping will not start
>> until all criteria are # satisfied. Here are some examples:
>> #
>> # You want to fill tapes completely even in the case of failed dumps,
>> and # don't care if some dumps are left on the holding disk after a
>> run: # flush-threshold-dumped 100 # (or more)
>> # flush-threshold-scheduled 100 # (or more)
>> # taperflush 100
>> # autoflush yes
>> #
>> # You want to improve tape performance by waiting for a complete tape
>> of data # before writing anything. However, all dumps will be flushed;
>> none will # be left on the holding disk.
>> # flush-threshold-dumped 100 # (or more)
>> # flush-threshold-scheduled 100 # (or more)
>> # taperflush 0
>> #
>> # You don't want to use a new tape for every run, but want to start
>> writing # to tape as soon as possible:
>> # flush-threshold-dumped 0 # (or more)
>> # flush-threshold-scheduled 100 # (or more)
>> # taperflush 100
>> # autoflush yes
>> # maxdumpsize 100k # amount of data to dump each run; see above.
>> #
>> # You want to keep the most recent dumps on holding disk, for faster
>> recovery. # Older dumps will be rotated to tape during each run.
>> # flush-threshold-dumped 300 # (or more)
>> # flush-threshold-scheduled 300 # (or more)
>> # taperflush 300
>> # autoflush yes
>> #
>> # Defaults:
>> # (no restrictions; flush to tape immediately; don't flush old dumps.)
>> #flush-threshold-dumped 0
>> #flush-threshold-scheduled 0
>> #taperflush 0
>> #autoflush no
>> #
>> —————
>> Here is part of my setup, with further comments beside each param. I
>> may have written some of these comments, so I hope they are completely
>> correct. I think they are.
>> ———————
>> ## with LTO5 tapes, as of 2/27/2015, I still only USE one tape.
>> ## Don't faff around; just write to the silly tape. But to avoid
>> ## shoe shining, let some amount accumulate. Else we'd be writing
>> ## the first tiny file and then waiting .....
>>
>> ## see <othernode> if you need LTO5 settings.
>> ## Enzo is using an LTO4, so revert to these:
>>
>> # You want to improve tape performance by waiting for a complete tape
>> of data # before writing anything. However, all dumps will be flushed;
>> none will # be left on the holding disk.
>> # flush-threshold-dumped 100 # (or more)
>> # flush-threshold-scheduled 100 # (or more)
>> # taperflush 0
>>
>> flush-threshold-dumped 100 #Default: 0.
>> # Amanda will not begin writing data to a new
>> tape volume # until the amount of data on the holding disk is at least
>> this percentage # of the volume size. The idea is to accumulate a
>> bunch of files, # so the fill algorithm "Greedy Algorithm" has some
>> choices to work with. # The value of this parameter may not exceed
>> than that of the # flush-threshold-scheduled parameter.
>>
>> flush-threshold-scheduled 100 #Default: 0.
>> # Amanda will not begin writing data to a new
>> volume until the sum of # the amount of data on the holding disk and
>> the estimated amount of data # remaining to be dumped during this run
>> is at least this percentage of # the volume size.
>> # The value of this parameter may not be less
>> than that of the # flush-threshold-dumped or taperflush parameters.
>>
>>
>> taperflush 0 # Default: 0.
>> # At the end of a run, Amanda will start a new
>> tape to flush remaining data # if there is more data on the holding
>> disk at the end of a run than this # setting allows; the amount is
>> specified as a percentage of the capacity # of a single volume.
>> #### dsbdsb ie. 0 == start a new tape if
>> any data is still on holding disk. #### Good.
>> ## taperflush <= flush-threshold-scheduled
>> ## flush-threshold-dumped <= flush-threshold-scheduled
>>
>> #autoflush yes # only flushes those NAMED on the command
>> line. Use ALL. 6/28/13 autoflush all # flush leftovers from
>> a crash, or a ran-out-of-tape condition
>>
>> NOTE THE PART THAT SURPRISED ME, A FEW VERSIONS BACK;
>> autoflush has values of no / yes / all
>> “yes” and “all” behave slightly differently.
>>
>> Deb Baddorf
>> Fermilab
>
> Thank you a bunch Deb, that is better explained than the manpages ever
> do.
>
>
> Copyright 2018 by Maurice E. Heskett
> --
> Cheers, Gene Heskett
:D Glad it wasn’t “too much info”. I think my concepts have been gathered
from
past dialogs on this mailing list.
Note that the 100 settings require you to have holding-disk space at least
equal to
one tape/vtape space.
Deb