Re: Flushing the Holding Disk

Gene Heskett Fri, 16 Nov 2018 12:23:51 -0800

On Friday 16 November 2018 14:51:21 Debra S Baddorf wrote:

> > On Nov 16, 2018, at 1:37 PM, Gene Heskett <[email protected]>
> > wrote:
> >
> > On Friday 16 November 2018 13:59:59 Debra S Baddorf wrote:
> >>> On Nov 16, 2018, at 12:11 PM, Austin S. Hemmelgarn
> >>> <[email protected]> wrote:
> >>>
> >>> On 2018-11-16 12:27, Chris Miller wrote:
> >>>> Hi Folks,
> >>>> I'm unclear on the timing of the flush from holding disk to
> >>>> vtape. Suppose I run two backup jobs,and each uses the holding
> >>>> disk. When will the second job start? Obviously, after the client
> >>>> has sent everything... Before the holding disk flush starts, or
> >>>> after the holding disk flush has completed?
> >>>
> >>> If by 'jobs' you mean 'amanda configurations', the second one
> >>> starts when you start it.  Note that `amdump` does not return
> >>> until everything is finished dumping and optionally taping if
> >>> anything would be taped, so you can literally just run each one
> >>> sequentially in a shell script and they won't run in parallel.
> >>>
> >>> If by 'jobs' you mean DLE's, they run as concurrently as you tell
> >>> Amanda to run them.  If you've got things serialized (`inparallel`
> >>> is set to 1 in your config), then the next DLE will start dumping
> >>> once the previous one is finished dumping to the holding disk.
> >>> Otherwise, however many you've said can run in parallel run
> >>> (within per-host limits), and DLE's start when the previous one in
> >>> sequence for that dumper finishes. Taping can (by default) run in
> >>> parallel with dumping if you're using a holding disk, which is
> >>> generally a good thing, though you can also easily configure it to
> >>> wait for some amount of data to be buffered on the holding disk
> >>> before it starts taping.
> >>>
> >>>> Is there any way to defer the holding disk flush until all backup
> >>>> jobs for a given night have completed?
> >>>
> >>> Generically, set `autoflush no` in each configuration, and then
> >>> run `amflush` for each configuration once all the dumps are done.
> >>>
> >>> However, unless you've got an odd arrangement where every system
> >>> saturates the network link while actually dumping and you are
> >>> sharing a single link on the Amanda server for both dumping and
> >>> taping, this actually probably won't do anything for your
> >>> performance.  You can easily configure amanda to flush backups
> >>> from each DLE as soon as they are done, and it will wait to exit
> >>> until everything is actually flushed.
> >>>
> >>> Building from that, if you just want to ensure the `amdump`
> >>> instances don't run in parallel, just use a tool to fire them off
> >>> sequentially in the foreground.  Stuff like Ansible is great for
> >>> this (especially because you can easily conditionally back up your
> >>> index and tapelist when the dump finishes).  As long as the next
> >>> `amdump` command isn't started until the previous one returns, you
> >>> won't have to worry about them fighting each other for bandwidth.
> >>
> >> Chris:  you have some control over when DLEs go from the holding
> >> disk to the actual tape (or vtape). This paragraph is from the
> >> examples, and I keep it in my config files so I remember how to
> >> setup these params: #  New amanda includes these explanatory
> >> paragraphs:
> >>
> >> # flush-threshold-dumped, flush-threshold-scheduled, taperflush,
> >> and autoflush # are used to control tape utilization. See the
> >> amanda.conf (5) manpage for # details on how they work. Taping will
> >> not start until all criteria are # satisfied. Here are some
> >> examples: #
> >> # You want to fill tapes completely even in the case of failed
> >> dumps, and # don't care if some dumps are left on the holding disk
> >> after a run: # flush-threshold-dumped        100 # (or more)
> >> # flush-threshold-scheduled     100 # (or more)
> >> # taperflush                    100
> >> # autoflush                     yes
> >> #
> >> # You want to improve tape performance by waiting for a complete
> >> tape of data # before writing anything. However, all dumps will be
> >> flushed; none will # be left on the holding disk.
> >> # flush-threshold-dumped        100 # (or more)
> >> # flush-threshold-scheduled     100 # (or more)
> >> # taperflush    0
> >> #
> >> # You don't want to use a new tape for every run, but want to start
> >> writing # to tape as soon as possible:
> >> # flush-threshold-dumped        0   # (or more)
> >> # flush-threshold-scheduled     100 # (or more)
> >> # taperflush    100
> >> # autoflush     yes
> >> # maxdumpsize   100k # amount of data to dump each run; see above.
> >> #
> >> # You want to keep the most recent dumps on holding disk, for
> >> faster recovery. # Older dumps will be rotated to tape during each
> >> run. # flush-threshold-dumped        300 # (or more)
> >> # flush-threshold-scheduled     300 # (or more)
> >> # taperflush    300
> >> # autoflush     yes
> >> #
> >> # Defaults:
> >> # (no restrictions; flush to tape immediately; don't flush old
> >> dumps.) #flush-threshold-dumped 0
> >> #flush-threshold-scheduled 0
> >> #taperflush 0
> >> #autoflush no
> >> #
> >> —————
> >> Here is part of my setup, with further comments beside each param. 
> >> I may have written some of these comments, so I hope they are
> >> completely correct.  I think they are.
> >> ———————
> >> ## with LTO5 tapes,  as of 2/27/2015,  I still only USE one tape.
> >> ## Don't faff around;  just write to the silly tape.   But to avoid
> >> ## shoe shining,  let some amount accumulate.  Else we'd be writing
> >> ## the first tiny file and then waiting .....
> >>
> >> ##  see <othernode>  if you  need LTO5 settings.
> >> ## Enzo is using an LTO4,  so revert to these:
> >>
> >> # You want to improve tape performance by waiting for a complete
> >> tape of data # before writing anything. However, all dumps will be
> >> flushed; none will # be left on the holding disk.
> >> # flush-threshold-dumped        100 # (or more)
> >> # flush-threshold-scheduled     100 # (or more)
> >> # taperflush    0
> >>
> >> flush-threshold-dumped 100      #Default: 0.
> >>                        # Amanda will not begin writing data to a
> >> new tape volume # until the amount of data on the holding disk is
> >> at least this percentage # of the volume size.     The idea is to
> >> accumulate a bunch of files, # so the fill algorithm "Greedy
> >> Algorithm"  has some choices to work with. #  The value of this
> >> parameter may not exceed than that of the #
> >> flush-threshold-scheduled parameter.
> >>
> >> flush-threshold-scheduled 100          #Default: 0.
> >>                        # Amanda will not begin writing data to a
> >> new volume until the sum of # the amount of data on the holding
> >> disk and the estimated amount of data # remaining to be dumped
> >> during this run is at least this percentage of # the volume size.
> >>                        #  The value of this parameter may not be
> >> less than that of the # flush-threshold-dumped or taperflush
> >> parameters.
> >>
> >>
> >> taperflush 0            # Default: 0.
> >>                        # At the end of a run, Amanda will start a
> >> new tape to flush remaining data # if there is more data on the
> >> holding disk at the end of a run than this # setting allows; the
> >> amount is specified as a percentage of the capacity # of a single
> >> volume. ####  dsbdsb   ie.  0 == start a new tape if any data is
> >> still on holding disk. ####           Good.
> >> ## taperflush              <= flush-threshold-scheduled
> >> ## flush-threshold-dumped  <= flush-threshold-scheduled
> >>
> >> #autoflush yes          #  only flushes those NAMED on the command
> >> line.  Use ALL.  6/28/13 autoflush all          # flush leftovers
> >> from a crash, or a ran-out-of-tape condition
> >>
> >> NOTE THE PART THAT SURPRISED ME, A FEW VERSIONS BACK;
> >> autoflush   has values of  no / yes / all
> >> “yes” and “all” behave slightly differently.
> >>
> >> Deb Baddorf
> >> Fermilab
> >
> > Thank you a bunch Deb, that is better explained than the manpages
> > ever do.
> >
> >
> > Copyright 2018 by Maurice E. Heskett
> > --
> > Cheers, Gene Heskett
> >
> :D   Glad it wasn’t “too much info”.    I think my concepts have been
> : gathered from
>
> past dialogs on this mailing list.
>
> Note that the 100 settings require you to have holding-disk space at
> least equal to one tape/vtape  space.
>
> Deb


Its in /usr/dumps, and can currently use 650 to 700 GB as that 1-T is 
currently at 27%. vtapes are 40GB ATM, so it looks like I'm good to very 
good there.

Thanks Deb.

Copyright 2018 by Maurice E. Heskett
-- 
Cheers, Gene Heskett
--
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
Genes Web page <http://geneslinuxbox.net:6309/gene>

Re: Flushing the Holding Disk

Reply via email to