Re: taper should wait until all dumps are done

Austin S. Hemmelgarn Fri, 27 Jul 2018 10:39:04 -0700

On 2018-07-27 12:23, Stefan G. Weichinger wrote:

Am 27.07.2018 um 17:02 schrieb Jean-Francois Malouin:

You should also consider playing with dumporder.
I have it set to 'TTTTTTTT' and that makes the longest (time wise)
dumps go first so that the fast ones get push at the end.
In one config I have:

dumporder "TTTTTTTT"
flush-threshold-dumped 100
flush-threshold-scheduled 100
taperflush 100
autoflush yes

so that all the dumps will wait until the longest one are done.
It also won't go until it can fill one volume (100%). You can
obviously go further than that if you have enough hold disk.

Or at least it's my understanding...

(the ML was down for a while, so that's the reason for my delayedresponse, it should work now)

I checked "dumporder" in that config, it was "BTBT...", I changed it to"TTT..." now for a test.


Although I am not 100% convinced that this will do the trick ;-)

We will see.

I never fully understood that parameter and its influence so far, to meit's a bit "unintuitive".

Perhaps I can help with that.

Part of what Amanda's scheduling does is figure out the size that eachbackup will be on each run (based on the estimate process), how muchbandwidth it will need while dumping (based on the bandwidth settingsfor that particular dump type), and the amount of time it will take(predicted based on the size, prior timing data, and possibly thebandwidth). That information is then used together with the 'dumporder'setting to control how each dumper chooses what dump to do next when itfinishes dumping. Each letter in the value corresponds to exactly onedumper, and controls only that dumper's selection.

The size-based selection is generally the easiest to explain, it justsays to pick the largest (for 'S') or smallest (for 's') dump out of theset and run that next.

The bandwidth-based selection is only relevant if you have bandwidthsettings configured. Without them, it treats all dumps as equal, andpicks the next dump based solely on the order that amanda has themsorted (which, IIRC, matches the order found in the disk list). Withthem, it uses a similar selection method to the size-based selection,just looking at bandwidth instead of size.

The time-based selection is where things get tricky, but they get trickybecause of how complicated it is to predict how long a dump will take,not because the selection is complicated (it works just like size-basedselection, just looking at estimated runtime instead of size). Prettymuch, the timing data is extrapolated by looking at previous dumps ofthe DLE, correlating size and actual run-time. I'm not sure whatfitting method it uses for the extrapolation (my first guess would besimple linear extrapolation, because that's easy and should work most ofthe time), and I'm also not sure what, if any, impact bandwidth has onthe calculation.


So, in short you have:

* 'S' and 's': Simple deterministic selection based on the predictedsize of the dump.* 'B' and 'b': Simple deterministic selection based on bandwidthsettings if they are defined, otherwise trivial FIFO selection.* 'T' and 't': Not quite deterministic selection based on predictedexecution time of the dump process.


So, for a couple of examples:

* The default setting 'BTBTBTBT' This will have half the dumpers selectdumps that will take the largest amount of time, and the other selectthe ones that will take the largest amount of bandwidth. This worksreasonably well if you have bandwidth settings configured and widevariance in dump size.

* What you're looking at testing 'TTTTTTTT': This is a trivial case ofall dumpers selecting the dumps that will take the longest time. Ifyou're dumping almost all similar hosts, this will be essentiallyequivalent to just selecting the largest. If you're dumping a widevariety of different hosts, it will be equivalent to selecting thelargest on the first dump, but after that will select based on whichsystem takes the longest.

* What I use on my own systems 'SSss' (I only run four dumpers, noteight): This is a reasonably simple option that gives a good balancebetween getting dumps done as quickly as possible, and not wasting timewaiting on the big ones. Two of the dumpers select whatever dump is thelargest, so that some of the big ones get started right away, while theother two select the smallest dumps, so that those get backed upimmediately. I've done some really simple testing that indicates thatthis actually gets all the dumps done faster on average than the defaultfor the case of all your systems being able to dump data at the same rate.

* What we use where I work 'TTSSSSss': This is one where things get abit complicated. There are three different ways things get selectedhere. First, two of the eight dumpers will select dumps that are goingto take the longest amount of time. Then, you have four that will pullthe largest ones, and two that will pull the smallest. This gets reallygood behavior where I work because we have a handful of decade oldsystems that we need to keep backed up which take _forever_ to back up,but most of our other systems are new and don't take too long. On thefirst dump, this is equivalent to 'SSSSSSss', but after that, the slowsystems get priority to run while everything else is dumping even thoughthey are not the largest or smallest dumps, so the backup processdoesn't stall out waiting on them to finish at the end.

Re: taper should wait until all dumps are done

Reply via email to