Re: some suggested config parameters for backups to local disk

Austin S. Hemmelgarn Fri, 23 Mar 2018 05:05:17 -0700

On 2018-03-22 19:03, Ryan, Lyle (US) wrote:

I've got an Amanda 3.4.5 server running on Centos 7 now, and am able todo rudimentary backups of a remote client.
But in spite of reading man pages, HowTo's, etc, I need help choosingconfig params. I don't mind continuing to read and experiment, but ifsomeone could get me at least in the ballpark, I'd really appreciate it.
The server has an 11TB filesystem to store the backups in. I shouldprobably be fancier and split this up more, but not now. So I've gotmy holding, state, and vtapes directories all in there.
The main client I want to back up has 4TB I want to backup. It's almostall in one filesystem, but the HowTo for splitting DLE's with excludelists is clear, so it should be easy to split this into (say) 10 smallerindividual dumps. The bulk of the data is pretty static, maybe10%/month changes. It's hard to imagine 20%/month changing.
For a start, I'd like to get a full done every 2 weeks, andincrementals/differentials on the intervening days. If I have room tokeep 2 fulls (2 complete dumpcycles) that would be great.

Given what you've said, you should have enough room to do so, but onlyif you use compression. Assuming the rate of change you quote above sapproximately constant and doesn't result in bumping to a level higherthan 1, then without compression you will need roughly 4.015TB per cycle(4TB for the full backup, ~15.38GB for the incrementals (roughly 0.38%change per day for 13 days)), plus 4TB of space for the holding disk(because you have to have room for a full backup _there_ prior to tapinganything). With compression and assuming you get a compression ratio ofabout 50%, you should actually be able to fit four complete cycles (youwould need about 2.0075TB per cycle), though if you decide you want thatI would bump the tapecycle to 60 and the number of slots to 60.

So I'm thinking:

- dumpcycle = 14

- runspercycle = 0 (default)

- tapecycle = 30

- runtapes = 1 (default)
I'd break the filesystem into 10 pieces, so 400GB each. and make thevtapes 400GB each (with tapetype length) relying on server-sidecompression to make it fit.
The HowTo "Use pigz to speed compression" looks clear, and the DL380 G7isn't doing anything else, so server-side compression sounds good.
Any advice on this or better ideas?  Maybe I'm off in left-field.
And one bonus question: I'm assuming Amanda will just make vtapes asnecessary, but is there any guidance as to how many vtape slots I shouldcreate ahead of time? If my dumpcycle=14, maybe create 14 slots just tomake tapes easier to find?

Debra covered the requirements for vtapes, slots, and everything verywell in her reply, so I won't repeat any of that here. I do howeverhave some other more generic advice I can give based on my own experience:

* Make your vtapes as large as possible. They won't take up any spacebeyond what's stored on them (in storage terminology, they're thinlyprovisioned), so their total 'virtual' size can be far more than youractual storage capacity, but if you can make it so that you can alwaysfit a full backup on a single vtape, it will make figuring out how manyvtapes you need easier, and additionally give a slight boost to tapingperformance (because the taper never has to stop to switch to a newvtape). In your case, I'd say stating 5TB for your vtape size isreasonable, that would give you some extra room if you suddenly havemore data without being insanely over-sized.

* Make sure to set a reasonable part_size for your vtapes. While youwouldn't have to worry about splitting dumps if you take my above adviceabout vtape size, using parts has some other performance relatedadvantages. I normally use 1G, but all of my dumps are less than 100Gin size. In your case, if you'll have 10 400G dumps, I'd probably gofor 4G for the part size.

* Match your holding disk chunk size to your vtape's part_size. I haveno hard number to back this up, but it appears to provide a slightperformance improvement while dumping data.

* Don't worry right now about parallelizing the taping process. It'ssomewhat complicated to get it working right, significantly changes howyou have to calculate vtape slots and sizes, and will probably notprovide much benefit unless you're taping to a really fast RAID arraythat does a very good job of handling parallel writes.

* There's essentially zero performance benefit to having your holdingdisk on a separate partition from your final storage unless you have iton a completely separate disk. There are some benefits in terms ofreliability, but realizing them requires some significant planning (youhave to figure out exactly what amount of space your holding disk willneed).

* If you're indexing the backups, store the working index directory (theone Amanda actually reads and writes to) on a separate drive from theholding disk and final backup storage, but make sure it doesn't getincluded in the backup if you're backing up your local system as part ofthis configuration. This is the single biggest performance booster I'vefound so far when dealing with Amanda. You can still copy the indexover to the final backup storage location (and I would actuallyencourage you to do so), but just make sure it's not being written to orread from off of that location while backups are being taped.

* Given the fact that you're going to need to use compression, I wouldsuggest looking into how much processing power you can throw at that bydoing some actual testing. In particular, I would suggest trying testdumps a couple of times with different compression types to see how fasteach type runs and how much space it saves you. Keep in mind that youcan pass extra options to any compression program you want by using thecustom compression support and a wrapper script like this:


    #!/bin/bash
    /path/to/program --options $@

If you can get it on your distribution, I'd suggest looking intozstandard [1] for compression. The default settings for it compressboth better _and_ faster than the default gzip settings.

* Given that you're only backing up to a local disk, try tweaking thedevice_output_buffer_size and see how that impacts your performance. 1Mseems to be a good starting point for local disks, but higher values mayget you much better performance.

Re: some suggested config parameters for backups to local disk

Reply via email to