Re: Backups to tape consistently under 60% tape capacity

Debra S Baddorf Tue, 21 Oct 2014 09:39:50 -0700

Since nobody else is chiming in,  I’ll have another go.
I don’t think there IS a dry-run of the taping process, since so much depends 
on the timing
of when a DLE is finished and ready to go to tape,   and the physical fitting 
it onto tape
(although, since you have a virtual tape,  presumably that isn’t as subject to 
variation as
a real tape might be).


I wonder if your root (or boot or sys or whatever you call them)  partitions 
are now just slightly
bigger, after your operating system upgrade.  That would affect the way things 
fit into the tape.
One has to put the biggest things in first,  then the next biggest that will 
still fit,  etc
to make the most of the tape size.  (see  
http://www.appleseeds.org/Big-Rocks_Covey.htm
for the life motivational analysis type speech that uses this principal too)

Yet you, Tom,  are telling amanda to finish all the small things first,  and 
then put them onto tape
as soon as they are done:
  dumporder “sssS”
  taperalgo   first
I have mine set to finish the big dumps first,   so I can put them on the tape 
first
   dumporder “BTBTBTBTBT"

Then — I want amanda to wait until it has a whole tapeful before it starts 
writing — just so that
all those “big pieces”  are done  and available to be chosen.
     flush-threshold-dumped        100

And  THEN — I tell amanda to use the principle in the above motivational speech 
—
PUT THE BIG THINGS IN FIRST  to be sure they fit  (and that I don’t have a 40% 
space left
at the end of the tape which still isn’t big enough for that Big DLE  that just 
now finished).
    taperalgo largestfit    # pick the biggest file that will fit in space left
                        #"Greedy Algorithm"  -- best polynomial time choice
                        #   (err,  I think it was maybe my suggestion that 
caused the creation of this option,
                        #   cuz of the Knapsack problem & the Greedy Algorithm 
from comp sic
                        #   classes.    Which is the same as the motivational 
speech above.) Put the
                        #   big stuff in first!   Then you can always fit the 
little stuff in the remaining space.

SO TRY THIS:
If your operating system DLE is now big enough that it doesn’t fit in that last 
40% of the tape — 
then make sure it is ready earlier   
     dumporder  “BBB”   or  “BTBT”  etc
and that the taper waits till it has a whole tape worth   
     flush-threshold-dumped 100
AND  that it chooses the biggest bits first   
     taperalgo  largestfit.

Make those three changes and see if it helps.  I bet your tapes will again be 
mostly full, and only
the little bits will be left over to flush next time.

Deb Baddorf
Fermilab

(ps  the caps aren’t shouting — they are meant to make skimming this long 
winded diatribe easier!)  


On Oct 20, 2014, at 6:51 PM, Tom Robinson <[email protected]> wrote:

> Hi Debra,
> 
> Thanks for you comments especially regarding 'no record'. I did already make 
> that setting in my
> disklist file for all DLEs. eg:
> 
> host /path {
>        root-tar
>        strategy noinc
>        record no
> }
> 
> I didn't check it though until you mentioned it, so thanks again.
> 
> I did read the man page regarding the settings for autoflush to distinguish 
> the no/yes/all
> semantics. I chose specifically 'yes' ('With yes, only dump [sic] matching 
> the command line argument
> are flushed.').
> 
> Since I'm using 'yes' and not 'all' for autoflush, I don't think that has 
> been interfering.
> 
> When I ran the manual flush I did have to override the flush settings because 
> amanda didn't want to
> flush to tape at all. Just sat there waiting for more data, I guess. I didn't 
> record the command and
> it's no longer in my history. From memory, I think it was:
> 
> $ amflush -o flush-threshold-dumped=0 -o flush-threshold-scheduled=0 -o 
> taperflush=0 -o autoflush=no
> weekly
> 
> So essentially I was trying to flush with 'defaults' restored. Would that 
> mess with my scheduled runs?
> 
> Anyone have some clues about 'dry running' to see what tuning I need to tune 
> without actually doing it?
> 
> Regards,
> Tom
> 
> 
> On 21/10/14 10:27, Debra S Baddorf wrote:
>> Not an actual answer, but two comments:
>> 1- you’ve added a new config “archive”.    Make sure you set it “no record”  
>> so that
>> when IT does a level 0   of some disk,   your normal config doesn’t read 
>> that as ITS 
>> level 0.   The   “level 0 was done <date>”  info is not specific to the 
>> configuration,
>> but to the  disk itself.  For a “dump” type dump (as opposed to tar)  it is 
>> stored in
>> /etc/dumpdates,   and any dump done gets written there.   Amanda’s 
>> configurations  are “meta data”
>> that amanda knows about but that the disk itself doesn’t know about.   So 
>> your
>> archive config might be changing the dump level patterns of your other 
>> config,
>> unless you set the archive config to “no record”.
>>    I’m not sure if this is affecting your current setup, but you did just 
>> add that new config.
>> 
>> 2-  I became aware about a year ago that  “autoflush  yes”  is no longer the 
>> only
>> opposite to “autoflush no”.    There is also a new-ish  “autoflush all”.
>>     If you type  “amdump  MyConfig”    the either  “yes”  or “all”   should 
>> flush
>> everything.   But if you type   “amdump   MyConfig   aParticularNodeName”   
>> then
>> it will only flush DLE’s  that match that node name,   unless you set it to
>> “autoflush  all”.  
>>    You did mention that you had to do a few flushes lately.   If you really 
>> meant that
>> you had to allow some DLE’s to auto-flush,   then the  “all”  vs  “yes”  
>> might make a
>> difference to you.
>> 
>> Other people:   how can he do a “dry run” here?
>> 
>> Deb
>> 
>> On Oct 20, 2014, at 6:05 PM, Tom Robinson <[email protected]> wrote:
>> 
>>> Thanks Debra. I know there's a lot of info I dumped in my original email so 
>>> maybe my
>>> question/message wasn't clear.
>>> 
>>> I'm still confused over this. I only started dabbling with the flush 
>>> settings because I wasn't
>>> getting more than about 56% on the tape. I can't see how setting it back 
>>> will change that.
>>> 
>>> When I add up what flushed and what's not flushed, it appears as if it 
>>> would all fit on the tape.
>>> 
>>> Is there any way of testing this in a so called 'dry run'? Otherwise I'll 
>>> be waiting weeks to see
>>> what a couple of tweaks here and there will actually do.
>>> 
>>> On 21/10/14 08:28, Debra S Baddorf wrote:
>>>> Here’s a thought: 
>>>> orig:
>>>>>> flush-threshold-dumped 100
>>>>>> flush-threshold-scheduled 100
>>>>>> taperflush 100
>>>>>> autoflush yes
>>>> now:
>>>>>> flush-threshold-dumped 50
>>>>>> flush-threshold-scheduled 100
>>>>>> taperflush 0
>>>>>> autoflush yes
>>>> You now allow amanda to start writing to tape when only 50% of the data is 
>>>> ready.
>>>> (flush-threshold-dumped).   Previously,  100% had to be ready — and THAT 
>>>> allows
>>>> the best fit of DLE’s onto tape.  Ie:  
>>>> - pick the biggest DLE that will fit.  Write it to tape.
>>>> -  repeat.
>>>> 
>>>> Now,  the biggest one may not be done yet.  But you’ve already started 
>>>> writing all the
>>>> small pieces onto the tape,  so maybe when you reach the Big Guy,  there 
>>>> is no space
>>>> for it.  
>>>>  The  “Greedy Algorithm”  (above:  pick biggest.   repeat)  works best 
>>>> when  all the
>>>> parts are available for it to choose.  
>>>> 
>>>> Try setting  flush-threshold-dumped back to 100.    It won’t write as SOON 
>>>> — cuz it waits
>>>> till 100% of a tape is available,   but it might FILL the tape better.
>>>> 
>>>> I think.
>>>> 
>>>> Deb Baddorf
>>>> Fermilab
>>>> 
>>>> On Oct 20, 2014, at 3:44 PM, Tom Robinson <[email protected]> 
>>>> wrote:
>>>> 
>>>>> Anyone care to comment?
>>>>> 
>>>>> On 20/10/14 10:49, Tom Robinson wrote:
>>>>>> Hi,
>>>>>> 
>>>>>> I'm not sure why I'm not getting such good tape usage any more and 
>>>>>> wonder if someone can help me.
>>>>>> 
>>>>>> Until recently I was getting quite good tape usage on my 'weekly' config:
>>>>>> 
>>>>>> USAGE BY TAPE:
>>>>>> Label               Time         Size      %  DLEs Parts
>>>>>> weekly01            3:10  1749362651K  117.9    16    16
>>>>>> weekly02            3:09  1667194493K  112.4    21    21
>>>>>> weekly03            3:08  1714523420K  115.5    16    16
>>>>>> weekly04            3:04  1664570982K  112.2    21    21
>>>>>> weekly05            3:11  1698357067K  114.5    17    17
>>>>>> weekly06            3:07  1686467027K  113.7    21    21
>>>>>> weekly07            3:03  1708584546K  115.1    17    17
>>>>>> weekly08            3:11  1657764181K  111.7    21    21
>>>>>> weekly09            3:03  1725209913K  116.3    17    17
>>>>>> weekly10            3:12  1643311109K  110.7    21    21
>>>>>> weekly01            3:06  1694157008K  114.2    17    17
>>>>>> 
>>>>>> For that last entry, the mail report looked like this:
>>>>>> 
>>>>>> These dumps were to tape weekly01.
>>>>>> Not using all tapes because 1 tapes filled; runtapes=1 does not allow 
>>>>>> additional tapes.
>>>>>> There are 198378440K of dumps left in the holding disk.
>>>>>> They will be flushed on the next run.
>>>>>> 
>>>>>> Which was fairly typical and to be expected since the tune of flush 
>>>>>> settings was:
>>>>>> 
>>>>>> flush-threshold-dumped 100
>>>>>> flush-threshold-scheduled 100
>>>>>> taperflush 100
>>>>>> autoflush yes
>>>>>> 
>>>>>> Now, without expectation, the dumps started to look like this:
>>>>>> 
>>>>>> weekly02            3:21  1289271529K   86.9    10    10
>>>>>> weekly03            3:17   854362421K   57.6    11    11
>>>>>> weekly04            3:20   839198404K   56.6    11    11
>>>>>> weekly05            9:40   637259676K   42.9     5     5
>>>>>> weekly06           10:54   806737591K   54.4    15    15
>>>>>> weekly09            1:12    35523072K    2.4     1     1
>>>>>> weekly09            3:21   841844504K   56.7    11    11
>>>>>> weekly01            3:16   842557835K   56.8    19    19
>>>>>> 
>>>>>> About the time it started looking different, I introduced a second 
>>>>>> config for 'archive' but I can't
>>>>>> see why that would affect my 'weekly' run.
>>>>>> 
>>>>>> I had a couple of bad runs and had to flush them manually and I'm not 
>>>>>> sure what happened with tapes
>>>>>> weekly07 and weekly08 (they appear to be missing) and weekly09 is dumped 
>>>>>> to twice in succession.
>>>>>> This looks very weird.
>>>>>> 
>>>>>> $ amadmin weekly find | grep weekly07
>>>>>> 2014-09-14 00:00:00 monza /data/backup/amanda/vtapes/daily/slot4  0
>>>>>> weekly07                                                                 
>>>>>>                       1 
>>>>>> 1/-1 PARTIAL PARTIAL
>>>>>> $ amadmin weekly find | grep weekly08
>>>>>> 2014-09-14 00:00:00 monza /data/backup/amanda/vtapes/daily/slot4  0
>>>>>> weekly08                                                                 
>>>>>>                       1 
>>>>>> 1/-1 PARTIAL PARTIAL
>>>>>> $ amadmin weekly find | grep weekly09
>>>>>> 2014-09-21 00:00:00 monza /                                       0
>>>>>> weekly09                                                                 
>>>>>>                       9  
>>>>>> 1/1 OK
>>>>>> 2014-09-21 00:00:00 monza /data/backup/amanda/vtapes/daily/slot1  0
>>>>>> weekly09                                                                 
>>>>>>                      10  
>>>>>> 1/1 OK
>>>>>> 2014-09-21 00:00:00 monza /data/backup/amanda/vtapes/daily/slot2  0
>>>>>> weekly09                                                                 
>>>>>>                      11 
>>>>>> 1/-1 OK PARTIAL
>>>>>> 2014-09-14 00:00:00 monza /data/backup/amanda/vtapes/daily/slot4  0
>>>>>> weekly09                                                                 
>>>>>>                       1  
>>>>>> 1/1 OK
>>>>>> 2014-09-14 00:00:00 monza /data/backup/amanda/vtapes/daily/slot5  0
>>>>>> weekly09                                                                 
>>>>>>                       2  
>>>>>> 1/1 OK
>>>>>> 2014-09-14 00:00:00 monza /data/backup/amanda/vtapes/daily/slot6  0
>>>>>> weekly09                                                                 
>>>>>>                       3  
>>>>>> 1/1 OK
>>>>>> 2014-09-14 00:00:00 monza /data/backup/amanda/vtapes/daily/slot7  0
>>>>>> weekly09                                                                 
>>>>>>                       4  
>>>>>> 1/1 OK
>>>>>> 2014-09-14 00:00:00 monza /data/backup/amanda/vtapes/daily/slot8  0
>>>>>> weekly09                                                                 
>>>>>>                       5  
>>>>>> 1/1 OK
>>>>>> 2014-09-14 00:00:00 monza /export                                 0
>>>>>> weekly09                                                                 
>>>>>>                       6  
>>>>>> 1/1 OK
>>>>>> 2014-09-14 00:00:00 monza /export/home                            0
>>>>>> weekly09                                                                 
>>>>>>                       7  
>>>>>> 1/1 OK
>>>>>> 2014-09-14 00:00:00 monza /export/home/tom                        0
>>>>>> weekly09                                                                 
>>>>>>                       8  
>>>>>> 1/1 OK
>>>>>> 
>>>>>> 
>>>>>> More recently (about three weesk ago) I upgraded the OS. I don't think 
>>>>>> it has anything to do with
>>>>>> this but mention it for completeness.
>>>>>> 
>>>>>> To get as much on tape as possible I was originally using:
>>>>>> 
>>>>>> flush-threshold-dumped 100
>>>>>> flush-threshold-scheduled 100
>>>>>> taperflush 100
>>>>>> autoflush yes
>>>>>> 
>>>>>> But now, in an effort to tune better tape usage, I've dabbled with the 
>>>>>> settings. My full amanda.conf
>>>>>> is below. I include some configs (include statements) but have only 
>>>>>> shown robots.conf and
>>>>>> tapetypes.conf as the dumptypes.conf and networks.conf are pretty much 
>>>>>> stock standard and haven't
>>>>>> been modified.
>>>>>> 
>>>>>> Kind regards,
>>>>>> Tom
>>>>>> 
>>>>>> 
>>>>>> #amanda.conf
>>>>>> org      "somedomain.com weekly"
>>>>>> mailto   "[email protected]"
>>>>>> dumpuser "amanda"
>>>>>> inparallel 4
>>>>>> dumporder "sssS"
>>>>>> taperalgo first
>>>>>> displayunit "k"
>>>>>> netusage  8000 Kbps
>>>>>> dumpcycle 8 weeks
>>>>>> runspercycle 8
>>>>>> tapecycle 10 tapes
>>>>>> bumpsize 20 Mb
>>>>>> bumppercent 20
>>>>>> bumpdays 1
>>>>>> bumpmult 4
>>>>>> etimeout 3000
>>>>>> dtimeout 1800
>>>>>> ctimeout 30
>>>>>> device_output_buffer_size 81920k
>>>>>> usetimestamps yes
>>>>>> flush-threshold-dumped 50
>>>>>> flush-threshold-scheduled 100
>>>>>> taperflush 0
>>>>>> autoflush yes
>>>>>> runtapes 1
>>>>>> includefile "/etc/opt/csw/amanda/robot.conf"
>>>>>> maxdumpsize -1
>>>>>> tapetype ULT3580-TD5
>>>>>> labelstr "^weekly[0-9][0-9]*$"
>>>>>> amrecover_changer "changer"
>>>>>> holdingdisk hd1 {
>>>>>>  comment "main holding disk"
>>>>>>  directory "/data/spool/amanda/hold/monza"
>>>>>>  use -100 Mb
>>>>>>  chunksize 1Gb
>>>>>>  }
>>>>>> infofile "/etc/opt/csw/amanda/weekly/curinfo"
>>>>>> logdir   "/etc/opt/csw/amanda/weekly"
>>>>>> indexdir "/etc/opt/csw/amanda/weekly/index"
>>>>>> includefile "/etc/opt/csw/amanda/dumptypes.conf"
>>>>>> includefile "/etc/opt/csw/amanda/networks.conf"
>>>>>> includefile "/etc/opt/csw/amanda/tapetypes.conf"
>>>>>> 
>>>>>> #robot.conf
>>>>>> define changer robot {
>>>>>>      tpchanger "chg-robot:/dev/scsi/changer/c1t5000E11156304003d1"
>>>>>>      property "tape-device" "0=tape:/dev/rmt/0bn"
>>>>>>      #property "eject-before-unload" "yes"
>>>>>>      property "use-slots" "1-23"
>>>>>>      device-property "BLOCK_SIZE" "512k"
>>>>>>      device-property "READ_BLOCK_SIZE" "512k"
>>>>>>      device-property "FSF_AFTER_FILEMARK" "false"               
>>>>>>      device-property "LEOM" "TRUE"
>>>>>> }
>>>>>> tapedev "robot"
>>>>>> 
>>>>>> # tapetypes.conf
>>>>>> define tapetype global {
>>>>>>  part_size 3G
>>>>>>  part_cache_type none
>>>>>> }
>>>>>> 
>>>>>> define tapetype ULT3580-TD5 {
>>>>>>  comment "Created by amtapetype; compression enabled"
>>>>>>  length 1483868160 kbytes
>>>>>>  filemark 868 kbytes
>>>>>>  speed 85837 kps
>>>>>>  blocksize 512 kbytes
>>>>>> }
>>>>>> 
>>>>>> 
>>>>>> 
>>> 
> 
>

Re: Backups to tape consistently under 60% tape capacity

Reply via email to