Re: All level 0 on the same run?

2018-10-31 Thread Gene Heskett
On Wednesday 31 October 2018 22:40:58 Nathan Stratton Treadway wrote:

> On Wed, Oct 31, 2018 at 13:26:40 -0400, Gene Heskett wrote:
> > That makes more sense than anything else I've found. Now, I have
> > 3.3.7p1. 3.4.3, and 3.5.1 which I've been running about that long.
> > So lets install 3.4.3 for tonight. Building now, apparently had not
> > been done before. and on the instal and amcheck, I had to move the
>
> Okay, let us know if 3.4.3 behaves any differently for those small
> DLEs. I took a quick look at the source code commits in the 3.5
> timeframe and nothing jumped out at me as touching the
> info-file-updating, so it would not surprise me too much if that bug
> were in 3.4 as well.
>
> But in any case, it would help narrow down where to look to know if it
> was or was not fixed by downgrading.
>
The next step down would be 3.3.7p1, which has ran here for years. I also 
have 3.3.6, but thats as old as I go except for some amanda-4.x.x-alpha 
stuff that the previous programmer who was fond of perl was doing, but 
can't remember his name ATM.  Most of that also seemed to work. So if 
its present in 3.3.7p1, then test 3.3.6, then start on the alpha stuff. 
At least till someone comes up with a patch.  What file do you think the 
bug is in?
>
Take care now, Nathan & thank you.
-- 
Cheers, Gene Heskett
--
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
Genes Web page 


Re: All level 0 on the same run?

2018-10-31 Thread Nathan Stratton Treadway
On Wed, Oct 31, 2018 at 13:26:40 -0400, Gene Heskett wrote:
> That makes more sense than anything else I've found. Now, I have 3.3.7p1. 
> 3.4.3, and 3.5.1 which I've been running about that long.
> So lets install 3.4.3 for tonight. Building now, apparently had not been 
> done before. and on the instal and amcheck, I had to move the 

Okay, let us know if 3.4.3 behaves any differently for those small DLEs.
I took a quick look at the source code commits in the 3.5 timeframe and
nothing jumped out at me as touching the info-file-updating, so it would
not surprise me too much if that bug were in 3.4 as well.

But in any case, it would help narrow down where to look to know if it
was or was not fixed by downgrading.


Nathan



Nathan Stratton Treadway  -  natha...@ontko.com  -  Mid-Atlantic region
Ray Ontko & Co.  -  Software consulting services  -   http://www.ontko.com/
 GPG Key: http://www.ontko.com/~nathanst/gpg_key.txt   ID: 1023D/ECFB6239
 Key fingerprint = 6AD8 485E 20B9 5C71 231C  0C32 15F3 ADCD ECFB 6239


Re: All level 0 on the same run?

2018-10-31 Thread Gene Heskett
On Wednesday 31 October 2018 14:52:46 Debra S Baddorf wrote:

> “  Those datestamps are obviously wrong, should be 20181031  "
>
> The two DLE’s that you showed, with datestamp “wrong”, are among the
> “too small” disks you are talking about.   So it seems that the
> datestamp probably isn’t wrong? These sets are being missed?
>
> (Not completely following, cuz it’s not (so far) relevant to me.)
> But the date thing seems to be real, so I thought I’d point it out.
>
> Deb Baddorf
>
Nope, Deb, those dle's are present, and carrying todays date when the 
vtape is accessed by an ls -l, as I posted a few hours ago.

-- 
Cheers, Gene Heskett
--
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
Genes Web page <http://geneslinuxbox.net:6309/gene>



Re: All level 0 on the same run?

2018-10-31 Thread Nathan Stratton Treadway
On Wed, Oct 31, 2018 at 12:07:06 -0400, Gene Heskett wrote:
> root@coyote:/amandatapes/Dailys# ls -l data/
> total 18143556
> -rw--- 1 amanda amanda  32768 Oct 31 03:03 0.Dailys-27
> -rw--- 1 amanda amanda  70603 Oct 31 03:03 1.shop._root.0
> -rw--- 1 amanda amanda  32934 Oct 31 03:03 2.shop._var_amanda.0
> -rw--- 1 amanda amanda 272120 Oct 31 03:03 3.GO704._root.0
[...]
> -rw--- 1 amanda amanda   13353012 Oct 31 03:44 00064.GO704._usr_local.0
> -rw--- 1 amanda amanda2163585 Oct 31 03:44 
> 00065.GO704._usr_lib_amanda.0
> -rw-r--r-- 1 amanda amanda 112640 Oct 31 03:45 configuration.tar
> -rw-r--r-- 1 amanda amanda  469934080 Oct 31 03:45 indices.tar
> 
> So all 66 dle's are there, subbing out the first line & the last 2.
> 
[...]
> Now, possibly interesting, does amadmin skip some that aren't due, it 
> only sees 65 lines of output.
> oot@coyote:/amandatapes/Dailys# su amanda -c "/usr/local/sbin/amadmin Daily 
> dles"|wc -l
> 65

Note that in the vtape directory, the 0 file is the tape label, so
actually there are only 65 DLEs backed up there (thus matching your
"amadmin ... dles" output).  

(Amanda generally does _some_ dump for every DLE, to make sure to catch
changes since the previous dump... though of course if nothing has
changed on that DLE you may end up with an empty incremental dump for
that DLE.)

> 
> So where does amanda keep the file with the last backup, is that amandates?

(In 3.5 I seems to remember it's possible to choose from different
storage back-ends, but generally) the "amadmin ... info" data is stored
in a pile of
  /var/lib/amanda//curinfo///info 
text files (one file for each DLE).

> On shop, they don't resemble dates:
> gene@shop:/etc$ ls -l amandates
> -rw-r- 1 amandabackup disk 380 Oct 31 03:28 amandates
> gene@shop:/etc$ cat amandates
> /etc 0 1540969428
> /etc 1 1540796552

I don't believe amandates has anything to do with the "amadmin ...
info/due/balance" commands, but anyway those are
seconds-since-Unix-epoch numbers.  For a reasonably up-to-date version
of the GNU date command, you can translate that to a human-readable
date-time string with date --date="@n", e.g.

  $ date --date="@1540969428"
  Wed Oct 31 03:03:48 EDT 2018
  $ date --date="@1540796552"
  Mon Oct 29 03:02:32 EDT 2018


Nathan


Nathan Stratton Treadway  -  natha...@ontko.com  -  Mid-Atlantic region
Ray Ontko & Co.  -  Software consulting services  -   http://www.ontko.com/
 GPG Key: http://www.ontko.com/~nathanst/gpg_key.txt   ID: 1023D/ECFB6239
 Key fingerprint = 6AD8 485E 20B9 5C71 231C  0C32 15F3 ADCD ECFB 6239


Re: All level 0 on the same run?

2018-10-31 Thread Debra S Baddorf
We may have found an answer for Gene’s problem.
Has the original poster, Chris Nighswonger  found an answer?

Deb

> On Oct 30, 2018, at 1:32 PM, Debra S Baddorf  wrote:
> 
> Is this the first backup run for a long while?  If so, then they are all DUE, 
> so amanda feels it has to schedule them all, now.
> 
> Is this the first backup ever?   Ditto above.
> 
> Did you perhaps run  “amadminforce  *”  which forces a level 0 on 
> all disks.
> Did you specify  “strategy noinc”which does the same?
> Or  "skip-incr yes”  ?  Ditto.
> 
> Did you replace a whole disk, making all the files look like they’ve never 
> been backed up?
> 
> Okay,  failing all the above obvious reasons,  I’ll leave others to discuss 
> “planner” reasons.  Sorry!
> Deb Baddorf
> Fermilab
> 
>> On Oct 30, 2018, at 1:20 PM, Chris Nighswonger 
>>  wrote:
>> 
>> Why in the world does Amanda plan level 0 backups for all entries in a DLE 
>> for the same run This causes all sorts of problems.
>> Is there any solution for this? I've read some of the creative suggestions, 
>> but it seems a bunch of trouble.
>> Kind regards,
>> Chris
>> 
>> 0 19098649k waiting for dumping
>> 0 9214891k waiting for dumping
>> 0 718824k waiting for dumping
>> 0 365207k waiting for dumping
>> 0 2083027k waiting for dumping
>> 0 3886869k waiting for dumping
>> 0 84910k waiting for dumping
>> 0 22489k dump done (7:23:34), waiting for writing to tape
>> 0 304k dump done (7:22:30), waiting for writing to tape
>> 0 2613k waiting for dumping
>> 0 30k dump done (7:23:07), waiting for writing to tape
>> 0 39642k dump done (7:23:07), waiting for writing to tape
>> 0 8513409k waiting for dumping
>> 0 39519558k waiting for dumping
>> 0 47954k waiting for dumping
>> 0 149877984k dumping 145307840k ( 96.95%) (7:22:15)
>> 0 742804k waiting for dumping
>> p" 0 88758k waiting for dumping
>> 0 12463k dump done (7:24:19), waiting for writing to tape
>> 0 5544352k waiting for dumping
>> 0 191676480k waiting for dumping
>> 0 3799277k waiting for dumping
>> 0 3177171k waiting for dumping
>> 0 11058544k waiting for dumping
>> 0 230026440k dump done (7:22:13), waiting for writing to tape
>> 0 8k dump done (7:24:24), waiting for writing to tape
>> 0 184k dump done (7:24:19), waiting for writing to tape
>> 0 1292009k waiting for dumping
>> 0 2870k dump done (7:23:23), waiting for writing to tape
>> 0 13893263k waiting for dumping
>> 0 6025026k waiting for dumping
>> 0 6k dump done (7:22:15), waiting for writing to tape
>> 0 42k dump done (7:24:24), waiting for writing to tape
>> 0 53k dump done (7:24:19), waiting for writing to tape
>> 0 74462169k waiting for dumping
>> 0 205032k waiting for dumping
>> 0 32914k waiting for dumping
>> 0 1k dump done (7:24:02), waiting for writing to tape
>> 0 854272k waiting for dumping
>> 
> 




Re: All level 0 on the same run?

2018-10-31 Thread Jon LaBadie
On Wed, Oct 31, 2018 at 08:25:39AM -0400, Chris Nighswonger wrote:
> So, looking at this more, it may be self-inflicted. Last week I changed
> blocksize to 512k, and began amrmtape and amlabel with the oldest tape
> first and working backward day by day. I run backups 5 nights per week with
> a cycle of 13 tapes (see below). I would have thought that this would have
> allowed the change in blocksize to run seamlessly. Maybe not. I'm now
> suspecting that by amrmtape --cleanup, this caused Amanda to bork and fall
> back to level 0 backups. She did this two nights in a row!!!
> 
> Anyway, I'm going to hold off any further concerns until I finish a
> complete tapecycle. If the problem continues after that point, I'll pick
> back up.
> 
> Relevant lines from amanda.conf:
> 
> dumpcycle 5 days
> runspercycle 5
> tapecycle 13 tapes
> runtapes 1
> flush-threshold-dumped 50
> bumpsize 10 Mbytes
> bumppercent 0
> bumpmult 1.5
> bumpdays 2
> 
> Kind regards,
> Chris
> 

When I introduce a lot of DLEs into the disklist (or start a
new amanda instance) I typically will comment out all the new
DLEs and only uncomment a few each amdump run.

Jon
-- 
Jon H. LaBadie j...@jgcomp.com
 11226 South Shore Rd.  (703) 787-0688 (H)
 Reston, VA  20190  (703) 935-6720 (C)


Re: All level 0 on the same run?

2018-10-31 Thread Nathan Stratton Treadway
On Wed, Oct 31, 2018 at 08:29:08 -0400, Chris Nighswonger wrote:
> FWIW, here is the output of amadmin balance before last nights run and
> again this morning. No overdues, so I guess that's good. I'm not
> experienced enough to make much of the balance percentages, but am now
> wondering if I should work at breaking up the large DLEs into smaller
> subsets as several have suggested.
> 
> root@scriptor:/home/manager# su backup -c "/usr/sbin/amadmin campus balance"
> 
>  due-date  #fsorig kB out kB   balance
> --
> 10/30 Tue   25  359009166  284972243   +102.1%
> 
> 11/03 Sat   15  632526057  420083122   +197.9%
> --
> TOTAL   40  991535223  705055365 141011073
>   (estimated 5 runs per dumpcycle)
>  (13 filesystems overdue. The most being overdue 1 day.)

Regarding "balance percentages": if you divide the "TOTAL out kB" figure
by the number of runs per dmpcycle, you'll get the number in the bottom
right corner of the chart (i.e. "141011073").  Amanda tries to spread
out the full dumps so that volume of full dumps happens each day (or, in
other words, so that the full dumps are split evenly over each day in
the dumpcycle).  The "balance" figures is simply a calculation of how
the currently-scheduled cycle compares to that ideal -- so in this case
the 10/30 figure is about twice the average (102% above the target),
while the 11/03 figure is about three times the target.

> root@scriptor:/home/manager# su backup -c "/usr/sbin/amadmin campus balance"
> 
>  due-date  #fsorig kB out kB   balance
> --
> 10/31 Wed1  0  0  ---
> 
> 11/03 Sat   39 1079730080  776153947   +400.0%
> 11/04 Sun0  0  0  ---
> --
> TOTAL   40 1079730080  776153947 155230789
>   (estimated 5 runs per dumpcycle)

Am I correct that you actually ran two separate amdump runs within the
calendar day of 10/30 (with the first "balance" command executed between
the runs)?  That would explain why all 39 DLEs are now showing as due on
the same day.

Anyway, the "clump" showing here is a direct result of the fact that all
your DLEs got set to full dumps yesterday (for whatever reason that
happened).  

Over the rest of this cycle, you should see the planner promoting about
one fifth of your full-dump volume to that day, so that the full dumps
spread back out to near a "zero" balance figure.  (That is, some level
0s will happen sooner than a full dumpcycle after the last one, to
spread things back out.)

If you know for sure that certain DLEs are larger than (or very close
to) the balance size, it will definitely help to split them.  Otherwise,
I'd say you might as well wait 5 days and then take a look at the
balance listing, to see if Amanda ended up having trouble evening things
back out over the course of the cycle.

Nathan


Nathan Stratton Treadway  -  natha...@ontko.com  -  Mid-Atlantic region
Ray Ontko & Co.  -  Software consulting services  -   http://www.ontko.com/
 GPG Key: http://www.ontko.com/~nathanst/gpg_key.txt   ID: 1023D/ECFB6239
 Key fingerprint = 6AD8 485E 20B9 5C71 231C  0C32 15F3 ADCD ECFB 6239


Re: All level 0 on the same run?

2018-10-31 Thread Jose M Calhariz
On Wed, Oct 31, 2018 at 02:13:27PM -0400, Nathan Stratton Treadway wrote:
> On Wed, Oct 31, 2018 at 14:38:43 +, Jose M Calhariz wrote:
> > I bet this DLEs are very small and that you have done the upgrade of
> > the amand server between 20 and 20+tapecycle ago.
> > 
> > There is a bug in recent amanda server that if a DLE is very small it
> > will not update properly the internal database after a level 0.
> > Making that DLE overdue.
> 
> Yes, that definitely would explain what Gene was seeing.
> 
> Did you recieve or create a fix for that bug?  (I didn't immediately
> recognize any of the debian/patch files in the amanda_3.5.1-3_WIP_2
> source package I downloaded from you a few weeks ago as applying to this
> bug.)

No fix.  I was waiting for a new release before reporting this bug.
As I do not like to have many patches applied to the Debian packages,
besides the ones to debianize the software.


> 
> Do you know how small a DLE has to be to trigger this problem?

No, usually they are 0 size on amreport, being MB or GB.

> 
>   
>   Nathan
> 
> 
> Nathan Stratton Treadway  -  natha...@ontko.com  -  Mid-Atlantic region
> Ray Ontko & Co.  -  Software consulting services  -   http://www.ontko.com/
>  GPG Key: http://www.ontko.com/~nathanst/gpg_key.txt   ID: 1023D/ECFB6239
>  Key fingerprint = 6AD8 485E 20B9 5C71 231C  0C32 15F3 ADCD ECFB 6239
> 
>

Kind regards
Jose M Calhariz

-- 
--
Nada é tao bom que alguem, em algum lugar, não ira odiar.
-- Joseph Murphy


signature.asc
Description: PGP signature


Re: All level 0 on the same run?

2018-10-31 Thread Nathan Stratton Treadway
On Wed, Oct 31, 2018 at 18:52:46 +, Debra S Baddorf wrote:
> "Those datestamps are obviously wrong, should be 20181031"
> 
> The two DLE's that you showed, with datestamp "wrong", are among the
> "too small" disks you are talking about.  So it seems that the
> datestamp probably isn't wrong? These sets are being missed?
> 
> (Not completely following, cuz it's not (so far) relevant to me.)  
> But the date thing seems to be real, so I thought I'd point it out.

Actually, in the listing of the vtape directory (found in Gene's message
dated "Wed, 31 Oct 2018 12:07:06 -0400") one can see that the DLEs in
question _did_ get dumped.

So it appears that almost all parts of the process are working
correctly; the only problem is that the info database for those
particular dumps is not getting updated.  This in turn causes Amanda to
constantly think they are overdue, and thus I assume it always schedules
them for level 0 dumps (as well as showing them as "overdue" in the
"amadmin ... due" output") -- but since they are so tiny, presumable
that ends up having a negligible effect on the overall balance.  (And in
fact the vtape listing confirms that many of Gene's level 1 dumps are
much larger than those 5 specific level 0s.)

Nathan


Nathan Stratton Treadway  -  natha...@ontko.com  -  Mid-Atlantic region
Ray Ontko & Co.  -  Software consulting services  -   http://www.ontko.com/
 GPG Key: http://www.ontko.com/~nathanst/gpg_key.txt   ID: 1023D/ECFB6239
 Key fingerprint = 6AD8 485E 20B9 5C71 231C  0C32 15F3 ADCD ECFB 6239
 


Re: All level 0 on the same run?

2018-10-31 Thread Debra S Baddorf
“  Those datestamps are obviously wrong, should be 20181031  "

The two DLE’s that you showed, with datestamp “wrong”, are among the “too small”
disks you are talking about.   So it seems that the datestamp probably isn’t 
wrong?
These sets are being missed?

(Not completely following, cuz it’s not (so far) relevant to me.)  
But the date thing seems to be real, so I thought I’d point it out.

Deb Baddorf


> On Oct 31, 2018, at 11:07 AM, Gene Heskett  wrote:
> 
> On Wednesday 31 October 2018 10:37:11 Nathan Stratton Treadway wrote:
> 
>> On Wed, Oct 31, 2018 at 08:50:55 -0400, Gene Heskett wrote:
>>> On Wednesday 31 October 2018 08:39:43 Nathan Stratton Treadway wrote:
>>>> On Wed, Oct 31, 2018 at 07:59:02 -0400, Gene Heskett wrote:
>>>>> yadda yadda. So just where the hell do I look for these?
>>>>> root@coyote:/amandatapes/Dailys# su amanda -c
>>>>> "/usr/local/sbin/amadmin Daily due"|grep Overdue
>>>>> Overdue 21 days: shop:/usr/local
>>>>> Overdue 21 days: shop:/var/amanda
>>>>> Overdue 21 days: lathe:/usr/local
>>>>> Overdue 21 days: lathe:/var/amanda
>>>>> Overdue 21 days: GO704:/var/amanda
>>>>> 
>>>>> So I look in the vtapes, and find its being done too.
>>>>> Checking for lathe:/var/amanda, its there
>>>>> Checking for lathe:/usr/local, they've been done too
>>>> 
>>>> What do "amadmin Daily info shop", etc. say?
>>> 
>>> That "info" directory does not exist, never has existed here that I
>>> can
>> 
>> "info" is a subcommand of "amadmin", along the lines of "balance" and
>> "due".
>> 
> I don't believe its being treated as a subcommand, or maybe I didn't 
> give it all the arguments, lemme look at the man page. Yes, I seem to 
> have failed to assert the "Daily", and now I get output like.
> shop:/var/amanda:
> Current info for shop /var/amanda:
>  Stats: dump rates (kps), Full:1.0,   1.0,   1.0
>Incremental:1.0,   1.0,   1.0
>  compressed size, Full:  10.0%, 10.0%, 10.0%
>Incremental:  10.0%, 10.0%, 10.0%
>  Dumps: lev datestmp  tape file   origK   compK secs
>  0  20181001  Dailys-272 10 1 0
>  1  20181004  Dailys-5613 10 1 1
> 
> And shop:/usr/local:
> Current info for shop /usr/local:
>  Stats: dump rates (kps), Full:1.0,   1.0,   1.0
>Incremental:1.0,   1.0,   1.0
>  compressed size, Full:   2.5%,  2.5%,  2.5%
>Incremental:   2.5%,  2.5%,  2.5%
>  Dumps: lev datestmp  tape file   origK   compK secs
>  0  20181001  Dailys-276 40 1 1
>  1  20181004  Dailys-5923 40 1 1
> 
> Those datestamps are obviously wrong, should be 20181031
> 
> root@coyote:/amandatapes/Dailys# ls -l data
> lrwxrwxrwx 1 amanda amanda 6 Oct 31 03:01 data -> slot27
> 
> root@coyote:/amandatapes/Dailys# ls -l data/
> total 18143556
> -rw--- 1 amanda amanda  32768 Oct 31 03:03 0.Dailys-27
> -rw--- 1 amanda amanda  70603 Oct 31 03:03 1.shop._root.0
> -rw--- 1 amanda amanda  32934 Oct 31 03:03 2.shop._var_amanda.0
> -rw--- 1 amanda amanda 272120 Oct 31 03:03 3.GO704._root.0
> -rw--- 1 amanda amanda  32943 Oct 31 03:03 4.GO704._var_amanda.0
> -rw--- 1 amanda amanda5013684 Oct 31 03:03 5.lathe._usr_src.0
> -rw--- 1 amanda amanda  33687 Oct 31 03:03 6.shop._usr_local.0
> -rw--- 1 amanda amanda 206886 Oct 31 03:03 
> 7.shop._var_lib_amanda.0
> -rw--- 1 amanda amanda1156067 Oct 31 03:03 8.lathe._etc.0
> -rw--- 1 amanda amanda  32933 Oct 31 03:03 9.lathe._var_amanda.0
> -rw--- 1 amanda amanda1024710 Oct 31 03:03 00010.shop._etc.0
> -rw--- 1 amanda amanda  33689 Oct 31 03:03 00011.lathe._usr_local.0
> -rw--- 1 amanda amanda 264766 Oct 31 03:03 
> 00012.lathe._var_lib_amanda.0
> -rw--- 1 amanda amanda 113999 Oct 31 03:04 00013.lathe._root.0
> -rw--- 1 amanda amanda   22061583 Oct 31 03:04 00014.shop._lib_firmware.0
> -rw--- 1 amanda amanda1325898 Oct 31 03:04 
> 00015.shop._usr_lib_amanda.0
> -rw--- 1 amanda amanda   21973059 Oct 31 03:05 00016.lathe._lib_firmware.0
> -rw--- 1 amanda amanda1325856 Oct 31 03:05 
> 00017.lathe._usr_lib_amanda.0
> -rw--- 1 amanda amanda 3193481270 Oct 31 03:09 
> 00018.coyote._home_gene_src.0
> -rw--- 1 amanda amanda 3298505184 Oct 31 03:23 00019.coyote._home_gene.0
> -rw--- 1 amanda

Re: All level 0 on the same run?

2018-10-31 Thread Nathan Stratton Treadway
On Wed, Oct 31, 2018 at 14:38:43 +, Jose M Calhariz wrote:
> I bet this DLEs are very small and that you have done the upgrade of
> the amand server between 20 and 20+tapecycle ago.
> 
> There is a bug in recent amanda server that if a DLE is very small it
> will not update properly the internal database after a level 0.
> Making that DLE overdue.

Yes, that definitely would explain what Gene was seeing.

Did you recieve or create a fix for that bug?  (I didn't immediately
recognize any of the debian/patch files in the amanda_3.5.1-3_WIP_2
source package I downloaded from you a few weeks ago as applying to this
bug.)

Do you know how small a DLE has to be to trigger this problem?


Nathan


Nathan Stratton Treadway  -  natha...@ontko.com  -  Mid-Atlantic region
Ray Ontko & Co.  -  Software consulting services  -   http://www.ontko.com/
 GPG Key: http://www.ontko.com/~nathanst/gpg_key.txt   ID: 1023D/ECFB6239
 Key fingerprint = 6AD8 485E 20B9 5C71 231C  0C32 15F3 ADCD ECFB 6239


Re: All level 0 on the same run?

2018-10-31 Thread Gene Heskett
On Wednesday 31 October 2018 10:38:43 Jose M Calhariz wrote:

> On Wed, Oct 31, 2018 at 08:39:43AM -0400, Nathan Stratton Treadway 
wrote:
> > On Wed, Oct 31, 2018 at 07:59:02 -0400, Gene Heskett wrote:
> > > yadda yadda. So just where the hell do I look for these?
> > > root@coyote:/amandatapes/Dailys# su amanda -c
> > > "/usr/local/sbin/amadmin Daily due"|grep Overdue
> > > Overdue 21 days: shop:/usr/local
> > > Overdue 21 days: shop:/var/amanda
> > > Overdue 21 days: lathe:/usr/local
> > > Overdue 21 days: lathe:/var/amanda
> > > Overdue 21 days: GO704:/var/amanda
>
> I bet this DLEs are very small and that you have done the upgrade of
> the amand server between 20 and 20+tapecycle ago.
>
> There is a bug in recent amanda server that if a DLE is very small it
> will not update properly the internal database after a level 0.
> Making that DLE overdue.
>
That makes more sense than anything else I've found. Now, I have 3.3.7p1. 
3.4.3, and 3.5.1 which I've been running about that long.
So lets install 3.4.3 for tonight. Building now, apparently had not been 
done before. and on the instal and amcheck, I had to move the 
amanda-security.conf file up one level to /usr/local/etc. Might be wiser 
to softlink it if they are going to bounce it around like that.
So we'll see how it works in the morning. amcheck is at least happy with 
it.

Thanks for the bugreport Jose M Calhariz, you may have saved what little 
hair I have not pulled out looking for this.

Copyright 2018 by Maurice E. Heskett
-- 
Cheers, Gene Heskett
--
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
Genes Web page 


Re: All level 0 on the same run?

2018-10-31 Thread Gene Heskett
On Wednesday 31 October 2018 10:37:11 Nathan Stratton Treadway wrote:

> On Wed, Oct 31, 2018 at 08:50:55 -0400, Gene Heskett wrote:
> > On Wednesday 31 October 2018 08:39:43 Nathan Stratton Treadway wrote:
> > > On Wed, Oct 31, 2018 at 07:59:02 -0400, Gene Heskett wrote:
> > > > yadda yadda. So just where the hell do I look for these?
> > > > root@coyote:/amandatapes/Dailys# su amanda -c
> > > > "/usr/local/sbin/amadmin Daily due"|grep Overdue
> > > > Overdue 21 days: shop:/usr/local
> > > > Overdue 21 days: shop:/var/amanda
> > > > Overdue 21 days: lathe:/usr/local
> > > > Overdue 21 days: lathe:/var/amanda
> > > > Overdue 21 days: GO704:/var/amanda
> > > >
> > > > So I look in the vtapes, and find its being done too.
> > > > Checking for lathe:/var/amanda, its there
> > > > Checking for lathe:/usr/local, they've been done too
> > >
> > > What do "amadmin Daily info shop", etc. say?
> >
> > That "info" directory does not exist, never has existed here that I
> > can
>
> "info" is a subcommand of "amadmin", along the lines of "balance" and
> "due".
>
I don't believe its being treated as a subcommand, or maybe I didn't 
give it all the arguments, lemme look at the man page. Yes, I seem to 
have failed to assert the "Daily", and now I get output like.
shop:/var/amanda:
Current info for shop /var/amanda:
  Stats: dump rates (kps), Full:1.0,   1.0,   1.0
Incremental:1.0,   1.0,   1.0
  compressed size, Full:  10.0%, 10.0%, 10.0%
Incremental:  10.0%, 10.0%, 10.0%
  Dumps: lev datestmp  tape file   origK   compK secs
  0  20181001  Dailys-272 10 1 0
  1  20181004  Dailys-5613 10 1 1

And shop:/usr/local:
Current info for shop /usr/local:
  Stats: dump rates (kps), Full:1.0,   1.0,   1.0
Incremental:1.0,   1.0,   1.0
  compressed size, Full:   2.5%,  2.5%,  2.5%
Incremental:   2.5%,  2.5%,  2.5%
  Dumps: lev datestmp  tape file   origK   compK secs
  0  20181001  Dailys-276 40 1 1
  1  20181004  Dailys-5923 40 1 1

Those datestamps are obviously wrong, should be 20181031

root@coyote:/amandatapes/Dailys# ls -l data
lrwxrwxrwx 1 amanda amanda 6 Oct 31 03:01 data -> slot27

root@coyote:/amandatapes/Dailys# ls -l data/
total 18143556
-rw--- 1 amanda amanda  32768 Oct 31 03:03 0.Dailys-27
-rw--- 1 amanda amanda  70603 Oct 31 03:03 1.shop._root.0
-rw--- 1 amanda amanda  32934 Oct 31 03:03 2.shop._var_amanda.0
-rw--- 1 amanda amanda 272120 Oct 31 03:03 3.GO704._root.0
-rw--- 1 amanda amanda  32943 Oct 31 03:03 4.GO704._var_amanda.0
-rw--- 1 amanda amanda5013684 Oct 31 03:03 5.lathe._usr_src.0
-rw--- 1 amanda amanda  33687 Oct 31 03:03 6.shop._usr_local.0
-rw--- 1 amanda amanda 206886 Oct 31 03:03 7.shop._var_lib_amanda.0
-rw--- 1 amanda amanda1156067 Oct 31 03:03 8.lathe._etc.0
-rw--- 1 amanda amanda  32933 Oct 31 03:03 9.lathe._var_amanda.0
-rw--- 1 amanda amanda1024710 Oct 31 03:03 00010.shop._etc.0
-rw--- 1 amanda amanda  33689 Oct 31 03:03 00011.lathe._usr_local.0
-rw--- 1 amanda amanda 264766 Oct 31 03:03 00012.lathe._var_lib_amanda.0
-rw--- 1 amanda amanda 113999 Oct 31 03:04 00013.lathe._root.0
-rw--- 1 amanda amanda   22061583 Oct 31 03:04 00014.shop._lib_firmware.0
-rw--- 1 amanda amanda1325898 Oct 31 03:04 00015.shop._usr_lib_amanda.0
-rw--- 1 amanda amanda   21973059 Oct 31 03:05 00016.lathe._lib_firmware.0
-rw--- 1 amanda amanda1325856 Oct 31 03:05 00017.lathe._usr_lib_amanda.0
-rw--- 1 amanda amanda 3193481270 Oct 31 03:09 00018.coyote._home_gene_src.0
-rw--- 1 amanda amanda 3298505184 Oct 31 03:23 00019.coyote._home_gene.0
-rw--- 1 amanda amanda 163341 Oct 31 03:23 
00020.coyote._home_gene_Downloads.2
-rw--- 1 amanda amanda   3123 Oct 31 03:23 00021.coyote._home_ups.0
-rw--- 1 amanda amanda 6189105783 Oct 31 03:28 00022.picnc._.0
-rw--- 1 amanda amanda   68618563 Oct 31 03:29 00023.shop._home.0
-rw--- 1 amanda amanda   36166458 Oct 31 03:29 00024.lathe._home.0
-rw--- 1 amanda amanda  827898079 Oct 31 03:34 00025.coyote._home_amanda.0
-rw--- 1 amanda amanda1117337 Oct 31 03:34 00026.coyote._home_nut.0
-rw--- 1 amanda amanda   27806189 Oct 31 03:34 
00027.coyote._home_gene_Mail.1
-rw--- 1 amanda amanda  45027 Oct 31 03:35 
00028.coyote._home_gene_Download.1
-rw--- 1 amanda amanda  339806208 Oct 31 03:35 00029.coyote._usr_dlds_misc.0
-rw--- 1 amanda amanda  101449728 Oct 31 03:35 00

Re: All level 0 on the same run?

2018-10-31 Thread Jose M Calhariz
On Wed, Oct 31, 2018 at 08:39:43AM -0400, Nathan Stratton Treadway wrote:
> On Wed, Oct 31, 2018 at 07:59:02 -0400, Gene Heskett wrote:
> > yadda yadda. So just where the hell do I look for these?
> > root@coyote:/amandatapes/Dailys# su amanda -c "/usr/local/sbin/amadmin 
> > Daily due"|grep Overdue
> > Overdue 21 days: shop:/usr/local
> > Overdue 21 days: shop:/var/amanda
> > Overdue 21 days: lathe:/usr/local
> > Overdue 21 days: lathe:/var/amanda
> > Overdue 21 days: GO704:/var/amanda

I bet this DLEs are very small and that you have done the upgrade of
the amand server between 20 and 20+tapecycle ago.

There is a bug in recent amanda server that if a DLE is very small it
will not update properly the internal database after a level 0.
Making that DLE overdue.


> > 
> > So I look in the vtapes, and find its being done too.
> > Checking for lathe:/var/amanda, its there
> > Checking for lathe:/usr/local, they've been done too
> 
> What do "amadmin Daily info shop", etc. say?
> 
>   Nathan
> 
> Nathan Stratton Treadway  -  natha...@ontko.com  -  Mid-Atlantic region
> Ray Ontko & Co.  -  Software consulting services  -   http://www.ontko.com/
>  GPG Key: http://www.ontko.com/~nathanst/gpg_key.txt   ID: 1023D/ECFB6239
>  Key fingerprint = 6AD8 485E 20B9 5C71 231C  0C32 15F3 ADCD ECFB 6239
> 
>

Kind regards
Jose M Calhariz

-- 
--
A vida é como rapadura: é doce, mas não é mole.


signature.asc
Description: PGP signature


Re: All level 0 on the same run?

2018-10-31 Thread Nathan Stratton Treadway
On Wed, Oct 31, 2018 at 08:50:55 -0400, Gene Heskett wrote:
> On Wednesday 31 October 2018 08:39:43 Nathan Stratton Treadway wrote:
> 
> > On Wed, Oct 31, 2018 at 07:59:02 -0400, Gene Heskett wrote:
> > > yadda yadda. So just where the hell do I look for these?
> > > root@coyote:/amandatapes/Dailys# su amanda -c
> > > "/usr/local/sbin/amadmin Daily due"|grep Overdue
> > > Overdue 21 days: shop:/usr/local
> > > Overdue 21 days: shop:/var/amanda
> > > Overdue 21 days: lathe:/usr/local
> > > Overdue 21 days: lathe:/var/amanda
> > > Overdue 21 days: GO704:/var/amanda
> > >
> > > So I look in the vtapes, and find its being done too.
> > > Checking for lathe:/var/amanda, its there
> > > Checking for lathe:/usr/local, they've been done too
> >
> > What do "amadmin Daily info shop", etc. say?
> >
> That "info" directory does not exist, never has existed here that I can 

"info" is a subcommand of "amadmin", along the lines of "balance" and
"due".

Nathan


Nathan Stratton Treadway  -  natha...@ontko.com  -  Mid-Atlantic region
Ray Ontko & Co.  -  Software consulting services  -   http://www.ontko.com/
 GPG Key: http://www.ontko.com/~nathanst/gpg_key.txt   ID: 1023D/ECFB6239
 Key fingerprint = 6AD8 485E 20B9 5C71 231C  0C32 15F3 ADCD ECFB 6239


Re: All level 0 on the same run?

2018-10-31 Thread Gene Heskett
On Wednesday 31 October 2018 08:34:21 Nathan Stratton Treadway wrote:

> On Wed, Oct 31, 2018 at 08:18:47 -0400, Gene Heskett wrote:
> > that link from tapelist.last_write -> 27062 is dead, there is no
> > 27062 file to be found. WTH is that? And how can that be causeing
> > the erroneous amadmin due's.
>
> (You successfully moved your server to Amanda 3.5.x, right?)
>
Yes, both are local builds, so all I need to do to switch is install the 
older version. All built in /home/amanda.

> The last_write symlink is used for locking or something; in any case
> it's supposed to point to a number (perhaps a sequence number?) rather
> than an actual file.

It does, looks like it might be a PID. A big one but...

> > So I'm apparently plumb bumfuzzled at this point. Help! Where is JLM
> > when I need him.
>
> (Yeah.)
>   Nathan
> --
>-- Nathan Stratton Treadway  -  natha...@ontko.com  -  Mid-Atlantic
> region Ray Ontko & Co.  -  Software consulting services  -  
> http://www.ontko.com/ GPG Key:
> http://www.ontko.com/~nathanst/gpg_key.txt   ID: 1023D/ECFB6239 Key
> fingerprint = 6AD8 485E 20B9 5C71 231C  0C32 15F3 ADCD ECFB 6239



Copyright 2018 by Maurice E. Heskett
-- 
Cheers, Gene Heskett
--
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
Genes Web page 


Re: All level 0 on the same run?

2018-10-31 Thread Gene Heskett
On Wednesday 31 October 2018 08:39:43 Nathan Stratton Treadway wrote:

> On Wed, Oct 31, 2018 at 07:59:02 -0400, Gene Heskett wrote:
> > yadda yadda. So just where the hell do I look for these?
> > root@coyote:/amandatapes/Dailys# su amanda -c
> > "/usr/local/sbin/amadmin Daily due"|grep Overdue
> > Overdue 21 days: shop:/usr/local
> > Overdue 21 days: shop:/var/amanda
> > Overdue 21 days: lathe:/usr/local
> > Overdue 21 days: lathe:/var/amanda
> > Overdue 21 days: GO704:/var/amanda
> >
> > So I look in the vtapes, and find its being done too.
> > Checking for lathe:/var/amanda, its there
> > Checking for lathe:/usr/local, they've been done too
>
> What do "amadmin Daily info shop", etc. say?
>
That "info" directory does not exist, never has existed here that I can 
recall.

>   Nathan
> --
>-- Nathan Stratton Treadway  -  natha...@ontko.com  -  Mid-Atlantic
> region Ray Ontko & Co.  -  Software consulting services  -  
> http://www.ontko.com/ GPG Key:
> http://www.ontko.com/~nathanst/gpg_key.txt   ID: 1023D/ECFB6239 Key
> fingerprint = 6AD8 485E 20B9 5C71 231C  0C32 15F3 ADCD ECFB 6239



Copyright 2018 by Maurice E. Heskett
-- 
Cheers, Gene Heskett
--
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
Genes Web page 


Re: All level 0 on the same run?

2018-10-31 Thread Nathan Stratton Treadway
On Wed, Oct 31, 2018 at 07:59:02 -0400, Gene Heskett wrote:
> yadda yadda. So just where the hell do I look for these?
> root@coyote:/amandatapes/Dailys# su amanda -c "/usr/local/sbin/amadmin 
> Daily due"|grep Overdue
> Overdue 21 days: shop:/usr/local
> Overdue 21 days: shop:/var/amanda
> Overdue 21 days: lathe:/usr/local
> Overdue 21 days: lathe:/var/amanda
> Overdue 21 days: GO704:/var/amanda
> 
> So I look in the vtapes, and find its being done too.
> Checking for lathe:/var/amanda, its there
> Checking for lathe:/usr/local, they've been done too

What do "amadmin Daily info shop", etc. say?

Nathan

Nathan Stratton Treadway  -  natha...@ontko.com  -  Mid-Atlantic region
Ray Ontko & Co.  -  Software consulting services  -   http://www.ontko.com/
 GPG Key: http://www.ontko.com/~nathanst/gpg_key.txt   ID: 1023D/ECFB6239
 Key fingerprint = 6AD8 485E 20B9 5C71 231C  0C32 15F3 ADCD ECFB 6239


Re: All level 0 on the same run?

2018-10-31 Thread Nathan Stratton Treadway
On Wed, Oct 31, 2018 at 08:18:47 -0400, Gene Heskett wrote:
> that link from tapelist.last_write -> 27062 is dead, there is no 27062 
> file to be found. WTH is that? And how can that be causeing the 
> erroneous amadmin due's.

(You successfully moved your server to Amanda 3.5.x, right?)

The last_write symlink is used for locking or something; in any case
it's supposed to point to a number (perhaps a sequence number?) rather than
an actual file.

> So I'm apparently plumb bumfuzzled at this point. Help! Where is JLM when 
> I need him.

(Yeah.)
Nathan

Nathan Stratton Treadway  -  natha...@ontko.com  -  Mid-Atlantic region
Ray Ontko & Co.  -  Software consulting services  -   http://www.ontko.com/
 GPG Key: http://www.ontko.com/~nathanst/gpg_key.txt   ID: 1023D/ECFB6239
 Key fingerprint = 6AD8 485E 20B9 5C71 231C  0C32 15F3 ADCD ECFB 6239


Re: All level 0 on the same run?

2018-10-31 Thread Chris Nighswonger
FWIW, here is the output of amadmin balance before last nights run and
again this morning. No overdues, so I guess that's good. I'm not
experienced enough to make much of the balance percentages, but am now
wondering if I should work at breaking up the large DLEs into smaller
subsets as several have suggested.

root@scriptor:/home/manager# su backup -c "/usr/sbin/amadmin campus balance"

 due-date  #fsorig kB out kB   balance
--
10/30 Tue   25  359009166  284972243   +102.1%

11/03 Sat   15  632526057  420083122   +197.9%
--
TOTAL   40  991535223  705055365 141011073
  (estimated 5 runs per dumpcycle)
 (13 filesystems overdue. The most being overdue 1 day.)
root@scriptor:/home/manager# su backup -c "/usr/sbin/amadmin campus balance"

 due-date  #fsorig kB out kB   balance
--
10/31 Wed1  0  0  ---

11/03 Sat   39 1079730080  776153947   +400.0%
11/04 Sun0  0  0  ---
--
TOTAL   40 1079730080  776153947 155230789
  (estimated 5 runs per dumpcycle)


On Tue, Oct 30, 2018 at 3:56 PM Gene Heskett  wrote:

> On Tuesday 30 October 2018 15:29:37 Nathan Stratton Treadway wrote:
>
> > On Tue, Oct 30, 2018 at 14:20:55 -0400, Chris Nighswonger wrote:
> > > Why in the world does Amanda plan level 0 backups for all entries in
> > > a DLE for the same run This causes all sorts of problems.
> > >
> > > Is there any solution for this? I've read some of the creative
> > > suggestions, but it seems a bunch of trouble.
> >
> > The operation of Amanda's planner depends on many inputs, both "fixed"
> > (e.g. configuration options) and constantly-varying (e.g. estimate
> > sizes and dump history), and I suspect there are only a few people in
> > the world who really understand it fully -- and I don't know how many
> > of them still read this mailing list :(.  But even one of those people
> > would probably need to look at a lot of information in order to know
> > what exactly was going on.
> >
> >
> > The good news is that I have noticed that the planner records a bunch
> > of interesting information in the amdump.DATETIMESTAMP log file, so at
> > least that seems like the place to start investigated.  Look in
> > particular for the following sections: DONE QUEUE, ANALYZING
> > ESTIMATES, INITIAL SCHEDULE, DELAYING DUMPS IF NEEDED, PROMOTING DUMPS
> > IF NEEDED, and finally GENERATING SCHEDULE.
> >
> > In your case, it seems likely that the  PROMOTING DUMPS section should
> > have a bunch of activity listed; if so, that might explain what it's
> > "thinking".
> >
> > If that doesn't give a clear answer, does the INITIAL SCHEDULE section
> > show all the dumps are already scheduled for level 0?  If not, pick a
> > DLE that is not shown at level 0 there and follow it down the log to
> > see if you can figure out what stage bumps it back to level 0...
> >
> >
> > On a different track of investigation, the  output of "amadmin CONFIG
> > balance" might show something useful (though off hand it seems
> > unlikely to explain why _all_ DLEs would be switched to level 0).
> >
> >
> > Let us know what you find out :)
> >   Nathan
> >
> I just changed the length of the dumpcycle and runs percycle up to 10,
> about last friday while I was makeing the bump* stuff more attractive,
> but the above command returns that the are 5 filesystens out of date:
> su amanda -c "/usr/local/sbin/amadmin Daily balance"
>
>  due-date  #fsorig MB out MB   balance
> --
> 10/30 Tue5  0  0  ---
> 10/31 Wed1  17355   8958-45.3%
> 11/01 Thu2  10896  10887-33.5%
> 11/02 Fri4  35944   9298-43.2%
> 11/03 Sat4  14122  10835-33.8%
> 11/04 Sun3  57736  57736   +252.7%
> 11/05 Mon2  39947  30635+87.1%
> 11/06 Tue8   4235   4215-74.3%
> 11/07 Wed4  19503  14732-10.0%
> 11/08 Thu   32  31783  16408 +0.2%
> --
> TOTAL   65 231521 163704 16370
>   (estimated 10 runs per dumpcycle)
>  (5 filesystems overdue. The most being overdue 20 days.)
>
> That last line is disturbing. Ideas anyone? I'll certainly keep an eye on
> it.
>
> Cheers & thanks, Gene Heskett
> --
> "There are four boxes to be used in defense of liberty:
>  soap, ballot, jury, and ammo. Please use in that order."
> -Ed Howdershelt (Author)
> Genes Web page 
>


Re: All level 0 on the same run?

2018-10-31 Thread Chris Nighswonger
So, looking at this more, it may be self-inflicted. Last week I changed
blocksize to 512k, and began amrmtape and amlabel with the oldest tape
first and working backward day by day. I run backups 5 nights per week with
a cycle of 13 tapes (see below). I would have thought that this would have
allowed the change in blocksize to run seamlessly. Maybe not. I'm now
suspecting that by amrmtape --cleanup, this caused Amanda to bork and fall
back to level 0 backups. She did this two nights in a row!!!

Anyway, I'm going to hold off any further concerns until I finish a
complete tapecycle. If the problem continues after that point, I'll pick
back up.

Relevant lines from amanda.conf:

dumpcycle 5 days
runspercycle 5
tapecycle 13 tapes
runtapes 1
flush-threshold-dumped 50
bumpsize 10 Mbytes
bumppercent 0
bumpmult 1.5
bumpdays 2

Kind regards,
Chris

On Tue, Oct 30, 2018 at 2:32 PM Debra S Baddorf  wrote:

> Is this the first backup run for a long while?  If so, then they are all
> DUE, so amanda feels it has to schedule them all, now.
>
> Is this the first backup ever?   Ditto above.
>
> Did you perhaps run  “amadminforce  *”  which forces a level 0
> on all disks.
> Did you specify  “strategy noinc”which does the same?
> Or  "skip-incr yes”  ?  Ditto.
>
> Did you replace a whole disk, making all the files look like they’ve never
> been backed up?
>
> Okay,  failing all the above obvious reasons,  I’ll leave others to
> discuss “planner” reasons.  Sorry!
> Deb Baddorf
> Fermilab
>
> > On Oct 30, 2018, at 1:20 PM, Chris Nighswonger <
> cnighswon...@foundations.edu> wrote:
> >
> > Why in the world does Amanda plan level 0 backups for all entries in a
> DLE for the same run This causes all sorts of problems.
> > Is there any solution for this? I've read some of the creative
> suggestions, but it seems a bunch of trouble.
> > Kind regards,
> > Chris
> >
> > 0 19098649k waiting for dumping
> > 0 9214891k waiting for dumping
> > 0 718824k waiting for dumping
> > 0 365207k waiting for dumping
> > 0 2083027k waiting for dumping
> > 0 3886869k waiting for dumping
> > 0 84910k waiting for dumping
> > 0 22489k dump done (7:23:34), waiting for writing to tape
> > 0 304k dump done (7:22:30), waiting for writing to tape
> > 0 2613k waiting for dumping
> > 0 30k dump done (7:23:07), waiting for writing to tape
> > 0 39642k dump done (7:23:07), waiting for writing to tape
> > 0 8513409k waiting for dumping
> > 0 39519558k waiting for dumping
> > 0 47954k waiting for dumping
> > 0 149877984k dumping 145307840k ( 96.95%) (7:22:15)
> > 0 742804k waiting for dumping
> > p" 0 88758k waiting for dumping
> > 0 12463k dump done (7:24:19), waiting for writing to tape
> > 0 5544352k waiting for dumping
> > 0 191676480k waiting for dumping
> > 0 3799277k waiting for dumping
> > 0 3177171k waiting for dumping
> > 0 11058544k waiting for dumping
> > 0 230026440k dump done (7:22:13), waiting for writing to tape
> > 0 8k dump done (7:24:24), waiting for writing to tape
> > 0 184k dump done (7:24:19), waiting for writing to tape
> > 0 1292009k waiting for dumping
> > 0 2870k dump done (7:23:23), waiting for writing to tape
> > 0 13893263k waiting for dumping
> > 0 6025026k waiting for dumping
> > 0 6k dump done (7:22:15), waiting for writing to tape
> > 0 42k dump done (7:24:24), waiting for writing to tape
> > 0 53k dump done (7:24:19), waiting for writing to tape
> > 0 74462169k waiting for dumping
> > 0 205032k waiting for dumping
> > 0 32914k waiting for dumping
> > 0 1k dump done (7:24:02), waiting for writing to tape
> > 0 854272k waiting for dumping
> >
>
>


Re: All level 0 on the same run?

2018-10-31 Thread Gene Heskett
On Wednesday 31 October 2018 07:02:53 Nathan Stratton Treadway wrote:

> On Wed, Oct 31, 2018 at 06:32:41 -0400, Gene Heskett wrote:
> > I'll see if I can find the logs, I assume on the clients marked
> > guilty?
>
> Personally I'd probably start with "amstatus" on the server to see if
> it said anything about the DLEs in question, then maybe look into the
> amdump.1 log file (the one mentioned at the top of the amstatus
> report) for more details on that DLE.  If there is evidence in those
> places that it actually tried contacting the client to kick off a
> dump, that would tell me it was worth going over to the client's logs
> to try to track down those specific requests.
>
>   Nathan
Not it at all Nathan, the backups are in the vtapes with zero errors 
logged. See my previous post a few minutes ago. There is either 
something goofy in the $config or a nilmerg, and the only thing I see 
thats odd is:

root@coyote:/usr/local/etc/amanda/Daily# ls -l
total 132
-rw-r--r-- 1 amanda disk   21488 Oct 25  2005 3hole.ps
-rw-r--r-- 1 amanda disk5887 Oct 25  2005 8.5x11.ps
-rw-r--r-- 1 amanda disk   25423 Oct 28 05:19 amanda.conf
-rw-r--r-- 1 amanda disk   24655 Apr 20  2012 amanda.conf~
-rw--- 1 amanda disk 222 Oct  4 04:11 chg-disk
-rw-r--r-- 1 amanda disk   2 Aug 24 13:42 chg-disk-access
-rw-r--r-- 1 amanda disk   3 Aug 24 13:42 chg-disk-clean
-rw-r--r-- 1 amanda disk   2 Aug 24 13:42 chg-disk-slot
-rw-r--r-- 1 amanda disk 765 May 22  2004 chg-scsi.conf
-rw--- 1 amanda disk  18 Oct 31 03:44 command_file
-rw-r--r-- 1 amanda disk3977 Aug 30 06:28 disklist
-rw-r--r-- 1 amanda disk5002 Apr  3  2012 disklist~
-rw--- 1 amanda amanda  3809 Oct 31 03:03 tapelist
-rw--- 1 amanda disk1071 Aug 24 13:22 tapelist.amlabel
lrwxrwxrwx 1 amanda amanda 5 Oct 31 03:03 tapelist.last_write -> 
27062
-rw--- 1 amanda disk   0 Aug 31 03:03 tapelist.lock

that link from tapelist.last_write -> 27062 is dead, there is no 27062 
file to be found. WTH is that? And how can that be causeing the 
erroneous amadmin due's.

So I'm apparently plumb bumfuzzled at this point. Help! Where is JLM when 
I need him.

-- 
Cheers, Gene Heskett
--
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
Genes Web page 


Re: All level 0 on the same run?

2018-10-31 Thread Gene Heskett
On Wednesday 31 October 2018 06:32:41 Gene Heskett wrote:

> On Tuesday 30 October 2018 17:56:33 Gene Heskett wrote:
> > On Tuesday 30 October 2018 16:45:50 Nathan Stratton Treadway wrote:
> > > On Tue, Oct 30, 2018 at 15:51:36 -0400, Gene Heskett wrote:
> > > > I just changed the length of the dumpcycle and runs percycle up
> > > > to 10, about last friday while I was makeing the bump* stuff
> > > > more attractive, but the above command returns that the are 5
> > > > filesystens out of date: su amanda -c "/usr/local/sbin/amadmin
> > > > Daily balance"
> > > >
> > > >  due-date  #fsorig MB out MB   balance
> > > > --
> > > > 10/30 Tue5  0  0  ---
> > > > 10/31 Wed1  17355   8958-45.3%
> > > > 11/01 Thu2  10896  10887-33.5%
> > > > 11/02 Fri4  35944   9298-43.2%
> > > > 11/03 Sat4  14122  10835-33.8%
> > > > 11/04 Sun3  57736  57736   +252.7%
> > > > 11/05 Mon2  39947  30635+87.1%
> > > > 11/06 Tue8   4235   4215-74.3%
> > > > 11/07 Wed4  19503  14732-10.0%
> > > > 11/08 Thu   32  31783  16408 +0.2%
> > > > --
> > > > TOTAL   65 231521 163704 16370
> > > >   (estimated 10 runs per dumpcycle)
> > > >  (5 filesystems overdue. The most being overdue 20 days.)
> > > >
> > > > That last line is disturbing. Ideas anyone? I'll certainly keep
> > > > an eye on it.
> > >
> > > Did you already run today's (10/30's) dump?  Assuming you are
> > > running one dump per day, running the "balance" command after the
> > > dump has occurred generates confusing output, because the output
> > > includes a line for today's date but actually no (new) dumps will
> > > be happening today. So if you are able to run the balance command
> > > before a particular day's dump (but on the day in question), the
> > > output is a little bit more helpful
> > >
> > > Anyway by "last line" are you talking about the "overdue" line?
> > > "amadmin ... due" should tell you which DLEs are overdue, and you
> > > can then look back through through your Amanda Mail Reports to see
> > > if there's any indication of why they are overdue (especially the
> > > one that's 20 days overdue...)
> > >
> > > Anyway, the other thing that jumps out from the above listing is
> > > the line for 11/4, with a balance of 250%.  We can't tell from the
> > > listing what the relatives sizes of the three DLEs in question
> > > are, though. Here's where a true Amanda expert could advise you
> > > better, but off hand I'd guess that one particule one of those
> > > three DLEs is much larger than all the rest of your DLEs and that
> > > fact is making it hard for the planner to come up with 10
> > > consecutive days of plans that when taken as a group actually work
> > > out to functional cycle
> > >
> > > ("amadmin ... due" should help you figure out which three DLEs are
> > > scheduled for that day, if you don't already know off hand which
> > > one is super large.)
> > >
> > > Hmmm, it would be interesting to know if the the super-large DLE
> > > is also the one that's 20 days overdue.  Perhaps its so big it
> > > can't fit on a tape, or something?
> > >
> > >   Nathan
> >
> > Its rigged to use 2 40GB vtapes if it has to.  But I found some
> > amandates problems in my kicking the tires so we'll see what it does
> > for tonight's after midnight run.
>
> And my perms & linkages fixes for amandates didn't help a bit, this
> morning after the run, its still showing the same 5 dle's as 21 days
> overdue.
>
> I'll see if I can find the logs, I assume on the clients marked
> guilty?
>
Apparently not on the client, I've read every 20181031 log 
containing /usr/local, no failures there. back to here I guess. Trawl 
thru another 20 megs of logs, no "fail" to be found by grep for this 
mornings date.  Humm, go look in /amandatapes/Dailys/data, and lo, and 
behold even, that backup was done!: Picked out of 65 files:
-rw--- 1 amanda amanda   13353012 Oct 31 03:44 
00064.GO704._usr_local.0

Sooo, lets backup a day, and t

Re: All level 0 on the same run?

2018-10-31 Thread Nathan Stratton Treadway
On Wed, Oct 31, 2018 at 06:32:41 -0400, Gene Heskett wrote:
> I'll see if I can find the logs, I assume on the clients marked guilty?

Personally I'd probably start with "amstatus" on the server to see if it
said anything about the DLEs in question, then maybe look into the
amdump.1 log file (the one mentioned at the top of the amstatus report)
for more details on that DLE.  If there is evidence in those places that
it actually tried contacting the client to kick off a dump, that would
tell me it was worth going over to the client's logs to try to track
down those specific requests.

Nathan


Nathan Stratton Treadway  -  natha...@ontko.com  -  Mid-Atlantic region
Ray Ontko & Co.  -  Software consulting services  -   http://www.ontko.com/
 GPG Key: http://www.ontko.com/~nathanst/gpg_key.txt   ID: 1023D/ECFB6239
 Key fingerprint = 6AD8 485E 20B9 5C71 231C  0C32 15F3 ADCD ECFB 6239


Re: All level 0 on the same run?

2018-10-31 Thread Gene Heskett
On Tuesday 30 October 2018 17:56:33 Gene Heskett wrote:

> On Tuesday 30 October 2018 16:45:50 Nathan Stratton Treadway wrote:
> > On Tue, Oct 30, 2018 at 15:51:36 -0400, Gene Heskett wrote:
> > > I just changed the length of the dumpcycle and runs percycle up to
> > > 10, about last friday while I was makeing the bump* stuff more
> > > attractive, but the above command returns that the are 5
> > > filesystens out of date: su amanda -c "/usr/local/sbin/amadmin
> > > Daily balance"
> > >
> > >  due-date  #fsorig MB out MB   balance
> > > --
> > > 10/30 Tue5  0  0  ---
> > > 10/31 Wed1  17355   8958-45.3%
> > > 11/01 Thu2  10896  10887-33.5%
> > > 11/02 Fri4  35944   9298-43.2%
> > > 11/03 Sat4  14122  10835-33.8%
> > > 11/04 Sun3  57736  57736   +252.7%
> > > 11/05 Mon2  39947  30635+87.1%
> > > 11/06 Tue8   4235   4215-74.3%
> > > 11/07 Wed4  19503  14732-10.0%
> > > 11/08 Thu   32  31783  16408 +0.2%
> > > --
> > > TOTAL   65 231521 163704 16370
> > >   (estimated 10 runs per dumpcycle)
> > >  (5 filesystems overdue. The most being overdue 20 days.)
> > >
> > > That last line is disturbing. Ideas anyone? I'll certainly keep an
> > > eye on it.
> >
> > Did you already run today's (10/30's) dump?  Assuming you are
> > running one dump per day, running the "balance" command after the
> > dump has occurred generates confusing output, because the output
> > includes a line for today's date but actually no (new) dumps will be
> > happening today. So if you are able to run the balance command
> > before a particular day's dump (but on the day in question), the
> > output is a little bit more helpful
> >
> > Anyway by "last line" are you talking about the "overdue" line?
> > "amadmin ... due" should tell you which DLEs are overdue, and you
> > can then look back through through your Amanda Mail Reports to see
> > if there's any indication of why they are overdue (especially the
> > one that's 20 days overdue...)
> >
> > Anyway, the other thing that jumps out from the above listing is the
> > line for 11/4, with a balance of 250%.  We can't tell from the
> > listing what the relatives sizes of the three DLEs in question are,
> > though. Here's where a true Amanda expert could advise you better,
> > but off hand I'd guess that one particule one of those three DLEs is
> > much larger than all the rest of your DLEs and that fact is making
> > it hard for the planner to come up with 10 consecutive days of plans
> > that when taken as a group actually work out to functional cycle
> >
> > ("amadmin ... due" should help you figure out which three DLEs are
> > scheduled for that day, if you don't already know off hand which one
> > is super large.)
> >
> > Hmmm, it would be interesting to know if the the super-large DLE is
> > also the one that's 20 days overdue.  Perhaps its so big it can't
> > fit on a tape, or something?
> >
> > Nathan
>
> Its rigged to use 2 40GB vtapes if it has to.  But I found some
> amandates problems in my kicking the tires so we'll see what it does
> for tonight's after midnight run.
>
And my perms & linkages fixes for amandates didn't help a bit, this 
morning after the run, its still showing the same 5 dle's as 21 days 
overdue.

I'll see if I can find the logs, I assume on the clients marked guilty?

> Funny is that 5 machines are still running wheezy, but the default
> names in both /etc/passwd and in /etc/group don't match! One of then
> isn't a Wurlitzer, but since they were installed at different times
> over the history here...  Who knows and I'm too lazy to lock
> in /var/cache/apt/archives to check  versions. We'll see what happens
> tonight.
>
> > 
> >-- -- Nathan Stratton Treadway  -  natha...@ontko.com  - 
> > Mid-Atlantic region Ray Ontko & Co.  -  Software consulting services
> >  -
> > http://www.ontko.com/ GPG Key:
> > http://www.ontko.com/~nathanst/gpg_key.txt   ID: 1023D/ECFB6239 Key
> > fingerprint = 6AD8 485E 20B9 5C71 231C  0C32 15F3 ADCD ECFB 6239
>
> Copyright 2018 by Maurice E. Heskett



Copyright 2018 by Maurice E. Heskett
-- 
Cheers, Gene Heskett
--
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
Genes Web page