On Wednesday 26 February 2014 12:19:18 Gene Heskett did opine:

> On Wednesday 26 February 2014 10:39:13 Gene Heskett did opine:
> > Greetings;
> > 
> > 3 backups ago, with no change to the amanda.conf in months, I have
> > awakened to a hung tar task using 100% of a core, more than 5 hours
> > after it should have completed.
> > 
> > It is in that state now.  How can I find what is causing this
> > blockage? Here is the report from yesterdays attempt,  received after
> > I had used htop to send this stuck tar instance a normal quit signal.
> > 
> > These dumps were to tape Dailys-9.
> > The next 2 tapes Amanda expects to use are: Dailys-10, Dailys-11.
> > 
> > FAILURE DUMP SUMMARY:
> >   planner: ERROR Some estimate timeout on coyote, using server
> >   estimate
> > 
> > if possible coyote /CoCo lev 0  FAILED [too many dumper retry:
> > [request failed: Connection timed out]] coyote
> > /GenesAmandaHelper-0.61 lev 1 FAILED [too many dumper retry: [request
> > failed: Connection timed out]] coyote /home lev 2  FAILED [too many
> > dumper retry: [request failed: Connection timed out]] coyote /lib lev
> > 0  FAILED [disk /lib, all estimate timed out] coyote /opt lev 0 
> > FAILED [disk /opt, all estimate timed out] coyote /root lev 0  FAILED
> > [disk /root, all estimate timed out] coyote /sbin lev 0  FAILED [disk
> > /sbin, all estimate timed out] coyote /var lev 0  FAILED [disk /var,
> > all estimate timed out] coyote /usr/bin lev 0  FAILED [disk /usr/bin,
> > all estimate timed out] coyote /usr/dlds/misc lev 0  FAILED [disk
> > /usr/dlds/misc, all estimate timed out] coyote /usr/dlds/tgzs lev 0 
> > FAILED [disk /usr/dlds/tgzs, all estimate timed out] coyote
> > /usr/dlds/books lev 0  FAILED [disk /usr/dlds/books, all estimate
> > timed out] coyote /usr/include lev 0 FAILED [disk /usr/include, all
> > estimate timed out] coyote /usr/lib lev 0  FAILED [disk /usr/lib, all
> > estimate timed out] coyote /usr/libexec lev 0  FAILED [disk
> > /usr/libexec, all estimate timed out] coyote /usr/movies lev 0 
> > FAILED [disk /usr/movies, all estimate timed out] coyote /usr/local
> > lev 0  FAILED [disk /usr/local, all estimate timed out] coyote
> > /usr/music lev 0  FAILED [disk /usr/music, all estimate timed out]
> > coyote /usr/pix lev 0  FAILED [disk /usr/pix, all estimate timed out]
> > coyote /usr/sbin lev 0  FAILED [disk /usr/sbin, all estimate timed
> > out] coyote /usr/share lev 0  FAILED [disk /usr/share, all estimate
> > timed out] coyote /usr/src lev 0  FAILED [disk /usr/src, all estimate
> > timed out] coyote /usr/games lev 0  FAILED [disk /usr/games, all
> > estimate timed out] coyote /CoCo lev 0  FAILED Got empty header
> > 
> >   coyote /CoCo lev 0  FAILED Got empty header
> >   coyote /GenesAmandaHelper-0.61 lev 1  FAILED Got empty header
> >   coyote /GenesAmandaHelper-0.61 lev 1  FAILED Got empty header
> >   coyote /boot lev 0  FAILED Got empty header
> >   coyote /home lev 2  FAILED Got empty header
> >   coyote /home lev 2  FAILED Got empty header
> > 
> > However, at the bottom of the report, the remote systems were backed
> > up just fine. lathe   /home                       1       1       0  
> > 5.6 0:00   169.9  0:36     1.2 lathe   /usr/lib/amanda             1
> > 0       0   3.3  0:05     0.4  0:00    10.0 lathe   /usr/local
> > 
> >         1       0       0   2.0  0:05     0.4  0:00    10.0 lathe
> > 
> > /var/lib/amanda             1       0       0  22.0  0:00   354.6 
> > 0:00
> > 
> >   220.0 shop    /home                       3       4       0   8.2
> > 
> > 0:07    43.6  0:00  3080.0 shop    /usr/lib/amanda             1
> > 0       0   3.3  0:05     0.4  0:00    10.0 shop    /usr/local
> > 
> >         1       0       0   2.0  0:05     0.4  0:00    10.0 shop
> > 
> > /var/lib/amanda             1       2       0  17.8  0:01   584.4 
> > 0:00
> > 
> >  2950.0
> > 
> > (brought to you by Amanda version 4.0.0alpha.svn.4761)
> > 
> > Now, the thing that _has_ changed is the running kernel, from a 3.12.9
> > that seemed to work well with amanda, to a 3.13.5 that I had one heck
> > of a time building because of Kconfig dependency errors that caused
> > all of the many "media" options to disappear from the "make ?config"
> > operations, and it is likely this one could be missing something that
> > tar needs.
> > 
> > So, what, from this, would be the most likely candidate? The config.gz
> > is attached.
> > 
> > Thank you very much for any insight that can be determined from this.
> > 
> > Cheers, Gene
> 
> Ping!  In the meantime I have rebuilt this kernel 3 times, getting an
> unbootable once, but without finding the option that seems to throw tar
> for a forever loop.
> 
> FWIW, when tar is in that state, the only drive activity is related to
> fetchmail activities which loops every 3 minutes, tar apparently gets
> stuck hammering on something it can't access.  And yet, the DLE it
> appears to be stuck on while attempting an estimate, /lib, can be
> listed with an ls -laR, with no problems.
> 
> This is the distro's copy of tar-1.22, but I've no clue what options it
> was compiled with.  Is this 1.22 a known bad actor under some
> conditions?
> 
> Cheers, Gene

I am building tar-1.27 to see if it fixes the hangs, it passed all the make 
check stuff it was supposed to.

This IS tar from the GNU site, but when I gave the config the path to the 
newly installed 1.27, I get a message from configure that is is NOT GNU 
tar, but will be used.

Is this a concern?  Oh,oh facepalm, I should have told it 
/usr/_local_/bin/tar.  I'll fix & rebuild again before installing.

Make check was happy.

Installed, amcheck is happy.

Now we see what happens tonight.

Thanks for reading.

Cheers, Gene
-- 
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
Genes Web page <http://geneslinuxbox.net:6309/gene>

NOTICE: Will pay 100 USD for an HP-4815A defective but
complete probe assembly.

Reply via email to