On Wednesday 26 February 2014 10:39:13 Gene Heskett did opine:

> Greetings;
> 
> 3 backups ago, with no change to the amanda.conf in months, I have
> awakened to a hung tar task using 100% of a core, more than 5 hours
> after it should have completed.
> 
> It is in that state now.  How can I find what is causing this blockage?
> Here is the report from yesterdays attempt,  received after I had used
> htop to send this stuck tar instance a normal quit signal.
> 
> These dumps were to tape Dailys-9.
> The next 2 tapes Amanda expects to use are: Dailys-10, Dailys-11.
> FAILURE DUMP SUMMARY:
>   planner: ERROR Some estimate timeout on coyote, using server estimate
> if possible coyote /CoCo lev 0  FAILED [too many dumper retry: [request
> failed: Connection timed out]] coyote /GenesAmandaHelper-0.61 lev 1 
> FAILED [too many dumper retry: [request failed: Connection timed out]]
> coyote /home lev 2  FAILED [too many dumper retry: [request failed:
> Connection timed out]] coyote /lib lev 0  FAILED [disk /lib, all
> estimate timed out] coyote /opt lev 0  FAILED [disk /opt, all estimate
> timed out] coyote /root lev 0  FAILED [disk /root, all estimate timed
> out] coyote /sbin lev 0  FAILED [disk /sbin, all estimate timed out]
> coyote /var lev 0  FAILED [disk /var, all estimate timed out] coyote
> /usr/bin lev 0  FAILED [disk /usr/bin, all estimate timed out] coyote
> /usr/dlds/misc lev 0  FAILED [disk /usr/dlds/misc, all estimate timed
> out] coyote /usr/dlds/tgzs lev 0  FAILED [disk /usr/dlds/tgzs, all
> estimate timed out] coyote /usr/dlds/books lev 0  FAILED [disk
> /usr/dlds/books, all estimate timed out] coyote /usr/include lev 0 
> FAILED [disk /usr/include, all estimate timed out] coyote /usr/lib lev
> 0  FAILED [disk /usr/lib, all estimate timed out] coyote /usr/libexec
> lev 0  FAILED [disk /usr/libexec, all estimate timed out] coyote
> /usr/movies lev 0  FAILED [disk /usr/movies, all estimate timed out]
> coyote /usr/local lev 0  FAILED [disk /usr/local, all estimate timed
> out] coyote /usr/music lev 0  FAILED [disk /usr/music, all estimate
> timed out] coyote /usr/pix lev 0  FAILED [disk /usr/pix, all estimate
> timed out] coyote /usr/sbin lev 0  FAILED [disk /usr/sbin, all estimate
> timed out] coyote /usr/share lev 0  FAILED [disk /usr/share, all
> estimate timed out] coyote /usr/src lev 0  FAILED [disk /usr/src, all
> estimate timed out] coyote /usr/games lev 0  FAILED [disk /usr/games,
> all estimate timed out] coyote /CoCo lev 0  FAILED Got empty header
>   coyote /CoCo lev 0  FAILED Got empty header
>   coyote /GenesAmandaHelper-0.61 lev 1  FAILED Got empty header
>   coyote /GenesAmandaHelper-0.61 lev 1  FAILED Got empty header
>   coyote /boot lev 0  FAILED Got empty header
>   coyote /home lev 2  FAILED Got empty header
>   coyote /home lev 2  FAILED Got empty header
> 
> However, at the bottom of the report, the remote systems were backed up
> just fine. lathe   /home                       1       1       0   5.6 
> 0:00   169.9  0:36     1.2 lathe   /usr/lib/amanda             1      
> 0       0   3.3  0:05     0.4  0:00    10.0 lathe   /usr/local         
>         1       0       0   2.0  0:05     0.4  0:00    10.0 lathe  
> /var/lib/amanda             1       0       0  22.0  0:00   354.6  0:00
>   220.0 shop    /home                       3       4       0   8.2 
> 0:07    43.6  0:00  3080.0 shop    /usr/lib/amanda             1      
> 0       0   3.3  0:05     0.4  0:00    10.0 shop    /usr/local         
>         1       0       0   2.0  0:05     0.4  0:00    10.0 shop   
> /var/lib/amanda             1       2       0  17.8  0:01   584.4  0:00
>  2950.0
> 
> (brought to you by Amanda version 4.0.0alpha.svn.4761)
> 
> Now, the thing that _has_ changed is the running kernel, from a 3.12.9
> that seemed to work well with amanda, to a 3.13.5 that I had one heck
> of a time building because of Kconfig dependency errors that caused all
> of the many "media" options to disappear from the "make ?config"
> operations, and it is likely this one could be missing something that
> tar needs.
> 
> So, what, from this, would be the most likely candidate? The config.gz
> is attached.
> 
> Thank you very much for any insight that can be determined from this.
> 
> Cheers, Gene

Ping!  In the meantime I have rebuilt this kernel 3 times, getting an 
unbootable once, but without finding the option that seems to throw tar for 
a forever loop.

FWIW, when tar is in that state, the only drive activity is related to 
fetchmail activities which loops every 3 minutes, tar apparently gets stuck 
hammering on something it can't access.  And yet, the DLE it appears to be 
stuck on while attempting an estimate, /lib, can be listed with an ls -laR, 
with no problems.

This is the distro's copy of tar-1.22, but I've no clue what options it was 
compiled with.  Is this 1.22 a known bad actor under some conditions?

Cheers, Gene
-- 
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
Genes Web page <http://geneslinuxbox.net:6309/gene>

NOTICE: Will pay 100 USD for an HP-4815A defective but
complete probe assembly.

Reply via email to