I think the explanation for most of your problems is explained the
amanda report.  I have some questions that may help find the problem:

  - how much free space do you have on holding disk, how much space is
    used?
    
  - How big are the backups every day?

  - Do you use hardware compression on the tape?


Kind regards
Jose M Calhariz


On Fri, Dec 03, 2021 at 11:18:08PM +0000, ghe2001 wrote:
> amanda version: amadmin-3.5.1
> OS: Debian Linux, Buster
> Host: Supermicro
> PCI card: LSI Logic / Symbios Logic SAS2008 PCI-Express Fusion-MPT SAS-2 
> [Falcon] (rev 03)
> Tape drive: Quantum LTO-5
> Tapes: Quantum Ultrium 5
> 
> Problem: Over the years, I've written a bunch of amanda scripts.  Here's what 
> the one that displays what's going on during a backup says at one point:
> 
> gobook3.slsware.lan:/usr                       20211203133758 0   4519450k 
> dump done, wait for writing
> pi.slsware.lan:/lib                            20211203133758 0   1720720k 
> dump done, writing (1518944k done (88.27%)) (14:03:16)
> pi.slsware.lan:/toshiba1                       20211203133758 0   5089430k 
> dump done, wait for writing
> pi.slsware.lan:/usr                            20211203133758 0   3108660k 
> dump done, wait for writing
> sbox.slsware.lan:/blackHole/amanda             20211203133758 0   2251540k 
> dump done, wait for writing
> sbox.slsware.lan:/home                         20211203133758 0  74493810k 
> dump done, wait for writing
> sbox.slsware.lan:/usr                          20211203133758 0   5144210k 
> dump done, wait for writing
>                0: writing (pi.slsware.lan:/lib)
> === Fri Dec 03, 2021 02:03:35 PM
> 
>   This is normal.  A lot of flushing's been done, and all the small files.  
> The script futzes with amstatus and displays every 5 seconds.
> 
> .
> 
>   The '.' means amstatus has written the same line, and nothing of interest 
> has happened.
> 
> gobook3.slsware.lan:/usr                       20211203133758 0   4519450k 
> dump done, wait for writing
> pi.slsware.lan:/toshiba1                       20211203133758 0   5089430k 
> dump done, wait for writing
> pi.slsware.lan:/usr                            20211203133758 0   3108660k 
> dump done, wait for writing
> sbox.slsware.lan:/blackHole/amanda             20211203133758 0   2251540k 
> dump done, wait for writing
> sbox.slsware.lan:/home                         20211203133758 0  74493810k 
> dump done, wait for writing
> sbox.slsware.lan:/usr                          20211203133758 0   5144210k 
> dump done, wait for writing
>                0: tape error: Couldn't rewind device to finish: No such 
> device, splitting not enabled (pi.slsware.lan:/lib)
> === Fri Dec 03, 2021 02:03:52 PM
> 
>   Notice the than ~30 seconds from the last display.  And who's trying to 
> rewind and finding no device (/dev/nst0 I suppose).
> 
>   After this, every line says "terminated while waiting for writing."
> 
>   This looks to me like maybe something can't deal with files larger that 1G. 
>  But I've seen it get to 55G or so.
> 
> 
> This morning, Logwatch said:
> 
> WARNING:  Kernel Errors Present
>     st 0:0:3:0: [st0] Error 10000 (driver bt ...:  2 Time(s)
>     st 0:0:3:0: [st0] Error e0000 (driver bt ...:  1 Time(s)
>     st 0:0:4:0: [st0] Error 10000 (driver bt ...:  1 Time(s)
>     st 0:0:4:0: [st0] Error e0000 (driver bt ...:  1 Time(s)
>     st 0:0:5:0: [st0] Error 10000 (driver bt ...:  1 Time(s)
>     st 0:0:5:0: [st0] Error e0000 (driver bt ...:  1 Time(s)
>     st 0:0:6:0: [st0] Error 10000 (driver bt ...:  1 Time(s)
>     st 0:0:6:0: [st0] Error e0000 (driver bt ...:  1 Time(s)
>     st 0:0:7:0: [st0] Error 10000 (driver bt ...:  1 Time(s)
>     st 0:0:7:0: [st0] Error 70000 (driver bt ...:  1 Time(s)
>     st 0:0:7:0: [st0] Error e0000 (driver bt ...:  1 Time(s)
> 
> st0 is the non-rewinding tape drive.  As best I know, nobody does anything 
> with st0 -- nst0 is what I, and amanda, use.
> 
> This looks like there might be a bent driver.
> 
> 
> Amcheck:
> 
> slot 1: volume 'sls-9'
> Will write to volume 'sls-9' in slot 1.
> Writing label 'sls-9' to check writability
> Volume 'sls-9' is writeable.
> Server check took 25.073 seconds
> (brought to you by Amanda 3.5.1)
> 
> 
> I'm at a loss.  Amanda worked for a decade or so with the old DLT drive, then 
> for 3 or 4 years with the LTO -- then started breaking in every backup.  I 
> trusted Quantum and LSI -- I ran tar to copy my entire / directory to the 
> Quantum tape that came with the drive, and it worked -- that ruled out the 
> card and the drive, so I bought a new collection of Quantum tapes to replace 
> the HPs I had been using.  No joy.
> 
> Tar used a block size of 512, and amanda uses 32768.  I thought the block 
> size might be the problem, so I looked on the 'Net and found a little bit 
> about changing the size, but I already knew how to do that in mt.  There was 
> nothing about changing it in amanda.  Besides, it'd been working for years at 
> 32768.
> 
> The only thought I have left is that something's wrong with the Linux driver. 
>  But I'd expect to have seen a lot of traffic in this list if there was 
> something wrong with it.  The change to failure did, I think, happen right 
> after an update, though.
> 
> Any ideas, explanations, fixes?
> 

-- 
--
        Infeliz o povo que precisa de herois.
                -- Bertold Brecht

Attachment: signature.asc
Description: PGP signature

Reply via email to