What's the time difference between the first and last line in the amandad.debug file?
Is it bigger than your etimeout setting? If so, increase etimeout

Jean-Louis

Klas Heggemann wrote:
Hi list.

We're trying backup snapshots of some zfs filesystems on a OpenSolaris host. Client is running amanda 2.6.1p2, server is running 2.5.3 (with some kerberos fixes added) on a Solaris 10.

Backup of this client is via ssh and keys. The client is not build with 
kerberos.
Backup is with gnutar (/Usr/sfw/bin/gtar). ssh is Solaris ssh. No firewalls are involved.


However the client  fails o communicate with the server. It seems to calculate 
the size right.
This is from the amandad.debug file:

OPTIONS features=ffffffff9ffeffffffff7f;
/.zfs/snapshot/daily.0 1 SIZE 39529810
/.zfs/snapshot/daily.0 2 SIZE 39529810
/opt/.zfs/snapshot/daily.0 1 SIZE 3968010
/opt/.zfs/snapshot/daily.0 2 SIZE 3968010
/sandbox/.zfs/snapshot/daily.0 1 SIZE 10
/sandbox/.zfs/snapshot/daily.0 2 SIZE 10
/sandbox/klas/.zfs/snapshot/daily.0 1 SIZE 71922300
/sandbox/klas/.zfs/snapshot/daily.0 2 SIZE 71922300
/sandbox/klas2/.zfs/snapshot/daily.0 1 SIZE 38611950
1289263081.214946: amandad: security_stream_seterr(8074e80, write error to: Bad file number)
1289263081.214961: amandad: security_seterror(handle=8074d30, driver=fef3a688 
(SSH) error=write error to: Bad file number)
1289263137.443175: amandad: sending PREP pkt:
<<<<<


(I notice that last fs is not repeated like the other ones are.)
The server will log:


   client                  /sandbox/klas2/.zfs/snapshot/daily.0                 
       lev 0  FAILED [disk /sandbox/klas2/.zfs/snapshot/daily.0, all estimate 
timed out]
client /sandbox/klas/.zfs/snapshot/daily.0 lev 0 FAILED [disk /sandbox/klas/.zfs/snapshot/daily.0, all estimate timed out] client /sandbox/.zfs/snapshot/daily.0 lev 0 FAILED [disk /sandbox/.zfs/snapshot/daily.0, all estimate timed out] client /opt/.zfs/snapshot/daily.0 lev 0 FAILED [disk /opt/.zfs/snapshot/daily.0, all estimate timed out]
   client                  /.zfs/snapshot/daily.0                               
       lev 0  FAILED [disk /.zfs/snapshot/daily.0, all estimate timed out]


I guess something is broken in the ssh communication between server and client. Has anyone seen anything like this before and have some idea about what is going wrong? Could it be real timeouts when calculations take more time then expected?



Klas Heggemann
sys adm  at  csc kth .se




Reply via email to