Jean-Louis, Thank you for replying.
OK ... but I'm beginning to think there are network issues. Last night this machine failed during the auth phase. First time that's happened. 'amcheck -c' later succeeded. Would the amandad.*.debug REQ packet received be an entry like this? <<<<< SERVICE noop OPTIONS features=ffffffff9ffeffffffff00; Increasing rep_tries? I'm grasping for a solution, and presuming that (from the context of REP in the *debug logs) that REP indicates an attempt to reply to the server. More tries = greater chance of success? Evidently not. There are spare NICs on both server and client; I'm going to try a backup over a cross-over cable (e.g. take the existing cables / ports / switches out of the picture ...). Bryan On Fri, Jul 24, 2009 at 07:10:53AM -0400, Jean-Louis Martineau wrote: > You should look at the REQ packet in the amandad.*.debug file to find > what is bogus in it. > I don't understand why you want to increase rep_tries? > > Jean-Louis > > Bryan wrote: > >Looking for some guidance on a perplexing problem, plus a summary > >update on my earlier note (below): > > > >Answers to my earlier questions were 'all works with 2.5.1p3'. > >However, 2.5.1 is problematic for a client with ~750 ZFS file > >systems; on test and production runs, variously 10% to 50% of > >the client's ZFS file systems failed to back up due to missing > >ACKs on sendsize. > > > >Building 2.6.x on a SUNWCreq installation on Solaris 9/10 with > >SUNWspro 12 and/or SUNWgccfss was NOT successful (after current > >glib / pkg-config were installed). It builds on a SUNWCprog > >cluster installation. Will investigate this further later. > > > >Current problem: Moving from 2.5.1 to 2.5.2 ... > > > >Solaris 9 server, Solaris 10 client, both running 2.5.2p1, both > >built (no problems noted) with SUNWspro 12. Tape loads OK, bsd > >auth, client selfcheck succeeds, client sendsize fails, > >apparently due to: > > > > sendsize: debug 1 pid 15227 ruid 50 euid 50: start at Thu Jul 23 > > 12:01:15 2009 > > sendsize: version 2.5.2p1 > > Reading conf file "/etc/amanda/amanda-client.conf". > > Could not open conf file "/etc/amanda/DAILY/amanda-client.conf": No such > > file or directory > > sendsize: debug 1 pid 15227 ruid 50 euid 50: rename at Thu Jul 23 > > 12:01:15 2009 > >* sendsize[15227]: time 0.541: REQ packet is bogus: no dumpdate > > sendsize: time 0.541: pid 15227 finish time Thu Jul 23 12:01:15 2009 > > > >amandad agrees: > > > > <<<<< > > OPTIONS features=ffffffff9ffeffffffff00; > > FORMAT ERROR IN REQUEST PACKET > > >>>>> > > amandad: udpbsd_sendpkt: enter > > amandad: time 8.705: bsd: pkthdr2str handle '000-00000001' > > amandad: time 8.705: sec: udpbsd_sendpkt: PREP (2) pkt_t (len 72) > > contains: > > > > "OPTIONS features=ffffffff9ffeffffffff00; > > FORMAT ERROR IN REQUEST PACKET > > " > > > >sendsize functioned (imperfectly) on 2.5.1. The effort to move > >from 2.5.1 to 2.5.2 is an attempt to gain 'rep_tries' in > >amanda-client.conf. The client conf file says: > > > > connect_tries 10 > > rep_tries 50 > > debug_amandad 1 > > debug_amidxtaped 1 > > debug_amindexd 1 > > debug_amrecover 1 > > debug_auth 1 > > debug_event 1 > > debug_holding 1 > > debug_protocol 1 > > debug_selfcheck 1 > > debug_sendsize 1 > > debug_sendbackup 1 > > > >disklist entries for the client all use spindle -1. The client > >has its own dumptype entry that says 'maxdumps 6', and otherwise > >inherits the usual gtar parameters. > > > >The problem is consistently re-produceable, before and after > >'amadmin delete' of the Solaris 10 client. > > > >Hoping that someone can provide some guidance or suggestions. > > > >Bryan > > > >On Tue, Jul 14, 2009 at 10:49:36AM -0400, Bryan wrote: > > > >>Folks, > >> > >>We've been running 2.4.x (presently 2.4.4p1) on Solaris 9 since > >>2002 (and earlier versions back to about 1997 ..). The 2.4.4 > >>release has been enormously successful, and has saved our tail > >>feathers any number of times. Clients are Solaris and Linux of > >>various vintages (including some Fedora that should be retired.) > >> > >>We're bringing up ZFS file systems; looks like it's time to move > >>to 2.6.1. I've read the UPGRADING file in the distribution, and > >>am seeking guidance on some finer points. > >> > >>We keep daily backups for 4 weeks, weeklies for 4 months, and > >>monthlies for 1 year. Can I expect that 2.6.1 amanda will be > >>able to 'amadmin find' and recover tapes written by 2.4.4? (My > >>guess is 'yes'.) Is there compelling merit to doing an upgrade to > >>2.5.1 as a tranisitional step? (Guess = no). > >> > >>Updating all of the clients in one fell swoop would present > >>challenges. > >> > >>A number of client machines (but not all) are presently running > >>2.5.1, the result of a prior but incomplete attempt to upgrade. > >>The 2.5.1 clients work fine with the 2.4.4 server. Can I expect > >>the older clients to work with a 2.6.1 server? (Guess = I hope > >>so.) Or should I upgrade the clients first and the server > >>second? (Guess = no.) > >> > >>My guesses are just that, and have no basis. Some guidance (even > >>if only more informed guesses) would be appreciated. > >> > >>Thank you. > >> > >>Bryan > >>
