Re: upgrade issue, 2.5.1 to 2.5.1p2: things are worse than I thought

2006-11-26 Thread Jean-Louis Martineau

Steve,

Could you send me the amdump.1 log file?
What is at the beginning of the holdingdisk files? Could you send me one 
of the header?


Jean-Louis

Steve Newcomb wrote:

As a temporary measure, I have re-installed 2.5.1 on my server
and it's working again, but I'd prefer to be running the same
version everywhere.



Well, I lied.  amcheck worked OK, but amflush 2.5.1 refuses to flush
the dumps made with 2.5.1p2 on the holding disk, saying:

[EMAIL PROTECTED]:~$ amflush coolheads
Scanning /nobackup/AMANDASPOOL...
  20061125001502: found Amanda directory.
Could not find any valid dump image, check directory.

The directory has many dump files in it, and it looks OK.

Is there a trick I should know?

-- Steve

Steven R. Newcomb, Consultant
Coolheads Consulting

Co-editor, Topic Maps International Standard (ISO/IEC 13250)
Co-editor, draft Topic Maps -- Reference Model (ISO/IEC 13250-5)

[EMAIL PROTECTED]
http://www.coolheads.com

direct: +1 540 951 9773
main:   +1 540 951 9774
fax:+1 540 951 9775

208 Highview Drive
Blacksburg, Virginia 24060 USA


(Confidential to all US government personnel to whom this private
letter is not addressed and who are reading it in the absence of a
specific search warrant: You, along with the corrupt and pusillanimous
109th Congress, are co-conspiring to subvert the Constitution that you
are sworn to defend.  You can either refuse to commit this crime, or
you can expect to suffer criminal sanctions in the future, when the
current administration of the United States of America has been
replaced by one that respects the rule of law.  I do not envy you for
having to make this difficult choice, but I urge you to make it
wisely.)
  




Re: 2.5.1p2: planner returns error=EOF on read

2006-11-26 Thread Jean-Louis Martineau

Amanda expect tar to exit with a "broken pipe".
I looks that we need to kill it.

Jean-Louis

Jean-Francois Malouin wrote:

* Jean-Louis Martineau <[EMAIL PROTECTED]> [20061123 17:20]:
  

Do amandad is running?



To be sure that I would hit the problem again I didn't load new tapes
so after the holddding filled up I have a bunch of gtar running and
just wasting a lot of cycles. A typical example (sorry for the long lines):

  amanda35559363557855  0- -   0:00  
  amanda3557855  1  0 05:37:39 ?   1:30 /opt/amanda/amanda1/libexec/sendbackup amandad bsdtcp

root35680873557855  0 05:40:02 ?  277:35 gtar --create --file - 
--directory /data/mafalda/mafalda1/susanita/sandra/DTI -

Tracing the sendbackup process yields no output while the gtar one
gives just an uninterupted flow of lines:

   68mS tar(3568087): write(1, ..., 10240) errno = 32 (Broken pipe)

Don't know if this is going to be of any help...
jf

  
Could you try to find what each process (amanda,sendbackup,tar) are 
doing? on which system call they are hung?

On linux, I would use strace for that.

Jean-Louis

Jean-Francois Malouin wrote:


Jean-Louis,

The amdump just finished and because amanda ran out of tapes
(runtapes=10) it completed badly leaving stuff in the holdding
disk. I'm flushing it at the moment but I noticed that when amanda
gave up it left a lot of gnutar lying around, with amandad as parent.
I had to manually kill them. Looks like signals are not doing ok.
I've attached one example of a DLE's sendbackup and runtar debug file.
It's not the first time I notice that when a DLE fails to make it
to tape successfully processes are left running...

* Jean-Louis Martineau <[EMAIL PROTECTED]> [20061122 09:32]:
 
  

Could you post amandad..debug file from yorick?

Jean-Francois Malouin wrote:
   


On a server running irix-6.5 and amanda 2.5.1p2
planner debug shows:

security_getdriver(name=bsdtcp) returns 4075aa8
security_handleinit(handle=1001d4a8, driver=4075aa8 (BSDTCP))
security_streaminit(stream=1005e538, driver=4075aa8 (BSDTCP))
security_close(handle=1001cfe0, driver=4075aa8 (BSDTCP))
security_stream_close(10053a80)
security_stream_seterr(1005e538, SOCKET_EOF)
security_seterror(handle=1001d4a8, driver=4075aa8 (BSDTCP) error=EOF
on read from yorick)
security_close(handle=1001d4a8, driver=4075aa8 (BSDTCP))
security_stream_close(1005e538)

and the amanda report shows things like:

yorick  DATA_sub101   lev 0  FAILED [hmm, disk was stranded on waitq]
planner: ERROR Request to yorick failed: EOF on read from yorick

and the backup fails. The funny thing is that I have 2 other 
configurations

running 2.5.1p2 in parallel that doesn't exhibit this behaviour.
Any clues?
jf

 
  
 



runtar: debug 1 pid 2826413 ruid 666 euid 0: start at Thu Nov 23 08:04:13 
2006

runtar: version 2.5.1p2
/usr/freeware/bin/tar version: tar (GNU tar) 1.13.25

config: stk_80-conf1
runtar: debug 1 pid 2826413 ruid 0 euid 0: rename at Thu Nov 23 08:04:13 
2006
running: /usr/freeware/bin/tar: 'gtar' '--create' '--file' '-' 
'--directory' 
'/data/mafalda/mafalda1/susanita/jen/anxiety_version1/sub101' 
'--one-file-system' '--listed-incremental' 
'/opt/amanda/amanda1/var/amanda/gnutar-lists/yoricksub101_1.new' 
'--sparse' '--ignore-failed-read' '--totals' '.' runtar: pid 2826413 
finish time Thu Nov 23 08:04:13 2006
 



sendbackup: debug 1 pid 2836263 ruid 666 euid 666: start at Thu Nov 23 
08:03:20 2006

sendbackup: version 2.5.1p2
Could not open conf file 
"/opt/amanda/amanda1/etc/amanda/amanda-client.conf": No such file or 
directory
Reading conf file 
"/opt/amanda/amanda1/etc/amanda/stk_80-conf1/amanda-client.conf".
sendbackup: debug 1 pid 2836263 ruid 666 euid 666: rename at Thu Nov 23 
08:03:20 2006
 sendbackup req:  /data/mafalda/mafalda1/susanita/jen/anxiety_version1/sub101 1 
 2006:11:13:13:21:32 OPTIONS |;auth=bsdtcp;index;>

 parsed request as: program `GNUTAR'
disk `sub101'
device 
`/data/mafalda/mafalda1/susanita/jen/anxiety_version1/sub101'

level 1
since 2006:11:13:13:21:32
options `|;auth=bsdtcp;index;'
sendbackup: start: yorick:sub101 lev 1
sendbackup-gnutar: time 0.208: doing level 1 dump as listed-incremental 
  
>from '/opt/amanda/amanda1/var/amanda/gnutar-lists/yoricksub101_0' to 


'/opt/amanda/amanda1/var/amanda/gnutar-lists/yoricksub101_1.new'
sendbackup-gnutar: time 53.057: doing level 1 dump from date: 2006-11-13 
13:22:26 GMT
sendbackup: time 53.246: started index creator: "/usr/freeware/bin/tar -tf 
- 2>/dev/null | sed -e 's/^\.//'"
sendbackup: time 53.254: spawning /opt/amanda/amanda1/libexec/runtar in 
pipeline
sendbackup: argument list: runtar stk_80-conf1 gtar --create --

Re: lev 1 FAILED [no backup size line]

2006-11-26 Thread Jean-Louis Martineau

Geert Uytterhoeven wrote:

Hi,

Since a few days one of my DLEs consistently fails with:

| /--  anakin /home/src lev 1 FAILED [no backup size line]
| sendbackup: start [anakin:/home/src level 1]
| sendbackup: info BACKUP=/bin/tar
| sendbackup: info RECOVER_CMD=/bin/gzip -dc |/bin/tar -f - ...
| sendbackup: info COMPRESS_SUFFIX=.gz
| sendbackup: info end
| ? gtar: /var/lib/amanda/gnutar-lists/anakin_home_src_1.new: Missing record 
terminator
| ? gtar: Error is not recoverable: exiting now
| sendbackup: error [no backup size line]

sendbackup.20061125005608.debug has:

| sendbackup: start: anakin:/home/src lev 1
| sendbackup: time 0.000: spawning /bin/gzip in pipeline
| sendbackup: argument list: /bin/gzip --fast
| sendbackup-gnutar: time 0.001: pid 8618: /bin/gzip --fast
| sendbackup-gnutar: time 7.362: doing level 1 dump as listed-incremental from 
'/var/lib/amanda/gnutar-lists/anakin_home_src_0' to 
'/var/lib/amanda/gnutar-lists/anakin_home_src_1.new'
| sendbackup-gnutar: time 7.407: doing level 1 dump from date: 2006-11-20 
12:45:43 GMT
| sendbackup: time 7.419: started index creator: "/bin/tar -tf - 2>/dev/null | sed 
-e 's/^\.//'"
| sendbackup: time 7.420: spawning /usr/lib/amanda/runtar in pipeline
| sendbackup: argument list: runtar DailySet1 gtar --create --file - 
--directory /home --one-file-system --listed-incremental 
/var/lib/amanda/gnutar-lists/anakin_home_src_1.new --sparse 
--ignore-failed-read --totals --exclude-from 
/tmp/amanda/sendbackup._home_src.20061125005616.exclude --files-from 
/tmp/amanda/sendbackup._home_src.20061125005616.include
| sendbackup-gnutar: time 7.423: /usr/lib/amanda/runtar: pid 8623
| sendbackup: time 7.423: started backup
| sendbackup: time 13.134: 118: strange(?): gtar: 
/var/lib/amanda/gnutar-lists/anakin_home_src_1.new: Missing record terminator
| sendbackup: time 13.148: 118: strange(?): gtar: Error is not recoverable: 
exiting now
| sendbackup: time 13.162: index created successfully
| sendbackup: time 13.163: error [no backup size line]
| sendbackup: time 13.163: pid 8616 finish time Sat Nov 25 00:56:21 2006

Anyone with a clue? Could this be /var running out of diskspace?
  


yes, or  /var/lib/amanda/gnutar-lists/anakin_home_src_0 is corrupted, 
you should force a level 0.



I'm using Debian, with Amanda 2.5.1p1-2 and tar 1.16-1.

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [EMAIL PROTECTED]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
  




Re: amandas group membership in FC6?

2006-11-26 Thread Gene Heskett
On Sunday 26 November 2006 10:09, Ross Vandegrift wrote:
>On Sat, Nov 25, 2006 at 11:22:38PM -0500, Gene Heskett wrote:
>> >See what that number maps to in /etc/group.  I'm betting it
>> >goes to an 'amanda' group and not the 'disk' group.
>>
>> There was not, and still is not, a group named amanda, just the amanda
>> entry in the disk line.
>
>Is there any chance you're using ldap/nis/winbind/etc for groups in
>nsswitch.conf?  The "amanda" group must be coming from somewhere.  If
>it's not listed in /etc/group, I'm wondering if there's another source
>of groups on your system.

Pretty close to zero on the above.

[EMAIL PROTECTED] etc]# grep -R amanda *
aliases:amanda: root
amandates:/amanda 0 1155793225
amandates:/amanda 1 1155879121
amandates:/amanda 2 1155966111
amandates:/amanda 3 1156050701
amandates:/amanda 4 1144996628
group:disk:x:6:amanda,root
group:amanda:x:501:
group-:disk:x:6:root,amanda
group-:amanda:x:501:
group.bak:disk:x:6:root,amanda
group.bak:amanda:x:501:
gshadow:disk:!::amanda,root
gshadow:amanda:!::
gshadow-:amanda:!::
gshadow.bak:disk:!::root,amanda
gshadow.bak:amanda:!::
kde/kdm/kdmrc:HiddenUsers=adm,alias,amanda,apache,bin,bind,daemon,exim,falken,ftp,games,gdm,gopher,halt,httpd,ident,ingres,kmem,lp,mail,mailnull,man,mta,mysql,named,news,nfsnobody,nobody,nscd,ntp,operator,pcap,pop,postfix,postgres,qmaild,qmaill,qmailp,qmailq,qmailr,qmails,radvd,reboot,rpc,rpcuser,rpm,sendmail,shutdown,squid,sympa,sync,tty,uucp,xfs,xten
mtab:/dev/hdd3 /amandatapes ext3 rw 0 0
passwd:amanda:x:501:6::/home/amanda:/bin/bash
passwd-:amanda:x:501:501::/home/amanda:/bin/bash
services:amanda 10080/tcp   # amanda backup 
services
services:amanda 10080/udp   # amanda backup 
services
services:kamanda10081/tcp   # amanda 
backup services (Kerberos)
services:kamanda10081/udp   # amanda 
backup services (Kerberos)
services:amandaidx  10082/tcp   # amanda backup 
services
services:amidxtape  10083/tcp   # amanda backup 
services
shadow:amanda::13477:0:0
shadow-:amanda:!!:13461:0:9:7:::
shadow.bak:amanda:!!:13461:0:9:7:::
X11/xdm/kdmrc:HiddenUsers=adm,alias,amanda,apache,bin,bind,daemon,exim,falken,ftp,games,gdm,gopher,halt,httpd,ident,ingres,kmem,lp,mail,mailnull,man,mta,mysql,named,news,nfsnobody,nobody,nscd,ntp,operator,pcap,pop,postfix,postgres,qmaild,qmaill,qmailp,qmailq,qmailr,qmails,radvd,reboot,rpc,rpcuser,rpm,sendmail,shutdown,squid,sympa,sync,tty,uucp,xfs,xten
xinetd.d/amanda:service amanda
xinetd.d/amanda:user= amanda
xinetd.d/amanda:server  = /usr/local/libexec/amandad
xinetd.d/amanda:service amandaidx
xinetd.d/amanda:user= amanda
xinetd.d/amanda:user= amanda
[EMAIL PROTECTED] etc]#  
===
I removed the obvious errors and several kilobytes of selinux related 
stuff from the above as its set permissive.

I expect some of the above, like the group entries for a group 501, could 
just as well be deleted.  But are they a factor?  I don't know.  I will 
get rid of the group 501 entries though, just to make me feel better.

FWIW, amanda ran just fine this morning.

Thanks Ross.

-- 
Cheers, Gene
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
Yahoo.com and AOL/TW attorneys please note, additions to the above
message by Gene Heskett are:
Copyright 2006 by Maurice Eugene Heskett, all rights reserved.


Re: amandas group membership in FC6?

2006-11-26 Thread Ross Vandegrift
On Sat, Nov 25, 2006 at 11:22:38PM -0500, Gene Heskett wrote:
> >See what that number maps to in /etc/group.  I'm betting it 
> >goes to an 'amanda' group and not the 'disk' group.
> 
> There was not, and still is not, a group named amanda, just the amanda 
> entry in the disk line.

Is there any chance you're using ldap/nis/winbind/etc for groups in
nsswitch.conf?  The "amanda" group must be coming from somewhere.  If
it's not listed in /etc/group, I'm wondering if there's another source
of groups on your system.

-- 
Ross Vandegrift
[EMAIL PROTECTED]

"The good Christian should beware of mathematicians, and all those who
make empty prophecies. The danger already exists that the mathematicians
have made a covenant with the devil to darken the spirit and to confine
man in the bonds of Hell."
--St. Augustine, De Genesi ad Litteram, Book II, xviii, 37