[BackupPC-users] BackupPC and PowerEdge E1410 CPU 1 IERR

2006-12-15 Thread Jonathan Dill
Maybe this is a shot in the dark, I have already asked on the Linux 
PowerEdge mailing list, just hoping that someone has had a very similar 
problem and can help narrow down all of the possibilities.

I have a PowerEdge 1900 with Ubuntu Dapper 6.06.1 LTS x86_64 with dual 
Xeon 5110 processors, PERC 5/i with 2 SATA drives in RAID1, 3 more 
drives on motherboard SATA with LVM for the backup data.  The server is 
stable as long as I don't run BackupPC.  I called Dell and they 
recommended taking out the Intel add-in gigabit card and try the onboard 
Broadcom, tried that and it still crashed with the same error, so it's 
not the network card.  I have run a regular rsync of about 200 GB from 
the server to another computer, and that worked fine, and that was more 
CPU and I/O intensive than BackupPC.  I really need to get some more 
time to do more hardware diagnostics, but I'm having to do it off hours 
and get someone there to escort me so it's been hard to schedule downtime.

I disabled all but one host then ran BackupPC_dump -v -f host from the 
console and also a tail of /var/log/messages, there were no errors in 
the system log, the dump was just a ways into the stream of dumping 
files, create / pool, toward then end there are a few Can't link X to 
errors.  I started the dump around 11pm and it crashed at 4:13am, this 
was only supposed to be about 18 GB so it seems like it should have been 
quicker than that.

There are a lot of hardware things that it could be, CPU, memory, riser 
board, PERC controller, motherboard.  The disks that it is dumping to 
are on the motherboard SATA controller and not on the PERC.  Any ideas 
that could help me to pinpoint this problem?

Thanks,
Jonathan

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/backuppc-users
http://backuppc.sourceforge.net/


Re: [BackupPC-users] BackupPC and PowerEdge E1410 CPU 1 IERR

2006-12-15 Thread Guus Houtzager
On Fri, 2006-12-15 at 10:01 -0500, Jonathan Dill wrote:

[...]

Ok, more details please, you're being too vague. Is your linux box
crashing (kernel oops, freeze, spontaneous reboot) or is just the backup
failing? What version of backuppc are you running? If you get an oops,
can you post it here? What are the last lines of the LOG file for the
host you tried?
What backup method are you using (rsync/ssh, sshd, smb, tar)? What
filesystem do you use for your (c)pool? How many files does the backup
you tried consist of?

Usually if it's hardware related, the memory is the culprit. Try a
memtest86+ run and see if that shows any errors.

Hth,

Guus


-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/backuppc-users
http://backuppc.sourceforge.net/


Re: [BackupPC-users] BackupPC and PowerEdge E1410 CPU 1 IERR

2006-12-15 Thread Jonathan Dill
Guus Houtzager wrote:
 Ok, more details please, you're being too vague. Is your linux box
 crashing (kernel oops, freeze, spontaneous reboot) or is just the backup
 failing? What version of backuppc are you running? If you get an oops,
 can you post it here? What are the last lines of the LOG file for the
 host you tried?
   
No kernel oops, no spontaneous reboots, no errors in the system log, no 
thermal warnings, the system freezes and the LCD on the front bezel 
shows E1410 CPU 1 IERR.  The Dell tech told me this usually indicates 
a problem on the PCI bus, most often the add-in network card, but we 
ruled that out.

backuppc-2.1.2-2ubuntu5
perl-5.8.7-10ubuntu
rsync-2.6.6-1ubuntu2

[EMAIL PROTECTED]:~# uname -a
Linux imageserver 2.6.15-27-amd64-server #1 SMP Fri Dec 8 18:02:49 UTC 
2006 x86_64 GNU/Linux

LOG (entire contents):
2006-12-14 23:00:00 full backup started for directory DriveC

 From XferLOG it looks like I should add an exclude for $NtServicePack$, 
but that's just noise, there is an occassional error like below, but no 
other errors:

  create   64418/544  247693 
WINDOWS/$NtServicePackUninstall$/msoe.chm
Can't link 
/var/lib/backuppc/pc/sandraxp2k3/new/fDriveC/fWINDOWS/f$NtServicePackUninstall$/fmsoe.dll
 
to /var/lib/backuppc/pool/9/5/4/954484d7dde3b57fa9e7a971a7200bdf   
pool 75518/544 1176064 WINDOWS/$NtServicePackUninstall$/msoe.dll

I have also previously tried just a Linux host, and that also locked up 
the system.  I should try just backing up the localhost and see what 
that does.
 What backup method are you using (rsync/ssh, sshd, smb, tar)? What
 filesystem do you use for your (c)pool? How many files does the backup
 you tried consist of?
   
I'm using rsyncd on the client as the backup method.

I'm using ReiserFS 3.6 with standard journal for the pool filesystem, 
and the fs is on an LVM .
 Usually if it's hardware related, the memory is the culprit. Try a
 memtest86+ run and see if that shows any errors.
   
Yeah, that's definitely suspect, but it could take an hour or so to 
check 2 GB of memory unless it hits an error early on.  Scheduling the 
time take the server offline is the hard part.

Thanks,
Jonathan

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/backuppc-users
http://backuppc.sourceforge.net/


Re: [BackupPC-users] BackupPC and PowerEdge E1410 CPU 1 IERR

2006-12-15 Thread Tino Schwarze
On Fri, Dec 15, 2006 at 04:21:11PM +0100, Guus Houtzager wrote:

 Usually if it's hardware related, the memory is the culprit. Try a
 memtest86+ run and see if that shows any errors.

You might want to use the Dell tools to check the memory sind the Dell
guys usually want it's results before they care.

HTH,

Tino.

-- 
www.quantenfeuerwerk.de
www.spiritualdesign-chemnitz.de
www.lebensraum11.de

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/backuppc-users
http://backuppc.sourceforge.net/