It sounds very much like a hardware problem, perhaps slightly toasted ide controllers? It sounds like a commodity box, can you move all the disks to another machine and fire it up? Oh and go get a decent UPS! :-)
brien Klaas Vantournhout wrote: > Dear all, > > The real questions are at the bottom, the rest is just a nice intro > which introduces you to the nature of the questions. > > Two days ago, we had a power outage in our department which caused a > rather brutal shutdown of the computers. All of the computers survived, > which is a good thing. But only one gained a peculiar character, and of > course it had to be the backup server. > > At the current point I am not blaming BackupPC at all, I'm just trying > to isolate the problem, and that is why I would need your help in this. > > Okay so what does the bastard (read server) do now. Well not much, it > just hangs or reboots from time to time. Rather in a random way. > > The first thing we noticed was in /var/log/messages that after the > poweroutage, the ntpd deamon could not set its clock right anymore. > > <snip /var/log/messages> > # cat /var/log/messages | grep ntpd > Mar 29 10:39:43 inwtheo1 ntpd: ntpd startup succeeded > Mar 29 10:39:43 inwtheo1 ntpd[5689]: ntp engine ready > Mar 29 08:40:06 inwtheo1 ntpd[5689]: peer 157.193.40.37 now valid > Mar 29 10:40:57 inwtheo1 ntpd[5688]: adjusting local clock by 166.241134s > Mar 29 10:41:59 inwtheo1 ntpd[5688]: adjusting local clock by 166.240065s > Mar 29 10:44:13 inwtheo1 ntpd[5688]: adjusting local clock by 166.238681s > Mar 29 10:45:13 inwtheo1 ntpd[5688]: adjusting local clock by 166.174413s > Mar 29 10:46:15 inwtheo1 ntpd[5688]: adjusting local clock by 187.903248s > Mar 29 10:55:11 inwtheo1 ntpd: ntpd startup succeeded > Mar 29 10:55:11 inwtheo1 ntpd[5607]: ntp engine ready > Mar 29 08:55:32 inwtheo1 ntpd[5607]: peer 157.193.40.37 now valid > <end snip> > > Although trying to understand this problem, I noticed that changing from > openntpd to ntp did the trick to get the time correct. Although unsure > about this solution, we switched off the deamon to be 100% sure this was > not the cause of the reboots and or crashes. > > init 1 and 2 ran stable (backuppc is not running in init 2). > init 3 didn't (backuppc runs there) > starting all services by hand to go from 2 to 3, also did not give any > problem. But using the command #init 3, it does. If we remove backuppc > from init 3, the server is stable. > > So at this point we started to suspect something is going on when > backuppc is running, but we also noticed that sometimes something was > going on when backuppc was not running. So no conclusion yet. > > Although it frequently happens that backuppc initiates the crashes, we > are wondering why this could be, that is why i write here. > > Our server is very basic. We are running version 3.0.0, the whole > system is located on /dev/hda in several partitions, and the backup > config files and data is in raid 5 on 3 separate disks > > [EMAIL PROTECTED] ~]$ df > Filesystem Size Used Avail Use% Mounted on > /dev/hda7 9.9G 1.4G 8.1G 15% / > /dev/hda1 479M 12M 443M 3% /boot > /dev/hda8 44G 172M 44G 1% /home > /dev/hda6 20G 729M 18G 4% /var > /dev/md0 461G 194G 243G 45% /var/backups > [EMAIL PROTECTED] ~]$ cat /proc/mdstat > Personalities : [raid5] > md0 : active raid5 hdb1[0] hdg1[2] hde1[1] > 490223232 blocks level 5, 64k chunk, algorithm 2 [3/3] [UUU] > > unused devices: <none> > > > A test also showed that init 3 without backuppc and without /dev/md0 > mounted, was very stable. > > I also have to mention that one time when the system rebooted > unexpectedly, the raid system lost 2 of its drives, without a reason. > The next bootup just repaired the raid system. Hence we start thinking > something is wrong with the raid. fsck gives no problems whatsoever. > > ** If you skipped the top, here are the questions ** > > What we are wondering now is, does backuppc initiate some other system > commands which could enable the hang? > > The poweroutage was in the middle of some full backups, is it possible > that this gives problems? We have for example in couple client directory > a directory new/, even without a backup going on. Can i safely delete > this directory? > > Is there more going on that I am not aware of, and how can i see it. > > Did anybody had the same? And if so, how did you solve it? > > Regards > klaas > > > > ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys-and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/backuppc-users http://backuppc.sourceforge.net/