Holger Parplies wrote: > Hi, > > Matthias Meyer wrote on 2009-01-18 15:33:30 +0100 [Re: [BackupPC-users] > errors in cpool after e2fsck corrections]: >> Johan Ehnberg wrote: >> > Quoting Matthias Meyer <matthias.me...@gmx.li>: >> > >> >> After a system crash and tons of errors in my ext3 filesystem I have >> >> to run e2fsck. >> >> During this I lost some GB of data in /var/lib/backuppc. >> >> [...] >> >> I believe the reason is that >> >> /var/lib/backuppc/cpool/8/4/5/845a684e4a8c9fe22d11484dc13e24fc >> >> is a directory and not a file. Probably during e2fsck created. > > no, e2fsck does not *create* directories; this is a clear evidence of > on-disk data corruption. I had a lot of HTREE errors, duplicate claimed inodes and not claimed inodes. Maybe the files would be directories before e2fsck run. > >> >> Should I delete all directories in /var/lib/backuppc/cpool/?/?/?/* >> >> or would BackupPC_nightly do this job? > > I doubt so. BackupPC_nightly is not designed to fix broken file systems. Ok. I will delete this directories. In your responsibility (Just a joke :-) > >> > Sorry to hear about that. I would recommend the following: >> > - Consider all the backed up data corrupt (don't build any new backups >> > on it) - Start a fresh pool, saving the old one for the duration of >> > your normal cycle - Look for the reason for the crash/corruption and >> > prevent it from happening >> [...] >> I would believe the filesystem should be ok in the meantime. e2fsck needs >> to run 3 or 4 times and need in total more than 2 days. After this >> lost+found contains approximately 10% of my data :-( No chance to >> reconstruct all of them. >> >> 1) So you would recommend: >> mv /var/lib/backuppc/cpool /var/lib/backuppc/cpool.sav >> mkdir /var/lib/backuppc/cpool > > No. The point seems to be *getting rid of the corrupt file system*. You > don't know what exactly was corrupted on-disk. You have definite evidence > that a lot was - and possibly still is (3 or 4 e2fscks to find all > problems? What should a subsequent check find that a previous one > didn't?). You can trust in everything being ok now, but you might as well > trust in not needing your backups in the first place. You can't really > verify it. But if a new backup of a file occurs, it should be ok and it should be possible to restore this file. > > The key phrase is > >> > - Look for the reason for the crash/corruption and prevent it >> > from happening > > - this can likely mean exchanging the disk (cables, mainboard, memory, > power supply ...). You don't give any details about your system crash or > hardware setup, so there is little point in guessing what might have gone > wrong. Debian stable in VMware. 4 SATA Disks with Adaptec 1420A in software raid5 and LVM2. Both controlled within the vmware. extracts from /var/log/messages: Jan 13 17:38:04 FileServer -- MARK -- Jan 13 17:49:30 FileServer kernel: mptscsih: ioc0: attempting task abort! (sc=c15be840) Jan 13 17:49:30 FileServer kernel: sd 0:0:1:0: Jan 13 17:49:30 FileServer kernel: command: cdb[0]=0x2a: 2a 00 00 00 ce 49 00 00 30 00 Jan 13 17:49:30 FileServer kernel: mptbase: ioc0: IOCStatus(0x004b): SCSI IOC Terminated Jan 13 17:49:30 FileServer kernel: mptscsih: ioc0: task abort: SUCCESS (sc=c15be840) Jan 13 18:18:05 FileServer -- MARK -- Jan 13 22:18:13 FileServer -- MARK -- Jan 13 22:18:14 FileServer kernel: rpc-srv/tcp: nfsd: got error -104 when sending 32900 bytes - shutting down socket Jan 13 22:38:13 FileServer -- MARK -- Jan 13 22:38:57 FileServer shutdown[24187]: shutting down for system reboot Jan 13 22:42:02 FileServer kernel: NFSD: starting 90-second grace period Jan 13 23:01:52 FileServer -- MARK -- Jan 14 02:41:54 FileServer -- MARK -- Jan 14 02:49:29 FileServer kernel: mptscsih: ioc0: attempting task abort! (sc=c15c0720) Jan 14 02:49:29 FileServer kernel: sd 0:0:3:0: Jan 14 02:49:29 FileServer kernel: command: cdb[0]=0x2a: 2a 00 0b 8d 0a b9 00 00 08 00 Jan 14 02:49:29 FileServer kernel: mptbase: ioc0: IOCStatus(0x004b): SCSI IOC Terminated Jan 14 02:49:29 FileServer kernel: mptscsih: ioc0: task abort: SUCCESS (sc=c15c0720) Jan 14 02:49:29 FileServer kernel: mptscsih: ioc0: attempting task abort! (sc=c15c0960) Jan 14 02:49:29 FileServer kernel: sd 0:0:2:0: Jan 14 02:49:29 FileServer kernel: command: cdb[0]=0x2a: 2a 00 0c 63 38 91 00 00 10 00 Jan 14 02:49:29 FileServer kernel: mptbase: ioc0: IOCStatus(0x004b): SCSI IOC Terminated Jan 14 02:49:29 FileServer kernel: mptscsih: ioc0: task abort: SUCCESS (sc=c15c0960) Jan 14 02:49:29 FileServer kernel: mptscsih: ioc0: attempting task abort! (sc=c15c0180) Jan 14 02:49:30 FileServer kernel: sd 0:0:1:0: Jan 14 02:49:30 FileServer kernel: command: cdb[0]=0x2a: 2a 00 0c 63 38 81 00 00 10 00 Jan 14 02:49:30 FileServer kernel: mptbase: ioc0: IOCStatus(0x004b): SCSI IOC Terminated Jan 14 02:49:30 FileServer kernel: mptscsih: ioc0: task abort: SUCCESS (sc=c15c0180) Jan 14 02:52:45 FileServer kernel: mptscsih: ioc0: attempting task abort! (sc=da8bf3e0) Jan 14 02:52:45 FileServer kernel: sd 0:0:1:0: Jan 14 02:52:45 FileServer kernel: command: cdb[0]=0x2a: 2a 00 09 cc 5e b9 00 00 08 00 Jan 14 02:52:45 FileServer kernel: command: cdb[0]=0x2a: 2a 00 09 cc 5e b9 00 00 08 00 Jan 14 02:52:45 FileServer kernel: mptbase: ioc0: IOCStatus(0x004b): SCSI IOC Terminated Jan 14 02:52:45 FileServer kernel: mptscsih: ioc0: task abort: SUCCESS (sc=da8bf3e0) Jan 14 03:21:54 FileServer -- MARK -- Jan 14 07:42:24 FileServer -- MARK -- Jan 14 07:48:20 FileServer kernel: mptscsih: ioc0: attempting task abort! (sc=c15c0060) Jan 14 07:48:20 FileServer kernel: sd 0:0:2:0: Jan 14 07:48:20 FileServer kernel: command: cdb[0]=0x2a: 2a 00 0b 8d 23 59 00 00 08 00 Jan 14 07:48:20 FileServer kernel: mptbase: ioc0: IOCStatus(0x004b): SCSI IOC Terminated Jan 14 07:48:20 FileServer kernel: mptscsih: ioc0: task abort: SUCCESS (sc=c15c0060) Jan 14 08:02:24 FileServer -- MARK -- Jan 14 12:42:28 FileServer -- MARK -- Jan 14 20:23:48 FileServer syslogd 1.4.1#18: restart. Jan 14 20:23:48 FileServer kernel: klogd 1.4.1#18, log source = /proc/kmsg started. Jan 14 20:23:48 FileServer kernel: Linux version 2.6.18 (r...@fileserver.privatelan.at) (gcc version 4.1.3 20070812 (prerelease) (Debian 4.1.2-15$ Jan 14 20:23:48 FileServer kernel: BIOS-provided physical RAM map: Jan 14 20:23:48 FileServer kernel: BIOS-e820: 0000000000000000 - 000000000009f800 (usable)
I would believe it is a problem with the SATA cable. I discuss that in a german debian mailinglist. But if somebody have a hint for me - thanks a lot! > > Either your backup data and history are vitally important to you, in which > case you don't want to trust the current state of your pool file system > for future backups, or they aren't, in which case you can get rid of them > and save yourself future headaches. If you can avoid it, you probably > don't want to overwrite your current pool for a while, in case you need to > restore something. Making an archive of the last backup(s) seems unlikely > to get every file content right, so you could need to resort to versions > of files in older backups ... > >> [...] >> During the deletion of old backups also old, (maybee corrupt) files in >> cpool will be deleted. So possible corrupt files in cpool will disappear >> automaticly during the next month. > > Yes, but do you know the implementation of the ext[23] file system well > enough to tell what will happen to possible corruption of file system > metadata? No, I surly not know enough. But e2fsck tells me that all is allright with the filesystem. So I have files which claimed wrong blocks or inodes. I can not trust the content of the files. But BackupPC will verify each new backuped file against the cpool. Byte by byte. So I believe BackupPC will verify my files in the next weeks :-) > > Regards, > Holger > > ------------------------------------------------------------------------------ > This SF.net email is sponsored by: > SourcForge Community > SourceForge wants to tell your story. > http://p.sf.net/sfu/sf-spreadtheword > _______________________________________________ > BackupPC-users mailing list > BackupPC-users@lists.sourceforge.net > List: https://lists.sourceforge.net/lists/listinfo/backuppc-users > Wiki: http://backuppc.wiki.sourceforge.net > Project: http://backuppc.sourceforge.net/ -- Don't Panic ------------------------------------------------------------------------------ This SF.net email is sponsored by: SourcForge Community SourceForge wants to tell your story. http://p.sf.net/sfu/sf-spreadtheword _______________________________________________ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List: https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki: http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/