Re: Raid5 assemble after dual sata port failure

2007-11-16 Thread Chris Eddington
Yes, this is exactly the kind of symptoms I've experienced. I was losing a drive here and there every couple of months (mostly the last two drives sdc and sdd) which I though were cable problems (shut down, re-plug the cables and restart and it would always work, with add/rebuild the 4th

Re: Raid5 assemble after dual sata port failure

2007-11-11 Thread David Greaves
Chris Eddington wrote: Hi, Thanks for the pointer on xfs_repair -n , it actually tells me something (some listed below) but I'm not sure what it means but there seems to be a lot of data loss. One complication is I see an error message in ata6, so I moved the disks around thinking it was a

Re: Raid5 assemble after dual sata port failure

2007-11-11 Thread Chris Eddington
Yes, there is some kind of media error message in dmesg, below. It is not random, it happens at exactly the same moments in each xfs_repair -n run. Nov 11 09:48:25 altair kernel: [37043.300691] res 51/40:00:01:00:00/00:00:00:00:00/e1 Emask 0x9 (media error) Nov 11 09:48:25 altair

Re: Raid5 assemble after dual sata port failure

2007-11-11 Thread David Greaves
Chris Eddington wrote: Yes, there is some kind of media error message in dmesg, below. It is not random, it happens at exactly the same moments in each xfs_repair -n run. Nov 11 09:48:25 altair kernel: [37043.300691] res 51/40:00:01:00:00/00:00:00:00:00/e1 Emask 0x9 (media error)

Re: Raid5 assemble after dual sata port failure

2007-11-11 Thread Bill Davidsen
David Greaves wrote: Chris Eddington wrote: Yes, there is some kind of media error message in dmesg, below. It is not random, it happens at exactly the same moments in each xfs_repair -n run. Nov 11 09:48:25 altair kernel: [37043.300691] res 51/40:00:01:00:00/00:00:00:00:00/e1

Re: Raid5 assemble after dual sata port failure

2007-11-10 Thread David Greaves
Ok - it looks like the raid array is up. There will have been an event count mismatch which is why you needed --force. This may well have caused some (hopefully minor) corruption. FWIW, xfs_check is almost never worth running :) (It runs out of memory easily). xfs_repair -n is much better. What

Re: Raid5 assemble after dual sata port failure

2007-11-10 Thread Chris Eddington
Hi, Thanks for the pointer on xfs_repair -n , it actually tells me something (some listed below) but I'm not sure what it means but there seems to be a lot of data loss. One complication is I see an error message in ata6, so I moved the disks around thinking it was a flaky sata port, but I

Re: Raid5 assemble after dual sata port failure

2007-11-09 Thread Chris Eddington
Thanks David. I've had cable/port failures in the past and after re-adding the drive, the order changed - I'm not sure why, but I noticed it sometime ago but don't remember the exact order. My initial attempt to assemble, it came up with only two drives in the array. Then I tried

Re: Raid5 assemble after dual sata port failure

2007-11-09 Thread Chris Eddington
Hi David, I ran xfs_check and get this: ERROR: The filesystem has valuable metadata changes in a log which needs to be replayed. Mount the filesystem to replay the log, and unmount it before re-running xfs_check. If you are unable to mount the filesystem, then use the xfs_repair -L option to

Re: Raid5 assemble after dual sata port failure

2007-11-08 Thread David Greaves
Chris Eddington wrote: Hi, Hi While on vacation I had one SATA port/cable fail, and then four hours later a second one fail. After fixing/moving the SATA ports, I can reboot and all drives seem to be OK now, but when assembled it won't recognize the filesystem. That's unusual - if the

Raid5 assemble after dual sata port failure

2007-11-07 Thread chrise
Hi, While on vacation I had one SATA port/cable fail, and then four hours later a second one fail. After fixing/moving the SATA ports, I can reboot and all drives seem to be OK now, but when assembled it won't recognize the filesystem. After futzing around with assemble options like --force