Re: gmirror not synced
On Thu 2012-01-05 (09:56), Matthew Seaman wrote: drive is actually generating errors.) Also try a few passes of memtest86 to try and spot problems with RAM. Yes that was the problem, have gotten rid of a faulty DIMM and everything is looking a lot saner, thanx : ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: gmirror not synced
On 01/04/12 14:43, Gareth de Vaux wrote: Hi all, I've noticed that the md5 hashes of a couple of files on a gmirror change when I recalculate the hashes. The output usually cycles between 2 hashes per file. I'm guessing this is because each calculation reads the file randomly from 1 of 2 component drives, and the files in question had a few bit flips during their original sync. I also assume this's something you have to live with for gmirror? Is removing and completely rebuilding the secondary drive the only thing you can do (which might fix these bit flips but incur others elsewhere)? ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org Hi. Bit-flipping is unlikely, but, you can test this hypothesis by having it only ever read from one disk. Use gmirror configure -b to change the balancing algorithm of the array to priority and gmirror configure -p to change the priority one of of the members. Then, repeat the test for the other member. What I would say is more likely is that you've got bad memory or CPU cache in the machine. I've had this happen to me and that turned out to be the case. And, as the other reply said, it's a good idea to make sure you've got a good backup at this point. -Boris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: gmirror not synced
On Thu, Jan 05, 2012 at 09:56:02AM +, Matthew Seaman wrote: On 04/01/2012 19:43, Gareth de Vaux wrote: Hi all, I've noticed that the md5 hashes of a couple of files on a gmirror change when I recalculate the hashes. The output usually cycles between 2 hashes per file. I'm guessing this is because each calculation reads the file randomly from 1 of 2 component drives, and the files in question had a few bit flips during their original sync. I also assume this's something you have to live with for gmirror? Is removing and completely rebuilding the secondary drive the only thing you can do (which might fix these bit flips but incur others elsewhere)? No, that's not something acceptable at all. Randomly flipping bits in files is a really nasty failure mode. What does 'gmirror list' tell you about the state of the gmirror? Is there any possibility that your hardware is failing? Check the SMART attributes of the disk in the first instance (it isn't brilliant for picking up impending failure, but it should be pretty accurate once the drive is actually generating errors.) Also try a few passes of memtest86 to try and spot problems with RAM. Cleaning dust out of air vents and heatsinks and generally making sure the machine is not overheating is a good idea too. Another possibility is a disk with intermittently faulty cache, or a drive who has basically given up (firmware bug, design flaw, etc.) honouring ECC[1][2] when reading/writing sectors. For the former point, SMART statistics from the drives could help determine if this is the case, but I stress the word could. This is usually stored in Attribute 184 (End-to-End_Error) but is not available on very many drives. Gareth, please install ports/sysutils/smartmontools (make sure it's version 5.42 or newer) and provide output from smartctl -x /dev/disk and I'll review it for you. [1]: http://www.storagereview.com/guide/error.html (read all subsections too) [2]: http://www.dewassoc.com/kbase/hard_drives/hard_disk_sector_structures.htm -- | Jeremy Chadwickjdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB | ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: gmirror not synced
On 04/01/2012 19:43, Gareth de Vaux wrote: Hi all, I've noticed that the md5 hashes of a couple of files on a gmirror change when I recalculate the hashes. The output usually cycles between 2 hashes per file. I'm guessing this is because each calculation reads the file randomly from 1 of 2 component drives, and the files in question had a few bit flips during their original sync. I also assume this's something you have to live with for gmirror? Is removing and completely rebuilding the secondary drive the only thing you can do (which might fix these bit flips but incur others elsewhere)? No, that's not something acceptable at all. Randomly flipping bits in files is a really nasty failure mode. What does 'gmirror list' tell you about the state of the gmirror? Is there any possibility that your hardware is failing? Check the SMART attributes of the disk in the first instance (it isn't brilliant for picking up impending failure, but it should be pretty accurate once the drive is actually generating errors.) Also try a few passes of memtest86 to try and spot problems with RAM. Cleaning dust out of air vents and heatsinks and generally making sure the machine is not overheating is a good idea too. Actually, first thing to do is make sure you have really good backups. Bonus points if you have been backing everything up religiously, and can extract a known good copy of the files in question from some of the older ones. Cheers, Matthew -- Dr Matthew J Seaman MA, D.Phil. 7 Priory Courtyard Flat 3 PGP: http://www.infracaninophile.co.uk/pgpkey Ramsgate JID: matt...@infracaninophile.co.uk Kent, CT11 9PW signature.asc Description: OpenPGP digital signature
gmirror not synced
Hi all, I've noticed that the md5 hashes of a couple of files on a gmirror change when I recalculate the hashes. The output usually cycles between 2 hashes per file. I'm guessing this is because each calculation reads the file randomly from 1 of 2 component drives, and the files in question had a few bit flips during their original sync. I also assume this's something you have to live with for gmirror? Is removing and completely rebuilding the secondary drive the only thing you can do (which might fix these bit flips but incur others elsewhere)? ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org