Re: automatic fsck on gmirror failure

2008-02-22 Thread Wojciech Puchar


$ grep -i fsck /etc/defaults/rc.conf
fsck_y_enable=NO  # Set to YES to do fsck -y if the initial preen
fails.

gmirror(8) / geom(8) should automatically remove (degrade) components
with bad I/O operations after a certain threshold, but I'm pretty sure
it doesn't.


yes it does
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: automatic fsck on gmirror failure

2008-02-22 Thread Brian A. Seklecki

On Sun, 2008-02-03 at 23:39 +0100, Wojciech Puchar wrote:
 it failed while rebuilding with badly written data on the disk that was 
 used, while other rebuild.
 
 now it can't read it.
 
 if you are sure that it doesn't pass through fsck before second reboot, do 
 the following.
 
 1) turn off gmirror
 
 2) clear gmirror header on both providers
 
 3) run fsck the other drive (not ad6, but the other used on mirror).
 

Also don't forget about:

$ grep -i fsck /etc/defaults/rc.conf 
fsck_y_enable=NO  # Set to YES to do fsck -y if the initial preen
fails.

gmirror(8) / geom(8) should automatically remove (degrade) components
with bad I/O operations after a certain threshold, but I'm pretty sure
it doesn't.

~BAS


 4) pray
 
 5) after fsck will end it successfully (it should), create gmirror with 
 the disk you checked
 
 gmirror label options gmirror-name /dev/thedisk
 
 6) reboot and start the system. should go well.


___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: automatic fsck on gmirror failure

2008-02-22 Thread Wojciech Puchar

gmirror(8) / geom(8) should automatically remove (degrade) components
with bad I/O operations after a certain threshold, but I'm pretty sure
it doesn't.

but i'm absolutely sure it does because it did several times for me
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: automatic fsck on gmirror failure

2008-02-22 Thread Brian A. Seklecki




On Fri, 22 Feb 2008, Wojciech Puchar wrote:



$ grep -i fsck /etc/defaults/rc.conf
fsck_y_enable=NO  # Set to YES to do fsck -y if the initial preen
fails.

gmirror(8) / geom(8) should automatically remove (degrade) components
with bad I/O operations after a certain threshold, but I'm pretty sure
it doesn't.


yes it does



Maybe my experiences didn't his the threshold.  I'm checking the code now. 
The threshold is likely compile-time adjusable.


~BAS
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: automatic fsck on gmirror failure

2008-02-22 Thread Brian A. Seklecki

On Fri, 22 Feb 2008, Wojciech Puchar wrote:


gmirror(8) / geom(8) should automatically remove (degrade) components
with bad I/O operations after a certain threshold, but I'm pretty sure
it doesn't.


but i'm absolutely sure it does because it did several times for me



Finally I had some time to research.  939 of geom_mirror -- 
kern.geom.mirror.disconnect_on_failure -- It's a newer 6.x thing 
apparently:


Behavior is not tunable.  It happens on a single failure.  I ask about 
tunable behavior because some cheap IDE (Maxtor) disks can fail, then 
recover.


6.3/amd64:

  [EMAIL PROTECTED] /usr/src-RELENG_6_3]# sysctl -a|grep -i kern.geom.mirror
  kern.geom.mirror.sync_requests: 2
  kern.geom.mirror.disconnect_on_failure: 1
  kern.geom.mirror.idletime: 5
  kern.geom.mirror.timeout: 4
  kern.geom.mirror.debug: 0


But:

 FreeBSD wingspan 5.5-RELEASE-p10 FreeBSD 5.5-RELEASE-p10 #0: Fri Jan 12

 [EMAIL PROTECTED]:/home/seklecki$ sysctl -a|grep -i kern.geom.mirror

 kern.geom.mirror.debug: 0
 kern.geom.mirror.timeout: 0
 kern.geom.mirror.idletime: 5
 kern.geom.mirror.reqs_per_sync: 5
 kern.geom.mirror.syncs_per_sec: 1000


___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


automatic fsck on gmirror failure

2008-02-03 Thread Gunther Mayer

Hi there,

I have a RAID 1 mirror implemented with gmirror and we recently had some 
power issues at our data centre which caused fsck to fail mysteriously. 
The server lost power unexpectedly, then came back up again for a 
minute, power died again and shortly after the next boot the following 
appears in my /var/log/messages


   Feb  2 05:20:19 myserver fsck: /dev/mirror/gm0s1f: INCORRECT BLOCK 
COUNT I=777684 (8 should be 0) (CORRECTED)
   Feb  2 05:20:19 myserver fsck: /dev/mirror/gm0s1f: CANNOT READ BLK: 
12417184
   Feb  2 05:20:19 myserver fsck: /dev/mirror/gm0s1f: UNEXPECTED SOFT 
UPDATE INCONSISTENCY; RUN fsck MANUALLY.


gm0s1f is my /usr partition. This was followed by countless errors that 
look like


   Feb  2 05:20:38 myserver ad6: TIMEOUT - READ_DMA retrying (1 retry 
left) LBA=29096879
   Feb  2 05:20:43 myserver ad6: TIMEOUT - READ_DMA retrying (0 retries 
left) LBA=29096879

   Feb  2 05:20:48 myserver ad6: FAILURE - READ_DMA timed out LBA=29096879
   Feb  2 05:20:48 myserver 
g_vfs_done():mirror/gm0s1f[READ(offset=6357598208, length=16384)]error = 5


and with it went any sort of remote access to the box. We had to get 
physical access, fsck -y and reboot for the machine to be put back into 
service.


Now my question is: Why did fsck die on me? I thought in this day and 
age file system corruptions caused by power failures are repaired 
automatically upon reboot. Or is it possible that interrupting fsck 
itself caused the problem when the system went down again after the very 
brief uptime in between?


I am really concerned about this as this caused a lot of unnecessary 
downtime and I really don't want this to ever happen again. I know, 
solving the power issues is the real solution but I want my several 
layers of peace of mind.


Oh, I run 6.2 RELEASE.

Gunther
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: automatic fsck on gmirror failure

2008-02-03 Thread Wojciech Puchar
it failed while rebuilding with badly written data on the disk that was 
used, while other rebuild.


now it can't read it.

if you are sure that it doesn't pass through fsck before second reboot, do 
the following.


1) turn off gmirror

2) clear gmirror header on both providers

3) run fsck the other drive (not ad6, but the other used on mirror).

4) pray

5) after fsck will end it successfully (it should), create gmirror with 
the disk you checked


gmirror label options gmirror-name /dev/thedisk

6) reboot and start the system. should go well.

7) after system is running and not too much needing disk I/O, do

gmirror insert gmirror-name /dev/ad6

8) pray again, but with much less fear.

9) if gmirror will finish rebuild, all right.

if you got write errors in log, ad6 needs to be replaced.


wish it helps.
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]