Re: [dm-devel] Re: [RFD] BIO_RW_BARRIER - what it means for devices, filesystems, and dm/md.

2007-07-13 Thread Ric Wheeler
} development; [EMAIL PROTECTED]; [EMAIL PROTECTED]; } linux-raid@vger.kernel.org; Jens Axboe; David Chinner; Andreas Dilger } Subject: Re: [dm-devel] Re: [RFD] BIO_RW_BARRIER - what it means for } devices, filesystems, and dm/md. } } On Wed, 11 Jul 2007 18:44:21 EDT, Ric Wheeler said: } [EMAIL

Re: [dm-devel] Re: [RFD] BIO_RW_BARRIER - what it means for devices, filesystems, and dm/md.

2007-07-12 Thread Ric Wheeler
[EMAIL PROTECTED] wrote: On Wed, 11 Jul 2007 18:44:21 EDT, Ric Wheeler said: [EMAIL PROTECTED] wrote: On Tue, 10 Jul 2007 14:39:41 EDT, Ric Wheeler said: All of the high end arrays have non-volatile cache (read, on power loss, it is a promise that it will get all of your data out

Re: [dm-devel] Re: [RFD] BIO_RW_BARRIER - what it means for devices, filesystems, and dm/md.

2007-07-11 Thread Ric Wheeler
[EMAIL PROTECTED] wrote: On Tue, 10 Jul 2007 14:39:41 EDT, Ric Wheeler said: All of the high end arrays have non-volatile cache (read, on power loss, it is a promise that it will get all of your data out to permanent storage). You don't need to ask this kind of array to drain the cache

Re: parity check for read?

2007-04-04 Thread Ric Wheeler
Mirko Benz wrote: Neil, Exactly what I had in mind. Some vendors claim they do parity checking for reads. Technically it should be possible for Linux RAID as well but is not implemented – correct? Reliability data for unrecoverable read errors: - enterprise SAS drive (ST3300655SS): 1 in

Re: end to end error recovery musings

2007-02-27 Thread Ric Wheeler
Martin K. Petersen wrote: Eric == Moore, Eric [EMAIL PROTECTED] writes: Eric Martin K. Petersen on Data Intergrity Feature, which is also Eric called EEDP(End to End Data Protection), which he presented some Eric ideas/suggestions of adding an API in linux for this. T10 DIF is interesting

Re: end to end error recovery musings

2007-02-26 Thread Ric Wheeler
Alan wrote: the new location. I believe this should be always true, so presumably with all modern disk drives a write error should mean something very serious has happend. Not quite that simple. I think that write errors are normally quite serious, but there are exceptions which might

Re: end to end error recovery musings

2007-02-26 Thread Ric Wheeler
Alan wrote: I think that this is mostly true, but we also need to balance this against the need for higher levels to get a timely response. In a really large IO, a naive retry of a very large write could lead to a non-responsive system for a very large time... And losing the I/O could

Re: end to end error recovery musings

2007-02-26 Thread Ric Wheeler
Jeff Garzik wrote: Theodore Tso wrote: Can someone with knowledge of current disk drive behavior confirm that for all drives that support bad block sparing, if an attempt to write to a particular spot on disk results in an error due to bad media at that spot, the disk drive will automatically

end to end error recovery musings

2007-02-23 Thread Ric Wheeler
In the IO/FS workshop, one idea we kicked around is the need to provide better and more specific error messages between the IO stack and the file system layer. My group has been working to stabilize a relatively up to date libata + MD based box, so I can try to lay out at least one appliance

Re: [patch] latency problem in md driver

2006-12-22 Thread Ric Wheeler
Jeff Garzik wrote: Lars Ellenberg wrote: md raidX make_request functions strip off the BIO_RW_SYNC flag, this introducing additional latency. below is a suggested patch for the raid1.c . other suggested solutions would be to let the bio_clone do its work, and not reassign thereby stripping off

Re: libata hotplug and md raid?

2006-09-13 Thread Ric Wheeler
(Adding Tejun Greg KH to this thread) Leon Woestenberg wrote: Hello all, I am testing the (work-in-progress / upcoming) libata SATA hotplug. Hotplugging alone seems to work, but not well in combination with md RAID. Here is my report and a question about intended behaviour. Mainstream

Re: Failed Hard Disk... help!

2006-06-10 Thread Ric Wheeler
David M. Strang wrote: /Patrick wrote: pretty sure smartctl -d ata -a /dev/sdwhatever will tell you the serial number. (Hopefully the kernel is new enough that it supports SATA/smart, otherwise you need a kernel patch which won't be any better...) Yep... 2.6.15 or better... I need the

Re: md: Change ENOTSUPP to EOPNOTSUPP

2006-04-29 Thread Ric Wheeler
Molle Bestefich wrote: Ric Wheeler wrote: You are absolutely right - if you do not have a validated, working barrier for your low level devices (or a high end, battery backed array or JBOD), you should disable the write cache on your RAIDed partitions and on your normal file systems

Re: [PATCH 003 of 5] md: Change ENOTSUPP to EOPNOTSUPP

2006-04-28 Thread Ric Wheeler
Molle Bestefich wrote: NeilBrown wrote: Change ENOTSUPP to EOPNOTSUPP Because that is what you get if a BIO_RW_BARRIER isn't supported ! Dumb question, hope someone can answer it :). Does this mean that any version of MD up till now won't know that SATA disks does not support barriers,

Re: md faster than h/w?

2006-01-16 Thread Ric Wheeler
Max Waterman wrote: Mark Hahn wrote: I've written a fairly simple bandwidth-reporting tool: http://www.sharcnet.ca/~hahn/iorate.c it prints incremental bandwidth, which I find helpful because it shows recording zones, like this slightly odd Samsung:

Re: disk light remains on

2006-01-03 Thread Ric Wheeler
[EMAIL PROTECTED] wrote: Thanks for the reply. On Mon, Jan 02, 2006 at 11:49:14PM -0500, Ross Vandegrift wrote: I just began using RAID-1 (in 2.6.12) on a pair of SATA drives, and now the hard disk drive light comes on during booting--about when the RAID system is loaded--and stays on.

4-way RAID-1 group never finishes md_do_sync()?

2005-01-31 Thread Ric Wheeler
We have a setup where the system partitions (/, /usr, /var) are all mirrored across a 4 volume RAID-1 devices. On some set of our nodes running both a SLES based kernel and 2.6.10, we have a condition where the a device gets stuck in the md_do_sync() code and never makes progress, never