Re: end to end error recovery musings

2007-03-01 Thread James Bottomley
On Wed, 2007-02-28 at 17:28 -0800, H. Peter Anvin wrote: James Bottomley wrote: On Wed, 2007-02-28 at 12:42 -0500, Martin K. Petersen wrote: 4104. It's 8 bytes per hardware sector. At least for T10... Er ... that won't look good to the 512 ATA compatibility remapping ... Well, in

Re: end to end error recovery musings

2007-02-28 Thread Douglas Gilbert
Martin K. Petersen wrote: Alan == Alan [EMAIL PROTECTED] writes: Not sure you're up-to-date on the T10 data integrity feature. Essentially it's an extension of the 520 byte sectors common in disk [...] Alan but here's a minor bit of passing bad news - quite a few older Alan ATA

RE: end to end error recovery musings

2007-02-28 Thread Moore, Eric
On Tuesday, February 27, 2007 12:07 PM, Martin K. Petersen wrote: Not sure you're up-to-date on the T10 data integrity feature. Essentially it's an extension of the 520 byte sectors common in disk arrays. For each 512 byte sector (or 4KB ditto) you get 8 bytes of protection data. There's

Re: end to end error recovery musings

2007-02-28 Thread Martin K. Petersen
Doug == Douglas Gilbert [EMAIL PROTECTED] writes: Doug Work on SAT-2 is now underway and one of the agenda items is Doug end to end data protection and is in the hands of the t13 Doug ATA8-ACS technical editor. So it looks like data integrity is on Doug the radar in the SATA world. It's cool

Re: end to end error recovery musings

2007-02-28 Thread James Bottomley
On Wed, 2007-02-28 at 12:16 -0500, Martin K. Petersen wrote: It's cool that it's on the radar in terms of the protocol. That doesn't mean that drive manufacturers are going to implement it, though. The ones I've talked to were unwilling to sacrifice capacity because that's the main

Re: end to end error recovery musings

2007-02-28 Thread Martin K. Petersen
James == James Bottomley [EMAIL PROTECTED] writes: James However, I could see the SATA manufacturers selling capacity at James 512 (or the new 4096) sectors but allowing their OEMs to James reformat them 520 (or 4160) 4104. It's 8 bytes per hardware sector. At least for T10... -- Martin K.

Re: end to end error recovery musings

2007-02-28 Thread James Bottomley
On Wed, 2007-02-28 at 12:42 -0500, Martin K. Petersen wrote: 4104. It's 8 bytes per hardware sector. At least for T10... Er ... that won't look good to the 512 ATA compatibility remapping ... James - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a

Re: end to end error recovery musings

2007-02-28 Thread H. Peter Anvin
James Bottomley wrote: On Wed, 2007-02-28 at 12:42 -0500, Martin K. Petersen wrote: 4104. It's 8 bytes per hardware sector. At least for T10... Er ... that won't look good to the 512 ATA compatibility remapping ... Well, in that case you'd only see 8x512 data bytes, no metadata...

Re: end to end error recovery musings

2007-02-27 Thread Martin K. Petersen
Eric == Moore, Eric [EMAIL PROTECTED] writes: Eric Martin K. Petersen on Data Intergrity Feature, which is also Eric called EEDP(End to End Data Protection), which he presented some Eric ideas/suggestions of adding an API in linux for this. T10 DIF is interesting for a few things: -

Re: end to end error recovery musings

2007-02-27 Thread Alan
These features make the most sense in terms of WRITE. Disks already have plenty of CRC on the data so if a READ fails on a regular drive we already know about it. Don't bet on it. If you want to do this seriously you need an end to end (media to host ram) checksum. We do see bizarre and quite

Re: end to end error recovery musings

2007-02-27 Thread Andreas Dilger
On Feb 27, 2007 19:02 +, Alan wrote: It would be great if the app tag was more than 16 bits. Ted mentioned that ideally he'd like to store the inode number in the app tag. But as it stands there isn't room. The lowest few bits are the most important with ext2/ext3 because you

Re: end to end error recovery musings

2007-02-27 Thread Ric Wheeler
Martin K. Petersen wrote: Eric == Moore, Eric [EMAIL PROTECTED] writes: Eric Martin K. Petersen on Data Intergrity Feature, which is also Eric called EEDP(End to End Data Protection), which he presented some Eric ideas/suggestions of adding an API in linux for this. T10 DIF is interesting

Re: end to end error recovery musings

2007-02-27 Thread Martin K. Petersen
Alan == Alan [EMAIL PROTECTED] writes: These features make the most sense in terms of WRITE. Disks already have plenty of CRC on the data so if a READ fails on a regular drive we already know about it. Alan Don't bet on it. This is why I mentioned that I want to expose the protection

Re: end to end error recovery musings

2007-02-27 Thread Alan
Not sure you're up-to-date on the T10 data integrity feature. Essentially it's an extension of the 520 byte sectors common in disk I saw the basics but not the detail. Thanks for the explanation it was most helpful and promises to fix a few things for some controllers.. but here's a minor bit

Re: end to end error recovery musings

2007-02-27 Thread Martin K. Petersen
Alan == Alan [EMAIL PROTECTED] writes: Not sure you're up-to-date on the T10 data integrity feature. Essentially it's an extension of the 520 byte sectors common in disk [...] Alan but here's a minor bit of passing bad news - quite a few older Alan ATA controllers can't issue DMA transfers

Re: end to end error recovery musings

2007-02-26 Thread Theodore Tso
On Mon, Feb 26, 2007 at 04:33:37PM +1100, Neil Brown wrote: Do we want a path in the other direction to handle write errors? The file system could say Don't worry to much if this block cannot be written, just return an error and I will write it somewhere else? This might allow md not to fail

Re: end to end error recovery musings

2007-02-26 Thread Alan
the new location. I believe this should be always true, so presumably with all modern disk drives a write error should mean something very serious has happend. Not quite that simple. If you write a block aligned size the same size as the physical media block size maybe this is true. If you

Re: end to end error recovery musings

2007-02-26 Thread James Bottomley
On Mon, 2007-02-26 at 08:25 -0500, Theodore Tso wrote: Somewhat off-topic, but my one big regret with how the dm vs. evms competition settled out was that evms had the ability to perform block device snapshots using a non-LVM volume as the base --- and that EVMS allowed a single drive to be

Re: end to end error recovery musings

2007-02-26 Thread Ric Wheeler
Alan wrote: the new location. I believe this should be always true, so presumably with all modern disk drives a write error should mean something very serious has happend. Not quite that simple. I think that write errors are normally quite serious, but there are exceptions which might

Re: end to end error recovery musings

2007-02-26 Thread Alan
I think that this is mostly true, but we also need to balance this against the need for higher levels to get a timely response. In a really large IO, a naive retry of a very large write could lead to a non-responsive system for a very large time... And losing the I/O could result in a

Re: end to end error recovery musings

2007-02-26 Thread Ric Wheeler
Alan wrote: I think that this is mostly true, but we also need to balance this against the need for higher levels to get a timely response. In a really large IO, a naive retry of a very large write could lead to a non-responsive system for a very large time... And losing the I/O could

Re: end to end error recovery musings

2007-02-26 Thread H. Peter Anvin
Theodore Tso wrote: In any case, the reason why I bring this up is that it would be really nice if there was a way with a single laptop drive to be able to do snapshots and background fsck's without having to use initrd's with device mapper. This is a major part of why I've been trying to

Re: end to end error recovery musings

2007-02-26 Thread Jeff Garzik
Theodore Tso wrote: Can someone with knowledge of current disk drive behavior confirm that for all drives that support bad block sparing, if an attempt to write to a particular spot on disk results in an error due to bad media at that spot, the disk drive will automatically rewrite the sector to

Re: end to end error recovery musings

2007-02-26 Thread Ric Wheeler
Jeff Garzik wrote: Theodore Tso wrote: Can someone with knowledge of current disk drive behavior confirm that for all drives that support bad block sparing, if an attempt to write to a particular spot on disk results in an error due to bad media at that spot, the disk drive will automatically

Re: end to end error recovery musings

2007-02-26 Thread Alan
One interesting counter example is a smaller write than a full page - say 512 bytes out of 4k. If we need to do a read-modify-write and it just so happens that 1 of the 7 sectors we need to read is flaky, will this look like a write failure? The current core kernel code can't handle

RE: end to end error recovery musings

2007-02-26 Thread Moore, Eric
On Monday, February 26, 2007 9:42 AM, Ric Wheeler wrote: Which brings us back to a recent discussion at the file system workshop on being more repair oriented in file system design so we can survive situations like this a bit more reliably ;-) On the second day of the workshop, there

Re: end to end error recovery musings

2007-02-25 Thread Neil Brown
On Friday February 23, [EMAIL PROTECTED] wrote: On Fri, Feb 23, 2007 at 05:37:23PM -0700, Andreas Dilger wrote: Probably the only sane thing to do is to remember the bad sectors and avoid attempting reading them; that would mean marking automatic versus explicitly requested requests to

Re: end to end error recovery musings

2007-02-25 Thread Douglas Gilbert
H. Peter Anvin wrote: Ric Wheeler wrote: We still have the following challenges: (1) read-ahead often means that we will retry every bad sector at least twice from the file system level. The first time, the fs read ahead request triggers a speculative read that includes the bad sector

Re: end to end error recovery musings

2007-02-24 Thread Chris Wedgwood
On Fri, Feb 23, 2007 at 09:32:29PM -0500, Theodore Tso wrote: And having a way of making this list available to both the filesystem and to a userspace utility, so they can more easily deal with doing a forced rewrite of the bad sector, after determining which file is involved and perhaps

end to end error recovery musings

2007-02-23 Thread Ric Wheeler
In the IO/FS workshop, one idea we kicked around is the need to provide better and more specific error messages between the IO stack and the file system layer. My group has been working to stabilize a relatively up to date libata + MD based box, so I can try to lay out at least one appliance

Re: end to end error recovery musings

2007-02-23 Thread H. Peter Anvin
Ric Wheeler wrote: We still have the following challenges: (1) read-ahead often means that we will retry every bad sector at least twice from the file system level. The first time, the fs read ahead request triggers a speculative read that includes the bad sector (triggering the error

Re: end to end error recovery musings

2007-02-23 Thread Andreas Dilger
On Feb 23, 2007 16:03 -0800, H. Peter Anvin wrote: Ric Wheeler wrote: (1) read-ahead often means that we will retry every bad sector at least twice from the file system level. The first time, the fs read ahead request triggers a speculative read that includes the bad sector

Re: end to end error recovery musings

2007-02-23 Thread H. Peter Anvin
Andreas Dilger wrote: And clearing this list when the sector is overwritten, as it will almost certainly be relocated at the disk level. Certainly if the overwrite is successful. -hpa - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to