Re: end to end error recovery musings

2007-03-01 Thread James Bottomley
On Wed, 2007-02-28 at 17:28 -0800, H. Peter Anvin wrote: James Bottomley wrote: On Wed, 2007-02-28 at 12:42 -0500, Martin K. Petersen wrote: 4104. It's 8 bytes per hardware sector. At least for T10... Er ... that won't look good to the 512 ATA compatibility remapping ... Well, in

Re: end to end error recovery musings

2007-03-01 Thread H. Peter Anvin
James Bottomley wrote: On Wed, 2007-02-28 at 17:28 -0800, H. Peter Anvin wrote: James Bottomley wrote: On Wed, 2007-02-28 at 12:42 -0500, Martin K. Petersen wrote: 4104. It's 8 bytes per hardware sector. At least for T10... Er ... that won't look good to the 512 ATA compatibility remapping

RE: end to end error recovery musings

2007-02-28 Thread Martin K. Petersen
Eric == Moore, Eric [EMAIL PROTECTED] writes: [Trimmed the worldwide broadcast CC: list down to linux-scsi] Eric I from the scsi lld perspective, all we need 32 byte cdbs, and a Eric mechinism to pass the tags down from above. Ok, so your board only supports Type 2 protection? Eric It

RE: end to end error recovery musings

2007-02-28 Thread Moore, Eric
On Tuesday, February 27, 2007 12:07 PM, Martin K. Petersen wrote: Not sure you're up-to-date on the T10 data integrity feature. Essentially it's an extension of the 520 byte sectors common in disk arrays. For each 512 byte sector (or 4KB ditto) you get 8 bytes of protection data. There's

Re: end to end error recovery musings

2007-02-28 Thread James Bottomley
On Wed, 2007-02-28 at 12:16 -0500, Martin K. Petersen wrote: It's cool that it's on the radar in terms of the protocol. That doesn't mean that drive manufacturers are going to implement it, though. The ones I've talked to were unwilling to sacrifice capacity because that's the main

Re: end to end error recovery musings

2007-02-28 Thread Martin K. Petersen
James == James Bottomley [EMAIL PROTECTED] writes: James However, I could see the SATA manufacturers selling capacity at James 512 (or the new 4096) sectors but allowing their OEMs to James reformat them 520 (or 4160) 4104. It's 8 bytes per hardware sector. At least for T10... -- Martin K.

Re: end to end error recovery musings

2007-02-28 Thread H. Peter Anvin
James Bottomley wrote: On Wed, 2007-02-28 at 12:42 -0500, Martin K. Petersen wrote: 4104. It's 8 bytes per hardware sector. At least for T10... Er ... that won't look good to the 512 ATA compatibility remapping ... Well, in that case you'd only see 8x512 data bytes, no metadata...

Re: end to end error recovery musings

2007-02-27 Thread Martin K. Petersen
Eric == Moore, Eric [EMAIL PROTECTED] writes: Eric Martin K. Petersen on Data Intergrity Feature, which is also Eric called EEDP(End to End Data Protection), which he presented some Eric ideas/suggestions of adding an API in linux for this. T10 DIF is interesting for a few things: -

Re: end to end error recovery musings

2007-02-27 Thread Alan
These features make the most sense in terms of WRITE. Disks already have plenty of CRC on the data so if a READ fails on a regular drive we already know about it. Don't bet on it. If you want to do this seriously you need an end to end (media to host ram) checksum. We do see bizarre and quite

Re: end to end error recovery musings

2007-02-27 Thread Andreas Dilger
On Feb 27, 2007 19:02 +, Alan wrote: It would be great if the app tag was more than 16 bits. Ted mentioned that ideally he'd like to store the inode number in the app tag. But as it stands there isn't room. The lowest few bits are the most important with ext2/ext3 because you

Re: end to end error recovery musings

2007-02-27 Thread Ric Wheeler
Martin K. Petersen wrote: Eric == Moore, Eric [EMAIL PROTECTED] writes: Eric Martin K. Petersen on Data Intergrity Feature, which is also Eric called EEDP(End to End Data Protection), which he presented some Eric ideas/suggestions of adding an API in linux for this. T10 DIF is interesting

Re: end to end error recovery musings

2007-02-27 Thread Martin K. Petersen
Alan == Alan [EMAIL PROTECTED] writes: Not sure you're up-to-date on the T10 data integrity feature. Essentially it's an extension of the 520 byte sectors common in disk [...] Alan but here's a minor bit of passing bad news - quite a few older Alan ATA controllers can't issue DMA transfers

Re: end to end error recovery musings

2007-02-26 Thread Ric Wheeler
Alan wrote: I think that this is mostly true, but we also need to balance this against the need for higher levels to get a timely response. In a really large IO, a naive retry of a very large write could lead to a non-responsive system for a very large time... And losing the I/O could

Re: end to end error recovery musings

2007-02-26 Thread H. Peter Anvin
Theodore Tso wrote: In any case, the reason why I bring this up is that it would be really nice if there was a way with a single laptop drive to be able to do snapshots and background fsck's without having to use initrd's with device mapper. This is a major part of why I've been trying to

Re: end to end error recovery musings

2007-02-26 Thread Ric Wheeler
Jeff Garzik wrote: Theodore Tso wrote: Can someone with knowledge of current disk drive behavior confirm that for all drives that support bad block sparing, if an attempt to write to a particular spot on disk results in an error due to bad media at that spot, the disk drive will automatically

Re: end to end error recovery musings

2007-02-26 Thread Alan
One interesting counter example is a smaller write than a full page - say 512 bytes out of 4k. If we need to do a read-modify-write and it just so happens that 1 of the 7 sectors we need to read is flaky, will this look like a write failure? The current core kernel code can't handle

Re: end to end error recovery musings

2007-02-25 Thread Douglas Gilbert
H. Peter Anvin wrote: Ric Wheeler wrote: We still have the following challenges: (1) read-ahead often means that we will retry every bad sector at least twice from the file system level. The first time, the fs read ahead request triggers a speculative read that includes the bad sector

Re: end to end error recovery musings

2007-02-24 Thread Chris Wedgwood
On Fri, Feb 23, 2007 at 09:32:29PM -0500, Theodore Tso wrote: And having a way of making this list available to both the filesystem and to a userspace utility, so they can more easily deal with doing a forced rewrite of the bad sector, after determining which file is involved and perhaps

end to end error recovery musings

2007-02-23 Thread Ric Wheeler
In the IO/FS workshop, one idea we kicked around is the need to provide better and more specific error messages between the IO stack and the file system layer. My group has been working to stabilize a relatively up to date libata + MD based box, so I can try to lay out at least one appliance

Re: end to end error recovery musings

2007-02-23 Thread H. Peter Anvin
Ric Wheeler wrote: We still have the following challenges: (1) read-ahead often means that we will retry every bad sector at least twice from the file system level. The first time, the fs read ahead request triggers a speculative read that includes the bad sector (triggering the error

Re: end to end error recovery musings

2007-02-23 Thread Andreas Dilger
On Feb 23, 2007 16:03 -0800, H. Peter Anvin wrote: Ric Wheeler wrote: (1) read-ahead often means that we will retry every bad sector at least twice from the file system level. The first time, the fs read ahead request triggers a speculative read that includes the bad sector

Re: end to end error recovery musings

2007-02-23 Thread H. Peter Anvin
Andreas Dilger wrote: And clearing this list when the sector is overwritten, as it will almost certainly be relocated at the disk level. Certainly if the overwrite is successful. -hpa - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to

Re: end to end error recovery musings

2007-02-23 Thread Theodore Tso
On Fri, Feb 23, 2007 at 05:37:23PM -0700, Andreas Dilger wrote: Probably the only sane thing to do is to remember the bad sectors and avoid attempting reading them; that would mean marking automatic versus explicitly requested requests to determine whether or not to filter them against