Linux Foundation's open source file & storage conference CFP

2017-01-05 Thread Ric Wheeler
Hi all, The CFP for the Linux Foundation's Vault conference is coming close to an end. The event is being held this year in Cambridge, Massachusetts on the days following the LSF/MM summit. The first two year's events have been solid, focused events in my (slightly biased) opinion, so worth

Re: [PATCH v2 4/7] megaraid_sas: Send SYNCHRONIZE_CACHE command to firmware

2016-10-21 Thread Ric Wheeler
On 10/20/2016 05:05 AM, Sumit Saxena wrote: From previous patch we have below changes in v2 - 1. Updated change log. Provided more detail in change log. 2. Agreed to remove module parameter. If we remove module parameter, we can ask customer to disable WCE on drive to get similar

Re: [PATCH 4/7] megaraid_sas: Send SYNCHRONIZE_CACHE command to firmware

2016-10-18 Thread Ric Wheeler
On 10/17/2016 09:57 AM, Tomas Henzl wrote: On 17.10.2016 15:28, Sumit Saxena wrote: -Original Message- From: Tomas Henzl [mailto:the...@redhat.com] Sent: Monday, October 17, 2016 6:44 PM To: Sumit Saxena; linux-scsi@vger.kernel.org Cc: martin.peter...@oracle.com;

Re: [PATCH 4/7] megaraid_sas: Send SYNCHRONIZE_CACHE command to firmware

2016-10-17 Thread Ric Wheeler
On 10/17/2016 01:19 PM, James Bottomley wrote: That's not what I get from the change log. What it says to me is > >that the caches are currently firmware managed. Barring firmware > >bugs, that means that we currently don't have any integrity issues. > >Your understanding (or the change log)

Re: [PATCH 4/7] megaraid_sas: Send SYNCHRONIZE_CACHE command to firmware

2016-10-17 Thread Ric Wheeler
On 10/17/2016 12:20 PM, James Bottomley wrote: We really need to have some ways to validate that our IO stack is >properly and safely configured. > >I would love to see a couple of things: > >* having T10 & T13 report the existence of a volatile write cache - >this is different than WCE set,

Re: [PATCH 4/7] megaraid_sas: Send SYNCHRONIZE_CACHE command to firmware

2016-10-17 Thread Ric Wheeler
On 10/17/2016 12:20 PM, James Bottomley wrote: On Mon, 2016-10-17 at 09:01 -0400, Ric Wheeler wrote: On 10/17/2016 07:34 AM, Hannes Reinecke wrote: On 10/17/2016 12:24 PM, Sumit Saxena wrote: megaraid_sas driver returns SYNCHRONIZE_CACHE command back to SCSI layer without sending

Re: [PATCH 4/7] megaraid_sas: Send SYNCHRONIZE_CACHE command to firmware

2016-10-17 Thread Ric Wheeler
On 10/17/2016 11:55 AM, Christoph Hellwig wrote: On Mon, Oct 17, 2016 at 09:01:29AM -0400, Ric Wheeler wrote: This must go in - without this fix, there is no data integrity for any file system. megaraid always had odd ideas on cache flushing, and this might be a opportunity to write down all

Re: [PATCH 4/7] megaraid_sas: Send SYNCHRONIZE_CACHE command to firmware

2016-10-17 Thread Ric Wheeler
On 10/17/2016 07:34 AM, Hannes Reinecke wrote: On 10/17/2016 12:24 PM, Sumit Saxena wrote: megaraid_sas driver returns SYNCHRONIZE_CACHE command back to SCSI layer without sending it to firmware as firmware takes care of flushing cache. This patch will change the driver behavior wrt

Re: [Lsf-pc] [LSF/MM TOPIC] really large storage sectors - going beyond 4096 bytes

2014-01-22 Thread Ric Wheeler
On 01/22/2014 04:34 AM, Mel Gorman wrote: On Tue, Jan 21, 2014 at 10:04:29PM -0500, Ric Wheeler wrote: One topic that has been lurking forever at the edges is the current 4k limitation for file system block sizes. Some devices in production today and others coming soon have larger sectors

Re: [Lsf-pc] [LSF/MM TOPIC] really large storage sectors - going beyond 4096 bytes

2014-01-22 Thread Ric Wheeler
On 01/22/2014 09:34 AM, Mel Gorman wrote: On Wed, Jan 22, 2014 at 09:10:48AM -0500, Ric Wheeler wrote: On 01/22/2014 04:34 AM, Mel Gorman wrote: On Tue, Jan 21, 2014 at 10:04:29PM -0500, Ric Wheeler wrote: One topic that has been lurking forever at the edges is the current 4k limitation

Re: [Lsf-pc] [LSF/MM TOPIC] really large storage sectors - going beyond 4096 bytes

2014-01-22 Thread Ric Wheeler
On 01/22/2014 11:03 AM, James Bottomley wrote: On Wed, 2014-01-22 at 15:14 +, Chris Mason wrote: On Wed, 2014-01-22 at 09:34 +, Mel Gorman wrote: On Tue, Jan 21, 2014 at 10:04:29PM -0500, Ric Wheeler wrote: One topic that has been lurking forever at the edges is the current 4k

Re: [Lsf-pc] [LSF/MM TOPIC] really large storage sectors - going beyond 4096 bytes

2014-01-22 Thread Ric Wheeler
On 01/22/2014 01:13 PM, James Bottomley wrote: On Wed, 2014-01-22 at 18:02 +, Chris Mason wrote: On Wed, 2014-01-22 at 09:21 -0800, James Bottomley wrote: On Wed, 2014-01-22 at 17:02 +, Chris Mason wrote: [ I like big sectors and I cannot lie ] I think I might be sceptical, but I

Re: [Lsf-pc] [LSF/MM TOPIC] really large storage sectors - going beyond 4096 bytes

2014-01-22 Thread Ric Wheeler
On 01/22/2014 01:35 PM, James Bottomley wrote: On Wed, 2014-01-22 at 13:17 -0500, Ric Wheeler wrote: On 01/22/2014 01:13 PM, James Bottomley wrote: On Wed, 2014-01-22 at 18:02 +, Chris Mason wrote: On Wed, 2014-01-22 at 09:21 -0800, James Bottomley wrote: On Wed, 2014-01-22 at 17:02

Re: [Lsf-pc] [LSF/MM TOPIC] really large storage sectors - going beyond 4096 bytes

2014-01-22 Thread Ric Wheeler
On 01/22/2014 01:37 PM, Chris Mason wrote: Circling back to what we might talk about at the conference, Ric do you have any ideas on when these drives might hit the wild? -chris I will poke at vendors to see if we can get someone to make a public statement, but I cannot do that for them. Ric

[LSF/MM TOPIC] really large storage sectors - going beyond 4096 bytes

2014-01-21 Thread Ric Wheeler
One topic that has been lurking forever at the edges is the current 4k limitation for file system block sizes. Some devices in production today and others coming soon have larger sectors and it would be interesting to see if it is time to poke at this topic again. LSF/MM seems to be pretty

Re: status of block-integrity

2014-01-07 Thread Ric Wheeler
On 12/23/2013 09:35 PM, Martin K. Petersen wrote: Christoph == Christoph Hellwig h...@infradead.org writes: Christoph We have the block integrity code to support DIF/DIX in the Christoph the tree for about 5 and a half years, and we still don't Christoph have a single consumer of it. What do

[LSF/MM TOPIC] [ATTEND] persistent memory progress, management of storage file systems

2014-01-06 Thread Ric Wheeler
I would like to attend this year and continue to talk about the work on enabling the new class of persistent memory devices. Specifically, very interested in talking about both using a block driver under our existing stack and also progress at the file system layer (adding xip/mmap tweaks to

Re: [PATCHv2 0/7] Limit overall SCSI EH runtime

2013-09-24 Thread Ric Wheeler
On 08/29/2013 09:06 AM, Hannes Reinecke wrote: Hi James, On 08/07/2013 08:43 AM, Ren Mingxin wrote: Hi, James: On 07/11/2013 04:35 AM, Ewan Milne wrote: Looks good. We have been testing this extensively. Acked-by: Ewan D. Milneemi...@redhat.com Do you think this patchset can be applied?

Re: [RFC PATCH] scsi: Add failfast mode to avoid infinite retry loop

2013-08-23 Thread Ric Wheeler
On 08/23/2013 05:10 AM, Eiichi Tsukata wrote: (2013/08/21 3:09), Ewan Milne wrote: On Tue, 2013-08-20 at 16:13 +0900, Eiichi Tsukata wrote: (2013/08/19 23:30), James Bottomley wrote: On Mon, 2013-08-19 at 18:39 +0900, Eiichi Tsukata wrote: Hello, This patch adds scsi device failfast mode to

Re: [Ksummit-2013-discuss] [ATTEND] scsi-mq prototype discussion

2013-07-19 Thread Ric Wheeler
On 07/17/2013 12:52 AM, James Bottomley wrote: On Tue, 2013-07-16 at 15:15 -0600, Jens Axboe wrote: On Tue, Jul 16 2013, Nicholas A. Bellinger wrote: On Sat, 2013-07-13 at 06:53 +, James Bottomley wrote: On Fri, 2013-07-12 at 12:52 +0200, Hannes Reinecke wrote: On 07/12/2013 03:33 AM,

Re: Linux Plumbers IO File System Micro-conference

2013-07-15 Thread Ric Wheeler
developers? thanks! Ric -Original Message- From: linux-fsdevel-ow...@vger.kernel.org [mailto:linux-fsdevel-ow...@vger.kernel.org] On Behalf Of Ric Wheeler Sent: Friday, July 12, 2013 1:21 PM To: linux-scsi@vger.kernel.org; Linux FS Devel; linux-...@vger.kernel.org; linux-btrfs; xfs-oss

Linux Plumbers IO File System Micro-conference

2013-07-12 Thread Ric Wheeler
Linux Plumbers has approved a file and storage microconference. The overview page is here: http://wiki.linuxplumbersconf.org/2013:file_and_storage_systems I would like to started gathering in ideas for topics. I have been approached already with a request to cover the copy_range work Zach

Re: atomic write T10 standards

2013-07-05 Thread Ric Wheeler
...@vger.kernel.org [mailto:linux-scsi- ow...@vger.kernel.org] On Behalf Of Ric Wheeler Sent: Thursday, 04 July, 2013 7:35 AM To: Vladislav Bolkhovitin Cc: Chris Mason; James Bottomley; Martin K. Petersen; linux- s...@vger.kernel.org Subject: Re: atomic write T10 standards On 07/03/2013 11:18 PM, Vladislav

Re: atomic write T10 standards

2013-07-04 Thread Ric Wheeler
On 07/03/2013 11:18 PM, Vladislav Bolkhovitin wrote: Ric Wheeler, on 07/03/2013 11:31 AM wrote: Journals are normally big (128MB or so?) - I don't think that this is unique to xfs. We're mixing a bunch of concepts here. The filesystems have a lot of different requirements, and atomics

atomic write T10 standards

2013-07-03 Thread Ric Wheeler
On 07/03/2013 11:00 AM, James Bottomley wrote: On Wed, 2013-07-03 at 10:56 -0400, Ric Wheeler wrote: On 07/03/2013 10:38 AM, Chris Mason wrote: Quoting Ric Wheeler (2013-07-03 10:34:04) As I was out walking Skeeter this morning, I was thinking a bit about the new T10 atomic write proposal

Re: atomic write T10 standards

2013-07-03 Thread Ric Wheeler
On 07/03/2013 11:37 AM, James Bottomley wrote: On Wed, 2013-07-03 at 11:27 -0400, Ric Wheeler wrote: On 07/03/2013 11:22 AM, James Bottomley wrote: On Wed, 2013-07-03 at 11:04 -0400, Ric Wheeler wrote: Why not have the atomic write actually imply that it is atomic and durable for just

Re: atomic write T10 standards

2013-07-03 Thread Ric Wheeler
On 07/03/2013 11:54 AM, Chris Mason wrote: Quoting Ric Wheeler (2013-07-03 11:42:38) On 07/03/2013 11:37 AM, James Bottomley wrote: On Wed, 2013-07-03 at 11:27 -0400, Ric Wheeler wrote: On 07/03/2013 11:22 AM, James Bottomley wrote: On Wed, 2013-07-03 at 11:04 -0400, Ric Wheeler wrote: Why

Re: atomic write T10 standards

2013-07-03 Thread Ric Wheeler
On 07/03/2013 02:54 PM, Chris Mason wrote: Quoting Ric Wheeler (2013-07-03 14:31:59) On 07/03/2013 11:54 AM, Chris Mason wrote: Quoting Ric Wheeler (2013-07-03 11:42:38) On 07/03/2013 11:37 AM, James Bottomley wrote: On Wed, 2013-07-03 at 11:27 -0400, Ric Wheeler wrote: On 07/03/2013 11:22

LPC2013: File and Storage Systems uconf Approved

2013-04-30 Thread Ric Wheeler
Just a heads up, Linux Plumbers has approved the FS Storage micro-conf for this year. Please have a look at the wiki for details: http://www.linuxplumbersconf.org/2013/ I have posted some suggested topics at the site, let's get the planning rolling so we can get people registered in time

Re: T10 WCE interpretation in Linux device level access

2013-04-25 Thread Ric Wheeler
On 04/24/2013 06:46 PM, James Bottomley wrote: On Wed, 2013-04-24 at 18:36 -0400, Ric Wheeler wrote: On 04/24/2013 06:09 PM, James Bottomley wrote: On Wed, 2013-04-24 at 23:54 +0200, Paolo Bonzini wrote: Il 24/04/2013 23:02, James Bottomley ha scritto: That just leaves us with random

Re: T10 WCE interpretation in Linux device level access

2013-04-24 Thread Ric Wheeler
applications to manually override our flush settings (reluctantly!). Regards, Ric -Original Message- From: linux-scsi-ow...@vger.kernel.org [mailto:linux-scsi-ow...@vger.kernel.org] On Behalf Of Jeremy Linton Sent: Tuesday, April 23, 2013 5:40 PM To: James Bottomley Cc: Ric

Re: T10 WCE interpretation in Linux device level access

2013-04-24 Thread Ric Wheeler
On 04/24/2013 08:08 AM, Paolo Bonzini wrote: Il 24/04/2013 14:07, Hannes Reinecke ha scritto: On 04/24/2013 01:17 PM, Paolo Bonzini wrote: Il 23/04/2013 22:07, James Bottomley ha scritto: On Tue, 2013-04-23 at 15:41 -0400, Ric Wheeler wrote: For many years, we have used WCE as an indication

Re: T10 WCE interpretation in Linux device level access

2013-04-24 Thread Ric Wheeler
On 04/24/2013 02:20 PM, Black, David wrote: Jeremy, It looks like, you, Paolo and Ric have hit the nail on the head here - this is a nice summary, IMHO: On 4/24/2013 7:57 AM, Paolo Bonzini wrote: If the device can promise this, we don't care (and don't know) how it manages that promise. It

Re: T10 WCE interpretation in Linux device level access

2013-04-24 Thread Ric Wheeler
On 04/24/2013 06:09 PM, James Bottomley wrote: On Wed, 2013-04-24 at 23:54 +0200, Paolo Bonzini wrote: Il 24/04/2013 23:02, James Bottomley ha scritto: That just leaves us with random standards behaviour. Lets permit the deterministic thing instead for the distros. It kills two birds with

T10 WCE interpretation in Linux device level access

2013-04-23 Thread Ric Wheeler
For many years, we have used WCE as an indication that a device has a volatile write cache (not just a write cache) and used this as a trigger to send down SYNCHRONIZE_CACHE commands as needed. Some arrays with non-volatile cache seem to have WCE set and simply ignore the command. Some

Re: getting I/O errors in super_written()...any ideas what would cause this?

2012-12-05 Thread Ric Wheeler
On 12/05/2012 04:20 AM, James Bottomley wrote: On Tue, 2012-12-04 at 16:00 -0600, Chris Friesen wrote: As another data point, it looks like we may be doing a SEND DIAGNOSTIC command specifying the default self-test in addition to the background short self-test. This seems a bit risky and

Re: getting I/O errors in super_written()...any ideas what would cause this?

2012-12-04 Thread Ric Wheeler
On 12/04/2012 05:00 PM, Chris Friesen wrote: On 12/03/2012 03:53 PM, Ric Wheeler wrote: On 12/03/2012 04:08 PM, Chris Friesen wrote: On 12/03/2012 02:52 PM, Ric Wheeler wrote: I jumped into this thread late - can you repost detail on the specific drive and HBA used here? In any case

Re: getting I/O errors in super_written()...any ideas what would cause this?

2012-12-03 Thread Ric Wheeler
On 12/03/2012 04:08 PM, Chris Friesen wrote: On 12/03/2012 02:52 PM, Ric Wheeler wrote: I jumped into this thread late - can you repost detail on the specific drive and HBA used here? In any case, it sounds like this is a better topic for the linux-scsi or linux-ide list where most of the low

Re: [SCSI PATCH] sd: max-retries becomes configurable

2012-10-01 Thread Ric Wheeler
On 09/25/2012 04:08 PM, James Bottomley wrote: On Tue, 2012-09-25 at 01:21 -0400, Jeff Garzik wrote: On 09/25/2012 12:06 AM, James Bottomley wrote: On Mon, 2012-09-24 at 17:00 -0400, Jeff Garzik wrote: drivers/scsi/sd.c |4 drivers/scsi/sd.h |2 +- 2 files changed, 5

Re: [Ping^3] Re: [PATCH] sg_io: allow UNMAP and WRITE SAME without CAP_SYS_RAWIO

2012-09-06 Thread Ric Wheeler
On 09/06/2012 02:31 AM, Paolo Bonzini wrote: Il 05/09/2012 22:18, Ric Wheeler ha scritto: Hi Paolo, Both of these commands are destructive. WRITE_SAME (if done without the discard bits set) can also take a very long time to be destructive and tie up the storage. FORMAT_UNIT has the same

Re: [Ping^3] Re: [PATCH] sg_io: allow UNMAP and WRITE SAME without CAP_SYS_RAWIO

2012-09-06 Thread Ric Wheeler
On 09/06/2012 07:49 AM, Paolo Bonzini wrote: Il 06/09/2012 13:31, Ric Wheeler ha scritto: Both of these commands are destructive. WRITE_SAME (if done without the discard bits set) can also take a very long time to be destructive and tie up the storage. FORMAT_UNIT has the same characteristics

Re: [Ping^3] Re: [PATCH] sg_io: allow UNMAP and WRITE SAME without CAP_SYS_RAWIO

2012-09-05 Thread Ric Wheeler
On 09/05/2012 10:41 AM, Paolo Bonzini wrote: Il 28/08/2012 13:04, Paolo Bonzini ha scritto: Il 01/08/2012 17:53, Paolo Bonzini ha scritto: Il 20/07/2012 18:30, Paolo Bonzini ha scritto: These commands cannot be issued right now without giving CAP_SYS_RAWIO to the process who wishes to send

Re: impact of 4k sector size on the IO FS stack

2007-03-12 Thread Ric Wheeler
Alan Cox wrote: First generation of 1K sector drives will continue to use the same 512-byte ATA sector size you are familiar with. A single 512-byte write will cause the drive to perform a read-modify-write cycle. This configuration is physical 1K sector, logical 512b sector. The problem

impact of 4k sector size on the IO FS stack

2007-03-11 Thread Ric Wheeler
During the recent IO/FS workshop, we spoke briefly about the coming change to a 4k sector size for disks on linux. If I recall correctly, the general feeling was that the impact was not significant since we already do most file system IO in 4k page sizes and should be fine as long as we

Re: impact of 4k sector size on the IO FS stack

2007-03-11 Thread Ric Wheeler
Jeff Garzik wrote: Alan Cox wrote: I would be interested to know what the disk vendors intend to use as their strategy when (with ATA) they have a 512 byte write from an older file system/setup into a 4K block. The case where errors magically appear Well, you have logical and physical

Re: impact of 4k sector size on the IO FS stack

2007-03-11 Thread Ric Wheeler
Alan Cox wrote: Are there other concerns in the IO or FS stack that we should bring up with vendors? I have been asked to summarize the impact of 4k sectors on linux for a disk vendor gathering and want to make sure that I put all of our linux specific items into that summary... We

Re: impact of 4k sector size on the IO FS stack

2007-03-11 Thread Ric Wheeler
Jan Engelhardt wrote: On Mar 11 2007 18:51, Ric Wheeler wrote: During the recent IO/FS workshop, we spoke briefly about the coming change to a 4k sector size for disks on linux. If I recall correctly, the general feeling was that the impact was not significant since we already do most file

Re: end to end error recovery musings

2007-02-27 Thread Ric Wheeler
Martin K. Petersen wrote: Eric == Moore, Eric [EMAIL PROTECTED] writes: Eric Martin K. Petersen on Data Intergrity Feature, which is also Eric called EEDP(End to End Data Protection), which he presented some Eric ideas/suggestions of adding an API in linux for this. T10 DIF is interesting

Re: end to end error recovery musings

2007-02-26 Thread Ric Wheeler
Alan wrote: I think that this is mostly true, but we also need to balance this against the need for higher levels to get a timely response. In a really large IO, a naive retry of a very large write could lead to a non-responsive system for a very large time... And losing the I/O could

Re: end to end error recovery musings

2007-02-26 Thread Ric Wheeler
Jeff Garzik wrote: Theodore Tso wrote: Can someone with knowledge of current disk drive behavior confirm that for all drives that support bad block sparing, if an attempt to write to a particular spot on disk results in an error due to bad media at that spot, the disk drive will automatically

end to end error recovery musings

2007-02-23 Thread Ric Wheeler
In the IO/FS workshop, one idea we kicked around is the need to provide better and more specific error messages between the IO stack and the file system layer. My group has been working to stabilize a relatively up to date libata + MD based box, so I can try to lay out at least one appliance

Re: [PATCH] scsi_lib.c: continue after MEDIUM_ERROR

2007-02-02 Thread Ric Wheeler
James Bottomley wrote: On Thu, 2007-02-01 at 15:02 -0500, Mark Lord wrote: I believe you made the first change in response to my prodding at the time, when libata was not returning valid sense data (no LBA) for media errors. The SCSI EH handling of that was rather poor at the time, and so

Re: [PATCH] scsi_lib.c: continue after MEDIUM_ERROR

2007-02-02 Thread Ric Wheeler
James Bottomley wrote: On Fri, 2007-02-02 at 14:42 +, Alan wrote: The interesting point of this question is about the typically pattern of IO errors. On a read, it is safe to assume that you will have issues with some bounded numbers of adjacent sectors. Which in theory you

Re: [PATCH] scsi_lib.c: continue after MEDIUM_ERROR

2007-01-31 Thread Ric Wheeler
Jeff Garzik wrote: Mark Lord wrote: Eric D. Mudama wrote: Actually, it's possibly worse, since each failure in libata will generate 3-4 retries. With existing ATA error recovery in the drives, that's about 3 seconds per retry on average, or 12 seconds per failure. Multiply that by the

Re: [PATCH] scsi_lib.c: continue after MEDIUM_ERROR

2007-01-31 Thread Ric Wheeler
Alan wrote: When libata reports a MEDIUM_ERROR to us, we *know* it's non-recoverable, as the drive itself has already done internal retries (libata uses the with retry ATA opcodes for this). This depends on the firmware. Some of the raid firmware drives don't appear to do retries in

Re: [PATCH] scsi_lib.c: continue after MEDIUM_ERROR

2007-01-30 Thread Ric Wheeler
Mark Lord wrote: Eric D. Mudama wrote: Actually, it's possibly worse, since each failure in libata will generate 3-4 retries. With existing ATA error recovery in the drives, that's about 3 seconds per retry on average, or 12 seconds per failure. Multiply that by the number of blocks