Re: scsi-mq V2

2014-07-14 Thread Sagi Grimberg
On 7/8/2014 5:48 PM, Christoph Hellwig wrote: SNIP I've pushed out a new scsi-mq.3 branch, which has been rebased on the latest core-for-3.17 tree + the RFC: clean up command setup series from June 29th. Robert Elliot found a problem with not fully zeroed out UNMAP CDBs, which is fixed by the

Re: scsi-mq V2

2014-07-14 Thread Benjamin LaHaise
Hi Robert, On Sun, Jul 13, 2014 at 05:15:15PM +, Elliott, Robert (Server Storage) wrote: I will see if that solves the problem with the scsi-mq-3 tree, or at least some of the bisect trees leading up to it. scsi-mq-3 is still going after 45 minutes. I'll leave it running

RE: scsi-mq V2

2014-07-13 Thread Elliott, Robert (Server Storage)
Bottomley; Bart Van Assche; linux-scsi@vger.kernel.org; linux- ker...@vger.kernel.org Subject: RE: scsi-mq V2 I will see if that solves the problem with the scsi-mq-3 tree, or at least some of the bisect trees leading up to it. scsi-mq-3 is still going after 45 minutes. I'll leave

RE: scsi-mq V2

2014-07-12 Thread Elliott, Robert (Server Storage)
...@vger.kernel.org Subject: Re: scsi-mq V2 ... Can you try the below totally untested patch instead? It looks like put_reqs_available() is not irq-safe. With that addition alone, fio still runs into the same problem. I added the same fix to get_reqs_available, which also accesses kcpu

RE: scsi-mq V2

2014-07-12 Thread Elliott, Robert (Server Storage)
I will see if that solves the problem with the scsi-mq-3 tree, or at least some of the bisect trees leading up to it. scsi-mq-3 is still going after 45 minutes. I'll leave it running overnight. -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to

RE: scsi-mq V2

2014-07-11 Thread Elliott, Robert (Server Storage)
-Original Message- From: linux-scsi-ow...@vger.kernel.org [mailto:linux-scsi- ow...@vger.kernel.org] On Behalf Of Elliott, Robert (Server Storage) I added some prints in aio_setup_ring and ioctx_alloc and rebooted. This time it took much longer to hit the problem. It survived

Re: scsi-mq V2

2014-07-11 Thread Christoph Hellwig
On Fri, Jul 11, 2014 at 06:02:03AM +, Elliott, Robert (Server Storage) wrote: Allowing longer run times before declaring success, the problem does appear in all of the bisect trees. I just let fio continue to run for many minutes - no ^Cs necessary. no-rebase: good for 45 minutes (I

RE: scsi-mq V2

2014-07-11 Thread Elliott, Robert (Server Storage)
@vger.kernel.org; linux-ker...@vger.kernel.org Subject: Re: scsi-mq V2 On Fri, Jul 11, 2014 at 06:02:03AM +, Elliott, Robert (Server Storage) wrote: Allowing longer run times before declaring success, the problem does appear in all of the bisect trees. I just let fio continue to run

Re: scsi-mq V2

2014-07-11 Thread Benjamin LaHaise
On Fri, Jul 11, 2014 at 02:33:12PM +, Elliott, Robert (Server Storage) wrote: That ran 9 total hours with no problem. Rather than revert in the bisect trees, I added just this single additional patch to the no-rebase tree, and the problem appeared: Can you try the below totally untested

Re: scsi-mq V2

2014-07-10 Thread Christoph Hellwig
On Thu, Jul 10, 2014 at 12:53:36AM +, Elliott, Robert (Server Storage) wrote: the problem still occurs - fio results in low or 0 IOPS, with perf top reporting unusual amounts of time spent in do_io_submit and io_submit. The diff between the two version doesn't show too much other possible

Re: scsi-mq V2

2014-07-10 Thread Benjamin LaHaise
On Wed, Jul 09, 2014 at 11:20:40PM -0700, Christoph Hellwig wrote: On Thu, Jul 10, 2014 at 12:53:36AM +, Elliott, Robert (Server Storage) wrote: the problem still occurs - fio results in low or 0 IOPS, with perf top reporting unusual amounts of time spent in do_io_submit and io_submit.

Re: scsi-mq V2

2014-07-10 Thread Jens Axboe
On 2014-07-10 15:36, Benjamin LaHaise wrote: On Wed, Jul 09, 2014 at 11:20:40PM -0700, Christoph Hellwig wrote: On Thu, Jul 10, 2014 at 12:53:36AM +, Elliott, Robert (Server Storage) wrote: the problem still occurs - fio results in low or 0 IOPS, with perf top reporting unusual amounts of

Re: scsi-mq V2

2014-07-10 Thread Benjamin LaHaise
On Thu, Jul 10, 2014 at 03:39:57PM +0200, Jens Axboe wrote: That's how fio always runs, it sets up the context with the exact queue depth that it needs. Do we have a good enough understanding of other aio use cases to say that this isn't the norm? I would expect it to be, it's the way that

Re: scsi-mq V2

2014-07-10 Thread Jens Axboe
On 2014-07-10 15:44, Benjamin LaHaise wrote: On Thu, Jul 10, 2014 at 03:39:57PM +0200, Jens Axboe wrote: That's how fio always runs, it sets up the context with the exact queue depth that it needs. Do we have a good enough understanding of other aio use cases to say that this isn't the norm? I

Re: scsi-mq V2

2014-07-10 Thread Benjamin LaHaise
On Thu, Jul 10, 2014 at 03:48:10PM +0200, Jens Axboe wrote: On 2014-07-10 15:44, Benjamin LaHaise wrote: On Thu, Jul 10, 2014 at 03:39:57PM +0200, Jens Axboe wrote: That's how fio always runs, it sets up the context with the exact queue depth that it needs. Do we have a good enough

Re: scsi-mq V2

2014-07-10 Thread Christoph Hellwig
On Thu, Jul 10, 2014 at 09:36:09AM -0400, Benjamin LaHaise wrote: There is one possible concern that could be exacerbated by other changes in the system: if the application is running close to the bare minimum number of requests allocated in io_setup(), the per cpu reference counters will

Re: scsi-mq V2

2014-07-10 Thread Jens Axboe
On 2014-07-10 15:50, Christoph Hellwig wrote: On Thu, Jul 10, 2014 at 09:36:09AM -0400, Benjamin LaHaise wrote: There is one possible concern that could be exacerbated by other changes in the system: if the application is running close to the bare minimum number of requests allocated in

Re: scsi-mq V2

2014-07-10 Thread Jens Axboe
On 2014-07-10 15:50, Benjamin LaHaise wrote: On Thu, Jul 10, 2014 at 03:48:10PM +0200, Jens Axboe wrote: On 2014-07-10 15:44, Benjamin LaHaise wrote: On Thu, Jul 10, 2014 at 03:39:57PM +0200, Jens Axboe wrote: That's how fio always runs, it sets up the context with the exact queue depth that

RE: scsi-mq V2

2014-07-10 Thread Elliott, Robert (Server Storage)
...@vger.kernel.org Subject: Re: scsi-mq V2 On 2014-07-10 15:50, Christoph Hellwig wrote: On Thu, Jul 10, 2014 at 09:36:09AM -0400, Benjamin LaHaise wrote: There is one possible concern that could be exacerbated by other changes in the system: if the application is running close to the bare minimum

Re: scsi-mq V2

2014-07-10 Thread Benjamin LaHaise
...@interlog.com; James Bottomley; Bart Van Assche; linux-scsi@vger.kernel.org; linux-ker...@vger.kernel.org Subject: Re: scsi-mq V2 On 2014-07-10 15:50, Christoph Hellwig wrote: On Thu, Jul 10, 2014 at 09:36:09AM -0400, Benjamin LaHaise wrote: There is one possible concern that could

Re: scsi-mq V2

2014-07-10 Thread Jeff Moyer
Benjamin LaHaise b...@kvack.org writes: [ 186.339064] ioctx_alloc: nr_events=-2 aio_max_nr=65536 [ 186.339065] ioctx_alloc: nr_events=-2 aio_max_nr=65536 [ 186.339067] ioctx_alloc: nr_events=-2 aio_max_nr=65536 [ 186.339068] ioctx_alloc: nr_events=-2 aio_max_nr=65536 [ 186.339069]

RE: scsi-mq V2

2014-07-10 Thread Elliott, Robert (Server Storage)
; linux- ker...@vger.kernel.org Subject: Re: scsi-mq V2 On Thu, Jul 10, 2014 at 12:53:36AM +, Elliott, Robert (Server Storage) wrote: the problem still occurs - fio results in low or 0 IOPS, with perf top reporting unusual amounts of time spent in do_io_submit and io_submit

Re: scsi-mq V2

2014-07-10 Thread Christoph Hellwig
On Thu, Jul 10, 2014 at 03:51:44PM +, Elliott, Robert (Server Storage) wrote: scsi-mq.3-bisect-1 branch that is rebased to just before the merge of the block tree good. and a scsi-mq.3-bisect-2 branch that is just after the merge of the block tree to get started. good. It's

Re: scsi-mq V2

2014-07-10 Thread Christoph Hellwig
On Thu, Jul 10, 2014 at 09:04:22AM -0700, Christoph Hellwig wrote: It's starting to look weird. I'll prepare another two bisect branches around some MM changes, which seems the only other possible candidate. I've pushed out scsi-mq.3-bisect-3 and scsi-mq.3-bisect-4 for you. -- To unsubscribe

RE: scsi-mq V2

2014-07-10 Thread Elliott, Robert (Server Storage)
...@vger.kernel.org Subject: Re: scsi-mq V2 On Thu, Jul 10, 2014 at 09:04:22AM -0700, Christoph Hellwig wrote: It's starting to look weird. I'll prepare another two bisect branches around some MM changes, which seems the only other possible candidate. I've pushed out scsi-mq.3-bisect-3 Good

Re: scsi-mq V2

2014-07-10 Thread Jeff Moyer
; Benjamin LaHaise; linux-scsi@vger.kernel.org; linux-ker...@vger.kernel.org Subject: Re: scsi-mq V2 On Thu, Jul 10, 2014 at 09:04:22AM -0700, Christoph Hellwig wrote: It's starting to look weird. I'll prepare another two bisect branches around some MM changes, which seems the only other possible

Re: scsi-mq V2

2014-07-10 Thread Jeff Moyer
Jeff Moyer jmo...@redhat.com writes: Hi, Rob, Can you get sysrq-t output for me? I don't know how/why we'd continue to get io_submits for an exiting process. Also, do you know what sys_io_submit is returning? -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body

Re: scsi-mq V2

2014-07-10 Thread Jens Axboe
On 2014-07-10 17:11, Jeff Moyer wrote: Benjamin LaHaise b...@kvack.org writes: [ 186.339064] ioctx_alloc: nr_events=-2 aio_max_nr=65536 [ 186.339065] ioctx_alloc: nr_events=-2 aio_max_nr=65536 [ 186.339067] ioctx_alloc: nr_events=-2 aio_max_nr=65536 [ 186.339068] ioctx_alloc: nr_events=-2

Re: scsi-mq V2

2014-07-10 Thread Jeff Moyer
Jens Axboe ax...@kernel.dk writes: On 2014-07-10 17:11, Jeff Moyer wrote: Benjamin LaHaise b...@kvack.org writes: [ 186.339064] ioctx_alloc: nr_events=-2 aio_max_nr=65536 [ 186.339065] ioctx_alloc: nr_events=-2 aio_max_nr=65536 [ 186.339067] ioctx_alloc: nr_events=-2 aio_max_nr=65536 [

Re: scsi-mq V2

2014-07-10 Thread Jens Axboe
On 2014-07-10 22:05, Jeff Moyer wrote: Jens Axboe ax...@kernel.dk writes: On 2014-07-10 17:11, Jeff Moyer wrote: Benjamin LaHaise b...@kvack.org writes: [ 186.339064] ioctx_alloc: nr_events=-2 aio_max_nr=65536 [ 186.339065] ioctx_alloc: nr_events=-2 aio_max_nr=65536 [ 186.339067]

RE: scsi-mq V2

2014-07-10 Thread Elliott, Robert (Server Storage)
- ker...@vger.kernel.org Subject: Re: scsi-mq V2 Elliott, Robert (Server Storage) elli...@hp.com writes: -Original Message- From: Christoph Hellwig [mailto:h...@infradead.org] Sent: Thursday, 10 July, 2014 11:15 AM To: Elliott, Robert (Server Storage) Cc: Jens Axboe; dgilb

Re: scsi-mq V2

2014-07-09 Thread Douglas Gilbert
On 14-07-08 10:48 AM, Christoph Hellwig wrote: On Wed, Jun 25, 2014 at 06:51:47PM +0200, Christoph Hellwig wrote: Changes from V1: - rebased on top of the core-for-3.17 branch, most notable the scsi logging changes - fixed handling of cmd_list to prevent crashes for some heavy

Re: scsi-mq V2

2014-07-09 Thread Jens Axboe
On 2014-07-09 18:39, Douglas Gilbert wrote: On 14-07-08 10:48 AM, Christoph Hellwig wrote: On Wed, Jun 25, 2014 at 06:51:47PM +0200, Christoph Hellwig wrote: Changes from V1: - rebased on top of the core-for-3.17 branch, most notable the scsi logging changes - fixed handling of

RE: scsi-mq V2

2014-07-09 Thread Elliott, Robert (Server Storage)
: Re: scsi-mq V2 On 2014-07-09 18:39, Douglas Gilbert wrote: On 14-07-08 10:48 AM, Christoph Hellwig wrote: On Wed, Jun 25, 2014 at 06:51:47PM +0200, Christoph Hellwig wrote: Changes from V1: - rebased on top of the core-for-3.17 branch, most notable the scsi logging changes

Re: scsi-mq V2

2014-07-08 Thread Christoph Hellwig
On Wed, Jun 25, 2014 at 06:51:47PM +0200, Christoph Hellwig wrote: Changes from V1: - rebased on top of the core-for-3.17 branch, most notable the scsi logging changes - fixed handling of cmd_list to prevent crashes for some heavy workloads - fixed incorrect handling of

Re: scsi-mq V2

2014-06-30 Thread Jens Axboe
On 06/25/2014 10:50 PM, Jens Axboe wrote: On 2014-06-25 10:51, Christoph Hellwig wrote: This is the second post of the scsi-mq series. At this point the code is ready for merging and use by developers and early adopters. The core blk-mq code isn't that suitable for slow devices yet, mostly

Re: scsi-mq V2

2014-06-30 Thread Christoph Hellwig
On Mon, Jun 30, 2014 at 09:20:51AM -0600, Jens Axboe wrote: Ran stress testing from Friday to now, 65h of beating up on it and no problems observed. 47TB read and 20TB written for a total of 17.7 billion of IOs issued and completed. Latencies look good. I officially declare this code for bug

Re: scsi-mq V2

2014-06-30 Thread Martin K. Petersen
Christoph == Christoph Hellwig h...@infradead.org writes: Christoph I'm still looking for one (or better two) persons familar Christoph with the SCSI and/or block code to go over it and do a real Christoph detailed review. I'm on vacation for a couple of days. Will review Wednesday. -- Martin

Re: scsi-mq V2

2014-06-27 Thread Bart Van Assche
; linux-ker...@vger.kernel.org Subject: Re: scsi-mq V2 On 2014-06-25 10:51, Christoph Hellwig wrote: This is the second post of the scsi-mq series. ... Changes from V1: - rebased on top of the core-for-3.17 branch, most notable the scsi logging changes - fixed handling of cmd_list

RE: scsi-mq V2

2014-06-26 Thread Elliott, Robert (Server Storage)
-Original Message- From: Jens Axboe [mailto:ax...@kernel.dk] Sent: Wednesday, 25 June, 2014 11:51 PM To: Christoph Hellwig; James Bottomley Cc: Bart Van Assche; Elliott, Robert (Server Storage); linux- s...@vger.kernel.org; linux-ker...@vger.kernel.org Subject: Re: scsi-mq V2

Re: scsi-mq V2

2014-06-25 Thread Jens Axboe
On 2014-06-25 10:51, Christoph Hellwig wrote: This is the second post of the scsi-mq series. At this point the code is ready for merging and use by developers and early adopters. The core blk-mq code isn't that suitable for slow devices yet, mostly due to the lack of an I/O scheduler, but Jens