Re: RFC: split scsi passthrough fields out of struct request

2017-01-12 Thread Christoph Hellwig
On Wed, Jan 11, 2017 at 05:41:42PM -0500, Mike Snitzer wrote: > I removed blk-mq on request_fn paths support because it was one of the > permutations that I felt least useful/stable (see commit c5248f79f3 "dm: > remove support for stacking dm-mq on .request_fn device(s)") > > As for all of the

Re: blk_queue_bounce_limit() broken for mask=0xffffffff on 64bit archs

2017-01-12 Thread Ming Lei
Hi, On Tue, Jan 10, 2017 at 4:48 AM, Nikita Yushchenko wrote: > Hi > > There is a use cases when architecture is 64-bit but hardware supports > only DMA to lower 4G of address space. E.g. NVMe device on RCar PCIe host. > > For such cases, it looks proper to call

Re: blk_queue_bounce_limit() broken for mask=0xffffffff on 64bit archs

2017-01-12 Thread Nikita Yushchenko
>> There is a use cases when architecture is 64-bit but hardware supports >> only DMA to lower 4G of address space. E.g. NVMe device on RCar PCIe host. >> >> For such cases, it looks proper to call blk_queue_bounce_limit() with >> mask set to 0x - thus making block layer to use bounce

Re: [PATCH 07/10] blk-mq: abstract out helpers for allocating/freeing tag maps

2017-01-12 Thread Jens Axboe
On Thu, Jan 12 2017, Bart Van Assche wrote: > On Wed, 2017-01-11 at 14:40 -0700, Jens Axboe wrote: > > @@ -2392,12 +2425,12 @@ int blk_mq_alloc_tag_set(struct blk_mq_tag_set *set) > > if (set->nr_hw_queues > nr_cpu_ids) > > set->nr_hw_queues = nr_cpu_ids; > > > > + ret =

Re: [PATCH 10/10] blk-mq-sched: allow setting of default IO scheduler

2017-01-12 Thread Bart Van Assche
On Wed, 2017-01-11 at 14:40 -0700, Jens Axboe wrote: > Add Kconfig entries to manage what devices get assigned an MQ > scheduler, and add a blk-mq flag for drivers to opt out of scheduling. > The latter is useful for admin type queues that still allocate a blk-mq > queue and tag set, but aren't

Re: [Lsf-pc] [LSF/MM TOPIC] [LSF/MM ATTEND] md raid general discussion

2017-01-12 Thread Coly Li
On 2017/1/12 下午11:09, Sagi Grimberg wrote: > Hey Coly, > >> Also I receive reports from users that raid1 performance is desired when >> it is built on NVMe SSDs as a cache (maybe bcache or dm-cache). I am >> working on some raid1 performance improvement (e.g. new raid1 I/O >> barrier and lockless

RE: [Regression] fstrim hangs on Hyper-V: caused by "block: improve handling of the magic discard payload"

2017-01-12 Thread Dexuan Cui
> From: Christoph Hellwig [mailto:h...@lst.de] > Sent: Friday, January 13, 2017 02:19 > To: Dexuan Cui > Cc: linux-block@vger.kernel.org; KY Srinivasan ; Chris > Valean (Cloudbase Solutions SRL) > Subject: Re: [Regression] fstrim

RE: [Regression] fstrim hangs on Hyper-V: caused by "block: improve handling of the magic discard payload"

2017-01-12 Thread Dexuan Cui
> From: Dexuan Cui > Sent: Friday, January 13, 2017 11:05 > To: 'Christoph Hellwig' > Cc: linux-block@vger.kernel.org; KY Srinivasan ; Chris > Valean (Cloudbase Solutions SRL) > Subject: RE: [Regression] fstrim hangs on Hyper-V: caused by

Re: [PATCH 05/10] blk-mq: export some helpers we need to the scheduling framework

2017-01-12 Thread Bart Van Assche
On Wed, 2017-01-11 at 14:39 -0700, Jens Axboe wrote: > [ ... ] Reviewed-by: Bart Van Assche -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majord...@vger.kernel.org More majordomo info at

Re: [PATCH 06/10] blk-mq-tag: cleanup the normal/reserved tag allocation

2017-01-12 Thread Jens Axboe
On Thu, Jan 12 2017, Bart Van Assche wrote: > On Wed, 2017-01-11 at 14:39 -0700, Jens Axboe wrote: > > This is in preparation for having another tag set available. Cleanup > > the parameters, and allow passing in of tags fo blk_mq_put_tag(). > > It seems like an 'r' is missing from the

Re: [PATCH 05/15] dm: remove incomple BLOCK_PC support

2017-01-12 Thread Mike Snitzer
On Thu, Jan 12 2017 at 3:00am -0500, Christoph Hellwig wrote: > On Wed, Jan 11, 2017 at 08:09:37PM -0500, Mike Snitzer wrote: > > I'm not following your reasoning. > > > > dm_blk_ioctl calls __blkdev_driver_ioctl and will call scsi_cmd_ioctl > > (sd_ioctl -> scsi_cmd_blk_ioctl ->

Re: [PATCH 0/2] Rename blk_queue_zone_size and bdev_zone_size

2017-01-12 Thread Jens Axboe
On 01/11/2017 09:38 PM, Jens Axboe wrote: > On 01/11/2017 09:36 PM, Damien Le Moal wrote: >> Jens, >> >> On 1/12/17 12:52, Jens Axboe wrote: >>> On Thu, Jan 12 2017, Damien Le Moal wrote: All block device data fields and functions returning a number of 512B sectors are by convention

[LSF/MM TOPIC] Memory hotplug, ZONE_DEVICE, and the future of struct page

2017-01-12 Thread Dan Williams
Back when we were first attempting to support DMA for DAX mappings of persistent memory the plan was to forgo 'struct page' completely and develop a pfn-to-scatterlist capability for the dma-mapping-api. That effort died in this thread: https://lkml.org/lkml/2015/8/14/3 ...where we learned

Re: [LSF/MM TOPIC] Memory hotplug, ZONE_DEVICE, and the future of struct page

2017-01-12 Thread Dan Williams
On Thu, Jan 12, 2017 at 3:14 PM, Jerome Glisse wrote: > On Thu, Jan 12, 2017 at 02:43:03PM -0800, Dan Williams wrote: >> Back when we were first attempting to support DMA for DAX mappings of >> persistent memory the plan was to forgo 'struct page' completely and >> develop a

Re: [PATCH 0/2] Rename blk_queue_zone_size and bdev_zone_size

2017-01-12 Thread Damien Le Moal
Jens, On 1/13/17 05:49, Jens Axboe wrote: > Just in case you missed it, I had to fold your two patches. Looking at > it again, what is going on? You rename a function, and then patch #2 > renames the use of that function in a different spot? How did that ever > pass your testing? For something

Re: [PATCH] preview - block layer help to detect sequential IO

2017-01-12 Thread Jeff Moyer
Hi, Kashyap, I'm CC-ing Kent, seeing how this is his code. Kashyap Desai writes: > Objective of this patch is - > > To move code used in bcache module in block layer which is used to > find IO stream. Reference code @drivers/md/bcache/request.c >

Re: [PATCH 06/10] blk-mq-tag: cleanup the normal/reserved tag allocation

2017-01-12 Thread Bart Van Assche
On Wed, 2017-01-11 at 14:39 -0700, Jens Axboe wrote: > This is in preparation for having another tag set available. Cleanup > the parameters, and allow passing in of tags fo blk_mq_put_tag(). It seems like an 'r' is missing from the description ("tags fo")? Anyway: Reviewed-by: Bart Van Assche

Re: [PATCH 08/10] blk-mq-sched: add framework for MQ capable IO schedulers

2017-01-12 Thread Bart Van Assche
On Wed, 2017-01-11 at 14:40 -0700, Jens Axboe wrote: > @@ -451,11 +456,11 @@ void blk_insert_flush(struct request *rq) >* processed directly without going through flush machinery. Queue >* for normal execution. >*/ > - if ((policy & REQ_FSEQ_DATA) && > -

Re: [PATCH 03/10] block: move rq_ioc() to blk.h

2017-01-12 Thread Johannes Thumshirn
On Wed, Jan 11, 2017 at 02:39:56PM -0700, Jens Axboe wrote: > We want to use it outside of blk-core.c. > > Signed-off-by: Jens Axboe > --- Looks good, Reviewed-by: Johannes Thumshirn -- Johannes Thumshirn Storage

Re: [LSF/MM TOPIC][LSF/MM ATTEND] NAPI polling for block drivers

2017-01-12 Thread Sagi Grimberg
I'd like to attend LSF/MM and would like to discuss polling for block drivers. Currently there is blk-iopoll but it is neither as widely used as NAPI in the networking field and accoring to Sagi's findings in [1] performance with polling is not on par with IRQ usage. On LSF/MM I'd like to

[Regression] fstrim hangs on Hyper-V: caused by "block: improve handling of the magic discard payload"

2017-01-12 Thread Dexuan Cui
Hi, Recently fstrim and mkfs always hang in Linux VM running on Hyper-V 2012 R2 or 2016. The VM uses the latest mainline kernel (v4.10-rc3). git-bisect shows the patch "block: improve handling of the magic discard payload (f9d03f96)" causes the issue. If I revert the patch, the issue will go

Re: [LSF/MM TOPIC][LSF/MM ATTEND] NAPI polling for block drivers

2017-01-12 Thread Sagi Grimberg
Hi all, I'd like to attend LSF/MM and would like to discuss polling for block drivers. Currently there is blk-iopoll but it is neither as widely used as NAPI in the networking field and accoring to Sagi's findings in [1] performance with polling is not on par with IRQ usage. On LSF/MM I'd

Re: [PATCH 01/10] block: move existing elevator ops to union

2017-01-12 Thread Johannes Thumshirn
On Wed, Jan 11, 2017 at 02:39:54PM -0700, Jens Axboe wrote: > Prep patch for adding MQ ops as well, since doing anon unions with > named initializers doesn't work on older compilers. > > Signed-off-by: Jens Axboe > --- Looks good, Reviewed-by: Johannes Thumshirn

Re: [PATCH 02/10] blk-mq: make mq_ops a const pointer

2017-01-12 Thread Johannes Thumshirn
On Wed, Jan 11, 2017 at 02:39:55PM -0700, Jens Axboe wrote: > We never change it, make that clear. > > Signed-off-by: Jens Axboe > Reviewed-by: Bart Van Assche > --- Looks good, Reviewed-by: Johannes Thumshirn -- Johannes

Re: [LSF/MM TOPIC][LSF/MM ATTEND] NAPI polling for block drivers

2017-01-12 Thread Sagi Grimberg
I agree with Jens that we'll need some analysis if we want the discussion to be affective, and I can spend some time this if I can find volunteers with high-end nvme devices (I only have access to client nvme devices. I have a P3700 but somehow burned the FW. Let me see if I can bring it back

Re: [Lsf-pc] [LFS/MM TOPIC][LFS/MM ATTEND]: - Storage Stack and Driver Testing methodology.

2017-01-12 Thread Sagi Grimberg
Hi Folks, I would like to propose a general discussion on Storage stack and device driver testing. I think its very useful and needed. Purpose:- - The main objective of this discussion is to address the need for a Unified Test Automation Framework which can be used by

Re: RFC: 512e ZBC host-managed disks

2017-01-12 Thread Jeff Moyer
Christoph Hellwig writes: > On Thu, Jan 12, 2017 at 05:13:52PM +0900, Damien Le Moal wrote: >> (3) Any other idea ? > > Do nothing and ignore the problem. This whole idea so braindead that > the person coming up with the T10 language should be shot. Either a device > has 511

Re: [Lsf-pc] [LSF/MM TOPIC] [LSF/MM ATTEND] md raid general discussion

2017-01-12 Thread Sagi Grimberg
Hey Coly, Also I receive reports from users that raid1 performance is desired when it is built on NVMe SSDs as a cache (maybe bcache or dm-cache). I am working on some raid1 performance improvement (e.g. new raid1 I/O barrier and lockless raid1 I/O submit), and have some more ideas to discuss.

Re: [Lsf-pc] [LSF/MM TOPIC][LSF/MM ATTEND] NAPI polling for block drivers

2017-01-12 Thread Sagi Grimberg
**Note: when I ran multiple threads on more cpus the performance degradation phenomenon disappeared, but I tested on a VM with qemu emulation backed by null_blk so I figured I had some other bottleneck somewhere (that's why I asked for some more testing). That could be because of the vmexits

Re: [Regression] fstrim hangs on Hyper-V: caused by "block: improve handling of the magic discard payload"

2017-01-12 Thread Christoph Hellwig
Hi Dexuan, sorry for dropping the ball on the previous private report, I hoped I could get my hands on a Hyper-V VM and reproduce it myself, but that has obviously not happened. Can you send me the output of the provisioning_mode file for the scsi disk in question to get started? -- To

Re: [LSF/MM TOPIC][LSF/MM ATTEND] NAPI polling for block drivers

2017-01-12 Thread Johannes Thumshirn
On Thu, Jan 12, 2017 at 01:44:05PM +0200, Sagi Grimberg wrote: [...] > Its pretty basic: > -- > [global] > group_reporting > cpus_allowed=0 > cpus_allowed_policy=split > rw=randrw > bs=4k > numjobs=4 > iodepth=32 > runtime=60 > time_based > loops=1 > ioengine=libaio > direct=1 > invalidate=1 >

Re: [Regression] fstrim hangs on Hyper-V: caused by "block: improve handling of the magic discard payload"

2017-01-12 Thread Christoph Hellwig
On Fri, Jan 13, 2017 at 06:16:02AM +, Dexuan Cui wrote: > IMO this means not only SCSI Unmap command is affected, but > some other SCSI commands can be affected too? > And it looks the bare metal can be affected too? This affects all drivers looking at the sdb.length field for the total I/O

RE: [Regression] fstrim hangs on Hyper-V: caused by "block: improve handling of the magic discard payload"

2017-01-12 Thread Dexuan Cui
> From: Christoph Hellwig [mailto:h...@lst.de] > Sent: Thursday, January 12, 2017 21:44 > To: Dexuan Cui > Cc: Christoph Hellwig ; linux-block@vger.kernel.org; Jens Axboe > ; Vitaly Kuznetsov ; linux- > ker...@vger.kernel.org;

Re: [Regression] fstrim hangs on Hyper-V: caused by "block: improve handling of the magic discard payload"

2017-01-12 Thread Christoph Hellwig
Can you check if this debug printk triggers for the discard commands? --- diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c index 888e16e..7ab7d08 100644 --- a/drivers/scsi/storvsc_drv.c +++ b/drivers/scsi/storvsc_drv.c @@ -1031,6 +1031,10 @@ static void

Re: [PATCHSET v6] blk-mq scheduling framework

2017-01-12 Thread Bart Van Assche
On Wed, 2017-01-11 at 14:39 -0700, Jens Axboe wrote: > I've reworked bits of this to get rid of the shadow requests, thanks > to Bart for the inspiration. The missing piece, for me, was the fact > that we have the tags->rqs[] indirection array already. I've done this > somewhat differently,

Re: [LSF/MM TOPIC] Memory hotplug, ZONE_DEVICE, and the future of struct page

2017-01-12 Thread Jerome Glisse
On Thu, Jan 12, 2017 at 02:43:03PM -0800, Dan Williams wrote: > Back when we were first attempting to support DMA for DAX mappings of > persistent memory the plan was to forgo 'struct page' completely and > develop a pfn-to-scatterlist capability for the dma-mapping-api. That > effort died in this

Re: [Regression] fstrim hangs on Hyper-V: caused by "block: improve handling of the magic discard payload"

2017-01-12 Thread Christoph Hellwig
Next try: (I've also dropped most of the Cc list) diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index c35b6de..2f358f7 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -1018,7 +1018,10 @@ static int scsi_init_sgtable(struct request *req, struct

Re: [Lsf-pc] [LSF/MM TOPIC][LSF/MM ATTEND] NAPI polling for block drivers

2017-01-12 Thread Johannes Thumshirn
On Thu, Jan 12, 2017 at 04:41:00PM +0200, Sagi Grimberg wrote: > > >>**Note: when I ran multiple threads on more cpus the performance > >>degradation phenomenon disappeared, but I tested on a VM with > >>qemu emulation backed by null_blk so I figured I had some other > >>bottleneck somewhere

RE: [Regression] fstrim hangs on Hyper-V: caused by "block: improve handling of the magic discard payload"

2017-01-12 Thread Chris Valean (Cloudbase Solutions SRL)
Hi Christoph, Adding Nick and Alex to the thread. We'll give it a try along with Dexuan and update you with the results. Thank you! Chris Valean -Original Message- From: Christoph Hellwig [mailto:h...@lst.de] Sent: Thursday, January 12, 2017 8:19 PM To: Dexuan Cui

Re: [LSF/MM TOPIC][LSF/MM ATTEND] NAPI polling for block drivers

2017-01-12 Thread Bart Van Assche
On Thu, 2017-01-12 at 10:41 +0200, Sagi Grimberg wrote: > First, when the nvme device fires an interrupt, the driver consumes > the completion(s) from the interrupt (usually there will be some more > completions waiting in the cq by the time the host start processing it). > With irq-poll, we