Re: blk-mq: takes hours for scsi scanning finish when thousands of LUNs

2015-10-23 Thread jason
Hi,Jeff On Friday, October 23, 2015 03:04 AM, Jeff Moyer wrote: Jens Axboe writes: On 10/22/2015 09:53 AM, Jeff Moyer wrote: I think that percolating BLK_MQ_F_TAG_SHARED up to the tag set would allow newly created hctxs to simply inherit the shared state (in blk_mq_init_hctx), and you won't

Re: blk-mq: takes hours for scsi scanning finish when thousands of LUNs

2015-10-23 Thread jason
Hi,Jeff On Friday, October 23, 2015 03:04 AM, Jeff Moyer wrote: Jens Axboe writes: On 10/22/2015 09:53 AM, Jeff Moyer wrote: I think that percolating BLK_MQ_F_TAG_SHARED up to the tag set would allow newly created hctxs to simply inherit the shared state (in

Re: blk-mq: takes hours for scsi scanning finish when thousands of LUNs

2015-10-22 Thread Ming Lei
On Thu, Oct 22, 2015 at 5:15 PM, jason wrote: > > > On Thursday, October 22, 2015 04:47 PM, Tejun Heo wrote: >> >> Hello, >> >> On Mon, Oct 19, 2015 at 07:40:13AM -0700, Zhangqing Luo wrote: >> >> > So every time blk_mq_freeze_queue_start, it runs in this way >> > >> >

Re: blk-mq: takes hours for scsi scanning finish when thousands of LUNs

2015-10-22 Thread Jeff Moyer
Jens Axboe writes: > On 10/22/2015 09:53 AM, Jeff Moyer wrote: >> I think that percolating BLK_MQ_F_TAG_SHARED up to the tag set would >> allow newly created hctxs to simply inherit the shared state (in >> blk_mq_init_hctx), and you won't need to freeze every queue in order to >> guarantee that.

Re: blk-mq: takes hours for scsi scanning finish when thousands of LUNs

2015-10-22 Thread Jens Axboe
On 10/22/2015 09:53 AM, Jeff Moyer wrote: Jens Axboe writes: I agree with the optimizing hot paths by cheaper percpu operation, but how much does it affect the performance? A lot, since the queue referencing happens twice per IO. The switch to percpu was done to use shared/common code for

Re: blk-mq: takes hours for scsi scanning finish when thousands of LUNs

2015-10-22 Thread Jeff Moyer
Jens Axboe writes: >> I agree with the optimizing hot paths by cheaper percpu operation, >> but how much does it affect the performance? > > A lot, since the queue referencing happens twice per IO. The switch to > percpu was done to use shared/common code for this, the previous > version was a

Re: blk-mq: takes hours for scsi scanning finish when thousands of LUNs

2015-10-22 Thread Jens Axboe
On 10/22/2015 03:15 AM, jason wrote: On Thursday, October 22, 2015 04:47 PM, Tejun Heo wrote: Hello, On Mon, Oct 19, 2015 at 07:40:13AM -0700, Zhangqing Luo wrote: > So every time blk_mq_freeze_queue_start, it runs in this way > > blk_mq_freeze_queue_start >

Re: blk-mq: takes hours for scsi scanning finish when thousands of LUNs

2015-10-22 Thread jason
On Thursday, October 22, 2015 04:47 PM, Tejun Heo wrote: Hello, On Mon, Oct 19, 2015 at 07:40:13AM -0700, Zhangqing Luo wrote: > So every time blk_mq_freeze_queue_start, it runs in this way > > blk_mq_freeze_queue_start > ->percpu_ref_kill->percpu_ref_kill_and_confirm >

Re: blk-mq: takes hours for scsi scanning finish when thousands of LUNs

2015-10-22 Thread Tejun Heo
Hello, On Mon, Oct 19, 2015 at 07:40:13AM -0700, Zhangqing Luo wrote: > So every time blk_mq_freeze_queue_start, it runs in this way > > blk_mq_freeze_queue_start > ->percpu_ref_kill->percpu_ref_kill_and_confirm > ->__percpu_ref_switch_to_atomic >

Re: blk-mq: takes hours for scsi scanning finish when thousands of LUNs

2015-10-22 Thread Tejun Heo
Hello, On Mon, Oct 19, 2015 at 07:40:13AM -0700, Zhangqing Luo wrote: > So every time blk_mq_freeze_queue_start, it runs in this way > > blk_mq_freeze_queue_start > ->percpu_ref_kill->percpu_ref_kill_and_confirm > ->__percpu_ref_switch_to_atomic >

Re: blk-mq: takes hours for scsi scanning finish when thousands of LUNs

2015-10-22 Thread Jeff Moyer
Jens Axboe writes: > On 10/22/2015 09:53 AM, Jeff Moyer wrote: >> I think that percolating BLK_MQ_F_TAG_SHARED up to the tag set would >> allow newly created hctxs to simply inherit the shared state (in >> blk_mq_init_hctx), and you won't need to freeze every queue in order to

Re: blk-mq: takes hours for scsi scanning finish when thousands of LUNs

2015-10-22 Thread Ming Lei
On Thu, Oct 22, 2015 at 5:15 PM, jason wrote: > > > On Thursday, October 22, 2015 04:47 PM, Tejun Heo wrote: >> >> Hello, >> >> On Mon, Oct 19, 2015 at 07:40:13AM -0700, Zhangqing Luo wrote: >> >> > So every time blk_mq_freeze_queue_start, it runs in this way >> >

Re: blk-mq: takes hours for scsi scanning finish when thousands of LUNs

2015-10-22 Thread jason
On Thursday, October 22, 2015 04:47 PM, Tejun Heo wrote: Hello, On Mon, Oct 19, 2015 at 07:40:13AM -0700, Zhangqing Luo wrote: > So every time blk_mq_freeze_queue_start, it runs in this way > > blk_mq_freeze_queue_start > ->percpu_ref_kill->percpu_ref_kill_and_confirm >

Re: blk-mq: takes hours for scsi scanning finish when thousands of LUNs

2015-10-22 Thread Jens Axboe
On 10/22/2015 03:15 AM, jason wrote: On Thursday, October 22, 2015 04:47 PM, Tejun Heo wrote: Hello, On Mon, Oct 19, 2015 at 07:40:13AM -0700, Zhangqing Luo wrote: > So every time blk_mq_freeze_queue_start, it runs in this way > > blk_mq_freeze_queue_start >

Re: blk-mq: takes hours for scsi scanning finish when thousands of LUNs

2015-10-22 Thread Jeff Moyer
Jens Axboe writes: >> I agree with the optimizing hot paths by cheaper percpu operation, >> but how much does it affect the performance? > > A lot, since the queue referencing happens twice per IO. The switch to > percpu was done to use shared/common code for this, the previous

Re: blk-mq: takes hours for scsi scanning finish when thousands of LUNs

2015-10-22 Thread Jens Axboe
On 10/22/2015 09:53 AM, Jeff Moyer wrote: Jens Axboe writes: I agree with the optimizing hot paths by cheaper percpu operation, but how much does it affect the performance? A lot, since the queue referencing happens twice per IO. The switch to percpu was done to use

blk-mq: takes hours for scsi scanning finish when thousands of LUNs

2015-10-19 Thread Zhangqing Luo
Hi, Jens,Tejun Current problem we meet: When Multiple Queue is used for scsi, the period between each LUN probe is increasing as the number of block request queue goes up, eventually it takes hours for Initiator to finish scanning thousands of LUNs. Kernel version we're using: 4.1.6-14 I

blk-mq: takes hours for scsi scanning finish when thousands of LUNs

2015-10-19 Thread Zhangqing Luo
Hi, Jens,Tejun Current problem we meet: When Multiple Queue is used for scsi, the period between each LUN probe is increasing as the number of block request queue goes up, eventually it takes hours for Initiator to finish scanning thousands of LUNs. Kernel version we're using: 4.1.6-14 I