Re: [PATCH v2 4/5] block, scsi: Rework runtime power management

2018-08-02 Thread Bart Van Assche
On Mon, 2018-07-30 at 09:56 +0800, jianchao.wang wrote:
> static int sdev_runtime_suspend(struct device *dev)
> {
>   const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL;
>   struct scsi_device *sdev = to_scsi_device(dev);
>   int err = 0;
> 
>   err = blk_pre_runtime_suspend(sdev->request_queue);
>   if (err)
>   return err;
>   if (pm && pm->runtime_suspend)
>   err = pm->runtime_suspend(dev);
>   blk_post_runtime_suspend(sdev->request_queue, err);
> 
>   return err;
> }
> 
> If blk_pre_runtime_suspend returns -EBUSY, blk_post_runtime_suspend will not 
> be invoked.

Right, I will fix this.

> > > The request_queue should be set to preempt only mode only when we confirm 
> > > we could set
> > > rpm_status to RPM_SUSPENDING or RPM_RESUMING.
> > 
> > Why do you think this?
> 
> https://marc.info/?l=linux-scsi=133727953625963=2
> "
> If q->rpm_status is RPM_SUSPENDED, they shouldn't do anything -- act as 
> though the queue is
> empty.  If q->rpm_status is RPM_SUSPENDING or RPM_RESUMING, they should hand 
> over the request
> only if it has the REQ_PM flag set.
> "

I think the blk_pre_runtime_suspend() callers guarantee that q->rpm_status == 
RPM_ACTIVE
before blk_pre_runtime_suspend() is called. I will add a WARN_ON_ONCE() 
statement that
verifies that.

> In additon, if we set the preempt only here unconditionally, the normal IO 
> will be blocked
> during the blk_pre_runtime_suspend. In your patch, q_usage_counter will be 
> switched to atomic mode,
> this could cost some time. Is it really OK ?

I will see what I can do about this.

Bart.



Re: [PATCH v2 4/5] block, scsi: Rework runtime power management

2018-07-29 Thread jianchao.wang



On 07/27/2018 11:29 PM, Bart Van Assche wrote:
> On Fri, 2018-07-27 at 09:57 +0800, jianchao.wang wrote:
>> Something like:
>>
>>  percpu_ref_switch_to_atomic_sync(>q_usage_counter);
>>  if (!percpu_ref_is_zero(>q_usage_counter)) {
>>  ret = -EBUSY;
>>  pm_runtime_mark_last_busy(q->dev);
>>  } else {
>>  blk_set_preempt_only(q);
>>  if (!percpu_ref_is_zero(>q_usage_counter) {
>>  ret = -EBUSY;
>>  pm_runtime_mark_last_busy(q->dev);
>>  blk_clear_preempt_only(q);
>>  } else {
>>  q->rpm_status = RPM_SUSPENDIN;
>>  }
>> }
> 
> I think this code is racy. Because there is no memory barrier in
> blk_queue_enter() between the percpu_ref_tryget_live() and the
> blk_queue_preempt_only() calls, the context that sets the PREEMPT_ONLY flag
> has to use synchronize_rcu() or call_rcu() to ensure that blk_queue_enter()
> sees the PREEMPT_ONLY flag after it has called percpu_ref_tryget_live().
> See also 
> https://urldefense.proofpoint.com/v2/url?u=http-3A__lwn.net_Articles_573497_=DwIFgQ=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE=7WdAxUBeiTUTCy8v-7zXyr4qk7sx26ATvfo6QSTvZyQ=mkAOXQtxuVTAej2tBjOvEed2h4TRvX3mE-EPeNEUD5E=xTgTB2x1JwvRUQVyOL8m3rhbbk8xhOkZMC_Io9bmpFc=.

Yes, a synchrorize_rcu is indeed needed here to ensure a q_usage_counter is
got successfully with increasing one,  or unsuccessfully without increasing one.

Thanks
Jianchao


Re: [PATCH v2 4/5] block, scsi: Rework runtime power management

2018-07-29 Thread jianchao.wang


Hi Bart

On 07/27/2018 11:09 PM, Bart Van Assche wrote:
> On Fri, 2018-07-27 at 09:57 +0800, jianchao.wang wrote:
>> If q_usage_counter is not zero here, we will leave the request_queue in 
>> preempt only mode.
> 
> That's on purpose. If q_usage_counter is not zero then 
> blk_pre_runtime_suspend()
> will return -EBUSY. That error code will be passed to 
> blk_post_runtime_suspend()
> and that will cause that function to clear QUEUE_FLAG_PREEMPT_ONLY.
> 

static int sdev_runtime_suspend(struct device *dev)
{
const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL;
struct scsi_device *sdev = to_scsi_device(dev);
int err = 0;

err = blk_pre_runtime_suspend(sdev->request_queue);
if (err)
return err;
if (pm && pm->runtime_suspend)
err = pm->runtime_suspend(dev);
blk_post_runtime_suspend(sdev->request_queue, err);

return err;
}

If blk_pre_runtime_suspend returns -EBUSY, blk_post_runtime_suspend will not be 
invoked.

>> The request_queue should be set to preempt only mode only when we confirm we 
>> could set
>> rpm_status to RPM_SUSPENDING or RPM_RESUMING.
> 
> Why do you think this?

https://marc.info/?l=linux-scsi=133727953625963=2
"
If q->rpm_status is RPM_SUSPENDED, they shouldn't do anything -- act as though 
the queue is
empty.  If q->rpm_status is RPM_SUSPENDING or RPM_RESUMING, they should hand 
over the request
only if it has the REQ_PM flag set.
"
In additon, if we set the preempt only here unconditionally, the normal IO will 
be blocked
during the blk_pre_runtime_suspend. In your patch, q_usage_counter will be 
switched to atomic mode,
this could cost some time. Is it really OK ?

Thanks
Jianchao
> 
> Bart.
> 
> 


Re: [PATCH v2 4/5] block, scsi: Rework runtime power management

2018-07-27 Thread Bart Van Assche
On Fri, 2018-07-27 at 09:57 +0800, jianchao.wang wrote:
> Something like:
> 
>   percpu_ref_switch_to_atomic_sync(>q_usage_counter);
>   if (!percpu_ref_is_zero(>q_usage_counter)) {
>   ret = -EBUSY;
>   pm_runtime_mark_last_busy(q->dev);
>   } else {
>   blk_set_preempt_only(q);
>   if (!percpu_ref_is_zero(>q_usage_counter) {
>   ret = -EBUSY;
>   pm_runtime_mark_last_busy(q->dev);
>   blk_clear_preempt_only(q);
>   } else {
>   q->rpm_status = RPM_SUSPENDIN;
>   }
> }

I think this code is racy. Because there is no memory barrier in
blk_queue_enter() between the percpu_ref_tryget_live() and the
blk_queue_preempt_only() calls, the context that sets the PREEMPT_ONLY flag
has to use synchronize_rcu() or call_rcu() to ensure that blk_queue_enter()
sees the PREEMPT_ONLY flag after it has called percpu_ref_tryget_live().
See also http://lwn.net/Articles/573497/.

Bart.


Re: [PATCH v2 4/5] block, scsi: Rework runtime power management

2018-07-27 Thread Bart Van Assche
On Fri, 2018-07-27 at 09:57 +0800, jianchao.wang wrote:
> If q_usage_counter is not zero here, we will leave the request_queue in 
> preempt only mode.

That's on purpose. If q_usage_counter is not zero then blk_pre_runtime_suspend()
will return -EBUSY. That error code will be passed to blk_post_runtime_suspend()
and that will cause that function to clear QUEUE_FLAG_PREEMPT_ONLY.

> The request_queue should be set to preempt only mode only when we confirm we 
> could set
> rpm_status to RPM_SUSPENDING or RPM_RESUMING.

Why do you think this?

Bart.



Re: [PATCH v2 4/5] block, scsi: Rework runtime power management

2018-07-26 Thread jianchao.wang



On 07/27/2018 06:24 AM, Bart Van Assche wrote:
> On Thu, 2018-07-26 at 10:45 +0800, jianchao.wang wrote:
>> On 07/26/2018 06:26 AM, Bart Van Assche wrote:
>>> @@ -102,9 +109,11 @@ int blk_pre_runtime_suspend(struct request_queue *q)
>>> return ret;
>>>  
>>> blk_pm_runtime_lock(q);
>>> +   blk_set_preempt_only(q);
>>
>> We only stop non-RQF_PM request entering when RPM_SUSPENDING and 
>> RPM_RESUMING.
>> blk_pre_runtime_suspend should only _check_ whether runtime suspend is 
>> allowed.
>> So we should not set preempt_only here.
>>
>>> +   percpu_ref_switch_to_atomic_sync(>q_usage_counter);
>>>  
>>> spin_lock_irq(q->queue_lock);
>>> -   if (q->nr_pending) {
>>> +   if (!percpu_ref_is_zero(>q_usage_counter)) {
>>> ret = -EBUSY;
>>> pm_runtime_mark_last_busy(q->dev);
>>> } else {
>>> @@ -112,6 +121,7 @@ int blk_pre_runtime_suspend(struct request_queue *q)
>>> }
>>> spin_unlock_irq(q->queue_lock);
>>>  
>>> +   percpu_ref_switch_to_percpu(>q_usage_counter);
>>> blk_pm_runtime_unlock(q);
> 
> Hello Jianchao,
> 
> There is only one caller of blk_post_runtime_suspend(), namely
> sdev_runtime_suspend(). That function calls pm->runtime_suspend() before
> it calls blk_post_runtime_suspend(). I think it would be wrong to set the
> PREEMPT_ONLY flag from inside blk_post_runtime_suspend() because that could
> cause pm->runtime_suspend() while a request is in progress.
> 

Hi Bart

If q_usage_counter is not zero here, we will leave the request_queue in preempt 
only mode.
The request_queue should be set to preempt only mode only when we confirm we 
could set
rpm_status to RPM_SUSPENDING or RPM_RESUMING.

Maybe we could set the PREEMPT_ONLY after we confirm we could set the rpm_status
to RPM_SUSPENDING.
Something like:

percpu_ref_switch_to_atomic_sync(>q_usage_counter);
if (!percpu_ref_is_zero(>q_usage_counter)) {
ret = -EBUSY;
pm_runtime_mark_last_busy(q->dev);
} else {
blk_set_preempt_only(q);
if (!percpu_ref_is_zero(>q_usage_counter) {
ret = -EBUSY;
pm_runtime_mark_last_busy(q->dev);
blk_clear_preempt_only(q);
} else {
q->rpm_status = RPM_SUSPENDIN;
}
}


Thanks
Jianchao


Re: [PATCH v2 4/5] block, scsi: Rework runtime power management

2018-07-26 Thread Bart Van Assche
On Thu, 2018-07-26 at 10:45 +0800, jianchao.wang wrote:
> On 07/26/2018 06:26 AM, Bart Van Assche wrote:
> > @@ -102,9 +109,11 @@ int blk_pre_runtime_suspend(struct request_queue *q)
> > return ret;
> >  
> > blk_pm_runtime_lock(q);
> > +   blk_set_preempt_only(q);
> 
> We only stop non-RQF_PM request entering when RPM_SUSPENDING and RPM_RESUMING.
> blk_pre_runtime_suspend should only _check_ whether runtime suspend is 
> allowed.
> So we should not set preempt_only here.
> 
> > +   percpu_ref_switch_to_atomic_sync(>q_usage_counter);
> >  
> > spin_lock_irq(q->queue_lock);
> > -   if (q->nr_pending) {
> > +   if (!percpu_ref_is_zero(>q_usage_counter)) {
> > ret = -EBUSY;
> > pm_runtime_mark_last_busy(q->dev);
> > } else {
> > @@ -112,6 +121,7 @@ int blk_pre_runtime_suspend(struct request_queue *q)
> > }
> > spin_unlock_irq(q->queue_lock);
> >  
> > +   percpu_ref_switch_to_percpu(>q_usage_counter);
> > blk_pm_runtime_unlock(q);

Hello Jianchao,

There is only one caller of blk_post_runtime_suspend(), namely
sdev_runtime_suspend(). That function calls pm->runtime_suspend() before
it calls blk_post_runtime_suspend(). I think it would be wrong to set the
PREEMPT_ONLY flag from inside blk_post_runtime_suspend() because that could
cause pm->runtime_suspend() while a request is in progress.

Bart.



Re: [PATCH v2 4/5] block, scsi: Rework runtime power management

2018-07-26 Thread jianchao.wang



On 07/26/2018 03:52 PM, jianchao.wang wrote:
> In addition, .runtime_suspend is invoked under spinlock and irq-disabled.
> So sleep is forbidden here.
> Please refer to rpm_suspend
> 
> * This function must be called under dev->power.lock with interrupts disabled
> 
__rpm_callback will unlock the spinlock before invoke the callback.
Sorry for the noise.

Thanks
Jianchao



Re: [PATCH v2 4/5] block, scsi: Rework runtime power management

2018-07-26 Thread jianchao.wang



On 07/26/2018 10:45 AM, jianchao.wang wrote:
> Hi Bart
> 
> On 07/26/2018 06:26 AM, Bart Van Assche wrote:
>> @@ -102,9 +109,11 @@ int blk_pre_runtime_suspend(struct request_queue *q)
>>  return ret;
>>  
>>  blk_pm_runtime_lock(q);
>> +blk_set_preempt_only(q);
> 
> We only stop non-RQF_PM request entering when RPM_SUSPENDING and RPM_RESUMING.
> blk_pre_runtime_suspend should only _check_ whether runtime suspend is 
> allowed.
> So we should not set preempt_only here.
> 
> 
>> +percpu_ref_switch_to_atomic_sync(>q_usage_counter);

In addition, .runtime_suspend is invoked under spinlock and irq-disabled.
So sleep is forbidden here.
Please refer to rpm_suspend

* This function must be called under dev->power.lock with interrupts disabled

Thanks
Jianchao
>>  
>>  spin_lock_irq(q->queue_lock);
>> -if (q->nr_pending) {
>> +if (!percpu_ref_is_zero(>q_usage_counter)) {
>>  ret = -EBUSY;
>>  pm_runtime_mark_last_busy(q->dev);
>>  } else {
>> @@ -112,6 +121,7 @@ int blk_pre_runtime_suspend(struct request_queue *q)
>>  }
>>  spin_unlock_irq(q->queue_lock);
>>  
>> +percpu_ref_switch_to_percpu(>q_usage_counter);
>>  blk_pm_runtime_unlock(q);
> 


Re: [PATCH v2 4/5] block, scsi: Rework runtime power management

2018-07-25 Thread jianchao.wang
Hi Bart

On 07/26/2018 06:26 AM, Bart Van Assche wrote:
> @@ -102,9 +109,11 @@ int blk_pre_runtime_suspend(struct request_queue *q)
>   return ret;
>  
>   blk_pm_runtime_lock(q);
> + blk_set_preempt_only(q);

We only stop non-RQF_PM request entering when RPM_SUSPENDING and RPM_RESUMING.
blk_pre_runtime_suspend should only _check_ whether runtime suspend is allowed.
So we should not set preempt_only here.


> + percpu_ref_switch_to_atomic_sync(>q_usage_counter);
>  
>   spin_lock_irq(q->queue_lock);
> - if (q->nr_pending) {
> + if (!percpu_ref_is_zero(>q_usage_counter)) {
>   ret = -EBUSY;
>   pm_runtime_mark_last_busy(q->dev);
>   } else {
> @@ -112,6 +121,7 @@ int blk_pre_runtime_suspend(struct request_queue *q)
>   }
>   spin_unlock_irq(q->queue_lock);
>  
> + percpu_ref_switch_to_percpu(>q_usage_counter);
>   blk_pm_runtime_unlock(q);