Hi Jens
On 11/13/18 12:26 AM, Jens Axboe wrote:
> On 11/12/18 2:35 AM, jianchao.wang wrote:
>> Hi Jens
>>
>> On 11/10/18 11:13 PM, Jens Axboe wrote:
>>> We only really need the barrier if we're going to be sleeping,
>>> if we're just polling we're fine with the __set_current_state()
>>> variant.
>>>
>>> Signed-off-by: Jens Axboe <[email protected]>
>>> ---
>>> fs/block_dev.c | 25 +++++++++++++++++++++----
>>> 1 file changed, 21 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/fs/block_dev.c b/fs/block_dev.c
>>> index c039abfb2052..e7550b1f9670 100644
>>> --- a/fs/block_dev.c
>>> +++ b/fs/block_dev.c
>>> @@ -237,12 +237,23 @@ __blkdev_direct_IO_simple(struct kiocb *iocb, struct
>>> iov_iter *iter,
>>>
>>> qc = submit_bio(&bio);
>>> for (;;) {
>>> - set_current_state(TASK_UNINTERRUPTIBLE);
>>> + /*
>>> + * Using non-atomic task state for polling
>>> + */
>>> + __set_current_state(TASK_UNINTERRUPTIBLE);
>>> +
>>> if (!READ_ONCE(bio.bi_private))
>>> break;
>>
>> When we don't use polling, the blkdev_bio_end_io_simple may come on a
>> different cpu before
>> submit_bio returns
>> For example,
>> __blkdev_direct_IO_simple
>> qc = submit_bio(&bio)
>> blkdev_bio_end_io_simple
>> WRITE_ONCE(bio.bi_private, NULL)
>> wake_up_process(waiter)
>> __set_current_state(TASK_UNINTERRUPTIBLE)
>> READ_ONCE(bio.bi_private)
>
> While I agree that you are right, that's an existing issue. The store
> barrier implied by set_current_state() doesn't fix that.
>
> How about this variant:
>
> https://urldefense.proofpoint.com/v2/url?u=http-3A__git.kernel.dk_cgit_linux-2Dblock_commit_-3Fh-3Dmq-2Dperf-26id-3D036ba7a8d1334889c0fe55d4858d78f4e8022f12&d=DwICaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=7WdAxUBeiTUTCy8v-7zXyr4qk7sx26ATvfo6QSTvZyQ&m=pjbk_9rxuy7OS1pCkQ9MAHzFf4eT27Yt4RL8jASDJCA&s=AnmYMIcGpIILUrjDFsabj61pcTyMCRHB1Pu8XXi47t0&e=
>
> which then changes the later wake up helper patch to be:
>
> https://urldefense.proofpoint.com/v2/url?u=http-3A__git.kernel.dk_cgit_linux-2Dblock_commit_-3Fh-3Dmq-2Dperf-26id-3Df8c3f188425967adb040cfefb799b0a5a1df769d&d=DwICaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=7WdAxUBeiTUTCy8v-7zXyr4qk7sx26ATvfo6QSTvZyQ&m=pjbk_9rxuy7OS1pCkQ9MAHzFf4eT27Yt4RL8jASDJCA&s=PdajuUul55e6p0FED_AzzWt5Gip0ekLi1eGYBMLiZPo&e=
>
The wake up path has contained a full memory barrier with the raw_spin_lock and
following smp_mb__after_spinlock
wake_up_process
-> try_to_wake_up
raw_spin_lock_irqsave(&p->pi_lock, flags);
smp_mb__after_spinlock();
So a smp_rmb() could be enough. :)
Thanks
Jianchao