Jens Axboe <[email protected]> writes:
> On 5/1/19 5:56 AM, Jeff Moyer wrote:
>> Shenghui Wang <[email protected]> writes:
>>
>>> This issue is found by running liburing/test/io_uring_setup test.
>>>
>>> When test run, the testcase "attempt to bind to invalid cpu" would not
>>> pass with messages like:
>>> io_uring_setup(1, 0xbfc2f7c8), \
>>> flags: IORING_SETUP_SQPOLL|IORING_SETUP_SQ_AFF, \
>>> resv: 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000, \
>>> sq_thread_cpu: 2
>>> expected -1, got 3
>>> FAIL
>>>
>>> On my system, there is:
>>> CPU(s) possible : 0-3
>>> CPU(s) online : 0-1
>>> CPU(s) offline : 2-3
>>> CPU(s) present : 0-1
>>>
>>> The sq_thread_cpu 2 is offline on my system, so the bind should fail.
>>> But cpu_possible() will pass the check. We shouldn't be able to bind
>>> to an offline cpu. Use cpu_online() to do the check.
>>>
>>> After the change, the testcase run as expected: EINVAL will be returned
>>> for cpu offlined.
>>>
>>> Signed-off-by: Shenghui Wang <[email protected]>
>>> ---
>>> fs/io_uring.c | 4 ++--
>>> 1 file changed, 2 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/fs/io_uring.c b/fs/io_uring.c
>>> index 0e9fb2cb1984..aa3d39860a1c 100644
>>> --- a/fs/io_uring.c
>>> +++ b/fs/io_uring.c
>>> @@ -2241,7 +2241,7 @@ static int io_sq_offload_start(struct io_ring_ctx
>>> *ctx,
>>> ctx->sqo_mm = current->mm;
>>>
>>> ret = -EINVAL;
>>> - if (!cpu_possible(p->sq_thread_cpu))
>>> + if (!cpu_online(p->sq_thread_cpu))
>>> goto err;
>>>
>>> if (ctx->flags & IORING_SETUP_SQPOLL) {
>>> @@ -2258,7 +2258,7 @@ static int io_sq_offload_start(struct io_ring_ctx
>>> *ctx,
>>>
>>> cpu = array_index_nospec(p->sq_thread_cpu, NR_CPUS);
>>> ret = -EINVAL;
>>> - if (!cpu_possible(p->sq_thread_cpu))
>>> + if (!cpu_online(p->sq_thread_cpu))
>>> goto err;
>>>
>>> ctx->sqo_thread = kthread_create_on_cpu(io_sq_thread,
>>
>> Hmm. Why are we doing this check twice? Oh... Jens, I think you
>> braino'd commit 917257daa0fea. Have a look. You probably wanted to get
>> rid of the first check for cpu_possible.
>
> Added a fixup patch the other day:
>
> http://git.kernel.dk/cgit/linux-block/commit/?h=for-linus&id=362bf8670efccebca22efda1ee5a5ee831ec5efb
@@ -2333,13 +2329,14 @@ static int io_sq_offload_start(struct io_ring_ctx *ctx,
ctx->sq_thread_idle = HZ;
if (p->flags & IORING_SETUP_SQ_AFF) {
- int cpu;
+ int cpu = p->sq_thread_cpu;
- cpu = array_index_nospec(p->sq_thread_cpu, NR_CPUS);
ret = -EINVAL;
- if (!cpu_possible(p->sq_thread_cpu))
+ if (cpu >= nr_cpu_ids || !cpu_possible(cpu))
goto err;
+ cpu = array_index_nospec(cpu, nr_cpu_ids);
+
Why do you do the array_index_nospec last? Why wouldn't that be written
as:
if (p->flags & IORING_SETUP_SQ_AFF) {
int cpu = array_index_nospec(p->sq_thread_cpu, nr_cpu_ids);
ret = -EINVAL;
if (!cpu_possible(cpu))
goto err;
ctx->sqo_thread = kthread_create_on_cpu(io_sq_thread,
ctx, cpu,
"io_uring-sq");
} else {
...
That would take away some head-scratching for me.
Cheers,
Jeff