Re: [PATCH 2/2] arch: add pidfd and io_uring syscalls everywhere

2019-04-04 Thread Michael Ellerman
Jens Axboe  writes:
> On 4/3/19 5:11 AM, Will Deacon wrote:
>> On Wed, Apr 03, 2019 at 01:47:50PM +1100, Michael Ellerman wrote:
>>> Arnd Bergmann  writes:
 diff --git a/arch/powerpc/kernel/syscalls/syscall.tbl 
 b/arch/powerpc/kernel/syscalls/syscall.tbl
 index b18abb0c3dae..00f5a63c8d9a 100644
 --- a/arch/powerpc/kernel/syscalls/syscall.tbl
 +++ b/arch/powerpc/kernel/syscalls/syscall.tbl
 @@ -505,3 +505,7 @@
  421   32  rt_sigtimedwait_time64  sys_rt_sigtimedwait 
 compat_sys_rt_sigtimedwait_time64
  422   32  futex_time64sys_futex   
 sys_futex
  423   32  sched_rr_get_interval_time64
 sys_sched_rr_get_interval   sys_sched_rr_get_interval
 +424   common  pidfd_send_signal   sys_pidfd_send_signal
 +425   common  io_uring_setup  sys_io_uring_setup
 +426   common  io_uring_enter  sys_io_uring_enter
 +427   common  io_uring_register   sys_io_uring_register
>>>
>>> Acked-by: Michael Ellerman  (powerpc)
>>>
>>> Lightly tested.
>>>
>>> The pidfd_test selftest passes.
>> 
>> That reports pass for me too, although it fails to unshare the pid ns, which 
>> I
>> assume is benign.

If you run it as root it should work?

>>> Ran the io_uring example from fio, which prints lots of:
>> 
>> How did you invoke that? I had a play with the tests in:
>
> It's t/io_uring from the fio repo:
>
> git://git.kernel.dk/fio
>
> and you just run it ala:
>
> # make t/io_uring
> # t/io_uring /dev/some_device

Yeah that's all I did.

>> will@autoplooker:~/liburing/test$ ./io_uring_register 
>> RELIMIT_MEMLOCK: 67108864 (67108864)
>> [   35.477875] Unable to handle kernel NULL pointer dereference at virtual 
>> address 0070
>> [   35.478969] Mem abort info:
>> [   35.479296]   ESR = 0x9604
>> [   35.479785]   Exception class = DABT (current EL), IL = 32 bits
>> [   35.480528]   SET = 0, FnV = 0
>> [   35.480980]   EA = 0, S1PTW = 0
>> [   35.481345] Data abort info:
>> [   35.481680]   ISV = 0, ISS = 0x0004
>> [   35.482267]   CM = 0, WnR = 0
>> [   35.482618] user pgtable: 4k pages, 48-bit VAs, pgdp = (ptrval)
>> [   35.483486] [0070] pgd=
>> [   35.484041] Internal error: Oops: 9604 [#1] PREEMPT SMP
>> [   35.484788] Modules linked in:
>> [   35.485311] CPU: 113 PID: 3973 Comm: io_uring_regist Not tainted 
>> 5.1.0-rc3-00012-g40b114779944 #1
>> [   35.486712] Hardware name: linux,dummy-virt (DT)
>> [   35.487450] pstate: 2045 (nzCv daif +PAN -UAO)
>> [   35.488228] pc : link_pwq+0x10/0x60
>> [   35.488794] lr : apply_wqattrs_commit+0xe0/0x118
>> [   35.489550] sp : 17e2bbc0
>
> Huh, this looks odd, it's crashing inside the wq setup.

Looks like you found a bug :)

cheers


Re: [PATCH 2/2] arch: add pidfd and io_uring syscalls everywhere

2019-04-03 Thread Jens Axboe
On 4/3/19 9:49 AM, Will Deacon wrote:
> On Wed, Apr 03, 2019 at 09:39:52AM -0600, Jens Axboe wrote:
>> On 4/3/19 9:19 AM, Will Deacon wrote:
>>> On Wed, Apr 03, 2019 at 07:49:26AM -0600, Jens Axboe wrote:
 On 4/3/19 5:11 AM, Will Deacon wrote:
> will@autoplooker:~/liburing/test$ ./io_uring_register 
> RELIMIT_MEMLOCK: 67108864 (67108864)
> [   35.477875] Unable to handle kernel NULL pointer dereference at 
> virtual address 0070
> [   35.478969] Mem abort info:
> [   35.479296]   ESR = 0x9604
> [   35.479785]   Exception class = DABT (current EL), IL = 32 bits
> [   35.480528]   SET = 0, FnV = 0
> [   35.480980]   EA = 0, S1PTW = 0
> [   35.481345] Data abort info:
> [   35.481680]   ISV = 0, ISS = 0x0004
> [   35.482267]   CM = 0, WnR = 0
> [   35.482618] user pgtable: 4k pages, 48-bit VAs, pgdp = (ptrval)
> [   35.483486] [0070] pgd=
> [   35.484041] Internal error: Oops: 9604 [#1] PREEMPT SMP
> [   35.484788] Modules linked in:
> [   35.485311] CPU: 113 PID: 3973 Comm: io_uring_regist Not tainted 
> 5.1.0-rc3-00012-g40b114779944 #1
> [   35.486712] Hardware name: linux,dummy-virt (DT)
> [   35.487450] pstate: 2045 (nzCv daif +PAN -UAO)
> [   35.488228] pc : link_pwq+0x10/0x60
> [   35.488794] lr : apply_wqattrs_commit+0xe0/0x118
> [   35.489550] sp : 17e2bbc0

 Huh, this looks odd, it's crashing inside the wq setup.
>>>
>>> Enabling KASAN seems to indicate a double-free, which may well be related.
>>
>> Does this help?
> 
> Yes, thanks for the quick patch. Feel free to add:
> 
> Reported-by: Will Deacon 
> Tested-by: Will Deacon 
> 
> if you spin a proper patch.

Great, thanks for reporting/testing.

-- 
Jens Axboe



Re: [PATCH 2/2] arch: add pidfd and io_uring syscalls everywhere

2019-04-03 Thread Will Deacon
On Wed, Apr 03, 2019 at 09:39:52AM -0600, Jens Axboe wrote:
> On 4/3/19 9:19 AM, Will Deacon wrote:
> > On Wed, Apr 03, 2019 at 07:49:26AM -0600, Jens Axboe wrote:
> >> On 4/3/19 5:11 AM, Will Deacon wrote:
> >>> will@autoplooker:~/liburing/test$ ./io_uring_register 
> >>> RELIMIT_MEMLOCK: 67108864 (67108864)
> >>> [   35.477875] Unable to handle kernel NULL pointer dereference at 
> >>> virtual address 0070
> >>> [   35.478969] Mem abort info:
> >>> [   35.479296]   ESR = 0x9604
> >>> [   35.479785]   Exception class = DABT (current EL), IL = 32 bits
> >>> [   35.480528]   SET = 0, FnV = 0
> >>> [   35.480980]   EA = 0, S1PTW = 0
> >>> [   35.481345] Data abort info:
> >>> [   35.481680]   ISV = 0, ISS = 0x0004
> >>> [   35.482267]   CM = 0, WnR = 0
> >>> [   35.482618] user pgtable: 4k pages, 48-bit VAs, pgdp = (ptrval)
> >>> [   35.483486] [0070] pgd=
> >>> [   35.484041] Internal error: Oops: 9604 [#1] PREEMPT SMP
> >>> [   35.484788] Modules linked in:
> >>> [   35.485311] CPU: 113 PID: 3973 Comm: io_uring_regist Not tainted 
> >>> 5.1.0-rc3-00012-g40b114779944 #1
> >>> [   35.486712] Hardware name: linux,dummy-virt (DT)
> >>> [   35.487450] pstate: 2045 (nzCv daif +PAN -UAO)
> >>> [   35.488228] pc : link_pwq+0x10/0x60
> >>> [   35.488794] lr : apply_wqattrs_commit+0xe0/0x118
> >>> [   35.489550] sp : 17e2bbc0
> >>
> >> Huh, this looks odd, it's crashing inside the wq setup.
> > 
> > Enabling KASAN seems to indicate a double-free, which may well be related.
> 
> Does this help?

Yes, thanks for the quick patch. Feel free to add:

Reported-by: Will Deacon 
Tested-by: Will Deacon 

if you spin a proper patch.

Will

> diff --git a/fs/io_uring.c b/fs/io_uring.c
> index bbdbd56cf2ac..07d6ef195d05 100644
> --- a/fs/io_uring.c
> +++ b/fs/io_uring.c
> @@ -2215,6 +2215,7 @@ static int io_sqe_files_register(struct io_ring_ctx 
> *ctx, void __user *arg,
>   fput(ctx->user_files[i]);
>  
>   kfree(ctx->user_files);
> + ctx->user_files = NULL;
>   ctx->nr_user_files = 0;
>   return ret;
>   }
> 
> -- 
> Jens Axboe
> 


Re: [PATCH 2/2] arch: add pidfd and io_uring syscalls everywhere

2019-04-03 Thread Jens Axboe
On 4/3/19 9:19 AM, Will Deacon wrote:
> Hi Jens,
> 
> On Wed, Apr 03, 2019 at 07:49:26AM -0600, Jens Axboe wrote:
>> On 4/3/19 5:11 AM, Will Deacon wrote:
>>> will@autoplooker:~/liburing/test$ ./io_uring_register 
>>> RELIMIT_MEMLOCK: 67108864 (67108864)
>>> [   35.477875] Unable to handle kernel NULL pointer dereference at virtual 
>>> address 0070
>>> [   35.478969] Mem abort info:
>>> [   35.479296]   ESR = 0x9604
>>> [   35.479785]   Exception class = DABT (current EL), IL = 32 bits
>>> [   35.480528]   SET = 0, FnV = 0
>>> [   35.480980]   EA = 0, S1PTW = 0
>>> [   35.481345] Data abort info:
>>> [   35.481680]   ISV = 0, ISS = 0x0004
>>> [   35.482267]   CM = 0, WnR = 0
>>> [   35.482618] user pgtable: 4k pages, 48-bit VAs, pgdp = (ptrval)
>>> [   35.483486] [0070] pgd=
>>> [   35.484041] Internal error: Oops: 9604 [#1] PREEMPT SMP
>>> [   35.484788] Modules linked in:
>>> [   35.485311] CPU: 113 PID: 3973 Comm: io_uring_regist Not tainted 
>>> 5.1.0-rc3-00012-g40b114779944 #1
>>> [   35.486712] Hardware name: linux,dummy-virt (DT)
>>> [   35.487450] pstate: 2045 (nzCv daif +PAN -UAO)
>>> [   35.488228] pc : link_pwq+0x10/0x60
>>> [   35.488794] lr : apply_wqattrs_commit+0xe0/0x118
>>> [   35.489550] sp : 17e2bbc0
>>
>> Huh, this looks odd, it's crashing inside the wq setup.
> 
> Enabling KASAN seems to indicate a double-free, which may well be related.

Does this help?


diff --git a/fs/io_uring.c b/fs/io_uring.c
index bbdbd56cf2ac..07d6ef195d05 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -2215,6 +2215,7 @@ static int io_sqe_files_register(struct io_ring_ctx *ctx, 
void __user *arg,
fput(ctx->user_files[i]);
 
kfree(ctx->user_files);
+   ctx->user_files = NULL;
ctx->nr_user_files = 0;
return ret;
}

-- 
Jens Axboe



Re: [PATCH 2/2] arch: add pidfd and io_uring syscalls everywhere

2019-04-03 Thread Will Deacon
Hi Jens,

On Wed, Apr 03, 2019 at 07:49:26AM -0600, Jens Axboe wrote:
> On 4/3/19 5:11 AM, Will Deacon wrote:
> > will@autoplooker:~/liburing/test$ ./io_uring_register 
> > RELIMIT_MEMLOCK: 67108864 (67108864)
> > [   35.477875] Unable to handle kernel NULL pointer dereference at virtual 
> > address 0070
> > [   35.478969] Mem abort info:
> > [   35.479296]   ESR = 0x9604
> > [   35.479785]   Exception class = DABT (current EL), IL = 32 bits
> > [   35.480528]   SET = 0, FnV = 0
> > [   35.480980]   EA = 0, S1PTW = 0
> > [   35.481345] Data abort info:
> > [   35.481680]   ISV = 0, ISS = 0x0004
> > [   35.482267]   CM = 0, WnR = 0
> > [   35.482618] user pgtable: 4k pages, 48-bit VAs, pgdp = (ptrval)
> > [   35.483486] [0070] pgd=
> > [   35.484041] Internal error: Oops: 9604 [#1] PREEMPT SMP
> > [   35.484788] Modules linked in:
> > [   35.485311] CPU: 113 PID: 3973 Comm: io_uring_regist Not tainted 
> > 5.1.0-rc3-00012-g40b114779944 #1
> > [   35.486712] Hardware name: linux,dummy-virt (DT)
> > [   35.487450] pstate: 2045 (nzCv daif +PAN -UAO)
> > [   35.488228] pc : link_pwq+0x10/0x60
> > [   35.488794] lr : apply_wqattrs_commit+0xe0/0x118
> > [   35.489550] sp : 17e2bbc0
> 
> Huh, this looks odd, it's crashing inside the wq setup.

Enabling KASAN seems to indicate a double-free, which may well be related.

Will

[  149.890370] 
==
[  149.891266] BUG: KASAN: double-free or invalid-free in 
io_sqe_files_unregister+0xa8/0x140
[  149.892218] 
[  149.892411] CPU: 113 PID: 3974 Comm: io_uring_regist Tainted: GB 
5.1.0-rc3-00012-g40b114779944 #3
[  149.893623] Hardware name: linux,dummy-virt (DT)
[  149.894169] Call trace:
[  149.894539]  dump_backtrace+0x0/0x228
[  149.895172]  show_stack+0x14/0x20
[  149.895747]  dump_stack+0xe8/0x124
[  149.896335]  print_address_description+0x60/0x258
[  149.897148]  kasan_report_invalid_free+0x78/0xb8
[  149.897936]  __kasan_slab_free+0x1fc/0x228
[  149.898641]  kasan_slab_free+0x10/0x18
[  149.899283]  kfree+0x70/0x1f8
[  149.899798]  io_sqe_files_unregister+0xa8/0x140
[  149.900574]  io_ring_ctx_wait_and_kill+0x190/0x3c0
[  149.901402]  io_uring_release+0x2c/0x48
[  149.902068]  __fput+0x18c/0x510
[  149.902612]  fput+0xc/0x18
[  149.903146]  task_work_run+0xf0/0x148
[  149.903778]  do_notify_resume+0x554/0x748
[  149.904467]  work_pending+0x8/0x10
[  149.905060] 
[  149.905331] Allocated by task 3974:
[  149.905934]  __kasan_kmalloc.isra.0.part.1+0x48/0xf8
[  149.906786]  __kasan_kmalloc.isra.0+0xb8/0xd8
[  149.907531]  kasan_kmalloc+0xc/0x18
[  149.908134]  __kmalloc+0x168/0x248
[  149.908724]  __arm64_sys_io_uring_register+0x2b8/0x15a8
[  149.909622]  el0_svc_common+0x100/0x258
[  149.910281]  el0_svc_handler+0x48/0xc0
[  149.910928]  el0_svc+0x8/0xc
[  149.911425] 
[  149.911696] Freed by task 3974:
[  149.912242]  __kasan_slab_free+0x114/0x228
[  149.912955]  kasan_slab_free+0x10/0x18
[  149.913602]  kfree+0x70/0x1f8
[  149.914118]  __arm64_sys_io_uring_register+0xc2c/0x15a8
[  149.915009]  el0_svc_common+0x100/0x258
[  149.915670]  el0_svc_handler+0x48/0xc0
[  149.916317]  el0_svc+0x8/0xc
[  149.916817] 
[  149.917101] The buggy address belongs to the object at 8004ce07ed00
[  149.917101]  which belongs to the cache kmalloc-128 of size 128
[  149.919197] The buggy address is located 0 bytes inside of
[  149.919197]  128-byte region [8004ce07ed00, 8004ce07ed80)
[  149.921142] The buggy address belongs to the page:
[  149.921953] page:7e0013381f00 count:1 mapcount:0 
mapping:800503417c00 index:0x0 compound_mapcount: 0
[  149.923595] flags: 0x10010200(slab|head)
[  149.924388] raw: 10010200 dead0100 dead0200 
800503417c00
[  149.925706] raw:  80400040 0001 

[  149.927011] page dumped because: kasan: bad access detected
[  149.927956] 
[  149.928224] Memory state around the buggy address:
[  149.929054]  8004ce07ec00: 00 00 00 00 00 00 00 00 fc fc fc fc fc fc fc 
fc
[  149.930274]  8004ce07ec80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc 
fc
[  149.931494] >8004ce07ed00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb 
fb
[  149.932712]^
[  149.933281]  8004ce07ed80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc 
fc
[  149.934508]  8004ce07ee00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc 
fc
[  149.935725] 
==


Re: [PATCH 2/2] arch: add pidfd and io_uring syscalls everywhere

2019-04-03 Thread Jens Axboe
On 4/3/19 5:11 AM, Will Deacon wrote:
> Hi Michael,
> 
> On Wed, Apr 03, 2019 at 01:47:50PM +1100, Michael Ellerman wrote:
>> Arnd Bergmann  writes:
>>> diff --git a/arch/powerpc/kernel/syscalls/syscall.tbl 
>>> b/arch/powerpc/kernel/syscalls/syscall.tbl
>>> index b18abb0c3dae..00f5a63c8d9a 100644
>>> --- a/arch/powerpc/kernel/syscalls/syscall.tbl
>>> +++ b/arch/powerpc/kernel/syscalls/syscall.tbl
>>> @@ -505,3 +505,7 @@
>>>  42132  rt_sigtimedwait_time64  sys_rt_sigtimedwait 
>>> compat_sys_rt_sigtimedwait_time64
>>>  42232  futex_time64sys_futex   
>>> sys_futex
>>>  42332  sched_rr_get_interval_time64
>>> sys_sched_rr_get_interval   sys_sched_rr_get_interval
>>> +424common  pidfd_send_signal   sys_pidfd_send_signal
>>> +425common  io_uring_setup  sys_io_uring_setup
>>> +426common  io_uring_enter  sys_io_uring_enter
>>> +427common  io_uring_register   sys_io_uring_register
>>
>> Acked-by: Michael Ellerman  (powerpc)
>>
>> Lightly tested.
>>
>> The pidfd_test selftest passes.
> 
> That reports pass for me too, although it fails to unshare the pid ns, which I
> assume is benign.
> 
>> Ran the io_uring example from fio, which prints lots of:
> 
> How did you invoke that? I had a play with the tests in:

It's t/io_uring from the fio repo:

git://git.kernel.dk/fio

and you just run it ala:

# make t/io_uring
# t/io_uring /dev/some_device

>   git://git.kernel.dk/liburing
> 
> but I quickly ran into the kernel oops below.
> 
> Will
> 
> --->8
> 
> will@autoplooker:~/liburing/test$ ./io_uring_register 
> RELIMIT_MEMLOCK: 67108864 (67108864)
> [   35.477875] Unable to handle kernel NULL pointer dereference at virtual 
> address 0070
> [   35.478969] Mem abort info:
> [   35.479296]   ESR = 0x9604
> [   35.479785]   Exception class = DABT (current EL), IL = 32 bits
> [   35.480528]   SET = 0, FnV = 0
> [   35.480980]   EA = 0, S1PTW = 0
> [   35.481345] Data abort info:
> [   35.481680]   ISV = 0, ISS = 0x0004
> [   35.482267]   CM = 0, WnR = 0
> [   35.482618] user pgtable: 4k pages, 48-bit VAs, pgdp = (ptrval)
> [   35.483486] [0070] pgd=
> [   35.484041] Internal error: Oops: 9604 [#1] PREEMPT SMP
> [   35.484788] Modules linked in:
> [   35.485311] CPU: 113 PID: 3973 Comm: io_uring_regist Not tainted 
> 5.1.0-rc3-00012-g40b114779944 #1
> [   35.486712] Hardware name: linux,dummy-virt (DT)
> [   35.487450] pstate: 2045 (nzCv daif +PAN -UAO)
> [   35.488228] pc : link_pwq+0x10/0x60
> [   35.488794] lr : apply_wqattrs_commit+0xe0/0x118
> [   35.489550] sp : 17e2bbc0

Huh, this looks odd, it's crashing inside the wq setup.


-- 
Jens Axboe



Re: [PATCH 2/2] arch: add pidfd and io_uring syscalls everywhere

2019-04-03 Thread Will Deacon
Hi Michael,

On Wed, Apr 03, 2019 at 01:47:50PM +1100, Michael Ellerman wrote:
> Arnd Bergmann  writes:
> > diff --git a/arch/powerpc/kernel/syscalls/syscall.tbl 
> > b/arch/powerpc/kernel/syscalls/syscall.tbl
> > index b18abb0c3dae..00f5a63c8d9a 100644
> > --- a/arch/powerpc/kernel/syscalls/syscall.tbl
> > +++ b/arch/powerpc/kernel/syscalls/syscall.tbl
> > @@ -505,3 +505,7 @@
> >  42132  rt_sigtimedwait_time64  sys_rt_sigtimedwait 
> > compat_sys_rt_sigtimedwait_time64
> >  42232  futex_time64sys_futex   
> > sys_futex
> >  42332  sched_rr_get_interval_time64
> > sys_sched_rr_get_interval   sys_sched_rr_get_interval
> > +424common  pidfd_send_signal   sys_pidfd_send_signal
> > +425common  io_uring_setup  sys_io_uring_setup
> > +426common  io_uring_enter  sys_io_uring_enter
> > +427common  io_uring_register   sys_io_uring_register
> 
> Acked-by: Michael Ellerman  (powerpc)
> 
> Lightly tested.
> 
> The pidfd_test selftest passes.

That reports pass for me too, although it fails to unshare the pid ns, which I
assume is benign.

> Ran the io_uring example from fio, which prints lots of:

How did you invoke that? I had a play with the tests in:

  git://git.kernel.dk/liburing

but I quickly ran into the kernel oops below.

Will

--->8

will@autoplooker:~/liburing/test$ ./io_uring_register 
RELIMIT_MEMLOCK: 67108864 (67108864)
[   35.477875] Unable to handle kernel NULL pointer dereference at virtual 
address 0070
[   35.478969] Mem abort info:
[   35.479296]   ESR = 0x9604
[   35.479785]   Exception class = DABT (current EL), IL = 32 bits
[   35.480528]   SET = 0, FnV = 0
[   35.480980]   EA = 0, S1PTW = 0
[   35.481345] Data abort info:
[   35.481680]   ISV = 0, ISS = 0x0004
[   35.482267]   CM = 0, WnR = 0
[   35.482618] user pgtable: 4k pages, 48-bit VAs, pgdp = (ptrval)
[   35.483486] [0070] pgd=
[   35.484041] Internal error: Oops: 9604 [#1] PREEMPT SMP
[   35.484788] Modules linked in:
[   35.485311] CPU: 113 PID: 3973 Comm: io_uring_regist Not tainted 
5.1.0-rc3-00012-g40b114779944 #1
[   35.486712] Hardware name: linux,dummy-virt (DT)
[   35.487450] pstate: 2045 (nzCv daif +PAN -UAO)
[   35.488228] pc : link_pwq+0x10/0x60
[   35.488794] lr : apply_wqattrs_commit+0xe0/0x118
[   35.489550] sp : 17e2bbc0
[   35.490088] x29: 17e2bbc0 x28: 8004b9118000 
[   35.490939] x27:  x26: 8004c21c4200 
[   35.491786] x25: 0004 x24: 1123e1b0 
[   35.492640] x23: 8004c539 x22: 8004bb440500 
[   35.493502] x21: 8004bb440500 x20: 0070 
[   35.494355] x19: 0022 x18:  
[   35.495202] x17:  x16:  
[   35.496054] x15:  x14: 7e0012e8a240 
[   35.496910] x13: 4a73a5e663e2 x12:  
[   35.497764] x11: 0001 x10: 0070 
[   35.498611] x9 : 8004cb49d610 x8 :  
[   35.499462] x7 : 8004c4ff9c70 x6 : 8004cb49ccb0 
[   35.500308] x5 : 8004c66cc4c0 x4 : 0001 
[   35.501173] x3 :  x2 : 0040 
[   35.502019] x1 : 0004 x0 :  
[   35.502872] Process io_uring_regist (pid: 3973, stack limit = 
0x(ptrval))
[   35.504052] Call trace:
[   35.504463]  link_pwq+0x10/0x60
[   35.504987]  apply_wqattrs_commit+0xe0/0x118
[   35.505681]  apply_workqueue_attrs_locked+0x3c/0x80
[   35.506460]  apply_workqueue_attrs+0x3c/0x60
[   35.507152]  alloc_workqueue+0x264/0x430
[   35.507786]  io_uring_setup+0x478/0x6a8
[   35.508414]  __arm64_sys_io_uring_setup+0x18/0x20
[   35.509183]  el0_svc_common+0x80/0xf0
[   35.509786]  el0_svc_handler+0x2c/0x80
[   35.510393]  el0_svc+0x8/0xc
[   35.510873] Code: a9bd7bfd 910003fd a90153f3 9101c014 (f9403802) 
[   35.511843] ---[ end trace 0a53e45ee26def4c ]---
Segmentation fault


Re: [PATCH 2/2] arch: add pidfd and io_uring syscalls everywhere

2019-04-02 Thread Michael Ellerman
Arnd Bergmann  writes:
> diff --git a/arch/powerpc/kernel/syscalls/syscall.tbl 
> b/arch/powerpc/kernel/syscalls/syscall.tbl
> index b18abb0c3dae..00f5a63c8d9a 100644
> --- a/arch/powerpc/kernel/syscalls/syscall.tbl
> +++ b/arch/powerpc/kernel/syscalls/syscall.tbl
> @@ -505,3 +505,7 @@
>  421  32  rt_sigtimedwait_time64  sys_rt_sigtimedwait 
> compat_sys_rt_sigtimedwait_time64
>  422  32  futex_time64sys_futex   
> sys_futex
>  423  32  sched_rr_get_interval_time64sys_sched_rr_get_interval   
> sys_sched_rr_get_interval
> +424  common  pidfd_send_signal   sys_pidfd_send_signal
> +425  common  io_uring_setup  sys_io_uring_setup
> +426  common  io_uring_enter  sys_io_uring_enter
> +427  common  io_uring_register   sys_io_uring_register

Acked-by: Michael Ellerman  (powerpc)

Lightly tested.

The pidfd_test selftest passes.

Ran the io_uring example from fio, which prints lots of:

IOPS=209952, IOS/call=32/32, inflight=117 (117), Cachehit=0.00%
IOPS=209952, IOS/call=32/32, inflight=116 (116), Cachehit=0.00%
IOPS=209920, IOS/call=32/32, inflight=115 (115), Cachehit=0.00%
IOPS=209952, IOS/call=32/32, inflight=115 (115), Cachehit=0.00%
IOPS=209920, IOS/call=32/32, inflight=115 (115), Cachehit=0.00%
IOPS=209952, IOS/call=32/32, inflight=115 (115), Cachehit=0.00%
IOPS=210016, IOS/call=32/32, inflight=114 (114), Cachehit=0.00%
IOPS=210016, IOS/call=32/32, inflight=113 (113), Cachehit=0.00%
IOPS=210048, IOS/call=32/32, inflight=113 (113), Cachehit=0.00%
IOPS=210016, IOS/call=32/32, inflight=113 (113), Cachehit=0.00%
IOPS=210048, IOS/call=32/32, inflight=112 (112), Cachehit=0.00%
IOPS=210016, IOS/call=32/32, inflight=110 (110), Cachehit=0.00%
IOPS=210048, IOS/call=32/32, inflight=105 (105), Cachehit=0.00%
IOPS=210048, IOS/call=32/32, inflight=104 (104), Cachehit=0.00%
IOPS=210080, IOS/call=32/32, inflight=102 (102), Cachehit=0.00%
IOPS=210112, IOS/call=32/32, inflight=100 (100), Cachehit=0.00%
IOPS=210080, IOS/call=32/32, inflight=97 (97), Cachehit=0.00%
IOPS=210112, IOS/call=32/32, inflight=97 (97), Cachehit=0.00%
IOPS=210112, IOS/call=32/31, inflight=126 (126), Cachehit=0.00%
IOPS=210048, IOS/call=32/32, inflight=126 (126), Cachehit=0.00%
IOPS=210048, IOS/call=32/32, inflight=125 (125), Cachehit=0.00%
IOPS=210016, IOS/call=32/32, inflight=119 (119), Cachehit=0.00%
IOPS=210048, IOS/call=32/32, inflight=117 (117), Cachehit=0.00%
IOPS=210016, IOS/call=32/32, inflight=114 (114), Cachehit=0.00%
IOPS=210048, IOS/call=32/32, inflight=111 (111), Cachehit=0.00%
IOPS=210048, IOS/call=32/32, inflight=108 (108), Cachehit=0.00%
IOPS=210048, IOS/call=32/32, inflight=107 (107), Cachehit=0.00%
IOPS=210048, IOS/call=32/32, inflight=105 (105), Cachehit=0.00%

Which is good I think?


cheers


Re: [PATCH 2/2] arch: add pidfd and io_uring syscalls everywhere

2019-04-02 Thread Michael Ellerman
Arnd Bergmann  writes:
> On Sun, Mar 31, 2019 at 5:47 PM Michael Ellerman  wrote:
>>
>> Arnd Bergmann  writes:
>> > Add the io_uring and pidfd_send_signal system calls to all architectures.
>> >
>> > These system calls are designed to handle both native and compat tasks,
>> > so all entries are the same across architectures, only arm-compat and
>> > the generic tale still use an old format.
>> >
>> > Signed-off-by: Arnd Bergmann 
>> > ---
>> >  arch/alpha/kernel/syscalls/syscall.tbl  | 4 
>> >  arch/arm/tools/syscall.tbl  | 4 
>> >  arch/arm64/include/asm/unistd.h | 2 +-
>> >  arch/arm64/include/asm/unistd32.h   | 8 
>> >  arch/ia64/kernel/syscalls/syscall.tbl   | 4 
>> >  arch/m68k/kernel/syscalls/syscall.tbl   | 4 
>> >  arch/microblaze/kernel/syscalls/syscall.tbl | 4 
>> >  arch/mips/kernel/syscalls/syscall_n32.tbl   | 4 
>> >  arch/mips/kernel/syscalls/syscall_n64.tbl   | 4 
>> >  arch/mips/kernel/syscalls/syscall_o32.tbl   | 4 
>> >  arch/parisc/kernel/syscalls/syscall.tbl | 4 
>> >  arch/powerpc/kernel/syscalls/syscall.tbl| 4 
>>
>> Have you done any testing?
>>
>> I'd rather not wire up syscalls that have never been tested at all on
>> powerpc.
>
> No, I have not. I did review the system calls carefully and added the first
> patch to fix the bug on x86 compat mode before adding the same bug
> on the other compat architectures though ;-)
>
> Generally, my feeling is that adding system calls is not fundamentally
> different from adding other ABIs, and we should really do it at
> the same time across all architectures, rather than waiting for each
> maintainer to get around to reviewing and testing the new calls
> first. This is not a problem on powerpc, but a lot of other architectures
> are less active, which is how we have always ended up with
> different sets of system calls across architectures.

Well it's still something of a problem on powerpc. No one has
volunteered to test io_uring on powerpc, so at this stage it will go in
completely untested.

If there was a selftest in the tree I'd be a bit happier, because at
least then our CI would start testing it as soon as the syscalls were
wired up in linux-next.

And yeah obviously I should test it, but I don't have infinite time
unfortunately.

> The problem here is that this makes it harder for the C library to
> know when a system call is guaranteed to be available. glibc
> still needs a feature test for newly added syscalls to see if they
> are working (they might be backported to an older kernel, or
> disabled), but whenever the minimum kernel version is increased,
> it makes sense to drop those checks and assume non-optional
> system calls will work if they were part of that minimum version.

But that's the thing, if we just wire them up untested they may not
actually work. And then you have the far worse situation where the
syscall exists in kernel version x but does not actually work properly.

See the mess we have with pkeys for example.

> In the future, I'd hope that any new system calls get added
> right away on all architectures when they land (it was a bit
> tricky this time, because I still did a bunch of reworks that
> conflicted with the new calls). Bugs will happen of course, but
> I think adding them sooner makes it more likely to catch those
> bugs early on so we have a chance to fix them properly,
> and need fewer arch specific workarounds (ideally none)
> for system calls.

For syscalls that have a selftest in the tree, and don't rely on
anything arch specific I agree.

I'm a bit more wary of things that are not easily tested and have the
potential to work differently across arches.

cheers


Re: [PATCH 2/2] arch: add pidfd and io_uring syscalls everywhere

2019-04-01 Thread Geert Uytterhoeven
On Mon, Mar 25, 2019 at 3:48 PM Arnd Bergmann  wrote:
> Add the io_uring and pidfd_send_signal system calls to all architectures.
>
> These system calls are designed to handle both native and compat tasks,
> so all entries are the same across architectures, only arm-compat and
> the generic tale still use an old format.
>
> Signed-off-by: Arnd Bergmann 

>  arch/m68k/kernel/syscalls/syscall.tbl   | 4 

Acked-by: Geert Uytterhoeven 

Gr{oetje,eeting}s,

Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds


Re: [PATCH 2/2] arch: add pidfd and io_uring syscalls everywhere

2019-03-31 Thread Arnd Bergmann
On Sun, Mar 31, 2019 at 5:47 PM Michael Ellerman  wrote:
>
> Arnd Bergmann  writes:
> > Add the io_uring and pidfd_send_signal system calls to all architectures.
> >
> > These system calls are designed to handle both native and compat tasks,
> > so all entries are the same across architectures, only arm-compat and
> > the generic tale still use an old format.
> >
> > Signed-off-by: Arnd Bergmann 
> > ---
> >  arch/alpha/kernel/syscalls/syscall.tbl  | 4 
> >  arch/arm/tools/syscall.tbl  | 4 
> >  arch/arm64/include/asm/unistd.h | 2 +-
> >  arch/arm64/include/asm/unistd32.h   | 8 
> >  arch/ia64/kernel/syscalls/syscall.tbl   | 4 
> >  arch/m68k/kernel/syscalls/syscall.tbl   | 4 
> >  arch/microblaze/kernel/syscalls/syscall.tbl | 4 
> >  arch/mips/kernel/syscalls/syscall_n32.tbl   | 4 
> >  arch/mips/kernel/syscalls/syscall_n64.tbl   | 4 
> >  arch/mips/kernel/syscalls/syscall_o32.tbl   | 4 
> >  arch/parisc/kernel/syscalls/syscall.tbl | 4 
> >  arch/powerpc/kernel/syscalls/syscall.tbl| 4 
>
> Have you done any testing?
>
> I'd rather not wire up syscalls that have never been tested at all on
> powerpc.

No, I have not. I did review the system calls carefully and added the first
patch to fix the bug on x86 compat mode before adding the same bug
on the other compat architectures though ;-)

Generally, my feeling is that adding system calls is not fundamentally
different from adding other ABIs, and we should really do it at
the same time across all architectures, rather than waiting for each
maintainer to get around to reviewing and testing the new calls
first. This is not a problem on powerpc, but a lot of other architectures
are less active, which is how we have always ended up with
different sets of system calls across architectures.

The problem here is that this makes it harder for the C library to
know when a system call is guaranteed to be available. glibc
still needs a feature test for newly added syscalls to see if they
are working (they might be backported to an older kernel, or
disabled), but whenever the minimum kernel version is increased,
it makes sense to drop those checks and assume non-optional
system calls will work if they were part of that minimum version.

In the future, I'd hope that any new system calls get added
right away on all architectures when they land (it was a bit
tricky this time, because I still did a bunch of reworks that
conflicted with the new calls). Bugs will happen of course, but
I think adding them sooner makes it more likely to catch those
bugs early on so we have a chance to fix them properly,
and need fewer arch specific workarounds (ideally none)
for system calls.

   Arnd


Re: [PATCH 2/2] arch: add pidfd and io_uring syscalls everywhere

2019-03-31 Thread Michael Ellerman
Arnd Bergmann  writes:
> Add the io_uring and pidfd_send_signal system calls to all architectures.
>
> These system calls are designed to handle both native and compat tasks,
> so all entries are the same across architectures, only arm-compat and
> the generic tale still use an old format.
>
> Signed-off-by: Arnd Bergmann 
> ---
>  arch/alpha/kernel/syscalls/syscall.tbl  | 4 
>  arch/arm/tools/syscall.tbl  | 4 
>  arch/arm64/include/asm/unistd.h | 2 +-
>  arch/arm64/include/asm/unistd32.h   | 8 
>  arch/ia64/kernel/syscalls/syscall.tbl   | 4 
>  arch/m68k/kernel/syscalls/syscall.tbl   | 4 
>  arch/microblaze/kernel/syscalls/syscall.tbl | 4 
>  arch/mips/kernel/syscalls/syscall_n32.tbl   | 4 
>  arch/mips/kernel/syscalls/syscall_n64.tbl   | 4 
>  arch/mips/kernel/syscalls/syscall_o32.tbl   | 4 
>  arch/parisc/kernel/syscalls/syscall.tbl | 4 
>  arch/powerpc/kernel/syscalls/syscall.tbl| 4 

Have you done any testing?

I'd rather not wire up syscalls that have never been tested at all on
powerpc.

cheers


Re: [PATCH 2/2] arch: add pidfd and io_uring syscalls everywhere

2019-03-30 Thread Heiko Carstens
On Mon, Mar 25, 2019 at 03:47:37PM +0100, Arnd Bergmann wrote:
> Add the io_uring and pidfd_send_signal system calls to all architectures.
> 
> These system calls are designed to handle both native and compat tasks,
> so all entries are the same across architectures, only arm-compat and
> the generic tale still use an old format.
> 
> Signed-off-by: Arnd Bergmann 

> diff --git a/arch/s390/kernel/syscalls/syscall.tbl 
> b/arch/s390/kernel/syscalls/syscall.tbl
> index 02579f95f391..3eb56e639b96 100644
> --- a/arch/s390/kernel/syscalls/syscall.tbl
> +++ b/arch/s390/kernel/syscalls/syscall.tbl
> @@ -426,3 +426,7 @@
>  421  32  rt_sigtimedwait_time64  -   
> compat_sys_rt_sigtimedwait_time64
>  422  32  futex_time64-   
> sys_futex
>  423  32  sched_rr_get_interval_time64-   
> sys_sched_rr_get_interval
> +424  common  pidfd_send_signal   sys_pidfd_send_signal
> +425  common  io_uring_setup  sys_io_uring_setup
> +426  common  io_uring_enter  sys_io_uring_enter
> +427  common  io_uring_register   sys_io_uring_register

I was just about to write that io_uring_enter is missing compat
handling, but your first patch actually fixes that. Would have been
good to be cc'ed on both patches :)

For s390:
Acked-by: Heiko Carstens 



Re: [PATCH 2/2] arch: add pidfd and io_uring syscalls everywhere

2019-03-26 Thread Arnd Bergmann
On Mon, Mar 25, 2019 at 6:37 PM Paul Burton  wrote:
> On Mon, Mar 25, 2019 at 03:47:37PM +0100, Arnd Bergmann wrote:
> > Add the io_uring and pidfd_send_signal system calls to all architectures.
> >
> > These system calls are designed to handle both native and compat tasks,
> > so all entries are the same across architectures, only arm-compat and
> > the generic tale still use an old format.
> >
> > Signed-off-by: Arnd Bergmann 
> > ---
> >%
> > diff --git a/arch/mips/kernel/syscalls/syscall_n64.tbl 
> > b/arch/mips/kernel/syscalls/syscall_n64.tbl
> > index c85502e67b44..c4a49f7d57bb 100644
> > --- a/arch/mips/kernel/syscalls/syscall_n64.tbl
> > +++ b/arch/mips/kernel/syscalls/syscall_n64.tbl
> > @@ -338,3 +338,7 @@
> >  327  n64 rseqsys_rseq
> >  328  n64 io_pgetevents   sys_io_pgetevents
> >  # 329 through 423 are reserved to sync up with other architectures
> > +424  common  pidfd_send_signal   sys_pidfd_send_signal
> > +425  common  io_uring_setup  sys_io_uring_setup
> > +426  common  io_uring_enter  sys_io_uring_enter
> > +427  common  io_uring_register   sys_io_uring_register
>
> Shouldn't these declare the ABI as "n64"?
>
> I don't see anywhere that it would actually change the generated code,
> but a comment at the top of the file says that every entry should use
> "n64" and so far they all do. Did you have something else in mind here?

You are right, the use of 'common' here is unintentional but harmless,
and I should have used 'n64' here.

We may decide to do things differently in the future, i.e. we could
have just a single global file for newly added system calls once
it turns out that the tables are consistent across all architectures,
but I'd probably go on with the separate identical entries for a bit
before changing that.

 Arnd


Re: [PATCH 2/2] arch: add pidfd and io_uring syscalls everywhere

2019-03-25 Thread Paul Burton
Hi Arnd,

On Mon, Mar 25, 2019 at 03:47:37PM +0100, Arnd Bergmann wrote:
> Add the io_uring and pidfd_send_signal system calls to all architectures.
> 
> These system calls are designed to handle both native and compat tasks,
> so all entries are the same across architectures, only arm-compat and
> the generic tale still use an old format.
> 
> Signed-off-by: Arnd Bergmann 
> ---
>%
> diff --git a/arch/mips/kernel/syscalls/syscall_n64.tbl 
> b/arch/mips/kernel/syscalls/syscall_n64.tbl
> index c85502e67b44..c4a49f7d57bb 100644
> --- a/arch/mips/kernel/syscalls/syscall_n64.tbl
> +++ b/arch/mips/kernel/syscalls/syscall_n64.tbl
> @@ -338,3 +338,7 @@
>  327  n64 rseqsys_rseq
>  328  n64 io_pgetevents   sys_io_pgetevents
>  # 329 through 423 are reserved to sync up with other architectures
> +424  common  pidfd_send_signal   sys_pidfd_send_signal
> +425  common  io_uring_setup  sys_io_uring_setup
> +426  common  io_uring_enter  sys_io_uring_enter
> +427  common  io_uring_register   sys_io_uring_register

Shouldn't these declare the ABI as "n64"?

I don't see anywhere that it would actually change the generated code,
but a comment at the top of the file says that every entry should use
"n64" and so far they all do. Did you have something else in mind here?

Thanks,
Paul


[PATCH 2/2] arch: add pidfd and io_uring syscalls everywhere

2019-03-25 Thread Arnd Bergmann
Add the io_uring and pidfd_send_signal system calls to all architectures.

These system calls are designed to handle both native and compat tasks,
so all entries are the same across architectures, only arm-compat and
the generic tale still use an old format.

Signed-off-by: Arnd Bergmann 
---
 arch/alpha/kernel/syscalls/syscall.tbl  | 4 
 arch/arm/tools/syscall.tbl  | 4 
 arch/arm64/include/asm/unistd.h | 2 +-
 arch/arm64/include/asm/unistd32.h   | 8 
 arch/ia64/kernel/syscalls/syscall.tbl   | 4 
 arch/m68k/kernel/syscalls/syscall.tbl   | 4 
 arch/microblaze/kernel/syscalls/syscall.tbl | 4 
 arch/mips/kernel/syscalls/syscall_n32.tbl   | 4 
 arch/mips/kernel/syscalls/syscall_n64.tbl   | 4 
 arch/mips/kernel/syscalls/syscall_o32.tbl   | 4 
 arch/parisc/kernel/syscalls/syscall.tbl | 4 
 arch/powerpc/kernel/syscalls/syscall.tbl| 4 
 arch/s390/kernel/syscalls/syscall.tbl   | 4 
 arch/sh/kernel/syscalls/syscall.tbl | 4 
 arch/sparc/kernel/syscalls/syscall.tbl  | 4 
 arch/xtensa/kernel/syscalls/syscall.tbl | 4 
 16 files changed, 65 insertions(+), 1 deletion(-)

diff --git a/arch/alpha/kernel/syscalls/syscall.tbl 
b/arch/alpha/kernel/syscalls/syscall.tbl
index 63ed39cbd3bd..165f268beafc 100644
--- a/arch/alpha/kernel/syscalls/syscall.tbl
+++ b/arch/alpha/kernel/syscalls/syscall.tbl
@@ -463,3 +463,7 @@
 532common  getppid sys_getppid
 # all other architectures have common numbers for new syscall, alpha
 # is the exception.
+534common  pidfd_send_signal   sys_pidfd_send_signal
+535common  io_uring_setup  sys_io_uring_setup
+536common  io_uring_enter  sys_io_uring_enter
+537common  io_uring_register   sys_io_uring_register
diff --git a/arch/arm/tools/syscall.tbl b/arch/arm/tools/syscall.tbl
index 9016f4081bb9..0393917eaa57 100644
--- a/arch/arm/tools/syscall.tbl
+++ b/arch/arm/tools/syscall.tbl
@@ -437,3 +437,7 @@
 421common  rt_sigtimedwait_time64  sys_rt_sigtimedwait
 422common  futex_time64sys_futex
 423common  sched_rr_get_interval_time64sys_sched_rr_get_interval
+424common  pidfd_send_signal   sys_pidfd_send_signal
+425common  io_uring_setup  sys_io_uring_setup
+426common  io_uring_enter  sys_io_uring_enter
+427common  io_uring_register   sys_io_uring_register
diff --git a/arch/arm64/include/asm/unistd.h b/arch/arm64/include/asm/unistd.h
index 310d8f1cae7a..c6946fe640e6 100644
--- a/arch/arm64/include/asm/unistd.h
+++ b/arch/arm64/include/asm/unistd.h
@@ -49,7 +49,7 @@
 #define __ARM_NR_compat_set_tls(__ARM_NR_COMPAT_BASE + 5)
 #define __ARM_NR_COMPAT_END(__ARM_NR_COMPAT_BASE + 0x800)
 
-#define __NR_compat_syscalls   424
+#define __NR_compat_syscalls   428
 #endif
 
 #define __ARCH_WANT_SYS_CLONE
diff --git a/arch/arm64/include/asm/unistd32.h 
b/arch/arm64/include/asm/unistd32.h
index 5590f2623690..23f1a44acada 100644
--- a/arch/arm64/include/asm/unistd32.h
+++ b/arch/arm64/include/asm/unistd32.h
@@ -866,6 +866,14 @@ __SYSCALL(__NR_rt_sigtimedwait_time64, 
compat_sys_rt_sigtimedwait_time64)
 __SYSCALL(__NR_futex_time64, sys_futex)
 #define __NR_sched_rr_get_interval_time64 423
 __SYSCALL(__NR_sched_rr_get_interval_time64, sys_sched_rr_get_interval)
+#define __NR_pidfd_send_signal 424
+__SYSCALL(__NR_pidfd_send_signal, sys_pidfd_send_signal)
+#define __NR_io_uring_setup 425
+__SYSCALL(__NR_io_uring_setup, sys_io_uring_setup)
+#define __NR_io_uring_enter 426
+__SYSCALL(__NR_io_uring_enter, sys_io_uring_enter)
+#define __NR_io_uring_register 427
+__SYSCALL(__NR_io_uring_register, sys_io_uring_register)
 
 /*
  * Please add new compat syscalls above this comment and update
diff --git a/arch/ia64/kernel/syscalls/syscall.tbl 
b/arch/ia64/kernel/syscalls/syscall.tbl
index ab9cda5f6136..56e3d0b685e1 100644
--- a/arch/ia64/kernel/syscalls/syscall.tbl
+++ b/arch/ia64/kernel/syscalls/syscall.tbl
@@ -344,3 +344,7 @@
 332common  pkey_free   sys_pkey_free
 333common  rseqsys_rseq
 # 334 through 423 are reserved to sync up with other architectures
+424common  pidfd_send_signal   sys_pidfd_send_signal
+425common  io_uring_setup  sys_io_uring_setup
+426common  io_uring_enter  sys_io_uring_enter
+427common  io_uring_register   sys_io_uring_register
diff --git a/arch/m68k/kernel/syscalls/syscall.tbl 
b/arch/m68k/kernel/syscalls/syscall.tbl
index 125c14178979..df4ec3ec71d1 100644
--- a/arch/m68k/kernel/syscalls/syscall.tbl
+++ b/arch/m68k/kernel/syscalls/syscall.tbl
@@ -423,3 +423,7 @@
 421common  rt_sigtimedwait_time64  sys_rt_sigtimedwait
 422common  futex_time64sys_futex