Re: [PATCH] fcntl: make F_GETOWN(EX) return 0 on dead owner task

2021-02-08 Thread Jeff Layton
On Mon, 2021-02-08 at 15:57 +0300, Pavel Tikhomirov wrote:
> 
> On 2/8/21 3:31 PM, Jeff Layton wrote:
> > On Thu, 2021-02-04 at 01:17 +0300, Cyrill Gorcunov wrote:
> > > On Thu, Feb 04, 2021 at 12:35:42AM +0300, Pavel Tikhomirov wrote:
> > > > 
> > > > AFAICS if pid is held only by 1) fowner refcount and by 2) single 
> > > > process
> > > > (without threads, group and session for simplicity), on process exit we 
> > > > go
> > > > through:
> > > > 
> > > > do_exit
> > > >    exit_notify
> > > >  release_task
> > > >    __exit_signal
> > > >  __unhash_process
> > > >    detach_pid
> > > >  __change_pid
> > > >    free_pid
> > > >  idr_remove
> > > > 
> > > > So pid is removed from idr, and after that alloc_pid can reuse pid 
> > > > numbers
> > > > even if old pid structure is still alive and is still held by fowner.
> > > ...
> > > > Hope this answers your question, Thanks!
> > > 
> > > Yeah, indeed, thanks! So the change is sane still I'm
> > > a bit worried about backward compatibility, gimme some
> > > time I'll try to refresh my memory first, in a couple
> > > of days or weekend (though here are a number of experienced
> > > developers CC'ed maybe they reply even faster).
> > 
> > I always find it helpful to refer to the POSIX spec [1] for this sort of
> > thing. In this case, it says:
> > 
> > F_GETOWN
> >  If fildes refers to a socket, get the process ID or process group ID
> > specified to receive SIGURG signals when out-of-band data is available.
> > Positive values shall indicate a process ID; negative values, other than
> > -1, shall indicate a process group ID; the value zero shall indicate
> > that no SIGURG signals are to be sent. If fildes does not refer to a
> > socket, the results are unspecified.
> > 
> > In the event that the PID is reused, the kernel won't send signals to
> > the replacement task, correct?
> 
> Correct. Looks like only places to send signal to owner are send_sigio() 
> and send_sigurg() (at least nobody else dereferences fown->pid_type). 
> And in both places we lookup for task to send signal to with pid_task() 
> or do_each_pid_task() (similar to what I do in patch) and will not find 
> any task if pid was reused. Thus no signal would be sent.
> 

Thanks for confirming it. I queued it up for linux-next (with a small
amendment to the changelog), and it should be there later today or
tomorrow. It might not hurt to roll up a manpage patch too if you have
the cycles. It'd be nice to spell out what a 0 return means there.

> > Assuming that's the case, then this patch
> > looks fine to me too. I'll plan to pick it for linux-next later today,
> > and we can hopefully get this into v5.12.
> > 
> > [1]: 
> > https://pubs.opengroup.org/onlinepubs/9699919799/functions/fcntl.html#tag_16_122
> > 
> 
> Thanks for finding it!
> 

No problem. That site is worth bookmarking for this sort of thing... ;)
-- 
Jeff Layton 



Re: [PATCH] fcntl: make F_GETOWN(EX) return 0 on dead owner task

2021-02-08 Thread Pavel Tikhomirov




On 2/8/21 3:31 PM, Jeff Layton wrote:

On Thu, 2021-02-04 at 01:17 +0300, Cyrill Gorcunov wrote:

On Thu, Feb 04, 2021 at 12:35:42AM +0300, Pavel Tikhomirov wrote:


AFAICS if pid is held only by 1) fowner refcount and by 2) single process
(without threads, group and session for simplicity), on process exit we go
through:

do_exit
   exit_notify
 release_task
   __exit_signal
 __unhash_process
   detach_pid
 __change_pid
   free_pid
 idr_remove

So pid is removed from idr, and after that alloc_pid can reuse pid numbers
even if old pid structure is still alive and is still held by fowner.

...

Hope this answers your question, Thanks!


Yeah, indeed, thanks! So the change is sane still I'm
a bit worried about backward compatibility, gimme some
time I'll try to refresh my memory first, in a couple
of days or weekend (though here are a number of experienced
developers CC'ed maybe they reply even faster).


I always find it helpful to refer to the POSIX spec [1] for this sort of
thing. In this case, it says:

F_GETOWN
 If fildes refers to a socket, get the process ID or process group ID
specified to receive SIGURG signals when out-of-band data is available.
Positive values shall indicate a process ID; negative values, other than
-1, shall indicate a process group ID; the value zero shall indicate
that no SIGURG signals are to be sent. If fildes does not refer to a
socket, the results are unspecified.

In the event that the PID is reused, the kernel won't send signals to
the replacement task, correct?


Correct. Looks like only places to send signal to owner are send_sigio() 
and send_sigurg() (at least nobody else dereferences fown->pid_type). 
And in both places we lookup for task to send signal to with pid_task() 
or do_each_pid_task() (similar to what I do in patch) and will not find 
any task if pid was reused. Thus no signal would be sent.



Assuming that's the case, then this patch
looks fine to me too. I'll plan to pick it for linux-next later today,
and we can hopefully get this into v5.12.

[1]: 
https://pubs.opengroup.org/onlinepubs/9699919799/functions/fcntl.html#tag_16_122



Thanks for finding it!

--
Best regards, Tikhomirov Pavel
Software Developer, Virtuozzo.


Re: [PATCH] fcntl: make F_GETOWN(EX) return 0 on dead owner task

2021-02-08 Thread Jeff Layton
On Thu, 2021-02-04 at 01:17 +0300, Cyrill Gorcunov wrote:
> On Thu, Feb 04, 2021 at 12:35:42AM +0300, Pavel Tikhomirov wrote:
> > 
> > AFAICS if pid is held only by 1) fowner refcount and by 2) single process
> > (without threads, group and session for simplicity), on process exit we go
> > through:
> > 
> > do_exit
> >   exit_notify
> > release_task
> >   __exit_signal
> > __unhash_process
> >   detach_pid
> > __change_pid
> >   free_pid
> > idr_remove
> > 
> > So pid is removed from idr, and after that alloc_pid can reuse pid numbers
> > even if old pid structure is still alive and is still held by fowner.
> ...
> > Hope this answers your question, Thanks!
> 
> Yeah, indeed, thanks! So the change is sane still I'm
> a bit worried about backward compatibility, gimme some
> time I'll try to refresh my memory first, in a couple
> of days or weekend (though here are a number of experienced
> developers CC'ed maybe they reply even faster).

I always find it helpful to refer to the POSIX spec [1] for this sort of
thing. In this case, it says:

F_GETOWN
If fildes refers to a socket, get the process ID or process group ID
specified to receive SIGURG signals when out-of-band data is available.
Positive values shall indicate a process ID; negative values, other than
-1, shall indicate a process group ID; the value zero shall indicate
that no SIGURG signals are to be sent. If fildes does not refer to a
socket, the results are unspecified.

In the event that the PID is reused, the kernel won't send signals to
the replacement task, correct? Assuming that's the case, then this patch
looks fine to me too. I'll plan to pick it for linux-next later today,
and we can hopefully get this into v5.12.

[1]: 
https://pubs.opengroup.org/onlinepubs/9699919799/functions/fcntl.html#tag_16_122
-- 
Jeff Layton 



Re: [PATCH] fcntl: make F_GETOWN(EX) return 0 on dead owner task

2021-02-07 Thread Cyrill Gorcunov
On Wed, Feb 03, 2021 at 03:41:56PM +0300, Pavel Tikhomirov wrote:
> Currently there is no way to differentiate the file with alive owner
> from the file with dead owner but pid of the owner reused. That's why
> CRIU can't actually know if it needs to restore file owner or not,
> because if it restores owner but actual owner was dead, this can
> introduce unexpected signals to the "false"-owner (which reused the
> pid).
> 
> Let's change the api, so that F_GETOWN(EX) returns 0 in case actual
> owner is dead already.
> 
> Cc: Jeff Layton 
> Cc: "J. Bruce Fields" 
> Cc: Alexander Viro 
> Cc: linux-fsde...@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Cc: Cyrill Gorcunov 
> Cc: Andrei Vagin 
> Signed-off-by: Pavel Tikhomirov 

I can't imagine a scenario where we could break some backward
compatibility with this change, so

Reviewed-by: Cyrill Gorcunov 


Re: [PATCH] fcntl: make F_GETOWN(EX) return 0 on dead owner task

2021-02-03 Thread Cyrill Gorcunov
On Thu, Feb 04, 2021 at 12:35:42AM +0300, Pavel Tikhomirov wrote:
> 
> AFAICS if pid is held only by 1) fowner refcount and by 2) single process
> (without threads, group and session for simplicity), on process exit we go
> through:
> 
> do_exit
>   exit_notify
> release_task
>   __exit_signal
> __unhash_process
>   detach_pid
> __change_pid
>   free_pid
> idr_remove
> 
> So pid is removed from idr, and after that alloc_pid can reuse pid numbers
> even if old pid structure is still alive and is still held by fowner.
...
> Hope this answers your question, Thanks!

Yeah, indeed, thanks! So the change is sane still I'm
a bit worried about backward compatibility, gimme some
time I'll try to refresh my memory first, in a couple
of days or weekend (though here are a number of experienced
developers CC'ed maybe they reply even faster).


Re: [PATCH] fcntl: make F_GETOWN(EX) return 0 on dead owner task

2021-02-03 Thread Pavel Tikhomirov

On 2/3/21 10:32 PM, Cyrill Gorcunov wrote:

On Wed, Feb 03, 2021 at 03:41:56PM +0300, Pavel Tikhomirov wrote:

Currently there is no way to differentiate the file with alive owner
from the file with dead owner but pid of the owner reused. That's why
CRIU can't actually know if it needs to restore file owner or not,
because if it restores owner but actual owner was dead, this can
introduce unexpected signals to the "false"-owner (which reused the
pid).


Hi! Thanks for the patch. You know I manage to forget the fowner internals.
Could you please enlighten me -- when owner is set with some pid we do

f_setown_ex
   __f_setown
 f_modown
   filp->f_owner.pid = get_pid(pid);

Thus pid get refcount incremented.


Hi, and yes you are right about refcount is held.

 Then the owner exits but refcounter

should be still up and running and pid should not be reused, no? Or
I miss something obvious?


AFAICS if pid is held only by 1) fowner refcount and by 2) single 
process (without threads, group and session for simplicity), on process 
exit we go through:


do_exit
  exit_notify
release_task
  __exit_signal
__unhash_process
  detach_pid
__change_pid
  free_pid
idr_remove

So pid is removed from idr, and after that alloc_pid can reuse pid 
numbers even if old pid structure is still alive and is still held by 
fowner.


Also I've added criu-zdtm test which reproduces the problem:

https://src.openvz.org/projects/OVZ/repos/criu/commits/e25904c35dbc535f6837e55da58ca0f5a5caf4b3#test/zdtm/static/file_fown_reuse.c

Hope this answers your question, Thanks!



The patch itself looks ok on a first glance.



--
Best regards, Tikhomirov Pavel
Software Developer, Virtuozzo.


Re: [PATCH] fcntl: make F_GETOWN(EX) return 0 on dead owner task

2021-02-03 Thread Cyrill Gorcunov
On Wed, Feb 03, 2021 at 03:41:56PM +0300, Pavel Tikhomirov wrote:
> Currently there is no way to differentiate the file with alive owner
> from the file with dead owner but pid of the owner reused. That's why
> CRIU can't actually know if it needs to restore file owner or not,
> because if it restores owner but actual owner was dead, this can
> introduce unexpected signals to the "false"-owner (which reused the
> pid).

Hi! Thanks for the patch. You know I manage to forget the fowner internals.
Could you please enlighten me -- when owner is set with some pid we do

f_setown_ex
  __f_setown
f_modown
  filp->f_owner.pid = get_pid(pid);

Thus pid get refcount incremented. Then the owner exits but refcounter
should be still up and running and pid should not be reused, no? Or
I miss something obvious?

The patch itself looks ok on a first glance.