On Mon, Sep 30, 2024 at 02:34:33PM GMT, Christian Brauner wrote:
> On Mon, Sep 30, 2024 at 11:39:49AM GMT, Lorenzo Stoakes wrote:
> > On Mon, Sep 30, 2024 at 12:33:18PM GMT, Florian Weimer wrote:
> > > * Lorenzo Stoakes:
> > >
> > > > If you wish to utilise a pidfd interface to refer to the current process
> > > > (from the point of view of userland - from the kernel point of view - 
> > > > the
> > > > thread group leader), it is rather cumbersome, requiring something like:
> > > >
> > > >         int pidfd = pidfd_open(getpid(), 0);
> > > >
> > > >         ...
> > > >
> > > >         close(pidfd);
> > > >
> > > > Or the equivalent call opening /proc/self. It is more convenient to use 
> > > > a
> > > > sentinel value to indicate to an interface that accepts a pidfd that we
> > > > simply wish to refer to the current process.
> > >
> > > The descriptor will refer to the current thread, not process, right?
> >
> > No it refers to the current process (i.e. thread group leader from kernel
> > perspective). Unless you specify PIDFD_THREAD, this is the same if you did 
> > the above.
> >
> > >
> > > The distinction matters for pidfd_getfd if a process contains multiple
> > > threads with different file descriptor tables, and probably for
> > > pidfd_send_signal as well.
> >
> > You mean if you did a strange set of flags to clone()? Otherwise these are
> > shared right?
> >
> > Again, we are explicitly looking at process not thread from userland
> > perspective. A PIDFD_SELF_THREAD might be possible, but this series doesn't 
> > try
> > to implement that.
>
> Florian raises a good point. Currently we have:
>
> (1) int pidfd_tgid = pidfd_open(getpid(), 0);
> (2) int pidfd_thread = pidfd_open(getpid(), PIDFD_THREAD);
>
> and this instructs:
>
> pidfd_send_signal()
> pidfd_getfd()
>
> to do different things. For pidfd_send_signal() it's whether the
> operation has thread-group scope or thread-scope for pidfd_send_signal()
> and for pidfd_getfd() it determines the fdtable to use.
>
> The thing is that if you pass:
>
> pidfd_getfd(PDIFD_SELF)
>
> and you have:
>
> TGID
>
> T1 {
>     clone(CLONE_THREAD)
>     unshare(CLONE_FILES)
> }
>
> T2 {
>     clone(CLONE_THREAD)
>     unshare(CLONE_FILES)
> }
>
> You have 3 threads in the same thread-group that all have distinct file
> descriptor tables from each other.
>
> So if T1 did:
>
> pidfd_getfd(PIDFD_SELF, ...)
>
> and we mirror the PIDTYPE_TGID behavior then T1 will very likely expect
> to get the fd from its file descriptor table. IOW, its reasonable to
> expect that T1 is interested in their very own resource, not someone
> else's even if it is the thread-group leader.
>
> But what T1 will get in reality is an fd from TGID's file descriptor
> table (and similar for T2).
>
> Iirc, yes that confusion exists already with /proc/self. But the
> question is whether we should add the same confusion to the pidfd api or
> whether we make PIDFD_SELF actually mean PIDTYPE_PID aka the actual
> calling thread.
>
> My thinking is that if you have the reasonable suspicion that you're
> multi-threaded and that you're interested in the thread-group resource
> then you should be using:
>
> int pidfd = pidfd_open(getpid(), 0)
>
> and hand that thread-group leader pidfd around since you're interested
> in another thread. But if you're really just interested in your own
> resource then pidfd_open(getpid(), 0) makes no sense and you would want
> PIDFD_SELF.
>
> Thoughts?

I mean from my perspective, my aim is to get current->mm for
process_madvise() so both work for me :) however you both raise a very good
point here (sorry Florian, perhaps I was a little too dismissive as to your
point, you're absolutely right).

My intent was for PIDFD_SELF to simply mirror the pidfd_open(getpid(), 0)
behaviour, but you and Florian make a strong case that you'd _probably_
find this very confusing had you unshared in this fashion.

I mean in general this confusion already exists, and is for what
PIDFD_THREAD was created, but I suspect ideally if you could go back you
might actually do this by default Christian + let the TGL behaviour be the
optional thing?

For most users this will not be an issue, but for those they'd get the same
result whichever they used, but yes actually I think you're both right -
PIDFD_SELF should in effect imply PIDFD_THREAD.

We can adjust the pidfd_send_signal() call to infer the correct scope
(actually nicely we can do that without any change there, by having
__pidfd_get_pid() set f_flags accordingly).

So TL;DR: I agree, I will respin with PIDFD_SELF referring to the thread.

My question in return here then is - should we introduce PIDFD_SELF_PROCESS
also (do advise if you feel this naming isn't quite right) - to provide
thread group leader behaviour?

Thanks!

Reply via email to