On Wed, Nov 05, 2025 at 03:32:10PM +0100, Oleg Nesterov wrote:
> I am still thinking about another approach, will write another email.
> But let me take a closer look at your patch.
> 
> First of all, can you split it? See below.
> 
> On 08/21, Bernd Edlinger wrote:
> >
> > -static int de_thread(struct task_struct *tsk)
> > +static int de_thread(struct task_struct *tsk, struct linux_binprm *bprm)
> >  {
> >     struct signal_struct *sig = tsk->signal;
> >     struct sighand_struct *oldsighand = tsk->sighand;
> >     spinlock_t *lock = &oldsighand->siglock;
> > +   struct task_struct *t;
> > +   bool unsafe_execve_in_progress = false;
> >
> >     if (thread_group_empty(tsk))
> >             goto no_thread_group;
> > @@ -932,6 +934,19 @@ static int de_thread(struct task_struct *tsk)
> >     if (!thread_group_leader(tsk))
> >             sig->notify_count--;
> >
> > +   for_other_threads(tsk, t) {
> > +           if (unlikely(t->ptrace)
> > +               && (t != tsk->group_leader || !t->exit_state))
> > +                   unsafe_execve_in_progress = true;
> 
> you can add "break" into the "if ()" block...
> 
> But this is minor. Why do we need "bool unsafe_execve_in_progress" ?
> If this patch is correct, de_thread() can drop/reacquire cred_guard_mutex
> unconditionally.
> 
> If you really think it makes sense, please make another patch with the
> changelog.
> 
> I'd certainly prefer to avoid this boolean at least for the start. If nothing
> else to catch the potential problems earlier.
> 
> > +   if (unlikely(unsafe_execve_in_progress)) {
> > +           spin_unlock_irq(lock);
> > +           sig->exec_bprm = bprm;
> > +           mutex_unlock(&sig->cred_guard_mutex);
> > +           spin_lock_irq(lock);
> 
> I don't think spin_unlock_irq() + spin_lock_irq() makes any sense...
> 
> > @@ -1114,13 +1139,31 @@ int begin_new_exec(struct linux_binprm * bprm)
> >      */
> >     trace_sched_prepare_exec(current, bprm);
> >
> > +   /* If the binary is not readable then enforce mm->dumpable=0 */
> > +   would_dump(bprm, bprm->file);
> > +   if (bprm->have_execfd)
> > +           would_dump(bprm, bprm->executable);
> > +
> > +   /*
> > +    * Figure out dumpability. Note that this checking only of current
> > +    * is wrong, but userspace depends on it. This should be testing
> > +    * bprm->secureexec instead.
> > +    */
> > +   if (bprm->interp_flags & BINPRM_FLAGS_ENFORCE_NONDUMP ||
> > +       is_dumpability_changed(current_cred(), bprm->cred) ||
> > +       !(uid_eq(current_euid(), current_uid()) &&
> > +         gid_eq(current_egid(), current_gid())))
> > +           set_dumpable(bprm->mm, suid_dumpable);
> > +   else
> > +           set_dumpable(bprm->mm, SUID_DUMP_USER);
> > +
> 
> OK, we need to do this before de_thread() drops cred_guard_mutex.
> But imo this too should be done in a separate patch, the changelog should
> explain this change.
> 
> > @@ -1361,6 +1387,11 @@ static int prepare_bprm_creds(struct linux_binprm 
> > *bprm)
> >     if (mutex_lock_interruptible(&current->signal->cred_guard_mutex))
> >             return -ERESTARTNOINTR;
> >
> > +   if (unlikely(current->signal->exec_bprm)) {
> > +           mutex_unlock(&current->signal->cred_guard_mutex);
> > +           return -ERESTARTNOINTR;
> > +   }
> 
> OK, if signal->exec_bprm != NULL, then current is already killed. But
> proc_pid_attr_write() and ptrace_traceme() do the same. So how about
> something like
> 
>       int lock_current_cgm(void)
>       {
>               if 
> (mutex_lock_interruptible(&current->signal->cred_guard_mutex))
>                       return -ERESTARTNOINTR;
> 
>               if (!current->signal->group_exec_task)
>                       return 0;
> 
>               WARN_ON(!fatal_signal_pending(current));
>               mutex_unlock(&current->signal->cred_guard_mutex);
>               return -ERESTARTNOINTR;
>       }
> 
> ?
> 
> Note that it checks ->group_exec_task, not ->exec_bprm. So this change can
> come in a separate patch too, but I won't insist.
> 
> > @@ -453,6 +454,28 @@ static int ptrace_attach(struct task_struct *task, 
> > long request,
> >                             return retval;
> >             }
> >
> > +           if (unlikely(task == task->signal->group_exec_task)) {
> > +                   retval = 
> > down_write_killable(&task->signal->exec_update_lock);
> > +                   if (retval)
> > +                           return retval;
> > +
> > +                   scoped_guard (task_lock, task) {
> > +                           struct linux_binprm *bprm = 
> > task->signal->exec_bprm;
> > +                           const struct cred __rcu *old_cred = 
> > task->real_cred;
> > +                           struct mm_struct *old_mm = task->mm;
> > +
> > +                           rcu_assign_pointer(task->real_cred, bprm->cred);
> > +                           task->mm = bprm->mm;
> > +                           retval = __ptrace_may_access(task, 
> > PTRACE_MODE_ATTACH_REALCREDS);
> > +                           rcu_assign_pointer(task->real_cred, old_cred);
> > +                           task->mm = old_mm;
> > +                   }
> 
> This is the most problematic change which I can't review...
> 
> Firstly, it changes task->mm/real_cred for __ptrace_may_access() and this
> looks dangerous to me.

Yeah, that is not ok. This is effectively override_creds for real_cred
and that is not a pattern I want to see us establish at all! Temporary
credential overrides for the subjective credentials is already terrible
but at least we have the explicit split between real_cred and cred
expressely for that. So no, that's not an acceptable solution.

> 
> Say, current_is_single_threaded() called by another CLONE_VM process can
> miss group_exec_task and falsely return true. Probably not that bad, in
> this case old_mm should go away soon, but still...
> 
> And I don't know if this can fool the users of task_cred_xxx/__task_cred
> somehow.
> 
> Or. check_unsafe_exec() sets LSM_UNSAFE_PTRACE if ptrace. Is it safe to
> ptrace the execing task after that? I have no idea what the security hooks
> can do...
> 
> Again, can't review this part.
> 
> Oleg.
> 

Reply via email to