Hi,

I'm working in a LSM that uses the task_alloc hook to do some work. This
hook needs to check the "real_parent" field hold by the task_struct
structure. I'm very confused since navigating the source code I see many
different ways to access this field. I don't understand why every method
is used in each case. So I don't know how to implement this access in my
LSM in a secure way.

The real_parent field is defined in the task_struct structure as:

struct task_struct __rcu        *real_parent;

So, as far I can understand this pointer uses the "Read Copy Update"
feature.

Below I show some examples of different access:

--------------------------------------------------------------------------
Example 1
--------------------------------------------------------------------------

void proc_fork_connector(struct task_struct *task)
{
        [...]
        rcu_read_lock();
        parent = rcu_dereference(task->real_parent);
        ev->event_data.fork.parent_pid = parent->pid;
        ev->event_data.fork.parent_tgid = parent->tgid;
        rcu_read_unlock();
        [...]
}

Here to access the real_parent field the code uses rcu_dereference inside
the rcu_read_lock/rcu_read_unlock block.

--------------------------------------------------------------------------
Example 2
--------------------------------------------------------------------------

static struct pid *
get_children_pid(struct inode *inode, struct pid *pid_prev, loff_t pos)
{
        [...]
        read_lock(&tasklist_lock);
        [...]
                if (task && task->real_parent == start &&
                    !(list_empty(&task->sibling))) {
        [...]
        read_unlock(&tasklist_lock);
        [...]
}

Here to access the real_parent field the code reads the pointer directly
inside the read_lock/read_unlock block. No rcu block needed? Why is not
used the rcu_dereference function?

--------------------------------------------------------------------------
Example 3
--------------------------------------------------------------------------

struct task_struct init_task __aligned(L1_CACHE_BYTES) = {
        [...]
        .real_parent    = &init_task,
        .parent         = &init_task,
        [...]
        RCU_POINTER_INITIALIZER(real_cred, &init_cred),
        RCU_POINTER_INITIALIZER(cred, &init_cred),
        [...]
};

Here the initialization is directly. If the pointer is declared __rcu, the
assigment to the real_parent should be with RCU_POINTER_INITIALIZER macro?

--------------------------------------------------------------------------
Example 4
--------------------------------------------------------------------------

static void forget_original_parent(struct task_struct *father,
                                        struct list_head *dead)
{
        [...]
                        RCU_INIT_POINTER(t->real_parent, reaper);
        [...]
}

Here the initialization uses the RCU_INIT_POINTER macro. It's not directly.

--------------------------------------------------------------------------
Example 5
--------------------------------------------------------------------------

SYSCALL_DEFINE2(setpgid, pid_t, pid, pid_t, pgid)
{
        [...]
        rcu_read_lock();

        /* From this point forward we keep holding onto the tasklist lock
         * so that our parent does not change from under us. -DaveM
         */
        write_lock_irq(&tasklist_lock);
        [...]

        if (same_thread_group(p->real_parent, group_leader)) {

        [...]
        write_unlock_irq(&tasklist_lock);
        rcu_read_unlock();
        [...]
}

Here to access the real_parent field the code reads the pointer directly
inside the write_lock_irq/write_unlock_irq block nested in a rcu_read_lock/
rcu_read_unlock block.

--------------------------------------------------------------------------
Example 6
--------------------------------------------------------------------------

long keyctl_session_to_parent(void)
{
        [...]
        rcu_read_lock();
        write_lock_irq(&tasklist_lock);

        [...]
        parent = rcu_dereference_protected(me->real_parent,
                                           lockdep_is_held(&tasklist_lock));

        [...]
        write_unlock_irq(&tasklist_lock);
        rcu_read_unlock();
        [...]
}

But here to access the real_parent field the code uses rcu_dereference_*
inside the write_lock_irq/write_unlock_irq block nested in a rcu_read_lock/
rcu_read_unlock block. The nested blocks are the sames that in the example 5
but the access is not directly, it uses rcu_dereference_*. Why?

Extracted from the documentation:

[1] The variant rcu_dereference_protected() can be used outside of an RCU
read-side critical section as long as the usage is protected by locks
acquired by the update-side code. This variant avoids the lockdep warning
that would happen when using (for example) rcu_dereference() without
rcu_read_lock() protection. Using rcu_dereference_protected() also has
the advantage of permitting compiler optimizations that rcu_dereference()
must prohibit. The rcu_dereference_protected() variant takes a lockdep
expression to indicate which locks must be acquired by the caller. If the
indicated protection is not provided, a lockdep splat is emitted.

Moreover, why the rcu_dereference_protected is used under a rcu block?

There are more examples but are similars to the ones showed. So my question
is how to read the "real_parent" field correctly. If I can understand all
the above examples I think I will have the knowledge to implement my LSM in
a correct way.

Any help that points me to the right direction will be greatly apreciated.
Some rules to know when and why use a method or another are welcome.

Thanks in advance. Regards,
John Wood


_______________________________________________
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Reply via email to