Serge E. Hallyn wrote:
> Quoting Oren Laadan ([email protected]):
>>
>> Serge E. Hallyn wrote:
>>> Quoting Sukadev Bhattiprolu ([email protected]):
>>>> Subject: [RFC][v4][PATCH 7/7]: Define clone_extended() syscall
>>>>
>>>> Container restart requires that a task have the same pid it had when it was
>>>> checkpointed. When containers are nested the tasks within the containers
>>>> exist in multiple pid namespaces and hence have multiple pids to specify
>>>> during restart.
>>>>
>>>> This patch defines, a new system call, clone_extended() which is like 
>>>> clone(),
>>>> but takes a new 'pid_set' parameter.  This parameter lets caller choose
>>>> specific pid numbers for the child process, in the process's active and
>>>> ancestor pid namespaces. (Descendant pid namespaces in general don't matter
>>>> since processes don't have pids in them anyway, but see comments in
>>>> copy_target_pids() regarding CLONE_NEWPID).
>>>>
>>>> Unlike clone(), however, clone_extended() needs CAP_SYS_ADMIN, at least for
>>>> now, to prevent unprivileged processes from misusing this interface.
>>> It only needs that when specifying pids.
>>>
>>>> While the main motivation for this interface is the need to let a process
>>>> choose its 'pid numbers', the clone_extended() interface uses 64-bit clone
>>>> flags.  The 'higher' portion of the clone flags are unused and are only
>>>> included to preclude yet another version of clone when a new clone flag is
>>>> needed. 
>>>>
>>>> ===== Interface:
>>>>
>>>> Compared to clone(), clone_extended() needs to pass in three more pieces
>>>> of information:
>>>>
>>>>    - additional 32-bit of clone_flags
>>>>    - number of pids in the set
>>>>    - user buffer containing the list of pids.
>>>>
>>>> But since clone() already takes 5 parameters and some (all ?) architectures
>>>> are restricted to 6 parameters to a system-call, additional data-structures
>>>> (and copy_from_user()) are needed.
>>>>
>>>> The proposed interface for clone_extended() is:
>>>>
>>>>    struct clone_tid_info {
>>>>            void *parent_tid;       /* parent_tid_ptr parameter */
>>>>            void *child_tid;        /* child_tid_ptr parameter */
>>>>    };
>>>>
>>>>    struct pid_set {
>>>>            int num_pids;
>>>>            pid_t *pids;
>>>>    };
>>>>
>>>>    int clone_extended(int flags_low, int flags_high, void *child_stack,
>>>>                    void *unused, struct clone_tid_info *tid_ptrs,
>>>>                    struct pid_set *pid_setp);
>>> I was thinking additional flags would be passed in the (renamed)
>>> struct pid_set.
>> Yes.
>>
>> But maybe in (renamed) 'struct clone_info' instead of 'struct pid_set' ?
>>
>> I vaguely recall a strong preference to not require copy-from-user
>> during a fast-path clone, because it may hurt performance.
>>
>> *If* this is the case, then maybe place extra flags among the
>> "base" args, or at least a CLONE_EXTRA would indicate that more
>> arguments need to be pulled from user-space ?
> 
> Wouldn't passing NULL for struct clone_info suffice?

:o

Actually, I misread the original prototype, and I prefer Suka's
current suggestion.

Oren.

> 
>> Do you intend to get feedback from LKML too ?
>>
>> Oren.
_______________________________________________
Containers mailing list
[email protected]
https://lists.linux-foundation.org/mailman/listinfo/containers

_______________________________________________
Devel mailing list
[email protected]
https://openvz.org/mailman/listinfo/devel

Reply via email to