The proctrack/cgroup plugin designed to more easily support pid's from  
processes launched outside of slurm. As I recall, the path used for  
cgroups is of this form:
user_id/job_id/step_id/task_id

For example, PAM could put processes into the appropriate cgroup, at  
least to the level of the user_id directory, although I don't know how  
you handle binding tasks to cpus if slurm isn't launching the tasks.

Quoting Michel Bourget <[email protected]>:

>
> Hi all,
>
> was the issue of monitoring pids coming-and-going-away addressed ( or
> debated ) in
> the past ( or the future tbd) in regards to proctrack and job_acct_gather ?
>
> I mean, since pids can fork() children and go away later, proctrack seems
> not to able to dynamically track this since it's "on-demand". Same for
> jobacct_gather since it's set "in stone" when a step is launched.
> And, because proctrack is on-demand and jobacct_gather pids are set in stone
> at the beginning, on-demand newly discovered pids never intersect
> with those jobacct pids.
>
> Maybe an approach like using the kernel process socket connector,
> based on an initial set of pids ( monitor fork() and exit() ), and then
> proctrack/job_act_gather using that list instead,  would be useful
> and feasible ? In that case, I would think additional information
> relative to  the obtained pid list would be something in the lines of:
>
>   pid_list_t {
>          a_lock;             // Global list lock
>          int n;              // # of records
>          pid_info_t *info;   // Obvious
>          more ?
>   }
>
>   pid_info_t {
>          a_lock;             // Record lock
>          int is_active;      // 0 means pids once live but now gone
>          struct jobacctinfo; // acct for that pid so far.
>          more ?
>   }
>
> Given the above, proctrack services would key on pid where active=1.
> And jobacct_gather services would key on jobacctinfo gathered so far,
> regardless of is_active.And I would risk to state proctrack and
> jobacct_gather could be independent of each other, which is not the case
> today, I believe.
>
> I have to admit the above would allow a lot more easily to inject
> out-of-band pids to slurm. I can think of those using mpirun
> in an salloc, or similar. "Similar" is about the sgimpi
> implementation I maintain here at SGI.  I understand it
> sounds SGI-specific but I believe there is a generic value
> in the above-mentioned approach that would benefit to SLURM in
> general.
>
> Hopefully, I hope I am not off track ;-)
>
> Too evil ? Not worth ? Comments ?
>
> --
>
> -----------------------------------------------------------
>       Michel Bourget - SGI - Linux Software Engineering
>      "Past BIOS POST, everything else is extra" (travis)
> -----------------------------------------------------------
>

Reply via email to