While making the changes for the slurmctld core file location (see
previous messages in this thread), I took a look at "slurmd" and
"slurmdbd" and I see that both of them do a similar thing to slurmctld
with regard to the core file location. Here is the code from "slurmd.c",
which does a "chdir" to either the "SlurmdLogFile" directory or the
"SlurmdSpoolDir":
if (conf->daemonize) {
if (conf->logfile && (conf->logfile[0] == '/')) {
char *slash_ptr, *work_dir;
work_dir = xstrdup(conf->logfile);
slash_ptr = strrchr(work_dir, '/');
if (slash_ptr == work_dir)
work_dir[1] = '\0';
else
slash_ptr[0] = '\0';
if (chdir(work_dir) < 0) {
error("Unable to chdir to %s", work_dir);
xfree(work_dir);
return SLURM_FAILURE;
}
xfree(work_dir);
} else {
if (chdir(conf->spooldir) < 0) {
error("Unable to chdir to %s",
conf->spooldir);
return SLURM_FAILURE;
}
}
}
In the case of "slurmd", the core file is produced even if the
"SlurmdLogFile" directory points to someplace like "/var/log", because
slurmd never relinquishes "root" privileges and thus has root permission
to write the core file. In the interest of making "slurmctld" and
"slurmd" consistent with respect to the core file, do you think I should
create a patch to default to "SlurmdSpoolDir", or just leave well enough
alone? There is currently nothing in the "slurmd" man page to indicate
where core files are supposed to go for slurmd.
The situation is a little different for "slurmdbd". The current code in
"slurmdbd.c" is as follows:
if (slurmdbd_conf->log_file &&
(slurmdbd_conf->log_file[0] == '/')) {
char *slash_ptr, *work_dir;
work_dir = xstrdup(slurmdbd_conf->log_file);
slash_ptr = strrchr(work_dir, '/');
if (slash_ptr == work_dir)
work_dir[1] = '\0';
else
slash_ptr[0] = '\0';
if (chdir(work_dir) < 0)
fatal("chdir(%s): %m", work_dir);
xfree(work_dir);
}
This code suffers from the same problem as slurmctld. It sets the working
dir to the "LogFile" directory and later drops its privileges by setting
GID and UID to those of "SlurmUser". Thus if "SlurmUser" does not have
create file permission on the LogFile containing directory, it cannot
write a core file on abort. Unfortunately, for slurmdbd there is no
convenient alternate directory like "SaveStateLocation" or
"SlurmdSpoolDir" to do the chdir to to ensure a core file can be written.
Any suggestions?
-Don Albert-