On 2015-04-25 01:59, David Bigagli wrote:
Hi all,
the reason to compile without optimization is to be able to have
a meaningful stack when attaching gdb to the daemons or when analysing
core files. If the optimization is on crucial variables in the stack are
optimized out preventing exact diagnoses of issues. This of course is
configurable, we only changed the default, if sites wish to compile with
optimization use the config option --disable-debug. So this is not about
sweeping bugs under the carpet it is exactly the opposite it is a tool
to debug more efficiently.
FWIW, newer gcc versions have an option "-Og", which enables optimizations which don't
interfere with debugging. Might be worth adding a configure check if one uses a recent enough gcc,
and enable that option then? IIRC the optimizations are roughly similar to what "-O1"
gives.
Anyway, is there a way to enable optimization but keep the debug symbols? For our
production builds, I think we'd like to have "-O2 -g".
The reasons for using statvfs versus statfs is that statfs is deprecated
and replaced by the POSIX statvfs, so it is portable across platforms,
indeed NetBSD and Solaris do not have statfs.
Since all platforms have stavfs the code in get_tmp_disk() under the
#define (HAVE_STATFS) is obsolete and will possibly be removed in the
next major release.
Indeed, statvfs is in POSIX and should work everywhere on a decently new
system. However, as I mentioned in my previous message, on Linux prior to
kernel 2.6.36 and glibc 2.13, it's not as robust as the (non-standard) statfs.
Hence I would prefer that it would be used on Linux in preference for statvfs,
as most slurm clusters are presumably still running on older kernel/glibc
versions. Something like the attached patch?
--
Janne Blomqvist, D.Sc. (Tech.), Scientific Computing Specialist
Aalto University School of Science, PHYS & NBE
+358503841576 || [email protected]
diff --git a/src/slurmd/slurmd/get_mach_stat.c b/src/slurmd/slurmd/get_mach_stat.c
index d7c5eb1..81994a6 100644
--- a/src/slurmd/slurmd/get_mach_stat.c
+++ b/src/slurmd/slurmd/get_mach_stat.c
@@ -216,57 +216,42 @@ extern int
get_tmp_disk(uint32_t *tmp_disk, char *tmp_fs)
{
int error_code = 0;
-
-#if defined(HAVE_STATVFS)
- struct statvfs stat_buf;
- uint64_t total_size = 0;
+ unsigned long long total_size = 0;
char *tmp_fs_name = tmp_fs;
- *tmp_disk = 0;
- total_size = 0;
-
if (tmp_fs_name == NULL)
tmp_fs_name = "/tmp";
- if (statvfs(tmp_fs_name, &stat_buf) == 0) {
- total_size = stat_buf.f_blocks * stat_buf.f_frsize;
- total_size /= 1024 * 1024;
- }
- else if (errno != ENOENT) {
- error_code = errno;
- error ("get_tmp_disk: error %d executing statvfs on %s",
- errno, tmp_fs_name);
- }
- *tmp_disk += (uint32_t)total_size;
-#elif defined(HAVE_STATFS)
+#ifdef(__linux__)
+ /* Prior to Linux 2.6.36 and glibc 2.13, statvfs() can get
+ * stuck if ANY mount in the system is hung, so use the
+ * non-standard statfs() instead. Furthermore, as of Linux
+ * 2.6+ struct statfs contains the f_frsize field which gives
+ * the size of the blocks reported in the f_blocks field. */
struct statfs stat_buf;
- long total_size;
- float page_size;
- char *tmp_fs_name = tmp_fs;
-
- *tmp_disk = 0;
- total_size = 0;
- page_size = (sysconf(_SC_PAGE_SIZE) / 1048576.0); /* MG per page */
- if (tmp_fs_name == NULL)
- tmp_fs_name = "/tmp";
-#if defined (__sun)
- if (statfs(tmp_fs_name, &stat_buf, 0, 0) == 0) {
-#else
if (statfs(tmp_fs_name, &stat_buf) == 0) {
-#endif
- total_size = (long)stat_buf.f_blocks;
+ total_size = stat_buf.f_blocks * stat_buf.f_frsize;
+ total_size /= 1024 * 1024;
+ } else if (errno != ENOENT) {
+ error_code = errno;
+ error ("get_tmp_disk: error %d executing statfs on %s",
+ errno, tmp_fs_name);
+ }
+#elif defined(HAVE_STATVFS)
+ struct statvfs stat_buf;
+
+ if (statvfs(tmp_fs_name, &stat_buf) == 0) {
+ total_size = stat_buf.f_blocks * stat_buf.f_frsize;
+ total_size /= 1024 * 1024;
}
else if (errno != ENOENT) {
error_code = errno;
- error ("get_tmp_disk: error %d executing statfs on %s",
+ error ("get_tmp_disk: error %d executing statvfs on %s",
errno, tmp_fs_name);
}
-
- *tmp_disk += (uint32_t)(total_size * page_size);
-#else
- *tmp_disk = 1;
#endif
+ *tmp_disk = (uint32_t)total_size;
return error_code;
}