Dear FreeIPMI developers,

In freeipmi-1.6.11 the driver libfreeipmi/driver/ipmi-openipmi-driver.c in line 532 calls the select() function:

    if ((n = select (ctx->device_fd + 1,
      ...

The Slurm Resource Manager batch queue system for Linux clusters has been hit by crashes due to a bug which has been attributed to the ipmi-openipmi-driver.c using select().

The issue is described in this bug report: https://bugs.schedmd.com/show_bug.cgi?id=17639#c30

On a very busy Linux cluster it seems that ipmi-openipmi-driver.c may sometimes exceed the number of file descriptors available (1024). The select(2) man-page states:

BUGS
       POSIX allows an implementation to define an upper limit, advertised via
       the  constant  FD_SETSIZE, on the range of file descriptors that can be
       specified in a file descriptor set.  The Linux kernel imposes no  fixed
       limit,  but  the  glibc  implementation makes fd_set a fixed-size type,
       with FD_SETSIZE defined  as  1024,  and  the  FD_*()  macros  operating
       according  to  that  limit.   To  monitor file descriptors greater than
       1023, use poll(2) instead.

Question: Would it be possible for you to replace the select() by poll() in the FreeIPMI driver/ipmi-openipmi-driver.c code?

Thanks a lot,
Ole

--
Ole Holm Nielsen
PhD, Senior HPC Officer
Department of Physics, Technical University of Denmark

_______________________________________________
Freeipmi-users mailing list
Freeipmi-users@gnu.org
https://lists.gnu.org/mailman/listinfo/freeipmi-users

Reply via email to