[bug #64721] segfault when exceeding 1024 fd - replace select() by poll() in ipmi-openipmi-driver.c

2023-12-17 Thread Albert Chu
Update of bug#64721 (group freeipmi):

 Open/Closed:Open => Closed 

___

Follow-up Comment #5:

closing, this has been merged for awhile


___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/




Re: [bug #64721] segfault when exceeding 1024 fd - replace select() by poll() in ipmi-openipmi-driver.c

2023-09-27 Thread Felip Moll
Thanks Albert for looking into this.

I requested the original user to test it and report if it is working. Will
let you know asap.
I haven't tested it extensively on my own yet.

I requested the test to be done here:
https://bugs.schedmd.com/show_bug.cgi?id=17639#c41




*--Felip Moll - felip.m...@schedmd.com  Senior
Support EngineerSchedMD - http://www.schedmd.com
*



On Wed, Sep 27, 2023 at 6:32 PM Albert Chu  wrote:

> Follow-up Comment #4, bug #64721 (project freeipmi):
>
> Oops, I didn't finish.
>
> I'll go ahead and apply in master and 1.6.X stable branch.
>
> If we could get original user to verify then could get release out sooner
> rather than later (if important).
>
>
>
> ___
>
> Reply to this item at:
>
>   
>
> ___
> Message sent via Savannah
> https://savannah.gnu.org/
>
>


[bug #64721] segfault when exceeding 1024 fd - replace select() by poll() in ipmi-openipmi-driver.c

2023-09-27 Thread Albert Chu
Follow-up Comment #4, bug #64721 (project freeipmi):

Oops, I didn't finish.

I'll go ahead and apply in master and 1.6.X stable branch.

If we could get original user to verify then could get release out sooner
rather than later (if important).



___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/




[bug #64721] segfault when exceeding 1024 fd - replace select() by poll() in ipmi-openipmi-driver.c

2023-09-27 Thread Albert Chu
Follow-up Comment #3, bug #64721 (project freeipmi):

Hi, at a high level glance and very loose testing, this seems to work.  It
seems the original user with the issue hasn't confirmed it works?

https://bugs.schedmd.com/show_bug.cgi?id=17639#c30


I'll go ahead and apply in master and 1.6.X stable branch.




___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/




[bug #64721] segfault when exceeding 1024 fd - replace select() by poll() in ipmi-openipmi-driver.c

2023-09-27 Thread Felip Moll
Follow-up Comment #2, bug #64721 (project freeipmi):

[comment #1 comment #1:]
> I attach this patch proposal which I think it may fix the issue. It might
need some testing.
> 
> http://savannah.gnu.org/bugs/download.php?file_id=55173

Sorry, I opened this bug as an anonymous user from https://savannah.gnu.org/.
I registered now, hope it is the correct way to do so.


___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/




[bug #64721] segfault when exceeding 1024 fd - replace select() by poll() in ipmi-openipmi-driver.c

2023-09-27 Thread anonymous
Follow-up Comment #1, bug #64721 (project freeipmi):

I attach this patch proposal which I think it may fix the issue. It might need
some testing.

http://savannah.gnu.org/bugs/download.php?file_id=55173


___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/




[bug #64721] segfault when exceeding 1024 fd - replace select() by poll() in ipmi-openipmi-driver.c

2023-09-27 Thread anonymous
Additional Item Attachment, bug #64721 (project freeipmi):

File name: freeipmi_bug64721_master_v1.patch Size:3 KB
   




___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/




[bug #64721] segfault when exceeding 1024 fd - replace select() by poll() in ipmi-openipmi-driver.c

2023-09-27 Thread anonymous
URL:
  

 Summary: segfault when exceeding 1024 fd - replace select()
by poll() in ipmi-openipmi-driver.c
   Group: GNU FreeIPMI
   Submitter: None
   Submitted: Wed 27 Sep 2023 01:14:26 PM UTC
Category: libfreeipmi
Severity: 3 - Normal
Priority: 5 - Normal
  Item Group: None
  Status: None
 Privacy: Public
 Assigned to: None
 Open/Closed: Open
 Discussion Lock: Any
Operating System: None


___

Follow-up Comments:


---
Date: Wed 27 Sep 2023 01:14:26 PM UTC By: Anonymous
In libfreeipmi/driver/ipmi-openipmi-driver.c we loop on select() call.

According to select man(2):

   WARNING:  select() can monitor only file descriptors numbers that are
less than FD_SETSIZE
   (1024)—an unreasonably low limit for many modern applications—and
this limitation will not
   change.  All modern applications should instead use poll(2) or
epoll(7), which do not suf‐
   fer this limitation.

This causes a segfault if the application has >=1024 fd open when trying to
open the openipmi driver. This happens easily in multithreaded applications
like Slurm (slurmd) when managing many RPCs. See bug
https://bugs.schedmd.com/show_bug.cgi?id=17639

See an excerpt of the core file from the Slurm bug 17639:

Thread 1 (Thread 0x14c5ed8b5700 (LWP 53050)):
#0  0x14c5ef6b2c6b in raise () from /lib64/libc.so.6
#1  0x14c5ef6b4305 in abort () from /lib64/libc.so.6
#2  0x14c5ef6f8a97 in __libc_message () from /lib64/libc.so.6
#3  0x14c5ef790812 in __fortify_fail () from /lib64/libc.so.6
#4  0x14c5ef78ec40 in __chk_fail () from /lib64/libc.so.6
#5  0x14c5ef79071a in __fdelt_warn () from /lib64/libc.so.6
#6  0x14c5ee42a8c4 in ?? () from /usr/lib64/libfreeipmi.so.17
#7  0x14c5ee42aecf in ipmi_openipmi_cmd () from
/usr/lib64/libfreeipmi.so.17
#8  0x14c5ee40a0ab in ?? () from /usr/lib64/libfreeipmi.so.17
#9  0x14c5ee3f75e7 in ipmi_cmd () from /usr/lib64/libfreeipmi.so.17
#10 0x14c5ee3f8568 in ?? () from /usr/lib64/libfreeipmi.so.17
#11 0x14c5ee3fab15 in ipmi_cmd_dcmi_get_power_reading () from
/usr/lib64/libfreeipmi.so.17
#12 0x14c5eebded89 in _get_dcmi_power_reading (dcmi_mode=)
at acct_gather_energy_ipmi.c:500
#13 _read_ipmi_dcmi_values () at acct_gather_energy_ipmi.c:526
#14 _read_ipmi_values () at acct_gather_energy_ipmi.c:642
#15 _thread_update_node_energy () at acct_gather_energy_ipmi.c:696
#16 0x14c5eebe096b in acct_gather_energy_p_get_data (data_type=, data=0x14c5dc0f4a10) at acct_gather_energy_ipmi.c:1187
#17 0x14c5f03ee3ab in acct_gather_energy_g_get_data (context_id=0,
data_type=data_type@entry=ENERGY_DATA_JOULES_TASK, data=0x14c5dc0f4a10) at
acct_gather_energy.c:362
#18 0x0041691e in _rpc_acct_gather_energy
(msg=msg@entry=0x14c5dc001350) at req.c:3431
#19 0x0041e353 in slurmd_req (msg=msg@entry=0x14c5dc001350) at
req.c:388
#20 0x0040dc37 in _service_connection (arg=) at
slurmd.c:625
#21 0x14c5ef9db6ea in start_thread () from /lib64/libpthread.so.0
#22 0x14c5ef77f94f in clone () from /lib64/libc.so.6

It seems natural that the select() call must be replaced by poll() to support
applications with more than 1024 opened fds.








___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/