[EMAIL PROTECTED] wrote on 05/12/2008 10:38:51 
AM:

> Hello Zhang Huan,
> I was finally able to reproduce the bug in my i386 server installed 
> with RHEL5.1.
> 
> As identified by you, this bug is reproducible only when the 
> openhpid process is run in the background
> (done by invoking the openhpid using service utility, i.e. "service 
> openhpid start" or
> by running the openhpid in background directly i.e. "openhpid -c 
> /etc/openhpi/openhpi.conf")
> 
(snip)
> 
> When the openhpid is started in background, ipmi connection error is
> reported and the openhpid continuously tries to discover with no 
success.
> 
> When the hpitop client is started, there is stream socket connection
> established between openhpid process and hpitop client program.
> Since openhpid is running in the background, File handlers like 
> stderr, stdout and stdin are not attached to any TTY.

A daemon inherits file descriptors from its parent and has access to them 
even when running in the background.

> But for reporting the ipmi connection error, the IPMI plugin tries 
> to write into stderr using fprintf (plugins/ipmi/ipmi.c, line no 
> 604), this error message gets into data stream being read by client:
> 
> Extract of ipmi.c:
> 
>  602         while (ipmi_handler->fully_up == 0) {
>  603                 if (!ipmi_handler->connected) {
>  604                         fprintf(stderr, "IPMI connection is 
down\n");
>  605                         return SA_ERR_HPI_NO_RESPONSE;
>  606                 }
> 
> 
> (gdb) print data
> $12 = 0xbfc14d2b "IPMI connection is down\n\001\021"
> (gdb) print *data
> $13 = 73 'I'
> 
> Provided above is the corrupted message content received at hpitop 
> client's side (Captured in gdb).
> Due to this problem the ReadMsg  reports wrong version error.
> 
(snip)

This analysis helped me understand the problem a great deal. Thanks MS.

I found that the daemon was closing the stdout and stderr descriptors on 
startup. The problem was harder to find since the return code of fprintf 
is not examined by the code. I believe fprintf must be returning an error 
in that case, since its trying to write to an invalid descriptor.

> 
> Testing and findings on openhpi.2.11.2:
> =============================
> The scenario here is much different, the definition of the "dbg" 
> statements are written to stderr and hence most of the dbg statement
> gets mixed with the data being read by hpitop client program.
> 
> In 2.10.2, the dbg statements were written to syslog rather than stderr.
> 
> Any suggestions on finding the solution for openhpi trunk.

I have attached a patch that keeps stdout and stderr open. I was able to 
reproduce the problem at home and the patch fixes the "wrong version" 
error.
One thing to try though is to set the openhpid service to start on machine 
startup with the same openhpi.conf, and test with the hpi clients after a 
fresh reboot. If the "wrong version" problem appears again, even with the 
patch, then we need to figure out how to create stdout/stderr when they 
don't exist for the daemon initially.

Please, try the patch out and let me know.

        --Renier

Attachment: fd_keep.diff
Description: Binary data

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft 
Defy all challenges. Microsoft(R) Visual Studio 2008. 
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Openhpi-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/openhpi-devel

Reply via email to