Hello Renier,

After applying the patch provided by you, I tried the following scenarios on 
the i386 server with RHEL 5.1:

[EMAIL PROTECTED] ~]# uname -a
Linux aotpprl3 2.6.18-53.el5PAE #1 SMP Wed Oct 10 16:48:18 EDT 2007 i686 i686 
i386 GNU/Linux
Scenarios:
1. Running openhpid in background directly in a shell (i.e. openhpid -c 
/etc/openhpi/openhpi.conf) and run hpitop client(at least 2-3 times)
2. Running openhpid using service utility (service openhpid start) and run 
hpitop client(at least 2-3 times)
3. Incorporate openhpid into the standard run levels of OS, start the openhpid 
using service utility,
reboot the server and check that openhpid is running after restart. Run hpitop 
client(atlease 2-3 times)

Scenario 1: Running openhpid in background directly in a shell:

1. export OPENHPI_DEBUG=YES and export OPENHPI_DEBUG_TRACE=YES(This step is not 
mandatory, but just trying out to check for some errors)
2. Start openhpid -c /etc/openhpi/openhpi.conf (this is same conf used for 
previous execution, contain libipmi to non existent machine)
3. ran hpitop multiple times, the wrong version error is not reported.

Scenario 2: Running openhpid using service utility

1. service openhpid start
2. ran hpitop multiple times, the wrong version error is not reported.

Scenario 3: Incorporate openhpid startup into rc.local, start the openhpid 
using service utility,
reboot the server and check that openhpid is running after restart. Run hpitop 
client(at least 2-3 times):

1. openhpid is started when the OS boots up
2. ran hpitop multiple times, the wrong version error is not reported.

With the above testing, I feel that the patch has addressed the issue and is 
the fix for the bug id 1939812.

Thanks for the patch.

Regards,
MS



________________________________
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Renier Morales
Sent: Tuesday, May 13, 2008 10:30 AM
To: [email protected]
Subject: Re: [Openhpi-devel] Regarding bug id 1939812 (openhpid doesnt work 
correctly for non-existent machine)


[EMAIL PROTECTED] wrote on 05/12/2008 10:38:51 AM:

> Hello Zhang Huan,
> I was finally able to reproduce the bug in my i386 server installed
> with RHEL5.1.
>
> As identified by you, this bug is reproducible only when the
> openhpid process is run in the background
> (done by invoking the openhpid using service utility, i.e. "service
> openhpid start" or
> by running the openhpid in background directly i.e. "openhpid -c
> /etc/openhpi/openhpi.conf")
>
(snip)
>
> When the openhpid is started in background, ipmi connection error is
> reported and the openhpid continuously tries to discover with no success.
>
> When the hpitop client is started, there is stream socket connection
> established between openhpid process and hpitop client program.
> Since openhpid is running in the background, File handlers like
> stderr, stdout and stdin are not attached to any TTY.

A daemon inherits file descriptors from its parent and has access to them even 
when running in the background.

> But for reporting the ipmi connection error, the IPMI plugin tries
> to write into stderr using fprintf (plugins/ipmi/ipmi.c, line no
> 604), this error message gets into data stream being read by client:
>
> Extract of ipmi.c:
>
>  602         while (ipmi_handler->fully_up == 0) {
>  603                 if (!ipmi_handler->connected) {
>  604                         fprintf(stderr, "IPMI connection is down\n");
>  605                         return SA_ERR_HPI_NO_RESPONSE;
>  606                 }
>
>
> (gdb) print data
> $12 = 0xbfc14d2b "IPMI connection is down\n\001\021"
> (gdb) print *data
> $13 = 73 'I'
>
> Provided above is the corrupted message content received at hpitop
> client's side (Captured in gdb).
> Due to this problem the ReadMsg  reports wrong version error.
>
(snip)

This analysis helped me understand the problem a great deal. Thanks MS.

I found that the daemon was closing the stdout and stderr descriptors on 
startup. The problem was harder to find since the return code of fprintf is not 
examined by the code. I believe fprintf must be returning an error in that 
case, since its trying to write to an invalid descriptor.

>
> Testing and findings on openhpi.2.11.2:
> =============================
> The scenario here is much different, the definition of the "dbg"
> statements are written to stderr and hence most of the dbg statement
> gets mixed with the data being read by hpitop client program.
>
> In 2.10.2, the dbg statements were written to syslog rather than stderr.
>
> Any suggestions on finding the solution for openhpi trunk.

I have attached a patch that keeps stdout and stderr open. I was able to 
reproduce the problem at home and the patch fixes the "wrong version" error.
One thing to try though is to set the openhpid service to start on machine 
startup with the same openhpi.conf, and test with the hpi clients after a fresh 
reboot. If the "wrong version" problem appears again, even with the patch, then 
we need to figure out how to create stdout/stderr when they don't exist for the 
daemon initially.

Please, try the patch out and let me know.

        --Renier

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft 
Defy all challenges. Microsoft(R) Visual Studio 2008. 
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Openhpi-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/openhpi-devel

Reply via email to