Hey Eric, On Tue, 2010-05-04 at 19:16 -0700, Eric Pooch wrote: > From: epo...@cox.net > Subject: Re: [Freeipmi-users] Disabled temp sensors > Date: May 4, 2010 7:15:43 PM PDT > To: ch...@llnl.gov > > Al, > see below: > On May 4, 2010, at 10:48 AM, Al Chu wrote: > > > > Hey Eric, > > > > Ahhhh. That would do it. The slave address is probably wrong in the > > SDR. When you run w/ bridging, the sensors attempts to bridge to an > > address that is probably non-functional/illegal. > > > > > Yep. > > > LMK what your final patch looks like. I can work it into a workaround > > of some sort for ipmi-sensors. (e.g. > > --workaround-flags=assumebmcslaveaddr). > > > > I can work on it, but I wanted to make sure that I don't just have > some errors in the SDR that are causing the problem. If I issue the > Clear SDR Repository command, with this cause me to lose information, > or will the SDR repository get rebuilt fresh on its own?
I don't know of HW that will rebuild an SDR, so I wouldn't recommend that. Best bet is a firmware reflash. > > How does ipmi-sel look like? I'm wondering if SEL events are > > reporting > > right/wrong slave addresses and sensor related outputs are outputting > > correctly or not. > > > > > sudo ipmi-sel > ipmi_sel_parse: internal IPMI error Hmmm. Maybe a similar internal issue. Can you send --debug output. > >> My fans still look messed up. > >> > > > > It certainly depends on if the SDR is correct or not. From the output > > below, it looks as though the Fans are "transition" fans. They only > > report the transition state instead of fan instead of an RPM. If they > > aren't "transition" fans, then the SDR might be wrong which is leading > > to this kind of output. > > > There is also a valid sensor reading , but it doesn't look like the > library supports that. > > > > > BTW, you forgot the debug output from your previous e-mail. > > > > > I did send it as an attachment, but I think it got filtered out. Maybe it did. What are you sending as? If the output is too big, a gzip should be sufficient. Al > Thanks a lot > --Eric > > > Al > > > > On Mon, 2010-05-03 at 21:32 -0700, Eric Pooch wrote: > > > >> OK, I think I found the problem on my computer's implementation > >> of IPMI > >> I edited: > >> /freeipmi-0.8.5/libfreeipmi/src/sensor-read/ipmi-sensor-read.c > >> > >> if (slave_address == IPMI_SLAVE_ADDRESS_BMC)*/ > >> if (slave_address != IPMI_SLAVE_ADDRESS_BMC) > >> And received what looks like good data: > >> > >> 1 | Fan 1 | Fan | N/A | N/A > >> | 'transition to Off Line' > >> 2 | Fan 2 | Fan | N/A | N/A > >> | 'transition to Running' 'transition to On Line' > >> 3 | Fan 3 | Fan | N/A | N/A > >> | 'transition to Running' 'transition to On Line' > >> 4 | Fan 4 | Fan | N/A | N/A > >> | 'transition to Running' 'transition to On Line' > >> 5 | PCI Fan | Fan | N/A | N/A > >> | 'transition to Off Line' > >> 6 | Memory | Memory | N/A | N/A > >> | 'OK' > >> 7 | CPU 1 | Processor | N/A | N/A > >> | 'Processor Presence detected' > >> 8 | CPU 2 | Processor | N/A | N/A > >> | 'Processor Presence detected' > >> 9 | VRM | Voltage | N/A | N/A > >> | 'OK' > >> 10 | CPU1 Temperature | Temperature | 35.00 | C > >> | 'OK' > >> 11 | CPU2 Temperature | Temperature | 33.00 | C > >> | 'OK' > >> 12 | Thermal Trip | Temperature | N/A | N/A > >> | 'OK' > >> 13 | Sys Temperature | Temperature | 31.00 | C > >> | 'OK' > >> 14 | DDR 1.25V | Voltage | 1.25 | V > >> | 'OK' > >> 15 | Sys 3.3V | Voltage | 3.25 | V > >> | 'OK' > >> 16 | Sys 5V | Voltage | 5.00 | V > >> | 'OK' > >> 17 | CIOBE 1.2V | Voltage | 1.21 | V > >> | 'OK' > >> 18 | CIOBE 2.5V | Voltage | 2.52 | V > >> | 'OK' > >> 19 | BIOS Progress | System Firmware Progress | N/A | N/A > >> | N/A > >> 20 | Watchdog | Watchdog 2 | N/A | N/A > >> | N/A > >> > >> This is much better, and I get info for almost all of the sensors > >> that just showed N/A before. My fans still look messed up. I will > >> figure out more details, make it a bit cleaner and send a patch for > >> users of this flawed IPMI implementation > >> Thanks > >> --Eric > >> > >> On May 3, 2010, at 8:42 PM, Eric Pooch wrote: > >> > >> > >>> Ok, I updated to 0.8.5 and attached an archive of the debug log > >>> from: > >>> $ sudo ipmi-sensors --debug > >>> see below: > >>> On May 3, 2010, at 5:23 PM, Al Chu wrote: > >>> > >>> > >>>> Hey Eric, > >>>> > >>>> > >>>>> Also, bmc-info returns IPMI version 1.0 that is probably not > >>>>> supported by FreeIPMI, but ipmi-locate, returns "IPMI Version: > >>>>> 1.5" > >>>>> for all of the devices. > >>>>> > >>>> > >>>> Doing a quick online search, this machine appears to be pretty > >>>> old. It > >>>> is possible that it does not support IPMI 1.5. The output from > >>>> ipmi-locate you're seeing may be the defaults and not actual > >>>> outputs > >>>> from the machine (this is confusing many people so I'm changing > >>>> this > >>>> output for the next 0.9.1 release). If it is only IPMI 1.0, > >>>> there's > >>>> probably not much I can do to help you, since many of the IPMI > >>>> commands > >>>> will just not be supported on your motherboard. > >>>> > >>>> > >>> > >>> Ok. I understand > >>> > >>>>> First, all my sensor values come back as [NA] even though most > >>>>> work > >>>>> properly under ipmitool. > >>>>> > >>>> > >>>> I assume you're using FreeIPMI 0.7.X b/c the newest one (0.8.X > >>>> line) > >>>> does not have "[NA]" output. There have certainly been fixes since > >>>> then, so you may wish to upgrade. My initial guess was > >>>> bridging, but > >>>> you seem to have tried that. > >>>> > >>>> I've noticed on some motherboards that there are issues b/c I find > >>>> errors/problems in other parts of IPMI that ipmitool doesn't, > >>>> thus I > >>>> output errors and they don't. We need to dig into the core of the > >>>> errors on your board to figure out what they/I are doing > >>>> wrong/differently. Can you provide --debug output? > >>>> > >>>> > >>>>> So, I think maybe there something that is disabling the Temp > >>>>> sensor > >>>>> at another level. I noticed on the HP lightsout user guide that > >>>>> they > >>>>> have a setting "o PEF Control—Enables or disables the sensor. " > >>>>> > >>>> > >>>> Based on some of the error messages you posted from ipmitool > >>>> (BTW, in > >>>> the future could you indicate what tools the error messages came > >>>> from, I > >>>> thought you were indicating FreeIPMI errors and couldn't find > >>>> them at > >>>> first), > >>>> > >>> Sorry, I thought I was listing FreeIPMI errors, but I guess I > >>> posted errors from the wrong log. > >>> > >>>> my guess is bridging is not supported on your motherboard and/or > >>>> there is a firmware issue w/ bridging, so the temp sensors can't be > >>>> reached. > >>>> > >>> I would agree, except that the standard IMPI raw "get sensor > >>> reading" command works fine. It is almost like ipmi-sensors and > >>> ipmitool are finding something they don't like in the sdr and not > >>> trying to read the sensor at all. > >>> > >>> $ sudo ipmi-raw 0 04 2d 0A > >>> rcvd: 2D 00 23 C0 00 00 > >>> > >>> 0x23 = 35 degrees celsius, which seems right for my processor > >>> temp. As I mentioned before, it varies proportionally with server > >>> load, seems like the value I need, and is the correct command as > >>> far as I can tell from the IPMI v 1.5 specs > >>> > >>> > >>>> It's hard to say. If you can provide me --debug output from > >>>> ipmi-sensors, I can maybe analyze it deeper. > >>>> > >>> > >>> > >>> $ sudo ipmi-sensors --debug > >>> see attachment > >>> > >>> $ sudo ipmi-sensors --bridge > >>> ipmi_sensor_read: internal IPMI error > >>> > >>> > >>>> Does any HP specific software work for you for all these > >>>> sensors? If > >>>> their software does, and ipmitool/FreeIPMI does not, it indicates > >>>> there > >>>> is something kooky on your motherboard. > >>>> > >>>> > >>> I don't know, I don't have access to Windows. If it won't work > >>> with FreeIPMI, I understand that my motherboard is old, but it just > >>> seems strange that I can get the sensor reading using ipmi-raw, but > >>> not ipmi-sensors. > >>> > >>> Thanks a lot for your help. > >>> --Eric > >>> > >>> > >>>> Al > >>>> > >>>> On Sun, 2010-05-02 at 10:10 -0700, Eric Pooch wrote: > >>>> > >>>>> I am having several problems on my HP proliant dl140 G1 > >>>>> > >>>>> First, all my sensor values come back as [NA] even though most > >>>>> work > >>>>> properly under ipmitool. > >>>>> I get the debug errors from ipmi-sensors: > >>>>> > >>>>> Error reading event status for sensor #09: Invalid command > >>>>> ... > >>>>> Error reading event enable for sensor #09: Invalid command > >>>>> > >>>>> When I try ipmi-raw to send those commands, I also get the same > >>>>> error, so I think the commands are not supported on the sensors. > >>>>> The > >>>>> sensors are returning the proper information when I send a raw > >>>>> command to get their readings. (see below) > >>>>> > >>>>> However, none of my temp sensors work properly in either > >>>>> freeipmi or > >>>>> ipmitool and I get a debug error: > >>>>> Error reading sensor CPU1 Temperature (#0a): Destination > >>>>> unavailable > >>>>> > >>>>> I get the same "destination unavailable message from event status > >>>>> and > >>>>> event enable. However, when I enter the raw ipmi command to > >>>>> read the > >>>>> temp sensor: > >>>>> sudo ipmi-raw 0 04 2d 0A > >>>>> > >>>>> it responds correctly: > >>>>> rcvd: 2D 00 1B C0 00 00 > >>>>> > >>>>> The 1B is the correct temperature in Celsius that rises with > >>>>> processor load. It is definitely the correct temperature. > >>>>> I have tried the bridge mode but I get an error also. > >>>>> It seems like the sensor is responding correctly, but is > >>>>> disabled as > >>>>> far as the sdr is concerned? I can't enable it through a raw > >>>>> command > >>>>> because none of the sensors respond to the "event status" or > >>>>> "event > >>>>> enable" commands. So, I think maybe there something that is > >>>>> disabling the Temp sensor at another level. I noticed on the HP > >>>>> lightsout user guide that they have a setting "o PEF Control— > >>>>> Enables > >>>>> or disables the sensor. " > >>>>> I am not really sure how to make a change that would cause the > >>>>> sensor > >>>>> to be enabled. > >>>>> > >>>>> Also, bmc-info returns IPMI version 1.0 that is probably not > >>>>> supported by FreeIPMI, but ipmi-locate, returns "IPMI Version: > >>>>> 1.5" > >>>>> for all of the devices. > >>>>> > >>>>> Thanks for any help! > >>>>> _______________________________________________ > >>>>> Freeipmi-users mailing list > >>>>> Freeipmi-users@gnu.org > >>>>> http://***lists.gnu.org/mailman/listinfo/freeipmi-users > >>>>> > >>>>> > >>>> -- > >>>> Albert Chu > >>>> ch...@llnl.gov > >>>> Computer Scientist > >>>> High Performance Systems Division > >>>> Lawrence Livermore National Laboratory > >>>> > >>>> > >>> > >>> _______________________________________________ > >>> Freeipmi-users mailing list > >>> Freeipmi-users@gnu.org > >>> http://**lists.gnu.org/mailman/listinfo/freeipmi-users > >>> > >> > >> > >> > >> _______________________________________________ > >> Freeipmi-users mailing list > >> Freeipmi-users@gnu.org > >> http://**lists.gnu.org/mailman/listinfo/freeipmi-users > >> > >> > > -- > > Albert Chu > > ch...@llnl.gov > > Computer Scientist > > High Performance Systems Division > > Lawrence Livermore National Laboratory > > > > > > _______________________________________________ > Freeipmi-users mailing list > Freeipmi-users@gnu.org > http://*lists.gnu.org/mailman/listinfo/freeipmi-users > -- Albert Chu ch...@llnl.gov Computer Scientist High Performance Systems Division Lawrence Livermore National Laboratory _______________________________________________ Freeipmi-users mailing list Freeipmi-users@gnu.org http://lists.gnu.org/mailman/listinfo/freeipmi-users