Hey Al, here is the sdr-cache. 'sdr-cache-p300slg01.10.136.17.128' is the file for gtseval-ipmi, 'sdr-cache-p300slg01.10.136.17.170' is an other cache file from a call of ipmi-sensors which works fine.
I'm using FreeIPMI on a system with SUSE 10.1. --------- p300slg01:/usr/local/src # uname -a Linux p300slg01 2.6.16.27-0.9-smp #1 SMP Tue Feb 13 09:35:18 UTC 2007 i686 i686 i386 GNU/Linux --------- In your test4-code, I had to change the following lines to compile w/o errors: common/src/pstdout.c -243: fprintf(stderr, "Default stack size = %li bytes \n", mystacksize); +243: fprintf(stderr, "Default stack size = %li bytes \n", (long)mystacksize); +501: va_list vacpy; --------- I've tested FreeIPMI locally again. I was wrong, it crashes, too. I guess, I was confused with IPMItool, which runs fine locally but gives warnings over the network. Don't know whether it helps you: Locally: [EMAIL PROTECTED]:~/ipmi/usr/bin> ./ipmitool -I open sensor ACPI State | 0x1 | discrete | 0x0180| na | na | na | na | na | na System Reset | 0x0 | discrete | 0x0080| na | na | na | na | na | na POST Error | na | discrete | na | na | na | na | na | na | na Memory ECC | na | discrete | na | na | na | na | na | na | na PCI Error | na | discrete | na | na | na | na | na | na | na Fan Error | na | discrete | na | na | na | na | na | na | na Watchdog | na | discrete | na | na | na | na | na | na | na CPU Fan 1 | 9992.006 | RPM | ok | na | na | na | 3996.803 | 3475.480 | na CPU Fan 2 | 10426.441 | RPM | ok | na | na | na | 3996.803 | 3475.480 | na CPU Fan 3 | 9992.006 | RPM | ok | na | na | na | 3996.803 | 3475.480 | na CPU Fan 4 | 10426.441 | RPM | ok | na | na | na | 3996.803 | 3475.480 | na CPU Fan 5 | 9223.391 | RPM | ok | na | na | na | 3996.803 | 3475.480 | na CPU Fan 6 | 10900.371 | RPM | ok | na | na | na | 3996.803 | 3475.480 | na CPU Fan 7 | 9992.006 | RPM | ok | na | na | na | 3996.803 | 3475.480 | na CPU Fan 8 | 10900.371 | RPM | ok | na | na | na | 3996.803 | 3475.480 | na CPU Fan 9 | 9992.006 | RPM | ok | na | na | na | 3996.803 | 3475.480 | na CPU Fan 10 | 10426.441 | RPM | ok | na | na | na | 3996.803 | 3475.480 | na System Fan 1 | 9992.006 | RPM | ok | na | na | na | 3996.803 | 3475.480 | na System Fan 2 | 10900.371 | RPM | ok | na | na | na | 3996.803 | 3475.480 | na CPU0 Vcore | 1.107 | Volts | ok | na | 0.402 | 0.500 | 1.597 | 1.695 | na CPU1 Vcore | na | Volts | na | na | 0.402 | 0.500 | 1.597 | 1.695 | na Standby 5V | 4.969 | Volts | ok | na | 4.263 | 4.528 | 5.527 | 5.792 | na System 5V | 4.851 | Volts | ok | na | 4.263 | 4.528 | 5.527 | 5.792 | na System 3.3V | 3.234 | Volts | ok | na | 2.822 | 2.999 | 3.675 | 3.851 | na 3V CMOS Sense | 3.028 | Volts | ok | na | 2.617 | 2.781 | na | na | na CPU0 Therm Diode | na | degrees C | na | na | 10.000 | na | 68.000 | 80.000 | 95.000 CPU1 Therm Diode | na | degrees C | na | na | 10.000 | na | 68.000 | 80.000 | 95.000 CPU0 ThermDiode2 | na | degrees C | na | na | 10.000 | na | 68.000 | 80.000 | 95.000 CPU1 ThermDiode2 | na | degrees C | na | na | 10.000 | na | 68.000 | 80.000 | 95.000 AMB Temp | 29.000 | degrees C | ok | na | 10.000 | na | 30.000 | 45.000 | na MultiBit ECC ER | 0x0 | discrete | 0x0180| na | na | na | na | na | na VDD Power Fail | 0x0 | discrete | 0x0180| na | na | na | na | na | na Reset | 0x0 | discrete | 0x0180| na | na | na | na | na | na Identify | 0x0 | discrete | 0x0180| na | na | na | na | na | na NMI | 0x0 | discrete | 0x0180| na | na | na | na | na | na CPU0 Therm-Trip | 0x0 | discrete | 0x0180| na | na | na | na | na | na CPU1 Therm-Trip | na | discrete | na | na | na | na | na | na | na CPU0 IERR | 0x0 | discrete | 0x0180| na | na | na | na | na | na CPU1 IERR | na | discrete | na | na | na | na | na | na | na CPU0 Prochot | 0x0 | discrete | 0x0180| na | na | na | na | na | na CPU1 Prochot | na | discrete | na | na | na | na | na | na | na CPU0 SocketOcc | 0x1 | discrete | 0x0280| na | na | na | na | na | na CPU1 SocketOcc | 0x0 | discrete | 0x0180| na | na | na | na | na | na CPU0 Dmn 0 Temp | 45.000 | degrees C | ok | na | na | na | na | 85.000 | 95.000 CPU1 Dmn 0 Temp | na | degrees C | na | na | na | na | na | 85.000 | 95.000 CPU0 Dmn 1 Temp | 46.000 | degrees C | ok | na | na | na | na | 85.000 | 95.000 CPU1 Dmn 1 Temp | na | degrees C | na | na | na | na | na | 85.000 | 95.000 Over a RCMP+-Session: [...] System Reset | 0x0 | discrete | 0x0080| na | na | na | na | na | na Error reading sensor POST Error (#01) Error reading sensor Memory ECC (#02) Error reading sensor PCI Error (#03) Error reading sensor Fan Error (#04) Watchdog | na | discrete | na | na | na | na | na | na | na CPU Fan 1 | 9992.006 | RPM | ok | na | na | na | 3996.803 | 3475.480 | na [...] The missing lines are equal. ----------- I've called ipmi-sensors from an x86_64 to reach gtseval-ipmi, too. And it crashes with the same error (second attachment). So... Enough debugging for today. Have a nice day, Gregor Al Chu wrote: > Hey Gregor, > > Although it's unlikely your problem, I saw one other potential issue. > So I added a fix in this slightly newer tar.gz. > > Thanks, > Al > > On Mon, 2007-10-08 at 11:51 -0700, Al Chu wrote: >> Hey Gregor, >> >> Here's another tar.gz. Could you run ./configure with --enable-debug >> and run with --debug again? The gdb output confirms the line I believed >> was causing the problem, but I still can't quite figure out how the >> corruption is happening. So I put in a lot more printfs. >> >> I do have atleast two other suspicions, that depend on your system. So >> do you think you could also send me the SDR from ~/.freeipmi/sdr-cache/ >> for me to analyze and also could you tell me what linux you are running >> on the i386 box? I'm wondering if you have some older distribution (b/c >> its i386) and it has slightly different threads behavior that I'm not >> handling properly. >> >> Thanks, >> Al >> >> >> On Sun, 2007-10-07 at 12:12 +0200, Gregor Dschung wrote: >>> Hi Al, >>> >>> I attach again the output of the call with --debug and the backtrace. It >>> was the first time that I used gdb, so I hope I understood the tutorials >>> :) >>> >>> At the moment I'm not able to run ipmi-sensors locally, because I'm not >>> root on "gtseval" (the host of gtseval-ipmi) and I've to wait until I get >>> rw-rights for /dev/ipmi0 again. And we have week-end ;) >>> >>> You are right, I'm running the IPMItool and FreeIPMI on an i386. On >>> gtseval is a 64bit-System, so perhaps this is the reason for not crashing >>> locally. >>> >>> Have a nice Sunday, >>> Gregor >>> >>> >>>> Hey Gregor, >>>> >>>> Can't see anything suspicuous in the code. Here's another tar.gz that I >>>> added a whole bunch of extra printfs to try and give me more information, >>>> could you run again (./configure --enable-debug and run ipmi-sensors with >>>> --debug again). Also, you mentioned that ipmi-sensors completes locally >>>> without issue. Are the number of sensor listed below (ending w/ CPU1 Dmn >>>> 1 Temp) the same as the number of sensors listed when you run locally? >>>> >>>> Also, is a core dump being output by this crash? Could you run gdb >>>> against the core and get a backtrace? That'd be a lot of help too. >>>> >>>> Thanks for helping me look into this, >>>> >>>> Al >>>> >>>>> Hi Al, >>>>> >>>>> thanks for your fast answer. >>>>> >>>>> I've tested your test-version and it seems to be on the correct way. It >>>>> still crashes, but now I get sensor-data :) : >>>>> >>>>> [...] >>>>> >>>> >>>> -- >>>> Albert Chu >>>> [EMAIL PROTECTED] >>>> 925-422-5311 >>>> Computer Scientist >>>> High Performance Systems Division >>>> Lawrence Livermore National Laboratory >>>> -- Gregor Dschung System Life Guard, HiWi Fraunhofer-Institut für Techno- und Wirtschaftsmathematik ITWM Fraunhofer-Platz 1 D-67663 Kaiserslautern E-Mail: [EMAIL PROTECTED] Internet: www.itwm.fraunhofer.de
sdr-cache.tar.bz2
Description: application/bzip
ipmi-sensors.debug.tar.bz2
Description: application/bzip
ipmi-sensors_x64.debug.tar.bz2
Description: application/bzip
_______________________________________________ Freeipmi-devel mailing list Freeipmi-devel@gnu.org http://lists.gnu.org/mailman/listinfo/freeipmi-devel