There's clearly some communication problems with the motherboard, leading to the "internal IPMI errors". Many times we send a request and don't even see a response. In atleast one case before, the response wasn't even a fully formed packet.
But this made me realize what is the possible problem. When you run IPMI commands (i.e. ipmi-sensors), are you using one of the kernel device drivers (e.g. linux defaults to /dev/ipmi0) as your communication driver? The default ipmimonitoring-sensors example happens to use the KCS driver, which is separate and not related to the kernel one. It may be conflicting w/ the kernel device driver. Effectively they are both doing communication to the BMC but not sharing a lock. If you are using /dev/ipmi0, if you changed the ipmimonitoring example to use the IPMI_MONITORING_DRIVER_TYPE_OPENIPMI driver, thing'll probably work out. Al On Tue, 2017-07-18 at 11:43 -0700, Sohan Chowdary Kollu wrote: > I am using 1.5.5 version. > > Below are the packet details along with errors. Except for the 3rd > scenario all other errors are very frequent > > > 1) > > Failed right away (first sdr request in the trace) > > > Get SDR Repository Info Request > > ===================================================== > > KCS Header: > > ------------ > > [ 0h] = lun[ 2b] > > [ Ah] = net_fn[ 6b] > > IPMI Command Data: > > ------------------ > > [ 20h] = cmd[ 8b] > > (ipmi_monitoring_sdr_cache.c, ipmi_monitoring_sdr_cache_load, 314): > ipmi_sdr_cache_open: internal IPMI error > > ipmi_monitoring_sensor_readings_by_record_id: internal error > > > 2) > > a) Failed right away (first sdr request in the trace) > > ===================================================== > > Get SDR Repository Info Request > > ===================================================== > > KCS Header: > > ------------ > > [ 0h] = lun[ 2b] > > [ Ah] = net_fn[ 6b] > > IPMI Command Data: > > ------------------ > > [ 20h] = cmd[ 8b] > > (ipmi_monitoring_sdr_cache.c, _ipmi_monitoring_sdr_cache_retrieve, > 223): ipmi_sdr_cache_create: internal IPMI error > > ipmi_monitoring_sensor_readings_by_record_id: internal error > > > b) Failed after going though some sdr requests > > ===================================================== > > Get SDR Request > > ===================================================== > > KCS Header: > > ------------ > > [ 0h] = lun[ 2b] > > [ Ah] = net_fn[ 6b] > > IPMI Command Data: > > ------------------ > > [ 23h] = cmd[ 8b] > > [ 8820h] = reservation_id[16b] > > [ 82h] = record_id[16b] > > [ 25h] = offset_into_record[ 8b] > > [ 10h] = bytes_to_read[ 8b] > > (ipmi_monitoring_sdr_cache.c, _ipmi_monitoring_sdr_cache_retrieve, > 223): ipmi_sdr_cache_create: internal IPMI error > > ipmi_monitoring_sensor_readings_by_record_id: internal error > > > 3) > > Failed right away (first sdr request in the trace). Seen this only > twice > > > ===================================================== > > Get SDR Repository Info Request > > ===================================================== > > KCS Header: > > ------------ > > [ 0h] = lun[ 2b] > > [ Ah] = net_fn[ 6b] > > IPMI Command Data: > > ------------------ > > [ 20h] = cmd[ 8b] > > (ipmi_monitoring_sdr_cache.c, ipmi_monitoring_sdr_cache_load, 336): > ipmi_sdr_cache_open: internal IPMI error > > ipmi_monitoring_sensor_readings_by_record_id: internal error > > > 4) > > a) Failed at Reading Request > > ===================================================== > > Get Sensor Reading Request > > ===================================================== > > KCS Header: > > ------------ > > [ 0h] = lun[ 2b] > > [ 4h] = net_fn[ 6b] > > IPMI Command Data: > > ------------------ > > [ 2Dh] = cmd[ 8b] > > [ B0h] = sensor_number[ 8b] > > (ipmi_monitoring_sensor_reading.c, _get_sensor_reading, 356): > ipmi_sensor_read: internal IPMI error > > (ipmi_monitoring.c, _ipmi_monitoring_sensor_readings_by_record_id, > 1449): ipmi_sdr_cache_iterate: error returned in callback > > ipmi_monitoring_sensor_readings_by_record_id: internal error > > > b) Failed at Reading Response > > ===================================================== > > Get Sensor Reading Request > > ===================================================== > > KCS Header: > > ------------ > > [ 0h] = lun[ 2b] > > [ 4h] = net_fn[ 6b] > > IPMI Command Data: > > ------------------ > > [ 2Dh] = cmd[ 8b] > > [ 90h] = sensor_number[ 8b] > > ===================================================== > > Get Sensor Reading Response > > ===================================================== > > KCS Header: > > ------------ > > [ 0h] = lun[ 2b] > > [ 5h] = net_fn[ 6b] > > IPMI Command Data: > > ------------------ > > [ 0h] = cmd[ 8b] > > (ipmi_monitoring_sensor_reading.c, _get_sensor_reading, 356): > ipmi_sensor_read: internal IPMI error > > (ipmi_monitoring.c, _ipmi_monitoring_sensor_readings_by_record_id, > 1449): ipmi_sdr_cache_iterate: error returned in callback > > ipmi_monitoring_sensor_readings_by_record_id: internal error > > > Thanks > > > > On Mon, Jul 17, 2017 at 11:46 PM, Albert Chu <achu.de...@gmail.com> > wrote: > Hi, > > > What version of FreeIPMI are you using? The line numbers > don't quite line up with the master branch. > > > Also, could you set IPMI_MONITORING_FLAGS_DEBUG_IPMI_PACKETS > and show the IPMI packet that occurs right before the error > line? > > > Thanks, > > > > Al > > > On Mon, Jul 17, 2017 at 4:28 PM, Sohan Chowdary Kollu > <sko...@ncsu.edu> wrote: > Hi Albert, > > Thanks for quick response. I have set the flags for > debugging and found it failing at one of the three > instances below in different runs. > > 1) (ipmi_monitoring_sensor_reading.c, > _get_sensor_reading, 356): ipmi_sensor_read: internal > system error(ipmi_monitoring.c, > _ipmi_monitoring_sensor_readings_by_record_id, 1449): > ipmi_sdr_cache_iterate: error returned in callback > ipmi_monitoring_sensor_readings_by_record_id: internal > error > 2)(ipmi_monitoring_sdr_cache.c, > ipmi_monitoring_sdr_cache_load, 314): > ipmi_sdr_cache_open: internal IPMI > error ipmi_monitoring_sensor_readings_by_record_id: > internal error > > > 3) (ipmi_monitoring_sdr_cache.c, > _ipmi_monitoring_sdr_cache_retrieve, 223): > ipmi_sdr_cache_create: internal IPMI > error ipmi_monitoring_sensor_readings_by_record_id: > internal error > > > > Thanks > > > > On Mon, Jul 17, 2017 at 2:34 PM, Albert Chu > <ch...@llnl.gov> wrote: > The "internal error" indicates some logical > error that the library > doesn't know how to handle. Given its coming > from > ipmi_monitoring_sensor_readings_by_record_id > and it occurs when you run > the program back to back, I would bet there is > some internal IPMI issue > on your system. Perhaps its a new error code > or something like that > that I do not handle gracefully correctly. > > To try and debug, could you set the flag > "IPMI_MONITORING_FLAGS_DEBUG | > IPMI_MONITORING_FLAGS_DEBUG_IPMI_PACKETS" when > calling > ipmimonitoring_init() in the example code. > Hopefully that'll be enough > to figure out the issue. > > Al > > On Mon, 2017-07-17 at 13:03 -0700, Sohan > Chowdary Kollu wrote: > > Hi, > > > > I am executing the ipmimonitoring-sensors.c > example provided in the > > freeipmi library. It throws internal error > sometimes. Issue is > > reproducible when i execute the program back > to back couple of times. > > I need to wait approximately 30 sec or more > after the last execution > > for the program to run properly. > > > > > > This is the error > ipmi_monitoring_sensor_readings_by_record_id: > > internal error > > > > > > > > I ran some of the commands on terminal back > to back , including > > ipmi-sensors with group option, > ipmimonitoring etc. None of them thew > > any errors. Error occurs only when i am use > the API. > > > > > > Has anyone faced this issue before? If yes, > can you tell me how to > > avoid it > > > > > > > > > > Thanks, > > Sohan > > > > _______________________________________________ > > Freeipmi-devel mailing list > > Freeipmi-devel@gnu.org > > > https://lists.gnu.org/mailman/listinfo/freeipmi-devel > > -- > Albert Chu > ch...@llnl.gov > Computer Scientist > High Performance Systems Division > Lawrence Livermore National Laboratory > > > > > > > -- > Thanks, > Sohan > > _______________________________________________ > Freeipmi-devel mailing list > Freeipmi-devel@gnu.org > https://lists.gnu.org/mailman/listinfo/freeipmi-devel > > > > > > > > -- > Thanks, > Sohan -- Albert Chu ch...@llnl.gov Computer Scientist High Performance Systems Division Lawrence Livermore National Laboratory _______________________________________________ Freeipmi-devel mailing list Freeipmi-devel@gnu.org https://lists.gnu.org/mailman/listinfo/freeipmi-devel