I don't have the I2C spec, but I was assured by the patch author that this failure was a temporary failure.
What is bizarre, though, is that this is only used when going out on the IPMB; it shouldn't have any effect on local BMC messages. I'd be surprised if this sensor was on the IPMB, or even if this box had an IPMB at all. I suspect that there is some incorrect SDR information or information that ipmitool is misinterpreting. I don't think you can dump the raw SDRs via ipmitool. You can with openipmi (with the "mc sdr" command). You could pull the smi connection up in the GUI, dump the SDRs on mc (0.20), and find the sensor in the tree and get the information about it there. -Corey Bela Lubkin wrote: > > ... > > So. One reason I was pursuing this anomaly was that, as I said, it got > _much_ worse with some driver changes. I have now gone back and serially > layered on all the patches I'm trying to integrate. > > The cause of the extra slowdowns is the "Retryable return codes" patch, > > > http://www.mail-archive.com/[email protected]/msg00451 > .html > > i.e.: > > ipmi_msghandler.c:ipmi_smi_msg_received(): > > if ((msg->rsp_size >= 3) && (msg->rsp[2] != 0) > && (msg->rsp[2] != IPMI_NODE_BUSY_ERR) > - && (msg->rsp[2] != IPMI_LOST_ARBITRATION_ERR)) > + && (msg->rsp[2] != IPMI_LOST_ARBITRATION_ERR) > + && (msg->rsp[2] != IPMI_BUS_ERR) > + && (msg->rsp[2] != IPMI_NAK_ON_WRITE_ERR)) > > I instrumented this and found that the driver is getting lots of > IPMI_NAK_ON_WRITE_ERRs. No IPMI_BUS_ERRs. Each of the slow sensors > hits 5 (exactly 5) IPMI_NAK_ON_WRITE_ERRs before completing. This > number 5 corresponds to ipmi_msghandler.c:i_ipmi_request(): > > if (addr->addr_type == IPMI_IPMB_BROADCAST_ADDR_TYPE) > retries = 0; /* Don't retry broadcasts. */ > else > --> retries = 4; > > It's retrying the command 4 times (== 5 total) before failing. > > Even if I set retries = 0 here, it's still much slower than without > checking for IPMI_NAK_ON_WRITE_ERR. Total runtime for `ipmitool > sensor` goes from 5s (no IPMI_NAK_ON_WRITE_ERR checking) to 24s > (checking + 0 retries) to 112s (checking + 4 retries). > > Is it right that there's a 1s delay on the failure path, even when > it's on its last (re-)try? > > Anyway, for my setup, that patch is very harmful to performance. > > Meanwhile, on the trail of what's happening with the hardware, I > return to a slice of my original output: > > ##PS Redundancy |0x0 |discrete|0x0080|na|na|na|na|na|na > ##Drive |0x0 |discrete|0x0080|na|na|na|na|na|na > ##############ECC Corr Err|na |discrete|na |na|na|na|na|na|na > #####ECC Uncorr Err |na |discrete|na |na|na|na|na|na|na > #####I/O Channel Chk |na |discrete|na |na|na|na|na|na|na > > I think those "na" outputs in column 4 mean that we're not getting > any information about those sensors. I should have noticed that > earlier... > > According to `ipmitool sdr list -v all`, all of the problematic > sensors correspond to "Entity ID: 34.6 (BIOS)". None of the happy > sensors are provided by the BIOS. > > For the moment I am omitting the "Retryable return codes" patch > from my working environment; then these sensors fail immediately > instead of suffering 5-each 1s timeouts. > > Matt, is it expected that the BMC on a PE1800 can't get any sensor > readings from the BIOS? > > >> Bela< >> > > ------------------------------------------------------------------------- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to share your > opinions on IT & business topics through brief surveys - and earn cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > _______________________________________________ > Openipmi-developer mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/openipmi-developer > ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys - and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ Openipmi-developer mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/openipmi-developer
