Hi Pat, I understood the fix of this problem was already included in rc version and will be released as 1.8.14 officially.
Thanks, Kazuyuki Funada (2014/04/17 22:40), Pat Donlin wrote: > Funada, > > I have seen similar retry failures in the lanplus driver, and have > submitted fixes which are presently part of the 1.8.14-rc1 build. The > specific problem you describe is likely the lack of cleanup on the prior > request before starting the retry request. The last 3 lines added to the > diff below highlights the call to clean up the old request off the list. > > In general I am an advocate of increasing the default timeout for > lanplus from 1 second to at least 4 seconds. My experience with Intel > Romley and Grantley platforms has shown a steady increase in load on > BMCs and more frequent occasions where the BMC simply cannot respond > within 1 second. > > Regards, > > Pat Donlin > Principal Engineer > SGI > > > > diff -c lanplus.c.orig lanplus.c.leakfix > *** lanplus.c.orig 2014-01-15 07:48:36.000000000 -0600 > --- lanplus.c.leakfix 2014-01-15 07:48:43.000000000 -0600 > *************** > *** 2099,2104 **** > --- 2099,2105 ---- > uint8_t * msg_data; > int msg_length; > struct ipmi_session * session = intf->session; > + struct ipmi_rq_entry * entry = NULL; > int try = 0; > int xmit = 1; > time_t ltime; > *************** > *** 2123,2129 **** > /* > * Build an IPMI v1.5 or v2 command > */ > - struct ipmi_rq_entry * entry; > struct ipmi_rq * ipmi_request = > payload->payload.ipmi_request.request; > > lprintf(LOG_DEBUG, ""); > --- 2124,2129 ---- > *************** > *** 2304,2309 **** > --- 2304,2312 ---- > > if (rsp) > break; > + // req timed out, remove entry > + if ((payload->payload_type == > IPMI_PAYLOAD_TYPE_IPMI) && entry) > + ipmi_req_remove_entry( entry->rq_seq, > entry->req.msg.cmd); > } > > /* only timeout if time exceeds the timeout value */ > > On 4/17/2014 7:22 AM, ipmitool-devel-requ...@lists.sourceforge.net wrote: >> Today's Topics: >> >> 1. regarding libipmitool library (sarath azad) >> 2. Implementation of lanplus for retry (Kazuyuki Funada) >> 3. [BMR #81324] PigeonPoint Systems various patches (Dmitry Bazhenov) >> >> >> ---------------------------------------------------------------------- >> >> >> ------------------------------ >> >> Message: 2 >> Date: Fri, 11 Apr 2014 09:25:58 +0000 >> From: Kazuyuki Funada <funada-...@necst.nec.co.jp> >> Subject: [Ipmitool-devel] Implementation of lanplus for retry >> To: "ipmitool-devel@lists.sourceforge.net" >> <ipmitool-devel@lists.sourceforge.net> >> Message-ID: >> <c91f56d9b1a7be42b84e6a85bcb0aaf301004...@bpxm22gp.gisp.nec.co.jp> >> Content-Type: text/plain; charset="iso-2022-jp" >> >> >> Hello. I'm a newbie in this list and I have a question about current >> implementation of lanplus for retry. >> Our firmware developers including me have faced a issue that target >> controller occasionally returns 0xc1 response to "Get Chassis Status" >> command. >> We found the issue happened when target controller could not respond within >> 1 second and ipmitool retried sending packet. We checked debugging output of >> ipmitool and understood the mechanism was below. >> >> - user issued command(netfn=0x00 command=0x01) by ipmitool using lanplus I/F >> - ipmitool sent command(netfn=0x06 command=0x01) and added it to list with >> seq#2 >> (target controller did not respond within 1 second) >> - ipmitool sent command again and added it to list with seq#2 >> - target controller sent response for 1st command >> - ipmitool received it and removed 1st entry(seq#2) >> - ipmitool sent command(netfn=0x2c command=0x00) and added it to list with >> seq#3 >> - target controller sent response for 2nd command(retry) >> - ipmitool received it and removed 2nd entry(seq#2) >> - ipmitool sent user command(netfn=0x00 command=0x01) and added it to list >> with seq#4 >> - target controller sent response for 3rd command(seq#3) and it had 0xc1 >> response >> - ipmitool received it and removed 3rd entry(seq#3) >> - ipmitool returned this 0xc1 response to user even though it was not for >> user command(seq#4). >> >> I guess ipmitool should remove 1st entry by itself before adding retried >> command to list. >> I also checked source code of version 1.8.13 and found "lanplus.c" does not >> have a code for the purpose but "lan.c" has it. >> My question is whether current implementation of "lanplus.c" is correct or >> not. >> I also know we can avoid this problem by using "-N" option and we will use >> it for a while. >> >> Best Regards, >> Kazuyuki Funada >> >> >> ------------------------------ >> > > > ------------------------------------------------------------------------------ > Learn Graph Databases - Download FREE O'Reilly Book > "Graph Databases" is the definitive new guide to graph databases and their > applications. Written by three acclaimed leaders in the field, > this first edition is now available. Download your free book today! > http://p.sf.net/sfu/NeoTech > _______________________________________________ > Ipmitool-devel mailing list > Ipmitool-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/ipmitool-devel > ------------------------------------------------------------------------------ Learn Graph Databases - Download FREE O'Reilly Book "Graph Databases" is the definitive new guide to graph databases and their applications. Written by three acclaimed leaders in the field, this first edition is now available. Download your free book today! http://p.sf.net/sfu/NeoTech _______________________________________________ Ipmitool-devel mailing list Ipmitool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ipmitool-devel