On Thu, Aug 14, 2025 at 06:23:23PM +0100, Mark Bannister wrote:
> > > Thanks for the bug report and debugging info.  I think I know what is
> > > going on, I've attached a patch that should hopefully fix it.
> > > Basically, it looks like the BMC is alive enough that it sort of
> > > responds to the host, but not alive enough to actually complete a
> > > transaction.  The driver needs to not immediately retry in that case, it
> > > needs to delay a bit.
> > >
> > > It passes all my tests, but the situation you are in would be hard to
> > > manufacture for me.
> > >
> > > Can you try this patch?
> >
> > Thanks for the super quick response, I'll try out this patch and report
> back my findings.
> >
> > Best regards
> > Mark
> 
> The patch looks good.  Without the patch I was able to reproduce the
> problem on kernels 6.6 and 6.12 (but not 6.1) after 5-20 attempts of
> running 'ipmitool mc reset cold' every 2 minutes.  With the patch, I have
> run it 50 times without incident.

Perfect, I'll queue it for the next kernel release.  I can get it into
the current release if it's urgent.

The change that caused this was c608966f3f9c "ipmi: fix msg stack when
IPMI is disconnected" and it came in between 6.1 and 6.6.  I'm adding
the author of that patch because this change may affect that.

In hindsight I think the fix that caused this is wrong.  I'm not sure
how what the author said was happening could happen.  There's a limit
of 100 messages per user.  I am inclined right now to revert that
change.

> The hosed counter isn't as much of an
> indicator as I thought, I saw it in the tens of thousands with and without
> the patch, I have also seen it in the hundreds of thousands without the
> patch and on other hardware I have seen it reach 5 million in one hour
> without the patch (but also without incident).

Yeah, that's just a count of how many issues it has with the BMC.  You
will still see it go up.

-corey

> 
> We will incorporate your patch into our builds so that we avoid hitting
> this problem in production again.
> 
> Best regards
> Mark


_______________________________________________
Openipmi-developer mailing list
Openipmi-developer@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openipmi-developer

Reply via email to