> > Thanks for the bug report and debugging info. I think I know what is > > going on, I've attached a patch that should hopefully fix it. > > Basically, it looks like the BMC is alive enough that it sort of > > responds to the host, but not alive enough to actually complete a > > transaction. The driver needs to not immediately retry in that case, it > > needs to delay a bit. > > > > It passes all my tests, but the situation you are in would be hard to > > manufacture for me. > > > > Can you try this patch? > > Thanks for the super quick response, I'll try out this patch and report back my findings. > > Best regards > Mark
The patch looks good. Without the patch I was able to reproduce the problem on kernels 6.6 and 6.12 (but not 6.1) after 5-20 attempts of running 'ipmitool mc reset cold' every 2 minutes. With the patch, I have run it 50 times without incident. The hosed counter isn't as much of an indicator as I thought, I saw it in the tens of thousands with and without the patch, I have also seen it in the hundreds of thousands without the patch and on other hardware I have seen it reach 5 million in one hour without the patch (but also without incident). We will incorporate your patch into our builds so that we avoid hitting this problem in production again. Best regards Mark
_______________________________________________ Openipmi-developer mailing list Openipmi-developer@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/openipmi-developer