> From: Corey Minyard [mailto:tcminy...@gmail.com] On Behalf Of Corey Minyard > > On 08/17/2015 09:54 PM, 河合英宏 / KAWAI,HIDEHIRO wrote: > >> From: Corey Minyard [mailto:tcminy...@gmail.com] On Behalf Of Corey Minyard > >> > >> This patch will break ATN handling on the interfaces. So we can't do this. > > I understand. So how about doing like this: > > > > /* All states wait for ibf, so just do it here. */ > > - if (!check_ibf(kcs, status, time)) > > + if (kcs->state != KCS_IDLE && !check_ibf(kcs, status, time)) > > return SI_SM_CALL_WITH_DELAY; > > > > I think it is not necessary to wait IBF when the state is IDLE. > > In this way, we can also handle the ATN case. > > I think it would be more reliable to go up a level and add a timeout.
It may be so, but we should address this issue separately (at least I think above solution reasonably solves the issue). This issue happens after all queued messages are processed or dropped by timeout. There is no current message. So what should we set a timeout against? We can add a timeout into my new flush_messages(), but that is meaningful only in panic context. That doesn't help in normal context; we would perform a busy loop of smi_event_handler() and schedule() in ipmi_thread(). Regards, Hidehiro Kawai > One should > be there, anyway. I thought they were all covered, but I may have missed > something. > > -corey > > > > > Regards, > > > > Hidehiro Kawai > > Hitachi, Ltd. Research & Development Group > > > >> It's going to be extremely hard to recover if the BMC is not working > >> correctly when a panic happens. I'm not sure what can be done, but if > >> you can fix it another way it would be good. > >> > >> -corey > >> > >> On 07/27/2015 12:55 AM, Hidehiro Kawai wrote: > >>> If a BMC is unresponsive for some reason, it ends up completing > >>> the requested message as an error, then kcs_event() is called once > >>> to advance the state machine. However, since the BMC is > >>> unresponsive now, the status of the KCS interface may not be > >>> idle. As the result, the state machine can continue to run and > >>> comsume CPU time indefinitely even if there is no more request > >>> message. Moreover, if this happens in run-to-completion mode > >>> (i.e. context of panic_event()), the kernel hangs up. > >>> > >>> To fix this problem, this patch ignores kcs_event() call if there > >>> is no request message to be processed. > >>> > >>> Signed-off-by: Hidehiro Kawai <hidehiro.kawai...@hitachi.com> > >>> --- > >>> drivers/char/ipmi/ipmi_kcs_sm.c | 4 ++++ > >>> 1 file changed, 4 insertions(+) > >>> > >>> diff --git a/drivers/char/ipmi/ipmi_kcs_sm.c > >>> b/drivers/char/ipmi/ipmi_kcs_sm.c > >>> index 8c25f59..0e187fb 100644 > >>> --- a/drivers/char/ipmi/ipmi_kcs_sm.c > >>> +++ b/drivers/char/ipmi/ipmi_kcs_sm.c > >>> @@ -353,6 +353,10 @@ static enum si_sm_result kcs_event(struct si_sm_data > >>> *kcs, long time) > >>> if (kcs_debug & KCS_DEBUG_STATES) > >>> printk(KERN_DEBUG "KCS: State = %d, %x\n", kcs->state, status); > >>> > >>> + /* We don't want to run the state machine when the state is IDLE */ > >>> + if (kcs->state == KCS_IDLE) > >>> + return SI_SM_IDLE; > >>> + > >>> /* All states wait for ibf, so just do it here. */ > >>> if (!check_ibf(kcs, status, time)) > >>> return SI_SM_CALL_WITH_DELAY; > >>> > >>>