Hi,Corey Ok , We will test it.
But, please wait a little because this problem happens only on our specific products ,which we have a small numbers of. We have to reserve that first amog our sections internally. Regards, M.Gouji. > -----Original Message----- > From: Corey Minyard [mailto:[email protected]] On Behalf Of Corey Minyard > Sent: Sunday, September 06, 2015 8:03 AM > To: Gouji, Masayuki/郷治 将之; [email protected] > Subject: Re: [Openipmi-developer] [PATCH] ipmi:driver not to stop when the > hardware not respond in irq/revised > > On 09/05/2015 03:27 AM, Gouji, Masayuki wrote: > > Hi, Corey > > > > I think the reason why timer stops is that timer system in IPMI is based > on only the request from outside. > > > > This is typical case > > > > 1) IPMI driver is in idle state. No pending messages , SI_timer is already > stopped. > > 2) Some internal events happens and BMC raised interrupt. > > 3) Driver send IPMI_GET_MSG_FLAGS_CMD from smi_event_handler via > smi_info->handlers->start_transaction > > start_kcs_transaction(=handlers->start_transaction) never set > > timer > > Ah then that's the bug. The attached patch should start the timer for > internal messages as well as external messages. I can't believe I missed > that. > > If you could test this, it would be great. > > Thanks, > > -corey > > > > > From kcs , it doesn't matter who sent the message . > > For working around the hang up , the timer is needed for those situasion > either. > > Ipmi_timeout may be invoked in many cases.But it won't start si_timeout > because of the reason described below. > > > > > > Some simular messages invoked internally , like > IPMI_READ_EVENT_MSG_BUFFER_CMD/ IPMI_GET_MSG_CMD and so on , may fall in > the same situation. > > All are sent directly , using smi_info->handlers->start_transaction > > > > Watching where check_start_timer_thread is called from ,these are > sender() and set_need_watch(). > > > > > > First, in check_start_timer_thread() , It has > > > > if (smi_info->si_state == SI_NORMAL && smi_info->curr_msg == NULL) > > > > We have situation where both conditions are not met . > > When sending ie. IPMI_GET_MSG_FLAGS_CMD , smi_info->si_state > ==SI_GETTING_FLAGS and smi_info->curr_msg is occupied by > IPMI_GET_MSG_FLAGS_CMD. > > > > Simply , the condition for timer restart is complies for the messages > sent by APPL or other handlers. > > > > So, I think this code should be like this. > > > > > > if ((smi_info->si_state == SI_NORMAL && smi_info->curr_msg == NULL ) > > || smi_info->si_state == SI_GETTING_FLAGS > > || smi_info->si_state == SI_GETTING_MESSAGES > > ..... ) > > > > This is just showing the rough sketch for the fix. > > > > Second , > > > > Ipmi_timeout stops under some condition. > > > > - there is no event_waiter , which may happen eveb if rare. > > - there is no pending messages after sending > > - > > In this condition, ipmi_timeout may be restarted by need_waiter(). > > > > This means ipmi_timeout can remain stopping while the internall messages > are being sent. > > > > This may require some fix too. > > > > > > > > If you think the way to let check_start_timer_thread() restart timer > > correctly is better, I'd think again according to this idea. > > > > By the way , my new patch still includes a BUG. > > > > I wanted to detect the firast attempt just after status switched to > > KCS_WAIT_WRITE_START and used kcs->write_count == 1 But this for the > > last attempt. I have to find another way if I adhere to it. > > Anyway this is wrong yet. > > > > > > > > Regards, > > M.Gouji > > > > > >> -----Original Message----- > >> From: Corey Minyard [mailto:[email protected]] On Behalf Of Corey > >> Minyard > >> Sent: Saturday, September 05, 2015 4:04 AM > >> To: Gouji, Masayuki/郷治 将之; > [email protected] > >> Subject: Re: [Openipmi-developer] [PATCH] ipmi:driver not to stop > >> when the hardware not respond in irq/revised > >> > >> On 09/04/2015 04:56 AM, Gouji, Masayuki wrote: > >>> Hi, Corey > >>> > >>> This is new one which includes the missed braces and function for > >>> modifying > >> waiting period (default 0). > >>> I think it complies to your comment. > >> Well, not exactly. Thinking through this, if you are going to delay > >> and spin, you might as well spin completely in the KCS code, spinning > >> on the IBF value. That would be a lot simpler and it would return as > >> soon as IBF was set. However, spinning is just not a good idea, > >> especially in hardirq or softirq mode (where some of this would be > happening). > >> > >> I've spent some time thinking about this and looking at the code, and > >> I realized that the timer should be running whenever a message is being > sent. > >> Which is why no one has seen this before, and it means we are dealing > >> with a different issue. If you look at the sender() function in > >> ipmi_si_intf.c, it calls check_start_timer_thread(), which will start > >> the timer and start the message send. The timer code checks at the > >> end if it is idle, and if it is idle it doesn't restart the timer. > >> > >> So somehow the timer is not getting started or is getting canceled. > >> That's the problem we need to fix. > >> > >> Looking at the code: > >> > >> static void check_start_timer_thread(struct smi_info *smi_info) { > >> if (smi_info->si_state == SI_NORMAL && smi_info->curr_msg == NULL) > { > >> smi_mod_timer(smi_info, jiffies + SI_TIMEOUT_JIFFIES); > >> > >> if (smi_info->thread) > >> wake_up_process(smi_info->thread); > >> > >> start_next_msg(smi_info); > >> smi_event_handler(smi_info, 0); > >> } > >> } > >> > >> perhaps start_next_msg() should come before smi_mod_timer(). Maybe > >> there's a race where the timer goes off before start_next_msg()? > >> Well, that shouldn't be possible because this code is called under > >> the lock and the first the time timeout does is claim the lock. But > >> something is going on here. > >> > >> -corey > >> ------------------------------------------------------------------------------ _______________________________________________ Openipmi-developer mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/openipmi-developer
