On 12/03/2013 10:40 PM, Corey Minyard wrote:

> On 12/03/2013 12:18 AM, srinivas_g_go...@dell.com wrote:
>> Dell - Internal Use - Confidential 
>>
>> Hi Corey,
>>> Unfortunately, that would start the timer unnecessarily.  You don't want to 
>>> start timers unnecessarily in the kernel or the power management police 
>>> will come after you.
>>    I still see the issue after applying the patch. 
>>    I have one question, for commands such as IPMI_READ_EVENT_MSG_BUFFER_CMD 
>> in smi_event_handler() that are invoked by the driver itself where do you 
>> expect the timer to be set ??
> 
> The timer should be done by the calling function.  Since the timer is
> actually used to time out complete operations, you don't want to reset
> it from interrupts or the kthread.  The patch I added would start the
> timer if it wasn't running and the event returned something non-idle.
> 
> So what should happen is smi_event_handler() starts the event message
> buffer read, goes back to the top of smi_event_handler() and handles the
> new message.  It should return something besides idle in that case.  And
> thus the timer should be started.
> 
>>> The patch I sent did have this call in the non-idle portion of the kernel 
>>> thread and that should have done the same thing.  Plus, if you are using 
>>> the kernel thread, it's going to run periodically and should kick things 
>>> off again if they get stuck.  I'm suspicious now that something else is 
>>> going on.
>>   ipmi_thread() is getting invoked when the issue is seen, but I keep 
>> hitting this condition
>>   else if (smi_result == SI_SM_CALL_WITH_DELAY && busy_wait)
>>   Since the timer is not armed for IPMI_READ_EVENT_MSG_BUFFER_CMD, 
>> smi_event_handler calls with time value 0, which doesn't help when OBF is 
>> stuck.
>>   We keep running around this loop since we never hit the kcs error states.
> 
> Maybe the timer start function needs to be called from that else clause,
> but it didn't seem so to me.  It wouldn't hurt, I suppose.  But the
> event handler should eventually return something non-idle and the timer
> start function should get called.  Is the ipmi thread just stuck using
> 100% CPU?  

Yes

> Does the final else clause ever get hit?  Maybe the bug is in
> ipmi_thread_busy_wait().
> 

No, I did not see final else in ipmi_thread ever getting hit.

I am looking at  ipmi_thread_busy_wait(),
 by default the below condition never gets set unitl I explicetly set 
"kipmid_max_busy_us" to some value.
if (smi_info->intf_num < num_max_busy_us)

This means by default "max_busy_us" is always 0. Hence Ill basically end up 
only doing this
if (max_busy_us == 0 || smi_result != SI_SM_CALL_WITH_DELAY)
                ipmi_si_set_not_busy(busy_until);

That probably explains why I never hit the else condition in ipmi_thread.

I see ipmi_start_timer_if_necessary() getting called from ipmi_thread() after 
setting "kipmid_max_busy_us"
Ill set "kipmid_max_busy_us=300" and run the stress. I am hoping that we will 
not see the issue in this case.


Thanks,
G



> Thanks,
> 
> -corey
> 
>>
>>
>> Thanks,
>> G
>>
>> -----Original Message-----
>> From: Corey Minyard [mailto:tcminy...@gmail.com] On Behalf Of Corey Minyard
>> Sent: Tuesday, December 03, 2013 2:34 AM
>> To: Gowda, Srinivas G
>> Cc: tcminy...@gmail.com; linux-kernel@vger.kernel.org; openi...@mvista.com
>> Subject: Re: [PATCH 1/1] ipmi: setting mod_timer for read_event_msg buffer 
>> cmd On 12/02/2013 08:49 AM, srinivas_g_go...@dell.com wrote: 
>>> Thanks for the patch Corey. 
>>> I am afraid that the system does not have interrupts enabled. It uses 
>>> polling mode. 
>>>
>>> When the error is seen, I know for a fact that in function
>>> ipmi_thread() smi_result is SI_SM_CALL_WITH_DELAY, I have some logs where 
>>> in busy_wait always reads as 1. Not sure if it was ever set to 0. (ill 
>>> check this again).
>>> Ill anyway run the test using the patch that you have shared. 
>>>
>>> b/w would it harm if we were to do to something like this ?
>> Unfortunately, that would start the timer unnecessarily.  You don't want to 
>> start timers unnecessarily in the kernel or the power management police will 
>> come after you.
>> The patch I sent did have this call in the non-idle portion of the kernel 
>> thread and that should have done the same thing.  Plus, if you are using the 
>> kernel thread, it's going to run periodically and should kick things off 
>> again if they get stuck.  I'm suspicious now that something else is going on.
>> -corey 
>>> Signed-off-by: Srinivas Gowda <srinivas_g_go...@dell.com>
>>> ---
>>>   drivers/char/ipmi/ipmi_si_intf.c | 1 +
>>>   1 file changed, 1 insertion(+)
>>>
>>> diff --git a/drivers/char/ipmi/ipmi_si_intf.c
>>> b/drivers/char/ipmi/ipmi_si_intf.c
>>> index 15e4a60..e23484f 100644
>>> --- a/drivers/char/ipmi/ipmi_si_intf.c
>>> +++ b/drivers/char/ipmi/ipmi_si_intf.c
>>> @@ -1008,6 +1008,7 @@ static int ipmi_thread(void *data)
>>>                spin_unlock_irqrestore(&(smi_info->si_lock), flags);
>>>                busy_wait = ipmi_thread_busy_wait(smi_result, smi_info,
>>>                                                  &busy_until);
>>> +             ipmi_start_timer_if_necessary(smi_info);
>>>                if (smi_result == SI_SM_CALL_WITHOUT_DELAY)
>>>                        ; /* do nothing */
>>>                else if (smi_result == SI_SM_CALL_WITH_DELAY && 
>>> busy_wait)
> 



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to