On 03.01.23 01:57, Peng Fan wrote:
>> To: Peng Fan <[email protected]>; [email protected]
>> Subject: Re: Question: ivshmem-net
>>
>> On 21.11.22 07:47, Peng Fan wrote:
>>> Hi,
>>>
>>> I have a question regarding ivshmem-net, not sure whether we need to
>> use local_bh_disable/enable to guard napi_schedule.
>>>
>>> See below patch:
>>>
>> https://patc
>>> hes.linaro.org%2Fproject%2Flinux-
>> usb%2Fpatch%2F877dk162mo.ffs%40nanos.
>>>
>> tec.linutronix.de%2F&data=05%7C01%7Cpeng.fan%40nxp.com%7C10674af
>> e061c4
>>>
>> fbdd5cd08daec94e54a%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0
>> %7C63808
>>>
>> 2421486787248%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiL
>> CJQIjoiV2lu
>>>
>> MzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=nxYB
>> 9%2BjoTRb
>>> hTF7GGaa32nmr7hU7zCk2fEOL5ZbbL34%3D&reserved=0
>>> "
>>> The driver invokes napi_schedule() in several places from task
>>> context. napi_schedule() raises the NET_RX softirq bit and relies on
>>> the calling context to ensure that the softirq is handled. That's
>>> usually on return from interrupt or on the outermost local_bh_enable().
>>>
>>> But that's not the case here which causes the soft interrupt handling
>>> to be delayed to the next interrupt or local_bh_enable(). If the task
>>> in which context this is invoked is the last runnable task on a CPU
>>> and the CPU goes idle before an interrupt arrives or a
>>> local_bh_disable/enable() pair handles the pending soft interrupt then
>>> the NOHZ idle code emits the following warning.
>>>
>>>   NOHZ tick-stop error: Non-RCU local softirq work is pending, handler
>> #08!!!
>>>
>>> Prevent this by wrapping the napi_schedule() invocation from task
>>> context into a local_bh_disable/enable() pair.
>>> "
>>>
>> https://lore
>>> .kernel.org%2Fall%2F87y28b9nyn.ffs%40tglx%2Ft%2F&data=05%7C01%7C
>> peng.f
>>>
>> an%40nxp.com%7C10674afe061c4fbdd5cd08daec94e54a%7C686ea1d3bc2b
>> 4c6fa92c
>>>
>> d99c5c301635%7C0%7C0%7C638082421486787248%7CUnknown%7CTWFp
>> bGZsb3d8eyJW
>>>
>> IjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7
>> C3000%
>>>
>> 7C%7C%7C&sdata=g%2FR48KwZCOVGt6momplzAv5VwmWpykE6c0e29pCpv
>> 9k%3D&reserv
>>> ed=0
>>>
>>> I draft one:
>>> diff --git a/drivers/net/ivshmem-net.c b/drivers/net/ivshmem-net.c
>>> index 3bcd39b91176e..81e19d80bd0a7 100644
>>> --- a/drivers/net/ivshmem-net.c
>>> +++ b/drivers/net/ivshmem-net.c
>>> @@ -558,7 +558,9 @@ static void ivshm_net_run(struct net_device *ndev)
>>>
>>>         netif_start_queue(ndev);
>>>         napi_enable(&in->napi);
>>> +       local_bh_disable();
>>>         napi_schedule(&in->napi);
>>> +       local_bh_enable();
>>>         ivshm_net_set_state(in, IVSHM_NET_STATE_RUN);  }
>>>
>>> There are other places calling napi_schedule, but seems no need
>> local_bh_disable/enable to protect.
>>>
>>> Not sure the upper change is valid or not, please help check.
>>>
>>
>> Looks suspicious, indeed. I wonder why there are not runtime checks in
>> napi_schedule to detect its needs and report a wrong bh state.
>>
>> Does this issue only trigger with NOHZ enabled?
> [Peng Fan] 
> 
> I not tested other configurations, but I think so from various
> lkml thread reading.
> 

Ok, makes sense. And Peter even wrote a detection patch as it seems.

Would you write a proper patch from your change, maybe also explaining
why the other spots do not need this? Then I could merge it into
queues/jailhouse where we track the driver history (it's usually folded
in stable jailhouse-enabling branches).

Thanks,
Jan

-- 
Siemens AG, Technology
Competence Center Embedded Linux

-- 
You received this message because you are subscribed to the Google Groups 
"Jailhouse" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/jailhouse-dev/5786bf3d-f0f2-48d3-609e-a9fdc68de40f%40siemens.com.

Reply via email to