Re: [PATCH net] net: lapbether: Prevent racing when checking whether the netif is running

2021-03-16 Thread Xie He
On Tue, Mar 16, 2021 at 8:17 AM Martin Schiller  wrote:
>
> On 2021-03-14 02:59, Xie He wrote:
> > Hi Martin,
> >
> > Could you ack? Thanks!
>
> Acked-by: Martin Schiller 

Thank you!!


Re: [PATCH net] net: lapbether: Prevent racing when checking whether the netif is running

2021-03-16 Thread Martin Schiller

On 2021-03-14 02:59, Xie He wrote:

Hi Martin,

Could you ack? Thanks!


Acked-by: Martin Schiller 


Re: [PATCH net] net: lapbether: Prevent racing when checking whether the netif is running

2021-03-13 Thread Xie He
Hi Martin,

Could you ack? Thanks!


Re: [PATCH net] net: lapbether: Prevent racing when checking whether the netif is running

2021-03-11 Thread Jakub Kicinski
On Thu, 11 Mar 2021 16:28:47 -0800 Xie He wrote:
> On Thu, Mar 11, 2021 at 4:10 PM Jakub Kicinski  wrote:
> >
> > And the "noqueue" queue is there because it's on top of hdlc_fr.c
> > somehow or some out of tree driver? Or do you install it manually?  
> 
> No, this driver is not related to "hdlc_fr.c" or any out-of-tree
> driver. The default qdisc is "noqueue" for this driver because this
> driver doesn't set "tx_queue_len". This means the value of
> "tx_queue_len" would be 0. In this case, "alloc_netdev_mqs" will
> automatically add the "IFF_NO_QUEUE" flag to the device, then
> "attach_one_default_qdisc" in "net/sched/sch_generic.c" will attach
> the "noqueue" qdisc for devices with the "IFF_NO_QUEUE" flag.

I see.


Re: [PATCH net] net: lapbether: Prevent racing when checking whether the netif is running

2021-03-11 Thread Xie He
On Thu, Mar 11, 2021 at 4:10 PM Jakub Kicinski  wrote:
>
> And the "noqueue" queue is there because it's on top of hdlc_fr.c
> somehow or some out of tree driver? Or do you install it manually?

No, this driver is not related to "hdlc_fr.c" or any out-of-tree
driver. The default qdisc is "noqueue" for this driver because this
driver doesn't set "tx_queue_len". This means the value of
"tx_queue_len" would be 0. In this case, "alloc_netdev_mqs" will
automatically add the "IFF_NO_QUEUE" flag to the device, then
"attach_one_default_qdisc" in "net/sched/sch_generic.c" will attach
the "noqueue" qdisc for devices with the "IFF_NO_QUEUE" flag.


Re: [PATCH net] net: lapbether: Prevent racing when checking whether the netif is running

2021-03-11 Thread Jakub Kicinski
On Thu, 11 Mar 2021 15:13:01 -0800 Xie He wrote:
> On Thu, Mar 11, 2021 at 2:52 PM Jakub Kicinski  wrote:
> >
> > Normally driver's ndo_stop() calls netif_tx_disable() which takes TX
> > locks, so unless your driver is lockless (LLTX) there should be no xmit
> > calls after that point.  
> 
> Do you mean I should call "netif_tx_disable" inside my "ndo_stop"
> function as a fix for the racing between "ndo_stop" and
> "ndo_start_xmit"?
> 
> I can't call "netif_tx_disable" inside my "ndo_stop" function because
> "netif_tx_disable" will call "netif_tx_stop_queue", which causes
> another racing problem. Please see my recent commit f7d9d4854519
> ("net: lapbether: Remove netif_start_queue / netif_stop_queue")

And the "noqueue" queue is there because it's on top of hdlc_fr.c
somehow or some out of tree driver? Or do you install it manually?


Re: [PATCH net] net: lapbether: Prevent racing when checking whether the netif is running

2021-03-11 Thread Xie He
On Thu, Mar 11, 2021 at 2:52 PM Jakub Kicinski  wrote:
>
> Normally driver's ndo_stop() calls netif_tx_disable() which takes TX
> locks, so unless your driver is lockless (LLTX) there should be no xmit
> calls after that point.

Do you mean I should call "netif_tx_disable" inside my "ndo_stop"
function as a fix for the racing between "ndo_stop" and
"ndo_start_xmit"?

I can't call "netif_tx_disable" inside my "ndo_stop" function because
"netif_tx_disable" will call "netif_tx_stop_queue", which causes
another racing problem. Please see my recent commit f7d9d4854519
("net: lapbether: Remove netif_start_queue / netif_stop_queue")


Re: [PATCH net] net: lapbether: Prevent racing when checking whether the netif is running

2021-03-11 Thread Jakub Kicinski
On Thu, 11 Mar 2021 13:12:25 -0800 Xie He wrote:
> On Thu, Mar 11, 2021 at 12:43 PM Jakub Kicinski  wrote:
> >
> > Is this a theoretical issues or do you see a path where it triggers?
> >
> > Who are the callers sending frames to a device which went down?  
> 
> This is a theoretical issue. I didn't see this issue in practice.
> 
> When "__dev_queue_xmit" and "sch_direct_xmit" call
> "dev_hard_start_xmit", there appears to be no locking mechanism
> preventing the netif from going down while "dev_hard_start_xmit" is
> doing its work.

Normally driver's ndo_stop() calls netif_tx_disable() which takes TX
locks, so unless your driver is lockless (LLTX) there should be no xmit
calls after that point.


Re: [PATCH net] net: lapbether: Prevent racing when checking whether the netif is running

2021-03-11 Thread Xie He
On Thu, Mar 11, 2021 at 12:43 PM Jakub Kicinski  wrote:
>
> Is this a theoretical issues or do you see a path where it triggers?
>
> Who are the callers sending frames to a device which went down?

This is a theoretical issue. I didn't see this issue in practice.

When "__dev_queue_xmit" and "sch_direct_xmit" call
"dev_hard_start_xmit", there appears to be no locking mechanism
preventing the netif from going down while "dev_hard_start_xmit" is
doing its work.

David once confirmed in an email that a driver's "ndo_stop" function
would indeed race with its "ndo_start_xmit" function:
https://lore.kernel.org/netdev/20190520.200922.2277656639346033061.da...@davemloft.net/


Re: [PATCH net] net: lapbether: Prevent racing when checking whether the netif is running

2021-03-11 Thread Jakub Kicinski
On Wed, 10 Mar 2021 23:23:09 -0800 Xie He wrote:
> There are two "netif_running" checks in this driver. One is in
> "lapbeth_xmit" and the other is in "lapbeth_rcv". They serve to make
> sure that the LAPB APIs called in these functions are called before
> "lapb_unregister" is called by the "ndo_stop" function.
> 
> However, these "netif_running" checks are unreliable, because it's
> possible that immediately after "netif_running" returns true, "ndo_stop"
> is called (which causes "lapb_unregister" to be called).
> 
> This patch adds locking to make sure "lapbeth_xmit" and "lapbeth_rcv" can
> reliably check and ensure the netif is running while doing their work.
> 
> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> Signed-off-by: Xie He 

Is this a theoretical issues or do you see a path where it triggers?

Who are the callers sending frames to a device which went down?