Re: [PATCH net v5] net: lapb: Add locking to the lapb module

2021-01-24 Thread Martin Schiller

On 2021-01-24 05:45, Jakub Kicinski wrote:

On Fri, 22 Jan 2021 10:07:05 +0100 Martin Schiller wrote:

On 2021-01-21 01:21, Xie He wrote:
> In the lapb module, the timers may run concurrently with other code in
> this module, and there is currently no locking to prevent the code from
> racing on "struct lapb_cb". This patch adds locking to prevent racing.
>
> 1. Add "spinlock_t lock" to "struct lapb_cb"; Add "spin_lock_bh" and
> "spin_unlock_bh" to APIs, timer functions and notifier functions.
>
> 2. Add "bool t1timer_stop, t2timer_stop" to "struct lapb_cb" to make us
> able to ask running timers to abort; Modify "lapb_stop_t1timer" and
> "lapb_stop_t2timer" to make them able to abort running timers;
> Modify "lapb_t2timer_expiry" and "lapb_t1timer_expiry" to make them
> abort after they are stopped by "lapb_stop_t1timer",
> "lapb_stop_t2timer",
> and "lapb_start_t1timer", "lapb_start_t2timer".
>
> 3. Let lapb_unregister wait for other API functions and running timers
> to stop.
>
> 4. The lapb_device_event function calls lapb_disconnect_request. In
> order to avoid trying to hold the lock twice, add a new function named
> "__lapb_disconnect_request" which assumes the lock is held, and make
> it called by lapb_disconnect_request and lapb_device_event.
>
> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> Cc: Martin Schiller 
> Signed-off-by: Xie He 

I don't have the opportunity to test this at the moment, but code 
looks

reasonable so far. Have you tested this at runtime?


Are you okay with this being merged or would you like to review
further/test?

Nothing jumps out to me either (other than a few nit picks).


Adding a small delay in the while loop is a good idea.
Otherwise: Yes, I agree with merging this.




> Change from v4:
> Make lapb_unregister wait for other refs to "lapb" to drop, to ensure
> that other LAPB API calls have all finished.
>
> Change from v3:
> In lapb_unregister make sure the self-restarting t1timer has really
> been
> stopped.
>
> Change from v2:
> Create a new __lapb_disconnect_request function to reduce redundant
> code.
>
> Change from v1:
> Broke long lines to keep the line lengths within 80 characters.



> @@ -178,11 +182,23 @@ int lapb_unregister(struct net_device *dev)
>goto out;
>lapb_put(lapb);
>
> +  /* Wait for other refs to "lapb" to drop */
> +  while (refcount_read(>refcnt) > 2)
> +  ;


Tight loop like this is a little scary, perhaps add a small
usleep_range() here?


> +
> +  spin_lock_bh(>lock);
> +
>lapb_stop_t1timer(lapb);
>lapb_stop_t2timer(lapb);
>
>lapb_clear_queues(lapb);
>
> +  spin_unlock_bh(>lock);
> +
> +  /* Wait for running timers to stop */
> +  del_timer_sync(>t1timer);
> +  del_timer_sync(>t2timer);
> +
>__lapb_remove_cb(lapb);
>
>lapb_put(lapb);



> -int lapb_disconnect_request(struct net_device *dev)
> +static int __lapb_disconnect_request(struct lapb_cb *lapb)
>  {
> -  struct lapb_cb *lapb = lapb_devtostruct(dev);
> -  int rc = LAPB_BADTOKEN;
> -
> -  if (!lapb)
> -  goto out;
> -
>switch (lapb->state) {
>case LAPB_STATE_0:
> -  rc = LAPB_NOTCONNECTED;
> -  goto out_put;
> +  return LAPB_NOTCONNECTED;
>
>case LAPB_STATE_1:
>lapb_dbg(1, "(%p) S1 TX DISC(1)\n", lapb->dev);
> @@ -310,12 +328,10 @@ int lapb_disconnect_request(struct net_device
> *dev)
>lapb_send_control(lapb, LAPB_DISC, LAPB_POLLON, LAPB_COMMAND);
>lapb->state = LAPB_STATE_0;
>lapb_start_t1timer(lapb);
> -  rc = LAPB_NOTCONNECTED;
> -  goto out_put;
> +  return LAPB_NOTCONNECTED;
>
>case LAPB_STATE_2:
> -  rc = LAPB_OK;
> -  goto out_put;
> +  return LAPB_OK;
>}
>
>lapb_clear_queues(lapb);
> @@ -328,8 +344,22 @@ int lapb_disconnect_request(struct net_device
> *dev)
>lapb_dbg(1, "(%p) S3 DISC(1)\n", lapb->dev);
>lapb_dbg(0, "(%p) S3 -> S2\n", lapb->dev);
>
> -  rc = LAPB_OK;
> -out_put:
> +  return LAPB_OK;
> +}


Since this is a fix for net, I'd advise against converting the goto
into direct returns (as much as I generally like such conversion).


Re: [PATCH net v5] net: lapb: Add locking to the lapb module

2021-01-24 Thread Xie He
On Sat, Jan 23, 2021 at 8:45 PM Jakub Kicinski  wrote:
>
> > > @@ -178,11 +182,23 @@ int lapb_unregister(struct net_device *dev)
> > > goto out;
> > > lapb_put(lapb);
> > >
> > > +   /* Wait for other refs to "lapb" to drop */
> > > +   while (refcount_read(>refcnt) > 2)
> > > +   ;
>
> Tight loop like this is a little scary, perhaps add a small
> usleep_range() here?

OK, sure. I'll add a usleep_range(1, 10) here.

> > > -int lapb_disconnect_request(struct net_device *dev)
> > > +static int __lapb_disconnect_request(struct lapb_cb *lapb)
> > >  {
> > > -   struct lapb_cb *lapb = lapb_devtostruct(dev);
> > > -   int rc = LAPB_BADTOKEN;
> > > -
> > > -   if (!lapb)
> > > -   goto out;
> > > -
> > > switch (lapb->state) {
> > > case LAPB_STATE_0:
> > > -   rc = LAPB_NOTCONNECTED;
> > > -   goto out_put;
> > > +   return LAPB_NOTCONNECTED;
> > >
> > > case LAPB_STATE_1:
> > > lapb_dbg(1, "(%p) S1 TX DISC(1)\n", lapb->dev);
> > > @@ -310,12 +328,10 @@ int lapb_disconnect_request(struct net_device
> > > *dev)
> > > lapb_send_control(lapb, LAPB_DISC, LAPB_POLLON, LAPB_COMMAND);
> > > lapb->state = LAPB_STATE_0;
> > > lapb_start_t1timer(lapb);
> > > -   rc = LAPB_NOTCONNECTED;
> > > -   goto out_put;
> > > +   return LAPB_NOTCONNECTED;
> > >
> > > case LAPB_STATE_2:
> > > -   rc = LAPB_OK;
> > > -   goto out_put;
> > > +   return LAPB_OK;
> > > }
> > >
> > > lapb_clear_queues(lapb);
> > > @@ -328,8 +344,22 @@ int lapb_disconnect_request(struct net_device
> > > *dev)
> > > lapb_dbg(1, "(%p) S3 DISC(1)\n", lapb->dev);
> > > lapb_dbg(0, "(%p) S3 -> S2\n", lapb->dev);
> > >
> > > -   rc = LAPB_OK;
> > > -out_put:
> > > +   return LAPB_OK;
> > > +}
>
> Since this is a fix for net, I'd advise against converting the goto
> into direct returns (as much as I generally like such conversion).

This part is actually splitting "lapb_disconnect_request" into two
functions - a "__lapb_disconnect_request" without locking, and a
"lapb_disconnect_request" which provides the locking and calls
"__lapb_disconnect_request". The splitting is necessary for
"lapb_device_event" to directly call "__lapb_disconnect_request" with
the lock already held. After the splitting, the "out_put" tag would
actually be in the caller function so there's nowhere we can "goto".


Re: [PATCH net v5] net: lapb: Add locking to the lapb module

2021-01-23 Thread Jakub Kicinski
On Fri, 22 Jan 2021 10:07:05 +0100 Martin Schiller wrote:
> On 2021-01-21 01:21, Xie He wrote:
> > In the lapb module, the timers may run concurrently with other code in
> > this module, and there is currently no locking to prevent the code from
> > racing on "struct lapb_cb". This patch adds locking to prevent racing.
> > 
> > 1. Add "spinlock_t lock" to "struct lapb_cb"; Add "spin_lock_bh" and
> > "spin_unlock_bh" to APIs, timer functions and notifier functions.
> > 
> > 2. Add "bool t1timer_stop, t2timer_stop" to "struct lapb_cb" to make us
> > able to ask running timers to abort; Modify "lapb_stop_t1timer" and
> > "lapb_stop_t2timer" to make them able to abort running timers;
> > Modify "lapb_t2timer_expiry" and "lapb_t1timer_expiry" to make them
> > abort after they are stopped by "lapb_stop_t1timer", 
> > "lapb_stop_t2timer",
> > and "lapb_start_t1timer", "lapb_start_t2timer".
> > 
> > 3. Let lapb_unregister wait for other API functions and running timers
> > to stop.
> > 
> > 4. The lapb_device_event function calls lapb_disconnect_request. In
> > order to avoid trying to hold the lock twice, add a new function named
> > "__lapb_disconnect_request" which assumes the lock is held, and make
> > it called by lapb_disconnect_request and lapb_device_event.
> > 
> > Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> > Cc: Martin Schiller 
> > Signed-off-by: Xie He   
> 
> I don't have the opportunity to test this at the moment, but code looks
> reasonable so far. Have you tested this at runtime?

Are you okay with this being merged or would you like to review
further/test?

Nothing jumps out to me either (other than a few nit picks).

> > Change from v4:
> > Make lapb_unregister wait for other refs to "lapb" to drop, to ensure
> > that other LAPB API calls have all finished.
> > 
> > Change from v3:
> > In lapb_unregister make sure the self-restarting t1timer has really 
> > been
> > stopped.
> > 
> > Change from v2:
> > Create a new __lapb_disconnect_request function to reduce redundant 
> > code.
> > 
> > Change from v1:
> > Broke long lines to keep the line lengths within 80 characters.

> > @@ -178,11 +182,23 @@ int lapb_unregister(struct net_device *dev)
> > goto out;
> > lapb_put(lapb);
> > 
> > +   /* Wait for other refs to "lapb" to drop */
> > +   while (refcount_read(>refcnt) > 2)
> > +   ;

Tight loop like this is a little scary, perhaps add a small
usleep_range() here?

> > +
> > +   spin_lock_bh(>lock);
> > +
> > lapb_stop_t1timer(lapb);
> > lapb_stop_t2timer(lapb);
> > 
> > lapb_clear_queues(lapb);
> > 
> > +   spin_unlock_bh(>lock);
> > +
> > +   /* Wait for running timers to stop */
> > +   del_timer_sync(>t1timer);
> > +   del_timer_sync(>t2timer);
> > +
> > __lapb_remove_cb(lapb);
> > 
> > lapb_put(lapb);

> > -int lapb_disconnect_request(struct net_device *dev)
> > +static int __lapb_disconnect_request(struct lapb_cb *lapb)
> >  {
> > -   struct lapb_cb *lapb = lapb_devtostruct(dev);
> > -   int rc = LAPB_BADTOKEN;
> > -
> > -   if (!lapb)
> > -   goto out;
> > -
> > switch (lapb->state) {
> > case LAPB_STATE_0:
> > -   rc = LAPB_NOTCONNECTED;
> > -   goto out_put;
> > +   return LAPB_NOTCONNECTED;
> > 
> > case LAPB_STATE_1:
> > lapb_dbg(1, "(%p) S1 TX DISC(1)\n", lapb->dev);
> > @@ -310,12 +328,10 @@ int lapb_disconnect_request(struct net_device 
> > *dev)
> > lapb_send_control(lapb, LAPB_DISC, LAPB_POLLON, LAPB_COMMAND);
> > lapb->state = LAPB_STATE_0;
> > lapb_start_t1timer(lapb);
> > -   rc = LAPB_NOTCONNECTED;
> > -   goto out_put;
> > +   return LAPB_NOTCONNECTED;
> > 
> > case LAPB_STATE_2:
> > -   rc = LAPB_OK;
> > -   goto out_put;
> > +   return LAPB_OK;
> > }
> > 
> > lapb_clear_queues(lapb);
> > @@ -328,8 +344,22 @@ int lapb_disconnect_request(struct net_device 
> > *dev)
> > lapb_dbg(1, "(%p) S3 DISC(1)\n", lapb->dev);
> > lapb_dbg(0, "(%p) S3 -> S2\n", lapb->dev);
> > 
> > -   rc = LAPB_OK;
> > -out_put:
> > +   return LAPB_OK;
> > +}

Since this is a fix for net, I'd advise against converting the goto
into direct returns (as much as I generally like such conversion).




Re: [PATCH net v5] net: lapb: Add locking to the lapb module

2021-01-22 Thread Xie He
On Fri, Jan 22, 2021 at 1:07 AM Martin Schiller  wrote:
>
> I don't have the opportunity to test this at the moment, but code looks
> reasonable so far. Have you tested this at runtime?

Thanks! Yes, I have tested this using hdlc_x25.c, lapbether.c and (the
deleted) x25_asy.c drivers.


Re: [PATCH net v5] net: lapb: Add locking to the lapb module

2021-01-22 Thread Martin Schiller

On 2021-01-21 01:21, Xie He wrote:

In the lapb module, the timers may run concurrently with other code in
this module, and there is currently no locking to prevent the code from
racing on "struct lapb_cb". This patch adds locking to prevent racing.

1. Add "spinlock_t lock" to "struct lapb_cb"; Add "spin_lock_bh" and
"spin_unlock_bh" to APIs, timer functions and notifier functions.

2. Add "bool t1timer_stop, t2timer_stop" to "struct lapb_cb" to make us
able to ask running timers to abort; Modify "lapb_stop_t1timer" and
"lapb_stop_t2timer" to make them able to abort running timers;
Modify "lapb_t2timer_expiry" and "lapb_t1timer_expiry" to make them
abort after they are stopped by "lapb_stop_t1timer", 
"lapb_stop_t2timer",

and "lapb_start_t1timer", "lapb_start_t2timer".

3. Let lapb_unregister wait for other API functions and running timers
to stop.

4. The lapb_device_event function calls lapb_disconnect_request. In
order to avoid trying to hold the lock twice, add a new function named
"__lapb_disconnect_request" which assumes the lock is held, and make
it called by lapb_disconnect_request and lapb_device_event.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Cc: Martin Schiller 
Signed-off-by: Xie He 



I don't have the opportunity to test this at the moment, but code looks
reasonable so far. Have you tested this at runtime?


---

Change from v4:
Make lapb_unregister wait for other refs to "lapb" to drop, to ensure
that other LAPB API calls have all finished.

Change from v3:
In lapb_unregister make sure the self-restarting t1timer has really 
been

stopped.

Change from v2:
Create a new __lapb_disconnect_request function to reduce redundant 
code.


Change from v1:
Broke long lines to keep the line lengths within 80 characters.

---
 include/net/lapb.h|  2 ++
 net/lapb/lapb_iface.c | 70 +--
 net/lapb/lapb_timer.c | 30 ---
 3 files changed, 82 insertions(+), 20 deletions(-)

diff --git a/include/net/lapb.h b/include/net/lapb.h
index ccc3d1f020b0..eee73442a1ba 100644
--- a/include/net/lapb.h
+++ b/include/net/lapb.h
@@ -92,6 +92,7 @@ struct lapb_cb {
unsigned short  n2, n2count;
unsigned short  t1, t2;
struct timer_list   t1timer, t2timer;
+   boolt1timer_stop, t2timer_stop;

/* Internal control information */
struct sk_buff_head write_queue;
@@ -103,6 +104,7 @@ struct lapb_cb {
struct lapb_frame   frmr_data;
unsigned char   frmr_type;

+   spinlock_t  lock;
refcount_t  refcnt;
 };

diff --git a/net/lapb/lapb_iface.c b/net/lapb/lapb_iface.c
index 40961889e9c0..b028dfc438ce 100644
--- a/net/lapb/lapb_iface.c
+++ b/net/lapb/lapb_iface.c
@@ -122,6 +122,8 @@ static struct lapb_cb *lapb_create_cb(void)

timer_setup(>t1timer, NULL, 0);
timer_setup(>t2timer, NULL, 0);
+   lapb->t1timer_stop = true;
+   lapb->t2timer_stop = true;

lapb->t1  = LAPB_DEFAULT_T1;
lapb->t2  = LAPB_DEFAULT_T2;
@@ -129,6 +131,8 @@ static struct lapb_cb *lapb_create_cb(void)
lapb->mode= LAPB_DEFAULT_MODE;
lapb->window  = LAPB_DEFAULT_WINDOW;
lapb->state   = LAPB_STATE_0;
+
+   spin_lock_init(>lock);
refcount_set(>refcnt, 1);
 out:
return lapb;
@@ -178,11 +182,23 @@ int lapb_unregister(struct net_device *dev)
goto out;
lapb_put(lapb);

+   /* Wait for other refs to "lapb" to drop */
+   while (refcount_read(>refcnt) > 2)
+   ;
+
+   spin_lock_bh(>lock);
+
lapb_stop_t1timer(lapb);
lapb_stop_t2timer(lapb);

lapb_clear_queues(lapb);

+   spin_unlock_bh(>lock);
+
+   /* Wait for running timers to stop */
+   del_timer_sync(>t1timer);
+   del_timer_sync(>t2timer);
+
__lapb_remove_cb(lapb);

lapb_put(lapb);
@@ -201,6 +217,8 @@ int lapb_getparms(struct net_device *dev, struct
lapb_parms_struct *parms)
if (!lapb)
goto out;

+   spin_lock_bh(>lock);
+
parms->t1  = lapb->t1 / HZ;
parms->t2  = lapb->t2 / HZ;
parms->n2  = lapb->n2;
@@ -219,6 +237,7 @@ int lapb_getparms(struct net_device *dev, struct
lapb_parms_struct *parms)
else
parms->t2timer = (lapb->t2timer.expires - jiffies) / HZ;

+   spin_unlock_bh(>lock);
lapb_put(lapb);
rc = LAPB_OK;
 out:
@@ -234,6 +253,8 @@ int lapb_setparms(struct net_device *dev, struct
lapb_parms_struct *parms)
if (!lapb)
goto out;

+   spin_lock_bh(>lock);
+
rc = LAPB_INVALUE;
if (parms->t1 < 1 || parms->t2 < 1 || parms->n2 < 1)
goto out_put;
@@ -256,6 +277,7 @@ int lapb_setparms(struct net_device *dev, struct
lapb_parms_struct *parms)

rc = LAPB_OK;
 out_put:
+   spin_unlock_bh(>lock);
lapb_put(lapb);
 out:
return rc;
@@