Re: [PATCH net-next 7/7] net: marvell: prestera: fix port event handling on init

2021-02-05 Thread Vadym Kochan
Hi Jakub,

On Thu, Feb 04, 2021 at 09:19:34PM -0800, Jakub Kicinski wrote:
> On Wed,  3 Feb 2021 18:54:58 +0200 Vadym Kochan wrote:
> > For some reason there might be a crash during ports creation if port
> > events are handling at the same time  because fw may send initial
> > port event with down state.
> > 
> > The crash points to cancel_delayed_work() which is called when port went
> > is down.  Currently I did not find out the real cause of the issue, so
> > fixed it by cancel port stats work only if previous port's state was up
> > & runnig.
> 
> Maybe you just need to move the DELAYED_WORK_INIT() earlier?
> Not sure why it's at the end of prestera_port_create(), it 
> just initializes some fields.
> 

Thanks for suggestion, but it does not help. Will try to find-out the
real root cause but this is the only fix I 'v came up.

> > [   28.489791] Call trace:
> > [   28.492259]  get_work_pool+0x48/0x60
> > [   28.495874]  cancel_delayed_work+0x38/0xb0
> > [   28.500011]  prestera_port_handle_event+0x90/0xa0 [prestera]
> > [   28.505743]  prestera_evt_recv+0x98/0xe0 [prestera]
> > [   28.510683]  prestera_fw_evt_work_fn+0x180/0x228 [prestera_pci]
> > [   28.516660]  process_one_work+0x1e8/0x360
> > [   28.520710]  worker_thread+0x44/0x480
> > [   28.524412]  kthread+0x154/0x160
> > [   28.527670]  ret_from_fork+0x10/0x38
> > [   28.531290] Code: a8c17bfd d50323bf d65f03c0 9278dc21 (f9400020)
> > [   28.537429] ---[ end trace 5eced933df3a080b ]---
> > 
> > Signed-off-by: Vadym Kochan 
> > ---
> >  drivers/net/ethernet/marvell/prestera/prestera_main.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/net/ethernet/marvell/prestera/prestera_main.c 
> > b/drivers/net/ethernet/marvell/prestera/prestera_main.c
> > index 39465e65d09b..122324dae47d 100644
> > --- a/drivers/net/ethernet/marvell/prestera/prestera_main.c
> > +++ b/drivers/net/ethernet/marvell/prestera/prestera_main.c
> > @@ -433,7 +433,8 @@ static void prestera_port_handle_event(struct 
> > prestera_switch *sw,
> > netif_carrier_on(port->dev);
> > if (!delayed_work_pending(caching_dw))
> > queue_delayed_work(prestera_wq, caching_dw, 0);
> > -   } else {
> > +   } else if (netif_running(port->dev) &&
> > +  netif_carrier_ok(port->dev)) {
> > netif_carrier_off(port->dev);
> > if (delayed_work_pending(caching_dw))
> > cancel_delayed_work(caching_dw);
> 


Re: [PATCH net-next 7/7] net: marvell: prestera: fix port event handling on init

2021-02-04 Thread Jakub Kicinski
On Wed,  3 Feb 2021 18:54:58 +0200 Vadym Kochan wrote:
> For some reason there might be a crash during ports creation if port
> events are handling at the same time  because fw may send initial
> port event with down state.
> 
> The crash points to cancel_delayed_work() which is called when port went
> is down.  Currently I did not find out the real cause of the issue, so
> fixed it by cancel port stats work only if previous port's state was up
> & runnig.

Maybe you just need to move the DELAYED_WORK_INIT() earlier?
Not sure why it's at the end of prestera_port_create(), it 
just initializes some fields.

> [   28.489791] Call trace:
> [   28.492259]  get_work_pool+0x48/0x60
> [   28.495874]  cancel_delayed_work+0x38/0xb0
> [   28.500011]  prestera_port_handle_event+0x90/0xa0 [prestera]
> [   28.505743]  prestera_evt_recv+0x98/0xe0 [prestera]
> [   28.510683]  prestera_fw_evt_work_fn+0x180/0x228 [prestera_pci]
> [   28.516660]  process_one_work+0x1e8/0x360
> [   28.520710]  worker_thread+0x44/0x480
> [   28.524412]  kthread+0x154/0x160
> [   28.527670]  ret_from_fork+0x10/0x38
> [   28.531290] Code: a8c17bfd d50323bf d65f03c0 9278dc21 (f9400020)
> [   28.537429] ---[ end trace 5eced933df3a080b ]---
> 
> Signed-off-by: Vadym Kochan 
> ---
>  drivers/net/ethernet/marvell/prestera/prestera_main.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/marvell/prestera/prestera_main.c 
> b/drivers/net/ethernet/marvell/prestera/prestera_main.c
> index 39465e65d09b..122324dae47d 100644
> --- a/drivers/net/ethernet/marvell/prestera/prestera_main.c
> +++ b/drivers/net/ethernet/marvell/prestera/prestera_main.c
> @@ -433,7 +433,8 @@ static void prestera_port_handle_event(struct 
> prestera_switch *sw,
>   netif_carrier_on(port->dev);
>   if (!delayed_work_pending(caching_dw))
>   queue_delayed_work(prestera_wq, caching_dw, 0);
> - } else {
> + } else if (netif_running(port->dev) &&
> +netif_carrier_ok(port->dev)) {
>   netif_carrier_off(port->dev);
>   if (delayed_work_pending(caching_dw))
>   cancel_delayed_work(caching_dw);



[PATCH net-next 7/7] net: marvell: prestera: fix port event handling on init

2021-02-03 Thread Vadym Kochan
For some reason there might be a crash during ports creation if port
events are handling at the same time  because fw may send initial
port event with down state.

The crash points to cancel_delayed_work() which is called when port went
is down.  Currently I did not find out the real cause of the issue, so
fixed it by cancel port stats work only if previous port's state was up
& runnig.

The following is the crash which can be triggered:

[   28.311104] Unable to handle kernel paging request at virtual address
71775f776600
[   28.319097] Mem abort info:
[   28.321914]   ESR = 0x9604
[   28.324996]   EC = 0x25: DABT (current EL), IL = 32 bits
[   28.330350]   SET = 0, FnV = 0
[   28.333430]   EA = 0, S1PTW = 0
[   28.336597] Data abort info:
[   28.339499]   ISV = 0, ISS = 0x0004
[   28.343362]   CM = 0, WnR = 0
[   28.346354] user pgtable: 4k pages, 48-bit VAs, pgdp=000100bf7000
[   28.352842] [71775f776600] pgd=,
p4d=
[   28.359695] Internal error: Oops: 9604 [#1] PREEMPT SMP
[   28.365310] Modules linked in: prestera_pci(+) prestera
uio_pdrv_genirq
[   28.372005] CPU: 0 PID: 1291 Comm: kworker/0:1H Not tainted
5.11.0-rc4 #1
[   28.378846] Hardware name: DNI AmazonGo1 A7040 board (DT)
[   28.384283] Workqueue: prestera_fw_wq prestera_fw_evt_work_fn
[prestera_pci]
[   28.391413] pstate: 6085 (nZCv daIf -PAN -UAO -TCO BTYPE=--)
[   28.397468] pc : get_work_pool+0x48/0x60
[   28.401442] lr : try_to_grab_pending+0x6c/0x1b0
[   28.406018] sp : 80001391bc60
[   28.409358] x29: 80001391bc60 x28: 
[   28.414725] x27: 000104fc8b40 x26: 80001127de88
[   28.420089] x25:  x24: 000106119760
[   28.425452] x23: 00010775dd60 x22: 00010567e000
[   28.430814] x21:  x20: 80001391bcb0
[   28.436175] x19: 00010775deb8 x18: 00c0
[   28.441537] x17:  x16: 8d9b0e88
[   28.446898] x15: 0001 x14: 02ba
[   28.452261] x13: 80a3002c0002 x12: 05f4
[   28.457622] x11: 0030 x10: 000c
[   28.462985] x9 : 000c x8 : 0030
[   28.468346] x7 : 80001440 x6 : 000106119758
[   28.473708] x5 : 0003 x4 : 00010775dc60
[   28.479068] x3 :  x2 : 0060
[   28.484429] x1 : 71775f776600 x0 : 00010775deb8
[   28.489791] Call trace:
[   28.492259]  get_work_pool+0x48/0x60
[   28.495874]  cancel_delayed_work+0x38/0xb0
[   28.500011]  prestera_port_handle_event+0x90/0xa0 [prestera]
[   28.505743]  prestera_evt_recv+0x98/0xe0 [prestera]
[   28.510683]  prestera_fw_evt_work_fn+0x180/0x228 [prestera_pci]
[   28.516660]  process_one_work+0x1e8/0x360
[   28.520710]  worker_thread+0x44/0x480
[   28.524412]  kthread+0x154/0x160
[   28.527670]  ret_from_fork+0x10/0x38
[   28.531290] Code: a8c17bfd d50323bf d65f03c0 9278dc21 (f9400020)
[   28.537429] ---[ end trace 5eced933df3a080b ]---

Signed-off-by: Vadym Kochan 
---
 drivers/net/ethernet/marvell/prestera/prestera_main.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/marvell/prestera/prestera_main.c 
b/drivers/net/ethernet/marvell/prestera/prestera_main.c
index 39465e65d09b..122324dae47d 100644
--- a/drivers/net/ethernet/marvell/prestera/prestera_main.c
+++ b/drivers/net/ethernet/marvell/prestera/prestera_main.c
@@ -433,7 +433,8 @@ static void prestera_port_handle_event(struct 
prestera_switch *sw,
netif_carrier_on(port->dev);
if (!delayed_work_pending(caching_dw))
queue_delayed_work(prestera_wq, caching_dw, 0);
-   } else {
+   } else if (netif_running(port->dev) &&
+  netif_carrier_ok(port->dev)) {
netif_carrier_off(port->dev);
if (delayed_work_pending(caching_dw))
cancel_delayed_work(caching_dw);
-- 
2.17.1