Re: [ovs-dev] dpif-netdev:fix reload pmd's dead lock

2019-02-11 Thread Lilijun (Jerry, Cloud Networking)
Ok, I got it, thanks for your explanation.

> -Original Message-
> From: Ilya Maximets [mailto:i.maxim...@samsung.com]
> Sent: Monday, February 11, 2019 9:55 PM
> To: Lilijun (Jerry, Cloud Networking) ; Stokes, Ian
> ; d...@openvswitch.org
> Subject: Re: [ovs-dev] dpif-netdev:fix reload pmd's dead lock
> 
> On 11.02.2019 6:22, Lilijun (Jerry, Cloud Networking) wrote:
> > Yes, Thanks for your help again.
> >
> > Let's forget the patch :). But in my opinion, the function
> reload_affected_pmds() doesn't need be locked within  dp->port_mutex.  Is
> that true?
> 
> That's true that 'reload_affected_pmds()' itself doesn't need to be locked.
> However, you can't just unlock the mutex and lock it again, because upper
> layer functions that calls 'reconfigure_datapath()' expects that mutex will be
> held all the way down.
> For example, 'dp_netdev_free()' iterates over the 'dp->ports' hmap and it
> doesn't expect that 'do_del_port()' could unlock the mutex. If hmap will be
> changed (some ports added), the iterator could crash.
> Another example is 'dpif_dummy_change_port_number()' that calls the
> reconfiguration twice in a row expecting no changes in 'dp->ports' between.
> 
> Best regards, Ilya Maximets.
> 
> >
> >> -Original Message-
> >> From: ovs-dev-boun...@openvswitch.org [mailto:ovs-dev-
> >> boun...@openvswitch.org] On Behalf Of Stokes, Ian
> >> Sent: Thursday, February 07, 2019 5:25 PM
> >> To: Ilya Maximets ; d...@openvswitch.org
> >> Subject: Re: [ovs-dev] dpif-netdev:fix reload pmd's dead lock
> >>
> >>> On 31.01.2019 11:48, Lilijun wrote:
> >>>> This patch fix the dead lock when using dpdk userspace datapath.
> >>>> The problem is described as follows:
> >>>> 1) when add or delete port, the main thread will call
> >>>> reconfigure_datapath() in the function dpif_netdev_run()
> >>>> 2) Here the dp->port_mutex is locked. In dp_netdev_reload_pmd__(),
> >>>> it will notify each pmd to reload.
> >>>> 3) If pmd is doing packet upcall in fast_path_processing() and try
> >>>> to get
> >>>> dp->port_mutex in
> >>>> do_xlate_actions()-->tnl_route_lookup_flow()--
> >>>> dpif_netdev_port_query_by_name().
> >>>> Here pmd get lock failed because the main thread has got the lock
> >>>> in step 2.
> >>>> 4) So the main thread was stuck for waiting pmd to reload done. Now
> >>>> we  got a dead lock.
> >>>>
> >>>> Here reload_affected_pmds() may not need to lock dp->port_mutex.
> So
> >>>> we release the lock temporarily when calling  reload_affected_pmds().
> >>>>
> >>>> Signed-off-by: Lilijun 
> >>>
> >>> Replying just to keep answers on a list/patchwork consistent.
> >>> The deadlock caused by some local changes done by the user.
> >>> Not possible in upstream master. See discussion for the previous
> >>> version of this patch that is missing in patchwork for some reason:
> >>>
> >>>
> >>> https://mail.openvswitch.org/pipermail/ovs-dev/2019-
> >> January/355756.htm
> >>> l
> >>>
> >>> Best regards, Ilya Maximets.
> >>>
> >>
> >> Thanks for clarifying Ilya, there was discussion of this at the
> >> community meeting yesterday but it seems a non-issue now.
> >>
> >> Thanks
> >> Ian
> >> ___
> >> dev mailing list
> >> d...@openvswitch.org
> >> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> >
> >
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] dpif-netdev:fix reload pmd's dead lock

2019-02-10 Thread Lilijun (Jerry, Cloud Networking)
Yes, Thanks for your help again.

Let's forget the patch :). But in my opinion, the function 
reload_affected_pmds() doesn't need be locked within  dp->port_mutex.  Is that 
true?

> -Original Message-
> From: ovs-dev-boun...@openvswitch.org [mailto:ovs-dev-
> boun...@openvswitch.org] On Behalf Of Stokes, Ian
> Sent: Thursday, February 07, 2019 5:25 PM
> To: Ilya Maximets ; d...@openvswitch.org
> Subject: Re: [ovs-dev] dpif-netdev:fix reload pmd's dead lock
> 
> > On 31.01.2019 11:48, Lilijun wrote:
> > > This patch fix the dead lock when using dpdk userspace datapath. The
> > > problem is described as follows:
> > > 1) when add or delete port, the main thread will call
> > > reconfigure_datapath() in the function dpif_netdev_run()
> > > 2) Here the dp->port_mutex is locked. In dp_netdev_reload_pmd__(),
> > > it will notify each pmd to reload.
> > > 3) If pmd is doing packet upcall in fast_path_processing() and try
> > > to get
> > > dp->port_mutex in
> > > do_xlate_actions()-->tnl_route_lookup_flow()--
> > >dpif_netdev_port_query_by_name().
> > > Here pmd get lock failed because the main thread has got the lock in
> > >step 2.
> > > 4) So the main thread was stuck for waiting pmd to reload done. Now
> > >we  got a dead lock.
> > >
> > > Here reload_affected_pmds() may not need to lock dp->port_mutex. So
> > > we release the lock temporarily when calling  reload_affected_pmds().
> > >
> > > Signed-off-by: Lilijun 
> >
> > Replying just to keep answers on a list/patchwork consistent.
> > The deadlock caused by some local changes done by the user.
> > Not possible in upstream master. See discussion for the previous
> > version of this patch that is missing in patchwork for some reason:
> >
> >
> > https://mail.openvswitch.org/pipermail/ovs-dev/2019-
> January/355756.htm
> > l
> >
> > Best regards, Ilya Maximets.
> >
> 
> Thanks for clarifying Ilya, there was discussion of this at the community
> meeting yesterday but it seems a non-issue now.
> 
> Thanks
> Ian
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] dpif-netdev:fix reload pmd's dead lock

2019-02-07 Thread Stokes, Ian
> On 31.01.2019 11:48, Lilijun wrote:
> > This patch fix the dead lock when using dpdk userspace datapath. The
> > problem is described as follows:
> > 1) when add or delete port, the main thread will call
> > reconfigure_datapath() in the function dpif_netdev_run()
> > 2) Here the dp->port_mutex is locked. In dp_netdev_reload_pmd__(), it
> > will notify each pmd to reload.
> > 3) If pmd is doing packet upcall in fast_path_processing() and try to
> > get
> > dp->port_mutex in
> > do_xlate_actions()-->tnl_route_lookup_flow()--
> >dpif_netdev_port_query_by_name().
> > Here pmd get lock failed because the main thread has got the lock in
> > step 2.
> > 4) So the main thread was stuck for waiting pmd to reload done. Now we
> > got a dead lock.
> >
> > Here reload_affected_pmds() may not need to lock dp->port_mutex. So we
> > release the lock temporarily when calling  reload_affected_pmds().
> >
> > Signed-off-by: Lilijun 
> 
> Replying just to keep answers on a list/patchwork consistent.
> The deadlock caused by some local changes done by the user.
> Not possible in upstream master. See discussion for the previous version
> of this patch that is missing in patchwork for some reason:
> 
>https://mail.openvswitch.org/pipermail/ovs-dev/2019-January/355756.html
> 
> Best regards, Ilya Maximets.
> 

Thanks for clarifying Ilya, there was discussion of this at the community 
meeting yesterday but it seems a non-issue now.

Thanks
Ian
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] dpif-netdev:fix reload pmd's dead lock

2019-02-06 Thread Ilya Maximets
On 31.01.2019 11:48, Lilijun wrote:
> This patch fix the dead lock when using dpdk userspace datapath. The
> problem is described as follows:
> 1) when add or delete port, the main thread will call
> reconfigure_datapath() in the function dpif_netdev_run()
> 2) Here the dp->port_mutex is locked. In dp_netdev_reload_pmd__(), it
> will notify each pmd to reload.
> 3) If pmd is doing packet upcall in fast_path_processing() and try to get
> dp->port_mutex in
> do_xlate_actions()-->tnl_route_lookup_flow()-->dpif_netdev_port_query_by_name().
> Here pmd get lock failed because the main thread has got the lock in
> step 2.
> 4) So the main thread was stuck for waiting pmd to reload done. Now we
> got a dead lock.
> 
> Here reload_affected_pmds() may not need to lock dp->port_mutex. So we
> release the lock temporarily when calling  reload_affected_pmds().
> 
> Signed-off-by: Lilijun 

Replying just to keep answers on a list/patchwork consistent.
The deadlock caused by some local changes done by the user.
Not possible in upstream master. See discussion for the previous
version of this patch that is missing in patchwork for some reason:

   https://mail.openvswitch.org/pipermail/ovs-dev/2019-January/355756.html

Best regards, Ilya Maximets.

> ---
>  lib/dpif-netdev.c | 9 -
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
> index 0f57e3f8a..fc3bdae66 100644
> --- a/lib/dpif-netdev.c
> +++ b/lib/dpif-netdev.c
> @@ -4813,8 +4813,11 @@ reconfigure_datapath(struct dp_netdev *dp)
>  
>  /* Reload affected pmd threads.  We must wait for the pmd threads before
>   * reconfiguring the ports, because a port cannot be reconfigured while
> - * it's being used. */
> + * it's being used. We need release dp->port_mutex to make sure that pmds
> + * don't wait for getting the mutex when handling packet upcalls */
> +ovs_mutex_unlock(>port_mutex);
>  reload_affected_pmds(dp);
> +ovs_mutex_lock(>port_mutex);
>  
>  /* Step 3: Reconfigure ports. */
>  
> @@ -4877,7 +4880,9 @@ reconfigure_datapath(struct dp_netdev *dp)
>  /* Reload affected pmd threads.  We must wait for the pmd threads to 
> remove
>   * the old queues before readding them, otherwise a queue can be polled 
> by
>   * two threads at the same time. */
> +ovs_mutex_unlock(>port_mutex);
>  reload_affected_pmds(dp);
> +ovs_mutex_lock(>port_mutex);
>  
>  /* Step 6: Add queues from scheduling, if they're not there already. */
>  HMAP_FOR_EACH (port, node, >ports) {
> @@ -4909,7 +4914,9 @@ reconfigure_datapath(struct dp_netdev *dp)
>  }
>  
>  /* Reload affected pmd threads. */
> +ovs_mutex_unlock(>port_mutex);
>  reload_affected_pmds(dp);
> +ovs_mutex_lock(>port_mutex);
>  
>  /* Check if PMD Auto LB is to be enabled */
>  set_pmd_auto_lb(dp);
> 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] dpif-netdev:fix reload pmd's dead lock

2019-01-31 Thread Lilijun (Jerry, Cloud Networking)
Yes you're right, the master branch has no problem here. I am sorry for that 
it's my own code's errors causing the deadlock.
We MUST make sure that dp->port_mutex is not used in pmd thread context.

Thanks very much.

> -Original Message-
> From: Ilya Maximets [mailto:i.maxim...@samsung.com]
> Sent: Thursday, January 31, 2019 5:12 PM
> To: Lilijun (Jerry, Cloud Networking) ;
> d...@openvswitch.org
> Cc: Stokes, Ian 
> Subject: Re: [ovs-dev] dpif-netdev:fix reload pmd's dead lock
> 
> Hi.
> 
> On 31.01.2019 8:57, Lilijun wrote:
> > This patch fix the dead lock when using dpdk userspace datapath. The
> > problem is described as follows:
> > 1) when add or delete port, the main thread will call
> > reconfigure_datapath() in the function dpif_netdev_run()
> > 2) Here the dp->port_mutex is locked. In dp_netdev_reload_pmd__(), it
> > will notify each pmd to reload.
> > 3) If pmd is doing packet upcall in fast_path_processing() and try to
> > get
> > dp->port_mutex in
> > do_xlate_actions()-->tnl_route_lookup_flow()--
> >dpif_netdev_port_query_by_name().
> 
> Beside the fact that this patch could not be applied to master because of
> some conflicts, I do not see how the dpif_netdev_port_query_by_name()
> could be called from tnl_route_lookup_flow(). What OVS version you're
> using ?
> Could you, please, provide more detailed call trace for this deadlock ?
> 
> Also, reload_affected_pmds() called 4 times during the reconfiguration.
> Why you unlocking only for these two ?
> 
> > Here pmd get lock failed because the main thread has got the lock in
> > step 2.
> > 4) So the main thread was stuck for waiting pmd to reload done. Now we
> > got a dead lock.
> >
> > Here reload_affected_pmds() may not need to lock dp->port_mutex. So
> we
> > release the lock temporarily when calling  reload_affected_pmds().
> >
> > Signed-off-by: Lilijun 
> > ---
> >  lib/dpif-netdev.c | 7 ++-
> >  1 file changed, 6 insertions(+), 1 deletion(-)
> >
> > diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index
> > 1564db9c6..bfd6aa74c 100644
> > --- a/lib/dpif-netdev.c
> > +++ b/lib/dpif-netdev.c
> > @@ -4665,8 +4665,11 @@ reconfigure_datapath(struct dp_netdev *dp)
> >
> >  /* Reload affected pmd threads.  We must wait for the pmd threads
> before
> >   * reconfiguring the ports, because a port cannot be reconfigured while
> > - * it's being used. */
> > + * it's being used. We need release dp->port_mutex to make sure that
> pmds
> > + * don't wait for getting the mutex when handling packet upcalls*/
> > +ovs_mutex_unlock(>port_mutex);
> >  reload_affected_pmds(dp);
> > +ovs_mutex_lock(>port_mutex);
> >
> >  /* Step 3: Reconfigure ports. */
> >
> > @@ -4761,7 +4764,9 @@ reconfigure_datapath(struct dp_netdev *dp)
> >  }
> >
> >  /* Reload affected pmd threads. */
> > +ovs_mutex_unlock(>port_mutex);
> >  reload_affected_pmds(dp);
> > +ovs_mutex_lock(>port_mutex);
> >  }
> >
> >  /* Returns true if one of the netdevs in 'dp' requires a
> > reconfiguration */
> >
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] dpif-netdev:fix reload pmd's dead lock

2019-01-31 Thread Ilya Maximets
Hi.

On 31.01.2019 8:57, Lilijun wrote:
> This patch fix the dead lock when using dpdk userspace datapath. The
> problem is described as follows:
> 1) when add or delete port, the main thread will call
> reconfigure_datapath() in the function dpif_netdev_run()
> 2) Here the dp->port_mutex is locked. In dp_netdev_reload_pmd__(), it
> will notify each pmd to reload.
> 3) If pmd is doing packet upcall in fast_path_processing() and try to get
> dp->port_mutex in
> do_xlate_actions()-->tnl_route_lookup_flow()-->dpif_netdev_port_query_by_name().

Beside the fact that this patch could not be applied to master because of
some conflicts, I do not see how the dpif_netdev_port_query_by_name() could
be called from tnl_route_lookup_flow(). What OVS version you're using ?
Could you, please, provide more detailed call trace for this deadlock ?

Also, reload_affected_pmds() called 4 times during the reconfiguration.
Why you unlocking only for these two ?

> Here pmd get lock failed because the main thread has got the lock in
> step 2.
> 4) So the main thread was stuck for waiting pmd to reload done. Now we
> got a dead lock.
> 
> Here reload_affected_pmds() may not need to lock dp->port_mutex. So we
> release the lock temporarily when calling  reload_affected_pmds().
> 
> Signed-off-by: Lilijun 
> ---
>  lib/dpif-netdev.c | 7 ++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
> index 1564db9c6..bfd6aa74c 100644
> --- a/lib/dpif-netdev.c
> +++ b/lib/dpif-netdev.c
> @@ -4665,8 +4665,11 @@ reconfigure_datapath(struct dp_netdev *dp)
>  
>  /* Reload affected pmd threads.  We must wait for the pmd threads before
>   * reconfiguring the ports, because a port cannot be reconfigured while
> - * it's being used. */
> + * it's being used. We need release dp->port_mutex to make sure that pmds
> + * don't wait for getting the mutex when handling packet upcalls*/
> +ovs_mutex_unlock(>port_mutex);
>  reload_affected_pmds(dp);
> +ovs_mutex_lock(>port_mutex);
>  
>  /* Step 3: Reconfigure ports. */
>  
> @@ -4761,7 +4764,9 @@ reconfigure_datapath(struct dp_netdev *dp)
>  }
>  
>  /* Reload affected pmd threads. */
> +ovs_mutex_unlock(>port_mutex);
>  reload_affected_pmds(dp);
> +ovs_mutex_lock(>port_mutex);
>  }
>  
>  /* Returns true if one of the netdevs in 'dp' requires a reconfiguration */
> 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] dpif-netdev:fix reload pmd's dead lock

2019-01-30 Thread 0-day Robot
Bleep bloop.  Greetings Lilijun, I am a robot and I have tried out your patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


git-am:
Failed to merge in the changes.
Patch failed at 0001 dpif-netdev:fix reload pmd's dead lock
The copy of the patch that failed is found in:
   
/var/lib/jenkins/jobs/upstream_build_from_pw/workspace/.git/rebase-apply/patch
When you have resolved this problem, run "git am --resolved".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".


Please check this out.  If you feel there has been an error, please email 
acon...@bytheb.org

Thanks,
0-day Robot
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev