Re: [PATCH net-next v6 00/11] Modify action API for implementing lockless actions

2018-07-13 Thread Vlad Buslov


On Fri 13 Jul 2018 at 03:54, Cong Wang  wrote:
> On Sat, Jul 7, 2018 at 8:43 PM David Miller  wrote:
>>
>> From: Vlad Buslov 
>> Date: Thu,  5 Jul 2018 17:24:22 +0300
>>
>> > Currently, all netlink protocol handlers for updating rules, actions and
>> > qdiscs are protected with single global rtnl lock which removes any
>> > possibility for parallelism. This patch set is a first step to remove
>> > rtnl lock dependency from TC rules update path.
>>  ...
>>
>> I'll apply this for now, I reviewed it a few more times and I see
>> where you are going with this.
>
> Dear David,
>
> I don't understand why you even believe the claim of lockless
> updaters here, it at least should raise a red flag when you see any
> kinda of this claim.
>
> I know you don't trust me, how about thinking it in this way:
>
> Why does RCU still require a lock for RCU writers? (Or at least
> RCU recommends a lock, if anyone really wants to point out some
> lockless algorithm here.)
>
> or:
>
> If writers could really go lockless as easily as Vlad claims, how could
> even Paul E. McKenney never bring it into RCU?
>
> Maybe Vlad is much cleverer than any of us here, and maybe he really
> discovers a very brilliant algorithm to allow TC actions to be updated
> locklessly, why not wait until he shows a proof (either code or a paper)?
> Is there a rush? I don't see it.
>
> In fact, I discussed this with Vlad a little bit at netdev TC workshop.
> I never see any brilliant algorithm from him from his slides, and I was
> told by him he used "copy and replace" to archive parallel updaters, I
> told him that is basically how RCU works and RCU writers have to be
> sync'ed with a lock (or at least recommended).
>
> Also, to confirm my judgement, I checked this with Paul privately too.
> Paul said you have to be extremely careful to go lockless, it is very hard
> to be bug free for lockless, although he _never_ says it is impossible.
>
> My _personal_ bet is that, lockless updates for TC filters or actions
> are impossible unless there are more things hiding behind "copy and
> replace", for example, some brilliant lockless algorithm. If lockless is
> really impossible in this circumstance, then many of your efforts in
> this patchset are vain, by the way.
>
> I _do_ believe you can break RTNL down to per device, per filter or per
> action, but no matter how small the locking scope is, there is still a lock.
> With a lock, there is no need to make things friendly to lockless, like
> making an integer increment inside an action to be atomic (your patch
> 02/11).
>
> Please _do_ prove my personal judgement is wrong, by showing your
> final code or a formal paper/article. I am very *happy* to be proved
> to be wrong here, I am very open to change my mind here.
>
> Vlad, we need your proof. Please prove I am wrong, seriously!!! :)
>
> Thanks to anyone for proving me I am wrong just in case!!! :)

Dear Cong,

I never claimed to have some new brilliant algorithm that completely
removed any locks from rules update path. Obviously, fine-grained
locking is introduced when necessary. I'm sorry if my liberal usage of
term "lockless" confused you. I guess I should be more specific. I'm
fully agree with you that totally removing any and all locks from rules
update path would require some engineering marvel.


Re: [PATCH net-next v6 00/11] Modify action API for implementing lockless actions

2018-07-12 Thread Cong Wang
On Sat, Jul 7, 2018 at 8:43 PM David Miller  wrote:
>
> From: Vlad Buslov 
> Date: Thu,  5 Jul 2018 17:24:22 +0300
>
> > Currently, all netlink protocol handlers for updating rules, actions and
> > qdiscs are protected with single global rtnl lock which removes any
> > possibility for parallelism. This patch set is a first step to remove
> > rtnl lock dependency from TC rules update path.
>  ...
>
> I'll apply this for now, I reviewed it a few more times and I see
> where you are going with this.

Dear David,

I don't understand why you even believe the claim of lockless
updaters here, it at least should raise a red flag when you see any
kinda of this claim.

I know you don't trust me, how about thinking it in this way:

Why does RCU still require a lock for RCU writers? (Or at least
RCU recommends a lock, if anyone really wants to point out some
lockless algorithm here.)

or:

If writers could really go lockless as easily as Vlad claims, how could
even Paul E. McKenney never bring it into RCU?

Maybe Vlad is much cleverer than any of us here, and maybe he really
discovers a very brilliant algorithm to allow TC actions to be updated
locklessly, why not wait until he shows a proof (either code or a paper)?
Is there a rush? I don't see it.

In fact, I discussed this with Vlad a little bit at netdev TC workshop.
I never see any brilliant algorithm from him from his slides, and I was
told by him he used "copy and replace" to archive parallel updaters, I
told him that is basically how RCU works and RCU writers have to be
sync'ed with a lock (or at least recommended).

Also, to confirm my judgement, I checked this with Paul privately too.
Paul said you have to be extremely careful to go lockless, it is very hard
to be bug free for lockless, although he _never_ says it is impossible.

My _personal_ bet is that, lockless updates for TC filters or actions
are impossible unless there are more things hiding behind "copy and
replace", for example, some brilliant lockless algorithm. If lockless is
really impossible in this circumstance, then many of your efforts in
this patchset are vain, by the way.

I _do_ believe you can break RTNL down to per device, per filter or per
action, but no matter how small the locking scope is, there is still a lock.
With a lock, there is no need to make things friendly to lockless, like
making an integer increment inside an action to be atomic (your patch
02/11).

Please _do_ prove my personal judgement is wrong, by showing your
final code or a formal paper/article. I am very *happy* to be proved
to be wrong here, I am very open to change my mind here.

Vlad, we need your proof. Please prove I am wrong, seriously!!! :)

Thanks to anyone for proving me I am wrong just in case!!! :)


Re: [PATCH net-next v6 00/11] Modify action API for implementing lockless actions

2018-07-07 Thread David Miller
From: Vlad Buslov 
Date: Thu,  5 Jul 2018 17:24:22 +0300

> Currently, all netlink protocol handlers for updating rules, actions and
> qdiscs are protected with single global rtnl lock which removes any
> possibility for parallelism. This patch set is a first step to remove
> rtnl lock dependency from TC rules update path.
 ...

I'll apply this for now, I reviewed it a few more times and I see
where you are going with this.

I hope there are no new performance regressions in the control path
for cases people care about, and if there are I definitely expect
you to address them.

Thank you.


Re: [PATCH net-next v6 00/11] Modify action API for implementing lockless actions

2018-07-07 Thread David Miller
From: Vlad Buslov 
Date: Thu,  5 Jul 2018 17:24:22 +0300

> Currently, all netlink protocol handlers for updating rules, actions and
> qdiscs are protected with single global rtnl lock which removes any
> possibility for parallelism. This patch set is a first step to remove
> rtnl lock dependency from TC rules update path.

I've reviewed this a few time but since this is a rather non-trivial
set of changes I'm going to let others have a chance to review and
give feedback as well.

Thanks.


[PATCH net-next v6 00/11] Modify action API for implementing lockless actions

2018-07-05 Thread Vlad Buslov
Currently, all netlink protocol handlers for updating rules, actions and
qdiscs are protected with single global rtnl lock which removes any
possibility for parallelism. This patch set is a first step to remove
rtnl lock dependency from TC rules update path.

Recently, new rtnl registration flag RTNL_FLAG_DOIT_UNLOCKED was added.
Handlers registered with this flag are called without RTNL taken. End
goal is to have rule update handlers(RTM_NEWTFILTER, RTM_DELTFILTER,
etc.) to be registered with UNLOCKED flag to allow parallel execution.
However, there is no intention to completely remove or split rtnl lock
itself. This patch set addresses specific problems in action API that
prevents it from being executed concurrently. This patch set does not
completely unlock rules or actions update path. Additional patch sets
are required to refactor individual actions and filters update for
parallel execution.

As a preparation for executing TC rules update handlers without rtnl
lock, action API code was audited to determine areas that assume
external synchronization with rtnl lock and must be changed to allow
safe concurrent access with following results:

1. Action idr is already protected with spinlock. However, some code
   paths assume that idr state is not changes between several
   consecutive tcf_idr_* function calls.
2. tc_action reference and bind counters are implemented as plain
   integers. They purpose was to allow single actions to be shared
   between multiple filters, not to provide means for concurrent
   modification.
3. tc_action 'cookie' pointer field is not protected against
   modification.
4. Action API functions, that work with set of actions, use intrusive
   linked list, which cannot be used concurrently without additional
   synchronization.
5. Action API functions don't take reference to actions while using
   them, assuming external synchronization with rtnl lock.

Following solutions to these problems are implemented:

1. To remove assumption that idr state doesn't change between tcf_idr_*
   calls, implement new functions that atomically perform several
   operations on idr without releasing idr spinlock. (function to
   atomically lookup and delete action by index, function to atomically
   check if action exists and allocate new one if necessary, etc.)
2. Use atomic operations on counters to make them suitable for
   concurrent get/put operations.
3. Data that 'cookie' points to is never modified, so it enough to
   refactor it to rcu pointer to prevent concurrent de-allocation.
4. Action API doesn't actually use any linked list specific operations
   on actions intrusive linked list, so it can be refactored to array in
   straightforward manner.
5. Always take reference to action while accessing it in action API.
   tcf_idr_search function modified to take reference to action before
   returning it, so there is no way to lookup an action without
   incrementing its reference counter. All users of this function are
   modified to release the reference, after they done using action. With
   all users using reference counting, it is now safe to concurrently
   delete actions.

Additionally, actions init function signature was expanded with
'rtnl_held' argument, that allows actions that have internal dependency
on rtnl lock to take/release it when necessary.

Since only shared state in action API module are actions themselves and
action idr, these changes are sufficient to not to rely on global rtnl
lock for protection of internal action API data structures.

Changes from V5 to V6:
- Rebase on current net-next
- When action is deleted, set pointer in actions array to NULL to
  prevent double freeing.

Changes from V4 to V5:
- Change action delete API to track actions that were deleted, to
  prevent releasing them on error.

Changes from V3 to V4:
- Expand cover letter.
- Reduce actions array size in tcf_action_init_1.
- Rebase on latest net-next.

Changes from V2 to V3:
- Re-send with changelog copied to individual patches.

Changes from V1 to V2:
- Removed redundant actions ops lookup during delete.
- Merge action ops delete definition and implementation.
- Assume all actions have delete implemented and don't check for it
  explicitly.
- Resplit action lookup/release code to prevent memory leaks in
  individual patches.
- Make __tcf_idr_check function static
- Remove unique idr insertion function. Change original idr insert to do
  the same thing.
- Merge changes that take reference to action when performing lookup and
  changes that account for this additional reference when dumping action
  to user space into single patch.
- Change convoluted commit message.
- Rename "unlocked" to "rtnl_held" for clarity.
- Remove estimator lock add patch.
- Refactor action check-alloc code into standalone function.
- Rename tcf_idr_find_delete to tcf_idr_delete_index.
- Rearrange variable definitions in tc_action_delete.
- Add patch that refactors action API code to use array of pointers to
  actions