Re: [PATCH net-next v4 00/11] Modify action API for implementing lockless actions

2018-06-01 Thread Vlad Buslov


On Fri 01 Jun 2018 at 12:24, Jamal Hadi Salim  wrote:
> On 31/05/18 08:38 AM, Vlad Buslov wrote:
>
>> Hi Jamal,
>> 
>> On current net-next I still have action with single reference after last
>> step:
>> ~$ sudo $TC -s actions ls action skbedit
>> total acts 1
>> 
>>  action order 0:  skbedit mark 1 pipe
>>   index 1 ref 2 bind 1 installed 47 sec used 47 sec
>>  Action statistics:
>>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>>  backlog 0b 0p requeues 0
>> ~$ sudo $TC filter del dev lo parent : protocol ip prio 1 \
>>> u32 match ip dst 127.0.0.1/32 flowid 1:1
>> ~$ sudo $TC -s actions ls action skbedit
>> total acts 1
>> 
>>  action order 0:  skbedit mark 1 pipe
>>   index 1 ref 1 bind 0 installed 80 sec used 80 sec
>>  Action statistics:
>>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>>  backlog 0b 0p requeues 0
>> 
>> Which branch are you testing on?
>
> You are correct - this is how it works now (I was thinking of a
> very old version before Cong made some changes a while back).
> Just vet this continues to work as above.
>
> cheers,
> jamal

Indeed, there was a problem that changed a behavior in this case.
I fixed it, re-run the test suite, manually checked with test you
proposed in this thread, and sent V5.

Thanks,
Vlad


Re: [PATCH net-next v4 00/11] Modify action API for implementing lockless actions

2018-06-01 Thread Jamal Hadi Salim

On 31/05/18 08:38 AM, Vlad Buslov wrote:


Hi Jamal,

On current net-next I still have action with single reference after last
step:
~$ sudo $TC -s actions ls action skbedit
total acts 1

 action order 0:  skbedit mark 1 pipe

  index 1 ref 2 bind 1 installed 47 sec used 47 sec
 Action statistics:
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
~$ sudo $TC filter del dev lo parent : protocol ip prio 1 \

u32 match ip dst 127.0.0.1/32 flowid 1:1

~$ sudo $TC -s actions ls action skbedit
total acts 1

 action order 0:  skbedit mark 1 pipe

  index 1 ref 1 bind 0 installed 80 sec used 80 sec
 Action statistics:
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0

Which branch are you testing on?


You are correct - this is how it works now (I was thinking of a
very old version before Cong made some changes a while back).
Just vet this continues to work as above.

cheers,
jamal


Re: [PATCH net-next v4 00/11] Modify action API for implementing lockless actions

2018-05-31 Thread Vlad Buslov


On Thu 31 May 2018 at 10:01, Jamal Hadi Salim  wrote:
> Hi Vlad,
>
> Can you try one simple test below with these patches?
>
> #create an action
> sudo $TC actions add action skbedit mark 1 pipe
> #
> sudo $TC qdisc del dev lo parent :
> sudo $TC qdisc add dev lo ingress
> # bind action to filter
> sudo $TC filter add dev lo parent : protocol ip prio 1 \
> u32 match ip dst 127.0.0.1/32 flowid 1:1 action skbedit index 1
>
> #now delete that action multiple times while it is still bound
> sudo $TC actions del action skbedit index 1
> sudo $TC actions del action skbedit index 1
> sudo $TC actions del action skbedit index 1
>
> #check the refcount and bindcount
> sudo $TC -s actions ls action skbedit
>
> #delete the filter (which should remove the bindcnt)
>
> sudo $TC filter del dev lo parent : protocol ip prio 1 \
> u32 match ip dst 127.0.0.1/32 flowid 1:1
>
> #check the refcount and bindcount
> sudo $TC -s actions ls action skbedit
>
> Current behavior: i believe the action is gone in this last step.
> Your patches may change behavior so that the action action is still
> around. I dont think this is a big deal, but just wanted to be sure
> it is not something more unexpected.
>
> cheers,
> jamal

Hi Jamal,

On current net-next I still have action with single reference after last
step:
~$ sudo $TC -s actions ls action skbedit   
total acts 1   
   
action order 0:  skbedit mark 1 pipe   
 index 1 ref 2 bind 1 installed 47 sec used 47 sec 
Action statistics: 
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0   
~$ sudo $TC filter del dev lo parent : protocol ip prio 1 \
> u32 match ip dst 127.0.0.1/32 flowid 1:1 
~$ sudo $TC -s actions ls action skbedit   
total acts 1   
   
action order 0:  skbedit mark 1 pipe   
 index 1 ref 1 bind 0 installed 80 sec used 80 sec 
Action statistics: 
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0   

Which branch are you testing on?

Regards,
Vlad


Re: [PATCH net-next v4 00/11] Modify action API for implementing lockless actions

2018-05-31 Thread Jamal Hadi Salim

Hi Vlad,

Can you try one simple test below with these patches?

#create an action
sudo $TC actions add action skbedit mark 1 pipe
#
sudo $TC qdisc del dev lo parent :
sudo $TC qdisc add dev lo ingress
# bind action to filter
sudo $TC filter add dev lo parent : protocol ip prio 1 \
u32 match ip dst 127.0.0.1/32 flowid 1:1 action skbedit index 1

#now delete that action multiple times while it is still bound
sudo $TC actions del action skbedit index 1
sudo $TC actions del action skbedit index 1
sudo $TC actions del action skbedit index 1

#check the refcount and bindcount
sudo $TC -s actions ls action skbedit

#delete the filter (which should remove the bindcnt)

sudo $TC filter del dev lo parent : protocol ip prio 1 \
u32 match ip dst 127.0.0.1/32 flowid 1:1

#check the refcount and bindcount
sudo $TC -s actions ls action skbedit

Current behavior: i believe the action is gone in this last step.
Your patches may change behavior so that the action action is still
around. I dont think this is a big deal, but just wanted to be sure
it is not something more unexpected.

cheers,
jamal


[PATCH net-next v4 00/11] Modify action API for implementing lockless actions

2018-05-31 Thread Vlad Buslov
Currently, all netlink protocol handlers for updating rules, actions and
qdiscs are protected with single global rtnl lock which removes any
possibility for parallelism. This patch set is a first step to remove
rtnl lock dependency from TC rules update path.

Recently, new rtnl registration flag RTNL_FLAG_DOIT_UNLOCKED was added.
Handlers registered with this flag are called without RTNL taken. End
goal is to have rule update handlers(RTM_NEWTFILTER, RTM_DELTFILTER,
etc.) to be registered with UNLOCKED flag to allow parallel execution.
However, there is no intention to completely remove or split rtnl lock
itself. This patch set addresses specific problems in action API that
prevents it from being executed concurrently. This patch set does not
completely unlock rules or actions update path. Additional patch sets
are required to refactor individual actions and filters update for
parallel execution.

As a preparation for executing TC rules update handlers without rtnl
lock, action API code was audited to determine areas that assume
external synchronization with rtnl lock and must be changed to allow
safe concurrent access with following results:

1. Action idr is already protected with spinlock. However, some code
   paths assume that idr state is not changes between several
   consecutive tcf_idr_* function calls.
2. tc_action reference and bind counters are implemented as plain
   integers. They purpose was to allow single actions to be shared
   between multiple filters, not to provide means for concurrent
   modification.
3. tc_action 'cookie' pointer field is not protected against
   modification.
4. Action API functions, that work with set of actions, use intrusive
   linked list, which cannot be used concurrently without additional
   synchronization.
5. Action API functions don't take reference to actions while using
   them, assuming external synchronization with rtnl lock.

Following solutions to these problems are implemented:

1. To remove assumption that idr state doesn't change between tcf_idr_*
   calls, implement new functions that atomically perform several
   operations on idr without releasing idr spinlock. (function to
   atomically lookup and delete action by index, function to atomically
   check if action exists and allocate new one if necessary, etc.)
2. Use atomic operations on counters to make them suitable for
   concurrent get/put operations.
3. Data that 'cookie' points to is never modified, so it enough to
   refactor it to rcu pointer to prevent concurrent de-allocation.
4. Action API doesn't actually use any linked list specific operations
   on actions intrusive linked list, so it can be refactored to array in
   straightforward manner.
5. Always take reference to action while accessing it in action API.
   tcf_idr_search function modified to take reference to action before
   returning it, so there is no way to lookup an action without
   incrementing its reference counter. All users of this function are
   modified to release the reference, after they done using action. With
   all users using reference counting, it is now safe to concurrently
   delete actions.

Additionally, actions init function signature was expanded with
'rtnl_held' argument, that allows actions that have internal dependency
on rtnl lock to take/release it when necessary.

Since only shared state in action API module are actions themselves and
action idr, these changes are sufficient to not to rely on global rtnl
lock for protection of internal action API data structures.

Changes from V3 to V4:
- Expand cover letter.
- Reduce actions array size in tcf_action_init_1.
- Rebase on latest net-next.

Changes from V2 to V3:
- Re-send with changelog copied to individual patches.

Changes from V1 to V2:
- Removed redundant actions ops lookup during delete.
- Merge action ops delete definition and implementation.
- Assume all actions have delete implemented and don't check for it
  explicitly.
- Resplit action lookup/release code to prevent memory leaks in
  individual patches.
- Make __tcf_idr_check function static
- Remove unique idr insertion function. Change original idr insert to do
  the same thing.
- Merge changes that take reference to action when performing lookup and
  changes that account for this additional reference when dumping action
  to user space into single patch.
- Change convoluted commit message.
- Rename "unlocked" to "rtnl_held" for clarity.
- Remove estimator lock add patch.
- Refactor action check-alloc code into standalone function.
- Rename tcf_idr_find_delete to tcf_idr_delete_index.
- Rearrange variable definitions in tc_action_delete.
- Add patch that refactors action API code to use array of pointers to
  actions instead of intrusive linked list.
- Expand cover letter.

Vlad Buslov (11):
  net: sched: use rcu for action cookie update
  net: sched: change type of reference and bind counters
  net: sched: implement unlocked action init API
  net: sched: always take reference to act