** Description changed:

  Please, follow this in: http://people.canonical.com/~inaddy/lp1328088/.
  Same description on daily-basis updated text.
  
  --
- 
- It was brought to my attention that "fake router creation" scalability
- was affected during kernel development.
+ It was brought to my attention that network namespace creation scalability 
was affected during kernel development.
  
  The following script was used for all the tests and charts generation:
  
  http://people.canonical.com/~inaddy/lp1328088/make_fake_routers.sh
  http://people.canonical.com/~inaddy/lp1328088/parse.py
  
  I measured how many "fake routers" (above script) could be added per
  second from 0 to 4000 created routers mark. Using this script and a git
  bisect on kernel tree I was led to one specific commit causing
- regression: #911af505 "rcu: Provide compile-time control for no-CBs
- CPUs".
+ regression: #911af50 "rcu: Provide compile-time control for no-CBs
+ CPUs". Even Though this change was experimental at that point, it
+ introduced a performance scalability regression (explained below) that
+ still last and seems to be the default option for distributions
+ nowadays.
  
- It appeared that rcu, rcu callbacks and no-cb cpus were causing the
- issue so every commit that changed any of this files: "kernel/rcutree.c
- kernel/rcutree.h kernel/rcutree_plugin.h include/trace/events/rcu.h
- include/linux/rcupdate.h" was tested. The idea was to check performance
- regression during rcu development. In the worst case I would have data
- for performance regression during kernel development (since we have rcu
- commits from 3.8 to 3.14).
+ RCU related code looked like to be responsible for the problem. With
+ that, every commit from tag v3.8..master that changed any of this files:
+ "kernel/rcutree.c kernel/rcutree.h kernel/rcutree_plugin.h
+ include/trace/events/rcu.h include/linux/rcupdate.h" was tested. The
+ idea was to check performance regression during rcu development. In the
+ worst case, the regression not being related to rcu, I would still have
+ data to interpret the performance/scalability regression.
  
  All text below this refer to 2 groups of charts, generated during the
  study:
  
  1) Kernel git tags from 3.8 to 3.14.
  http://people.canonical.com/~inaddy/lp1328088/charts/250-tag.html
  
  2) Kernel git commits for rcu development (111 commits).
  http://people.canonical.com/~inaddy/lp1328088/charts/250.html
  
  Since there was difference in results depending on how many cpus or how
  the no-cb cpus were configured, 3 kernel config options were used on
  every measure:
  
- - CONFIG_RCU_NOCB_CPU (disabled: nocbno)
- - CONFIG_RCU_NOCB_CPU_ALL (enabled: nocball)
- - CONFIG_RCU_NOCB_CPU_NONE (enabled: nocbnone)
+ - CONFIG_RCU_NOCB_CPU (disabled): nocbno
+ - CONFIG_RCU_NOCB_CPU_ALL (enabled): nocball
+ - CONFIG_RCU_NOCB_CPU_NONE (enabled): nocbnone
  
- After charts generation and study it was clear that NOCB_CPU_ALL (4
- cpus) affected the "fake routers" creation process performance and this
+ Obs: For 1 cpu cases: nocbno, nocbnone, nocball behaves the same since
+ w/ only 1 cpu there is no no-cb cpu
+ 
+ After charts being generated it was clear that NOCB_CPU_ALL (4 cpus)
+ affected the "fake routers" creation process performance and this
  regression continues up to upstream version. It was also clear that,
- after this commit, there is no scalability executing this test with more
- than 1 cpu.
+ after commit #911af50, having more than 1 cpu does not improve
+ performance/scalability for netns, makes it worse.
+ 
+ #911af50
+ ...
+ +#ifdef CONFIG_RCU_NOCB_CPU_ALL
+ +   pr_info("\tExperimental no-CBs for all CPUs\n");
+ +   cpumask_setall(rcu_nocb_mask);
+ +#endif /* #ifdef CONFIG_RCU_NOCB_CPU_ALL */
+ ...
  
  Comparing standing out points (see charts):
  
  #81e5949 - good
  #911af50 - bad
- #6faf728 - not good enough
  
- I was able to see that from the script above the following lines were
- affected:
+ I was able to see that, from the script above, the following lines
+ causes major impact on netns scalability/performance:
  
- 1) ip netns add -> huge performance regression
+ 1) ip netns add -> huge performance regression:
+     1 cpu: no regression
+     4 cpu: regression for NOCB_CPU_ALL
+     obs: regression from 250 netns/sec to 50 netns/sec 
+          on 500 netns already created mark
+ 
  2) ip netns exec -> some performance regression
+     1 cpu: no regression
+     4 cpu: regression for NOCB_CPU_ALL
+     obs: regression from 40 netns (+1 exec per netns 
+          creation) to 20 netns/sec on 500 netns created 
+          mark
  
- #
- # Assumption
- #
+ # Assumption (to be confirmed)
  
  rcu callbacks being offloaded to other cpus caused regression in
- unshare(CLONE_NEWNET) code.
- 
- # Specific kernel entry being investigated:
- 
- unshare(CLONE_NEWNET)
+ copy_net_ns<-created_new_namespaces or unshare(clone_newnet).

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1328088

Title:
  Kernel network namespace performance regression during rcu development
  on kernels above 3.8

To manage notifications about this bug go to:
https://bugs.launchpad.net/linux/+bug/1328088/+subscriptions

-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to