** Description changed: Please, follow this in: http://people.canonical.com/~inaddy/lp1328088/. Same description on daily-basis updated text. -- - - It was brought to my attention that "fake router creation" scalability - was affected during kernel development. + It was brought to my attention that network namespace creation scalability was affected during kernel development. The following script was used for all the tests and charts generation: http://people.canonical.com/~inaddy/lp1328088/make_fake_routers.sh http://people.canonical.com/~inaddy/lp1328088/parse.py I measured how many "fake routers" (above script) could be added per second from 0 to 4000 created routers mark. Using this script and a git bisect on kernel tree I was led to one specific commit causing - regression: #911af505 "rcu: Provide compile-time control for no-CBs - CPUs". + regression: #911af50 "rcu: Provide compile-time control for no-CBs + CPUs". Even Though this change was experimental at that point, it + introduced a performance scalability regression (explained below) that + still last and seems to be the default option for distributions + nowadays. - It appeared that rcu, rcu callbacks and no-cb cpus were causing the - issue so every commit that changed any of this files: "kernel/rcutree.c - kernel/rcutree.h kernel/rcutree_plugin.h include/trace/events/rcu.h - include/linux/rcupdate.h" was tested. The idea was to check performance - regression during rcu development. In the worst case I would have data - for performance regression during kernel development (since we have rcu - commits from 3.8 to 3.14). + RCU related code looked like to be responsible for the problem. With + that, every commit from tag v3.8..master that changed any of this files: + "kernel/rcutree.c kernel/rcutree.h kernel/rcutree_plugin.h + include/trace/events/rcu.h include/linux/rcupdate.h" was tested. The + idea was to check performance regression during rcu development. In the + worst case, the regression not being related to rcu, I would still have + data to interpret the performance/scalability regression. All text below this refer to 2 groups of charts, generated during the study: 1) Kernel git tags from 3.8 to 3.14. http://people.canonical.com/~inaddy/lp1328088/charts/250-tag.html 2) Kernel git commits for rcu development (111 commits). http://people.canonical.com/~inaddy/lp1328088/charts/250.html Since there was difference in results depending on how many cpus or how the no-cb cpus were configured, 3 kernel config options were used on every measure: - - CONFIG_RCU_NOCB_CPU (disabled: nocbno) - - CONFIG_RCU_NOCB_CPU_ALL (enabled: nocball) - - CONFIG_RCU_NOCB_CPU_NONE (enabled: nocbnone) + - CONFIG_RCU_NOCB_CPU (disabled): nocbno + - CONFIG_RCU_NOCB_CPU_ALL (enabled): nocball + - CONFIG_RCU_NOCB_CPU_NONE (enabled): nocbnone - After charts generation and study it was clear that NOCB_CPU_ALL (4 - cpus) affected the "fake routers" creation process performance and this + Obs: For 1 cpu cases: nocbno, nocbnone, nocball behaves the same since + w/ only 1 cpu there is no no-cb cpu + + After charts being generated it was clear that NOCB_CPU_ALL (4 cpus) + affected the "fake routers" creation process performance and this regression continues up to upstream version. It was also clear that, - after this commit, there is no scalability executing this test with more - than 1 cpu. + after commit #911af50, having more than 1 cpu does not improve + performance/scalability for netns, makes it worse. + + #911af50 + ... + +#ifdef CONFIG_RCU_NOCB_CPU_ALL + + pr_info("\tExperimental no-CBs for all CPUs\n"); + + cpumask_setall(rcu_nocb_mask); + +#endif /* #ifdef CONFIG_RCU_NOCB_CPU_ALL */ + ... Comparing standing out points (see charts): #81e5949 - good #911af50 - bad - #6faf728 - not good enough - I was able to see that from the script above the following lines were - affected: + I was able to see that, from the script above, the following lines + causes major impact on netns scalability/performance: - 1) ip netns add -> huge performance regression + 1) ip netns add -> huge performance regression: + 1 cpu: no regression + 4 cpu: regression for NOCB_CPU_ALL + obs: regression from 250 netns/sec to 50 netns/sec + on 500 netns already created mark + 2) ip netns exec -> some performance regression + 1 cpu: no regression + 4 cpu: regression for NOCB_CPU_ALL + obs: regression from 40 netns (+1 exec per netns + creation) to 20 netns/sec on 500 netns created + mark - # - # Assumption - # + # Assumption (to be confirmed) rcu callbacks being offloaded to other cpus caused regression in - unshare(CLONE_NEWNET) code. - - # Specific kernel entry being investigated: - - unshare(CLONE_NEWNET) + copy_net_ns<-created_new_namespaces or unshare(clone_newnet).
-- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1328088 Title: Kernel network namespace performance regression during rcu development on kernels above 3.8 To manage notifications about this bug go to: https://bugs.launchpad.net/linux/+bug/1328088/+subscriptions -- ubuntu-bugs mailing list [email protected] https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
