[Kernel-packages] [Bug 1328088] Re: Kernel network namespace performance regression during rcu development on kernels above 3.8

2017-06-13 Thread Don Bowman
I wonder if this fix fell out or is somehow different now?

The below script, on 4.11rc7:

1cpu: 0m1.379s
12cpu: 1m36.556s
72cpu: 2m20.118s

This is a *huge* impact for neutron L3 agent on my OpenStack system.

# cd /tmp
# ip netns add foo
# ip netns add bar
# for i in `seq 0 1000` ; do echo -e 'netns exec foo echo\nnetns exec bar echo' 
>> ipnetns.batch ; done
# time ip -b ipnetns.batch > /dev/null

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1328088

Title:
  Kernel network namespace performance regression during rcu development
  on kernels above 3.8

Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Trusty:
  Fix Released
Status in linux source package in Utopic:
  Fix Released

Bug description:
  SRU Justification:

  Impact: network namespace creation has performance regression since v3.5.
  Fix: my analysis, lklm discussion, upstream patch
  Testcase: 
   
   http://people.canonical.com/~inaddy/lp1328088/make_fake_routers.sh
   http://people.canonical.com/~inaddy/lp1328088/parse.py
   http://people.canonical.com/~inaddy/lp1328088/charts/250.html
   http://people.canonical.com/~inaddy/lp1328088/charts/250-tag.html

   Running make_fake_routers.sh 4000 and using parse.py you can check if 
   "fake routers" are being created in a good rate /sec (and you can
   compare with all generated charts). 

  

  Original Description:

  Please, follow this in:
  http://people.canonical.com/~inaddy/lp1328088/. Same description on
  daily-basis updated text.

  --
  It was brought to my attention that network namespace creation scalability 
was affected during kernel development.

  The following script was used for all the tests and charts generation:

  http://people.canonical.com/~inaddy/lp1328088/make_fake_routers.sh
  http://people.canonical.com/~inaddy/lp1328088/parse.py

  I measured how many "fake routers" (above script) could be added per
  second from 0 to 4000 created routers mark. Using this script and a
  git bisect on kernel tree I was led to one specific commit causing
  regression: #911af50 "rcu: Provide compile-time control for no-CBs
  CPUs". Even Though this change was experimental at that point, it
  introduced a performance scalability regression (explained below) that
  still last and seems to be the default option for distributions
  nowadays.

  RCU related code looked like to be responsible for the problem. With
  that, every commit from tag v3.8..master that changed any of this
  files: "kernel/rcutree.c kernel/rcutree.h kernel/rcutree_plugin.h
  include/trace/events/rcu.h include/linux/rcupdate.h" was tested. The
  idea was to check performance regression during rcu development. In
  the worst case, the regression not being related to rcu, I would still
  have data to interpret the performance/scalability regression.

  All text below this refer to 2 groups of charts, generated during the
  study:

  1) Kernel git tags from 3.8 to 3.14.
  http://people.canonical.com/~inaddy/lp1328088/charts/250-tag.html

  2) Kernel git commits for rcu development (111 commits).
  http://people.canonical.com/~inaddy/lp1328088/charts/250.html

  Since there was difference in results depending on how many cpus or
  how the no-cb cpus were configured, 3 kernel config options were used
  on every measure:

  - CONFIG_RCU_NOCB_CPU (disabled): nocbno
  - CONFIG_RCU_NOCB_CPU_ALL (enabled): nocball
  - CONFIG_RCU_NOCB_CPU_NONE (enabled): nocbnone

  Obs: For 1 cpu cases: nocbno, nocbnone, nocball behaves the same since
  w/ only 1 cpu there is no no-cb cpu

  After charts being generated it was clear that NOCB_CPU_ALL (4 cpus)
  affected the "fake routers" creation process performance and this
  regression continues up to upstream version. It was also clear that,
  after commit #911af50, having more than 1 cpu does not improve
  performance/scalability for netns, makes it worse.

  #911af50
  ...
  +#ifdef CONFIG_RCU_NOCB_CPU_ALL
  +   pr_info("\tExperimental no-CBs for all CPUs\n");
  +   cpumask_setall(rcu_nocb_mask);
  +#endif /* #ifdef CONFIG_RCU_NOCB_CPU_ALL */
  ...

  Comparing standing out points (see charts):

  #81e5949 - good
  #911af50 - bad

  I was able to see that, from the script above, the following lines
  causes major impact on netns scalability/performance:

  1) ip netns add -> huge performance regression:
  1 cpu: no regression
  4 cpu: regression for NOCB_CPU_ALL
  obs: regression from 250 netns/sec to 50 netns/sec
   on 500 netns already created mark

  2) ip netns exec -> some performance regression
  1 cpu: no regression
  4 cpu: regression for NOCB_CPU_ALL
  obs: regression from 40 netns (+1 exec per netns
   creation) to 20 netns/sec on 500 netns created
   mark

  # Assumption (to be confirmed)

  rcu callbacks being offloaded to other cpus caused regression in
  

[Kernel-packages] [Bug 1328088] Re: Kernel network namespace performance regression during rcu development on kernels above 3.8

2017-03-16 Thread Dave Chiluk
@gnanasekarkas. Not that we know of.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1328088

Title:
  Kernel network namespace performance regression during rcu development
  on kernels above 3.8

Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Trusty:
  Fix Released
Status in linux source package in Utopic:
  Fix Released

Bug description:
  SRU Justification:

  Impact: network namespace creation has performance regression since v3.5.
  Fix: my analysis, lklm discussion, upstream patch
  Testcase: 
   
   http://people.canonical.com/~inaddy/lp1328088/make_fake_routers.sh
   http://people.canonical.com/~inaddy/lp1328088/parse.py
   http://people.canonical.com/~inaddy/lp1328088/charts/250.html
   http://people.canonical.com/~inaddy/lp1328088/charts/250-tag.html

   Running make_fake_routers.sh 4000 and using parse.py you can check if 
   "fake routers" are being created in a good rate /sec (and you can
   compare with all generated charts). 

  

  Original Description:

  Please, follow this in:
  http://people.canonical.com/~inaddy/lp1328088/. Same description on
  daily-basis updated text.

  --
  It was brought to my attention that network namespace creation scalability 
was affected during kernel development.

  The following script was used for all the tests and charts generation:

  http://people.canonical.com/~inaddy/lp1328088/make_fake_routers.sh
  http://people.canonical.com/~inaddy/lp1328088/parse.py

  I measured how many "fake routers" (above script) could be added per
  second from 0 to 4000 created routers mark. Using this script and a
  git bisect on kernel tree I was led to one specific commit causing
  regression: #911af50 "rcu: Provide compile-time control for no-CBs
  CPUs". Even Though this change was experimental at that point, it
  introduced a performance scalability regression (explained below) that
  still last and seems to be the default option for distributions
  nowadays.

  RCU related code looked like to be responsible for the problem. With
  that, every commit from tag v3.8..master that changed any of this
  files: "kernel/rcutree.c kernel/rcutree.h kernel/rcutree_plugin.h
  include/trace/events/rcu.h include/linux/rcupdate.h" was tested. The
  idea was to check performance regression during rcu development. In
  the worst case, the regression not being related to rcu, I would still
  have data to interpret the performance/scalability regression.

  All text below this refer to 2 groups of charts, generated during the
  study:

  1) Kernel git tags from 3.8 to 3.14.
  http://people.canonical.com/~inaddy/lp1328088/charts/250-tag.html

  2) Kernel git commits for rcu development (111 commits).
  http://people.canonical.com/~inaddy/lp1328088/charts/250.html

  Since there was difference in results depending on how many cpus or
  how the no-cb cpus were configured, 3 kernel config options were used
  on every measure:

  - CONFIG_RCU_NOCB_CPU (disabled): nocbno
  - CONFIG_RCU_NOCB_CPU_ALL (enabled): nocball
  - CONFIG_RCU_NOCB_CPU_NONE (enabled): nocbnone

  Obs: For 1 cpu cases: nocbno, nocbnone, nocball behaves the same since
  w/ only 1 cpu there is no no-cb cpu

  After charts being generated it was clear that NOCB_CPU_ALL (4 cpus)
  affected the "fake routers" creation process performance and this
  regression continues up to upstream version. It was also clear that,
  after commit #911af50, having more than 1 cpu does not improve
  performance/scalability for netns, makes it worse.

  #911af50
  ...
  +#ifdef CONFIG_RCU_NOCB_CPU_ALL
  +   pr_info("\tExperimental no-CBs for all CPUs\n");
  +   cpumask_setall(rcu_nocb_mask);
  +#endif /* #ifdef CONFIG_RCU_NOCB_CPU_ALL */
  ...

  Comparing standing out points (see charts):

  #81e5949 - good
  #911af50 - bad

  I was able to see that, from the script above, the following lines
  causes major impact on netns scalability/performance:

  1) ip netns add -> huge performance regression:
  1 cpu: no regression
  4 cpu: regression for NOCB_CPU_ALL
  obs: regression from 250 netns/sec to 50 netns/sec
   on 500 netns already created mark

  2) ip netns exec -> some performance regression
  1 cpu: no regression
  4 cpu: regression for NOCB_CPU_ALL
  obs: regression from 40 netns (+1 exec per netns
   creation) to 20 netns/sec on 500 netns created
   mark

  # Assumption (to be confirmed)

  rcu callbacks being offloaded to other cpus caused regression in
  copy_net_ns<-created_new_namespaces or unshare(clone_newnet).

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1328088/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : 

[Kernel-packages] [Bug 1328088] Re: Kernel network namespace performance regression during rcu development on kernels above 3.8

2017-03-15 Thread Gnanasekar Velu
Do we have this bug on Kernel 4.4.0-67-generic on Trusty release?

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1328088

Title:
  Kernel network namespace performance regression during rcu development
  on kernels above 3.8

Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Trusty:
  Fix Released
Status in linux source package in Utopic:
  Fix Released

Bug description:
  SRU Justification:

  Impact: network namespace creation has performance regression since v3.5.
  Fix: my analysis, lklm discussion, upstream patch
  Testcase: 
   
   http://people.canonical.com/~inaddy/lp1328088/make_fake_routers.sh
   http://people.canonical.com/~inaddy/lp1328088/parse.py
   http://people.canonical.com/~inaddy/lp1328088/charts/250.html
   http://people.canonical.com/~inaddy/lp1328088/charts/250-tag.html

   Running make_fake_routers.sh 4000 and using parse.py you can check if 
   "fake routers" are being created in a good rate /sec (and you can
   compare with all generated charts). 

  

  Original Description:

  Please, follow this in:
  http://people.canonical.com/~inaddy/lp1328088/. Same description on
  daily-basis updated text.

  --
  It was brought to my attention that network namespace creation scalability 
was affected during kernel development.

  The following script was used for all the tests and charts generation:

  http://people.canonical.com/~inaddy/lp1328088/make_fake_routers.sh
  http://people.canonical.com/~inaddy/lp1328088/parse.py

  I measured how many "fake routers" (above script) could be added per
  second from 0 to 4000 created routers mark. Using this script and a
  git bisect on kernel tree I was led to one specific commit causing
  regression: #911af50 "rcu: Provide compile-time control for no-CBs
  CPUs". Even Though this change was experimental at that point, it
  introduced a performance scalability regression (explained below) that
  still last and seems to be the default option for distributions
  nowadays.

  RCU related code looked like to be responsible for the problem. With
  that, every commit from tag v3.8..master that changed any of this
  files: "kernel/rcutree.c kernel/rcutree.h kernel/rcutree_plugin.h
  include/trace/events/rcu.h include/linux/rcupdate.h" was tested. The
  idea was to check performance regression during rcu development. In
  the worst case, the regression not being related to rcu, I would still
  have data to interpret the performance/scalability regression.

  All text below this refer to 2 groups of charts, generated during the
  study:

  1) Kernel git tags from 3.8 to 3.14.
  http://people.canonical.com/~inaddy/lp1328088/charts/250-tag.html

  2) Kernel git commits for rcu development (111 commits).
  http://people.canonical.com/~inaddy/lp1328088/charts/250.html

  Since there was difference in results depending on how many cpus or
  how the no-cb cpus were configured, 3 kernel config options were used
  on every measure:

  - CONFIG_RCU_NOCB_CPU (disabled): nocbno
  - CONFIG_RCU_NOCB_CPU_ALL (enabled): nocball
  - CONFIG_RCU_NOCB_CPU_NONE (enabled): nocbnone

  Obs: For 1 cpu cases: nocbno, nocbnone, nocball behaves the same since
  w/ only 1 cpu there is no no-cb cpu

  After charts being generated it was clear that NOCB_CPU_ALL (4 cpus)
  affected the "fake routers" creation process performance and this
  regression continues up to upstream version. It was also clear that,
  after commit #911af50, having more than 1 cpu does not improve
  performance/scalability for netns, makes it worse.

  #911af50
  ...
  +#ifdef CONFIG_RCU_NOCB_CPU_ALL
  +   pr_info("\tExperimental no-CBs for all CPUs\n");
  +   cpumask_setall(rcu_nocb_mask);
  +#endif /* #ifdef CONFIG_RCU_NOCB_CPU_ALL */
  ...

  Comparing standing out points (see charts):

  #81e5949 - good
  #911af50 - bad

  I was able to see that, from the script above, the following lines
  causes major impact on netns scalability/performance:

  1) ip netns add -> huge performance regression:
  1 cpu: no regression
  4 cpu: regression for NOCB_CPU_ALL
  obs: regression from 250 netns/sec to 50 netns/sec
   on 500 netns already created mark

  2) ip netns exec -> some performance regression
  1 cpu: no regression
  4 cpu: regression for NOCB_CPU_ALL
  obs: regression from 40 netns (+1 exec per netns
   creation) to 20 netns/sec on 500 netns created
   mark

  # Assumption (to be confirmed)

  rcu callbacks being offloaded to other cpus caused regression in
  copy_net_ns<-created_new_namespaces or unshare(clone_newnet).

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1328088/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : 

[Kernel-packages] [Bug 1328088] Re: Kernel network namespace performance regression during rcu development on kernels above 3.8

2015-10-15 Thread Jorge Niedbalski
** Changed in: linux (Ubuntu Trusty)
 Assignee: (unassigned) => Rafael David Tinoco (inaddy)

** Changed in: linux (Ubuntu Utopic)
 Assignee: (unassigned) => Rafael David Tinoco (inaddy)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1328088

Title:
  Kernel network namespace performance regression during rcu development
  on kernels above 3.8

Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Trusty:
  Fix Released
Status in linux source package in Utopic:
  Fix Released

Bug description:
  SRU Justification:

  Impact: network namespace creation has performance regression since v3.5.
  Fix: my analysis, lklm discussion, upstream patch
  Testcase: 
   
   http://people.canonical.com/~inaddy/lp1328088/make_fake_routers.sh
   http://people.canonical.com/~inaddy/lp1328088/parse.py
   http://people.canonical.com/~inaddy/lp1328088/charts/250.html
   http://people.canonical.com/~inaddy/lp1328088/charts/250-tag.html

   Running make_fake_routers.sh 4000 and using parse.py you can check if 
   "fake routers" are being created in a good rate /sec (and you can
   compare with all generated charts). 

  

  Original Description:

  Please, follow this in:
  http://people.canonical.com/~inaddy/lp1328088/. Same description on
  daily-basis updated text.

  --
  It was brought to my attention that network namespace creation scalability 
was affected during kernel development.

  The following script was used for all the tests and charts generation:

  http://people.canonical.com/~inaddy/lp1328088/make_fake_routers.sh
  http://people.canonical.com/~inaddy/lp1328088/parse.py

  I measured how many "fake routers" (above script) could be added per
  second from 0 to 4000 created routers mark. Using this script and a
  git bisect on kernel tree I was led to one specific commit causing
  regression: #911af50 "rcu: Provide compile-time control for no-CBs
  CPUs". Even Though this change was experimental at that point, it
  introduced a performance scalability regression (explained below) that
  still last and seems to be the default option for distributions
  nowadays.

  RCU related code looked like to be responsible for the problem. With
  that, every commit from tag v3.8..master that changed any of this
  files: "kernel/rcutree.c kernel/rcutree.h kernel/rcutree_plugin.h
  include/trace/events/rcu.h include/linux/rcupdate.h" was tested. The
  idea was to check performance regression during rcu development. In
  the worst case, the regression not being related to rcu, I would still
  have data to interpret the performance/scalability regression.

  All text below this refer to 2 groups of charts, generated during the
  study:

  1) Kernel git tags from 3.8 to 3.14.
  http://people.canonical.com/~inaddy/lp1328088/charts/250-tag.html

  2) Kernel git commits for rcu development (111 commits).
  http://people.canonical.com/~inaddy/lp1328088/charts/250.html

  Since there was difference in results depending on how many cpus or
  how the no-cb cpus were configured, 3 kernel config options were used
  on every measure:

  - CONFIG_RCU_NOCB_CPU (disabled): nocbno
  - CONFIG_RCU_NOCB_CPU_ALL (enabled): nocball
  - CONFIG_RCU_NOCB_CPU_NONE (enabled): nocbnone

  Obs: For 1 cpu cases: nocbno, nocbnone, nocball behaves the same since
  w/ only 1 cpu there is no no-cb cpu

  After charts being generated it was clear that NOCB_CPU_ALL (4 cpus)
  affected the "fake routers" creation process performance and this
  regression continues up to upstream version. It was also clear that,
  after commit #911af50, having more than 1 cpu does not improve
  performance/scalability for netns, makes it worse.

  #911af50
  ...
  +#ifdef CONFIG_RCU_NOCB_CPU_ALL
  +   pr_info("\tExperimental no-CBs for all CPUs\n");
  +   cpumask_setall(rcu_nocb_mask);
  +#endif /* #ifdef CONFIG_RCU_NOCB_CPU_ALL */
  ...

  Comparing standing out points (see charts):

  #81e5949 - good
  #911af50 - bad

  I was able to see that, from the script above, the following lines
  causes major impact on netns scalability/performance:

  1) ip netns add -> huge performance regression:
  1 cpu: no regression
  4 cpu: regression for NOCB_CPU_ALL
  obs: regression from 250 netns/sec to 50 netns/sec
   on 500 netns already created mark

  2) ip netns exec -> some performance regression
  1 cpu: no regression
  4 cpu: regression for NOCB_CPU_ALL
  obs: regression from 40 netns (+1 exec per netns
   creation) to 20 netns/sec on 500 netns created
   mark

  # Assumption (to be confirmed)

  rcu callbacks being offloaded to other cpus caused regression in
  copy_net_ns<-created_new_namespaces or unshare(clone_newnet).

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1328088/+subscriptions

-- 
Mailing list: 

[Kernel-packages] [Bug 1328088] Re: Kernel network namespace performance regression during rcu development on kernels above 3.8

2015-05-13 Thread Rafael David Tinoco
** Changed in: linux (Ubuntu)
 Assignee: (unassigned) = Rafael David Tinoco (inaddy)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1328088

Title:
  Kernel network namespace performance regression during rcu development
  on kernels above 3.8

Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Trusty:
  Fix Released
Status in linux source package in Utopic:
  Fix Released

Bug description:
  SRU Justification:

  Impact: network namespace creation has performance regression since v3.5.
  Fix: my analysis, lklm discussion, upstream patch
  Testcase: 
   
   http://people.canonical.com/~inaddy/lp1328088/make_fake_routers.sh
   http://people.canonical.com/~inaddy/lp1328088/parse.py
   http://people.canonical.com/~inaddy/lp1328088/charts/250.html
   http://people.canonical.com/~inaddy/lp1328088/charts/250-tag.html

   Running make_fake_routers.sh 4000 and using parse.py you can check if 
   fake routers are being created in a good rate /sec (and you can
   compare with all generated charts). 

  

  Original Description:

  Please, follow this in:
  http://people.canonical.com/~inaddy/lp1328088/. Same description on
  daily-basis updated text.

  --
  It was brought to my attention that network namespace creation scalability 
was affected during kernel development.

  The following script was used for all the tests and charts generation:

  http://people.canonical.com/~inaddy/lp1328088/make_fake_routers.sh
  http://people.canonical.com/~inaddy/lp1328088/parse.py

  I measured how many fake routers (above script) could be added per
  second from 0 to 4000 created routers mark. Using this script and a
  git bisect on kernel tree I was led to one specific commit causing
  regression: #911af50 rcu: Provide compile-time control for no-CBs
  CPUs. Even Though this change was experimental at that point, it
  introduced a performance scalability regression (explained below) that
  still last and seems to be the default option for distributions
  nowadays.

  RCU related code looked like to be responsible for the problem. With
  that, every commit from tag v3.8..master that changed any of this
  files: kernel/rcutree.c kernel/rcutree.h kernel/rcutree_plugin.h
  include/trace/events/rcu.h include/linux/rcupdate.h was tested. The
  idea was to check performance regression during rcu development. In
  the worst case, the regression not being related to rcu, I would still
  have data to interpret the performance/scalability regression.

  All text below this refer to 2 groups of charts, generated during the
  study:

  1) Kernel git tags from 3.8 to 3.14.
  http://people.canonical.com/~inaddy/lp1328088/charts/250-tag.html

  2) Kernel git commits for rcu development (111 commits).
  http://people.canonical.com/~inaddy/lp1328088/charts/250.html

  Since there was difference in results depending on how many cpus or
  how the no-cb cpus were configured, 3 kernel config options were used
  on every measure:

  - CONFIG_RCU_NOCB_CPU (disabled): nocbno
  - CONFIG_RCU_NOCB_CPU_ALL (enabled): nocball
  - CONFIG_RCU_NOCB_CPU_NONE (enabled): nocbnone

  Obs: For 1 cpu cases: nocbno, nocbnone, nocball behaves the same since
  w/ only 1 cpu there is no no-cb cpu

  After charts being generated it was clear that NOCB_CPU_ALL (4 cpus)
  affected the fake routers creation process performance and this
  regression continues up to upstream version. It was also clear that,
  after commit #911af50, having more than 1 cpu does not improve
  performance/scalability for netns, makes it worse.

  #911af50
  ...
  +#ifdef CONFIG_RCU_NOCB_CPU_ALL
  +   pr_info(\tExperimental no-CBs for all CPUs\n);
  +   cpumask_setall(rcu_nocb_mask);
  +#endif /* #ifdef CONFIG_RCU_NOCB_CPU_ALL */
  ...

  Comparing standing out points (see charts):

  #81e5949 - good
  #911af50 - bad

  I was able to see that, from the script above, the following lines
  causes major impact on netns scalability/performance:

  1) ip netns add - huge performance regression:
  1 cpu: no regression
  4 cpu: regression for NOCB_CPU_ALL
  obs: regression from 250 netns/sec to 50 netns/sec
   on 500 netns already created mark

  2) ip netns exec - some performance regression
  1 cpu: no regression
  4 cpu: regression for NOCB_CPU_ALL
  obs: regression from 40 netns (+1 exec per netns
   creation) to 20 netns/sec on 500 netns created
   mark

  # Assumption (to be confirmed)

  rcu callbacks being offloaded to other cpus caused regression in
  copy_net_ns-created_new_namespaces or unshare(clone_newnet).

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1328088/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : 

[Kernel-packages] [Bug 1328088] Re: Kernel network namespace performance regression during rcu development on kernels above 3.8

2015-05-01 Thread Rafael David Tinoco
** Changed in: linux
 Assignee: Rafael David Tinoco (inaddy) = (unassigned)

** Changed in: linux (Ubuntu Trusty)
 Assignee: Rafael David Tinoco (inaddy) = (unassigned)

** No longer affects: linux

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1328088

Title:
  Kernel network namespace performance regression during rcu development
  on kernels above 3.8

Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Trusty:
  Fix Released
Status in linux source package in Utopic:
  Fix Released

Bug description:
  SRU Justification:

  Impact: network namespace creation has performance regression since v3.5.
  Fix: my analysis, lklm discussion, upstream patch
  Testcase: 
   
   http://people.canonical.com/~inaddy/lp1328088/make_fake_routers.sh
   http://people.canonical.com/~inaddy/lp1328088/parse.py
   http://people.canonical.com/~inaddy/lp1328088/charts/250.html
   http://people.canonical.com/~inaddy/lp1328088/charts/250-tag.html

   Running make_fake_routers.sh 4000 and using parse.py you can check if 
   fake routers are being created in a good rate /sec (and you can
   compare with all generated charts). 

  

  Original Description:

  Please, follow this in:
  http://people.canonical.com/~inaddy/lp1328088/. Same description on
  daily-basis updated text.

  --
  It was brought to my attention that network namespace creation scalability 
was affected during kernel development.

  The following script was used for all the tests and charts generation:

  http://people.canonical.com/~inaddy/lp1328088/make_fake_routers.sh
  http://people.canonical.com/~inaddy/lp1328088/parse.py

  I measured how many fake routers (above script) could be added per
  second from 0 to 4000 created routers mark. Using this script and a
  git bisect on kernel tree I was led to one specific commit causing
  regression: #911af50 rcu: Provide compile-time control for no-CBs
  CPUs. Even Though this change was experimental at that point, it
  introduced a performance scalability regression (explained below) that
  still last and seems to be the default option for distributions
  nowadays.

  RCU related code looked like to be responsible for the problem. With
  that, every commit from tag v3.8..master that changed any of this
  files: kernel/rcutree.c kernel/rcutree.h kernel/rcutree_plugin.h
  include/trace/events/rcu.h include/linux/rcupdate.h was tested. The
  idea was to check performance regression during rcu development. In
  the worst case, the regression not being related to rcu, I would still
  have data to interpret the performance/scalability regression.

  All text below this refer to 2 groups of charts, generated during the
  study:

  1) Kernel git tags from 3.8 to 3.14.
  http://people.canonical.com/~inaddy/lp1328088/charts/250-tag.html

  2) Kernel git commits for rcu development (111 commits).
  http://people.canonical.com/~inaddy/lp1328088/charts/250.html

  Since there was difference in results depending on how many cpus or
  how the no-cb cpus were configured, 3 kernel config options were used
  on every measure:

  - CONFIG_RCU_NOCB_CPU (disabled): nocbno
  - CONFIG_RCU_NOCB_CPU_ALL (enabled): nocball
  - CONFIG_RCU_NOCB_CPU_NONE (enabled): nocbnone

  Obs: For 1 cpu cases: nocbno, nocbnone, nocball behaves the same since
  w/ only 1 cpu there is no no-cb cpu

  After charts being generated it was clear that NOCB_CPU_ALL (4 cpus)
  affected the fake routers creation process performance and this
  regression continues up to upstream version. It was also clear that,
  after commit #911af50, having more than 1 cpu does not improve
  performance/scalability for netns, makes it worse.

  #911af50
  ...
  +#ifdef CONFIG_RCU_NOCB_CPU_ALL
  +   pr_info(\tExperimental no-CBs for all CPUs\n);
  +   cpumask_setall(rcu_nocb_mask);
  +#endif /* #ifdef CONFIG_RCU_NOCB_CPU_ALL */
  ...

  Comparing standing out points (see charts):

  #81e5949 - good
  #911af50 - bad

  I was able to see that, from the script above, the following lines
  causes major impact on netns scalability/performance:

  1) ip netns add - huge performance regression:
  1 cpu: no regression
  4 cpu: regression for NOCB_CPU_ALL
  obs: regression from 250 netns/sec to 50 netns/sec
   on 500 netns already created mark

  2) ip netns exec - some performance regression
  1 cpu: no regression
  4 cpu: regression for NOCB_CPU_ALL
  obs: regression from 40 netns (+1 exec per netns
   creation) to 20 netns/sec on 500 netns created
   mark

  # Assumption (to be confirmed)

  rcu callbacks being offloaded to other cpus caused regression in
  copy_net_ns-created_new_namespaces or unshare(clone_newnet).

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1328088/+subscriptions

-- 
Mailing list: 

[Kernel-packages] [Bug 1328088] Re: Kernel network namespace performance regression during rcu development on kernels above 3.8

2014-10-10 Thread Jorge Niedbalski
** Tags added: cts

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1328088

Title:
  Kernel network namespace performance regression during rcu development
  on kernels above 3.8

Status in The Linux Kernel:
  In Progress
Status in “linux” package in Ubuntu:
  Fix Released
Status in “linux” source package in Trusty:
  Fix Released
Status in “linux” source package in Utopic:
  Fix Released

Bug description:
  SRU Justification:

  Impact: network namespace creation has performance regression since v3.5.
  Fix: my analysis, lklm discussion, upstream patch
  Testcase: 
   
   http://people.canonical.com/~inaddy/lp1328088/make_fake_routers.sh
   http://people.canonical.com/~inaddy/lp1328088/parse.py
   http://people.canonical.com/~inaddy/lp1328088/charts/250.html
   http://people.canonical.com/~inaddy/lp1328088/charts/250-tag.html

   Running make_fake_routers.sh 4000 and using parse.py you can check if 
   fake routers are being created in a good rate /sec (and you can
   compare with all generated charts). 

  

  Original Description:

  Please, follow this in:
  http://people.canonical.com/~inaddy/lp1328088/. Same description on
  daily-basis updated text.

  --
  It was brought to my attention that network namespace creation scalability 
was affected during kernel development.

  The following script was used for all the tests and charts generation:

  http://people.canonical.com/~inaddy/lp1328088/make_fake_routers.sh
  http://people.canonical.com/~inaddy/lp1328088/parse.py

  I measured how many fake routers (above script) could be added per
  second from 0 to 4000 created routers mark. Using this script and a
  git bisect on kernel tree I was led to one specific commit causing
  regression: #911af50 rcu: Provide compile-time control for no-CBs
  CPUs. Even Though this change was experimental at that point, it
  introduced a performance scalability regression (explained below) that
  still last and seems to be the default option for distributions
  nowadays.

  RCU related code looked like to be responsible for the problem. With
  that, every commit from tag v3.8..master that changed any of this
  files: kernel/rcutree.c kernel/rcutree.h kernel/rcutree_plugin.h
  include/trace/events/rcu.h include/linux/rcupdate.h was tested. The
  idea was to check performance regression during rcu development. In
  the worst case, the regression not being related to rcu, I would still
  have data to interpret the performance/scalability regression.

  All text below this refer to 2 groups of charts, generated during the
  study:

  1) Kernel git tags from 3.8 to 3.14.
  http://people.canonical.com/~inaddy/lp1328088/charts/250-tag.html

  2) Kernel git commits for rcu development (111 commits).
  http://people.canonical.com/~inaddy/lp1328088/charts/250.html

  Since there was difference in results depending on how many cpus or
  how the no-cb cpus were configured, 3 kernel config options were used
  on every measure:

  - CONFIG_RCU_NOCB_CPU (disabled): nocbno
  - CONFIG_RCU_NOCB_CPU_ALL (enabled): nocball
  - CONFIG_RCU_NOCB_CPU_NONE (enabled): nocbnone

  Obs: For 1 cpu cases: nocbno, nocbnone, nocball behaves the same since
  w/ only 1 cpu there is no no-cb cpu

  After charts being generated it was clear that NOCB_CPU_ALL (4 cpus)
  affected the fake routers creation process performance and this
  regression continues up to upstream version. It was also clear that,
  after commit #911af50, having more than 1 cpu does not improve
  performance/scalability for netns, makes it worse.

  #911af50
  ...
  +#ifdef CONFIG_RCU_NOCB_CPU_ALL
  +   pr_info(\tExperimental no-CBs for all CPUs\n);
  +   cpumask_setall(rcu_nocb_mask);
  +#endif /* #ifdef CONFIG_RCU_NOCB_CPU_ALL */
  ...

  Comparing standing out points (see charts):

  #81e5949 - good
  #911af50 - bad

  I was able to see that, from the script above, the following lines
  causes major impact on netns scalability/performance:

  1) ip netns add - huge performance regression:
  1 cpu: no regression
  4 cpu: regression for NOCB_CPU_ALL
  obs: regression from 250 netns/sec to 50 netns/sec
   on 500 netns already created mark

  2) ip netns exec - some performance regression
  1 cpu: no regression
  4 cpu: regression for NOCB_CPU_ALL
  obs: regression from 40 netns (+1 exec per netns
   creation) to 20 netns/sec on 500 netns created
   mark

  # Assumption (to be confirmed)

  rcu callbacks being offloaded to other cpus caused regression in
  copy_net_ns-created_new_namespaces or unshare(clone_newnet).

To manage notifications about this bug go to:
https://bugs.launchpad.net/linux/+bug/1328088/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : 

[Kernel-packages] [Bug 1328088] Re: Kernel network namespace performance regression during rcu development on kernels above 3.8

2014-09-22 Thread Launchpad Bug Tracker
This bug was fixed in the package linux - 3.13.0-36.63

---
linux (3.13.0-36.63) trusty; urgency=low

  [ Joseph Salisbury ]

  * Release Tracking Bug
- LP: #1365052

  [ Feng Kan ]

  * SAUCE: (no-up) irqchip:gic: change access of gicc_ctrl register to read
modify write.
- LP: #1357527
  * SAUCE: (no-up) arm64: optimized copy_to_user and copy_from_user
assembly code
- LP: #1358949

  [ Ming Lei ]

  * SAUCE: (no-up) Drop APM X-Gene SoC Ethernet driver
- LP: #1360140
  * [Config] Drop XGENE entries
- LP: #1360140
  * [Config] CONFIG_NET_XGENE=m for arm64
- LP: #1360140

  [ Stefan Bader ]

  * SAUCE: Add compat macro for skb_get_hash
- LP: #1358162
  * SAUCE: bcache: prevent crash on changing writeback_running
- LP: #1357295

  [ Suman Tripathi ]

  * SAUCE: (no-up) arm64: Fix the csr-mask for APM X-Gene SoC AHCI SATA PHY
clock DTS node.
- LP: #1359489
  * SAUCE: (no-up) ahci_xgene: Skip the PHY and clock initialization if
already configured by the firmware.
- LP: #1359501
  * SAUCE: (no-up) ahci_xgene: Fix the link down in first attempt for the
APM X-Gene SoC AHCI SATA host controller driver.
- LP: #1359507

  [ Tuan Phan ]

  * SAUCE: (no-up) pci-xgene-msi: fixed deadlock in irq_set_affinity
- LP: #1359514

  [ Upstream Kernel Changes ]

  * iwlwifi: mvm: Add a missed beacons threshold
- LP: #1349572
  * mac80211: reset probe_send_count also in HW_CONNECTION_MONITOR case
- LP: #1349572
  * genirq: Add an accessor for IRQ_PER_CPU flag
- LP: #1357527
  * arm64: perf: add support for percpu pmu interrupt
- LP: #1357527
  * cifs: sanity check length of data to send before sending
- LP: #1283101
  * KVM: nVMX: Pass vmexit parameters to nested_vmx_vmexit
- LP: #1329434
  * KVM: nVMX: Rework interception of IRQs and NMIs
- LP: #1329434
  * KVM: vmx: disable APIC virtualization in nested guests
- LP: #1329434
  * HID: Add transport-driver functions to the USB HID interface.
- LP: #1353021
  * ahci_xgene: Removing NCQ support from the APM X-Gene SoC AHCI SATA Host
Controller driver.
- LP: #1358498
  * fold d_kill() and d_free()
- LP: #1354234
  * fold try_prune_one_dentry()
- LP: #1354234
  * new helper: dentry_free()
- LP: #1354234
  * expand the call of dentry_lru_del() in dentry_kill()
- LP: #1354234
  * dentry_kill(): don't try to remove from shrink list
- LP: #1354234
  * don't remove from shrink list in select_collect()
- LP: #1354234
  * more graceful recovery in umount_collect()
- LP: #1354234
  * dcache: don't need rcu in shrink_dentry_list()
- LP: #1354234
  * lift the already marked killed case into shrink_dentry_list()
  * split dentry_kill()
- LP: #1354234
  * expand dentry_kill(dentry, 0) in shrink_dentry_list()
- LP: #1354234
  * shrink_dentry_list(): take parent's -d_lock earlier
- LP: #1354234
  * dealing with the rest of shrink_dentry_list() livelock
- LP: #1354234
  * dentry_kill() doesn't need the second argument now
- LP: #1354234
  * dcache: add missing lockdep annotation
- LP: #1354234
  * fs: convert use of typedef ctl_table to struct ctl_table
- LP: #1354234
  * lock_parent: don't step on stale -d_parent of all-but-freed one
- LP: #1354234
  * tools/testing/selftests/ptrace/peeksiginfo.c: add PAGE_SIZE definition
- LP: #1358855
  * x86, irq, pic: Probe for legacy PIC and set legacy_pic appropriately
- LP: #1317697
  * bnx2x: Fix kernel crash and data miscompare after EEH recovery
- LP: #1353105
  * bnx2x: Adapter not recovery from EEH error injection
- LP: #1353105
  * Fix: module signature vs tracepoints: add new TAINT_UNSIGNED_MODULE
- LP: #1359670
  * bcache: fix crash on shutdown in passthrough mode
- LP: #1357295
  * bcache: fix uninterruptible sleep in writeback thread
- LP: #1357295
  * namespaces: Use task_lock and not rcu to protect nsproxy
- LP: #1328088
  * MAINTAINERS: Add entry for APM X-Gene SoC ethernet driver
- LP: #1360140
  * Documentation: dts: Add bindings for APM X-Gene SoC ethernet driver
- LP: #1360140
  * dts: Add bindings for APM X-Gene SoC ethernet driver
- LP: #1360140
  * drivers: net: Add APM X-Gene SoC ethernet driver support.
- LP: #1360140
  * powerpc/mm: Add new set flag argument to pte/pmd update function
- LP: #1357014
  * powerpc/thp: Add write barrier after updating the valid bit
- LP: #1357014
  * powerpc/thp: Don't recompute vsid and ssize in loop on invalidate
- LP: #1357014
  * powerpc/thp: Invalidate old 64K based hash page mapping before insert
of 4k pte
- LP: #1357014
  * powerpc/thp: Handle combo pages in invalidate
- LP: #1357014
  * powerpc/thp: Invalidate with vpn in loop
- LP: #1357014
  * powerpc/thp: Use ACCESS_ONCE when loading pmdp
- LP: #1357014
  * powerpc/mm: Use read barrier when creating real_pte
- LP: #1357014
  * powerpc/thp: Add tracepoints to track hugepage 

[Kernel-packages] [Bug 1328088] Re: Kernel network namespace performance regression during rcu development on kernels above 3.8

2014-09-22 Thread Launchpad Bug Tracker
** Branch linked: lp:ubuntu/precise-security/linux-lts-trusty

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1328088

Title:
  Kernel network namespace performance regression during rcu development
  on kernels above 3.8

Status in The Linux Kernel:
  In Progress
Status in “linux” package in Ubuntu:
  Fix Released
Status in “linux” source package in Trusty:
  Fix Released
Status in “linux” source package in Utopic:
  Fix Released

Bug description:
  SRU Justification:

  Impact: network namespace creation has performance regression since v3.5.
  Fix: my analysis, lklm discussion, upstream patch
  Testcase: 
   
   http://people.canonical.com/~inaddy/lp1328088/make_fake_routers.sh
   http://people.canonical.com/~inaddy/lp1328088/parse.py
   http://people.canonical.com/~inaddy/lp1328088/charts/250.html
   http://people.canonical.com/~inaddy/lp1328088/charts/250-tag.html

   Running make_fake_routers.sh 4000 and using parse.py you can check if 
   fake routers are being created in a good rate /sec (and you can
   compare with all generated charts). 

  

  Original Description:

  Please, follow this in:
  http://people.canonical.com/~inaddy/lp1328088/. Same description on
  daily-basis updated text.

  --
  It was brought to my attention that network namespace creation scalability 
was affected during kernel development.

  The following script was used for all the tests and charts generation:

  http://people.canonical.com/~inaddy/lp1328088/make_fake_routers.sh
  http://people.canonical.com/~inaddy/lp1328088/parse.py

  I measured how many fake routers (above script) could be added per
  second from 0 to 4000 created routers mark. Using this script and a
  git bisect on kernel tree I was led to one specific commit causing
  regression: #911af50 rcu: Provide compile-time control for no-CBs
  CPUs. Even Though this change was experimental at that point, it
  introduced a performance scalability regression (explained below) that
  still last and seems to be the default option for distributions
  nowadays.

  RCU related code looked like to be responsible for the problem. With
  that, every commit from tag v3.8..master that changed any of this
  files: kernel/rcutree.c kernel/rcutree.h kernel/rcutree_plugin.h
  include/trace/events/rcu.h include/linux/rcupdate.h was tested. The
  idea was to check performance regression during rcu development. In
  the worst case, the regression not being related to rcu, I would still
  have data to interpret the performance/scalability regression.

  All text below this refer to 2 groups of charts, generated during the
  study:

  1) Kernel git tags from 3.8 to 3.14.
  http://people.canonical.com/~inaddy/lp1328088/charts/250-tag.html

  2) Kernel git commits for rcu development (111 commits).
  http://people.canonical.com/~inaddy/lp1328088/charts/250.html

  Since there was difference in results depending on how many cpus or
  how the no-cb cpus were configured, 3 kernel config options were used
  on every measure:

  - CONFIG_RCU_NOCB_CPU (disabled): nocbno
  - CONFIG_RCU_NOCB_CPU_ALL (enabled): nocball
  - CONFIG_RCU_NOCB_CPU_NONE (enabled): nocbnone

  Obs: For 1 cpu cases: nocbno, nocbnone, nocball behaves the same since
  w/ only 1 cpu there is no no-cb cpu

  After charts being generated it was clear that NOCB_CPU_ALL (4 cpus)
  affected the fake routers creation process performance and this
  regression continues up to upstream version. It was also clear that,
  after commit #911af50, having more than 1 cpu does not improve
  performance/scalability for netns, makes it worse.

  #911af50
  ...
  +#ifdef CONFIG_RCU_NOCB_CPU_ALL
  +   pr_info(\tExperimental no-CBs for all CPUs\n);
  +   cpumask_setall(rcu_nocb_mask);
  +#endif /* #ifdef CONFIG_RCU_NOCB_CPU_ALL */
  ...

  Comparing standing out points (see charts):

  #81e5949 - good
  #911af50 - bad

  I was able to see that, from the script above, the following lines
  causes major impact on netns scalability/performance:

  1) ip netns add - huge performance regression:
  1 cpu: no regression
  4 cpu: regression for NOCB_CPU_ALL
  obs: regression from 250 netns/sec to 50 netns/sec
   on 500 netns already created mark

  2) ip netns exec - some performance regression
  1 cpu: no regression
  4 cpu: regression for NOCB_CPU_ALL
  obs: regression from 40 netns (+1 exec per netns
   creation) to 20 netns/sec on 500 netns created
   mark

  # Assumption (to be confirmed)

  rcu callbacks being offloaded to other cpus caused regression in
  copy_net_ns-created_new_namespaces or unshare(clone_newnet).

To manage notifications about this bug go to:
https://bugs.launchpad.net/linux/+bug/1328088/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : 

[Kernel-packages] [Bug 1328088] Re: Kernel network namespace performance regression during rcu development on kernels above 3.8

2014-09-09 Thread Rafael David Tinoco
** Tags removed: verification-needed-trusty
** Tags added: verification-done

** Tags removed: verification-done
** Tags added: verification-done-trusty

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1328088

Title:
  Kernel network namespace performance regression during rcu development
  on kernels above 3.8

Status in The Linux Kernel:
  In Progress
Status in “linux” package in Ubuntu:
  Fix Released
Status in “linux” source package in Trusty:
  Fix Committed
Status in “linux” source package in Utopic:
  Fix Released

Bug description:
  SRU Justification:

  Impact: network namespace creation has performance regression since v3.5.
  Fix: my analysis, lklm discussion, upstream patch
  Testcase: 
   
   http://people.canonical.com/~inaddy/lp1328088/make_fake_routers.sh
   http://people.canonical.com/~inaddy/lp1328088/parse.py
   http://people.canonical.com/~inaddy/lp1328088/charts/250.html
   http://people.canonical.com/~inaddy/lp1328088/charts/250-tag.html

   Running make_fake_routers.sh 4000 and using parse.py you can check if 
   fake routers are being created in a good rate /sec (and you can
   compare with all generated charts). 

  

  Original Description:

  Please, follow this in:
  http://people.canonical.com/~inaddy/lp1328088/. Same description on
  daily-basis updated text.

  --
  It was brought to my attention that network namespace creation scalability 
was affected during kernel development.

  The following script was used for all the tests and charts generation:

  http://people.canonical.com/~inaddy/lp1328088/make_fake_routers.sh
  http://people.canonical.com/~inaddy/lp1328088/parse.py

  I measured how many fake routers (above script) could be added per
  second from 0 to 4000 created routers mark. Using this script and a
  git bisect on kernel tree I was led to one specific commit causing
  regression: #911af50 rcu: Provide compile-time control for no-CBs
  CPUs. Even Though this change was experimental at that point, it
  introduced a performance scalability regression (explained below) that
  still last and seems to be the default option for distributions
  nowadays.

  RCU related code looked like to be responsible for the problem. With
  that, every commit from tag v3.8..master that changed any of this
  files: kernel/rcutree.c kernel/rcutree.h kernel/rcutree_plugin.h
  include/trace/events/rcu.h include/linux/rcupdate.h was tested. The
  idea was to check performance regression during rcu development. In
  the worst case, the regression not being related to rcu, I would still
  have data to interpret the performance/scalability regression.

  All text below this refer to 2 groups of charts, generated during the
  study:

  1) Kernel git tags from 3.8 to 3.14.
  http://people.canonical.com/~inaddy/lp1328088/charts/250-tag.html

  2) Kernel git commits for rcu development (111 commits).
  http://people.canonical.com/~inaddy/lp1328088/charts/250.html

  Since there was difference in results depending on how many cpus or
  how the no-cb cpus were configured, 3 kernel config options were used
  on every measure:

  - CONFIG_RCU_NOCB_CPU (disabled): nocbno
  - CONFIG_RCU_NOCB_CPU_ALL (enabled): nocball
  - CONFIG_RCU_NOCB_CPU_NONE (enabled): nocbnone

  Obs: For 1 cpu cases: nocbno, nocbnone, nocball behaves the same since
  w/ only 1 cpu there is no no-cb cpu

  After charts being generated it was clear that NOCB_CPU_ALL (4 cpus)
  affected the fake routers creation process performance and this
  regression continues up to upstream version. It was also clear that,
  after commit #911af50, having more than 1 cpu does not improve
  performance/scalability for netns, makes it worse.

  #911af50
  ...
  +#ifdef CONFIG_RCU_NOCB_CPU_ALL
  +   pr_info(\tExperimental no-CBs for all CPUs\n);
  +   cpumask_setall(rcu_nocb_mask);
  +#endif /* #ifdef CONFIG_RCU_NOCB_CPU_ALL */
  ...

  Comparing standing out points (see charts):

  #81e5949 - good
  #911af50 - bad

  I was able to see that, from the script above, the following lines
  causes major impact on netns scalability/performance:

  1) ip netns add - huge performance regression:
  1 cpu: no regression
  4 cpu: regression for NOCB_CPU_ALL
  obs: regression from 250 netns/sec to 50 netns/sec
   on 500 netns already created mark

  2) ip netns exec - some performance regression
  1 cpu: no regression
  4 cpu: regression for NOCB_CPU_ALL
  obs: regression from 40 netns (+1 exec per netns
   creation) to 20 netns/sec on 500 netns created
   mark

  # Assumption (to be confirmed)

  rcu callbacks being offloaded to other cpus caused regression in
  copy_net_ns-created_new_namespaces or unshare(clone_newnet).

To manage notifications about this bug go to:
https://bugs.launchpad.net/linux/+bug/1328088/+subscriptions

-- 
Mailing list: 

[Kernel-packages] [Bug 1328088] Re: Kernel network namespace performance regression during rcu development on kernels above 3.8

2014-09-08 Thread Brad Figg
This bug is awaiting verification that the kernel in -proposed solves
the problem. Please test the kernel and update this bug with the
results. If the problem is solved, change the tag 'verification-needed-
trusty' to 'verification-done-trusty'.

If verification is not done by 5 working days from today, this fix will
be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how
to enable and use -proposed. Thank you!


** Tags added: verification-needed-trusty

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1328088

Title:
  Kernel network namespace performance regression during rcu development
  on kernels above 3.8

Status in The Linux Kernel:
  In Progress
Status in “linux” package in Ubuntu:
  Fix Released
Status in “linux” source package in Trusty:
  Fix Committed
Status in “linux” source package in Utopic:
  Fix Released

Bug description:
  SRU Justification:

  Impact: network namespace creation has performance regression since v3.5.
  Fix: my analysis, lklm discussion, upstream patch
  Testcase: 
   
   http://people.canonical.com/~inaddy/lp1328088/make_fake_routers.sh
   http://people.canonical.com/~inaddy/lp1328088/parse.py
   http://people.canonical.com/~inaddy/lp1328088/charts/250.html
   http://people.canonical.com/~inaddy/lp1328088/charts/250-tag.html

   Running make_fake_routers.sh 4000 and using parse.py you can check if 
   fake routers are being created in a good rate /sec (and you can
   compare with all generated charts). 

  

  Original Description:

  Please, follow this in:
  http://people.canonical.com/~inaddy/lp1328088/. Same description on
  daily-basis updated text.

  --
  It was brought to my attention that network namespace creation scalability 
was affected during kernel development.

  The following script was used for all the tests and charts generation:

  http://people.canonical.com/~inaddy/lp1328088/make_fake_routers.sh
  http://people.canonical.com/~inaddy/lp1328088/parse.py

  I measured how many fake routers (above script) could be added per
  second from 0 to 4000 created routers mark. Using this script and a
  git bisect on kernel tree I was led to one specific commit causing
  regression: #911af50 rcu: Provide compile-time control for no-CBs
  CPUs. Even Though this change was experimental at that point, it
  introduced a performance scalability regression (explained below) that
  still last and seems to be the default option for distributions
  nowadays.

  RCU related code looked like to be responsible for the problem. With
  that, every commit from tag v3.8..master that changed any of this
  files: kernel/rcutree.c kernel/rcutree.h kernel/rcutree_plugin.h
  include/trace/events/rcu.h include/linux/rcupdate.h was tested. The
  idea was to check performance regression during rcu development. In
  the worst case, the regression not being related to rcu, I would still
  have data to interpret the performance/scalability regression.

  All text below this refer to 2 groups of charts, generated during the
  study:

  1) Kernel git tags from 3.8 to 3.14.
  http://people.canonical.com/~inaddy/lp1328088/charts/250-tag.html

  2) Kernel git commits for rcu development (111 commits).
  http://people.canonical.com/~inaddy/lp1328088/charts/250.html

  Since there was difference in results depending on how many cpus or
  how the no-cb cpus were configured, 3 kernel config options were used
  on every measure:

  - CONFIG_RCU_NOCB_CPU (disabled): nocbno
  - CONFIG_RCU_NOCB_CPU_ALL (enabled): nocball
  - CONFIG_RCU_NOCB_CPU_NONE (enabled): nocbnone

  Obs: For 1 cpu cases: nocbno, nocbnone, nocball behaves the same since
  w/ only 1 cpu there is no no-cb cpu

  After charts being generated it was clear that NOCB_CPU_ALL (4 cpus)
  affected the fake routers creation process performance and this
  regression continues up to upstream version. It was also clear that,
  after commit #911af50, having more than 1 cpu does not improve
  performance/scalability for netns, makes it worse.

  #911af50
  ...
  +#ifdef CONFIG_RCU_NOCB_CPU_ALL
  +   pr_info(\tExperimental no-CBs for all CPUs\n);
  +   cpumask_setall(rcu_nocb_mask);
  +#endif /* #ifdef CONFIG_RCU_NOCB_CPU_ALL */
  ...

  Comparing standing out points (see charts):

  #81e5949 - good
  #911af50 - bad

  I was able to see that, from the script above, the following lines
  causes major impact on netns scalability/performance:

  1) ip netns add - huge performance regression:
  1 cpu: no regression
  4 cpu: regression for NOCB_CPU_ALL
  obs: regression from 250 netns/sec to 50 netns/sec
   on 500 netns already created mark

  2) ip netns exec - some performance regression
  1 cpu: no regression
  4 cpu: regression for NOCB_CPU_ALL
  obs: regression from 40 netns (+1 exec per netns
    

[Kernel-packages] [Bug 1328088] Re: Kernel network namespace performance regression during rcu development on kernels above 3.8

2014-09-05 Thread Launchpad Bug Tracker
** Branch linked: lp:ubuntu/precise-proposed/linux-lts-trusty

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1328088

Title:
  Kernel network namespace performance regression during rcu development
  on kernels above 3.8

Status in The Linux Kernel:
  In Progress
Status in “linux” package in Ubuntu:
  Fix Released
Status in “linux” source package in Trusty:
  Fix Committed
Status in “linux” source package in Utopic:
  Fix Released

Bug description:
  SRU Justification:

  Impact: network namespace creation has performance regression since v3.5.
  Fix: my analysis, lklm discussion, upstream patch
  Testcase: 
   
   http://people.canonical.com/~inaddy/lp1328088/make_fake_routers.sh
   http://people.canonical.com/~inaddy/lp1328088/parse.py
   http://people.canonical.com/~inaddy/lp1328088/charts/250.html
   http://people.canonical.com/~inaddy/lp1328088/charts/250-tag.html

   Running make_fake_routers.sh 4000 and using parse.py you can check if 
   fake routers are being created in a good rate /sec (and you can
   compare with all generated charts). 

  

  Original Description:

  Please, follow this in:
  http://people.canonical.com/~inaddy/lp1328088/. Same description on
  daily-basis updated text.

  --
  It was brought to my attention that network namespace creation scalability 
was affected during kernel development.

  The following script was used for all the tests and charts generation:

  http://people.canonical.com/~inaddy/lp1328088/make_fake_routers.sh
  http://people.canonical.com/~inaddy/lp1328088/parse.py

  I measured how many fake routers (above script) could be added per
  second from 0 to 4000 created routers mark. Using this script and a
  git bisect on kernel tree I was led to one specific commit causing
  regression: #911af50 rcu: Provide compile-time control for no-CBs
  CPUs. Even Though this change was experimental at that point, it
  introduced a performance scalability regression (explained below) that
  still last and seems to be the default option for distributions
  nowadays.

  RCU related code looked like to be responsible for the problem. With
  that, every commit from tag v3.8..master that changed any of this
  files: kernel/rcutree.c kernel/rcutree.h kernel/rcutree_plugin.h
  include/trace/events/rcu.h include/linux/rcupdate.h was tested. The
  idea was to check performance regression during rcu development. In
  the worst case, the regression not being related to rcu, I would still
  have data to interpret the performance/scalability regression.

  All text below this refer to 2 groups of charts, generated during the
  study:

  1) Kernel git tags from 3.8 to 3.14.
  http://people.canonical.com/~inaddy/lp1328088/charts/250-tag.html

  2) Kernel git commits for rcu development (111 commits).
  http://people.canonical.com/~inaddy/lp1328088/charts/250.html

  Since there was difference in results depending on how many cpus or
  how the no-cb cpus were configured, 3 kernel config options were used
  on every measure:

  - CONFIG_RCU_NOCB_CPU (disabled): nocbno
  - CONFIG_RCU_NOCB_CPU_ALL (enabled): nocball
  - CONFIG_RCU_NOCB_CPU_NONE (enabled): nocbnone

  Obs: For 1 cpu cases: nocbno, nocbnone, nocball behaves the same since
  w/ only 1 cpu there is no no-cb cpu

  After charts being generated it was clear that NOCB_CPU_ALL (4 cpus)
  affected the fake routers creation process performance and this
  regression continues up to upstream version. It was also clear that,
  after commit #911af50, having more than 1 cpu does not improve
  performance/scalability for netns, makes it worse.

  #911af50
  ...
  +#ifdef CONFIG_RCU_NOCB_CPU_ALL
  +   pr_info(\tExperimental no-CBs for all CPUs\n);
  +   cpumask_setall(rcu_nocb_mask);
  +#endif /* #ifdef CONFIG_RCU_NOCB_CPU_ALL */
  ...

  Comparing standing out points (see charts):

  #81e5949 - good
  #911af50 - bad

  I was able to see that, from the script above, the following lines
  causes major impact on netns scalability/performance:

  1) ip netns add - huge performance regression:
  1 cpu: no regression
  4 cpu: regression for NOCB_CPU_ALL
  obs: regression from 250 netns/sec to 50 netns/sec
   on 500 netns already created mark

  2) ip netns exec - some performance regression
  1 cpu: no regression
  4 cpu: regression for NOCB_CPU_ALL
  obs: regression from 40 netns (+1 exec per netns
   creation) to 20 netns/sec on 500 netns created
   mark

  # Assumption (to be confirmed)

  rcu callbacks being offloaded to other cpus caused regression in
  copy_net_ns-created_new_namespaces or unshare(clone_newnet).

To manage notifications about this bug go to:
https://bugs.launchpad.net/linux/+bug/1328088/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : 

[Kernel-packages] [Bug 1328088] Re: Kernel network namespace performance regression during rcu development on kernels above 3.8

2014-09-04 Thread Launchpad Bug Tracker
** Branch linked: lp:ubuntu/trusty-proposed/linux-keystone

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1328088

Title:
  Kernel network namespace performance regression during rcu development
  on kernels above 3.8

Status in The Linux Kernel:
  In Progress
Status in “linux” package in Ubuntu:
  Fix Released
Status in “linux” source package in Trusty:
  Fix Committed
Status in “linux” source package in Utopic:
  Fix Released

Bug description:
  SRU Justification:

  Impact: network namespace creation has performance regression since v3.5.
  Fix: my analysis, lklm discussion, upstream patch
  Testcase: 
   
   http://people.canonical.com/~inaddy/lp1328088/make_fake_routers.sh
   http://people.canonical.com/~inaddy/lp1328088/parse.py
   http://people.canonical.com/~inaddy/lp1328088/charts/250.html
   http://people.canonical.com/~inaddy/lp1328088/charts/250-tag.html

   Running make_fake_routers.sh 4000 and using parse.py you can check if 
   fake routers are being created in a good rate /sec (and you can
   compare with all generated charts). 

  

  Original Description:

  Please, follow this in:
  http://people.canonical.com/~inaddy/lp1328088/. Same description on
  daily-basis updated text.

  --
  It was brought to my attention that network namespace creation scalability 
was affected during kernel development.

  The following script was used for all the tests and charts generation:

  http://people.canonical.com/~inaddy/lp1328088/make_fake_routers.sh
  http://people.canonical.com/~inaddy/lp1328088/parse.py

  I measured how many fake routers (above script) could be added per
  second from 0 to 4000 created routers mark. Using this script and a
  git bisect on kernel tree I was led to one specific commit causing
  regression: #911af50 rcu: Provide compile-time control for no-CBs
  CPUs. Even Though this change was experimental at that point, it
  introduced a performance scalability regression (explained below) that
  still last and seems to be the default option for distributions
  nowadays.

  RCU related code looked like to be responsible for the problem. With
  that, every commit from tag v3.8..master that changed any of this
  files: kernel/rcutree.c kernel/rcutree.h kernel/rcutree_plugin.h
  include/trace/events/rcu.h include/linux/rcupdate.h was tested. The
  idea was to check performance regression during rcu development. In
  the worst case, the regression not being related to rcu, I would still
  have data to interpret the performance/scalability regression.

  All text below this refer to 2 groups of charts, generated during the
  study:

  1) Kernel git tags from 3.8 to 3.14.
  http://people.canonical.com/~inaddy/lp1328088/charts/250-tag.html

  2) Kernel git commits for rcu development (111 commits).
  http://people.canonical.com/~inaddy/lp1328088/charts/250.html

  Since there was difference in results depending on how many cpus or
  how the no-cb cpus were configured, 3 kernel config options were used
  on every measure:

  - CONFIG_RCU_NOCB_CPU (disabled): nocbno
  - CONFIG_RCU_NOCB_CPU_ALL (enabled): nocball
  - CONFIG_RCU_NOCB_CPU_NONE (enabled): nocbnone

  Obs: For 1 cpu cases: nocbno, nocbnone, nocball behaves the same since
  w/ only 1 cpu there is no no-cb cpu

  After charts being generated it was clear that NOCB_CPU_ALL (4 cpus)
  affected the fake routers creation process performance and this
  regression continues up to upstream version. It was also clear that,
  after commit #911af50, having more than 1 cpu does not improve
  performance/scalability for netns, makes it worse.

  #911af50
  ...
  +#ifdef CONFIG_RCU_NOCB_CPU_ALL
  +   pr_info(\tExperimental no-CBs for all CPUs\n);
  +   cpumask_setall(rcu_nocb_mask);
  +#endif /* #ifdef CONFIG_RCU_NOCB_CPU_ALL */
  ...

  Comparing standing out points (see charts):

  #81e5949 - good
  #911af50 - bad

  I was able to see that, from the script above, the following lines
  causes major impact on netns scalability/performance:

  1) ip netns add - huge performance regression:
  1 cpu: no regression
  4 cpu: regression for NOCB_CPU_ALL
  obs: regression from 250 netns/sec to 50 netns/sec
   on 500 netns already created mark

  2) ip netns exec - some performance regression
  1 cpu: no regression
  4 cpu: regression for NOCB_CPU_ALL
  obs: regression from 40 netns (+1 exec per netns
   creation) to 20 netns/sec on 500 netns created
   mark

  # Assumption (to be confirmed)

  rcu callbacks being offloaded to other cpus caused regression in
  copy_net_ns-created_new_namespaces or unshare(clone_newnet).

To manage notifications about this bug go to:
https://bugs.launchpad.net/linux/+bug/1328088/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : 

[Kernel-packages] [Bug 1328088] Re: Kernel network namespace performance regression during rcu development on kernels above 3.8

2014-08-28 Thread Launchpad Bug Tracker
This bug was fixed in the package linux - 3.16.0-11.16

---
linux (3.16.0-11.16) utopic; urgency=low

  [ Mauricio Faria de Oliveira ]

  * [Config] Switch kernel to vmlinuz (from vmlinux) on ppc64el
- LP: #1358920

  [ Peter Zijlstra ]

  * SAUCE: (no-up) mmu_notifier: add call_srcu and sync function for listener 
to delay call and sync
- LP: #1361300

  [ Tim Gardner ]

  * [Config] CONFIG_ZPOOL=y
- LP: #1360428
  * Release Tracking Bug
- LP: #1361308

  [ Upstream Kernel Changes ]

  * Revert net/mlx4_en: Fix bad use of dev_id
- LP: #1347012
  * net/mlx4_en: Reduce memory consumption on kdump kernel
- LP: #1347012
  * net/mlx4_en: Fix mac_hash database inconsistency
- LP: #1347012
  * net/mlx4_en: Disable blueflame using ethtool private flags
- LP: #1347012
  * net/mlx4_en: current_mac isn't updated in port up
- LP: #1347012
  * net/mlx4_core: Use low memory profile on kdump kernel
- LP: #1347012
  * Drivers: scsi: storvsc: Change the limits to reflect the values on the host
- LP: #1347169
  * Drivers: scsi: storvsc: Set cmd_per_lun to reflect value supported by the 
Host
- LP: #1347169
  * Drivers: scsi: storvsc: Filter commands based on the storage protocol 
version
- LP: #1347169
  * Drivers: scsi: storvsc: Fix a bug in handling VMBUS protocol version
- LP: #1347169
  * Drivers: scsi: storvsc: Implement a eh_timed_out handler
- LP: #1347169
  * drivers: scsi: storvsc: Set srb_flags in all cases
- LP: #1347169
  * drivers: scsi: storvsc: Correctly handle TEST_UNIT_READY failure
- LP: #1347169
  * namespaces: Use task_lock and not rcu to protect nsproxy
- LP: #1328088
  * net: xgene: Check negative return value of xgene_enet_get_ring_size()
  * mm/zbud: change zbud_alloc size type to size_t
- LP: #1360428
  * mm/zpool: implement common zpool api to zbud/zsmalloc
- LP: #1360428
  * mm/zpool: zbud/zsmalloc implement zpool
- LP: #1360428
  * mm/zpool: update zswap to use zpool
- LP: #1360428
  * ideapad-laptop: Change Lenovo Yoga 2 series rfkill handling
- LP: #1341296
  * iommu/amd: Fix for pasid initialization
- LP: #1361300
  * iommu/amd: Moving PPR fault flags macros definitions
- LP: #1361300
  * iommu/amd: Drop oprofile dependency
- LP: #1361300
  * iommu/amd: Fix typo in amd_iommu_v2 driver
- LP: #1361300
  * iommu/amd: Don't call mmu_notifer_unregister in __unbind_pasid
- LP: #1361300
  * iommu/amd: Don't free pasid_state in mn_release path
- LP: #1361300
  * iommu/amd: Get rid of __unbind_pasid
- LP: #1361300
  * iommu/amd: Drop pasid_state reference in ppr_notifer error path
- LP: #1361300
  * iommu/amd: Add pasid_state-invalid flag
- LP: #1361300
  * iommu/amd: Don't hold a reference to mm_struct
- LP: #1361300
  * iommu/amd: Don't hold a reference to task_struct
- LP: #1361300
  * iommu/amd: Don't call the inv_ctx_cb when pasid is not set up
- LP: #1361300
  * iommu/amd: Don't set pasid_state-mm to NULL in unbind_pasid
- LP: #1361300
  * iommu/amd: Remove change_pte mmu_notifier call-back
- LP: #1361300
  * iommu/amd: Fix device_state reference counting
- LP: #1361300
  * iommu/amd: Fix 2 typos in comments
- LP: #1361300
 -- Tim Gardner tim.gard...@canonical.com   Fri, 22 Aug 2014 08:45:54 -0400

** Changed in: linux (Ubuntu Utopic)
   Status: Fix Committed = Fix Released

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1328088

Title:
  Kernel network namespace performance regression during rcu development
  on kernels above 3.8

Status in The Linux Kernel:
  In Progress
Status in “linux” package in Ubuntu:
  Fix Released
Status in “linux” source package in Trusty:
  Fix Committed
Status in “linux” source package in Utopic:
  Fix Released

Bug description:
  SRU Justification:

  Impact: network namespace creation has performance regression since v3.5.
  Fix: my analysis, lklm discussion, upstream patch
  Testcase: 
   
   http://people.canonical.com/~inaddy/lp1328088/make_fake_routers.sh
   http://people.canonical.com/~inaddy/lp1328088/parse.py
   http://people.canonical.com/~inaddy/lp1328088/charts/250.html
   http://people.canonical.com/~inaddy/lp1328088/charts/250-tag.html

   Running make_fake_routers.sh 4000 and using parse.py you can check if 
   fake routers are being created in a good rate /sec (and you can
   compare with all generated charts). 

  

  Original Description:

  Please, follow this in:
  http://people.canonical.com/~inaddy/lp1328088/. Same description on
  daily-basis updated text.

  --
  It was brought to my attention that network namespace creation scalability 
was affected during kernel development.

  The following script was used for all the tests and charts generation:

  http://people.canonical.com/~inaddy/lp1328088/make_fake_routers.sh
  

[Kernel-packages] [Bug 1328088] Re: Kernel network namespace performance regression during rcu development on kernels above 3.8

2014-08-22 Thread Rafael David Tinoco
** Description changed:

+ SRU Justification:
+ 
+ Impact: network namespace creation has performance regression since v3.5.
+ Fix: my analysis, lklm discussion, upstream patch
+ Testcase: 
+  
+  http://people.canonical.com/~inaddy/lp1328088/make_fake_routers.sh
+  http://people.canonical.com/~inaddy/lp1328088/parse.py
+  http://people.canonical.com/~inaddy/lp1328088/charts/250.html
+  http://people.canonical.com/~inaddy/lp1328088/charts/250-tag.html
+ 
+  Running make_fake_routers.sh 4000 and using parse.py you can check if 
+  fake routers are being created in a good rate /sec (and you can
+  compare with all generated charts). 
+ 
+ 
+ 
+ Original Description:
+ 
  Please, follow this in: http://people.canonical.com/~inaddy/lp1328088/.
  Same description on daily-basis updated text.
  
  --
  It was brought to my attention that network namespace creation scalability 
was affected during kernel development.
  
  The following script was used for all the tests and charts generation:
  
  http://people.canonical.com/~inaddy/lp1328088/make_fake_routers.sh
  http://people.canonical.com/~inaddy/lp1328088/parse.py
  
  I measured how many fake routers (above script) could be added per
  second from 0 to 4000 created routers mark. Using this script and a git
  bisect on kernel tree I was led to one specific commit causing
  regression: #911af50 rcu: Provide compile-time control for no-CBs
  CPUs. Even Though this change was experimental at that point, it
  introduced a performance scalability regression (explained below) that
  still last and seems to be the default option for distributions
  nowadays.
  
  RCU related code looked like to be responsible for the problem. With
  that, every commit from tag v3.8..master that changed any of this files:
  kernel/rcutree.c kernel/rcutree.h kernel/rcutree_plugin.h
  include/trace/events/rcu.h include/linux/rcupdate.h was tested. The
  idea was to check performance regression during rcu development. In the
  worst case, the regression not being related to rcu, I would still have
  data to interpret the performance/scalability regression.
  
  All text below this refer to 2 groups of charts, generated during the
  study:
  
  1) Kernel git tags from 3.8 to 3.14.
  http://people.canonical.com/~inaddy/lp1328088/charts/250-tag.html
  
  2) Kernel git commits for rcu development (111 commits).
  http://people.canonical.com/~inaddy/lp1328088/charts/250.html
  
  Since there was difference in results depending on how many cpus or how
  the no-cb cpus were configured, 3 kernel config options were used on
  every measure:
  
  - CONFIG_RCU_NOCB_CPU (disabled): nocbno
  - CONFIG_RCU_NOCB_CPU_ALL (enabled): nocball
  - CONFIG_RCU_NOCB_CPU_NONE (enabled): nocbnone
  
  Obs: For 1 cpu cases: nocbno, nocbnone, nocball behaves the same since
  w/ only 1 cpu there is no no-cb cpu
  
  After charts being generated it was clear that NOCB_CPU_ALL (4 cpus)
  affected the fake routers creation process performance and this
  regression continues up to upstream version. It was also clear that,
  after commit #911af50, having more than 1 cpu does not improve
  performance/scalability for netns, makes it worse.
  
  #911af50
  ...
  +#ifdef CONFIG_RCU_NOCB_CPU_ALL
  +   pr_info(\tExperimental no-CBs for all CPUs\n);
  +   cpumask_setall(rcu_nocb_mask);
  +#endif /* #ifdef CONFIG_RCU_NOCB_CPU_ALL */
  ...
  
  Comparing standing out points (see charts):
  
  #81e5949 - good
  #911af50 - bad
  
  I was able to see that, from the script above, the following lines
  causes major impact on netns scalability/performance:
  
  1) ip netns add - huge performance regression:
- 1 cpu: no regression
- 4 cpu: regression for NOCB_CPU_ALL
- obs: regression from 250 netns/sec to 50 netns/sec 
-  on 500 netns already created mark
+ 1 cpu: no regression
+ 4 cpu: regression for NOCB_CPU_ALL
+ obs: regression from 250 netns/sec to 50 netns/sec
+  on 500 netns already created mark
  
  2) ip netns exec - some performance regression
- 1 cpu: no regression
- 4 cpu: regression for NOCB_CPU_ALL
- obs: regression from 40 netns (+1 exec per netns 
-  creation) to 20 netns/sec on 500 netns created 
-  mark
+ 1 cpu: no regression
+ 4 cpu: regression for NOCB_CPU_ALL
+ obs: regression from 40 netns (+1 exec per netns
+  creation) to 20 netns/sec on 500 netns created
+  mark
  
  # Assumption (to be confirmed)
  
  rcu callbacks being offloaded to other cpus caused regression in
  copy_net_ns-created_new_namespaces or unshare(clone_newnet).

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1328088

Title:
  Kernel network namespace performance regression during rcu development
  on kernels above 3.8

Status in The Linux Kernel:
  In Progress
Status in “linux” package in Ubuntu:
  New

Bug 

[Kernel-packages] [Bug 1328088] Re: Kernel network namespace performance regression during rcu development on kernels above 3.8

2014-08-22 Thread Tim Gardner
** Changed in: linux (Ubuntu)
   Status: New = Fix Committed

** Also affects: linux (Ubuntu Trusty)
   Importance: Undecided
   Status: New

** Also affects: linux (Ubuntu Utopic)
   Importance: Undecided
   Status: Fix Committed

** Changed in: linux (Ubuntu Trusty)
   Status: New = Fix Committed

** Changed in: linux (Ubuntu Trusty)
 Assignee: (unassigned) = Rafael David Tinoco (inaddy)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1328088

Title:
  Kernel network namespace performance regression during rcu development
  on kernels above 3.8

Status in The Linux Kernel:
  In Progress
Status in “linux” package in Ubuntu:
  Fix Committed
Status in “linux” source package in Trusty:
  Fix Committed
Status in “linux” source package in Utopic:
  Fix Committed

Bug description:
  SRU Justification:

  Impact: network namespace creation has performance regression since v3.5.
  Fix: my analysis, lklm discussion, upstream patch
  Testcase: 
   
   http://people.canonical.com/~inaddy/lp1328088/make_fake_routers.sh
   http://people.canonical.com/~inaddy/lp1328088/parse.py
   http://people.canonical.com/~inaddy/lp1328088/charts/250.html
   http://people.canonical.com/~inaddy/lp1328088/charts/250-tag.html

   Running make_fake_routers.sh 4000 and using parse.py you can check if 
   fake routers are being created in a good rate /sec (and you can
   compare with all generated charts). 

  

  Original Description:

  Please, follow this in:
  http://people.canonical.com/~inaddy/lp1328088/. Same description on
  daily-basis updated text.

  --
  It was brought to my attention that network namespace creation scalability 
was affected during kernel development.

  The following script was used for all the tests and charts generation:

  http://people.canonical.com/~inaddy/lp1328088/make_fake_routers.sh
  http://people.canonical.com/~inaddy/lp1328088/parse.py

  I measured how many fake routers (above script) could be added per
  second from 0 to 4000 created routers mark. Using this script and a
  git bisect on kernel tree I was led to one specific commit causing
  regression: #911af50 rcu: Provide compile-time control for no-CBs
  CPUs. Even Though this change was experimental at that point, it
  introduced a performance scalability regression (explained below) that
  still last and seems to be the default option for distributions
  nowadays.

  RCU related code looked like to be responsible for the problem. With
  that, every commit from tag v3.8..master that changed any of this
  files: kernel/rcutree.c kernel/rcutree.h kernel/rcutree_plugin.h
  include/trace/events/rcu.h include/linux/rcupdate.h was tested. The
  idea was to check performance regression during rcu development. In
  the worst case, the regression not being related to rcu, I would still
  have data to interpret the performance/scalability regression.

  All text below this refer to 2 groups of charts, generated during the
  study:

  1) Kernel git tags from 3.8 to 3.14.
  http://people.canonical.com/~inaddy/lp1328088/charts/250-tag.html

  2) Kernel git commits for rcu development (111 commits).
  http://people.canonical.com/~inaddy/lp1328088/charts/250.html

  Since there was difference in results depending on how many cpus or
  how the no-cb cpus were configured, 3 kernel config options were used
  on every measure:

  - CONFIG_RCU_NOCB_CPU (disabled): nocbno
  - CONFIG_RCU_NOCB_CPU_ALL (enabled): nocball
  - CONFIG_RCU_NOCB_CPU_NONE (enabled): nocbnone

  Obs: For 1 cpu cases: nocbno, nocbnone, nocball behaves the same since
  w/ only 1 cpu there is no no-cb cpu

  After charts being generated it was clear that NOCB_CPU_ALL (4 cpus)
  affected the fake routers creation process performance and this
  regression continues up to upstream version. It was also clear that,
  after commit #911af50, having more than 1 cpu does not improve
  performance/scalability for netns, makes it worse.

  #911af50
  ...
  +#ifdef CONFIG_RCU_NOCB_CPU_ALL
  +   pr_info(\tExperimental no-CBs for all CPUs\n);
  +   cpumask_setall(rcu_nocb_mask);
  +#endif /* #ifdef CONFIG_RCU_NOCB_CPU_ALL */
  ...

  Comparing standing out points (see charts):

  #81e5949 - good
  #911af50 - bad

  I was able to see that, from the script above, the following lines
  causes major impact on netns scalability/performance:

  1) ip netns add - huge performance regression:
  1 cpu: no regression
  4 cpu: regression for NOCB_CPU_ALL
  obs: regression from 250 netns/sec to 50 netns/sec
   on 500 netns already created mark

  2) ip netns exec - some performance regression
  1 cpu: no regression
  4 cpu: regression for NOCB_CPU_ALL
  obs: regression from 40 netns (+1 exec per netns
   creation) to 20 netns/sec on 500 netns created
   mark

  # Assumption (to be confirmed)

  rcu callbacks being 

[Kernel-packages] [Bug 1328088] Re: Kernel network namespace performance regression during rcu development on kernels above 3.8

2014-08-21 Thread Rafael David Tinoco
Upstream suggestions/observations were accepted and code was changed:

http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=728dba3a39c66b3d8ac889ddbe38b5b1c264aec3

These changes were already tested and worked good for the performance
regression.

Suggesting SRU to our kernel team soon.

Thank you

-Rafael

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1328088

Title:
  Kernel network namespace performance regression during rcu development
  on kernels above 3.8

Status in The Linux Kernel:
  In Progress
Status in “linux” package in Ubuntu:
  New

Bug description:
  Please, follow this in:
  http://people.canonical.com/~inaddy/lp1328088/. Same description on
  daily-basis updated text.

  --
  It was brought to my attention that network namespace creation scalability 
was affected during kernel development.

  The following script was used for all the tests and charts generation:

  http://people.canonical.com/~inaddy/lp1328088/make_fake_routers.sh
  http://people.canonical.com/~inaddy/lp1328088/parse.py

  I measured how many fake routers (above script) could be added per
  second from 0 to 4000 created routers mark. Using this script and a
  git bisect on kernel tree I was led to one specific commit causing
  regression: #911af50 rcu: Provide compile-time control for no-CBs
  CPUs. Even Though this change was experimental at that point, it
  introduced a performance scalability regression (explained below) that
  still last and seems to be the default option for distributions
  nowadays.

  RCU related code looked like to be responsible for the problem. With
  that, every commit from tag v3.8..master that changed any of this
  files: kernel/rcutree.c kernel/rcutree.h kernel/rcutree_plugin.h
  include/trace/events/rcu.h include/linux/rcupdate.h was tested. The
  idea was to check performance regression during rcu development. In
  the worst case, the regression not being related to rcu, I would still
  have data to interpret the performance/scalability regression.

  All text below this refer to 2 groups of charts, generated during the
  study:

  1) Kernel git tags from 3.8 to 3.14.
  http://people.canonical.com/~inaddy/lp1328088/charts/250-tag.html

  2) Kernel git commits for rcu development (111 commits).
  http://people.canonical.com/~inaddy/lp1328088/charts/250.html

  Since there was difference in results depending on how many cpus or
  how the no-cb cpus were configured, 3 kernel config options were used
  on every measure:

  - CONFIG_RCU_NOCB_CPU (disabled): nocbno
  - CONFIG_RCU_NOCB_CPU_ALL (enabled): nocball
  - CONFIG_RCU_NOCB_CPU_NONE (enabled): nocbnone

  Obs: For 1 cpu cases: nocbno, nocbnone, nocball behaves the same since
  w/ only 1 cpu there is no no-cb cpu

  After charts being generated it was clear that NOCB_CPU_ALL (4 cpus)
  affected the fake routers creation process performance and this
  regression continues up to upstream version. It was also clear that,
  after commit #911af50, having more than 1 cpu does not improve
  performance/scalability for netns, makes it worse.

  #911af50
  ...
  +#ifdef CONFIG_RCU_NOCB_CPU_ALL
  +   pr_info(\tExperimental no-CBs for all CPUs\n);
  +   cpumask_setall(rcu_nocb_mask);
  +#endif /* #ifdef CONFIG_RCU_NOCB_CPU_ALL */
  ...

  Comparing standing out points (see charts):

  #81e5949 - good
  #911af50 - bad

  I was able to see that, from the script above, the following lines
  causes major impact on netns scalability/performance:

  1) ip netns add - huge performance regression:
  1 cpu: no regression
  4 cpu: regression for NOCB_CPU_ALL
  obs: regression from 250 netns/sec to 50 netns/sec 
   on 500 netns already created mark

  2) ip netns exec - some performance regression
  1 cpu: no regression
  4 cpu: regression for NOCB_CPU_ALL
  obs: regression from 40 netns (+1 exec per netns 
   creation) to 20 netns/sec on 500 netns created 
   mark

  # Assumption (to be confirmed)

  rcu callbacks being offloaded to other cpus caused regression in
  copy_net_ns-created_new_namespaces or unshare(clone_newnet).

To manage notifications about this bug go to:
https://bugs.launchpad.net/linux/+bug/1328088/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1328088] Re: Kernel network namespace performance regression during rcu development on kernels above 3.8

2014-06-11 Thread Christopher M. Penalver
** Tags added: bisect-done

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1328088

Title:
  Kernel network namespace performance regression during rcu development
  on kernels above 3.8

Status in The Linux Kernel:
  In Progress
Status in “linux” package in Ubuntu:
  New

Bug description:
  Please, follow this in:
  http://people.canonical.com/~inaddy/lp1328088/. Same description on
  daily-basis updated text.

  --
  It was brought to my attention that network namespace creation scalability 
was affected during kernel development.

  The following script was used for all the tests and charts generation:

  http://people.canonical.com/~inaddy/lp1328088/make_fake_routers.sh
  http://people.canonical.com/~inaddy/lp1328088/parse.py

  I measured how many fake routers (above script) could be added per
  second from 0 to 4000 created routers mark. Using this script and a
  git bisect on kernel tree I was led to one specific commit causing
  regression: #911af50 rcu: Provide compile-time control for no-CBs
  CPUs. Even Though this change was experimental at that point, it
  introduced a performance scalability regression (explained below) that
  still last and seems to be the default option for distributions
  nowadays.

  RCU related code looked like to be responsible for the problem. With
  that, every commit from tag v3.8..master that changed any of this
  files: kernel/rcutree.c kernel/rcutree.h kernel/rcutree_plugin.h
  include/trace/events/rcu.h include/linux/rcupdate.h was tested. The
  idea was to check performance regression during rcu development. In
  the worst case, the regression not being related to rcu, I would still
  have data to interpret the performance/scalability regression.

  All text below this refer to 2 groups of charts, generated during the
  study:

  1) Kernel git tags from 3.8 to 3.14.
  http://people.canonical.com/~inaddy/lp1328088/charts/250-tag.html

  2) Kernel git commits for rcu development (111 commits).
  http://people.canonical.com/~inaddy/lp1328088/charts/250.html

  Since there was difference in results depending on how many cpus or
  how the no-cb cpus were configured, 3 kernel config options were used
  on every measure:

  - CONFIG_RCU_NOCB_CPU (disabled): nocbno
  - CONFIG_RCU_NOCB_CPU_ALL (enabled): nocball
  - CONFIG_RCU_NOCB_CPU_NONE (enabled): nocbnone

  Obs: For 1 cpu cases: nocbno, nocbnone, nocball behaves the same since
  w/ only 1 cpu there is no no-cb cpu

  After charts being generated it was clear that NOCB_CPU_ALL (4 cpus)
  affected the fake routers creation process performance and this
  regression continues up to upstream version. It was also clear that,
  after commit #911af50, having more than 1 cpu does not improve
  performance/scalability for netns, makes it worse.

  #911af50
  ...
  +#ifdef CONFIG_RCU_NOCB_CPU_ALL
  +   pr_info(\tExperimental no-CBs for all CPUs\n);
  +   cpumask_setall(rcu_nocb_mask);
  +#endif /* #ifdef CONFIG_RCU_NOCB_CPU_ALL */
  ...

  Comparing standing out points (see charts):

  #81e5949 - good
  #911af50 - bad

  I was able to see that, from the script above, the following lines
  causes major impact on netns scalability/performance:

  1) ip netns add - huge performance regression:
  1 cpu: no regression
  4 cpu: regression for NOCB_CPU_ALL
  obs: regression from 250 netns/sec to 50 netns/sec 
   on 500 netns already created mark

  2) ip netns exec - some performance regression
  1 cpu: no regression
  4 cpu: regression for NOCB_CPU_ALL
  obs: regression from 40 netns (+1 exec per netns 
   creation) to 20 netns/sec on 500 netns created 
   mark

  # Assumption (to be confirmed)

  rcu callbacks being offloaded to other cpus caused regression in
  copy_net_ns-created_new_namespaces or unshare(clone_newnet).

To manage notifications about this bug go to:
https://bugs.launchpad.net/linux/+bug/1328088/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1328088] Re: Kernel network namespace performance regression during rcu development on kernels above 3.8

2014-06-11 Thread Chris J Arges
Related upstream discussion:
https://lkml.org/lkml/2014/6/11/42

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1328088

Title:
  Kernel network namespace performance regression during rcu development
  on kernels above 3.8

Status in The Linux Kernel:
  In Progress
Status in “linux” package in Ubuntu:
  New

Bug description:
  Please, follow this in:
  http://people.canonical.com/~inaddy/lp1328088/. Same description on
  daily-basis updated text.

  --
  It was brought to my attention that network namespace creation scalability 
was affected during kernel development.

  The following script was used for all the tests and charts generation:

  http://people.canonical.com/~inaddy/lp1328088/make_fake_routers.sh
  http://people.canonical.com/~inaddy/lp1328088/parse.py

  I measured how many fake routers (above script) could be added per
  second from 0 to 4000 created routers mark. Using this script and a
  git bisect on kernel tree I was led to one specific commit causing
  regression: #911af50 rcu: Provide compile-time control for no-CBs
  CPUs. Even Though this change was experimental at that point, it
  introduced a performance scalability regression (explained below) that
  still last and seems to be the default option for distributions
  nowadays.

  RCU related code looked like to be responsible for the problem. With
  that, every commit from tag v3.8..master that changed any of this
  files: kernel/rcutree.c kernel/rcutree.h kernel/rcutree_plugin.h
  include/trace/events/rcu.h include/linux/rcupdate.h was tested. The
  idea was to check performance regression during rcu development. In
  the worst case, the regression not being related to rcu, I would still
  have data to interpret the performance/scalability regression.

  All text below this refer to 2 groups of charts, generated during the
  study:

  1) Kernel git tags from 3.8 to 3.14.
  http://people.canonical.com/~inaddy/lp1328088/charts/250-tag.html

  2) Kernel git commits for rcu development (111 commits).
  http://people.canonical.com/~inaddy/lp1328088/charts/250.html

  Since there was difference in results depending on how many cpus or
  how the no-cb cpus were configured, 3 kernel config options were used
  on every measure:

  - CONFIG_RCU_NOCB_CPU (disabled): nocbno
  - CONFIG_RCU_NOCB_CPU_ALL (enabled): nocball
  - CONFIG_RCU_NOCB_CPU_NONE (enabled): nocbnone

  Obs: For 1 cpu cases: nocbno, nocbnone, nocball behaves the same since
  w/ only 1 cpu there is no no-cb cpu

  After charts being generated it was clear that NOCB_CPU_ALL (4 cpus)
  affected the fake routers creation process performance and this
  regression continues up to upstream version. It was also clear that,
  after commit #911af50, having more than 1 cpu does not improve
  performance/scalability for netns, makes it worse.

  #911af50
  ...
  +#ifdef CONFIG_RCU_NOCB_CPU_ALL
  +   pr_info(\tExperimental no-CBs for all CPUs\n);
  +   cpumask_setall(rcu_nocb_mask);
  +#endif /* #ifdef CONFIG_RCU_NOCB_CPU_ALL */
  ...

  Comparing standing out points (see charts):

  #81e5949 - good
  #911af50 - bad

  I was able to see that, from the script above, the following lines
  causes major impact on netns scalability/performance:

  1) ip netns add - huge performance regression:
  1 cpu: no regression
  4 cpu: regression for NOCB_CPU_ALL
  obs: regression from 250 netns/sec to 50 netns/sec 
   on 500 netns already created mark

  2) ip netns exec - some performance regression
  1 cpu: no regression
  4 cpu: regression for NOCB_CPU_ALL
  obs: regression from 40 netns (+1 exec per netns 
   creation) to 20 netns/sec on 500 netns created 
   mark

  # Assumption (to be confirmed)

  rcu callbacks being offloaded to other cpus caused regression in
  copy_net_ns-created_new_namespaces or unshare(clone_newnet).

To manage notifications about this bug go to:
https://bugs.launchpad.net/linux/+bug/1328088/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1328088] Re: Kernel network namespace performance regression during rcu development on kernels above 3.8

2014-06-10 Thread Chris J Arges
** Also affects: linux (Ubuntu)
   Importance: Undecided
   Status: New

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1328088

Title:
  Kernel network namespace performance regression during rcu development
  on kernels above 3.8

Status in The Linux Kernel:
  In Progress
Status in “linux” package in Ubuntu:
  New

Bug description:
  Please, follow this in:
  http://people.canonical.com/~inaddy/lp1328088/. Same description on
  daily-basis updated text.

  --

  It was brought to my attention that fake router creation scalability
  was affected during kernel development.

  The following script was used for all the tests and charts generation:

  http://people.canonical.com/~inaddy/lp1328088/make_fake_routers.sh
  http://people.canonical.com/~inaddy/lp1328088/parse.py

  I measured how many fake routers (above script) could be added per
  second from 0 to 4000 created routers mark. Using this script and a
  git bisect on kernel tree I was led to one specific commit causing
  regression: #911af505 rcu: Provide compile-time control for no-CBs
  CPUs.

  It appeared that rcu, rcu callbacks and no-cb cpus were causing the
  issue so every commit that changed any of this files:
  kernel/rcutree.c kernel/rcutree.h kernel/rcutree_plugin.h
  include/trace/events/rcu.h include/linux/rcupdate.h was tested. The
  idea was to check performance regression during rcu development. In
  the worst case I would have data for performance regression during
  kernel development (since we have rcu commits from 3.8 to 3.14).

  All text below this refer to 2 groups of charts, generated during the
  study:

  1) Kernel git tags from 3.8 to 3.14.
  http://people.canonical.com/~inaddy/lp1328088/charts/250-tag.html

  2) Kernel git commits for rcu development (111 commits).
  http://people.canonical.com/~inaddy/lp1328088/charts/250.html

  Since there was difference in results depending on how many cpus or
  how the no-cb cpus were configured, 3 kernel config options were used
  on every measure:

  - CONFIG_RCU_NOCB_CPU (disabled: nocbno)
  - CONFIG_RCU_NOCB_CPU_ALL (enabled: nocball)
  - CONFIG_RCU_NOCB_CPU_NONE (enabled: nocbnone)

  After charts generation and study it was clear that NOCB_CPU_ALL (4
  cpus) affected the fake routers creation process performance and
  this regression continues up to upstream version. It was also clear
  that, after this commit, there is no scalability executing this test
  with more than 1 cpu.

  Comparing standing out points (see charts):

  #81e5949 - good
  #911af50 - bad
  #6faf728 - not good enough

  I was able to see that from the script above the following lines were
  affected:

  1) ip netns add - huge performance regression
  2) ip netns exec - some performance regression

  #
  # Assumption
  #

  rcu callbacks being offloaded to other cpus caused regression in
  unshare(CLONE_NEWNET) code.

  # Specific kernel entry being investigated:

  unshare(CLONE_NEWNET)

To manage notifications about this bug go to:
https://bugs.launchpad.net/linux/+bug/1328088/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1328088] Re: Kernel network namespace performance regression during rcu development on kernels above 3.8

2014-06-10 Thread Rafael David Tinoco
** Description changed:

  Please, follow this in: http://people.canonical.com/~inaddy/lp1328088/.
  Same description on daily-basis updated text.
  
  --
- 
- It was brought to my attention that fake router creation scalability
- was affected during kernel development.
+ It was brought to my attention that network namespace creation scalability 
was affected during kernel development.
  
  The following script was used for all the tests and charts generation:
  
  http://people.canonical.com/~inaddy/lp1328088/make_fake_routers.sh
  http://people.canonical.com/~inaddy/lp1328088/parse.py
  
  I measured how many fake routers (above script) could be added per
  second from 0 to 4000 created routers mark. Using this script and a git
  bisect on kernel tree I was led to one specific commit causing
- regression: #911af505 rcu: Provide compile-time control for no-CBs
- CPUs.
+ regression: #911af50 rcu: Provide compile-time control for no-CBs
+ CPUs. Even Though this change was experimental at that point, it
+ introduced a performance scalability regression (explained below) that
+ still last and seems to be the default option for distributions
+ nowadays.
  
- It appeared that rcu, rcu callbacks and no-cb cpus were causing the
- issue so every commit that changed any of this files: kernel/rcutree.c
- kernel/rcutree.h kernel/rcutree_plugin.h include/trace/events/rcu.h
- include/linux/rcupdate.h was tested. The idea was to check performance
- regression during rcu development. In the worst case I would have data
- for performance regression during kernel development (since we have rcu
- commits from 3.8 to 3.14).
+ RCU related code looked like to be responsible for the problem. With
+ that, every commit from tag v3.8..master that changed any of this files:
+ kernel/rcutree.c kernel/rcutree.h kernel/rcutree_plugin.h
+ include/trace/events/rcu.h include/linux/rcupdate.h was tested. The
+ idea was to check performance regression during rcu development. In the
+ worst case, the regression not being related to rcu, I would still have
+ data to interpret the performance/scalability regression.
  
  All text below this refer to 2 groups of charts, generated during the
  study:
  
  1) Kernel git tags from 3.8 to 3.14.
  http://people.canonical.com/~inaddy/lp1328088/charts/250-tag.html
  
  2) Kernel git commits for rcu development (111 commits).
  http://people.canonical.com/~inaddy/lp1328088/charts/250.html
  
  Since there was difference in results depending on how many cpus or how
  the no-cb cpus were configured, 3 kernel config options were used on
  every measure:
  
- - CONFIG_RCU_NOCB_CPU (disabled: nocbno)
- - CONFIG_RCU_NOCB_CPU_ALL (enabled: nocball)
- - CONFIG_RCU_NOCB_CPU_NONE (enabled: nocbnone)
+ - CONFIG_RCU_NOCB_CPU (disabled): nocbno
+ - CONFIG_RCU_NOCB_CPU_ALL (enabled): nocball
+ - CONFIG_RCU_NOCB_CPU_NONE (enabled): nocbnone
  
- After charts generation and study it was clear that NOCB_CPU_ALL (4
- cpus) affected the fake routers creation process performance and this
+ Obs: For 1 cpu cases: nocbno, nocbnone, nocball behaves the same since
+ w/ only 1 cpu there is no no-cb cpu
+ 
+ After charts being generated it was clear that NOCB_CPU_ALL (4 cpus)
+ affected the fake routers creation process performance and this
  regression continues up to upstream version. It was also clear that,
- after this commit, there is no scalability executing this test with more
- than 1 cpu.
+ after commit #911af50, having more than 1 cpu does not improve
+ performance/scalability for netns, makes it worse.
+ 
+ #911af50
+ ...
+ +#ifdef CONFIG_RCU_NOCB_CPU_ALL
+ +   pr_info(\tExperimental no-CBs for all CPUs\n);
+ +   cpumask_setall(rcu_nocb_mask);
+ +#endif /* #ifdef CONFIG_RCU_NOCB_CPU_ALL */
+ ...
  
  Comparing standing out points (see charts):
  
  #81e5949 - good
  #911af50 - bad
- #6faf728 - not good enough
  
- I was able to see that from the script above the following lines were
- affected:
+ I was able to see that, from the script above, the following lines
+ causes major impact on netns scalability/performance:
  
- 1) ip netns add - huge performance regression
+ 1) ip netns add - huge performance regression:
+ 1 cpu: no regression
+ 4 cpu: regression for NOCB_CPU_ALL
+ obs: regression from 250 netns/sec to 50 netns/sec 
+  on 500 netns already created mark
+ 
  2) ip netns exec - some performance regression
+ 1 cpu: no regression
+ 4 cpu: regression for NOCB_CPU_ALL
+ obs: regression from 40 netns (+1 exec per netns 
+  creation) to 20 netns/sec on 500 netns created 
+  mark
  
- #
- # Assumption
- #
+ # Assumption (to be confirmed)
  
  rcu callbacks being offloaded to other cpus caused regression in
- unshare(CLONE_NEWNET) code.
- 
- # Specific kernel entry being investigated:
- 
- unshare(CLONE_NEWNET)
+ copy_net_ns-created_new_namespaces or unshare(clone_newnet).

-- 
You received this bug notification because you are a member of Kernel
Packages, which is