Hello,

    Recently, I found a serious issue about network-nodes startup time,
neutron-rootwrap eats a lot of cpu cycles, much more than the processes it's wrapping itself.

On a database with 1 public network, 192 private networks, 192 routers, and 192 nano VMs, with OVS plugin:


Network node setup time (rootwrap): 24 minutes
Network node setup time (sudo):     10 minutes


   That's the time since you reboot a network node, until all namespaces
and services are restored.


If you see appendix "1", this extra 14min overhead, matches with the fact that rootwrap needs 0.3s to start, and launch a system command (once filtered).

    14minutes =  840 s.
(840s. / 192 resources)/0.3s ~= 15 operations / resource(qdhcp+qrouter) (iptables, ovs port creation & tagging, starting child processes, etc..)

   The overhead comes from python startup time + rootwrap loading.

I suppose that rootwrap was designed for lower amount of system calls (nova?).

And, I understand what rootwrap provides, a level of filtering that sudo cannot offer. But it raises some question:

1) It's actually someone using rootwrap in production?

2) What alternatives can we think about to improve this situation.

0) already being done: coalescing system calls. But I'm unsure that's enough. (if we coalesce 15 calls to 3 on this system we get: 192*3*0.3/60 ~=3 minutes overhead on a 10min operation).

a) Rewriting rules into sudo (to the extent that it's possible), and live with that. b) How secure is neutron about command injection to that point? How much is user input filtered on the API calls? c) Even if "b" is ok , I suppose that if the DB gets compromised, that could lead to command injection.

   d) Re-writing rootwrap into C (it's 600 python LOCs now).

e) Doing the command filtering at neutron-side, as a library and live with sudo with simple filtering. (we kill the python/rootwrap startup overhead).

3) I also find 10 minutes a long time to setup 192 networks/basic tenant structures, I wonder if that time could be reduced by conversion
of system process calls into system library calls (I know we don't have
libraries for iproute, iptables?, and many other things... but it's a
problem that's probably worth looking at.)

Best,
Miguel Ángel Ajo.


Appendix:

[1] Analyzing overhead:

[root@rhos4-neutron2 ~]# echo "int main() { return 0; }" > test.c
[root@rhos4-neutron2 ~]# gcc test.c -o test
[root@rhos4-neutron2 ~]# time test # to time process invocation on this machine

real    0m0.000s
user    0m0.000s
sys    0m0.000s


[root@rhos4-neutron2 ~]# time sudo bash -c 'exit 0'

real    0m0.032s
user    0m0.010s
sys    0m0.019s


[root@rhos4-neutron2 ~]# time python -c'import sys;sys.exit(0)'

real    0m0.057s
user    0m0.016s
sys    0m0.011s

[root@rhos4-neutron2 ~]# time neutron-rootwrap --help
/usr/bin/neutron-rootwrap: No command specified

real    0m0.309s
user    0m0.128s
sys    0m0.037s

_______________________________________________
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to