We have a default rule in ovs which I assume makes it behave like a regular
L2 switch
cookie=0x0, duration=71407.425s, table=0, n_packets=33577078,
n_bytes=38722336595, idle_age=0, hard_age=65534, priority=0 actions=NORMAL
Through a traffic generator we are sending unknown unicast
traffic/broadcast traffic to/from about 10000 hosts at say 500 pkts/sec. We
see that this causes really high CPU utilization with the revalidator
threads as shown:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1522 root 20 0 413360 55972 3916 S 85.4 0.7 603:29.96
revalidator8
1521 root 20 0 413360 55972 3916 R 79.7 0.7 616:24.68
revalidator9
And the following logs are seen in ovs-vswitchd.log
2017-02-02T21:27:15.474Z|00009|poll_loop(revalidator23)|INFO|wakeup due to
[POLLIN] on fd 52 (FIFO pipe:[23153437]) at lib/ovs-thread.c:306 (54% CPU
usage)
2017-02-02T21:27:15.530Z|00014|poll_loop(revalidator22)|INFO|wakeup due to
[POLLIN] on fd 50 (FIFO pipe:[23153436]) at lib/ovs-thread.c:306 (58% CPU
usage)
2017-02-02T21:27:15.532Z|00015|poll_loop(revalidator22)|INFO|wakeup due to
[POLLIN] on fd 50 (FIFO pipe:[23153436]) at lib/ovs-thread.c:306 (58% CPU
usage)
2017-02-02T21:27:21.444Z|00016|poll_loop(revalidator22)|INFO|Dropped 242
log messages in last 5 seconds (most recently, 0 seconds ago) due to
excessive rate
2017-02-02T21:27:21.445Z|00017|poll_loop(revalidator22)|INFO|wakeup due to
[POLLIN] on fd 50 (FIFO pipe:[23153436]) at lib/ovs-thread.c:306 (73% CPU
usage)
2017-02-02T21:27:27.471Z|00010|poll_loop(revalidator23)|INFO|Dropped 190
log messages in last 6 seconds (most recently, 0 seconds ago) due to
excessive rate
2017-02-02T21:27:27.471Z|00011|poll_loop(revalidator23)|INFO|wakeup due to
[POLLIN] on fd 52 (FIFO pipe:[23153437]) at lib/ovs-thread.c:306 (82% CPU
usage)
2017-02-02T21:27:33.439Z|00012|poll_loop(revalidator23)|INFO|Dropped 195
log messages in last 6 seconds (most recently, 0 seconds ago) due to
excessive rate
2017-02-02T21:27:33.439Z|00013|poll_loop(revalidator23)|INFO|wakeup due to
[POLLIN] on fd 52 (FIFO pipe:[23153437]) at lib/ovs-thread.c:306 (88% CPU
usage)
2017-02-02T21:27:39.479Z|00014|poll_loop(revalidator23)|INFO|Dropped 203
log messages in last 6 seconds (most recently, 0 seconds ago) due to
excessive rate
2017-02-02T21:27:39.479Z|00015|poll_loop(revalidator23)|INFO|wakeup due to
[POLLIN] on fd 52 (FIFO pipe:[23153437]) at lib/ovs-thread.c:306 (78% CPU
usage)
2017-02-02T21:27:45.469Z|00016|poll_loop(revalidator23)|INFO|Dropped 239
log messages in last 6 seconds (most recently, 0 seconds ago) due to
excessive rate
2017-02-02T21:27:45.469Z|00017|poll_loop(revalidator23)|INFO|wakeup due to
[POLLIN] on fd 52 (FIFO pipe:[23153437]) at lib/ovs-thread.c:306 (80% CPU
usage)
2017-02-02T21:27:51.733Z|00018|poll_loop(revalidator22)|INFO|Dropped 213
log messages in last 7 seconds (most recently, 1 seconds ago) due to
excessive rate
2017-02-02T21:27:51.733Z|00019|poll_loop(revalidator22)|INFO|wakeup due to
422-ms timeout at ofproto/ofproto-dpif-upcall.c:917 (71% CPU usage)
Are there any tips to improve OVS performance under such traffic, where the
kernel cache may be constantly thrashed?
Is there a way to wildcard Layer 2 information in the packets and purely
forward packets based on vlan, port, so that the kernel cache undergoes
less thrashing?
Note we have now set n-handler-threads to 2 and n-revalidator-threads to 1
so that we burn at most one core with the revalidator threads.
_______________________________________________
discuss mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss