Keepalive feature is aimed at achieving Fastpath Service Assurance in OVS-DPDK deployments. It adds support for monitoring the packet processing cores(PMD thread cores) by dispatching heartbeats at regular intervals. Incase of heartbeat misses additional health checks are enabled on the PMD thread to detect the failure and the same shall be reported to higher level fault management systems/frameworks.
The implementation uses OVSDB for reporting the datapath status and the health of the PMD threads. Any external monitoring application can read the status from OVSDB at regular intervals (or) subscribe to the updates in OVSDB so that they get notified when the changes happen on OVSDB. POSIX shared memory object is created and initialized for storing the status of the PMD threads. This is initialized by main thread(vswitchd) as part of init process and will be periodically updated by 'keepalive' thread. keepalive feature can be enabled through below OVSDB settings. enable-keepalive=true - Keepalive feature is disabled by default. keepalive-interval="5000" - Timer interval in milliseconds for monitoring the packet processing cores. keepalive-shm-name="/ovs_keepalive_shm_name" - Shared memory block name where the events shall be updated. When KA is enabled, 'ovs-keepalive' thread shall be spawned that wakes up at regular intervals to update the timestamp and status of pmd cores in shared memory region. This information shall be read by vswitchd thread and write the status in to 'keepalive' column of Open_vSwitch table in OVSDB. An external monitoring framework like collectd with ovs events support can read (or) subscribe to the datapath status changes in ovsdb. When the state is updated, the collectd shall be notified and will eventually relay the status to ceilometer service running in the controller. Below is the high level overview of deployment model. Compute Node Controller Compute Node Collectd <----------> Ceilometer <--------> Collectd OvS DPDK OvS DPDK +-----+ | VM | +--+--+ \---+---/ | +--+---+ +------------+----------+ +------+-------+ | OVS |-----> | ovsevents plugin | --> | collectd | +--+---+ +------------+----------+ +------+-------+ +------+-----+ +---------------+------------+ | | Ceilometer | <-- | collectd ceilometer plugin | <--- +------+-----+ +---------------+------------+ Performance impact: No noticeable performance or latency impact is observed with KA feature enabled. ------------------------------- v1-> v2 * Merged the xml and schema commits to later commit where the actual implementation is done(suggested by Ben). * Fix ovs-appctl keepalive/* hang issue when KA disabled. * Fixed memory leaks with appctl commands for keepalive/pmd-health-show, pmd-xstats-show. * Refactored code and fixed APIs dealing with PMD health monitoring. Bhanuprakash Bodireddy (19): [9] patches help update OVSDB with keepalive status dpdk: Add helper functions for DPDK datapath keepalive. process: Retrieve process status. Keepalive: Add initial keepalive support. bridge: Invoke keepalive framework. keepalive: Add more helper functions to KA framework. dpif-netdev: Register packet processing cores to KA framework. dpif-netdev: Enable heartbeats for DPDK datapath. keepalive: Retrieve PMD status periodically. bridge: Update keepalive status in OVSDB keepalive: Add support to query keepalive statistics. keepalive: Add support to query keepalive status. dpif-netdev: Add helper function to check false positives. [5] Patches add additional health checks in case of heartbeat failure. dpif-netdev: Add additional datapath health checks. keepalive: Check the link status as part of PMD health checks. keepalive: Check the packet statistics as part of PMD health checks. keepalive: Check the PMD cycle stats as part of PMD health checks. netdev-dpdk: Enable PMD health checks on heartbeat failure. keepalive: Display extended Keepalive status. Documentation: Update DPDK doc with Keepalive feature. Documentation/howto/dpdk.rst | 95 +++++ lib/automake.mk | 2 + lib/dpdk-stub.c | 30 ++ lib/dpdk.c | 61 ++++ lib/dpdk.h | 13 + lib/dpif-netdev.c | 166 ++++++++- lib/dpif-netdev.h | 6 + lib/keepalive.c | 846 +++++++++++++++++++++++++++++++++++++++++++ lib/keepalive.h | 132 +++++++ lib/netdev-dpdk.c | 129 ++++++- lib/netdev-dpdk.h | 5 + lib/process.c | 73 ++++ lib/process.h | 11 + vswitchd/bridge.c | 30 ++ vswitchd/vswitch.ovsschema | 7 +- vswitchd/vswitch.xml | 59 +++ 16 files changed, 1656 insertions(+), 9 deletions(-) create mode 100644 lib/keepalive.c create mode 100644 lib/keepalive.h -- 2.4.11 _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev