On 12/08/2017 12:04 PM, Bhanuprakash Bodireddy wrote:
> Keepalive feature is aimed at achieving Fastpath Service Assurance
> in OVS-DPDK deployments. It adds support for monitoring the packet
> processing threads by dispatching heartbeats at regular intervals.
>  

Hi Bhanu,

I guess there's comments elsewhere about the need for this feature, but
looking at this solution I had a few comments on it. Not a full review.

This is creating a dependency on internal OVS threading model from
collectd/ceilometer. That seems like something that could cause problems
in the future. What if someone wanted to put multiple pmds on a core, or
put a pmd into interrupt mode. Would it be blocked in OVS because it
would break collectd/ceilometer? or how would the projects coordinate?

I see there is a sleep state, so maybe that could be used for things
like an interrupt mode etc. but please consider that that OVS may not
always spin the pmds in the manner it does now and at least allow for a
"don't monitor this pmd" type option, so OVS is not restricted by trying
to please collectd/ceilometer.

It also seems strange that collectd/ceilometer would be monitoring pmd
health but be ok with pmds appearing and disappearing due to internal
OVS rxq assignment.

> keepalive feature can be enabled through below OVSDB settings.
> 
>     enable-keepalive=true
>       - Keepalive feature is disabled by default and should be enabled
>         at startup before ovs-vswitchd daemon is started.
> 
>     keepalive-interval="5000"
>       - Timer interval in milliseconds for monitoring the packet
>         processing cores.
> 
> TESTING:
>     The testing of keepalive is done using stress cmd (simulating the stalls).
>       - pmd-cpu-mask=0xf [MQ enabled on DPDK ports]
>       - stress -c 1 &          [tid is usually the __tid + 1 of the output]
>       - chrt -r -p 99 <tid>    [set realtime priority for stress thread]
>       - taskset -p 0x8 <tid>   [Pin the stress thread to the core PMD is 
> running]
>       - PMD thread will be descheduled due to its normal priority and yields
>         core to stress thread.
> 
>       - ovs-appctl keepalive/pmd-health-show   [Display that the thread is 
> GONE]
>       - ./ovsdb/ovsdb-client monitor Open_vSwitch  [Should update the status]
> 
>       - taskset -p 0x10 <tid>  [This brings back pmd thread to life as stress 
> thread
>                                 is moved to idle core]
> 
>       (watch out for stress threads, and carefully pin them to core not to 
> hang your DUTs
>        during tesing).
> 
> v5 -> v6
>   * Remove 2 patches from series
>      - xnanosleep was applied to master as part of high resolution timeout 
> support.
>      - Extend get_process_info() API was also applied to master earlier.
>   * Remove KA_STATE_DOZING as it was initially meant to handle Core C states, 
> not needed
>     for now.
>   * Fixed ka_destroy(), to fix unit test cases 536, 537.
>   * A minor performance degradation(0.5%) is observed with Keepalive enabled.
>     [Tested with loopback case using 1000 IXIA streams/64 byte udp pkts and
>     1 PMD thread(9.239 vs 9.177Mpps) at 10ms ka-interval timeout]
>   * Verified with sparse, MSVC compilers(appveyor).
> 
> v4 -> v5
>   * Add 3 more patches to the series
>      - xnanosleep()
>      - Documentation
>      - Update to NEWS
>   * Remove all references to core_id and instead implemented thread based 
> tracking.
>   * Addressed most of the comments in v4.
> 
> v3 -> v4
>   * Split the functionality in to 2 parts. This patch series only updates
>     PMD status to OVSDB. The incremental patch series to handle false 
> positives,
>     negatives and more checking and stats. 
>   * Remove code from netdev layer and dependency on rte_keepalive lib.
>   * Merged few patches and simplified the patch series.
>   * Timestamp in human readable form.
> 
> v2 -> v3
>   * Rebase.
>   * Verified with dpdk-stable-17.05.1 release.
>   * Fixed build issues with MSVC and cross checked with appveyor.
> 
> v1 -> v2
>   * Rebase
>   * Drop 01/20 Patch "Consolidate process related APIs" of V1 as it
>     is already applied as separate patch.
> 
> RFCv3 -> v1
>   * Made changes to fix failures in some unit test cases.
>   * some more code cleanup w.r.t process related APIs.
> 
> RFCv2 -> RFCv3
>   * Remove POSIX shared memory block implementation (suggested by Aaron).
>   * Rework the logic to register and track threads instead of cores. This way
>     in the future any thread can be registered to KA framework. For now only 
> PMD
>     threads are tracked (suggested by Aaron).
>   * Refactor few APIs and further clean up the code.
>    
> RFCv1 -> RFCv2
>   * Merged the xml and schema commits to later commit where the actual
>     implementation is done(suggested by Ben).
>   * Fix ovs-appctl keepalive/* hang issue when KA disabled.
>   * Fixed memory leaks with appctl commands for keepalive/pmd-health-show,
>     pmd-xstats-show.
>   * Refactor code and fixed APIs dealing with PMD health monitoring.
> 
> 
> Bhanuprakash Bodireddy (8):
>   Keepalive: Add initial keepalive configuration.
>   dpif-netdev: Register packet processing cores to KA framework.
>   dpif-netdev: Enable heartbeats for DPDK datapath.
>   keepalive: Retrieve PMD status periodically.
>   bridge: Update keepalive status in OVSDB.
>   keepalive: Add support to query keepalive status and statistics.
>   Documentation: Update DPDK doc with Keepalive feature.
>   NEWS: Add keepalive support information in NEWS.
> 
>  Documentation/howto/dpdk.rst | 112 +++++++++
>  NEWS                         |   2 +
>  lib/automake.mk              |   2 +
>  lib/dpif-netdev.c            |  92 ++++++++
>  lib/keepalive.c              | 552 
> +++++++++++++++++++++++++++++++++++++++++++
>  lib/keepalive.h              | 109 +++++++++
>  lib/ovs-thread.c             |   6 +
>  lib/ovs-thread.h             |   1 +
>  lib/util.c                   |  22 ++
>  lib/util.h                   |   1 +
>  vswitchd/bridge.c            |  29 +++
>  vswitchd/vswitch.ovsschema   |   8 +-
>  vswitchd/vswitch.xml         |  49 ++++
>  13 files changed, 983 insertions(+), 2 deletions(-)
>  create mode 100644 lib/keepalive.c
>  create mode 100644 lib/keepalive.h
> 

_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to