With this change and CFS in effect, it effectively means that the dpdk control threads need to be on different cores than the PMD threads or the response latency may be too long for their control work ? Have we tested having the control threads on the same cpu with -20 nice for the pmd thread ?
I see the comment is added below + It is recommended that the OVS control thread and pmd thread shouldn't be + pinned to the same core i.e 'dpdk-lcore-mask' and 'pmd-cpu-mask' cpu mask + settings should be non-overlapping. I understand that other heavy threads would be a problem for PMD threads and we want to effectively encourage these to be on different cores in the situation where we are using a pmd-cpu-mask. However, here we are almost shutting down other threads by default on the same core as PMDs threads using -20 nice, even those with little cpu load but just needing a reasonable latency. Will this aggravate the argument from some quarters that using dpdk requires too much cpu reservation ? On 6/25/17, 3:06 PM, "[email protected] on behalf of Bhanuprakash Bodireddy" <[email protected] on behalf of [email protected]> wrote: Increase the DPDK pmd thread scheduling priority by lowering the nice value. This will advise the kernel scheduler to prioritize pmd thread over other processes and will help PMD to provide deterministic performance in out-of-the-box deployments. This patch sets the nice value of PMD threads to '-20'. $ ps -eLo comm,policy,psr,nice | grep pmd COMMAND POLICY PROCESSOR NICE pmd62 TS 3 -20 pmd63 TS 0 -20 pmd64 TS 1 -20 pmd65 TS 2 -20 Signed-off-by: Bhanuprakash Bodireddy <[email protected]> Tested-by: Billy O'Mahony <[email protected]> Acked-by: Billy O'Mahony <[email protected]> --- v9->v10 * Return error code if setpriority fails. v8->v9: * Rebase v7->v8: * Rebase * Update the documentation file @Documentation/intro/install/dpdk-advanced.rst v6->v7: * Remove realtime scheduling policy logic. * Increase pmd thread scheduling priority by lowering nice value to -20. * Update doc accordingly. v5->v6: * Prohibit spawning pmd thread on the lowest core in dpdk-lcore-mask if lcore-mask and pmd-mask affinity are identical. * Updated Note section in INSTALL.DPDK-ADVANCED doc. * Tested below cases to verify system stability with pmd priority patch v4->v5: * Reword Note section in DPDK-ADVANCED.md v3->v4: * Document update * Use ovs_strerror for reporting errors in lib-numa.c v2->v3: * Move set_priority() function to lib/ovs-numa.c * Apply realtime scheduling policy and priority to pmd thread only if pmd-cpu-mask is passed. * Update INSTALL.DPDK-ADVANCED. v1->v2: * Removed #ifdef and introduced dummy function "pmd_thread_setpriority" in netdev-dpdk.h * Rebase Documentation/intro/install/dpdk.rst | 8 +++++++- lib/dpif-netdev.c | 4 ++++ lib/ovs-numa.c | 22 ++++++++++++++++++++++ lib/ovs-numa.h | 1 + 4 files changed, 34 insertions(+), 1 deletion(-) diff --git a/Documentation/intro/install/dpdk.rst b/Documentation/intro/install/dpdk.rst index e83f852..b5c26ba 100644 --- a/Documentation/intro/install/dpdk.rst +++ b/Documentation/intro/install/dpdk.rst @@ -453,7 +453,8 @@ affinitized accordingly. to be affinitized to isolated cores for optimum performance. By setting a bit in the mask, a pmd thread is created and pinned to the - corresponding CPU core. e.g. to run a pmd thread on core 2:: + corresponding CPU core with nice value set to -20. + e.g. to run a pmd thread on core 2:: $ ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=0x4 @@ -493,6 +494,11 @@ improvements as there will be more total CPU occupancy available:: NIC port0 <-> OVS <-> VM <-> OVS <-> NIC port 1 + .. note:: + It is recommended that the OVS control thread and pmd thread shouldn't be + pinned to the same core i.e 'dpdk-lcore-mask' and 'pmd-cpu-mask' cpu mask + settings should be non-overlapping. + DPDK Physical Port Rx Queues ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index 4e29085..e952cf9 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -3712,6 +3712,10 @@ pmd_thread_main(void *f_) ovs_numa_thread_setaffinity_core(pmd->core_id); dpdk_set_lcore_id(pmd->core_id); poll_cnt = pmd_load_queues_and_ports(pmd, &poll_list); + + /* Set pmd thread's nice value to -20 */ +#define MIN_NICE -20 + ovs_numa_thread_setpriority(MIN_NICE); reload: emc_cache_init(&pmd->flow_cache); diff --git a/lib/ovs-numa.c b/lib/ovs-numa.c index 98e97cb..9cf6bd4 100644 --- a/lib/ovs-numa.c +++ b/lib/ovs-numa.c @@ -23,6 +23,7 @@ #include <dirent.h> #include <stddef.h> #include <string.h> +#include <sys/resource.h> #include <sys/types.h> #include <unistd.h> #endif /* __linux__ */ @@ -570,3 +571,24 @@ int ovs_numa_thread_setaffinity_core(unsigned core_id OVS_UNUSED) return EOPNOTSUPP; #endif /* __linux__ */ } + +int +ovs_numa_thread_setpriority(int nice OVS_UNUSED) +{ + if (dummy_numa) { + return 0; + } + +#ifndef _WIN32 + int err; + err = setpriority(PRIO_PROCESS, 0, nice); + if (err) { + VLOG_ERR("Thread priority error %s", ovs_strerror(err)); + return err; + } + + return 0; +#else + return EOPNOTSUPP; +#endif +} diff --git a/lib/ovs-numa.h b/lib/ovs-numa.h index 6946cdc..e132483 100644 --- a/lib/ovs-numa.h +++ b/lib/ovs-numa.h @@ -62,6 +62,7 @@ bool ovs_numa_dump_contains_core(const struct ovs_numa_dump *, size_t ovs_numa_dump_count(const struct ovs_numa_dump *); void ovs_numa_dump_destroy(struct ovs_numa_dump *); int ovs_numa_thread_setaffinity_core(unsigned core_id); +int ovs_numa_thread_setpriority(int nice); #define FOR_EACH_CORE_ON_DUMP(ITER, DUMP) \ HMAP_FOR_EACH((ITER), hmap_node, &(DUMP)->cores) -- 2.4.11 _______________________________________________ dev mailing list [email protected] https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_mailman_listinfo_ovs-2Ddev&d=DwICAg&c=uilaK90D4TOVoH58JNXRgQ&r=BVhFA09CGX7JQ5Ih-uZnsw&m=zXbCg4WV0RIP3JSv_63fHNJnZWnx5t936cWOkpWxXwM&s=jnpfeaeqmiFt37-rg8gCHNZvYMYwuPvl-hSrZL5ziCw&e= _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
