> -----Original Message----- > From: David Marchand <david.march...@redhat.com> > Sent: Tuesday, October 13, 2020 8:45 PM > To: Van Haaren, Harry <harry.van.haa...@intel.com> > Cc: dev <dev@dpdk.org>; Lukasz Wojciechowski > <l.wojciec...@partner.samsung.com>; Honnappa Nagarahalli > <honnappa.nagaraha...@arm.com>; Phil Yang <phil.y...@arm.com>; Aaron > Conole <acon...@redhat.com> > Subject: Re: [PATCH v6 2/2] test/service: fix race condition on stopping lcore > > Hello Harry, > > Long time no see :-) > > On Mon, Sep 21, 2020 at 4:51 PM David Marchand > <david.march...@redhat.com> wrote: > > > > On Mon, Sep 14, 2020 at 4:30 PM Harry van Haaren > > <harry.van.haa...@intel.com> wrote: > > > > > > This commit fixes a potential race condition in the tests > > > where the lcore running a service would increment a counter > > > that was already reset by the test-suite thread. The resulting > > > race-condition incremented value could cause CI failures, as > > > indicated by DPDK's CI. > > > > > > This patch fixes the race-condition by making use of the > > > added rte_service_lcore_active() API, which indicates when > > > a service-core is no longer in the service-core polling loop. > > > > > > The unit test makes use of the above function to detect when > > > all statistics increments are done in the service-core thread, > > > and then the unit test continues finalizing and checking state. > > > > > > Fixes: f28f3594ded2 ("service: add attribute API") > > > > > > Reported-by: David Marchand <david.march...@redhat.com> > > > Signed-off-by: Harry van Haaren <harry.van.haa...@intel.com> > > > Reviewed-by: Phil Yang <phil.y...@arm.com> > > > Reviewed-by: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com> > > > > We probably need a followup fix for: > https://travis-ci.com/github/DPDK/dpdk/jobs/398954463#L10088 > > > The race is in service_attr_get where we look at/reset spent cycles > while a service lcore is still running. > Quoting this test code: > > rte_service_lcore_stop(slcore_id);
/* TODO: implement wait for slcore_id to stop polling here */ > TEST_ASSERT_EQUAL(0, rte_service_attr_get(id, attr_calls, &attr_value), > "Valid attr_get() call didn't return success"); > TEST_ASSERT_EQUAL(1, (attr_value > 0), > "attr_get() call didn't get call count (zero)"); > > TEST_ASSERT_EQUAL(0, rte_service_attr_reset_all(id), > "Valid attr_reset_all() return success"); > > TEST_ASSERT_EQUAL(0, rte_service_attr_get(id, attr_id, &attr_value), > "Valid attr_get() call didn't return success"); Based on the output you provided ( https://travis-ci.com/github/DPDK/dpdk/jobs/398954463#L10088 ) and the above, indeed it seems that the core may not have stopped yet (race-cond). Will send a patch - thanks for reporting with details, -Harry