On Thu, Aug 4, 2016 at 10:59 AM, Mike Holmes <[email protected]> wrote:
> > > On 4 August 2016 at 11:47, Bill Fischofer <[email protected]> > wrote: > >> >> On Thu, Aug 4, 2016 at 10:36 AM, Mike Holmes <[email protected]> >> wrote: >> >>> On my vanilla x86 I don't get any issues, keen to get this in and have >>> CI run it on lots of HW to see what happens, many of the other tests >>> completely fail in process mode so we will expose a lot as we add them I >>> think. >>> >>> On 4 August 2016 at 11:33, Bill Fischofer <[email protected]> >>> wrote: >>> >>>> >>>> >>>> On Thu, Aug 4, 2016 at 10:26 AM, Brian Brooks <[email protected]> >>>> wrote: >>>> >>>>> Reviewed-by: Brian Brooks <[email protected]> >>>>> >>>>> On 08/04 09:18:14, Mike Holmes wrote: >>>>> > +ret=0 >>>>> > + >>>>> > +run() >>>>> > +{ >>>>> > + echo odp_scheduling_run_proc starts with $1 worker threads >>>>> > + echo ===================================================== >>>>> > + >>>>> > + $PERFORMANCE/odp_scheduling${EXEEXT} --odph_proc -c $1 || >>>>> ret=1 >>>>> > +} >>>>> > + >>>>> > +run 1 >>>>> > +run 8 >>>>> > + >>>>> > +exit $ret >>>>> >>>>> Seeing this randomly in both multithread and multiprocess modes: >>>>> >>>> >>>> Before or after you apply this patch? What environment are you seeing >>>> these errors in. They should definitely not be happening. >>>> >>>> >>>>> >>>>> ../../../odp/platform/linux-generic/odp_queue.c:328:odp_queue_destroy():queue >>>>> "sched_00_07" not empty >>>>> ../../../odp/platform/linux-generic/odp_schedule.c:271:schedule_term_global():Queue >>>>> not empty >>>>> ../../../odp/platform/linux-generic/odp_schedule.c:294:schedule_term_global():Pool >>>>> destroy fail. >>>>> ../../../odp/platform/linux-generic/odp_init.c:188:_odp_term_global():ODP >>>>> schedule term failed. >>>>> ../../../odp/platform/linux-generic/odp_queue.c:170:odp_queue_term_global():Not >>>>> destroyed queue: sched_00_07 >>>>> ../../../odp/platform/linux-generic/odp_init.c:195:_odp_term_global():ODP >>>>> queue term failed. >>>>> ../../../odp/platform/linux-generic/odp_pool.c:149:odp_pool_term_global():Not >>>>> destroyed pool: odp_sched_pool >>>>> ../../../odp/platform/linux-generic/odp_pool.c:149:odp_pool_term_global():Not >>>>> destroyed pool: msg_pool >>>>> ../../../odp/platform/linux-generic/odp_init.c:202:_odp_term_global():ODP >>>>> buffer pool term failed. >>>>> ~/odp_incoming/odp_build/test/common_plat/performance$ echo $? >>>>> 0 >>>>> >>>>> >> Looks like we have a real issue that somehow creeped into master. I can >> sporadically reproduce these same errors on my x86 system. It looks like >> this is also present in the monarch_lts branch. >> > > > I think that we agreed that Monarch would not support Process mode becasue > we never tested for it, but for TgrM we need to start fixing it. > Unfortunately the issue Brian identified has nothing to do with process mode. This happens in regular pthread mode on all levels past v1.10.0.0 as far as I can see. > > Mike > > >> >> >>> Potentially two items: one for correctly returning the failure code, and >>>>> another related to teardown. Both beyond the scope of this patch which >>>>> LGTM. >>>>> >>>> >>>> >>> >>> >>> -- >>> Mike Holmes >>> Technical Manager - Linaro Networking Group >>> Linaro.org <http://www.linaro.org/> *│ *Open source software for ARM >>> SoCs >>> "Work should be fun and collaborative, the rest follows" >>> >>> >>> >> > > > -- > Mike Holmes > Technical Manager - Linaro Networking Group > Linaro.org <http://www.linaro.org/> *│ *Open source software for ARM SoCs > "Work should be fun and collaborative, the rest follows" > > >
