26/02/2021 00:33, Honnappa Nagarahalli:
> +Thomas, David, Konstantin for input
> 
> <snip>
> 
> > > Subject: [RFC 3/5] eal: lcore state FINISHED is not required
> > >
> > > FINISHED state seems to be used to indicate that the worker's update
> > > of the 'state' is not visible to other threads. There seems to be no
> > > requirement to have such a state.
> > 
> > I am not sure "FINISHED" is necessary to be removed, and I propose some of 
> > my
> > profiles for discussion.
> >  There are three states for lcore now:
> > "WAIT": indicate lcore can start working
> > "RUNNING": indicate lcore is working
> > "FINISHED": indicate lcore has finished its working and wait to be reset
> If you look at the definitions of "WAIT" and "FINISHED" states, they look 
> similar, except for "wait to be reset" in "FINISHED" state . The code really 
> does not do anything to reset the lcore. It just changes the state to "WAIT".
> 
> > 
> > From the description above, we can find "FINISHED" is different from 
> > "WAIT", it
> > can shows that lcore has done the work and finished it. Thus, if we remove
> > "FINISHED", maybe we will not know whether the lcore finishes its work or 
> > just
> > doesn't start, because this two state has the same tag "WAIT".
> Looking at "eal_thread_loop", the worker thread sets the state to "RUNNING" 
> before sending the ack back to main core. After that it is guaranteed that 
> the worker will run the assigned function. Only case where it will not run 
> the assigned function is when the 'write' syscall fails, in which case it 
> results in a panic.

Quick note: it should not panic.
We must find a way to return an error
without crashing the whole application.

 
> > Furthermore, consider such a scenario:
> > Core 1 need to monitor Core 2 state, if Core 2 finishes one task, Core 1 
> > can start
> > its working.
> > However, if there is only  one tag "WAIT", Core 1 maybe  start its work at 
> > the
> > wrong time, when Core 2 still does not start its task at state "WAIT".
> > This is just my guess, and at present, there is no similar application 
> > scenario in
> > dpdk.
> To be able to do this effectively, core 1 needs to observe the state change 
> from WAIT->RUNNING->FINISHED. This requires that core 1 should be calling 
> rte_eal_remote_launch and rte_eal_wait_lcore functions. It is not possible to 
> observe this state transition from a 3rd core (for ex: a worker might go from 
> RUNNING->FINISHED->WAIT->RUNNING which a 3rd core might not be able to 
> observe).
> 
> > 
> > On the other hand, if we decide to remove "FINISHED", please consider the
> > following files:
> > 1. lib/librte_eal/linux/eal_thread.c: line 31
> >     lib/librte_eal/windows/eal_thread.c: line 22
> >     lib/librte_eal/freebsd/eal_thread.c: line 31
> I have looked at these lines, they do not capture "why" FINISHED state is 
> required.
> 
>  2.
> > lib/librte_eal/include/rte_launch.h: line 24, 44, 121, 123, 131 3. 
> > examples/l2fwd-
> > keepalive/main.c: line 510
> > rte_eal_wait_lcore(id_core) can be removed. Because the core state has been
> > checked as "WAIT", this is a redundant operation



Reply via email to