Re: [Pvfs2-developers] terminating state machines

Walter B. Ligon III Wed, 26 Jul 2006 14:07:04 -0700


Phil Carns wrote:

Walter B. Ligon III wrote:
OK, guys, I have another issue I want input on. When child SMsterminate they have to notify their parent. The parent has to waitfor all the children to terminate. So I've been thinking to use thejob subsystem for this: the parent would post a job to wait for Nchildren,
and each child would post a job, the last one releasing the parent.
Now I see two ways to implement this - one is to implement thisdirectly in the state machine code. The parent simply stops running(because it does not schedule a job yet returns DEFERRED). Each childdecrements a counter, and when it hits 0 the parent is restarted.This is a little ugly because the waiting parent is not being held onany list or queue (up to now all waiting SMs are in the jobsubsystem), also the last terminating child becomes the parent as itstarts executing the parent code. Things can get weird when one SMstarts children that start children, and so on.
Now the other way to implement this is with the job subsystem as Isuggested above. Much cleaner except for one thing: up to now thestate machine subsystem has had no dependency at all on the jobsubsystem. If we do it this way, this function only works with thejob system intact. I'd prefer not to do this, but it does seem thecleanest, most logical means.
I like the job approach. I guess this is an extra dependency becausethe sms would be calling these particular job functions implicitly,rather than relying on the state functions to handle those posts andreleases? We definitely haven't done that before, but at least in thiscase the job function that the sm infrastructure would be depending onis the simplest one in the arsenal :) It shouldn't be hard for someoneto reimplement that particular functionality if they wanted to use thestate machine mechanism in another project.
If you weren't planning on these job calls to be implicit, then I'm notsure where the extra dependency is- we already use jobs to trigger allof the other "normal" transitions.
This reminded me of a question, though- is there going to be a standardmechanism for the children to report each of their independent errorcodes to the parent sm? Or do the children need to just keep areference to the parent sm structure and manually fill in an array orsomething?
I guess I have a broader question of how data that the children generate(like a handle value or an attr structure) gets transferred to theparent. Does the parent copy this stuff from the child after the childfinishes, or does the child copy it to the parent before it exits? Ithink we talked about this before at some point but I forgot what theplan is. It would be nice if we made the developer define macros orsomething to dictate what the input parameters need to be filled in wheninvoking a child and what output parameters can be retrieved when itfinishes. Otherwise it starts getting tricky to remember what fieldsneed to be set in the sm structure before kicking something off.

Phil, first your questions: The parent will push a "frame" onto a stackfor each child it is starting. A frame is everything that used to be ineither a s_op or sm_p on the server or client, except for the stuff thatactually runs the SM (now in an smcb). The parent can pass in anythingit wants by filling in the fields appropriately. When each child runsthat struct will appear to be its "current" frame. Each child can leavethat frame in any condition it wants, with any values of buffers thechild wants to leave for the parent. After the children are done theparent can pop each frame off the stack and do what it wants with it.Thus there is plenty of flexibility on how you want to handle passingthings in and out, all under control of the server or client code.

As for providing macros for setting up and tearing down frames, we cancertainly do that. I'm not sure hoe much that really helps, but we cando it.

Now, an implementation question - one approach to this job/counter thingis to have two job calls, one for the parent, and one of the children.Another approach is for the parent to simple set a counter and not callanything. The children come along, decrement the count, and if zero,call job_null() to awaken the parent. Requires no modification in thejob layer, minimizes dependency. What do you think? Should the joblayer have more of a roll, or keep it minimum?


Walt

--
Dr. Walter B. Ligon III
Associate Professor
ECE Department
Clemson University
_______________________________________________
Pvfs2-developers mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers

Re: [Pvfs2-developers] terminating state machines

Reply via email to