Hi Konstantina -
On 6/6/15, 11:38 AM, "Panagiotopoulou, Konstantina" <[email protected]> wrote: >Hi Michael, > >About the endCount pointer in ChapelBase.chpl > >> I believe that the compiler is adding the endCount pointer as an >>argument >> to the 'on' clause function. You can see in ChapelBase.chpl:865 that >> there is a primitive that lets the module code get the end count that >> is possibly added by the compiler. > >I added some printing to the downEndCount function to see what's going >on. It now looks like this: > proc _downEndCount(e: _EndCount) { > extern proc printf(s: c_string, l: int(32)); > printf ("SUB:: %d :: started \n", chpl_nodeID); > if(e!=nil){ > printf ("SUB:: %d :: in e!=nil \n", chpl_nodeID); > e.i.sub(1, memory_order_release); > printf ("SUB:: %d :: passed SUB \n", chpl_nodeID); > }else{ > printf ("SUB:: %d :: e ==nil - dont' do anyhting \n", > chpl_nodeID); > } > } > >Briefly, what I am trying to do is this: >In the chapel program : >on Locales[1] do begin{ foo(); } >In the runtime: >execute foo() on Locale 0 (because locale 1 has failed) > >When I migrate foo() on Locale 0 (copying fork_t and args) >the function call is correct (foo() executes on locale 0) and then the >_downEndCount function is called. >Depending on the way I am re-launching the task on the "other" locale I >get a different result. Generally - I would expect your approach to work. The end count is a wide pointer in multi locale runs and so you should be able to move the task to a different locale.... Could you describe a little bit more how you are redirecting the function call to the other locale? Are you doing this entirely from C? (I think so, but I'm not sure)... Which functions in the C runtime have you modified? With uses of __primitive("get end count") - the compiler adds an _endCount argument to whatever function it was called in... and I would not generally expect that the work outside of the special wrapper functions the compiler creates for tasks. So, I don't understand * why you get e=nil in downEndCount in the first attempt * why you would need to call __primitive("get end count") at all * or what's going wrong in the ftable_call attempt I'm not sure that I can help any more without looking specifically at your modification. Have you already been using --savec and inspecting the C code? Can you track in the debugger/C code when the end count argument is correct and when it becomes wrong? Another debugging approach would be to run a program without the redirection and one with it at the same time, to try to understand where they diverge in behavior (and then figure out where exactly we're ending up with the wrong end count). Do you already have this program working: on Locales[1] do foo(); In the runtime: execute foo() on Locale 0 (because locale 1 has failed) ? Cheers, -michael > ------------------------------------------------------------------------------ _______________________________________________ Chapel-developers mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/chapel-developers
