Hi Konstantina -

On 6/6/15, 11:38 AM, "Panagiotopoulou, Konstantina" <[email protected]> wrote:

>Hi Michael,
>
>About the endCount pointer in ChapelBase.chpl
>
>> I believe that the compiler is adding the endCount pointer as an
>>argument
>> to the 'on' clause function. You can see in ChapelBase.chpl:865 that
>> there is a primitive that lets the module code get the end count that
>> is possibly added by the compiler.
>
>I added some printing to the downEndCount function to see what's going
>on. It now looks like this:
>  proc _downEndCount(e: _EndCount) {
>       extern proc printf(s: c_string, l: int(32));
>       printf ("SUB:: %d :: started \n", chpl_nodeID);
>       if(e!=nil){ 
>               printf ("SUB:: %d :: in  e!=nil \n", chpl_nodeID);
>               e.i.sub(1, memory_order_release);
>               printf ("SUB:: %d :: passed  SUB \n", chpl_nodeID);
>        }else{
>               printf ("SUB:: %d :: e ==nil - dont' do anyhting \n", 
> chpl_nodeID);
>        }
>  }
>
>Briefly, what I am trying to do is this:
>In the chapel program :
>on Locales[1] do begin{ foo(); }
>In the runtime:
>execute foo() on Locale 0  (because locale 1 has failed)
>
>When I migrate foo() on Locale 0 (copying fork_t and args)
>the function call is correct (foo() executes on locale 0) and then the
>_downEndCount function is called.
>Depending on the way I am re-launching the task on the "other" locale I
>get a different result.


Generally - I would expect your approach to work. The
end count is a wide pointer in multi locale runs and
so you should be able to move the task to a different
locale....

Could you describe a little bit more how you are redirecting the
function call to the other locale? Are you doing this
entirely from C? (I think so, but I'm not sure)...
Which functions in the C runtime have you modified?

With uses of __primitive("get end count") - the compiler
adds an _endCount argument to whatever function it was called
in... and I would not generally expect that the work outside
of the special wrapper functions the compiler creates for tasks.

So, I don't understand
 * why you get e=nil in downEndCount in the first attempt
 * why you would need to call __primitive("get end count")
   at all
 * or what's going wrong in the ftable_call attempt

I'm not sure that I can help any more without looking specifically
at your modification. Have you already been using --savec and
inspecting the C code? Can you track in the debugger/C code
when the end count argument is correct and when it becomes wrong?
Another debugging approach would be to run a program without
the redirection and one with it at the same time, to try to
understand where they diverge in behavior (and then figure
out where exactly we're ending up with the wrong end count).

Do you already have this program working:


on Locales[1] do foo();
In the runtime:
execute foo() on Locale 0 (because locale 1 has failed)



?

Cheers,

-michael


>


------------------------------------------------------------------------------
_______________________________________________
Chapel-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/chapel-developers

Reply via email to