Hi Michael,

About the endCount pointer in ChapelBase.chpl

> I believe that the compiler is adding the endCount pointer as an argument
> to the 'on' clause function. You can see in ChapelBase.chpl:865 that
> there is a primitive that lets the module code get the end count that
> is possibly added by the compiler.

I added some printing to the downEndCount function to see what's going on. It 
now looks like this: 
  proc _downEndCount(e: _EndCount) {
        extern proc printf(s: c_string, l: int(32));
        printf ("SUB:: %d :: started \n", chpl_nodeID);
        if(e!=nil){ 
                printf ("SUB:: %d :: in  e!=nil \n", chpl_nodeID); 
                e.i.sub(1, memory_order_release);
                printf ("SUB:: %d :: passed  SUB \n", chpl_nodeID);
        }else{
                printf ("SUB:: %d :: e ==nil - dont' do anyhting \n", 
chpl_nodeID);
        }
  }

Briefly, what I am trying to do is this:
In the chapel program :
on Locales[1] do begin{ foo(); }
In the runtime:
execute foo() on Locale 0  (because locale 1 has failed)

When I migrate foo() on Locale 0 (copying fork_t and args)
the function call is correct (foo() executes on locale 0) and then the 
_downEndCount function is called.
Depending on the way I am re-launching the task on the "other" locale I get a 
different result.

A. When using a new thread on Locale 0 with fork_nb_wrapper(..) or 
chpl_task_startMovedTask(..) I get:

(PROGRAM OUTPUT) arg= 1----------on locale 0  //execution of foo() function
(MODULE) SUB:: 0 :: started 
(MODULE) SUB:: 0 ::  e ==nil - dont' do anyhting 

The program executes the relaunched function (and probably subs the task 
counter for this new thread) 
but then hangs waiting on completion of task on locale 1, since it has no 
pointer to that endCount.
(Actually it hangs much earlier, but this is what I see when I use the 
--taskreport flag)
Now if I try to get the wide pointer to the endCount of the relaunched task 
using the primitive:
var a = __primitive("get end count"); 
a.i.sub(1, memory_order_release);

the compiler gives me "internal error: number of actuals does not match number 
of formals"
which I realise is coming from compiler/AST/primitive.cpp : 
prim_def(PRIM_GET_END_COUNT), "get end count", returnInfo, EndCount);

B. When using serial local execution (chpl_ftable_call (..)) 
I get :
(PROGRAM OUTPUT) arg= 1----------on locale 0  //execution of foo() function
(MODULE) SUB:: 0 :: started 
(MODULE) SUB:: 0 :: in  e!=nil 
(COMM LAYER) 0 chpl_comm_fork::  Loc 0 -> Loc 457826136 
.... 
[0] /usr/bin/gstack 20149

Now since the parent of locale 1 is locale 0 (here) I would expect that the 
wide_endCount pointer would point to the endCount in local memory.
Instead it points to 457826136 (corrupted memory I suppose) , reads this as a 
locale ID and tries to do a remote sub on that. Eventually it gives a seg fault.

So I am confused. 
Shouldn't locale 0 be able to read the correct wide_EndCount pointer since:
1. it is copied from the args sent to locale 1 and
2. the endCount lives in local memory
??

--Konstantina

________________________________________
From: Panagiotopoulou, Konstantina [[email protected]]
Sent: 02 June 2015 22:12
To: Michael Ferguson
Cc: [email protected]
Subject: Re: [Chapel-developers] chpl_wide_EndCount

Hi Michael,

I have copied the args and function ID using memcpy to the other locale.
So, I though the endCount pointer was already part of the args and could be
referenced from Locale X.

I will look into the primitive a bit more.
Thanks.

--Konstantina
________________________________________
From: Michael Ferguson [[email protected]]
Sent: 02 June 2015 21:46
To: Panagiotopoulou, Konstantina
Subject: Re: [Chapel-developers] chpl_wide_EndCount

Hi Konstantina -


endCount is basically used to wait for a task to exit.
More below, but keep in mind I haven't looked closely
at your example or the compiler code for this...

On 6/2/15, 4:23 PM, "Panagiotopoulou, Konstantina" <[email protected]> wrote:

>Hi Michael, Ben,
>
>I am trying this minimal example :
>
>on Locales[1] do begin
> foo();
>
>Locale 0 allocates the endCount, calls add() and waits on 0.
>Locale 1 assigns the endCount pointer to args_foron ->_1_endCount
>and then to the wrapper and eventually calls sub()
>I have 2 questions:
>
>1. The endCount of Locale 1 is a pointer or a copy of the endCount
>allocated initially on Locale 0?

A pointer. Since endCount is a class instance, it's always passed
as a pointer.

>
>2. Is there a way to access the endCount (the .addr filed) from the
>module code?
>I assume modules know nothing about wide_endCounts or wide pointers in
>general.

I believe that the compiler is adding the endCount pointer as an argument
to the 'on' clause function. You can see in ChapelBase.chpl:865 that
there is a primitive that lets the module code get the end count that
is possibly added by the compiler.

>
>The rationale behind all this is to find out what happens if a migrated
>task is "redirected" to execute on
>another locale X, rather than Locale 1. In this case, I manage to get
>foo() to execute on Locale X and right after that I am getting
>a nil reference, coming from the downCount() function.
>So I am wondering if it is possible to access the endCount from Locale X?

I think it should be. When you move the task to another locale - I'm
guessing that the __primitive("get end count") is returning the
wrong value - nil instead of a wide pointer to the end count allocated
on Locale 0.

It depends presumably on how you migrated that task to another
locale.

-michael

>
>
>
>_______________________________________
>From: Michael Ferguson [[email protected]]
>Sent: 02 June 2015 20:51
>To: Panagiotopoulou, Konstantina; [email protected]
>Subject: Re: [Chapel-developers] chpl_wide_EndCount
>
>Hi -
>
>class _EndCount is declared in ChapelBase.chpl.
>wide_EndCount is a wide pointer - as you
>guessed, .locale is the locale storing the instance.
>.addr is the address of the instance on that locale.
>
>Hope that helps,
>
>-michael
>
>
>On 6/2/15, 2:22 PM, "Panagiotopoulou, Konstantina" <[email protected]> wrote:
>
>>Hello,
>>
>>Could someone point me to the def of wide_endCounts?
>>
>>
>>Looking at C intermediate source I see that there are two attributes
>>
>>
>>.locale and .addr
>>
>>
>>Supposing .locale is the parent locale where the endCount is allocated,
>>what is .addr?
>>
>>Regards,
>>Konstantina
>>________________________________________
>>From: Panagiotopoulou, Konstantina
>>Sent: 02 June 2015 19:20
>>To: [email protected]
>>Subject: chpl_wide_EndCount
>>
>>
>>Hello,
>>
>>Could someone point me to the def of wide_endCounts?
>>
>>
>>Looking at C intermediate source I
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>We invite research leaders and ambitious early career researchers to join
>>us in leading and driving research in key inter-disciplinary themes.
>>Please see www.hw.ac.uk/researchleaders for further information and how
>>to
>> apply.
>>
>>Heriot-Watt University is a Scottish charity registered under charity
>>number SC000278.
>>
>
>
>
>-----
>We invite research leaders and ambitious early career researchers to
>join us in leading and driving research in key inter-disciplinary themes.
>Please see www.hw.ac.uk/researchleaders for further information and how
>to apply.
>
>Heriot-Watt University is a Scottish charity
>registered under charity number SC000278.
>



-----
We invite research leaders and ambitious early career researchers to
join us in leading and driving research in key inter-disciplinary themes.
Please see www.hw.ac.uk/researchleaders for further information and how
to apply.

Heriot-Watt University is a Scottish charity
registered under charity number SC000278.


------------------------------------------------------------------------------
_______________________________________________
Chapel-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/chapel-developers


----- 
We invite research leaders and ambitious early career researchers to 
join us in leading and driving research in key inter-disciplinary themes. 
Please see www.hw.ac.uk/researchleaders for further information and how
to apply.

Heriot-Watt University is a Scottish charity
registered under charity number SC000278.


------------------------------------------------------------------------------
_______________________________________________
Chapel-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/chapel-developers

Reply via email to