> -----Original Message-----
> From: Jakub Jelinek [mailto:[email protected]]
> Sent: Friday, January 31, 2014 11:04 AM
> To: Iyer, Balaji V
> Cc: [email protected]
> Subject: Re: Issue with _Cilk_for
>
> On Fri, Jan 31, 2014 at 04:42:57PM +0100, Jakub Jelinek wrote:
> > Can you explain why you emit anything in between the parallel and
> _Cilk_for?
> > I don't see why it should be needed.
> > Depending on what the Cilk+ standard allows you to do (can array.begin
> > () be evaluated multiple times or not, and if not, can you invoke say
> > a copy constructor on it), you should just emit an EXPR_STMT that
> > initializes an artificial scalar (say get_temp_regvar created) to
> > (array.end () - array.begin ()) before you add the parallel, or, if
> > array.begin () can't be evaluated multiple times, then construct some
> > temporary before the parallel with array.begin () as initializer and
> > then do the subtraction between array.end () and that temporary.
> >
> > Then the parallel, with _Cilk_for immediately in it and just another
> > temporary scalar as loop iterator there, and only inside of _Cilk_for
> > declare the iterator var and construct (if it has been declared in
> > _Cilk_for, otherwise just initialize it) to, depending on what Cilk+
> > requires, either to array.begin () and then operator+ it to the
> > corresponding value, or construct already to array.begin () + scalariv.
>
> Actually, small correction, the iterator variable likely should be constructed
> before the parallel and just marked as firstprivate on the parallel, then you
> even don't need any special temporary.
>
> So for:
>
> _Cilk_for (vector<int>::iterator iter = array.begin(); iter != array.end();
> iter++)
> {
> if (*iter == 6)
> *iter = 13;
> }
>
> emit basically:
> {
> Iterator iter;
> Iterator::Iterator(&iter, array.begin());
> ptrdiff_t tmpcount = Iterator::operator-(array.end(), &iter);
> ptrdiff_t tmpiv;
> ptrdiff_t tmplast = 0;
> #pragma omp parallel firstprivate(iter, tmpcount, tmplast) private(tmpiv)
> {
> _Cilk_for(tmpiv = 0; tmpiv < tmpcount; tmpiv++)
> {
> Iterator::operator+(&iter, tmpiv - tmplast);
> tmplast = tmpiv;
> {
> if (*iter == 6)
> *iter = 13;
> }
> }
> }
> }
>
I tried to do this. The thing is, you had me model like this:
#pragma omp parallel for
_Cilk_for (...)
{
Body
}
Now, the compiler will do something like this:
#pragma OMP parallel
{
#pragma Omp for
{
_Cilk_for (...)
Body
}
}
Now, if by the time it starts to look at evaluating/breaking up the _Cilk-for's
parts, we are already at the scope inside #pragma omp parallel. If I try to pop
here, it has some scope issues with #pragma omp parallel. This requires a lot
of rewrite of existing OMP code to port it for _Cilk_for. This can be done but
will take significant time which I don't think I have since we are close to end
of stage3.
But, if I had the parallel around the body, then the body is transplanted into
the child function and everything I need for _Cilk-for is done. Yes, it does
not make sense for an OMP expert. I am nowhere close to being an expert and
even I had a hard time to wrap my head around this.
We can look into what you want, but in the meantime, can you accept this patch
with the way I have modelled so that the feature can make it into 4.9?
Thanks,
Balaji V. Iyer.
> Jakub