> -----Original Message----- > From: Jakub Jelinek [mailto:ja...@redhat.com] > Sent: Friday, January 31, 2014 11:04 AM > To: Iyer, Balaji V > Cc: gcc-patches@gcc.gnu.org > Subject: Re: Issue with _Cilk_for > > On Fri, Jan 31, 2014 at 04:42:57PM +0100, Jakub Jelinek wrote: > > Can you explain why you emit anything in between the parallel and > _Cilk_for? > > I don't see why it should be needed. > > Depending on what the Cilk+ standard allows you to do (can array.begin > > () be evaluated multiple times or not, and if not, can you invoke say > > a copy constructor on it), you should just emit an EXPR_STMT that > > initializes an artificial scalar (say get_temp_regvar created) to > > (array.end () - array.begin ()) before you add the parallel, or, if > > array.begin () can't be evaluated multiple times, then construct some > > temporary before the parallel with array.begin () as initializer and > > then do the subtraction between array.end () and that temporary. > > > > Then the parallel, with _Cilk_for immediately in it and just another > > temporary scalar as loop iterator there, and only inside of _Cilk_for > > declare the iterator var and construct (if it has been declared in > > _Cilk_for, otherwise just initialize it) to, depending on what Cilk+ > > requires, either to array.begin () and then operator+ it to the > > corresponding value, or construct already to array.begin () + scalariv. > > Actually, small correction, the iterator variable likely should be constructed > before the parallel and just marked as firstprivate on the parallel, then you > even don't need any special temporary. > > So for: > > _Cilk_for (vector<int>::iterator iter = array.begin(); iter != array.end(); > iter++) > { > if (*iter == 6) > *iter = 13; > } > > emit basically: > { > Iterator iter; > Iterator::Iterator(&iter, array.begin()); > ptrdiff_t tmpcount = Iterator::operator-(array.end(), &iter); > ptrdiff_t tmpiv; > ptrdiff_t tmplast = 0; > #pragma omp parallel firstprivate(iter, tmpcount, tmplast) private(tmpiv) > { > _Cilk_for(tmpiv = 0; tmpiv < tmpcount; tmpiv++) > { > Iterator::operator+(&iter, tmpiv - tmplast); > tmplast = tmpiv; > { > if (*iter == 6) > *iter = 13; > } > } > } > } >
I tried to do this. The thing is, you had me model like this: #pragma omp parallel for _Cilk_for (...) { Body } Now, the compiler will do something like this: #pragma OMP parallel { #pragma Omp for { _Cilk_for (...) Body } } Now, if by the time it starts to look at evaluating/breaking up the _Cilk-for's parts, we are already at the scope inside #pragma omp parallel. If I try to pop here, it has some scope issues with #pragma omp parallel. This requires a lot of rewrite of existing OMP code to port it for _Cilk_for. This can be done but will take significant time which I don't think I have since we are close to end of stage3. But, if I had the parallel around the body, then the body is transplanted into the child function and everything I need for _Cilk-for is done. Yes, it does not make sense for an OMP expert. I am nowhere close to being an expert and even I had a hard time to wrap my head around this. We can look into what you want, but in the meantime, can you accept this patch with the way I have modelled so that the feature can make it into 4.9? Thanks, Balaji V. Iyer. > Jakub