By "construction below", I mean this:
results = SharedArray(Float64, (m,n))
@sync @parallel for i = 1:n
results[:, i] = complicatedfunction(inputs[i])
end
On Saturday, January 30, 2016 at 2:31:40 PM UTC-5, Christopher Alexander
wrote:
>
> I have tried the construction below with no success. In v0.4.3, I end up
> getting a segmentation fault. In the latest v.0.5.0, the run time is 3-4x
> as long as the non-parallelized version and the array constructed is vastly
> different than the one that is constructed using the non-parallelized code.
> Below is the C++ code that I am essentially trying to emulate:
>
> void TreeLattice<Impl>::stepback(Size i, const Array& values,
>
> Array& newValues) const {
>
> #pragma omp parallel for
>
> for (Size j=0; j<this->impl().size(i); j++) {
>
> Real value = 0.0;
>
> for (Size l=0; l<n_; l++) {
>
> value += this->impl().probability(i,j,l) *
>
> values[this->impl().descendant(i,j,l)];
>
> }
>
> value *= this->impl().discount(i,j);
>
> newValues[j] = value;
>
> }
>
> }
>
> The calls to probability, descendant, and discount all end up accessing
> data in other objects, so I tried to prepend those function and type
> definitions with @everywhere. However, that started me on a long chain of
> having to eventually wrap each file in my module in @everywhere, and there
> were still errors complaining about things not being defined. At this
> point I am really confused as to how to construct what would appear to be a
> rather simple parallelized for loop that generates the same results as
> non-parallelized code. I've poured over both this forum and other
> resources, and nothing has really worked.
>
> Any help would be appreciated.
>
> Thanks!
>
> Chris
>
>
> On Thursday, August 20, 2015 at 4:52:52 AM UTC-4, Nils Gudat wrote:
>>
>> Sebastian, I'm not sure I understand you correctly, but point (1) in your
>> list can usually be taken care of by wrapping all the necessary
>> usings/requires/includes and definitions in a @everywhere begin ... end
>> block.
>>
>> Julio, as for your original problem, I think Tim's advice about
>> SharedArrays was perfectly reasonable. Without having looked at your
>> problem in detail, I think you should be able to do something like this
>> (and I also think this gets close enough to what Sebastian was talking
>> about, and to Matlab's parfor, unless I'm completely misunderstanding your
>> problem):
>>
>> nprocs()==CPU_CORES || addprocs(CPU_CORES-1)
>> results = SharedArray(Float64, (m,n))
>>
>> @sync @parallel for i = 1:n
>> results[:, i] = complicatedfunction(inputs[i])
>> end
>>
>