Replying to @GordonBGood and @mratsim:

> > There is no reason to have such a high overhead especially in high 
> > performance computing. OpenMP is way way way lower.
> 
> I agree, and I'm still not 100% sure where all the time is going

Are you able to share the benchmark you were using with us (as well as 
information like back-end compiler and version, processor, etc)? In our 
experiments, Chapel has generally shown to be competitive with OpenMP, so it 
would be interesting for us to understand better what you were doing (prior to 
resorting to a homegrown thread pool) in order to make sure nothing's going 
horribly awry. I'd also be curious whether you were using CHPL_TASKS=qthreads 
or CHPL_TASKS=fifo. Thanks.

> I expect [Chapel's data pallelism] is similar to CoArray Fortran

Chapel's data parallelism is significantly different than Co-Array Fortran, 
where an array-of-arrays approach is taken for distributed arrays. In contrast, 
Chapel's data parallelism is based on global-view domains (index sets) and 
arrays, which are an evolution of concepts that were pioneered by ZPL in the 
1990's.

By default, most data parallelism in Chapel is implemented using #cores tasks 
where cores is the number of processor cores to which the index set or array 
are distributed. This ensures that the computational granularity is based on 
the hardware parallelism rather than the data collection (though programmers 
can override these defaults if they want something finer/coarser).

> I can see that it could be of interest if one had access to such a machine or 
> group of machines, which most of us probably never will.

For those not in the market for a Cray or commodity cluster, I suspect AWS, 
Azure, and Google Cloud would be happy to offer you such access for a 
reasonable fee. :)

Reply via email to