On Friday, 4 December 2015 at 19:31:22 UTC, anonymous wrote:
Why did you expect the C++ inspired version to be faster? Just
because the original was written in C++?
From a quick skim the two versions seem to be pretty much
identical, aside from names and struct vs class.
Names don't make a difference of course. It would be easier to
compare the two versions if you used the same names, though.
The differences between structs on the heap and classes are
subtle. It's not surprising that they don't have an impact here.
If there are substantial differences between the two versions,
please point them out.
Yes, I missed this, sorry. The main part of the question was
probably about the class and struct difference. I thought
handling with structs and pointers would be faster then with
classes.
2. auto depthind = iota(min_depth, max_depth+1, 2);
foreach(dummy_i, d; taskPool.parallel(depthind))
Works for me. Maybe show the exact full program you tried, and
tell what compiler version you're using.
Ok, this was strange, but I found the crux. The correct question
is:
Why the parallel version is slower then the sequential?
If you set
int n = 14 in the main function
the parallel version is MUCH slower then the sequential. At my
machine 7x slower. Shouldn't it be the other way round?
3. The compilation was done by:
dmd -O -release -boundscheck=off [filename.d]
Is there anything else to improve performance significantly?
The other compilers, ldc and gdc, usually produce faster code
than dmd.
Thanks for the hint!
As ldc doesn't have the experimental part of the includes,
compared on the first version. The result was: program compiled
with ldc2, same params, was 5% slower... nothing crucial, I
think...
Just reiterating what I said re the first question: I don't
really see a difference. If you think there is, please point it
out. Or if you're not sure, feel free to ask about specific
parts.
Yeah... so the answer here for me, is that I can stay with my way
of thinking in c# style. :)