Two quick notes: a) Profile first the optimise.
b) This probably wouldn't make 4x difference but in the C++ code you're passing most objects around by ref. In the D version you're passing structs by value.
They are only small but there's a tight loop of recursion to consider...
That said, I don't know the details of D optimisation all that well and it may be a non-issue in release builds.
Stewart
