On Friday, 14 March 2014 at 06:21:27 UTC, Manu wrote:
So, I'm constantly running into issues with not having control over inline. I've run into it again doing experiments in preparation for my dconf talk...

I have identified 2 cases which come up regularly:
1. A function that should always be inline unconditionally (std.simd is
effectively blocked on this)
2. A particular invocation of a function should be inlined for this call
only

The first case it just about having control over code gen. Some functions should effectively be macros or pseudo-intrinsics (ie, intrinsic wrappers in std.simd, beauty wrappers around asm code, etc), and I don't ever want
to see a symbol appear in the binary.

My suggestion is introduction of __forceinline or something like it. We
need this.


The second case is interesting, and I've found it comes up a few times on
different occasions.
In my current instance, I'm trying to build generic framework to perform efficient composable data processing, and a basic requirement is that the components are inlined, such that the optimiser can interleave the work
properly.

Let's imagine I have a template which implements a work loop, which wants to call a bunch of work elements it receives by alias. The issue is, each of those must be inlined, for this call instance only, and there's no way
to do this.
I'm gonna draw the line at stringified code to use with mixin; I hate that, and I don't want to encourage use of mixin or stringified code in user-facing API's as a matter of practise. Also, some of these work elements might be useful functions in their own right, which means they can indeed be a function existing somewhere else that shouldn't itself be
attributed as __forceinline.

What are the current options to force that some code is inlined?

My feeling is that an ideal solution would be something like an enhancement which would allow the 'mixin' keyword to be used with regular function
calls.
What this would do is 'mix in' the function call at this location, ie, effectively inline that particular call, and it leverages a keyword and concept that we already have. It would obviously produce a compile error of
the code is not available.

I quite like this idea, but there is a potential syntactical problem; how
to assign the return value?

int func(int y) { return y*y+10; }

int output = mixin func(10); // the 'mixin' keyword seems to kinda 'get in
the way' if the output
int output = mixin(func(10)); // now i feel paren spammy...
mixin(int output = func(10)); // this doesn't feel right...

My feeling is the first is the best, but I'm not sure about that
grammatically.


The other thing that comes to mind is that it seems like this might make a case for AST macros... but I think that's probably overkill for this situation, and I'm not confident we're ever gonna attempt to crack that nut. I'd like to see something practical and unobjectionable preferably.


This problem is fairly far reaching; phobos receives a lot of lambdas these days, which I've found don't reliably inline and interfere with the
optimisers ability to optimise the code.
There was some discussion about a code unrolling API some time back, and this would apply there (the suggested solution used string mixins! >_<). Debug build performance is a problem which would be improved with this
feature.

As much as I like the idea:

Something always tells me this is the compilers job... What clever reasoning are you applying that the compiler's inliner can't? It seems like a different situation to say SIMD code, where correctly structuring loops can require a lot of gymnastics that the compiler can't or won't (floating point conformance) do. The inlining decision seems easily automatable in comparison.

I understand that unoptimised builds for debugging are a problem, but a sensible compiler let's you hand pick your optimisation passes.

In short: why are compilers not good enough at this that the programmer needs to be involved?

Reply via email to