On Friday, 14 March 2014 at 06:21:27 UTC, Manu wrote:
So, I'm constantly running into issues with not having control
over inline.
I've run into it again doing experiments in preparation for my
dconf talk...
I have identified 2 cases which come up regularly:
1. A function that should always be inline unconditionally
(std.simd is
effectively blocked on this)
2. A particular invocation of a function should be inlined for
this call
only
The first case it just about having control over code gen. Some
functions
should effectively be macros or pseudo-intrinsics (ie,
intrinsic wrappers
in std.simd, beauty wrappers around asm code, etc), and I don't
ever want
to see a symbol appear in the binary.
My suggestion is introduction of __forceinline or something
like it. We
need this.
The second case is interesting, and I've found it comes up a
few times on
different occasions.
In my current instance, I'm trying to build generic framework
to perform
efficient composable data processing, and a basic requirement
is that the
components are inlined, such that the optimiser can interleave
the work
properly.
Let's imagine I have a template which implements a work loop,
which wants
to call a bunch of work elements it receives by alias. The
issue is, each
of those must be inlined, for this call instance only, and
there's no way
to do this.
I'm gonna draw the line at stringified code to use with mixin;
I hate that,
and I don't want to encourage use of mixin or stringified code
in
user-facing API's as a matter of practise. Also, some of these
work
elements might be useful functions in their own right, which
means they can
indeed be a function existing somewhere else that shouldn't
itself be
attributed as __forceinline.
What are the current options to force that some code is inlined?
My feeling is that an ideal solution would be something like an
enhancement
which would allow the 'mixin' keyword to be used with regular
function
calls.
What this would do is 'mix in' the function call at this
location, ie,
effectively inline that particular call, and it leverages a
keyword and
concept that we already have. It would obviously produce a
compile error of
the code is not available.
I quite like this idea, but there is a potential syntactical
problem; how
to assign the return value?
int func(int y) { return y*y+10; }
int output = mixin func(10); // the 'mixin' keyword seems to
kinda 'get in
the way' if the output
int output = mixin(func(10)); // now i feel paren spammy...
mixin(int output = func(10)); // this doesn't feel right...
My feeling is the first is the best, but I'm not sure about that
grammatically.
The other thing that comes to mind is that it seems like this
might make a
case for AST macros... but I think that's probably overkill for
this
situation, and I'm not confident we're ever gonna attempt to
crack that
nut. I'd like to see something practical and unobjectionable
preferably.
This problem is fairly far reaching; phobos receives a lot of
lambdas these
days, which I've found don't reliably inline and interfere with
the
optimisers ability to optimise the code.
There was some discussion about a code unrolling API some time
back, and
this would apply there (the suggested solution used string
mixins! >_<).
Debug build performance is a problem which would be improved
with this
feature.
As much as I like the idea:
Something always tells me this is the compilers job... What
clever reasoning are you applying that the compiler's inliner
can't? It seems like a different situation to say SIMD code,
where correctly structuring loops can require a lot of gymnastics
that the compiler can't or won't (floating point conformance) do.
The inlining decision seems easily automatable in comparison.
I understand that unoptimised builds for debugging are a problem,
but a sensible compiler let's you hand pick your optimisation
passes.
In short: why are compilers not good enough at this that the
programmer needs to be involved?