Re: Is there a list of things which are slow to compile?

2020-06-05 Thread H. S. Teoh via Digitalmars-d-learn
On Fri, Jun 05, 2020 at 08:25:13AM +, aberba via Digitalmars-d-learn wrote:
> On Wednesday, 3 June 2020 at 17:02:35 UTC, H. S. Teoh wrote:
> > On Wed, Jun 03, 2020 at 09:36:52AM +, drathier via
> > Digitalmars-d-learn wrote:
> > > I'm wondering if there's a place that lists things which are
> > > slower/faster to compile? DMD is pretty famed for compiling
> > > quickly, but I'm not seeing particularly high speed at all, and I
> > > want to fix that.
> > 
> > The two usual culprits are:
> > - Recursive/chained templates
> > - Excessive CTFE
[...]
> I'm thinking about a resource hub for D with information like these.
> Can I use this information?
[...]

Of course. No need to reference this thread, what I wrote above is
pretty much common knowledge for anyone who has worked with D long
enough.


T

-- 
Famous last words: I *think* this will work...


Re: Is there a list of things which are slow to compile?

2020-06-05 Thread aberba via Digitalmars-d-learn

On Wednesday, 3 June 2020 at 17:02:35 UTC, H. S. Teoh wrote:
On Wed, Jun 03, 2020 at 09:36:52AM +, drathier via 
Digitalmars-d-learn wrote:
I'm wondering if there's a place that lists things which are 
slower/faster to compile? DMD is pretty famed for compiling 
quickly, but I'm not seeing particularly high speed at all, 
and I want to fix that.


The two usual culprits are:
- Recursive/chained templates
- Excessive CTFE

Note that while the current CTFE engine is slow, it's still 
reasonably fast for short computations. Just don't write nested 
loops or loops with a huge number of iterations inside your 
CTFE code, and you should be fine. And on that note, even 
running std.format with all of its complexity inside CTFE is 
reasonably fast, as long as you don't do it too often; so 
generally you won't see a problem here unless you have loop 
with too many iterations or too deeply-nested loops running in 
CTFE.


Templates are generally reasonably OK, until you use too many 
recursive templates. Or if you chain too many of them together, 
like if you have excessively long UFCS chains with Phobos 
algorithms. Short chains are generally OK, but once they start 
getting long they will generate large symbols and large numbers 
of instantiations. Large symbols used to be a big problem, but 
ever since Rainer's fix they have generally been a lot tamer. 
But still, it's something to avoid unless you can't help it.


Recursive templates are generally bad because they tend to 
produce a super-linear number of instantiations, which consume 
lots of compiler memory and also slow things down. Use too many 
of them, and things will quickly slow to a crawl.


Worst is if you combine both deeply-nested templates and CTFE, 
like std.regex does. Similarly, std.format (which includes 
writefln & co) tends to add 1-2 seconds to compile time.


Another is if you have an excessively long function body, IIRC 
there are some O(n^2) algorithms in the compiler w.r.t. the 
length of the function body. But I don't expect normal code to 
reach the point where this begins to matter; generally you 
won't run into this unless your code is *really* poorly written 
(like the entire application inside main()), or you're using 
excessive code generation (like the mixin of a huge 
procedurally generated string).


Identifier lengths are generally no problem unless you're 
talking about 100KB-long identifiers, which used to be a 
problem until Rainer implemented backreferences in the 
mangling. But I don't expect normal code to generate symbols of 
this order of magnitude unless you're using excessively-long 
UFCS chains with nested templates. Identifier length generally 
doesn't even register on the radar unless they're ridiculously 
long, like tens or hundreds of KB long -- not something a human 
would type. What humans would consider a long identifier, like 
Java-style names that span 50 characters, are mere round-off 
error and probably don't even make a measurable difference. The 
problem really only begins to surface when you have 10,000 
characters in your identifier or larger.


Comments are not even a blip on the radar: lexing is the 
fastest part of the compilation process.  Similarly, aliases 
are extremely cheap, it's not even on the radar. Delegates have 
only a runtime cost; they are similarly unnoticeably cheap 
during compilation.  As are Variants, unless you're running 
Variants inside CTFE (which I don't think even works).



T


I'm thinking about a resource hub for D with information like 
these. Can I use this information? ...of course I'll reference 
this thread and you can always call for changes.


Re: Is there a list of things which are slow to compile?

2020-06-04 Thread drathier via Digitalmars-d-learn

On Wednesday, 3 June 2020 at 17:02:35 UTC, H. S. Teoh wrote:
On Wed, Jun 03, 2020 at 09:36:52AM +, drathier via 
Digitalmars-d-learn wrote:
I'm wondering if there's a place that lists things which are 
slower/faster to compile? DMD is pretty famed for compiling 
quickly, but I'm not seeing particularly high speed at all, 
and I want to fix that.


The two usual culprits are:
- Recursive/chained templates
- Excessive CTFE

...

T


Thanks for the comprehensive answer!

I'm not using CTFE at all, because as you thought, Variants 
aren't supported in CTFE. I had to go out of my way to avoid CTFE 
running, since it crashes on Variants. I'm not using UFCS, and 
the long identifiers I was talking about are like 50 characters 
long, from mangling package name + module name + variable name 
together in the source language.


I'm guessing it's mainly templates from my code gen then, and 
there's not much I can do about that; I'm doing code gen from a 
functional language where polymorphism is literally everywhere, 
and so are templates then.


Regarding std.format, std.regex and such, would it be possible to 
put those into their own package or something, so `dub` doesn't 
rebuild them every time? It feels like that'd save a lot of time.


Re: Is there a list of things which are slow to compile?

2020-06-03 Thread H. S. Teoh via Digitalmars-d-learn
On Wed, Jun 03, 2020 at 09:36:52AM +, drathier via Digitalmars-d-learn 
wrote:
> I'm wondering if there's a place that lists things which are
> slower/faster to compile? DMD is pretty famed for compiling quickly,
> but I'm not seeing particularly high speed at all, and I want to fix
> that.

The two usual culprits are:
- Recursive/chained templates
- Excessive CTFE

Note that while the current CTFE engine is slow, it's still reasonably
fast for short computations. Just don't write nested loops or loops with
a huge number of iterations inside your CTFE code, and you should be
fine. And on that note, even running std.format with all of its
complexity inside CTFE is reasonably fast, as long as you don't do it
too often; so generally you won't see a problem here unless you have
loop with too many iterations or too deeply-nested loops running in
CTFE.

Templates are generally reasonably OK, until you use too many recursive
templates. Or if you chain too many of them together, like if you have
excessively long UFCS chains with Phobos algorithms. Short chains are
generally OK, but once they start getting long they will generate large
symbols and large numbers of instantiations. Large symbols used to be a
big problem, but ever since Rainer's fix they have generally been a lot
tamer. But still, it's something to avoid unless you can't help it.

Recursive templates are generally bad because they tend to produce a
super-linear number of instantiations, which consume lots of compiler
memory and also slow things down. Use too many of them, and things will
quickly slow to a crawl.

Worst is if you combine both deeply-nested templates and CTFE, like
std.regex does. Similarly, std.format (which includes writefln & co)
tends to add 1-2 seconds to compile time.

Another is if you have an excessively long function body, IIRC there are
some O(n^2) algorithms in the compiler w.r.t. the length of the function
body. But I don't expect normal code to reach the point where this
begins to matter; generally you won't run into this unless your code is
*really* poorly written (like the entire application inside main()), or
you're using excessive code generation (like the mixin of a huge
procedurally generated string).

Identifier lengths are generally no problem unless you're talking about
100KB-long identifiers, which used to be a problem until Rainer
implemented backreferences in the mangling. But I don't expect normal
code to generate symbols of this order of magnitude unless you're using
excessively-long UFCS chains with nested templates. Identifier length
generally doesn't even register on the radar unless they're ridiculously
long, like tens or hundreds of KB long -- not something a human would
type. What humans would consider a long identifier, like Java-style
names that span 50 characters, are mere round-off error and probably
don't even make a measurable difference. The problem really only begins
to surface when you have 10,000 characters in your identifier or larger.

Comments are not even a blip on the radar: lexing is the fastest part of
the compilation process.  Similarly, aliases are extremely cheap, it's
not even on the radar. Delegates have only a runtime cost; they are
similarly unnoticeably cheap during compilation.  As are Variants,
unless you're running Variants inside CTFE (which I don't think even
works).


T

-- 
Why waste time reinventing the wheel, when you could be reinventing the engine? 
-- Damian Conway


Re: Is there a list of things which are slow to compile?

2020-06-03 Thread drathier via Digitalmars-d-learn

On Wednesday, 3 June 2020 at 09:36:52 UTC, drathier wrote:
Currently at ~1ksloc/s of d input without optimizing anything, 
which corresponds to 350ksloc/s if measuring by `-vcg-ast` 
output instead of d source input, while using the same time 
measurement from before, so the flag doesn't cost time.


Sorry, that should read `44ksloc/s`, not `350ksloc/s`.