Am Sun, 08 Apr 2012 15:01:56 +0400 schrieb Dmitry Olshansky <[email protected]>:
> I think it's been ages since I meant to ask why nobody (as in compiler > vendors) does what I think is rather simple optimization. > > In the short term the plan is to introduce a "link-time" flavored > optimization at code generation or (better) link step. > > For simplicity let's assume compiler does all of the following during > generation of an object file. > > 1. Every time a function is generated (or pretty much any symbol) not > only a size calculated but also a checksum* of it's data. > (If we go for link-time optimization we should find a place to stick it > to in the object file) > > 2. Compiler maintains a hash-table of symbol_size ---> list( ~ array) of > pairs (references to data, checksum) of all symbols with given size. > Call it a duplicate table. Every function generated and more generally > global immutable data should end up there. > > 3. After any function was generated compiler checks an entry in the > duplicate table that matches size, followed by matching checksum and > only then (if required) doing a straight memcmp. If it happens that > there is a match compiler just throws generated code away and _aliases_ > it's symbol to that of a matched entry. > (so there has to be an alias table if there isn't one already) > > *Yes, checksum. I think it should be real simple and easy to parallel > hash function. The original checksum is no reliable but some amount > balancing and profiling are obviously required when picking this function. > > Applicability: > It's not only const-immutable bloat, it can be alleviated with inout. > Yet there are plenty of places the exact same code is being generated: > e.g. sealed containers of int vs uint, std.find on dchar[] vs > uint[]/int[] an so on. > In general, the coarse grained parametrization is the root of all evil > and it is inevitable since we are just humans after all. > > Notes: > 1. If we do checksum calculation on the fly during codegen it gets at > virtually no cost as the data is in CPU data cache. Preliminary version > can avoid hacking this part of backend though. > > 2. By _alias_ I mean the ability of compiler to emit references to a > given symbol as if it was some other symbol (should be really straight > forward). > > 3. Linker have more data and is able to achieve colossal size savings, > essentially running through the same algorithm before actually linking > things. Again it's easily separable step and can be an opt-in. > > 4. Ironically the same exact thing works with any kind of immutable data > structures. It looks like string pooling is superseded by this proposal. > > Thoughts? Thoughts? Nothing much. I thought of that a while ago, but as an external program, that finds function calls by disassembling and removing dead/duplicate code. So I agree with you. A similar feature is a CTFE cache (or general code cache) that checksums a function's source code and gets the compiled version from a cache. Template bloat could be especially important to 'fix' on embedded systems. But I don't consider it important enough at the moment. :/ Let's wait till the bugs and important features are implemented or hack the compiler ourselves. -- Marco
