On 2/26/2016 11:17 AM, David Nadlinger wrote:
I was referring to something different in my post, though, as the question
concerned "low-hanging fruit". The problem there is really just that template
names sometimes grow unreasonably long pretty quickly. As an example, without
wanting to divulge any internals, some of the mangled symbols (!) in the Weka
codebase are several hundred kilobytes in size. core.demangle gives up on them
anyway, and they appear to be extremely repetitive. Note that just like in
Steven's post which I linked earlier, the code in question does not involve any
crazy recursive meta-templates, but IIRC makes use of Voldemort types. Tracking
down and fixing this – one would almost be tempted to just use standard data
compression – would lead to a noticeable decrease in compile and link times for
affected code.
A simple solution is to just use lz77 compression on the strings. This is used
for Win32 and works well. (I had a PR to put that in Phobos, but it was rejected.)
https://www.digitalmars.com/sargon/lz77.html
As a snide aside, the mangling schemes used by Microsoft and g++ have a built-in
compression scheme, but they are overly complex and produce lousy results. Lz77
is simpler and far more effective :-)
An alternative is to generate an SHA hash of the name, which will be unique, but
the downside is it is not reversible and so cannot be demangled.