Compiler benchmarks for an alternative to std.uni.asLowerCase.

Jon D via Digitalmars-d Sun, 08 May 2016 16:42:07 -0700

I did a performance study on speeding up case conversion instd.uni.asLowerCase. Specifics for asLowerCase have been added toissue https://issues.dlang.org/show_bug.cgi?id=11229. Publishinghere as some of the more general observations may be of widerinterest.

Background - Case conversion can generally be sped up by checkingif a character is ascii before invoking a full unicode caseconversion. The single character std.uni.toLower does thisoptimization, but std.uni.asLowerCase does not. asLowerCase doesa lazy conversion of a range. For the test, I created areplacement for asLowerCase which uses map and toLower. Inessence, `map!(x => x.toLower)` or `map!(x => x.byDchar.toLower)`.

Testing was with DMD (2.071) and LDC 1.0.0-beta1 (Phobos 2.070)on OSX. Compiler settings were `-release -O -boundscheck=off`.DMD was tested with and without `-inline`. LDC turns on inlining(-enable-inlining=1) by default with -O, but DMD does not. Textstried were in Japanese, Chinese, Finnish, English, German, andSpanish. Timing was done both including and excluding decodingfrom utf-8 to dchar.


Performance delta including decoding to dchar:

| Language group | Pct Ascii | LDC gain | DMD gain | DMD noinline ||-----------------+-----------+------------+-----------+----------------|| Latin | 95-99% | 64% (2.7x) | 93% (14x) | 48%(1.9x) |

  | Asian (Jpn/Chn) |  2.4-3.7% | 36% (1.6x) | 80% (5x)  | -1%

Performance delta excluding decoding to dchar:

| Language group | Pct Ascii | LDC gain | DMD gain | DMD noinline ||-----------------+-----------+------------+-----------+---------------|| Latin | 95-99% | 60% (2.5x) | 95% (20x) | 60%(2.5x) |

  | Asian (Jpn/Chn) |  2.4-3.7% | 50% (2x)   | 95% (20x) | -2%

Observations:

* mapAsLowerCase was faster than asLowerCase across the board.That it was better for Asian texts suggests the improvementinvolved more just the ascii check optimization.* Performance varied widely between compilers, and for DMD,whether the -inline flag was included. The performance deltabetween asLowerCase and the mapAsLowerCase replacement was verydependent on these choices. Similarly, the delta betweeninclusion and exclusion of auto-decoding was highly dependent onthese selections.* DMD improvement by using -inline: 30% for asLowerCase (1.5x),90% for mapAsLowerCase (10x).* DMD (-inline) vs LDC: For asLowerCase, LDC was 65-85% faster.For mapAsLowerCase, DMD was 10-40% faster. There were changes tothe map implementation in 2.071, so these were not equivalent,but still, it's interesting that DMD beat LDC in this case.


Thoughts:

* The large variances between compiler settings imply extradiligence when performance tuning at the source code level,especially for code intended for multiple compilers.* Perhaps DMD -O should also turn on -inline. This would presenta better performance picture to new users. It's also helpful whenthe different compilers agree on rough meaning of compilerswitches.* Auto-decoding is an oft discussed concern. It doesn't show upin the table above, but the data I looked at suggests thecost/penalty may vary quite a bit depending on usage context andcompiler/settings. I wasn't studying aspect explicitly. It may beworth its own analysis.


Other details:

* Code for mapAsLowerCase and the timing program is at:https://dpaste.dzfl.pl/a0e2fa1c71fd* Texts used for timing were books in several languages from theProject Gutenberg site (http://www.gutenberg.org/), withboilerplate text removed.


--Jon

Compiler benchmarks for an alternative to std.uni.asLowerCase.

Reply via email to