Let's avoid range specializations - make ranges great again!

Seb via Digitalmars-d Sat, 04 Jun 2016 10:41:12 -0700

In Phobos (especially in std.algorithm) a lot of specializationbetween a RandomAccessRange and an InputRange are used eventhough only an InputRange is required.The reason for this is solely performance - imho we should startto tackle this and remove such specializations. Let's createbetter code!

To quote David from the recent PR(https://github.com/dlang/phobos/pull/4265) where adding aidentity specialization achieves a 5-10x speedup for{min,max}Element.

I think that we have a larger issue at hand here. Ranges are ofprime strategic importance for D as an accessible way ofcomposing complex operations out of reusable primitives. Alarge part of this comes from the fact that they are supposedto be zero-cost abstractions – after the compiler optimiser hashad its way with a piece of code, we expect it to be equivalentto a hand-written loop.

1) Should it matter whether generic range looping or"specialized" r[i] access is used?

Answer for dmd it does (7x slower), ldc does the job (nearly)correctly.

```

dmd -release -O -boundscheck=off test_looping.d &&./test_looping

random_access 685 ms, 351 μs, and 9 hnsecs
foreach       685 ms, 888 μs, and 4 hnsecs
foreach_ref   685 ms, 654 μs, and 4 hnsecs
for_range_api 7 secs, 211 ms, 530 μs, and 3 hnsecs

```

ldc -release -O3 -boundscheck=off test_looping.d &&./test_looping

random_access 142 ms and 88 μs
foreach       142 ms, 8 μs, and 3 hnsecs
foreach_ref   142 ms, 400 μs, and 9 hnsecs
for_range_api 143 ms, 394 μs, and 1 hnsec
```

Benchmarking code:https://github.com/wilzbach/perf-ranges/blob/master/test_looping.d

Bug report for DMD: https://issues.dlang.org/show_bug.cgi?id=16120

(Tested with both ldc 0.17 and 1.0)

2) Should it matter whether I use r.save or r[i..$]?

Answer: it does for dmd and ldc.

```

dmd -release -O -boundscheck=off test_save.d && ./test_save

random_access  424 ms and 72 μs
foreach_random 420 ms, 754 μs, and 5 hnsecs
for_range_api  2 secs, 562 ms, 415 μs, and 5 hnsecs

ldc -release -O3 -boundscheck=off test_save.d && ./test_save

random_access  245 ms, 713 μs, and 4 hnsecs
foreach_random 241 ms, 533 μs, and 1 hnsec
for_range_api  349 ms, 582 μs, and 9 hnsecs
```

Benchmarking code:https://github.com/wilzbach/perf-ranges/blob/master/test_save.d

Open question: how can we make DMD/LDC smarter, so that theyinline range primitives and the promise about the zero-costabstractions is full-filled?

Let's avoid range specializations - make ranges great again!

Reply via email to