On Wednesday, 14 May 2014 at 06:00:33 UTC, Ola Fosheim Grøstad wrote:
Besides, D ranges will never perform as well as an optimized explicit loop, so you might as well aim for usability over speed.

There is no single reason why this should be true. Actually ranges of medium complexity are already very close to explicit loops in D when you use something like LDC.

Consider sample program like this:

  1 void main()
  2 {
  3     import std.stdio;
  4     int max;
  5     readf("%d", &max);
  6     assert(foo1(max) == foo2(max));
  7 }
  8
  9 int foo1(int max)
 10 {
 11     int sum = 0;
 12     for (auto i = 0; i < max; ++i)
 13     {
 14         sum += i*2;
 15     }
 16     return sum;
 17 }
 18
 19 int foo2(int max)
 20 {
 21     import std.algorithm : reduce, map;
 22     import std.range : iota;
23 return reduce!((a, b) => a + b)(0, iota(0, max).map!(a => a*2));
 24 }

Disassembly (ldmd2 -O -release -inline):

  765 0000000000402df0 <_D4test4foo1FiZi>:
  766   402df0:   31 c0                   xor    %eax,%eax
  767   402df2:   85 ff                   test   %edi,%edi
768 402df4: 7e 10 jle 402e06 <_D4test4foo1FiZi+0x16>
  769   402df6:   8d 47 ff                lea    -0x1(%rdi),%eax
  770   402df9:   8d 4f fe                lea    -0x2(%rdi),%ecx
  771   402dfc:   0f af c8                imul   %eax,%ecx
  772   402dff:   83 e1 fe                and    $0xfffffffe,%ecx
773 402e02: 8d 44 79 fe lea -0x2(%rcx,%rdi,2),%eax
  774   402e06:   c3                      retq
  775   402e07:   66 0f 1f 84 00 00 00    nopw   0x0(%rax,%rax,1)
  776   402e0e:   00 00
  777
  778 0000000000402e10 <_D4test4foo2FiZi>:
  779   402e10:   31 c0                   xor    %eax,%eax
  780   402e12:   85 ff                   test   %edi,%edi
  781   402e14:   0f 48 f8                cmovs  %eax,%edi
  782   402e17:   85 ff                   test   %edi,%edi
783 402e19: 74 10 je 402e2b <_D4test4foo2FiZi+0x1b>
  784   402e1b:   8d 47 ff                lea    -0x1(%rdi),%eax
  785   402e1e:   8d 4f fe                lea    -0x2(%rdi),%ecx
  786   402e21:   0f af c8                imul   %eax,%ecx
  787   402e24:   83 e1 fe                and    $0xfffffffe,%ecx
788 402e27: 8d 44 79 fe lea -0x2(%rcx,%rdi,2),%eax
  789   402e2b:   c3                      retq
  790   402e2c:   0f 1f 40 00             nopl   0x0(%rax)

it is almost identical. I fully expect to be 100% identical at certain point of compiler maturity.

Reply via email to