Re: stride in slices

DigitalDesigns via Digitalmars-d Tue, 05 Jun 2018 09:56:25 -0700

On Tuesday, 5 June 2018 at 13:05:56 UTC, Steven Schveighofferwrote:

On 6/4/18 5:52 PM, DigitalDesigns wrote:
On Monday, 4 June 2018 at 17:40:57 UTC, Dennis wrote:
On Monday, 4 June 2018 at 15:43:20 UTC, Steven Schveighofferwrote:
Note, it's not going to necessarily be as efficient, butit's likely to be close.
-Steve
I've compared the range versions with a for-loop. Forintegers and longs or high stride amounts the time is roughlyequal, but for bytes with low stride amounts it can be up totwice as slow.
https://run.dlang.io/is/BoTflQ
50 Mb array, type = byte, stride = 3, compiler = LDC -O4-release
For-loop  18 ms
Fill(0)   33 ms
each!     33 ms

With stride = 13:
For-loop  7.3 ms
Fill(0)   7.5 ms
each!     7.8 ms
This is why I wanted to make sure! I would be using it for astride of 2 and it seems it might have doubled the cost for noother reason than using ranged. Ranges are great but one can'treason about what is happening in then as easy as a directloop so I wanted to be sure. Thanks for running the test!
See later postings from Ethan and others. It's a matter ofoptimization being able to see the "whole thing". This is whyfor loops are sometimes better. It's not inherent with ranges,but if you use the right optimization flags, it's done as fastas if you hand-wrote it.
What I've found with D (and especially LDC) is that when yougive the compiler everything to work with, it can do someseemingly magic things.
-Steve

It would be nice if testing could be done. Maybe even profilingin unit tests to make sure ranges are within some margin oferror(10%). One of the main reasons I don't use ranges is Isimply don't have faith they will be as fast as direct encoding.While they might offer a slightly easier syntax I don't know whatis going on under the hood so I can't reason about it(unless Ilook up the source). With a for loop, it is pretty much a wrapperon internal cpu logic so it will be near as fast as possible.

I suppose in the long run ranges do have the potential to outperform since they do abstract but there is no guarantee theywill even come close. Having some "proof" that they are workingwell would ease my mind. As this thread shows, ranges have somemajor issues. Imagine having some code on your machine that isvery performant but on another machine in a slightly differentcircumstances it runs poorly. Now, say it is the stride issue...One normally would not think of that being an issue so one willlook in other areas and could waste times. At least with directloops you pretty much get what you see. It is very easy forranges to be slow but more difficult for them to be fast.

Re: stride in slices

Reply via email to