Re: Help optimizing code?

2018-01-01 Thread Uknown via Digitalmars-d-learn

On Tuesday, 2 January 2018 at 07:17:23 UTC, Uknown wrote:

[snip]
0. Use LDC. It is significantly faster.
1. Utilize the fact that the Mandelbrot  set is symmetric about 
the X axis.You can half the time taken.

2. Use std.parallelism for using multiple cores on the CPU
3. Use @fastmath of LDC
4. imageData.reserve(width * height * 3) before the loop
5. [1] is a great article on this specific topic
[snip]


Forgot to mention that since you already know some of the edges, 
you can avoid unnecessarily looping through some regions. That 
saves a lot of time


Re: Help optimizing code?

2018-01-01 Thread Uknown via Digitalmars-d-learn

On Monday, 1 January 2018 at 15:09:53 UTC, Lily wrote:
I started learning D a few days ago, coming from some very 
basic C++ knowledge, and I'd like some help getting a program 
to run faster. The code is here: 
https://github.com/IndigoLily/D-mandelbrot/blob/master/mandelbrot.d


Right now it runs slower than my JavaScript Mandelbrot renderer 
on the same quality settings, which is clearly ridiculous, but 
I don't know what to do to fix it. Sorry for the lack of 
comments, but I can never tell what will and won't be obvious 
to other people.


Hey! I happened to also write a Mandelbrot generator in D. It was 
based of the version given on rossetacode for C[0].

Some of the optimizations I used were:

0. Use LDC. It is significantly faster.
1. Utilize the fact that the Mandelbrot  set is symmetric about 
the X axis.You can half the time taken.

2. Use std.parallelism for using multiple cores on the CPU
3. Use @fastmath of LDC
4. imageData.reserve(width * height * 3) before the loop
5. [1] is a great article on this specific topic

For reference, on my 28W 2 core i5, a 2560x1600 image took about 
2 minutes to

render, with 500,000 iterations per pixel.
[2] is my own version.

[0]: 
https://rosettacode.org/wiki/Mandelbrot_set#PPM_non_interactive
[1]: 
https://randomascii.wordpress.com/2011/08/13/faster-fractals-through-algebra/
[2]: 
https://github.com/Sirsireesh/Khoj-2017/blob/master/Mandelbrot-set/mandlebrot.d


Re: Help optimizing code?

2018-01-01 Thread Muld via Digitalmars-d-learn

On Monday, 1 January 2018 at 16:47:40 UTC, Adam D. Ruppe wrote:

On Monday, 1 January 2018 at 16:13:37 UTC, Muld wrote:
If you use .ptr then you get zero detection, even in debug 
builds.


It is limited to the one expression where you wrote it, instead 
of on the ENTIRE program like the build switches do.


It is a lot easier to check correctness in an individual 
expression than it is to check the entire program, including 
stuff you didn't even realize might have been a problem.


With the .ptr pattern, it is correct by default and you 
individually change ones you (should) look carefully at. With 
-boundscheck, it is wrong by default and most people don't even 
look at it - people suggest it to newbies as an optimization 
without mentioning how nasty it is.


It won't be just one line though. When you pretty much have to 
use it EVERYWHERE to get the optimization you want. It makes more 
sense to just turn off the check for the entire program and use 
your own asserts() where they are actually needed. That way you 
still get the checks in debug builds and have asserts where they 
are actually necessary.


I'd rather there be a potential bug than the program running 
to slow to be usable


That's a ridiculous exaggeration. In this program, I saw a < 1% 
time difference using those flags. -O -inline make a 50x bigger 
difference!


Read the sentence right before this.. Jesus. People only read 
what they want.



or have zero debugging for indices in debug builds.


You shouldn't be using .ptr until after you've carefully 
checked and debugged the line of code where you are writing it. 
That's the beauty of the pattern: it only affects one line of 
code, so you can test it before you use it without affecting 
the rest of the program.


It won't just be one line, and that's not beautiful. What happens 
when code gets refactored? You are constantly going to be 
flip-flopping the source code rather than a compiler flag or 
using multiple build configurations? How long are you even going 
to test for? The error that might happen for the code is probably 
difficult to detect, if it wasn't then having bounds checking at 
all wouldn't be necessary. Just test your code, that's the beauty 
of testing!




Re: Help optimizing code?

2018-01-01 Thread Adam D. Ruppe via Digitalmars-d-learn

On Monday, 1 January 2018 at 16:13:37 UTC, Muld wrote:
If you use .ptr then you get zero detection, even in debug 
builds.


It is limited to the one expression where you wrote it, instead 
of on the ENTIRE program like the build switches do.


It is a lot easier to check correctness in an individual 
expression than it is to check the entire program, including 
stuff you didn't even realize might have been a problem.


With the .ptr pattern, it is correct by default and you 
individually change ones you (should) look carefully at. With 
-boundscheck, it is wrong by default and most people don't even 
look at it - people suggest it to newbies as an optimization 
without mentioning how nasty it is.


I'd rather there be a potential bug than the program running to 
slow to be usable


That's a ridiculous exaggeration. In this program, I saw a < 1% 
time difference using those flags. -O -inline make a 50x bigger 
difference!



or have zero debugging for indices in debug builds.


You shouldn't be using .ptr until after you've carefully checked 
and debugged the line of code where you are writing it. That's 
the beauty of the pattern: it only affects one line of code, so 
you can test it before you use it without affecting the rest of 
the program.


Re: Help optimizing code?

2018-01-01 Thread Muld via Digitalmars-d-learn

On Monday, 1 January 2018 at 15:54:33 UTC, Adam D. Ruppe wrote:

On Monday, 1 January 2018 at 15:29:28 UTC, user1234 wrote:

dmd mandelbrot.d -O -release -inline -boundscheck=off


-O and -inline are OK, but -release and -boundscheck are 
harmful and shouldn't be used. Yeah, you can squeeze a bit of 
speed out of them, but there's another way to do it - `.ptr` on 
the individual accesses or versioning out unwanted `assert` 
statements - and those avoid major bug and security baggage 
that -release and -boundscheck=off bring.


If you use .ptr then you get zero detection, even in debug builds.

In this program, I didn't see a major improvement with the 
boundscheck skipping... and in this program, it seems to be 
written without the bugs, but still, I am against that switch 
on principle. It is so so so easy to break things with them.


In this program, it's relatively small and doesn't look like it 
does its calculations in realtime. I'd rather there be a 
potential bug than the program running to slow to be usable, or 
have zero debugging for indices in debug builds.







Re: Help optimizing code?

2018-01-01 Thread Adam D. Ruppe via Digitalmars-d-learn

On Monday, 1 January 2018 at 15:29:28 UTC, user1234 wrote:

dmd mandelbrot.d -O -release -inline -boundscheck=off


-O and -inline are OK, but -release and -boundscheck are harmful 
and shouldn't be used. Yeah, you can squeeze a bit of speed out 
of them, but there's another way to do it - `.ptr` on the 
individual accesses or versioning out unwanted `assert` 
statements - and those avoid major bug and security baggage that 
-release and -boundscheck=off bring.


In this program, I didn't see a major improvement with the 
boundscheck skipping... and in this program, it seems to be 
written without the bugs, but still, I am against that switch on 
principle. It is so so so easy to break things with them.



- I'd use "double" instead of "real".


On my computer at least, float gave 2x speed compared to double. 
You could try both though and see which works better.


Re: Help optimizing code?

2018-01-01 Thread user1234 via Digitalmars-d-learn

On Monday, 1 January 2018 at 15:23:19 UTC, Adam D. Ruppe wrote:

On Monday, 1 January 2018 at 15:09:53 UTC, Lily wrote:
I started learning D a few days ago, coming from some very 
basic C++ knowledge, and I'd like some help getting a program 
to run faster.


So a few easy things you can do:

1) use `float` instead of `real`. real sucks, it is really slow 
and weird. Making that one switch doubled the speed on my 
computer.


Yes I've also adviced double. Double is better if the target arch 
is X86_64 since part of the operations will be made with SSE. 
With "real" the OP was **sure** to get 100% of the maths done in 
the FPU (although for all the trigo stuff there's no choice)




2) preallocate the imageData. before the loop, 
`imageData.reserve(width*height*3)`. Small savings on my 
computer but an easy one.


3) make sure you use the compiler optimization options like 
`-O` and `-inline` on dmd (or use the gdc and ldc compilers 
both of which generally optimize better than dmd out of the 
box).



And if that isn't enough we can look into smaller things, but 
these overall brought the time down to about 1/3 what it 
started on my box.





Re: Help optimizing code?

2018-01-01 Thread user1234 via Digitalmars-d-learn

On Monday, 1 January 2018 at 15:09:53 UTC, Lily wrote:
I started learning D a few days ago, coming from some very 
basic C++ knowledge, and I'd like some help getting a program 
to run faster. The code is here: 
https://github.com/IndigoLily/D-mandelbrot/blob/master/mandelbrot.d


Right now it runs slower than my JavaScript Mandelbrot renderer 
on the same quality settings, which is clearly ridiculous, but 
I don't know what to do to fix it. Sorry for the lack of 
comments, but I can never tell what will and won't be obvious 
to other people.


- The first thing is to compile with the best options:

dmd mandelbrot.d -O -release -inline -boundscheck=off

- You append a lot, which can cause reallocs for imageData; Try

   import std.array;
   Appender!(ubyte[]) imageData;

   The code will not have to be changed for "~=" since Appender 
overloads this operator.


- I'd use "double" instead of "real".


Re: Help optimizing code?

2018-01-01 Thread Adam D. Ruppe via Digitalmars-d-learn

On Monday, 1 January 2018 at 15:09:53 UTC, Lily wrote:
I started learning D a few days ago, coming from some very 
basic C++ knowledge, and I'd like some help getting a program 
to run faster.


So a few easy things you can do:

1) use `float` instead of `real`. real sucks, it is really slow 
and weird. Making that one switch doubled the speed on my 
computer.


2) preallocate the imageData. before the loop, 
`imageData.reserve(width*height*3)`. Small savings on my computer 
but an easy one.


3) make sure you use the compiler optimization options like `-O` 
and `-inline` on dmd (or use the gdc and ldc compilers both of 
which generally optimize better than dmd out of the box).



And if that isn't enough we can look into smaller things, but 
these overall brought the time down to about 1/3 what it started 
on my box.