Thanks for reviewing my code and making the adjustments, Vasco! I’ve integrated 
the changes in my patch.

I’ve finished the port of the new bool_eval() function to OpenCL, and although 
the improved performance, it wasn’t enough to outperform the ANSI C code with 
the Release build. 

For the havoc scene, I got 1.56sec now vs 2.10sec before, when running the 
OpenCL code on my GPU. (command ‘rt -z1 -l5 -s1024’). For reference, the same 
scene renders in 0.63sec with the ANSI C code currently in the trunk.

Despite that, when I ran the OpenCL code in my CPU, I got 0.64sec now vs 
2.79sec before. (command ‘rt -z1 -l5 -s1024’).

I am a little intrigued with this, because smaller scenes like the operators.g 
are clearly faster when using the GPU, (0.06 sec gpu vs 0.16sec cpu). Any 

Other thing that caught my attention was how close the lines RTFM and wallclock 
from the ‘rt’ output are when running the OpenCL code in the CPU, compared with 
the same lines from running the OpenCL code in the GPU. (i.e  0.60 and 0.65 sec 
- cpu vs 0.32 and 1.65 sec - gpu).
Couldn’t the big difference on the GPU side be caused from transfers between 
CPU-GPU and not by performing ray-intersections, boolean evaluation and shading 
operations? Is there a way to investigate this?

Tomorrow I will update the previous tables that I shared before on my document, 
now using Release builds. And will also include side by side image comparisons 
between the ANCI C and OpenCL results, for each scene. 


Attachment: new_bool_eval.patch
Description: Binary data

Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
BRL-CAD Developer mailing list

Reply via email to