Hi, Thanks for reviewing my code and making the adjustments, Vasco! I’ve integrated the changes in my patch.
I’ve finished the port of the new bool_eval() function to OpenCL, and although the improved performance, it wasn’t enough to outperform the ANSI C code with the Release build. For the havoc scene, I got 1.56sec now vs 2.10sec before, when running the OpenCL code on my GPU. (command ‘rt -z1 -l5 -s1024’). For reference, the same scene renders in 0.63sec with the ANSI C code currently in the trunk. Despite that, when I ran the OpenCL code in my CPU, I got 0.64sec now vs 2.79sec before. (command ‘rt -z1 -l5 -s1024’). I am a little intrigued with this, because smaller scenes like the operators.g are clearly faster when using the GPU, (0.06 sec gpu vs 0.16sec cpu). Any explanation? Other thing that caught my attention was how close the lines RTFM and wallclock from the ‘rt’ output are when running the OpenCL code in the CPU, compared with the same lines from running the OpenCL code in the GPU. (i.e 0.60 and 0.65 sec - cpu vs 0.32 and 1.65 sec - gpu). Couldn’t the big difference on the GPU side be caused from transfers between CPU-GPU and not by performing ray-intersections, boolean evaluation and shading operations? Is there a way to investigate this? Tomorrow I will update the previous tables that I shared before on my document, now using Release builds. And will also include side by side image comparisons between the ANCI C and OpenCL results, for each scene. Regards, Marco
new_bool_eval.patch
Description: Binary data
------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________ BRL-CAD Developer mailing list brlcad-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/brlcad-devel