On Tue, Jul 25, 2017 at 7:15 PM, Marco Domingues <marcodomingue...@gmail.com
> wrote:

> I’ve just finished gathering the statistics on the scenes in the
> ‘share/db’ directory. I will attach the pdf with the results.
>

It is amazing how c) and d) are so much slower than b). It should have only
been like 2x slower. I guess this is due to the larger working set in
memory. With the list of segments spread over a large amount of memory the
'shade_segs' phase will have poor memory coherency. It is particularly bad
in goliath.g which is the scene with most depth complexity.

It does not make sense that g) is faster than f) though.

You should use more appropriate measures. i.e. 's' or 'ms' for each cell,
depending on much time it takes, instead of fractions.
Or MB vs KB, etc. Also use the same number format everywhere (e.g. %.2f)
and use American number format for the fraction separator i.e. '.' vs ','.

In Table 3 "Other metrics" what does xx/yy in the "Partitions" mean? Is
this used vs allocated partitions? The amount of wasted memory seems
particularly bad in truck.g if that is the case. I would be nice to reduce
memory consumption with partitions further. Still, at least all these
scenes would easily fit into the typical memory of a graphics card with
under 512 MB total footprint.


> I couldn’t really figure out how to use the profiling tools you mentioned.
> Well, the AMD CodeXL only allowed me to use CPU Profiling Time-based
> Sampling with my hardware, but I couldn’t really understand the output from
> it. The other tools I had trouble installing/running with BRL-CAD, so I
> ended  gathering the data with the output from the ‘rt’ command.
>

We'll have to talk about this over Skype I guess. I'm going to be a bit
busy the next couple of days though so perhaps we'll have to do it Friday
or early next week. Still the statistics you gathered are enough to start
optimizing the code.


> Maybe its not very accurate, but I tried to compare the code with the
> different kernels enabled/disabled and it pretty much confirms the
> bottleneck in the rt_boolfinal kernel.
>

Everywhere you see loops within loops, large branch heavy kernels, or
memory walks with large strides, it is a good hint that code could be
further optimized.

-- 
Vasco Alexandre da Silva Costa
PhD in Computer Engineering (Computer Graphics)
Instituto Superior Técnico/University of Lisbon, Portugal
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
BRL-CAD Developer mailing list
brlcad-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/brlcad-devel

Reply via email to