On Tue, Jul 25, 2017 at 9:54 PM, Marco Domingues <marcodomingue...@gmail.com
> wrote:
> On 25 Jul 2017, at 20:49, Vasco Alexandre da Silva Costa <
> vasco.co...@gmail.com> wrote:
>
> It is amazing how c) and d) are so much slower than b). It should have
> only been like 2x slower. I guess this is due to the larger working set in
> memory. With the list of segments spread over a large amount of memory the
> 'shade_segs' phase will have poor memory coherency. It is particularly bad
> in goliath.g which is the scene with most depth complexity.
>
> I take my comment back. It seems this is just because it needs to traverse
the segments 3x vs 1x.
So it usually takes around 3x as much time.
Regarding memory consumption and complexity I have more comments to make.
[[
I've been thinking about how we can further reduce memory consumption...
sizeof() this:
struct hit {
double3 hit_point;
double3 hit_normal;
double3 hit_vpriv;
double hit_dist;
int hit_surfno;
};
vs this:
struct hit {
double hit_point[3];
double hit_normal[3];
double hit_vpriv[3];
double hit_dist;
int hit_surfno;
};
8*4*3+8+4 = 108 bytes vs
8*3*3+8+4 = 84 bytes
i.e. a compression ratio of 1.29x on struct hit BUT at the expense of lots
of source code changes on the primitives.
As for struct partition, we could use pointers to hits or indices to hits
instead of storing the actual hits. like:
idx = (seg_idx << 1) + (in|out)bit .
This should save a lot more memory on the partitions. Like 238 bytes vs 30
bytes i.e. a compression ratio of 7.93x.
However it would increase memory trashing quite a lot. So we should at
least store cache the 'hit_dist' for the hits.
This would be 46 bytes per partition i.e. a compression ratio of 5.17x.
With struct hit pointers it would be 54 bytes per partition i.e. a
compression ratio of 4.40x (with 64-bit pointers).
]]
To put it short: my advice is that you change struct partition from storing
copies of the struct hits to having pointers to the struct hits. i.e.
struct partition {
global struct hit *inhit;
global struct hit *outhit;
uint inseg;
uint outseg;
...
};
That should save quite a lot of memory at the expense of some memory
pointer chasing. Which can be mitigated with a 'hit_dist' cache for the
inhit and outhit.
Also, let me get this straight, the ANSI C code rt_boolfinal gets
partitions from an input queue, evaluates them, and puts the valid ones in
an output queue. You also evaluate them, but then you just mark the valid
partitions as such. While this does save memory, it may make the final
shading stage take more time than it would otherwise because it needs to
skip invalid partitions.
--
Vasco Alexandre da Silva Costa
PhD in Computer Engineering (Computer Graphics)
Instituto Superior Técnico/University of Lisbon, Portugal
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
BRL-CAD Developer mailing list
brlcad-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/brlcad-devel