On Mon, Jul 17, 2017 at 11:23 PM, Marco Domingues <
marcodomingue...@gmail.com> wrote:

> Hello,
>
> As combined, I have made some improvements on the code (based on Vasco’s
> feedback) and implemented an overlap handler
>

This is one of those words where Portuguese does not translate well into
English.
Instead of 'combined' it is better to say as 'discussed' or 'agreed' upon.

There are still some bugs in the code as:
>
>    - Incorrect shading for some partitions (some normals seem to be off)
>
> This is quite likely because of the issue with primitive ids being in BVH
order on the OpenCL side that we talked about on IRC.

>
>    - Some more complex scenes have less partitions evaluated than
>    expected. (I believe this issue comes from some partitions reporting
>    overlap when they shouldn’t, but still have to further investigate what is
>    causing that)
>
> Apart from that, the results look promising and ok for most scenes I have
> tested (this includes the scenes 'share/db/operators.g' and
> 'share/db/boolean-ops.g'). Some images can be found here:
> https://brlcad.org/wiki/User:Marco-domingues/GSoC17/Log#12_-_14_July
>
> Right now, I am using a bitarray to represent the regions involved in each
> partition, and this seems to be an inefficient solution, because of the
> high number of regions that some complex scenes have (to evaluate one
> partition and to resolve any overlap that may occur, I have to iterate over
> all the regions and test the bits, which can be very slow for some scenes).
> The ‘share/db/goliath.g’ scene has a total of 1447 regions, and despite
> that, the max number of regions involved in one partition is 15 (from some
> profiling I made over the ansi c code).
>

If the issue is with the partition bitarrays try making an intermediate
step where the bitarrays from rt_boolweave are converted into integer lists
prior to rt_boolfinal or the render phase.
I think this algorithm should be simpler than modifying rt_boolweave to use
lists. For one you won't need dynamic lists. You can use popcnt to count
the number of bits in each bitarray (complexity is amount of bitarray
words) then sum that to compute the amount of memory to allocate and then
allocate an array where you store the ids of the region bits set on each
partition.

If you want to reduce the amount of bits on the partition segment
bitarrays, you can also consider that you only need a bitarray for a
partition to be the size of the amount of segments in that specific ray,
not the max amount of segments on the largest ray segment list. This
complicates the code a bit but should significantly reduce the amount of
memory required to store the bitarray on complex scenes.

As for the regiontable bitarray in rt_boolfinal, I don't see you initialize
it to zero in the OpenCL code, plus it seems to me like all the ray threads
are writing to the same memory space so there are bound to be errors in it,
i.e.:

build_regiontable()'s pp_idx = current_index which is the id of a
partition. You use:
set(regiontable, pp_idx + (rt_index - 1), m) i.e.
regiontable[pp_idx + (rt_index - 1) + bindex(m)] & bmask(m)) i.e.
regiontable[current_index + (total_regions/32 + 1 - 1) + (m >> 5)] & (1 <<
m)

this doesn't look correct. this seems more like what you want:
set(regiontable, current_index * rt_index, m)

Think about it. Two adjacent partitions will write in overlapping memory
areas of the regiontable bitarray. You don't want that.
Also I think this same error is in other bitarrays you have.

Also your code to iterate the bitarray is still inefficient. You check the
leading zeros in an uint but then you iterate the bits one by one over that
to iterate over all set bits. This should be faster:

uint mask; /* initialized with some bits set. */
uint i, n;
i = 0;
while (mask != 0) {
  n = clz(mask);
  i += n;
  /* bit 32-i is set */
  mask <<= (n+1);
}

or some variation of this.

>
> In the next days I’m planning to work on replacing the bitarray to a
> solution that uses a list to represent the regions involved in each
> partition (regiontable), which despite using more memory, should be
> considerably faster.
>

-- 
Vasco Alexandre da Silva Costa
PhD in Computer Engineering (Computer Graphics)
Instituto Superior Técnico/University of Lisbon, Portugal
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
BRL-CAD Developer mailing list
brlcad-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/brlcad-devel

Reply via email to