On Mar 20, 2015, at 7:17 PM, Vasco Alexandre da Silva Costa
<vasco.co...@gmail.com> wrote:
> I came up with this tentative work plan for improving the OpenCL (CL) RT code
> on BRL-CAD:
Excellent!
> - Hook into shot and db load routines (hooks disabled by default) in order to
> capture ellipsoid primitives and shots into the CL side. Get familiar with
> these top level interfaces of the code. (2 weeks)
This might require a little more inspection. For places where you might “get
opencl ready”, I would expect you to hook in when a raytrace context is created
(e.g., rt_dirbuild()) or during geometry prep, which every object implements
and is called during directory building. If you see the rt_ell_prep() function
in the opencl branch, that’s effectively where this is already happening for
our quick test.
> - CL megakernel for boolean operations. i need to check how involved this
> will be to have a more accurate time estimate. Probably requires storing some
> primitive hierarchy on CL. Integration with C++ side of things for other
> primitives may be problematic. (2 weeks?)
This may be a large task, possibly the bulk of your timeframe, as boolean
weaving is one of the most complex aspects in BRL-CAD. Understanding it is
almost certainly going to take you some time. However, you’ll probably be able
to publish a paper on it when you’re done! :)
Something to consider, you could even propose *only* focusing on this aspect
for GSoC. What I mean is that you could spend time restructuring the rt
dispatch logic, which is something like
forAllPixels->shootOneRay->traverseScene->evalHits->weaveBooleans->colorizePixel
that iterates over all pixels depth-first pipeline style. You’d restructure
it into something like phase1:
forAllPixels->shootOneRay->traverseScene->evalHits then phase2:
CLweaveAllBooleans->CLcolorizePixels.
If you got it working, you’d speed up ray tracing somewhere between 25-50% (as
that is about how much time Boolean weaving takes) for all geometry, not just
ellipsoids and it could go into immediate production use. It’d be in users
hands. More on this later.
With time-permitting, you could then work on the rest of the pipeline like
implementing the top-level bundled dispatch and scene traversal acceleration,
and primitive shot() routines.
> - Implement CL rectilinear grids [1] (improvement on the Mike Gigante Nugrid
> currently used in BRL-CAD) spatial partitioning construction and traversal.
> Should reuse most of the regular grid construction code but requires some
> extra construction steps and has different traversal scheme (2.5 weeks)
Sounds good, but know that we don’t actually use Nugrid. It’s faster for some
scenes, slower for others — overall a wash. I think there’s also a bug
somewhere in there. We use a custom BSP traversal for production work that is
much more thoroughly tested and robust. Grids generally do much better on the
GPU than they do on the CPU.
> - Cleanups, bugfixes, final tests, docs. (2.5 weeks)
So the only planning concern is that this isn’t an incremental plan. It’s
always a big red flag when a proposal has a big “test and document” at the end.
If anything goes wrong or takes a couple weeks longer than planned with one of
your early steps, we usually end up with something incomplete that cannot be
shipped to users.
Assume something will go wrong, that it will take longer, or your computer will
explode next week. How would you change your approach so that users still get
something? If you had to immediately stop working, there should be something
usable (without requiring additional work by you or others to make it usable).
This is an aspect of “coding complete” mentioned on
http://brlcad.org/wiki/Google_Summer_of_Code/Acceptance#Write_complete_code
Basically, this means thinking through how you can incrementally break up the
task into phases where each phase gets tested and could be put into production
use without detriment. It not only means you’re continually cleaning up,
testing, and updating docs, it means you’re continually focused on the user and
their benefit instead of your development comfort. If that means getting a
little less done, that’s okay. Users will love you for it.
My other concern was that your objective doesn’t result in a feature that
affects users. Yay, devs can raytrace ellipsoids MUCH faster … and users see
nothing. That’s why I’d suggest either focusing only on boolean weaving, or on
bundled ray dispatch+traversal, or hit+result gathering, etc — something that
could be put into immediate use, even if it’s not going to give the 10x speedup
until the rest of the pipeline is converted. Changing BRL-CAD’s render
pipeline to support this style of evaluation is going to be a lot of work.
Still, outstanding proposal progress. This and other proposals are making me
pretty darn excited!
Cheers!
Sean
p.s. Some technical libraries to consider in lieu of directly binding to OpenCL:
http://viennacl.sourceforge.net
http://ddemidov.github.io/vexcl/
> I am fairly sure on the time estimates I made for the CL side of things but
> am unsure on the BRL-CAD integration. Perhaps you know better how involved
> these tasks would be?
Assume they will be VERY involved.
------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the
conversation now. http://goparallel.sourceforge.net/
_______________________________________________
BRL-CAD Developer mailing list
brlcad-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/brlcad-devel