In this particular example, we do two forms of global illumination in
seperate passes. While I think there are some techniques we could apply
that would discretise the scene and could allow us to save IO and memory
overhead per tile, the generic nature of the scenes (we have both indoor
and outdoor) means that we have both lights with infinite range and
render with nothing more discrete than a polygon soup. Also, the layout
of the tiles in most cases is not convenient for discretisation seeing
as the faces are laid out to optimise space.
We use two seperate passes already. The tiles are rendered once and the
results of this are used to compute the second pass. This would
translate very well into hadoop I think as I have seen examples in the
mailing list of multi-pass jobs. While I do not think there is much
scope for optimising the discretisation of the scene, I think I could
make substantial gains in reducing I/O and job distribution overhead by
using hadoop.
My main questions are really to do with the finer points of the C++ vs
Java implementations. I've not coded Java in years so I'm really
interested in how easy it is to simply wrap the C++ components for the
map and reduce phases. I notice there is a C++ API that has methods for
manipulating files in the DFS, which sounds ideal as the first job I run
(I think this is equivalent to the Mapper, is that right?) generates the
scene description and composes the tile rendering commands.
-Rob
Ted Dunning wrote:
This is interesting.
One of the poster children for Google's map-reduce is rendering for Google
maps. Each object in the world is keyed according to the tile that it
affects in the map and then the reduce renders the tile given all of the
objects that affect it. Very slick. Very fast.
The question with 3d rendering is whether you can limit effects in this way,
especially if you are using things like radiosity where illumination on
objects way across the screen can affect the lighting on other objects.
It may be that multiple map-reduce passes could be used to do this, but I
don't know.
If you are only passing the entire scene to independent tile renderers, then
you really don't have much to do. Just put your scene description into the
cluster with a huge replication and run.
On 10/16/07 8:37 AM, "Robert Jessop" <[EMAIL PROTECTED]> wrote:
Hi there. I did a search of the mailing list archives looking for
something similar to this, but I didn't find anything so apologies if
this has been discussed before.
I'm investigating using Hadoop for distributed rendering. The Mapper
would define the tiles to be rendered and the nodes would render them
using the scene data (which is for the sake of argument, all wrapped up
in one big binary file on the HDFS). The reducer would take the output
tiles and stitch them together to form the final image. I have a system
that does this already but it doesn't have any of the advantages of a
distributed file system, there are lots of IO bottlenecks and
communication overheads. All the code is currently in C++.
Does this sound like a good use for Hadoop?
-Rob