Hi,

I would like to bundle a binary with a hadoop job and call it from inside
the mappers/reducers.

The binary is a C++ program that I do not want to re-implement in Java. I
want to fork it as a subprocess from inside mappers/reducers and capture
the output (on stdout).

So, I need to get the binary onto the compute nodes and figure out how to
call it. Ideally, the binary would be copied to the compute nodes
alongside the job jar. (I'm not interested in solutions that involve
copying the binary to the compute nodes by hand).

Note that Streaming is not a solution here--the binary itself is not the
mapper or reducer; the binary needs to be *called* from the
mapper/reducer.

Does anyone have experience with this? Any suggestions are much appreciated!

-daren

Reply via email to