Hello all,
I have Hadoop up and running and an embarrassingly parallel problem but can't
figure out how to arrange the problem. My apologies in advance if this is
obvious and I'm not getting it.
My HPC application isn't a batch program, but runs in a continuous loop (like a
server) *outside* of the Hadoop machines, and it should occasionally farm out a
large computation to Hadoop and use the results. However, all the examples I
have come across interact with Hadoop via files and the command line. (Perhaps
I am looking at the wrong places?)
So,
* is Hadoop the right platform for this kind of problem?
* is it possible to use Hadoop without going through the command line and
writing all input data to files?
If so, could someone point me to some examples and documentation. I am coding
in C/C++ in case that is relevant, but examples in any language should be
helpful.
Thanks for any suggestions,
Parker