Raymond Jennings III wrote:
In other words: I have a situation where I want to feed the output from the first iteration of my mapreduce job to a second iteration and so on. I have a "for" loop in my main method to setup the job parameters and to run it through all iterations but on about the third run the Hadoop processes lose their association with the 'jps' command and then weird things start happening. I remember reading somewhere about "chaining" - is that what is needed? I'm not sure what causes jps to not report the hadoop processes even though they are still active as can be seen with the "ps" command. Thanks. (This is on version 0.20.1)
yes, here is something that does this to do a pagerank style ranking of things
http://smartfrog.svn.sourceforge.net/viewvc/smartfrog/trunk/core/extras/citerank/src/org/smartfrog/services/hadoop/benchmark/citerank/CiteRank.java?revision=7728&view=markup
