Michael Basnight wrote:
I have a java app that runs in tomcat and now needs to talk to my hadoop
infrastructure. Typically, all the testing ive done / examples show
starting something that uses hadoop via the 'bin/hadoop -jar' cmd, but
as you can imagine this is no good for a existing tomcat app. Ive looked
thru the .sh files in the bin/ dir, and it would require extensive work
to mod the script to export the env variables so that tomcat can be
restarted without a special init script (and with those variables in
tact). The last thing i want to do is hand crank files that may or may
not change in new hadoop distros. Is there a known way to use the hadoop
infrastructure outside of the bin/hadoop -jar command?
mb
what are you trying to do? Submit jobs or start hadoop itself?
Hadoop is tricky to start up in-VM; the hadoop-3628 branch of trunk can
do this, but your security manager needs to intercept the odd call to
System.exit(), and there are a lot of singletons for monitoring -better
to start up hadoop in a new VM.
1. JobSubmission can be done remotely by way of the JobClient api
http://smartfrog.svn.sourceforge.net/viewvc/smartfrog/trunk/core/components/hadoop/src/org/smartfrog/services/hadoop/components/submitter/SubmitterImpl.java?view=markup
2. You can also run any instance of Tool by creating then invoking it
-with any configuration you choose to run
http://smartfrog.svn.sourceforge.net/viewvc/smartfrog/trunk/core/components/hadoop/src/org/smartfrog/services/hadoop/components/submitter/ToolRunnerComponentImpl.java?view=markup
Both suffer from version-sensitivity, everything has to be on exactly
the same version of hadoop, and the hadoop cluster has to be exposed to
tomcat, which can be a problem if tomcat is in a DMZ and you are trying
to secure Hadoop.