Re: running hadoop remotely from inside a java program

Steve Loughran Wed, 09 Jul 2008 02:28:39 -0700

Deyaa Adranale wrote:

thanks for your help

please i need more explanations on these:

* it is not too far away, network-wise
what do u mean network-wise?? what are the requirements of theconnection between the client and server? because i think that mycluster is protected with a firewall

I dont know if that will work or not. Given what hadoop security is like(minimal), a firewall between the cluster and rest of the world isimportant.

* the client hadoop configuration is in sync with the servers
how to do this?
i have been till now only running jobs on hadoop, but i have neverconfigured it.this will not mean that the client machine will be a node in thecluster, right?

The client machines XML files need to be synchronised with those on theserver, otherwise

and what if my client does not have a hadoop installation and I don'twant to force him to install one just to use my tool? can't I simplysubmit jobs to the cluster remotely from my java code using SSH forexample?


That's one option:
* scp the JARs to a machine in the cluster
* ssh in to that machine
* use the command line tools there to run the job

* use distcp to get the results back to the local fileysystem, and scpthem back to the user

Its workable, and would avoid having to to have the hadoop client sidejars installed on every machine. Getting the data in and outefficiently could be the tricky part.


-steve

Re: running hadoop remotely from inside a java program

Reply via email to