Deyaa Adranale wrote:

thanks for your help

please i need more explanations on these:

* it is not too far away, network-wise
what do u mean network-wise?? what are the requirements of the connection between the client and server? because i think that my cluster is protected with a firewall

I dont know if that will work or not. Given what hadoop security is like (minimal), a firewall between the cluster and rest of the world is important.


* the client hadoop configuration is in sync with the servers
how to do this?
i have been till now only running jobs on hadoop, but i have never configured it. this will not mean that the client machine will be a node in the cluster, right?

The client machines XML files need to be synchronised with those on the server, otherwise


and what if my client does not have a hadoop installation and I don't want to force him to install one just to use my tool? can't I simply submit jobs to the cluster remotely from my java code using SSH for example?

That's one option:
* scp the JARs to a machine in the cluster
* ssh in to that machine
* use the command line tools there to run the job
* use distcp to get the results back to the local fileysystem, and scp them back to the user

Its workable, and would avoid having to to have the hadoop client side jars installed on every machine. Getting the data in and out efficiently could be the tricky part.

-steve

Reply via email to