Deyaa Adranale wrote:
thanks for your help
please i need more explanations on these:
* it is not too far away, network-wise
what do u mean network-wise?? what are the requirements of the
connection between the client and server? because i think that my
cluster is protected with a firewall
I dont know if that will work or not. Given what hadoop security is like
(minimal), a firewall between the cluster and rest of the world is
important.
* the client hadoop configuration is in sync with the servers
how to do this?
i have been till now only running jobs on hadoop, but i have never
configured it.
this will not mean that the client machine will be a node in the
cluster, right?
The client machines XML files need to be synchronised with those on the
server, otherwise
and what if my client does not have a hadoop installation and I don't
want to force him to install one just to use my tool? can't I simply
submit jobs to the cluster remotely from my java code using SSH for
example?
That's one option:
* scp the JARs to a machine in the cluster
* ssh in to that machine
* use the command line tools there to run the job
* use distcp to get the results back to the local fileysystem, and scp
them back to the user
Its workable, and would avoid having to to have the hadoop client side
jars installed on every machine. Getting the data in and out
efficiently could be the tricky part.
-steve