Interesting thread.

This relates to HADOOP-288.

Also the thread I started last week on using URLs in general for input arguments. Seems like we should just take a URL for the jar, which could be file: or hdfs:

Thoughts?

On Aug 31, 2006, at 10:54 AM, Doug Cutting wrote:

Frédéric Bertin wrote:
This should run clientside, since it depends on the username, which is different on the server.
then, what about passing the username as a parameter to the JobSubmissionProtocol.submitJob(...) ? This avoids loading the whole JobConf clientside just to set the username.

That sounds like a reasonable change to me.

Why not moving it in the JobSubmissionProtocol (JobTracker's submitJob method) ?

These could probably run on the server. They're currently run on the client in an attempt to return errors as quickly as possible when jobs are misconfigured.
Is it really quicker to make all those checkings remotely than remotely asking the JobTracker to make them locally? (just a question, I really have no idea of the answer)

We'd need to be careful that this is not a synchronized method on the server, so it doesn't interfere with other server activities. Also, checking the input and output has to be much faster than the RPC timeout, which it should be, since this just checks for the existence of directories, not of individual files.

Doug

Reply via email to