[ 
https://issues.apache.org/jira/browse/HADOOP-2778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated HADOOP-2778:
----------------------------------

    Attachment: 2778-1.patch

Unfortunately, this no longer compiles against trunk. HADOOP-2818 removed the 
deprecated {{Counter.Group::getDisplayName(String)}} and 
{{Counter.Group::getCounterNames()}} and HADOOP-3162 changed the way input and 
output paths are specified in 0.17. Also, {{RunningJob::getJobID()}}, 
{{JobClient::getJob(String)}}, and {{FileSystem::delete(Path)}} have been 
deprecated in 0.18.

The attached updates the counter code, removes the deprecation warnings, and 
adjusts some of the spacing to be closer to the standards. Anything bringing it 
closer to said standards (spacing around operators, etc.) would be appreciated.

Other comments:
* The javadoc explaining how to use this should probably be attached to the 
class, rather than the private {{HMRExec::getPropertiesFile}}. Surrounding the 
xml in [EMAIL PROTECTED] ... \} will, with luck, avoid the requirement to 
escape all the reserved chars.
* Some of the code is commented out, other parts disabled (e.g. line 884). If 
any feature is only partially supported/implemented, it should probably be 
removed (e.g. {{getTaskLogs}})
* Speaking of the retry code, what are the rules for this? It looks like a job 
will be re-executed if a parent was re-executed, but with different rules for 
the status files. With the retry failure logic disabled, is this distinction 
necessary?
* Any specification or documentation on this would be invaluable. A testcase 
would be difficult to write, but is there an example or some other way this can 
be validated to ensure it's kept up to date? A reference job would also be very 
helpful to prospective users (and reviewers :) )
* {{doPostSubmitStuff}} could use {{RunningJob::waitForCompletion()}} instead 
of reimplementing it, but I couldn't find any callers in the framework so I 
don't know its status. Neither should be swallowing the InterruptedException...
* {{wasParentRerun}} returns true if any parent completed after this job began, 
but false if it's the first time this job was run (is that really the only case 
where the DateFormat::parse can throw?)? Could the check for exceptions from 
the parent tasks be separated from this logic, so the +100 years tweak isn't 
necessary?
* loadProperties looks like it's being used for a number of checks incidental 
to its purpose, and each use of it appears to rely on a subset of what it does. 
For example, registerJob ignores its return value completely, though the call 
is still relevant because it could generate a log message; similarly, checks 
for null return values are inconsistently enforced, and exceptions are used in 
its control logic. It's difficult to tease out exactly what role it plays. Is 
it possible to refactor this section a bit?
* The number of maps/reduces failed or killed is set to 0% if the job is 
successful, which is probably too optimistic
* Instead of:
{noformat}
       throw new IOException(e.getMessage());
{noformat}
It's usually more helpful to preserve the cause:
{noformat}
       throw (IOException)new IOException().initCause(e);
{noformat}

> Hadoop job submission via ant using HMRExec
> -------------------------------------------
>
>                 Key: HADOOP-2778
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2778
>             Project: Hadoop Core
>          Issue Type: New Feature
>         Environment: Submit/monitor hadoop map-reduce jobs via ant
>            Reporter: Srikanth Kakani
>         Attachments: 2778-1.patch, hadoop-hmrexec.patch
>
>
> Patch attached please check

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to