[
https://issues.apache.org/jira/browse/OOZIE-983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13454399#comment-13454399
]
Mona Chitnis commented on OOZIE-983:
------------------------------------
my bad in not explaining better. I meant Oozie can get the NN URI from
workflow's global conf and construct the WebHDFS end-point by appending the
http port of 'dfs.http.address' obtained from default hadoop conf. But this is
assuming, the user has not configured a non-default dfs http port. In that case
he will be required to either supply his end-point or provide a *-site.xml in
Oozie's hadoop-conf to be picked up from.
I will recheck the design about kerberos auth for this one.
So the tool will be a wrapper to convert the CLI arguments 'oozie deploy..."
into RESTful WebHDFS requests.
> [Design] Automatic Oozie application deployment using WebHDFS
> -------------------------------------------------------------
>
> Key: OOZIE-983
> URL: https://issues.apache.org/jira/browse/OOZIE-983
> Project: Oozie
> Issue Type: Bug
> Reporter: Mohammad Kamrul Islam
> Assignee: Mona Chitnis
>
> Problem:
> 1. A user can't upload the oozie application from his dev box. User needs to
> access to a specialized box (such as gateway) to run those hadoop commands.
> It is inconvenient which requires to follow multiple steps and restrictions.
> 2. Automatic Oozie application versioning. If a user wants to deploy a new
> version of Oozie application, he needs to run multiple commands. In addition,
> there is no standard for this.
> Proposal:
> 1. Oozie will provide a tool that will automatically deploy the application
> and maintained a rigid version mechanism.
> 2. It could be a new script (e.g. oozie-deply) or it can extend the existing
> oozie command (e.g. oozie -deply....."). TBD
> 3. The new script will get the necessary information to launch a WebHDFS
> command from the user and upload the necessary files. It includes: WebHDFS
> end point, security token (for secured version), local application directory
> and remote application base path.
> 4. Using the appropriate WebHDFS REST API, the tool will deploy the
> application. User can choose whether to override an existing application
> path.
> 5. User can ask to upload a new version of application. The new version could
> be user provided or auto created by the script. For auto version selection,
> oozie tools will check the existing application path with pattern "v?". Then
> select the new version number.
> 6. For uploading a new application version, the oozie tool will first upload
> the application and then kill the old job (How to get the old job id?). At
> last, submit the new application.
> Open question:
> 1. How to pass the kerberos token? Specially from a dev box.
> 2. Who will determine the new version? user or automatic?
> Other key points:
> 1. Only supported for Hadoop 1.0.2+
> 2. Need to use/develop some wrapper tools which can hide most of the WebHDFS
> details. There are already two such tools: a) for python :
> https://github.com/drelu/webhdfs-py b) for Ruby,
> https://github.com/zenja/webhdfs-ruby. At this point the options are:
> * Write a new Java wrapper class.
> * Write a new wrapper tool using pure shell commands.
> * Reuse python or Ruby libraries.
> Overall, we need to do it correctly from the beginning. The comments from
> others are highly appreciated.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira