[ 
https://issues.apache.org/jira/browse/OOZIE-1658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13861862#comment-13861862
 ] 

Rohini Palaniswamy commented on OOZIE-1658:
-------------------------------------------

Some additional context:
   Oozie already has most of the information in the name of the job launched 
and some in configuration

For eg:
Name of the Job:
oozie:launcher:T=java:W=my-wf:A=startup:ID=0149397-131211211232989-oozie-wrkf-W 
(for a launcher job)
oozie:action:T=map-reduce:W=my_wf:A=transform:ID=0110342-131211211232989-oozie-wrkf-W
  (for a child job)

Configuration:
oozie.action.id         0110342-131211211232989-oozie-wrkf-W @transform 
oozie.job.id    0110342-131211211232989-oozie-wrkf-W  

But OOZIE-949 allows overriding the name of the job and many users use it and 
some information from job name is lost. Also configuration does not contain 
information of the bundle or coordinator. This jira adds those.   At present we 
have to run lot of queries on oozie database to gather analytics on how many 
jobs oozie is launching and with purging set to 30 days, you cannot gather for 
more than a month. Also need to do lot of joins causes load on database. We 
have the job configuration from job history stored in hive tables and this 
information can be used to run more analytics on oozie with mapreduce not 
loading oozie database.

 This will also be useful in case of debugging making it faster instead of 
trying to traverse from workflow to coordinator or bundle from job logs, UI or 
using command line. For now, this option is disabled by default. 


> Add bundle, coord, wf and action related information to launched M/R jobs.
> --------------------------------------------------------------------------
>
>                 Key: OOZIE-1658
>                 URL: https://issues.apache.org/jira/browse/OOZIE-1658
>             Project: Oozie
>          Issue Type: Bug
>            Reporter: purshotam shah
>            Assignee: purshotam shah
>
> We should add workflow related information to launched job. This information 
> can be used for analytics like how many oozie jobs are submitted for a 
> particular period, what is the total number of failed pig jobs.. etc from 
> mapreduce jobhistory logs and configuration.
> JobInfo will contain information of bundle, coordinator, workflow and 
> actions. If enabled, Hadoop job will have
> property(oozie.job.info) which value is multiple key/value pair separated by 
> ";". 
> User can also add custom workflow property to jobinfo by adding property 
> which prefix with "oozie.job.info."
> Eg.
> oozie.job.info="bundle.id=;bundle.name=;coord.name=;coord.nominal.time=;coord.name=;wf.id=;
> wf.name=;action.name=;action.type=;launcher=true"
>  
> Jobinfo can enabled/disabled using "oozie.action.jobinfo.enable"



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to