What about the current options of getting job details using either of these 
options: CLI/Rest API/Java API?

I believe most of the data should already be available. If something is missing 
we might need to extend that API.




________________________________
 From: Alexey Yakubovich <[email protected]>
To: [email protected] 
Sent: Wednesday, September 18, 2013 10:53 AM
Subject: Logging from Oozie workflow
 

Hi, this post is mostly about audit log from Oozie workflow.

Suppose a silly client wants to have audit log that provides immediately
before / after each action:
1. Name and time when action was submitted for execution
2. Time when action was completed
3. If completed with error, the error code, error message

I see some possible ways to do it, but neither I would like.

1. You can get all those data (and more) from the Oozie database (tables
SLA_EVENTS, WF_JOBS, WF_ACTIONS ...), if enable SLA.
   1.1. Cons: Oozie database is not part of ... public API
   1.2. Cons: The only reasonable way to do it (as I see it) - in special
java action(s) at the end of workflow. So if workflow by any way doesn't
get to that action, no audit logging happens

2. Override the class that write data to Oozie SLA tables and made it also
to send some data into some other sink.
   2. Cons: That seems not so transparent (clean), that class can be
responsible for more than it seems, and is called often. So writing from if
to e.g. other (audit) database seems a bit rood to Oozie.

3. Override launchers for all (most) actions, as it suggested in
http://svn.apache.org/repos/asf/oozie/trunk/examples/src/main/apps/custom-main/workflow.xml
    3.1. Cons: But it also not that obvious (clean), and makes you
implementation dependent; what if in next version e.g.
org.apache.oozie.action.hadoop.PigMain change representation of action name
(actionXML), error code, error message, etc
    3.2. Cons: if code added to overridden launcher not "simple and light",
that can hinder Oozie

4. After every Oozie action node add the logging node.
          But that looks ...simple clumsy

So, if I am not completely lost, it probably has sense to add some "log
hooks" to actions. And to extend public API to express it.

If that really what should be done to support audit logging for workflow, I
am willing to try to implement it after agreement on design.

Thank you
Alexey

Reply via email to