Re: MiniOozie for local dryrun or other options for doing dryrun of oozie workflows?

2016-12-07 Thread Robert Kanter
If you get MiniOozie to work, it should start Mini HDFS and a Mini
MapReduce (Yarn + MR or just MR depending on Hadoop 1 or Hadoop 2)
clusters.  There are methods in the parent class to get the URIs for
these.  That should let you submit workflows containing most action types.
For ones that require an external service (e.g. Hive2), you'll need to
either setup a Mini or real version of that service.

- Robert



On Sat, Dec 3, 2016 at 3:08 AM, Serega Sheypak 
wrote:

> Hi, thanks for the reply. I did hit many issues with MiniOoze from CDH
> 5.5.1 (Oozie 4.1 + some Cloudera patches on top of it)
> 1. It doesn't work without oozie git repo. I've fixed it.
> 2. I wasn't able to run LocalOoize. I'm using DagEngine directly.
>
> dryrun works fine. And what if I want to execute my workflow with mocking
> actual actions?
>
> 2016-12-02 21:46 GMT+01:00 Robert Kanter :
>
> > Hi Serega,
> >
> > It turns out that LocalOozieClient is missing some methods that it should
> > be implementing to call DagEngine (see OOZIE-2751
> > ).  One of those is
> the
> > dryrun method, which is why that's not working for you.  For now, calling
> > DagEngine directly should work fine (that's essentially what
> > LocalOozieClient
> >  > main/java/org/apache/oozie/LocalOozieClient.java>
> > does).
> >
> > In your last line, dryRunSubmit doesn't return a jobId because it doesn't
> > actually start a job.  It does a dryrun
> >  > Dryrun_of_Workflow_Job>,
> > which essentially just checks that everything resolves correctly in the
> > workflow.xml without actually running any of the actions.  If successful,
> > it returns the String "OK".  If there's a problem, it throws an exception
> > that should contain the details of the problem.
> >
> >
> > - Robert
> >
> > On Fri, Dec 2, 2016 at 1:22 AM, Serega Sheypak  >
> > wrote:
> >
> > > Hi, did anyone make it work property in his project?
> > > I need to do dry run for my workflows.
> > > The usecase is:
> > > User writes workflow and wants to:
> > > 1. Check if it valid
> > > 2. do dryrun, see how it flows without executing steps.
> > >
> > > Let say I have wflow with three steps:
> > >
> > > 1. disctp data from $A to $B
> > > 2. run spark action with $B as input
> > > 3. disctp $B to $C
> > >
> > > I want to do dryrun and check how my variables were interpolated it
> > wflow.
> > > The killer feature is: I want to imitate spark action failure and check
> > how
> > > my kill node looks like.
> > >
> > > I gave up to make MiniOozie work. But I was able to start DagEngine.
> > >
> > >val fs: FileSystem = FileSystem.getLocal(new Configuration())
> > >val appPath: Path = new Path("build", "app")
> > >fs.mkdirs(appPath)
> > >fs.mkdirs(new Path(appPath, "lib"))
> > >
> > >val writer = new OutputStreamWriter(fs.create(new Path(appPath,
> > > "workflow.xml")))
> > >writer.write(pipelineXml)
> > >writer.close()
> > >
> > >val wc = getClient
> > >
> > >val workflowConfiguration = wc.createConfiguration
> > >workflowConfiguration.setProperty(OozieClient.APP_PATH, new
> > > Path(appPath, "workflow.xml").toString)
> > >workflowConfiguration.setProperty(OozieClient.USER_NAME,
> getTestUser)
> > >usrDefinedProperties.foreach { case (k, v) =>
> > > workflowConfiguration.put(k.toString, v.toString) }
> > >
> > >val dagEngine:DagEngine  =
> > > Services.get.get(classOf[DagEngineService]).getDagEngine(getTestUser)
> > >val conf = new Configuration()
> > >workflowConfiguration.asScala.toMap.foreach{case(k,v ) =>
> conf.set(k,
> > > v)}
> > > val jobId = dagEngine.dryRunSubmit(conf)
> > >
> > > But I would like to check how workflow flows when you pass in different
> > > parameters without actualy executing steps.
> > >
> >
>


Re: MiniOozie for local dryrun or other options for doing dryrun of oozie workflows?

2016-12-05 Thread Serega Sheypak
Yeah, I see it. I found the way to test wokflow locally but it's suuuper
complex. I have to start local MR, Local HDFS and Local OOzie things. Then
I do mock on the fly xml actions with my test actions and run workflow.
It's super complex and fragile unfortunately...
I'll try to reach dev group. I found LiteWorkflow thing, it seems to be the
engine that execute nodes and it's super lightweight.

2016-12-05 14:11 GMT+01:00 Andras Piros :

> Hi Serega,
>
> as per *Oozie documentation
>  Dryrun_of_Workflow_Job>*
> we
> can see that with -dryrun option does not create nor run a job.
>
> So for the killer feature request, I think it's not possible ATM.
>
> Regards,
>
> Andras
>
> --
> Andras PIROS
> Software Engineer
> 
>
> On Thu, Dec 1, 2016 at 8:33 PM, Serega Sheypak 
> wrote:
>
> > Hi, did anyone make it work property in his project?
> > I need to do dry run for my workflows.
> > The usecase is:
> > User writes workflow and wants to:
> > 1. Check if it valid
> > 2. do dryrun, see how it flows without executing steps.
> >
> > Let say I have wflow with three steps:
> >
> > 1. disctp data from $A to $B
> > 2. run spark action with $B as input
> > 3. disctp $B to $C
> >
> > I want to do dryrun and check how my variables were interpolated it
> wflow.
> > The killer feature is: I want to imitate spark action failure and check
> how
> > my kill node looks like.
> >
>


Re: MiniOozie for local dryrun or other options for doing dryrun of oozie workflows?

2016-12-05 Thread Andras Piros
Hi Serega,

as per *Oozie documentation
*
we
can see that with -dryrun option does not create nor run a job.

So for the killer feature request, I think it's not possible ATM.

Regards,

Andras

--
Andras PIROS
Software Engineer


On Thu, Dec 1, 2016 at 8:33 PM, Serega Sheypak 
wrote:

> Hi, did anyone make it work property in his project?
> I need to do dry run for my workflows.
> The usecase is:
> User writes workflow and wants to:
> 1. Check if it valid
> 2. do dryrun, see how it flows without executing steps.
>
> Let say I have wflow with three steps:
>
> 1. disctp data from $A to $B
> 2. run spark action with $B as input
> 3. disctp $B to $C
>
> I want to do dryrun and check how my variables were interpolated it wflow.
> The killer feature is: I want to imitate spark action failure and check how
> my kill node looks like.
>


Re: MiniOozie for local dryrun or other options for doing dryrun of oozie workflows?

2016-12-03 Thread Serega Sheypak
Hi, thanks for the reply. I did hit many issues with MiniOoze from CDH
5.5.1 (Oozie 4.1 + some Cloudera patches on top of it)
1. It doesn't work without oozie git repo. I've fixed it.
2. I wasn't able to run LocalOoize. I'm using DagEngine directly.

dryrun works fine. And what if I want to execute my workflow with mocking
actual actions?

2016-12-02 21:46 GMT+01:00 Robert Kanter :

> Hi Serega,
>
> It turns out that LocalOozieClient is missing some methods that it should
> be implementing to call DagEngine (see OOZIE-2751
> ).  One of those is the
> dryrun method, which is why that's not working for you.  For now, calling
> DagEngine directly should work fine (that's essentially what
> LocalOozieClient
>  main/java/org/apache/oozie/LocalOozieClient.java>
> does).
>
> In your last line, dryRunSubmit doesn't return a jobId because it doesn't
> actually start a job.  It does a dryrun
>  Dryrun_of_Workflow_Job>,
> which essentially just checks that everything resolves correctly in the
> workflow.xml without actually running any of the actions.  If successful,
> it returns the String "OK".  If there's a problem, it throws an exception
> that should contain the details of the problem.
>
>
> - Robert
>
> On Fri, Dec 2, 2016 at 1:22 AM, Serega Sheypak 
> wrote:
>
> > Hi, did anyone make it work property in his project?
> > I need to do dry run for my workflows.
> > The usecase is:
> > User writes workflow and wants to:
> > 1. Check if it valid
> > 2. do dryrun, see how it flows without executing steps.
> >
> > Let say I have wflow with three steps:
> >
> > 1. disctp data from $A to $B
> > 2. run spark action with $B as input
> > 3. disctp $B to $C
> >
> > I want to do dryrun and check how my variables were interpolated it
> wflow.
> > The killer feature is: I want to imitate spark action failure and check
> how
> > my kill node looks like.
> >
> > I gave up to make MiniOozie work. But I was able to start DagEngine.
> >
> >val fs: FileSystem = FileSystem.getLocal(new Configuration())
> >val appPath: Path = new Path("build", "app")
> >fs.mkdirs(appPath)
> >fs.mkdirs(new Path(appPath, "lib"))
> >
> >val writer = new OutputStreamWriter(fs.create(new Path(appPath,
> > "workflow.xml")))
> >writer.write(pipelineXml)
> >writer.close()
> >
> >val wc = getClient
> >
> >val workflowConfiguration = wc.createConfiguration
> >workflowConfiguration.setProperty(OozieClient.APP_PATH, new
> > Path(appPath, "workflow.xml").toString)
> >workflowConfiguration.setProperty(OozieClient.USER_NAME, getTestUser)
> >usrDefinedProperties.foreach { case (k, v) =>
> > workflowConfiguration.put(k.toString, v.toString) }
> >
> >val dagEngine:DagEngine  =
> > Services.get.get(classOf[DagEngineService]).getDagEngine(getTestUser)
> >val conf = new Configuration()
> >workflowConfiguration.asScala.toMap.foreach{case(k,v ) => conf.set(k,
> > v)}
> > val jobId = dagEngine.dryRunSubmit(conf)
> >
> > But I would like to check how workflow flows when you pass in different
> > parameters without actualy executing steps.
> >
>


Re: MiniOozie for local dryrun or other options for doing dryrun of oozie workflows?

2016-12-02 Thread Robert Kanter
Hi Serega,

It turns out that LocalOozieClient is missing some methods that it should
be implementing to call DagEngine (see OOZIE-2751
).  One of those is the
dryrun method, which is why that's not working for you.  For now, calling
DagEngine directly should work fine (that's essentially what
LocalOozieClient

does).

In your last line, dryRunSubmit doesn't return a jobId because it doesn't
actually start a job.  It does a dryrun
,
which essentially just checks that everything resolves correctly in the
workflow.xml without actually running any of the actions.  If successful,
it returns the String "OK".  If there's a problem, it throws an exception
that should contain the details of the problem.


- Robert

On Fri, Dec 2, 2016 at 1:22 AM, Serega Sheypak 
wrote:

> Hi, did anyone make it work property in his project?
> I need to do dry run for my workflows.
> The usecase is:
> User writes workflow and wants to:
> 1. Check if it valid
> 2. do dryrun, see how it flows without executing steps.
>
> Let say I have wflow with three steps:
>
> 1. disctp data from $A to $B
> 2. run spark action with $B as input
> 3. disctp $B to $C
>
> I want to do dryrun and check how my variables were interpolated it wflow.
> The killer feature is: I want to imitate spark action failure and check how
> my kill node looks like.
>
> I gave up to make MiniOozie work. But I was able to start DagEngine.
>
>val fs: FileSystem = FileSystem.getLocal(new Configuration())
>val appPath: Path = new Path("build", "app")
>fs.mkdirs(appPath)
>fs.mkdirs(new Path(appPath, "lib"))
>
>val writer = new OutputStreamWriter(fs.create(new Path(appPath,
> "workflow.xml")))
>writer.write(pipelineXml)
>writer.close()
>
>val wc = getClient
>
>val workflowConfiguration = wc.createConfiguration
>workflowConfiguration.setProperty(OozieClient.APP_PATH, new
> Path(appPath, "workflow.xml").toString)
>workflowConfiguration.setProperty(OozieClient.USER_NAME, getTestUser)
>usrDefinedProperties.foreach { case (k, v) =>
> workflowConfiguration.put(k.toString, v.toString) }
>
>val dagEngine:DagEngine  =
> Services.get.get(classOf[DagEngineService]).getDagEngine(getTestUser)
>val conf = new Configuration()
>workflowConfiguration.asScala.toMap.foreach{case(k,v ) => conf.set(k,
> v)}
> val jobId = dagEngine.dryRunSubmit(conf)
>
> But I would like to check how workflow flows when you pass in different
> parameters without actualy executing steps.
>