Re: [openstack-dev] [sahara] Shell Action, Re: Running HBase Jobs (was: About Sahara Oozie plan)

2015-02-13 Thread michael mccune

On 02/12/2015 05:15 PM, Trevor McKay wrote:

Hi folks,

Here is another way to do this.  Lu had mentioned Oozie shell actions
previously.
Sahara doesn't support them, but I played with it from the Oozie command
line
to verify that it solves our hbase problem, too.

We can potentially create a blueprint to build a simple Shell action
around a
user-supplied script and supporting files.  The script and files would
all be stored
in Sahara as job binaries (Swift or internal db) and referenced the same
way. The exec
target must be on the path at runtime, or included in the working dir.

To do this, I simply put workflow.xml, doit.sh, and the test jar into
a directory in hdfs.  Then I ran it with the Oozie cli using the job.xml
config file
configured to point at the hdfs dir.  Nothing special here, just
standard Oozie
job execution.



very cool Trevor, i wonder if there is a greater pattern here we could 
identify. namely, the idea that a user could upload multiple job 
binaries and create some sort of nesting structure to the way they are 
interpreted. maybe there could be a standard substitution method for 
script like files. in this respect a user could create a standard 
wrapper script and allow different binary names to be substituted into 
the script. this may be too complicated but it occurred to me while 
reading your results.


regardless of the greater pattern, this is a good window into more ways 
for us to control the command arguments.


mike


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [sahara] Shell Action, Re: Running HBase Jobs (was: About Sahara Oozie plan)

2015-02-12 Thread Trevor McKay
Hmm, my attachments were removed :)

Well, the interesting parts were the doit.sh and workflow.xml:

$ more doit.sh 
#!/bin/bash
/usr/lib/jvm/java-7-oracle-cloudera/bin/java -cp HBaseTest.jar:`hbase
classpath` HBaseTest

$ more workflow.xml




${jobTracker}
${nameNode}


  mapred.job.queue.name
  default


doit.sh
HBaseTest.jar
doit.sh





Script failed, error
message[${wf:errorMessage(wf:lastErrorNode())}]






On Thu, 2015-02-12 at 17:15 -0500, Trevor McKay wrote:

> Hi folks,
> 
> Here is another way to do this.  Lu had mentioned Oozie shell actions
> previously.
> Sahara doesn't support them, but I played with it from the Oozie
> command line
> to verify that it solves our hbase problem, too.
> 
> We can potentially create a blueprint to build a simple Shell action
> around a
> user-supplied script and supporting files.  The script and files would
> all be stored
> in Sahara as job binaries (Swift or internal db) and referenced the
> same way. The exec
> target must be on the path at runtime, or included in the working dir.
> 
> To do this, I simply put workflow.xml, doit.sh, and the test jar into
> a directory in hdfs.  Then I ran it with the Oozie cli using the
> job.xml config file
> configured to point at the hdfs dir.  Nothing special here, just
> standard Oozie
> job execution.
> 
> I've attached everything here but the test jar.
> 
> $ oozie job -oozie http://localhost:11000/oozie -config job.xml -run
> 
> Best,
> 
> Trev
> 
> On Thu, 2015-02-12 at 08:39 -0500, Trevor McKay wrote:
> 
> > Hi Lu, folks,
> > 
> > I've been investigating how to run Java actions in Sahara EDP that
> > depend on 
> > HBase libraries (see snippet from the original question by Lu
> > below).
> > 
> > In a nutshell, we need to use Oozie sharelibs for this. I am working
> > on a spec now, thinking 
> > about the best way to support this in Sahara, but here is a
> > semi-manual intermediate solution
> > that will work if you would like to run such a job from Sahara.
> > 
> > 1) Create your own Oozie sharelib that contains the HBase jars.
> > 
> > This ultimately is just an HDFS dir holding the jars.  On any node
> > in your cluster with 
> > HBase installed, run the attached script or something like it (I
> > like Python better than bash :) )
> > It simply separates the classpath and uploads all the jars to the
> > specified HDFS dir.
> > 
> > $ parsePath.py /user/myhbaselib
> > 
> > 2) Run your Java action from EDP, but use the oozie.libpath
> > configuration value when you
> > launch the job.  For example, on the job configure tab set
> > oozie.libpath like this:
> > 
> > NameValue
> > 
> > oozie.libpathhdfs://namenode:8020/user/myhbaselib
> > 
> > (note, support for this was added in
> > https://review.openstack.org/#/c/154214/)
> > 
> > That's it! In general, you can add any jars that you want to a
> > sharelib and then set the
> > oozie.libpath for the job to access them.
> > 
> > Here is a good blog entry about sharelibs and extra jars in Oozie
> > jobs:
> > 
> > http://blog.cloudera.com/blog/2014/05/how-to-use-the-sharelib-in-apache-oozie-cdh-5/
> > 
> > Best,
> > 
> > Trevor
> > 
> > --- original question
> > (1) EDP job in Java action
> > 
> >The background is that we want write integration test case for
> > newly added services like HBase, zookeeper just like the way the
> > edp-examples does( sample code under sahara/etc/edp-examples/). So I
> > thought I can wrote an example via edp job by Java action to test
> > HBase Service, then I wrote the HBaseTest.java and packaged as a jar
> > file, and run this jar manually with the command "java -cp `hbase
> > classpath` HBaseTest.jar HBaseTest", it works well in the
> > vm(provisioned by sahara with cdh plugin). 
> > “/usr/lib/jvm/java-7-oracle-cloudera/bin/java -cp
> > "HBaseTest.jar:`hbase classpath`" HBaseTest”
> > So I want run this job via horizon in sahara job execution page, but
> > found no place to pass the `hbase classpath` parameter.(I have tried
> > java_opt and configuration and args, all failed). When I pass the
> > “-cp `hbase classpath`” to java_opts in horizon job execution page.
> > Oozie raise this error as below
> > 
> > 
> > 
> > __
> > OpenStack Development Mailing List (not for usage questions)
> > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> 
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-de

[openstack-dev] [sahara] Shell Action, Re: Running HBase Jobs (was: About Sahara Oozie plan)

2015-02-12 Thread Trevor McKay
Hi folks,

Here is another way to do this.  Lu had mentioned Oozie shell actions
previously.
Sahara doesn't support them, but I played with it from the Oozie command
line
to verify that it solves our hbase problem, too.

We can potentially create a blueprint to build a simple Shell action
around a
user-supplied script and supporting files.  The script and files would
all be stored
in Sahara as job binaries (Swift or internal db) and referenced the same
way. The exec
target must be on the path at runtime, or included in the working dir.

To do this, I simply put workflow.xml, doit.sh, and the test jar into
a directory in hdfs.  Then I ran it with the Oozie cli using the job.xml
config file
configured to point at the hdfs dir.  Nothing special here, just
standard Oozie
job execution.

I've attached everything here but the test jar.

$ oozie job -oozie http://localhost:11000/oozie -config job.xml -run

Best,

Trev

On Thu, 2015-02-12 at 08:39 -0500, Trevor McKay wrote:

> Hi Lu, folks,
> 
> I've been investigating how to run Java actions in Sahara EDP that
> depend on 
> HBase libraries (see snippet from the original question by Lu below).
> 
> In a nutshell, we need to use Oozie sharelibs for this. I am working
> on a spec now, thinking 
> about the best way to support this in Sahara, but here is a
> semi-manual intermediate solution
> that will work if you would like to run such a job from Sahara.
> 
> 1) Create your own Oozie sharelib that contains the HBase jars.
> 
> This ultimately is just an HDFS dir holding the jars.  On any node in
> your cluster with 
> HBase installed, run the attached script or something like it (I like
> Python better than bash :) )
> It simply separates the classpath and uploads all the jars to the
> specified HDFS dir.
> 
> $ parsePath.py /user/myhbaselib
> 
> 2) Run your Java action from EDP, but use the oozie.libpath
> configuration value when you
> launch the job.  For example, on the job configure tab set
> oozie.libpath like this:
> 
> NameValue
> 
> oozie.libpathhdfs://namenode:8020/user/myhbaselib
> 
> (note, support for this was added in
> https://review.openstack.org/#/c/154214/)
> 
> That's it! In general, you can add any jars that you want to a
> sharelib and then set the
> oozie.libpath for the job to access them.
> 
> Here is a good blog entry about sharelibs and extra jars in Oozie
> jobs:
> 
> http://blog.cloudera.com/blog/2014/05/how-to-use-the-sharelib-in-apache-oozie-cdh-5/
> 
> Best,
> 
> Trevor
> 
> --- original question
> (1) EDP job in Java action
> 
>The background is that we want write integration test case for
> newly added services like HBase, zookeeper just like the way the
> edp-examples does( sample code under sahara/etc/edp-examples/). So I
> thought I can wrote an example via edp job by Java action to test
> HBase Service, then I wrote the HBaseTest.java and packaged as a jar
> file, and run this jar manually with the command "java -cp `hbase
> classpath` HBaseTest.jar HBaseTest", it works well in the
> vm(provisioned by sahara with cdh plugin). 
> “/usr/lib/jvm/java-7-oracle-cloudera/bin/java -cp
> "HBaseTest.jar:`hbase classpath`" HBaseTest”
> So I want run this job via horizon in sahara job execution page, but
> found no place to pass the `hbase classpath` parameter.(I have tried
> java_opt and configuration and args, all failed). When I pass the “-cp
> `hbase classpath`” to java_opts in horizon job execution page. Oozie
> raise this error as below
> 
> 
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




workflow.xml
Description: XML document


doit.sh
Description: application/shellscript


job.xml
Description: XML document
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev