Re: [openstack-dev] [sahara] Shell Action, Re: Running HBase Jobs (was: About Sahara Oozie plan)

2015-02-13 Thread michael mccune

On 02/12/2015 05:15 PM, Trevor McKay wrote:

Hi folks,

Here is another way to do this.  Lu had mentioned Oozie shell actions
previously.
Sahara doesn't support them, but I played with it from the Oozie command
line
to verify that it solves our hbase problem, too.

We can potentially create a blueprint to build a simple Shell action
around a
user-supplied script and supporting files.  The script and files would
all be stored
in Sahara as job binaries (Swift or internal db) and referenced the same
way. The exec
target must be on the path at runtime, or included in the working dir.

To do this, I simply put workflow.xml, doit.sh, and the test jar into
a directory in hdfs.  Then I ran it with the Oozie cli using the job.xml
config file
configured to point at the hdfs dir.  Nothing special here, just
standard Oozie
job execution.



very cool Trevor, i wonder if there is a greater pattern here we could 
identify. namely, the idea that a user could upload multiple job 
binaries and create some sort of nesting structure to the way they are 
interpreted. maybe there could be a standard substitution method for 
script like files. in this respect a user could create a standard 
wrapper script and allow different binary names to be substituted into 
the script. this may be too complicated but it occurred to me while 
reading your results.


regardless of the greater pattern, this is a good window into more ways 
for us to control the command arguments.


mike


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [sahara] Shell Action, Re: Running HBase Jobs (was: About Sahara Oozie plan)

2015-02-12 Thread Trevor McKay
Hi folks,

Here is another way to do this.  Lu had mentioned Oozie shell actions
previously.
Sahara doesn't support them, but I played with it from the Oozie command
line
to verify that it solves our hbase problem, too.

We can potentially create a blueprint to build a simple Shell action
around a
user-supplied script and supporting files.  The script and files would
all be stored
in Sahara as job binaries (Swift or internal db) and referenced the same
way. The exec
target must be on the path at runtime, or included in the working dir.

To do this, I simply put workflow.xml, doit.sh, and the test jar into
a directory in hdfs.  Then I ran it with the Oozie cli using the job.xml
config file
configured to point at the hdfs dir.  Nothing special here, just
standard Oozie
job execution.

I've attached everything here but the test jar.

$ oozie job -oozie http://localhost:11000/oozie -config job.xml -run

Best,

Trev

On Thu, 2015-02-12 at 08:39 -0500, Trevor McKay wrote:

 Hi Lu, folks,
 
 I've been investigating how to run Java actions in Sahara EDP that
 depend on 
 HBase libraries (see snippet from the original question by Lu below).
 
 In a nutshell, we need to use Oozie sharelibs for this. I am working
 on a spec now, thinking 
 about the best way to support this in Sahara, but here is a
 semi-manual intermediate solution
 that will work if you would like to run such a job from Sahara.
 
 1) Create your own Oozie sharelib that contains the HBase jars.
 
 This ultimately is just an HDFS dir holding the jars.  On any node in
 your cluster with 
 HBase installed, run the attached script or something like it (I like
 Python better than bash :) )
 It simply separates the classpath and uploads all the jars to the
 specified HDFS dir.
 
 $ parsePath.py /user/myhbaselib
 
 2) Run your Java action from EDP, but use the oozie.libpath
 configuration value when you
 launch the job.  For example, on the job configure tab set
 oozie.libpath like this:
 
 NameValue
 
 oozie.libpathhdfs://namenode:8020/user/myhbaselib
 
 (note, support for this was added in
 https://review.openstack.org/#/c/154214/)
 
 That's it! In general, you can add any jars that you want to a
 sharelib and then set the
 oozie.libpath for the job to access them.
 
 Here is a good blog entry about sharelibs and extra jars in Oozie
 jobs:
 
 http://blog.cloudera.com/blog/2014/05/how-to-use-the-sharelib-in-apache-oozie-cdh-5/
 
 Best,
 
 Trevor
 
 --- original question
 (1) EDP job in Java action
 
The background is that we want write integration test case for
 newly added services like HBase, zookeeper just like the way the
 edp-examples does( sample code under sahara/etc/edp-examples/). So I
 thought I can wrote an example via edp job by Java action to test
 HBase Service, then I wrote the HBaseTest.java and packaged as a jar
 file, and run this jar manually with the command java -cp `hbase
 classpath` HBaseTest.jar HBaseTest, it works well in the
 vm(provisioned by sahara with cdh plugin). 
 “/usr/lib/jvm/java-7-oracle-cloudera/bin/java -cp
 HBaseTest.jar:`hbase classpath` HBaseTest”
 So I want run this job via horizon in sahara job execution page, but
 found no place to pass the `hbase classpath` parameter.(I have tried
 java_opt and configuration and args, all failed). When I pass the “-cp
 `hbase classpath`” to java_opts in horizon job execution page. Oozie
 raise this error as below
 
 
 
 __
 OpenStack Development Mailing List (not for usage questions)
 Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




workflow.xml
Description: XML document


doit.sh
Description: application/shellscript


job.xml
Description: XML document
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [sahara] Shell Action, Re: Running HBase Jobs (was: About Sahara Oozie plan)

2015-02-12 Thread Trevor McKay
Hmm, my attachments were removed :)

Well, the interesting parts were the doit.sh and workflow.xml:

$ more doit.sh 
#!/bin/bash
/usr/lib/jvm/java-7-oracle-cloudera/bin/java -cp HBaseTest.jar:`hbase
classpath` HBaseTest

$ more workflow.xml
workflow-app xmlns='uri:oozie:workflow:0.3' name='shell-wf'
start to='shell1' /
action name='shell1'
shell xmlns=uri:oozie:shell-action:0.1
job-tracker${jobTracker}/job-tracker
name-node${nameNode}/name-node
configuration
property
  namemapred.job.queue.name/name
  valuedefault/value
/property
/configuration
execdoit.sh/exec
fileHBaseTest.jar/file
filedoit.sh/file
/shell
ok to=end /
error to=fail /
/action
kill name=fail
messageScript failed, error
message[${wf:errorMessage(wf:lastErrorNode())}]/message
/kill
end name='end' /
/workflow-app



On Thu, 2015-02-12 at 17:15 -0500, Trevor McKay wrote:

 Hi folks,
 
 Here is another way to do this.  Lu had mentioned Oozie shell actions
 previously.
 Sahara doesn't support them, but I played with it from the Oozie
 command line
 to verify that it solves our hbase problem, too.
 
 We can potentially create a blueprint to build a simple Shell action
 around a
 user-supplied script and supporting files.  The script and files would
 all be stored
 in Sahara as job binaries (Swift or internal db) and referenced the
 same way. The exec
 target must be on the path at runtime, or included in the working dir.
 
 To do this, I simply put workflow.xml, doit.sh, and the test jar into
 a directory in hdfs.  Then I ran it with the Oozie cli using the
 job.xml config file
 configured to point at the hdfs dir.  Nothing special here, just
 standard Oozie
 job execution.
 
 I've attached everything here but the test jar.
 
 $ oozie job -oozie http://localhost:11000/oozie -config job.xml -run
 
 Best,
 
 Trev
 
 On Thu, 2015-02-12 at 08:39 -0500, Trevor McKay wrote:
 
  Hi Lu, folks,
  
  I've been investigating how to run Java actions in Sahara EDP that
  depend on 
  HBase libraries (see snippet from the original question by Lu
  below).
  
  In a nutshell, we need to use Oozie sharelibs for this. I am working
  on a spec now, thinking 
  about the best way to support this in Sahara, but here is a
  semi-manual intermediate solution
  that will work if you would like to run such a job from Sahara.
  
  1) Create your own Oozie sharelib that contains the HBase jars.
  
  This ultimately is just an HDFS dir holding the jars.  On any node
  in your cluster with 
  HBase installed, run the attached script or something like it (I
  like Python better than bash :) )
  It simply separates the classpath and uploads all the jars to the
  specified HDFS dir.
  
  $ parsePath.py /user/myhbaselib
  
  2) Run your Java action from EDP, but use the oozie.libpath
  configuration value when you
  launch the job.  For example, on the job configure tab set
  oozie.libpath like this:
  
  NameValue
  
  oozie.libpathhdfs://namenode:8020/user/myhbaselib
  
  (note, support for this was added in
  https://review.openstack.org/#/c/154214/)
  
  That's it! In general, you can add any jars that you want to a
  sharelib and then set the
  oozie.libpath for the job to access them.
  
  Here is a good blog entry about sharelibs and extra jars in Oozie
  jobs:
  
  http://blog.cloudera.com/blog/2014/05/how-to-use-the-sharelib-in-apache-oozie-cdh-5/
  
  Best,
  
  Trevor
  
  --- original question
  (1) EDP job in Java action
  
 The background is that we want write integration test case for
  newly added services like HBase, zookeeper just like the way the
  edp-examples does( sample code under sahara/etc/edp-examples/). So I
  thought I can wrote an example via edp job by Java action to test
  HBase Service, then I wrote the HBaseTest.java and packaged as a jar
  file, and run this jar manually with the command java -cp `hbase
  classpath` HBaseTest.jar HBaseTest, it works well in the
  vm(provisioned by sahara with cdh plugin). 
  “/usr/lib/jvm/java-7-oracle-cloudera/bin/java -cp
  HBaseTest.jar:`hbase classpath` HBaseTest”
  So I want run this job via horizon in sahara job execution page, but
  found no place to pass the `hbase classpath` parameter.(I have tried
  java_opt and configuration and args, all failed). When I pass the
  “-cp `hbase classpath`” to java_opts in horizon job execution page.
  Oozie raise this error as below
  
  
  
  __
  OpenStack Development Mailing List (not for usage questions)
  Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
  http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 
 
 
 __
 OpenStack Development