Just to clarify my point and promote Azkaban...

This is part of an oozie config:

<workflow-app xmlns='uri:oozie:workflow:0.1' name='demo-wf'> ...
<action name="hdfs_1"> <fs>
<mkdir path="${nameNode}/tmp/${wf:user()}/hdfsdir1" /> </fs> <ok
to="hadoop_streaming_1" /> <error to="fail_1" />
</action> ... <decision name="decision_1">
<switch> <case to="map_reduce_2">
${hadoop:counters('mapred_1')[RECORDS][REDUCE_OUT] > 100000} </case>
<default to="end" />
</switch> </decision> ... <kill name="fail_1">
<message >Demo workflow failed, error message:
${wf:errorMessage(lastErrorNode())} </message> </kill> ...
</workflow-app>

I have no idea what that does because I cannot parse obscure XML.

This is an Azkaban config that reads my data's schema on HDFS, builds a
read-only Voldemort store out of my data on hadoop, then does a distributed
load right into a live Voldemort cluster - where its could be live on the
site!  The config file is called foo_data.job:

job.class=com.linkedin.batch.jobs.VoldemortBuildAndPushJob
hadoop.job.ugi=rjurney,foo
build.input.path=/user/rjurney/voldemort/foo_data
build.output.dir=/user/rjurney/voldemort/foo_data.store
push.store.name=foo_data
push.cluster=tcp://voldemort-server:6666

It is run with:

ant run -Djob.name=foo_data

The end.  If I want to schedule that job, I place the config in the jobs
directory, then go to the Azkaban web app to schedule it periodically.

Thats the kind of simplicity I value in my scheduler/workflow system,
because it is as simple as Pig itself.  Although I'd personally prefer to
nuke the config completely.

http://sna-projects.com/azkaban/quickstart.php

The Azkaban guys have NAILED this problem.  Its great :)

Russ

On Thu, Jun 24, 2010 at 2:06 PM, Mridul Muralidharan
<mrid...@yahoo-inc.com>wrote:

>
> As an aside, if you are using Azkaban for purpose of cron, etc - you might
> want to take a look at oozie : I think it has been released - and iirc going
> to be opensourced too.
>
>
> Regards,
> Mridul
>
>
> On Friday 25 June 2010 12:21 AM, Russell Jurney wrote:
>
>> Wrote a... thing about Pig at LinkedIn that might be useful to some:
>>
>> http://sna-projects.com/blog/2010/06/when-pigs-fly-apache-pig-open-source-and-understanding-systems/
>>
>> Russ
>>
>
>

Reply via email to