I have a workflow that periodically executes a Pig script, concatenates the
output into a single tab-separated value file, compresses the file using
Gzip and then FTPs it to a remote server.  I could do this pretty easily
using a standard shell script, but was looking into whether or not this
would be a good candidate for an Oozie workflow.  I am new to Oozie and
spent the last few days learning how to write and debug Oozie workflows.  I
have run into a few issues and was wondering if anyone had some advice.

Is there an easy way to concatenate output using Oozie?  HDFS supports the
getmerge command, but it appears this is not supported as an Oozie action.
 Would it make sense to execute this command using a shell or SSH action?
 Likewise I would like to compress and FTP this output using shell or SSH
actions.

I guess I have two basic questions.  First, is there an easy way to do all
of this in Ooozie.  Second, is this a good use case for Oozie? Reading
through the Oozie use cases and working through the examples, this doesn't
seem to be one of the primary use cases for Oozie.  Would this be better to
run as a standard cron job using a shell script?

I appreciate any experience or feedback you might have.

Thanks,
Shawn

Reply via email to