Hey all,

We're big users of Flume, but now we're looking to integrate a workflow engine 
to manage dependencies between data imports, scheduled reports, and 
intermittent data generation for a Hadoop-based data warehouse & analytics 
system.

I thought I'd reach out to get the community's opinions:

- Do you use Yahoo (Apache) Oozie 
* What do you think of it? (pros/cons) 
* Would you recommend it?

- Do you use something else? 
* What do you think of it? (pros/cons) 
* Would you recommend it?

Any suggestions/comments greatly appreciated. 

I'm reaching out to the flume list, because I'd be especially interested to 
hear about any bespoke flume integrations the community has built (eg - 
checking that data from all machines is available before starting a job).

-- 
Matthew Rathbone
Foursquare | Software Engineer | Server Engineering Team
[email protected] (mailto:[email protected]) | @rathboma 
(http://twitter.com/rathboma) | 4sq (http://foursquare.com/rathboma)


Reply via email to