Have a look at 'hamake' on google source. On Thu, Feb 17, 2011 at 2:27 AM, Guy Doulberg <[email protected]> wrote: > Thank you all for your suggestion, > > The suggestions you gave me, are more on the "how" should I develop my app > side, and not "what" can I use instead of building an app of my own > > Going over Cascalog, Cascading and pig, I didn't find exactly what I need. > > I need a batch that periodically runs and samples folders for data, if it > finds data there, it takes the data and transforms it according to preset > transformations, I want to be able to change the transformations easily. The > transformations that the data should go through, are determined by the > directory it came from or a pattern in the data itself. > > This app sounds very similar to flume, beside the fact it digest the entire > data that has arrived in one map/reduce. > > > > -----Original Message----- > From: Chris K Wensel [mailto:[email protected]] > Sent: Wednesday, February 16, 2011 10:18 PM > To: [email protected] > Subject: Re: DataCreator > >> was thinking of using cascading, but cascading, requires me for each change >> in the data flow, to recompile and deploy. Maybe cascading can be part of >> the implementation but not the solution. > > Cascading is well suited for this. > > Multitool was written with Cascading, you can spawn reasonably complex > filtering, conversion, and joins from the command line (no recompiling). > Amazon promotes this for searching S3 buckets from EMR. > > Cascading.JRuby allows you to creating complex jobs from a jruby script, no > compiling. Etsy uses this for their web site funnel analysis. > > Cascalog is much more sophisticated, and can be driven from a Clojure shell > (repl), obviously no compiling there either. Quite a few companies use this > to power their analytics and analysis. > > all of which can be found here > http://www.cascading.org/modules.html > > And a number of companies have built proprietary web UI's to Hadoop with > Cascading as the query planner and processing engine. Some of which will ship > as products this year. > > fyi, there will be a Cascalog workshop this Saturday (I'll be attending) > http://www.cascading.org/2011/02/cascalog-workshop-february-19t.html > > cheers, > chris > > -- > Chris K Wensel > [email protected] > http://www.concurrentinc.com >
-- Lance Norskog [email protected]
