Have a look at 'hamake' on google source.

On Thu, Feb 17, 2011 at 2:27 AM, Guy Doulberg <[email protected]> wrote:
> Thank you all for your suggestion,
>
> The suggestions you gave me, are more on the "how" should I develop my app 
> side, and not "what" can I use instead of building an app of my own
>
> Going over Cascalog, Cascading and pig, I didn't find exactly what I need.
>
> I need a batch that periodically runs and samples folders for data, if it 
> finds data there, it takes the data and transforms it according to preset 
> transformations, I want to be able to change the transformations easily. The 
> transformations that the data should go through, are determined by the 
> directory it came from or a pattern in the data itself.
>
> This app sounds very similar to flume, beside the fact it digest the entire 
> data that has arrived in one map/reduce.
>
>
>
> -----Original Message-----
> From: Chris K Wensel [mailto:[email protected]]
> Sent: Wednesday, February 16, 2011 10:18 PM
> To: [email protected]
> Subject: Re: DataCreator
>
>> was thinking of using cascading, but cascading, requires me for each change 
>> in the data flow, to recompile and deploy. Maybe cascading can be part of 
>> the implementation but not the solution.
>
> Cascading is well suited for this.
>
> Multitool was written with Cascading, you can spawn reasonably complex 
> filtering, conversion, and joins from the command line (no recompiling). 
> Amazon promotes this for searching S3 buckets from EMR.
>
> Cascading.JRuby allows you to creating complex jobs from a jruby script, no 
> compiling. Etsy uses this for their web site funnel analysis.
>
> Cascalog is much more sophisticated, and can be driven from a Clojure shell 
> (repl), obviously no compiling there either. Quite a few companies use this 
> to power their analytics and analysis.
>
> all of which can be found here
> http://www.cascading.org/modules.html
>
> And a number of companies have built proprietary web UI's to Hadoop with 
> Cascading as the query planner and processing engine. Some of which will ship 
> as products this year.
>
> fyi, there will be a Cascalog workshop this Saturday (I'll be attending)
> http://www.cascading.org/2011/02/cascalog-workshop-february-19t.html
>
> cheers,
> chris
>
> --
> Chris K Wensel
> [email protected]
> http://www.concurrentinc.com
>



-- 
Lance Norskog
[email protected]

Reply via email to