+1 (binding)
On Thu, Mar 21, 2013 at 7:32 PM, Niall Pemberton <niall.pember...@gmail.com>wrote: > +1 to Falcon - or whatever name it finalises on! > > Niall > > On Thu, Mar 21, 2013 at 4:54 AM, Srikanth Sundarrajan > <srikanth.sundarra...@inmobi.com> wrote: > > Hi, > > > > Thanks for participating in the proposal discussion on Falcon > > (formerly Ivory). I'd like to call a VOTE for acceptance of Apache > > Falcon into the Incubator. I'll let the vote run till (Tue 3/26 6pm IST). > > > > [ ] +1 Accept Apache Falcon into the Incubator > > [ ] +0 Don't care. > > [ ] -1 Don't accept Apache Falcon into the Incubator because... > > > > Full proposal is pasted at the bottom of this email, and the > > corresponding wiki is http://wiki.apache.org/incubator/FalconProposal. > > > > > > Only VOTEs from Incubator PMC members are binding, but all are welcome > > to express their thoughts. > > > > Thanks, > > Srikanth Sundarrajan > > = Falcon Proposal = > > > > == Abstract == > > Falcon is a data processing and management solution for Hadoop > > designed for data motion, coordination of data pipelines, lifecycle > > management, and data discovery. Falcon enables end consumers to > > quickly onboard their data and its associated processing and > > management tasks on Hadoop clusters. > > > > == Proposal == > > Falcon will enable easy data management via declarative mechanism for > > Hadoop. Users of Falcon platform simply define infrastructure > > endpoints, data sets and processing rules declaratively. These > > declarative configurations are expressed in such a way that the > > dependencies between these configured entities are explicitly > > described. This information about inter-dependencies between various > > entities allows Falcon to orchestrate and manage various data > > management functions. > > > > The key use cases that Falcon addresses are: > > * Data Motion > > * Process orchestration and scheduling > > * Policy-based Lifecycle Management > > * Data Discovery > > * Operability/Usability > > > > With these features it is possible for users to onboard their data > > sets with a comprehensive and holistic understanding of how, when and > > where their data is managed across its lifecycle. Complex functions > > such as retrying failures, identifying possible SLA breaches or > > automated handling of input data changes are now simple directives. > > All the administrative functions and user level functions are > > available via RESTful APIs. CLI is simply a wrapper over the RESTful > > APIs. > > > > == Background == > > Hadoop and its ecosystem of products have made storing and processing > > massive amounts of data commonplace. This has enabled numerous > > organizations to gain valuable insights that they never could have > > achieved in the past. While it is easy to leverage Hadoop for > > crunching large volumes of data, organizing data, managing life cycle > > of data and processing data is fairly involved. This is solved > > adequately well in a classic data platform involving data warehouses > > and standard ETL (extract-transform-load) tools, but remains largely > > unsolved today. In addition to data processing complexities, Hadoop > > presents new sets of challenges and opportunities relating to > > management of data. > > > > Data Management on Hadoop encompasses data motion, process > > orchestration, lifecycle management, data discovery, etc. among other > > concerns that are beyond ETL. Falcon is a new data processing and > > management platform for Hadoop that solves this problem and creates > > additional opportunities by building on existing components within the > > Hadoop ecosystem (ex. Apache Oozie, Apache Hadoop DistCp etc.) without > > reinventing the wheel. Falcon has been in production at InMobi, going > > on its second year and has been managing hundreds of feeds and > > processes. > > > > Falcon is being developed by engineers employed with InMobi and > > Hortonworks. This platform addition will increase the adoption of > > Apache Hadoop by driving data management tractable for end users. We > > are therefore proposing to make Falcon an Apache open source project. > > > > == Rationale == > > The Falcon project aims to improve the usability of Apache Hadoop. As > > a result Apache Hadoop will grow its community of users by increasing > > the places Hadoop can be utilized and the use cases it will solve. By > > developing Falcon in Apache we hope to gather a diverse community of > > contributors, helping to ensure that Falcon is deployable for a broad > > range of scenarios. Members of the Hadoop development community will > > be able to influence Falcon’s roadmap, and contribute to it. We > > believe having Falcon as part of the Apache Hadoop ecosystem will be a > > great benefit to all of Hadoop's users. > > > > == Current Status == > > Falcon is widely deployed in production within InMobi and moving on to > > its second year. A version with a valuable set of features is > > developed by the list of initial committers and is hosted on github. > > > > === Meritocracy === > > Our intent with this incubator proposal is to start building a diverse > > developer community around Falcon following the Apache meritocracy > > model. We have wanted to make the project open source and encourage > > contributors from multiple organizations from the start. We plan to > > provide plenty of support to new developers and to quickly recruit > > those who make solid contributions to committer status. > > > > === Community === > > We are happy to report that the initial team already represents > > multiple organizations. We hope to extend the user and developer base > > further in the future and build a solid open source community around > > Falcon. > > > > === Core Developers === > > Falcon is currently being developed by three engineers from InMobi – > > Srikanth Sunderrajan, Shwetha G S, and Shaik Idris, two Hortonworks > > employees – Sanjay Radia and Venkatesh Seetharam. In addition, Rohini > > Palaniswamy and Thiruvel Thirumoolan, were also involved in the > > initial design discussions. Srikanth, Shwetha and Shaik are the > > original developers. All the engineers have built two generations of > > Data Management on Hadoop, having deep expertise in Hadoop and are > > quite familiar with the Hadoop Ecosystem. Samarth Gupta & Rishu > > Mehrothra, both from InMobi have build the QA automation for Falcon. > > > > === Alignment === > > The ASF is a natural host for Falcon given that it is already the home > > of Hadoop, Pig, Knox, HCatalog, and other emerging “big data” software > > projects. Falcon has been designed to solve the data management > > challenges and opportunities of the Hadoop ecosystem family of > > products. Falcon fills the gap that Hadoop ecosystem has been lacking > > in the areas of data processing and data lifecycle management. > > > > == Known Risks == > > > > === Orphaned products & Reliance on Salaried Developers === > > The core developers plan to work full time on the project. There is > > very little risk of Falcon getting orphaned. Falcon is in use by > > companies we work for so the companies have an interest in its > > continued vitality. > > > > === Inexperience with Open Source === > > All of the core developers are active users and followers of open > > source. Srikanth Sundarrajan has been contributing patches to Apache > > Hadoop and Apache Oozie, Shwetha GS has been contributing patches to > > Apache Oozie. Seetharam Venkatesh is a committer on Apache Knox. > > Sharad Agarwal, Amareshwari SR (also a Apache Hive PMC member) and > > Sanjay Radia are PMC members on Apache Hadoop. > > > > === Homogeneous Developers === > > The current core developers are from diverse set of organizations such > > as InMobi and Hortonworks. We expect to quickly establish a developer > > community that includes contributors from several corporations post > > incubation. > > > > === Reliance on Salaried Developers === > > Currently, most developers are paid to do work on Falcon but few are > > contributing in their spare time. However, once the project has a > > community built around it post incubation, we expect to get committers > > and developers from outside the current core developers. > > > > === Relationships with Other Apache Products === > > Falcon is going to be used by the users of Hadoop and the Hadoop > > ecosystem in general. > > > > === A Excessive Fascination with the Apache Brand === > > While we respect the reputation of the Apache brand and have no doubts > > that it will attract contributors and users, our interest is primarily > > to give Falcon a solid home as an open source project following an > > established development model. We have also given reasons in the > > Rationale and Alignment sections. > > > > == Documentation ==http://wiki.apache.org/incubator/FalconProposal > > > > == Initial Source == > > The source is currently in github repository at: > > https://github.com/sriksun/Falcon > > > > == Source and Intellectual Property Submission Plan == > > The complete Falcon code is under Apache Software License 2. > > > > == External Dependencies == > > The dependencies all have Apache compatible licenses. These include > > BSD, MIT licensed dependencies. > > > > == Cryptography == > > None > > > > == Required Resources == > > > > === Mailing lists === > > > > * falcon-dev AT incubator DOT apache DOT org > > * falcon-commits AT incubator DOT apache DOT org > > * falcon-user AT incubator apache DOT org > > * falcon-private AT incubator DOT apache DOT org > > > > === Subversion Directory === > > Git is the preferred source control system: git://git.apache.org/falcon > > > > === Issue Tracking === > > JIRA FALCON > > > > == Initial Committers == > > * Srikanth Sundarrajan (Srikanth.Sundarrajan AT inmobi DOT com) > > * Shwetha GS (shwetha.gs AT inmobi DOT com) > > * Shaik Idris (shaik.idris AT inmobi DOT com) > > * Venkatesh Seetharam (Venkatesh AT apache DOT org) > > * Sanjay Radia (sanjay AT apache DOT org) > > * Sharad Agarwal (sharad AT apache DOT org) > > * Amareshwari SR (amareshwari AT apache DOT org) > > * Samarth Gupta (samarth.gupta AT inmobi DOT com) > > * Rishu Mehrothra (rishu.mehrothra AT inmobi DOT com) > > > > == Affiliations == > > * Srikanth Sundarrajan (InMobi) > > * Shwetha GS (InMobi) > > * Shaik Idris (InMobi) > > * Venkatesh Seetharam (Hortonworks Inc.) > > * Sanjay Radia (Hortonworks Inc.) > > * Sharad Agarwal (InMobi) > > * Amareshwari SR (InMobi) > > * Samarth Gupta (InMobi) > > * Rishu Mehrothra (InMobi) > > > > == Sponsors == > > > > === Champion === > > * Arun C Murthy (acmurthy at apache dot org) > > > > === Nominated Mentors === > > * Alan Gates (gates AT apache DOT org) > > * Chris Douglas (cdouglas AT apache DOT org) > > * Devaraj Das (ddas AT apache DOT org) > > * Owen O’Malley (omalley AT apache DOT org) > > > > === Sponsoring Entity === > > Incubator PMC > > > > -- > > _____________________________________________________________ > > The information contained in this communication is intended solely for > the > > use of the individual or entity to whom it is addressed and others > > authorized to receive it. It may contain confidential or legally > privileged > > information. If you are not the intended recipient you are hereby > notified > > that any disclosure, copying, distribution or taking any action in > reliance > > on the contents of this information is strictly prohibited and may be > > unlawful. If you have received this communication in error, please notify > > us immediately by responding to this email and then delete it from your > > system. The firm is neither liable for the proper and complete > transmission > > of the information contained in this communication nor for any delay in > its > > receipt. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org > For additional commands, e-mail: general-h...@incubator.apache.org > >