+1 (binding) > On Nov 26, 2015, at 11:50 AM, Konstantin Boudnik <c...@apache.org> wrote: > > Come to think of it a bit more, yes I am not satisfied with the outcome of > the CTR/RTC exchange in the project. > > Hence changing my vote to > -1 [binding] > > On Thu, Nov 26, 2015 at 11:47AM, Konstantin Boudnik wrote: >> -0 [binding] >> >> On Tue, Nov 24, 2015 at 01:03PM, Henry Robinson wrote: >>> Hi - >>> >>> The [DISCUSS] thread has been quiet for a few days, so I think there's been >>> sufficient opportunity for discussion around our proposal to bring Impala >>> to the ASF Incubator. >>> >>> I'd like to call a VOTE on that proposal, which is on the wiki at >>> https://wiki.apache.org/incubator/ImpalaProposal, and which I've pasted >>> below. >>> >>> During the discussion period, the proposal has been amended to add Brock >>> Noland as a new mentor, to add one missed committer from the list and to >>> correct some issues with the dependency list. >>> >>> Please cast your votes as follows: >>> >>> [] +1, accept Impala into the Incubator >>> [] +/-0, non-counted vote to express a disposition >>> [] -1, do not accept Impala into the Incubator (please give your reason(s)) >>> >>> As with the concurrent Kudu vote, I propose leaving the vote open for a >>> full seven days (to close at Tuesday, December 1st at noon PST), due to the >>> upcoming US holiday. >>> >>> Thanks, >>> Henry >>> >>> -------- >>> >>> = Abstract = >>> Impala is a high-performance C++ and Java SQL query engine for data stored >>> in Apache Hadoop-based clusters. >>> >>> = Proposal = >>> >>> We propose to contribute the Impala codebase and associated artifacts (e.g. >>> documentation, web-site content etc.) to the Apache Software Foundation >>> with the intent of forming a productive, meritocratic and open community >>> around Impala’s continued development, according to the ‘Apache Way’. >>> >>> Cloudera owns several trademarks regarding Impala, and proposes to transfer >>> ownership of those trademarks in full to the ASF. >>> >>> = Background = >>> Engineers at Cloudera developed Impala and released it as an >>> Apache-licensed open-source project in Fall 2012. Impala was written as a >>> brand-new, modern C++ SQL engine targeted from the start for data stored in >>> Apache Hadoop clusters. >>> >>> Impala’s most important benefit to users is high-performance, making it >>> extremely appropriate for common enterprise analytic and business >>> intelligence workloads. This is achieved by a number of software >>> techniques, including: native support for data stored in HDFS and related >>> filesystems, just-in-time compilation and optimization of individual query >>> plans, high-performance C++ codebase and massively-parallel distributed >>> architecture. In benchmarks, Impala is routinely amongst the very highest >>> performing SQL query engines. >>> >>> = Rationale = >>> >>> Despite the exciting innovation in the so-called ‘big-data’ space, SQL >>> remains by far the most common interface for interacting with data in both >>> traditional warehouses and modern ‘big-data’ clusters. There is clearly a >>> need, as evidenced by the eager adoption of Impala and other SQL engines in >>> enterprise contexts, for a query engine that offers the familiar SQL >>> interface, but that has been specifically designed to operate in massive, >>> distributed clusters rather than in traditional, fixed-hardware, >>> warehouse-specific deployments. Impala is one such query engine. >>> >>> We believe that the ASF is the right venue to foster an open-source >>> community around Impala’s development. We expect that Impala will benefit >>> from more productive collaboration with related Apache projects, and under >>> the auspices of the ASF will attract talented contributors who will push >>> Impala’s development forward at pace. >>> >>> We believe that the timing is right for Impala’s development to move >>> wholesale to the ASF: Impala is well-established, has been Apache-licensed >>> open-source for more than three years, and the core project is relatively >>> stable. We are excited to see where an ASF-based community can take Impala >>> from this strong starting point. >>> >>> = Initial Goals = >>> Our initial goals are as follows: >>> >>> * Establish ASF-compatible engineering practices and workflows >>> * Refactor and publish existing internal build scripts and test >>> infrastructure, in order to make them usable by any community member. >>> * Transfer source code, documentation and associated artifacts to the ASF. >>> * Grow the user and developer communities >>> >>> = Current Status = >>> >>> Impala is developed as an Apache-licensed open-source project. The source >>> code is available at http://github.com/cloudera/Impala, and developer >>> documentation is at https://github.com/cloudera/Impala/wiki. The majority >>> of commits to the project have come from Cloudera-employed developers, but >>> we have accepted some contributions from individuals from other >>> organizations. >>> >>> All code reviews are done via a public instance of the Gerrit review tool >>> at http://gerrit.cloudera.org:8080/, and discussed on a public mailing >>> list. All patches must be reviewed before they are accepted into the >>> codebase, via a voting mechanism that is similar to that used on Apache >>> projects such as Hadoop and HBase. >>> >>> Before a patch is committed, it must pass a suite of pre-commit tests. >>> These tests are currently run on Cloudera’s internal infrastructure. One of >>> our initial goals will be to work with the ASF Infrastructure team to find >>> a way to run these tests in an acceptable way on publicly accessible >>> machines. >>> >>> Issues are tracked in JIRA at https://issues.cloudera.org/projects/IMPALA, >>> in a way that is extremely similar to existing practices at other ASF >>> projects. >>> >>> = Meritocracy = >>> >>> We understand the central importance of meritocracy to the Apache Way. We >>> will work to establish a welcoming, fair and meritocratic community, in >>> part by expanding the set of committers on the project. Although Impala’s >>> committer list will initially be dominated by members of the Impala >>> engineering team at Cloudera, we look forward to growing a rich user and >>> developer community. >>> >>> = Community = >>> Impala has a strong user community (see >>> https://groups.google.com/a/cloudera.org/forum/#!forum/impala-user), and a >>> growing developer community (see >>> https://groups.google.com/a/cloudera.org/forum/#!forum/impala-dev). We wish >>> to attract more developers to the project, and we believe that the ASF’s >>> open and meritocratic philosophy will help us with this. We note the >>> success of other, similar projects already part of the ASF. >>> >>> = Core Developers = >>> Most - but not all - of Impala’s core developers are not currently >>> affiliated with the ASF, and will require new ICLAs. >>> >>> = Alignment = >>> Impala is related to several other Apache projects: >>> >>> * Data that is read by Impala is very often stored in Apache Hadoop >>> clusters powered by the HDFS filesystem. >>> * Impala can also read data stored in Apache HBase >>> * Metadata for databases, tables and so on is read by Impala from Apache >>> Hive. >>> * The preferred data format for HDFS-based tables is Apache Parquet, and >>> Apache Avro is also a supported data format. >>> * Impala is closely integrated with Kudu, which is also being proposed to >>> the Incubator. >>> * Impala uses Apache Thrift as its RPC and serialization framework of >>> choice. >>> >>> = Known Risks = >>> >>> == Orphaned Products == >>> Impala is used by most of Cloudera’s customers, and Cloudera remains >>> committed to developing and supporting the project. Cloudera has a strong >>> track record in standing behind projects that were contributed to the ASF >>> by its employees, including Apache Flume, Apache Sqoop, and others. Other >>> companies both ship and support Impala, lending credence to the idea that >>> Impala is not at risk of being suddenly orphaned. >>> >>> == Inexperience with Open Source == >>> Although all committers on the initial list have significant experience >>> with at least one open-source project - namely Impala - fewer have much >>> experience with ASF-based software projects as contributors and community >>> members. However, with the guidance of our mentors, committers who do have >>> ASF experience, and time to learn during Incubation, we are confident that >>> the project can be run in accordance with Apache principles on an ongoing >>> basis. >>> >>> == Homogeneous Developers == >>> >>> The initial committers are employees of Cloudera. >>> >>> The project has received some contributions from developers outside of >>> Cloudera, from individuals belonging to organizations such as Intel and >>> Google, from hobbyists and from students using Impala to advance their >>> understanding of distributed databases. The project attracted an active >>> user community as well. We hope to continue to encourage contributions from >>> these developers and community members and grow them into committers after >>> they have had time to continue their contributions. >>> >>> == Reliance on Salaried Developers == >>> >>> Many of Impala’s initial set of committers work full-time on Impala, and >>> are paid to do so. However, as mentioned elsewhere, we anticipate growth in >>> the developer community which we hope will include hobbyists and academics >>> who have an interested in distributed data systems. >>> >>> == An Excessive Fascination with the Apache Brand == >>> Although we hope that Impala benefits from the Apache Brand, any reflected >>> goodwill to Cloudera as the contributing entity is not the goal of >>> establishing Impala as an Apache project. We will work with the Incubator >>> PMC and the PRC to ensure that the Apache Brand is respected. >>> >>> = Documentation = >>> Impala: A Modern, Open-Source SQL Engine for Hadoop ( >>> http://www.cidrdb.org/cidr2015/Papers/CIDR15_Paper28.pdf) >>> >>> Impala’s developer wiki (https://github.com/cloudera/Impala/wiki) >>> >>> Impala’s auto-generated API documentation ( >>> http://impala.io/doc/html/index.html) >>> >>> = Initial Source = >>> Impala’s initial source contribution will come from >>> http://github.com/cloudera/Impala/. >>> >>> = External Dependencies = >>> >>> Impala depends upon a number of third-party libraries, which we list below. >>> We intend to compile a LICENSE.txt file in the very short term (see >>> https://issues.cloudera.org/browse/IMPALA-2670). >>> >>> * Google gflags (BSD) >>> * Google glog (BSD) >>> * Apache Thrift (Apache Software License v2.0) >>> * Apache Commons (Apache Software License v2.0) >>> * Apache Hadoop (Apache Software License v2.0) >>> * Apache HBase (Apache Software License v2.0) >>> * Apache Hive (Apache Software License v2.0) >>> * Boost (Boost Software License) >>> * OpenLdap (OpenLDAP Software License) >>> * rapidjson (MIT) >>> * Google RE2 (BSD-style) >>> * lz4 (BSD) >>> * snappy (BSD) >>> * cyrus-sasl (CMU License) >>> * Apache Avro (Apache Software License v2.0) >>> * Cloudera squeasel (Apache Software License v2.0) >>> * Apache htrace (Incubating) (Apache Software License v2.0) >>> * Apache Sentry (Incubating) (Apache Software License v2.0) >>> * Apache Shiro (Apache Software License v2.0) >>> * Twitter Bootstrap (Apache Software License v2.0) >>> * d3 (BSD) >>> * LLVM (BSD-like) >>> >>> Build and test dependencies: >>> >>> * ant (Apache Software License v2.0) >>> * Apache Maven (Apache Software License v2.0) >>> * cmake (BSD) >>> * clang (BSD) >>> * Google gtest (Apache Software License v2.0) >>> >>> = Required Resources = >>> >>> We request that following resources be created for the project to use: >>> >>> == Mailing lists == >>> >>> * priv...@impala.incubator.apache.org (moderated subscriptions) >>> * comm...@impala.incubator.apache.org >>> * d...@impala.incubator.apache.org >>> * iss...@impala.incubator.apache.org >>> * u...@impala.incubator.apache.org >>> >>> == Git repository == >>> https://git.apache.org/impala.git >>> >>> == JIRA instance == >>> JIRA project IMPALA (IMPALA or IMP) >>> >>> == Other Resources == >>> We hope to continue using Gerrit for our code review and commit workflow. >>> We are involved with discussions that the Kudu team at Cloudera have been >>> having with Jake Farrell to start discussions on how Gerrit can fit into >>> the ASF. We know that several other ASF projects or podlings are also >>> interested in Gerrit. >>> >>> If the Infrastructure team does not have the bandwidth to support gerrit, >>> we will continue to support our own instance of gerrit for Impala, and make >>> the necessary integrations such that commits are properly authenticated and >>> maintain sufficient provenance to uphold the ASF standards (e.g. via the >>> solution adopted by the AsterixDB podling). >>> >>> = Initial Committers = >>> >>> * Tim Armstrong >>> * Alex Behm >>> * Taras Bobrovytsky >>> * Casey Ching >>> * Martin Grund >>> * Daniel Hecht >>> * Michael Ho >>> * Matthew Jacobs >>> * Ishaan Joshi >>> * Lenni Kuff >>> * Marcel Kornacker >>> * Sailesh Mukil >>> * Henry Robinson >>> * John Russell >>> * Dimitris Tsirogiannis >>> * Skye Wanderman-Milne >>> * Juan Yu >>> >>> == Affiliations == >>> All: Cloudera Inc. >>> >>> = Sponsors = >>> >>> == Champion == >>> Tom White >>> >>> == Nominated Mentors == >>> * Tom White (Cloudera) >>> * Todd Lipcon (Cloudera) >>> * Carl Steinbach (LinkedIn) >>> * Brock Noland (StreamSets) >>> >>> >>> = Sponsoring Entity = >>> We ask that the Incubator PMC sponsor this proposal. > >
--------------------------------------------------------------------- To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org