[RESULT] [VOTE] Accept Kudu into the Apache incubator

2015-12-01 Thread Todd Lipcon
The vote for accepting Kudu into the Incubator has now closed. The vote
count is as follows:

+1 (binding): 29
+1 (not binding): 9
-1 (binding): 2

The full list of voters is reproduced below. Apologies if I mistakenly
counted a binding vote as non-binding (had to check the IPMC members list
for a number of people who didn't specify).

The -1 votes were related to code review policy which was discussed
thoroughly in a separate thread.

As this is a majority vote, the vote passes.

Thanks for voting!
-Todd

+1 (binding)
--
Todd Lipcon
Reynold Xin
Jarek Jarcec Cecho
Jacques Nadeau
Andrew Purtell
Arvind Prabhakar
Alex Karasulu
Jean-Baptiste Onofre
Chris Mattman
Jake Farrell
Edward Yoon
Julien Le Dem
Sean Busbey
Hyunsik Choi
John D Ament
Carl Steinbach
Brock Noland
Owen O'Malley
Tom White
Rob Vesse
Roman Shaposhnik
Chris Douglas
Doug Cutting
Hitesh Shah
Julian Hyde
Ted Dunning
Andrew Bayer
Michael Stack
Andrei Savu

+1 (not binding)
--
Mike Percy
Ashish Paliwal
Patrick Angeles
Luke Han
Amol Kekre
Joe Witt
Tony Kurc
Henry Robinson
Sree V

-1 (binding)

Ralph Goers
Greg Stein


Re: [VOTE] Accept Kudu into the Apache Incubator

2015-11-30 Thread stack
+1 (binding)
St.Ack
On Nov 24, 2015 11:33 AM, "Todd Lipcon"  wrote:

> Hi all,
>
> Discussion on the [DISCUSS] thread seems to have wound down, so I'd like to
> call a VOTE on acceptance of Kudu into the ASF Incubator. The proposal is
> pasted below and also available on the wiki at:
> https://wiki.apache.org/incubator/KuduProposal
>
> The proposal is unchanged since the original version, except for the
> addition of Carl Steinbach as a Mentor.
>
> Please cast your votes:
>
> [] +1, accept Kudu into the Incubator
> [] +/-0, positive/negative non-counted expression of feelings
> [] -1, do not accept Kudu into the incubator (please state reasoning)
>
> Given the US holiday this week, I imagine many folks are traveling or
> otherwise offline. So, let's run the vote for a full week rather than the
> traditional 72 hours. Unless the IPMC objects to the extended voting
> period, the vote will close on Tues, Dec 1st at noon PST.
>
> Thanks
> -Todd
> -
>
> = Kudu Proposal =
>
> == Abstract ==
>
> Kudu is a distributed columnar storage engine built for the Apache Hadoop
> ecosystem.
>
> == Proposal ==
>
> Kudu is an open source storage engine for structured data which supports
> low-latency random access together with efficient analytical access
> patterns. Kudu distributes data using horizontal partitioning and
> replicates each partition using Raft consensus, providing low
> mean-time-to-recovery and low tail latencies. Kudu is designed within the
> context of the Apache Hadoop ecosystem and supports many integrations with
> other data analytics projects both inside and outside of the Apache
> Software Foundation.
>
>
>
> We propose to incubate Kudu as a project of the Apache Software Foundation.
>
> == Background ==
>
> In recent years, explosive growth in the amount of data being generated and
> captured by enterprises has resulted in the rapid adoption of open source
> technology which is able to store massive data sets at scale and at low
> cost. In particular, the Apache Hadoop ecosystem has become a focal point
> for such “big data” workloads, because many traditional open source
> database systems have lagged in offering a scalable alternative.
>
>
>
> Structured storage in the Hadoop ecosystem has typically been achieved in
> two ways: for static data sets, data is typically stored on Apache HDFS
> using binary data formats such as Apache Avro or Apache Parquet. However,
> neither HDFS nor these formats has any provision for updating individual
> records, or for efficient random access. Mutable data sets are typically
> stored in semi-structured stores such as Apache HBase or Apache Cassandra.
> These systems allow for low-latency record-level reads and writes, but lag
> far behind the static file formats in terms of sequential read throughput
> for applications such as SQL-based analytics or machine learning.
>
>
>
> Kudu is a new storage system designed and implemented from the ground up to
> fill this gap between high-throughput sequential-access storage systems
> such as HDFS and low-latency random-access systems such as HBase or
> Cassandra. While these existing systems continue to hold advantages in some
> situations, Kudu offers a “happy medium” alternative that can dramatically
> simplify the architecture of many common workloads. In particular, Kudu
> offers a simple API for row-level inserts, updates, and deletes, while
> providing table scans at throughputs similar to Parquet, a commonly-used
> columnar format for static data.
>
>
>
> More information on Kudu can be found at the existing open source project
> website: http://getkudu.io and in particular in the Kudu white-paper PDF:
> http://getkudu.io/kudu.pdf from which the above was excerpted.
>
> == Rationale ==
>
> As described above, Kudu fills an important gap in the open source storage
> ecosystem. After our initial open source project release in September 2015,
> we have seen a great amount of interest across a diverse set of users and
> companies. We believe that, as a storage system, it is critical to build an
> equally diverse set of contributors in the development community. Our
> experiences as committers and PMC members on other Apache projects have
> taught us the value of diverse communities in ensuring both longevity and
> high quality for such foundational systems.
>
> == Initial Goals ==
>
>  * Move the existing codebase, website, documentation, and mailing lists to
> Apache-hosted infrastructure
>  * Work with the infrastructure team to implement and approve our code
> review, build, and testing workflows in the context of the ASF
>  * Incremental development and releases per Apache guidelines
>
> == Current Status ==
>
>  Releases 
>
> Kudu has undergone one public release, tagged here
> https://github.com/cloudera/kudu/tree/kudu0.5.0-release
>
> This initial release was not performed in the typical ASF fashion -- no
> source tarball was released, but rather only convenience binaries made
> 

Re: [VOTE] Accept Kudu into the Apache Incubator

2015-11-30 Thread Sree V
+1 (non-binding) Thanking you.With RegardsSree
 


On Monday, November 30, 2015 9:33 AM, stack  wrote:
 

 +1 (binding)
St.Ack
On Nov 24, 2015 11:33 AM, "Todd Lipcon"  wrote:

> Hi all,
>
> Discussion on the [DISCUSS] thread seems to have wound down, so I'd like to
> call a VOTE on acceptance of Kudu into the ASF Incubator. The proposal is
> pasted below and also available on the wiki at:
> https://wiki.apache.org/incubator/KuduProposal
>
> The proposal is unchanged since the original version, except for the
> addition of Carl Steinbach as a Mentor.
>
> Please cast your votes:
>
> [] +1, accept Kudu into the Incubator
> [] +/-0, positive/negative non-counted expression of feelings
> [] -1, do not accept Kudu into the incubator (please state reasoning)
>
> Given the US holiday this week, I imagine many folks are traveling or
> otherwise offline. So, let's run the vote for a full week rather than the
> traditional 72 hours. Unless the IPMC objects to the extended voting
> period, the vote will close on Tues, Dec 1st at noon PST.
>
> Thanks
> -Todd
> -
>
> = Kudu Proposal =
>
> == Abstract ==
>
> Kudu is a distributed columnar storage engine built for the Apache Hadoop
> ecosystem.
>
> == Proposal ==
>
> Kudu is an open source storage engine for structured data which supports
> low-latency random access together with efficient analytical access
> patterns. Kudu distributes data using horizontal partitioning and
> replicates each partition using Raft consensus, providing low
> mean-time-to-recovery and low tail latencies. Kudu is designed within the
> context of the Apache Hadoop ecosystem and supports many integrations with
> other data analytics projects both inside and outside of the Apache
> Software Foundation.
>
>
>
> We propose to incubate Kudu as a project of the Apache Software Foundation.
>
> == Background ==
>
> In recent years, explosive growth in the amount of data being generated and
> captured by enterprises has resulted in the rapid adoption of open source
> technology which is able to store massive data sets at scale and at low
> cost. In particular, the Apache Hadoop ecosystem has become a focal point
> for such “big data” workloads, because many traditional open source
> database systems have lagged in offering a scalable alternative.
>
>
>
> Structured storage in the Hadoop ecosystem has typically been achieved in
> two ways: for static data sets, data is typically stored on Apache HDFS
> using binary data formats such as Apache Avro or Apache Parquet. However,
> neither HDFS nor these formats has any provision for updating individual
> records, or for efficient random access. Mutable data sets are typically
> stored in semi-structured stores such as Apache HBase or Apache Cassandra.
> These systems allow for low-latency record-level reads and writes, but lag
> far behind the static file formats in terms of sequential read throughput
> for applications such as SQL-based analytics or machine learning.
>
>
>
> Kudu is a new storage system designed and implemented from the ground up to
> fill this gap between high-throughput sequential-access storage systems
> such as HDFS and low-latency random-access systems such as HBase or
> Cassandra. While these existing systems continue to hold advantages in some
> situations, Kudu offers a “happy medium” alternative that can dramatically
> simplify the architecture of many common workloads. In particular, Kudu
> offers a simple API for row-level inserts, updates, and deletes, while
> providing table scans at throughputs similar to Parquet, a commonly-used
> columnar format for static data.
>
>
>
> More information on Kudu can be found at the existing open source project
> website: http://getkudu.io and in particular in the Kudu white-paper PDF:
> http://getkudu.io/kudu.pdf from which the above was excerpted.
>
> == Rationale ==
>
> As described above, Kudu fills an important gap in the open source storage
> ecosystem. After our initial open source project release in September 2015,
> we have seen a great amount of interest across a diverse set of users and
> companies. We believe that, as a storage system, it is critical to build an
> equally diverse set of contributors in the development community. Our
> experiences as committers and PMC members on other Apache projects have
> taught us the value of diverse communities in ensuring both longevity and
> high quality for such foundational systems.
>
> == Initial Goals ==
>
>  * Move the existing codebase, website, documentation, and mailing lists to
> Apache-hosted infrastructure
>  * Work with the infrastructure team to implement and approve our code
> review, build, and testing workflows in the context of the ASF
>  * Incremental development and releases per Apache guidelines
>
> == Current Status ==
>
>  Releases 
>
> Kudu has undergone one public release, tagged here
> https://github.com/cloudera/kudu/tree/kudu0.5.0-release
>
> This initial 

Re: [VOTE] Accept Kudu into the Apache Incubator

2015-11-29 Thread Henry Robinson
+1 (non-binding).

Thanks,
Henry

On 27 November 2015 at 07:14, Andrew Bayer  wrote:

> +1 binding
>
> On Thursday, November 26, 2015, Ted Dunning  wrote:
>
> > +1 (binding)
> >
> > I think that forcing experienced community developers into one model or
> the
> > other is unnecessary. Let them in as they would like.
> >
> >
> >
> > On Wed, Nov 25, 2015 at 4:51 PM, Greg Stein  > > wrote:
> >
> > > -1 (binding)
> > >
> > > Starting with RTC is a poor way to attract new community members. I'd
> > like
> > > to see this community use CTR instead of mandating gerrit reviews.
> > >
> > > (ref: other-threads about lack of trust, and control issues; poor basis
> > for
> > > a community)
> > >
> > > On Tue, Nov 24, 2015 at 1:32 PM, Todd Lipcon  > > wrote:
> > >
> > > > Hi all,
> > > >
> > > > Discussion on the [DISCUSS] thread seems to have wound down, so I'd
> > like
> > > to
> > > > call a VOTE on acceptance of Kudu into the ASF Incubator. The
> proposal
> > is
> > > > pasted below and also available on the wiki at:
> > > > https://wiki.apache.org/incubator/KuduProposal
> > > >
> > > > The proposal is unchanged since the original version, except for the
> > > > addition of Carl Steinbach as a Mentor.
> > > >
> > > > Please cast your votes:
> > > >
> > > > [] +1, accept Kudu into the Incubator
> > > > [] +/-0, positive/negative non-counted expression of feelings
> > > > [] -1, do not accept Kudu into the incubator (please state reasoning)
> > > >
> > > > Given the US holiday this week, I imagine many folks are traveling or
> > > > otherwise offline. So, let's run the vote for a full week rather than
> > the
> > > > traditional 72 hours. Unless the IPMC objects to the extended voting
> > > > period, the vote will close on Tues, Dec 1st at noon PST.
> > > >
> > > > Thanks
> > > > -Todd
> > > > -
> > > >
> > > > = Kudu Proposal =
> > > >
> > > > == Abstract ==
> > > >
> > > > Kudu is a distributed columnar storage engine built for the Apache
> > Hadoop
> > > > ecosystem.
> > > >
> > > > == Proposal ==
> > > >
> > > > Kudu is an open source storage engine for structured data which
> > supports
> > > > low-latency random access together with efficient analytical access
> > > > patterns. Kudu distributes data using horizontal partitioning and
> > > > replicates each partition using Raft consensus, providing low
> > > > mean-time-to-recovery and low tail latencies. Kudu is designed within
> > the
> > > > context of the Apache Hadoop ecosystem and supports many integrations
> > > with
> > > > other data analytics projects both inside and outside of the Apache
> > > > Software Foundation.
> > > >
> > > >
> > > >
> > > > We propose to incubate Kudu as a project of the Apache Software
> > > Foundation.
> > > >
> > > > == Background ==
> > > >
> > > > In recent years, explosive growth in the amount of data being
> generated
> > > and
> > > > captured by enterprises has resulted in the rapid adoption of open
> > source
> > > > technology which is able to store massive data sets at scale and at
> low
> > > > cost. In particular, the Apache Hadoop ecosystem has become a focal
> > point
> > > > for such “big data” workloads, because many traditional open source
> > > > database systems have lagged in offering a scalable alternative.
> > > >
> > > >
> > > >
> > > > Structured storage in the Hadoop ecosystem has typically been
> achieved
> > in
> > > > two ways: for static data sets, data is typically stored on Apache
> HDFS
> > > > using binary data formats such as Apache Avro or Apache Parquet.
> > However,
> > > > neither HDFS nor these formats has any provision for updating
> > individual
> > > > records, or for efficient random access. Mutable data sets are
> > typically
> > > > stored in semi-structured stores such as Apache HBase or Apache
> > > Cassandra.
> > > > These systems allow for low-latency record-level reads and writes,
> but
> > > lag
> > > > far behind the static file formats in terms of sequential read
> > throughput
> > > > for applications such as SQL-based analytics or machine learning.
> > > >
> > > >
> > > >
> > > > Kudu is a new storage system designed and implemented from the ground
> > up
> > > to
> > > > fill this gap between high-throughput sequential-access storage
> systems
> > > > such as HDFS and low-latency random-access systems such as HBase or
> > > > Cassandra. While these existing systems continue to hold advantages
> in
> > > some
> > > > situations, Kudu offers a “happy medium” alternative that can
> > > dramatically
> > > > simplify the architecture of many common workloads. In particular,
> Kudu
> > > > offers a simple API for row-level inserts, updates, and deletes,
> while
> > > > providing table scans at throughputs similar to Parquet, a
> > commonly-used
> > > > columnar format for static data.
> > > >
> > > >
> > > >
> > > > More information on Kudu can be found at the 

Re: [VOTE] Accept Kudu into the Apache Incubator

2015-11-29 Thread Andrei Savu
+1 (binding)

-- Andrei Savu

On Tue, Nov 24, 2015 at 11:32 AM, Todd Lipcon  wrote:

> Hi all,
>
> Discussion on the [DISCUSS] thread seems to have wound down, so I'd like to
> call a VOTE on acceptance of Kudu into the ASF Incubator. The proposal is
> pasted below and also available on the wiki at:
> https://wiki.apache.org/incubator/KuduProposal
>
> The proposal is unchanged since the original version, except for the
> addition of Carl Steinbach as a Mentor.
>
> Please cast your votes:
>
> [] +1, accept Kudu into the Incubator
> [] +/-0, positive/negative non-counted expression of feelings
> [] -1, do not accept Kudu into the incubator (please state reasoning)
>
> Given the US holiday this week, I imagine many folks are traveling or
> otherwise offline. So, let's run the vote for a full week rather than the
> traditional 72 hours. Unless the IPMC objects to the extended voting
> period, the vote will close on Tues, Dec 1st at noon PST.
>
> Thanks
> -Todd
> -
>
> = Kudu Proposal =
>
> == Abstract ==
>
> Kudu is a distributed columnar storage engine built for the Apache Hadoop
> ecosystem.
>
> == Proposal ==
>
> Kudu is an open source storage engine for structured data which supports
> low-latency random access together with efficient analytical access
> patterns. Kudu distributes data using horizontal partitioning and
> replicates each partition using Raft consensus, providing low
> mean-time-to-recovery and low tail latencies. Kudu is designed within the
> context of the Apache Hadoop ecosystem and supports many integrations with
> other data analytics projects both inside and outside of the Apache
> Software Foundation.
>
>
>
> We propose to incubate Kudu as a project of the Apache Software Foundation.
>
> == Background ==
>
> In recent years, explosive growth in the amount of data being generated and
> captured by enterprises has resulted in the rapid adoption of open source
> technology which is able to store massive data sets at scale and at low
> cost. In particular, the Apache Hadoop ecosystem has become a focal point
> for such “big data” workloads, because many traditional open source
> database systems have lagged in offering a scalable alternative.
>
>
>
> Structured storage in the Hadoop ecosystem has typically been achieved in
> two ways: for static data sets, data is typically stored on Apache HDFS
> using binary data formats such as Apache Avro or Apache Parquet. However,
> neither HDFS nor these formats has any provision for updating individual
> records, or for efficient random access. Mutable data sets are typically
> stored in semi-structured stores such as Apache HBase or Apache Cassandra.
> These systems allow for low-latency record-level reads and writes, but lag
> far behind the static file formats in terms of sequential read throughput
> for applications such as SQL-based analytics or machine learning.
>
>
>
> Kudu is a new storage system designed and implemented from the ground up to
> fill this gap between high-throughput sequential-access storage systems
> such as HDFS and low-latency random-access systems such as HBase or
> Cassandra. While these existing systems continue to hold advantages in some
> situations, Kudu offers a “happy medium” alternative that can dramatically
> simplify the architecture of many common workloads. In particular, Kudu
> offers a simple API for row-level inserts, updates, and deletes, while
> providing table scans at throughputs similar to Parquet, a commonly-used
> columnar format for static data.
>
>
>
> More information on Kudu can be found at the existing open source project
> website: http://getkudu.io and in particular in the Kudu white-paper PDF:
> http://getkudu.io/kudu.pdf from which the above was excerpted.
>
> == Rationale ==
>
> As described above, Kudu fills an important gap in the open source storage
> ecosystem. After our initial open source project release in September 2015,
> we have seen a great amount of interest across a diverse set of users and
> companies. We believe that, as a storage system, it is critical to build an
> equally diverse set of contributors in the development community. Our
> experiences as committers and PMC members on other Apache projects have
> taught us the value of diverse communities in ensuring both longevity and
> high quality for such foundational systems.
>
> == Initial Goals ==
>
>  * Move the existing codebase, website, documentation, and mailing lists to
> Apache-hosted infrastructure
>  * Work with the infrastructure team to implement and approve our code
> review, build, and testing workflows in the context of the ASF
>  * Incremental development and releases per Apache guidelines
>
> == Current Status ==
>
>  Releases 
>
> Kudu has undergone one public release, tagged here
> https://github.com/cloudera/kudu/tree/kudu0.5.0-release
>
> This initial release was not performed in the typical ASF fashion -- no
> source tarball was released, but rather only convenience 

Re: [VOTE] Accept Kudu into the Apache Incubator

2015-11-27 Thread Andrew Bayer
+1 binding

On Thursday, November 26, 2015, Ted Dunning  wrote:

> +1 (binding)
>
> I think that forcing experienced community developers into one model or the
> other is unnecessary. Let them in as they would like.
>
>
>
> On Wed, Nov 25, 2015 at 4:51 PM, Greg Stein  > wrote:
>
> > -1 (binding)
> >
> > Starting with RTC is a poor way to attract new community members. I'd
> like
> > to see this community use CTR instead of mandating gerrit reviews.
> >
> > (ref: other-threads about lack of trust, and control issues; poor basis
> for
> > a community)
> >
> > On Tue, Nov 24, 2015 at 1:32 PM, Todd Lipcon  > wrote:
> >
> > > Hi all,
> > >
> > > Discussion on the [DISCUSS] thread seems to have wound down, so I'd
> like
> > to
> > > call a VOTE on acceptance of Kudu into the ASF Incubator. The proposal
> is
> > > pasted below and also available on the wiki at:
> > > https://wiki.apache.org/incubator/KuduProposal
> > >
> > > The proposal is unchanged since the original version, except for the
> > > addition of Carl Steinbach as a Mentor.
> > >
> > > Please cast your votes:
> > >
> > > [] +1, accept Kudu into the Incubator
> > > [] +/-0, positive/negative non-counted expression of feelings
> > > [] -1, do not accept Kudu into the incubator (please state reasoning)
> > >
> > > Given the US holiday this week, I imagine many folks are traveling or
> > > otherwise offline. So, let's run the vote for a full week rather than
> the
> > > traditional 72 hours. Unless the IPMC objects to the extended voting
> > > period, the vote will close on Tues, Dec 1st at noon PST.
> > >
> > > Thanks
> > > -Todd
> > > -
> > >
> > > = Kudu Proposal =
> > >
> > > == Abstract ==
> > >
> > > Kudu is a distributed columnar storage engine built for the Apache
> Hadoop
> > > ecosystem.
> > >
> > > == Proposal ==
> > >
> > > Kudu is an open source storage engine for structured data which
> supports
> > > low-latency random access together with efficient analytical access
> > > patterns. Kudu distributes data using horizontal partitioning and
> > > replicates each partition using Raft consensus, providing low
> > > mean-time-to-recovery and low tail latencies. Kudu is designed within
> the
> > > context of the Apache Hadoop ecosystem and supports many integrations
> > with
> > > other data analytics projects both inside and outside of the Apache
> > > Software Foundation.
> > >
> > >
> > >
> > > We propose to incubate Kudu as a project of the Apache Software
> > Foundation.
> > >
> > > == Background ==
> > >
> > > In recent years, explosive growth in the amount of data being generated
> > and
> > > captured by enterprises has resulted in the rapid adoption of open
> source
> > > technology which is able to store massive data sets at scale and at low
> > > cost. In particular, the Apache Hadoop ecosystem has become a focal
> point
> > > for such “big data” workloads, because many traditional open source
> > > database systems have lagged in offering a scalable alternative.
> > >
> > >
> > >
> > > Structured storage in the Hadoop ecosystem has typically been achieved
> in
> > > two ways: for static data sets, data is typically stored on Apache HDFS
> > > using binary data formats such as Apache Avro or Apache Parquet.
> However,
> > > neither HDFS nor these formats has any provision for updating
> individual
> > > records, or for efficient random access. Mutable data sets are
> typically
> > > stored in semi-structured stores such as Apache HBase or Apache
> > Cassandra.
> > > These systems allow for low-latency record-level reads and writes, but
> > lag
> > > far behind the static file formats in terms of sequential read
> throughput
> > > for applications such as SQL-based analytics or machine learning.
> > >
> > >
> > >
> > > Kudu is a new storage system designed and implemented from the ground
> up
> > to
> > > fill this gap between high-throughput sequential-access storage systems
> > > such as HDFS and low-latency random-access systems such as HBase or
> > > Cassandra. While these existing systems continue to hold advantages in
> > some
> > > situations, Kudu offers a “happy medium” alternative that can
> > dramatically
> > > simplify the architecture of many common workloads. In particular, Kudu
> > > offers a simple API for row-level inserts, updates, and deletes, while
> > > providing table scans at throughputs similar to Parquet, a
> commonly-used
> > > columnar format for static data.
> > >
> > >
> > >
> > > More information on Kudu can be found at the existing open source
> project
> > > website: http://getkudu.io and in particular in the Kudu white-paper
> > PDF:
> > > http://getkudu.io/kudu.pdf from which the above was excerpted.
> > >
> > > == Rationale ==
> > >
> > > As described above, Kudu fills an important gap in the open source
> > storage
> > > ecosystem. After our initial open source project release in September
> > 2015,
> > > we 

Re: [VOTE] Accept Kudu into the Apache Incubator

2015-11-26 Thread Joe Witt
+1 (non-binding)

On Wed, Nov 25, 2015 at 5:26 PM, Hitesh Shah  wrote:
> +1 (binding)
>
> — Hitesh
>
> On Nov 24, 2015, at 11:32 AM, Todd Lipcon  wrote:
>
>> Hi all,
>>
>> Discussion on the [DISCUSS] thread seems to have wound down, so I'd like to
>> call a VOTE on acceptance of Kudu into the ASF Incubator. The proposal is
>> pasted below and also available on the wiki at:
>> https://wiki.apache.org/incubator/KuduProposal
>>
>> The proposal is unchanged since the original version, except for the
>> addition of Carl Steinbach as a Mentor.
>>
>> Please cast your votes:
>>
>> [] +1, accept Kudu into the Incubator
>> [] +/-0, positive/negative non-counted expression of feelings
>> [] -1, do not accept Kudu into the incubator (please state reasoning)
>>
>> Given the US holiday this week, I imagine many folks are traveling or
>> otherwise offline. So, let's run the vote for a full week rather than the
>> traditional 72 hours. Unless the IPMC objects to the extended voting
>> period, the vote will close on Tues, Dec 1st at noon PST.
>>
>> Thanks
>> -Todd
>> -
>>
>> = Kudu Proposal =
>>
>> == Abstract ==
>>
>> Kudu is a distributed columnar storage engine built for the Apache Hadoop
>> ecosystem.
>>
>> == Proposal ==
>>
>> Kudu is an open source storage engine for structured data which supports
>> low-latency random access together with efficient analytical access
>> patterns. Kudu distributes data using horizontal partitioning and
>> replicates each partition using Raft consensus, providing low
>> mean-time-to-recovery and low tail latencies. Kudu is designed within the
>> context of the Apache Hadoop ecosystem and supports many integrations with
>> other data analytics projects both inside and outside of the Apache
>> Software Foundation.
>>
>>
>>
>> We propose to incubate Kudu as a project of the Apache Software Foundation.
>>
>> == Background ==
>>
>> In recent years, explosive growth in the amount of data being generated and
>> captured by enterprises has resulted in the rapid adoption of open source
>> technology which is able to store massive data sets at scale and at low
>> cost. In particular, the Apache Hadoop ecosystem has become a focal point
>> for such “big data” workloads, because many traditional open source
>> database systems have lagged in offering a scalable alternative.
>>
>>
>>
>> Structured storage in the Hadoop ecosystem has typically been achieved in
>> two ways: for static data sets, data is typically stored on Apache HDFS
>> using binary data formats such as Apache Avro or Apache Parquet. However,
>> neither HDFS nor these formats has any provision for updating individual
>> records, or for efficient random access. Mutable data sets are typically
>> stored in semi-structured stores such as Apache HBase or Apache Cassandra.
>> These systems allow for low-latency record-level reads and writes, but lag
>> far behind the static file formats in terms of sequential read throughput
>> for applications such as SQL-based analytics or machine learning.
>>
>>
>>
>> Kudu is a new storage system designed and implemented from the ground up to
>> fill this gap between high-throughput sequential-access storage systems
>> such as HDFS and low-latency random-access systems such as HBase or
>> Cassandra. While these existing systems continue to hold advantages in some
>> situations, Kudu offers a “happy medium” alternative that can dramatically
>> simplify the architecture of many common workloads. In particular, Kudu
>> offers a simple API for row-level inserts, updates, and deletes, while
>> providing table scans at throughputs similar to Parquet, a commonly-used
>> columnar format for static data.
>>
>>
>>
>> More information on Kudu can be found at the existing open source project
>> website: http://getkudu.io and in particular in the Kudu white-paper PDF:
>> http://getkudu.io/kudu.pdf from which the above was excerpted.
>>
>> == Rationale ==
>>
>> As described above, Kudu fills an important gap in the open source storage
>> ecosystem. After our initial open source project release in September 2015,
>> we have seen a great amount of interest across a diverse set of users and
>> companies. We believe that, as a storage system, it is critical to build an
>> equally diverse set of contributors in the development community. Our
>> experiences as committers and PMC members on other Apache projects have
>> taught us the value of diverse communities in ensuring both longevity and
>> high quality for such foundational systems.
>>
>> == Initial Goals ==
>>
>> * Move the existing codebase, website, documentation, and mailing lists to
>> Apache-hosted infrastructure
>> * Work with the infrastructure team to implement and approve our code
>> review, build, and testing workflows in the context of the ASF
>> * Incremental development and releases per Apache guidelines
>>
>> == Current Status ==
>>
>>  Releases 
>>
>> Kudu has undergone one public release, tagged here

Re: [VOTE] Accept Kudu into the Apache Incubator

2015-11-26 Thread Tony Kurc
+1 (non-binding)
On Nov 26, 2015 3:04 PM, "Joe Witt"  wrote:

> +1 (non-binding)
>
> On Wed, Nov 25, 2015 at 5:26 PM, Hitesh Shah  wrote:
> > +1 (binding)
> >
> > — Hitesh
> >
> > On Nov 24, 2015, at 11:32 AM, Todd Lipcon  wrote:
> >
> >> Hi all,
> >>
> >> Discussion on the [DISCUSS] thread seems to have wound down, so I'd
> like to
> >> call a VOTE on acceptance of Kudu into the ASF Incubator. The proposal
> is
> >> pasted below and also available on the wiki at:
> >> https://wiki.apache.org/incubator/KuduProposal
> >>
> >> The proposal is unchanged since the original version, except for the
> >> addition of Carl Steinbach as a Mentor.
> >>
> >> Please cast your votes:
> >>
> >> [] +1, accept Kudu into the Incubator
> >> [] +/-0, positive/negative non-counted expression of feelings
> >> [] -1, do not accept Kudu into the incubator (please state reasoning)
> >>
> >> Given the US holiday this week, I imagine many folks are traveling or
> >> otherwise offline. So, let's run the vote for a full week rather than
> the
> >> traditional 72 hours. Unless the IPMC objects to the extended voting
> >> period, the vote will close on Tues, Dec 1st at noon PST.
> >>
> >> Thanks
> >> -Todd
> >> -
> >>
> >> = Kudu Proposal =
> >>
> >> == Abstract ==
> >>
> >> Kudu is a distributed columnar storage engine built for the Apache
> Hadoop
> >> ecosystem.
> >>
> >> == Proposal ==
> >>
> >> Kudu is an open source storage engine for structured data which supports
> >> low-latency random access together with efficient analytical access
> >> patterns. Kudu distributes data using horizontal partitioning and
> >> replicates each partition using Raft consensus, providing low
> >> mean-time-to-recovery and low tail latencies. Kudu is designed within
> the
> >> context of the Apache Hadoop ecosystem and supports many integrations
> with
> >> other data analytics projects both inside and outside of the Apache
> >> Software Foundation.
> >>
> >>
> >>
> >> We propose to incubate Kudu as a project of the Apache Software
> Foundation.
> >>
> >> == Background ==
> >>
> >> In recent years, explosive growth in the amount of data being generated
> and
> >> captured by enterprises has resulted in the rapid adoption of open
> source
> >> technology which is able to store massive data sets at scale and at low
> >> cost. In particular, the Apache Hadoop ecosystem has become a focal
> point
> >> for such “big data” workloads, because many traditional open source
> >> database systems have lagged in offering a scalable alternative.
> >>
> >>
> >>
> >> Structured storage in the Hadoop ecosystem has typically been achieved
> in
> >> two ways: for static data sets, data is typically stored on Apache HDFS
> >> using binary data formats such as Apache Avro or Apache Parquet.
> However,
> >> neither HDFS nor these formats has any provision for updating individual
> >> records, or for efficient random access. Mutable data sets are typically
> >> stored in semi-structured stores such as Apache HBase or Apache
> Cassandra.
> >> These systems allow for low-latency record-level reads and writes, but
> lag
> >> far behind the static file formats in terms of sequential read
> throughput
> >> for applications such as SQL-based analytics or machine learning.
> >>
> >>
> >>
> >> Kudu is a new storage system designed and implemented from the ground
> up to
> >> fill this gap between high-throughput sequential-access storage systems
> >> such as HDFS and low-latency random-access systems such as HBase or
> >> Cassandra. While these existing systems continue to hold advantages in
> some
> >> situations, Kudu offers a “happy medium” alternative that can
> dramatically
> >> simplify the architecture of many common workloads. In particular, Kudu
> >> offers a simple API for row-level inserts, updates, and deletes, while
> >> providing table scans at throughputs similar to Parquet, a commonly-used
> >> columnar format for static data.
> >>
> >>
> >>
> >> More information on Kudu can be found at the existing open source
> project
> >> website: http://getkudu.io and in particular in the Kudu white-paper
> PDF:
> >> http://getkudu.io/kudu.pdf from which the above was excerpted.
> >>
> >> == Rationale ==
> >>
> >> As described above, Kudu fills an important gap in the open source
> storage
> >> ecosystem. After our initial open source project release in September
> 2015,
> >> we have seen a great amount of interest across a diverse set of users
> and
> >> companies. We believe that, as a storage system, it is critical to
> build an
> >> equally diverse set of contributors in the development community. Our
> >> experiences as committers and PMC members on other Apache projects have
> >> taught us the value of diverse communities in ensuring both longevity
> and
> >> high quality for such foundational systems.
> >>
> >> == Initial Goals ==
> >>
> >> * Move the existing codebase, website, documentation, and mailing 

Re: [VOTE] Accept Kudu into the Apache Incubator

2015-11-26 Thread Ted Dunning
+1 (binding)

I think that forcing experienced community developers into one model or the
other is unnecessary. Let them in as they would like.



On Wed, Nov 25, 2015 at 4:51 PM, Greg Stein  wrote:

> -1 (binding)
>
> Starting with RTC is a poor way to attract new community members. I'd like
> to see this community use CTR instead of mandating gerrit reviews.
>
> (ref: other-threads about lack of trust, and control issues; poor basis for
> a community)
>
> On Tue, Nov 24, 2015 at 1:32 PM, Todd Lipcon  wrote:
>
> > Hi all,
> >
> > Discussion on the [DISCUSS] thread seems to have wound down, so I'd like
> to
> > call a VOTE on acceptance of Kudu into the ASF Incubator. The proposal is
> > pasted below and also available on the wiki at:
> > https://wiki.apache.org/incubator/KuduProposal
> >
> > The proposal is unchanged since the original version, except for the
> > addition of Carl Steinbach as a Mentor.
> >
> > Please cast your votes:
> >
> > [] +1, accept Kudu into the Incubator
> > [] +/-0, positive/negative non-counted expression of feelings
> > [] -1, do not accept Kudu into the incubator (please state reasoning)
> >
> > Given the US holiday this week, I imagine many folks are traveling or
> > otherwise offline. So, let's run the vote for a full week rather than the
> > traditional 72 hours. Unless the IPMC objects to the extended voting
> > period, the vote will close on Tues, Dec 1st at noon PST.
> >
> > Thanks
> > -Todd
> > -
> >
> > = Kudu Proposal =
> >
> > == Abstract ==
> >
> > Kudu is a distributed columnar storage engine built for the Apache Hadoop
> > ecosystem.
> >
> > == Proposal ==
> >
> > Kudu is an open source storage engine for structured data which supports
> > low-latency random access together with efficient analytical access
> > patterns. Kudu distributes data using horizontal partitioning and
> > replicates each partition using Raft consensus, providing low
> > mean-time-to-recovery and low tail latencies. Kudu is designed within the
> > context of the Apache Hadoop ecosystem and supports many integrations
> with
> > other data analytics projects both inside and outside of the Apache
> > Software Foundation.
> >
> >
> >
> > We propose to incubate Kudu as a project of the Apache Software
> Foundation.
> >
> > == Background ==
> >
> > In recent years, explosive growth in the amount of data being generated
> and
> > captured by enterprises has resulted in the rapid adoption of open source
> > technology which is able to store massive data sets at scale and at low
> > cost. In particular, the Apache Hadoop ecosystem has become a focal point
> > for such “big data” workloads, because many traditional open source
> > database systems have lagged in offering a scalable alternative.
> >
> >
> >
> > Structured storage in the Hadoop ecosystem has typically been achieved in
> > two ways: for static data sets, data is typically stored on Apache HDFS
> > using binary data formats such as Apache Avro or Apache Parquet. However,
> > neither HDFS nor these formats has any provision for updating individual
> > records, or for efficient random access. Mutable data sets are typically
> > stored in semi-structured stores such as Apache HBase or Apache
> Cassandra.
> > These systems allow for low-latency record-level reads and writes, but
> lag
> > far behind the static file formats in terms of sequential read throughput
> > for applications such as SQL-based analytics or machine learning.
> >
> >
> >
> > Kudu is a new storage system designed and implemented from the ground up
> to
> > fill this gap between high-throughput sequential-access storage systems
> > such as HDFS and low-latency random-access systems such as HBase or
> > Cassandra. While these existing systems continue to hold advantages in
> some
> > situations, Kudu offers a “happy medium” alternative that can
> dramatically
> > simplify the architecture of many common workloads. In particular, Kudu
> > offers a simple API for row-level inserts, updates, and deletes, while
> > providing table scans at throughputs similar to Parquet, a commonly-used
> > columnar format for static data.
> >
> >
> >
> > More information on Kudu can be found at the existing open source project
> > website: http://getkudu.io and in particular in the Kudu white-paper
> PDF:
> > http://getkudu.io/kudu.pdf from which the above was excerpted.
> >
> > == Rationale ==
> >
> > As described above, Kudu fills an important gap in the open source
> storage
> > ecosystem. After our initial open source project release in September
> 2015,
> > we have seen a great amount of interest across a diverse set of users and
> > companies. We believe that, as a storage system, it is critical to build
> an
> > equally diverse set of contributors in the development community. Our
> > experiences as committers and PMC members on other Apache projects have
> > taught us the value of diverse communities in ensuring both longevity and
> > high quality 

Re: [VOTE] Accept Kudu into the Apache Incubator

2015-11-25 Thread Doug Cutting
+1 (binding)

Doug

On Wed, Nov 25, 2015 at 8:45 AM, Chris Douglas  wrote:

> +1 (binding) -C
>
> On Tue, Nov 24, 2015 at 11:32 AM, Todd Lipcon  wrote:
> > Hi all,
> >
> > Discussion on the [DISCUSS] thread seems to have wound down, so I'd like
> to
> > call a VOTE on acceptance of Kudu into the ASF Incubator. The proposal is
> > pasted below and also available on the wiki at:
> > https://wiki.apache.org/incubator/KuduProposal
> >
> > The proposal is unchanged since the original version, except for the
> > addition of Carl Steinbach as a Mentor.
> >
> > Please cast your votes:
> >
> > [] +1, accept Kudu into the Incubator
> > [] +/-0, positive/negative non-counted expression of feelings
> > [] -1, do not accept Kudu into the incubator (please state reasoning)
> >
> > Given the US holiday this week, I imagine many folks are traveling or
> > otherwise offline. So, let's run the vote for a full week rather than the
> > traditional 72 hours. Unless the IPMC objects to the extended voting
> > period, the vote will close on Tues, Dec 1st at noon PST.
> >
> > Thanks
> > -Todd
> > -
> >
> > = Kudu Proposal =
> >
> > == Abstract ==
> >
> > Kudu is a distributed columnar storage engine built for the Apache Hadoop
> > ecosystem.
> >
> > == Proposal ==
> >
> > Kudu is an open source storage engine for structured data which supports
> > low-latency random access together with efficient analytical access
> > patterns. Kudu distributes data using horizontal partitioning and
> > replicates each partition using Raft consensus, providing low
> > mean-time-to-recovery and low tail latencies. Kudu is designed within the
> > context of the Apache Hadoop ecosystem and supports many integrations
> with
> > other data analytics projects both inside and outside of the Apache
> > Software Foundation.
> >
> >
> >
> > We propose to incubate Kudu as a project of the Apache Software
> Foundation.
> >
> > == Background ==
> >
> > In recent years, explosive growth in the amount of data being generated
> and
> > captured by enterprises has resulted in the rapid adoption of open source
> > technology which is able to store massive data sets at scale and at low
> > cost. In particular, the Apache Hadoop ecosystem has become a focal point
> > for such “big data” workloads, because many traditional open source
> > database systems have lagged in offering a scalable alternative.
> >
> >
> >
> > Structured storage in the Hadoop ecosystem has typically been achieved in
> > two ways: for static data sets, data is typically stored on Apache HDFS
> > using binary data formats such as Apache Avro or Apache Parquet. However,
> > neither HDFS nor these formats has any provision for updating individual
> > records, or for efficient random access. Mutable data sets are typically
> > stored in semi-structured stores such as Apache HBase or Apache
> Cassandra.
> > These systems allow for low-latency record-level reads and writes, but
> lag
> > far behind the static file formats in terms of sequential read throughput
> > for applications such as SQL-based analytics or machine learning.
> >
> >
> >
> > Kudu is a new storage system designed and implemented from the ground up
> to
> > fill this gap between high-throughput sequential-access storage systems
> > such as HDFS and low-latency random-access systems such as HBase or
> > Cassandra. While these existing systems continue to hold advantages in
> some
> > situations, Kudu offers a “happy medium” alternative that can
> dramatically
> > simplify the architecture of many common workloads. In particular, Kudu
> > offers a simple API for row-level inserts, updates, and deletes, while
> > providing table scans at throughputs similar to Parquet, a commonly-used
> > columnar format for static data.
> >
> >
> >
> > More information on Kudu can be found at the existing open source project
> > website: http://getkudu.io and in particular in the Kudu white-paper
> PDF:
> > http://getkudu.io/kudu.pdf from which the above was excerpted.
> >
> > == Rationale ==
> >
> > As described above, Kudu fills an important gap in the open source
> storage
> > ecosystem. After our initial open source project release in September
> 2015,
> > we have seen a great amount of interest across a diverse set of users and
> > companies. We believe that, as a storage system, it is critical to build
> an
> > equally diverse set of contributors in the development community. Our
> > experiences as committers and PMC members on other Apache projects have
> > taught us the value of diverse communities in ensuring both longevity and
> > high quality for such foundational systems.
> >
> > == Initial Goals ==
> >
> >  * Move the existing codebase, website, documentation, and mailing lists
> to
> > Apache-hosted infrastructure
> >  * Work with the infrastructure team to implement and approve our code
> > review, build, and testing workflows in the context of the ASF
> >  * Incremental development and 

Re: [VOTE] Accept Kudu into the Apache Incubator

2015-11-25 Thread Chris Douglas
+1 (binding) -C

On Tue, Nov 24, 2015 at 11:32 AM, Todd Lipcon  wrote:
> Hi all,
>
> Discussion on the [DISCUSS] thread seems to have wound down, so I'd like to
> call a VOTE on acceptance of Kudu into the ASF Incubator. The proposal is
> pasted below and also available on the wiki at:
> https://wiki.apache.org/incubator/KuduProposal
>
> The proposal is unchanged since the original version, except for the
> addition of Carl Steinbach as a Mentor.
>
> Please cast your votes:
>
> [] +1, accept Kudu into the Incubator
> [] +/-0, positive/negative non-counted expression of feelings
> [] -1, do not accept Kudu into the incubator (please state reasoning)
>
> Given the US holiday this week, I imagine many folks are traveling or
> otherwise offline. So, let's run the vote for a full week rather than the
> traditional 72 hours. Unless the IPMC objects to the extended voting
> period, the vote will close on Tues, Dec 1st at noon PST.
>
> Thanks
> -Todd
> -
>
> = Kudu Proposal =
>
> == Abstract ==
>
> Kudu is a distributed columnar storage engine built for the Apache Hadoop
> ecosystem.
>
> == Proposal ==
>
> Kudu is an open source storage engine for structured data which supports
> low-latency random access together with efficient analytical access
> patterns. Kudu distributes data using horizontal partitioning and
> replicates each partition using Raft consensus, providing low
> mean-time-to-recovery and low tail latencies. Kudu is designed within the
> context of the Apache Hadoop ecosystem and supports many integrations with
> other data analytics projects both inside and outside of the Apache
> Software Foundation.
>
>
>
> We propose to incubate Kudu as a project of the Apache Software Foundation.
>
> == Background ==
>
> In recent years, explosive growth in the amount of data being generated and
> captured by enterprises has resulted in the rapid adoption of open source
> technology which is able to store massive data sets at scale and at low
> cost. In particular, the Apache Hadoop ecosystem has become a focal point
> for such “big data” workloads, because many traditional open source
> database systems have lagged in offering a scalable alternative.
>
>
>
> Structured storage in the Hadoop ecosystem has typically been achieved in
> two ways: for static data sets, data is typically stored on Apache HDFS
> using binary data formats such as Apache Avro or Apache Parquet. However,
> neither HDFS nor these formats has any provision for updating individual
> records, or for efficient random access. Mutable data sets are typically
> stored in semi-structured stores such as Apache HBase or Apache Cassandra.
> These systems allow for low-latency record-level reads and writes, but lag
> far behind the static file formats in terms of sequential read throughput
> for applications such as SQL-based analytics or machine learning.
>
>
>
> Kudu is a new storage system designed and implemented from the ground up to
> fill this gap between high-throughput sequential-access storage systems
> such as HDFS and low-latency random-access systems such as HBase or
> Cassandra. While these existing systems continue to hold advantages in some
> situations, Kudu offers a “happy medium” alternative that can dramatically
> simplify the architecture of many common workloads. In particular, Kudu
> offers a simple API for row-level inserts, updates, and deletes, while
> providing table scans at throughputs similar to Parquet, a commonly-used
> columnar format for static data.
>
>
>
> More information on Kudu can be found at the existing open source project
> website: http://getkudu.io and in particular in the Kudu white-paper PDF:
> http://getkudu.io/kudu.pdf from which the above was excerpted.
>
> == Rationale ==
>
> As described above, Kudu fills an important gap in the open source storage
> ecosystem. After our initial open source project release in September 2015,
> we have seen a great amount of interest across a diverse set of users and
> companies. We believe that, as a storage system, it is critical to build an
> equally diverse set of contributors in the development community. Our
> experiences as committers and PMC members on other Apache projects have
> taught us the value of diverse communities in ensuring both longevity and
> high quality for such foundational systems.
>
> == Initial Goals ==
>
>  * Move the existing codebase, website, documentation, and mailing lists to
> Apache-hosted infrastructure
>  * Work with the infrastructure team to implement and approve our code
> review, build, and testing workflows in the context of the ASF
>  * Incremental development and releases per Apache guidelines
>
> == Current Status ==
>
>  Releases 
>
> Kudu has undergone one public release, tagged here
> https://github.com/cloudera/kudu/tree/kudu0.5.0-release
>
> This initial release was not performed in the typical ASF fashion -- no
> source tarball was released, but rather only convenience binaries made
> 

Re: [VOTE] Accept Kudu into the Apache Incubator

2015-11-25 Thread Hitesh Shah
+1 (binding)

— Hitesh

On Nov 24, 2015, at 11:32 AM, Todd Lipcon  wrote:

> Hi all,
> 
> Discussion on the [DISCUSS] thread seems to have wound down, so I'd like to
> call a VOTE on acceptance of Kudu into the ASF Incubator. The proposal is
> pasted below and also available on the wiki at:
> https://wiki.apache.org/incubator/KuduProposal
> 
> The proposal is unchanged since the original version, except for the
> addition of Carl Steinbach as a Mentor.
> 
> Please cast your votes:
> 
> [] +1, accept Kudu into the Incubator
> [] +/-0, positive/negative non-counted expression of feelings
> [] -1, do not accept Kudu into the incubator (please state reasoning)
> 
> Given the US holiday this week, I imagine many folks are traveling or
> otherwise offline. So, let's run the vote for a full week rather than the
> traditional 72 hours. Unless the IPMC objects to the extended voting
> period, the vote will close on Tues, Dec 1st at noon PST.
> 
> Thanks
> -Todd
> -
> 
> = Kudu Proposal =
> 
> == Abstract ==
> 
> Kudu is a distributed columnar storage engine built for the Apache Hadoop
> ecosystem.
> 
> == Proposal ==
> 
> Kudu is an open source storage engine for structured data which supports
> low-latency random access together with efficient analytical access
> patterns. Kudu distributes data using horizontal partitioning and
> replicates each partition using Raft consensus, providing low
> mean-time-to-recovery and low tail latencies. Kudu is designed within the
> context of the Apache Hadoop ecosystem and supports many integrations with
> other data analytics projects both inside and outside of the Apache
> Software Foundation.
> 
> 
> 
> We propose to incubate Kudu as a project of the Apache Software Foundation.
> 
> == Background ==
> 
> In recent years, explosive growth in the amount of data being generated and
> captured by enterprises has resulted in the rapid adoption of open source
> technology which is able to store massive data sets at scale and at low
> cost. In particular, the Apache Hadoop ecosystem has become a focal point
> for such “big data” workloads, because many traditional open source
> database systems have lagged in offering a scalable alternative.
> 
> 
> 
> Structured storage in the Hadoop ecosystem has typically been achieved in
> two ways: for static data sets, data is typically stored on Apache HDFS
> using binary data formats such as Apache Avro or Apache Parquet. However,
> neither HDFS nor these formats has any provision for updating individual
> records, or for efficient random access. Mutable data sets are typically
> stored in semi-structured stores such as Apache HBase or Apache Cassandra.
> These systems allow for low-latency record-level reads and writes, but lag
> far behind the static file formats in terms of sequential read throughput
> for applications such as SQL-based analytics or machine learning.
> 
> 
> 
> Kudu is a new storage system designed and implemented from the ground up to
> fill this gap between high-throughput sequential-access storage systems
> such as HDFS and low-latency random-access systems such as HBase or
> Cassandra. While these existing systems continue to hold advantages in some
> situations, Kudu offers a “happy medium” alternative that can dramatically
> simplify the architecture of many common workloads. In particular, Kudu
> offers a simple API for row-level inserts, updates, and deletes, while
> providing table scans at throughputs similar to Parquet, a commonly-used
> columnar format for static data.
> 
> 
> 
> More information on Kudu can be found at the existing open source project
> website: http://getkudu.io and in particular in the Kudu white-paper PDF:
> http://getkudu.io/kudu.pdf from which the above was excerpted.
> 
> == Rationale ==
> 
> As described above, Kudu fills an important gap in the open source storage
> ecosystem. After our initial open source project release in September 2015,
> we have seen a great amount of interest across a diverse set of users and
> companies. We believe that, as a storage system, it is critical to build an
> equally diverse set of contributors in the development community. Our
> experiences as committers and PMC members on other Apache projects have
> taught us the value of diverse communities in ensuring both longevity and
> high quality for such foundational systems.
> 
> == Initial Goals ==
> 
> * Move the existing codebase, website, documentation, and mailing lists to
> Apache-hosted infrastructure
> * Work with the infrastructure team to implement and approve our code
> review, build, and testing workflows in the context of the ASF
> * Incremental development and releases per Apache guidelines
> 
> == Current Status ==
> 
>  Releases 
> 
> Kudu has undergone one public release, tagged here
> https://github.com/cloudera/kudu/tree/kudu0.5.0-release
> 
> This initial release was not performed in the typical ASF fashion -- no
> source tarball was released, but rather 

Re: [VOTE] Accept Kudu into the Apache Incubator

2015-11-25 Thread Rob Vesse
+1 (binding)

Rob

On 24/11/2015 19:32, "Todd Lipcon"  wrote:

>Hi all,
>
>Discussion on the [DISCUSS] thread seems to have wound down, so I'd like
>to
>call a VOTE on acceptance of Kudu into the ASF Incubator. The proposal is
>pasted below and also available on the wiki at:
>https://wiki.apache.org/incubator/KuduProposal
>
>The proposal is unchanged since the original version, except for the
>addition of Carl Steinbach as a Mentor.
>
>Please cast your votes:
>
>[] +1, accept Kudu into the Incubator
>[] +/-0, positive/negative non-counted expression of feelings
>[] -1, do not accept Kudu into the incubator (please state reasoning)
>
>Given the US holiday this week, I imagine many folks are traveling or
>otherwise offline. So, let's run the vote for a full week rather than the
>traditional 72 hours. Unless the IPMC objects to the extended voting
>period, the vote will close on Tues, Dec 1st at noon PST.
>
>Thanks
>-Todd
>-
>
>= Kudu Proposal =
>
>== Abstract ==
>
>Kudu is a distributed columnar storage engine built for the Apache Hadoop
>ecosystem.
>
>== Proposal ==
>
>Kudu is an open source storage engine for structured data which supports
>low-latency random access together with efficient analytical access
>patterns. Kudu distributes data using horizontal partitioning and
>replicates each partition using Raft consensus, providing low
>mean-time-to-recovery and low tail latencies. Kudu is designed within the
>context of the Apache Hadoop ecosystem and supports many integrations with
>other data analytics projects both inside and outside of the Apache
>Software Foundation.
>
>
>
>We propose to incubate Kudu as a project of the Apache Software
>Foundation.
>
>== Background ==
>
>In recent years, explosive growth in the amount of data being generated
>and
>captured by enterprises has resulted in the rapid adoption of open source
>technology which is able to store massive data sets at scale and at low
>cost. In particular, the Apache Hadoop ecosystem has become a focal point
>for such “big data” workloads, because many traditional open source
>database systems have lagged in offering a scalable alternative.
>
>
>
>Structured storage in the Hadoop ecosystem has typically been achieved in
>two ways: for static data sets, data is typically stored on Apache HDFS
>using binary data formats such as Apache Avro or Apache Parquet. However,
>neither HDFS nor these formats has any provision for updating individual
>records, or for efficient random access. Mutable data sets are typically
>stored in semi-structured stores such as Apache HBase or Apache Cassandra.
>These systems allow for low-latency record-level reads and writes, but lag
>far behind the static file formats in terms of sequential read throughput
>for applications such as SQL-based analytics or machine learning.
>
>
>
>Kudu is a new storage system designed and implemented from the ground up
>to
>fill this gap between high-throughput sequential-access storage systems
>such as HDFS and low-latency random-access systems such as HBase or
>Cassandra. While these existing systems continue to hold advantages in
>some
>situations, Kudu offers a “happy medium” alternative that can dramatically
>simplify the architecture of many common workloads. In particular, Kudu
>offers a simple API for row-level inserts, updates, and deletes, while
>providing table scans at throughputs similar to Parquet, a commonly-used
>columnar format for static data.
>
>
>
>More information on Kudu can be found at the existing open source project
>website: http://getkudu.io and in particular in the Kudu white-paper PDF:
>http://getkudu.io/kudu.pdf from which the above was excerpted.
>
>== Rationale ==
>
>As described above, Kudu fills an important gap in the open source storage
>ecosystem. After our initial open source project release in September
>2015,
>we have seen a great amount of interest across a diverse set of users and
>companies. We believe that, as a storage system, it is critical to build
>an
>equally diverse set of contributors in the development community. Our
>experiences as committers and PMC members on other Apache projects have
>taught us the value of diverse communities in ensuring both longevity and
>high quality for such foundational systems.
>
>== Initial Goals ==
>
> * Move the existing codebase, website, documentation, and mailing lists
>to
>Apache-hosted infrastructure
> * Work with the infrastructure team to implement and approve our code
>review, build, and testing workflows in the context of the ASF
> * Incremental development and releases per Apache guidelines
>
>== Current Status ==
>
> Releases 
>
>Kudu has undergone one public release, tagged here
>https://github.com/cloudera/kudu/tree/kudu0.5.0-release
>
>This initial release was not performed in the typical ASF fashion -- no
>source tarball was released, but rather only convenience binaries made
>available in Cloudera’s repositories. We will adopt 

Re: [VOTE] Accept Kudu into the Apache Incubator

2015-11-25 Thread Tom White
+1 (binding)

Tom

On Tue, Nov 24, 2015 at 7:32 PM, Todd Lipcon  wrote:
> Hi all,
>
> Discussion on the [DISCUSS] thread seems to have wound down, so I'd like to
> call a VOTE on acceptance of Kudu into the ASF Incubator. The proposal is
> pasted below and also available on the wiki at:
> https://wiki.apache.org/incubator/KuduProposal
>
> The proposal is unchanged since the original version, except for the
> addition of Carl Steinbach as a Mentor.
>
> Please cast your votes:
>
> [] +1, accept Kudu into the Incubator
> [] +/-0, positive/negative non-counted expression of feelings
> [] -1, do not accept Kudu into the incubator (please state reasoning)
>
> Given the US holiday this week, I imagine many folks are traveling or
> otherwise offline. So, let's run the vote for a full week rather than the
> traditional 72 hours. Unless the IPMC objects to the extended voting
> period, the vote will close on Tues, Dec 1st at noon PST.
>
> Thanks
> -Todd
> -
>
> = Kudu Proposal =
>
> == Abstract ==
>
> Kudu is a distributed columnar storage engine built for the Apache Hadoop
> ecosystem.
>
> == Proposal ==
>
> Kudu is an open source storage engine for structured data which supports
> low-latency random access together with efficient analytical access
> patterns. Kudu distributes data using horizontal partitioning and
> replicates each partition using Raft consensus, providing low
> mean-time-to-recovery and low tail latencies. Kudu is designed within the
> context of the Apache Hadoop ecosystem and supports many integrations with
> other data analytics projects both inside and outside of the Apache
> Software Foundation.
>
>
>
> We propose to incubate Kudu as a project of the Apache Software Foundation.
>
> == Background ==
>
> In recent years, explosive growth in the amount of data being generated and
> captured by enterprises has resulted in the rapid adoption of open source
> technology which is able to store massive data sets at scale and at low
> cost. In particular, the Apache Hadoop ecosystem has become a focal point
> for such “big data” workloads, because many traditional open source
> database systems have lagged in offering a scalable alternative.
>
>
>
> Structured storage in the Hadoop ecosystem has typically been achieved in
> two ways: for static data sets, data is typically stored on Apache HDFS
> using binary data formats such as Apache Avro or Apache Parquet. However,
> neither HDFS nor these formats has any provision for updating individual
> records, or for efficient random access. Mutable data sets are typically
> stored in semi-structured stores such as Apache HBase or Apache Cassandra.
> These systems allow for low-latency record-level reads and writes, but lag
> far behind the static file formats in terms of sequential read throughput
> for applications such as SQL-based analytics or machine learning.
>
>
>
> Kudu is a new storage system designed and implemented from the ground up to
> fill this gap between high-throughput sequential-access storage systems
> such as HDFS and low-latency random-access systems such as HBase or
> Cassandra. While these existing systems continue to hold advantages in some
> situations, Kudu offers a “happy medium” alternative that can dramatically
> simplify the architecture of many common workloads. In particular, Kudu
> offers a simple API for row-level inserts, updates, and deletes, while
> providing table scans at throughputs similar to Parquet, a commonly-used
> columnar format for static data.
>
>
>
> More information on Kudu can be found at the existing open source project
> website: http://getkudu.io and in particular in the Kudu white-paper PDF:
> http://getkudu.io/kudu.pdf from which the above was excerpted.
>
> == Rationale ==
>
> As described above, Kudu fills an important gap in the open source storage
> ecosystem. After our initial open source project release in September 2015,
> we have seen a great amount of interest across a diverse set of users and
> companies. We believe that, as a storage system, it is critical to build an
> equally diverse set of contributors in the development community. Our
> experiences as committers and PMC members on other Apache projects have
> taught us the value of diverse communities in ensuring both longevity and
> high quality for such foundational systems.
>
> == Initial Goals ==
>
>  * Move the existing codebase, website, documentation, and mailing lists to
> Apache-hosted infrastructure
>  * Work with the infrastructure team to implement and approve our code
> review, build, and testing workflows in the context of the ASF
>  * Incremental development and releases per Apache guidelines
>
> == Current Status ==
>
>  Releases 
>
> Kudu has undergone one public release, tagged here
> https://github.com/cloudera/kudu/tree/kudu0.5.0-release
>
> This initial release was not performed in the typical ASF fashion -- no
> source tarball was released, but rather only convenience binaries made
> 

Re: [VOTE] Accept Kudu into the Apache Incubator

2015-11-25 Thread Amol Kekre
+1 (non-binding)

Amol


On Wed, Nov 25, 2015 at 3:19 AM, Roman Shaposhnik  wrote:

> On Tue, Nov 24, 2015 at 11:32 AM, Todd Lipcon  wrote:
> > Hi all,
> >
> > Discussion on the [DISCUSS] thread seems to have wound down, so I'd like
> to
> > call a VOTE on acceptance of Kudu into the ASF Incubator. The proposal is
> > pasted below and also available on the wiki at:
> > https://wiki.apache.org/incubator/KuduProposal
> >
> > The proposal is unchanged since the original version, except for the
> > addition of Carl Steinbach as a Mentor.
> >
> > Please cast your votes:
> >
> > [] +1, accept Kudu into the Incubator
> > [] +/-0, positive/negative non-counted expression of feelings
> > [] -1, do not accept Kudu into the incubator (please state reasoning)
>
> +1 (binding)
>
> Bets of luck guys!
>
> Thanks,
> Roman.
>
> -
> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> For additional commands, e-mail: general-h...@incubator.apache.org
>
>


Re: [VOTE] Accept Kudu into the Apache Incubator

2015-11-25 Thread Roman Shaposhnik
On Tue, Nov 24, 2015 at 11:32 AM, Todd Lipcon  wrote:
> Hi all,
>
> Discussion on the [DISCUSS] thread seems to have wound down, so I'd like to
> call a VOTE on acceptance of Kudu into the ASF Incubator. The proposal is
> pasted below and also available on the wiki at:
> https://wiki.apache.org/incubator/KuduProposal
>
> The proposal is unchanged since the original version, except for the
> addition of Carl Steinbach as a Mentor.
>
> Please cast your votes:
>
> [] +1, accept Kudu into the Incubator
> [] +/-0, positive/negative non-counted expression of feelings
> [] -1, do not accept Kudu into the incubator (please state reasoning)

+1 (binding)

Bets of luck guys!

Thanks,
Roman.

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: [VOTE] Accept Kudu into the Apache Incubator

2015-11-24 Thread Luke Han
+1 (non-binding)


Best Regards!
-

Luke Han

On Wed, Nov 25, 2015 at 7:47 AM, Julien Le Dem  wrote:

> +1 (binding)
>
> On Tue, Nov 24, 2015 at 1:57 PM, Edward J. Yoon 
> wrote:
>
> > +1 (binding)
> >
> > On Wed, Nov 25, 2015 at 6:26 AM, Patrick Angeles
> >  wrote:
> > > +1 (non-binding)
> > >
> > > On Tue, Nov 24, 2015 at 4:23 PM, Jake Farrell 
> > wrote:
> > >
> > >> +1 (binding)
> > >>
> > >> -Jake
> > >>
> > >> On Tue, Nov 24, 2015 at 2:32 PM, Todd Lipcon  wrote:
> > >>
> > >> > Hi all,
> > >> >
> > >> > Discussion on the [DISCUSS] thread seems to have wound down, so I'd
> > like
> > >> to
> > >> > call a VOTE on acceptance of Kudu into the ASF Incubator. The
> > proposal is
> > >> > pasted below and also available on the wiki at:
> > >> > https://wiki.apache.org/incubator/KuduProposal
> > >> >
> > >> > The proposal is unchanged since the original version, except for the
> > >> > addition of Carl Steinbach as a Mentor.
> > >> >
> > >> > Please cast your votes:
> > >> >
> > >> > [] +1, accept Kudu into the Incubator
> > >> > [] +/-0, positive/negative non-counted expression of feelings
> > >> > [] -1, do not accept Kudu into the incubator (please state
> reasoning)
> > >> >
> > >> > Given the US holiday this week, I imagine many folks are traveling
> or
> > >> > otherwise offline. So, let's run the vote for a full week rather
> than
> > the
> > >> > traditional 72 hours. Unless the IPMC objects to the extended voting
> > >> > period, the vote will close on Tues, Dec 1st at noon PST.
> > >> >
> > >> > Thanks
> > >> > -Todd
> > >> > -
> > >> >
> > >> > = Kudu Proposal =
> > >> >
> > >> > == Abstract ==
> > >> >
> > >> > Kudu is a distributed columnar storage engine built for the Apache
> > Hadoop
> > >> > ecosystem.
> > >> >
> > >> > == Proposal ==
> > >> >
> > >> > Kudu is an open source storage engine for structured data which
> > supports
> > >> > low-latency random access together with efficient analytical access
> > >> > patterns. Kudu distributes data using horizontal partitioning and
> > >> > replicates each partition using Raft consensus, providing low
> > >> > mean-time-to-recovery and low tail latencies. Kudu is designed
> within
> > the
> > >> > context of the Apache Hadoop ecosystem and supports many
> integrations
> > >> with
> > >> > other data analytics projects both inside and outside of the Apache
> > >> > Software Foundation.
> > >> >
> > >> >
> > >> >
> > >> > We propose to incubate Kudu as a project of the Apache Software
> > >> Foundation.
> > >> >
> > >> > == Background ==
> > >> >
> > >> > In recent years, explosive growth in the amount of data being
> > generated
> > >> and
> > >> > captured by enterprises has resulted in the rapid adoption of open
> > source
> > >> > technology which is able to store massive data sets at scale and at
> > low
> > >> > cost. In particular, the Apache Hadoop ecosystem has become a focal
> > point
> > >> > for such “big data” workloads, because many traditional open source
> > >> > database systems have lagged in offering a scalable alternative.
> > >> >
> > >> >
> > >> >
> > >> > Structured storage in the Hadoop ecosystem has typically been
> > achieved in
> > >> > two ways: for static data sets, data is typically stored on Apache
> > HDFS
> > >> > using binary data formats such as Apache Avro or Apache Parquet.
> > However,
> > >> > neither HDFS nor these formats has any provision for updating
> > individual
> > >> > records, or for efficient random access. Mutable data sets are
> > typically
> > >> > stored in semi-structured stores such as Apache HBase or Apache
> > >> Cassandra.
> > >> > These systems allow for low-latency record-level reads and writes,
> but
> > >> lag
> > >> > far behind the static file formats in terms of sequential read
> > throughput
> > >> > for applications such as SQL-based analytics or machine learning.
> > >> >
> > >> >
> > >> >
> > >> > Kudu is a new storage system designed and implemented from the
> ground
> > up
> > >> to
> > >> > fill this gap between high-throughput sequential-access storage
> > systems
> > >> > such as HDFS and low-latency random-access systems such as HBase or
> > >> > Cassandra. While these existing systems continue to hold advantages
> in
> > >> some
> > >> > situations, Kudu offers a “happy medium” alternative that can
> > >> dramatically
> > >> > simplify the architecture of many common workloads. In particular,
> > Kudu
> > >> > offers a simple API for row-level inserts, updates, and deletes,
> while
> > >> > providing table scans at throughputs similar to Parquet, a
> > commonly-used
> > >> > columnar format for static data.
> > >> >
> > >> >
> > >> >
> > >> > More information on Kudu can be found at the existing open source
> > project
> > >> > website: http://getkudu.io and in particular in the Kudu
> white-paper
> > >> PDF:
> > >> > 

Re: [VOTE] Accept Kudu into the Apache Incubator

2015-11-24 Thread John D. Ament
+1
On Nov 24, 2015 14:33, "Todd Lipcon"  wrote:

> Hi all,
>
> Discussion on the [DISCUSS] thread seems to have wound down, so I'd like to
> call a VOTE on acceptance of Kudu into the ASF Incubator. The proposal is
> pasted below and also available on the wiki at:
> https://wiki.apache.org/incubator/KuduProposal
>
> The proposal is unchanged since the original version, except for the
> addition of Carl Steinbach as a Mentor.
>
> Please cast your votes:
>
> [] +1, accept Kudu into the Incubator
> [] +/-0, positive/negative non-counted expression of feelings
> [] -1, do not accept Kudu into the incubator (please state reasoning)
>
> Given the US holiday this week, I imagine many folks are traveling or
> otherwise offline. So, let's run the vote for a full week rather than the
> traditional 72 hours. Unless the IPMC objects to the extended voting
> period, the vote will close on Tues, Dec 1st at noon PST.
>
> Thanks
> -Todd
> -
>
> = Kudu Proposal =
>
> == Abstract ==
>
> Kudu is a distributed columnar storage engine built for the Apache Hadoop
> ecosystem.
>
> == Proposal ==
>
> Kudu is an open source storage engine for structured data which supports
> low-latency random access together with efficient analytical access
> patterns. Kudu distributes data using horizontal partitioning and
> replicates each partition using Raft consensus, providing low
> mean-time-to-recovery and low tail latencies. Kudu is designed within the
> context of the Apache Hadoop ecosystem and supports many integrations with
> other data analytics projects both inside and outside of the Apache
> Software Foundation.
>
>
>
> We propose to incubate Kudu as a project of the Apache Software Foundation.
>
> == Background ==
>
> In recent years, explosive growth in the amount of data being generated and
> captured by enterprises has resulted in the rapid adoption of open source
> technology which is able to store massive data sets at scale and at low
> cost. In particular, the Apache Hadoop ecosystem has become a focal point
> for such “big data” workloads, because many traditional open source
> database systems have lagged in offering a scalable alternative.
>
>
>
> Structured storage in the Hadoop ecosystem has typically been achieved in
> two ways: for static data sets, data is typically stored on Apache HDFS
> using binary data formats such as Apache Avro or Apache Parquet. However,
> neither HDFS nor these formats has any provision for updating individual
> records, or for efficient random access. Mutable data sets are typically
> stored in semi-structured stores such as Apache HBase or Apache Cassandra.
> These systems allow for low-latency record-level reads and writes, but lag
> far behind the static file formats in terms of sequential read throughput
> for applications such as SQL-based analytics or machine learning.
>
>
>
> Kudu is a new storage system designed and implemented from the ground up to
> fill this gap between high-throughput sequential-access storage systems
> such as HDFS and low-latency random-access systems such as HBase or
> Cassandra. While these existing systems continue to hold advantages in some
> situations, Kudu offers a “happy medium” alternative that can dramatically
> simplify the architecture of many common workloads. In particular, Kudu
> offers a simple API for row-level inserts, updates, and deletes, while
> providing table scans at throughputs similar to Parquet, a commonly-used
> columnar format for static data.
>
>
>
> More information on Kudu can be found at the existing open source project
> website: http://getkudu.io and in particular in the Kudu white-paper PDF:
> http://getkudu.io/kudu.pdf from which the above was excerpted.
>
> == Rationale ==
>
> As described above, Kudu fills an important gap in the open source storage
> ecosystem. After our initial open source project release in September 2015,
> we have seen a great amount of interest across a diverse set of users and
> companies. We believe that, as a storage system, it is critical to build an
> equally diverse set of contributors in the development community. Our
> experiences as committers and PMC members on other Apache projects have
> taught us the value of diverse communities in ensuring both longevity and
> high quality for such foundational systems.
>
> == Initial Goals ==
>
>  * Move the existing codebase, website, documentation, and mailing lists to
> Apache-hosted infrastructure
>  * Work with the infrastructure team to implement and approve our code
> review, build, and testing workflows in the context of the ASF
>  * Incremental development and releases per Apache guidelines
>
> == Current Status ==
>
>  Releases 
>
> Kudu has undergone one public release, tagged here
> https://github.com/cloudera/kudu/tree/kudu0.5.0-release
>
> This initial release was not performed in the typical ASF fashion -- no
> source tarball was released, but rather only convenience binaries made
> available in Cloudera’s 

Re: [VOTE] Accept Kudu into the Apache Incubator

2015-11-24 Thread Hyunsik Choi
+1 (binding)

Good luck!

On Tue, Nov 24, 2015 at 5:29 PM, Sean Busbey  wrote:
> +1 (binding)
>
> On Tue, Nov 24, 2015 at 1:32 PM, Todd Lipcon  wrote:
>
>> Hi all,
>>
>> Discussion on the [DISCUSS] thread seems to have wound down, so I'd like to
>> call a VOTE on acceptance of Kudu into the ASF Incubator. The proposal is
>> pasted below and also available on the wiki at:
>> https://wiki.apache.org/incubator/KuduProposal
>>
>> The proposal is unchanged since the original version, except for the
>> addition of Carl Steinbach as a Mentor.
>>
>> Please cast your votes:
>>
>> [] +1, accept Kudu into the Incubator
>> [] +/-0, positive/negative non-counted expression of feelings
>> [] -1, do not accept Kudu into the incubator (please state reasoning)
>>
>> Given the US holiday this week, I imagine many folks are traveling or
>> otherwise offline. So, let's run the vote for a full week rather than the
>> traditional 72 hours. Unless the IPMC objects to the extended voting
>> period, the vote will close on Tues, Dec 1st at noon PST.
>>
>> Thanks
>> -Todd
>> -
>>
>> = Kudu Proposal =
>>
>> == Abstract ==
>>
>> Kudu is a distributed columnar storage engine built for the Apache Hadoop
>> ecosystem.
>>
>> == Proposal ==
>>
>> Kudu is an open source storage engine for structured data which supports
>> low-latency random access together with efficient analytical access
>> patterns. Kudu distributes data using horizontal partitioning and
>> replicates each partition using Raft consensus, providing low
>> mean-time-to-recovery and low tail latencies. Kudu is designed within the
>> context of the Apache Hadoop ecosystem and supports many integrations with
>> other data analytics projects both inside and outside of the Apache
>> Software Foundation.
>>
>>
>>
>> We propose to incubate Kudu as a project of the Apache Software Foundation.
>>
>> == Background ==
>>
>> In recent years, explosive growth in the amount of data being generated and
>> captured by enterprises has resulted in the rapid adoption of open source
>> technology which is able to store massive data sets at scale and at low
>> cost. In particular, the Apache Hadoop ecosystem has become a focal point
>> for such “big data” workloads, because many traditional open source
>> database systems have lagged in offering a scalable alternative.
>>
>>
>>
>> Structured storage in the Hadoop ecosystem has typically been achieved in
>> two ways: for static data sets, data is typically stored on Apache HDFS
>> using binary data formats such as Apache Avro or Apache Parquet. However,
>> neither HDFS nor these formats has any provision for updating individual
>> records, or for efficient random access. Mutable data sets are typically
>> stored in semi-structured stores such as Apache HBase or Apache Cassandra.
>> These systems allow for low-latency record-level reads and writes, but lag
>> far behind the static file formats in terms of sequential read throughput
>> for applications such as SQL-based analytics or machine learning.
>>
>>
>>
>> Kudu is a new storage system designed and implemented from the ground up to
>> fill this gap between high-throughput sequential-access storage systems
>> such as HDFS and low-latency random-access systems such as HBase or
>> Cassandra. While these existing systems continue to hold advantages in some
>> situations, Kudu offers a “happy medium” alternative that can dramatically
>> simplify the architecture of many common workloads. In particular, Kudu
>> offers a simple API for row-level inserts, updates, and deletes, while
>> providing table scans at throughputs similar to Parquet, a commonly-used
>> columnar format for static data.
>>
>>
>>
>> More information on Kudu can be found at the existing open source project
>> website: http://getkudu.io and in particular in the Kudu white-paper PDF:
>> http://getkudu.io/kudu.pdf from which the above was excerpted.
>>
>> == Rationale ==
>>
>> As described above, Kudu fills an important gap in the open source storage
>> ecosystem. After our initial open source project release in September 2015,
>> we have seen a great amount of interest across a diverse set of users and
>> companies. We believe that, as a storage system, it is critical to build an
>> equally diverse set of contributors in the development community. Our
>> experiences as committers and PMC members on other Apache projects have
>> taught us the value of diverse communities in ensuring both longevity and
>> high quality for such foundational systems.
>>
>> == Initial Goals ==
>>
>>  * Move the existing codebase, website, documentation, and mailing lists to
>> Apache-hosted infrastructure
>>  * Work with the infrastructure team to implement and approve our code
>> review, build, and testing workflows in the context of the ASF
>>  * Incremental development and releases per Apache guidelines
>>
>> == Current Status ==
>>
>>  Releases 
>>
>> Kudu has undergone one public release, tagged 

Re: [VOTE] Accept Kudu into the Apache Incubator

2015-11-24 Thread Sean Busbey
+1 (binding)

On Tue, Nov 24, 2015 at 1:32 PM, Todd Lipcon  wrote:

> Hi all,
>
> Discussion on the [DISCUSS] thread seems to have wound down, so I'd like to
> call a VOTE on acceptance of Kudu into the ASF Incubator. The proposal is
> pasted below and also available on the wiki at:
> https://wiki.apache.org/incubator/KuduProposal
>
> The proposal is unchanged since the original version, except for the
> addition of Carl Steinbach as a Mentor.
>
> Please cast your votes:
>
> [] +1, accept Kudu into the Incubator
> [] +/-0, positive/negative non-counted expression of feelings
> [] -1, do not accept Kudu into the incubator (please state reasoning)
>
> Given the US holiday this week, I imagine many folks are traveling or
> otherwise offline. So, let's run the vote for a full week rather than the
> traditional 72 hours. Unless the IPMC objects to the extended voting
> period, the vote will close on Tues, Dec 1st at noon PST.
>
> Thanks
> -Todd
> -
>
> = Kudu Proposal =
>
> == Abstract ==
>
> Kudu is a distributed columnar storage engine built for the Apache Hadoop
> ecosystem.
>
> == Proposal ==
>
> Kudu is an open source storage engine for structured data which supports
> low-latency random access together with efficient analytical access
> patterns. Kudu distributes data using horizontal partitioning and
> replicates each partition using Raft consensus, providing low
> mean-time-to-recovery and low tail latencies. Kudu is designed within the
> context of the Apache Hadoop ecosystem and supports many integrations with
> other data analytics projects both inside and outside of the Apache
> Software Foundation.
>
>
>
> We propose to incubate Kudu as a project of the Apache Software Foundation.
>
> == Background ==
>
> In recent years, explosive growth in the amount of data being generated and
> captured by enterprises has resulted in the rapid adoption of open source
> technology which is able to store massive data sets at scale and at low
> cost. In particular, the Apache Hadoop ecosystem has become a focal point
> for such “big data” workloads, because many traditional open source
> database systems have lagged in offering a scalable alternative.
>
>
>
> Structured storage in the Hadoop ecosystem has typically been achieved in
> two ways: for static data sets, data is typically stored on Apache HDFS
> using binary data formats such as Apache Avro or Apache Parquet. However,
> neither HDFS nor these formats has any provision for updating individual
> records, or for efficient random access. Mutable data sets are typically
> stored in semi-structured stores such as Apache HBase or Apache Cassandra.
> These systems allow for low-latency record-level reads and writes, but lag
> far behind the static file formats in terms of sequential read throughput
> for applications such as SQL-based analytics or machine learning.
>
>
>
> Kudu is a new storage system designed and implemented from the ground up to
> fill this gap between high-throughput sequential-access storage systems
> such as HDFS and low-latency random-access systems such as HBase or
> Cassandra. While these existing systems continue to hold advantages in some
> situations, Kudu offers a “happy medium” alternative that can dramatically
> simplify the architecture of many common workloads. In particular, Kudu
> offers a simple API for row-level inserts, updates, and deletes, while
> providing table scans at throughputs similar to Parquet, a commonly-used
> columnar format for static data.
>
>
>
> More information on Kudu can be found at the existing open source project
> website: http://getkudu.io and in particular in the Kudu white-paper PDF:
> http://getkudu.io/kudu.pdf from which the above was excerpted.
>
> == Rationale ==
>
> As described above, Kudu fills an important gap in the open source storage
> ecosystem. After our initial open source project release in September 2015,
> we have seen a great amount of interest across a diverse set of users and
> companies. We believe that, as a storage system, it is critical to build an
> equally diverse set of contributors in the development community. Our
> experiences as committers and PMC members on other Apache projects have
> taught us the value of diverse communities in ensuring both longevity and
> high quality for such foundational systems.
>
> == Initial Goals ==
>
>  * Move the existing codebase, website, documentation, and mailing lists to
> Apache-hosted infrastructure
>  * Work with the infrastructure team to implement and approve our code
> review, build, and testing workflows in the context of the ASF
>  * Incremental development and releases per Apache guidelines
>
> == Current Status ==
>
>  Releases 
>
> Kudu has undergone one public release, tagged here
> https://github.com/cloudera/kudu/tree/kudu0.5.0-release
>
> This initial release was not performed in the typical ASF fashion -- no
> source tarball was released, but rather only convenience binaries made
> 

Re: [VOTE] Accept Kudu into the Apache Incubator

2015-11-24 Thread Carl Steinbach
+1 (binding)


On Tue, Nov 24, 2015 at 5:39 PM, John D. Ament 
wrote:

> +1
> On Nov 24, 2015 14:33, "Todd Lipcon"  wrote:
>
> > Hi all,
> >
> > Discussion on the [DISCUSS] thread seems to have wound down, so I'd like
> to
> > call a VOTE on acceptance of Kudu into the ASF Incubator. The proposal is
> > pasted below and also available on the wiki at:
> > https://wiki.apache.org/incubator/KuduProposal
> >
> > The proposal is unchanged since the original version, except for the
> > addition of Carl Steinbach as a Mentor.
> >
> > Please cast your votes:
> >
> > [] +1, accept Kudu into the Incubator
> > [] +/-0, positive/negative non-counted expression of feelings
> > [] -1, do not accept Kudu into the incubator (please state reasoning)
> >
> > Given the US holiday this week, I imagine many folks are traveling or
> > otherwise offline. So, let's run the vote for a full week rather than the
> > traditional 72 hours. Unless the IPMC objects to the extended voting
> > period, the vote will close on Tues, Dec 1st at noon PST.
> >
> > Thanks
> > -Todd
> > -
> >
> > = Kudu Proposal =
> >
> > == Abstract ==
> >
> > Kudu is a distributed columnar storage engine built for the Apache Hadoop
> > ecosystem.
> >
> > == Proposal ==
> >
> > Kudu is an open source storage engine for structured data which supports
> > low-latency random access together with efficient analytical access
> > patterns. Kudu distributes data using horizontal partitioning and
> > replicates each partition using Raft consensus, providing low
> > mean-time-to-recovery and low tail latencies. Kudu is designed within the
> > context of the Apache Hadoop ecosystem and supports many integrations
> with
> > other data analytics projects both inside and outside of the Apache
> > Software Foundation.
> >
> >
> >
> > We propose to incubate Kudu as a project of the Apache Software
> Foundation.
> >
> > == Background ==
> >
> > In recent years, explosive growth in the amount of data being generated
> and
> > captured by enterprises has resulted in the rapid adoption of open source
> > technology which is able to store massive data sets at scale and at low
> > cost. In particular, the Apache Hadoop ecosystem has become a focal point
> > for such “big data” workloads, because many traditional open source
> > database systems have lagged in offering a scalable alternative.
> >
> >
> >
> > Structured storage in the Hadoop ecosystem has typically been achieved in
> > two ways: for static data sets, data is typically stored on Apache HDFS
> > using binary data formats such as Apache Avro or Apache Parquet. However,
> > neither HDFS nor these formats has any provision for updating individual
> > records, or for efficient random access. Mutable data sets are typically
> > stored in semi-structured stores such as Apache HBase or Apache
> Cassandra.
> > These systems allow for low-latency record-level reads and writes, but
> lag
> > far behind the static file formats in terms of sequential read throughput
> > for applications such as SQL-based analytics or machine learning.
> >
> >
> >
> > Kudu is a new storage system designed and implemented from the ground up
> to
> > fill this gap between high-throughput sequential-access storage systems
> > such as HDFS and low-latency random-access systems such as HBase or
> > Cassandra. While these existing systems continue to hold advantages in
> some
> > situations, Kudu offers a “happy medium” alternative that can
> dramatically
> > simplify the architecture of many common workloads. In particular, Kudu
> > offers a simple API for row-level inserts, updates, and deletes, while
> > providing table scans at throughputs similar to Parquet, a commonly-used
> > columnar format for static data.
> >
> >
> >
> > More information on Kudu can be found at the existing open source project
> > website: http://getkudu.io and in particular in the Kudu white-paper
> PDF:
> > http://getkudu.io/kudu.pdf from which the above was excerpted.
> >
> > == Rationale ==
> >
> > As described above, Kudu fills an important gap in the open source
> storage
> > ecosystem. After our initial open source project release in September
> 2015,
> > we have seen a great amount of interest across a diverse set of users and
> > companies. We believe that, as a storage system, it is critical to build
> an
> > equally diverse set of contributors in the development community. Our
> > experiences as committers and PMC members on other Apache projects have
> > taught us the value of diverse communities in ensuring both longevity and
> > high quality for such foundational systems.
> >
> > == Initial Goals ==
> >
> >  * Move the existing codebase, website, documentation, and mailing lists
> to
> > Apache-hosted infrastructure
> >  * Work with the infrastructure team to implement and approve our code
> > review, build, and testing workflows in the context of the ASF
> >  * Incremental development and releases per Apache guidelines

Re: [VOTE] Accept Kudu into the Apache Incubator

2015-11-24 Thread Brock Noland
+1 (binding)

On Tuesday, November 24, 2015, Carl Steinbach  wrote:

> +1 (binding)
>
>
> On Tue, Nov 24, 2015 at 5:39 PM, John D. Ament  >
> wrote:
>
> > +1
> > On Nov 24, 2015 14:33, "Todd Lipcon" >
> wrote:
> >
> > > Hi all,
> > >
> > > Discussion on the [DISCUSS] thread seems to have wound down, so I'd
> like
> > to
> > > call a VOTE on acceptance of Kudu into the ASF Incubator. The proposal
> is
> > > pasted below and also available on the wiki at:
> > > https://wiki.apache.org/incubator/KuduProposal
> > >
> > > The proposal is unchanged since the original version, except for the
> > > addition of Carl Steinbach as a Mentor.
> > >
> > > Please cast your votes:
> > >
> > > [] +1, accept Kudu into the Incubator
> > > [] +/-0, positive/negative non-counted expression of feelings
> > > [] -1, do not accept Kudu into the incubator (please state reasoning)
> > >
> > > Given the US holiday this week, I imagine many folks are traveling or
> > > otherwise offline. So, let's run the vote for a full week rather than
> the
> > > traditional 72 hours. Unless the IPMC objects to the extended voting
> > > period, the vote will close on Tues, Dec 1st at noon PST.
> > >
> > > Thanks
> > > -Todd
> > > -
> > >
> > > = Kudu Proposal =
> > >
> > > == Abstract ==
> > >
> > > Kudu is a distributed columnar storage engine built for the Apache
> Hadoop
> > > ecosystem.
> > >
> > > == Proposal ==
> > >
> > > Kudu is an open source storage engine for structured data which
> supports
> > > low-latency random access together with efficient analytical access
> > > patterns. Kudu distributes data using horizontal partitioning and
> > > replicates each partition using Raft consensus, providing low
> > > mean-time-to-recovery and low tail latencies. Kudu is designed within
> the
> > > context of the Apache Hadoop ecosystem and supports many integrations
> > with
> > > other data analytics projects both inside and outside of the Apache
> > > Software Foundation.
> > >
> > >
> > >
> > > We propose to incubate Kudu as a project of the Apache Software
> > Foundation.
> > >
> > > == Background ==
> > >
> > > In recent years, explosive growth in the amount of data being generated
> > and
> > > captured by enterprises has resulted in the rapid adoption of open
> source
> > > technology which is able to store massive data sets at scale and at low
> > > cost. In particular, the Apache Hadoop ecosystem has become a focal
> point
> > > for such “big data” workloads, because many traditional open source
> > > database systems have lagged in offering a scalable alternative.
> > >
> > >
> > >
> > > Structured storage in the Hadoop ecosystem has typically been achieved
> in
> > > two ways: for static data sets, data is typically stored on Apache HDFS
> > > using binary data formats such as Apache Avro or Apache Parquet.
> However,
> > > neither HDFS nor these formats has any provision for updating
> individual
> > > records, or for efficient random access. Mutable data sets are
> typically
> > > stored in semi-structured stores such as Apache HBase or Apache
> > Cassandra.
> > > These systems allow for low-latency record-level reads and writes, but
> > lag
> > > far behind the static file formats in terms of sequential read
> throughput
> > > for applications such as SQL-based analytics or machine learning.
> > >
> > >
> > >
> > > Kudu is a new storage system designed and implemented from the ground
> up
> > to
> > > fill this gap between high-throughput sequential-access storage systems
> > > such as HDFS and low-latency random-access systems such as HBase or
> > > Cassandra. While these existing systems continue to hold advantages in
> > some
> > > situations, Kudu offers a “happy medium” alternative that can
> > dramatically
> > > simplify the architecture of many common workloads. In particular, Kudu
> > > offers a simple API for row-level inserts, updates, and deletes, while
> > > providing table scans at throughputs similar to Parquet, a
> commonly-used
> > > columnar format for static data.
> > >
> > >
> > >
> > > More information on Kudu can be found at the existing open source
> project
> > > website: http://getkudu.io and in particular in the Kudu white-paper
> > PDF:
> > > http://getkudu.io/kudu.pdf from which the above was excerpted.
> > >
> > > == Rationale ==
> > >
> > > As described above, Kudu fills an important gap in the open source
> > storage
> > > ecosystem. After our initial open source project release in September
> > 2015,
> > > we have seen a great amount of interest across a diverse set of users
> and
> > > companies. We believe that, as a storage system, it is critical to
> build
> > an
> > > equally diverse set of contributors in the development community. Our
> > > experiences as committers and PMC members on other Apache projects have
> > > taught us the value of diverse communities in ensuring both longevity
> and
> > > high 

Re: [VOTE] Accept Kudu into the Apache Incubator

2015-11-24 Thread Ralph Goers
-1 (binding)
I’d like to see the project start with CTR and use RTC only for specific cases 
(like where tests must be modified, over X (1000 lines?) of code added, etc.

I must say I do find the part about achieving quality through automation 
attractive, but following that up with requiring RTC leads me to conclude that 
the project doesn’t really believe that to be true.

Ralph

> On Nov 24, 2015, at 12:32 PM, Todd Lipcon  wrote:
> 
> Hi all,
> 
> Discussion on the [DISCUSS] thread seems to have wound down, so I'd like to
> call a VOTE on acceptance of Kudu into the ASF Incubator. The proposal is
> pasted below and also available on the wiki at:
> https://wiki.apache.org/incubator/KuduProposal
> 
> The proposal is unchanged since the original version, except for the
> addition of Carl Steinbach as a Mentor.
> 
> Please cast your votes:
> 
> [] +1, accept Kudu into the Incubator
> [] +/-0, positive/negative non-counted expression of feelings
> [] -1, do not accept Kudu into the incubator (please state reasoning)
> 
> Given the US holiday this week, I imagine many folks are traveling or
> otherwise offline. So, let's run the vote for a full week rather than the
> traditional 72 hours. Unless the IPMC objects to the extended voting
> period, the vote will close on Tues, Dec 1st at noon PST.
> 
> Thanks
> -Todd
> -
> 
> = Kudu Proposal =
> 
> == Abstract ==
> 
> Kudu is a distributed columnar storage engine built for the Apache Hadoop
> ecosystem.
> 
> == Proposal ==
> 
> Kudu is an open source storage engine for structured data which supports
> low-latency random access together with efficient analytical access
> patterns. Kudu distributes data using horizontal partitioning and
> replicates each partition using Raft consensus, providing low
> mean-time-to-recovery and low tail latencies. Kudu is designed within the
> context of the Apache Hadoop ecosystem and supports many integrations with
> other data analytics projects both inside and outside of the Apache
> Software Foundation.
> 
> 
> 
> We propose to incubate Kudu as a project of the Apache Software Foundation.
> 
> == Background ==
> 
> In recent years, explosive growth in the amount of data being generated and
> captured by enterprises has resulted in the rapid adoption of open source
> technology which is able to store massive data sets at scale and at low
> cost. In particular, the Apache Hadoop ecosystem has become a focal point
> for such “big data” workloads, because many traditional open source
> database systems have lagged in offering a scalable alternative.
> 
> 
> 
> Structured storage in the Hadoop ecosystem has typically been achieved in
> two ways: for static data sets, data is typically stored on Apache HDFS
> using binary data formats such as Apache Avro or Apache Parquet. However,
> neither HDFS nor these formats has any provision for updating individual
> records, or for efficient random access. Mutable data sets are typically
> stored in semi-structured stores such as Apache HBase or Apache Cassandra.
> These systems allow for low-latency record-level reads and writes, but lag
> far behind the static file formats in terms of sequential read throughput
> for applications such as SQL-based analytics or machine learning.
> 
> 
> 
> Kudu is a new storage system designed and implemented from the ground up to
> fill this gap between high-throughput sequential-access storage systems
> such as HDFS and low-latency random-access systems such as HBase or
> Cassandra. While these existing systems continue to hold advantages in some
> situations, Kudu offers a “happy medium” alternative that can dramatically
> simplify the architecture of many common workloads. In particular, Kudu
> offers a simple API for row-level inserts, updates, and deletes, while
> providing table scans at throughputs similar to Parquet, a commonly-used
> columnar format for static data.
> 
> 
> 
> More information on Kudu can be found at the existing open source project
> website: http://getkudu.io and in particular in the Kudu white-paper PDF:
> http://getkudu.io/kudu.pdf from which the above was excerpted.
> 
> == Rationale ==
> 
> As described above, Kudu fills an important gap in the open source storage
> ecosystem. After our initial open source project release in September 2015,
> we have seen a great amount of interest across a diverse set of users and
> companies. We believe that, as a storage system, it is critical to build an
> equally diverse set of contributors in the development community. Our
> experiences as committers and PMC members on other Apache projects have
> taught us the value of diverse communities in ensuring both longevity and
> high quality for such foundational systems.
> 
> == Initial Goals ==
> 
> * Move the existing codebase, website, documentation, and mailing lists to
> Apache-hosted infrastructure
> * Work with the infrastructure team to implement and approve our code
> review, build, and testing workflows in the 

Re: [VOTE] Accept Kudu into the Apache Incubator

2015-11-24 Thread Owen O'Malley
+1 (binding)

On Tue, Nov 24, 2015 at 9:13 PM, Ralph Goers 
wrote:

> -1 (binding)
> I’d like to see the project start with CTR and use RTC only for specific
> cases (like where tests must be modified, over X (1000 lines?) of code
> added, etc.
>
> I must say I do find the part about achieving quality through automation
> attractive, but following that up with requiring RTC leads me to conclude
> that the project doesn’t really believe that to be true.
>
> Ralph
>
> > On Nov 24, 2015, at 12:32 PM, Todd Lipcon  wrote:
> >
> > Hi all,
> >
> > Discussion on the [DISCUSS] thread seems to have wound down, so I'd like
> to
> > call a VOTE on acceptance of Kudu into the ASF Incubator. The proposal is
> > pasted below and also available on the wiki at:
> > https://wiki.apache.org/incubator/KuduProposal
> >
> > The proposal is unchanged since the original version, except for the
> > addition of Carl Steinbach as a Mentor.
> >
> > Please cast your votes:
> >
> > [] +1, accept Kudu into the Incubator
> > [] +/-0, positive/negative non-counted expression of feelings
> > [] -1, do not accept Kudu into the incubator (please state reasoning)
> >
> > Given the US holiday this week, I imagine many folks are traveling or
> > otherwise offline. So, let's run the vote for a full week rather than the
> > traditional 72 hours. Unless the IPMC objects to the extended voting
> > period, the vote will close on Tues, Dec 1st at noon PST.
> >
> > Thanks
> > -Todd
> > -
> >
> > = Kudu Proposal =
> >
> > == Abstract ==
> >
> > Kudu is a distributed columnar storage engine built for the Apache Hadoop
> > ecosystem.
> >
> > == Proposal ==
> >
> > Kudu is an open source storage engine for structured data which supports
> > low-latency random access together with efficient analytical access
> > patterns. Kudu distributes data using horizontal partitioning and
> > replicates each partition using Raft consensus, providing low
> > mean-time-to-recovery and low tail latencies. Kudu is designed within the
> > context of the Apache Hadoop ecosystem and supports many integrations
> with
> > other data analytics projects both inside and outside of the Apache
> > Software Foundation.
> >
> >
> >
> > We propose to incubate Kudu as a project of the Apache Software
> Foundation.
> >
> > == Background ==
> >
> > In recent years, explosive growth in the amount of data being generated
> and
> > captured by enterprises has resulted in the rapid adoption of open source
> > technology which is able to store massive data sets at scale and at low
> > cost. In particular, the Apache Hadoop ecosystem has become a focal point
> > for such “big data” workloads, because many traditional open source
> > database systems have lagged in offering a scalable alternative.
> >
> >
> >
> > Structured storage in the Hadoop ecosystem has typically been achieved in
> > two ways: for static data sets, data is typically stored on Apache HDFS
> > using binary data formats such as Apache Avro or Apache Parquet. However,
> > neither HDFS nor these formats has any provision for updating individual
> > records, or for efficient random access. Mutable data sets are typically
> > stored in semi-structured stores such as Apache HBase or Apache
> Cassandra.
> > These systems allow for low-latency record-level reads and writes, but
> lag
> > far behind the static file formats in terms of sequential read throughput
> > for applications such as SQL-based analytics or machine learning.
> >
> >
> >
> > Kudu is a new storage system designed and implemented from the ground up
> to
> > fill this gap between high-throughput sequential-access storage systems
> > such as HDFS and low-latency random-access systems such as HBase or
> > Cassandra. While these existing systems continue to hold advantages in
> some
> > situations, Kudu offers a “happy medium” alternative that can
> dramatically
> > simplify the architecture of many common workloads. In particular, Kudu
> > offers a simple API for row-level inserts, updates, and deletes, while
> > providing table scans at throughputs similar to Parquet, a commonly-used
> > columnar format for static data.
> >
> >
> >
> > More information on Kudu can be found at the existing open source project
> > website: http://getkudu.io and in particular in the Kudu white-paper
> PDF:
> > http://getkudu.io/kudu.pdf from which the above was excerpted.
> >
> > == Rationale ==
> >
> > As described above, Kudu fills an important gap in the open source
> storage
> > ecosystem. After our initial open source project release in September
> 2015,
> > we have seen a great amount of interest across a diverse set of users and
> > companies. We believe that, as a storage system, it is critical to build
> an
> > equally diverse set of contributors in the development community. Our
> > experiences as committers and PMC members on other Apache projects have
> > taught us the value of diverse communities in ensuring both longevity 

Re: [VOTE] Accept Kudu into the Apache Incubator

2015-11-24 Thread Greg Stein
-1 (binding)

Starting with RTC is a poor way to attract new community members. I'd like
to see this community use CTR instead of mandating gerrit reviews.

(ref: other-threads about lack of trust, and control issues; poor basis for
a community)

On Tue, Nov 24, 2015 at 1:32 PM, Todd Lipcon  wrote:

> Hi all,
>
> Discussion on the [DISCUSS] thread seems to have wound down, so I'd like to
> call a VOTE on acceptance of Kudu into the ASF Incubator. The proposal is
> pasted below and also available on the wiki at:
> https://wiki.apache.org/incubator/KuduProposal
>
> The proposal is unchanged since the original version, except for the
> addition of Carl Steinbach as a Mentor.
>
> Please cast your votes:
>
> [] +1, accept Kudu into the Incubator
> [] +/-0, positive/negative non-counted expression of feelings
> [] -1, do not accept Kudu into the incubator (please state reasoning)
>
> Given the US holiday this week, I imagine many folks are traveling or
> otherwise offline. So, let's run the vote for a full week rather than the
> traditional 72 hours. Unless the IPMC objects to the extended voting
> period, the vote will close on Tues, Dec 1st at noon PST.
>
> Thanks
> -Todd
> -
>
> = Kudu Proposal =
>
> == Abstract ==
>
> Kudu is a distributed columnar storage engine built for the Apache Hadoop
> ecosystem.
>
> == Proposal ==
>
> Kudu is an open source storage engine for structured data which supports
> low-latency random access together with efficient analytical access
> patterns. Kudu distributes data using horizontal partitioning and
> replicates each partition using Raft consensus, providing low
> mean-time-to-recovery and low tail latencies. Kudu is designed within the
> context of the Apache Hadoop ecosystem and supports many integrations with
> other data analytics projects both inside and outside of the Apache
> Software Foundation.
>
>
>
> We propose to incubate Kudu as a project of the Apache Software Foundation.
>
> == Background ==
>
> In recent years, explosive growth in the amount of data being generated and
> captured by enterprises has resulted in the rapid adoption of open source
> technology which is able to store massive data sets at scale and at low
> cost. In particular, the Apache Hadoop ecosystem has become a focal point
> for such “big data” workloads, because many traditional open source
> database systems have lagged in offering a scalable alternative.
>
>
>
> Structured storage in the Hadoop ecosystem has typically been achieved in
> two ways: for static data sets, data is typically stored on Apache HDFS
> using binary data formats such as Apache Avro or Apache Parquet. However,
> neither HDFS nor these formats has any provision for updating individual
> records, or for efficient random access. Mutable data sets are typically
> stored in semi-structured stores such as Apache HBase or Apache Cassandra.
> These systems allow for low-latency record-level reads and writes, but lag
> far behind the static file formats in terms of sequential read throughput
> for applications such as SQL-based analytics or machine learning.
>
>
>
> Kudu is a new storage system designed and implemented from the ground up to
> fill this gap between high-throughput sequential-access storage systems
> such as HDFS and low-latency random-access systems such as HBase or
> Cassandra. While these existing systems continue to hold advantages in some
> situations, Kudu offers a “happy medium” alternative that can dramatically
> simplify the architecture of many common workloads. In particular, Kudu
> offers a simple API for row-level inserts, updates, and deletes, while
> providing table scans at throughputs similar to Parquet, a commonly-used
> columnar format for static data.
>
>
>
> More information on Kudu can be found at the existing open source project
> website: http://getkudu.io and in particular in the Kudu white-paper PDF:
> http://getkudu.io/kudu.pdf from which the above was excerpted.
>
> == Rationale ==
>
> As described above, Kudu fills an important gap in the open source storage
> ecosystem. After our initial open source project release in September 2015,
> we have seen a great amount of interest across a diverse set of users and
> companies. We believe that, as a storage system, it is critical to build an
> equally diverse set of contributors in the development community. Our
> experiences as committers and PMC members on other Apache projects have
> taught us the value of diverse communities in ensuring both longevity and
> high quality for such foundational systems.
>
> == Initial Goals ==
>
>  * Move the existing codebase, website, documentation, and mailing lists to
> Apache-hosted infrastructure
>  * Work with the infrastructure team to implement and approve our code
> review, build, and testing workflows in the context of the ASF
>  * Incremental development and releases per Apache guidelines
>
> == Current Status ==
>
>  Releases 
>
> Kudu has undergone one public 

Re: [VOTE] Accept Kudu into the Apache Incubator

2015-11-24 Thread Julien Le Dem
+1 (binding)

On Tue, Nov 24, 2015 at 1:57 PM, Edward J. Yoon 
wrote:

> +1 (binding)
>
> On Wed, Nov 25, 2015 at 6:26 AM, Patrick Angeles
>  wrote:
> > +1 (non-binding)
> >
> > On Tue, Nov 24, 2015 at 4:23 PM, Jake Farrell 
> wrote:
> >
> >> +1 (binding)
> >>
> >> -Jake
> >>
> >> On Tue, Nov 24, 2015 at 2:32 PM, Todd Lipcon  wrote:
> >>
> >> > Hi all,
> >> >
> >> > Discussion on the [DISCUSS] thread seems to have wound down, so I'd
> like
> >> to
> >> > call a VOTE on acceptance of Kudu into the ASF Incubator. The
> proposal is
> >> > pasted below and also available on the wiki at:
> >> > https://wiki.apache.org/incubator/KuduProposal
> >> >
> >> > The proposal is unchanged since the original version, except for the
> >> > addition of Carl Steinbach as a Mentor.
> >> >
> >> > Please cast your votes:
> >> >
> >> > [] +1, accept Kudu into the Incubator
> >> > [] +/-0, positive/negative non-counted expression of feelings
> >> > [] -1, do not accept Kudu into the incubator (please state reasoning)
> >> >
> >> > Given the US holiday this week, I imagine many folks are traveling or
> >> > otherwise offline. So, let's run the vote for a full week rather than
> the
> >> > traditional 72 hours. Unless the IPMC objects to the extended voting
> >> > period, the vote will close on Tues, Dec 1st at noon PST.
> >> >
> >> > Thanks
> >> > -Todd
> >> > -
> >> >
> >> > = Kudu Proposal =
> >> >
> >> > == Abstract ==
> >> >
> >> > Kudu is a distributed columnar storage engine built for the Apache
> Hadoop
> >> > ecosystem.
> >> >
> >> > == Proposal ==
> >> >
> >> > Kudu is an open source storage engine for structured data which
> supports
> >> > low-latency random access together with efficient analytical access
> >> > patterns. Kudu distributes data using horizontal partitioning and
> >> > replicates each partition using Raft consensus, providing low
> >> > mean-time-to-recovery and low tail latencies. Kudu is designed within
> the
> >> > context of the Apache Hadoop ecosystem and supports many integrations
> >> with
> >> > other data analytics projects both inside and outside of the Apache
> >> > Software Foundation.
> >> >
> >> >
> >> >
> >> > We propose to incubate Kudu as a project of the Apache Software
> >> Foundation.
> >> >
> >> > == Background ==
> >> >
> >> > In recent years, explosive growth in the amount of data being
> generated
> >> and
> >> > captured by enterprises has resulted in the rapid adoption of open
> source
> >> > technology which is able to store massive data sets at scale and at
> low
> >> > cost. In particular, the Apache Hadoop ecosystem has become a focal
> point
> >> > for such “big data” workloads, because many traditional open source
> >> > database systems have lagged in offering a scalable alternative.
> >> >
> >> >
> >> >
> >> > Structured storage in the Hadoop ecosystem has typically been
> achieved in
> >> > two ways: for static data sets, data is typically stored on Apache
> HDFS
> >> > using binary data formats such as Apache Avro or Apache Parquet.
> However,
> >> > neither HDFS nor these formats has any provision for updating
> individual
> >> > records, or for efficient random access. Mutable data sets are
> typically
> >> > stored in semi-structured stores such as Apache HBase or Apache
> >> Cassandra.
> >> > These systems allow for low-latency record-level reads and writes, but
> >> lag
> >> > far behind the static file formats in terms of sequential read
> throughput
> >> > for applications such as SQL-based analytics or machine learning.
> >> >
> >> >
> >> >
> >> > Kudu is a new storage system designed and implemented from the ground
> up
> >> to
> >> > fill this gap between high-throughput sequential-access storage
> systems
> >> > such as HDFS and low-latency random-access systems such as HBase or
> >> > Cassandra. While these existing systems continue to hold advantages in
> >> some
> >> > situations, Kudu offers a “happy medium” alternative that can
> >> dramatically
> >> > simplify the architecture of many common workloads. In particular,
> Kudu
> >> > offers a simple API for row-level inserts, updates, and deletes, while
> >> > providing table scans at throughputs similar to Parquet, a
> commonly-used
> >> > columnar format for static data.
> >> >
> >> >
> >> >
> >> > More information on Kudu can be found at the existing open source
> project
> >> > website: http://getkudu.io and in particular in the Kudu white-paper
> >> PDF:
> >> > http://getkudu.io/kudu.pdf from which the above was excerpted.
> >> >
> >> > == Rationale ==
> >> >
> >> > As described above, Kudu fills an important gap in the open source
> >> storage
> >> > ecosystem. After our initial open source project release in September
> >> 2015,
> >> > we have seen a great amount of interest across a diverse set of users
> and
> >> > companies. We believe that, as a storage system, it is critical to
> build
> >> an
> >> > 

Re: [VOTE] Accept Kudu into the Apache Incubator

2015-11-24 Thread Jacques Nadeau
+1

Great to see this coming to the foundation!

On Tue, Nov 24, 2015 at 11:33 AM, Todd Lipcon  wrote:

> On Tue, Nov 24, 2015 at 11:32 AM, Todd Lipcon  wrote:
>
> > Hi all,
> >
> > Discussion on the [DISCUSS] thread seems to have wound down, so I'd like
> > to call a VOTE on acceptance of Kudu into the ASF Incubator. The proposal
> > is pasted below and also available on the wiki at:
> > https://wiki.apache.org/incubator/KuduProposal
> >
> > The proposal is unchanged since the original version, except for the
> > addition of Carl Steinbach as a Mentor.
> >
> > Please cast your votes:
> >
> > [] +1, accept Kudu into the Incubator
> > [] +/-0, positive/negative non-counted expression of feelings
> > [] -1, do not accept Kudu into the incubator (please state reasoning)
> >
> >
> I'll start the voting with my +1 (binding, assuming it's permitted to vote
> on your own proposal!)
>


Re: [VOTE] Accept Kudu into the Apache Incubator

2015-11-24 Thread Arvind Prabhakar
+1 (binding)

Regards,
Arvind Prabhakar

On Tue, Nov 24, 2015 at 11:32 AM, Todd Lipcon  wrote:

> Hi all,
>
> Discussion on the [DISCUSS] thread seems to have wound down, so I'd like to
> call a VOTE on acceptance of Kudu into the ASF Incubator. The proposal is
> pasted below and also available on the wiki at:
> https://wiki.apache.org/incubator/KuduProposal
>
> The proposal is unchanged since the original version, except for the
> addition of Carl Steinbach as a Mentor.
>
> Please cast your votes:
>
> [] +1, accept Kudu into the Incubator
> [] +/-0, positive/negative non-counted expression of feelings
> [] -1, do not accept Kudu into the incubator (please state reasoning)
>
> Given the US holiday this week, I imagine many folks are traveling or
> otherwise offline. So, let's run the vote for a full week rather than the
> traditional 72 hours. Unless the IPMC objects to the extended voting
> period, the vote will close on Tues, Dec 1st at noon PST.
>
> Thanks
> -Todd
> -
>
> = Kudu Proposal =
>
> == Abstract ==
>
> Kudu is a distributed columnar storage engine built for the Apache Hadoop
> ecosystem.
>
> == Proposal ==
>
> Kudu is an open source storage engine for structured data which supports
> low-latency random access together with efficient analytical access
> patterns. Kudu distributes data using horizontal partitioning and
> replicates each partition using Raft consensus, providing low
> mean-time-to-recovery and low tail latencies. Kudu is designed within the
> context of the Apache Hadoop ecosystem and supports many integrations with
> other data analytics projects both inside and outside of the Apache
> Software Foundation.
>
>
>
> We propose to incubate Kudu as a project of the Apache Software Foundation.
>
> == Background ==
>
> In recent years, explosive growth in the amount of data being generated and
> captured by enterprises has resulted in the rapid adoption of open source
> technology which is able to store massive data sets at scale and at low
> cost. In particular, the Apache Hadoop ecosystem has become a focal point
> for such “big data” workloads, because many traditional open source
> database systems have lagged in offering a scalable alternative.
>
>
>
> Structured storage in the Hadoop ecosystem has typically been achieved in
> two ways: for static data sets, data is typically stored on Apache HDFS
> using binary data formats such as Apache Avro or Apache Parquet. However,
> neither HDFS nor these formats has any provision for updating individual
> records, or for efficient random access. Mutable data sets are typically
> stored in semi-structured stores such as Apache HBase or Apache Cassandra.
> These systems allow for low-latency record-level reads and writes, but lag
> far behind the static file formats in terms of sequential read throughput
> for applications such as SQL-based analytics or machine learning.
>
>
>
> Kudu is a new storage system designed and implemented from the ground up to
> fill this gap between high-throughput sequential-access storage systems
> such as HDFS and low-latency random-access systems such as HBase or
> Cassandra. While these existing systems continue to hold advantages in some
> situations, Kudu offers a “happy medium” alternative that can dramatically
> simplify the architecture of many common workloads. In particular, Kudu
> offers a simple API for row-level inserts, updates, and deletes, while
> providing table scans at throughputs similar to Parquet, a commonly-used
> columnar format for static data.
>
>
>
> More information on Kudu can be found at the existing open source project
> website: http://getkudu.io and in particular in the Kudu white-paper PDF:
> http://getkudu.io/kudu.pdf from which the above was excerpted.
>
> == Rationale ==
>
> As described above, Kudu fills an important gap in the open source storage
> ecosystem. After our initial open source project release in September 2015,
> we have seen a great amount of interest across a diverse set of users and
> companies. We believe that, as a storage system, it is critical to build an
> equally diverse set of contributors in the development community. Our
> experiences as committers and PMC members on other Apache projects have
> taught us the value of diverse communities in ensuring both longevity and
> high quality for such foundational systems.
>
> == Initial Goals ==
>
>  * Move the existing codebase, website, documentation, and mailing lists to
> Apache-hosted infrastructure
>  * Work with the infrastructure team to implement and approve our code
> review, build, and testing workflows in the context of the ASF
>  * Incremental development and releases per Apache guidelines
>
> == Current Status ==
>
>  Releases 
>
> Kudu has undergone one public release, tagged here
> https://github.com/cloudera/kudu/tree/kudu0.5.0-release
>
> This initial release was not performed in the typical ASF fashion -- no
> source tarball was released, but rather only 

Re: [VOTE] Accept Kudu into the Apache Incubator

2015-11-24 Thread Jake Farrell
+1 (binding)

-Jake

On Tue, Nov 24, 2015 at 2:32 PM, Todd Lipcon  wrote:

> Hi all,
>
> Discussion on the [DISCUSS] thread seems to have wound down, so I'd like to
> call a VOTE on acceptance of Kudu into the ASF Incubator. The proposal is
> pasted below and also available on the wiki at:
> https://wiki.apache.org/incubator/KuduProposal
>
> The proposal is unchanged since the original version, except for the
> addition of Carl Steinbach as a Mentor.
>
> Please cast your votes:
>
> [] +1, accept Kudu into the Incubator
> [] +/-0, positive/negative non-counted expression of feelings
> [] -1, do not accept Kudu into the incubator (please state reasoning)
>
> Given the US holiday this week, I imagine many folks are traveling or
> otherwise offline. So, let's run the vote for a full week rather than the
> traditional 72 hours. Unless the IPMC objects to the extended voting
> period, the vote will close on Tues, Dec 1st at noon PST.
>
> Thanks
> -Todd
> -
>
> = Kudu Proposal =
>
> == Abstract ==
>
> Kudu is a distributed columnar storage engine built for the Apache Hadoop
> ecosystem.
>
> == Proposal ==
>
> Kudu is an open source storage engine for structured data which supports
> low-latency random access together with efficient analytical access
> patterns. Kudu distributes data using horizontal partitioning and
> replicates each partition using Raft consensus, providing low
> mean-time-to-recovery and low tail latencies. Kudu is designed within the
> context of the Apache Hadoop ecosystem and supports many integrations with
> other data analytics projects both inside and outside of the Apache
> Software Foundation.
>
>
>
> We propose to incubate Kudu as a project of the Apache Software Foundation.
>
> == Background ==
>
> In recent years, explosive growth in the amount of data being generated and
> captured by enterprises has resulted in the rapid adoption of open source
> technology which is able to store massive data sets at scale and at low
> cost. In particular, the Apache Hadoop ecosystem has become a focal point
> for such “big data” workloads, because many traditional open source
> database systems have lagged in offering a scalable alternative.
>
>
>
> Structured storage in the Hadoop ecosystem has typically been achieved in
> two ways: for static data sets, data is typically stored on Apache HDFS
> using binary data formats such as Apache Avro or Apache Parquet. However,
> neither HDFS nor these formats has any provision for updating individual
> records, or for efficient random access. Mutable data sets are typically
> stored in semi-structured stores such as Apache HBase or Apache Cassandra.
> These systems allow for low-latency record-level reads and writes, but lag
> far behind the static file formats in terms of sequential read throughput
> for applications such as SQL-based analytics or machine learning.
>
>
>
> Kudu is a new storage system designed and implemented from the ground up to
> fill this gap between high-throughput sequential-access storage systems
> such as HDFS and low-latency random-access systems such as HBase or
> Cassandra. While these existing systems continue to hold advantages in some
> situations, Kudu offers a “happy medium” alternative that can dramatically
> simplify the architecture of many common workloads. In particular, Kudu
> offers a simple API for row-level inserts, updates, and deletes, while
> providing table scans at throughputs similar to Parquet, a commonly-used
> columnar format for static data.
>
>
>
> More information on Kudu can be found at the existing open source project
> website: http://getkudu.io and in particular in the Kudu white-paper PDF:
> http://getkudu.io/kudu.pdf from which the above was excerpted.
>
> == Rationale ==
>
> As described above, Kudu fills an important gap in the open source storage
> ecosystem. After our initial open source project release in September 2015,
> we have seen a great amount of interest across a diverse set of users and
> companies. We believe that, as a storage system, it is critical to build an
> equally diverse set of contributors in the development community. Our
> experiences as committers and PMC members on other Apache projects have
> taught us the value of diverse communities in ensuring both longevity and
> high quality for such foundational systems.
>
> == Initial Goals ==
>
>  * Move the existing codebase, website, documentation, and mailing lists to
> Apache-hosted infrastructure
>  * Work with the infrastructure team to implement and approve our code
> review, build, and testing workflows in the context of the ASF
>  * Incremental development and releases per Apache guidelines
>
> == Current Status ==
>
>  Releases 
>
> Kudu has undergone one public release, tagged here
> https://github.com/cloudera/kudu/tree/kudu0.5.0-release
>
> This initial release was not performed in the typical ASF fashion -- no
> source tarball was released, but rather only convenience binaries made

Re: [VOTE] Accept Kudu into the Apache Incubator

2015-11-24 Thread Mike Percy
On Tue, Nov 24, 2015 at 11:32 AM, Todd Lipcon  wrote:

> Hi all,
>
> Discussion on the [DISCUSS] thread seems to have wound down, so I'd like to
> call a VOTE on acceptance of Kudu into the ASF Incubator. The proposal is
> pasted below and also available on the wiki at:
> https://wiki.apache.org/incubator/KuduProposal
>
> The proposal is unchanged since the original version, except for the
> addition of Carl Steinbach as a Mentor.
>
> Please cast your votes:
>
> [] +1, accept Kudu into the Incubator
> [] +/-0, positive/negative non-counted expression of feelings
> [] -1, do not accept Kudu into the incubator (please state reasoning)
>

+1 (non-binding)

Mike


Re: [VOTE] Accept Kudu into the Apache Incubator

2015-11-24 Thread Mattmann, Chris A (3980)
+1 from me.

++
Chris Mattmann, Ph.D.
Chief Architect
Instrument Software and Science Data Systems Section (398)
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 168-519, Mailstop: 168-527
Email: chris.a.mattm...@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
++
Adjunct Associate Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++





-Original Message-
From: <t...@cloudera.com> on behalf of Todd Lipcon <t...@apache.org>
Reply-To: "general@incubator.apache.org" <general@incubator.apache.org>
Date: Tuesday, November 24, 2015 at 11:32 AM
To: "general@incubator.apache.org" <general@incubator.apache.org>
Subject: [VOTE] Accept Kudu into the Apache Incubator

>Hi all,
>
>Discussion on the [DISCUSS] thread seems to have wound down, so I'd like
>to
>call a VOTE on acceptance of Kudu into the ASF Incubator. The proposal is
>pasted below and also available on the wiki at:
>https://wiki.apache.org/incubator/KuduProposal
>
>The proposal is unchanged since the original version, except for the
>addition of Carl Steinbach as a Mentor.
>
>Please cast your votes:
>
>[] +1, accept Kudu into the Incubator
>[] +/-0, positive/negative non-counted expression of feelings
>[] -1, do not accept Kudu into the incubator (please state reasoning)
>
>Given the US holiday this week, I imagine many folks are traveling or
>otherwise offline. So, let's run the vote for a full week rather than the
>traditional 72 hours. Unless the IPMC objects to the extended voting
>period, the vote will close on Tues, Dec 1st at noon PST.
>
>Thanks
>-Todd
>-
>
>= Kudu Proposal =
>
>== Abstract ==
>
>Kudu is a distributed columnar storage engine built for the Apache Hadoop
>ecosystem.
>
>== Proposal ==
>
>Kudu is an open source storage engine for structured data which supports
>low-latency random access together with efficient analytical access
>patterns. Kudu distributes data using horizontal partitioning and
>replicates each partition using Raft consensus, providing low
>mean-time-to-recovery and low tail latencies. Kudu is designed within the
>context of the Apache Hadoop ecosystem and supports many integrations with
>other data analytics projects both inside and outside of the Apache
>Software Foundation.
>
>
>
>We propose to incubate Kudu as a project of the Apache Software
>Foundation.
>
>== Background ==
>
>In recent years, explosive growth in the amount of data being generated
>and
>captured by enterprises has resulted in the rapid adoption of open source
>technology which is able to store massive data sets at scale and at low
>cost. In particular, the Apache Hadoop ecosystem has become a focal point
>for such “big data” workloads, because many traditional open source
>database systems have lagged in offering a scalable alternative.
>
>
>
>Structured storage in the Hadoop ecosystem has typically been achieved in
>two ways: for static data sets, data is typically stored on Apache HDFS
>using binary data formats such as Apache Avro or Apache Parquet. However,
>neither HDFS nor these formats has any provision for updating individual
>records, or for efficient random access. Mutable data sets are typically
>stored in semi-structured stores such as Apache HBase or Apache Cassandra.
>These systems allow for low-latency record-level reads and writes, but lag
>far behind the static file formats in terms of sequential read throughput
>for applications such as SQL-based analytics or machine learning.
>
>
>
>Kudu is a new storage system designed and implemented from the ground up
>to
>fill this gap between high-throughput sequential-access storage systems
>such as HDFS and low-latency random-access systems such as HBase or
>Cassandra. While these existing systems continue to hold advantages in
>some
>situations, Kudu offers a “happy medium” alternative that can dramatically
>simplify the architecture of many common workloads. In particular, Kudu
>offers a simple API for row-level inserts, updates, and deletes, while
>providing table scans at throughputs similar to Parquet, a commonly-used
>columnar format for static data.
>
>
>
>More information on Kudu can be found at the existing open source project
>website: http://getkudu.io and in particular in the Kudu white-paper PDF:
>http://getkudu.io/kudu.pdf from which the above was excerpted.
>
>== Rationale ==
>
>As described above, Kudu fills an important gap in the open source storage
>ecosystem. After our initial o

Re: [VOTE] Accept Kudu into the Apache Incubator

2015-11-24 Thread Andrew Purtell
+1 (binding)

Good luck!


On Tue, Nov 24, 2015 at 11:32 AM, Todd Lipcon  wrote:

> Hi all,
>
> Discussion on the [DISCUSS] thread seems to have wound down, so I'd like to
> call a VOTE on acceptance of Kudu into the ASF Incubator. The proposal is
> pasted below and also available on the wiki at:
> https://wiki.apache.org/incubator/KuduProposal
>
> The proposal is unchanged since the original version, except for the
> addition of Carl Steinbach as a Mentor.
>
> Please cast your votes:
>
> [] +1, accept Kudu into the Incubator
> [] +/-0, positive/negative non-counted expression of feelings
> [] -1, do not accept Kudu into the incubator (please state reasoning)
>
> Given the US holiday this week, I imagine many folks are traveling or
> otherwise offline. So, let's run the vote for a full week rather than the
> traditional 72 hours. Unless the IPMC objects to the extended voting
> period, the vote will close on Tues, Dec 1st at noon PST.
>
> Thanks
> -Todd
> -
>
> = Kudu Proposal =
>
> == Abstract ==
>
> Kudu is a distributed columnar storage engine built for the Apache Hadoop
> ecosystem.
>
> == Proposal ==
>
> Kudu is an open source storage engine for structured data which supports
> low-latency random access together with efficient analytical access
> patterns. Kudu distributes data using horizontal partitioning and
> replicates each partition using Raft consensus, providing low
> mean-time-to-recovery and low tail latencies. Kudu is designed within the
> context of the Apache Hadoop ecosystem and supports many integrations with
> other data analytics projects both inside and outside of the Apache
> Software Foundation.
>
>
>
> We propose to incubate Kudu as a project of the Apache Software Foundation.
>
> == Background ==
>
> In recent years, explosive growth in the amount of data being generated and
> captured by enterprises has resulted in the rapid adoption of open source
> technology which is able to store massive data sets at scale and at low
> cost. In particular, the Apache Hadoop ecosystem has become a focal point
> for such “big data” workloads, because many traditional open source
> database systems have lagged in offering a scalable alternative.
>
>
>
> Structured storage in the Hadoop ecosystem has typically been achieved in
> two ways: for static data sets, data is typically stored on Apache HDFS
> using binary data formats such as Apache Avro or Apache Parquet. However,
> neither HDFS nor these formats has any provision for updating individual
> records, or for efficient random access. Mutable data sets are typically
> stored in semi-structured stores such as Apache HBase or Apache Cassandra.
> These systems allow for low-latency record-level reads and writes, but lag
> far behind the static file formats in terms of sequential read throughput
> for applications such as SQL-based analytics or machine learning.
>
>
>
> Kudu is a new storage system designed and implemented from the ground up to
> fill this gap between high-throughput sequential-access storage systems
> such as HDFS and low-latency random-access systems such as HBase or
> Cassandra. While these existing systems continue to hold advantages in some
> situations, Kudu offers a “happy medium” alternative that can dramatically
> simplify the architecture of many common workloads. In particular, Kudu
> offers a simple API for row-level inserts, updates, and deletes, while
> providing table scans at throughputs similar to Parquet, a commonly-used
> columnar format for static data.
>
>
>
> More information on Kudu can be found at the existing open source project
> website: http://getkudu.io and in particular in the Kudu white-paper PDF:
> http://getkudu.io/kudu.pdf from which the above was excerpted.
>
> == Rationale ==
>
> As described above, Kudu fills an important gap in the open source storage
> ecosystem. After our initial open source project release in September 2015,
> we have seen a great amount of interest across a diverse set of users and
> companies. We believe that, as a storage system, it is critical to build an
> equally diverse set of contributors in the development community. Our
> experiences as committers and PMC members on other Apache projects have
> taught us the value of diverse communities in ensuring both longevity and
> high quality for such foundational systems.
>
> == Initial Goals ==
>
>  * Move the existing codebase, website, documentation, and mailing lists to
> Apache-hosted infrastructure
>  * Work with the infrastructure team to implement and approve our code
> review, build, and testing workflows in the context of the ASF
>  * Incremental development and releases per Apache guidelines
>
> == Current Status ==
>
>  Releases 
>
> Kudu has undergone one public release, tagged here
> https://github.com/cloudera/kudu/tree/kudu0.5.0-release
>
> This initial release was not performed in the typical ASF fashion -- no
> source tarball was released, but rather only convenience 

Re: [VOTE] Accept Kudu into the Apache Incubator

2015-11-24 Thread Jean-Baptiste Onofré

+1

Regards
JB

On 11/24/2015 08:32 PM, Todd Lipcon wrote:

Hi all,

Discussion on the [DISCUSS] thread seems to have wound down, so I'd like to
call a VOTE on acceptance of Kudu into the ASF Incubator. The proposal is
pasted below and also available on the wiki at:
https://wiki.apache.org/incubator/KuduProposal

The proposal is unchanged since the original version, except for the
addition of Carl Steinbach as a Mentor.

Please cast your votes:

[] +1, accept Kudu into the Incubator
[] +/-0, positive/negative non-counted expression of feelings
[] -1, do not accept Kudu into the incubator (please state reasoning)

Given the US holiday this week, I imagine many folks are traveling or
otherwise offline. So, let's run the vote for a full week rather than the
traditional 72 hours. Unless the IPMC objects to the extended voting
period, the vote will close on Tues, Dec 1st at noon PST.

Thanks
-Todd
-

= Kudu Proposal =

== Abstract ==

Kudu is a distributed columnar storage engine built for the Apache Hadoop
ecosystem.

== Proposal ==

Kudu is an open source storage engine for structured data which supports
low-latency random access together with efficient analytical access
patterns. Kudu distributes data using horizontal partitioning and
replicates each partition using Raft consensus, providing low
mean-time-to-recovery and low tail latencies. Kudu is designed within the
context of the Apache Hadoop ecosystem and supports many integrations with
other data analytics projects both inside and outside of the Apache
Software Foundation.



We propose to incubate Kudu as a project of the Apache Software Foundation.

== Background ==

In recent years, explosive growth in the amount of data being generated and
captured by enterprises has resulted in the rapid adoption of open source
technology which is able to store massive data sets at scale and at low
cost. In particular, the Apache Hadoop ecosystem has become a focal point
for such “big data” workloads, because many traditional open source
database systems have lagged in offering a scalable alternative.



Structured storage in the Hadoop ecosystem has typically been achieved in
two ways: for static data sets, data is typically stored on Apache HDFS
using binary data formats such as Apache Avro or Apache Parquet. However,
neither HDFS nor these formats has any provision for updating individual
records, or for efficient random access. Mutable data sets are typically
stored in semi-structured stores such as Apache HBase or Apache Cassandra.
These systems allow for low-latency record-level reads and writes, but lag
far behind the static file formats in terms of sequential read throughput
for applications such as SQL-based analytics or machine learning.



Kudu is a new storage system designed and implemented from the ground up to
fill this gap between high-throughput sequential-access storage systems
such as HDFS and low-latency random-access systems such as HBase or
Cassandra. While these existing systems continue to hold advantages in some
situations, Kudu offers a “happy medium” alternative that can dramatically
simplify the architecture of many common workloads. In particular, Kudu
offers a simple API for row-level inserts, updates, and deletes, while
providing table scans at throughputs similar to Parquet, a commonly-used
columnar format for static data.



More information on Kudu can be found at the existing open source project
website: http://getkudu.io and in particular in the Kudu white-paper PDF:
http://getkudu.io/kudu.pdf from which the above was excerpted.

== Rationale ==

As described above, Kudu fills an important gap in the open source storage
ecosystem. After our initial open source project release in September 2015,
we have seen a great amount of interest across a diverse set of users and
companies. We believe that, as a storage system, it is critical to build an
equally diverse set of contributors in the development community. Our
experiences as committers and PMC members on other Apache projects have
taught us the value of diverse communities in ensuring both longevity and
high quality for such foundational systems.

== Initial Goals ==

  * Move the existing codebase, website, documentation, and mailing lists to
Apache-hosted infrastructure
  * Work with the infrastructure team to implement and approve our code
review, build, and testing workflows in the context of the ASF
  * Incremental development and releases per Apache guidelines

== Current Status ==

 Releases 

Kudu has undergone one public release, tagged here
https://github.com/cloudera/kudu/tree/kudu0.5.0-release

This initial release was not performed in the typical ASF fashion -- no
source tarball was released, but rather only convenience binaries made
available in Cloudera’s repositories. We will adopt the ASF source release
process upon joining the incubator.


 Source 

Kudu’s source is currently hosted on GitHub at
https://github.com/cloudera/kudu

This 

Re: [VOTE] Accept Kudu into the Apache Incubator

2015-11-24 Thread Alex Karasulu
+1 (binding)

On Tue, Nov 24, 2015 at 10:08 PM, Arvind Prabhakar 
wrote:

> +1 (binding)
>
> Regards,
> Arvind Prabhakar
>
> On Tue, Nov 24, 2015 at 11:32 AM, Todd Lipcon  wrote:
>
> > Hi all,
> >
> > Discussion on the [DISCUSS] thread seems to have wound down, so I'd like
> to
> > call a VOTE on acceptance of Kudu into the ASF Incubator. The proposal is
> > pasted below and also available on the wiki at:
> > https://wiki.apache.org/incubator/KuduProposal
> >
> > The proposal is unchanged since the original version, except for the
> > addition of Carl Steinbach as a Mentor.
> >
> > Please cast your votes:
> >
> > [] +1, accept Kudu into the Incubator
> > [] +/-0, positive/negative non-counted expression of feelings
> > [] -1, do not accept Kudu into the incubator (please state reasoning)
> >
> > Given the US holiday this week, I imagine many folks are traveling or
> > otherwise offline. So, let's run the vote for a full week rather than the
> > traditional 72 hours. Unless the IPMC objects to the extended voting
> > period, the vote will close on Tues, Dec 1st at noon PST.
> >
> > Thanks
> > -Todd
> > -
> >
> > = Kudu Proposal =
> >
> > == Abstract ==
> >
> > Kudu is a distributed columnar storage engine built for the Apache Hadoop
> > ecosystem.
> >
> > == Proposal ==
> >
> > Kudu is an open source storage engine for structured data which supports
> > low-latency random access together with efficient analytical access
> > patterns. Kudu distributes data using horizontal partitioning and
> > replicates each partition using Raft consensus, providing low
> > mean-time-to-recovery and low tail latencies. Kudu is designed within the
> > context of the Apache Hadoop ecosystem and supports many integrations
> with
> > other data analytics projects both inside and outside of the Apache
> > Software Foundation.
> >
> >
> >
> > We propose to incubate Kudu as a project of the Apache Software
> Foundation.
> >
> > == Background ==
> >
> > In recent years, explosive growth in the amount of data being generated
> and
> > captured by enterprises has resulted in the rapid adoption of open source
> > technology which is able to store massive data sets at scale and at low
> > cost. In particular, the Apache Hadoop ecosystem has become a focal point
> > for such “big data” workloads, because many traditional open source
> > database systems have lagged in offering a scalable alternative.
> >
> >
> >
> > Structured storage in the Hadoop ecosystem has typically been achieved in
> > two ways: for static data sets, data is typically stored on Apache HDFS
> > using binary data formats such as Apache Avro or Apache Parquet. However,
> > neither HDFS nor these formats has any provision for updating individual
> > records, or for efficient random access. Mutable data sets are typically
> > stored in semi-structured stores such as Apache HBase or Apache
> Cassandra.
> > These systems allow for low-latency record-level reads and writes, but
> lag
> > far behind the static file formats in terms of sequential read throughput
> > for applications such as SQL-based analytics or machine learning.
> >
> >
> >
> > Kudu is a new storage system designed and implemented from the ground up
> to
> > fill this gap between high-throughput sequential-access storage systems
> > such as HDFS and low-latency random-access systems such as HBase or
> > Cassandra. While these existing systems continue to hold advantages in
> some
> > situations, Kudu offers a “happy medium” alternative that can
> dramatically
> > simplify the architecture of many common workloads. In particular, Kudu
> > offers a simple API for row-level inserts, updates, and deletes, while
> > providing table scans at throughputs similar to Parquet, a commonly-used
> > columnar format for static data.
> >
> >
> >
> > More information on Kudu can be found at the existing open source project
> > website: http://getkudu.io and in particular in the Kudu white-paper
> PDF:
> > http://getkudu.io/kudu.pdf from which the above was excerpted.
> >
> > == Rationale ==
> >
> > As described above, Kudu fills an important gap in the open source
> storage
> > ecosystem. After our initial open source project release in September
> 2015,
> > we have seen a great amount of interest across a diverse set of users and
> > companies. We believe that, as a storage system, it is critical to build
> an
> > equally diverse set of contributors in the development community. Our
> > experiences as committers and PMC members on other Apache projects have
> > taught us the value of diverse communities in ensuring both longevity and
> > high quality for such foundational systems.
> >
> > == Initial Goals ==
> >
> >  * Move the existing codebase, website, documentation, and mailing lists
> to
> > Apache-hosted infrastructure
> >  * Work with the infrastructure team to implement and approve our code
> > review, build, and testing workflows in the context of the ASF
> >  * 

Re: [VOTE] Accept Kudu into the Apache Incubator

2015-11-24 Thread Patrick Angeles
+1 (non-binding)

On Tue, Nov 24, 2015 at 4:23 PM, Jake Farrell  wrote:

> +1 (binding)
>
> -Jake
>
> On Tue, Nov 24, 2015 at 2:32 PM, Todd Lipcon  wrote:
>
> > Hi all,
> >
> > Discussion on the [DISCUSS] thread seems to have wound down, so I'd like
> to
> > call a VOTE on acceptance of Kudu into the ASF Incubator. The proposal is
> > pasted below and also available on the wiki at:
> > https://wiki.apache.org/incubator/KuduProposal
> >
> > The proposal is unchanged since the original version, except for the
> > addition of Carl Steinbach as a Mentor.
> >
> > Please cast your votes:
> >
> > [] +1, accept Kudu into the Incubator
> > [] +/-0, positive/negative non-counted expression of feelings
> > [] -1, do not accept Kudu into the incubator (please state reasoning)
> >
> > Given the US holiday this week, I imagine many folks are traveling or
> > otherwise offline. So, let's run the vote for a full week rather than the
> > traditional 72 hours. Unless the IPMC objects to the extended voting
> > period, the vote will close on Tues, Dec 1st at noon PST.
> >
> > Thanks
> > -Todd
> > -
> >
> > = Kudu Proposal =
> >
> > == Abstract ==
> >
> > Kudu is a distributed columnar storage engine built for the Apache Hadoop
> > ecosystem.
> >
> > == Proposal ==
> >
> > Kudu is an open source storage engine for structured data which supports
> > low-latency random access together with efficient analytical access
> > patterns. Kudu distributes data using horizontal partitioning and
> > replicates each partition using Raft consensus, providing low
> > mean-time-to-recovery and low tail latencies. Kudu is designed within the
> > context of the Apache Hadoop ecosystem and supports many integrations
> with
> > other data analytics projects both inside and outside of the Apache
> > Software Foundation.
> >
> >
> >
> > We propose to incubate Kudu as a project of the Apache Software
> Foundation.
> >
> > == Background ==
> >
> > In recent years, explosive growth in the amount of data being generated
> and
> > captured by enterprises has resulted in the rapid adoption of open source
> > technology which is able to store massive data sets at scale and at low
> > cost. In particular, the Apache Hadoop ecosystem has become a focal point
> > for such “big data” workloads, because many traditional open source
> > database systems have lagged in offering a scalable alternative.
> >
> >
> >
> > Structured storage in the Hadoop ecosystem has typically been achieved in
> > two ways: for static data sets, data is typically stored on Apache HDFS
> > using binary data formats such as Apache Avro or Apache Parquet. However,
> > neither HDFS nor these formats has any provision for updating individual
> > records, or for efficient random access. Mutable data sets are typically
> > stored in semi-structured stores such as Apache HBase or Apache
> Cassandra.
> > These systems allow for low-latency record-level reads and writes, but
> lag
> > far behind the static file formats in terms of sequential read throughput
> > for applications such as SQL-based analytics or machine learning.
> >
> >
> >
> > Kudu is a new storage system designed and implemented from the ground up
> to
> > fill this gap between high-throughput sequential-access storage systems
> > such as HDFS and low-latency random-access systems such as HBase or
> > Cassandra. While these existing systems continue to hold advantages in
> some
> > situations, Kudu offers a “happy medium” alternative that can
> dramatically
> > simplify the architecture of many common workloads. In particular, Kudu
> > offers a simple API for row-level inserts, updates, and deletes, while
> > providing table scans at throughputs similar to Parquet, a commonly-used
> > columnar format for static data.
> >
> >
> >
> > More information on Kudu can be found at the existing open source project
> > website: http://getkudu.io and in particular in the Kudu white-paper
> PDF:
> > http://getkudu.io/kudu.pdf from which the above was excerpted.
> >
> > == Rationale ==
> >
> > As described above, Kudu fills an important gap in the open source
> storage
> > ecosystem. After our initial open source project release in September
> 2015,
> > we have seen a great amount of interest across a diverse set of users and
> > companies. We believe that, as a storage system, it is critical to build
> an
> > equally diverse set of contributors in the development community. Our
> > experiences as committers and PMC members on other Apache projects have
> > taught us the value of diverse communities in ensuring both longevity and
> > high quality for such foundational systems.
> >
> > == Initial Goals ==
> >
> >  * Move the existing codebase, website, documentation, and mailing lists
> to
> > Apache-hosted infrastructure
> >  * Work with the infrastructure team to implement and approve our code
> > review, build, and testing workflows in the context of the ASF
> >  * Incremental development and 

Re: [VOTE] Accept Kudu into the Apache Incubator

2015-11-24 Thread Ashish
+1 (non-binding)

On Tue, Nov 24, 2015 at 11:40 AM, Jarek Jarcec Cecho  wrote:
>> [X] +1, accept Kudu into the Incubator
>
> (binding)
>
> Jarcec
>
>> On Nov 24, 2015, at 11:32 AM, Todd Lipcon  wrote:
>>
>> Hi all,
>>
>> Discussion on the [DISCUSS] thread seems to have wound down, so I'd like to
>> call a VOTE on acceptance of Kudu into the ASF Incubator. The proposal is
>> pasted below and also available on the wiki at:
>> https://wiki.apache.org/incubator/KuduProposal
>>
>> The proposal is unchanged since the original version, except for the
>> addition of Carl Steinbach as a Mentor.
>>
>> Please cast your votes:
>>
>> [] +1, accept Kudu into the Incubator
>> [] +/-0, positive/negative non-counted expression of feelings
>> [] -1, do not accept Kudu into the incubator (please state reasoning)
>>
>> Given the US holiday this week, I imagine many folks are traveling or
>> otherwise offline. So, let's run the vote for a full week rather than the
>> traditional 72 hours. Unless the IPMC objects to the extended voting
>> period, the vote will close on Tues, Dec 1st at noon PST.
>>
>> Thanks
>> -Todd
>> -
>>
>> = Kudu Proposal =
>>
>> == Abstract ==
>>
>> Kudu is a distributed columnar storage engine built for the Apache Hadoop
>> ecosystem.
>>
>> == Proposal ==
>>
>> Kudu is an open source storage engine for structured data which supports
>> low-latency random access together with efficient analytical access
>> patterns. Kudu distributes data using horizontal partitioning and
>> replicates each partition using Raft consensus, providing low
>> mean-time-to-recovery and low tail latencies. Kudu is designed within the
>> context of the Apache Hadoop ecosystem and supports many integrations with
>> other data analytics projects both inside and outside of the Apache
>> Software Foundation.
>>
>>
>>
>> We propose to incubate Kudu as a project of the Apache Software Foundation.
>>
>> == Background ==
>>
>> In recent years, explosive growth in the amount of data being generated and
>> captured by enterprises has resulted in the rapid adoption of open source
>> technology which is able to store massive data sets at scale and at low
>> cost. In particular, the Apache Hadoop ecosystem has become a focal point
>> for such “big data” workloads, because many traditional open source
>> database systems have lagged in offering a scalable alternative.
>>
>>
>>
>> Structured storage in the Hadoop ecosystem has typically been achieved in
>> two ways: for static data sets, data is typically stored on Apache HDFS
>> using binary data formats such as Apache Avro or Apache Parquet. However,
>> neither HDFS nor these formats has any provision for updating individual
>> records, or for efficient random access. Mutable data sets are typically
>> stored in semi-structured stores such as Apache HBase or Apache Cassandra.
>> These systems allow for low-latency record-level reads and writes, but lag
>> far behind the static file formats in terms of sequential read throughput
>> for applications such as SQL-based analytics or machine learning.
>>
>>
>>
>> Kudu is a new storage system designed and implemented from the ground up to
>> fill this gap between high-throughput sequential-access storage systems
>> such as HDFS and low-latency random-access systems such as HBase or
>> Cassandra. While these existing systems continue to hold advantages in some
>> situations, Kudu offers a “happy medium” alternative that can dramatically
>> simplify the architecture of many common workloads. In particular, Kudu
>> offers a simple API for row-level inserts, updates, and deletes, while
>> providing table scans at throughputs similar to Parquet, a commonly-used
>> columnar format for static data.
>>
>>
>>
>> More information on Kudu can be found at the existing open source project
>> website: http://getkudu.io and in particular in the Kudu white-paper PDF:
>> http://getkudu.io/kudu.pdf from which the above was excerpted.
>>
>> == Rationale ==
>>
>> As described above, Kudu fills an important gap in the open source storage
>> ecosystem. After our initial open source project release in September 2015,
>> we have seen a great amount of interest across a diverse set of users and
>> companies. We believe that, as a storage system, it is critical to build an
>> equally diverse set of contributors in the development community. Our
>> experiences as committers and PMC members on other Apache projects have
>> taught us the value of diverse communities in ensuring both longevity and
>> high quality for such foundational systems.
>>
>> == Initial Goals ==
>>
>> * Move the existing codebase, website, documentation, and mailing lists to
>> Apache-hosted infrastructure
>> * Work with the infrastructure team to implement and approve our code
>> review, build, and testing workflows in the context of the ASF
>> * Incremental development and releases per Apache guidelines
>>
>> == Current Status ==
>>
>>  Releases 
>>
>> 

Re: [VOTE] Accept Kudu into the Apache Incubator

2015-11-24 Thread Edward J. Yoon
+1 (binding)

On Wed, Nov 25, 2015 at 6:26 AM, Patrick Angeles
 wrote:
> +1 (non-binding)
>
> On Tue, Nov 24, 2015 at 4:23 PM, Jake Farrell  wrote:
>
>> +1 (binding)
>>
>> -Jake
>>
>> On Tue, Nov 24, 2015 at 2:32 PM, Todd Lipcon  wrote:
>>
>> > Hi all,
>> >
>> > Discussion on the [DISCUSS] thread seems to have wound down, so I'd like
>> to
>> > call a VOTE on acceptance of Kudu into the ASF Incubator. The proposal is
>> > pasted below and also available on the wiki at:
>> > https://wiki.apache.org/incubator/KuduProposal
>> >
>> > The proposal is unchanged since the original version, except for the
>> > addition of Carl Steinbach as a Mentor.
>> >
>> > Please cast your votes:
>> >
>> > [] +1, accept Kudu into the Incubator
>> > [] +/-0, positive/negative non-counted expression of feelings
>> > [] -1, do not accept Kudu into the incubator (please state reasoning)
>> >
>> > Given the US holiday this week, I imagine many folks are traveling or
>> > otherwise offline. So, let's run the vote for a full week rather than the
>> > traditional 72 hours. Unless the IPMC objects to the extended voting
>> > period, the vote will close on Tues, Dec 1st at noon PST.
>> >
>> > Thanks
>> > -Todd
>> > -
>> >
>> > = Kudu Proposal =
>> >
>> > == Abstract ==
>> >
>> > Kudu is a distributed columnar storage engine built for the Apache Hadoop
>> > ecosystem.
>> >
>> > == Proposal ==
>> >
>> > Kudu is an open source storage engine for structured data which supports
>> > low-latency random access together with efficient analytical access
>> > patterns. Kudu distributes data using horizontal partitioning and
>> > replicates each partition using Raft consensus, providing low
>> > mean-time-to-recovery and low tail latencies. Kudu is designed within the
>> > context of the Apache Hadoop ecosystem and supports many integrations
>> with
>> > other data analytics projects both inside and outside of the Apache
>> > Software Foundation.
>> >
>> >
>> >
>> > We propose to incubate Kudu as a project of the Apache Software
>> Foundation.
>> >
>> > == Background ==
>> >
>> > In recent years, explosive growth in the amount of data being generated
>> and
>> > captured by enterprises has resulted in the rapid adoption of open source
>> > technology which is able to store massive data sets at scale and at low
>> > cost. In particular, the Apache Hadoop ecosystem has become a focal point
>> > for such “big data” workloads, because many traditional open source
>> > database systems have lagged in offering a scalable alternative.
>> >
>> >
>> >
>> > Structured storage in the Hadoop ecosystem has typically been achieved in
>> > two ways: for static data sets, data is typically stored on Apache HDFS
>> > using binary data formats such as Apache Avro or Apache Parquet. However,
>> > neither HDFS nor these formats has any provision for updating individual
>> > records, or for efficient random access. Mutable data sets are typically
>> > stored in semi-structured stores such as Apache HBase or Apache
>> Cassandra.
>> > These systems allow for low-latency record-level reads and writes, but
>> lag
>> > far behind the static file formats in terms of sequential read throughput
>> > for applications such as SQL-based analytics or machine learning.
>> >
>> >
>> >
>> > Kudu is a new storage system designed and implemented from the ground up
>> to
>> > fill this gap between high-throughput sequential-access storage systems
>> > such as HDFS and low-latency random-access systems such as HBase or
>> > Cassandra. While these existing systems continue to hold advantages in
>> some
>> > situations, Kudu offers a “happy medium” alternative that can
>> dramatically
>> > simplify the architecture of many common workloads. In particular, Kudu
>> > offers a simple API for row-level inserts, updates, and deletes, while
>> > providing table scans at throughputs similar to Parquet, a commonly-used
>> > columnar format for static data.
>> >
>> >
>> >
>> > More information on Kudu can be found at the existing open source project
>> > website: http://getkudu.io and in particular in the Kudu white-paper
>> PDF:
>> > http://getkudu.io/kudu.pdf from which the above was excerpted.
>> >
>> > == Rationale ==
>> >
>> > As described above, Kudu fills an important gap in the open source
>> storage
>> > ecosystem. After our initial open source project release in September
>> 2015,
>> > we have seen a great amount of interest across a diverse set of users and
>> > companies. We believe that, as a storage system, it is critical to build
>> an
>> > equally diverse set of contributors in the development community. Our
>> > experiences as committers and PMC members on other Apache projects have
>> > taught us the value of diverse communities in ensuring both longevity and
>> > high quality for such foundational systems.
>> >
>> > == Initial Goals ==
>> >
>> >  * Move the existing codebase, website, documentation, and mailing 

[VOTE] Accept Kudu into the Apache Incubator

2015-11-24 Thread Todd Lipcon
Hi all,

Discussion on the [DISCUSS] thread seems to have wound down, so I'd like to
call a VOTE on acceptance of Kudu into the ASF Incubator. The proposal is
pasted below and also available on the wiki at:
https://wiki.apache.org/incubator/KuduProposal

The proposal is unchanged since the original version, except for the
addition of Carl Steinbach as a Mentor.

Please cast your votes:

[] +1, accept Kudu into the Incubator
[] +/-0, positive/negative non-counted expression of feelings
[] -1, do not accept Kudu into the incubator (please state reasoning)

Given the US holiday this week, I imagine many folks are traveling or
otherwise offline. So, let's run the vote for a full week rather than the
traditional 72 hours. Unless the IPMC objects to the extended voting
period, the vote will close on Tues, Dec 1st at noon PST.

Thanks
-Todd
-

= Kudu Proposal =

== Abstract ==

Kudu is a distributed columnar storage engine built for the Apache Hadoop
ecosystem.

== Proposal ==

Kudu is an open source storage engine for structured data which supports
low-latency random access together with efficient analytical access
patterns. Kudu distributes data using horizontal partitioning and
replicates each partition using Raft consensus, providing low
mean-time-to-recovery and low tail latencies. Kudu is designed within the
context of the Apache Hadoop ecosystem and supports many integrations with
other data analytics projects both inside and outside of the Apache
Software Foundation.



We propose to incubate Kudu as a project of the Apache Software Foundation.

== Background ==

In recent years, explosive growth in the amount of data being generated and
captured by enterprises has resulted in the rapid adoption of open source
technology which is able to store massive data sets at scale and at low
cost. In particular, the Apache Hadoop ecosystem has become a focal point
for such “big data” workloads, because many traditional open source
database systems have lagged in offering a scalable alternative.



Structured storage in the Hadoop ecosystem has typically been achieved in
two ways: for static data sets, data is typically stored on Apache HDFS
using binary data formats such as Apache Avro or Apache Parquet. However,
neither HDFS nor these formats has any provision for updating individual
records, or for efficient random access. Mutable data sets are typically
stored in semi-structured stores such as Apache HBase or Apache Cassandra.
These systems allow for low-latency record-level reads and writes, but lag
far behind the static file formats in terms of sequential read throughput
for applications such as SQL-based analytics or machine learning.



Kudu is a new storage system designed and implemented from the ground up to
fill this gap between high-throughput sequential-access storage systems
such as HDFS and low-latency random-access systems such as HBase or
Cassandra. While these existing systems continue to hold advantages in some
situations, Kudu offers a “happy medium” alternative that can dramatically
simplify the architecture of many common workloads. In particular, Kudu
offers a simple API for row-level inserts, updates, and deletes, while
providing table scans at throughputs similar to Parquet, a commonly-used
columnar format for static data.



More information on Kudu can be found at the existing open source project
website: http://getkudu.io and in particular in the Kudu white-paper PDF:
http://getkudu.io/kudu.pdf from which the above was excerpted.

== Rationale ==

As described above, Kudu fills an important gap in the open source storage
ecosystem. After our initial open source project release in September 2015,
we have seen a great amount of interest across a diverse set of users and
companies. We believe that, as a storage system, it is critical to build an
equally diverse set of contributors in the development community. Our
experiences as committers and PMC members on other Apache projects have
taught us the value of diverse communities in ensuring both longevity and
high quality for such foundational systems.

== Initial Goals ==

 * Move the existing codebase, website, documentation, and mailing lists to
Apache-hosted infrastructure
 * Work with the infrastructure team to implement and approve our code
review, build, and testing workflows in the context of the ASF
 * Incremental development and releases per Apache guidelines

== Current Status ==

 Releases 

Kudu has undergone one public release, tagged here
https://github.com/cloudera/kudu/tree/kudu0.5.0-release

This initial release was not performed in the typical ASF fashion -- no
source tarball was released, but rather only convenience binaries made
available in Cloudera’s repositories. We will adopt the ASF source release
process upon joining the incubator.


 Source 

Kudu’s source is currently hosted on GitHub at
https://github.com/cloudera/kudu

This repository will be transitioned to Apache’s git hosting during

Re: [VOTE] Accept Kudu into the Apache Incubator

2015-11-24 Thread Todd Lipcon
On Tue, Nov 24, 2015 at 11:32 AM, Todd Lipcon  wrote:

> Hi all,
>
> Discussion on the [DISCUSS] thread seems to have wound down, so I'd like
> to call a VOTE on acceptance of Kudu into the ASF Incubator. The proposal
> is pasted below and also available on the wiki at:
> https://wiki.apache.org/incubator/KuduProposal
>
> The proposal is unchanged since the original version, except for the
> addition of Carl Steinbach as a Mentor.
>
> Please cast your votes:
>
> [] +1, accept Kudu into the Incubator
> [] +/-0, positive/negative non-counted expression of feelings
> [] -1, do not accept Kudu into the incubator (please state reasoning)
>
>
I'll start the voting with my +1 (binding, assuming it's permitted to vote
on your own proposal!)


Re: [VOTE] Accept Kudu into the Apache Incubator

2015-11-24 Thread Julian Hyde
+1

> On Nov 24, 2015, at 11:33 AM, Todd Lipcon  wrote:
> 
> On Tue, Nov 24, 2015 at 11:32 AM, Todd Lipcon  wrote:
> 
>> Hi all,
>> 
>> Discussion on the [DISCUSS] thread seems to have wound down, so I'd like
>> to call a VOTE on acceptance of Kudu into the ASF Incubator. The proposal
>> is pasted below and also available on the wiki at:
>> https://wiki.apache.org/incubator/KuduProposal
>> 
>> The proposal is unchanged since the original version, except for the
>> addition of Carl Steinbach as a Mentor.
>> 
>> Please cast your votes:
>> 
>> [] +1, accept Kudu into the Incubator
>> [] +/-0, positive/negative non-counted expression of feelings
>> [] -1, do not accept Kudu into the incubator (please state reasoning)
>> 
>> 
> I'll start the voting with my +1 (binding, assuming it's permitted to vote
> on your own proposal!)


-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: [VOTE] Accept Kudu into the Apache Incubator

2015-11-24 Thread Reynold Xin
+1


On Tue, Nov 24, 2015 at 11:32 AM, Todd Lipcon  wrote:

> Hi all,
>
> Discussion on the [DISCUSS] thread seems to have wound down, so I'd like to
> call a VOTE on acceptance of Kudu into the ASF Incubator. The proposal is
> pasted below and also available on the wiki at:
> https://wiki.apache.org/incubator/KuduProposal
>
> The proposal is unchanged since the original version, except for the
> addition of Carl Steinbach as a Mentor.
>
> Please cast your votes:
>
> [] +1, accept Kudu into the Incubator
> [] +/-0, positive/negative non-counted expression of feelings
> [] -1, do not accept Kudu into the incubator (please state reasoning)
>
> Given the US holiday this week, I imagine many folks are traveling or
> otherwise offline. So, let's run the vote for a full week rather than the
> traditional 72 hours. Unless the IPMC objects to the extended voting
> period, the vote will close on Tues, Dec 1st at noon PST.
>
> Thanks
> -Todd
> -
>
> = Kudu Proposal =
>
> == Abstract ==
>
> Kudu is a distributed columnar storage engine built for the Apache Hadoop
> ecosystem.
>
> == Proposal ==
>
> Kudu is an open source storage engine for structured data which supports
> low-latency random access together with efficient analytical access
> patterns. Kudu distributes data using horizontal partitioning and
> replicates each partition using Raft consensus, providing low
> mean-time-to-recovery and low tail latencies. Kudu is designed within the
> context of the Apache Hadoop ecosystem and supports many integrations with
> other data analytics projects both inside and outside of the Apache
> Software Foundation.
>
>
>
> We propose to incubate Kudu as a project of the Apache Software Foundation.
>
> == Background ==
>
> In recent years, explosive growth in the amount of data being generated and
> captured by enterprises has resulted in the rapid adoption of open source
> technology which is able to store massive data sets at scale and at low
> cost. In particular, the Apache Hadoop ecosystem has become a focal point
> for such “big data” workloads, because many traditional open source
> database systems have lagged in offering a scalable alternative.
>
>
>
> Structured storage in the Hadoop ecosystem has typically been achieved in
> two ways: for static data sets, data is typically stored on Apache HDFS
> using binary data formats such as Apache Avro or Apache Parquet. However,
> neither HDFS nor these formats has any provision for updating individual
> records, or for efficient random access. Mutable data sets are typically
> stored in semi-structured stores such as Apache HBase or Apache Cassandra.
> These systems allow for low-latency record-level reads and writes, but lag
> far behind the static file formats in terms of sequential read throughput
> for applications such as SQL-based analytics or machine learning.
>
>
>
> Kudu is a new storage system designed and implemented from the ground up to
> fill this gap between high-throughput sequential-access storage systems
> such as HDFS and low-latency random-access systems such as HBase or
> Cassandra. While these existing systems continue to hold advantages in some
> situations, Kudu offers a “happy medium” alternative that can dramatically
> simplify the architecture of many common workloads. In particular, Kudu
> offers a simple API for row-level inserts, updates, and deletes, while
> providing table scans at throughputs similar to Parquet, a commonly-used
> columnar format for static data.
>
>
>
> More information on Kudu can be found at the existing open source project
> website: http://getkudu.io and in particular in the Kudu white-paper PDF:
> http://getkudu.io/kudu.pdf from which the above was excerpted.
>
> == Rationale ==
>
> As described above, Kudu fills an important gap in the open source storage
> ecosystem. After our initial open source project release in September 2015,
> we have seen a great amount of interest across a diverse set of users and
> companies. We believe that, as a storage system, it is critical to build an
> equally diverse set of contributors in the development community. Our
> experiences as committers and PMC members on other Apache projects have
> taught us the value of diverse communities in ensuring both longevity and
> high quality for such foundational systems.
>
> == Initial Goals ==
>
>  * Move the existing codebase, website, documentation, and mailing lists to
> Apache-hosted infrastructure
>  * Work with the infrastructure team to implement and approve our code
> review, build, and testing workflows in the context of the ASF
>  * Incremental development and releases per Apache guidelines
>
> == Current Status ==
>
>  Releases 
>
> Kudu has undergone one public release, tagged here
> https://github.com/cloudera/kudu/tree/kudu0.5.0-release
>
> This initial release was not performed in the typical ASF fashion -- no
> source tarball was released, but rather only convenience binaries made
> available in 

Re: [VOTE] Accept Kudu into the Apache Incubator

2015-11-24 Thread Jarek Jarcec Cecho
> [X] +1, accept Kudu into the Incubator

(binding)

Jarcec

> On Nov 24, 2015, at 11:32 AM, Todd Lipcon  wrote:
> 
> Hi all,
> 
> Discussion on the [DISCUSS] thread seems to have wound down, so I'd like to
> call a VOTE on acceptance of Kudu into the ASF Incubator. The proposal is
> pasted below and also available on the wiki at:
> https://wiki.apache.org/incubator/KuduProposal
> 
> The proposal is unchanged since the original version, except for the
> addition of Carl Steinbach as a Mentor.
> 
> Please cast your votes:
> 
> [] +1, accept Kudu into the Incubator
> [] +/-0, positive/negative non-counted expression of feelings
> [] -1, do not accept Kudu into the incubator (please state reasoning)
> 
> Given the US holiday this week, I imagine many folks are traveling or
> otherwise offline. So, let's run the vote for a full week rather than the
> traditional 72 hours. Unless the IPMC objects to the extended voting
> period, the vote will close on Tues, Dec 1st at noon PST.
> 
> Thanks
> -Todd
> -
> 
> = Kudu Proposal =
> 
> == Abstract ==
> 
> Kudu is a distributed columnar storage engine built for the Apache Hadoop
> ecosystem.
> 
> == Proposal ==
> 
> Kudu is an open source storage engine for structured data which supports
> low-latency random access together with efficient analytical access
> patterns. Kudu distributes data using horizontal partitioning and
> replicates each partition using Raft consensus, providing low
> mean-time-to-recovery and low tail latencies. Kudu is designed within the
> context of the Apache Hadoop ecosystem and supports many integrations with
> other data analytics projects both inside and outside of the Apache
> Software Foundation.
> 
> 
> 
> We propose to incubate Kudu as a project of the Apache Software Foundation.
> 
> == Background ==
> 
> In recent years, explosive growth in the amount of data being generated and
> captured by enterprises has resulted in the rapid adoption of open source
> technology which is able to store massive data sets at scale and at low
> cost. In particular, the Apache Hadoop ecosystem has become a focal point
> for such “big data” workloads, because many traditional open source
> database systems have lagged in offering a scalable alternative.
> 
> 
> 
> Structured storage in the Hadoop ecosystem has typically been achieved in
> two ways: for static data sets, data is typically stored on Apache HDFS
> using binary data formats such as Apache Avro or Apache Parquet. However,
> neither HDFS nor these formats has any provision for updating individual
> records, or for efficient random access. Mutable data sets are typically
> stored in semi-structured stores such as Apache HBase or Apache Cassandra.
> These systems allow for low-latency record-level reads and writes, but lag
> far behind the static file formats in terms of sequential read throughput
> for applications such as SQL-based analytics or machine learning.
> 
> 
> 
> Kudu is a new storage system designed and implemented from the ground up to
> fill this gap between high-throughput sequential-access storage systems
> such as HDFS and low-latency random-access systems such as HBase or
> Cassandra. While these existing systems continue to hold advantages in some
> situations, Kudu offers a “happy medium” alternative that can dramatically
> simplify the architecture of many common workloads. In particular, Kudu
> offers a simple API for row-level inserts, updates, and deletes, while
> providing table scans at throughputs similar to Parquet, a commonly-used
> columnar format for static data.
> 
> 
> 
> More information on Kudu can be found at the existing open source project
> website: http://getkudu.io and in particular in the Kudu white-paper PDF:
> http://getkudu.io/kudu.pdf from which the above was excerpted.
> 
> == Rationale ==
> 
> As described above, Kudu fills an important gap in the open source storage
> ecosystem. After our initial open source project release in September 2015,
> we have seen a great amount of interest across a diverse set of users and
> companies. We believe that, as a storage system, it is critical to build an
> equally diverse set of contributors in the development community. Our
> experiences as committers and PMC members on other Apache projects have
> taught us the value of diverse communities in ensuring both longevity and
> high quality for such foundational systems.
> 
> == Initial Goals ==
> 
> * Move the existing codebase, website, documentation, and mailing lists to
> Apache-hosted infrastructure
> * Work with the infrastructure team to implement and approve our code
> review, build, and testing workflows in the context of the ASF
> * Incremental development and releases per Apache guidelines
> 
> == Current Status ==
> 
>  Releases 
> 
> Kudu has undergone one public release, tagged here
> https://github.com/cloudera/kudu/tree/kudu0.5.0-release
> 
> This initial release was not performed in the typical ASF fashion -- no
>