Re: [Result][Vote] vote for IoTDB incubation proposal

2018-11-15 Thread 黄向东
> - When you say "open source" repo, do you mean private repo vs public
> repo?

Yes.

> 
> - I believe Craig as Secretary will say an SGA never hurts but isn't
> everything already licensed ASLv2?  It's been a few weeks and a few
> proposals reviewed so it could be my memory.

Currently, the licenses of the dependency libs of IoTDB includes: Apache2.0, 
BSD (antlr3), EPL1.0 (logback) and EPL2.0 (junit). 
We are working on checking all the licenses once again for avoiding mistakes.

Regards,
Xiangdong Huang


> 在 2018年11月15日,下午10:43,Kevin A. McGrail  写道:
> 
> Well, first, let's ask some questions:
> 
> - When you say "open source" repo, do you mean private repo vs public
> repo?
> 
> - I believe Craig as Secretary will say an SGA never hurts but isn't
> everything already licensed ASLv2?  It's been a few weeks and a few
> proposals reviewed so it could be my memory.
> 
> Regards,
> KAM
> 
> --
> Kevin A. McGrail
> VP Fundraising, Apache Software Foundation
> Chair Emeritus Apache SpamAssassin Project
> https://www.linkedin.com/in/kmcgrail - 703.798.0171
> 
> 
> On Thu, Nov 15, 2018 at 7:27 AM hxd  wrote:
> 
>> Currently, there are 6 repositories (IoTDB, IoTDB-JDBC, TsFile,
>> Spark-Connector, Hive-Connector, and Grafana-Connector) totally and we will
>> merge them all in one repositories.
>> 
>> Only the first one is private.
>> 
>> Actually we are lack of experiences about how to open source.
>> 
>> Should we open all the source now or after all the Apache legal documents
>> are done?
>> 
>> Best,
>> 
>> Xiangdong Huang
>> 
>>> 在 2018年11月15日,下午5:06,Willem Jiang  写道:
>>> 
>>> Here is a question for the source code repository
>>> 
>>> The main source git repo[1] is still a private repo.  I think we need
>>> to open source the repo before sending the SGA?
>>> 
>>> 
>>> [1]https://github.com/thulab/iotdb
>>> 
>>> Willem Jiang
>>> 
>>> Twitter: willemjiang
>>> Weibo: 姜宁willem
>>> On Thu, Nov 15, 2018 at 4:08 PM hxd  wrote:
 
 Hi,
 
 In the proposal discussion process, we got 3 mentors,  Justin Mclean,
>> Christofer Dutz, and Willem Ning Jiang.
 
 In the vote process, we got a new mentor, Joe Witt.
 
 Totally, there are one Champion and four mentors, they are:
 
 Kevin A. McGrail (the Champion),
 Justin Mclean,
 Christofer Dutz,
 Willem Ning Jiang, and
 Joe Witt
 
 I have checked their name on
>> http://people.apache.org/committer-index.html, and they are accurate now.
 The name list on the proposal list (
>> https://wiki.apache.org/incubator/IoTDBProposal) is also correct.
 
 Regards,
 Xiangdong Huang
 
 
 
 在 2018年11月15日,上午12:51,Kevin A. McGrail  写道:
 
 Congratulations!  As champion, I think the next steps are:
 
 1 - Xiangdong, Can you confirm the list of mentors on the proposal is
>> accurate?
 
 2 - Also Xiangdong, Is there anyone else that stepped forward as a
>> mentor during the voting process that the project wants the IPMC to approve?
 
 3 - Justin, I think you have to request the creation of the podling and
>> then I as champion work on things like the meta data file from this page,
 https://incubator.apache.org/policy/incubation.html, correct?
 
 Regards,
 KAM
 
 
 
 
 --
 Kevin A. McGrail
 VP Fundraising, Apache Software Foundation
 Chair Emeritus Apache SpamAssassin Project
 https://www.linkedin.com/in/kmcgrail - 703.798.0171
 
 
 On Wed, Nov 14, 2018 at 6:29 AM hxd  wrote:
> 
> Hi,
> 
> With 8 +1 binding votes,  2 +1 non-binding votes and No +/-0 or -1
>> votes, this VOTE passes.
> 
> Thanks to everyone who voted!
> 
> Bellow is a voting tally:
> 
> Binding
> Von Gosling
> Christofer Dutz
> Kevin A. McGrail
> Felix Cheung
> Matt Sticker
> Joe Witt
> Justin Mclean
> Willem Jiang
> 
> 
> Non-binding
> Sheng Wu
> Yang Bo
> 
> The vote thread:
>> https://lists.apache.org/thread.html/077f029ab2b52a2b19fc8d41c07438f660a8e93dd87b3895d262263c@%3Cgeneral.incubator.apache.org%3E
>> <
>> https://lists.apache.org/thread.html/077f029ab2b52a2b19fc8d41c07438f660a8e93dd87b3895d262263c@%3Cgeneral.incubator.apache.org%3E
>>> 
> The proposal: https://wiki.apache.org/incubator/IoTDBProposal <
>> https://wiki.apache.org/incubator/IoTDBProposal>
> 
> Thanks,
> 
> Xiangdong Huang
> 
> 
>> 在 2018年11月7日,下午3:46,hxd  写道:
>> 
>> Hi,
>> 
>> Sorry for the previous mail with bad format.
>> I'd like to call a VOTE to accept IoTDB project, a database for
>> managing large amounts of time series data  from IoT sensors in industrial
>> applications, into the Apache Incubator.
>> The full proposal is available on the wiki:
>> https://wiki.apache.org/incubator/IoTDBProposal
>> and it is also attached below for your convenience.
>> 
>> Please cast your vote:
>> 
>> [ ] 

Re: [Result][Vote] vote for IoTDB incubation proposal

2018-11-15 Thread Kevin A. McGrail
Well, first, let's ask some questions:

- When you say "open source" repo, do you mean private repo vs public
repo?

- I believe Craig as Secretary will say an SGA never hurts but isn't
everything already licensed ASLv2?  It's been a few weeks and a few
proposals reviewed so it could be my memory.

Regards,
KAM

--
Kevin A. McGrail
VP Fundraising, Apache Software Foundation
Chair Emeritus Apache SpamAssassin Project
https://www.linkedin.com/in/kmcgrail - 703.798.0171


On Thu, Nov 15, 2018 at 7:27 AM hxd  wrote:

> Currently, there are 6 repositories (IoTDB, IoTDB-JDBC, TsFile,
> Spark-Connector, Hive-Connector, and Grafana-Connector) totally and we will
> merge them all in one repositories.
>
> Only the first one is private.
>
> Actually we are lack of experiences about how to open source.
>
> Should we open all the source now or after all the Apache legal documents
> are done?
>
> Best,
>
> Xiangdong Huang
>
> > 在 2018年11月15日,下午5:06,Willem Jiang  写道:
> >
> > Here is a question for the source code repository
> >
> > The main source git repo[1] is still a private repo.  I think we need
> > to open source the repo before sending the SGA?
> >
> >
> > [1]https://github.com/thulab/iotdb
> >
> > Willem Jiang
> >
> > Twitter: willemjiang
> > Weibo: 姜宁willem
> > On Thu, Nov 15, 2018 at 4:08 PM hxd  wrote:
> >>
> >> Hi,
> >>
> >> In the proposal discussion process, we got 3 mentors,  Justin Mclean,
> Christofer Dutz, and Willem Ning Jiang.
> >>
> >> In the vote process, we got a new mentor, Joe Witt.
> >>
> >> Totally, there are one Champion and four mentors, they are:
> >>
> >> Kevin A. McGrail (the Champion),
> >> Justin Mclean,
> >> Christofer Dutz,
> >> Willem Ning Jiang, and
> >> Joe Witt
> >>
> >> I have checked their name on
> http://people.apache.org/committer-index.html, and they are accurate now.
> >> The name list on the proposal list (
> https://wiki.apache.org/incubator/IoTDBProposal) is also correct.
> >>
> >> Regards,
> >> Xiangdong Huang
> >>
> >>
> >>
> >> 在 2018年11月15日,上午12:51,Kevin A. McGrail  写道:
> >>
> >> Congratulations!  As champion, I think the next steps are:
> >>
> >> 1 - Xiangdong, Can you confirm the list of mentors on the proposal is
> accurate?
> >>
> >> 2 - Also Xiangdong, Is there anyone else that stepped forward as a
> mentor during the voting process that the project wants the IPMC to approve?
> >>
> >> 3 - Justin, I think you have to request the creation of the podling and
> then I as champion work on things like the meta data file from this page,
> >> https://incubator.apache.org/policy/incubation.html, correct?
> >>
> >> Regards,
> >> KAM
> >>
> >>
> >>
> >>
> >> --
> >> Kevin A. McGrail
> >> VP Fundraising, Apache Software Foundation
> >> Chair Emeritus Apache SpamAssassin Project
> >> https://www.linkedin.com/in/kmcgrail - 703.798.0171
> >>
> >>
> >> On Wed, Nov 14, 2018 at 6:29 AM hxd  wrote:
> >>>
> >>> Hi,
> >>>
> >>> With 8 +1 binding votes,  2 +1 non-binding votes and No +/-0 or -1
> votes, this VOTE passes.
> >>>
> >>> Thanks to everyone who voted!
> >>>
> >>> Bellow is a voting tally:
> >>>
> >>> Binding
> >>> Von Gosling
> >>> Christofer Dutz
> >>> Kevin A. McGrail
> >>> Felix Cheung
> >>> Matt Sticker
> >>> Joe Witt
> >>> Justin Mclean
> >>> Willem Jiang
> >>>
> >>>
> >>> Non-binding
> >>> Sheng Wu
> >>> Yang Bo
> >>>
> >>> The vote thread:
> https://lists.apache.org/thread.html/077f029ab2b52a2b19fc8d41c07438f660a8e93dd87b3895d262263c@%3Cgeneral.incubator.apache.org%3E
> <
> https://lists.apache.org/thread.html/077f029ab2b52a2b19fc8d41c07438f660a8e93dd87b3895d262263c@%3Cgeneral.incubator.apache.org%3E
> >
> >>> The proposal: https://wiki.apache.org/incubator/IoTDBProposal <
> https://wiki.apache.org/incubator/IoTDBProposal>
> >>>
> >>> Thanks,
> >>>
> >>> Xiangdong Huang
> >>>
> >>>
>  在 2018年11月7日,下午3:46,hxd  写道:
> 
>  Hi,
> 
>  Sorry for the previous mail with bad format.
>  I'd like to call a VOTE to accept IoTDB project, a database for
> managing large amounts of time series data  from IoT sensors in industrial
> applications, into the Apache Incubator.
>  The full proposal is available on the wiki:
> https://wiki.apache.org/incubator/IoTDBProposal
>  and it is also attached below for your convenience.
> 
>  Please cast your vote:
> 
>   [ ] +1, bring IoTDB into Incubator
>   [ ] +0, I don't care either way,
>   [ ] -1, do not bring IoTDB into Incubator, because...
> 
>  The vote will open at least for 72 hours.
> 
>  Thanks,
>  Xiangdong Huang.
> 
> 
>  = IoTDB Proposal  =
>  v0.1.1
> 
> 
>  == Abstract ==
>  IoTDB is a data store for managing large amounts of time series data
> such as timestamped data from IoT sensors in industrial applications.
> 
>  == Proposal ==
>  IoTDB is a database for managing large amount of time series data
> with columnar storage, data encoding, pre-computation, and index
> techniques. It has SQL-like interface to 

Re: [VOTE] Accept the Iceberg project for incubation

2018-11-15 Thread Kenneth Knowles
+1 (non-binding)

On Thu, Nov 15, 2018 at 9:57 AM Michael Wall  wrote:

> +1 (binding)
>
> On Thu, Nov 15, 2018 at 3:03 AM Olivier Lamy  wrote:
>
> > +1
> >
> > On Wed, 14 Nov 2018 at 03:07, Ryan Blue  wrote:
> >
> > > The discuss thread seems to have reached consensus, so I propose
> > accepting
> > > the Iceberg project for incubation.
> > >
> > > The proposal is copied below and in the wiki:
> > > https://wiki.apache.org/incubator/IcebergProposal
> > >
> > > Please vote on whether to accept Iceberg in the next 72 hours:
> > >
> > > [ ] +1, accept Iceberg for incubation
> > > [ ] -1, reject the Iceberg proposal because . . .
> > >
> > > Thank you for reviewing the proposal and voting,
> > >
> > > rb
> > > --
> > > Iceberg Proposal Abstract
> > >
> > > Iceberg is a table format for large, slow-moving tabular data.
> > >
> > > It is designed to improve on the de-facto standard table layout built
> > into
> > > Apache Hive, Presto, and Apache Spark.
> > > Proposal
> > >
> > > The purpose of Iceberg is to provide SQL-like tables that are backed by
> > > large sets of data files. Iceberg is similar to the Hive table layout,
> > the
> > > de-facto standard structure used to track files in a table, but
> provides
> > > additional guarantees and performance optimizations:
> > >
> > >- Atomicity - Each change to the table is will be complete or will
> > fail.
> > >“Do or do not. There is no try.”
> > >- Snapshot isolation - Reads use one and only one snapshot of a
> table
> > at
> > >some time without holding a lock.
> > >- Safe schema evolution - A table’s schema can change in
> well-defined
> > >ways, without breaking older data files.
> > >- Column projection - An engine may request a subset of the
> available
> > >columns, including nested fields.
> > >- Predicate pushdown - An engine can push filters into read planning
> > to
> > >improve performance using partition data and file-level statistics.
> > >
> > > Iceberg does NOT define a new file format. All data is stored in Apache
> > > Avro, Apache ORC, or Apache Parquet files.
> > >
> > > Additionally, Iceberg is designed to work well when data files are
> stored
> > > in cloud blob stores, even when those systems provide weaker guarantees
> > > than a file system, including:
> > >
> > >- Eventual consistency in the namespace
> > >- High latency for directory listings
> > >- No renames of objects
> > >- No folder hierarchy
> > >
> > > Rationale
> > >
> > > Initial benchmarks show dramatic improvements in query planning. For
> > > example, in Netflix’s Atlas use case, which stores time-series metrics
> > from
> > > Netflix runtime systems and 1 month is stored across 2.7 million files
> in
> > > 2,688 partitions:
> > >
> > >- Hive table using Parquet:
> > >   - 400k+ splits, not combined
> > >   - Explain query: 9.6 minutes wall time (planning only)
> > >- Iceberg table with partition filtering:
> > >   - 15,218 splits, combined
> > >   - Planning: 10 seconds
> > >   - Query wall time: 13 minutes
> > >- Iceberg table with partition and min/max filtering:
> > >   - 412 splits
> > >   - Planning: 25 seconds
> > >   - Query wall time: 42 seconds
> > >
> > > These performance gains combined with the cross-engine compatibility
> are
> > a
> > > very compelling story.
> > > Initial Goals
> > >
> > > The initial goal will be to move the existing codebase to Apache and
> > > integrate with the Apache development process and infrastructure. A
> > primary
> > > goal of incubation will be to grow and diversify the Iceberg community.
> > We
> > > are well aware that the project community is largely comprised of
> > > individuals from a single company. We aim to change that during
> > incubation.
> > > Current Status
> > >
> > > As previously mentioned, Iceberg is under active development at
> Netflix,
> > > and is being used in processing large volumes of data in Amazon EC2.
> > >
> > > Iceberg license documentation is already based on Apache guidelines for
> > > LICENSE and NOTICE content.
> > > Meritocracy
> > >
> > > We value meritocracy and we understand that it is the basis for an open
> > > community that encourages multiple companies and individuals to
> > contribute
> > > and be invested in the project’s future. We will encourage and monitor
> > > participation and make sure to extend privileges and responsibilities
> to
> > > all contributors.
> > > Community
> > >
> > > Iceberg is currently being used by developers at Netflix and a growing
> > > number of users are actively using it in production environments.
> Iceberg
> > > has received contributions from developers working at Hortonworks,
> > WeWork,
> > > and Palantir. By bringing Iceberg to Apache we aim to assure current
> and
> > > future contributors that the Iceberg community is meritocratic and
> open,
> > in
> > > order to broaden and diversity the user and developer 

Re: [Result][Vote] vote for IoTDB incubation proposal

2018-11-15 Thread Kevin A. McGrail
I will defer the intake of code to the secretary.

On Thu, Nov 15, 2018, 12:20 黄向东  > - When you say "open source" repo, do you mean private repo vs public
> > repo?
>
> Yes.
>
> >
> > - I believe Craig as Secretary will say an SGA never hurts but isn't
> > everything already licensed ASLv2?  It's been a few weeks and a few
> > proposals reviewed so it could be my memory.
>
> Currently, the licenses of the dependency libs of IoTDB includes:
> Apache2.0, BSD (antlr3), EPL1.0 (logback) and EPL2.0 (junit).
> We are working on checking all the licenses once again for avoiding
> mistakes.
>
> Regards,
> Xiangdong Huang
>
>
> > 在 2018年11月15日,下午10:43,Kevin A. McGrail  写道:
> >
> > Well, first, let's ask some questions:
> >
> > - When you say "open source" repo, do you mean private repo vs public
> > repo?
> >
> > - I believe Craig as Secretary will say an SGA never hurts but isn't
> > everything already licensed ASLv2?  It's been a few weeks and a few
> > proposals reviewed so it could be my memory.
> >
> > Regards,
> > KAM
> >
> > --
> > Kevin A. McGrail
> > VP Fundraising, Apache Software Foundation
> > Chair Emeritus Apache SpamAssassin Project
> > https://www.linkedin.com/in/kmcgrail - 703.798.0171
> >
> >
> > On Thu, Nov 15, 2018 at 7:27 AM hxd  wrote:
> >
> >> Currently, there are 6 repositories (IoTDB, IoTDB-JDBC, TsFile,
> >> Spark-Connector, Hive-Connector, and Grafana-Connector) totally and we
> will
> >> merge them all in one repositories.
> >>
> >> Only the first one is private.
> >>
> >> Actually we are lack of experiences about how to open source.
> >>
> >> Should we open all the source now or after all the Apache legal
> documents
> >> are done?
> >>
> >> Best,
> >>
> >> Xiangdong Huang
> >>
> >>> 在 2018年11月15日,下午5:06,Willem Jiang  写道:
> >>>
> >>> Here is a question for the source code repository
> >>>
> >>> The main source git repo[1] is still a private repo.  I think we need
> >>> to open source the repo before sending the SGA?
> >>>
> >>>
> >>> [1]https://github.com/thulab/iotdb
> >>>
> >>> Willem Jiang
> >>>
> >>> Twitter: willemjiang
> >>> Weibo: 姜宁willem
> >>> On Thu, Nov 15, 2018 at 4:08 PM hxd  wrote:
> 
>  Hi,
> 
>  In the proposal discussion process, we got 3 mentors,  Justin Mclean,
> >> Christofer Dutz, and Willem Ning Jiang.
> 
>  In the vote process, we got a new mentor, Joe Witt.
> 
>  Totally, there are one Champion and four mentors, they are:
> 
>  Kevin A. McGrail (the Champion),
>  Justin Mclean,
>  Christofer Dutz,
>  Willem Ning Jiang, and
>  Joe Witt
> 
>  I have checked their name on
> >> http://people.apache.org/committer-index.html, and they are accurate
> now.
>  The name list on the proposal list (
> >> https://wiki.apache.org/incubator/IoTDBProposal) is also correct.
> 
>  Regards,
>  Xiangdong Huang
> 
> 
> 
>  在 2018年11月15日,上午12:51,Kevin A. McGrail  写道:
> 
>  Congratulations!  As champion, I think the next steps are:
> 
>  1 - Xiangdong, Can you confirm the list of mentors on the proposal is
> >> accurate?
> 
>  2 - Also Xiangdong, Is there anyone else that stepped forward as a
> >> mentor during the voting process that the project wants the IPMC to
> approve?
> 
>  3 - Justin, I think you have to request the creation of the podling
> and
> >> then I as champion work on things like the meta data file from this
> page,
>  https://incubator.apache.org/policy/incubation.html, correct?
> 
>  Regards,
>  KAM
> 
> 
> 
> 
>  --
>  Kevin A. McGrail
>  VP Fundraising, Apache Software Foundation
>  Chair Emeritus Apache SpamAssassin Project
>  https://www.linkedin.com/in/kmcgrail - 703.798.0171
> 
> 
>  On Wed, Nov 14, 2018 at 6:29 AM hxd  wrote:
> >
> > Hi,
> >
> > With 8 +1 binding votes,  2 +1 non-binding votes and No +/-0 or -1
> >> votes, this VOTE passes.
> >
> > Thanks to everyone who voted!
> >
> > Bellow is a voting tally:
> >
> > Binding
> > Von Gosling
> > Christofer Dutz
> > Kevin A. McGrail
> > Felix Cheung
> > Matt Sticker
> > Joe Witt
> > Justin Mclean
> > Willem Jiang
> >
> >
> > Non-binding
> > Sheng Wu
> > Yang Bo
> >
> > The vote thread:
> >>
> https://lists.apache.org/thread.html/077f029ab2b52a2b19fc8d41c07438f660a8e93dd87b3895d262263c@%3Cgeneral.incubator.apache.org%3E
> >> <
> >>
> https://lists.apache.org/thread.html/077f029ab2b52a2b19fc8d41c07438f660a8e93dd87b3895d262263c@%3Cgeneral.incubator.apache.org%3E
> >>>
> > The proposal: https://wiki.apache.org/incubator/IoTDBProposal <
> >> https://wiki.apache.org/incubator/IoTDBProposal>
> >
> > Thanks,
> >
> > Xiangdong Huang
> >
> >
> >> 在 2018年11月7日,下午3:46,hxd  写道:
> >>
> >> Hi,
> >>
> >> Sorry for the previous mail with bad format.
> >> I'd like to call a VOTE to accept 

Re: [PROPOSAL] Changing requirements for IPMC

2018-11-15 Thread Henry Saputra
+1 to the new proposals.

The key is to have new potential members to be VOTED in to IPMCs, meaning
someone from existing IPMC should have know the criteria to be invited.

- Henry

On Tue, Nov 6, 2018 at 12:20 AM Justin Mclean 
wrote:

> Hi,
>
> I looked at the board resolution for the creation of the IPMC [1] and it
> says nothing about how IPMC members should be added so from that I take it
> that the IPMC can decide how it wants to do that.
>
> Currently the IPMC can vote people in (which is not so common) or an ASF
> member can request it. I’m not sure where the ASF member requirement came
> from and wasn’t able to find the discussion about this on the incubator
> list. (If anyone knows please point me to it.)
>
> In theory an ASF member should have the knowledge and skills to mentor a
> project, however I also think those who have gone through the incubating
> process, have voted on releases and proposed or accepted new committers and
> PMC members probably know just as much even if they are not ASF members.
> They may not have as much experience but shovel at least know the basics.
>
> Now identifying everyone who has done this would not be be easy to
> determine and the Venn diagram of them and people who want to be mentors is
> probably small (but still significant in numbers).
>
> So I propose this:
>
> If someone has done several of the following:
> - has been involved in an incubating project from start to finish
> - has been a release manager
> - has assembled LICENSE and NOTICE files
> - has reviewed and voted on releases
> - has proposed or accepted committers/PPMC members
>
> Then they can ask the IPMC to join to IPMC by sending an email to private@
> listing what they have been involved in. The IPMC would VOTE on them, and
> there’s a chance they could be rejected, but given it’s a private vote I
> don’t think any harm is done if that happens. Also people could nominate
> other people who fit into this above group.
>
> I’d like to see this used for people who are wanting to be mentors, rather
> than just having binding votes on releases. I don’t have an issue with the
> later (and I think the IPMC currently does a decent job of catching any
> issues with releases they come their way), but that’s what I’m trying to
> solve with this proposal. i.e. We currently need more mentors and will need
> even more as ASF scales up.
>
> The subject line is actually a lie. All this really changes is that people
> can bring themselves or be brought to the attention of the IPMC, rather
> than having the IPMC actively trying to find people from graduated projects
> who then may or may not want to be IPMC members.
>
> We could start this off as an experiment. and take the first few people
> who request it, and see how it goes with more experienced mentors observing
> and/ or helping them.
>
> What do people and the IPMC think of this proposal? Good idea or not?
> Could it work with some modifications? Is it not needed at all?
>
> Thanks,
> Justin
>
>
> 1.
> https://svn.apache.org/repos/infra/websites/production/incubator/content/official/resolution.html
> -
> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> For additional commands, e-mail: general-h...@incubator.apache.org
>
>


Re: [VOTE] Accept the Iceberg project for incubation

2018-11-15 Thread Michael Wall
+1 (binding)

On Thu, Nov 15, 2018 at 3:03 AM Olivier Lamy  wrote:

> +1
>
> On Wed, 14 Nov 2018 at 03:07, Ryan Blue  wrote:
>
> > The discuss thread seems to have reached consensus, so I propose
> accepting
> > the Iceberg project for incubation.
> >
> > The proposal is copied below and in the wiki:
> > https://wiki.apache.org/incubator/IcebergProposal
> >
> > Please vote on whether to accept Iceberg in the next 72 hours:
> >
> > [ ] +1, accept Iceberg for incubation
> > [ ] -1, reject the Iceberg proposal because . . .
> >
> > Thank you for reviewing the proposal and voting,
> >
> > rb
> > --
> > Iceberg Proposal Abstract
> >
> > Iceberg is a table format for large, slow-moving tabular data.
> >
> > It is designed to improve on the de-facto standard table layout built
> into
> > Apache Hive, Presto, and Apache Spark.
> > Proposal
> >
> > The purpose of Iceberg is to provide SQL-like tables that are backed by
> > large sets of data files. Iceberg is similar to the Hive table layout,
> the
> > de-facto standard structure used to track files in a table, but provides
> > additional guarantees and performance optimizations:
> >
> >- Atomicity - Each change to the table is will be complete or will
> fail.
> >“Do or do not. There is no try.”
> >- Snapshot isolation - Reads use one and only one snapshot of a table
> at
> >some time without holding a lock.
> >- Safe schema evolution - A table’s schema can change in well-defined
> >ways, without breaking older data files.
> >- Column projection - An engine may request a subset of the available
> >columns, including nested fields.
> >- Predicate pushdown - An engine can push filters into read planning
> to
> >improve performance using partition data and file-level statistics.
> >
> > Iceberg does NOT define a new file format. All data is stored in Apache
> > Avro, Apache ORC, or Apache Parquet files.
> >
> > Additionally, Iceberg is designed to work well when data files are stored
> > in cloud blob stores, even when those systems provide weaker guarantees
> > than a file system, including:
> >
> >- Eventual consistency in the namespace
> >- High latency for directory listings
> >- No renames of objects
> >- No folder hierarchy
> >
> > Rationale
> >
> > Initial benchmarks show dramatic improvements in query planning. For
> > example, in Netflix’s Atlas use case, which stores time-series metrics
> from
> > Netflix runtime systems and 1 month is stored across 2.7 million files in
> > 2,688 partitions:
> >
> >- Hive table using Parquet:
> >   - 400k+ splits, not combined
> >   - Explain query: 9.6 minutes wall time (planning only)
> >- Iceberg table with partition filtering:
> >   - 15,218 splits, combined
> >   - Planning: 10 seconds
> >   - Query wall time: 13 minutes
> >- Iceberg table with partition and min/max filtering:
> >   - 412 splits
> >   - Planning: 25 seconds
> >   - Query wall time: 42 seconds
> >
> > These performance gains combined with the cross-engine compatibility are
> a
> > very compelling story.
> > Initial Goals
> >
> > The initial goal will be to move the existing codebase to Apache and
> > integrate with the Apache development process and infrastructure. A
> primary
> > goal of incubation will be to grow and diversify the Iceberg community.
> We
> > are well aware that the project community is largely comprised of
> > individuals from a single company. We aim to change that during
> incubation.
> > Current Status
> >
> > As previously mentioned, Iceberg is under active development at Netflix,
> > and is being used in processing large volumes of data in Amazon EC2.
> >
> > Iceberg license documentation is already based on Apache guidelines for
> > LICENSE and NOTICE content.
> > Meritocracy
> >
> > We value meritocracy and we understand that it is the basis for an open
> > community that encourages multiple companies and individuals to
> contribute
> > and be invested in the project’s future. We will encourage and monitor
> > participation and make sure to extend privileges and responsibilities to
> > all contributors.
> > Community
> >
> > Iceberg is currently being used by developers at Netflix and a growing
> > number of users are actively using it in production environments. Iceberg
> > has received contributions from developers working at Hortonworks,
> WeWork,
> > and Palantir. By bringing Iceberg to Apache we aim to assure current and
> > future contributors that the Iceberg community is meritocratic and open,
> in
> > order to broaden and diversity the user and developer community.
> > Core Developers
> >
> > Iceberg was initially developed at Netflix and is under active
> development.
> > We believe Netflix will be of interest to a broad range of users and
> > developers and that incubating the project at the ASF will help us build
> a
> > diverse, sustainable community.
> > Alignment
> >
> > Iceberg utilizes 

Re: [VOTE] Heron Release 0.20.0-incubating Candidate 5

2018-11-15 Thread P. Taylor Goetz
+1 (binding)

- Checksum and signature look good
- L & N files look okay
- DISCLAIMER present
- ASF headers present
- No wayward binaries

(Note I didn’t try to build, I just examined that source package.)

-Taylor


> On Oct 31, 2018, at 1:31 PM, Neng Lu  wrote:
> 
> Hi All,
> 
> This is the 5th release candidate for Apache Heron, version
> 0.20.0-incubating. Thanks everyone for providing various feedback for the
> previous release candidates at the @dev mailing list voting process. This
> release candidate passed the project's dev voting process so we are
> bringing it to a broader voting process.
> 
> It is the starting point of Heron and contains heron's main features, such
> as core streaming
> processing, stateful processing, streamlet API, API server, eco support,
> etc.
> 
> The full list of changes and fixes are available:
> https://github.com/apache/incubator-heron/compare/0.17.8...release/v-0.20.0-incubating
> 
> *** Please download, test and vote on this release. This vote will stay open
> for at least 72 hours ***
> 
> Source files:
> https://dist.apache.org/repos/dist/dev/incubator/heron/heron-0.20.0-incubating-candidate-5/
> 
> SHA-512 checksums:
> 27890ab30fc3e69b627f47d58d178d1a7dffa9dbe4ebbb5a5aa77caaac882fdc2b6f98b3b76210020db0fa3fd86e294cba214f86072e449837e1b7615cd6124a
> incubator-heron-v-0.20.0-incubating-candidate-5.tar.gz
> 
> The tag to be voted upon:
> v0.20.0-incubating-candidate-5 (45043bb6dcef1e8089c0834f17f8be0cc3f451d3)
> https://github.com/apache/incubator-heron/releases/tag/v-0.20.0-incubating-candidate-5
> 
> Please download the source package, and follow the compiling guide(
> https://apache.github.io/incubator-heron/docs/developers/compiling/compiling/)
> to build and run the Heron locally.
> 
> -- 
> Best Regards,
> Neng


-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: [VOTE] Heron Release 0.20.0-incubating Candidate 5

2018-11-15 Thread Justin Mclean
Hi,

- CloudPickle has NOTICE requirements where the copyrights should be.
> Please fix in the next release.
>

What are these? BSD licensed software has no NOTICE requirements. The
copyright statements are included in BSD license file and that's where they
should be copied to our pointed at from our LICENCE file.

Thanks,
Justin

>
>
>


Re: [VOTE] Heron Release 0.20.0-incubating Candidate 5

2018-11-15 Thread Ning Wang
I believe in the LICENSE file, we have this information:


Third party BSD 3-Clause licenses


The following components are provided under the BSD 3-Clause license.
See project link for details.

JSXTransformer(v0.10.0)
  -> heron/tools/ui/resources/static/js/JSXTransformer.0.10.0.js
autogen.sh
  -> config/autogen.sh
cloudpickle(https://github.com/cloudpipe/cloudpickle/blob/master/LICENSE
)
  -> heronpy/api/cloudpickle.py
cpplint(https://github.com/cpplint/cpplint/blob/master/LICENSE)
  -> third_party/python/cpplint/cpplint.py
d3(v3.4.11, https://github.com/d3/d3/blob/master/LICENSE)
  -> heron/tools/ui/resources/static/js/d3.min.3.4.11.js

Please feel free to let me know if it is not the right way. We will update.
Thanks in advance.


On Thu, Nov 15, 2018 at 2:13 PM Justin Mclean 
wrote:

> Hi,
>
> - CloudPickle has NOTICE requirements where the copyrights should be.
> > Please fix in the next release.
> >
>
> What are these? BSD licensed software has no NOTICE requirements. The
> copyright statements are included in BSD license file and that's where they
> should be copied to our pointed at from our LICENCE file.
>
> Thanks,
> Justin
>
> >
> >
> >
>


Re: [VOTE] Heron Release 0.20.0-incubating Candidate 5

2018-11-15 Thread Dave Fisher
+1 (binding)

I checked the signature and checksum.
I did a rat check and there are no unexpected binaries.
- Note CONTRIBUTING.md is a link to a not included file. Please fix in the next 
release.
- CloudPickle has NOTICE requirements where the copyrights should be. Please 
fix in the next release.
No unexpected binary

I could not build on my macOS 10.14, but this could easily be due to 
configuration issues related to other projects.

Regards,
Dave


> On Nov 13, 2018, at 2:46 PM, FatJ Love  wrote:
> 
> The package looks good to me +1
> 
> 
> On Thu, Nov 8, 2018 at 3:43 PM Neng Lu  wrote:
> 
>> Hi All,
>> 
>> This vote has been open for 8 days. If you have some time, please provide
>> any feedback to help us improve.
>> Thank you very much!
>> 
>> On Wed, Oct 31, 2018 at 10:31 AM Neng Lu  wrote:
>> 
>>> Hi All,
>>> 
>>> This is the 5th release candidate for Apache Heron, version
>>> 0.20.0-incubating. Thanks everyone for providing various feedback for the
>>> previous release candidates at the @dev mailing list voting process. This
>>> release candidate passed the project's dev voting process so we are
>>> bringing it to a broader voting process.
>>> 
>>> It is the starting point of Heron and contains heron's main features,
>> such
>>> as core streaming
>>> processing, stateful processing, streamlet API, API server, eco support,
>>> etc.
>>> 
>>> The full list of changes and fixes are available:
>>> 
>>> 
>> https://github.com/apache/incubator-heron/compare/0.17.8...release/v-0.20.0-incubating
>>> 
>>> *** Please download, test and vote on this release. This vote will stay
>>> open
>>> for at least 72 hours ***
>>> 
>>> Source files:
>>> 
>>> 
>> https://dist.apache.org/repos/dist/dev/incubator/heron/heron-0.20.0-incubating-candidate-5/
>>> 
>>> SHA-512 checksums:
>>> 
>>> 
>> 27890ab30fc3e69b627f47d58d178d1a7dffa9dbe4ebbb5a5aa77caaac882fdc2b6f98b3b76210020db0fa3fd86e294cba214f86072e449837e1b7615cd6124a
>>> incubator-heron-v-0.20.0-incubating-candidate-5.tar.gz
>>> 
>>> The tag to be voted upon:
>>> v0.20.0-incubating-candidate-5 (45043bb6dcef1e8089c0834f17f8be0cc3f451d3)
>>> 
>>> 
>> https://github.com/apache/incubator-heron/releases/tag/v-0.20.0-incubating-candidate-5
>>> 
>>> Please download the source package, and follow the compiling guide(
>>> 
>> https://apache.github.io/incubator-heron/docs/developers/compiling/compiling/
>> )
>>> to build and run the Heron locally.
>>> 
>>> --
>>> Best Regards,
>>> Neng
>>> 
>> 
>> 
>> --
>> Best Regards,
>> Neng
>> 


-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: [VOTE] Heron Release 0.20.0-incubating Candidate 5

2018-11-15 Thread Dave Fisher



> On Nov 15, 2018, at 9:23 AM, Dave Fisher  wrote:
> 
> +1 (binding)
> 
> I checked the signature and checksum.
> I did a rat check and there are no unexpected binaries.
> - Note CONTRIBUTING.md is a link to a not included file. Please fix in the 
> next release.
> - CloudPickle has NOTICE requirements where the copyrights should be. Please 
> fix in the next release.
> No unexpected binary
> 
> I could not build on my macOS 10.14, but this could easily be due to 
> configuration issues related to other projects.

With Ning’s help I was able to install the correct version of Bazel and have a 
successful build.

Regards,
Dave

> 
> Regards,
> Dave
> 
> 
>> On Nov 13, 2018, at 2:46 PM, FatJ Love  wrote:
>> 
>> The package looks good to me +1
>> 
>> 
>> On Thu, Nov 8, 2018 at 3:43 PM Neng Lu  wrote:
>> 
>>> Hi All,
>>> 
>>> This vote has been open for 8 days. If you have some time, please provide
>>> any feedback to help us improve.
>>> Thank you very much!
>>> 
>>> On Wed, Oct 31, 2018 at 10:31 AM Neng Lu  wrote:
>>> 
 Hi All,
 
 This is the 5th release candidate for Apache Heron, version
 0.20.0-incubating. Thanks everyone for providing various feedback for the
 previous release candidates at the @dev mailing list voting process. This
 release candidate passed the project's dev voting process so we are
 bringing it to a broader voting process.
 
 It is the starting point of Heron and contains heron's main features,
>>> such
 as core streaming
 processing, stateful processing, streamlet API, API server, eco support,
 etc.
 
 The full list of changes and fixes are available:
 
 
>>> https://github.com/apache/incubator-heron/compare/0.17.8...release/v-0.20.0-incubating
 
 *** Please download, test and vote on this release. This vote will stay
 open
 for at least 72 hours ***
 
 Source files:
 
 
>>> https://dist.apache.org/repos/dist/dev/incubator/heron/heron-0.20.0-incubating-candidate-5/
 
 SHA-512 checksums:
 
 
>>> 27890ab30fc3e69b627f47d58d178d1a7dffa9dbe4ebbb5a5aa77caaac882fdc2b6f98b3b76210020db0fa3fd86e294cba214f86072e449837e1b7615cd6124a
 incubator-heron-v-0.20.0-incubating-candidate-5.tar.gz
 
 The tag to be voted upon:
 v0.20.0-incubating-candidate-5 (45043bb6dcef1e8089c0834f17f8be0cc3f451d3)
 
 
>>> https://github.com/apache/incubator-heron/releases/tag/v-0.20.0-incubating-candidate-5
 
 Please download the source package, and follow the compiling guide(
 
>>> https://apache.github.io/incubator-heron/docs/developers/compiling/compiling/
>>> )
 to build and run the Heron locally.
 
 --
 Best Regards,
 Neng
 
>>> 
>>> 
>>> --
>>> Best Regards,
>>> Neng
>>> 
> 
> 
> -
> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> For additional commands, e-mail: general-h...@incubator.apache.org
> 


-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: [PROPOSAL] Changing requirements for IPMC

2018-11-15 Thread Liang Chen
Hi

+1, I am chairman of TLP project and has been involved multiple incubating
projects from start to finish, also was responsible for multiple releases as
manager. 

I would be glad to give help if some new incubator projects require.

Regards
Liang


Justin Mclean wrote
> Hi,
> 
> I looked at the board resolution for the creation of the IPMC [1] and it
> says nothing about how IPMC members should be added so from that I take it
> that the IPMC can decide how it wants to do that.
> 
> Currently the IPMC can vote people in (which is not so common) or an ASF
> member can request it. I’m not sure where the ASF member requirement came
> from and wasn’t able to find the discussion about this on the incubator
> list. (If anyone knows please point me to it.)
> 
> In theory an ASF member should have the knowledge and skills to mentor a
> project, however I also think those who have gone through the incubating
> process, have voted on releases and proposed or accepted new committers
> and PMC members probably know just as much even if they are not ASF
> members. They may not have as much experience but shovel at least know the
> basics.
> 
> Now identifying everyone who has done this would not be be easy to
> determine and the Venn diagram of them and people who want to be mentors
> is probably small (but still significant in numbers).
> 
> So I propose this:
> 
> If someone has done several of the following:
> - has been involved in an incubating project from start to finish
> - has been a release manager
> - has assembled LICENSE and NOTICE files
> - has reviewed and voted on releases
> - has proposed or accepted committers/PPMC members
> 
> Then they can ask the IPMC to join to IPMC by sending an email to private@
> listing what they have been involved in. The IPMC would VOTE on them, and
> there’s a chance they could be rejected, but given it’s a private vote I
> don’t think any harm is done if that happens. Also people could nominate
> other people who fit into this above group.
> 
> I’d like to see this used for people who are wanting to be mentors, rather
> than just having binding votes on releases. I don’t have an issue with the
> later (and I think the IPMC currently does a decent job of catching any
> issues with releases they come their way), but that’s what I’m trying to
> solve with this proposal. i.e. We currently need more mentors and will
> need even more as ASF scales up.
> 
> The subject line is actually a lie. All this really changes is that people
> can bring themselves or be brought to the attention of the IPMC, rather
> than having the IPMC actively trying to find people from graduated
> projects who then may or may not want to be IPMC members.
> 
> We could start this off as an experiment. and take the first few people
> who request it, and see how it goes with more experienced mentors
> observing and/ or helping them.
> 
> What do people and the IPMC think of this proposal? Good idea or not?
> Could it work with some modifications? Is it not needed at all?
> 
> Thanks,
> Justin
> 
> 
> 1.
> https://svn.apache.org/repos/infra/websites/production/incubator/content/official/resolution.html
> -
> To unsubscribe, e-mail: 

> general-unsubscribe@.apache

> For additional commands, e-mail: 

> general-help@.apache





--
Sent from: http://apache-incubator-general.996316.n3.nabble.com/

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: [Result][Vote] vote for IoTDB incubation proposal

2018-11-15 Thread hxd
Hi,

In the proposal discussion process, we got 3 mentors,  Justin Mclean, 
Christofer Dutz, and Willem Ning Jiang. 

In the vote process, we got a new mentor, Joe Witt.

Totally, there are one Champion and four mentors, they are:

Kevin A. McGrail (the Champion),
Justin Mclean, 
Christofer Dutz, 
Willem Ning Jiang, and
Joe Witt

I have checked their name on http://people.apache.org/committer-index.html 
, and they are accurate now. 
The name list on the proposal list 
(https://wiki.apache.org/incubator/IoTDBProposal 
) is also correct.

Regards,
Xiangdong Huang

 

> 在 2018年11月15日,上午12:51,Kevin A. McGrail  写道:
> 
> Congratulations!  As champion, I think the next steps are:
> 
> 1 - Xiangdong, Can you confirm the list of mentors on the proposal is 
> accurate?
> 
> 2 - Also Xiangdong, Is there anyone else that stepped forward as a mentor 
> during the voting process that the project wants the IPMC to approve?  
> 
> 3 - Justin, I think you have to request the creation of the podling and then 
> I as champion work on things like the meta data file from this page, 
> https://incubator.apache.org/policy/incubation.html 
> , correct?
> 
> Regards,
> KAM
> 
> 
> 
> 
> --
> Kevin A. McGrail
> VP Fundraising, Apache Software Foundation
> Chair Emeritus Apache SpamAssassin Project
> https://www.linkedin.com/in/kmcgrail  - 
> 703.798.0171 
> 
> On Wed, Nov 14, 2018 at 6:29 AM hxd mailto:hxd...@qq.com>> 
> wrote:
> Hi,
> 
> With 8 +1 binding votes,  2 +1 non-binding votes and No +/-0 or -1 votes, 
> this VOTE passes. 
> 
> Thanks to everyone who voted!
> 
> Bellow is a voting tally:
> 
> Binding
> Von Gosling
>  Christofer Dutz 
>  Kevin A. McGrail
>  Felix Cheung
>  Matt Sticker
>  Joe Witt
>  Justin Mclean 
>  Willem Jiang 
> 
> 
> Non-binding
>  Sheng Wu
>  Yang Bo
> 
> The vote thread: 
> https://lists.apache.org/thread.html/077f029ab2b52a2b19fc8d41c07438f660a8e93dd87b3895d262263c@%3Cgeneral.incubator.apache.org%3E
>  
>   
> >
>  
> The proposal: https://wiki.apache.org/incubator/IoTDBProposal 
>  
>  > 
> 
> Thanks,
> 
> Xiangdong Huang
> 
> 
> > 在 2018年11月7日,下午3:46,hxd mailto:hxd...@qq.com>> 写道:
> > 
> > Hi,
> > 
> > Sorry for the previous mail with bad format.
> > I'd like to call a VOTE to accept IoTDB project, a database for managing 
> > large amounts of time series data  from IoT sensors in industrial 
> > applications, into the Apache Incubator. 
> > The full proposal is available on the wiki: 
> > https://wiki.apache.org/incubator/IoTDBProposal 
> > 
> > and it is also attached below for your convenience.
> > 
> > Please cast your vote:
> > 
> >   [ ] +1, bring IoTDB into Incubator
> >   [ ] +0, I don't care either way,
> >   [ ] -1, do not bring IoTDB into Incubator, because...
> > 
> > The vote will open at least for 72 hours.
> > 
> > Thanks,
> > Xiangdong Huang.
> > 
> > 
> > = IoTDB Proposal  =
> > v0.1.1
> > 
> > 
> > == Abstract ==
> > IoTDB is a data store for managing large amounts of time series data such 
> > as timestamped data from IoT sensors in industrial applications.
> > 
> > == Proposal ==
> > IoTDB is a database for managing large amount of time series data with 
> > columnar storage, data encoding, pre-computation, and index techniques. It 
> > has SQL-like interface to write millions of data points per second per node 
> > and is optimized to get query results in few seconds over trillions of data 
> > points. It can also be easily integrated with Apache Hadoop MapReduce and 
> > Apache Spark for analytics.
> > 
> > == Background ==
> > 
> > A new class of data management system requirements is becoming increasingly 
> > important with the rise of the Internet of Things. There are some database 
> > systems and technologies aimed at time series data management.  For 
> > example, Gorilla and InfluxDB which are mainly built for data centers and 
> > monitoring application metrics. Other systems, for example, OpenTSDB and 
> > KairosDB, are built on Apache HBase and Apache Cassandra, respectively. 
> > 
> > However, many applications for time series data management have more 
> > requirements especially in industrial applications as follows:
> > 
> >  * Supporting time series data which has high data frequency. For 

Re: [Result][Vote] vote for IoTDB incubation proposal

2018-11-15 Thread Willem Jiang
Here is a question for the source code repository

The main source git repo[1] is still a private repo.  I think we need
to open source the repo before sending the SGA?


[1]https://github.com/thulab/iotdb

Willem Jiang

Twitter: willemjiang
Weibo: 姜宁willem
On Thu, Nov 15, 2018 at 4:08 PM hxd  wrote:
>
> Hi,
>
> In the proposal discussion process, we got 3 mentors,  Justin Mclean, 
> Christofer Dutz, and Willem Ning Jiang.
>
> In the vote process, we got a new mentor, Joe Witt.
>
> Totally, there are one Champion and four mentors, they are:
>
> Kevin A. McGrail (the Champion),
> Justin Mclean,
> Christofer Dutz,
> Willem Ning Jiang, and
> Joe Witt
>
> I have checked their name on http://people.apache.org/committer-index.html, 
> and they are accurate now.
> The name list on the proposal list 
> (https://wiki.apache.org/incubator/IoTDBProposal) is also correct.
>
> Regards,
> Xiangdong Huang
>
>
>
> 在 2018年11月15日,上午12:51,Kevin A. McGrail  写道:
>
> Congratulations!  As champion, I think the next steps are:
>
> 1 - Xiangdong, Can you confirm the list of mentors on the proposal is 
> accurate?
>
> 2 - Also Xiangdong, Is there anyone else that stepped forward as a mentor 
> during the voting process that the project wants the IPMC to approve?
>
> 3 - Justin, I think you have to request the creation of the podling and then 
> I as champion work on things like the meta data file from this page,
> https://incubator.apache.org/policy/incubation.html, correct?
>
> Regards,
> KAM
>
>
>
>
> --
> Kevin A. McGrail
> VP Fundraising, Apache Software Foundation
> Chair Emeritus Apache SpamAssassin Project
> https://www.linkedin.com/in/kmcgrail - 703.798.0171
>
>
> On Wed, Nov 14, 2018 at 6:29 AM hxd  wrote:
>>
>> Hi,
>>
>> With 8 +1 binding votes,  2 +1 non-binding votes and No +/-0 or -1 votes, 
>> this VOTE passes.
>>
>> Thanks to everyone who voted!
>>
>> Bellow is a voting tally:
>>
>> Binding
>> Von Gosling
>>  Christofer Dutz
>>  Kevin A. McGrail
>>  Felix Cheung
>>  Matt Sticker
>>  Joe Witt
>>  Justin Mclean
>>  Willem Jiang
>>
>>
>> Non-binding
>>  Sheng Wu
>>  Yang Bo
>>
>> The vote thread: 
>> https://lists.apache.org/thread.html/077f029ab2b52a2b19fc8d41c07438f660a8e93dd87b3895d262263c@%3Cgeneral.incubator.apache.org%3E
>> The proposal: https://wiki.apache.org/incubator/IoTDBProposal 
>> 
>>
>> Thanks,
>>
>> Xiangdong Huang
>>
>>
>> > 在 2018年11月7日,下午3:46,hxd  写道:
>> >
>> > Hi,
>> >
>> > Sorry for the previous mail with bad format.
>> > I'd like to call a VOTE to accept IoTDB project, a database for managing 
>> > large amounts of time series data  from IoT sensors in industrial 
>> > applications, into the Apache Incubator.
>> > The full proposal is available on the wiki: 
>> > https://wiki.apache.org/incubator/IoTDBProposal
>> > and it is also attached below for your convenience.
>> >
>> > Please cast your vote:
>> >
>> >   [ ] +1, bring IoTDB into Incubator
>> >   [ ] +0, I don't care either way,
>> >   [ ] -1, do not bring IoTDB into Incubator, because...
>> >
>> > The vote will open at least for 72 hours.
>> >
>> > Thanks,
>> > Xiangdong Huang.
>> >
>> >
>> > = IoTDB Proposal  =
>> > v0.1.1
>> >
>> >
>> > == Abstract ==
>> > IoTDB is a data store for managing large amounts of time series data such 
>> > as timestamped data from IoT sensors in industrial applications.
>> >
>> > == Proposal ==
>> > IoTDB is a database for managing large amount of time series data with 
>> > columnar storage, data encoding, pre-computation, and index techniques. It 
>> > has SQL-like interface to write millions of data points per second per 
>> > node and is optimized to get query results in few seconds over trillions 
>> > of data points. It can also be easily integrated with Apache Hadoop 
>> > MapReduce and Apache Spark for analytics.
>> >
>> > == Background ==
>> >
>> > A new class of data management system requirements is becoming 
>> > increasingly important with the rise of the Internet of Things. There are 
>> > some database systems and technologies aimed at time series data 
>> > management.  For example, Gorilla and InfluxDB which are mainly built for 
>> > data centers and monitoring application metrics. Other systems, for 
>> > example, OpenTSDB and KairosDB, are built on Apache HBase and Apache 
>> > Cassandra, respectively.
>> >
>> > However, many applications for time series data management have more 
>> > requirements especially in industrial applications as follows:
>> >
>> >  * Supporting time series data which has high data frequency. For example, 
>> > a turbine engine may generate 1000 points per second (i.e., 1000Hz), while 
>> > each CPU only reports 1 data points per 5 seconds in a data center 
>> > monitoring application.
>> >
>> >  * Supporting scanning data multi-resolutionally. For example, aggregation 
>> > operation is 

Re: [VOTE] Accept the Iceberg project for incubation

2018-11-15 Thread Olivier Lamy
+1

On Wed, 14 Nov 2018 at 03:07, Ryan Blue  wrote:

> The discuss thread seems to have reached consensus, so I propose accepting
> the Iceberg project for incubation.
>
> The proposal is copied below and in the wiki:
> https://wiki.apache.org/incubator/IcebergProposal
>
> Please vote on whether to accept Iceberg in the next 72 hours:
>
> [ ] +1, accept Iceberg for incubation
> [ ] -1, reject the Iceberg proposal because . . .
>
> Thank you for reviewing the proposal and voting,
>
> rb
> --
> Iceberg Proposal Abstract
>
> Iceberg is a table format for large, slow-moving tabular data.
>
> It is designed to improve on the de-facto standard table layout built into
> Apache Hive, Presto, and Apache Spark.
> Proposal
>
> The purpose of Iceberg is to provide SQL-like tables that are backed by
> large sets of data files. Iceberg is similar to the Hive table layout, the
> de-facto standard structure used to track files in a table, but provides
> additional guarantees and performance optimizations:
>
>- Atomicity - Each change to the table is will be complete or will fail.
>“Do or do not. There is no try.”
>- Snapshot isolation - Reads use one and only one snapshot of a table at
>some time without holding a lock.
>- Safe schema evolution - A table’s schema can change in well-defined
>ways, without breaking older data files.
>- Column projection - An engine may request a subset of the available
>columns, including nested fields.
>- Predicate pushdown - An engine can push filters into read planning to
>improve performance using partition data and file-level statistics.
>
> Iceberg does NOT define a new file format. All data is stored in Apache
> Avro, Apache ORC, or Apache Parquet files.
>
> Additionally, Iceberg is designed to work well when data files are stored
> in cloud blob stores, even when those systems provide weaker guarantees
> than a file system, including:
>
>- Eventual consistency in the namespace
>- High latency for directory listings
>- No renames of objects
>- No folder hierarchy
>
> Rationale
>
> Initial benchmarks show dramatic improvements in query planning. For
> example, in Netflix’s Atlas use case, which stores time-series metrics from
> Netflix runtime systems and 1 month is stored across 2.7 million files in
> 2,688 partitions:
>
>- Hive table using Parquet:
>   - 400k+ splits, not combined
>   - Explain query: 9.6 minutes wall time (planning only)
>- Iceberg table with partition filtering:
>   - 15,218 splits, combined
>   - Planning: 10 seconds
>   - Query wall time: 13 minutes
>- Iceberg table with partition and min/max filtering:
>   - 412 splits
>   - Planning: 25 seconds
>   - Query wall time: 42 seconds
>
> These performance gains combined with the cross-engine compatibility are a
> very compelling story.
> Initial Goals
>
> The initial goal will be to move the existing codebase to Apache and
> integrate with the Apache development process and infrastructure. A primary
> goal of incubation will be to grow and diversify the Iceberg community. We
> are well aware that the project community is largely comprised of
> individuals from a single company. We aim to change that during incubation.
> Current Status
>
> As previously mentioned, Iceberg is under active development at Netflix,
> and is being used in processing large volumes of data in Amazon EC2.
>
> Iceberg license documentation is already based on Apache guidelines for
> LICENSE and NOTICE content.
> Meritocracy
>
> We value meritocracy and we understand that it is the basis for an open
> community that encourages multiple companies and individuals to contribute
> and be invested in the project’s future. We will encourage and monitor
> participation and make sure to extend privileges and responsibilities to
> all contributors.
> Community
>
> Iceberg is currently being used by developers at Netflix and a growing
> number of users are actively using it in production environments. Iceberg
> has received contributions from developers working at Hortonworks, WeWork,
> and Palantir. By bringing Iceberg to Apache we aim to assure current and
> future contributors that the Iceberg community is meritocratic and open, in
> order to broaden and diversity the user and developer community.
> Core Developers
>
> Iceberg was initially developed at Netflix and is under active development.
> We believe Netflix will be of interest to a broad range of users and
> developers and that incubating the project at the ASF will help us build a
> diverse, sustainable community.
> Alignment
>
> Iceberg utilizes other Apache projects such as Avro, Hadoop, Hive, ORC,
> Parquet, Pig, and Spark. We anticipate integration with additional Apache
> projects as the Iceberg community and interest in the project grows.
> Known Risks Orphaned Products
>
> Netflix is committed to the future development of Iceberg and understands
> that 

Re: [Result][Vote] vote for IoTDB incubation proposal

2018-11-15 Thread hxd
Currently, there are 6 repositories (IoTDB, IoTDB-JDBC, TsFile, 
Spark-Connector, Hive-Connector, and Grafana-Connector) totally and we will 
merge them all in one repositories. 

Only the first one is private. 

Actually we are lack of experiences about how to open source. 

Should we open all the source now or after all the Apache legal documents are 
done? 

Best,

Xiangdong Huang  

> 在 2018年11月15日,下午5:06,Willem Jiang  写道:
> 
> Here is a question for the source code repository
> 
> The main source git repo[1] is still a private repo.  I think we need
> to open source the repo before sending the SGA?
> 
> 
> [1]https://github.com/thulab/iotdb
> 
> Willem Jiang
> 
> Twitter: willemjiang
> Weibo: 姜宁willem
> On Thu, Nov 15, 2018 at 4:08 PM hxd  wrote:
>> 
>> Hi,
>> 
>> In the proposal discussion process, we got 3 mentors,  Justin Mclean, 
>> Christofer Dutz, and Willem Ning Jiang.
>> 
>> In the vote process, we got a new mentor, Joe Witt.
>> 
>> Totally, there are one Champion and four mentors, they are:
>> 
>> Kevin A. McGrail (the Champion),
>> Justin Mclean,
>> Christofer Dutz,
>> Willem Ning Jiang, and
>> Joe Witt
>> 
>> I have checked their name on http://people.apache.org/committer-index.html, 
>> and they are accurate now.
>> The name list on the proposal list 
>> (https://wiki.apache.org/incubator/IoTDBProposal) is also correct.
>> 
>> Regards,
>> Xiangdong Huang
>> 
>> 
>> 
>> 在 2018年11月15日,上午12:51,Kevin A. McGrail  写道:
>> 
>> Congratulations!  As champion, I think the next steps are:
>> 
>> 1 - Xiangdong, Can you confirm the list of mentors on the proposal is 
>> accurate?
>> 
>> 2 - Also Xiangdong, Is there anyone else that stepped forward as a mentor 
>> during the voting process that the project wants the IPMC to approve?
>> 
>> 3 - Justin, I think you have to request the creation of the podling and then 
>> I as champion work on things like the meta data file from this page,
>> https://incubator.apache.org/policy/incubation.html, correct?
>> 
>> Regards,
>> KAM
>> 
>> 
>> 
>> 
>> --
>> Kevin A. McGrail
>> VP Fundraising, Apache Software Foundation
>> Chair Emeritus Apache SpamAssassin Project
>> https://www.linkedin.com/in/kmcgrail - 703.798.0171
>> 
>> 
>> On Wed, Nov 14, 2018 at 6:29 AM hxd  wrote:
>>> 
>>> Hi,
>>> 
>>> With 8 +1 binding votes,  2 +1 non-binding votes and No +/-0 or -1 votes, 
>>> this VOTE passes.
>>> 
>>> Thanks to everyone who voted!
>>> 
>>> Bellow is a voting tally:
>>> 
>>> Binding
>>> Von Gosling
>>> Christofer Dutz
>>> Kevin A. McGrail
>>> Felix Cheung
>>> Matt Sticker
>>> Joe Witt
>>> Justin Mclean
>>> Willem Jiang
>>> 
>>> 
>>> Non-binding
>>> Sheng Wu
>>> Yang Bo
>>> 
>>> The vote thread: 
>>> https://lists.apache.org/thread.html/077f029ab2b52a2b19fc8d41c07438f660a8e93dd87b3895d262263c@%3Cgeneral.incubator.apache.org%3E
>>> The proposal: https://wiki.apache.org/incubator/IoTDBProposal 
>>> 
>>> 
>>> Thanks,
>>> 
>>> Xiangdong Huang
>>> 
>>> 
 在 2018年11月7日,下午3:46,hxd  写道:
 
 Hi,
 
 Sorry for the previous mail with bad format.
 I'd like to call a VOTE to accept IoTDB project, a database for managing 
 large amounts of time series data  from IoT sensors in industrial 
 applications, into the Apache Incubator.
 The full proposal is available on the wiki: 
 https://wiki.apache.org/incubator/IoTDBProposal
 and it is also attached below for your convenience.
 
 Please cast your vote:
 
  [ ] +1, bring IoTDB into Incubator
  [ ] +0, I don't care either way,
  [ ] -1, do not bring IoTDB into Incubator, because...
 
 The vote will open at least for 72 hours.
 
 Thanks,
 Xiangdong Huang.
 
 
 = IoTDB Proposal  =
 v0.1.1
 
 
 == Abstract ==
 IoTDB is a data store for managing large amounts of time series data such 
 as timestamped data from IoT sensors in industrial applications.
 
 == Proposal ==
 IoTDB is a database for managing large amount of time series data with 
 columnar storage, data encoding, pre-computation, and index techniques. It 
 has SQL-like interface to write millions of data points per second per 
 node and is optimized to get query results in few seconds over trillions 
 of data points. It can also be easily integrated with Apache Hadoop 
 MapReduce and Apache Spark for analytics.
 
 == Background ==
 
 A new class of data management system requirements is becoming 
 increasingly important with the rise of the Internet of Things. There are 
 some database systems and technologies aimed at time series data 
 management.  For example, Gorilla and InfluxDB which are mainly built for 
 data centers and monitoring application metrics. Other systems, for 
 example, OpenTSDB and KairosDB, are