Re: [VOTE] Accept the Iceberg project for incubation

2018-11-16 Thread Julien Le Dem
>
> +1

>
> From: Kenneth Knowles 
> Date: Thu, Nov 15, 2018 at 10:01 AM
> Subject: Re: [VOTE] Accept the Iceberg project for incubation
> To: 
>
>
> +1 (non-binding)
>
> On Thu, Nov 15, 2018 at 9:57 AM Michael Wall  wrote:
>
> > +1 (binding)
> >
> > On Thu, Nov 15, 2018 at 3:03 AM Olivier Lamy  wrote:
> >
> > > +1
> > >
> > > On Wed, 14 Nov 2018 at 03:07, Ryan Blue  wrote:
> > >
> > > > The discuss thread seems to have reached consensus, so I propose
> > > accepting
> > > > the Iceberg project for incubation.
> > > >
> > > > The proposal is copied below and in the wiki:
> > > > https://wiki.apache.org/incubator/IcebergProposal
> > > >
> > > > Please vote on whether to accept Iceberg in the next 72 hours:
> > > >
> > > > [ ] +1, accept Iceberg for incubation
> > > > [ ] -1, reject the Iceberg proposal because . . .
> > > >
> > > > Thank you for reviewing the proposal and voting,
> > > >
> > > > rb
> > > > --
> > > > Iceberg Proposal Abstract
> > > >
> > > > Iceberg is a table format for large, slow-moving tabular data.
> > > >
> > > > It is designed to improve on the de-facto standard table layout built
> > > into
> > > > Apache Hive, Presto, and Apache Spark.
> > > > Proposal
> > > >
> > > > The purpose of Iceberg is to provide SQL-like tables that are backed
> by
> > > > large sets of data files. Iceberg is similar to the Hive table
> layout,
> > > the
> > > > de-facto standard structure used to track files in a table, but
> > provides
> > > > additional guarantees and performance optimizations:
> > > >
> > > >- Atomicity - Each change to the table is will be complete or will
> > > fail.
> > > >“Do or do not. There is no try.”
> > > >- Snapshot isolation - Reads use one and only one snapshot of a
> > table
> > > at
> > > >some time without holding a lock.
> > > >- Safe schema evolution - A table’s schema can change in
> > well-defined
> > > >ways, without breaking older data files.
> > > >- Column projection - An engine may request a subset of the
> > available
> > > >columns, including nested fields.
> > > >- Predicate pushdown - An engine can push filters into read
> planning
> > > to
> > > >improve performance using partition data and file-level
> statistics.
> > > >
> > > > Iceberg does NOT define a new file format. All data is stored in
> Apache
> > > > Avro, Apache ORC, or Apache Parquet files.
> > > >
> > > > Additionally, Iceberg is designed to work well when data files are
> > stored
> > > > in cloud blob stores, even when those systems provide weaker
> guarantees
> > > > than a file system, including:
> > > >
> > > >- Eventual consistency in the namespace
> > > >- High latency for directory listings
> > > >- No renames of objects
> > > >- No folder hierarchy
> > > >
> > > > Rationale
> > > >
> > > > Initial benchmarks show dramatic improvements in query planning. For
> > > > example, in Netflix’s Atlas use case, which stores time-series
> metrics
> > > from
> > > > Netflix runtime systems and 1 month is stored across 2.7 million
> files
> > in
> > > > 2,688 partitions:
> > > >
> > > >- Hive table using Parquet:
> > > >   - 400k+ splits, not combined
> > > >   - Explain query: 9.6 minutes wall time (planning only)
> > > >- Iceberg table with partition filtering:
> > > >   - 15,218 splits, combined
> > > >   - Planning: 10 seconds
> > > >   - Query wall time: 13 minutes
> > > >- Iceberg table with partition and min/max filtering:
> > > >   - 412 splits
> > > >   - Planning: 25 seconds
> > > >   - Query wall time: 42 seconds
>
> > > >
> > > > These performance gains combined with the cross-engine compatibility
> > are
> > > a
> > > > very compelling story.
> > > > Initial Goals
> > > >
> > > > The initial goal will be to move the existing codebase to Apache and
> > > > integrate with the Apache development process and infrastructure. A
> > > primary
> > > > goal of incubation will be to grow and diversify the Iceberg
> community.
> > > We
> > > > are well aware that the project community is largely comprised of
> > > > individuals from a single company. We aim to change that during
> > > incubation.
> > > > Current Status
> > > >
> > > > As previously mentioned, Iceberg is under active development at
> > Netflix,
> > > > and is being used in processing large volumes of data in Amazon EC2.
> > > >
> > > > Iceberg license documentation is already based on Apache guidelines
> for
> > > > LICENSE and NOTICE content.
> > > > Meritocracy
> > > >
> > > > We value meritocracy and we understand that it is the basis for an
> open
> > > > community that encourages multiple companies and individuals to
> > > contribute
> > > > and be invested in the project’s future. We will encourage and
> monitor
> > > > participation and make sure to extend privileges and responsibilities
> > to
> > > > all contributors.
> > > > Community
> > > >
> > > > Iceberg is currently being used by 

Re: [Incubator Wiki] Update of "November2018" by christhistlethwaite

2018-11-16 Thread Chris Lambertus



> On Nov 12, 2018, at 12:18 PM, Justin Mclean  wrote:
> 
> Hi,
> 
>> -   [ ](warble) Daniel Takamori
>> -  Comments:
>>   [X](warble) Chris Lambertus
>>  Comments: Community building and documentation of the framework 
>>  continue to be the core focus of Warble.
> 
> Any reason for removing Daniel as a mentor? He’s still listed on the roster. 
> [1]


I’m not sure if Chris T has responded yet, but we are unsure if Daniel will 
continue participating in the project. I’m not sure that he needs to be removed 
from the list, but I wouldn’t expect him to provide any sign-offs at any time 
in the foreseeable future. As such, we only currently have one active mentor, 
me. If anyone from the IPMC is interested in the Warble codebase and project 
goals, you are most welcome and encouraged to participate as a mentor, but 
given the current PPMC, we don’t feel additional mentorship is needed at this 
time. The project is expected to proceed at a slow pace.

-Chris



> 
> Thanks,
> Justin
> 
> 1. https://whimsy.apache.org/roster/ppmc/warble
> -
> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> For additional commands, e-mail: general-h...@incubator.apache.org
> 


-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: [VOTE] Heron Release 0.20.0-incubating Candidate 5

2018-11-16 Thread Neng Lu
Hi all,

With three +1 binding votes, three +1 non-binding votes and no -1 or +/-0
votes, this release vote passes! Heron is now ready for the first Apache
release!

Thanks to our mentors and contributors for helping us make it happen.
Thanks to everyone who checked the release and voted for it.

The voter list is attached:

+1 Binding:
  Justin Mclean
  Dave Fisher
  P. Taylor Goetz

+1 Non-binding:
  Karthik Ramasamy
  Fu Maosong
  huijun.wu

Thanks,
Neng


On Wed, Oct 31, 2018 at 10:31 AM Neng Lu  wrote:

> Hi All,
>
> This is the 5th release candidate for Apache Heron, version
> 0.20.0-incubating. Thanks everyone for providing various feedback for the
> previous release candidates at the @dev mailing list voting process. This
> release candidate passed the project's dev voting process so we are
> bringing it to a broader voting process.
>
> It is the starting point of Heron and contains heron's main features, such
> as core streaming
> processing, stateful processing, streamlet API, API server, eco support,
> etc.
>
> The full list of changes and fixes are available:
>
> https://github.com/apache/incubator-heron/compare/0.17.8...release/v-0.20.0-incubating
>
> *** Please download, test and vote on this release. This vote will stay
> open
> for at least 72 hours ***
>
> Source files:
>
> https://dist.apache.org/repos/dist/dev/incubator/heron/heron-0.20.0-incubating-candidate-5/
>
> SHA-512 checksums:
>
> 27890ab30fc3e69b627f47d58d178d1a7dffa9dbe4ebbb5a5aa77caaac882fdc2b6f98b3b76210020db0fa3fd86e294cba214f86072e449837e1b7615cd6124a
> incubator-heron-v-0.20.0-incubating-candidate-5.tar.gz
>
> The tag to be voted upon:
> v0.20.0-incubating-candidate-5 (45043bb6dcef1e8089c0834f17f8be0cc3f451d3)
>
> https://github.com/apache/incubator-heron/releases/tag/v-0.20.0-incubating-candidate-5
>
> Please download the source package, and follow the compiling guide(
> https://apache.github.io/incubator-heron/docs/developers/compiling/compiling/)
> to build and run the Heron locally.
>
> --
> Best Regards,
> Neng
>


-- 
Best Regards,
Neng


[VOTE] Retire ODF Toolkit

2018-11-16 Thread Dave Fisher
The ODF Toolkit community has VOTEd to Retire.

The thread: 
https://lists.apache.org/thread.html/6de81dc33cb8311a38e38b927bfa1df7290c47ba7e7b39311cd06ce6@%3Codf-dev.incubator.apache.org%3E

The results: 
https://lists.apache.org/thread.html/d4b4b33470a7bc3e70dc6192527168bb5907810045df5453c784cc93@%3Codf-dev.incubator.apache.org%3E

Note that of the 6 VOTES there were 5 from IPMC members including the current 
Mentor and two original Mentors.

This Vote will continue with the IPMC votes at 5 +1 and will be open for the 
next week.

[ ] +1 - Retire ODF Toolkit.
[ ] -1 - Do not retire ODF Toolkit. I intend to become a Mentor and try to make 
the community happen.

Regards,
Dave
-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: [NEW MENTORS REQUIRED] Apache SensSoft

2018-11-16 Thread Dave Meikle
Thanks Joshua.

I've just subscribed to the mailing list and forked the repos on GitHub.
Looking forward to hearing your plans and seeing where I can help.

Cheers,
Dave

On Tue, 13 Nov 2018 at 01:59, Joshua Poore  wrote:

> Hi Dave, Atri!
>
> Thanks for reaching out. I work very closely with Lewis—I’ve been with the
> project since day one, and a few years before that :). I’m happy to help
> answer any questions you might have about the polling, where we’re strong,
> where we’re weak, and what we’d like to do with it. Feel free to take the
> conversation over to dev:
>
> dev-subscr...@senssoft.incubator.apache.org
>
> Best,
>
> Josh
>
>
>
>
> >
> >> On Nov 9, 2018, at 1:29 PM, loo...@gmail.com wrote:
> >>
> >> Hi Lewis,
> >>
> >> I'd be up for helping too.
> >>
> >> Cheers,
> >> Dave
> >>
> >> On Thu, 8 Nov 2018 at 22:06, lewis john mcgibbney 
> >> wrote:
> >>
> >>> Hi Folks,
> >>> We are looking for new motivated mentors for the Apache SensSoft
> >>> (Incubating) project [0] look at that kick ass Website :)
> >>> In a nutshell SensSoft is a generalized user behavioral logging
> platform
> >>> for web pages and thin-client applications.
> >>> The podling was accepted into the Apache Incubator on 2016-07-13 so has
> >>> been maturing and has made one release during that timeframe.
> >>> We have been struggling somewhat with active mentorship which has
> >>> attributed to the podling struggling with the final push through to
> >>> graduation.
> >>> Interestingly the SensSoft community is also going through the process
> of a
> >>> PODLINGNAMESEARCH meaning that we will no longer be SensSoft but
> something
> >>> else. This is an excellent time for a mentor or two to come aboard and
> help
> >>> us drive onwards to TLP status.
> >>> Please let us know at d...@senssoft.apache.org if you are interested,
> >>> Lewis
> >>>
> >>> [0] http://senssoft.apache.org/
> >>>
> >>> --
> >>> http://home.apache.org/~lewismc/
> >>> http://people.apache.org/keys/committer/lewismc
> >>>
> >
>
>
> -
> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> For additional commands, e-mail: general-h...@incubator.apache.org
>
>


IPMC members not signed up to private list

2018-11-16 Thread Justin Mclean
Hi,

I’ve just sent an email out to the 111 IPMC members who are not signed up to 
the private incubator list.

Just in case you don’t get the email and are an IPMC member remember you should 
be signed up to the IPMC private list.

Now this could be because you are subscribed with an address that is not in 
your LDAP record. If so you can fix this here [1].

Or if you are not subscribed please subscribe to the private list by sending an 
email to:
private-subscr...@incubator.apache.org

Or use whimsy to subscribe to the list [2],

Thanks,
Justin
V.P. Incubator

1. https://whimsy.apache.org/roster/committer/__self__
2. https://whimsy.apache.org/committers/subscribe


-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: [VOTE] Retire ODF Toolkit

2018-11-16 Thread Liang Chen
Hi

+1

Regards
Liang


Dave Fisher-5 wrote
> The ODF Toolkit community has VOTEd to Retire.
> 
> The thread:
> https://lists.apache.org/thread.html/6de81dc33cb8311a38e38b927bfa1df7290c47ba7e7b39311cd06ce6@%3Codf-dev.incubator.apache.org%3E
> 
> The results:
> https://lists.apache.org/thread.html/d4b4b33470a7bc3e70dc6192527168bb5907810045df5453c784cc93@%3Codf-dev.incubator.apache.org%3E
> 
> Note that of the 6 VOTES there were 5 from IPMC members including the
> current Mentor and two original Mentors.
> 
> This Vote will continue with the IPMC votes at 5 +1 and will be open for
> the next week.
> 
> [ ] +1 - Retire ODF Toolkit.
> [ ] -1 - Do not retire ODF Toolkit. I intend to become a Mentor and try to
> make the community happen.
> 
> Regards,
> Dave
> -
> To unsubscribe, e-mail: 

> general-unsubscribe@.apache

> For additional commands, e-mail: 

> general-help@.apache





--
Sent from: http://apache-incubator-general.996316.n3.nabble.com/

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: How to review so-called "binary releases"?

2018-11-16 Thread Jim Jagielski



> On Nov 15, 2018, at 2:41 AM, Bertrand Delacretaz  
> wrote:
> 
> 
> I see this as a two-level thing:
> 
> a) The source release is an Act of the Foundation, it is what the
> foundation produces
> 
> b) For the binaries, the PMC states that it thinks they are good and
> declares that the published digests and signatures are the correct
> ones. The Foundation does not state anything about them - use at your
> own risk but in practice that risk is very low if the PMC members
> collectively recommend using them.
> 
> That's not very different from what other open source projects do - we
> need a) for our legal shield but b) is exactly like random open source
> projects operate.
> 
> You have to trust an open source project when you use their binaries,
> and you can use digests and signatures to verify that those binaries
> are the same that everyone else uses - I don't think anyone provides
> more guarantees than that, except when you pay for someone to state
> that those binaries are good.
> 
> If people agree with this view we might need to explain this better,
> "unofficial" does not mean much, this two-level view might be more
> useful.

Agree 100%. Thx for very clearly and accurately describing all this.
-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



[RESULT] [VOTE] Accept the Iceberg project for incubation

2018-11-16 Thread Ryan Blue
The vote passes with 13 binding +1 and 5 non-binding +1 votes.

Thank you for voting, everyone! I'll get started with the next steps.

+1 votes:
Ryan Blue*
Matt Sicker*
Felix Cheung
Dave Fisher*
Owen O'Malley*
Hugo Louro
Arthur Wiedmer
Julian Hyde*
Kevin A. McGrail*
Willem Jiang*
James Taylor*
Uwe Korn
Lars Francke*
Jean-Baptiste Onofré*
Olivier Lamy*
Michael Wall*
Kenneth Knowles
Julien Le Dem*

* = binding

On Tue, Nov 13, 2018 at 9:06 AM Ryan Blue  wrote:

> The discuss thread seems to have reached consensus, so I propose accepting
> the Iceberg project for incubation.
>
> The proposal is copied below and in the wiki:
> https://wiki.apache.org/incubator/IcebergProposal
>
> Please vote on whether to accept Iceberg in the next 72 hours:
>
> [ ] +1, accept Iceberg for incubation
> [ ] -1, reject the Iceberg proposal because . . .
>
> Thank you for reviewing the proposal and voting,
>
> rb
> --
> Iceberg Proposal Abstract
>
> Iceberg is a table format for large, slow-moving tabular data.
>
> It is designed to improve on the de-facto standard table layout built into
> Apache Hive, Presto, and Apache Spark.
> Proposal
>
> The purpose of Iceberg is to provide SQL-like tables that are backed by
> large sets of data files. Iceberg is similar to the Hive table layout, the
> de-facto standard structure used to track files in a table, but provides
> additional guarantees and performance optimizations:
>
>- Atomicity - Each change to the table is will be complete or will
>fail. “Do or do not. There is no try.”
>- Snapshot isolation - Reads use one and only one snapshot of a table
>at some time without holding a lock.
>- Safe schema evolution - A table’s schema can change in well-defined
>ways, without breaking older data files.
>- Column projection - An engine may request a subset of the available
>columns, including nested fields.
>- Predicate pushdown - An engine can push filters into read planning
>to improve performance using partition data and file-level statistics.
>
> Iceberg does NOT define a new file format. All data is stored in Apache
> Avro, Apache ORC, or Apache Parquet files.
>
> Additionally, Iceberg is designed to work well when data files are stored
> in cloud blob stores, even when those systems provide weaker guarantees
> than a file system, including:
>
>- Eventual consistency in the namespace
>- High latency for directory listings
>- No renames of objects
>- No folder hierarchy
>
> Rationale
>
> Initial benchmarks show dramatic improvements in query planning. For
> example, in Netflix’s Atlas use case, which stores time-series metrics from
> Netflix runtime systems and 1 month is stored across 2.7 million files in
> 2,688 partitions:
>
>- Hive table using Parquet:
>   - 400k+ splits, not combined
>   - Explain query: 9.6 minutes wall time (planning only)
>- Iceberg table with partition filtering:
>   - 15,218 splits, combined
>   - Planning: 10 seconds
>   - Query wall time: 13 minutes
>- Iceberg table with partition and min/max filtering:
>   - 412 splits
>   - Planning: 25 seconds
>   - Query wall time: 42 seconds
>
> These performance gains combined with the cross-engine compatibility are a
> very compelling story.
> Initial Goals
>
> The initial goal will be to move the existing codebase to Apache and
> integrate with the Apache development process and infrastructure. A primary
> goal of incubation will be to grow and diversify the Iceberg community. We
> are well aware that the project community is largely comprised of
> individuals from a single company. We aim to change that during incubation.
> Current Status
>
> As previously mentioned, Iceberg is under active development at Netflix,
> and is being used in processing large volumes of data in Amazon EC2.
>
> Iceberg license documentation is already based on Apache guidelines for
> LICENSE and NOTICE content.
> Meritocracy
>
> We value meritocracy and we understand that it is the basis for an open
> community that encourages multiple companies and individuals to contribute
> and be invested in the project’s future. We will encourage and monitor
> participation and make sure to extend privileges and responsibilities to
> all contributors.
> Community
>
> Iceberg is currently being used by developers at Netflix and a growing
> number of users are actively using it in production environments. Iceberg
> has received contributions from developers working at Hortonworks, WeWork,
> and Palantir. By bringing Iceberg to Apache we aim to assure current and
> future contributors that the Iceberg community is meritocratic and open, in
> order to broaden and diversity the user and developer community.
> Core Developers
>
> Iceberg was initially developed at Netflix and is under active
> development. We believe Netflix will be of interest to a broad range of
> users and developers and that incubating the project at the ASF