Re: [VOTE] Accept Zeppelin into the Apache Incubator

2014-12-24 Thread Konstantin Boudnik
+1 (binding)

On Thu, Dec 18, 2014 at 09:29PM, Roman Shaposhnik wrote:
 Following the discussion earlier:
 http://s.apache.org/kTp
 
 I would like to call a VOTE for accepting
 Zeppelin as a new Incubator project.
 
 The proposal is available at:
 https://wiki.apache.org/incubator/ZeppelinProposal
 and is also attached to the end of this email.
 
 Vote is open until at least Sunday, 21th December 2014,
 23:59:00 PST
 
 [ ] +1 Accept Zeppelin into the Incubator
 [ ] ±0 Indifferent to the acceptance of Zeppelin
 [ ] -1 Do not accept Zeppelin because ...
 
 Thanks,
 Roman.
 
 == Abstract ==
 Zeppelin is a collaborative data analytics and visualization tool for
 distributed, general-purpose data processing systems such as Apache
 Spark, Apache Flink, etc.
 
 == Proposal ==
 Zeppelin is a modern web-based tool for the data scientists to
 collaborate over large-scale data exploration and visualization
 projects. It is a notebook style interpreter that enable collaborative
 analysis sessions sharing between users. Zeppelin is independent of
 the execution framework itself. Current version runs on top of Apache
 Spark but it has pluggable interpreter APIs to support other data
 processing systems. More execution frameworks could be added at a
 later date i.e Apache Flink, Crunch as well as SQL-like backends such
 as Hive, Tajo, MRQL.
 
 We have a strong preference for the project to be called Zeppelin. In
 case that may not be feasible, alternative names could be: “Mir”,
 “Yuga” or “Sora”.
 
 == Background ==
 Large scale data analysis workflow includes multiple steps like data
 acquisition, pre-processing, visualization, etc and may include
 inter-operation of multiple different tools and technologies. With the
 widespread of the open source general-purpose data processing systems
 like Spark there is a lack of open source, modern user-friendly tools
 that combine strengths of interpreted language for data analysis with
 new in-browser visualization libraries and collaborative capabilities.
 
 Zeppelin initially started as a GUI tool for diverse set of
 SQL-over-Hadoop systems like Hive, Presto, Shark, etc. It was open
 source since its inception in Sep 2013. Later, it became clear that
 there was a need for a greater web-based tool for data scientists to
 collaborate on data exploration over the large-scale projects, not
 limited to SQL. So Zeppelin integrated full support of Apache Spark
 while adding a collaborative environment with the ability to run and
 share interpreter sessions in-browser
 
 == Rationale ==
 There are no open source alternatives for a collaborative
 notebook-based interpreter with support of multiple distributed data
 processing systems.
 
 As a number of companies adopting and contributing back to Zeppelin is
 growing, we think that having a long-term home at Apache foundation
 would be a great fit for the project ensuring that processes and
 procedures are in place to keep project and community “healthy” and
 free of any commercial, political or legal faults.
 
 == Initial Goals ==
 The initial goals will be to move the existing codebase to Apache and
 integrate with the Apache development process. This includes moving
 all infrastructure that we currently maintain, such as: a website, a
 mailing list, an issues tracker and a Jenkins CI, as mentioned in
 “Required Resources” section of current proposal.
 Once this is accomplished, we plan for incremental development and
 releases that follow the Apache guidelines.
 To increase adoption the major goal for the project would be to
 provide integration with as much projects from Apache data ecosystem
 as possible, including new interpreters for Apache Hive, Apache Drill
 and adding Zeppelin distribution to Apache Bigtop.
 On the community building side the main goal is to attract a diverse
 set of contributors by promoting Zeppelin to wide variety of
 engineers, starting a Zeppelin user groups around the globe and by
 engaging with other existing Apache projects communities online.
 
 
 == Current Status ==
 Currently, Zeppelin has 4 released versions and is used in production
 at a number of companies across the globe mentioned in Affiliation
 section. Current implementation status is pre-release with public API
 not being finalized yet. Current main and default backend processing
 engine is Apache Spark with consistent support of SparkSQL.
 Zeppelin is distributed as a binary package which includes an embedded
 webserver, application itself, a set of libraries and startup/shutdown
 scripts. No platform-specific installation packages are provided yet
 but it is something we are looking to provide as part of Apache Bigtop
 integration.
 Project codebase is currently hosted at github.com, which will form
 the basis of the Apache git repository.
 
 === Meritocracy ===
 Zeppelin is an open source project that already leverages meritocracy
 principles.  It was started by a handfull of people and now it has
 multiple contributors, 

Re: [VOTE] Accept Zeppelin into the Apache Incubator

2014-12-23 Thread Roman Shaposhnik
On Thu, Dec 18, 2014 at 9:29 PM, Roman Shaposhnik r...@apache.org wrote:
 Following the discussion earlier:
 http://s.apache.org/kTp

 I would like to call a VOTE for accepting
 Zeppelin as a new Incubator project.

 The proposal is available at:
 https://wiki.apache.org/incubator/ZeppelinProposal
 and is also attached to the end of this email.

 Vote is open until at least Sunday, 21th December 2014,
 23:59:00 PST

 [ ] +1 Accept Zeppelin into the Incubator
 [ ] ±0 Indifferent to the acceptance of Zeppelin
 [ ] -1 Do not accept Zeppelin because ...

+1 (binding)

Thanks,
Roman.

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



[RESULT] [VOTE] Accept Zeppelin into the Apache Incubator

2014-12-23 Thread Roman Shaposhnik
With 8 binding +1s, 4 non-binding +1 and no +-0 or -1s the vote
has passed. Thanks to everyone who voted! Here's a vote tally:

Binding +1:
   Ted Dunning
   Henry Saputra
   Hyunsik Choi
   Hadrian Zbarcea
   janI
   Arvind Prabhakar
   Ate Douma
   Roman Shaposhnik

Non-binding +1:
   Sharad Agarwal
   Jaideep Dhok
   Fabian Hueske
   Naresh Agarwal

I will now proceed with the next steps of establishing a poddling
according to the IPMC guidelines.

Thanks,
Roman.

On Thu, Dec 18, 2014 at 9:29 PM, Roman Shaposhnik r...@apache.org wrote:
 Following the discussion earlier:
 http://s.apache.org/kTp

 I would like to call a VOTE for accepting
 Zeppelin as a new Incubator project.

 The proposal is available at:
 https://wiki.apache.org/incubator/ZeppelinProposal
 and is also attached to the end of this email.

 Vote is open until at least Sunday, 21th December 2014,
 23:59:00 PST

 [ ] +1 Accept Zeppelin into the Incubator
 [ ] ±0 Indifferent to the acceptance of Zeppelin
 [ ] -1 Do not accept Zeppelin because ...

 Thanks,
 Roman.

 == Abstract ==
 Zeppelin is a collaborative data analytics and visualization tool for
 distributed, general-purpose data processing systems such as Apache
 Spark, Apache Flink, etc.

 == Proposal ==
 Zeppelin is a modern web-based tool for the data scientists to
 collaborate over large-scale data exploration and visualization
 projects. It is a notebook style interpreter that enable collaborative
 analysis sessions sharing between users. Zeppelin is independent of
 the execution framework itself. Current version runs on top of Apache
 Spark but it has pluggable interpreter APIs to support other data
 processing systems. More execution frameworks could be added at a
 later date i.e Apache Flink, Crunch as well as SQL-like backends such
 as Hive, Tajo, MRQL.

 We have a strong preference for the project to be called Zeppelin. In
 case that may not be feasible, alternative names could be: “Mir”,
 “Yuga” or “Sora”.

 == Background ==
 Large scale data analysis workflow includes multiple steps like data
 acquisition, pre-processing, visualization, etc and may include
 inter-operation of multiple different tools and technologies. With the
 widespread of the open source general-purpose data processing systems
 like Spark there is a lack of open source, modern user-friendly tools
 that combine strengths of interpreted language for data analysis with
 new in-browser visualization libraries and collaborative capabilities.

 Zeppelin initially started as a GUI tool for diverse set of
 SQL-over-Hadoop systems like Hive, Presto, Shark, etc. It was open
 source since its inception in Sep 2013. Later, it became clear that
 there was a need for a greater web-based tool for data scientists to
 collaborate on data exploration over the large-scale projects, not
 limited to SQL. So Zeppelin integrated full support of Apache Spark
 while adding a collaborative environment with the ability to run and
 share interpreter sessions in-browser

 == Rationale ==
 There are no open source alternatives for a collaborative
 notebook-based interpreter with support of multiple distributed data
 processing systems.

 As a number of companies adopting and contributing back to Zeppelin is
 growing, we think that having a long-term home at Apache foundation
 would be a great fit for the project ensuring that processes and
 procedures are in place to keep project and community “healthy” and
 free of any commercial, political or legal faults.

 == Initial Goals ==
 The initial goals will be to move the existing codebase to Apache and
 integrate with the Apache development process. This includes moving
 all infrastructure that we currently maintain, such as: a website, a
 mailing list, an issues tracker and a Jenkins CI, as mentioned in
 “Required Resources” section of current proposal.
 Once this is accomplished, we plan for incremental development and
 releases that follow the Apache guidelines.
 To increase adoption the major goal for the project would be to
 provide integration with as much projects from Apache data ecosystem
 as possible, including new interpreters for Apache Hive, Apache Drill
 and adding Zeppelin distribution to Apache Bigtop.
 On the community building side the main goal is to attract a diverse
 set of contributors by promoting Zeppelin to wide variety of
 engineers, starting a Zeppelin user groups around the globe and by
 engaging with other existing Apache projects communities online.


 == Current Status ==
 Currently, Zeppelin has 4 released versions and is used in production
 at a number of companies across the globe mentioned in Affiliation
 section. Current implementation status is pre-release with public API
 not being finalized yet. Current main and default backend processing
 engine is Apache Spark with consistent support of SparkSQL.
 Zeppelin is distributed as a binary package which includes an embedded
 webserver, application itself, a set of libraries and 

Re: [VOTE] Accept Zeppelin into the Apache Incubator

2014-12-20 Thread Ate Douma

+1 (binding)

On 2014-12-19 06:29, Roman Shaposhnik wrote:

Following the discussion earlier:
 http://s.apache.org/kTp

I would like to call a VOTE for accepting
Zeppelin as a new Incubator project.

The proposal is available at:
 https://wiki.apache.org/incubator/ZeppelinProposal
and is also attached to the end of this email.

Vote is open until at least Sunday, 21th December 2014,
23:59:00 PST

[ ] +1 Accept Zeppelin into the Incubator
[ ] ±0 Indifferent to the acceptance of Zeppelin
[ ] -1 Do not accept Zeppelin because ...

Thanks,
Roman.

== Abstract ==
Zeppelin is a collaborative data analytics and visualization tool for
distributed, general-purpose data processing systems such as Apache
Spark, Apache Flink, etc.

== Proposal ==
Zeppelin is a modern web-based tool for the data scientists to
collaborate over large-scale data exploration and visualization
projects. It is a notebook style interpreter that enable collaborative
analysis sessions sharing between users. Zeppelin is independent of
the execution framework itself. Current version runs on top of Apache
Spark but it has pluggable interpreter APIs to support other data
processing systems. More execution frameworks could be added at a
later date i.e Apache Flink, Crunch as well as SQL-like backends such
as Hive, Tajo, MRQL.

We have a strong preference for the project to be called Zeppelin. In
case that may not be feasible, alternative names could be: “Mir”,
“Yuga” or “Sora”.

== Background ==
Large scale data analysis workflow includes multiple steps like data
acquisition, pre-processing, visualization, etc and may include
inter-operation of multiple different tools and technologies. With the
widespread of the open source general-purpose data processing systems
like Spark there is a lack of open source, modern user-friendly tools
that combine strengths of interpreted language for data analysis with
new in-browser visualization libraries and collaborative capabilities.

Zeppelin initially started as a GUI tool for diverse set of
SQL-over-Hadoop systems like Hive, Presto, Shark, etc. It was open
source since its inception in Sep 2013. Later, it became clear that
there was a need for a greater web-based tool for data scientists to
collaborate on data exploration over the large-scale projects, not
limited to SQL. So Zeppelin integrated full support of Apache Spark
while adding a collaborative environment with the ability to run and
share interpreter sessions in-browser

== Rationale ==
There are no open source alternatives for a collaborative
notebook-based interpreter with support of multiple distributed data
processing systems.

As a number of companies adopting and contributing back to Zeppelin is
growing, we think that having a long-term home at Apache foundation
would be a great fit for the project ensuring that processes and
procedures are in place to keep project and community “healthy” and
free of any commercial, political or legal faults.

== Initial Goals ==
The initial goals will be to move the existing codebase to Apache and
integrate with the Apache development process. This includes moving
all infrastructure that we currently maintain, such as: a website, a
mailing list, an issues tracker and a Jenkins CI, as mentioned in
“Required Resources” section of current proposal.
Once this is accomplished, we plan for incremental development and
releases that follow the Apache guidelines.
To increase adoption the major goal for the project would be to
provide integration with as much projects from Apache data ecosystem
as possible, including new interpreters for Apache Hive, Apache Drill
and adding Zeppelin distribution to Apache Bigtop.
On the community building side the main goal is to attract a diverse
set of contributors by promoting Zeppelin to wide variety of
engineers, starting a Zeppelin user groups around the globe and by
engaging with other existing Apache projects communities online.


== Current Status ==
Currently, Zeppelin has 4 released versions and is used in production
at a number of companies across the globe mentioned in Affiliation
section. Current implementation status is pre-release with public API
not being finalized yet. Current main and default backend processing
engine is Apache Spark with consistent support of SparkSQL.
Zeppelin is distributed as a binary package which includes an embedded
webserver, application itself, a set of libraries and startup/shutdown
scripts. No platform-specific installation packages are provided yet
but it is something we are looking to provide as part of Apache Bigtop
integration.
Project codebase is currently hosted at github.com, which will form
the basis of the Apache git repository.

=== Meritocracy ===
Zeppelin is an open source project that already leverages meritocracy
principles.  It was started by a handfull of people and now it has
multiple contributors, although as the number of contribution grows we
want to build a diverse developer and user community that is governed

Re: [VOTE] Accept Zeppelin into the Apache Incubator

2014-12-19 Thread Fabian Hueske
+1 (non-binding)

2014-12-19 7:24 GMT+01:00 Jaideep Dhok jaideep.d...@inmobi.com:

 +1 (non-binding)

 Thanks,
 Jaideep

 On Fri, Dec 19, 2014 at 11:50 AM, Hyunsik Choi hyun...@apache.org wrote:
 
  +1 (binding)
 
  On Friday, December 19, 2014, Roman Shaposhnik r...@apache.org wrote:
 
   Following the discussion earlier:
   http://s.apache.org/kTp
  
   I would like to call a VOTE for accepting
   Zeppelin as a new Incubator project.
  
   The proposal is available at:
   https://wiki.apache.org/incubator/ZeppelinProposal
   and is also attached to the end of this email.
  
   Vote is open until at least Sunday, 21th December 2014,
   23:59:00 PST
  
   [ ] +1 Accept Zeppelin into the Incubator
   [ ] ±0 Indifferent to the acceptance of Zeppelin
   [ ] -1 Do not accept Zeppelin because ...
  
   Thanks,
   Roman.
  
   == Abstract ==
   Zeppelin is a collaborative data analytics and visualization tool for
   distributed, general-purpose data processing systems such as Apache
   Spark, Apache Flink, etc.
  
   == Proposal ==
   Zeppelin is a modern web-based tool for the data scientists to
   collaborate over large-scale data exploration and visualization
   projects. It is a notebook style interpreter that enable collaborative
   analysis sessions sharing between users. Zeppelin is independent of
   the execution framework itself. Current version runs on top of Apache
   Spark but it has pluggable interpreter APIs to support other data
   processing systems. More execution frameworks could be added at a
   later date i.e Apache Flink, Crunch as well as SQL-like backends such
   as Hive, Tajo, MRQL.
  
   We have a strong preference for the project to be called Zeppelin. In
   case that may not be feasible, alternative names could be: “Mir”,
   “Yuga” or “Sora”.
  
   == Background ==
   Large scale data analysis workflow includes multiple steps like data
   acquisition, pre-processing, visualization, etc and may include
   inter-operation of multiple different tools and technologies. With the
   widespread of the open source general-purpose data processing systems
   like Spark there is a lack of open source, modern user-friendly tools
   that combine strengths of interpreted language for data analysis with
   new in-browser visualization libraries and collaborative capabilities.
  
   Zeppelin initially started as a GUI tool for diverse set of
   SQL-over-Hadoop systems like Hive, Presto, Shark, etc. It was open
   source since its inception in Sep 2013. Later, it became clear that
   there was a need for a greater web-based tool for data scientists to
   collaborate on data exploration over the large-scale projects, not
   limited to SQL. So Zeppelin integrated full support of Apache Spark
   while adding a collaborative environment with the ability to run and
   share interpreter sessions in-browser
  
   == Rationale ==
   There are no open source alternatives for a collaborative
   notebook-based interpreter with support of multiple distributed data
   processing systems.
  
   As a number of companies adopting and contributing back to Zeppelin is
   growing, we think that having a long-term home at Apache foundation
   would be a great fit for the project ensuring that processes and
   procedures are in place to keep project and community “healthy” and
   free of any commercial, political or legal faults.
  
   == Initial Goals ==
   The initial goals will be to move the existing codebase to Apache and
   integrate with the Apache development process. This includes moving
   all infrastructure that we currently maintain, such as: a website, a
   mailing list, an issues tracker and a Jenkins CI, as mentioned in
   “Required Resources” section of current proposal.
   Once this is accomplished, we plan for incremental development and
   releases that follow the Apache guidelines.
   To increase adoption the major goal for the project would be to
   provide integration with as much projects from Apache data ecosystem
   as possible, including new interpreters for Apache Hive, Apache Drill
   and adding Zeppelin distribution to Apache Bigtop.
   On the community building side the main goal is to attract a diverse
   set of contributors by promoting Zeppelin to wide variety of
   engineers, starting a Zeppelin user groups around the globe and by
   engaging with other existing Apache projects communities online.
  
  
   == Current Status ==
   Currently, Zeppelin has 4 released versions and is used in production
   at a number of companies across the globe mentioned in Affiliation
   section. Current implementation status is pre-release with public API
   not being finalized yet. Current main and default backend processing
   engine is Apache Spark with consistent support of SparkSQL.
   Zeppelin is distributed as a binary package which includes an embedded
   webserver, application itself, a set of libraries and startup/shutdown
   scripts. No platform-specific installation packages are 

Re: [VOTE] Accept Zeppelin into the Apache Incubator

2014-12-19 Thread Hadrian Zbarcea

+1 (binding)


On 12/19/2014 12:29 AM, Roman Shaposhnik wrote:

Following the discussion earlier:
 http://s.apache.org/kTp

I would like to call a VOTE for accepting
Zeppelin as a new Incubator project.

The proposal is available at:
 https://wiki.apache.org/incubator/ZeppelinProposal
and is also attached to the end of this email.

Vote is open until at least Sunday, 21th December 2014,
23:59:00 PST

[ ] +1 Accept Zeppelin into the Incubator
[ ] ±0 Indifferent to the acceptance of Zeppelin
[ ] -1 Do not accept Zeppelin because ...

Thanks,
Roman.

== Abstract ==
Zeppelin is a collaborative data analytics and visualization tool for
distributed, general-purpose data processing systems such as Apache
Spark, Apache Flink, etc.

== Proposal ==
Zeppelin is a modern web-based tool for the data scientists to
collaborate over large-scale data exploration and visualization
projects. It is a notebook style interpreter that enable collaborative
analysis sessions sharing between users. Zeppelin is independent of
the execution framework itself. Current version runs on top of Apache
Spark but it has pluggable interpreter APIs to support other data
processing systems. More execution frameworks could be added at a
later date i.e Apache Flink, Crunch as well as SQL-like backends such
as Hive, Tajo, MRQL.

We have a strong preference for the project to be called Zeppelin. In
case that may not be feasible, alternative names could be: “Mir”,
“Yuga” or “Sora”.

== Background ==
Large scale data analysis workflow includes multiple steps like data
acquisition, pre-processing, visualization, etc and may include
inter-operation of multiple different tools and technologies. With the
widespread of the open source general-purpose data processing systems
like Spark there is a lack of open source, modern user-friendly tools
that combine strengths of interpreted language for data analysis with
new in-browser visualization libraries and collaborative capabilities.

Zeppelin initially started as a GUI tool for diverse set of
SQL-over-Hadoop systems like Hive, Presto, Shark, etc. It was open
source since its inception in Sep 2013. Later, it became clear that
there was a need for a greater web-based tool for data scientists to
collaborate on data exploration over the large-scale projects, not
limited to SQL. So Zeppelin integrated full support of Apache Spark
while adding a collaborative environment with the ability to run and
share interpreter sessions in-browser

== Rationale ==
There are no open source alternatives for a collaborative
notebook-based interpreter with support of multiple distributed data
processing systems.

As a number of companies adopting and contributing back to Zeppelin is
growing, we think that having a long-term home at Apache foundation
would be a great fit for the project ensuring that processes and
procedures are in place to keep project and community “healthy” and
free of any commercial, political or legal faults.

== Initial Goals ==
The initial goals will be to move the existing codebase to Apache and
integrate with the Apache development process. This includes moving
all infrastructure that we currently maintain, such as: a website, a
mailing list, an issues tracker and a Jenkins CI, as mentioned in
“Required Resources” section of current proposal.
Once this is accomplished, we plan for incremental development and
releases that follow the Apache guidelines.
To increase adoption the major goal for the project would be to
provide integration with as much projects from Apache data ecosystem
as possible, including new interpreters for Apache Hive, Apache Drill
and adding Zeppelin distribution to Apache Bigtop.
On the community building side the main goal is to attract a diverse
set of contributors by promoting Zeppelin to wide variety of
engineers, starting a Zeppelin user groups around the globe and by
engaging with other existing Apache projects communities online.


== Current Status ==
Currently, Zeppelin has 4 released versions and is used in production
at a number of companies across the globe mentioned in Affiliation
section. Current implementation status is pre-release with public API
not being finalized yet. Current main and default backend processing
engine is Apache Spark with consistent support of SparkSQL.
Zeppelin is distributed as a binary package which includes an embedded
webserver, application itself, a set of libraries and startup/shutdown
scripts. No platform-specific installation packages are provided yet
but it is something we are looking to provide as part of Apache Bigtop
integration.
Project codebase is currently hosted at github.com, which will form
the basis of the Apache git repository.

=== Meritocracy ===
Zeppelin is an open source project that already leverages meritocracy
principles.  It was started by a handfull of people and now it has
multiple contributors, although as the number of contribution grows we
want to build a diverse developer and user community that is 

Re: [VOTE] Accept Zeppelin into the Apache Incubator

2014-12-19 Thread jan i
+1 (binding)

On 19 December 2014 at 14:09, Hadrian Zbarcea hzbar...@gmail.com wrote:

 +1 (binding)



 On 12/19/2014 12:29 AM, Roman Shaposhnik wrote:

 Following the discussion earlier:
  http://s.apache.org/kTp

 I would like to call a VOTE for accepting
 Zeppelin as a new Incubator project.

 The proposal is available at:
  https://wiki.apache.org/incubator/ZeppelinProposal
 and is also attached to the end of this email.

 Vote is open until at least Sunday, 21th December 2014,
 23:59:00 PST

 [ ] +1 Accept Zeppelin into the Incubator
 [ ] ±0 Indifferent to the acceptance of Zeppelin
 [ ] -1 Do not accept Zeppelin because ...

 Thanks,
 Roman.

 == Abstract ==
 Zeppelin is a collaborative data analytics and visualization tool for
 distributed, general-purpose data processing systems such as Apache
 Spark, Apache Flink, etc.

 == Proposal ==
 Zeppelin is a modern web-based tool for the data scientists to
 collaborate over large-scale data exploration and visualization
 projects. It is a notebook style interpreter that enable collaborative
 analysis sessions sharing between users. Zeppelin is independent of
 the execution framework itself. Current version runs on top of Apache
 Spark but it has pluggable interpreter APIs to support other data
 processing systems. More execution frameworks could be added at a
 later date i.e Apache Flink, Crunch as well as SQL-like backends such
 as Hive, Tajo, MRQL.

 We have a strong preference for the project to be called Zeppelin. In
 case that may not be feasible, alternative names could be: “Mir”,
 “Yuga” or “Sora”.

 == Background ==
 Large scale data analysis workflow includes multiple steps like data
 acquisition, pre-processing, visualization, etc and may include
 inter-operation of multiple different tools and technologies. With the
 widespread of the open source general-purpose data processing systems
 like Spark there is a lack of open source, modern user-friendly tools
 that combine strengths of interpreted language for data analysis with
 new in-browser visualization libraries and collaborative capabilities.

 Zeppelin initially started as a GUI tool for diverse set of
 SQL-over-Hadoop systems like Hive, Presto, Shark, etc. It was open
 source since its inception in Sep 2013. Later, it became clear that
 there was a need for a greater web-based tool for data scientists to
 collaborate on data exploration over the large-scale projects, not
 limited to SQL. So Zeppelin integrated full support of Apache Spark
 while adding a collaborative environment with the ability to run and
 share interpreter sessions in-browser

 == Rationale ==
 There are no open source alternatives for a collaborative
 notebook-based interpreter with support of multiple distributed data
 processing systems.

 As a number of companies adopting and contributing back to Zeppelin is
 growing, we think that having a long-term home at Apache foundation
 would be a great fit for the project ensuring that processes and
 procedures are in place to keep project and community “healthy” and
 free of any commercial, political or legal faults.

 == Initial Goals ==
 The initial goals will be to move the existing codebase to Apache and
 integrate with the Apache development process. This includes moving
 all infrastructure that we currently maintain, such as: a website, a
 mailing list, an issues tracker and a Jenkins CI, as mentioned in
 “Required Resources” section of current proposal.
 Once this is accomplished, we plan for incremental development and
 releases that follow the Apache guidelines.
 To increase adoption the major goal for the project would be to
 provide integration with as much projects from Apache data ecosystem
 as possible, including new interpreters for Apache Hive, Apache Drill
 and adding Zeppelin distribution to Apache Bigtop.
 On the community building side the main goal is to attract a diverse
 set of contributors by promoting Zeppelin to wide variety of
 engineers, starting a Zeppelin user groups around the globe and by
 engaging with other existing Apache projects communities online.


 == Current Status ==
 Currently, Zeppelin has 4 released versions and is used in production
 at a number of companies across the globe mentioned in Affiliation
 section. Current implementation status is pre-release with public API
 not being finalized yet. Current main and default backend processing
 engine is Apache Spark with consistent support of SparkSQL.
 Zeppelin is distributed as a binary package which includes an embedded
 webserver, application itself, a set of libraries and startup/shutdown
 scripts. No platform-specific installation packages are provided yet
 but it is something we are looking to provide as part of Apache Bigtop
 integration.
 Project codebase is currently hosted at github.com, which will form
 the basis of the Apache git repository.

 === Meritocracy ===
 Zeppelin is an open source project that already leverages meritocracy
 principles.  It was started 

Re: [VOTE] Accept Zeppelin into the Apache Incubator

2014-12-19 Thread Naresh Agarwal
+1 (non-binding)

Thanks
Naresh

On Fri, Dec 19, 2014 at 4:32 PM, Fabian Hueske fhue...@apache.org wrote:

 +1 (non-binding)

 2014-12-19 7:24 GMT+01:00 Jaideep Dhok jaideep.d...@inmobi.com:

  +1 (non-binding)
 
  Thanks,
  Jaideep
 
  On Fri, Dec 19, 2014 at 11:50 AM, Hyunsik Choi hyun...@apache.org
 wrote:
  
   +1 (binding)
  
   On Friday, December 19, 2014, Roman Shaposhnik r...@apache.org wrote:
  
Following the discussion earlier:
http://s.apache.org/kTp
   
I would like to call a VOTE for accepting
Zeppelin as a new Incubator project.
   
The proposal is available at:
https://wiki.apache.org/incubator/ZeppelinProposal
and is also attached to the end of this email.
   
Vote is open until at least Sunday, 21th December 2014,
23:59:00 PST
   
[ ] +1 Accept Zeppelin into the Incubator
[ ] ±0 Indifferent to the acceptance of Zeppelin
[ ] -1 Do not accept Zeppelin because ...
   
Thanks,
Roman.
   
== Abstract ==
Zeppelin is a collaborative data analytics and visualization tool for
distributed, general-purpose data processing systems such as Apache
Spark, Apache Flink, etc.
   
== Proposal ==
Zeppelin is a modern web-based tool for the data scientists to
collaborate over large-scale data exploration and visualization
projects. It is a notebook style interpreter that enable
 collaborative
analysis sessions sharing between users. Zeppelin is independent of
the execution framework itself. Current version runs on top of Apache
Spark but it has pluggable interpreter APIs to support other data
processing systems. More execution frameworks could be added at a
later date i.e Apache Flink, Crunch as well as SQL-like backends such
as Hive, Tajo, MRQL.
   
We have a strong preference for the project to be called Zeppelin. In
case that may not be feasible, alternative names could be: “Mir”,
“Yuga” or “Sora”.
   
== Background ==
Large scale data analysis workflow includes multiple steps like data
acquisition, pre-processing, visualization, etc and may include
inter-operation of multiple different tools and technologies. With
 the
widespread of the open source general-purpose data processing systems
like Spark there is a lack of open source, modern user-friendly tools
that combine strengths of interpreted language for data analysis with
new in-browser visualization libraries and collaborative
 capabilities.
   
Zeppelin initially started as a GUI tool for diverse set of
SQL-over-Hadoop systems like Hive, Presto, Shark, etc. It was open
source since its inception in Sep 2013. Later, it became clear that
there was a need for a greater web-based tool for data scientists to
collaborate on data exploration over the large-scale projects, not
limited to SQL. So Zeppelin integrated full support of Apache Spark
while adding a collaborative environment with the ability to run and
share interpreter sessions in-browser
   
== Rationale ==
There are no open source alternatives for a collaborative
notebook-based interpreter with support of multiple distributed data
processing systems.
   
As a number of companies adopting and contributing back to Zeppelin
 is
growing, we think that having a long-term home at Apache foundation
would be a great fit for the project ensuring that processes and
procedures are in place to keep project and community “healthy” and
free of any commercial, political or legal faults.
   
== Initial Goals ==
The initial goals will be to move the existing codebase to Apache and
integrate with the Apache development process. This includes moving
all infrastructure that we currently maintain, such as: a website, a
mailing list, an issues tracker and a Jenkins CI, as mentioned in
“Required Resources” section of current proposal.
Once this is accomplished, we plan for incremental development and
releases that follow the Apache guidelines.
To increase adoption the major goal for the project would be to
provide integration with as much projects from Apache data ecosystem
as possible, including new interpreters for Apache Hive, Apache Drill
and adding Zeppelin distribution to Apache Bigtop.
On the community building side the main goal is to attract a diverse
set of contributors by promoting Zeppelin to wide variety of
engineers, starting a Zeppelin user groups around the globe and by
engaging with other existing Apache projects communities online.
   
   
== Current Status ==
Currently, Zeppelin has 4 released versions and is used in production
at a number of companies across the globe mentioned in Affiliation
section. Current implementation status is pre-release with public API
not being finalized yet. Current main and default backend processing
engine is Apache Spark with consistent support of 

Re: [VOTE] Accept Zeppelin into the Apache Incubator

2014-12-19 Thread Arvind Prabhakar
+1 (binding)

Regards,
Arvind Prabhakar

On Thu, Dec 18, 2014 at 9:29 PM, Roman Shaposhnik r...@apache.org wrote:

 Following the discussion earlier:
 http://s.apache.org/kTp

 I would like to call a VOTE for accepting
 Zeppelin as a new Incubator project.

 The proposal is available at:
 https://wiki.apache.org/incubator/ZeppelinProposal
 and is also attached to the end of this email.

 Vote is open until at least Sunday, 21th December 2014,
 23:59:00 PST

 [ ] +1 Accept Zeppelin into the Incubator
 [ ] ±0 Indifferent to the acceptance of Zeppelin
 [ ] -1 Do not accept Zeppelin because ...

 Thanks,
 Roman.

 == Abstract ==
 Zeppelin is a collaborative data analytics and visualization tool for
 distributed, general-purpose data processing systems such as Apache
 Spark, Apache Flink, etc.

 == Proposal ==
 Zeppelin is a modern web-based tool for the data scientists to
 collaborate over large-scale data exploration and visualization
 projects. It is a notebook style interpreter that enable collaborative
 analysis sessions sharing between users. Zeppelin is independent of
 the execution framework itself. Current version runs on top of Apache
 Spark but it has pluggable interpreter APIs to support other data
 processing systems. More execution frameworks could be added at a
 later date i.e Apache Flink, Crunch as well as SQL-like backends such
 as Hive, Tajo, MRQL.

 We have a strong preference for the project to be called Zeppelin. In
 case that may not be feasible, alternative names could be: “Mir”,
 “Yuga” or “Sora”.

 == Background ==
 Large scale data analysis workflow includes multiple steps like data
 acquisition, pre-processing, visualization, etc and may include
 inter-operation of multiple different tools and technologies. With the
 widespread of the open source general-purpose data processing systems
 like Spark there is a lack of open source, modern user-friendly tools
 that combine strengths of interpreted language for data analysis with
 new in-browser visualization libraries and collaborative capabilities.

 Zeppelin initially started as a GUI tool for diverse set of
 SQL-over-Hadoop systems like Hive, Presto, Shark, etc. It was open
 source since its inception in Sep 2013. Later, it became clear that
 there was a need for a greater web-based tool for data scientists to
 collaborate on data exploration over the large-scale projects, not
 limited to SQL. So Zeppelin integrated full support of Apache Spark
 while adding a collaborative environment with the ability to run and
 share interpreter sessions in-browser

 == Rationale ==
 There are no open source alternatives for a collaborative
 notebook-based interpreter with support of multiple distributed data
 processing systems.

 As a number of companies adopting and contributing back to Zeppelin is
 growing, we think that having a long-term home at Apache foundation
 would be a great fit for the project ensuring that processes and
 procedures are in place to keep project and community “healthy” and
 free of any commercial, political or legal faults.

 == Initial Goals ==
 The initial goals will be to move the existing codebase to Apache and
 integrate with the Apache development process. This includes moving
 all infrastructure that we currently maintain, such as: a website, a
 mailing list, an issues tracker and a Jenkins CI, as mentioned in
 “Required Resources” section of current proposal.
 Once this is accomplished, we plan for incremental development and
 releases that follow the Apache guidelines.
 To increase adoption the major goal for the project would be to
 provide integration with as much projects from Apache data ecosystem
 as possible, including new interpreters for Apache Hive, Apache Drill
 and adding Zeppelin distribution to Apache Bigtop.
 On the community building side the main goal is to attract a diverse
 set of contributors by promoting Zeppelin to wide variety of
 engineers, starting a Zeppelin user groups around the globe and by
 engaging with other existing Apache projects communities online.


 == Current Status ==
 Currently, Zeppelin has 4 released versions and is used in production
 at a number of companies across the globe mentioned in Affiliation
 section. Current implementation status is pre-release with public API
 not being finalized yet. Current main and default backend processing
 engine is Apache Spark with consistent support of SparkSQL.
 Zeppelin is distributed as a binary package which includes an embedded
 webserver, application itself, a set of libraries and startup/shutdown
 scripts. No platform-specific installation packages are provided yet
 but it is something we are looking to provide as part of Apache Bigtop
 integration.
 Project codebase is currently hosted at github.com, which will form
 the basis of the Apache git repository.

 === Meritocracy ===
 Zeppelin is an open source project that already leverages meritocracy
 principles.  It was started by a handfull of people and now it has
 

[VOTE] Accept Zeppelin into the Apache Incubator

2014-12-18 Thread Roman Shaposhnik
Following the discussion earlier:
http://s.apache.org/kTp

I would like to call a VOTE for accepting
Zeppelin as a new Incubator project.

The proposal is available at:
https://wiki.apache.org/incubator/ZeppelinProposal
and is also attached to the end of this email.

Vote is open until at least Sunday, 21th December 2014,
23:59:00 PST

[ ] +1 Accept Zeppelin into the Incubator
[ ] ±0 Indifferent to the acceptance of Zeppelin
[ ] -1 Do not accept Zeppelin because ...

Thanks,
Roman.

== Abstract ==
Zeppelin is a collaborative data analytics and visualization tool for
distributed, general-purpose data processing systems such as Apache
Spark, Apache Flink, etc.

== Proposal ==
Zeppelin is a modern web-based tool for the data scientists to
collaborate over large-scale data exploration and visualization
projects. It is a notebook style interpreter that enable collaborative
analysis sessions sharing between users. Zeppelin is independent of
the execution framework itself. Current version runs on top of Apache
Spark but it has pluggable interpreter APIs to support other data
processing systems. More execution frameworks could be added at a
later date i.e Apache Flink, Crunch as well as SQL-like backends such
as Hive, Tajo, MRQL.

We have a strong preference for the project to be called Zeppelin. In
case that may not be feasible, alternative names could be: “Mir”,
“Yuga” or “Sora”.

== Background ==
Large scale data analysis workflow includes multiple steps like data
acquisition, pre-processing, visualization, etc and may include
inter-operation of multiple different tools and technologies. With the
widespread of the open source general-purpose data processing systems
like Spark there is a lack of open source, modern user-friendly tools
that combine strengths of interpreted language for data analysis with
new in-browser visualization libraries and collaborative capabilities.

Zeppelin initially started as a GUI tool for diverse set of
SQL-over-Hadoop systems like Hive, Presto, Shark, etc. It was open
source since its inception in Sep 2013. Later, it became clear that
there was a need for a greater web-based tool for data scientists to
collaborate on data exploration over the large-scale projects, not
limited to SQL. So Zeppelin integrated full support of Apache Spark
while adding a collaborative environment with the ability to run and
share interpreter sessions in-browser

== Rationale ==
There are no open source alternatives for a collaborative
notebook-based interpreter with support of multiple distributed data
processing systems.

As a number of companies adopting and contributing back to Zeppelin is
growing, we think that having a long-term home at Apache foundation
would be a great fit for the project ensuring that processes and
procedures are in place to keep project and community “healthy” and
free of any commercial, political or legal faults.

== Initial Goals ==
The initial goals will be to move the existing codebase to Apache and
integrate with the Apache development process. This includes moving
all infrastructure that we currently maintain, such as: a website, a
mailing list, an issues tracker and a Jenkins CI, as mentioned in
“Required Resources” section of current proposal.
Once this is accomplished, we plan for incremental development and
releases that follow the Apache guidelines.
To increase adoption the major goal for the project would be to
provide integration with as much projects from Apache data ecosystem
as possible, including new interpreters for Apache Hive, Apache Drill
and adding Zeppelin distribution to Apache Bigtop.
On the community building side the main goal is to attract a diverse
set of contributors by promoting Zeppelin to wide variety of
engineers, starting a Zeppelin user groups around the globe and by
engaging with other existing Apache projects communities online.


== Current Status ==
Currently, Zeppelin has 4 released versions and is used in production
at a number of companies across the globe mentioned in Affiliation
section. Current implementation status is pre-release with public API
not being finalized yet. Current main and default backend processing
engine is Apache Spark with consistent support of SparkSQL.
Zeppelin is distributed as a binary package which includes an embedded
webserver, application itself, a set of libraries and startup/shutdown
scripts. No platform-specific installation packages are provided yet
but it is something we are looking to provide as part of Apache Bigtop
integration.
Project codebase is currently hosted at github.com, which will form
the basis of the Apache git repository.

=== Meritocracy ===
Zeppelin is an open source project that already leverages meritocracy
principles.  It was started by a handfull of people and now it has
multiple contributors, although as the number of contribution grows we
want to build a diverse developer and user community that is governed
by the Apache way. Users and new contributors will be treated 

Re: [VOTE] Accept Zeppelin into the Apache Incubator

2014-12-18 Thread Ted Dunning
+1 (binding)



On Thu, Dec 18, 2014 at 9:29 PM, Roman Shaposhnik r...@apache.org wrote:

 Following the discussion earlier:
 http://s.apache.org/kTp

 I would like to call a VOTE for accepting
 Zeppelin as a new Incubator project.

 The proposal is available at:
 https://wiki.apache.org/incubator/ZeppelinProposal
 and is also attached to the end of this email.

 Vote is open until at least Sunday, 21th December 2014,
 23:59:00 PST

 [ ] +1 Accept Zeppelin into the Incubator
 [ ] ±0 Indifferent to the acceptance of Zeppelin
 [ ] -1 Do not accept Zeppelin because ...

 Thanks,
 Roman.

 == Abstract ==
 Zeppelin is a collaborative data analytics and visualization tool for
 distributed, general-purpose data processing systems such as Apache
 Spark, Apache Flink, etc.

 == Proposal ==
 Zeppelin is a modern web-based tool for the data scientists to
 collaborate over large-scale data exploration and visualization
 projects. It is a notebook style interpreter that enable collaborative
 analysis sessions sharing between users. Zeppelin is independent of
 the execution framework itself. Current version runs on top of Apache
 Spark but it has pluggable interpreter APIs to support other data
 processing systems. More execution frameworks could be added at a
 later date i.e Apache Flink, Crunch as well as SQL-like backends such
 as Hive, Tajo, MRQL.

 We have a strong preference for the project to be called Zeppelin. In
 case that may not be feasible, alternative names could be: “Mir”,
 “Yuga” or “Sora”.

 == Background ==
 Large scale data analysis workflow includes multiple steps like data
 acquisition, pre-processing, visualization, etc and may include
 inter-operation of multiple different tools and technologies. With the
 widespread of the open source general-purpose data processing systems
 like Spark there is a lack of open source, modern user-friendly tools
 that combine strengths of interpreted language for data analysis with
 new in-browser visualization libraries and collaborative capabilities.

 Zeppelin initially started as a GUI tool for diverse set of
 SQL-over-Hadoop systems like Hive, Presto, Shark, etc. It was open
 source since its inception in Sep 2013. Later, it became clear that
 there was a need for a greater web-based tool for data scientists to
 collaborate on data exploration over the large-scale projects, not
 limited to SQL. So Zeppelin integrated full support of Apache Spark
 while adding a collaborative environment with the ability to run and
 share interpreter sessions in-browser

 == Rationale ==
 There are no open source alternatives for a collaborative
 notebook-based interpreter with support of multiple distributed data
 processing systems.

 As a number of companies adopting and contributing back to Zeppelin is
 growing, we think that having a long-term home at Apache foundation
 would be a great fit for the project ensuring that processes and
 procedures are in place to keep project and community “healthy” and
 free of any commercial, political or legal faults.

 == Initial Goals ==
 The initial goals will be to move the existing codebase to Apache and
 integrate with the Apache development process. This includes moving
 all infrastructure that we currently maintain, such as: a website, a
 mailing list, an issues tracker and a Jenkins CI, as mentioned in
 “Required Resources” section of current proposal.
 Once this is accomplished, we plan for incremental development and
 releases that follow the Apache guidelines.
 To increase adoption the major goal for the project would be to
 provide integration with as much projects from Apache data ecosystem
 as possible, including new interpreters for Apache Hive, Apache Drill
 and adding Zeppelin distribution to Apache Bigtop.
 On the community building side the main goal is to attract a diverse
 set of contributors by promoting Zeppelin to wide variety of
 engineers, starting a Zeppelin user groups around the globe and by
 engaging with other existing Apache projects communities online.


 == Current Status ==
 Currently, Zeppelin has 4 released versions and is used in production
 at a number of companies across the globe mentioned in Affiliation
 section. Current implementation status is pre-release with public API
 not being finalized yet. Current main and default backend processing
 engine is Apache Spark with consistent support of SparkSQL.
 Zeppelin is distributed as a binary package which includes an embedded
 webserver, application itself, a set of libraries and startup/shutdown
 scripts. No platform-specific installation packages are provided yet
 but it is something we are looking to provide as part of Apache Bigtop
 integration.
 Project codebase is currently hosted at github.com, which will form
 the basis of the Apache git repository.

 === Meritocracy ===
 Zeppelin is an open source project that already leverages meritocracy
 principles.  It was started by a handfull of people and now it has
 multiple contributors, 

Re: [VOTE] Accept Zeppelin into the Apache Incubator

2014-12-18 Thread Sharad Agarwal
+1 (non-binding)

On Fri, Dec 19, 2014 at 10:59 AM, Roman Shaposhnik r...@apache.org wrote:

 Following the discussion earlier:
 http://s.apache.org/kTp

 I would like to call a VOTE for accepting
 Zeppelin as a new Incubator project.

 The proposal is available at:
 https://wiki.apache.org/incubator/ZeppelinProposal
 and is also attached to the end of this email.

 Vote is open until at least Sunday, 21th December 2014,
 23:59:00 PST

 [ ] +1 Accept Zeppelin into the Incubator
 [ ] ±0 Indifferent to the acceptance of Zeppelin
 [ ] -1 Do not accept Zeppelin because ...

 Thanks,
 Roman.

 == Abstract ==
 Zeppelin is a collaborative data analytics and visualization tool for
 distributed, general-purpose data processing systems such as Apache
 Spark, Apache Flink, etc.

 == Proposal ==
 Zeppelin is a modern web-based tool for the data scientists to
 collaborate over large-scale data exploration and visualization
 projects. It is a notebook style interpreter that enable collaborative
 analysis sessions sharing between users. Zeppelin is independent of
 the execution framework itself. Current version runs on top of Apache
 Spark but it has pluggable interpreter APIs to support other data
 processing systems. More execution frameworks could be added at a
 later date i.e Apache Flink, Crunch as well as SQL-like backends such
 as Hive, Tajo, MRQL.

 We have a strong preference for the project to be called Zeppelin. In
 case that may not be feasible, alternative names could be: “Mir”,
 “Yuga” or “Sora”.

 == Background ==
 Large scale data analysis workflow includes multiple steps like data
 acquisition, pre-processing, visualization, etc and may include
 inter-operation of multiple different tools and technologies. With the
 widespread of the open source general-purpose data processing systems
 like Spark there is a lack of open source, modern user-friendly tools
 that combine strengths of interpreted language for data analysis with
 new in-browser visualization libraries and collaborative capabilities.

 Zeppelin initially started as a GUI tool for diverse set of
 SQL-over-Hadoop systems like Hive, Presto, Shark, etc. It was open
 source since its inception in Sep 2013. Later, it became clear that
 there was a need for a greater web-based tool for data scientists to
 collaborate on data exploration over the large-scale projects, not
 limited to SQL. So Zeppelin integrated full support of Apache Spark
 while adding a collaborative environment with the ability to run and
 share interpreter sessions in-browser

 == Rationale ==
 There are no open source alternatives for a collaborative
 notebook-based interpreter with support of multiple distributed data
 processing systems.

 As a number of companies adopting and contributing back to Zeppelin is
 growing, we think that having a long-term home at Apache foundation
 would be a great fit for the project ensuring that processes and
 procedures are in place to keep project and community “healthy” and
 free of any commercial, political or legal faults.

 == Initial Goals ==
 The initial goals will be to move the existing codebase to Apache and
 integrate with the Apache development process. This includes moving
 all infrastructure that we currently maintain, such as: a website, a
 mailing list, an issues tracker and a Jenkins CI, as mentioned in
 “Required Resources” section of current proposal.
 Once this is accomplished, we plan for incremental development and
 releases that follow the Apache guidelines.
 To increase adoption the major goal for the project would be to
 provide integration with as much projects from Apache data ecosystem
 as possible, including new interpreters for Apache Hive, Apache Drill
 and adding Zeppelin distribution to Apache Bigtop.
 On the community building side the main goal is to attract a diverse
 set of contributors by promoting Zeppelin to wide variety of
 engineers, starting a Zeppelin user groups around the globe and by
 engaging with other existing Apache projects communities online.


 == Current Status ==
 Currently, Zeppelin has 4 released versions and is used in production
 at a number of companies across the globe mentioned in Affiliation
 section. Current implementation status is pre-release with public API
 not being finalized yet. Current main and default backend processing
 engine is Apache Spark with consistent support of SparkSQL.
 Zeppelin is distributed as a binary package which includes an embedded
 webserver, application itself, a set of libraries and startup/shutdown
 scripts. No platform-specific installation packages are provided yet
 but it is something we are looking to provide as part of Apache Bigtop
 integration.
 Project codebase is currently hosted at github.com, which will form
 the basis of the Apache git repository.

 === Meritocracy ===
 Zeppelin is an open source project that already leverages meritocracy
 principles.  It was started by a handfull of people and now it has
 multiple contributors, 

Re: [VOTE] Accept Zeppelin into the Apache Incubator

2014-12-18 Thread Henry Saputra
+1

On Thu, Dec 18, 2014 at 9:29 PM, Roman Shaposhnik r...@apache.org wrote:
 Following the discussion earlier:
 http://s.apache.org/kTp

 I would like to call a VOTE for accepting
 Zeppelin as a new Incubator project.

 The proposal is available at:
 https://wiki.apache.org/incubator/ZeppelinProposal
 and is also attached to the end of this email.

 Vote is open until at least Sunday, 21th December 2014,
 23:59:00 PST

 [ ] +1 Accept Zeppelin into the Incubator
 [ ] ±0 Indifferent to the acceptance of Zeppelin
 [ ] -1 Do not accept Zeppelin because ...

 Thanks,
 Roman.

 == Abstract ==
 Zeppelin is a collaborative data analytics and visualization tool for
 distributed, general-purpose data processing systems such as Apache
 Spark, Apache Flink, etc.

 == Proposal ==
 Zeppelin is a modern web-based tool for the data scientists to
 collaborate over large-scale data exploration and visualization
 projects. It is a notebook style interpreter that enable collaborative
 analysis sessions sharing between users. Zeppelin is independent of
 the execution framework itself. Current version runs on top of Apache
 Spark but it has pluggable interpreter APIs to support other data
 processing systems. More execution frameworks could be added at a
 later date i.e Apache Flink, Crunch as well as SQL-like backends such
 as Hive, Tajo, MRQL.

 We have a strong preference for the project to be called Zeppelin. In
 case that may not be feasible, alternative names could be: “Mir”,
 “Yuga” or “Sora”.

 == Background ==
 Large scale data analysis workflow includes multiple steps like data
 acquisition, pre-processing, visualization, etc and may include
 inter-operation of multiple different tools and technologies. With the
 widespread of the open source general-purpose data processing systems
 like Spark there is a lack of open source, modern user-friendly tools
 that combine strengths of interpreted language for data analysis with
 new in-browser visualization libraries and collaborative capabilities.

 Zeppelin initially started as a GUI tool for diverse set of
 SQL-over-Hadoop systems like Hive, Presto, Shark, etc. It was open
 source since its inception in Sep 2013. Later, it became clear that
 there was a need for a greater web-based tool for data scientists to
 collaborate on data exploration over the large-scale projects, not
 limited to SQL. So Zeppelin integrated full support of Apache Spark
 while adding a collaborative environment with the ability to run and
 share interpreter sessions in-browser

 == Rationale ==
 There are no open source alternatives for a collaborative
 notebook-based interpreter with support of multiple distributed data
 processing systems.

 As a number of companies adopting and contributing back to Zeppelin is
 growing, we think that having a long-term home at Apache foundation
 would be a great fit for the project ensuring that processes and
 procedures are in place to keep project and community “healthy” and
 free of any commercial, political or legal faults.

 == Initial Goals ==
 The initial goals will be to move the existing codebase to Apache and
 integrate with the Apache development process. This includes moving
 all infrastructure that we currently maintain, such as: a website, a
 mailing list, an issues tracker and a Jenkins CI, as mentioned in
 “Required Resources” section of current proposal.
 Once this is accomplished, we plan for incremental development and
 releases that follow the Apache guidelines.
 To increase adoption the major goal for the project would be to
 provide integration with as much projects from Apache data ecosystem
 as possible, including new interpreters for Apache Hive, Apache Drill
 and adding Zeppelin distribution to Apache Bigtop.
 On the community building side the main goal is to attract a diverse
 set of contributors by promoting Zeppelin to wide variety of
 engineers, starting a Zeppelin user groups around the globe and by
 engaging with other existing Apache projects communities online.


 == Current Status ==
 Currently, Zeppelin has 4 released versions and is used in production
 at a number of companies across the globe mentioned in Affiliation
 section. Current implementation status is pre-release with public API
 not being finalized yet. Current main and default backend processing
 engine is Apache Spark with consistent support of SparkSQL.
 Zeppelin is distributed as a binary package which includes an embedded
 webserver, application itself, a set of libraries and startup/shutdown
 scripts. No platform-specific installation packages are provided yet
 but it is something we are looking to provide as part of Apache Bigtop
 integration.
 Project codebase is currently hosted at github.com, which will form
 the basis of the Apache git repository.

 === Meritocracy ===
 Zeppelin is an open source project that already leverages meritocracy
 principles.  It was started by a handfull of people and now it has
 multiple contributors, although as the 

Re: [VOTE] Accept Zeppelin into the Apache Incubator

2014-12-18 Thread Hyunsik Choi
+1 (binding)

On Friday, December 19, 2014, Roman Shaposhnik r...@apache.org wrote:

 Following the discussion earlier:
 http://s.apache.org/kTp

 I would like to call a VOTE for accepting
 Zeppelin as a new Incubator project.

 The proposal is available at:
 https://wiki.apache.org/incubator/ZeppelinProposal
 and is also attached to the end of this email.

 Vote is open until at least Sunday, 21th December 2014,
 23:59:00 PST

 [ ] +1 Accept Zeppelin into the Incubator
 [ ] ±0 Indifferent to the acceptance of Zeppelin
 [ ] -1 Do not accept Zeppelin because ...

 Thanks,
 Roman.

 == Abstract ==
 Zeppelin is a collaborative data analytics and visualization tool for
 distributed, general-purpose data processing systems such as Apache
 Spark, Apache Flink, etc.

 == Proposal ==
 Zeppelin is a modern web-based tool for the data scientists to
 collaborate over large-scale data exploration and visualization
 projects. It is a notebook style interpreter that enable collaborative
 analysis sessions sharing between users. Zeppelin is independent of
 the execution framework itself. Current version runs on top of Apache
 Spark but it has pluggable interpreter APIs to support other data
 processing systems. More execution frameworks could be added at a
 later date i.e Apache Flink, Crunch as well as SQL-like backends such
 as Hive, Tajo, MRQL.

 We have a strong preference for the project to be called Zeppelin. In
 case that may not be feasible, alternative names could be: “Mir”,
 “Yuga” or “Sora”.

 == Background ==
 Large scale data analysis workflow includes multiple steps like data
 acquisition, pre-processing, visualization, etc and may include
 inter-operation of multiple different tools and technologies. With the
 widespread of the open source general-purpose data processing systems
 like Spark there is a lack of open source, modern user-friendly tools
 that combine strengths of interpreted language for data analysis with
 new in-browser visualization libraries and collaborative capabilities.

 Zeppelin initially started as a GUI tool for diverse set of
 SQL-over-Hadoop systems like Hive, Presto, Shark, etc. It was open
 source since its inception in Sep 2013. Later, it became clear that
 there was a need for a greater web-based tool for data scientists to
 collaborate on data exploration over the large-scale projects, not
 limited to SQL. So Zeppelin integrated full support of Apache Spark
 while adding a collaborative environment with the ability to run and
 share interpreter sessions in-browser

 == Rationale ==
 There are no open source alternatives for a collaborative
 notebook-based interpreter with support of multiple distributed data
 processing systems.

 As a number of companies adopting and contributing back to Zeppelin is
 growing, we think that having a long-term home at Apache foundation
 would be a great fit for the project ensuring that processes and
 procedures are in place to keep project and community “healthy” and
 free of any commercial, political or legal faults.

 == Initial Goals ==
 The initial goals will be to move the existing codebase to Apache and
 integrate with the Apache development process. This includes moving
 all infrastructure that we currently maintain, such as: a website, a
 mailing list, an issues tracker and a Jenkins CI, as mentioned in
 “Required Resources” section of current proposal.
 Once this is accomplished, we plan for incremental development and
 releases that follow the Apache guidelines.
 To increase adoption the major goal for the project would be to
 provide integration with as much projects from Apache data ecosystem
 as possible, including new interpreters for Apache Hive, Apache Drill
 and adding Zeppelin distribution to Apache Bigtop.
 On the community building side the main goal is to attract a diverse
 set of contributors by promoting Zeppelin to wide variety of
 engineers, starting a Zeppelin user groups around the globe and by
 engaging with other existing Apache projects communities online.


 == Current Status ==
 Currently, Zeppelin has 4 released versions and is used in production
 at a number of companies across the globe mentioned in Affiliation
 section. Current implementation status is pre-release with public API
 not being finalized yet. Current main and default backend processing
 engine is Apache Spark with consistent support of SparkSQL.
 Zeppelin is distributed as a binary package which includes an embedded
 webserver, application itself, a set of libraries and startup/shutdown
 scripts. No platform-specific installation packages are provided yet
 but it is something we are looking to provide as part of Apache Bigtop
 integration.
 Project codebase is currently hosted at github.com, which will form
 the basis of the Apache git repository.

 === Meritocracy ===
 Zeppelin is an open source project that already leverages meritocracy
 principles.  It was started by a handfull of people and now it has
 multiple contributors, 

Re: [VOTE] Accept Zeppelin into the Apache Incubator

2014-12-18 Thread Jaideep Dhok
+1 (non-binding)

Thanks,
Jaideep

On Fri, Dec 19, 2014 at 11:50 AM, Hyunsik Choi hyun...@apache.org wrote:

 +1 (binding)

 On Friday, December 19, 2014, Roman Shaposhnik r...@apache.org wrote:

  Following the discussion earlier:
  http://s.apache.org/kTp
 
  I would like to call a VOTE for accepting
  Zeppelin as a new Incubator project.
 
  The proposal is available at:
  https://wiki.apache.org/incubator/ZeppelinProposal
  and is also attached to the end of this email.
 
  Vote is open until at least Sunday, 21th December 2014,
  23:59:00 PST
 
  [ ] +1 Accept Zeppelin into the Incubator
  [ ] ±0 Indifferent to the acceptance of Zeppelin
  [ ] -1 Do not accept Zeppelin because ...
 
  Thanks,
  Roman.
 
  == Abstract ==
  Zeppelin is a collaborative data analytics and visualization tool for
  distributed, general-purpose data processing systems such as Apache
  Spark, Apache Flink, etc.
 
  == Proposal ==
  Zeppelin is a modern web-based tool for the data scientists to
  collaborate over large-scale data exploration and visualization
  projects. It is a notebook style interpreter that enable collaborative
  analysis sessions sharing between users. Zeppelin is independent of
  the execution framework itself. Current version runs on top of Apache
  Spark but it has pluggable interpreter APIs to support other data
  processing systems. More execution frameworks could be added at a
  later date i.e Apache Flink, Crunch as well as SQL-like backends such
  as Hive, Tajo, MRQL.
 
  We have a strong preference for the project to be called Zeppelin. In
  case that may not be feasible, alternative names could be: “Mir”,
  “Yuga” or “Sora”.
 
  == Background ==
  Large scale data analysis workflow includes multiple steps like data
  acquisition, pre-processing, visualization, etc and may include
  inter-operation of multiple different tools and technologies. With the
  widespread of the open source general-purpose data processing systems
  like Spark there is a lack of open source, modern user-friendly tools
  that combine strengths of interpreted language for data analysis with
  new in-browser visualization libraries and collaborative capabilities.
 
  Zeppelin initially started as a GUI tool for diverse set of
  SQL-over-Hadoop systems like Hive, Presto, Shark, etc. It was open
  source since its inception in Sep 2013. Later, it became clear that
  there was a need for a greater web-based tool for data scientists to
  collaborate on data exploration over the large-scale projects, not
  limited to SQL. So Zeppelin integrated full support of Apache Spark
  while adding a collaborative environment with the ability to run and
  share interpreter sessions in-browser
 
  == Rationale ==
  There are no open source alternatives for a collaborative
  notebook-based interpreter with support of multiple distributed data
  processing systems.
 
  As a number of companies adopting and contributing back to Zeppelin is
  growing, we think that having a long-term home at Apache foundation
  would be a great fit for the project ensuring that processes and
  procedures are in place to keep project and community “healthy” and
  free of any commercial, political or legal faults.
 
  == Initial Goals ==
  The initial goals will be to move the existing codebase to Apache and
  integrate with the Apache development process. This includes moving
  all infrastructure that we currently maintain, such as: a website, a
  mailing list, an issues tracker and a Jenkins CI, as mentioned in
  “Required Resources” section of current proposal.
  Once this is accomplished, we plan for incremental development and
  releases that follow the Apache guidelines.
  To increase adoption the major goal for the project would be to
  provide integration with as much projects from Apache data ecosystem
  as possible, including new interpreters for Apache Hive, Apache Drill
  and adding Zeppelin distribution to Apache Bigtop.
  On the community building side the main goal is to attract a diverse
  set of contributors by promoting Zeppelin to wide variety of
  engineers, starting a Zeppelin user groups around the globe and by
  engaging with other existing Apache projects communities online.
 
 
  == Current Status ==
  Currently, Zeppelin has 4 released versions and is used in production
  at a number of companies across the globe mentioned in Affiliation
  section. Current implementation status is pre-release with public API
  not being finalized yet. Current main and default backend processing
  engine is Apache Spark with consistent support of SparkSQL.
  Zeppelin is distributed as a binary package which includes an embedded
  webserver, application itself, a set of libraries and startup/shutdown
  scripts. No platform-specific installation packages are provided yet
  but it is something we are looking to provide as part of Apache Bigtop
  integration.
  Project codebase is currently hosted at github.com, which will form
  the basis of the