Re: [RESULT] Re: [VOTE] Livy to enter Apache Incubator

2017-06-05 Thread Bikas Saha
Hi,

I am sorry I was off email when all of this happened. Would like to add my post 
result +1 to join everyone in supporting this effort!

Thanks!
Bikas


From: Sean Busbey <bus...@apache.org>
Sent: Monday, June 5, 2017 10:34:45 AM
To: general@incubator.apache.org
Subject: [RESULT] Re: [VOTE] Livy to enter Apache Incubator

With 7 binding +1 votes (and 14 non-binding +1 votes), this vote passes.

Thanks for everyone who took the time to vote!

I'll coordinate with the mentors to start the initial paperwork today. (And 
we'd be thrilled for the 4th mentor JB!)

binding:
Larry McCay
Jean-Baptiste Onofré
Luciano Resende
Andrew Purtell
Brock Noland
Raphael Bircher
Hitesh Shah

non-binding:
Sean Busbey
Ismaël Mejía
Marcelo Vanzin
Neelesh Salian
Phillip Rhodes
Kostas Sakellis
Jeff Zhang
Saisai Shao
tim shea
Pierre Smits
Arpit Agarwal
Madhawa Kasun Gunasekara
Bruno Mahé
Felix Cheung

-busbey

On 2017-05-31 08:03 (-0500), "Sean Busbey"<bus...@apache.org> wrote:
> Hi folks!
>
> I'm calling a vote to accept "Livy" into the Apache Incubator.
>
> The full proposal is available below, and is also available in the wiki:
>
> https://wiki.apache.org/incubator/LivyProposal
>
> For additional context, please see the discussion thread:
>
> https://s.apache.org/incubator-livy-proposal-thread
>
> Please cast your vote:
>
> [ ]  1, bring Livy into Incubator
> [ ] -1, do not bring Livy into Incubator, because...
>
> The vote will open at least for 72 hours and only votes from the Incubator
> PMC are binding.
>
> I start with my vote:
>  1
>


-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



[RESULT] Re: [VOTE] Livy to enter Apache Incubator

2017-06-05 Thread Sean Busbey
With 7 binding +1 votes (and 14 non-binding +1 votes), this vote passes.

Thanks for everyone who took the time to vote!

I'll coordinate with the mentors to start the initial paperwork today. (And 
we'd be thrilled for the 4th mentor JB!)

binding:
Larry McCay
Jean-Baptiste Onofré
Luciano Resende
Andrew Purtell
Brock Noland
Raphael Bircher
Hitesh Shah

non-binding:
Sean Busbey
Ismaël Mejía
Marcelo Vanzin
Neelesh Salian
Phillip Rhodes
Kostas Sakellis
Jeff Zhang
Saisai Shao
tim shea
Pierre Smits
Arpit Agarwal
Madhawa Kasun Gunasekara
Bruno Mahé
Felix Cheung

-busbey

On 2017-05-31 08:03 (-0500), "Sean Busbey" wrote: 
> Hi folks!
> 
> I'm calling a vote to accept "Livy" into the Apache Incubator.
> 
> The full proposal is available below, and is also available in the wiki:
> 
> https://wiki.apache.org/incubator/LivyProposal
> 
> For additional context, please see the discussion thread:
> 
> https://s.apache.org/incubator-livy-proposal-thread
> 
> Please cast your vote:
> 
> [ ]  1, bring Livy into Incubator
> [ ] -1, do not bring Livy into Incubator, because...
> 
> The vote will open at least for 72 hours and only votes from the Incubator
> PMC are binding.
> 
> I start with my vote:
>  1
> 


-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: [VOTE] Livy to enter Apache Incubator

2017-06-02 Thread Felix Cheung
+1

On Thu, Jun 1, 2017 at 11:14 AM Hitesh Shah  wrote:

> +1
>
> -- Hitesh
>
> On Wed, May 31, 2017 at 6:03 AM, Sean Busbey  wrote:
>
> > Hi folks!
> >
> > I'm calling a vote to accept "Livy" into the Apache Incubator.
> >
> > The full proposal is available below, and is also available in the wiki:
> >
> > https://wiki.apache.org/incubator/LivyProposal
> >
> > For additional context, please see the discussion thread:
> >
> > https://s.apache.org/incubator-livy-proposal-thread
> >
> > Please cast your vote:
> >
> > [ ] +1, bring Livy into Incubator
> > [ ] -1, do not bring Livy into Incubator, because...
> >
> > The vote will open at least for 72 hours and only votes from the
> Incubator
> > PMC are binding.
> >
> > I start with my vote:
> > +1
> >
> > 
> >
> > = Abstract =
> >
> > Livy is web service that exposes a REST interface for managing long
> running
> > Apache Spark contexts in your cluster. With Livy, new applications can be
> > built on top of Apache Spark that require fine grained interaction with
> > many
> > Spark contexts.
> >
> > = Proposal =
> >
> > Livy is an open-source REST service for Apache Spark. Livy enables
> > applications to submit Spark applications and retrieve results without a
> > co-location requirement on the Spark cluster.
> >
> > We propose to contribute the Livy codebase and associated artifacts (e.g.
> > documentation, web-site context etc) to the Apache Software Foundation.
> >
> > = Background =
> >
> > Apache Spark is a fast and general purpose distributed compute engine,
> with
> > a versatile API. It enables processing of large quantities of static data
> > distributed over a cluster of machines, as well as processing of
> continuous
> > streams of data. It is the preferred distributed data processing engine
> for
> > data engineering, stream processing and data science workloads. Each
> Spark
> > application uses a construct called the SparkContext, which is the
> > application’s connection or entry point to the Spark engine. Each Spark
> > application will have its own SparkContext.
> >
> > Livy enables clients to interact with one or more Spark sessions through
> > the
> > Livy Server, which acts as a proxy layer. Livy Clients have fine grained
> > control over the lifecycle of the Spark sessions, as well as the ability
> to
> > submit jobs and retrieve results, all over HTTP. Clients have two modes
> of
> > interaction: RPC Client API, available in Java and Python, which allows
> > results to be retrieved as Java or Python objects. The serialization and
> > deserialization of the results is handled by the Livy framework. HTTP
> based
> > API that allows submission of code snippets, and retrieval of the results
> > in
> > different formats.
> >
> > Multi-tenant resource allocation and security: Livy enables multiple
> > independent Spark sessions to be managed simultaneously. Multiple clients
> > can also interact simultaneously with the same Spark session and share
> the
> > resources of that Spark session. Livy can also enforce secure,
> > authenticated
> > communication between the clients and their respective Spark sessions.
> >
> > More information on Livy can be found at the existing open source
> website:
> > http://livy.io/
> >
> > = Rationale =
> >
> > Users want to use Spark’s powerful processing engine and API as the data
> > processing backend for interactive applications. However, the job
> > submission
> > and application interaction mechanisms built into Apache Spark are
> > insufficient and cumbersome for multi-user interactive applications.
> >
> > The primary mechanism for applications to submit Spark jobs is via
> > spark-submit
> > (http://spark.apache.org/docs/latest/submitting-applications.html),
> which
> > is
> > available as a command line tool as well as a programmatic API. However,
> > spark-submit has the following limitations that make it difficult to
> build
> > interactive applications: It is slow: each invocation of spark-submit
> > involves a setup phase where cluster resources are acquired, new
> processes
> > are forked, etc. This setup phase runs for many seconds, or even minutes,
> > and hence is too slow for interactive applications. It is cumbersome and
> > lacks flexibility: application code and dependencies have to be
> > pre-compiled
> > and submitted as jars, and can not be submitted interactively.
> >
> > Apache Spark comes with an ODBC/JDBC server, which can be used to submit
> > SQL
> > queries to Spark. However, this solution is limited to SQL and does not
> > allow the client to leverage the rest of the Spark API, such as RDDs,
> MLlib
> > and Streaming.
> >
> > A third way of using Spark is via its command-line shell, which allows
> the
> > interactive submission of snippets of Spark code. However, the shell
> > entails
> > running Spark code on the client machine and hence is not a viable
> > mechanism
> > for remote clients to submit Spark jobs.
> >
> > Livy solves the 

Re: [VOTE] Livy to enter Apache Incubator

2017-06-01 Thread Hitesh Shah
+1

-- Hitesh

On Wed, May 31, 2017 at 6:03 AM, Sean Busbey  wrote:

> Hi folks!
>
> I'm calling a vote to accept "Livy" into the Apache Incubator.
>
> The full proposal is available below, and is also available in the wiki:
>
> https://wiki.apache.org/incubator/LivyProposal
>
> For additional context, please see the discussion thread:
>
> https://s.apache.org/incubator-livy-proposal-thread
>
> Please cast your vote:
>
> [ ] +1, bring Livy into Incubator
> [ ] -1, do not bring Livy into Incubator, because...
>
> The vote will open at least for 72 hours and only votes from the Incubator
> PMC are binding.
>
> I start with my vote:
> +1
>
> 
>
> = Abstract =
>
> Livy is web service that exposes a REST interface for managing long running
> Apache Spark contexts in your cluster. With Livy, new applications can be
> built on top of Apache Spark that require fine grained interaction with
> many
> Spark contexts.
>
> = Proposal =
>
> Livy is an open-source REST service for Apache Spark. Livy enables
> applications to submit Spark applications and retrieve results without a
> co-location requirement on the Spark cluster.
>
> We propose to contribute the Livy codebase and associated artifacts (e.g.
> documentation, web-site context etc) to the Apache Software Foundation.
>
> = Background =
>
> Apache Spark is a fast and general purpose distributed compute engine, with
> a versatile API. It enables processing of large quantities of static data
> distributed over a cluster of machines, as well as processing of continuous
> streams of data. It is the preferred distributed data processing engine for
> data engineering, stream processing and data science workloads. Each Spark
> application uses a construct called the SparkContext, which is the
> application’s connection or entry point to the Spark engine. Each Spark
> application will have its own SparkContext.
>
> Livy enables clients to interact with one or more Spark sessions through
> the
> Livy Server, which acts as a proxy layer. Livy Clients have fine grained
> control over the lifecycle of the Spark sessions, as well as the ability to
> submit jobs and retrieve results, all over HTTP. Clients have two modes of
> interaction: RPC Client API, available in Java and Python, which allows
> results to be retrieved as Java or Python objects. The serialization and
> deserialization of the results is handled by the Livy framework. HTTP based
> API that allows submission of code snippets, and retrieval of the results
> in
> different formats.
>
> Multi-tenant resource allocation and security: Livy enables multiple
> independent Spark sessions to be managed simultaneously. Multiple clients
> can also interact simultaneously with the same Spark session and share the
> resources of that Spark session. Livy can also enforce secure,
> authenticated
> communication between the clients and their respective Spark sessions.
>
> More information on Livy can be found at the existing open source website:
> http://livy.io/
>
> = Rationale =
>
> Users want to use Spark’s powerful processing engine and API as the data
> processing backend for interactive applications. However, the job
> submission
> and application interaction mechanisms built into Apache Spark are
> insufficient and cumbersome for multi-user interactive applications.
>
> The primary mechanism for applications to submit Spark jobs is via
> spark-submit
> (http://spark.apache.org/docs/latest/submitting-applications.html), which
> is
> available as a command line tool as well as a programmatic API. However,
> spark-submit has the following limitations that make it difficult to build
> interactive applications: It is slow: each invocation of spark-submit
> involves a setup phase where cluster resources are acquired, new processes
> are forked, etc. This setup phase runs for many seconds, or even minutes,
> and hence is too slow for interactive applications. It is cumbersome and
> lacks flexibility: application code and dependencies have to be
> pre-compiled
> and submitted as jars, and can not be submitted interactively.
>
> Apache Spark comes with an ODBC/JDBC server, which can be used to submit
> SQL
> queries to Spark. However, this solution is limited to SQL and does not
> allow the client to leverage the rest of the Spark API, such as RDDs, MLlib
> and Streaming.
>
> A third way of using Spark is via its command-line shell, which allows the
> interactive submission of snippets of Spark code. However, the shell
> entails
> running Spark code on the client machine and hence is not a viable
> mechanism
> for remote clients to submit Spark jobs.
>
> Livy solves the limitations of the above three mechanisms, and provides the
> full Spark API as a multi-tenant service to remote clients.
>
> Since the open source release of Livy in late 2015, we have seen tremendous
> interest among a diverse set of application developers and ISVs that want
> to
> build applications with Apache Spark. To make Livy a 

Re: [VOTE] Livy to enter Apache Incubator

2017-06-01 Thread Pierre Smits
+1 (from the cheap seats).

Best regards,

Pierre Smits

ORRTIZ.COM 
OFBiz based solutions & services

OFBiz Extensions Marketplace
http://oem.ofbizci.net/oci-2/

On Wed, May 31, 2017 at 10:18 PM, tim shea  wrote:

> +1 (non-binding)
>
> Great project (and I've used it).
>
>
> On 5/31/17 11:59 AM, Kostas Sakellis wrote:
>
>> +1 (non-binding)
>>
>> On Wed, May 31, 2017 at 11:46 AM, Andrew Purtell <
>> andrew.purt...@gmail.com>
>> wrote:
>>
>> +1 (binding)
>>>
>>> On May 31, 2017, at 6:03 AM, Sean Busbey  wrote:

 Hi folks!

 I'm calling a vote to accept "Livy" into the Apache Incubator.

 The full proposal is available below, and is also available in the wiki:

 https://wiki.apache.org/incubator/LivyProposal

 For additional context, please see the discussion thread:

 https://s.apache.org/incubator-livy-proposal-thread

 Please cast your vote:

 [ ] +1, bring Livy into Incubator
 [ ] -1, do not bring Livy into Incubator, because...

 The vote will open at least for 72 hours and only votes from the

>>> Incubator
>>>
 PMC are binding.

 I start with my vote:
 +1

 

 = Abstract =

 Livy is web service that exposes a REST interface for managing long

>>> running
>>>
 Apache Spark contexts in your cluster. With Livy, new applications can
 be
 built on top of Apache Spark that require fine grained interaction with

>>> many
>>>
 Spark contexts.

 = Proposal =

 Livy is an open-source REST service for Apache Spark. Livy enables
 applications to submit Spark applications and retrieve results without a
 co-location requirement on the Spark cluster.

 We propose to contribute the Livy codebase and associated artifacts
 (e.g.
 documentation, web-site context etc) to the Apache Software Foundation.

 = Background =

 Apache Spark is a fast and general purpose distributed compute engine,

>>> with
>>>
 a versatile API. It enables processing of large quantities of static
 data
 distributed over a cluster of machines, as well as processing of

>>> continuous
>>>
 streams of data. It is the preferred distributed data processing engine

>>> for
>>>
 data engineering, stream processing and data science workloads. Each

>>> Spark
>>>
 application uses a construct called the SparkContext, which is the
 application’s connection or entry point to the Spark engine. Each Spark
 application will have its own SparkContext.

 Livy enables clients to interact with one or more Spark sessions through

>>> the
>>>
 Livy Server, which acts as a proxy layer. Livy Clients have fine grained
 control over the lifecycle of the Spark sessions, as well as the ability

>>> to
>>>
 submit jobs and retrieve results, all over HTTP. Clients have two modes

>>> of
>>>
 interaction: RPC Client API, available in Java and Python, which allows
 results to be retrieved as Java or Python objects. The serialization and
 deserialization of the results is handled by the Livy framework. HTTP

>>> based
>>>
 API that allows submission of code snippets, and retrieval of the

>>> results in
>>>
 different formats.

 Multi-tenant resource allocation and security: Livy enables multiple
 independent Spark sessions to be managed simultaneously. Multiple
 clients
 can also interact simultaneously with the same Spark session and share

>>> the
>>>
 resources of that Spark session. Livy can also enforce secure,

>>> authenticated
>>>
 communication between the clients and their respective Spark sessions.

 More information on Livy can be found at the existing open source

>>> website:
>>>
 http://livy.io/

 = Rationale =

 Users want to use Spark’s powerful processing engine and API as the data
 processing backend for interactive applications. However, the job

>>> submission
>>>
 and application interaction mechanisms built into Apache Spark are
 insufficient and cumbersome for multi-user interactive applications.

 The primary mechanism for applications to submit Spark jobs is via
 spark-submit
 (http://spark.apache.org/docs/latest/submitting-applications.html),

>>> which is
>>>
 available as a command line tool as well as a programmatic API. However,
 spark-submit has the following limitations that make it difficult to

>>> build
>>>
 interactive applications: It is slow: each invocation of spark-submit
 involves a setup phase where cluster resources are acquired, new

>>> processes
>>>
 are forked, etc. This setup phase runs for many seconds, or even
 minutes,
 and hence is too slow for interactive applications. It is cumbersome and
 lacks flexibility: 

Re: [VOTE] Livy to enter Apache Incubator

2017-06-01 Thread tim shea

+1 (non-binding)

Great project (and I've used it).

On 5/31/17 11:59 AM, Kostas Sakellis wrote:

+1 (non-binding)

On Wed, May 31, 2017 at 11:46 AM, Andrew Purtell 
wrote:


+1 (binding)


On May 31, 2017, at 6:03 AM, Sean Busbey  wrote:

Hi folks!

I'm calling a vote to accept "Livy" into the Apache Incubator.

The full proposal is available below, and is also available in the wiki:

https://wiki.apache.org/incubator/LivyProposal

For additional context, please see the discussion thread:

https://s.apache.org/incubator-livy-proposal-thread

Please cast your vote:

[ ] +1, bring Livy into Incubator
[ ] -1, do not bring Livy into Incubator, because...

The vote will open at least for 72 hours and only votes from the

Incubator

PMC are binding.

I start with my vote:
+1



= Abstract =

Livy is web service that exposes a REST interface for managing long

running

Apache Spark contexts in your cluster. With Livy, new applications can be
built on top of Apache Spark that require fine grained interaction with

many

Spark contexts.

= Proposal =

Livy is an open-source REST service for Apache Spark. Livy enables
applications to submit Spark applications and retrieve results without a
co-location requirement on the Spark cluster.

We propose to contribute the Livy codebase and associated artifacts (e.g.
documentation, web-site context etc) to the Apache Software Foundation.

= Background =

Apache Spark is a fast and general purpose distributed compute engine,

with

a versatile API. It enables processing of large quantities of static data
distributed over a cluster of machines, as well as processing of

continuous

streams of data. It is the preferred distributed data processing engine

for

data engineering, stream processing and data science workloads. Each

Spark

application uses a construct called the SparkContext, which is the
application’s connection or entry point to the Spark engine. Each Spark
application will have its own SparkContext.

Livy enables clients to interact with one or more Spark sessions through

the

Livy Server, which acts as a proxy layer. Livy Clients have fine grained
control over the lifecycle of the Spark sessions, as well as the ability

to

submit jobs and retrieve results, all over HTTP. Clients have two modes

of

interaction: RPC Client API, available in Java and Python, which allows
results to be retrieved as Java or Python objects. The serialization and
deserialization of the results is handled by the Livy framework. HTTP

based

API that allows submission of code snippets, and retrieval of the

results in

different formats.

Multi-tenant resource allocation and security: Livy enables multiple
independent Spark sessions to be managed simultaneously. Multiple clients
can also interact simultaneously with the same Spark session and share

the

resources of that Spark session. Livy can also enforce secure,

authenticated

communication between the clients and their respective Spark sessions.

More information on Livy can be found at the existing open source

website:

http://livy.io/

= Rationale =

Users want to use Spark’s powerful processing engine and API as the data
processing backend for interactive applications. However, the job

submission

and application interaction mechanisms built into Apache Spark are
insufficient and cumbersome for multi-user interactive applications.

The primary mechanism for applications to submit Spark jobs is via
spark-submit
(http://spark.apache.org/docs/latest/submitting-applications.html),

which is

available as a command line tool as well as a programmatic API. However,
spark-submit has the following limitations that make it difficult to

build

interactive applications: It is slow: each invocation of spark-submit
involves a setup phase where cluster resources are acquired, new

processes

are forked, etc. This setup phase runs for many seconds, or even minutes,
and hence is too slow for interactive applications. It is cumbersome and
lacks flexibility: application code and dependencies have to be

pre-compiled

and submitted as jars, and can not be submitted interactively.

Apache Spark comes with an ODBC/JDBC server, which can be used to submit

SQL

queries to Spark. However, this solution is limited to SQL and does not
allow the client to leverage the rest of the Spark API, such as RDDs,

MLlib

and Streaming.

A third way of using Spark is via its command-line shell, which allows

the

interactive submission of snippets of Spark code. However, the shell

entails

running Spark code on the client machine and hence is not a viable

mechanism

for remote clients to submit Spark jobs.

Livy solves the limitations of the above three mechanisms, and provides

the

full Spark API as a multi-tenant service to remote clients.

Since the open source release of Livy in late 2015, we have seen

tremendous

interest among a diverse set of application developers and ISVs that

want to

build 

Re: [VOTE] Livy to enter Apache Incubator

2017-06-01 Thread Alex Bozarth


+1 (non-binding)

On 2017-05-31 17:27 (-0700), "Saisai sh...@gmail.com> wrote:
>  1 (non-binding)>
>
> ->
> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org>
> For additional commands, e-mail: general-h...@incubator.apache.org>
>
>


   
 Alex Bozarth   
   
 Software Engineer  
   
 Spark Technology Center
   

   

 

 

 
 E-mail: ajboz...@us.ibm.com
 
 GitHub: github.com/ajbozarth   
 
   505 Howard 
Street 
 San Francisco, CA 
94105 
   United 
States 

 






Re: [VOTE] Livy to enter Apache Incubator

2017-06-01 Thread Bruno Mahé

+1 (non binding)


Thanks,
Bruno

On 05/31/2017 06:03 AM, Sean Busbey wrote:

Hi folks!

I'm calling a vote to accept "Livy" into the Apache Incubator.

The full proposal is available below, and is also available in the wiki:

https://wiki.apache.org/incubator/LivyProposal

For additional context, please see the discussion thread:

https://s.apache.org/incubator-livy-proposal-thread

Please cast your vote:

[ ] +1, bring Livy into Incubator
[ ] -1, do not bring Livy into Incubator, because...

The vote will open at least for 72 hours and only votes from the Incubator
PMC are binding.

I start with my vote:
+1



= Abstract =

Livy is web service that exposes a REST interface for managing long running
Apache Spark contexts in your cluster. With Livy, new applications can be
built on top of Apache Spark that require fine grained interaction with many
Spark contexts.

= Proposal =

Livy is an open-source REST service for Apache Spark. Livy enables
applications to submit Spark applications and retrieve results without a
co-location requirement on the Spark cluster.

We propose to contribute the Livy codebase and associated artifacts (e.g.
documentation, web-site context etc) to the Apache Software Foundation.

= Background =

Apache Spark is a fast and general purpose distributed compute engine, with
a versatile API. It enables processing of large quantities of static data
distributed over a cluster of machines, as well as processing of continuous
streams of data. It is the preferred distributed data processing engine for
data engineering, stream processing and data science workloads. Each Spark
application uses a construct called the SparkContext, which is the
application’s connection or entry point to the Spark engine. Each Spark
application will have its own SparkContext.

Livy enables clients to interact with one or more Spark sessions through the
Livy Server, which acts as a proxy layer. Livy Clients have fine grained
control over the lifecycle of the Spark sessions, as well as the ability to
submit jobs and retrieve results, all over HTTP. Clients have two modes of
interaction: RPC Client API, available in Java and Python, which allows
results to be retrieved as Java or Python objects. The serialization and
deserialization of the results is handled by the Livy framework. HTTP based
API that allows submission of code snippets, and retrieval of the results in
different formats.

Multi-tenant resource allocation and security: Livy enables multiple
independent Spark sessions to be managed simultaneously. Multiple clients
can also interact simultaneously with the same Spark session and share the
resources of that Spark session. Livy can also enforce secure, authenticated
communication between the clients and their respective Spark sessions.

More information on Livy can be found at the existing open source website:
http://livy.io/

= Rationale =

Users want to use Spark’s powerful processing engine and API as the data
processing backend for interactive applications. However, the job submission
and application interaction mechanisms built into Apache Spark are
insufficient and cumbersome for multi-user interactive applications.

The primary mechanism for applications to submit Spark jobs is via
spark-submit
(http://spark.apache.org/docs/latest/submitting-applications.html), which is
available as a command line tool as well as a programmatic API. However,
spark-submit has the following limitations that make it difficult to build
interactive applications: It is slow: each invocation of spark-submit
involves a setup phase where cluster resources are acquired, new processes
are forked, etc. This setup phase runs for many seconds, or even minutes,
and hence is too slow for interactive applications. It is cumbersome and
lacks flexibility: application code and dependencies have to be pre-compiled
and submitted as jars, and can not be submitted interactively.

Apache Spark comes with an ODBC/JDBC server, which can be used to submit SQL
queries to Spark. However, this solution is limited to SQL and does not
allow the client to leverage the rest of the Spark API, such as RDDs, MLlib
and Streaming.

A third way of using Spark is via its command-line shell, which allows the
interactive submission of snippets of Spark code. However, the shell entails
running Spark code on the client machine and hence is not a viable mechanism
for remote clients to submit Spark jobs.

Livy solves the limitations of the above three mechanisms, and provides the
full Spark API as a multi-tenant service to remote clients.

Since the open source release of Livy in late 2015, we have seen tremendous
interest among a diverse set of application developers and ISVs that want to
build applications with Apache Spark. To make Livy a robust and flexible
solution that will enable a broad and growing set of applications, it is
important to grow a large and varied community of contributors.

= Initial Goals =

   * Move existing codebase, 

Re: [VOTE] Livy to enter Apache Incubator

2017-05-31 Thread Madhawa Kasun Gunasekara
+1 (non binding)

Madhawa

On Thu, Jun 1, 2017 at 6:23 AM, Raphael Bircher 
wrote:

> +1 (binding)
>
>
> Am .05.2017, 15:03 Uhr, schrieb Sean Busbey :
>
> Hi folks!
>>
>> I'm calling a vote to accept "Livy" into the Apache Incubator.
>>
>> The full proposal is available below, and is also available in the wiki:
>>
>> https://wiki.apache.org/incubator/LivyProposal
>>
>> For additional context, please see the discussion thread:
>>
>> https://s.apache.org/incubator-livy-proposal-thread
>>
>> Please cast your vote:
>>
>> [ ] +1, bring Livy into Incubator
>> [ ] -1, do not bring Livy into Incubator, because...
>>
>> The vote will open at least for 72 hours and only votes from the Incubator
>> PMC are binding.
>>
>> I start with my vote:
>> +1
>>
>> 
>>
>> = Abstract =
>>
>> Livy is web service that exposes a REST interface for managing long
>> running
>> Apache Spark contexts in your cluster. With Livy, new applications can be
>> built on top of Apache Spark that require fine grained interaction with
>> many
>> Spark contexts.
>>
>> = Proposal =
>>
>> Livy is an open-source REST service for Apache Spark. Livy enables
>> applications to submit Spark applications and retrieve results without a
>> co-location requirement on the Spark cluster.
>>
>> We propose to contribute the Livy codebase and associated artifacts (e.g.
>> documentation, web-site context etc) to the Apache Software Foundation.
>>
>> = Background =
>>
>> Apache Spark is a fast and general purpose distributed compute engine,
>> with
>> a versatile API. It enables processing of large quantities of static data
>> distributed over a cluster of machines, as well as processing of
>> continuous
>> streams of data. It is the preferred distributed data processing engine
>> for
>> data engineering, stream processing and data science workloads. Each Spark
>> application uses a construct called the SparkContext, which is the
>> application’s connection or entry point to the Spark engine. Each Spark
>> application will have its own SparkContext.
>>
>> Livy enables clients to interact with one or more Spark sessions through
>> the
>> Livy Server, which acts as a proxy layer. Livy Clients have fine grained
>> control over the lifecycle of the Spark sessions, as well as the ability
>> to
>> submit jobs and retrieve results, all over HTTP. Clients have two modes of
>> interaction: RPC Client API, available in Java and Python, which allows
>> results to be retrieved as Java or Python objects. The serialization and
>> deserialization of the results is handled by the Livy framework. HTTP
>> based
>> API that allows submission of code snippets, and retrieval of the results
>> in
>> different formats.
>>
>> Multi-tenant resource allocation and security: Livy enables multiple
>> independent Spark sessions to be managed simultaneously. Multiple clients
>> can also interact simultaneously with the same Spark session and share the
>> resources of that Spark session. Livy can also enforce secure,
>> authenticated
>> communication between the clients and their respective Spark sessions.
>>
>> More information on Livy can be found at the existing open source website:
>> http://livy.io/
>>
>> = Rationale =
>>
>> Users want to use Spark’s powerful processing engine and API as the data
>>
>> processing backend for interactive applications. However, the job
>> submission
>> and application interaction mechanisms built into Apache Spark are
>> insufficient and cumbersome for multi-user interactive applications.
>>
>> The primary mechanism for applications to submit Spark jobs is via
>> spark-submit
>> (http://spark.apache.org/docs/latest/submitting-applications.html),
>> which is
>> available as a command line tool as well as a programmatic API. However,
>> spark-submit has the following limitations that make it difficult to build
>> interactive applications: It is slow: each invocation of spark-submit
>> involves a setup phase where cluster resources are acquired, new processes
>> are forked, etc. This setup phase runs for many seconds, or even minutes,
>> and hence is too slow for interactive applications. It is cumbersome and
>> lacks flexibility: application code and dependencies have to be
>> pre-compiled
>> and submitted as jars, and can not be submitted interactively.
>>
>> Apache Spark comes with an ODBC/JDBC server, which can be used to submit
>> SQL
>> queries to Spark. However, this solution is limited to SQL and does not
>> allow the client to leverage the rest of the Spark API, such as RDDs,
>> MLlib
>> and Streaming.
>>
>> A third way of using Spark is via its command-line shell, which allows the
>> interactive submission of snippets of Spark code. However, the shell
>> entails
>> running Spark code on the client machine and hence is not a viable
>> mechanism
>> for remote clients to submit Spark jobs.
>>
>> Livy solves the limitations of the above three mechanisms, and provides
>> the
>> full Spark API as a 

Re: [VOTE] Livy to enter Apache Incubator

2017-05-31 Thread Raphael Bircher

+1 (binding)

Am .05.2017, 15:03 Uhr, schrieb Sean Busbey :


Hi folks!

I'm calling a vote to accept "Livy" into the Apache Incubator.

The full proposal is available below, and is also available in the wiki:

https://wiki.apache.org/incubator/LivyProposal

For additional context, please see the discussion thread:

https://s.apache.org/incubator-livy-proposal-thread

Please cast your vote:

[ ] +1, bring Livy into Incubator
[ ] -1, do not bring Livy into Incubator, because...

The vote will open at least for 72 hours and only votes from the  
Incubator

PMC are binding.

I start with my vote:
+1



= Abstract =

Livy is web service that exposes a REST interface for managing long  
running

Apache Spark contexts in your cluster. With Livy, new applications can be
built on top of Apache Spark that require fine grained interaction with  
many

Spark contexts.

= Proposal =

Livy is an open-source REST service for Apache Spark. Livy enables
applications to submit Spark applications and retrieve results without a
co-location requirement on the Spark cluster.

We propose to contribute the Livy codebase and associated artifacts (e.g.
documentation, web-site context etc) to the Apache Software Foundation.

= Background =

Apache Spark is a fast and general purpose distributed compute engine,  
with

a versatile API. It enables processing of large quantities of static data
distributed over a cluster of machines, as well as processing of  
continuous
streams of data. It is the preferred distributed data processing engine  
for
data engineering, stream processing and data science workloads. Each  
Spark

application uses a construct called the SparkContext, which is the
application’s connection or entry point to the Spark engine. Each Spark
application will have its own SparkContext.

Livy enables clients to interact with one or more Spark sessions through  
the

Livy Server, which acts as a proxy layer. Livy Clients have fine grained
control over the lifecycle of the Spark sessions, as well as the ability  
to
submit jobs and retrieve results, all over HTTP. Clients have two modes  
of

interaction: RPC Client API, available in Java and Python, which allows
results to be retrieved as Java or Python objects. The serialization and
deserialization of the results is handled by the Livy framework. HTTP  
based
API that allows submission of code snippets, and retrieval of the  
results in

different formats.

Multi-tenant resource allocation and security: Livy enables multiple
independent Spark sessions to be managed simultaneously. Multiple clients
can also interact simultaneously with the same Spark session and share  
the
resources of that Spark session. Livy can also enforce secure,  
authenticated

communication between the clients and their respective Spark sessions.

More information on Livy can be found at the existing open source  
website:

http://livy.io/

= Rationale =

Users want to use Spark’s powerful processing engine and API as the  
data
processing backend for interactive applications. However, the job  
submission

and application interaction mechanisms built into Apache Spark are
insufficient and cumbersome for multi-user interactive applications.

The primary mechanism for applications to submit Spark jobs is via
spark-submit
(http://spark.apache.org/docs/latest/submitting-applications.html),  
which is

available as a command line tool as well as a programmatic API. However,
spark-submit has the following limitations that make it difficult to  
build

interactive applications: It is slow: each invocation of spark-submit
involves a setup phase where cluster resources are acquired, new  
processes

are forked, etc. This setup phase runs for many seconds, or even minutes,
and hence is too slow for interactive applications. It is cumbersome and
lacks flexibility: application code and dependencies have to be  
pre-compiled

and submitted as jars, and can not be submitted interactively.

Apache Spark comes with an ODBC/JDBC server, which can be used to submit  
SQL

queries to Spark. However, this solution is limited to SQL and does not
allow the client to leverage the rest of the Spark API, such as RDDs,  
MLlib

and Streaming.

A third way of using Spark is via its command-line shell, which allows  
the
interactive submission of snippets of Spark code. However, the shell  
entails
running Spark code on the client machine and hence is not a viable  
mechanism

for remote clients to submit Spark jobs.

Livy solves the limitations of the above three mechanisms, and provides  
the

full Spark API as a multi-tenant service to remote clients.

Since the open source release of Livy in late 2015, we have seen  
tremendous
interest among a diverse set of application developers and ISVs that  
want to

build applications with Apache Spark. To make Livy a robust and flexible
solution that will enable a broad and growing set of applications, it is
important to grow a large and varied 

Re: [VOTE] Livy to enter Apache Incubator

2017-05-31 Thread Saisai Shao
+1 (non-binding)

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: [VOTE] Livy to enter Apache Incubator

2017-05-31 Thread Arpit Agarwal
+1 (non-binding)


On 5/31/17, 6:03 AM, "Sean Busbey"  wrote:

Hi folks!

I'm calling a vote to accept "Livy" into the Apache Incubator.

The full proposal is available below, and is also available in the wiki:

https://wiki.apache.org/incubator/LivyProposal

For additional context, please see the discussion thread:

https://s.apache.org/incubator-livy-proposal-thread

Please cast your vote:

[ ] +1, bring Livy into Incubator
[ ] -1, do not bring Livy into Incubator, because...

The vote will open at least for 72 hours and only votes from the Incubator
PMC are binding.

I start with my vote:
+1



= Abstract =

Livy is web service that exposes a REST interface for managing long running
Apache Spark contexts in your cluster. With Livy, new applications can be
built on top of Apache Spark that require fine grained interaction with many
Spark contexts.  

= Proposal =

Livy is an open-source REST service for Apache Spark. Livy enables
applications to submit Spark applications and retrieve results without a
co-location requirement on the Spark cluster. 

We propose to contribute the Livy codebase and associated artifacts (e.g.
documentation, web-site context etc) to the Apache Software Foundation.

= Background =

Apache Spark is a fast and general purpose distributed compute engine, with
a versatile API. It enables processing of large quantities of static data
distributed over a cluster of machines, as well as processing of continuous
streams of data. It is the preferred distributed data processing engine for
data engineering, stream processing and data science workloads. Each Spark
application uses a construct called the SparkContext, which is the
application’s connection or entry point to the Spark engine. Each Spark
application will have its own SparkContext.

Livy enables clients to interact with one or more Spark sessions through the
Livy Server, which acts as a proxy layer. Livy Clients have fine grained
control over the lifecycle of the Spark sessions, as well as the ability to
submit jobs and retrieve results, all over HTTP. Clients have two modes of
interaction: RPC Client API, available in Java and Python, which allows
results to be retrieved as Java or Python objects. The serialization and
deserialization of the results is handled by the Livy framework. HTTP based
API that allows submission of code snippets, and retrieval of the results in
different formats.

Multi-tenant resource allocation and security: Livy enables multiple
independent Spark sessions to be managed simultaneously. Multiple clients
can also interact simultaneously with the same Spark session and share the
resources of that Spark session. Livy can also enforce secure, authenticated
communication between the clients and their respective Spark sessions.

More information on Livy can be found at the existing open source website:
http://livy.io/

= Rationale =

Users want to use Spark’s powerful processing engine and API as the data
processing backend for interactive applications. However, the job submission
and application interaction mechanisms built into Apache Spark are
insufficient and cumbersome for multi-user interactive applications.

The primary mechanism for applications to submit Spark jobs is via
spark-submit
(http://spark.apache.org/docs/latest/submitting-applications.html), which is
available as a command line tool as well as a programmatic API. However,
spark-submit has the following limitations that make it difficult to build
interactive applications: It is slow: each invocation of spark-submit
involves a setup phase where cluster resources are acquired, new processes
are forked, etc. This setup phase runs for many seconds, or even minutes,
and hence is too slow for interactive applications. It is cumbersome and
lacks flexibility: application code and dependencies have to be pre-compiled
and submitted as jars, and can not be submitted interactively.

Apache Spark comes with an ODBC/JDBC server, which can be used to submit SQL
queries to Spark. However, this solution is limited to SQL and does not
allow the client to leverage the rest of the Spark API, such as RDDs, MLlib
and Streaming.

A third way of using Spark is via its command-line shell, which allows the
interactive submission of snippets of Spark code. However, the shell entails
running Spark code on the client machine and hence is not a viable mechanism
for remote clients to submit Spark jobs.

Livy solves the limitations of the above three mechanisms, and provides the
full Spark API as a multi-tenant service to remote clients. 

 

Re: [VOTE] Livy to enter Apache Incubator

2017-05-31 Thread Jeff Zhang
+1 (non-binding)


Brock Noland 于2017年6月1日周四 上午3:13写道:

> +1 (binding)
>
> On Wed, May 31, 2017 at 1:59 PM, Kostas Sakellis 
> wrote:
>
> > +1 (non-binding)
> >
> > On Wed, May 31, 2017 at 11:46 AM, Andrew Purtell <
> andrew.purt...@gmail.com
> > >
> > wrote:
> >
> > > +1 (binding)
> > >
> > > > On May 31, 2017, at 6:03 AM, Sean Busbey  wrote:
> > > >
> > > > Hi folks!
> > > >
> > > > I'm calling a vote to accept "Livy" into the Apache Incubator.
> > > >
> > > > The full proposal is available below, and is also available in the
> > wiki:
> > > >
> > > > https://wiki.apache.org/incubator/LivyProposal
> > > >
> > > > For additional context, please see the discussion thread:
> > > >
> > > > https://s.apache.org/incubator-livy-proposal-thread
> > > >
> > > > Please cast your vote:
> > > >
> > > > [ ] +1, bring Livy into Incubator
> > > > [ ] -1, do not bring Livy into Incubator, because...
> > > >
> > > > The vote will open at least for 72 hours and only votes from the
> > > Incubator
> > > > PMC are binding.
> > > >
> > > > I start with my vote:
> > > > +1
> > > >
> > > > 
> > > >
> > > > = Abstract =
> > > >
> > > > Livy is web service that exposes a REST interface for managing long
> > > running
> > > > Apache Spark contexts in your cluster. With Livy, new applications
> can
> > be
> > > > built on top of Apache Spark that require fine grained interaction
> with
> > > many
> > > > Spark contexts.
> > > >
> > > > = Proposal =
> > > >
> > > > Livy is an open-source REST service for Apache Spark. Livy enables
> > > > applications to submit Spark applications and retrieve results
> without
> > a
> > > > co-location requirement on the Spark cluster.
> > > >
> > > > We propose to contribute the Livy codebase and associated artifacts
> > (e.g.
> > > > documentation, web-site context etc) to the Apache Software
> Foundation.
> > > >
> > > > = Background =
> > > >
> > > > Apache Spark is a fast and general purpose distributed compute
> engine,
> > > with
> > > > a versatile API. It enables processing of large quantities of static
> > data
> > > > distributed over a cluster of machines, as well as processing of
> > > continuous
> > > > streams of data. It is the preferred distributed data processing
> engine
> > > for
> > > > data engineering, stream processing and data science workloads. Each
> > > Spark
> > > > application uses a construct called the SparkContext, which is the
> > > > application’s connection or entry point to the Spark engine. Each
> Spark
> > > > application will have its own SparkContext.
> > > >
> > > > Livy enables clients to interact with one or more Spark sessions
> > through
> > > the
> > > > Livy Server, which acts as a proxy layer. Livy Clients have fine
> > grained
> > > > control over the lifecycle of the Spark sessions, as well as the
> > ability
> > > to
> > > > submit jobs and retrieve results, all over HTTP. Clients have two
> modes
> > > of
> > > > interaction: RPC Client API, available in Java and Python, which
> allows
> > > > results to be retrieved as Java or Python objects. The serialization
> > and
> > > > deserialization of the results is handled by the Livy framework. HTTP
> > > based
> > > > API that allows submission of code snippets, and retrieval of the
> > > results in
> > > > different formats.
> > > >
> > > > Multi-tenant resource allocation and security: Livy enables multiple
> > > > independent Spark sessions to be managed simultaneously. Multiple
> > clients
> > > > can also interact simultaneously with the same Spark session and
> share
> > > the
> > > > resources of that Spark session. Livy can also enforce secure,
> > > authenticated
> > > > communication between the clients and their respective Spark
> sessions.
> > > >
> > > > More information on Livy can be found at the existing open source
> > > website:
> > > > http://livy.io/
> > > >
> > > > = Rationale =
> > > >
> > > > Users want to use Spark’s powerful processing engine and API as the
> > data
> > > > processing backend for interactive applications. However, the job
> > > submission
> > > > and application interaction mechanisms built into Apache Spark are
> > > > insufficient and cumbersome for multi-user interactive applications.
> > > >
> > > > The primary mechanism for applications to submit Spark jobs is via
> > > > spark-submit
> > > > (http://spark.apache.org/docs/latest/submitting-applications.html),
> > > which is
> > > > available as a command line tool as well as a programmatic API.
> > However,
> > > > spark-submit has the following limitations that make it difficult to
> > > build
> > > > interactive applications: It is slow: each invocation of spark-submit
> > > > involves a setup phase where cluster resources are acquired, new
> > > processes
> > > > are forked, etc. This setup phase runs for many seconds, or even
> > minutes,
> > > > and hence is too slow for interactive applications. It is cumbersome
> > and
> > > > lacks 

Re: [VOTE] Livy to enter Apache Incubator

2017-05-31 Thread Brock Noland
+1 (binding)

On Wed, May 31, 2017 at 1:59 PM, Kostas Sakellis 
wrote:

> +1 (non-binding)
>
> On Wed, May 31, 2017 at 11:46 AM, Andrew Purtell  >
> wrote:
>
> > +1 (binding)
> >
> > > On May 31, 2017, at 6:03 AM, Sean Busbey  wrote:
> > >
> > > Hi folks!
> > >
> > > I'm calling a vote to accept "Livy" into the Apache Incubator.
> > >
> > > The full proposal is available below, and is also available in the
> wiki:
> > >
> > > https://wiki.apache.org/incubator/LivyProposal
> > >
> > > For additional context, please see the discussion thread:
> > >
> > > https://s.apache.org/incubator-livy-proposal-thread
> > >
> > > Please cast your vote:
> > >
> > > [ ] +1, bring Livy into Incubator
> > > [ ] -1, do not bring Livy into Incubator, because...
> > >
> > > The vote will open at least for 72 hours and only votes from the
> > Incubator
> > > PMC are binding.
> > >
> > > I start with my vote:
> > > +1
> > >
> > > 
> > >
> > > = Abstract =
> > >
> > > Livy is web service that exposes a REST interface for managing long
> > running
> > > Apache Spark contexts in your cluster. With Livy, new applications can
> be
> > > built on top of Apache Spark that require fine grained interaction with
> > many
> > > Spark contexts.
> > >
> > > = Proposal =
> > >
> > > Livy is an open-source REST service for Apache Spark. Livy enables
> > > applications to submit Spark applications and retrieve results without
> a
> > > co-location requirement on the Spark cluster.
> > >
> > > We propose to contribute the Livy codebase and associated artifacts
> (e.g.
> > > documentation, web-site context etc) to the Apache Software Foundation.
> > >
> > > = Background =
> > >
> > > Apache Spark is a fast and general purpose distributed compute engine,
> > with
> > > a versatile API. It enables processing of large quantities of static
> data
> > > distributed over a cluster of machines, as well as processing of
> > continuous
> > > streams of data. It is the preferred distributed data processing engine
> > for
> > > data engineering, stream processing and data science workloads. Each
> > Spark
> > > application uses a construct called the SparkContext, which is the
> > > application’s connection or entry point to the Spark engine. Each Spark
> > > application will have its own SparkContext.
> > >
> > > Livy enables clients to interact with one or more Spark sessions
> through
> > the
> > > Livy Server, which acts as a proxy layer. Livy Clients have fine
> grained
> > > control over the lifecycle of the Spark sessions, as well as the
> ability
> > to
> > > submit jobs and retrieve results, all over HTTP. Clients have two modes
> > of
> > > interaction: RPC Client API, available in Java and Python, which allows
> > > results to be retrieved as Java or Python objects. The serialization
> and
> > > deserialization of the results is handled by the Livy framework. HTTP
> > based
> > > API that allows submission of code snippets, and retrieval of the
> > results in
> > > different formats.
> > >
> > > Multi-tenant resource allocation and security: Livy enables multiple
> > > independent Spark sessions to be managed simultaneously. Multiple
> clients
> > > can also interact simultaneously with the same Spark session and share
> > the
> > > resources of that Spark session. Livy can also enforce secure,
> > authenticated
> > > communication between the clients and their respective Spark sessions.
> > >
> > > More information on Livy can be found at the existing open source
> > website:
> > > http://livy.io/
> > >
> > > = Rationale =
> > >
> > > Users want to use Spark’s powerful processing engine and API as the
> data
> > > processing backend for interactive applications. However, the job
> > submission
> > > and application interaction mechanisms built into Apache Spark are
> > > insufficient and cumbersome for multi-user interactive applications.
> > >
> > > The primary mechanism for applications to submit Spark jobs is via
> > > spark-submit
> > > (http://spark.apache.org/docs/latest/submitting-applications.html),
> > which is
> > > available as a command line tool as well as a programmatic API.
> However,
> > > spark-submit has the following limitations that make it difficult to
> > build
> > > interactive applications: It is slow: each invocation of spark-submit
> > > involves a setup phase where cluster resources are acquired, new
> > processes
> > > are forked, etc. This setup phase runs for many seconds, or even
> minutes,
> > > and hence is too slow for interactive applications. It is cumbersome
> and
> > > lacks flexibility: application code and dependencies have to be
> > pre-compiled
> > > and submitted as jars, and can not be submitted interactively.
> > >
> > > Apache Spark comes with an ODBC/JDBC server, which can be used to
> submit
> > SQL
> > > queries to Spark. However, this solution is limited to SQL and does not
> > > allow the client to leverage the rest of the 

Re: [VOTE] Livy to enter Apache Incubator

2017-05-31 Thread Kostas Sakellis
+1 (non-binding)

On Wed, May 31, 2017 at 11:46 AM, Andrew Purtell 
wrote:

> +1 (binding)
>
> > On May 31, 2017, at 6:03 AM, Sean Busbey  wrote:
> >
> > Hi folks!
> >
> > I'm calling a vote to accept "Livy" into the Apache Incubator.
> >
> > The full proposal is available below, and is also available in the wiki:
> >
> > https://wiki.apache.org/incubator/LivyProposal
> >
> > For additional context, please see the discussion thread:
> >
> > https://s.apache.org/incubator-livy-proposal-thread
> >
> > Please cast your vote:
> >
> > [ ] +1, bring Livy into Incubator
> > [ ] -1, do not bring Livy into Incubator, because...
> >
> > The vote will open at least for 72 hours and only votes from the
> Incubator
> > PMC are binding.
> >
> > I start with my vote:
> > +1
> >
> > 
> >
> > = Abstract =
> >
> > Livy is web service that exposes a REST interface for managing long
> running
> > Apache Spark contexts in your cluster. With Livy, new applications can be
> > built on top of Apache Spark that require fine grained interaction with
> many
> > Spark contexts.
> >
> > = Proposal =
> >
> > Livy is an open-source REST service for Apache Spark. Livy enables
> > applications to submit Spark applications and retrieve results without a
> > co-location requirement on the Spark cluster.
> >
> > We propose to contribute the Livy codebase and associated artifacts (e.g.
> > documentation, web-site context etc) to the Apache Software Foundation.
> >
> > = Background =
> >
> > Apache Spark is a fast and general purpose distributed compute engine,
> with
> > a versatile API. It enables processing of large quantities of static data
> > distributed over a cluster of machines, as well as processing of
> continuous
> > streams of data. It is the preferred distributed data processing engine
> for
> > data engineering, stream processing and data science workloads. Each
> Spark
> > application uses a construct called the SparkContext, which is the
> > application’s connection or entry point to the Spark engine. Each Spark
> > application will have its own SparkContext.
> >
> > Livy enables clients to interact with one or more Spark sessions through
> the
> > Livy Server, which acts as a proxy layer. Livy Clients have fine grained
> > control over the lifecycle of the Spark sessions, as well as the ability
> to
> > submit jobs and retrieve results, all over HTTP. Clients have two modes
> of
> > interaction: RPC Client API, available in Java and Python, which allows
> > results to be retrieved as Java or Python objects. The serialization and
> > deserialization of the results is handled by the Livy framework. HTTP
> based
> > API that allows submission of code snippets, and retrieval of the
> results in
> > different formats.
> >
> > Multi-tenant resource allocation and security: Livy enables multiple
> > independent Spark sessions to be managed simultaneously. Multiple clients
> > can also interact simultaneously with the same Spark session and share
> the
> > resources of that Spark session. Livy can also enforce secure,
> authenticated
> > communication between the clients and their respective Spark sessions.
> >
> > More information on Livy can be found at the existing open source
> website:
> > http://livy.io/
> >
> > = Rationale =
> >
> > Users want to use Spark’s powerful processing engine and API as the data
> > processing backend for interactive applications. However, the job
> submission
> > and application interaction mechanisms built into Apache Spark are
> > insufficient and cumbersome for multi-user interactive applications.
> >
> > The primary mechanism for applications to submit Spark jobs is via
> > spark-submit
> > (http://spark.apache.org/docs/latest/submitting-applications.html),
> which is
> > available as a command line tool as well as a programmatic API. However,
> > spark-submit has the following limitations that make it difficult to
> build
> > interactive applications: It is slow: each invocation of spark-submit
> > involves a setup phase where cluster resources are acquired, new
> processes
> > are forked, etc. This setup phase runs for many seconds, or even minutes,
> > and hence is too slow for interactive applications. It is cumbersome and
> > lacks flexibility: application code and dependencies have to be
> pre-compiled
> > and submitted as jars, and can not be submitted interactively.
> >
> > Apache Spark comes with an ODBC/JDBC server, which can be used to submit
> SQL
> > queries to Spark. However, this solution is limited to SQL and does not
> > allow the client to leverage the rest of the Spark API, such as RDDs,
> MLlib
> > and Streaming.
> >
> > A third way of using Spark is via its command-line shell, which allows
> the
> > interactive submission of snippets of Spark code. However, the shell
> entails
> > running Spark code on the client machine and hence is not a viable
> mechanism
> > for remote clients to submit Spark jobs.
> >
> > Livy solves the 

Re: [VOTE] Livy to enter Apache Incubator

2017-05-31 Thread Andrew Purtell
+1 (binding)

> On May 31, 2017, at 6:03 AM, Sean Busbey  wrote:
> 
> Hi folks!
> 
> I'm calling a vote to accept "Livy" into the Apache Incubator.
> 
> The full proposal is available below, and is also available in the wiki:
> 
> https://wiki.apache.org/incubator/LivyProposal
> 
> For additional context, please see the discussion thread:
> 
> https://s.apache.org/incubator-livy-proposal-thread
> 
> Please cast your vote:
> 
> [ ] +1, bring Livy into Incubator
> [ ] -1, do not bring Livy into Incubator, because...
> 
> The vote will open at least for 72 hours and only votes from the Incubator
> PMC are binding.
> 
> I start with my vote:
> +1
> 
> 
> 
> = Abstract =
> 
> Livy is web service that exposes a REST interface for managing long running
> Apache Spark contexts in your cluster. With Livy, new applications can be
> built on top of Apache Spark that require fine grained interaction with many
> Spark contexts.  
> 
> = Proposal =
> 
> Livy is an open-source REST service for Apache Spark. Livy enables
> applications to submit Spark applications and retrieve results without a
> co-location requirement on the Spark cluster. 
> 
> We propose to contribute the Livy codebase and associated artifacts (e.g.
> documentation, web-site context etc) to the Apache Software Foundation.
> 
> = Background =
> 
> Apache Spark is a fast and general purpose distributed compute engine, with
> a versatile API. It enables processing of large quantities of static data
> distributed over a cluster of machines, as well as processing of continuous
> streams of data. It is the preferred distributed data processing engine for
> data engineering, stream processing and data science workloads. Each Spark
> application uses a construct called the SparkContext, which is the
> application’s connection or entry point to the Spark engine. Each Spark
> application will have its own SparkContext.
> 
> Livy enables clients to interact with one or more Spark sessions through the
> Livy Server, which acts as a proxy layer. Livy Clients have fine grained
> control over the lifecycle of the Spark sessions, as well as the ability to
> submit jobs and retrieve results, all over HTTP. Clients have two modes of
> interaction: RPC Client API, available in Java and Python, which allows
> results to be retrieved as Java or Python objects. The serialization and
> deserialization of the results is handled by the Livy framework. HTTP based
> API that allows submission of code snippets, and retrieval of the results in
> different formats.
> 
> Multi-tenant resource allocation and security: Livy enables multiple
> independent Spark sessions to be managed simultaneously. Multiple clients
> can also interact simultaneously with the same Spark session and share the
> resources of that Spark session. Livy can also enforce secure, authenticated
> communication between the clients and their respective Spark sessions.
> 
> More information on Livy can be found at the existing open source website:
> http://livy.io/
> 
> = Rationale =
> 
> Users want to use Spark’s powerful processing engine and API as the data
> processing backend for interactive applications. However, the job submission
> and application interaction mechanisms built into Apache Spark are
> insufficient and cumbersome for multi-user interactive applications.
> 
> The primary mechanism for applications to submit Spark jobs is via
> spark-submit
> (http://spark.apache.org/docs/latest/submitting-applications.html), which is
> available as a command line tool as well as a programmatic API. However,
> spark-submit has the following limitations that make it difficult to build
> interactive applications: It is slow: each invocation of spark-submit
> involves a setup phase where cluster resources are acquired, new processes
> are forked, etc. This setup phase runs for many seconds, or even minutes,
> and hence is too slow for interactive applications. It is cumbersome and
> lacks flexibility: application code and dependencies have to be pre-compiled
> and submitted as jars, and can not be submitted interactively.
> 
> Apache Spark comes with an ODBC/JDBC server, which can be used to submit SQL
> queries to Spark. However, this solution is limited to SQL and does not
> allow the client to leverage the rest of the Spark API, such as RDDs, MLlib
> and Streaming.
> 
> A third way of using Spark is via its command-line shell, which allows the
> interactive submission of snippets of Spark code. However, the shell entails
> running Spark code on the client machine and hence is not a viable mechanism
> for remote clients to submit Spark jobs.
> 
> Livy solves the limitations of the above three mechanisms, and provides the
> full Spark API as a multi-tenant service to remote clients. 
> 
> Since the open source release of Livy in late 2015, we have seen tremendous
> interest among a diverse set of application developers and ISVs that want to
> build applications with Apache Spark. To make 

Re: [VOTE] Livy to enter Apache Incubator

2017-05-31 Thread Phillip Rhodes
+1

Looking forward to this...


Phil

This message optimized for indexing by NSA PRISM


On Wed, May 31, 2017 at 12:54 PM, Neelesh Salian
 wrote:
> +1 (non-binding)
> Thanks for putting this together.
>
> On May 31, 2017 9:46 AM, "Marcelo Vanzin"  wrote:
>
>> +1 (non-binding)
>>
>> On Wed, May 31, 2017 at 6:03 AM, Sean Busbey  wrote:
>> > Hi folks!
>> >
>> > I'm calling a vote to accept "Livy" into the Apache Incubator.
>> >
>> > The full proposal is available below, and is also available in the wiki:
>> >
>> > https://wiki.apache.org/incubator/LivyProposal
>> >
>> > For additional context, please see the discussion thread:
>> >
>> > https://s.apache.org/incubator-livy-proposal-thread
>> >
>> > Please cast your vote:
>> >
>> > [ ] +1, bring Livy into Incubator
>> > [ ] -1, do not bring Livy into Incubator, because...
>> >
>> > The vote will open at least for 72 hours and only votes from the
>> Incubator
>> > PMC are binding.
>> >
>> > I start with my vote:
>> > +1
>> >
>> > 
>> >
>> > = Abstract =
>> >
>> > Livy is web service that exposes a REST interface for managing long
>> running
>> > Apache Spark contexts in your cluster. With Livy, new applications can be
>> > built on top of Apache Spark that require fine grained interaction with
>> many
>> > Spark contexts.
>> >
>> > = Proposal =
>> >
>> > Livy is an open-source REST service for Apache Spark. Livy enables
>> > applications to submit Spark applications and retrieve results without a
>> > co-location requirement on the Spark cluster.
>> >
>> > We propose to contribute the Livy codebase and associated artifacts (e.g.
>> > documentation, web-site context etc) to the Apache Software Foundation.
>> >
>> > = Background =
>> >
>> > Apache Spark is a fast and general purpose distributed compute engine,
>> with
>> > a versatile API. It enables processing of large quantities of static data
>> > distributed over a cluster of machines, as well as processing of
>> continuous
>> > streams of data. It is the preferred distributed data processing engine
>> for
>> > data engineering, stream processing and data science workloads. Each
>> Spark
>> > application uses a construct called the SparkContext, which is the
>> > application’s connection or entry point to the Spark engine. Each Spark
>> > application will have its own SparkContext.
>> >
>> > Livy enables clients to interact with one or more Spark sessions through
>> the
>> > Livy Server, which acts as a proxy layer. Livy Clients have fine grained
>> > control over the lifecycle of the Spark sessions, as well as the ability
>> to
>> > submit jobs and retrieve results, all over HTTP. Clients have two modes
>> of
>> > interaction: RPC Client API, available in Java and Python, which allows
>> > results to be retrieved as Java or Python objects. The serialization and
>> > deserialization of the results is handled by the Livy framework. HTTP
>> based
>> > API that allows submission of code snippets, and retrieval of the
>> results in
>> > different formats.
>> >
>> > Multi-tenant resource allocation and security: Livy enables multiple
>> > independent Spark sessions to be managed simultaneously. Multiple clients
>> > can also interact simultaneously with the same Spark session and share
>> the
>> > resources of that Spark session. Livy can also enforce secure,
>> authenticated
>> > communication between the clients and their respective Spark sessions.
>> >
>> > More information on Livy can be found at the existing open source
>> website:
>> > http://livy.io/
>> >
>> > = Rationale =
>> >
>> > Users want to use Spark’s powerful processing engine and API as the data
>> > processing backend for interactive applications. However, the job
>> submission
>> > and application interaction mechanisms built into Apache Spark are
>> > insufficient and cumbersome for multi-user interactive applications.
>> >
>> > The primary mechanism for applications to submit Spark jobs is via
>> > spark-submit
>> > (http://spark.apache.org/docs/latest/submitting-applications.html),
>> which is
>> > available as a command line tool as well as a programmatic API. However,
>> > spark-submit has the following limitations that make it difficult to
>> build
>> > interactive applications: It is slow: each invocation of spark-submit
>> > involves a setup phase where cluster resources are acquired, new
>> processes
>> > are forked, etc. This setup phase runs for many seconds, or even minutes,
>> > and hence is too slow for interactive applications. It is cumbersome and
>> > lacks flexibility: application code and dependencies have to be
>> pre-compiled
>> > and submitted as jars, and can not be submitted interactively.
>> >
>> > Apache Spark comes with an ODBC/JDBC server, which can be used to submit
>> SQL
>> > queries to Spark. However, this solution is limited to SQL and does not
>> > allow the client to leverage the rest of the Spark API, such as RDDs,
>> MLlib
>> > and 

Re: [VOTE] Livy to enter Apache Incubator

2017-05-31 Thread Neelesh Salian
+1 (non-binding)
Thanks for putting this together.

On May 31, 2017 9:46 AM, "Marcelo Vanzin"  wrote:

> +1 (non-binding)
>
> On Wed, May 31, 2017 at 6:03 AM, Sean Busbey  wrote:
> > Hi folks!
> >
> > I'm calling a vote to accept "Livy" into the Apache Incubator.
> >
> > The full proposal is available below, and is also available in the wiki:
> >
> > https://wiki.apache.org/incubator/LivyProposal
> >
> > For additional context, please see the discussion thread:
> >
> > https://s.apache.org/incubator-livy-proposal-thread
> >
> > Please cast your vote:
> >
> > [ ] +1, bring Livy into Incubator
> > [ ] -1, do not bring Livy into Incubator, because...
> >
> > The vote will open at least for 72 hours and only votes from the
> Incubator
> > PMC are binding.
> >
> > I start with my vote:
> > +1
> >
> > 
> >
> > = Abstract =
> >
> > Livy is web service that exposes a REST interface for managing long
> running
> > Apache Spark contexts in your cluster. With Livy, new applications can be
> > built on top of Apache Spark that require fine grained interaction with
> many
> > Spark contexts.
> >
> > = Proposal =
> >
> > Livy is an open-source REST service for Apache Spark. Livy enables
> > applications to submit Spark applications and retrieve results without a
> > co-location requirement on the Spark cluster.
> >
> > We propose to contribute the Livy codebase and associated artifacts (e.g.
> > documentation, web-site context etc) to the Apache Software Foundation.
> >
> > = Background =
> >
> > Apache Spark is a fast and general purpose distributed compute engine,
> with
> > a versatile API. It enables processing of large quantities of static data
> > distributed over a cluster of machines, as well as processing of
> continuous
> > streams of data. It is the preferred distributed data processing engine
> for
> > data engineering, stream processing and data science workloads. Each
> Spark
> > application uses a construct called the SparkContext, which is the
> > application’s connection or entry point to the Spark engine. Each Spark
> > application will have its own SparkContext.
> >
> > Livy enables clients to interact with one or more Spark sessions through
> the
> > Livy Server, which acts as a proxy layer. Livy Clients have fine grained
> > control over the lifecycle of the Spark sessions, as well as the ability
> to
> > submit jobs and retrieve results, all over HTTP. Clients have two modes
> of
> > interaction: RPC Client API, available in Java and Python, which allows
> > results to be retrieved as Java or Python objects. The serialization and
> > deserialization of the results is handled by the Livy framework. HTTP
> based
> > API that allows submission of code snippets, and retrieval of the
> results in
> > different formats.
> >
> > Multi-tenant resource allocation and security: Livy enables multiple
> > independent Spark sessions to be managed simultaneously. Multiple clients
> > can also interact simultaneously with the same Spark session and share
> the
> > resources of that Spark session. Livy can also enforce secure,
> authenticated
> > communication between the clients and their respective Spark sessions.
> >
> > More information on Livy can be found at the existing open source
> website:
> > http://livy.io/
> >
> > = Rationale =
> >
> > Users want to use Spark’s powerful processing engine and API as the data
> > processing backend for interactive applications. However, the job
> submission
> > and application interaction mechanisms built into Apache Spark are
> > insufficient and cumbersome for multi-user interactive applications.
> >
> > The primary mechanism for applications to submit Spark jobs is via
> > spark-submit
> > (http://spark.apache.org/docs/latest/submitting-applications.html),
> which is
> > available as a command line tool as well as a programmatic API. However,
> > spark-submit has the following limitations that make it difficult to
> build
> > interactive applications: It is slow: each invocation of spark-submit
> > involves a setup phase where cluster resources are acquired, new
> processes
> > are forked, etc. This setup phase runs for many seconds, or even minutes,
> > and hence is too slow for interactive applications. It is cumbersome and
> > lacks flexibility: application code and dependencies have to be
> pre-compiled
> > and submitted as jars, and can not be submitted interactively.
> >
> > Apache Spark comes with an ODBC/JDBC server, which can be used to submit
> SQL
> > queries to Spark. However, this solution is limited to SQL and does not
> > allow the client to leverage the rest of the Spark API, such as RDDs,
> MLlib
> > and Streaming.
> >
> > A third way of using Spark is via its command-line shell, which allows
> the
> > interactive submission of snippets of Spark code. However, the shell
> entails
> > running Spark code on the client machine and hence is not a viable
> mechanism
> > for remote clients to submit Spark jobs.

Re: [VOTE] Livy to enter Apache Incubator

2017-05-31 Thread Marcelo Vanzin
+1 (non-binding)

On Wed, May 31, 2017 at 6:03 AM, Sean Busbey  wrote:
> Hi folks!
>
> I'm calling a vote to accept "Livy" into the Apache Incubator.
>
> The full proposal is available below, and is also available in the wiki:
>
> https://wiki.apache.org/incubator/LivyProposal
>
> For additional context, please see the discussion thread:
>
> https://s.apache.org/incubator-livy-proposal-thread
>
> Please cast your vote:
>
> [ ] +1, bring Livy into Incubator
> [ ] -1, do not bring Livy into Incubator, because...
>
> The vote will open at least for 72 hours and only votes from the Incubator
> PMC are binding.
>
> I start with my vote:
> +1
>
> 
>
> = Abstract =
>
> Livy is web service that exposes a REST interface for managing long running
> Apache Spark contexts in your cluster. With Livy, new applications can be
> built on top of Apache Spark that require fine grained interaction with many
> Spark contexts.
>
> = Proposal =
>
> Livy is an open-source REST service for Apache Spark. Livy enables
> applications to submit Spark applications and retrieve results without a
> co-location requirement on the Spark cluster.
>
> We propose to contribute the Livy codebase and associated artifacts (e.g.
> documentation, web-site context etc) to the Apache Software Foundation.
>
> = Background =
>
> Apache Spark is a fast and general purpose distributed compute engine, with
> a versatile API. It enables processing of large quantities of static data
> distributed over a cluster of machines, as well as processing of continuous
> streams of data. It is the preferred distributed data processing engine for
> data engineering, stream processing and data science workloads. Each Spark
> application uses a construct called the SparkContext, which is the
> application’s connection or entry point to the Spark engine. Each Spark
> application will have its own SparkContext.
>
> Livy enables clients to interact with one or more Spark sessions through the
> Livy Server, which acts as a proxy layer. Livy Clients have fine grained
> control over the lifecycle of the Spark sessions, as well as the ability to
> submit jobs and retrieve results, all over HTTP. Clients have two modes of
> interaction: RPC Client API, available in Java and Python, which allows
> results to be retrieved as Java or Python objects. The serialization and
> deserialization of the results is handled by the Livy framework. HTTP based
> API that allows submission of code snippets, and retrieval of the results in
> different formats.
>
> Multi-tenant resource allocation and security: Livy enables multiple
> independent Spark sessions to be managed simultaneously. Multiple clients
> can also interact simultaneously with the same Spark session and share the
> resources of that Spark session. Livy can also enforce secure, authenticated
> communication between the clients and their respective Spark sessions.
>
> More information on Livy can be found at the existing open source website:
> http://livy.io/
>
> = Rationale =
>
> Users want to use Spark’s powerful processing engine and API as the data
> processing backend for interactive applications. However, the job submission
> and application interaction mechanisms built into Apache Spark are
> insufficient and cumbersome for multi-user interactive applications.
>
> The primary mechanism for applications to submit Spark jobs is via
> spark-submit
> (http://spark.apache.org/docs/latest/submitting-applications.html), which is
> available as a command line tool as well as a programmatic API. However,
> spark-submit has the following limitations that make it difficult to build
> interactive applications: It is slow: each invocation of spark-submit
> involves a setup phase where cluster resources are acquired, new processes
> are forked, etc. This setup phase runs for many seconds, or even minutes,
> and hence is too slow for interactive applications. It is cumbersome and
> lacks flexibility: application code and dependencies have to be pre-compiled
> and submitted as jars, and can not be submitted interactively.
>
> Apache Spark comes with an ODBC/JDBC server, which can be used to submit SQL
> queries to Spark. However, this solution is limited to SQL and does not
> allow the client to leverage the rest of the Spark API, such as RDDs, MLlib
> and Streaming.
>
> A third way of using Spark is via its command-line shell, which allows the
> interactive submission of snippets of Spark code. However, the shell entails
> running Spark code on the client machine and hence is not a viable mechanism
> for remote clients to submit Spark jobs.
>
> Livy solves the limitations of the above three mechanisms, and provides the
> full Spark API as a multi-tenant service to remote clients.
>
> Since the open source release of Livy in late 2015, we have seen tremendous
> interest among a diverse set of application developers and ISVs that want to
> build applications with Apache Spark. To make Livy a robust and flexible
> 

Re: [VOTE] Livy to enter Apache Incubator

2017-05-31 Thread Luciano Resende
+1 (binding)

On Wed, May 31, 2017 at 6:03 AM, Sean Busbey  wrote:

> Hi folks!
>
> I'm calling a vote to accept "Livy" into the Apache Incubator.
>
> The full proposal is available below, and is also available in the wiki:
>
> https://wiki.apache.org/incubator/LivyProposal
>
> For additional context, please see the discussion thread:
>
> https://s.apache.org/incubator-livy-proposal-thread
>
> Please cast your vote:
>
> [ ] +1, bring Livy into Incubator
> [ ] -1, do not bring Livy into Incubator, because...
>
> The vote will open at least for 72 hours and only votes from the Incubator
> PMC are binding.
>
> I start with my vote:
> +1
>
> 
>
> = Abstract =
>
> Livy is web service that exposes a REST interface for managing long running
> Apache Spark contexts in your cluster. With Livy, new applications can be
> built on top of Apache Spark that require fine grained interaction with
> many
> Spark contexts.
>
> = Proposal =
>
> Livy is an open-source REST service for Apache Spark. Livy enables
> applications to submit Spark applications and retrieve results without a
> co-location requirement on the Spark cluster.
>
> We propose to contribute the Livy codebase and associated artifacts (e.g.
> documentation, web-site context etc) to the Apache Software Foundation.
>
> = Background =
>
> Apache Spark is a fast and general purpose distributed compute engine, with
> a versatile API. It enables processing of large quantities of static data
> distributed over a cluster of machines, as well as processing of continuous
> streams of data. It is the preferred distributed data processing engine for
> data engineering, stream processing and data science workloads. Each Spark
> application uses a construct called the SparkContext, which is the
> application’s connection or entry point to the Spark engine. Each Spark
> application will have its own SparkContext.
>
> Livy enables clients to interact with one or more Spark sessions through
> the
> Livy Server, which acts as a proxy layer. Livy Clients have fine grained
> control over the lifecycle of the Spark sessions, as well as the ability to
> submit jobs and retrieve results, all over HTTP. Clients have two modes of
> interaction: RPC Client API, available in Java and Python, which allows
> results to be retrieved as Java or Python objects. The serialization and
> deserialization of the results is handled by the Livy framework. HTTP based
> API that allows submission of code snippets, and retrieval of the results
> in
> different formats.
>
> Multi-tenant resource allocation and security: Livy enables multiple
> independent Spark sessions to be managed simultaneously. Multiple clients
> can also interact simultaneously with the same Spark session and share the
> resources of that Spark session. Livy can also enforce secure,
> authenticated
> communication between the clients and their respective Spark sessions.
>
> More information on Livy can be found at the existing open source website:
> http://livy.io/
>
> = Rationale =
>
> Users want to use Spark’s powerful processing engine and API as the data
> processing backend for interactive applications. However, the job
> submission
> and application interaction mechanisms built into Apache Spark are
> insufficient and cumbersome for multi-user interactive applications.
>
> The primary mechanism for applications to submit Spark jobs is via
> spark-submit
> (http://spark.apache.org/docs/latest/submitting-applications.html), which
> is
> available as a command line tool as well as a programmatic API. However,
> spark-submit has the following limitations that make it difficult to build
> interactive applications: It is slow: each invocation of spark-submit
> involves a setup phase where cluster resources are acquired, new processes
> are forked, etc. This setup phase runs for many seconds, or even minutes,
> and hence is too slow for interactive applications. It is cumbersome and
> lacks flexibility: application code and dependencies have to be
> pre-compiled
> and submitted as jars, and can not be submitted interactively.
>
> Apache Spark comes with an ODBC/JDBC server, which can be used to submit
> SQL
> queries to Spark. However, this solution is limited to SQL and does not
> allow the client to leverage the rest of the Spark API, such as RDDs, MLlib
> and Streaming.
>
> A third way of using Spark is via its command-line shell, which allows the
> interactive submission of snippets of Spark code. However, the shell
> entails
> running Spark code on the client machine and hence is not a viable
> mechanism
> for remote clients to submit Spark jobs.
>
> Livy solves the limitations of the above three mechanisms, and provides the
> full Spark API as a multi-tenant service to remote clients.
>
> Since the open source release of Livy in late 2015, we have seen tremendous
> interest among a diverse set of application developers and ISVs that want
> to
> build applications with Apache Spark. To make Livy a 

Re: [VOTE] Livy to enter Apache Incubator

2017-05-31 Thread Jean-Baptiste Onofré

+1 (binding)

If you need an additional mentor, please let me know, I'm interested by the 
project !


Regards
JB

On 05/31/2017 03:03 PM, Sean Busbey wrote:

Hi folks!

I'm calling a vote to accept "Livy" into the Apache Incubator.

The full proposal is available below, and is also available in the wiki:

https://wiki.apache.org/incubator/LivyProposal

For additional context, please see the discussion thread:

https://s.apache.org/incubator-livy-proposal-thread

Please cast your vote:

[ ] +1, bring Livy into Incubator
[ ] -1, do not bring Livy into Incubator, because...

The vote will open at least for 72 hours and only votes from the Incubator
PMC are binding.

I start with my vote:
+1



= Abstract =

Livy is web service that exposes a REST interface for managing long running
Apache Spark contexts in your cluster. With Livy, new applications can be
built on top of Apache Spark that require fine grained interaction with many
Spark contexts.

= Proposal =

Livy is an open-source REST service for Apache Spark. Livy enables
applications to submit Spark applications and retrieve results without a
co-location requirement on the Spark cluster.

We propose to contribute the Livy codebase and associated artifacts (e.g.
documentation, web-site context etc) to the Apache Software Foundation.

= Background =

Apache Spark is a fast and general purpose distributed compute engine, with
a versatile API. It enables processing of large quantities of static data
distributed over a cluster of machines, as well as processing of continuous
streams of data. It is the preferred distributed data processing engine for
data engineering, stream processing and data science workloads. Each Spark
application uses a construct called the SparkContext, which is the
application’s connection or entry point to the Spark engine. Each Spark
application will have its own SparkContext.

Livy enables clients to interact with one or more Spark sessions through the
Livy Server, which acts as a proxy layer. Livy Clients have fine grained
control over the lifecycle of the Spark sessions, as well as the ability to
submit jobs and retrieve results, all over HTTP. Clients have two modes of
interaction: RPC Client API, available in Java and Python, which allows
results to be retrieved as Java or Python objects. The serialization and
deserialization of the results is handled by the Livy framework. HTTP based
API that allows submission of code snippets, and retrieval of the results in
different formats.

Multi-tenant resource allocation and security: Livy enables multiple
independent Spark sessions to be managed simultaneously. Multiple clients
can also interact simultaneously with the same Spark session and share the
resources of that Spark session. Livy can also enforce secure, authenticated
communication between the clients and their respective Spark sessions.

More information on Livy can be found at the existing open source website:
http://livy.io/

= Rationale =

Users want to use Spark’s powerful processing engine and API as the data
processing backend for interactive applications. However, the job submission
and application interaction mechanisms built into Apache Spark are
insufficient and cumbersome for multi-user interactive applications.

The primary mechanism for applications to submit Spark jobs is via
spark-submit
(http://spark.apache.org/docs/latest/submitting-applications.html), which is
available as a command line tool as well as a programmatic API. However,
spark-submit has the following limitations that make it difficult to build
interactive applications: It is slow: each invocation of spark-submit
involves a setup phase where cluster resources are acquired, new processes
are forked, etc. This setup phase runs for many seconds, or even minutes,
and hence is too slow for interactive applications. It is cumbersome and
lacks flexibility: application code and dependencies have to be pre-compiled
and submitted as jars, and can not be submitted interactively.

Apache Spark comes with an ODBC/JDBC server, which can be used to submit SQL
queries to Spark. However, this solution is limited to SQL and does not
allow the client to leverage the rest of the Spark API, such as RDDs, MLlib
and Streaming.

A third way of using Spark is via its command-line shell, which allows the
interactive submission of snippets of Spark code. However, the shell entails
running Spark code on the client machine and hence is not a viable mechanism
for remote clients to submit Spark jobs.

Livy solves the limitations of the above three mechanisms, and provides the
full Spark API as a multi-tenant service to remote clients.

Since the open source release of Livy in late 2015, we have seen tremendous
interest among a diverse set of application developers and ISVs that want to
build applications with Apache Spark. To make Livy a robust and flexible
solution that will enable a broad and growing set of applications, it is
important to grow a large and varied 

Re: [VOTE] Livy to enter Apache Incubator

2017-05-31 Thread Ismaël Mejía
A missing backend element with a community around it, definitely a
great project to have at Apache.

+1 (non-binding)

Ismaël


On Wed, May 31, 2017 at 3:29 PM, larry mccay  wrote:
> This will be a great addition.
>
> +1
>
> On Wed, May 31, 2017 at 9:03 AM, Sean Busbey  wrote:
>
>> Hi folks!
>>
>> I'm calling a vote to accept "Livy" into the Apache Incubator.
>>
>> The full proposal is available below, and is also available in the wiki:
>>
>> https://wiki.apache.org/incubator/LivyProposal
>>
>> For additional context, please see the discussion thread:
>>
>> https://s.apache.org/incubator-livy-proposal-thread
>>
>> Please cast your vote:
>>
>> [ ] +1, bring Livy into Incubator
>> [ ] -1, do not bring Livy into Incubator, because...
>>
>> The vote will open at least for 72 hours and only votes from the Incubator
>> PMC are binding.
>>
>> I start with my vote:
>> +1
>>
>> 
>>
>> = Abstract =
>>
>> Livy is web service that exposes a REST interface for managing long running
>> Apache Spark contexts in your cluster. With Livy, new applications can be
>> built on top of Apache Spark that require fine grained interaction with
>> many
>> Spark contexts.
>>
>> = Proposal =
>>
>> Livy is an open-source REST service for Apache Spark. Livy enables
>> applications to submit Spark applications and retrieve results without a
>> co-location requirement on the Spark cluster.
>>
>> We propose to contribute the Livy codebase and associated artifacts (e.g.
>> documentation, web-site context etc) to the Apache Software Foundation.
>>
>> = Background =
>>
>> Apache Spark is a fast and general purpose distributed compute engine, with
>> a versatile API. It enables processing of large quantities of static data
>> distributed over a cluster of machines, as well as processing of continuous
>> streams of data. It is the preferred distributed data processing engine for
>> data engineering, stream processing and data science workloads. Each Spark
>> application uses a construct called the SparkContext, which is the
>> application’s connection or entry point to the Spark engine. Each Spark
>> application will have its own SparkContext.
>>
>> Livy enables clients to interact with one or more Spark sessions through
>> the
>> Livy Server, which acts as a proxy layer. Livy Clients have fine grained
>> control over the lifecycle of the Spark sessions, as well as the ability to
>> submit jobs and retrieve results, all over HTTP. Clients have two modes of
>> interaction: RPC Client API, available in Java and Python, which allows
>> results to be retrieved as Java or Python objects. The serialization and
>> deserialization of the results is handled by the Livy framework. HTTP based
>> API that allows submission of code snippets, and retrieval of the results
>> in
>> different formats.
>>
>> Multi-tenant resource allocation and security: Livy enables multiple
>> independent Spark sessions to be managed simultaneously. Multiple clients
>> can also interact simultaneously with the same Spark session and share the
>> resources of that Spark session. Livy can also enforce secure,
>> authenticated
>> communication between the clients and their respective Spark sessions.
>>
>> More information on Livy can be found at the existing open source website:
>> http://livy.io/
>>
>> = Rationale =
>>
>> Users want to use Spark’s powerful processing engine and API as the data
>> processing backend for interactive applications. However, the job
>> submission
>> and application interaction mechanisms built into Apache Spark are
>> insufficient and cumbersome for multi-user interactive applications.
>>
>> The primary mechanism for applications to submit Spark jobs is via
>> spark-submit
>> (http://spark.apache.org/docs/latest/submitting-applications.html), which
>> is
>> available as a command line tool as well as a programmatic API. However,
>> spark-submit has the following limitations that make it difficult to build
>> interactive applications: It is slow: each invocation of spark-submit
>> involves a setup phase where cluster resources are acquired, new processes
>> are forked, etc. This setup phase runs for many seconds, or even minutes,
>> and hence is too slow for interactive applications. It is cumbersome and
>> lacks flexibility: application code and dependencies have to be
>> pre-compiled
>> and submitted as jars, and can not be submitted interactively.
>>
>> Apache Spark comes with an ODBC/JDBC server, which can be used to submit
>> SQL
>> queries to Spark. However, this solution is limited to SQL and does not
>> allow the client to leverage the rest of the Spark API, such as RDDs, MLlib
>> and Streaming.
>>
>> A third way of using Spark is via its command-line shell, which allows the
>> interactive submission of snippets of Spark code. However, the shell
>> entails
>> running Spark code on the client machine and hence is not a viable
>> mechanism
>> for remote clients to submit Spark jobs.
>>
>> Livy 

Re: [VOTE] Livy to enter Apache Incubator

2017-05-31 Thread larry mccay
This will be a great addition.

+1

On Wed, May 31, 2017 at 9:03 AM, Sean Busbey  wrote:

> Hi folks!
>
> I'm calling a vote to accept "Livy" into the Apache Incubator.
>
> The full proposal is available below, and is also available in the wiki:
>
> https://wiki.apache.org/incubator/LivyProposal
>
> For additional context, please see the discussion thread:
>
> https://s.apache.org/incubator-livy-proposal-thread
>
> Please cast your vote:
>
> [ ] +1, bring Livy into Incubator
> [ ] -1, do not bring Livy into Incubator, because...
>
> The vote will open at least for 72 hours and only votes from the Incubator
> PMC are binding.
>
> I start with my vote:
> +1
>
> 
>
> = Abstract =
>
> Livy is web service that exposes a REST interface for managing long running
> Apache Spark contexts in your cluster. With Livy, new applications can be
> built on top of Apache Spark that require fine grained interaction with
> many
> Spark contexts.
>
> = Proposal =
>
> Livy is an open-source REST service for Apache Spark. Livy enables
> applications to submit Spark applications and retrieve results without a
> co-location requirement on the Spark cluster.
>
> We propose to contribute the Livy codebase and associated artifacts (e.g.
> documentation, web-site context etc) to the Apache Software Foundation.
>
> = Background =
>
> Apache Spark is a fast and general purpose distributed compute engine, with
> a versatile API. It enables processing of large quantities of static data
> distributed over a cluster of machines, as well as processing of continuous
> streams of data. It is the preferred distributed data processing engine for
> data engineering, stream processing and data science workloads. Each Spark
> application uses a construct called the SparkContext, which is the
> application’s connection or entry point to the Spark engine. Each Spark
> application will have its own SparkContext.
>
> Livy enables clients to interact with one or more Spark sessions through
> the
> Livy Server, which acts as a proxy layer. Livy Clients have fine grained
> control over the lifecycle of the Spark sessions, as well as the ability to
> submit jobs and retrieve results, all over HTTP. Clients have two modes of
> interaction: RPC Client API, available in Java and Python, which allows
> results to be retrieved as Java or Python objects. The serialization and
> deserialization of the results is handled by the Livy framework. HTTP based
> API that allows submission of code snippets, and retrieval of the results
> in
> different formats.
>
> Multi-tenant resource allocation and security: Livy enables multiple
> independent Spark sessions to be managed simultaneously. Multiple clients
> can also interact simultaneously with the same Spark session and share the
> resources of that Spark session. Livy can also enforce secure,
> authenticated
> communication between the clients and their respective Spark sessions.
>
> More information on Livy can be found at the existing open source website:
> http://livy.io/
>
> = Rationale =
>
> Users want to use Spark’s powerful processing engine and API as the data
> processing backend for interactive applications. However, the job
> submission
> and application interaction mechanisms built into Apache Spark are
> insufficient and cumbersome for multi-user interactive applications.
>
> The primary mechanism for applications to submit Spark jobs is via
> spark-submit
> (http://spark.apache.org/docs/latest/submitting-applications.html), which
> is
> available as a command line tool as well as a programmatic API. However,
> spark-submit has the following limitations that make it difficult to build
> interactive applications: It is slow: each invocation of spark-submit
> involves a setup phase where cluster resources are acquired, new processes
> are forked, etc. This setup phase runs for many seconds, or even minutes,
> and hence is too slow for interactive applications. It is cumbersome and
> lacks flexibility: application code and dependencies have to be
> pre-compiled
> and submitted as jars, and can not be submitted interactively.
>
> Apache Spark comes with an ODBC/JDBC server, which can be used to submit
> SQL
> queries to Spark. However, this solution is limited to SQL and does not
> allow the client to leverage the rest of the Spark API, such as RDDs, MLlib
> and Streaming.
>
> A third way of using Spark is via its command-line shell, which allows the
> interactive submission of snippets of Spark code. However, the shell
> entails
> running Spark code on the client machine and hence is not a viable
> mechanism
> for remote clients to submit Spark jobs.
>
> Livy solves the limitations of the above three mechanisms, and provides the
> full Spark API as a multi-tenant service to remote clients.
>
> Since the open source release of Livy in late 2015, we have seen tremendous
> interest among a diverse set of application developers and ISVs that want
> to
> build applications with Apache 

[VOTE] Livy to enter Apache Incubator

2017-05-31 Thread Sean Busbey
Hi folks!

I'm calling a vote to accept "Livy" into the Apache Incubator.

The full proposal is available below, and is also available in the wiki:

https://wiki.apache.org/incubator/LivyProposal

For additional context, please see the discussion thread:

https://s.apache.org/incubator-livy-proposal-thread

Please cast your vote:

[ ] +1, bring Livy into Incubator
[ ] -1, do not bring Livy into Incubator, because...

The vote will open at least for 72 hours and only votes from the Incubator
PMC are binding.

I start with my vote:
+1



= Abstract =

Livy is web service that exposes a REST interface for managing long running
Apache Spark contexts in your cluster. With Livy, new applications can be
built on top of Apache Spark that require fine grained interaction with many
Spark contexts.  

= Proposal =

Livy is an open-source REST service for Apache Spark. Livy enables
applications to submit Spark applications and retrieve results without a
co-location requirement on the Spark cluster. 

We propose to contribute the Livy codebase and associated artifacts (e.g.
documentation, web-site context etc) to the Apache Software Foundation.

= Background =

Apache Spark is a fast and general purpose distributed compute engine, with
a versatile API. It enables processing of large quantities of static data
distributed over a cluster of machines, as well as processing of continuous
streams of data. It is the preferred distributed data processing engine for
data engineering, stream processing and data science workloads. Each Spark
application uses a construct called the SparkContext, which is the
application’s connection or entry point to the Spark engine. Each Spark
application will have its own SparkContext.

Livy enables clients to interact with one or more Spark sessions through the
Livy Server, which acts as a proxy layer. Livy Clients have fine grained
control over the lifecycle of the Spark sessions, as well as the ability to
submit jobs and retrieve results, all over HTTP. Clients have two modes of
interaction: RPC Client API, available in Java and Python, which allows
results to be retrieved as Java or Python objects. The serialization and
deserialization of the results is handled by the Livy framework. HTTP based
API that allows submission of code snippets, and retrieval of the results in
different formats.

Multi-tenant resource allocation and security: Livy enables multiple
independent Spark sessions to be managed simultaneously. Multiple clients
can also interact simultaneously with the same Spark session and share the
resources of that Spark session. Livy can also enforce secure, authenticated
communication between the clients and their respective Spark sessions.

More information on Livy can be found at the existing open source website:
http://livy.io/

= Rationale =

Users want to use Spark’s powerful processing engine and API as the data
processing backend for interactive applications. However, the job submission
and application interaction mechanisms built into Apache Spark are
insufficient and cumbersome for multi-user interactive applications.

The primary mechanism for applications to submit Spark jobs is via
spark-submit
(http://spark.apache.org/docs/latest/submitting-applications.html), which is
available as a command line tool as well as a programmatic API. However,
spark-submit has the following limitations that make it difficult to build
interactive applications: It is slow: each invocation of spark-submit
involves a setup phase where cluster resources are acquired, new processes
are forked, etc. This setup phase runs for many seconds, or even minutes,
and hence is too slow for interactive applications. It is cumbersome and
lacks flexibility: application code and dependencies have to be pre-compiled
and submitted as jars, and can not be submitted interactively.

Apache Spark comes with an ODBC/JDBC server, which can be used to submit SQL
queries to Spark. However, this solution is limited to SQL and does not
allow the client to leverage the rest of the Spark API, such as RDDs, MLlib
and Streaming.

A third way of using Spark is via its command-line shell, which allows the
interactive submission of snippets of Spark code. However, the shell entails
running Spark code on the client machine and hence is not a viable mechanism
for remote clients to submit Spark jobs.

Livy solves the limitations of the above three mechanisms, and provides the
full Spark API as a multi-tenant service to remote clients. 

Since the open source release of Livy in late 2015, we have seen tremendous
interest among a diverse set of application developers and ISVs that want to
build applications with Apache Spark. To make Livy a robust and flexible
solution that will enable a broad and growing set of applications, it is
important to grow a large and varied community of contributors.

= Initial Goals =

  * Move existing codebase, website, documentation and mailing lists to
Apache-hosted infrastructure
  *