Re: [VOTE] Accept Apex into the Apache Incubator

2015-08-17 Thread P. Taylor Goetz
This vote is now closed and passes with 15 binding +1 votes, 9 non-binding +1 
votes and not 0 or -1 votes.

Vote tally (* indicates a binding vote):

+1:
Pramod Immaneni
Chris Nauroth*
Gaurav Gupta
Julian Hyde*
Seetharam Venkatesh
Hitesh Shah*
Ted Dunning*
Alan Gates*
P. Taylor Goetz*
Henry Saputra*
Ashwin Chandra Putta
David Yan
John D. Ament*
Amol Kekre
Luke Han
Atri Sharma
Chris Douglas*
Justin Mclean*
Naresh Agarwal
Bertrand Delacretaz*
Jan Iversen*
Amareshwari Sriramdasu*
Roman Shaposhnik*
Niall Pemberton*

0:
-none-

-1:
-none-

Thank you to all who voted.

-Taylor



 On Aug 13, 2015, at 10:48 AM, P. Taylor Goetz ptgo...@apache.org wrote:
 
 Following the discussion thread [1], I would like to call a VOTE for 
 Accepting Apex as a new Apache Incubator project.
 
 The proposal is available on the wiki [2] and is also attached below.
 
 The VOTE will be open for at least 72 hours.
 
 [ ] +1 Accept Apex into the Incubator
 [ ] ±0 No opinion
 [ ] -1 Do not accept Apex into the Incubator because…
 
 Thanks,
 
 -Taylor
 
 [1] http://s.apache.org/apex_discuss http://s.apache.org/apex_discuss
 [2] https://wiki.apache.org/incubator/ApexProposal 
 https://wiki.apache.org/incubator/ApexProposal
 
 
 == Abstract ==
 Apex is an enterprise grade native YARN big data-in-motion platform that 
 unifies stream processing as well as batch processing. Apex processes big 
 data in-motion in a highly scalable, highly performant, fault tolerant, 
 stateful, secure, distributed, and an easily operable way. It provides a 
 simple API that enables users to write or re-use generic Java code, thereby 
 lowering the expertise needed to write big data applications.
 
 Functional and operational specifications are separated. Apex is designed in 
 a way to enable users to write their own code (aka user defined functions) as 
 is and leave all operability to the platform. The API is very simple and is 
 designed to allow users to drop in their code as is. The platform mainly 
 deals with operability and treats functional code as a black box. Operability 
 includes fault tolerance, scalability, security, ease of use, metrics api, 
 webservices, etc. In other words there is no separation of UDF (user defined 
 functions), as all functional code is UDF. This frees users to focus on 
 functional development, and lets platform provide operability support. The 
 same code runs as is with different operability attributes. The 
 data-in-motion architecture of Apex unifies stream as well as batch 
 processing in a single platform. Since Apex is a native YARN application, it 
 leverages all the components of YARN without duplication. Apex was developed 
 with YARN in mind and has no overlapping components/functionality with YARN.
 
 The Apex platform is supplemented by project Malhar, which is a library of 
 operators that implement common business logic functions needed by customers 
 who want to quickly develop applications. These operators provide access to 
 HDFS, S3, NFS, FTP, and other file systems;  Kafka, ActiveMQ, RabbitMQ, JMS, 
 and other message systems; MySql, Cassandra, MongoDB, Redis, HBase, CouchDB 
 and other databases along with JDBC connectors. The Malhar library also 
 includes a host of other common business logic patterns that help users to 
 significantly reduce the time it takes to go into production. Ease of 
 integration with all other big data technologies is one of the primary 
 missions of Malhar.
 
 == Proposal ==
 The goal of this proposal is to establish the core engine of DataTorrent RTS 
 product as an Apache Software Foundation (ASF) project in order to build a 
 vibrant, diverse, and self-governed open source community around the 
 technology. DataTorrent will continue to sell management tools, application 
 building tools, easy to use big data applications, and custom high end 
 business logic operators. This proposal covers the Apex source code (written 
 in Java), Apex documentation and other materials currently available on 
 https://github.com/DataTorrent/Apex https://github.com/DataTorrent/Apex. 
 This proposal also covers the Malhar source code (written in Java), Malhar 
 documentation, and other materials currently available on 
 https://github.com/DataTorrent/Malhar 
 https://github.com/DataTorrent/Malhar. We have done a trademark check on 
 the name Apex, and have concluded that the Apex name is likely to be a 
 suitable project name.
 
 == Background ==
 DataTorrent RTS is a mature and robust product developed as a native YARN 
 application. RTS 1.0 was launched in summer of 2014; RTS 2.0 was launched in 
 Jan 2015. Both were well received by customers. RTS 3.0 was launched at end 
 of July 2015. RTS is among the first enterprise grade platform that was 
 developed from the ground up as native YARN application. DataTorrent RTS is 
 currently maintained by engineers as a closed source project. Even though the 
 engineers behind RTS are experienced software engineers and are knowledge 
 leaders 

Re: [VOTE] Accept Apex into the Apache Incubator

2015-08-14 Thread jan i
+1 (binding)

have fun
jan i

On Friday, August 14, 2015, Bertrand Delacretaz bdelacre...@apache.org
wrote:

 On Thu, Aug 13, 2015 at 4:48 PM, P. Taylor Goetz ptgo...@apache.org
 javascript:; wrote:
  Following the discussion thread [1], I would like to call a VOTE for
  Accepting Apex as a new Apache Incubator project.

 +1, binding

 -Bertrand

 -
 To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
 javascript:;
 For additional commands, e-mail: general-h...@incubator.apache.org
 javascript:;



-- 
Sent from My iPad, sorry for any misspellings.


Re: [VOTE] Accept Apex into the Apache Incubator

2015-08-14 Thread Bertrand Delacretaz
On Thu, Aug 13, 2015 at 4:48 PM, P. Taylor Goetz ptgo...@apache.org wrote:
 Following the discussion thread [1], I would like to call a VOTE for
 Accepting Apex as a new Apache Incubator project.

+1, binding

-Bertrand

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: [VOTE] Accept Apex into the Apache Incubator

2015-08-14 Thread Amareshwari Sriramdasu
+1 binding

On Thu, Aug 13, 2015 at 8:18 PM, P. Taylor Goetz ptgo...@apache.org wrote:

 Following the discussion thread [1], I would like to call a VOTE for
 Accepting Apex as a new Apache Incubator project.

 The proposal is available on the wiki [2] and is also attached below.

 The VOTE will be open for at least 72 hours.

 [ ] +1 Accept Apex into the Incubator
 [ ] ±0 No opinion
 [ ] -1 Do not accept Apex into the Incubator because…

 Thanks,

 -Taylor

 [1] http://s.apache.org/apex_discuss
 [2] https://wiki.apache.org/incubator/ApexProposal


 == Abstract ==
 Apex is an enterprise grade native YARN big data-in-motion platform that
 unifies stream processing as well as batch processing. Apex processes big
 data in-motion in a highly scalable, highly performant, fault tolerant,
 stateful, secure, distributed, and an easily operable way. It provides a
 simple API that enables users to write or re-use generic Java code, thereby
 lowering the expertise needed to write big data applications.

 Functional and operational specifications are separated. Apex is designed
 in a way to enable users to write their own code (aka user defined
 functions) as is and leave all operability to the platform. The API is very
 simple and is designed to allow users to drop in their code as is. The
 platform mainly deals with operability and treats functional code as a
 black box. Operability includes fault tolerance, scalability, security,
 ease of use, metrics api, webservices, etc. In other words there is no
 separation of UDF (user defined functions), as all functional code is UDF.
 This frees users to focus on functional development, and lets platform
 provide operability support. The same code runs as is with different
 operability attributes. The data-in-motion architecture of Apex unifies
 stream as well as batch processing in a single platform. Since Apex is a
 native YARN application, it leverages all the components of YARN without
 duplication. Apex was developed with YARN in mind and has no overlapping
 components/functionality with YARN.

 The Apex platform is supplemented by project Malhar, which is a library of
 operators that implement common business logic functions needed by
 customers who want to quickly develop applications. These operators provide
 access to HDFS, S3, NFS, FTP, and other file systems;  Kafka, ActiveMQ,
 RabbitMQ, JMS, and other message systems; MySql, Cassandra, MongoDB, Redis,
 HBase, CouchDB and other databases along with JDBC connectors. The Malhar
 library also includes a host of other common business logic patterns that
 help users to significantly reduce the time it takes to go into production.
 Ease of integration with all other big data technologies is one of the
 primary missions of Malhar.

 == Proposal ==
 The goal of this proposal is to establish the core engine of DataTorrent
 RTS product as an Apache Software Foundation (ASF) project in order to
 build a vibrant, diverse, and self-governed open source community around
 the technology. DataTorrent will continue to sell management tools,
 application building tools, easy to use big data applications, and custom
 high end business logic operators. This proposal covers the Apex source
 code (written in Java), Apex documentation and other materials currently
 available on https://github.com/DataTorrent/Apex. This proposal also
 covers the Malhar source code (written in Java), Malhar documentation, and
 other materials currently available on
 https://github.com/DataTorrent/Malhar. We have done a trademark check on
 the name Apex, and have concluded that the Apex name is likely to be a
 suitable project name.

 == Background ==
 DataTorrent RTS is a mature and robust product developed as a native YARN
 application. RTS 1.0 was launched in summer of 2014; RTS 2.0 was launched
 in Jan 2015. Both were well received by customers. RTS 3.0 was launched at
 end of July 2015. RTS is among the first enterprise grade platform that was
 developed from the ground up as native YARN application. DataTorrent RTS is
 currently maintained by engineers as a closed source project. Even though
 the engineers behind RTS are experienced software engineers and are
 knowledge leaders in data-in-motion platforms, they have had little
 exposure to the open source governance process. Customers are currently
 running applications based on DataTorrent RTS in production.

 == Rationale ==
 Big data applications written for non-Hadoop platforms typically require
 major rewrites  to get them to work with Hadoop. This rewriting creates a
 significant bottleneck in terms of resources (expertise) which in turn
 jeopardizes the viability of such an endeavour. It is hard enough to
 acquire big data expertise, demanding additional expertise to do a major
 code conversion makes it a very hard problem for projects to successfully
 migrate to Hadoop. Also, due to the batch processing nature of Hadoop’s
 MapReduce paradigm, users often have to wait tens of minutes to see 

Re: [VOTE] Accept Apex into the Apache Incubator

2015-08-14 Thread Roman Shaposhnik
On Thu, Aug 13, 2015 at 7:48 AM, P. Taylor Goetz ptgo...@apache.org wrote:
 Following the discussion thread [1], I would like to call a VOTE for
 Accepting Apex as a new Apache Incubator project.

 The proposal is available on the wiki [2] and is also attached below.

 The VOTE will be open for at least 72 hours.

 [ ] +1 Accept Apex into the Incubator
 [ ] ±0 No opinion
 [ ] -1 Do not accept Apex into the Incubator because…

+1 (binding)

Thanks,
Roman.

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: [VOTE] Accept Apex into the Apache Incubator

2015-08-14 Thread Niall Pemberton
+1

Niall

On Thu, Aug 13, 2015 at 3:48 PM, P. Taylor Goetz ptgo...@apache.org wrote:

 Following the discussion thread [1], I would like to call a VOTE for
 Accepting Apex as a new Apache Incubator project.

 The proposal is available on the wiki [2] and is also attached below.

 The VOTE will be open for at least 72 hours.

 [ ] +1 Accept Apex into the Incubator
 [ ] ±0 No opinion
 [ ] -1 Do not accept Apex into the Incubator because…

 Thanks,

 -Taylor

 [1] http://s.apache.org/apex_discuss
 [2] https://wiki.apache.org/incubator/ApexProposal


 == Abstract ==
 Apex is an enterprise grade native YARN big data-in-motion platform that
 unifies stream processing as well as batch processing. Apex processes big
 data in-motion in a highly scalable, highly performant, fault tolerant,
 stateful, secure, distributed, and an easily operable way. It provides a
 simple API that enables users to write or re-use generic Java code, thereby
 lowering the expertise needed to write big data applications.

 Functional and operational specifications are separated. Apex is designed
 in a way to enable users to write their own code (aka user defined
 functions) as is and leave all operability to the platform. The API is very
 simple and is designed to allow users to drop in their code as is. The
 platform mainly deals with operability and treats functional code as a
 black box. Operability includes fault tolerance, scalability, security,
 ease of use, metrics api, webservices, etc. In other words there is no
 separation of UDF (user defined functions), as all functional code is UDF.
 This frees users to focus on functional development, and lets platform
 provide operability support. The same code runs as is with different
 operability attributes. The data-in-motion architecture of Apex unifies
 stream as well as batch processing in a single platform. Since Apex is a
 native YARN application, it leverages all the components of YARN without
 duplication. Apex was developed with YARN in mind and has no overlapping
 components/functionality with YARN.

 The Apex platform is supplemented by project Malhar, which is a library of
 operators that implement common business logic functions needed by
 customers who want to quickly develop applications. These operators provide
 access to HDFS, S3, NFS, FTP, and other file systems;  Kafka, ActiveMQ,
 RabbitMQ, JMS, and other message systems; MySql, Cassandra, MongoDB, Redis,
 HBase, CouchDB and other databases along with JDBC connectors. The Malhar
 library also includes a host of other common business logic patterns that
 help users to significantly reduce the time it takes to go into production.
 Ease of integration with all other big data technologies is one of the
 primary missions of Malhar.

 == Proposal ==
 The goal of this proposal is to establish the core engine of DataTorrent
 RTS product as an Apache Software Foundation (ASF) project in order to
 build a vibrant, diverse, and self-governed open source community around
 the technology. DataTorrent will continue to sell management tools,
 application building tools, easy to use big data applications, and custom
 high end business logic operators. This proposal covers the Apex source
 code (written in Java), Apex documentation and other materials currently
 available on https://github.com/DataTorrent/Apex. This proposal also
 covers the Malhar source code (written in Java), Malhar documentation, and
 other materials currently available on
 https://github.com/DataTorrent/Malhar. We have done a trademark check on
 the name Apex, and have concluded that the Apex name is likely to be a
 suitable project name.

 == Background ==
 DataTorrent RTS is a mature and robust product developed as a native YARN
 application. RTS 1.0 was launched in summer of 2014; RTS 2.0 was launched
 in Jan 2015. Both were well received by customers. RTS 3.0 was launched at
 end of July 2015. RTS is among the first enterprise grade platform that was
 developed from the ground up as native YARN application. DataTorrent RTS is
 currently maintained by engineers as a closed source project. Even though
 the engineers behind RTS are experienced software engineers and are
 knowledge leaders in data-in-motion platforms, they have had little
 exposure to the open source governance process. Customers are currently
 running applications based on DataTorrent RTS in production.

 == Rationale ==
 Big data applications written for non-Hadoop platforms typically require
 major rewrites  to get them to work with Hadoop. This rewriting creates a
 significant bottleneck in terms of resources (expertise) which in turn
 jeopardizes the viability of such an endeavour. It is hard enough to
 acquire big data expertise, demanding additional expertise to do a major
 code conversion makes it a very hard problem for projects to successfully
 migrate to Hadoop. Also, due to the batch processing nature of Hadoop’s
 MapReduce paradigm, users often have to wait tens of minutes to see 

Re: [VOTE] Accept Apex into the Apache Incubator

2015-08-13 Thread Chris Nauroth
+1 (binding)

I believe the current proposal covers everything required.  Thank you to Amol 
for incorporating the community's feedback.

--Chris Nauroth

From: P. Taylor Goetz ptgo...@apache.orgmailto:ptgo...@apache.org
Reply-To: general@incubator.apache.orgmailto:general@incubator.apache.org
Date: Thursday, August 13, 2015 at 7:48 AM
To: Incubator 
general@incubator.apache.orgmailto:general@incubator.apache.org
Subject: [VOTE] Accept Apex into the Apache Incubator

Following the discussion thread [1], I would like to call a VOTE for Accepting 
Apex as a new Apache Incubator project.

The proposal is available on the wiki [2] and is also attached below.

The VOTE will be open for at least 72 hours.

[ ] +1 Accept Apex into the Incubator
[ ] ±0 No opinion
[ ] -1 Do not accept Apex into the Incubator because…

Thanks,

-Taylor

[1] http://s.apache.org/apex_discuss
[2] https://wiki.apache.org/incubator/ApexProposal


== Abstract ==
Apex is an enterprise grade native YARN big data-in-motion platform that 
unifies stream processing as well as batch processing. Apex processes big data 
in-motion in a highly scalable, highly performant, fault tolerant, stateful, 
secure, distributed, and an easily operable way. It provides a simple API that 
enables users to write or re-use generic Java code, thereby lowering the 
expertise needed to write big data applications.

Functional and operational specifications are separated. Apex is designed in a 
way to enable users to write their own code (aka user defined functions) as is 
and leave all operability to the platform. The API is very simple and is 
designed to allow users to drop in their code as is. The platform mainly deals 
with operability and treats functional code as a black box. Operability 
includes fault tolerance, scalability, security, ease of use, metrics api, 
webservices, etc. In other words there is no separation of UDF (user defined 
functions), as all functional code is UDF. This frees users to focus on 
functional development, and lets platform provide operability support. The same 
code runs as is with different operability attributes. The data-in-motion 
architecture of Apex unifies stream as well as batch processing in a single 
platform. Since Apex is a native YARN application, it leverages all the 
components of YARN without duplication. Apex was developed with YARN in mind 
and has no overlapping components/functionality with YARN.

The Apex platform is supplemented by project Malhar, which is a library of 
operators that implement common business logic functions needed by customers 
who want to quickly develop applications. These operators provide access to 
HDFS, S3, NFS, FTP, and other file systems;  Kafka, ActiveMQ, RabbitMQ, JMS, 
and other message systems; MySql, Cassandra, MongoDB, Redis, HBase, CouchDB and 
other databases along with JDBC connectors. The Malhar library also includes a 
host of other common business logic patterns that help users to significantly 
reduce the time it takes to go into production. Ease of integration with all 
other big data technologies is one of the primary missions of Malhar.

== Proposal ==
The goal of this proposal is to establish the core engine of DataTorrent RTS 
product as an Apache Software Foundation (ASF) project in order to build a 
vibrant, diverse, and self-governed open source community around the 
technology. DataTorrent will continue to sell management tools, application 
building tools, easy to use big data applications, and custom high end business 
logic operators. This proposal covers the Apex source code (written in Java), 
Apex documentation and other materials currently available on 
https://github.com/DataTorrent/Apex. This proposal also covers the Malhar 
source code (written in Java), Malhar documentation, and other materials 
currently available on https://github.com/DataTorrent/Malhar. We have done a 
trademark check on the name Apex, and have concluded that the Apex name is 
likely to be a suitable project name.

== Background ==
DataTorrent RTS is a mature and robust product developed as a native YARN 
application. RTS 1.0 was launched in summer of 2014; RTS 2.0 was launched in 
Jan 2015. Both were well received by customers. RTS 3.0 was launched at end of 
July 2015. RTS is among the first enterprise grade platform that was developed 
from the ground up as native YARN application. DataTorrent RTS is currently 
maintained by engineers as a closed source project. Even though the engineers 
behind RTS are experienced software engineers and are knowledge leaders in 
data-in-motion platforms, they have had little exposure to the open source 
governance process. Customers are currently running applications based on 
DataTorrent RTS in production.

== Rationale ==
Big data applications written for non-Hadoop platforms typically require major 
rewrites  to get them to work with Hadoop. This rewriting creates a significant 
bottleneck in terms of resources (expertise) which in turn 

Re: [VOTE] Accept Apex into the Apache Incubator

2015-08-13 Thread Pramod Immaneni
+1 (Non-binding)

On Thu, Aug 13, 2015 at 7:48 AM, P. Taylor Goetz ptgo...@apache.org wrote:

 Following the discussion thread [1], I would like to call a VOTE for
 Accepting Apex as a new Apache Incubator project.

 The proposal is available on the wiki [2] and is also attached below.

 The VOTE will be open for at least 72 hours.

 [ ] +1 Accept Apex into the Incubator
 [ ] ±0 No opinion
 [ ] -1 Do not accept Apex into the Incubator because…

 Thanks,

 -Taylor

 [1] http://s.apache.org/apex_discuss
 [2] https://wiki.apache.org/incubator/ApexProposal


 == Abstract ==
 Apex is an enterprise grade native YARN big data-in-motion platform that
 unifies stream processing as well as batch processing. Apex processes big
 data in-motion in a highly scalable, highly performant, fault tolerant,
 stateful, secure, distributed, and an easily operable way. It provides a
 simple API that enables users to write or re-use generic Java code, thereby
 lowering the expertise needed to write big data applications.

 Functional and operational specifications are separated. Apex is designed
 in a way to enable users to write their own code (aka user defined
 functions) as is and leave all operability to the platform. The API is very
 simple and is designed to allow users to drop in their code as is. The
 platform mainly deals with operability and treats functional code as a
 black box. Operability includes fault tolerance, scalability, security,
 ease of use, metrics api, webservices, etc. In other words there is no
 separation of UDF (user defined functions), as all functional code is UDF.
 This frees users to focus on functional development, and lets platform
 provide operability support. The same code runs as is with different
 operability attributes. The data-in-motion architecture of Apex unifies
 stream as well as batch processing in a single platform. Since Apex is a
 native YARN application, it leverages all the components of YARN without
 duplication. Apex was developed with YARN in mind and has no overlapping
 components/functionality with YARN.

 The Apex platform is supplemented by project Malhar, which is a library of
 operators that implement common business logic functions needed by
 customers who want to quickly develop applications. These operators provide
 access to HDFS, S3, NFS, FTP, and other file systems;  Kafka, ActiveMQ,
 RabbitMQ, JMS, and other message systems; MySql, Cassandra, MongoDB, Redis,
 HBase, CouchDB and other databases along with JDBC connectors. The Malhar
 library also includes a host of other common business logic patterns that
 help users to significantly reduce the time it takes to go into production.
 Ease of integration with all other big data technologies is one of the
 primary missions of Malhar.

 == Proposal ==
 The goal of this proposal is to establish the core engine of DataTorrent
 RTS product as an Apache Software Foundation (ASF) project in order to
 build a vibrant, diverse, and self-governed open source community around
 the technology. DataTorrent will continue to sell management tools,
 application building tools, easy to use big data applications, and custom
 high end business logic operators. This proposal covers the Apex source
 code (written in Java), Apex documentation and other materials currently
 available on https://github.com/DataTorrent/Apex. This proposal also
 covers the Malhar source code (written in Java), Malhar documentation, and
 other materials currently available on
 https://github.com/DataTorrent/Malhar. We have done a trademark check on
 the name Apex, and have concluded that the Apex name is likely to be a
 suitable project name.

 == Background ==
 DataTorrent RTS is a mature and robust product developed as a native YARN
 application. RTS 1.0 was launched in summer of 2014; RTS 2.0 was launched
 in Jan 2015. Both were well received by customers. RTS 3.0 was launched at
 end of July 2015. RTS is among the first enterprise grade platform that was
 developed from the ground up as native YARN application. DataTorrent RTS is
 currently maintained by engineers as a closed source project. Even though
 the engineers behind RTS are experienced software engineers and are
 knowledge leaders in data-in-motion platforms, they have had little
 exposure to the open source governance process. Customers are currently
 running applications based on DataTorrent RTS in production.

 == Rationale ==
 Big data applications written for non-Hadoop platforms typically require
 major rewrites  to get them to work with Hadoop. This rewriting creates a
 significant bottleneck in terms of resources (expertise) which in turn
 jeopardizes the viability of such an endeavour. It is hard enough to
 acquire big data expertise, demanding additional expertise to do a major
 code conversion makes it a very hard problem for projects to successfully
 migrate to Hadoop. Also, due to the batch processing nature of Hadoop’s
 MapReduce paradigm, users often have to wait tens of minutes 

Re: [VOTE] Accept Apex into the Apache Incubator

2015-08-13 Thread Gaurav Gupta
+1 (Non-binding)

-Gaurav

 On Aug 13, 2015, at 10:22 AM, Pramod Immaneni pra...@datatorrent.com wrote:
 
 +1 (Non-binding)
 
 On Thu, Aug 13, 2015 at 7:48 AM, P. Taylor Goetz ptgo...@apache.org wrote:
 
 Following the discussion thread [1], I would like to call a VOTE for
 Accepting Apex as a new Apache Incubator project.
 
 The proposal is available on the wiki [2] and is also attached below.
 
 The VOTE will be open for at least 72 hours.
 
 [ ] +1 Accept Apex into the Incubator
 [ ] ±0 No opinion
 [ ] -1 Do not accept Apex into the Incubator because…
 
 Thanks,
 
 -Taylor
 
 [1] http://s.apache.org/apex_discuss
 [2] https://wiki.apache.org/incubator/ApexProposal
 
 
 == Abstract ==
 Apex is an enterprise grade native YARN big data-in-motion platform that
 unifies stream processing as well as batch processing. Apex processes big
 data in-motion in a highly scalable, highly performant, fault tolerant,
 stateful, secure, distributed, and an easily operable way. It provides a
 simple API that enables users to write or re-use generic Java code, thereby
 lowering the expertise needed to write big data applications.
 
 Functional and operational specifications are separated. Apex is designed
 in a way to enable users to write their own code (aka user defined
 functions) as is and leave all operability to the platform. The API is very
 simple and is designed to allow users to drop in their code as is. The
 platform mainly deals with operability and treats functional code as a
 black box. Operability includes fault tolerance, scalability, security,
 ease of use, metrics api, webservices, etc. In other words there is no
 separation of UDF (user defined functions), as all functional code is UDF.
 This frees users to focus on functional development, and lets platform
 provide operability support. The same code runs as is with different
 operability attributes. The data-in-motion architecture of Apex unifies
 stream as well as batch processing in a single platform. Since Apex is a
 native YARN application, it leverages all the components of YARN without
 duplication. Apex was developed with YARN in mind and has no overlapping
 components/functionality with YARN.
 
 The Apex platform is supplemented by project Malhar, which is a library of
 operators that implement common business logic functions needed by
 customers who want to quickly develop applications. These operators provide
 access to HDFS, S3, NFS, FTP, and other file systems;  Kafka, ActiveMQ,
 RabbitMQ, JMS, and other message systems; MySql, Cassandra, MongoDB, Redis,
 HBase, CouchDB and other databases along with JDBC connectors. The Malhar
 library also includes a host of other common business logic patterns that
 help users to significantly reduce the time it takes to go into production.
 Ease of integration with all other big data technologies is one of the
 primary missions of Malhar.
 
 == Proposal ==
 The goal of this proposal is to establish the core engine of DataTorrent
 RTS product as an Apache Software Foundation (ASF) project in order to
 build a vibrant, diverse, and self-governed open source community around
 the technology. DataTorrent will continue to sell management tools,
 application building tools, easy to use big data applications, and custom
 high end business logic operators. This proposal covers the Apex source
 code (written in Java), Apex documentation and other materials currently
 available on https://github.com/DataTorrent/Apex. This proposal also
 covers the Malhar source code (written in Java), Malhar documentation, and
 other materials currently available on
 https://github.com/DataTorrent/Malhar. We have done a trademark check on
 the name Apex, and have concluded that the Apex name is likely to be a
 suitable project name.
 
 == Background ==
 DataTorrent RTS is a mature and robust product developed as a native YARN
 application. RTS 1.0 was launched in summer of 2014; RTS 2.0 was launched
 in Jan 2015. Both were well received by customers. RTS 3.0 was launched at
 end of July 2015. RTS is among the first enterprise grade platform that was
 developed from the ground up as native YARN application. DataTorrent RTS is
 currently maintained by engineers as a closed source project. Even though
 the engineers behind RTS are experienced software engineers and are
 knowledge leaders in data-in-motion platforms, they have had little
 exposure to the open source governance process. Customers are currently
 running applications based on DataTorrent RTS in production.
 
 == Rationale ==
 Big data applications written for non-Hadoop platforms typically require
 major rewrites  to get them to work with Hadoop. This rewriting creates a
 significant bottleneck in terms of resources (expertise) which in turn
 jeopardizes the viability of such an endeavour. It is hard enough to
 acquire big data expertise, demanding additional expertise to do a major
 code conversion makes it a very hard problem for projects to successfully
 migrate to 

Re: [VOTE] Accept Apex into the Apache Incubator

2015-08-13 Thread Henry Saputra
+1 (binding)

Good luck guys!

- Henry

On Thu, Aug 13, 2015 at 7:48 AM, P. Taylor Goetz ptgo...@apache.org wrote:
 Following the discussion thread [1], I would like to call a VOTE for
 Accepting Apex as a new Apache Incubator project.

 The proposal is available on the wiki [2] and is also attached below.

 The VOTE will be open for at least 72 hours.

 [ ] +1 Accept Apex into the Incubator
 [ ] ±0 No opinion
 [ ] -1 Do not accept Apex into the Incubator because…

 Thanks,

 -Taylor

 [1] http://s.apache.org/apex_discuss
 [2] https://wiki.apache.org/incubator/ApexProposal


 == Abstract ==
 Apex is an enterprise grade native YARN big data-in-motion platform that
 unifies stream processing as well as batch processing. Apex processes big
 data in-motion in a highly scalable, highly performant, fault tolerant,
 stateful, secure, distributed, and an easily operable way. It provides a
 simple API that enables users to write or re-use generic Java code, thereby
 lowering the expertise needed to write big data applications.

 Functional and operational specifications are separated. Apex is designed in
 a way to enable users to write their own code (aka user defined functions)
 as is and leave all operability to the platform. The API is very simple and
 is designed to allow users to drop in their code as is. The platform mainly
 deals with operability and treats functional code as a black box.
 Operability includes fault tolerance, scalability, security, ease of use,
 metrics api, webservices, etc. In other words there is no separation of UDF
 (user defined functions), as all functional code is UDF. This frees users to
 focus on functional development, and lets platform provide operability
 support. The same code runs as is with different operability attributes. The
 data-in-motion architecture of Apex unifies stream as well as batch
 processing in a single platform. Since Apex is a native YARN application, it
 leverages all the components of YARN without duplication. Apex was developed
 with YARN in mind and has no overlapping components/functionality with YARN.

 The Apex platform is supplemented by project Malhar, which is a library of
 operators that implement common business logic functions needed by customers
 who want to quickly develop applications. These operators provide access to
 HDFS, S3, NFS, FTP, and other file systems;  Kafka, ActiveMQ, RabbitMQ, JMS,
 and other message systems; MySql, Cassandra, MongoDB, Redis, HBase, CouchDB
 and other databases along with JDBC connectors. The Malhar library also
 includes a host of other common business logic patterns that help users to
 significantly reduce the time it takes to go into production. Ease of
 integration with all other big data technologies is one of the primary
 missions of Malhar.

 == Proposal ==
 The goal of this proposal is to establish the core engine of DataTorrent RTS
 product as an Apache Software Foundation (ASF) project in order to build a
 vibrant, diverse, and self-governed open source community around the
 technology. DataTorrent will continue to sell management tools, application
 building tools, easy to use big data applications, and custom high end
 business logic operators. This proposal covers the Apex source code (written
 in Java), Apex documentation and other materials currently available on
 https://github.com/DataTorrent/Apex. This proposal also covers the Malhar
 source code (written in Java), Malhar documentation, and other materials
 currently available on https://github.com/DataTorrent/Malhar. We have done a
 trademark check on the name Apex, and have concluded that the Apex name is
 likely to be a suitable project name.

 == Background ==
 DataTorrent RTS is a mature and robust product developed as a native YARN
 application. RTS 1.0 was launched in summer of 2014; RTS 2.0 was launched in
 Jan 2015. Both were well received by customers. RTS 3.0 was launched at end
 of July 2015. RTS is among the first enterprise grade platform that was
 developed from the ground up as native YARN application. DataTorrent RTS is
 currently maintained by engineers as a closed source project. Even though
 the engineers behind RTS are experienced software engineers and are
 knowledge leaders in data-in-motion platforms, they have had little exposure
 to the open source governance process. Customers are currently running
 applications based on DataTorrent RTS in production.

 == Rationale ==
 Big data applications written for non-Hadoop platforms typically require
 major rewrites  to get them to work with Hadoop. This rewriting creates a
 significant bottleneck in terms of resources (expertise) which in turn
 jeopardizes the viability of such an endeavour. It is hard enough to acquire
 big data expertise, demanding additional expertise to do a major code
 conversion makes it a very hard problem for projects to successfully migrate
 to Hadoop. Also, due to the batch processing nature of Hadoop’s MapReduce
 paradigm, users often have to 

Re: [VOTE] Accept Apex into the Apache Incubator

2015-08-13 Thread David Yan
+1 (Non-binding)

On Thu, Aug 13, 2015 at 2:32 PM, Alan Gates alanfga...@gmail.com wrote:

 +1.

 Alan.

 Chris Nauroth cnaur...@hortonworks.com
 August 13, 2015 at 9:59
 +1 (binding)

 I believe the current proposal covers everything required. Thank you to
 Amol for incorporating the community's feedback.

 --Chris Nauroth

 From: P. Taylor Goetz ptgo...@apache.orgmailto:ptgo...@apache.org
 ptgo...@apache.org
 Reply-To: general@incubator.apache.org
 mailto:general@incubator.apache.org general@incubator.apache.org
 Date: Thursday, August 13, 2015 at 7:48 AM
 To: Incubator general@incubator.apache.org
 mailto:general@incubator.apache.org general@incubator.apache.org
 Subject: [VOTE] Accept Apex into the Apache Incubator

 Following the discussion thread [1], I would like to call a VOTE for
 Accepting Apex as a new Apache Incubator project.

 The proposal is available on the wiki [2] and is also attached below.

 The VOTE will be open for at least 72 hours.

 [ ] +1 Accept Apex into the Incubator
 [ ] ±0 No opinion
 [ ] -1 Do not accept Apex into the Incubator because…

 Thanks,

 -Taylor

 [1] http://s.apache.org/apex_discuss
 [2] https://wiki.apache.org/incubator/ApexProposal


 == Abstract ==
 Apex is an enterprise grade native YARN big data-in-motion platform that
 unifies stream processing as well as batch processing. Apex processes big
 data in-motion in a highly scalable, highly performant, fault tolerant,
 stateful, secure, distributed, and an easily operable way. It provides a
 simple API that enables users to write or re-use generic Java code, thereby
 lowering the expertise needed to write big data applications.

 Functional and operational specifications are separated. Apex is designed
 in a way to enable users to write their own code (aka user defined
 functions) as is and leave all operability to the platform. The API is very
 simple and is designed to allow users to drop in their code as is. The
 platform mainly deals with operability and treats functional code as a
 black box. Operability includes fault tolerance, scalability, security,
 ease of use, metrics api, webservices, etc. In other words there is no
 separation of UDF (user defined functions), as all functional code is UDF.
 This frees users to focus on functional development, and lets platform
 provide operability support. The same code runs as is with different
 operability attributes. The data-in-motion architecture of Apex unifies
 stream as well as batch processing in a single platform. Since Apex is a
 native YARN application, it leverages all the components of YARN without
 duplication. Apex was developed with YARN in mind and has no overlapping
 components/functionality with YARN.

 The Apex platform is supplemented by project Malhar, which is a library of
 operators that implement common business logic functions needed by
 customers who want to quickly develop applications. These operators provide
 access to HDFS, S3, NFS, FTP, and other file systems; Kafka, ActiveMQ,
 RabbitMQ, JMS, and other message systems; MySql, Cassandra, MongoDB, Redis,
 HBase, CouchDB and other databases along with JDBC connectors. The Malhar
 library also includes a host of other common business logic patterns that
 help users to significantly reduce the time it takes to go into production.
 Ease of integration with all other big data technologies is one of the
 primary missions of Malhar.

 == Proposal ==
 The goal of this proposal is to establish the core engine of DataTorrent
 RTS product as an Apache Software Foundation (ASF) project in order to
 build a vibrant, diverse, and self-governed open source community around
 the technology. DataTorrent will continue to sell management tools,
 application building tools, easy to use big data applications, and custom
 high end business logic operators. This proposal covers the Apex source
 code (written in Java), Apex documentation and other materials currently
 available on https://github.com/DataTorrent/Apex. This proposal also
 covers the Malhar source code (written in Java), Malhar documentation, and
 other materials currently available on
 https://github.com/DataTorrent/Malhar. We have done a trademark check on
 the name Apex, and have concluded that the Apex name is likely to be a
 suitable project name.

 == Background ==
 DataTorrent RTS is a mature and robust product developed as a native YARN
 application. RTS 1.0 was launched in summer of 2014; RTS 2.0 was launched
 in Jan 2015. Both were well received by customers. RTS 3.0 was launched at
 end of July 2015. RTS is among the first enterprise grade platform that was
 developed from the ground up as native YARN application. DataTorrent RTS is
 currently maintained by engineers as a closed source project. Even though
 the engineers behind RTS are experienced software engineers and are
 knowledge leaders in data-in-motion platforms, they have had little
 exposure to the open source governance process. Customers are currently
 running applications 

Re: [VOTE] Accept Apex into the Apache Incubator

2015-08-13 Thread P. Taylor Goetz
+1 (binding)

-Taylor

 On Aug 13, 2015, at 10:48 AM, P. Taylor Goetz ptgo...@apache.org wrote:
 
 Following the discussion thread [1], I would like to call a VOTE for 
 Accepting Apex as a new Apache Incubator project.
 
 The proposal is available on the wiki [2] and is also attached below.
 
 The VOTE will be open for at least 72 hours.
 
 [ ] +1 Accept Apex into the Incubator
 [ ] ±0 No opinion
 [ ] -1 Do not accept Apex into the Incubator because…
 
 Thanks,
 
 -Taylor
 
 [1] http://s.apache.org/apex_discuss http://s.apache.org/apex_discuss
 [2] https://wiki.apache.org/incubator/ApexProposal 
 https://wiki.apache.org/incubator/ApexProposal
 
 
 == Abstract ==
 Apex is an enterprise grade native YARN big data-in-motion platform that 
 unifies stream processing as well as batch processing. Apex processes big 
 data in-motion in a highly scalable, highly performant, fault tolerant, 
 stateful, secure, distributed, and an easily operable way. It provides a 
 simple API that enables users to write or re-use generic Java code, thereby 
 lowering the expertise needed to write big data applications.
 
 Functional and operational specifications are separated. Apex is designed in 
 a way to enable users to write their own code (aka user defined functions) as 
 is and leave all operability to the platform. The API is very simple and is 
 designed to allow users to drop in their code as is. The platform mainly 
 deals with operability and treats functional code as a black box. Operability 
 includes fault tolerance, scalability, security, ease of use, metrics api, 
 webservices, etc. In other words there is no separation of UDF (user defined 
 functions), as all functional code is UDF. This frees users to focus on 
 functional development, and lets platform provide operability support. The 
 same code runs as is with different operability attributes. The 
 data-in-motion architecture of Apex unifies stream as well as batch 
 processing in a single platform. Since Apex is a native YARN application, it 
 leverages all the components of YARN without duplication. Apex was developed 
 with YARN in mind and has no overlapping components/functionality with YARN.
 
 The Apex platform is supplemented by project Malhar, which is a library of 
 operators that implement common business logic functions needed by customers 
 who want to quickly develop applications. These operators provide access to 
 HDFS, S3, NFS, FTP, and other file systems;  Kafka, ActiveMQ, RabbitMQ, JMS, 
 and other message systems; MySql, Cassandra, MongoDB, Redis, HBase, CouchDB 
 and other databases along with JDBC connectors. The Malhar library also 
 includes a host of other common business logic patterns that help users to 
 significantly reduce the time it takes to go into production. Ease of 
 integration with all other big data technologies is one of the primary 
 missions of Malhar.
 
 == Proposal ==
 The goal of this proposal is to establish the core engine of DataTorrent RTS 
 product as an Apache Software Foundation (ASF) project in order to build a 
 vibrant, diverse, and self-governed open source community around the 
 technology. DataTorrent will continue to sell management tools, application 
 building tools, easy to use big data applications, and custom high end 
 business logic operators. This proposal covers the Apex source code (written 
 in Java), Apex documentation and other materials currently available on 
 https://github.com/DataTorrent/Apex https://github.com/DataTorrent/Apex. 
 This proposal also covers the Malhar source code (written in Java), Malhar 
 documentation, and other materials currently available on 
 https://github.com/DataTorrent/Malhar 
 https://github.com/DataTorrent/Malhar. We have done a trademark check on 
 the name Apex, and have concluded that the Apex name is likely to be a 
 suitable project name.
 
 == Background ==
 DataTorrent RTS is a mature and robust product developed as a native YARN 
 application. RTS 1.0 was launched in summer of 2014; RTS 2.0 was launched in 
 Jan 2015. Both were well received by customers. RTS 3.0 was launched at end 
 of July 2015. RTS is among the first enterprise grade platform that was 
 developed from the ground up as native YARN application. DataTorrent RTS is 
 currently maintained by engineers as a closed source project. Even though the 
 engineers behind RTS are experienced software engineers and are knowledge 
 leaders in data-in-motion platforms, they have had little exposure to the 
 open source governance process. Customers are currently running applications 
 based on DataTorrent RTS in production.
 
 == Rationale ==
 Big data applications written for non-Hadoop platforms typically require 
 major rewrites  to get them to work with Hadoop. This rewriting creates a 
 significant bottleneck in terms of resources (expertise) which in turn 
 jeopardizes the viability of such an endeavour. It is hard enough to acquire 
 big data expertise, demanding additional 

Re: [VOTE] Accept Apex into the Apache Incubator

2015-08-13 Thread Ashwin Chandra Putta
+1 (non-binding)

On Thu, Aug 13, 2015 at 7:48 AM, P. Taylor Goetz ptgo...@apache.org wrote:

 Following the discussion thread [1], I would like to call a VOTE for
 Accepting Apex as a new Apache Incubator project.

 The proposal is available on the wiki [2] and is also attached below.

 The VOTE will be open for at least 72 hours.

 [ ] +1 Accept Apex into the Incubator
 [ ] ±0 No opinion
 [ ] -1 Do not accept Apex into the Incubator because…

 Thanks,

 -Taylor

 [1] http://s.apache.org/apex_discuss
 [2] https://wiki.apache.org/incubator/ApexProposal


 == Abstract ==
 Apex is an enterprise grade native YARN big data-in-motion platform that
 unifies stream processing as well as batch processing. Apex processes big
 data in-motion in a highly scalable, highly performant, fault tolerant,
 stateful, secure, distributed, and an easily operable way. It provides a
 simple API that enables users to write or re-use generic Java code, thereby
 lowering the expertise needed to write big data applications.

 Functional and operational specifications are separated. Apex is designed
 in a way to enable users to write their own code (aka user defined
 functions) as is and leave all operability to the platform. The API is very
 simple and is designed to allow users to drop in their code as is. The
 platform mainly deals with operability and treats functional code as a
 black box. Operability includes fault tolerance, scalability, security,
 ease of use, metrics api, webservices, etc. In other words there is no
 separation of UDF (user defined functions), as all functional code is UDF.
 This frees users to focus on functional development, and lets platform
 provide operability support. The same code runs as is with different
 operability attributes. The data-in-motion architecture of Apex unifies
 stream as well as batch processing in a single platform. Since Apex is a
 native YARN application, it leverages all the components of YARN without
 duplication. Apex was developed with YARN in mind and has no overlapping
 components/functionality with YARN.

 The Apex platform is supplemented by project Malhar, which is a library of
 operators that implement common business logic functions needed by
 customers who want to quickly develop applications. These operators provide
 access to HDFS, S3, NFS, FTP, and other file systems;  Kafka, ActiveMQ,
 RabbitMQ, JMS, and other message systems; MySql, Cassandra, MongoDB, Redis,
 HBase, CouchDB and other databases along with JDBC connectors. The Malhar
 library also includes a host of other common business logic patterns that
 help users to significantly reduce the time it takes to go into production.
 Ease of integration with all other big data technologies is one of the
 primary missions of Malhar.

 == Proposal ==
 The goal of this proposal is to establish the core engine of DataTorrent
 RTS product as an Apache Software Foundation (ASF) project in order to
 build a vibrant, diverse, and self-governed open source community around
 the technology. DataTorrent will continue to sell management tools,
 application building tools, easy to use big data applications, and custom
 high end business logic operators. This proposal covers the Apex source
 code (written in Java), Apex documentation and other materials currently
 available on https://github.com/DataTorrent/Apex. This proposal also
 covers the Malhar source code (written in Java), Malhar documentation, and
 other materials currently available on
 https://github.com/DataTorrent/Malhar. We have done a trademark check on
 the name Apex, and have concluded that the Apex name is likely to be a
 suitable project name.

 == Background ==
 DataTorrent RTS is a mature and robust product developed as a native YARN
 application. RTS 1.0 was launched in summer of 2014; RTS 2.0 was launched
 in Jan 2015. Both were well received by customers. RTS 3.0 was launched at
 end of July 2015. RTS is among the first enterprise grade platform that was
 developed from the ground up as native YARN application. DataTorrent RTS is
 currently maintained by engineers as a closed source project. Even though
 the engineers behind RTS are experienced software engineers and are
 knowledge leaders in data-in-motion platforms, they have had little
 exposure to the open source governance process. Customers are currently
 running applications based on DataTorrent RTS in production.

 == Rationale ==
 Big data applications written for non-Hadoop platforms typically require
 major rewrites  to get them to work with Hadoop. This rewriting creates a
 significant bottleneck in terms of resources (expertise) which in turn
 jeopardizes the viability of such an endeavour. It is hard enough to
 acquire big data expertise, demanding additional expertise to do a major
 code conversion makes it a very hard problem for projects to successfully
 migrate to Hadoop. Also, due to the batch processing nature of Hadoop’s
 MapReduce paradigm, users often have to wait tens of minutes 

Re: [VOTE] Accept Apex into the Apache Incubator

2015-08-13 Thread John D. Ament
+1

On Thu, Aug 13, 2015 at 12:48 PM P. Taylor Goetz ptgo...@apache.org wrote:

 Following the discussion thread [1], I would like to call a VOTE for
 Accepting Apex as a new Apache Incubator project.

 The proposal is available on the wiki [2] and is also attached below.

 The VOTE will be open for at least 72 hours.

 [ ] +1 Accept Apex into the Incubator
 [ ] ±0 No opinion
 [ ] -1 Do not accept Apex into the Incubator because…

 Thanks,

 -Taylor

 [1] http://s.apache.org/apex_discuss
 [2] https://wiki.apache.org/incubator/ApexProposal


 == Abstract ==
 Apex is an enterprise grade native YARN big data-in-motion platform that
 unifies stream processing as well as batch processing. Apex processes big
 data in-motion in a highly scalable, highly performant, fault tolerant,
 stateful, secure, distributed, and an easily operable way. It provides a
 simple API that enables users to write or re-use generic Java code, thereby
 lowering the expertise needed to write big data applications.

 Functional and operational specifications are separated. Apex is designed
 in a way to enable users to write their own code (aka user defined
 functions) as is and leave all operability to the platform. The API is very
 simple and is designed to allow users to drop in their code as is. The
 platform mainly deals with operability and treats functional code as a
 black box. Operability includes fault tolerance, scalability, security,
 ease of use, metrics api, webservices, etc. In other words there is no
 separation of UDF (user defined functions), as all functional code is UDF.
 This frees users to focus on functional development, and lets platform
 provide operability support. The same code runs as is with different
 operability attributes. The data-in-motion architecture of Apex unifies
 stream as well as batch processing in a single platform. Since Apex is a
 native YARN application, it leverages all the components of YARN without
 duplication. Apex was developed with YARN in mind and has no overlapping
 components/functionality with YARN.

 The Apex platform is supplemented by project Malhar, which is a library of
 operators that implement common business logic functions needed by
 customers who want to quickly develop applications. These operators provide
 access to HDFS, S3, NFS, FTP, and other file systems;  Kafka, ActiveMQ,
 RabbitMQ, JMS, and other message systems; MySql, Cassandra, MongoDB, Redis,
 HBase, CouchDB and other databases along with JDBC connectors. The Malhar
 library also includes a host of other common business logic patterns that
 help users to significantly reduce the time it takes to go into production.
 Ease of integration with all other big data technologies is one of the
 primary missions of Malhar.

 == Proposal ==
 The goal of this proposal is to establish the core engine of DataTorrent
 RTS product as an Apache Software Foundation (ASF) project in order to
 build a vibrant, diverse, and self-governed open source community around
 the technology. DataTorrent will continue to sell management tools,
 application building tools, easy to use big data applications, and custom
 high end business logic operators. This proposal covers the Apex source
 code (written in Java), Apex documentation and other materials currently
 available on https://github.com/DataTorrent/Apex. This proposal also
 covers the Malhar source code (written in Java), Malhar documentation, and
 other materials currently available on
 https://github.com/DataTorrent/Malhar. We have done a trademark check on
 the name Apex, and have concluded that the Apex name is likely to be a
 suitable project name.

 == Background ==
 DataTorrent RTS is a mature and robust product developed as a native YARN
 application. RTS 1.0 was launched in summer of 2014; RTS 2.0 was launched
 in Jan 2015. Both were well received by customers. RTS 3.0 was launched at
 end of July 2015. RTS is among the first enterprise grade platform that was
 developed from the ground up as native YARN application. DataTorrent RTS is
 currently maintained by engineers as a closed source project. Even though
 the engineers behind RTS are experienced software engineers and are
 knowledge leaders in data-in-motion platforms, they have had little
 exposure to the open source governance process. Customers are currently
 running applications based on DataTorrent RTS in production.

 == Rationale ==
 Big data applications written for non-Hadoop platforms typically require
 major rewrites  to get them to work with Hadoop. This rewriting creates a
 significant bottleneck in terms of resources (expertise) which in turn
 jeopardizes the viability of such an endeavour. It is hard enough to
 acquire big data expertise, demanding additional expertise to do a major
 code conversion makes it a very hard problem for projects to successfully
 migrate to Hadoop. Also, due to the batch processing nature of Hadoop’s
 MapReduce paradigm, users often have to wait tens of minutes to see results
 

Re: [VOTE] Accept Apex into the Apache Incubator

2015-08-13 Thread Seetharam Venkatesh
+1 (Non-binding)

On Thu, Aug 13, 2015 at 1:09 PM Julian Hyde jh...@apache.org wrote:

 +1 (binding)

 Julian


  On Aug 13, 2015, at 12:40 PM, Gaurav Gupta gau...@datatorrent.com
 wrote:
 
  +1 (Non-binding)
 
  -Gaurav
 
  On Aug 13, 2015, at 10:22 AM, Pramod Immaneni pra...@datatorrent.com
 wrote:
 
  +1 (Non-binding)
 
  On Thu, Aug 13, 2015 at 7:48 AM, P. Taylor Goetz ptgo...@apache.org
 wrote:
 
  Following the discussion thread [1], I would like to call a VOTE for
  Accepting Apex as a new Apache Incubator project.
 
  The proposal is available on the wiki [2] and is also attached below.
 
  The VOTE will be open for at least 72 hours.
 
  [ ] +1 Accept Apex into the Incubator
  [ ] ±0 No opinion
  [ ] -1 Do not accept Apex into the Incubator because…
 
  Thanks,
 
  -Taylor
 
  [1] http://s.apache.org/apex_discuss
  [2] https://wiki.apache.org/incubator/ApexProposal
 
 
  == Abstract ==
  Apex is an enterprise grade native YARN big data-in-motion platform
 that
  unifies stream processing as well as batch processing. Apex processes
 big
  data in-motion in a highly scalable, highly performant, fault tolerant,
  stateful, secure, distributed, and an easily operable way. It provides
 a
  simple API that enables users to write or re-use generic Java code,
 thereby
  lowering the expertise needed to write big data applications.
 
  Functional and operational specifications are separated. Apex is
 designed
  in a way to enable users to write their own code (aka user defined
  functions) as is and leave all operability to the platform. The API is
 very
  simple and is designed to allow users to drop in their code as is. The
  platform mainly deals with operability and treats functional code as a
  black box. Operability includes fault tolerance, scalability, security,
  ease of use, metrics api, webservices, etc. In other words there is no
  separation of UDF (user defined functions), as all functional code is
 UDF.
  This frees users to focus on functional development, and lets platform
  provide operability support. The same code runs as is with different
  operability attributes. The data-in-motion architecture of Apex unifies
  stream as well as batch processing in a single platform. Since Apex is
 a
  native YARN application, it leverages all the components of YARN
 without
  duplication. Apex was developed with YARN in mind and has no
 overlapping
  components/functionality with YARN.
 
  The Apex platform is supplemented by project Malhar, which is a
 library of
  operators that implement common business logic functions needed by
  customers who want to quickly develop applications. These operators
 provide
  access to HDFS, S3, NFS, FTP, and other file systems;  Kafka, ActiveMQ,
  RabbitMQ, JMS, and other message systems; MySql, Cassandra, MongoDB,
 Redis,
  HBase, CouchDB and other databases along with JDBC connectors. The
 Malhar
  library also includes a host of other common business logic patterns
 that
  help users to significantly reduce the time it takes to go into
 production.
  Ease of integration with all other big data technologies is one of the
  primary missions of Malhar.
 
  == Proposal ==
  The goal of this proposal is to establish the core engine of
 DataTorrent
  RTS product as an Apache Software Foundation (ASF) project in order to
  build a vibrant, diverse, and self-governed open source community
 around
  the technology. DataTorrent will continue to sell management tools,
  application building tools, easy to use big data applications, and
 custom
  high end business logic operators. This proposal covers the Apex source
  code (written in Java), Apex documentation and other materials
 currently
  available on https://github.com/DataTorrent/Apex. This proposal also
  covers the Malhar source code (written in Java), Malhar documentation,
 and
  other materials currently available on
  https://github.com/DataTorrent/Malhar. We have done a trademark check
 on
  the name Apex, and have concluded that the Apex name is likely to be a
  suitable project name.
 
  == Background ==
  DataTorrent RTS is a mature and robust product developed as a native
 YARN
  application. RTS 1.0 was launched in summer of 2014; RTS 2.0 was
 launched
  in Jan 2015. Both were well received by customers. RTS 3.0 was
 launched at
  end of July 2015. RTS is among the first enterprise grade platform
 that was
  developed from the ground up as native YARN application. DataTorrent
 RTS is
  currently maintained by engineers as a closed source project. Even
 though
  the engineers behind RTS are experienced software engineers and are
  knowledge leaders in data-in-motion platforms, they have had little
  exposure to the open source governance process. Customers are currently
  running applications based on DataTorrent RTS in production.
 
  == Rationale ==
  Big data applications written for non-Hadoop platforms typically
 require
  major rewrites  to get them to work with Hadoop. This rewriting
 creates a
  

Re: [VOTE] Accept Apex into the Apache Incubator

2015-08-13 Thread Ted Dunning
+1 (binding)



On Thu, Aug 13, 2015 at 1:47 PM, Hitesh Shah hit...@apache.org wrote:

 +1 (binding)

 — Hitesh

 On Aug 13, 2015, at 7:48 AM, P. Taylor Goetz ptgo...@apache.org wrote:

  Following the discussion thread [1], I would like to call a VOTE for
 Accepting Apex as a new Apache Incubator project.
 
  The proposal is available on the wiki [2] and is also attached below.
 
  The VOTE will be open for at least 72 hours.
 
  [ ] +1 Accept Apex into the Incubator
  [ ] ±0 No opinion
  [ ] -1 Do not accept Apex into the Incubator because…
 
  Thanks,
 
  -Taylor
 
  [1] http://s.apache.org/apex_discuss
  [2] https://wiki.apache.org/incubator/ApexProposal
 
 
  == Abstract ==
  Apex is an enterprise grade native YARN big data-in-motion platform that
 unifies stream processing as well as batch processing. Apex processes big
 data in-motion in a highly scalable, highly performant, fault tolerant,
 stateful, secure, distributed, and an easily operable way. It provides a
 simple API that enables users to write or re-use generic Java code, thereby
 lowering the expertise needed to write big data applications.
 
  Functional and operational specifications are separated. Apex is
 designed in a way to enable users to write their own code (aka user defined
 functions) as is and leave all operability to the platform. The API is very
 simple and is designed to allow users to drop in their code as is. The
 platform mainly deals with operability and treats functional code as a
 black box. Operability includes fault tolerance, scalability, security,
 ease of use, metrics api, webservices, etc. In other words there is no
 separation of UDF (user defined functions), as all functional code is UDF.
 This frees users to focus on functional development, and lets platform
 provide operability support. The same code runs as is with different
 operability attributes. The data-in-motion architecture of Apex unifies
 stream as well as batch processing in a single platform. Since Apex is a
 native YARN application, it leverages all the components of YARN without
 duplication. Apex was developed with YARN in mind and has no overlapping
 components/functionality with YARN.
 
  The Apex platform is supplemented by project Malhar, which is a library
 of operators that implement common business logic functions needed by
 customers who want to quickly develop applications. These operators provide
 access to HDFS, S3, NFS, FTP, and other file systems;  Kafka, ActiveMQ,
 RabbitMQ, JMS, and other message systems; MySql, Cassandra, MongoDB, Redis,
 HBase, CouchDB and other databases along with JDBC connectors. The Malhar
 library also includes a host of other common business logic patterns that
 help users to significantly reduce the time it takes to go into production.
 Ease of integration with all other big data technologies is one of the
 primary missions of Malhar.
 
  == Proposal ==
  The goal of this proposal is to establish the core engine of DataTorrent
 RTS product as an Apache Software Foundation (ASF) project in order to
 build a vibrant, diverse, and self-governed open source community around
 the technology. DataTorrent will continue to sell management tools,
 application building tools, easy to use big data applications, and custom
 high end business logic operators. This proposal covers the Apex source
 code (written in Java), Apex documentation and other materials currently
 available on https://github.com/DataTorrent/Apex. This proposal also
 covers the Malhar source code (written in Java), Malhar documentation, and
 other materials currently available on
 https://github.com/DataTorrent/Malhar. We have done a trademark check on
 the name Apex, and have concluded that the Apex name is likely to be a
 suitable project name.
 
  == Background ==
  DataTorrent RTS is a mature and robust product developed as a native
 YARN application. RTS 1.0 was launched in summer of 2014; RTS 2.0 was
 launched in Jan 2015. Both were well received by customers. RTS 3.0 was
 launched at end of July 2015. RTS is among the first enterprise grade
 platform that was developed from the ground up as native YARN application.
 DataTorrent RTS is currently maintained by engineers as a closed source
 project. Even though the engineers behind RTS are experienced software
 engineers and are knowledge leaders in data-in-motion platforms, they have
 had little exposure to the open source governance process. Customers are
 currently running applications based on DataTorrent RTS in production.
 
  == Rationale ==
  Big data applications written for non-Hadoop platforms typically require
 major rewrites  to get them to work with Hadoop. This rewriting creates a
 significant bottleneck in terms of resources (expertise) which in turn
 jeopardizes the viability of such an endeavour. It is hard enough to
 acquire big data expertise, demanding additional expertise to do a major
 code conversion makes it a very hard problem for projects to successfully
 migrate to 

Re: [VOTE] Accept Apex into the Apache Incubator

2015-08-13 Thread Alan Gates
+1.

Alan.

 Chris Nauroth mailto:cnaur...@hortonworks.com
 August 13, 2015 at 9:59
 +1 (binding)

 I believe the current proposal covers everything required. Thank you
 to Amol for incorporating the community's feedback.

 --Chris Nauroth

 From: P. Taylor Goetz ptgo...@apache.orgmailto:ptgo...@apache.org
 Reply-To:
 general@incubator.apache.orgmailto:general@incubator.apache.org
 Date: Thursday, August 13, 2015 at 7:48 AM
 To: Incubator
 general@incubator.apache.orgmailto:general@incubator.apache.org
 Subject: [VOTE] Accept Apex into the Apache Incubator

 Following the discussion thread [1], I would like to call a VOTE for
 Accepting Apex as a new Apache Incubator project.

 The proposal is available on the wiki [2] and is also attached below.

 The VOTE will be open for at least 72 hours.

 [ ] +1 Accept Apex into the Incubator
 [ ] ±0 No opinion
 [ ] -1 Do not accept Apex into the Incubator because…

 Thanks,

 -Taylor

 [1] http://s.apache.org/apex_discuss
 [2] https://wiki.apache.org/incubator/ApexProposal


 == Abstract ==
 Apex is an enterprise grade native YARN big data-in-motion platform
 that unifies stream processing as well as batch processing. Apex
 processes big data in-motion in a highly scalable, highly performant,
 fault tolerant, stateful, secure, distributed, and an easily operable
 way. It provides a simple API that enables users to write or re-use
 generic Java code, thereby lowering the expertise needed to write big
 data applications.

 Functional and operational specifications are separated. Apex is
 designed in a way to enable users to write their own code (aka user
 defined functions) as is and leave all operability to the platform.
 The API is very simple and is designed to allow users to drop in their
 code as is. The platform mainly deals with operability and treats
 functional code as a black box. Operability includes fault tolerance,
 scalability, security, ease of use, metrics api, webservices, etc. In
 other words there is no separation of UDF (user defined functions), as
 all functional code is UDF. This frees users to focus on functional
 development, and lets platform provide operability support. The same
 code runs as is with different operability attributes. The
 data-in-motion architecture of Apex unifies stream as well as batch
 processing in a single platform. Since Apex is a native YARN
 application, it leverages all the components of YARN without
 duplication. Apex was developed with YARN in mind and has no
 overlapping components/functionality with YARN.

 The Apex platform is supplemented by project Malhar, which is a
 library of operators that implement common business logic functions
 needed by customers who want to quickly develop applications. These
 operators provide access to HDFS, S3, NFS, FTP, and other file
 systems; Kafka, ActiveMQ, RabbitMQ, JMS, and other message systems;
 MySql, Cassandra, MongoDB, Redis, HBase, CouchDB and other databases
 along with JDBC connectors. The Malhar library also includes a host of
 other common business logic patterns that help users to significantly
 reduce the time it takes to go into production. Ease of integration
 with all other big data technologies is one of the primary missions of
 Malhar.

 == Proposal ==
 The goal of this proposal is to establish the core engine of
 DataTorrent RTS product as an Apache Software Foundation (ASF) project
 in order to build a vibrant, diverse, and self-governed open source
 community around the technology. DataTorrent will continue to sell
 management tools, application building tools, easy to use big data
 applications, and custom high end business logic operators. This
 proposal covers the Apex source code (written in Java), Apex
 documentation and other materials currently available on
 https://github.com/DataTorrent/Apex. This proposal also covers the
 Malhar source code (written in Java), Malhar documentation, and other
 materials currently available on
 https://github.com/DataTorrent/Malhar. We have done a trademark check
 on the name Apex, and have concluded that the Apex name is likely to
 be a suitable project name.

 == Background ==
 DataTorrent RTS is a mature and robust product developed as a native
 YARN application. RTS 1.0 was launched in summer of 2014; RTS 2.0 was
 launched in Jan 2015. Both were well received by customers. RTS 3.0
 was launched at end of July 2015. RTS is among the first enterprise
 grade platform that was developed from the ground up as native YARN
 application. DataTorrent RTS is currently maintained by engineers as a
 closed source project. Even though the engineers behind RTS are
 experienced software engineers and are knowledge leaders in
 data-in-motion platforms, they have had little exposure to the open
 source governance process. Customers are currently running
 applications based on DataTorrent RTS in production.

 == Rationale ==
 Big data applications written for non-Hadoop platforms typically
 require major rewrites to get them 

Re: [VOTE] Accept Apex into the Apache Incubator

2015-08-13 Thread Julian Hyde
+1 (binding)

Julian


 On Aug 13, 2015, at 12:40 PM, Gaurav Gupta gau...@datatorrent.com wrote:
 
 +1 (Non-binding)
 
 -Gaurav
 
 On Aug 13, 2015, at 10:22 AM, Pramod Immaneni pra...@datatorrent.com wrote:
 
 +1 (Non-binding)
 
 On Thu, Aug 13, 2015 at 7:48 AM, P. Taylor Goetz ptgo...@apache.org wrote:
 
 Following the discussion thread [1], I would like to call a VOTE for
 Accepting Apex as a new Apache Incubator project.
 
 The proposal is available on the wiki [2] and is also attached below.
 
 The VOTE will be open for at least 72 hours.
 
 [ ] +1 Accept Apex into the Incubator
 [ ] ±0 No opinion
 [ ] -1 Do not accept Apex into the Incubator because…
 
 Thanks,
 
 -Taylor
 
 [1] http://s.apache.org/apex_discuss
 [2] https://wiki.apache.org/incubator/ApexProposal
 
 
 == Abstract ==
 Apex is an enterprise grade native YARN big data-in-motion platform that
 unifies stream processing as well as batch processing. Apex processes big
 data in-motion in a highly scalable, highly performant, fault tolerant,
 stateful, secure, distributed, and an easily operable way. It provides a
 simple API that enables users to write or re-use generic Java code, thereby
 lowering the expertise needed to write big data applications.
 
 Functional and operational specifications are separated. Apex is designed
 in a way to enable users to write their own code (aka user defined
 functions) as is and leave all operability to the platform. The API is very
 simple and is designed to allow users to drop in their code as is. The
 platform mainly deals with operability and treats functional code as a
 black box. Operability includes fault tolerance, scalability, security,
 ease of use, metrics api, webservices, etc. In other words there is no
 separation of UDF (user defined functions), as all functional code is UDF.
 This frees users to focus on functional development, and lets platform
 provide operability support. The same code runs as is with different
 operability attributes. The data-in-motion architecture of Apex unifies
 stream as well as batch processing in a single platform. Since Apex is a
 native YARN application, it leverages all the components of YARN without
 duplication. Apex was developed with YARN in mind and has no overlapping
 components/functionality with YARN.
 
 The Apex platform is supplemented by project Malhar, which is a library of
 operators that implement common business logic functions needed by
 customers who want to quickly develop applications. These operators provide
 access to HDFS, S3, NFS, FTP, and other file systems;  Kafka, ActiveMQ,
 RabbitMQ, JMS, and other message systems; MySql, Cassandra, MongoDB, Redis,
 HBase, CouchDB and other databases along with JDBC connectors. The Malhar
 library also includes a host of other common business logic patterns that
 help users to significantly reduce the time it takes to go into production.
 Ease of integration with all other big data technologies is one of the
 primary missions of Malhar.
 
 == Proposal ==
 The goal of this proposal is to establish the core engine of DataTorrent
 RTS product as an Apache Software Foundation (ASF) project in order to
 build a vibrant, diverse, and self-governed open source community around
 the technology. DataTorrent will continue to sell management tools,
 application building tools, easy to use big data applications, and custom
 high end business logic operators. This proposal covers the Apex source
 code (written in Java), Apex documentation and other materials currently
 available on https://github.com/DataTorrent/Apex. This proposal also
 covers the Malhar source code (written in Java), Malhar documentation, and
 other materials currently available on
 https://github.com/DataTorrent/Malhar. We have done a trademark check on
 the name Apex, and have concluded that the Apex name is likely to be a
 suitable project name.
 
 == Background ==
 DataTorrent RTS is a mature and robust product developed as a native YARN
 application. RTS 1.0 was launched in summer of 2014; RTS 2.0 was launched
 in Jan 2015. Both were well received by customers. RTS 3.0 was launched at
 end of July 2015. RTS is among the first enterprise grade platform that was
 developed from the ground up as native YARN application. DataTorrent RTS is
 currently maintained by engineers as a closed source project. Even though
 the engineers behind RTS are experienced software engineers and are
 knowledge leaders in data-in-motion platforms, they have had little
 exposure to the open source governance process. Customers are currently
 running applications based on DataTorrent RTS in production.
 
 == Rationale ==
 Big data applications written for non-Hadoop platforms typically require
 major rewrites  to get them to work with Hadoop. This rewriting creates a
 significant bottleneck in terms of resources (expertise) which in turn
 jeopardizes the viability of such an endeavour. It is hard enough to
 acquire big data expertise, demanding additional expertise 

Re: [VOTE] Accept Apex into the Apache Incubator

2015-08-13 Thread Luke Han
+1 (Non-binding)


Best Regards!
-

Luke Han

On Fri, Aug 14, 2015 at 9:52 AM, Amol Kekre a...@datatorrent.com wrote:

 +1 (Non-binding)

 Amol

 On Thu, Aug 13, 2015 at 7:48 AM, P. Taylor Goetz ptgo...@apache.org
 wrote:

  Following the discussion thread [1], I would like to call a VOTE for
  Accepting Apex as a new Apache Incubator project.
 
  The proposal is available on the wiki [2] and is also attached below.
 
  The VOTE will be open for at least 72 hours.
 
  [ ] +1 Accept Apex into the Incubator
  [ ] ±0 No opinion
  [ ] -1 Do not accept Apex into the Incubator because…
 
  Thanks,
 
  -Taylor
 
  [1] http://s.apache.org/apex_discuss
  [2] https://wiki.apache.org/incubator/ApexProposal
 
 
  == Abstract ==
  Apex is an enterprise grade native YARN big data-in-motion platform that
  unifies stream processing as well as batch processing. Apex processes big
  data in-motion in a highly scalable, highly performant, fault tolerant,
  stateful, secure, distributed, and an easily operable way. It provides a
  simple API that enables users to write or re-use generic Java code,
 thereby
  lowering the expertise needed to write big data applications.
 
  Functional and operational specifications are separated. Apex is designed
  in a way to enable users to write their own code (aka user defined
  functions) as is and leave all operability to the platform. The API is
 very
  simple and is designed to allow users to drop in their code as is. The
  platform mainly deals with operability and treats functional code as a
  black box. Operability includes fault tolerance, scalability, security,
  ease of use, metrics api, webservices, etc. In other words there is no
  separation of UDF (user defined functions), as all functional code is
 UDF.
  This frees users to focus on functional development, and lets platform
  provide operability support. The same code runs as is with different
  operability attributes. The data-in-motion architecture of Apex unifies
  stream as well as batch processing in a single platform. Since Apex is a
  native YARN application, it leverages all the components of YARN without
  duplication. Apex was developed with YARN in mind and has no overlapping
  components/functionality with YARN.
 
  The Apex platform is supplemented by project Malhar, which is a library
 of
  operators that implement common business logic functions needed by
  customers who want to quickly develop applications. These operators
 provide
  access to HDFS, S3, NFS, FTP, and other file systems;  Kafka, ActiveMQ,
  RabbitMQ, JMS, and other message systems; MySql, Cassandra, MongoDB,
 Redis,
  HBase, CouchDB and other databases along with JDBC connectors. The Malhar
  library also includes a host of other common business logic patterns that
  help users to significantly reduce the time it takes to go into
 production.
  Ease of integration with all other big data technologies is one of the
  primary missions of Malhar.
 
  == Proposal ==
  The goal of this proposal is to establish the core engine of DataTorrent
  RTS product as an Apache Software Foundation (ASF) project in order to
  build a vibrant, diverse, and self-governed open source community around
  the technology. DataTorrent will continue to sell management tools,
  application building tools, easy to use big data applications, and custom
  high end business logic operators. This proposal covers the Apex source
  code (written in Java), Apex documentation and other materials currently
  available on https://github.com/DataTorrent/Apex. This proposal also
  covers the Malhar source code (written in Java), Malhar documentation,
 and
  other materials currently available on
  https://github.com/DataTorrent/Malhar. We have done a trademark check on
  the name Apex, and have concluded that the Apex name is likely to be a
  suitable project name.
 
  == Background ==
  DataTorrent RTS is a mature and robust product developed as a native YARN
  application. RTS 1.0 was launched in summer of 2014; RTS 2.0 was launched
  in Jan 2015. Both were well received by customers. RTS 3.0 was launched
 at
  end of July 2015. RTS is among the first enterprise grade platform that
 was
  developed from the ground up as native YARN application. DataTorrent RTS
 is
  currently maintained by engineers as a closed source project. Even though
  the engineers behind RTS are experienced software engineers and are
  knowledge leaders in data-in-motion platforms, they have had little
  exposure to the open source governance process. Customers are currently
  running applications based on DataTorrent RTS in production.
 
  == Rationale ==
  Big data applications written for non-Hadoop platforms typically require
  major rewrites  to get them to work with Hadoop. This rewriting creates a
  significant bottleneck in terms of resources (expertise) which in turn
  jeopardizes the viability of such an endeavour. It is hard enough to
  acquire big data expertise, demanding 

Re: [VOTE] Accept Apex into the Apache Incubator

2015-08-13 Thread Amol Kekre
+1 (Non-binding)

Amol

On Thu, Aug 13, 2015 at 7:48 AM, P. Taylor Goetz ptgo...@apache.org wrote:

 Following the discussion thread [1], I would like to call a VOTE for
 Accepting Apex as a new Apache Incubator project.

 The proposal is available on the wiki [2] and is also attached below.

 The VOTE will be open for at least 72 hours.

 [ ] +1 Accept Apex into the Incubator
 [ ] ±0 No opinion
 [ ] -1 Do not accept Apex into the Incubator because…

 Thanks,

 -Taylor

 [1] http://s.apache.org/apex_discuss
 [2] https://wiki.apache.org/incubator/ApexProposal


 == Abstract ==
 Apex is an enterprise grade native YARN big data-in-motion platform that
 unifies stream processing as well as batch processing. Apex processes big
 data in-motion in a highly scalable, highly performant, fault tolerant,
 stateful, secure, distributed, and an easily operable way. It provides a
 simple API that enables users to write or re-use generic Java code, thereby
 lowering the expertise needed to write big data applications.

 Functional and operational specifications are separated. Apex is designed
 in a way to enable users to write their own code (aka user defined
 functions) as is and leave all operability to the platform. The API is very
 simple and is designed to allow users to drop in their code as is. The
 platform mainly deals with operability and treats functional code as a
 black box. Operability includes fault tolerance, scalability, security,
 ease of use, metrics api, webservices, etc. In other words there is no
 separation of UDF (user defined functions), as all functional code is UDF.
 This frees users to focus on functional development, and lets platform
 provide operability support. The same code runs as is with different
 operability attributes. The data-in-motion architecture of Apex unifies
 stream as well as batch processing in a single platform. Since Apex is a
 native YARN application, it leverages all the components of YARN without
 duplication. Apex was developed with YARN in mind and has no overlapping
 components/functionality with YARN.

 The Apex platform is supplemented by project Malhar, which is a library of
 operators that implement common business logic functions needed by
 customers who want to quickly develop applications. These operators provide
 access to HDFS, S3, NFS, FTP, and other file systems;  Kafka, ActiveMQ,
 RabbitMQ, JMS, and other message systems; MySql, Cassandra, MongoDB, Redis,
 HBase, CouchDB and other databases along with JDBC connectors. The Malhar
 library also includes a host of other common business logic patterns that
 help users to significantly reduce the time it takes to go into production.
 Ease of integration with all other big data technologies is one of the
 primary missions of Malhar.

 == Proposal ==
 The goal of this proposal is to establish the core engine of DataTorrent
 RTS product as an Apache Software Foundation (ASF) project in order to
 build a vibrant, diverse, and self-governed open source community around
 the technology. DataTorrent will continue to sell management tools,
 application building tools, easy to use big data applications, and custom
 high end business logic operators. This proposal covers the Apex source
 code (written in Java), Apex documentation and other materials currently
 available on https://github.com/DataTorrent/Apex. This proposal also
 covers the Malhar source code (written in Java), Malhar documentation, and
 other materials currently available on
 https://github.com/DataTorrent/Malhar. We have done a trademark check on
 the name Apex, and have concluded that the Apex name is likely to be a
 suitable project name.

 == Background ==
 DataTorrent RTS is a mature and robust product developed as a native YARN
 application. RTS 1.0 was launched in summer of 2014; RTS 2.0 was launched
 in Jan 2015. Both were well received by customers. RTS 3.0 was launched at
 end of July 2015. RTS is among the first enterprise grade platform that was
 developed from the ground up as native YARN application. DataTorrent RTS is
 currently maintained by engineers as a closed source project. Even though
 the engineers behind RTS are experienced software engineers and are
 knowledge leaders in data-in-motion platforms, they have had little
 exposure to the open source governance process. Customers are currently
 running applications based on DataTorrent RTS in production.

 == Rationale ==
 Big data applications written for non-Hadoop platforms typically require
 major rewrites  to get them to work with Hadoop. This rewriting creates a
 significant bottleneck in terms of resources (expertise) which in turn
 jeopardizes the viability of such an endeavour. It is hard enough to
 acquire big data expertise, demanding additional expertise to do a major
 code conversion makes it a very hard problem for projects to successfully
 migrate to Hadoop. Also, due to the batch processing nature of Hadoop’s
 MapReduce paradigm, users often have to wait tens of 

Re: [VOTE] Accept Apex into the Apache Incubator

2015-08-13 Thread Atri Sharma
+1 (Non Binding)
On 13 Aug 2015 22:18, P. Taylor Goetz ptgo...@apache.org wrote:

 Following the discussion thread [1], I would like to call a VOTE for
 Accepting Apex as a new Apache Incubator project.

 The proposal is available on the wiki [2] and is also attached below.

 The VOTE will be open for at least 72 hours.

 [ ] +1 Accept Apex into the Incubator
 [ ] ±0 No opinion
 [ ] -1 Do not accept Apex into the Incubator because…

 Thanks,

 -Taylor

 [1] http://s.apache.org/apex_discuss
 [2] https://wiki.apache.org/incubator/ApexProposal


 == Abstract ==
 Apex is an enterprise grade native YARN big data-in-motion platform that
 unifies stream processing as well as batch processing. Apex processes big
 data in-motion in a highly scalable, highly performant, fault tolerant,
 stateful, secure, distributed, and an easily operable way. It provides a
 simple API that enables users to write or re-use generic Java code, thereby
 lowering the expertise needed to write big data applications.

 Functional and operational specifications are separated. Apex is designed
 in a way to enable users to write their own code (aka user defined
 functions) as is and leave all operability to the platform. The API is very
 simple and is designed to allow users to drop in their code as is. The
 platform mainly deals with operability and treats functional code as a
 black box. Operability includes fault tolerance, scalability, security,
 ease of use, metrics api, webservices, etc. In other words there is no
 separation of UDF (user defined functions), as all functional code is UDF.
 This frees users to focus on functional development, and lets platform
 provide operability support. The same code runs as is with different
 operability attributes. The data-in-motion architecture of Apex unifies
 stream as well as batch processing in a single platform. Since Apex is a
 native YARN application, it leverages all the components of YARN without
 duplication. Apex was developed with YARN in mind and has no overlapping
 components/functionality with YARN.

 The Apex platform is supplemented by project Malhar, which is a library of
 operators that implement common business logic functions needed by
 customers who want to quickly develop applications. These operators provide
 access to HDFS, S3, NFS, FTP, and other file systems;  Kafka, ActiveMQ,
 RabbitMQ, JMS, and other message systems; MySql, Cassandra, MongoDB, Redis,
 HBase, CouchDB and other databases along with JDBC connectors. The Malhar
 library also includes a host of other common business logic patterns that
 help users to significantly reduce the time it takes to go into production.
 Ease of integration with all other big data technologies is one of the
 primary missions of Malhar.

 == Proposal ==
 The goal of this proposal is to establish the core engine of DataTorrent
 RTS product as an Apache Software Foundation (ASF) project in order to
 build a vibrant, diverse, and self-governed open source community around
 the technology. DataTorrent will continue to sell management tools,
 application building tools, easy to use big data applications, and custom
 high end business logic operators. This proposal covers the Apex source
 code (written in Java), Apex documentation and other materials currently
 available on https://github.com/DataTorrent/Apex. This proposal also
 covers the Malhar source code (written in Java), Malhar documentation, and
 other materials currently available on
 https://github.com/DataTorrent/Malhar. We have done a trademark check on
 the name Apex, and have concluded that the Apex name is likely to be a
 suitable project name.

 == Background ==
 DataTorrent RTS is a mature and robust product developed as a native YARN
 application. RTS 1.0 was launched in summer of 2014; RTS 2.0 was launched
 in Jan 2015. Both were well received by customers. RTS 3.0 was launched at
 end of July 2015. RTS is among the first enterprise grade platform that was
 developed from the ground up as native YARN application. DataTorrent RTS is
 currently maintained by engineers as a closed source project. Even though
 the engineers behind RTS are experienced software engineers and are
 knowledge leaders in data-in-motion platforms, they have had little
 exposure to the open source governance process. Customers are currently
 running applications based on DataTorrent RTS in production.

 == Rationale ==
 Big data applications written for non-Hadoop platforms typically require
 major rewrites  to get them to work with Hadoop. This rewriting creates a
 significant bottleneck in terms of resources (expertise) which in turn
 jeopardizes the viability of such an endeavour. It is hard enough to
 acquire big data expertise, demanding additional expertise to do a major
 code conversion makes it a very hard problem for projects to successfully
 migrate to Hadoop. Also, due to the batch processing nature of Hadoop’s
 MapReduce paradigm, users often have to wait tens of minutes to see 

Re: [VOTE] Accept Apex into the Apache Incubator

2015-08-13 Thread Chris Douglas
+1 (binding) -C

On Thu, Aug 13, 2015 at 7:48 AM, P. Taylor Goetz ptgo...@apache.org wrote:
 Following the discussion thread [1], I would like to call a VOTE for
 Accepting Apex as a new Apache Incubator project.

 The proposal is available on the wiki [2] and is also attached below.

 The VOTE will be open for at least 72 hours.

 [ ] +1 Accept Apex into the Incubator
 [ ] ±0 No opinion
 [ ] -1 Do not accept Apex into the Incubator because…

 Thanks,

 -Taylor

 [1] http://s.apache.org/apex_discuss
 [2] https://wiki.apache.org/incubator/ApexProposal


 == Abstract ==
 Apex is an enterprise grade native YARN big data-in-motion platform that
 unifies stream processing as well as batch processing. Apex processes big
 data in-motion in a highly scalable, highly performant, fault tolerant,
 stateful, secure, distributed, and an easily operable way. It provides a
 simple API that enables users to write or re-use generic Java code, thereby
 lowering the expertise needed to write big data applications.

 Functional and operational specifications are separated. Apex is designed in
 a way to enable users to write their own code (aka user defined functions)
 as is and leave all operability to the platform. The API is very simple and
 is designed to allow users to drop in their code as is. The platform mainly
 deals with operability and treats functional code as a black box.
 Operability includes fault tolerance, scalability, security, ease of use,
 metrics api, webservices, etc. In other words there is no separation of UDF
 (user defined functions), as all functional code is UDF. This frees users to
 focus on functional development, and lets platform provide operability
 support. The same code runs as is with different operability attributes. The
 data-in-motion architecture of Apex unifies stream as well as batch
 processing in a single platform. Since Apex is a native YARN application, it
 leverages all the components of YARN without duplication. Apex was developed
 with YARN in mind and has no overlapping components/functionality with YARN.

 The Apex platform is supplemented by project Malhar, which is a library of
 operators that implement common business logic functions needed by customers
 who want to quickly develop applications. These operators provide access to
 HDFS, S3, NFS, FTP, and other file systems;  Kafka, ActiveMQ, RabbitMQ, JMS,
 and other message systems; MySql, Cassandra, MongoDB, Redis, HBase, CouchDB
 and other databases along with JDBC connectors. The Malhar library also
 includes a host of other common business logic patterns that help users to
 significantly reduce the time it takes to go into production. Ease of
 integration with all other big data technologies is one of the primary
 missions of Malhar.

 == Proposal ==
 The goal of this proposal is to establish the core engine of DataTorrent RTS
 product as an Apache Software Foundation (ASF) project in order to build a
 vibrant, diverse, and self-governed open source community around the
 technology. DataTorrent will continue to sell management tools, application
 building tools, easy to use big data applications, and custom high end
 business logic operators. This proposal covers the Apex source code (written
 in Java), Apex documentation and other materials currently available on
 https://github.com/DataTorrent/Apex. This proposal also covers the Malhar
 source code (written in Java), Malhar documentation, and other materials
 currently available on https://github.com/DataTorrent/Malhar. We have done a
 trademark check on the name Apex, and have concluded that the Apex name is
 likely to be a suitable project name.

 == Background ==
 DataTorrent RTS is a mature and robust product developed as a native YARN
 application. RTS 1.0 was launched in summer of 2014; RTS 2.0 was launched in
 Jan 2015. Both were well received by customers. RTS 3.0 was launched at end
 of July 2015. RTS is among the first enterprise grade platform that was
 developed from the ground up as native YARN application. DataTorrent RTS is
 currently maintained by engineers as a closed source project. Even though
 the engineers behind RTS are experienced software engineers and are
 knowledge leaders in data-in-motion platforms, they have had little exposure
 to the open source governance process. Customers are currently running
 applications based on DataTorrent RTS in production.

 == Rationale ==
 Big data applications written for non-Hadoop platforms typically require
 major rewrites  to get them to work with Hadoop. This rewriting creates a
 significant bottleneck in terms of resources (expertise) which in turn
 jeopardizes the viability of such an endeavour. It is hard enough to acquire
 big data expertise, demanding additional expertise to do a major code
 conversion makes it a very hard problem for projects to successfully migrate
 to Hadoop. Also, due to the batch processing nature of Hadoop’s MapReduce
 paradigm, users often have to wait tens of minutes to 

Re: [VOTE] Accept Apex into the Apache Incubator

2015-08-13 Thread Justin Mclean
+1 binding

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: [VOTE] Accept Apex into the Apache Incubator

2015-08-13 Thread Naresh Agarwal
+1 (non-binding)

Thanks
Naresh

On Fri, Aug 14, 2015 at 11:14 AM, Justin Mclean jus...@classsoftware.com
wrote:

 +1 binding

 -
 To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
 For additional commands, e-mail: general-h...@incubator.apache.org



-- 
_
The information contained in this communication is intended solely for the 
use of the individual or entity to whom it is addressed and others 
authorized to receive it. It may contain confidential or legally privileged 
information. If you are not the intended recipient you are hereby notified 
that any disclosure, copying, distribution or taking any action in reliance 
on the contents of this information is strictly prohibited and may be 
unlawful. If you have received this communication in error, please notify 
us immediately by responding to this email and then delete it from your 
system. The firm is neither liable for the proper and complete transmission 
of the information contained in this communication nor for any delay in its 
receipt.


Re: [VOTE] Accept Apex into the Apache Incubator

2015-08-13 Thread Hitesh Shah
+1 (binding)

— Hitesh

On Aug 13, 2015, at 7:48 AM, P. Taylor Goetz ptgo...@apache.org wrote:

 Following the discussion thread [1], I would like to call a VOTE for 
 Accepting Apex as a new Apache Incubator project.
 
 The proposal is available on the wiki [2] and is also attached below.
 
 The VOTE will be open for at least 72 hours.
 
 [ ] +1 Accept Apex into the Incubator
 [ ] ±0 No opinion
 [ ] -1 Do not accept Apex into the Incubator because…
 
 Thanks,
 
 -Taylor
 
 [1] http://s.apache.org/apex_discuss
 [2] https://wiki.apache.org/incubator/ApexProposal
 
 
 == Abstract ==
 Apex is an enterprise grade native YARN big data-in-motion platform that 
 unifies stream processing as well as batch processing. Apex processes big 
 data in-motion in a highly scalable, highly performant, fault tolerant, 
 stateful, secure, distributed, and an easily operable way. It provides a 
 simple API that enables users to write or re-use generic Java code, thereby 
 lowering the expertise needed to write big data applications.
 
 Functional and operational specifications are separated. Apex is designed in 
 a way to enable users to write their own code (aka user defined functions) as 
 is and leave all operability to the platform. The API is very simple and is 
 designed to allow users to drop in their code as is. The platform mainly 
 deals with operability and treats functional code as a black box. Operability 
 includes fault tolerance, scalability, security, ease of use, metrics api, 
 webservices, etc. In other words there is no separation of UDF (user defined 
 functions), as all functional code is UDF. This frees users to focus on 
 functional development, and lets platform provide operability support. The 
 same code runs as is with different operability attributes. The 
 data-in-motion architecture of Apex unifies stream as well as batch 
 processing in a single platform. Since Apex is a native YARN application, it 
 leverages all the components of YARN without duplication. Apex was developed 
 with YARN in mind and has no overlapping components/functionality with YARN.
 
 The Apex platform is supplemented by project Malhar, which is a library of 
 operators that implement common business logic functions needed by customers 
 who want to quickly develop applications. These operators provide access to 
 HDFS, S3, NFS, FTP, and other file systems;  Kafka, ActiveMQ, RabbitMQ, JMS, 
 and other message systems; MySql, Cassandra, MongoDB, Redis, HBase, CouchDB 
 and other databases along with JDBC connectors. The Malhar library also 
 includes a host of other common business logic patterns that help users to 
 significantly reduce the time it takes to go into production. Ease of 
 integration with all other big data technologies is one of the primary 
 missions of Malhar.
 
 == Proposal ==
 The goal of this proposal is to establish the core engine of DataTorrent RTS 
 product as an Apache Software Foundation (ASF) project in order to build a 
 vibrant, diverse, and self-governed open source community around the 
 technology. DataTorrent will continue to sell management tools, application 
 building tools, easy to use big data applications, and custom high end 
 business logic operators. This proposal covers the Apex source code (written 
 in Java), Apex documentation and other materials currently available on 
 https://github.com/DataTorrent/Apex. This proposal also covers the Malhar 
 source code (written in Java), Malhar documentation, and other materials 
 currently available on https://github.com/DataTorrent/Malhar. We have done a 
 trademark check on the name Apex, and have concluded that the Apex name is 
 likely to be a suitable project name. 
 
 == Background ==
 DataTorrent RTS is a mature and robust product developed as a native YARN 
 application. RTS 1.0 was launched in summer of 2014; RTS 2.0 was launched in 
 Jan 2015. Both were well received by customers. RTS 3.0 was launched at end 
 of July 2015. RTS is among the first enterprise grade platform that was 
 developed from the ground up as native YARN application. DataTorrent RTS is 
 currently maintained by engineers as a closed source project. Even though the 
 engineers behind RTS are experienced software engineers and are knowledge 
 leaders in data-in-motion platforms, they have had little exposure to the 
 open source governance process. Customers are currently running applications 
 based on DataTorrent RTS in production.
 
 == Rationale ==
 Big data applications written for non-Hadoop platforms typically require 
 major rewrites  to get them to work with Hadoop. This rewriting creates a 
 significant bottleneck in terms of resources (expertise) which in turn 
 jeopardizes the viability of such an endeavour. It is hard enough to acquire 
 big data expertise, demanding additional expertise to do a major code 
 conversion makes it a very hard problem for projects to successfully migrate 
 to Hadoop. Also, due to the batch processing nature of