Re: [VOTE] Graduate Apache Usergrid from the Incubator

2015-08-13 Thread Jim Jagielski
+1!!
 On Aug 10, 2015, at 12:15 PM, Dave snoopd...@gmail.com wrote:
 
 The Usergrid project has made three releases from the Incubator (1.0.0,
 1.0.1 and 1.0.2), has added multiple and diverse committers, and the
 project has completed all required items on the graduation check-list [1].
 Consensus appears to be ([2] and [3]) that the project is ready to graduate
 and so I'm calling this vote and sharing the Usergrid Top Level Project
 Resolution (see below).
 
 The vote will run for 72 hours, ending 3pm EST Thursday Aug 13, 2015.
 Everyone in the Usergrid and Incubator communities is invited and
 encouraged to vote, although only PPMC votes are binding
 
 [ ] +1 Graduate Apache Usergrid from the Incubator.
 [ ] +0 Don't care.
 [ ] -1 Don't graduate Apache Usergrid from the Incubator because ...
 
 Here's my binding vote: +1.
 
 Thanks,
 Dave
 
 [1] http://incubator.apache.org/projects/usergrid.html
 [2] Dev list discussion:
 http://mail-archives.apache.org/mod_mbox/incubator-usergrid-dev/201507.mbox/%3CCAF1aazBvhYD3ZM_nKDDbrwO%3D4y6d%2BR1nH1M-2FWc9GZNuPtAjw%40mail.gmail.com%3E
 [3] Incubator discussion:
 http://mail-archives.apache.org/mod_mbox/incubator-general/201508.mbox/%3CCAF1aazCBCKNNYGT42%2BuGo%3DAdMb9uLkBrEm04rWj2tPe5LG%2BE9A%40mail.gmail.com%3E
 
 
 Apache Usergrid top-level project resolution:
 
   WHEREAS, the Board of Directors deems it to be in the best
   interests of the Foundation and consistent with the
   Foundation's purpose to establish a Project Management
   Committee charged with the creation and maintenance of
   open-source software related to the Usergrid BaaS software,
   for distribution at no charge to the public.
 
   NOW, THEREFORE, BE IT RESOLVED, that a Project Management
   Committee (PMC), to be known as the Apache Usergrid Project,
   be and hereby is established pursuant to Bylaws of the
   Foundation; and be it further
 
   RESOLVED, that the Apache Usergrid Project be and hereby is
   responsible for the creation and maintenance of open-source
   software related to the Usergrid BaaS software; and be it further
 
   RESOLVED, that the office of Vice President, Usergrid be and
   hereby is created, the person holding such office to serve at
   the direction of the Board of Directors as the chair of the
   Apache Usergrid Project, and to have primary responsibility for
   management of the projects within the scope of responsibility
   of the Apache Usegrid Project; and be it further
 
   RESOLVED, that the persons listed immediately below be and
   hereby are appointed to serve as the initial members of the
   Apache Usegrid Project:
 
   * Tim Anglade  timangl...@apache.org
   * Askhat Asanaliev aasanal...@apache.org
   * John D. Amentjohndam...@apache.org
   * Ed Anuff edan...@apache.org
   * Furkan Bıçak fbi...@apache.org
   * Ryan Bridges ry...@apache.org
   * Jake Farrell jfarr...@apachge.org
   * Scott Ganyo  scottga...@apache.org
   * Sungju Jin   sun...@apache.org
   * Dave Johnson snoopd...@apache.org
   * Alex Karasuluakaras...@apache.org
   * Salih Kardan skar...@apache.org
   * Jim Jagielskij...@apache.org
   * Shaozhuang Liu   st...@apache.org
   * Nate McCall  zzn...@apache.org
   * John Mcgibbney   lewi...@apache.org
   * Alex Muramotoamuram...@apache.org
   * Todd Ninetoddn...@apache.org
   * Luciano Resende  lrese...@apache.org
   * Yiğit Şaplı  yig...@apache.org
   * Rod Simpson  rockers...@apache.org
   * Jeff Westjeffreyaw...@apache.org
 
   NOW, THEREFORE, BE IT FURTHER RESOLVED, that Todd Nine
   be appointed to the office of Vice President, Usergrid, to serve
   in accordance with and subject to the direction of the Board of
   Directors and the Bylaws of the Foundation until death,
   resignation, retirement, removal or disqualification, or until
   a successor is appointed; and be it further
 
   RESOLVED, that the initial Apache Usergrid Project be and hereby
   is tasked with the migration and rationalization of the Apache
   Incubator Usegrid podling; and be it further
 
   RESOLVED, that all responsibility pertaining to the Apache
   Incubator Usergrid podling encumbered upon the Apache Incubator
   PMC are hereafter discharged.


-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



[VOTE] Accept Apex into the Apache Incubator

2015-08-13 Thread P. Taylor Goetz
Following the discussion thread [1], I would like to call a VOTE for Accepting 
Apex as a new Apache Incubator project.

The proposal is available on the wiki [2] and is also attached below.

The VOTE will be open for at least 72 hours.

[ ] +1 Accept Apex into the Incubator
[ ] ±0 No opinion
[ ] -1 Do not accept Apex into the Incubator because…

Thanks,

-Taylor

[1] http://s.apache.org/apex_discuss http://s.apache.org/apex_discuss
[2] https://wiki.apache.org/incubator/ApexProposal


== Abstract ==
Apex is an enterprise grade native YARN big data-in-motion platform that 
unifies stream processing as well as batch processing. Apex processes big data 
in-motion in a highly scalable, highly performant, fault tolerant, stateful, 
secure, distributed, and an easily operable way. It provides a simple API that 
enables users to write or re-use generic Java code, thereby lowering the 
expertise needed to write big data applications.

Functional and operational specifications are separated. Apex is designed in a 
way to enable users to write their own code (aka user defined functions) as is 
and leave all operability to the platform. The API is very simple and is 
designed to allow users to drop in their code as is. The platform mainly deals 
with operability and treats functional code as a black box. Operability 
includes fault tolerance, scalability, security, ease of use, metrics api, 
webservices, etc. In other words there is no separation of UDF (user defined 
functions), as all functional code is UDF. This frees users to focus on 
functional development, and lets platform provide operability support. The same 
code runs as is with different operability attributes. The data-in-motion 
architecture of Apex unifies stream as well as batch processing in a single 
platform. Since Apex is a native YARN application, it leverages all the 
components of YARN without duplication. Apex was developed with YARN in mind 
and has no overlapping components/functionality with YARN.

The Apex platform is supplemented by project Malhar, which is a library of 
operators that implement common business logic functions needed by customers 
who want to quickly develop applications. These operators provide access to 
HDFS, S3, NFS, FTP, and other file systems;  Kafka, ActiveMQ, RabbitMQ, JMS, 
and other message systems; MySql, Cassandra, MongoDB, Redis, HBase, CouchDB and 
other databases along with JDBC connectors. The Malhar library also includes a 
host of other common business logic patterns that help users to significantly 
reduce the time it takes to go into production. Ease of integration with all 
other big data technologies is one of the primary missions of Malhar.

== Proposal ==
The goal of this proposal is to establish the core engine of DataTorrent RTS 
product as an Apache Software Foundation (ASF) project in order to build a 
vibrant, diverse, and self-governed open source community around the 
technology. DataTorrent will continue to sell management tools, application 
building tools, easy to use big data applications, and custom high end business 
logic operators. This proposal covers the Apex source code (written in Java), 
Apex documentation and other materials currently available on 
https://github.com/DataTorrent/Apex. This proposal also covers the Malhar 
source code (written in Java), Malhar documentation, and other materials 
currently available on https://github.com/DataTorrent/Malhar. We have done a 
trademark check on the name Apex, and have concluded that the Apex name is 
likely to be a suitable project name.

== Background ==
DataTorrent RTS is a mature and robust product developed as a native YARN 
application. RTS 1.0 was launched in summer of 2014; RTS 2.0 was launched in 
Jan 2015. Both were well received by customers. RTS 3.0 was launched at end of 
July 2015. RTS is among the first enterprise grade platform that was developed 
from the ground up as native YARN application. DataTorrent RTS is currently 
maintained by engineers as a closed source project. Even though the engineers 
behind RTS are experienced software engineers and are knowledge leaders in 
data-in-motion platforms, they have had little exposure to the open source 
governance process. Customers are currently running applications based on 
DataTorrent RTS in production.

== Rationale ==
Big data applications written for non-Hadoop platforms typically require major 
rewrites  to get them to work with Hadoop. This rewriting creates a significant 
bottleneck in terms of resources (expertise) which in turn jeopardizes the 
viability of such an endeavour. It is hard enough to acquire big data 
expertise, demanding additional expertise to do a major code conversion makes 
it a very hard problem for projects to successfully migrate to Hadoop. Also, 
due to the batch processing nature of Hadoop’s MapReduce paradigm, users often 
have to wait tens of minutes to see results and act on them due to various 
delays in data flow. DataTorrent’s RTS 

Re: [VOTE] Accept Apex into the Apache Incubator

2015-08-13 Thread Chris Nauroth
+1 (binding)

I believe the current proposal covers everything required.  Thank you to Amol 
for incorporating the community's feedback.

--Chris Nauroth

From: P. Taylor Goetz ptgo...@apache.orgmailto:ptgo...@apache.org
Reply-To: general@incubator.apache.orgmailto:general@incubator.apache.org
Date: Thursday, August 13, 2015 at 7:48 AM
To: Incubator 
general@incubator.apache.orgmailto:general@incubator.apache.org
Subject: [VOTE] Accept Apex into the Apache Incubator

Following the discussion thread [1], I would like to call a VOTE for Accepting 
Apex as a new Apache Incubator project.

The proposal is available on the wiki [2] and is also attached below.

The VOTE will be open for at least 72 hours.

[ ] +1 Accept Apex into the Incubator
[ ] ±0 No opinion
[ ] -1 Do not accept Apex into the Incubator because…

Thanks,

-Taylor

[1] http://s.apache.org/apex_discuss
[2] https://wiki.apache.org/incubator/ApexProposal


== Abstract ==
Apex is an enterprise grade native YARN big data-in-motion platform that 
unifies stream processing as well as batch processing. Apex processes big data 
in-motion in a highly scalable, highly performant, fault tolerant, stateful, 
secure, distributed, and an easily operable way. It provides a simple API that 
enables users to write or re-use generic Java code, thereby lowering the 
expertise needed to write big data applications.

Functional and operational specifications are separated. Apex is designed in a 
way to enable users to write their own code (aka user defined functions) as is 
and leave all operability to the platform. The API is very simple and is 
designed to allow users to drop in their code as is. The platform mainly deals 
with operability and treats functional code as a black box. Operability 
includes fault tolerance, scalability, security, ease of use, metrics api, 
webservices, etc. In other words there is no separation of UDF (user defined 
functions), as all functional code is UDF. This frees users to focus on 
functional development, and lets platform provide operability support. The same 
code runs as is with different operability attributes. The data-in-motion 
architecture of Apex unifies stream as well as batch processing in a single 
platform. Since Apex is a native YARN application, it leverages all the 
components of YARN without duplication. Apex was developed with YARN in mind 
and has no overlapping components/functionality with YARN.

The Apex platform is supplemented by project Malhar, which is a library of 
operators that implement common business logic functions needed by customers 
who want to quickly develop applications. These operators provide access to 
HDFS, S3, NFS, FTP, and other file systems;  Kafka, ActiveMQ, RabbitMQ, JMS, 
and other message systems; MySql, Cassandra, MongoDB, Redis, HBase, CouchDB and 
other databases along with JDBC connectors. The Malhar library also includes a 
host of other common business logic patterns that help users to significantly 
reduce the time it takes to go into production. Ease of integration with all 
other big data technologies is one of the primary missions of Malhar.

== Proposal ==
The goal of this proposal is to establish the core engine of DataTorrent RTS 
product as an Apache Software Foundation (ASF) project in order to build a 
vibrant, diverse, and self-governed open source community around the 
technology. DataTorrent will continue to sell management tools, application 
building tools, easy to use big data applications, and custom high end business 
logic operators. This proposal covers the Apex source code (written in Java), 
Apex documentation and other materials currently available on 
https://github.com/DataTorrent/Apex. This proposal also covers the Malhar 
source code (written in Java), Malhar documentation, and other materials 
currently available on https://github.com/DataTorrent/Malhar. We have done a 
trademark check on the name Apex, and have concluded that the Apex name is 
likely to be a suitable project name.

== Background ==
DataTorrent RTS is a mature and robust product developed as a native YARN 
application. RTS 1.0 was launched in summer of 2014; RTS 2.0 was launched in 
Jan 2015. Both were well received by customers. RTS 3.0 was launched at end of 
July 2015. RTS is among the first enterprise grade platform that was developed 
from the ground up as native YARN application. DataTorrent RTS is currently 
maintained by engineers as a closed source project. Even though the engineers 
behind RTS are experienced software engineers and are knowledge leaders in 
data-in-motion platforms, they have had little exposure to the open source 
governance process. Customers are currently running applications based on 
DataTorrent RTS in production.

== Rationale ==
Big data applications written for non-Hadoop platforms typically require major 
rewrites  to get them to work with Hadoop. This rewriting creates a significant 
bottleneck in terms of resources (expertise) which in turn 

Re: [VOTE] Graduate Usergrid from the incubator

2015-08-13 Thread Jim Jagielski
+1! (PS: I was away on 2 week vaca that's why this is late!)

 On Aug 7, 2015, at 4:17 PM, Dave snoopd...@gmail.com wrote:
 
 Below is a revised TLP resolution for Usergrid for review.  I added to the
 list our mentors Jim J and Jake F. I believe the list is complete now.
 
 Also, I removed the below paragraph, which is unnecessary and only exists
 because I copied some other project's resolution. We don't need special
 Usergrid bylaws, the ASF bylaws are good enough.
 
   RESOLVED, that the initial Apache Usergrid Project be and hereby
   is tasked with the creation of a set of bylaws intended to
   encourage open development and increased participation in the
   Usergrid Project; and be it further
 
 I welcome any suggestions or other feedback on this resolution.
 
 Dave
 
 
 
 Apache Usergrid top-level project resolution:
 
   WHEREAS, the Board of Directors deems it to be in the best
   interests of the Foundation and consistent with the
   Foundation's purpose to establish a Project Management
   Committee charged with the creation and maintenance of
   open-source software related to the Usergrid BaaS software,
   for distribution at no charge to the public.
 
   NOW, THEREFORE, BE IT RESOLVED, that a Project Management
   Committee (PMC), to be known as the Apache Usergrid Project,
   be and hereby is established pursuant to Bylaws of the
   Foundation; and be it further
 
   RESOLVED, that the Apache Usergrid Project be and hereby is
   responsible for the creation and maintenance of open-source
   software related to the Usergrid BaaS software; and be it further
 
   RESOLVED, that the office of Vice President, Usergrid be and
   hereby is created, the person holding such office to serve at
   the direction of the Board of Directors as the chair of the
   Apache Usergrid Project, and to have primary responsibility for
   management of the projects within the scope of responsibility
   of the Apache Usegrid Project; and be it further
 
   RESOLVED, that the persons listed immediately below be and
   hereby are appointed to serve as the initial members of the
   Apache Usegrid Project:
 
   * Tim Anglade  timangl...@apache.org
   * Askhat Asanaliev aasanal...@apache.org
   * John D. Amentjohndam...@apache.org
   * Ed Anuff edan...@apache.org
   * Furkan Bıçak fbi...@apache.org
   * Ryan Bridges ry...@apache.org
   * Jake Farrell jfarr...@apachge.org
   * Scott Ganyo  scottga...@apache.org
   * Sungju Jin   sun...@apache.org
   * Dave Johnson snoopd...@apache.org
   * Alex Karasuluakaras...@apache.org
   * Salih Kardan skar...@apache.org
   * Jim Jagielskij...@apache.org
   * Shaozhuang Liu   st...@apache.org
   * Nate McCall  zzn...@apache.org
   * John Mcgibbney   lewi...@apache.org
   * Alex Muramotoamuram...@apache.org
   * Todd Ninetoddn...@apache.org
   * Luciano Resende  lrese...@apache.org
   * Yiğit Şaplı  yig...@apache.org
   * Rod Simpson  rockers...@apache.org
   * Jeff Westjeffreyaw...@apache.org
 
   NOW, THEREFORE, BE IT FURTHER RESOLVED, that Todd Nine
   be appointed to the office of Vice President, Usergrid, to serve
   in accordance with and subject to the direction of the Board of
   Directors and the Bylaws of the Foundation until death,
   resignation, retirement, removal or disqualification, or until
   a successor is appointed; and be it further
 
   RESOLVED, that the initial Apache Usergrid Project be and hereby
   is tasked with the migration and rationalization of the Apache
   Incubator Usegrid podling; and be it further
 
   RESOLVED, that all responsibility pertaining to the Apache
   Incubator Usergrid podling encumbered upon the Apache Incubator
   PMC are hereafter discharged.
 
 
 
 
 
 
 
 
 On Fri, Aug 7, 2015 at 3:51 PM Dave snoopd...@gmail.com wrote:
 
 There is no VOTE in progress.
 
 We voted to graduate on the Usergrid dev list, I forwarded the results of
 the vote to this list and added a draft TLP resolution for review. As I
 said when I forwarded the email, I will will call for an IPMC vote shortly.
 
 http://incubator.apache.org/guides/graduation.html
 
 Dave
 
 
 On Fri, Aug 7, 2015 at 3:35 PM Ted Dunning ted.dunn...@gmail.com wrote:
 
 On Fri, Aug 7, 2015 at 11:10 AM, Marvin Humphrey mar...@rectangular.com
 wrote:
 
 However, please do not change the text of any VOTE (graduation, entry
 into incubation, release approval, anything...) while it is underway.
 That retroactively changes the meaning of votes already cast, which is
 problematic. There are better ways to be flexible.
 
 
 To be concrete, some of the ways that I know about include:
 
 

Re: [VOTE] Accept Apex into the Apache Incubator

2015-08-13 Thread Pramod Immaneni
+1 (Non-binding)

On Thu, Aug 13, 2015 at 7:48 AM, P. Taylor Goetz ptgo...@apache.org wrote:

 Following the discussion thread [1], I would like to call a VOTE for
 Accepting Apex as a new Apache Incubator project.

 The proposal is available on the wiki [2] and is also attached below.

 The VOTE will be open for at least 72 hours.

 [ ] +1 Accept Apex into the Incubator
 [ ] ±0 No opinion
 [ ] -1 Do not accept Apex into the Incubator because…

 Thanks,

 -Taylor

 [1] http://s.apache.org/apex_discuss
 [2] https://wiki.apache.org/incubator/ApexProposal


 == Abstract ==
 Apex is an enterprise grade native YARN big data-in-motion platform that
 unifies stream processing as well as batch processing. Apex processes big
 data in-motion in a highly scalable, highly performant, fault tolerant,
 stateful, secure, distributed, and an easily operable way. It provides a
 simple API that enables users to write or re-use generic Java code, thereby
 lowering the expertise needed to write big data applications.

 Functional and operational specifications are separated. Apex is designed
 in a way to enable users to write their own code (aka user defined
 functions) as is and leave all operability to the platform. The API is very
 simple and is designed to allow users to drop in their code as is. The
 platform mainly deals with operability and treats functional code as a
 black box. Operability includes fault tolerance, scalability, security,
 ease of use, metrics api, webservices, etc. In other words there is no
 separation of UDF (user defined functions), as all functional code is UDF.
 This frees users to focus on functional development, and lets platform
 provide operability support. The same code runs as is with different
 operability attributes. The data-in-motion architecture of Apex unifies
 stream as well as batch processing in a single platform. Since Apex is a
 native YARN application, it leverages all the components of YARN without
 duplication. Apex was developed with YARN in mind and has no overlapping
 components/functionality with YARN.

 The Apex platform is supplemented by project Malhar, which is a library of
 operators that implement common business logic functions needed by
 customers who want to quickly develop applications. These operators provide
 access to HDFS, S3, NFS, FTP, and other file systems;  Kafka, ActiveMQ,
 RabbitMQ, JMS, and other message systems; MySql, Cassandra, MongoDB, Redis,
 HBase, CouchDB and other databases along with JDBC connectors. The Malhar
 library also includes a host of other common business logic patterns that
 help users to significantly reduce the time it takes to go into production.
 Ease of integration with all other big data technologies is one of the
 primary missions of Malhar.

 == Proposal ==
 The goal of this proposal is to establish the core engine of DataTorrent
 RTS product as an Apache Software Foundation (ASF) project in order to
 build a vibrant, diverse, and self-governed open source community around
 the technology. DataTorrent will continue to sell management tools,
 application building tools, easy to use big data applications, and custom
 high end business logic operators. This proposal covers the Apex source
 code (written in Java), Apex documentation and other materials currently
 available on https://github.com/DataTorrent/Apex. This proposal also
 covers the Malhar source code (written in Java), Malhar documentation, and
 other materials currently available on
 https://github.com/DataTorrent/Malhar. We have done a trademark check on
 the name Apex, and have concluded that the Apex name is likely to be a
 suitable project name.

 == Background ==
 DataTorrent RTS is a mature and robust product developed as a native YARN
 application. RTS 1.0 was launched in summer of 2014; RTS 2.0 was launched
 in Jan 2015. Both were well received by customers. RTS 3.0 was launched at
 end of July 2015. RTS is among the first enterprise grade platform that was
 developed from the ground up as native YARN application. DataTorrent RTS is
 currently maintained by engineers as a closed source project. Even though
 the engineers behind RTS are experienced software engineers and are
 knowledge leaders in data-in-motion platforms, they have had little
 exposure to the open source governance process. Customers are currently
 running applications based on DataTorrent RTS in production.

 == Rationale ==
 Big data applications written for non-Hadoop platforms typically require
 major rewrites  to get them to work with Hadoop. This rewriting creates a
 significant bottleneck in terms of resources (expertise) which in turn
 jeopardizes the viability of such an endeavour. It is hard enough to
 acquire big data expertise, demanding additional expertise to do a major
 code conversion makes it a very hard problem for projects to successfully
 migrate to Hadoop. Also, due to the batch processing nature of Hadoop’s
 MapReduce paradigm, users often have to wait tens of minutes 

Re: [VOTE] Apache Johnzon 0.9-incubating release

2015-08-13 Thread Romain Manni-Bucau
+1


Romain Manni-Bucau
@rmannibucau https://twitter.com/rmannibucau |  Blog
http://rmannibucau.wordpress.com | Github https://github.com/rmannibucau |
LinkedIn https://www.linkedin.com/in/rmannibucau | Tomitriber
http://www.tomitribe.com

2015-08-13 11:03 GMT-07:00 Hendrik Dev hendrikde...@gmail.com:

 The Apache Johnzon PPMC has voted to release Apache Johnzon
 0.9-incubating based on the second release candidate described below. Now
 it
 is the IPMC's turn to vote.

 Git commit for the release is

 https://git-wip-us.apache.org/repos/asf?p=incubator-johnzon.git;a=commit;h=ab542d7208c2b60aa6fdab1b11fa01a1e9ad8241

 Maven staging repo:
 https://repository.apache.org/content/repositories/orgapachejohnzon-1007

 Source releases (zip/tar.gz):

 https://repository.apache.org/content/repositories/orgapachejohnzon-1007/org/apache/johnzon/apache-johnzon/0.9-incubating/apache-johnzon-0.9-incubating-src.zip
 SHA-1:51ab45e8fb5f315c482593923c9d85af6d1bc18b


 https://repository.apache.org/content/repositories/orgapachejohnzon-1007/org/apache/johnzon/apache-johnzon/0.9-incubating/apache-johnzon-0.9-incubating-src.tar.gz
 SHA-1:f80bfce2717d632e38eb12930e1f6a5f30f0180a

 The files are also here
 https://dist.apache.org/repos/dist/dev/incubator/johnzon/

 PGP release keys (signed using 90910A83):
 https://dist.apache.org/repos/dist/release/incubator/johnzon/KEYS

 This release is mainly a bugfixing release. The mapper also introduced
 basic map support for nested converter. Snapshot deps. removed.

 The vote will be open for at least 72 hours.

 Project vote passes with 3 binding +1 votes and no -1 votes:

 http://mail-archives.apache.org/mod_mbox/incubator-johnzon-dev/201508.mbox/%3C8173EB96-FC9B-433A-B659-06A5D83E4A14%40apache.org%3E

 The vote will be open for at least 72 hours.

 [ ] +1  approve
 [ ] -1  disapprove (and reason why)

 Thanks

 --
 Hendrik Saly (salyh, hendrikdev22)
 @hendrikdev22
 PGP: 0x22D7F6EC

 -
 To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
 For additional commands, e-mail: general-h...@incubator.apache.org




Re: [VOTE] Accept Apex into the Apache Incubator

2015-08-13 Thread Gaurav Gupta
+1 (Non-binding)

-Gaurav

 On Aug 13, 2015, at 10:22 AM, Pramod Immaneni pra...@datatorrent.com wrote:
 
 +1 (Non-binding)
 
 On Thu, Aug 13, 2015 at 7:48 AM, P. Taylor Goetz ptgo...@apache.org wrote:
 
 Following the discussion thread [1], I would like to call a VOTE for
 Accepting Apex as a new Apache Incubator project.
 
 The proposal is available on the wiki [2] and is also attached below.
 
 The VOTE will be open for at least 72 hours.
 
 [ ] +1 Accept Apex into the Incubator
 [ ] ±0 No opinion
 [ ] -1 Do not accept Apex into the Incubator because…
 
 Thanks,
 
 -Taylor
 
 [1] http://s.apache.org/apex_discuss
 [2] https://wiki.apache.org/incubator/ApexProposal
 
 
 == Abstract ==
 Apex is an enterprise grade native YARN big data-in-motion platform that
 unifies stream processing as well as batch processing. Apex processes big
 data in-motion in a highly scalable, highly performant, fault tolerant,
 stateful, secure, distributed, and an easily operable way. It provides a
 simple API that enables users to write or re-use generic Java code, thereby
 lowering the expertise needed to write big data applications.
 
 Functional and operational specifications are separated. Apex is designed
 in a way to enable users to write their own code (aka user defined
 functions) as is and leave all operability to the platform. The API is very
 simple and is designed to allow users to drop in their code as is. The
 platform mainly deals with operability and treats functional code as a
 black box. Operability includes fault tolerance, scalability, security,
 ease of use, metrics api, webservices, etc. In other words there is no
 separation of UDF (user defined functions), as all functional code is UDF.
 This frees users to focus on functional development, and lets platform
 provide operability support. The same code runs as is with different
 operability attributes. The data-in-motion architecture of Apex unifies
 stream as well as batch processing in a single platform. Since Apex is a
 native YARN application, it leverages all the components of YARN without
 duplication. Apex was developed with YARN in mind and has no overlapping
 components/functionality with YARN.
 
 The Apex platform is supplemented by project Malhar, which is a library of
 operators that implement common business logic functions needed by
 customers who want to quickly develop applications. These operators provide
 access to HDFS, S3, NFS, FTP, and other file systems;  Kafka, ActiveMQ,
 RabbitMQ, JMS, and other message systems; MySql, Cassandra, MongoDB, Redis,
 HBase, CouchDB and other databases along with JDBC connectors. The Malhar
 library also includes a host of other common business logic patterns that
 help users to significantly reduce the time it takes to go into production.
 Ease of integration with all other big data technologies is one of the
 primary missions of Malhar.
 
 == Proposal ==
 The goal of this proposal is to establish the core engine of DataTorrent
 RTS product as an Apache Software Foundation (ASF) project in order to
 build a vibrant, diverse, and self-governed open source community around
 the technology. DataTorrent will continue to sell management tools,
 application building tools, easy to use big data applications, and custom
 high end business logic operators. This proposal covers the Apex source
 code (written in Java), Apex documentation and other materials currently
 available on https://github.com/DataTorrent/Apex. This proposal also
 covers the Malhar source code (written in Java), Malhar documentation, and
 other materials currently available on
 https://github.com/DataTorrent/Malhar. We have done a trademark check on
 the name Apex, and have concluded that the Apex name is likely to be a
 suitable project name.
 
 == Background ==
 DataTorrent RTS is a mature and robust product developed as a native YARN
 application. RTS 1.0 was launched in summer of 2014; RTS 2.0 was launched
 in Jan 2015. Both were well received by customers. RTS 3.0 was launched at
 end of July 2015. RTS is among the first enterprise grade platform that was
 developed from the ground up as native YARN application. DataTorrent RTS is
 currently maintained by engineers as a closed source project. Even though
 the engineers behind RTS are experienced software engineers and are
 knowledge leaders in data-in-motion platforms, they have had little
 exposure to the open source governance process. Customers are currently
 running applications based on DataTorrent RTS in production.
 
 == Rationale ==
 Big data applications written for non-Hadoop platforms typically require
 major rewrites  to get them to work with Hadoop. This rewriting creates a
 significant bottleneck in terms of resources (expertise) which in turn
 jeopardizes the viability of such an endeavour. It is hard enough to
 acquire big data expertise, demanding additional expertise to do a major
 code conversion makes it a very hard problem for projects to successfully
 migrate to 

Re: [VOTE] Accept Apex into the Apache Incubator

2015-08-13 Thread Henry Saputra
+1 (binding)

Good luck guys!

- Henry

On Thu, Aug 13, 2015 at 7:48 AM, P. Taylor Goetz ptgo...@apache.org wrote:
 Following the discussion thread [1], I would like to call a VOTE for
 Accepting Apex as a new Apache Incubator project.

 The proposal is available on the wiki [2] and is also attached below.

 The VOTE will be open for at least 72 hours.

 [ ] +1 Accept Apex into the Incubator
 [ ] ±0 No opinion
 [ ] -1 Do not accept Apex into the Incubator because…

 Thanks,

 -Taylor

 [1] http://s.apache.org/apex_discuss
 [2] https://wiki.apache.org/incubator/ApexProposal


 == Abstract ==
 Apex is an enterprise grade native YARN big data-in-motion platform that
 unifies stream processing as well as batch processing. Apex processes big
 data in-motion in a highly scalable, highly performant, fault tolerant,
 stateful, secure, distributed, and an easily operable way. It provides a
 simple API that enables users to write or re-use generic Java code, thereby
 lowering the expertise needed to write big data applications.

 Functional and operational specifications are separated. Apex is designed in
 a way to enable users to write their own code (aka user defined functions)
 as is and leave all operability to the platform. The API is very simple and
 is designed to allow users to drop in their code as is. The platform mainly
 deals with operability and treats functional code as a black box.
 Operability includes fault tolerance, scalability, security, ease of use,
 metrics api, webservices, etc. In other words there is no separation of UDF
 (user defined functions), as all functional code is UDF. This frees users to
 focus on functional development, and lets platform provide operability
 support. The same code runs as is with different operability attributes. The
 data-in-motion architecture of Apex unifies stream as well as batch
 processing in a single platform. Since Apex is a native YARN application, it
 leverages all the components of YARN without duplication. Apex was developed
 with YARN in mind and has no overlapping components/functionality with YARN.

 The Apex platform is supplemented by project Malhar, which is a library of
 operators that implement common business logic functions needed by customers
 who want to quickly develop applications. These operators provide access to
 HDFS, S3, NFS, FTP, and other file systems;  Kafka, ActiveMQ, RabbitMQ, JMS,
 and other message systems; MySql, Cassandra, MongoDB, Redis, HBase, CouchDB
 and other databases along with JDBC connectors. The Malhar library also
 includes a host of other common business logic patterns that help users to
 significantly reduce the time it takes to go into production. Ease of
 integration with all other big data technologies is one of the primary
 missions of Malhar.

 == Proposal ==
 The goal of this proposal is to establish the core engine of DataTorrent RTS
 product as an Apache Software Foundation (ASF) project in order to build a
 vibrant, diverse, and self-governed open source community around the
 technology. DataTorrent will continue to sell management tools, application
 building tools, easy to use big data applications, and custom high end
 business logic operators. This proposal covers the Apex source code (written
 in Java), Apex documentation and other materials currently available on
 https://github.com/DataTorrent/Apex. This proposal also covers the Malhar
 source code (written in Java), Malhar documentation, and other materials
 currently available on https://github.com/DataTorrent/Malhar. We have done a
 trademark check on the name Apex, and have concluded that the Apex name is
 likely to be a suitable project name.

 == Background ==
 DataTorrent RTS is a mature and robust product developed as a native YARN
 application. RTS 1.0 was launched in summer of 2014; RTS 2.0 was launched in
 Jan 2015. Both were well received by customers. RTS 3.0 was launched at end
 of July 2015. RTS is among the first enterprise grade platform that was
 developed from the ground up as native YARN application. DataTorrent RTS is
 currently maintained by engineers as a closed source project. Even though
 the engineers behind RTS are experienced software engineers and are
 knowledge leaders in data-in-motion platforms, they have had little exposure
 to the open source governance process. Customers are currently running
 applications based on DataTorrent RTS in production.

 == Rationale ==
 Big data applications written for non-Hadoop platforms typically require
 major rewrites  to get them to work with Hadoop. This rewriting creates a
 significant bottleneck in terms of resources (expertise) which in turn
 jeopardizes the viability of such an endeavour. It is hard enough to acquire
 big data expertise, demanding additional expertise to do a major code
 conversion makes it a very hard problem for projects to successfully migrate
 to Hadoop. Also, due to the batch processing nature of Hadoop’s MapReduce
 paradigm, users often have to 

Re: [VOTE] Accept Apex into the Apache Incubator

2015-08-13 Thread David Yan
+1 (Non-binding)

On Thu, Aug 13, 2015 at 2:32 PM, Alan Gates alanfga...@gmail.com wrote:

 +1.

 Alan.

 Chris Nauroth cnaur...@hortonworks.com
 August 13, 2015 at 9:59
 +1 (binding)

 I believe the current proposal covers everything required. Thank you to
 Amol for incorporating the community's feedback.

 --Chris Nauroth

 From: P. Taylor Goetz ptgo...@apache.orgmailto:ptgo...@apache.org
 ptgo...@apache.org
 Reply-To: general@incubator.apache.org
 mailto:general@incubator.apache.org general@incubator.apache.org
 Date: Thursday, August 13, 2015 at 7:48 AM
 To: Incubator general@incubator.apache.org
 mailto:general@incubator.apache.org general@incubator.apache.org
 Subject: [VOTE] Accept Apex into the Apache Incubator

 Following the discussion thread [1], I would like to call a VOTE for
 Accepting Apex as a new Apache Incubator project.

 The proposal is available on the wiki [2] and is also attached below.

 The VOTE will be open for at least 72 hours.

 [ ] +1 Accept Apex into the Incubator
 [ ] ±0 No opinion
 [ ] -1 Do not accept Apex into the Incubator because…

 Thanks,

 -Taylor

 [1] http://s.apache.org/apex_discuss
 [2] https://wiki.apache.org/incubator/ApexProposal


 == Abstract ==
 Apex is an enterprise grade native YARN big data-in-motion platform that
 unifies stream processing as well as batch processing. Apex processes big
 data in-motion in a highly scalable, highly performant, fault tolerant,
 stateful, secure, distributed, and an easily operable way. It provides a
 simple API that enables users to write or re-use generic Java code, thereby
 lowering the expertise needed to write big data applications.

 Functional and operational specifications are separated. Apex is designed
 in a way to enable users to write their own code (aka user defined
 functions) as is and leave all operability to the platform. The API is very
 simple and is designed to allow users to drop in their code as is. The
 platform mainly deals with operability and treats functional code as a
 black box. Operability includes fault tolerance, scalability, security,
 ease of use, metrics api, webservices, etc. In other words there is no
 separation of UDF (user defined functions), as all functional code is UDF.
 This frees users to focus on functional development, and lets platform
 provide operability support. The same code runs as is with different
 operability attributes. The data-in-motion architecture of Apex unifies
 stream as well as batch processing in a single platform. Since Apex is a
 native YARN application, it leverages all the components of YARN without
 duplication. Apex was developed with YARN in mind and has no overlapping
 components/functionality with YARN.

 The Apex platform is supplemented by project Malhar, which is a library of
 operators that implement common business logic functions needed by
 customers who want to quickly develop applications. These operators provide
 access to HDFS, S3, NFS, FTP, and other file systems; Kafka, ActiveMQ,
 RabbitMQ, JMS, and other message systems; MySql, Cassandra, MongoDB, Redis,
 HBase, CouchDB and other databases along with JDBC connectors. The Malhar
 library also includes a host of other common business logic patterns that
 help users to significantly reduce the time it takes to go into production.
 Ease of integration with all other big data technologies is one of the
 primary missions of Malhar.

 == Proposal ==
 The goal of this proposal is to establish the core engine of DataTorrent
 RTS product as an Apache Software Foundation (ASF) project in order to
 build a vibrant, diverse, and self-governed open source community around
 the technology. DataTorrent will continue to sell management tools,
 application building tools, easy to use big data applications, and custom
 high end business logic operators. This proposal covers the Apex source
 code (written in Java), Apex documentation and other materials currently
 available on https://github.com/DataTorrent/Apex. This proposal also
 covers the Malhar source code (written in Java), Malhar documentation, and
 other materials currently available on
 https://github.com/DataTorrent/Malhar. We have done a trademark check on
 the name Apex, and have concluded that the Apex name is likely to be a
 suitable project name.

 == Background ==
 DataTorrent RTS is a mature and robust product developed as a native YARN
 application. RTS 1.0 was launched in summer of 2014; RTS 2.0 was launched
 in Jan 2015. Both were well received by customers. RTS 3.0 was launched at
 end of July 2015. RTS is among the first enterprise grade platform that was
 developed from the ground up as native YARN application. DataTorrent RTS is
 currently maintained by engineers as a closed source project. Even though
 the engineers behind RTS are experienced software engineers and are
 knowledge leaders in data-in-motion platforms, they have had little
 exposure to the open source governance process. Customers are currently
 running applications 

Re: [VOTE] Accept Apex into the Apache Incubator

2015-08-13 Thread P. Taylor Goetz
+1 (binding)

-Taylor

 On Aug 13, 2015, at 10:48 AM, P. Taylor Goetz ptgo...@apache.org wrote:
 
 Following the discussion thread [1], I would like to call a VOTE for 
 Accepting Apex as a new Apache Incubator project.
 
 The proposal is available on the wiki [2] and is also attached below.
 
 The VOTE will be open for at least 72 hours.
 
 [ ] +1 Accept Apex into the Incubator
 [ ] ±0 No opinion
 [ ] -1 Do not accept Apex into the Incubator because…
 
 Thanks,
 
 -Taylor
 
 [1] http://s.apache.org/apex_discuss http://s.apache.org/apex_discuss
 [2] https://wiki.apache.org/incubator/ApexProposal 
 https://wiki.apache.org/incubator/ApexProposal
 
 
 == Abstract ==
 Apex is an enterprise grade native YARN big data-in-motion platform that 
 unifies stream processing as well as batch processing. Apex processes big 
 data in-motion in a highly scalable, highly performant, fault tolerant, 
 stateful, secure, distributed, and an easily operable way. It provides a 
 simple API that enables users to write or re-use generic Java code, thereby 
 lowering the expertise needed to write big data applications.
 
 Functional and operational specifications are separated. Apex is designed in 
 a way to enable users to write their own code (aka user defined functions) as 
 is and leave all operability to the platform. The API is very simple and is 
 designed to allow users to drop in their code as is. The platform mainly 
 deals with operability and treats functional code as a black box. Operability 
 includes fault tolerance, scalability, security, ease of use, metrics api, 
 webservices, etc. In other words there is no separation of UDF (user defined 
 functions), as all functional code is UDF. This frees users to focus on 
 functional development, and lets platform provide operability support. The 
 same code runs as is with different operability attributes. The 
 data-in-motion architecture of Apex unifies stream as well as batch 
 processing in a single platform. Since Apex is a native YARN application, it 
 leverages all the components of YARN without duplication. Apex was developed 
 with YARN in mind and has no overlapping components/functionality with YARN.
 
 The Apex platform is supplemented by project Malhar, which is a library of 
 operators that implement common business logic functions needed by customers 
 who want to quickly develop applications. These operators provide access to 
 HDFS, S3, NFS, FTP, and other file systems;  Kafka, ActiveMQ, RabbitMQ, JMS, 
 and other message systems; MySql, Cassandra, MongoDB, Redis, HBase, CouchDB 
 and other databases along with JDBC connectors. The Malhar library also 
 includes a host of other common business logic patterns that help users to 
 significantly reduce the time it takes to go into production. Ease of 
 integration with all other big data technologies is one of the primary 
 missions of Malhar.
 
 == Proposal ==
 The goal of this proposal is to establish the core engine of DataTorrent RTS 
 product as an Apache Software Foundation (ASF) project in order to build a 
 vibrant, diverse, and self-governed open source community around the 
 technology. DataTorrent will continue to sell management tools, application 
 building tools, easy to use big data applications, and custom high end 
 business logic operators. This proposal covers the Apex source code (written 
 in Java), Apex documentation and other materials currently available on 
 https://github.com/DataTorrent/Apex https://github.com/DataTorrent/Apex. 
 This proposal also covers the Malhar source code (written in Java), Malhar 
 documentation, and other materials currently available on 
 https://github.com/DataTorrent/Malhar 
 https://github.com/DataTorrent/Malhar. We have done a trademark check on 
 the name Apex, and have concluded that the Apex name is likely to be a 
 suitable project name.
 
 == Background ==
 DataTorrent RTS is a mature and robust product developed as a native YARN 
 application. RTS 1.0 was launched in summer of 2014; RTS 2.0 was launched in 
 Jan 2015. Both were well received by customers. RTS 3.0 was launched at end 
 of July 2015. RTS is among the first enterprise grade platform that was 
 developed from the ground up as native YARN application. DataTorrent RTS is 
 currently maintained by engineers as a closed source project. Even though the 
 engineers behind RTS are experienced software engineers and are knowledge 
 leaders in data-in-motion platforms, they have had little exposure to the 
 open source governance process. Customers are currently running applications 
 based on DataTorrent RTS in production.
 
 == Rationale ==
 Big data applications written for non-Hadoop platforms typically require 
 major rewrites  to get them to work with Hadoop. This rewriting creates a 
 significant bottleneck in terms of resources (expertise) which in turn 
 jeopardizes the viability of such an endeavour. It is hard enough to acquire 
 big data expertise, demanding additional 

Re: [VOTE] Accept Apex into the Apache Incubator

2015-08-13 Thread Ashwin Chandra Putta
+1 (non-binding)

On Thu, Aug 13, 2015 at 7:48 AM, P. Taylor Goetz ptgo...@apache.org wrote:

 Following the discussion thread [1], I would like to call a VOTE for
 Accepting Apex as a new Apache Incubator project.

 The proposal is available on the wiki [2] and is also attached below.

 The VOTE will be open for at least 72 hours.

 [ ] +1 Accept Apex into the Incubator
 [ ] ±0 No opinion
 [ ] -1 Do not accept Apex into the Incubator because…

 Thanks,

 -Taylor

 [1] http://s.apache.org/apex_discuss
 [2] https://wiki.apache.org/incubator/ApexProposal


 == Abstract ==
 Apex is an enterprise grade native YARN big data-in-motion platform that
 unifies stream processing as well as batch processing. Apex processes big
 data in-motion in a highly scalable, highly performant, fault tolerant,
 stateful, secure, distributed, and an easily operable way. It provides a
 simple API that enables users to write or re-use generic Java code, thereby
 lowering the expertise needed to write big data applications.

 Functional and operational specifications are separated. Apex is designed
 in a way to enable users to write their own code (aka user defined
 functions) as is and leave all operability to the platform. The API is very
 simple and is designed to allow users to drop in their code as is. The
 platform mainly deals with operability and treats functional code as a
 black box. Operability includes fault tolerance, scalability, security,
 ease of use, metrics api, webservices, etc. In other words there is no
 separation of UDF (user defined functions), as all functional code is UDF.
 This frees users to focus on functional development, and lets platform
 provide operability support. The same code runs as is with different
 operability attributes. The data-in-motion architecture of Apex unifies
 stream as well as batch processing in a single platform. Since Apex is a
 native YARN application, it leverages all the components of YARN without
 duplication. Apex was developed with YARN in mind and has no overlapping
 components/functionality with YARN.

 The Apex platform is supplemented by project Malhar, which is a library of
 operators that implement common business logic functions needed by
 customers who want to quickly develop applications. These operators provide
 access to HDFS, S3, NFS, FTP, and other file systems;  Kafka, ActiveMQ,
 RabbitMQ, JMS, and other message systems; MySql, Cassandra, MongoDB, Redis,
 HBase, CouchDB and other databases along with JDBC connectors. The Malhar
 library also includes a host of other common business logic patterns that
 help users to significantly reduce the time it takes to go into production.
 Ease of integration with all other big data technologies is one of the
 primary missions of Malhar.

 == Proposal ==
 The goal of this proposal is to establish the core engine of DataTorrent
 RTS product as an Apache Software Foundation (ASF) project in order to
 build a vibrant, diverse, and self-governed open source community around
 the technology. DataTorrent will continue to sell management tools,
 application building tools, easy to use big data applications, and custom
 high end business logic operators. This proposal covers the Apex source
 code (written in Java), Apex documentation and other materials currently
 available on https://github.com/DataTorrent/Apex. This proposal also
 covers the Malhar source code (written in Java), Malhar documentation, and
 other materials currently available on
 https://github.com/DataTorrent/Malhar. We have done a trademark check on
 the name Apex, and have concluded that the Apex name is likely to be a
 suitable project name.

 == Background ==
 DataTorrent RTS is a mature and robust product developed as a native YARN
 application. RTS 1.0 was launched in summer of 2014; RTS 2.0 was launched
 in Jan 2015. Both were well received by customers. RTS 3.0 was launched at
 end of July 2015. RTS is among the first enterprise grade platform that was
 developed from the ground up as native YARN application. DataTorrent RTS is
 currently maintained by engineers as a closed source project. Even though
 the engineers behind RTS are experienced software engineers and are
 knowledge leaders in data-in-motion platforms, they have had little
 exposure to the open source governance process. Customers are currently
 running applications based on DataTorrent RTS in production.

 == Rationale ==
 Big data applications written for non-Hadoop platforms typically require
 major rewrites  to get them to work with Hadoop. This rewriting creates a
 significant bottleneck in terms of resources (expertise) which in turn
 jeopardizes the viability of such an endeavour. It is hard enough to
 acquire big data expertise, demanding additional expertise to do a major
 code conversion makes it a very hard problem for projects to successfully
 migrate to Hadoop. Also, due to the batch processing nature of Hadoop’s
 MapReduce paradigm, users often have to wait tens of minutes 

Re: [VOTE] Accept Apex into the Apache Incubator

2015-08-13 Thread John D. Ament
+1

On Thu, Aug 13, 2015 at 12:48 PM P. Taylor Goetz ptgo...@apache.org wrote:

 Following the discussion thread [1], I would like to call a VOTE for
 Accepting Apex as a new Apache Incubator project.

 The proposal is available on the wiki [2] and is also attached below.

 The VOTE will be open for at least 72 hours.

 [ ] +1 Accept Apex into the Incubator
 [ ] ±0 No opinion
 [ ] -1 Do not accept Apex into the Incubator because…

 Thanks,

 -Taylor

 [1] http://s.apache.org/apex_discuss
 [2] https://wiki.apache.org/incubator/ApexProposal


 == Abstract ==
 Apex is an enterprise grade native YARN big data-in-motion platform that
 unifies stream processing as well as batch processing. Apex processes big
 data in-motion in a highly scalable, highly performant, fault tolerant,
 stateful, secure, distributed, and an easily operable way. It provides a
 simple API that enables users to write or re-use generic Java code, thereby
 lowering the expertise needed to write big data applications.

 Functional and operational specifications are separated. Apex is designed
 in a way to enable users to write their own code (aka user defined
 functions) as is and leave all operability to the platform. The API is very
 simple and is designed to allow users to drop in their code as is. The
 platform mainly deals with operability and treats functional code as a
 black box. Operability includes fault tolerance, scalability, security,
 ease of use, metrics api, webservices, etc. In other words there is no
 separation of UDF (user defined functions), as all functional code is UDF.
 This frees users to focus on functional development, and lets platform
 provide operability support. The same code runs as is with different
 operability attributes. The data-in-motion architecture of Apex unifies
 stream as well as batch processing in a single platform. Since Apex is a
 native YARN application, it leverages all the components of YARN without
 duplication. Apex was developed with YARN in mind and has no overlapping
 components/functionality with YARN.

 The Apex platform is supplemented by project Malhar, which is a library of
 operators that implement common business logic functions needed by
 customers who want to quickly develop applications. These operators provide
 access to HDFS, S3, NFS, FTP, and other file systems;  Kafka, ActiveMQ,
 RabbitMQ, JMS, and other message systems; MySql, Cassandra, MongoDB, Redis,
 HBase, CouchDB and other databases along with JDBC connectors. The Malhar
 library also includes a host of other common business logic patterns that
 help users to significantly reduce the time it takes to go into production.
 Ease of integration with all other big data technologies is one of the
 primary missions of Malhar.

 == Proposal ==
 The goal of this proposal is to establish the core engine of DataTorrent
 RTS product as an Apache Software Foundation (ASF) project in order to
 build a vibrant, diverse, and self-governed open source community around
 the technology. DataTorrent will continue to sell management tools,
 application building tools, easy to use big data applications, and custom
 high end business logic operators. This proposal covers the Apex source
 code (written in Java), Apex documentation and other materials currently
 available on https://github.com/DataTorrent/Apex. This proposal also
 covers the Malhar source code (written in Java), Malhar documentation, and
 other materials currently available on
 https://github.com/DataTorrent/Malhar. We have done a trademark check on
 the name Apex, and have concluded that the Apex name is likely to be a
 suitable project name.

 == Background ==
 DataTorrent RTS is a mature and robust product developed as a native YARN
 application. RTS 1.0 was launched in summer of 2014; RTS 2.0 was launched
 in Jan 2015. Both were well received by customers. RTS 3.0 was launched at
 end of July 2015. RTS is among the first enterprise grade platform that was
 developed from the ground up as native YARN application. DataTorrent RTS is
 currently maintained by engineers as a closed source project. Even though
 the engineers behind RTS are experienced software engineers and are
 knowledge leaders in data-in-motion platforms, they have had little
 exposure to the open source governance process. Customers are currently
 running applications based on DataTorrent RTS in production.

 == Rationale ==
 Big data applications written for non-Hadoop platforms typically require
 major rewrites  to get them to work with Hadoop. This rewriting creates a
 significant bottleneck in terms of resources (expertise) which in turn
 jeopardizes the viability of such an endeavour. It is hard enough to
 acquire big data expertise, demanding additional expertise to do a major
 code conversion makes it a very hard problem for projects to successfully
 migrate to Hadoop. Also, due to the batch processing nature of Hadoop’s
 MapReduce paradigm, users often have to wait tens of minutes to see results
 

Re: [VOTE] Accept Apex into the Apache Incubator

2015-08-13 Thread Seetharam Venkatesh
+1 (Non-binding)

On Thu, Aug 13, 2015 at 1:09 PM Julian Hyde jh...@apache.org wrote:

 +1 (binding)

 Julian


  On Aug 13, 2015, at 12:40 PM, Gaurav Gupta gau...@datatorrent.com
 wrote:
 
  +1 (Non-binding)
 
  -Gaurav
 
  On Aug 13, 2015, at 10:22 AM, Pramod Immaneni pra...@datatorrent.com
 wrote:
 
  +1 (Non-binding)
 
  On Thu, Aug 13, 2015 at 7:48 AM, P. Taylor Goetz ptgo...@apache.org
 wrote:
 
  Following the discussion thread [1], I would like to call a VOTE for
  Accepting Apex as a new Apache Incubator project.
 
  The proposal is available on the wiki [2] and is also attached below.
 
  The VOTE will be open for at least 72 hours.
 
  [ ] +1 Accept Apex into the Incubator
  [ ] ±0 No opinion
  [ ] -1 Do not accept Apex into the Incubator because…
 
  Thanks,
 
  -Taylor
 
  [1] http://s.apache.org/apex_discuss
  [2] https://wiki.apache.org/incubator/ApexProposal
 
 
  == Abstract ==
  Apex is an enterprise grade native YARN big data-in-motion platform
 that
  unifies stream processing as well as batch processing. Apex processes
 big
  data in-motion in a highly scalable, highly performant, fault tolerant,
  stateful, secure, distributed, and an easily operable way. It provides
 a
  simple API that enables users to write or re-use generic Java code,
 thereby
  lowering the expertise needed to write big data applications.
 
  Functional and operational specifications are separated. Apex is
 designed
  in a way to enable users to write their own code (aka user defined
  functions) as is and leave all operability to the platform. The API is
 very
  simple and is designed to allow users to drop in their code as is. The
  platform mainly deals with operability and treats functional code as a
  black box. Operability includes fault tolerance, scalability, security,
  ease of use, metrics api, webservices, etc. In other words there is no
  separation of UDF (user defined functions), as all functional code is
 UDF.
  This frees users to focus on functional development, and lets platform
  provide operability support. The same code runs as is with different
  operability attributes. The data-in-motion architecture of Apex unifies
  stream as well as batch processing in a single platform. Since Apex is
 a
  native YARN application, it leverages all the components of YARN
 without
  duplication. Apex was developed with YARN in mind and has no
 overlapping
  components/functionality with YARN.
 
  The Apex platform is supplemented by project Malhar, which is a
 library of
  operators that implement common business logic functions needed by
  customers who want to quickly develop applications. These operators
 provide
  access to HDFS, S3, NFS, FTP, and other file systems;  Kafka, ActiveMQ,
  RabbitMQ, JMS, and other message systems; MySql, Cassandra, MongoDB,
 Redis,
  HBase, CouchDB and other databases along with JDBC connectors. The
 Malhar
  library also includes a host of other common business logic patterns
 that
  help users to significantly reduce the time it takes to go into
 production.
  Ease of integration with all other big data technologies is one of the
  primary missions of Malhar.
 
  == Proposal ==
  The goal of this proposal is to establish the core engine of
 DataTorrent
  RTS product as an Apache Software Foundation (ASF) project in order to
  build a vibrant, diverse, and self-governed open source community
 around
  the technology. DataTorrent will continue to sell management tools,
  application building tools, easy to use big data applications, and
 custom
  high end business logic operators. This proposal covers the Apex source
  code (written in Java), Apex documentation and other materials
 currently
  available on https://github.com/DataTorrent/Apex. This proposal also
  covers the Malhar source code (written in Java), Malhar documentation,
 and
  other materials currently available on
  https://github.com/DataTorrent/Malhar. We have done a trademark check
 on
  the name Apex, and have concluded that the Apex name is likely to be a
  suitable project name.
 
  == Background ==
  DataTorrent RTS is a mature and robust product developed as a native
 YARN
  application. RTS 1.0 was launched in summer of 2014; RTS 2.0 was
 launched
  in Jan 2015. Both were well received by customers. RTS 3.0 was
 launched at
  end of July 2015. RTS is among the first enterprise grade platform
 that was
  developed from the ground up as native YARN application. DataTorrent
 RTS is
  currently maintained by engineers as a closed source project. Even
 though
  the engineers behind RTS are experienced software engineers and are
  knowledge leaders in data-in-motion platforms, they have had little
  exposure to the open source governance process. Customers are currently
  running applications based on DataTorrent RTS in production.
 
  == Rationale ==
  Big data applications written for non-Hadoop platforms typically
 require
  major rewrites  to get them to work with Hadoop. This rewriting
 creates a
  

Re: [VOTE] Accept Apex into the Apache Incubator

2015-08-13 Thread Ted Dunning
+1 (binding)



On Thu, Aug 13, 2015 at 1:47 PM, Hitesh Shah hit...@apache.org wrote:

 +1 (binding)

 — Hitesh

 On Aug 13, 2015, at 7:48 AM, P. Taylor Goetz ptgo...@apache.org wrote:

  Following the discussion thread [1], I would like to call a VOTE for
 Accepting Apex as a new Apache Incubator project.
 
  The proposal is available on the wiki [2] and is also attached below.
 
  The VOTE will be open for at least 72 hours.
 
  [ ] +1 Accept Apex into the Incubator
  [ ] ±0 No opinion
  [ ] -1 Do not accept Apex into the Incubator because…
 
  Thanks,
 
  -Taylor
 
  [1] http://s.apache.org/apex_discuss
  [2] https://wiki.apache.org/incubator/ApexProposal
 
 
  == Abstract ==
  Apex is an enterprise grade native YARN big data-in-motion platform that
 unifies stream processing as well as batch processing. Apex processes big
 data in-motion in a highly scalable, highly performant, fault tolerant,
 stateful, secure, distributed, and an easily operable way. It provides a
 simple API that enables users to write or re-use generic Java code, thereby
 lowering the expertise needed to write big data applications.
 
  Functional and operational specifications are separated. Apex is
 designed in a way to enable users to write their own code (aka user defined
 functions) as is and leave all operability to the platform. The API is very
 simple and is designed to allow users to drop in their code as is. The
 platform mainly deals with operability and treats functional code as a
 black box. Operability includes fault tolerance, scalability, security,
 ease of use, metrics api, webservices, etc. In other words there is no
 separation of UDF (user defined functions), as all functional code is UDF.
 This frees users to focus on functional development, and lets platform
 provide operability support. The same code runs as is with different
 operability attributes. The data-in-motion architecture of Apex unifies
 stream as well as batch processing in a single platform. Since Apex is a
 native YARN application, it leverages all the components of YARN without
 duplication. Apex was developed with YARN in mind and has no overlapping
 components/functionality with YARN.
 
  The Apex platform is supplemented by project Malhar, which is a library
 of operators that implement common business logic functions needed by
 customers who want to quickly develop applications. These operators provide
 access to HDFS, S3, NFS, FTP, and other file systems;  Kafka, ActiveMQ,
 RabbitMQ, JMS, and other message systems; MySql, Cassandra, MongoDB, Redis,
 HBase, CouchDB and other databases along with JDBC connectors. The Malhar
 library also includes a host of other common business logic patterns that
 help users to significantly reduce the time it takes to go into production.
 Ease of integration with all other big data technologies is one of the
 primary missions of Malhar.
 
  == Proposal ==
  The goal of this proposal is to establish the core engine of DataTorrent
 RTS product as an Apache Software Foundation (ASF) project in order to
 build a vibrant, diverse, and self-governed open source community around
 the technology. DataTorrent will continue to sell management tools,
 application building tools, easy to use big data applications, and custom
 high end business logic operators. This proposal covers the Apex source
 code (written in Java), Apex documentation and other materials currently
 available on https://github.com/DataTorrent/Apex. This proposal also
 covers the Malhar source code (written in Java), Malhar documentation, and
 other materials currently available on
 https://github.com/DataTorrent/Malhar. We have done a trademark check on
 the name Apex, and have concluded that the Apex name is likely to be a
 suitable project name.
 
  == Background ==
  DataTorrent RTS is a mature and robust product developed as a native
 YARN application. RTS 1.0 was launched in summer of 2014; RTS 2.0 was
 launched in Jan 2015. Both were well received by customers. RTS 3.0 was
 launched at end of July 2015. RTS is among the first enterprise grade
 platform that was developed from the ground up as native YARN application.
 DataTorrent RTS is currently maintained by engineers as a closed source
 project. Even though the engineers behind RTS are experienced software
 engineers and are knowledge leaders in data-in-motion platforms, they have
 had little exposure to the open source governance process. Customers are
 currently running applications based on DataTorrent RTS in production.
 
  == Rationale ==
  Big data applications written for non-Hadoop platforms typically require
 major rewrites  to get them to work with Hadoop. This rewriting creates a
 significant bottleneck in terms of resources (expertise) which in turn
 jeopardizes the viability of such an endeavour. It is hard enough to
 acquire big data expertise, demanding additional expertise to do a major
 code conversion makes it a very hard problem for projects to successfully
 migrate to 

Re: [VOTE] Accept Apex into the Apache Incubator

2015-08-13 Thread Alan Gates
+1.

Alan.

 Chris Nauroth mailto:cnaur...@hortonworks.com
 August 13, 2015 at 9:59
 +1 (binding)

 I believe the current proposal covers everything required. Thank you
 to Amol for incorporating the community's feedback.

 --Chris Nauroth

 From: P. Taylor Goetz ptgo...@apache.orgmailto:ptgo...@apache.org
 Reply-To:
 general@incubator.apache.orgmailto:general@incubator.apache.org
 Date: Thursday, August 13, 2015 at 7:48 AM
 To: Incubator
 general@incubator.apache.orgmailto:general@incubator.apache.org
 Subject: [VOTE] Accept Apex into the Apache Incubator

 Following the discussion thread [1], I would like to call a VOTE for
 Accepting Apex as a new Apache Incubator project.

 The proposal is available on the wiki [2] and is also attached below.

 The VOTE will be open for at least 72 hours.

 [ ] +1 Accept Apex into the Incubator
 [ ] ±0 No opinion
 [ ] -1 Do not accept Apex into the Incubator because…

 Thanks,

 -Taylor

 [1] http://s.apache.org/apex_discuss
 [2] https://wiki.apache.org/incubator/ApexProposal


 == Abstract ==
 Apex is an enterprise grade native YARN big data-in-motion platform
 that unifies stream processing as well as batch processing. Apex
 processes big data in-motion in a highly scalable, highly performant,
 fault tolerant, stateful, secure, distributed, and an easily operable
 way. It provides a simple API that enables users to write or re-use
 generic Java code, thereby lowering the expertise needed to write big
 data applications.

 Functional and operational specifications are separated. Apex is
 designed in a way to enable users to write their own code (aka user
 defined functions) as is and leave all operability to the platform.
 The API is very simple and is designed to allow users to drop in their
 code as is. The platform mainly deals with operability and treats
 functional code as a black box. Operability includes fault tolerance,
 scalability, security, ease of use, metrics api, webservices, etc. In
 other words there is no separation of UDF (user defined functions), as
 all functional code is UDF. This frees users to focus on functional
 development, and lets platform provide operability support. The same
 code runs as is with different operability attributes. The
 data-in-motion architecture of Apex unifies stream as well as batch
 processing in a single platform. Since Apex is a native YARN
 application, it leverages all the components of YARN without
 duplication. Apex was developed with YARN in mind and has no
 overlapping components/functionality with YARN.

 The Apex platform is supplemented by project Malhar, which is a
 library of operators that implement common business logic functions
 needed by customers who want to quickly develop applications. These
 operators provide access to HDFS, S3, NFS, FTP, and other file
 systems; Kafka, ActiveMQ, RabbitMQ, JMS, and other message systems;
 MySql, Cassandra, MongoDB, Redis, HBase, CouchDB and other databases
 along with JDBC connectors. The Malhar library also includes a host of
 other common business logic patterns that help users to significantly
 reduce the time it takes to go into production. Ease of integration
 with all other big data technologies is one of the primary missions of
 Malhar.

 == Proposal ==
 The goal of this proposal is to establish the core engine of
 DataTorrent RTS product as an Apache Software Foundation (ASF) project
 in order to build a vibrant, diverse, and self-governed open source
 community around the technology. DataTorrent will continue to sell
 management tools, application building tools, easy to use big data
 applications, and custom high end business logic operators. This
 proposal covers the Apex source code (written in Java), Apex
 documentation and other materials currently available on
 https://github.com/DataTorrent/Apex. This proposal also covers the
 Malhar source code (written in Java), Malhar documentation, and other
 materials currently available on
 https://github.com/DataTorrent/Malhar. We have done a trademark check
 on the name Apex, and have concluded that the Apex name is likely to
 be a suitable project name.

 == Background ==
 DataTorrent RTS is a mature and robust product developed as a native
 YARN application. RTS 1.0 was launched in summer of 2014; RTS 2.0 was
 launched in Jan 2015. Both were well received by customers. RTS 3.0
 was launched at end of July 2015. RTS is among the first enterprise
 grade platform that was developed from the ground up as native YARN
 application. DataTorrent RTS is currently maintained by engineers as a
 closed source project. Even though the engineers behind RTS are
 experienced software engineers and are knowledge leaders in
 data-in-motion platforms, they have had little exposure to the open
 source governance process. Customers are currently running
 applications based on DataTorrent RTS in production.

 == Rationale ==
 Big data applications written for non-Hadoop platforms typically
 require major rewrites to get them 

[VOTE] Apache Johnzon 0.9-incubating release

2015-08-13 Thread Hendrik Dev
The Apache Johnzon PPMC has voted to release Apache Johnzon
0.9-incubating based on the second release candidate described below. Now it
is the IPMC's turn to vote.

Git commit for the release is
https://git-wip-us.apache.org/repos/asf?p=incubator-johnzon.git;a=commit;h=ab542d7208c2b60aa6fdab1b11fa01a1e9ad8241

Maven staging repo:
https://repository.apache.org/content/repositories/orgapachejohnzon-1007

Source releases (zip/tar.gz):
https://repository.apache.org/content/repositories/orgapachejohnzon-1007/org/apache/johnzon/apache-johnzon/0.9-incubating/apache-johnzon-0.9-incubating-src.zip
SHA-1:51ab45e8fb5f315c482593923c9d85af6d1bc18b

https://repository.apache.org/content/repositories/orgapachejohnzon-1007/org/apache/johnzon/apache-johnzon/0.9-incubating/apache-johnzon-0.9-incubating-src.tar.gz
SHA-1:f80bfce2717d632e38eb12930e1f6a5f30f0180a

The files are also here
https://dist.apache.org/repos/dist/dev/incubator/johnzon/

PGP release keys (signed using 90910A83):
https://dist.apache.org/repos/dist/release/incubator/johnzon/KEYS

This release is mainly a bugfixing release. The mapper also introduced
basic map support for nested converter. Snapshot deps. removed.

The vote will be open for at least 72 hours.

Project vote passes with 3 binding +1 votes and no -1 votes:
http://mail-archives.apache.org/mod_mbox/incubator-johnzon-dev/201508.mbox/%3C8173EB96-FC9B-433A-B659-06A5D83E4A14%40apache.org%3E

The vote will be open for at least 72 hours.

[ ] +1  approve
[ ] -1  disapprove (and reason why)

Thanks

-- 
Hendrik Saly (salyh, hendrikdev22)
@hendrikdev22
PGP: 0x22D7F6EC

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



[RESULT] Graduate Apache Usergrid from the Incubator

2015-08-13 Thread Dave
Dave +1 (binding)
Jan +1 (binding)
John +1 (binding)
Daniel +1 (binding)
Ted +1 (binding)
Jake +1 (binding)
Konstantin +1 (binding)
Jim +1 (binding)

The vote PASSES. Thanks for voting!

The Incubator PMC has voted to proceed with Usergrid graduation, the next
step is to submit our Top Level Project Resolution (with the typo fix that
Jake pointed out) to the ASF Board of Directors for approval at their next
meeting.

Dave

Next step(s):
http://incubator.apache.org/guides/graduation.html#top-level-board-proposal


On Thu, Aug 13, 2015 at 1:11 PM Jim Jagielski j...@jagunet.com wrote:

 +1!!
  On Aug 10, 2015, at 12:15 PM, Dave snoopd...@gmail.com wrote:
 
  The Usergrid project has made three releases from the Incubator (1.0.0,
  1.0.1 and 1.0.2), has added multiple and diverse committers, and the
  project has completed all required items on the graduation check-list
 [1].
  Consensus appears to be ([2] and [3]) that the project is ready to
 graduate
  and so I'm calling this vote and sharing the Usergrid Top Level Project
  Resolution (see below).
 
  The vote will run for 72 hours, ending 3pm EST Thursday Aug 13, 2015.
  Everyone in the Usergrid and Incubator communities is invited and
  encouraged to vote, although only PPMC votes are binding
 
  [ ] +1 Graduate Apache Usergrid from the Incubator.
  [ ] +0 Don't care.
  [ ] -1 Don't graduate Apache Usergrid from the Incubator because ...
 
  Here's my binding vote: +1.
 
  Thanks,
  Dave
 
  [1] http://incubator.apache.org/projects/usergrid.html
  [2] Dev list discussion:
 
 http://mail-archives.apache.org/mod_mbox/incubator-usergrid-dev/201507.mbox/%3CCAF1aazBvhYD3ZM_nKDDbrwO%3D4y6d%2BR1nH1M-2FWc9GZNuPtAjw%40mail.gmail.com%3E
  [3] Incubator discussion:
 
 http://mail-archives.apache.org/mod_mbox/incubator-general/201508.mbox/%3CCAF1aazCBCKNNYGT42%2BuGo%3DAdMb9uLkBrEm04rWj2tPe5LG%2BE9A%40mail.gmail.com%3E
 
 
  Apache Usergrid top-level project resolution:
 
WHEREAS, the Board of Directors deems it to be in the best
interests of the Foundation and consistent with the
Foundation's purpose to establish a Project Management
Committee charged with the creation and maintenance of
open-source software related to the Usergrid BaaS software,
for distribution at no charge to the public.
 
NOW, THEREFORE, BE IT RESOLVED, that a Project Management
Committee (PMC), to be known as the Apache Usergrid Project,
be and hereby is established pursuant to Bylaws of the
Foundation; and be it further
 
RESOLVED, that the Apache Usergrid Project be and hereby is
responsible for the creation and maintenance of open-source
software related to the Usergrid BaaS software; and be it further
 
RESOLVED, that the office of Vice President, Usergrid be and
hereby is created, the person holding such office to serve at
the direction of the Board of Directors as the chair of the
Apache Usergrid Project, and to have primary responsibility for
management of the projects within the scope of responsibility
of the Apache Usegrid Project; and be it further
 
RESOLVED, that the persons listed immediately below be and
hereby are appointed to serve as the initial members of the
Apache Usegrid Project:
 
* Tim Anglade  timangl...@apache.org
* Askhat Asanaliev aasanal...@apache.org
* John D. Amentjohndam...@apache.org
* Ed Anuff edan...@apache.org
* Furkan Bıçak fbi...@apache.org
* Ryan Bridges ry...@apache.org
* Jake Farrell jfarr...@apachge.org
* Scott Ganyo  scottga...@apache.org
* Sungju Jin   sun...@apache.org
* Dave Johnson snoopd...@apache.org
* Alex Karasuluakaras...@apache.org
* Salih Kardan skar...@apache.org
* Jim Jagielskij...@apache.org
* Shaozhuang Liu   st...@apache.org
* Nate McCall  zzn...@apache.org
* John Mcgibbney   lewi...@apache.org
* Alex Muramotoamuram...@apache.org
* Todd Ninetoddn...@apache.org
* Luciano Resende  lrese...@apache.org
* Yiğit Şaplı  yig...@apache.org
* Rod Simpson  rockers...@apache.org
* Jeff Westjeffreyaw...@apache.org
 
NOW, THEREFORE, BE IT FURTHER RESOLVED, that Todd Nine
be appointed to the office of Vice President, Usergrid, to serve
in accordance with and subject to the direction of the Board of
Directors and the Bylaws of the Foundation until death,
resignation, retirement, removal or disqualification, or until
a successor is appointed; and be it further
 
RESOLVED, that the initial Apache Usergrid Project be and hereby
is tasked with the migration and rationalization of 

Re: [VOTE] Accept Apex into the Apache Incubator

2015-08-13 Thread Julian Hyde
+1 (binding)

Julian


 On Aug 13, 2015, at 12:40 PM, Gaurav Gupta gau...@datatorrent.com wrote:
 
 +1 (Non-binding)
 
 -Gaurav
 
 On Aug 13, 2015, at 10:22 AM, Pramod Immaneni pra...@datatorrent.com wrote:
 
 +1 (Non-binding)
 
 On Thu, Aug 13, 2015 at 7:48 AM, P. Taylor Goetz ptgo...@apache.org wrote:
 
 Following the discussion thread [1], I would like to call a VOTE for
 Accepting Apex as a new Apache Incubator project.
 
 The proposal is available on the wiki [2] and is also attached below.
 
 The VOTE will be open for at least 72 hours.
 
 [ ] +1 Accept Apex into the Incubator
 [ ] ±0 No opinion
 [ ] -1 Do not accept Apex into the Incubator because…
 
 Thanks,
 
 -Taylor
 
 [1] http://s.apache.org/apex_discuss
 [2] https://wiki.apache.org/incubator/ApexProposal
 
 
 == Abstract ==
 Apex is an enterprise grade native YARN big data-in-motion platform that
 unifies stream processing as well as batch processing. Apex processes big
 data in-motion in a highly scalable, highly performant, fault tolerant,
 stateful, secure, distributed, and an easily operable way. It provides a
 simple API that enables users to write or re-use generic Java code, thereby
 lowering the expertise needed to write big data applications.
 
 Functional and operational specifications are separated. Apex is designed
 in a way to enable users to write their own code (aka user defined
 functions) as is and leave all operability to the platform. The API is very
 simple and is designed to allow users to drop in their code as is. The
 platform mainly deals with operability and treats functional code as a
 black box. Operability includes fault tolerance, scalability, security,
 ease of use, metrics api, webservices, etc. In other words there is no
 separation of UDF (user defined functions), as all functional code is UDF.
 This frees users to focus on functional development, and lets platform
 provide operability support. The same code runs as is with different
 operability attributes. The data-in-motion architecture of Apex unifies
 stream as well as batch processing in a single platform. Since Apex is a
 native YARN application, it leverages all the components of YARN without
 duplication. Apex was developed with YARN in mind and has no overlapping
 components/functionality with YARN.
 
 The Apex platform is supplemented by project Malhar, which is a library of
 operators that implement common business logic functions needed by
 customers who want to quickly develop applications. These operators provide
 access to HDFS, S3, NFS, FTP, and other file systems;  Kafka, ActiveMQ,
 RabbitMQ, JMS, and other message systems; MySql, Cassandra, MongoDB, Redis,
 HBase, CouchDB and other databases along with JDBC connectors. The Malhar
 library also includes a host of other common business logic patterns that
 help users to significantly reduce the time it takes to go into production.
 Ease of integration with all other big data technologies is one of the
 primary missions of Malhar.
 
 == Proposal ==
 The goal of this proposal is to establish the core engine of DataTorrent
 RTS product as an Apache Software Foundation (ASF) project in order to
 build a vibrant, diverse, and self-governed open source community around
 the technology. DataTorrent will continue to sell management tools,
 application building tools, easy to use big data applications, and custom
 high end business logic operators. This proposal covers the Apex source
 code (written in Java), Apex documentation and other materials currently
 available on https://github.com/DataTorrent/Apex. This proposal also
 covers the Malhar source code (written in Java), Malhar documentation, and
 other materials currently available on
 https://github.com/DataTorrent/Malhar. We have done a trademark check on
 the name Apex, and have concluded that the Apex name is likely to be a
 suitable project name.
 
 == Background ==
 DataTorrent RTS is a mature and robust product developed as a native YARN
 application. RTS 1.0 was launched in summer of 2014; RTS 2.0 was launched
 in Jan 2015. Both were well received by customers. RTS 3.0 was launched at
 end of July 2015. RTS is among the first enterprise grade platform that was
 developed from the ground up as native YARN application. DataTorrent RTS is
 currently maintained by engineers as a closed source project. Even though
 the engineers behind RTS are experienced software engineers and are
 knowledge leaders in data-in-motion platforms, they have had little
 exposure to the open source governance process. Customers are currently
 running applications based on DataTorrent RTS in production.
 
 == Rationale ==
 Big data applications written for non-Hadoop platforms typically require
 major rewrites  to get them to work with Hadoop. This rewriting creates a
 significant bottleneck in terms of resources (expertise) which in turn
 jeopardizes the viability of such an endeavour. It is hard enough to
 acquire big data expertise, demanding additional expertise 

Re: [VOTE] Accept Apex into the Apache Incubator

2015-08-13 Thread Luke Han
+1 (Non-binding)


Best Regards!
-

Luke Han

On Fri, Aug 14, 2015 at 9:52 AM, Amol Kekre a...@datatorrent.com wrote:

 +1 (Non-binding)

 Amol

 On Thu, Aug 13, 2015 at 7:48 AM, P. Taylor Goetz ptgo...@apache.org
 wrote:

  Following the discussion thread [1], I would like to call a VOTE for
  Accepting Apex as a new Apache Incubator project.
 
  The proposal is available on the wiki [2] and is also attached below.
 
  The VOTE will be open for at least 72 hours.
 
  [ ] +1 Accept Apex into the Incubator
  [ ] ±0 No opinion
  [ ] -1 Do not accept Apex into the Incubator because…
 
  Thanks,
 
  -Taylor
 
  [1] http://s.apache.org/apex_discuss
  [2] https://wiki.apache.org/incubator/ApexProposal
 
 
  == Abstract ==
  Apex is an enterprise grade native YARN big data-in-motion platform that
  unifies stream processing as well as batch processing. Apex processes big
  data in-motion in a highly scalable, highly performant, fault tolerant,
  stateful, secure, distributed, and an easily operable way. It provides a
  simple API that enables users to write or re-use generic Java code,
 thereby
  lowering the expertise needed to write big data applications.
 
  Functional and operational specifications are separated. Apex is designed
  in a way to enable users to write their own code (aka user defined
  functions) as is and leave all operability to the platform. The API is
 very
  simple and is designed to allow users to drop in their code as is. The
  platform mainly deals with operability and treats functional code as a
  black box. Operability includes fault tolerance, scalability, security,
  ease of use, metrics api, webservices, etc. In other words there is no
  separation of UDF (user defined functions), as all functional code is
 UDF.
  This frees users to focus on functional development, and lets platform
  provide operability support. The same code runs as is with different
  operability attributes. The data-in-motion architecture of Apex unifies
  stream as well as batch processing in a single platform. Since Apex is a
  native YARN application, it leverages all the components of YARN without
  duplication. Apex was developed with YARN in mind and has no overlapping
  components/functionality with YARN.
 
  The Apex platform is supplemented by project Malhar, which is a library
 of
  operators that implement common business logic functions needed by
  customers who want to quickly develop applications. These operators
 provide
  access to HDFS, S3, NFS, FTP, and other file systems;  Kafka, ActiveMQ,
  RabbitMQ, JMS, and other message systems; MySql, Cassandra, MongoDB,
 Redis,
  HBase, CouchDB and other databases along with JDBC connectors. The Malhar
  library also includes a host of other common business logic patterns that
  help users to significantly reduce the time it takes to go into
 production.
  Ease of integration with all other big data technologies is one of the
  primary missions of Malhar.
 
  == Proposal ==
  The goal of this proposal is to establish the core engine of DataTorrent
  RTS product as an Apache Software Foundation (ASF) project in order to
  build a vibrant, diverse, and self-governed open source community around
  the technology. DataTorrent will continue to sell management tools,
  application building tools, easy to use big data applications, and custom
  high end business logic operators. This proposal covers the Apex source
  code (written in Java), Apex documentation and other materials currently
  available on https://github.com/DataTorrent/Apex. This proposal also
  covers the Malhar source code (written in Java), Malhar documentation,
 and
  other materials currently available on
  https://github.com/DataTorrent/Malhar. We have done a trademark check on
  the name Apex, and have concluded that the Apex name is likely to be a
  suitable project name.
 
  == Background ==
  DataTorrent RTS is a mature and robust product developed as a native YARN
  application. RTS 1.0 was launched in summer of 2014; RTS 2.0 was launched
  in Jan 2015. Both were well received by customers. RTS 3.0 was launched
 at
  end of July 2015. RTS is among the first enterprise grade platform that
 was
  developed from the ground up as native YARN application. DataTorrent RTS
 is
  currently maintained by engineers as a closed source project. Even though
  the engineers behind RTS are experienced software engineers and are
  knowledge leaders in data-in-motion platforms, they have had little
  exposure to the open source governance process. Customers are currently
  running applications based on DataTorrent RTS in production.
 
  == Rationale ==
  Big data applications written for non-Hadoop platforms typically require
  major rewrites  to get them to work with Hadoop. This rewriting creates a
  significant bottleneck in terms of resources (expertise) which in turn
  jeopardizes the viability of such an endeavour. It is hard enough to
  acquire big data expertise, demanding 

Re: apache binary distributions

2015-08-13 Thread Marvin Humphrey
On Thu, Aug 13, 2015 at 8:35 PM, Luke Han luke...@gmail.com wrote:
 There's one discussion in Kylin community about to add binary
 package in release, people are really would like to have one:
 http://mail-archives.apache.org/mod_mbox/incubator-kylin-dev/201508.mbox/%3CCAKmQrOZ_MFUyF_y7HXE7iVMCfJHuuOFuU4T8ibsPWfnw0z2Opw%40mail.gmail.com%3E

 For some reason, people (especially in China) is not easy
 to build from source, since there are many library hosted on
 some services which can't be access directly.

 Beyond that, the first impression of a project is how to setup
 correctly and successfully, it not make sense to have everyone to
 build from source. And the reality is many projects already DO binary
 package for convenience purpose.

 After read so long mail thread here, I have a little bit confusion:-(
 there are too many messages...should we have some clear
 guide or practices for such binary release ?

Apache produces open source software, and official Apache releases consist of
source code.  Alongside such official releases, projects may offer binary
packages supplied by volunteers which meet certain criteria:

  http://www.apache.org/dev/release#what

  In some cases, binary/bytecode packages are also produced as a convenience
  to users that might not have the appropriate tools to build a compiled
  version of the source. In all such cases, the binary/bytecode package must
  have the same version number as the source release and may only add
  binary/bytecode files that are the result of compiling that version of the
  source code release.

That's not quite what you asked for in the thread on dev@kylin (embedding a
binary inside a source release) but is it good enough?

Embedding executable binary code inside an official source release is not
OK.  Binary files, though they may be derived from open source, are not open
source themselves and cannot be audited by a PMC.

Marvin Humphrey

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: [VOTE] Accept Apex into the Apache Incubator

2015-08-13 Thread Amol Kekre
+1 (Non-binding)

Amol

On Thu, Aug 13, 2015 at 7:48 AM, P. Taylor Goetz ptgo...@apache.org wrote:

 Following the discussion thread [1], I would like to call a VOTE for
 Accepting Apex as a new Apache Incubator project.

 The proposal is available on the wiki [2] and is also attached below.

 The VOTE will be open for at least 72 hours.

 [ ] +1 Accept Apex into the Incubator
 [ ] ±0 No opinion
 [ ] -1 Do not accept Apex into the Incubator because…

 Thanks,

 -Taylor

 [1] http://s.apache.org/apex_discuss
 [2] https://wiki.apache.org/incubator/ApexProposal


 == Abstract ==
 Apex is an enterprise grade native YARN big data-in-motion platform that
 unifies stream processing as well as batch processing. Apex processes big
 data in-motion in a highly scalable, highly performant, fault tolerant,
 stateful, secure, distributed, and an easily operable way. It provides a
 simple API that enables users to write or re-use generic Java code, thereby
 lowering the expertise needed to write big data applications.

 Functional and operational specifications are separated. Apex is designed
 in a way to enable users to write their own code (aka user defined
 functions) as is and leave all operability to the platform. The API is very
 simple and is designed to allow users to drop in their code as is. The
 platform mainly deals with operability and treats functional code as a
 black box. Operability includes fault tolerance, scalability, security,
 ease of use, metrics api, webservices, etc. In other words there is no
 separation of UDF (user defined functions), as all functional code is UDF.
 This frees users to focus on functional development, and lets platform
 provide operability support. The same code runs as is with different
 operability attributes. The data-in-motion architecture of Apex unifies
 stream as well as batch processing in a single platform. Since Apex is a
 native YARN application, it leverages all the components of YARN without
 duplication. Apex was developed with YARN in mind and has no overlapping
 components/functionality with YARN.

 The Apex platform is supplemented by project Malhar, which is a library of
 operators that implement common business logic functions needed by
 customers who want to quickly develop applications. These operators provide
 access to HDFS, S3, NFS, FTP, and other file systems;  Kafka, ActiveMQ,
 RabbitMQ, JMS, and other message systems; MySql, Cassandra, MongoDB, Redis,
 HBase, CouchDB and other databases along with JDBC connectors. The Malhar
 library also includes a host of other common business logic patterns that
 help users to significantly reduce the time it takes to go into production.
 Ease of integration with all other big data technologies is one of the
 primary missions of Malhar.

 == Proposal ==
 The goal of this proposal is to establish the core engine of DataTorrent
 RTS product as an Apache Software Foundation (ASF) project in order to
 build a vibrant, diverse, and self-governed open source community around
 the technology. DataTorrent will continue to sell management tools,
 application building tools, easy to use big data applications, and custom
 high end business logic operators. This proposal covers the Apex source
 code (written in Java), Apex documentation and other materials currently
 available on https://github.com/DataTorrent/Apex. This proposal also
 covers the Malhar source code (written in Java), Malhar documentation, and
 other materials currently available on
 https://github.com/DataTorrent/Malhar. We have done a trademark check on
 the name Apex, and have concluded that the Apex name is likely to be a
 suitable project name.

 == Background ==
 DataTorrent RTS is a mature and robust product developed as a native YARN
 application. RTS 1.0 was launched in summer of 2014; RTS 2.0 was launched
 in Jan 2015. Both were well received by customers. RTS 3.0 was launched at
 end of July 2015. RTS is among the first enterprise grade platform that was
 developed from the ground up as native YARN application. DataTorrent RTS is
 currently maintained by engineers as a closed source project. Even though
 the engineers behind RTS are experienced software engineers and are
 knowledge leaders in data-in-motion platforms, they have had little
 exposure to the open source governance process. Customers are currently
 running applications based on DataTorrent RTS in production.

 == Rationale ==
 Big data applications written for non-Hadoop platforms typically require
 major rewrites  to get them to work with Hadoop. This rewriting creates a
 significant bottleneck in terms of resources (expertise) which in turn
 jeopardizes the viability of such an endeavour. It is hard enough to
 acquire big data expertise, demanding additional expertise to do a major
 code conversion makes it a very hard problem for projects to successfully
 migrate to Hadoop. Also, due to the batch processing nature of Hadoop’s
 MapReduce paradigm, users often have to wait tens of 

Re: [VOTE] Accept Apex into the Apache Incubator

2015-08-13 Thread Atri Sharma
+1 (Non Binding)
On 13 Aug 2015 22:18, P. Taylor Goetz ptgo...@apache.org wrote:

 Following the discussion thread [1], I would like to call a VOTE for
 Accepting Apex as a new Apache Incubator project.

 The proposal is available on the wiki [2] and is also attached below.

 The VOTE will be open for at least 72 hours.

 [ ] +1 Accept Apex into the Incubator
 [ ] ±0 No opinion
 [ ] -1 Do not accept Apex into the Incubator because…

 Thanks,

 -Taylor

 [1] http://s.apache.org/apex_discuss
 [2] https://wiki.apache.org/incubator/ApexProposal


 == Abstract ==
 Apex is an enterprise grade native YARN big data-in-motion platform that
 unifies stream processing as well as batch processing. Apex processes big
 data in-motion in a highly scalable, highly performant, fault tolerant,
 stateful, secure, distributed, and an easily operable way. It provides a
 simple API that enables users to write or re-use generic Java code, thereby
 lowering the expertise needed to write big data applications.

 Functional and operational specifications are separated. Apex is designed
 in a way to enable users to write their own code (aka user defined
 functions) as is and leave all operability to the platform. The API is very
 simple and is designed to allow users to drop in their code as is. The
 platform mainly deals with operability and treats functional code as a
 black box. Operability includes fault tolerance, scalability, security,
 ease of use, metrics api, webservices, etc. In other words there is no
 separation of UDF (user defined functions), as all functional code is UDF.
 This frees users to focus on functional development, and lets platform
 provide operability support. The same code runs as is with different
 operability attributes. The data-in-motion architecture of Apex unifies
 stream as well as batch processing in a single platform. Since Apex is a
 native YARN application, it leverages all the components of YARN without
 duplication. Apex was developed with YARN in mind and has no overlapping
 components/functionality with YARN.

 The Apex platform is supplemented by project Malhar, which is a library of
 operators that implement common business logic functions needed by
 customers who want to quickly develop applications. These operators provide
 access to HDFS, S3, NFS, FTP, and other file systems;  Kafka, ActiveMQ,
 RabbitMQ, JMS, and other message systems; MySql, Cassandra, MongoDB, Redis,
 HBase, CouchDB and other databases along with JDBC connectors. The Malhar
 library also includes a host of other common business logic patterns that
 help users to significantly reduce the time it takes to go into production.
 Ease of integration with all other big data technologies is one of the
 primary missions of Malhar.

 == Proposal ==
 The goal of this proposal is to establish the core engine of DataTorrent
 RTS product as an Apache Software Foundation (ASF) project in order to
 build a vibrant, diverse, and self-governed open source community around
 the technology. DataTorrent will continue to sell management tools,
 application building tools, easy to use big data applications, and custom
 high end business logic operators. This proposal covers the Apex source
 code (written in Java), Apex documentation and other materials currently
 available on https://github.com/DataTorrent/Apex. This proposal also
 covers the Malhar source code (written in Java), Malhar documentation, and
 other materials currently available on
 https://github.com/DataTorrent/Malhar. We have done a trademark check on
 the name Apex, and have concluded that the Apex name is likely to be a
 suitable project name.

 == Background ==
 DataTorrent RTS is a mature and robust product developed as a native YARN
 application. RTS 1.0 was launched in summer of 2014; RTS 2.0 was launched
 in Jan 2015. Both were well received by customers. RTS 3.0 was launched at
 end of July 2015. RTS is among the first enterprise grade platform that was
 developed from the ground up as native YARN application. DataTorrent RTS is
 currently maintained by engineers as a closed source project. Even though
 the engineers behind RTS are experienced software engineers and are
 knowledge leaders in data-in-motion platforms, they have had little
 exposure to the open source governance process. Customers are currently
 running applications based on DataTorrent RTS in production.

 == Rationale ==
 Big data applications written for non-Hadoop platforms typically require
 major rewrites  to get them to work with Hadoop. This rewriting creates a
 significant bottleneck in terms of resources (expertise) which in turn
 jeopardizes the viability of such an endeavour. It is hard enough to
 acquire big data expertise, demanding additional expertise to do a major
 code conversion makes it a very hard problem for projects to successfully
 migrate to Hadoop. Also, due to the batch processing nature of Hadoop’s
 MapReduce paradigm, users often have to wait tens of minutes to see 

Re: apache binary distributions

2015-08-13 Thread Luke Han
There's one discussion in Kylin community about to add binary
package in release, people are really would like to have one:
http://mail-archives.apache.org/mod_mbox/incubator-kylin-dev/201508.mbox/%3CCAKmQrOZ_MFUyF_y7HXE7iVMCfJHuuOFuU4T8ibsPWfnw0z2Opw%40mail.gmail.com%3E

For some reason, people (especially in China) is not easy
to build from source, since there are many library hosted on
some services which can't be access directly.

Beyond that, the first impression of a project is how to setup
correctly and successfully, it not make sense to have everyone to
build from source. And the reality is many projects already DO binary
package for convenience purpose.

After read so long mail thread here, I have a little bit confusion:-(
there are too many messages...should we have some clear
guide or practices for such binary release ?

Thanks.




Best Regards!
-

Luke Han

On Mon, Aug 10, 2015 at 10:50 PM, David Nalley da...@gnsa.us wrote:

 On Sun, Aug 9, 2015 at 9:33 PM, Roman Shaposhnik ro...@shaposhnik.org
 wrote:
  On Fri, Aug 7, 2015 at 12:46 AM, Bertrand Delacretaz
  bdelacre...@apache.org wrote:
  On Fri, Aug 7, 2015 at 2:50 AM, Roman Shaposhnik ro...@shaposhnik.org
 wrote:
  ...is Apache Brand meant to protect *any* possible object/binary
  artifact or only those that PMC actually care about?...
 
  IMO any object/binary created from our source code has to be clearly
  identified as not coming from the ASF.
 
  Well, the real question is: do we aspire to have a monopoly on certain
  binary convenience artifacts? IOW, if a Hadoop PMC blessed and RPM
  as one of those artifacts, does it mean that only that RPM (however
  potentially screwed up it is from the standpoint of Fedora packaging
  guidelines) is the RPM that can be called Hadoop?
 

 That depends.
 And what it largely depends on is the product, the PMC producing it,
 and the user base.
 Some projects have problems with abuse of their marks. People bundling
 additional (occasionally malicious) software with the ASF-produced
 software. Some of those projects enforce (rightly IMO) trademark to
 the benefit of the project and its users.

 Other projects are much more lax with trademarks, yet remain very vibrant.

 Mozilla had similar problems to those that GCC had, which were
 described earlier. Linux distributions were patching the 'official'
 release, and inadvertently causing problems which ended up giving
 Mozilla products an undeserved (at least for those specific issues)
 bad reputation. So they rectified this by enforcing their trademarks,
 and declaring that any patches had to be approved by Mozilla if you
 were to retain the Mozilla brands on the software. Is that overkill
 for most of the products that call the ASF home? Probably. But for
 some projects, it makes sense.

 (This is completely separate from a discussion about a third-party
 using ASF marks for their own gain and confusing folks about the
 origin of the software they are using)

 --David

 -
 To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
 For additional commands, e-mail: general-h...@incubator.apache.org




Re: [VOTE] Accept Apex into the Apache Incubator

2015-08-13 Thread Chris Douglas
+1 (binding) -C

On Thu, Aug 13, 2015 at 7:48 AM, P. Taylor Goetz ptgo...@apache.org wrote:
 Following the discussion thread [1], I would like to call a VOTE for
 Accepting Apex as a new Apache Incubator project.

 The proposal is available on the wiki [2] and is also attached below.

 The VOTE will be open for at least 72 hours.

 [ ] +1 Accept Apex into the Incubator
 [ ] ±0 No opinion
 [ ] -1 Do not accept Apex into the Incubator because…

 Thanks,

 -Taylor

 [1] http://s.apache.org/apex_discuss
 [2] https://wiki.apache.org/incubator/ApexProposal


 == Abstract ==
 Apex is an enterprise grade native YARN big data-in-motion platform that
 unifies stream processing as well as batch processing. Apex processes big
 data in-motion in a highly scalable, highly performant, fault tolerant,
 stateful, secure, distributed, and an easily operable way. It provides a
 simple API that enables users to write or re-use generic Java code, thereby
 lowering the expertise needed to write big data applications.

 Functional and operational specifications are separated. Apex is designed in
 a way to enable users to write their own code (aka user defined functions)
 as is and leave all operability to the platform. The API is very simple and
 is designed to allow users to drop in their code as is. The platform mainly
 deals with operability and treats functional code as a black box.
 Operability includes fault tolerance, scalability, security, ease of use,
 metrics api, webservices, etc. In other words there is no separation of UDF
 (user defined functions), as all functional code is UDF. This frees users to
 focus on functional development, and lets platform provide operability
 support. The same code runs as is with different operability attributes. The
 data-in-motion architecture of Apex unifies stream as well as batch
 processing in a single platform. Since Apex is a native YARN application, it
 leverages all the components of YARN without duplication. Apex was developed
 with YARN in mind and has no overlapping components/functionality with YARN.

 The Apex platform is supplemented by project Malhar, which is a library of
 operators that implement common business logic functions needed by customers
 who want to quickly develop applications. These operators provide access to
 HDFS, S3, NFS, FTP, and other file systems;  Kafka, ActiveMQ, RabbitMQ, JMS,
 and other message systems; MySql, Cassandra, MongoDB, Redis, HBase, CouchDB
 and other databases along with JDBC connectors. The Malhar library also
 includes a host of other common business logic patterns that help users to
 significantly reduce the time it takes to go into production. Ease of
 integration with all other big data technologies is one of the primary
 missions of Malhar.

 == Proposal ==
 The goal of this proposal is to establish the core engine of DataTorrent RTS
 product as an Apache Software Foundation (ASF) project in order to build a
 vibrant, diverse, and self-governed open source community around the
 technology. DataTorrent will continue to sell management tools, application
 building tools, easy to use big data applications, and custom high end
 business logic operators. This proposal covers the Apex source code (written
 in Java), Apex documentation and other materials currently available on
 https://github.com/DataTorrent/Apex. This proposal also covers the Malhar
 source code (written in Java), Malhar documentation, and other materials
 currently available on https://github.com/DataTorrent/Malhar. We have done a
 trademark check on the name Apex, and have concluded that the Apex name is
 likely to be a suitable project name.

 == Background ==
 DataTorrent RTS is a mature and robust product developed as a native YARN
 application. RTS 1.0 was launched in summer of 2014; RTS 2.0 was launched in
 Jan 2015. Both were well received by customers. RTS 3.0 was launched at end
 of July 2015. RTS is among the first enterprise grade platform that was
 developed from the ground up as native YARN application. DataTorrent RTS is
 currently maintained by engineers as a closed source project. Even though
 the engineers behind RTS are experienced software engineers and are
 knowledge leaders in data-in-motion platforms, they have had little exposure
 to the open source governance process. Customers are currently running
 applications based on DataTorrent RTS in production.

 == Rationale ==
 Big data applications written for non-Hadoop platforms typically require
 major rewrites  to get them to work with Hadoop. This rewriting creates a
 significant bottleneck in terms of resources (expertise) which in turn
 jeopardizes the viability of such an endeavour. It is hard enough to acquire
 big data expertise, demanding additional expertise to do a major code
 conversion makes it a very hard problem for projects to successfully migrate
 to Hadoop. Also, due to the batch processing nature of Hadoop’s MapReduce
 paradigm, users often have to wait tens of minutes to 

Re: [VOTE] Accept Apex into the Apache Incubator

2015-08-13 Thread Justin Mclean
+1 binding

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: [VOTE] Accept Apex into the Apache Incubator

2015-08-13 Thread Naresh Agarwal
+1 (non-binding)

Thanks
Naresh

On Fri, Aug 14, 2015 at 11:14 AM, Justin Mclean jus...@classsoftware.com
wrote:

 +1 binding

 -
 To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
 For additional commands, e-mail: general-h...@incubator.apache.org



-- 
_
The information contained in this communication is intended solely for the 
use of the individual or entity to whom it is addressed and others 
authorized to receive it. It may contain confidential or legally privileged 
information. If you are not the intended recipient you are hereby notified 
that any disclosure, copying, distribution or taking any action in reliance 
on the contents of this information is strictly prohibited and may be 
unlawful. If you have received this communication in error, please notify 
us immediately by responding to this email and then delete it from your 
system. The firm is neither liable for the proper and complete transmission 
of the information contained in this communication nor for any delay in its 
receipt.


Re: [VOTE] Accept Apex into the Apache Incubator

2015-08-13 Thread Hitesh Shah
+1 (binding)

— Hitesh

On Aug 13, 2015, at 7:48 AM, P. Taylor Goetz ptgo...@apache.org wrote:

 Following the discussion thread [1], I would like to call a VOTE for 
 Accepting Apex as a new Apache Incubator project.
 
 The proposal is available on the wiki [2] and is also attached below.
 
 The VOTE will be open for at least 72 hours.
 
 [ ] +1 Accept Apex into the Incubator
 [ ] ±0 No opinion
 [ ] -1 Do not accept Apex into the Incubator because…
 
 Thanks,
 
 -Taylor
 
 [1] http://s.apache.org/apex_discuss
 [2] https://wiki.apache.org/incubator/ApexProposal
 
 
 == Abstract ==
 Apex is an enterprise grade native YARN big data-in-motion platform that 
 unifies stream processing as well as batch processing. Apex processes big 
 data in-motion in a highly scalable, highly performant, fault tolerant, 
 stateful, secure, distributed, and an easily operable way. It provides a 
 simple API that enables users to write or re-use generic Java code, thereby 
 lowering the expertise needed to write big data applications.
 
 Functional and operational specifications are separated. Apex is designed in 
 a way to enable users to write their own code (aka user defined functions) as 
 is and leave all operability to the platform. The API is very simple and is 
 designed to allow users to drop in their code as is. The platform mainly 
 deals with operability and treats functional code as a black box. Operability 
 includes fault tolerance, scalability, security, ease of use, metrics api, 
 webservices, etc. In other words there is no separation of UDF (user defined 
 functions), as all functional code is UDF. This frees users to focus on 
 functional development, and lets platform provide operability support. The 
 same code runs as is with different operability attributes. The 
 data-in-motion architecture of Apex unifies stream as well as batch 
 processing in a single platform. Since Apex is a native YARN application, it 
 leverages all the components of YARN without duplication. Apex was developed 
 with YARN in mind and has no overlapping components/functionality with YARN.
 
 The Apex platform is supplemented by project Malhar, which is a library of 
 operators that implement common business logic functions needed by customers 
 who want to quickly develop applications. These operators provide access to 
 HDFS, S3, NFS, FTP, and other file systems;  Kafka, ActiveMQ, RabbitMQ, JMS, 
 and other message systems; MySql, Cassandra, MongoDB, Redis, HBase, CouchDB 
 and other databases along with JDBC connectors. The Malhar library also 
 includes a host of other common business logic patterns that help users to 
 significantly reduce the time it takes to go into production. Ease of 
 integration with all other big data technologies is one of the primary 
 missions of Malhar.
 
 == Proposal ==
 The goal of this proposal is to establish the core engine of DataTorrent RTS 
 product as an Apache Software Foundation (ASF) project in order to build a 
 vibrant, diverse, and self-governed open source community around the 
 technology. DataTorrent will continue to sell management tools, application 
 building tools, easy to use big data applications, and custom high end 
 business logic operators. This proposal covers the Apex source code (written 
 in Java), Apex documentation and other materials currently available on 
 https://github.com/DataTorrent/Apex. This proposal also covers the Malhar 
 source code (written in Java), Malhar documentation, and other materials 
 currently available on https://github.com/DataTorrent/Malhar. We have done a 
 trademark check on the name Apex, and have concluded that the Apex name is 
 likely to be a suitable project name. 
 
 == Background ==
 DataTorrent RTS is a mature and robust product developed as a native YARN 
 application. RTS 1.0 was launched in summer of 2014; RTS 2.0 was launched in 
 Jan 2015. Both were well received by customers. RTS 3.0 was launched at end 
 of July 2015. RTS is among the first enterprise grade platform that was 
 developed from the ground up as native YARN application. DataTorrent RTS is 
 currently maintained by engineers as a closed source project. Even though the 
 engineers behind RTS are experienced software engineers and are knowledge 
 leaders in data-in-motion platforms, they have had little exposure to the 
 open source governance process. Customers are currently running applications 
 based on DataTorrent RTS in production.
 
 == Rationale ==
 Big data applications written for non-Hadoop platforms typically require 
 major rewrites  to get them to work with Hadoop. This rewriting creates a 
 significant bottleneck in terms of resources (expertise) which in turn 
 jeopardizes the viability of such an endeavour. It is hard enough to acquire 
 big data expertise, demanding additional expertise to do a major code 
 conversion makes it a very hard problem for projects to successfully migrate 
 to Hadoop. Also, due to the batch processing nature of