[RESULT] [VOTE] Accept REEF into the Apache Incubator

2014-08-13 Thread Byung-Gon Chun
Thanks everyone who voted! The vote has passed with 13 binding +1 votes and
2 non-binding +1 votes and no +0 or -1 votes.

Binding (+1)
Ross Gardler
Till Westmann
Alan D. Cabrera
Konstantin Boudnik
Bertrand Delacretaz
Jakob Glen Homan
Chris A Mattmann
Andrew Purtell
Owen O'Malley
Jake Farrell
Suresh Srinivas
Roman Shaposhnik
Chris Douglas
Non-binding (+1)
Hitesh Shah
Jan Iversen

We will follow the next steps under the guidance of our mentors.

Thanks!
- Gon

---
Byung-Gon Chun


On Sat, Aug 9, 2014 at 2:40 PM, Byung-Gon Chun bgc...@gmail.com wrote:

 Hi,

 Thanks for participating in the proposal discussion on REEF. The
 discussion has calmed. I would like to call a vote for acceptance of REEF
 into the Apache Incubator.

 The proposal is attached below, and it is also available at
 https://wiki.apache.org/incubator/ReefProposal

 Let's keep this vote open for three business days, closing the voting on
 August 11, 11:59PM (PDT).

 [] +1 Accept REEF into the Incubator
 [] 0 Don't care
 [] -1 Don't accept REEF because...

 Thanks!
 -Gon

 --
 Byung-Gon Chun


 # REEFProposal - Incubator


 # Abstract

 REEF (Retainable Evaluator Execution Framework) is a scale-out
 computing fabric that eases the development of Big Data applications
 on top of resource managers such as Apache YARN and Mesos.


 # Proposal

 REEF is a Big Data system that makes it easy to implement scalable,
 fault-tolerant runtime environments for a range of data processing
 models (e.g., graph processing and machine learning) on top of
 resource managers such as Apache YARN and Mesos. REEF provides
 capabilities to run multiple heterogeneous frameworks and workflows of
 those efficiently.

 Additionally, REEF contains two libraries that are of independent
 value: Wake is an event-based-programming framework inspired by Rx and
 SEDA.  Tang is a dependency injection framework inspired by Google
 Guice, but designed specifically for configuring distributed systems.


 # Background

 The resource management layer such as Apache YARN and Mesos has
 emerged as a critical layer in the new scale-out data processing
 stack; resource managers assume the responsibility of multiplexing a
 cluster of shared-nothing machines across heterogeneous
 applications. They operate behind an interface for leasing containers
 - a slice of a machine’s resources - to computations in an elastic
 fashion. However, building data processing frameworks directly on this
 layer comes at a high cost: each framework must tackle the same
 challenges (e.g., fault-tolerance, task scheduling and coordination)
 and reimplement common mechanisms (e.g., caching, bulk transfers).

 REEF provides a reusable control-plane for scheduling and coordinating
 task-level work on cluster resource managers. The REEF design enables
 sophisticated optimizations, such as container re-use and data
 caching, and facilitates workflows that span multiple
 frameworks. Examples include pipelining data between different
 operators in a relational system, retaining state across iterations in
 iterative or recursive data flow, and passing the result of a
 MapReduce job to a Machine Learning computation.


 # Rationale

 Since REEF is a library that makes it easy to write distributed
 applications on top of Apache YARN or Mesos, the Apache Software
 Foundation
 is the perfect home for hosting REEF.


 # Current Status

 REEF has been developed mostly by Microsoft, UCLA and the Seoul
 National University.  The REEF codebase is open-sourced under Apache
 License 2.0 and is currently hosted in a public repository at
 github.com.


 # Meritocracy

 We plan to build a strong open community by following the Apache
 meritocracy principles. We will work with those who contribute
 significantly to the project and invite them to be its committers.


 # Community

 REEF is currently being used internally at Microsoft.  Also, SK
 Telecom builds their data analytics infrastructure on top of REEF in
 collaboration with Seoul National University.  We hope to extend our
 contributor base by becoming an Apache incubator project. REEF will
 attract developers who are interested in creating common building
 blocks for simplifying the development of large-scale big data
 applications.


 # Core Developers

 Core developers are engineers from Microsoft, Purestorage, UCB, UCLA,
 UW and Seoul National University.


 # Alignment

 REEF depends on many Apache projects and dependencies. REEF is built
 on resource managers such as Apache YARN and Apache Mesos. REEF also
 uses HDFS as a distributed storage layer.


 # Known Risks
 ## Orphaned Products

 The risk of REEF being orphaned is small because Microsoft products
 are built on REEF. The core REEF developers continue to work on REEF
 at Microsoft, UCLA, and Seoul National University. The REEF project is
 gaining interest from other institutions to be used as their
 infrastructure.

 ## Inexperience with Open Source

 Several core developers have experience with open source 

Re: [VOTE] Accept REEF into the Apache Incubator

2014-08-12 Thread Owen O'Malley
+1 (binding)


On Mon, Aug 11, 2014 at 6:20 PM, Hitesh Shah hit...@apache.org wrote:

 +1 ( non-binding )

 — Hitesh

 On Aug 8, 2014, at 10:40 PM, Byung-Gon Chun bgc...@gmail.com wrote:

  Hi,
 
  Thanks for participating in the proposal discussion on REEF. The
 discussion
  has calmed. I would like to call a vote for acceptance of REEF into the
  Apache Incubator.
 
  The proposal is attached below, and it is also available at
  https://wiki.apache.org/incubator/ReefProposal
 
  Let's keep this vote open for three business days, closing the voting on
  August 11, 11:59PM (PDT).
 
  [] +1 Accept REEF into the Incubator
  [] 0 Don't care
  [] -1 Don't accept REEF because...
 
  Thanks!
  -Gon
 
  --
  Byung-Gon Chun
 
 
  # REEFProposal - Incubator
 
 
  # Abstract
 
  REEF (Retainable Evaluator Execution Framework) is a scale-out
  computing fabric that eases the development of Big Data applications
  on top of resource managers such as Apache YARN and Mesos.
 
 
  # Proposal
 
  REEF is a Big Data system that makes it easy to implement scalable,
  fault-tolerant runtime environments for a range of data processing
  models (e.g., graph processing and machine learning) on top of
  resource managers such as Apache YARN and Mesos. REEF provides
  capabilities to run multiple heterogeneous frameworks and workflows of
  those efficiently.
 
  Additionally, REEF contains two libraries that are of independent
  value: Wake is an event-based-programming framework inspired by Rx and
  SEDA.  Tang is a dependency injection framework inspired by Google
  Guice, but designed specifically for configuring distributed systems.
 
 
  # Background
 
  The resource management layer such as Apache YARN and Mesos has
  emerged as a critical layer in the new scale-out data processing
  stack; resource managers assume the responsibility of multiplexing a
  cluster of shared-nothing machines across heterogeneous
  applications. They operate behind an interface for leasing containers
  - a slice of a machine’s resources - to computations in an elastic
  fashion. However, building data processing frameworks directly on this
  layer comes at a high cost: each framework must tackle the same
  challenges (e.g., fault-tolerance, task scheduling and coordination)
  and reimplement common mechanisms (e.g., caching, bulk transfers).
 
  REEF provides a reusable control-plane for scheduling and coordinating
  task-level work on cluster resource managers. The REEF design enables
  sophisticated optimizations, such as container re-use and data
  caching, and facilitates workflows that span multiple
  frameworks. Examples include pipelining data between different
  operators in a relational system, retaining state across iterations in
  iterative or recursive data flow, and passing the result of a
  MapReduce job to a Machine Learning computation.
 
 
  # Rationale
 
  Since REEF is a library that makes it easy to write distributed
  applications on top of Apache YARN or Mesos, the Apache Software
 Foundation
  is the perfect home for hosting REEF.
 
 
  # Current Status
 
  REEF has been developed mostly by Microsoft, UCLA and the Seoul
  National University.  The REEF codebase is open-sourced under Apache
  License 2.0 and is currently hosted in a public repository at
  github.com.
 
 
  # Meritocracy
 
  We plan to build a strong open community by following the Apache
  meritocracy principles. We will work with those who contribute
  significantly to the project and invite them to be its committers.
 
 
  # Community
 
  REEF is currently being used internally at Microsoft.  Also, SK
  Telecom builds their data analytics infrastructure on top of REEF in
  collaboration with Seoul National University.  We hope to extend our
  contributor base by becoming an Apache incubator project. REEF will
  attract developers who are interested in creating common building
  blocks for simplifying the development of large-scale big data
  applications.
 
 
  # Core Developers
 
  Core developers are engineers from Microsoft, Purestorage, UCB, UCLA,
  UW and Seoul National University.
 
 
  # Alignment
 
  REEF depends on many Apache projects and dependencies. REEF is built
  on resource managers such as Apache YARN and Apache Mesos. REEF also
  uses HDFS as a distributed storage layer.
 
 
  # Known Risks
  ## Orphaned Products
 
  The risk of REEF being orphaned is small because Microsoft products
  are built on REEF. The core REEF developers continue to work on REEF
  at Microsoft, UCLA, and Seoul National University. The REEF project is
  gaining interest from other institutions to be used as their
  infrastructure.
 
  ## Inexperience with Open Source
 
  Several core developers have experience with open source development.
  REEF committers will be guided by the mentors with strong Apache open
  source project backgrounds.
 
  ## Homogeneous Developers
 
  The initial committers include developers from several institutions
  including Microsoft, 

Re: [VOTE] Accept REEF into the Apache Incubator

2014-08-12 Thread Jake Farrell
+1 (binding)

-Jake


On Sat, Aug 9, 2014 at 1:40 AM, Byung-Gon Chun bgc...@gmail.com wrote:

 Hi,

 Thanks for participating in the proposal discussion on REEF. The discussion
 has calmed. I would like to call a vote for acceptance of REEF into the
 Apache Incubator.

 The proposal is attached below, and it is also available at
 https://wiki.apache.org/incubator/ReefProposal

 Let's keep this vote open for three business days, closing the voting on
 August 11, 11:59PM (PDT).

 [] +1 Accept REEF into the Incubator
 [] 0 Don't care
 [] -1 Don't accept REEF because...

 Thanks!
 -Gon

 --
 Byung-Gon Chun


 # REEFProposal - Incubator


 # Abstract

 REEF (Retainable Evaluator Execution Framework) is a scale-out
 computing fabric that eases the development of Big Data applications
 on top of resource managers such as Apache YARN and Mesos.


 # Proposal

 REEF is a Big Data system that makes it easy to implement scalable,
 fault-tolerant runtime environments for a range of data processing
 models (e.g., graph processing and machine learning) on top of
 resource managers such as Apache YARN and Mesos. REEF provides
 capabilities to run multiple heterogeneous frameworks and workflows of
 those efficiently.

 Additionally, REEF contains two libraries that are of independent
 value: Wake is an event-based-programming framework inspired by Rx and
 SEDA.  Tang is a dependency injection framework inspired by Google
 Guice, but designed specifically for configuring distributed systems.


 # Background

 The resource management layer such as Apache YARN and Mesos has
 emerged as a critical layer in the new scale-out data processing
 stack; resource managers assume the responsibility of multiplexing a
 cluster of shared-nothing machines across heterogeneous
 applications. They operate behind an interface for leasing containers
 - a slice of a machine’s resources - to computations in an elastic
 fashion. However, building data processing frameworks directly on this
 layer comes at a high cost: each framework must tackle the same
 challenges (e.g., fault-tolerance, task scheduling and coordination)
 and reimplement common mechanisms (e.g., caching, bulk transfers).

 REEF provides a reusable control-plane for scheduling and coordinating
 task-level work on cluster resource managers. The REEF design enables
 sophisticated optimizations, such as container re-use and data
 caching, and facilitates workflows that span multiple
 frameworks. Examples include pipelining data between different
 operators in a relational system, retaining state across iterations in
 iterative or recursive data flow, and passing the result of a
 MapReduce job to a Machine Learning computation.


 # Rationale

 Since REEF is a library that makes it easy to write distributed
 applications on top of Apache YARN or Mesos, the Apache Software Foundation
 is the perfect home for hosting REEF.


 # Current Status

 REEF has been developed mostly by Microsoft, UCLA and the Seoul
 National University.  The REEF codebase is open-sourced under Apache
 License 2.0 and is currently hosted in a public repository at
 github.com.


 # Meritocracy

 We plan to build a strong open community by following the Apache
 meritocracy principles. We will work with those who contribute
 significantly to the project and invite them to be its committers.


 # Community

 REEF is currently being used internally at Microsoft.  Also, SK
 Telecom builds their data analytics infrastructure on top of REEF in
 collaboration with Seoul National University.  We hope to extend our
 contributor base by becoming an Apache incubator project. REEF will
 attract developers who are interested in creating common building
 blocks for simplifying the development of large-scale big data
 applications.


 # Core Developers

 Core developers are engineers from Microsoft, Purestorage, UCB, UCLA,
 UW and Seoul National University.


 # Alignment

 REEF depends on many Apache projects and dependencies. REEF is built
 on resource managers such as Apache YARN and Apache Mesos. REEF also
 uses HDFS as a distributed storage layer.


 # Known Risks
 ## Orphaned Products

 The risk of REEF being orphaned is small because Microsoft products
 are built on REEF. The core REEF developers continue to work on REEF
 at Microsoft, UCLA, and Seoul National University. The REEF project is
 gaining interest from other institutions to be used as their
 infrastructure.

 ## Inexperience with Open Source

 Several core developers have experience with open source development.
 REEF committers will be guided by the mentors with strong Apache open
 source project backgrounds.

 ## Homogeneous Developers

 The initial committers include developers from several institutions
 including Microsoft, Purestorage, UCB, UCLA, and Seoul National
 University.

 ## Reliance on Salaried Developers

 Developers from Microsoft are paid to work on REEF. Since the work is
 used internally at Microsoft, Microsoft will keep supporting the
 

Re: [VOTE] Accept REEF into the Apache Incubator

2014-08-12 Thread Suresh Srinivas
+1 (binding)


On Fri, Aug 8, 2014 at 10:40 PM, Byung-Gon Chun bgc...@gmail.com wrote:

 Hi,

 Thanks for participating in the proposal discussion on REEF. The discussion
 has calmed. I would like to call a vote for acceptance of REEF into the
 Apache Incubator.

 The proposal is attached below, and it is also available at
 https://wiki.apache.org/incubator/ReefProposal

 Let's keep this vote open for three business days, closing the voting on
 August 11, 11:59PM (PDT).

 [] +1 Accept REEF into the Incubator
 [] 0 Don't care
 [] -1 Don't accept REEF because...

 Thanks!
 -Gon

 --
 Byung-Gon Chun


 # REEFProposal - Incubator


 # Abstract

 REEF (Retainable Evaluator Execution Framework) is a scale-out
 computing fabric that eases the development of Big Data applications
 on top of resource managers such as Apache YARN and Mesos.


 # Proposal

 REEF is a Big Data system that makes it easy to implement scalable,
 fault-tolerant runtime environments for a range of data processing
 models (e.g., graph processing and machine learning) on top of
 resource managers such as Apache YARN and Mesos. REEF provides
 capabilities to run multiple heterogeneous frameworks and workflows of
 those efficiently.

 Additionally, REEF contains two libraries that are of independent
 value: Wake is an event-based-programming framework inspired by Rx and
 SEDA.  Tang is a dependency injection framework inspired by Google
 Guice, but designed specifically for configuring distributed systems.


 # Background

 The resource management layer such as Apache YARN and Mesos has
 emerged as a critical layer in the new scale-out data processing
 stack; resource managers assume the responsibility of multiplexing a
 cluster of shared-nothing machines across heterogeneous
 applications. They operate behind an interface for leasing containers
 - a slice of a machine’s resources - to computations in an elastic
 fashion. However, building data processing frameworks directly on this
 layer comes at a high cost: each framework must tackle the same
 challenges (e.g., fault-tolerance, task scheduling and coordination)
 and reimplement common mechanisms (e.g., caching, bulk transfers).

 REEF provides a reusable control-plane for scheduling and coordinating
 task-level work on cluster resource managers. The REEF design enables
 sophisticated optimizations, such as container re-use and data
 caching, and facilitates workflows that span multiple
 frameworks. Examples include pipelining data between different
 operators in a relational system, retaining state across iterations in
 iterative or recursive data flow, and passing the result of a
 MapReduce job to a Machine Learning computation.


 # Rationale

 Since REEF is a library that makes it easy to write distributed
 applications on top of Apache YARN or Mesos, the Apache Software Foundation
 is the perfect home for hosting REEF.


 # Current Status

 REEF has been developed mostly by Microsoft, UCLA and the Seoul
 National University.  The REEF codebase is open-sourced under Apache
 License 2.0 and is currently hosted in a public repository at
 github.com.


 # Meritocracy

 We plan to build a strong open community by following the Apache
 meritocracy principles. We will work with those who contribute
 significantly to the project and invite them to be its committers.


 # Community

 REEF is currently being used internally at Microsoft.  Also, SK
 Telecom builds their data analytics infrastructure on top of REEF in
 collaboration with Seoul National University.  We hope to extend our
 contributor base by becoming an Apache incubator project. REEF will
 attract developers who are interested in creating common building
 blocks for simplifying the development of large-scale big data
 applications.


 # Core Developers

 Core developers are engineers from Microsoft, Purestorage, UCB, UCLA,
 UW and Seoul National University.


 # Alignment

 REEF depends on many Apache projects and dependencies. REEF is built
 on resource managers such as Apache YARN and Apache Mesos. REEF also
 uses HDFS as a distributed storage layer.


 # Known Risks
 ## Orphaned Products

 The risk of REEF being orphaned is small because Microsoft products
 are built on REEF. The core REEF developers continue to work on REEF
 at Microsoft, UCLA, and Seoul National University. The REEF project is
 gaining interest from other institutions to be used as their
 infrastructure.

 ## Inexperience with Open Source

 Several core developers have experience with open source development.
 REEF committers will be guided by the mentors with strong Apache open
 source project backgrounds.

 ## Homogeneous Developers

 The initial committers include developers from several institutions
 including Microsoft, Purestorage, UCB, UCLA, and Seoul National
 University.

 ## Reliance on Salaried Developers

 Developers from Microsoft are paid to work on REEF. Since the work is
 used internally at Microsoft, Microsoft will keep supporting the
 

Re: [VOTE] Accept REEF into the Apache Incubator

2014-08-12 Thread jan i
On Aug 12, 2014 7:26 PM, Suresh Srinivas sur...@hortonworks.com wrote:

 +1 (binding)
+1



 On Fri, Aug 8, 2014 at 10:40 PM, Byung-Gon Chun bgc...@gmail.com wrote:

  Hi,
 
  Thanks for participating in the proposal discussion on REEF. The
discussion
  has calmed. I would like to call a vote for acceptance of REEF into the
  Apache Incubator.
 
  The proposal is attached below, and it is also available at
  https://wiki.apache.org/incubator/ReefProposal
 
  Let's keep this vote open for three business days, closing the voting on
  August 11, 11:59PM (PDT).
 
  [] +1 Accept REEF into the Incubator
  [] 0 Don't care
  [] -1 Don't accept REEF because...
 
  Thanks!
  -Gon
 
  --
  Byung-Gon Chun
 
 
  # REEFProposal - Incubator
 
 
  # Abstract
 
  REEF (Retainable Evaluator Execution Framework) is a scale-out
  computing fabric that eases the development of Big Data applications
  on top of resource managers such as Apache YARN and Mesos.
 
 
  # Proposal
 
  REEF is a Big Data system that makes it easy to implement scalable,
  fault-tolerant runtime environments for a range of data processing
  models (e.g., graph processing and machine learning) on top of
  resource managers such as Apache YARN and Mesos. REEF provides
  capabilities to run multiple heterogeneous frameworks and workflows of
  those efficiently.
 
  Additionally, REEF contains two libraries that are of independent
  value: Wake is an event-based-programming framework inspired by Rx and
  SEDA.  Tang is a dependency injection framework inspired by Google
  Guice, but designed specifically for configuring distributed systems.
 
 
  # Background
 
  The resource management layer such as Apache YARN and Mesos has
  emerged as a critical layer in the new scale-out data processing
  stack; resource managers assume the responsibility of multiplexing a
  cluster of shared-nothing machines across heterogeneous
  applications. They operate behind an interface for leasing containers
  - a slice of a machine’s resources - to computations in an elastic
  fashion. However, building data processing frameworks directly on this
  layer comes at a high cost: each framework must tackle the same
  challenges (e.g., fault-tolerance, task scheduling and coordination)
  and reimplement common mechanisms (e.g., caching, bulk transfers).
 
  REEF provides a reusable control-plane for scheduling and coordinating
  task-level work on cluster resource managers. The REEF design enables
  sophisticated optimizations, such as container re-use and data
  caching, and facilitates workflows that span multiple
  frameworks. Examples include pipelining data between different
  operators in a relational system, retaining state across iterations in
  iterative or recursive data flow, and passing the result of a
  MapReduce job to a Machine Learning computation.
 
 
  # Rationale
 
  Since REEF is a library that makes it easy to write distributed
  applications on top of Apache YARN or Mesos, the Apache Software
Foundation
  is the perfect home for hosting REEF.
 
 
  # Current Status
 
  REEF has been developed mostly by Microsoft, UCLA and the Seoul
  National University.  The REEF codebase is open-sourced under Apache
  License 2.0 and is currently hosted in a public repository at
  github.com.
 
 
  # Meritocracy
 
  We plan to build a strong open community by following the Apache
  meritocracy principles. We will work with those who contribute
  significantly to the project and invite them to be its committers.
 
 
  # Community
 
  REEF is currently being used internally at Microsoft.  Also, SK
  Telecom builds their data analytics infrastructure on top of REEF in
  collaboration with Seoul National University.  We hope to extend our
  contributor base by becoming an Apache incubator project. REEF will
  attract developers who are interested in creating common building
  blocks for simplifying the development of large-scale big data
  applications.
 
 
  # Core Developers
 
  Core developers are engineers from Microsoft, Purestorage, UCB, UCLA,
  UW and Seoul National University.
 
 
  # Alignment
 
  REEF depends on many Apache projects and dependencies. REEF is built
  on resource managers such as Apache YARN and Apache Mesos. REEF also
  uses HDFS as a distributed storage layer.
 
 
  # Known Risks
  ## Orphaned Products
 
  The risk of REEF being orphaned is small because Microsoft products
  are built on REEF. The core REEF developers continue to work on REEF
  at Microsoft, UCLA, and Seoul National University. The REEF project is
  gaining interest from other institutions to be used as their
  infrastructure.
 
  ## Inexperience with Open Source
 
  Several core developers have experience with open source development.
  REEF committers will be guided by the mentors with strong Apache open
  source project backgrounds.
 
  ## Homogeneous Developers
 
  The initial committers include developers from several institutions
  including Microsoft, Purestorage, UCB, UCLA, 

Re: [VOTE] Accept REEF into the Apache Incubator

2014-08-12 Thread Roman Shaposhnik
On Fri, Aug 8, 2014 at 10:40 PM, Byung-Gon Chun bgc...@gmail.com wrote:
 Hi,

 Thanks for participating in the proposal discussion on REEF. The discussion
 has calmed. I would like to call a vote for acceptance of REEF into the
 Apache Incubator.

 The proposal is attached below, and it is also available at
 https://wiki.apache.org/incubator/ReefProposal

 Let's keep this vote open for three business days, closing the voting on
 August 11, 11:59PM (PDT).

 [] +1 Accept REEF into the Incubator
 [] 0 Don't care
 [] -1 Don't accept REEF because...

+1 (binding)

Thanks,
Roman.

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: [VOTE] Accept REEF into the Apache Incubator

2014-08-12 Thread Chris Douglas
+1 -C

On Fri, Aug 8, 2014 at 10:40 PM, Byung-Gon Chun bgc...@gmail.com wrote:
 Hi,

 Thanks for participating in the proposal discussion on REEF. The discussion
 has calmed. I would like to call a vote for acceptance of REEF into the
 Apache Incubator.

 The proposal is attached below, and it is also available at
 https://wiki.apache.org/incubator/ReefProposal

 Let's keep this vote open for three business days, closing the voting on
 August 11, 11:59PM (PDT).

 [] +1 Accept REEF into the Incubator
 [] 0 Don't care
 [] -1 Don't accept REEF because...

 Thanks!
 -Gon

 --
 Byung-Gon Chun


 # REEFProposal - Incubator


 # Abstract

 REEF (Retainable Evaluator Execution Framework) is a scale-out
 computing fabric that eases the development of Big Data applications
 on top of resource managers such as Apache YARN and Mesos.


 # Proposal

 REEF is a Big Data system that makes it easy to implement scalable,
 fault-tolerant runtime environments for a range of data processing
 models (e.g., graph processing and machine learning) on top of
 resource managers such as Apache YARN and Mesos. REEF provides
 capabilities to run multiple heterogeneous frameworks and workflows of
 those efficiently.

 Additionally, REEF contains two libraries that are of independent
 value: Wake is an event-based-programming framework inspired by Rx and
 SEDA.  Tang is a dependency injection framework inspired by Google
 Guice, but designed specifically for configuring distributed systems.


 # Background

 The resource management layer such as Apache YARN and Mesos has
 emerged as a critical layer in the new scale-out data processing
 stack; resource managers assume the responsibility of multiplexing a
 cluster of shared-nothing machines across heterogeneous
 applications. They operate behind an interface for leasing containers
 - a slice of a machine’s resources - to computations in an elastic
 fashion. However, building data processing frameworks directly on this
 layer comes at a high cost: each framework must tackle the same
 challenges (e.g., fault-tolerance, task scheduling and coordination)
 and reimplement common mechanisms (e.g., caching, bulk transfers).

 REEF provides a reusable control-plane for scheduling and coordinating
 task-level work on cluster resource managers. The REEF design enables
 sophisticated optimizations, such as container re-use and data
 caching, and facilitates workflows that span multiple
 frameworks. Examples include pipelining data between different
 operators in a relational system, retaining state across iterations in
 iterative or recursive data flow, and passing the result of a
 MapReduce job to a Machine Learning computation.


 # Rationale

 Since REEF is a library that makes it easy to write distributed
 applications on top of Apache YARN or Mesos, the Apache Software Foundation
 is the perfect home for hosting REEF.


 # Current Status

 REEF has been developed mostly by Microsoft, UCLA and the Seoul
 National University.  The REEF codebase is open-sourced under Apache
 License 2.0 and is currently hosted in a public repository at
 github.com.


 # Meritocracy

 We plan to build a strong open community by following the Apache
 meritocracy principles. We will work with those who contribute
 significantly to the project and invite them to be its committers.


 # Community

 REEF is currently being used internally at Microsoft.  Also, SK
 Telecom builds their data analytics infrastructure on top of REEF in
 collaboration with Seoul National University.  We hope to extend our
 contributor base by becoming an Apache incubator project. REEF will
 attract developers who are interested in creating common building
 blocks for simplifying the development of large-scale big data
 applications.


 # Core Developers

 Core developers are engineers from Microsoft, Purestorage, UCB, UCLA,
 UW and Seoul National University.


 # Alignment

 REEF depends on many Apache projects and dependencies. REEF is built
 on resource managers such as Apache YARN and Apache Mesos. REEF also
 uses HDFS as a distributed storage layer.


 # Known Risks
 ## Orphaned Products

 The risk of REEF being orphaned is small because Microsoft products
 are built on REEF. The core REEF developers continue to work on REEF
 at Microsoft, UCLA, and Seoul National University. The REEF project is
 gaining interest from other institutions to be used as their
 infrastructure.

 ## Inexperience with Open Source

 Several core developers have experience with open source development.
 REEF committers will be guided by the mentors with strong Apache open
 source project backgrounds.

 ## Homogeneous Developers

 The initial committers include developers from several institutions
 including Microsoft, Purestorage, UCB, UCLA, and Seoul National
 University.

 ## Reliance on Salaried Developers

 Developers from Microsoft are paid to work on REEF. Since the work is
 used internally at Microsoft, Microsoft will keep supporting the
 developers to 

Re: [VOTE] Accept REEF into the Apache Incubator

2014-08-11 Thread Bertrand Delacretaz
On Sat, Aug 9, 2014 at 7:40 AM, Byung-Gon Chun bgc...@gmail.com wrote:
 ...I would like to call a vote for acceptance of REEF into the
 Apache Incubator...

+1

-Bertrand

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: [VOTE] Accept REEF into the Apache Incubator

2014-08-11 Thread jghoman
+1 (binding)

-Jakob






From: Bertrand Delacretaz
Sent: ‎Monday‎, ‎August‎ ‎11‎, ‎2014 ‎1‎:‎16‎ ‎AM
To: general@incubator.apache.org





On Sat, Aug 9, 2014 at 7:40 AM, Byung-Gon Chun bgc...@gmail.com wrote:
 ...I would like to call a vote for acceptance of REEF into the
 Apache Incubator...

+1

-Bertrand

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: [VOTE] Accept REEF into the Apache Incubator

2014-08-11 Thread Mattmann, Chris A (3980)
+1 binding thanks 

Sent from my iPhone

 On Aug 8, 2014, at 10:40 PM, Byung-Gon Chun bgc...@gmail.com wrote:
 
 Hi,
 
 Thanks for participating in the proposal discussion on REEF. The discussion
 has calmed. I would like to call a vote for acceptance of REEF into the
 Apache Incubator.
 
 The proposal is attached below, and it is also available at
 https://wiki.apache.org/incubator/ReefProposal
 
 Let's keep this vote open for three business days, closing the voting on
 August 11, 11:59PM (PDT).
 
 [] +1 Accept REEF into the Incubator
 [] 0 Don't care
 [] -1 Don't accept REEF because...
 
 Thanks!
 -Gon
 
 -- 
 Byung-Gon Chun
 
 
 # REEFProposal - Incubator
 
 
 # Abstract
 
 REEF (Retainable Evaluator Execution Framework) is a scale-out
 computing fabric that eases the development of Big Data applications
 on top of resource managers such as Apache YARN and Mesos.
 
 
 # Proposal
 
 REEF is a Big Data system that makes it easy to implement scalable,
 fault-tolerant runtime environments for a range of data processing
 models (e.g., graph processing and machine learning) on top of
 resource managers such as Apache YARN and Mesos. REEF provides
 capabilities to run multiple heterogeneous frameworks and workflows of
 those efficiently.
 
 Additionally, REEF contains two libraries that are of independent
 value: Wake is an event-based-programming framework inspired by Rx and
 SEDA.  Tang is a dependency injection framework inspired by Google
 Guice, but designed specifically for configuring distributed systems.
 
 
 # Background
 
 The resource management layer such as Apache YARN and Mesos has
 emerged as a critical layer in the new scale-out data processing
 stack; resource managers assume the responsibility of multiplexing a
 cluster of shared-nothing machines across heterogeneous
 applications. They operate behind an interface for leasing containers
 - a slice of a machine’s resources - to computations in an elastic
 fashion. However, building data processing frameworks directly on this
 layer comes at a high cost: each framework must tackle the same
 challenges (e.g., fault-tolerance, task scheduling and coordination)
 and reimplement common mechanisms (e.g., caching, bulk transfers).
 
 REEF provides a reusable control-plane for scheduling and coordinating
 task-level work on cluster resource managers. The REEF design enables
 sophisticated optimizations, such as container re-use and data
 caching, and facilitates workflows that span multiple
 frameworks. Examples include pipelining data between different
 operators in a relational system, retaining state across iterations in
 iterative or recursive data flow, and passing the result of a
 MapReduce job to a Machine Learning computation.
 
 
 # Rationale
 
 Since REEF is a library that makes it easy to write distributed
 applications on top of Apache YARN or Mesos, the Apache Software Foundation
 is the perfect home for hosting REEF.
 
 
 # Current Status
 
 REEF has been developed mostly by Microsoft, UCLA and the Seoul
 National University.  The REEF codebase is open-sourced under Apache
 License 2.0 and is currently hosted in a public repository at
 github.com.
 
 
 # Meritocracy
 
 We plan to build a strong open community by following the Apache
 meritocracy principles. We will work with those who contribute
 significantly to the project and invite them to be its committers.
 
 
 # Community
 
 REEF is currently being used internally at Microsoft.  Also, SK
 Telecom builds their data analytics infrastructure on top of REEF in
 collaboration with Seoul National University.  We hope to extend our
 contributor base by becoming an Apache incubator project. REEF will
 attract developers who are interested in creating common building
 blocks for simplifying the development of large-scale big data
 applications.
 
 
 # Core Developers
 
 Core developers are engineers from Microsoft, Purestorage, UCB, UCLA,
 UW and Seoul National University.
 
 
 # Alignment
 
 REEF depends on many Apache projects and dependencies. REEF is built
 on resource managers such as Apache YARN and Apache Mesos. REEF also
 uses HDFS as a distributed storage layer.
 
 
 # Known Risks
 ## Orphaned Products
 
 The risk of REEF being orphaned is small because Microsoft products
 are built on REEF. The core REEF developers continue to work on REEF
 at Microsoft, UCLA, and Seoul National University. The REEF project is
 gaining interest from other institutions to be used as their
 infrastructure.
 
 ## Inexperience with Open Source
 
 Several core developers have experience with open source development.
 REEF committers will be guided by the mentors with strong Apache open
 source project backgrounds.
 
 ## Homogeneous Developers
 
 The initial committers include developers from several institutions
 including Microsoft, Purestorage, UCB, UCLA, and Seoul National
 University.
 
 ## Reliance on Salaried Developers
 
 Developers from Microsoft are paid to work on REEF. Since the work is
 

Re: [VOTE] Accept REEF into the Apache Incubator

2014-08-11 Thread Andrew Purtell
+1 (binding)


On Fri, Aug 8, 2014 at 10:40 PM, Byung-Gon Chun bgc...@gmail.com wrote:

 Hi,

 Thanks for participating in the proposal discussion on REEF. The discussion
 has calmed. I would like to call a vote for acceptance of REEF into the
 Apache Incubator.

 The proposal is attached below, and it is also available at
 https://wiki.apache.org/incubator/ReefProposal

 Let's keep this vote open for three business days, closing the voting on
 August 11, 11:59PM (PDT).

 [] +1 Accept REEF into the Incubator
 [] 0 Don't care
 [] -1 Don't accept REEF because...

 Thanks!
 -Gon

 --
 Byung-Gon Chun


 # REEFProposal - Incubator


 # Abstract

 REEF (Retainable Evaluator Execution Framework) is a scale-out
 computing fabric that eases the development of Big Data applications
 on top of resource managers such as Apache YARN and Mesos.


 # Proposal

 REEF is a Big Data system that makes it easy to implement scalable,
 fault-tolerant runtime environments for a range of data processing
 models (e.g., graph processing and machine learning) on top of
 resource managers such as Apache YARN and Mesos. REEF provides
 capabilities to run multiple heterogeneous frameworks and workflows of
 those efficiently.

 Additionally, REEF contains two libraries that are of independent
 value: Wake is an event-based-programming framework inspired by Rx and
 SEDA.  Tang is a dependency injection framework inspired by Google
 Guice, but designed specifically for configuring distributed systems.


 # Background

 The resource management layer such as Apache YARN and Mesos has
 emerged as a critical layer in the new scale-out data processing
 stack; resource managers assume the responsibility of multiplexing a
 cluster of shared-nothing machines across heterogeneous
 applications. They operate behind an interface for leasing containers
 - a slice of a machine’s resources - to computations in an elastic
 fashion. However, building data processing frameworks directly on this
 layer comes at a high cost: each framework must tackle the same
 challenges (e.g., fault-tolerance, task scheduling and coordination)
 and reimplement common mechanisms (e.g., caching, bulk transfers).

 REEF provides a reusable control-plane for scheduling and coordinating
 task-level work on cluster resource managers. The REEF design enables
 sophisticated optimizations, such as container re-use and data
 caching, and facilitates workflows that span multiple
 frameworks. Examples include pipelining data between different
 operators in a relational system, retaining state across iterations in
 iterative or recursive data flow, and passing the result of a
 MapReduce job to a Machine Learning computation.


 # Rationale

 Since REEF is a library that makes it easy to write distributed
 applications on top of Apache YARN or Mesos, the Apache Software Foundation
 is the perfect home for hosting REEF.


 # Current Status

 REEF has been developed mostly by Microsoft, UCLA and the Seoul
 National University.  The REEF codebase is open-sourced under Apache
 License 2.0 and is currently hosted in a public repository at
 github.com.


 # Meritocracy

 We plan to build a strong open community by following the Apache
 meritocracy principles. We will work with those who contribute
 significantly to the project and invite them to be its committers.


 # Community

 REEF is currently being used internally at Microsoft.  Also, SK
 Telecom builds their data analytics infrastructure on top of REEF in
 collaboration with Seoul National University.  We hope to extend our
 contributor base by becoming an Apache incubator project. REEF will
 attract developers who are interested in creating common building
 blocks for simplifying the development of large-scale big data
 applications.


 # Core Developers

 Core developers are engineers from Microsoft, Purestorage, UCB, UCLA,
 UW and Seoul National University.


 # Alignment

 REEF depends on many Apache projects and dependencies. REEF is built
 on resource managers such as Apache YARN and Apache Mesos. REEF also
 uses HDFS as a distributed storage layer.


 # Known Risks
 ## Orphaned Products

 The risk of REEF being orphaned is small because Microsoft products
 are built on REEF. The core REEF developers continue to work on REEF
 at Microsoft, UCLA, and Seoul National University. The REEF project is
 gaining interest from other institutions to be used as their
 infrastructure.

 ## Inexperience with Open Source

 Several core developers have experience with open source development.
 REEF committers will be guided by the mentors with strong Apache open
 source project backgrounds.

 ## Homogeneous Developers

 The initial committers include developers from several institutions
 including Microsoft, Purestorage, UCB, UCLA, and Seoul National
 University.

 ## Reliance on Salaried Developers

 Developers from Microsoft are paid to work on REEF. Since the work is
 used internally at Microsoft, Microsoft will keep supporting the
 

Re: [VOTE] Accept REEF into the Apache Incubator

2014-08-11 Thread Hitesh Shah
+1 ( non-binding )

— Hitesh 

On Aug 8, 2014, at 10:40 PM, Byung-Gon Chun bgc...@gmail.com wrote:

 Hi,
 
 Thanks for participating in the proposal discussion on REEF. The discussion
 has calmed. I would like to call a vote for acceptance of REEF into the
 Apache Incubator.
 
 The proposal is attached below, and it is also available at
 https://wiki.apache.org/incubator/ReefProposal
 
 Let's keep this vote open for three business days, closing the voting on
 August 11, 11:59PM (PDT).
 
 [] +1 Accept REEF into the Incubator
 [] 0 Don't care
 [] -1 Don't accept REEF because...
 
 Thanks!
 -Gon
 
 -- 
 Byung-Gon Chun
 
 
 # REEFProposal - Incubator
 
 
 # Abstract
 
 REEF (Retainable Evaluator Execution Framework) is a scale-out
 computing fabric that eases the development of Big Data applications
 on top of resource managers such as Apache YARN and Mesos.
 
 
 # Proposal
 
 REEF is a Big Data system that makes it easy to implement scalable,
 fault-tolerant runtime environments for a range of data processing
 models (e.g., graph processing and machine learning) on top of
 resource managers such as Apache YARN and Mesos. REEF provides
 capabilities to run multiple heterogeneous frameworks and workflows of
 those efficiently.
 
 Additionally, REEF contains two libraries that are of independent
 value: Wake is an event-based-programming framework inspired by Rx and
 SEDA.  Tang is a dependency injection framework inspired by Google
 Guice, but designed specifically for configuring distributed systems.
 
 
 # Background
 
 The resource management layer such as Apache YARN and Mesos has
 emerged as a critical layer in the new scale-out data processing
 stack; resource managers assume the responsibility of multiplexing a
 cluster of shared-nothing machines across heterogeneous
 applications. They operate behind an interface for leasing containers
 - a slice of a machine’s resources - to computations in an elastic
 fashion. However, building data processing frameworks directly on this
 layer comes at a high cost: each framework must tackle the same
 challenges (e.g., fault-tolerance, task scheduling and coordination)
 and reimplement common mechanisms (e.g., caching, bulk transfers).
 
 REEF provides a reusable control-plane for scheduling and coordinating
 task-level work on cluster resource managers. The REEF design enables
 sophisticated optimizations, such as container re-use and data
 caching, and facilitates workflows that span multiple
 frameworks. Examples include pipelining data between different
 operators in a relational system, retaining state across iterations in
 iterative or recursive data flow, and passing the result of a
 MapReduce job to a Machine Learning computation.
 
 
 # Rationale
 
 Since REEF is a library that makes it easy to write distributed
 applications on top of Apache YARN or Mesos, the Apache Software Foundation
 is the perfect home for hosting REEF.
 
 
 # Current Status
 
 REEF has been developed mostly by Microsoft, UCLA and the Seoul
 National University.  The REEF codebase is open-sourced under Apache
 License 2.0 and is currently hosted in a public repository at
 github.com.
 
 
 # Meritocracy
 
 We plan to build a strong open community by following the Apache
 meritocracy principles. We will work with those who contribute
 significantly to the project and invite them to be its committers.
 
 
 # Community
 
 REEF is currently being used internally at Microsoft.  Also, SK
 Telecom builds their data analytics infrastructure on top of REEF in
 collaboration with Seoul National University.  We hope to extend our
 contributor base by becoming an Apache incubator project. REEF will
 attract developers who are interested in creating common building
 blocks for simplifying the development of large-scale big data
 applications.
 
 
 # Core Developers
 
 Core developers are engineers from Microsoft, Purestorage, UCB, UCLA,
 UW and Seoul National University.
 
 
 # Alignment
 
 REEF depends on many Apache projects and dependencies. REEF is built
 on resource managers such as Apache YARN and Apache Mesos. REEF also
 uses HDFS as a distributed storage layer.
 
 
 # Known Risks
 ## Orphaned Products
 
 The risk of REEF being orphaned is small because Microsoft products
 are built on REEF. The core REEF developers continue to work on REEF
 at Microsoft, UCLA, and Seoul National University. The REEF project is
 gaining interest from other institutions to be used as their
 infrastructure.
 
 ## Inexperience with Open Source
 
 Several core developers have experience with open source development.
 REEF committers will be guided by the mentors with strong Apache open
 source project backgrounds.
 
 ## Homogeneous Developers
 
 The initial committers include developers from several institutions
 including Microsoft, Purestorage, UCB, UCLA, and Seoul National
 University.
 
 ## Reliance on Salaried Developers
 
 Developers from Microsoft are paid to work on REEF. Since the work is
 used 

Re: [VOTE] Accept REEF into the Apache Incubator

2014-08-10 Thread Konstantin Boudnik
+1

On Sat, Aug 09, 2014 at 02:40PM, Byung-Gon Chun wrote:
 Hi,
 
 Thanks for participating in the proposal discussion on REEF. The discussion
 has calmed. I would like to call a vote for acceptance of REEF into the
 Apache Incubator.
 
 The proposal is attached below, and it is also available at
 https://wiki.apache.org/incubator/ReefProposal
 
 Let's keep this vote open for three business days, closing the voting on
 August 11, 11:59PM (PDT).
 
 [] +1 Accept REEF into the Incubator
 [] 0 Don't care
 [] -1 Don't accept REEF because...
 
 Thanks!
 -Gon
 
 -- 
 Byung-Gon Chun
 
 
 # REEFProposal - Incubator
 
 
 # Abstract
 
 REEF (Retainable Evaluator Execution Framework) is a scale-out
 computing fabric that eases the development of Big Data applications
 on top of resource managers such as Apache YARN and Mesos.
 
 
 # Proposal
 
 REEF is a Big Data system that makes it easy to implement scalable,
 fault-tolerant runtime environments for a range of data processing
 models (e.g., graph processing and machine learning) on top of
 resource managers such as Apache YARN and Mesos. REEF provides
 capabilities to run multiple heterogeneous frameworks and workflows of
 those efficiently.
 
 Additionally, REEF contains two libraries that are of independent
 value: Wake is an event-based-programming framework inspired by Rx and
 SEDA.  Tang is a dependency injection framework inspired by Google
 Guice, but designed specifically for configuring distributed systems.
 
 
 # Background
 
 The resource management layer such as Apache YARN and Mesos has
 emerged as a critical layer in the new scale-out data processing
 stack; resource managers assume the responsibility of multiplexing a
 cluster of shared-nothing machines across heterogeneous
 applications. They operate behind an interface for leasing containers
 - a slice of a machine’s resources - to computations in an elastic
 fashion. However, building data processing frameworks directly on this
 layer comes at a high cost: each framework must tackle the same
 challenges (e.g., fault-tolerance, task scheduling and coordination)
 and reimplement common mechanisms (e.g., caching, bulk transfers).
 
 REEF provides a reusable control-plane for scheduling and coordinating
 task-level work on cluster resource managers. The REEF design enables
 sophisticated optimizations, such as container re-use and data
 caching, and facilitates workflows that span multiple
 frameworks. Examples include pipelining data between different
 operators in a relational system, retaining state across iterations in
 iterative or recursive data flow, and passing the result of a
 MapReduce job to a Machine Learning computation.
 
 
 # Rationale
 
 Since REEF is a library that makes it easy to write distributed
 applications on top of Apache YARN or Mesos, the Apache Software Foundation
 is the perfect home for hosting REEF.
 
 
 # Current Status
 
 REEF has been developed mostly by Microsoft, UCLA and the Seoul
 National University.  The REEF codebase is open-sourced under Apache
 License 2.0 and is currently hosted in a public repository at
 github.com.
 
 
 # Meritocracy
 
 We plan to build a strong open community by following the Apache
 meritocracy principles. We will work with those who contribute
 significantly to the project and invite them to be its committers.
 
 
 # Community
 
 REEF is currently being used internally at Microsoft.  Also, SK
 Telecom builds their data analytics infrastructure on top of REEF in
 collaboration with Seoul National University.  We hope to extend our
 contributor base by becoming an Apache incubator project. REEF will
 attract developers who are interested in creating common building
 blocks for simplifying the development of large-scale big data
 applications.
 
 
 # Core Developers
 
 Core developers are engineers from Microsoft, Purestorage, UCB, UCLA,
 UW and Seoul National University.
 
 
 # Alignment
 
 REEF depends on many Apache projects and dependencies. REEF is built
 on resource managers such as Apache YARN and Apache Mesos. REEF also
 uses HDFS as a distributed storage layer.
 
 
 # Known Risks
 ## Orphaned Products
 
 The risk of REEF being orphaned is small because Microsoft products
 are built on REEF. The core REEF developers continue to work on REEF
 at Microsoft, UCLA, and Seoul National University. The REEF project is
 gaining interest from other institutions to be used as their
 infrastructure.
 
 ## Inexperience with Open Source
 
 Several core developers have experience with open source development.
 REEF committers will be guided by the mentors with strong Apache open
 source project backgrounds.
 
 ## Homogeneous Developers
 
 The initial committers include developers from several institutions
 including Microsoft, Purestorage, UCB, UCLA, and Seoul National
 University.
 
 ## Reliance on Salaried Developers
 
 Developers from Microsoft are paid to work on REEF. Since the work is
 used internally at Microsoft, Microsoft will keep 

Re: [VOTE] Accept REEF into the Apache Incubator

2014-08-09 Thread Till Westmann
+1

On Fri, Aug 8, 2014 at 10:40 PM, Byung-Gon Chun bgc...@gmail.com wrote:

 Hi,

 Thanks for participating in the proposal discussion on REEF. The discussion
 has calmed. I would like to call a vote for acceptance of REEF into the
 Apache Incubator.

 The proposal is attached below, and it is also available at
 https://wiki.apache.org/incubator/ReefProposal

 Let's keep this vote open for three business days, closing the voting on
 August 11, 11:59PM (PDT).

 [] +1 Accept REEF into the Incubator
 [] 0 Don't care
 [] -1 Don't accept REEF because...

 Thanks!
 -Gon

 --
 Byung-Gon Chun


 # REEFProposal - Incubator


 # Abstract

 REEF (Retainable Evaluator Execution Framework) is a scale-out
 computing fabric that eases the development of Big Data applications
 on top of resource managers such as Apache YARN and Mesos.


 # Proposal

 REEF is a Big Data system that makes it easy to implement scalable,
 fault-tolerant runtime environments for a range of data processing
 models (e.g., graph processing and machine learning) on top of
 resource managers such as Apache YARN and Mesos. REEF provides
 capabilities to run multiple heterogeneous frameworks and workflows of
 those efficiently.

 Additionally, REEF contains two libraries that are of independent
 value: Wake is an event-based-programming framework inspired by Rx and
 SEDA.  Tang is a dependency injection framework inspired by Google
 Guice, but designed specifically for configuring distributed systems.


 # Background

 The resource management layer such as Apache YARN and Mesos has
 emerged as a critical layer in the new scale-out data processing
 stack; resource managers assume the responsibility of multiplexing a
 cluster of shared-nothing machines across heterogeneous
 applications. They operate behind an interface for leasing containers
 - a slice of a machine’s resources - to computations in an elastic
 fashion. However, building data processing frameworks directly on this
 layer comes at a high cost: each framework must tackle the same
 challenges (e.g., fault-tolerance, task scheduling and coordination)
 and reimplement common mechanisms (e.g., caching, bulk transfers).

 REEF provides a reusable control-plane for scheduling and coordinating
 task-level work on cluster resource managers. The REEF design enables
 sophisticated optimizations, such as container re-use and data
 caching, and facilitates workflows that span multiple
 frameworks. Examples include pipelining data between different
 operators in a relational system, retaining state across iterations in
 iterative or recursive data flow, and passing the result of a
 MapReduce job to a Machine Learning computation.


 # Rationale

 Since REEF is a library that makes it easy to write distributed
 applications on top of Apache YARN or Mesos, the Apache Software Foundation
 is the perfect home for hosting REEF.


 # Current Status

 REEF has been developed mostly by Microsoft, UCLA and the Seoul
 National University.  The REEF codebase is open-sourced under Apache
 License 2.0 and is currently hosted in a public repository at
 github.com.


 # Meritocracy

 We plan to build a strong open community by following the Apache
 meritocracy principles. We will work with those who contribute
 significantly to the project and invite them to be its committers.


 # Community

 REEF is currently being used internally at Microsoft.  Also, SK
 Telecom builds their data analytics infrastructure on top of REEF in
 collaboration with Seoul National University.  We hope to extend our
 contributor base by becoming an Apache incubator project. REEF will
 attract developers who are interested in creating common building
 blocks for simplifying the development of large-scale big data
 applications.


 # Core Developers

 Core developers are engineers from Microsoft, Purestorage, UCB, UCLA,
 UW and Seoul National University.


 # Alignment

 REEF depends on many Apache projects and dependencies. REEF is built
 on resource managers such as Apache YARN and Apache Mesos. REEF also
 uses HDFS as a distributed storage layer.


 # Known Risks
 ## Orphaned Products

 The risk of REEF being orphaned is small because Microsoft products
 are built on REEF. The core REEF developers continue to work on REEF
 at Microsoft, UCLA, and Seoul National University. The REEF project is
 gaining interest from other institutions to be used as their
 infrastructure.

 ## Inexperience with Open Source

 Several core developers have experience with open source development.
 REEF committers will be guided by the mentors with strong Apache open
 source project backgrounds.

 ## Homogeneous Developers

 The initial committers include developers from several institutions
 including Microsoft, Purestorage, UCB, UCLA, and Seoul National
 University.

 ## Reliance on Salaried Developers

 Developers from Microsoft are paid to work on REEF. Since the work is
 used internally at Microsoft, Microsoft will keep supporting the
 developers to 

Re: [VOTE] Accept REEF into the Apache Incubator

2014-08-09 Thread Alan D. Cabrera
+1 binding


Regards,
Alan

On Aug 8, 2014, at 10:40 PM, Byung-Gon Chun bgc...@gmail.com wrote:

 Let's keep this vote open for three business days, closing the voting on
 August 11, 11:59PM (PDT).
 
 [] +1 Accept REEF into the Incubator
 [] 0 Don't care
 [] -1 Don't accept REEF because...



[VOTE] Accept REEF into the Apache Incubator

2014-08-08 Thread Byung-Gon Chun
Hi,

Thanks for participating in the proposal discussion on REEF. The discussion
has calmed. I would like to call a vote for acceptance of REEF into the
Apache Incubator.

The proposal is attached below, and it is also available at
https://wiki.apache.org/incubator/ReefProposal

Let's keep this vote open for three business days, closing the voting on
August 11, 11:59PM (PDT).

[] +1 Accept REEF into the Incubator
[] 0 Don't care
[] -1 Don't accept REEF because...

Thanks!
-Gon

-- 
Byung-Gon Chun


# REEFProposal - Incubator


# Abstract

REEF (Retainable Evaluator Execution Framework) is a scale-out
computing fabric that eases the development of Big Data applications
on top of resource managers such as Apache YARN and Mesos.


# Proposal

REEF is a Big Data system that makes it easy to implement scalable,
fault-tolerant runtime environments for a range of data processing
models (e.g., graph processing and machine learning) on top of
resource managers such as Apache YARN and Mesos. REEF provides
capabilities to run multiple heterogeneous frameworks and workflows of
those efficiently.

Additionally, REEF contains two libraries that are of independent
value: Wake is an event-based-programming framework inspired by Rx and
SEDA.  Tang is a dependency injection framework inspired by Google
Guice, but designed specifically for configuring distributed systems.


# Background

The resource management layer such as Apache YARN and Mesos has
emerged as a critical layer in the new scale-out data processing
stack; resource managers assume the responsibility of multiplexing a
cluster of shared-nothing machines across heterogeneous
applications. They operate behind an interface for leasing containers
- a slice of a machine’s resources - to computations in an elastic
fashion. However, building data processing frameworks directly on this
layer comes at a high cost: each framework must tackle the same
challenges (e.g., fault-tolerance, task scheduling and coordination)
and reimplement common mechanisms (e.g., caching, bulk transfers).

REEF provides a reusable control-plane for scheduling and coordinating
task-level work on cluster resource managers. The REEF design enables
sophisticated optimizations, such as container re-use and data
caching, and facilitates workflows that span multiple
frameworks. Examples include pipelining data between different
operators in a relational system, retaining state across iterations in
iterative or recursive data flow, and passing the result of a
MapReduce job to a Machine Learning computation.


# Rationale

Since REEF is a library that makes it easy to write distributed
applications on top of Apache YARN or Mesos, the Apache Software Foundation
is the perfect home for hosting REEF.


# Current Status

REEF has been developed mostly by Microsoft, UCLA and the Seoul
National University.  The REEF codebase is open-sourced under Apache
License 2.0 and is currently hosted in a public repository at
github.com.


# Meritocracy

We plan to build a strong open community by following the Apache
meritocracy principles. We will work with those who contribute
significantly to the project and invite them to be its committers.


# Community

REEF is currently being used internally at Microsoft.  Also, SK
Telecom builds their data analytics infrastructure on top of REEF in
collaboration with Seoul National University.  We hope to extend our
contributor base by becoming an Apache incubator project. REEF will
attract developers who are interested in creating common building
blocks for simplifying the development of large-scale big data
applications.


# Core Developers

Core developers are engineers from Microsoft, Purestorage, UCB, UCLA,
UW and Seoul National University.


# Alignment

REEF depends on many Apache projects and dependencies. REEF is built
on resource managers such as Apache YARN and Apache Mesos. REEF also
uses HDFS as a distributed storage layer.


# Known Risks
## Orphaned Products

The risk of REEF being orphaned is small because Microsoft products
are built on REEF. The core REEF developers continue to work on REEF
at Microsoft, UCLA, and Seoul National University. The REEF project is
gaining interest from other institutions to be used as their
infrastructure.

## Inexperience with Open Source

Several core developers have experience with open source development.
REEF committers will be guided by the mentors with strong Apache open
source project backgrounds.

## Homogeneous Developers

The initial committers include developers from several institutions
including Microsoft, Purestorage, UCB, UCLA, and Seoul National
University.

## Reliance on Salaried Developers

Developers from Microsoft are paid to work on REEF. Since the work is
used internally at Microsoft, Microsoft will keep supporting the
developers to work on REEF. There are also engineers and graduate
students that contribute to REEF from UCLA, UCB, UW and Seoul National
University.  We plan to attract active developers 

Re: [VOTE] Accept REEF into the Apache Incubator

2014-08-08 Thread Ross Gardler
[x] +1 Accept REEF into the Incubator






On 8 August 2014 22:40, Byung-Gon Chun bgc...@gmail.com wrote:

 Hi,

 Thanks for participating in the proposal discussion on REEF. The discussion
 has calmed. I would like to call a vote for acceptance of REEF into the
 Apache Incubator.

 The proposal is attached below, and it is also available at
 https://wiki.apache.org/incubator/ReefProposal

 Let's keep this vote open for three business days, closing the voting on
 August 11, 11:59PM (PDT).

 [] +1 Accept REEF into the Incubator
 [] 0 Don't care
 [] -1 Don't accept REEF because...

 Thanks!
 -Gon

 --
 Byung-Gon Chun


 # REEFProposal - Incubator


 # Abstract

 REEF (Retainable Evaluator Execution Framework) is a scale-out
 computing fabric that eases the development of Big Data applications
 on top of resource managers such as Apache YARN and Mesos.


 # Proposal

 REEF is a Big Data system that makes it easy to implement scalable,
 fault-tolerant runtime environments for a range of data processing
 models (e.g., graph processing and machine learning) on top of
 resource managers such as Apache YARN and Mesos. REEF provides
 capabilities to run multiple heterogeneous frameworks and workflows of
 those efficiently.

 Additionally, REEF contains two libraries that are of independent
 value: Wake is an event-based-programming framework inspired by Rx and
 SEDA.  Tang is a dependency injection framework inspired by Google
 Guice, but designed specifically for configuring distributed systems.


 # Background

 The resource management layer such as Apache YARN and Mesos has
 emerged as a critical layer in the new scale-out data processing
 stack; resource managers assume the responsibility of multiplexing a
 cluster of shared-nothing machines across heterogeneous
 applications. They operate behind an interface for leasing containers
 - a slice of a machine's resources - to computations in an elastic
 fashion. However, building data processing frameworks directly on this
 layer comes at a high cost: each framework must tackle the same
 challenges (e.g., fault-tolerance, task scheduling and coordination)
 and reimplement common mechanisms (e.g., caching, bulk transfers).

 REEF provides a reusable control-plane for scheduling and coordinating
 task-level work on cluster resource managers. The REEF design enables
 sophisticated optimizations, such as container re-use and data
 caching, and facilitates workflows that span multiple
 frameworks. Examples include pipelining data between different
 operators in a relational system, retaining state across iterations in
 iterative or recursive data flow, and passing the result of a
 MapReduce job to a Machine Learning computation.


 # Rationale

 Since REEF is a library that makes it easy to write distributed
 applications on top of Apache YARN or Mesos, the Apache Software Foundation
 is the perfect home for hosting REEF.


 # Current Status

 REEF has been developed mostly by Microsoft, UCLA and the Seoul
 National University.  The REEF codebase is open-sourced under Apache
 License 2.0 and is currently hosted in a public repository at
 github.com.


 # Meritocracy

 We plan to build a strong open community by following the Apache
 meritocracy principles. We will work with those who contribute
 significantly to the project and invite them to be its committers.


 # Community

 REEF is currently being used internally at Microsoft.  Also, SK
 Telecom builds their data analytics infrastructure on top of REEF in
 collaboration with Seoul National University.  We hope to extend our
 contributor base by becoming an Apache incubator project. REEF will
 attract developers who are interested in creating common building
 blocks for simplifying the development of large-scale big data
 applications.


 # Core Developers

 Core developers are engineers from Microsoft, Purestorage, UCB, UCLA,
 UW and Seoul National University.


 # Alignment

 REEF depends on many Apache projects and dependencies. REEF is built
 on resource managers such as Apache YARN and Apache Mesos. REEF also
 uses HDFS as a distributed storage layer.


 # Known Risks
 ## Orphaned Products

 The risk of REEF being orphaned is small because Microsoft products
 are built on REEF. The core REEF developers continue to work on REEF
 at Microsoft, UCLA, and Seoul National University. The REEF project is
 gaining interest from other institutions to be used as their
 infrastructure.

 ## Inexperience with Open Source

 Several core developers have experience with open source development.
 REEF committers will be guided by the mentors with strong Apache open
 source project backgrounds.

 ## Homogeneous Developers

 The initial committers include developers from several institutions
 including Microsoft, Purestorage, UCB, UCLA, and Seoul National
 University.

 ## Reliance on Salaried Developers

 Developers from Microsoft are paid to work on REEF. Since the work is
 used internally at Microsoft, Microsoft will keep