Re: [DISCUSS] [PROPOSAL] Myriad for Apache Incubator

2015-02-21 Thread Adam Bordelon
I have updated the proposal to include Luciano as a Mentor, and marked it
as FINAL.
https://wiki.apache.org/incubator/MyriadProposal?action=recallrev=7
I will open up a new thread for the VOTE.

On Sat, Feb 21, 2015 at 1:20 PM, jan i j...@apache.org wrote:

 On 21 February 2015 at 21:12, Luciano Resende luckbr1...@gmail.com
 wrote:

  Discussion has died down, and we had only positive feedback for the
  proposal. Should we start a formal vote ?
 
 please   do.

 rgds
 jan i

 
  On Wed, Feb 18, 2015 at 9:27 PM, Ted Dunning ted.dunn...@gmail.com
  wrote:
 
   On Wed, Feb 18, 2015 at 12:24 PM, Adam Bordelon a...@mesosphere.io
   wrote:
  
I am personally in favor of adding Luciano Resende to the Nominated
   Mentors
list (the more the merrier, right?), but I want to get approval from
  the
other mentors/committers before nominating him in the proposal.
   
  
   +1
  
   I don't think that you really need to worry about other mentors
 approving
   the addition of a mentor.  This is a duty well shared by more hands.  I
   haven't seen a bad mentor except ones that go missing and having an
 extra
   helps with that.
  
 
 
 
  --
  Luciano Resende
  http://people.apache.org/~lresende
  http://twitter.com/lresende1975
  http://lresende.blogspot.com/
 



Re: [DISCUSS] [PROPOSAL] Myriad for Apache Incubator

2015-02-21 Thread Ted Dunning

Sound right to me. 

Ben?  Would you like to do the honors?

Sent from my iPhone

 On Feb 21, 2015, at 15:12, Luciano Resende luckbr1...@gmail.com wrote:
 
 Discussion has died down, and we had only positive feedback for the
 proposal. Should we start a formal vote ?
 
 On Wed, Feb 18, 2015 at 9:27 PM, Ted Dunning ted.dunn...@gmail.com wrote:
 
 On Wed, Feb 18, 2015 at 12:24 PM, Adam Bordelon a...@mesosphere.io
 wrote:
 
 I am personally in favor of adding Luciano Resende to the Nominated
 Mentors
 list (the more the merrier, right?), but I want to get approval from the
 other mentors/committers before nominating him in the proposal.
 
 
 +1
 
 I don't think that you really need to worry about other mentors approving
 the addition of a mentor.  This is a duty well shared by more hands.  I
 haven't seen a bad mentor except ones that go missing and having an extra
 helps with that.
 
 
 
 
 -- 
 Luciano Resende
 http://people.apache.org/~lresende
 http://twitter.com/lresende1975
 http://lresende.blogspot.com/

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: [DISCUSS] [PROPOSAL] Myriad for Apache Incubator

2015-02-21 Thread jan i
On 21 February 2015 at 21:12, Luciano Resende luckbr1...@gmail.com wrote:

 Discussion has died down, and we had only positive feedback for the
 proposal. Should we start a formal vote ?

please   do.

rgds
jan i


 On Wed, Feb 18, 2015 at 9:27 PM, Ted Dunning ted.dunn...@gmail.com
 wrote:

  On Wed, Feb 18, 2015 at 12:24 PM, Adam Bordelon a...@mesosphere.io
  wrote:
 
   I am personally in favor of adding Luciano Resende to the Nominated
  Mentors
   list (the more the merrier, right?), but I want to get approval from
 the
   other mentors/committers before nominating him in the proposal.
  
 
  +1
 
  I don't think that you really need to worry about other mentors approving
  the addition of a mentor.  This is a duty well shared by more hands.  I
  haven't seen a bad mentor except ones that go missing and having an extra
  helps with that.
 



 --
 Luciano Resende
 http://people.apache.org/~lresende
 http://twitter.com/lresende1975
 http://lresende.blogspot.com/



Re: [DISCUSS] [PROPOSAL] Myriad for Apache Incubator

2015-02-21 Thread Luciano Resende
Discussion has died down, and we had only positive feedback for the
proposal. Should we start a formal vote ?

On Wed, Feb 18, 2015 at 9:27 PM, Ted Dunning ted.dunn...@gmail.com wrote:

 On Wed, Feb 18, 2015 at 12:24 PM, Adam Bordelon a...@mesosphere.io
 wrote:

  I am personally in favor of adding Luciano Resende to the Nominated
 Mentors
  list (the more the merrier, right?), but I want to get approval from the
  other mentors/committers before nominating him in the proposal.
 

 +1

 I don't think that you really need to worry about other mentors approving
 the addition of a mentor.  This is a duty well shared by more hands.  I
 haven't seen a bad mentor except ones that go missing and having an extra
 helps with that.




-- 
Luciano Resende
http://people.apache.org/~lresende
http://twitter.com/lresende1975
http://lresende.blogspot.com/


Re: [DISCUSS] [PROPOSAL] Myriad for Apache Incubator

2015-02-18 Thread Adam Bordelon
Thanks for your support everyone. I have updated the wiki proposal to
remove the user@ mailing list, and fixed up the formatting.

I am personally in favor of adding Luciano Resende to the Nominated Mentors
list (the more the merrier, right?), but I want to get approval from the
other mentors/committers before nominating him in the proposal. See
http://incubator.apache.org/guides/proposal.html#template-mentors and
http://incubator.apache.org/guides/mentor.html for more details on the
role/responsibilities.
Alternatively, Luciano could act as a (small 'm') mentor, rather than an
official Podling Mentor. Thoughts, opinions?

Any high-level critiques or questions not answered in the proposal? Any
nit-picky grammer/spelling mistakes? [troll]

On Tue, Feb 17, 2015 at 10:19 PM, Naresh Agarwal naresh.agar...@inmobi.com
wrote:

 Looks interesting. Looking forward to this.

 Thanks
 Naresh

 On Wed, Feb 18, 2015 at 11:08 AM, Henry Saputra henry.sapu...@gmail.com
 wrote:

  I love this project and the idea. Tried to hack it couple years ago
  could not make it work.
 
  Looking forward seeing it in ASF incubator for sure.
 
  @Adam and @Ted, like any new incubator projects coming we always check
  if you need user@ so early in the process?
  Would probably better to have all discussion in dev@ early in
 incubation.
 
  - Henry
 
  On Fri, Feb 13, 2015 at 5:06 PM, Adam Bordelon a...@mesosphere.io
 wrote:
   Hello friends,
  
   The Myriad team and I would like to propose the Myriad project for
   inclusion in the Apache Incubator.
   Full text of the proposal is below. I can add it to the incubator wiki
 as
   well, if desired.
   Please review and discuss. If there are no major concerns, I will call
  for
   a Vote after a week.
  
   Cheers,
   -Adam-
   me@apache
  
   ==
   Apache Myriad Proposal
  
   * Abstract
   Myriad enables co-existence of Apache Hadoop YARN and Apache Mesos
  together
   on the same cluster and allows dynamic resource allocations across both
   Hadoop and other applications running on the same physical data center
   infrastructure.
  
   * Proposal
   The vision of Myriad is to provide a comprehensive framework to ensure
   Apache Hadoop YARN and Apache Mesos can interoperate with minimal
 changes
   on either side and prevent the static fragmentation of data center
   resources.
  
   * Background
   Project Myriad is the first resource management framework that allows
 big
   data developers to run YARN-based Hadoop jobs alongside other
  applications
   and services in production. ebay Inc., MapR, and Mesosphere jointly
 built
   Myriad (available on Github at https://github.com/mesos/myriad) with
 the
   vision of freeing big data jobs from siloed clusters and consolidating
   infrastructure into a single pool of resources for greater utilization
  and
   operational efficiency. Several companies including Twitter have
  expressed
   interest in Myriad and have begun testing it.
  
   * Rationale
   Many Hadoop users are building larger clusters (data lake/data hub
   architectures) that support multiple workloads - made possible by the
   advent of Apache Hadoop YARN. As the clusters grow in size and
  importance,
   they become an important application within the broader datacenter. At
  the
   same time, Apache Mesos enables efficient resource isolation and
 sharing
   across distributed applications for the broader data center, for
 instance
   MPI, Spark, long running web services, build/test infrastructure,
   traditional linux applications/scripts, and others (including arbitrary
   docker images).
  
   Myriad aims to enable co-existence of Apache Hadoop YARN and Apache
 Mesos
   on the same physical data center resources, reducing fragmentation of
  data
   center resources.
  
   * Project Goals
   ** Initial Goals
   - Run Myriad alongside Apache Hadoop YARN and Apache Mesos to allow
  policy
   based allocation of data center resources across Apache Hadoop and
 other
   distributed applications
   - Ensure YARN based execution frameworks work without any changes when
   running alongside Myriad. YARN Applications will continue to interact
 and
   run on top of YARN and can choose to be unaware of Myriad.
   - Ensure Mesos based execution frameworks work without any changes when
   running alongside Myriad. Mesos applications will continue to interact
  and
   run on Mesos and can choose to be unaware of Myriad.
   - Provide isolation for multi-tenancy.
 - Use linux cgroups (and optionally Docker-like technologies to ease
   packaging, deployment and broader isolation) so that multiple YARN
  clusters
   can run in their own space and are isolated from each other. YARN’s RM
  and
   NMs are dockerized.
   - Myriad should be able to manage full YARN lifecycle:
 - Bring up YARN (RM, NM)
 - Scale Up/Down YARN
 - Release resources and shut down YARN
  
   ** Longer Term Goals
   - Allow fine-grained 

Re: [DISCUSS] [PROPOSAL] Myriad for Apache Incubator

2015-02-18 Thread Ted Dunning
On Wed, Feb 18, 2015 at 12:24 PM, Adam Bordelon a...@mesosphere.io wrote:

 I am personally in favor of adding Luciano Resende to the Nominated Mentors
 list (the more the merrier, right?), but I want to get approval from the
 other mentors/committers before nominating him in the proposal.


+1

I don't think that you really need to worry about other mentors approving
the addition of a mentor.  This is a duty well shared by more hands.  I
haven't seen a bad mentor except ones that go missing and having an extra
helps with that.


Re: [DISCUSS] [PROPOSAL] Myriad for Apache Incubator

2015-02-17 Thread Henry Saputra
Oh it is painless =)

From what I have seen, having just dev@ list early would help ramping
up dev quickly.

@Adam and @Ted, IMHO once the transition is over and the project has
one release under ASF adding user@ list would be beneficial.

- Henry

On Tue, Feb 17, 2015 at 9:59 PM, Adam Bordelon a...@mesosphere.io wrote:
 Good point. I'm fine with starting with just a dev@ first, and then we can
 add user@ if/when dev becomes too noisy.
 I assume adding a new mailing list is relatively painless.

 On Tue, Feb 17, 2015 at 9:52 PM, Ted Dunning ted.dunn...@gmail.com wrote:

 On Tue, Feb 17, 2015 at 9:38 PM, Henry Saputra henry.sapu...@gmail.com
 wrote:

  @Adam and @Ted, like any new incubator projects coming we always check
  if you need user@ so early in the process?
  Would probably better to have all discussion in dev@ early in
 incubation.
 

 Henry,

 This is a good question to ask (and I have asked it in the past).

 I think that Myriad is in, or nearly in production here and there already.
 That means that a user@ list might well be useful.


-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: [DISCUSS] [PROPOSAL] Myriad for Apache Incubator

2015-02-17 Thread Naresh Agarwal
Looks interesting. Looking forward to this.

Thanks
Naresh

On Wed, Feb 18, 2015 at 11:08 AM, Henry Saputra henry.sapu...@gmail.com
wrote:

 I love this project and the idea. Tried to hack it couple years ago
 could not make it work.

 Looking forward seeing it in ASF incubator for sure.

 @Adam and @Ted, like any new incubator projects coming we always check
 if you need user@ so early in the process?
 Would probably better to have all discussion in dev@ early in incubation.

 - Henry

 On Fri, Feb 13, 2015 at 5:06 PM, Adam Bordelon a...@mesosphere.io wrote:
  Hello friends,
 
  The Myriad team and I would like to propose the Myriad project for
  inclusion in the Apache Incubator.
  Full text of the proposal is below. I can add it to the incubator wiki as
  well, if desired.
  Please review and discuss. If there are no major concerns, I will call
 for
  a Vote after a week.
 
  Cheers,
  -Adam-
  me@apache
 
  ==
  Apache Myriad Proposal
 
  * Abstract
  Myriad enables co-existence of Apache Hadoop YARN and Apache Mesos
 together
  on the same cluster and allows dynamic resource allocations across both
  Hadoop and other applications running on the same physical data center
  infrastructure.
 
  * Proposal
  The vision of Myriad is to provide a comprehensive framework to ensure
  Apache Hadoop YARN and Apache Mesos can interoperate with minimal changes
  on either side and prevent the static fragmentation of data center
  resources.
 
  * Background
  Project Myriad is the first resource management framework that allows big
  data developers to run YARN-based Hadoop jobs alongside other
 applications
  and services in production. ebay Inc., MapR, and Mesosphere jointly built
  Myriad (available on Github at https://github.com/mesos/myriad) with the
  vision of freeing big data jobs from siloed clusters and consolidating
  infrastructure into a single pool of resources for greater utilization
 and
  operational efficiency. Several companies including Twitter have
 expressed
  interest in Myriad and have begun testing it.
 
  * Rationale
  Many Hadoop users are building larger clusters (data lake/data hub
  architectures) that support multiple workloads - made possible by the
  advent of Apache Hadoop YARN. As the clusters grow in size and
 importance,
  they become an important application within the broader datacenter. At
 the
  same time, Apache Mesos enables efficient resource isolation and sharing
  across distributed applications for the broader data center, for instance
  MPI, Spark, long running web services, build/test infrastructure,
  traditional linux applications/scripts, and others (including arbitrary
  docker images).
 
  Myriad aims to enable co-existence of Apache Hadoop YARN and Apache Mesos
  on the same physical data center resources, reducing fragmentation of
 data
  center resources.
 
  * Project Goals
  ** Initial Goals
  - Run Myriad alongside Apache Hadoop YARN and Apache Mesos to allow
 policy
  based allocation of data center resources across Apache Hadoop and other
  distributed applications
  - Ensure YARN based execution frameworks work without any changes when
  running alongside Myriad. YARN Applications will continue to interact and
  run on top of YARN and can choose to be unaware of Myriad.
  - Ensure Mesos based execution frameworks work without any changes when
  running alongside Myriad. Mesos applications will continue to interact
 and
  run on Mesos and can choose to be unaware of Myriad.
  - Provide isolation for multi-tenancy.
- Use linux cgroups (and optionally Docker-like technologies to ease
  packaging, deployment and broader isolation) so that multiple YARN
 clusters
  can run in their own space and are isolated from each other. YARN’s RM
 and
  NMs are dockerized.
  - Myriad should be able to manage full YARN lifecycle:
- Bring up YARN (RM, NM)
- Scale Up/Down YARN
- Release resources and shut down YARN
 
  ** Longer Term Goals
  - Allow fine-grained dynamic allocation of resources to Hadoop including
  the ability to scale up and scale down the cluster.
- Provide different policies to allow downsizing running applications
 on
  Hadoop when resources are taken away from it.
- Provide a framework so the downsizing policy is pluggable and users
 can
  write their own implementations.
  - Allow multiple versions of Apache Hadoop to run on the same physical
  infrastructure
  - Allow workload portability - ability to migrate YARN workloads across
  various cloud infrastructures seamlessly (e.g. GCE, AWS, etc)
  - Security:
- Authentication Requirements:
  - Support basic CRAM-MD5 password authentication between Myriad and
  Mesos. Additional authentication mechanisms may be supported in the
 future.
  - Traditional user authentication with Hadoop’s HTTP web-consoles
  should work as usual.
- Authorization:
  - Only authorized users are allowed to launch YARN 

Re: [DISCUSS] [PROPOSAL] Myriad for Apache Incubator

2015-02-17 Thread Henry Saputra
I love this project and the idea. Tried to hack it couple years ago
could not make it work.

Looking forward seeing it in ASF incubator for sure.

@Adam and @Ted, like any new incubator projects coming we always check
if you need user@ so early in the process?
Would probably better to have all discussion in dev@ early in incubation.

- Henry

On Fri, Feb 13, 2015 at 5:06 PM, Adam Bordelon a...@mesosphere.io wrote:
 Hello friends,

 The Myriad team and I would like to propose the Myriad project for
 inclusion in the Apache Incubator.
 Full text of the proposal is below. I can add it to the incubator wiki as
 well, if desired.
 Please review and discuss. If there are no major concerns, I will call for
 a Vote after a week.

 Cheers,
 -Adam-
 me@apache

 ==
 Apache Myriad Proposal

 * Abstract
 Myriad enables co-existence of Apache Hadoop YARN and Apache Mesos together
 on the same cluster and allows dynamic resource allocations across both
 Hadoop and other applications running on the same physical data center
 infrastructure.

 * Proposal
 The vision of Myriad is to provide a comprehensive framework to ensure
 Apache Hadoop YARN and Apache Mesos can interoperate with minimal changes
 on either side and prevent the static fragmentation of data center
 resources.

 * Background
 Project Myriad is the first resource management framework that allows big
 data developers to run YARN-based Hadoop jobs alongside other applications
 and services in production. ebay Inc., MapR, and Mesosphere jointly built
 Myriad (available on Github at https://github.com/mesos/myriad) with the
 vision of freeing big data jobs from siloed clusters and consolidating
 infrastructure into a single pool of resources for greater utilization and
 operational efficiency. Several companies including Twitter have expressed
 interest in Myriad and have begun testing it.

 * Rationale
 Many Hadoop users are building larger clusters (data lake/data hub
 architectures) that support multiple workloads - made possible by the
 advent of Apache Hadoop YARN. As the clusters grow in size and importance,
 they become an important application within the broader datacenter. At the
 same time, Apache Mesos enables efficient resource isolation and sharing
 across distributed applications for the broader data center, for instance
 MPI, Spark, long running web services, build/test infrastructure,
 traditional linux applications/scripts, and others (including arbitrary
 docker images).

 Myriad aims to enable co-existence of Apache Hadoop YARN and Apache Mesos
 on the same physical data center resources, reducing fragmentation of data
 center resources.

 * Project Goals
 ** Initial Goals
 - Run Myriad alongside Apache Hadoop YARN and Apache Mesos to allow policy
 based allocation of data center resources across Apache Hadoop and other
 distributed applications
 - Ensure YARN based execution frameworks work without any changes when
 running alongside Myriad. YARN Applications will continue to interact and
 run on top of YARN and can choose to be unaware of Myriad.
 - Ensure Mesos based execution frameworks work without any changes when
 running alongside Myriad. Mesos applications will continue to interact and
 run on Mesos and can choose to be unaware of Myriad.
 - Provide isolation for multi-tenancy.
   - Use linux cgroups (and optionally Docker-like technologies to ease
 packaging, deployment and broader isolation) so that multiple YARN clusters
 can run in their own space and are isolated from each other. YARN’s RM and
 NMs are dockerized.
 - Myriad should be able to manage full YARN lifecycle:
   - Bring up YARN (RM, NM)
   - Scale Up/Down YARN
   - Release resources and shut down YARN

 ** Longer Term Goals
 - Allow fine-grained dynamic allocation of resources to Hadoop including
 the ability to scale up and scale down the cluster.
   - Provide different policies to allow downsizing running applications on
 Hadoop when resources are taken away from it.
   - Provide a framework so the downsizing policy is pluggable and users can
 write their own implementations.
 - Allow multiple versions of Apache Hadoop to run on the same physical
 infrastructure
 - Allow workload portability - ability to migrate YARN workloads across
 various cloud infrastructures seamlessly (e.g. GCE, AWS, etc)
 - Security:
   - Authentication Requirements:
 - Support basic CRAM-MD5 password authentication between Myriad and
 Mesos. Additional authentication mechanisms may be supported in the future.
 - Traditional user authentication with Hadoop’s HTTP web-consoles
 should work as usual.
   - Authorization:
 - Only authorized users are allowed to launch YARN clusters.  Mesos
 allows to specify which framework principal is allowed to register as a
 particular role.
   - Encryption on wire:
 - All control traffic to/from Myriad/Mesos
 - Logs
   - Audits (where to store them)
 - Log all major 

Re: [DISCUSS] [PROPOSAL] Myriad for Apache Incubator

2015-02-17 Thread Ted Dunning
On Tue, Feb 17, 2015 at 9:38 PM, Henry Saputra henry.sapu...@gmail.com
wrote:

 @Adam and @Ted, like any new incubator projects coming we always check
 if you need user@ so early in the process?
 Would probably better to have all discussion in dev@ early in incubation.


Henry,

This is a good question to ask (and I have asked it in the past).

I think that Myriad is in, or nearly in production here and there already.
That means that a user@ list might well be useful.


Re: [DISCUSS] [PROPOSAL] Myriad for Apache Incubator

2015-02-17 Thread Adam Bordelon
Good point. I'm fine with starting with just a dev@ first, and then we can
add user@ if/when dev becomes too noisy.
I assume adding a new mailing list is relatively painless.

On Tue, Feb 17, 2015 at 9:52 PM, Ted Dunning ted.dunn...@gmail.com wrote:

 On Tue, Feb 17, 2015 at 9:38 PM, Henry Saputra henry.sapu...@gmail.com
 wrote:

  @Adam and @Ted, like any new incubator projects coming we always check
  if you need user@ so early in the process?
  Would probably better to have all discussion in dev@ early in
 incubation.
 

 Henry,

 This is a good question to ask (and I have asked it in the past).

 I think that Myriad is in, or nearly in production here and there already.
 That means that a user@ list might well be useful.



Re: [DISCUSS] [PROPOSAL] Myriad for Apache Incubator

2015-02-16 Thread Ted Dunning
On Fri, Feb 13, 2015 at 5:06 PM, Adam Bordelon a...@mesosphere.io wrote:

 I can add it to the incubator wiki as
 well, if desired.


I added this to the incubator wiki just now.


Re: [DISCUSS] [PROPOSAL] Myriad for Apache Incubator

2015-02-15 Thread Ted Dunning
In case there is any doubt, +1 from me!



On Fri, Feb 13, 2015 at 5:15 PM, Luciano Resende luckbr1...@gmail.com
wrote:

 On Fri, Feb 13, 2015 at 5:06 PM, Adam Bordelon a...@mesosphere.io wrote:

  Hello friends,
 
  The Myriad team and I would like to propose the Myriad project for
  inclusion in the Apache Incubator.
  Full text of the proposal is below. I can add it to the incubator wiki as
  well, if desired.
  Please review and discuss. If there are no major concerns, I will call
 for
  a Vote after a week.
 
  Cheers,
  -Adam-
  me@apache
 
  ==
  Apache Myriad Proposal
 
  * Abstract
  Myriad enables co-existence of Apache Hadoop YARN and Apache Mesos
 together
  on the same cluster and allows dynamic resource allocations across both
  Hadoop and other applications running on the same physical data center
  infrastructure.
 
  * Proposal
  The vision of Myriad is to provide a comprehensive framework to ensure
  Apache Hadoop YARN and Apache Mesos can interoperate with minimal changes
  on either side and prevent the static fragmentation of data center
  resources.
 
  * Background
  Project Myriad is the first resource management framework that allows big
  data developers to run YARN-based Hadoop jobs alongside other
 applications
  and services in production. ebay Inc., MapR, and Mesosphere jointly built
  Myriad (available on Github at https://github.com/mesos/myriad) with the
  vision of freeing big data jobs from siloed clusters and consolidating
  infrastructure into a single pool of resources for greater utilization
 and
  operational efficiency. Several companies including Twitter have
 expressed
  interest in Myriad and have begun testing it.
 
  * Rationale
  Many Hadoop users are building larger clusters (data lake/data hub
  architectures) that support multiple workloads - made possible by the
  advent of Apache Hadoop YARN. As the clusters grow in size and
 importance,
  they become an important application within the broader datacenter. At
 the
  same time, Apache Mesos enables efficient resource isolation and sharing
  across distributed applications for the broader data center, for instance
  MPI, Spark, long running web services, build/test infrastructure,
  traditional linux applications/scripts, and others (including arbitrary
  docker images).
 
  Myriad aims to enable co-existence of Apache Hadoop YARN and Apache Mesos
  on the same physical data center resources, reducing fragmentation of
 data
  center resources.
 
  * Project Goals
  ** Initial Goals
  - Run Myriad alongside Apache Hadoop YARN and Apache Mesos to allow
 policy
  based allocation of data center resources across Apache Hadoop and other
  distributed applications
  - Ensure YARN based execution frameworks work without any changes when
  running alongside Myriad. YARN Applications will continue to interact and
  run on top of YARN and can choose to be unaware of Myriad.
  - Ensure Mesos based execution frameworks work without any changes when
  running alongside Myriad. Mesos applications will continue to interact
 and
  run on Mesos and can choose to be unaware of Myriad.
  - Provide isolation for multi-tenancy.
- Use linux cgroups (and optionally Docker-like technologies to ease
  packaging, deployment and broader isolation) so that multiple YARN
 clusters
  can run in their own space and are isolated from each other. YARN’s RM
 and
  NMs are dockerized.
  - Myriad should be able to manage full YARN lifecycle:
- Bring up YARN (RM, NM)
- Scale Up/Down YARN
- Release resources and shut down YARN
 
  ** Longer Term Goals
  - Allow fine-grained dynamic allocation of resources to Hadoop including
  the ability to scale up and scale down the cluster.
- Provide different policies to allow downsizing running applications
 on
  Hadoop when resources are taken away from it.
- Provide a framework so the downsizing policy is pluggable and users
 can
  write their own implementations.
  - Allow multiple versions of Apache Hadoop to run on the same physical
  infrastructure
  - Allow workload portability - ability to migrate YARN workloads across
  various cloud infrastructures seamlessly (e.g. GCE, AWS, etc)
  - Security:
- Authentication Requirements:
  - Support basic CRAM-MD5 password authentication between Myriad and
  Mesos. Additional authentication mechanisms may be supported in the
 future.
  - Traditional user authentication with Hadoop’s HTTP web-consoles
  should work as usual.
- Authorization:
  - Only authorized users are allowed to launch YARN clusters.  Mesos
  allows to specify which framework principal is allowed to register as a
  particular role.
- Encryption on wire:
  - All control traffic to/from Myriad/Mesos
  - Logs
- Audits (where to store them)
  - Log all major activities/events with audit trail - who, what, when,
  result
  - Launching YARN/RM
  - Launching NM’s
  - 

Re: [DISCUSS] [PROPOSAL] Myriad for Apache Incubator

2015-02-13 Thread Luciano Resende
On Fri, Feb 13, 2015 at 5:06 PM, Adam Bordelon a...@mesosphere.io wrote:

 Hello friends,

 The Myriad team and I would like to propose the Myriad project for
 inclusion in the Apache Incubator.
 Full text of the proposal is below. I can add it to the incubator wiki as
 well, if desired.
 Please review and discuss. If there are no major concerns, I will call for
 a Vote after a week.

 Cheers,
 -Adam-
 me@apache

 ==
 Apache Myriad Proposal

 * Abstract
 Myriad enables co-existence of Apache Hadoop YARN and Apache Mesos together
 on the same cluster and allows dynamic resource allocations across both
 Hadoop and other applications running on the same physical data center
 infrastructure.

 * Proposal
 The vision of Myriad is to provide a comprehensive framework to ensure
 Apache Hadoop YARN and Apache Mesos can interoperate with minimal changes
 on either side and prevent the static fragmentation of data center
 resources.

 * Background
 Project Myriad is the first resource management framework that allows big
 data developers to run YARN-based Hadoop jobs alongside other applications
 and services in production. ebay Inc., MapR, and Mesosphere jointly built
 Myriad (available on Github at https://github.com/mesos/myriad) with the
 vision of freeing big data jobs from siloed clusters and consolidating
 infrastructure into a single pool of resources for greater utilization and
 operational efficiency. Several companies including Twitter have expressed
 interest in Myriad and have begun testing it.

 * Rationale
 Many Hadoop users are building larger clusters (data lake/data hub
 architectures) that support multiple workloads - made possible by the
 advent of Apache Hadoop YARN. As the clusters grow in size and importance,
 they become an important application within the broader datacenter. At the
 same time, Apache Mesos enables efficient resource isolation and sharing
 across distributed applications for the broader data center, for instance
 MPI, Spark, long running web services, build/test infrastructure,
 traditional linux applications/scripts, and others (including arbitrary
 docker images).

 Myriad aims to enable co-existence of Apache Hadoop YARN and Apache Mesos
 on the same physical data center resources, reducing fragmentation of data
 center resources.

 * Project Goals
 ** Initial Goals
 - Run Myriad alongside Apache Hadoop YARN and Apache Mesos to allow policy
 based allocation of data center resources across Apache Hadoop and other
 distributed applications
 - Ensure YARN based execution frameworks work without any changes when
 running alongside Myriad. YARN Applications will continue to interact and
 run on top of YARN and can choose to be unaware of Myriad.
 - Ensure Mesos based execution frameworks work without any changes when
 running alongside Myriad. Mesos applications will continue to interact and
 run on Mesos and can choose to be unaware of Myriad.
 - Provide isolation for multi-tenancy.
   - Use linux cgroups (and optionally Docker-like technologies to ease
 packaging, deployment and broader isolation) so that multiple YARN clusters
 can run in their own space and are isolated from each other. YARN’s RM and
 NMs are dockerized.
 - Myriad should be able to manage full YARN lifecycle:
   - Bring up YARN (RM, NM)
   - Scale Up/Down YARN
   - Release resources and shut down YARN

 ** Longer Term Goals
 - Allow fine-grained dynamic allocation of resources to Hadoop including
 the ability to scale up and scale down the cluster.
   - Provide different policies to allow downsizing running applications on
 Hadoop when resources are taken away from it.
   - Provide a framework so the downsizing policy is pluggable and users can
 write their own implementations.
 - Allow multiple versions of Apache Hadoop to run on the same physical
 infrastructure
 - Allow workload portability - ability to migrate YARN workloads across
 various cloud infrastructures seamlessly (e.g. GCE, AWS, etc)
 - Security:
   - Authentication Requirements:
 - Support basic CRAM-MD5 password authentication between Myriad and
 Mesos. Additional authentication mechanisms may be supported in the future.
 - Traditional user authentication with Hadoop’s HTTP web-consoles
 should work as usual.
   - Authorization:
 - Only authorized users are allowed to launch YARN clusters.  Mesos
 allows to specify which framework principal is allowed to register as a
 particular role.
   - Encryption on wire:
 - All control traffic to/from Myriad/Mesos
 - Logs
   - Audits (where to store them)
 - Log all major activities/events with audit trail - who, what, when,
 result
 - Launching YARN/RM
 - Launching NM’s
 - Downsizing NM’s
 - Terminating YARN/RM
   - What to do with old logs?
   - Debuggability/Visibility
 - Hooks to identify different YARN cluster lifecycles (yarn-id?)
 - GUI: Capability to scale-up and scale-down by selecting nodes and
 

Re: [DISCUSS] [PROPOSAL] Myriad for Apache Incubator

2015-02-13 Thread Ted Dunning
Luciano,

I would expect that you would make an excellent additional mentor.



On Fri, Feb 13, 2015 at 5:15 PM, Luciano Resende luckbr1...@gmail.com
wrote:

 On Fri, Feb 13, 2015 at 5:06 PM, Adam Bordelon a...@mesosphere.io wrote:

  Hello friends,
 
  The Myriad team and I would like to propose the Myriad project for
  inclusion in the Apache Incubator.
  Full text of the proposal is below. I can add it to the incubator wiki as
  well, if desired.
  Please review and discuss. If there are no major concerns, I will call
 for
  a Vote after a week.
 
  Cheers,
  -Adam-
  me@apache
 
  ==
  Apache Myriad Proposal
 
  * Abstract
  Myriad enables co-existence of Apache Hadoop YARN and Apache Mesos
 together
  on the same cluster and allows dynamic resource allocations across both
  Hadoop and other applications running on the same physical data center
  infrastructure.
 
  * Proposal
  The vision of Myriad is to provide a comprehensive framework to ensure
  Apache Hadoop YARN and Apache Mesos can interoperate with minimal changes
  on either side and prevent the static fragmentation of data center
  resources.
 
  * Background
  Project Myriad is the first resource management framework that allows big
  data developers to run YARN-based Hadoop jobs alongside other
 applications
  and services in production. ebay Inc., MapR, and Mesosphere jointly built
  Myriad (available on Github at https://github.com/mesos/myriad) with the
  vision of freeing big data jobs from siloed clusters and consolidating
  infrastructure into a single pool of resources for greater utilization
 and
  operational efficiency. Several companies including Twitter have
 expressed
  interest in Myriad and have begun testing it.
 
  * Rationale
  Many Hadoop users are building larger clusters (data lake/data hub
  architectures) that support multiple workloads - made possible by the
  advent of Apache Hadoop YARN. As the clusters grow in size and
 importance,
  they become an important application within the broader datacenter. At
 the
  same time, Apache Mesos enables efficient resource isolation and sharing
  across distributed applications for the broader data center, for instance
  MPI, Spark, long running web services, build/test infrastructure,
  traditional linux applications/scripts, and others (including arbitrary
  docker images).
 
  Myriad aims to enable co-existence of Apache Hadoop YARN and Apache Mesos
  on the same physical data center resources, reducing fragmentation of
 data
  center resources.
 
  * Project Goals
  ** Initial Goals
  - Run Myriad alongside Apache Hadoop YARN and Apache Mesos to allow
 policy
  based allocation of data center resources across Apache Hadoop and other
  distributed applications
  - Ensure YARN based execution frameworks work without any changes when
  running alongside Myriad. YARN Applications will continue to interact and
  run on top of YARN and can choose to be unaware of Myriad.
  - Ensure Mesos based execution frameworks work without any changes when
  running alongside Myriad. Mesos applications will continue to interact
 and
  run on Mesos and can choose to be unaware of Myriad.
  - Provide isolation for multi-tenancy.
- Use linux cgroups (and optionally Docker-like technologies to ease
  packaging, deployment and broader isolation) so that multiple YARN
 clusters
  can run in their own space and are isolated from each other. YARN’s RM
 and
  NMs are dockerized.
  - Myriad should be able to manage full YARN lifecycle:
- Bring up YARN (RM, NM)
- Scale Up/Down YARN
- Release resources and shut down YARN
 
  ** Longer Term Goals
  - Allow fine-grained dynamic allocation of resources to Hadoop including
  the ability to scale up and scale down the cluster.
- Provide different policies to allow downsizing running applications
 on
  Hadoop when resources are taken away from it.
- Provide a framework so the downsizing policy is pluggable and users
 can
  write their own implementations.
  - Allow multiple versions of Apache Hadoop to run on the same physical
  infrastructure
  - Allow workload portability - ability to migrate YARN workloads across
  various cloud infrastructures seamlessly (e.g. GCE, AWS, etc)
  - Security:
- Authentication Requirements:
  - Support basic CRAM-MD5 password authentication between Myriad and
  Mesos. Additional authentication mechanisms may be supported in the
 future.
  - Traditional user authentication with Hadoop’s HTTP web-consoles
  should work as usual.
- Authorization:
  - Only authorized users are allowed to launch YARN clusters.  Mesos
  allows to specify which framework principal is allowed to register as a
  particular role.
- Encryption on wire:
  - All control traffic to/from Myriad/Mesos
  - Logs
- Audits (where to store them)
  - Log all major activities/events with audit trail - who, what, when,
  result
  - Launching 

[DISCUSS] [PROPOSAL] Myriad for Apache Incubator

2015-02-13 Thread Adam Bordelon
Hello friends,

The Myriad team and I would like to propose the Myriad project for
inclusion in the Apache Incubator.
Full text of the proposal is below. I can add it to the incubator wiki as
well, if desired.
Please review and discuss. If there are no major concerns, I will call for
a Vote after a week.

Cheers,
-Adam-
me@apache

==
Apache Myriad Proposal

* Abstract
Myriad enables co-existence of Apache Hadoop YARN and Apache Mesos together
on the same cluster and allows dynamic resource allocations across both
Hadoop and other applications running on the same physical data center
infrastructure.

* Proposal
The vision of Myriad is to provide a comprehensive framework to ensure
Apache Hadoop YARN and Apache Mesos can interoperate with minimal changes
on either side and prevent the static fragmentation of data center
resources.

* Background
Project Myriad is the first resource management framework that allows big
data developers to run YARN-based Hadoop jobs alongside other applications
and services in production. ebay Inc., MapR, and Mesosphere jointly built
Myriad (available on Github at https://github.com/mesos/myriad) with the
vision of freeing big data jobs from siloed clusters and consolidating
infrastructure into a single pool of resources for greater utilization and
operational efficiency. Several companies including Twitter have expressed
interest in Myriad and have begun testing it.

* Rationale
Many Hadoop users are building larger clusters (data lake/data hub
architectures) that support multiple workloads - made possible by the
advent of Apache Hadoop YARN. As the clusters grow in size and importance,
they become an important application within the broader datacenter. At the
same time, Apache Mesos enables efficient resource isolation and sharing
across distributed applications for the broader data center, for instance
MPI, Spark, long running web services, build/test infrastructure,
traditional linux applications/scripts, and others (including arbitrary
docker images).

Myriad aims to enable co-existence of Apache Hadoop YARN and Apache Mesos
on the same physical data center resources, reducing fragmentation of data
center resources.

* Project Goals
** Initial Goals
- Run Myriad alongside Apache Hadoop YARN and Apache Mesos to allow policy
based allocation of data center resources across Apache Hadoop and other
distributed applications
- Ensure YARN based execution frameworks work without any changes when
running alongside Myriad. YARN Applications will continue to interact and
run on top of YARN and can choose to be unaware of Myriad.
- Ensure Mesos based execution frameworks work without any changes when
running alongside Myriad. Mesos applications will continue to interact and
run on Mesos and can choose to be unaware of Myriad.
- Provide isolation for multi-tenancy.
  - Use linux cgroups (and optionally Docker-like technologies to ease
packaging, deployment and broader isolation) so that multiple YARN clusters
can run in their own space and are isolated from each other. YARN’s RM and
NMs are dockerized.
- Myriad should be able to manage full YARN lifecycle:
  - Bring up YARN (RM, NM)
  - Scale Up/Down YARN
  - Release resources and shut down YARN

** Longer Term Goals
- Allow fine-grained dynamic allocation of resources to Hadoop including
the ability to scale up and scale down the cluster.
  - Provide different policies to allow downsizing running applications on
Hadoop when resources are taken away from it.
  - Provide a framework so the downsizing policy is pluggable and users can
write their own implementations.
- Allow multiple versions of Apache Hadoop to run on the same physical
infrastructure
- Allow workload portability - ability to migrate YARN workloads across
various cloud infrastructures seamlessly (e.g. GCE, AWS, etc)
- Security:
  - Authentication Requirements:
- Support basic CRAM-MD5 password authentication between Myriad and
Mesos. Additional authentication mechanisms may be supported in the future.
- Traditional user authentication with Hadoop’s HTTP web-consoles
should work as usual.
  - Authorization:
- Only authorized users are allowed to launch YARN clusters.  Mesos
allows to specify which framework principal is allowed to register as a
particular role.
  - Encryption on wire:
- All control traffic to/from Myriad/Mesos
- Logs
  - Audits (where to store them)
- Log all major activities/events with audit trail - who, what, when,
result
- Launching YARN/RM
- Launching NM’s
- Downsizing NM’s
- Terminating YARN/RM
  - What to do with old logs?
  - Debuggability/Visibility
- Hooks to identify different YARN cluster lifecycles (yarn-id?)
- GUI: Capability to scale-up and scale-down by selecting nodes and
providing a scale-up/scale-down factor.

* Architectural Overview
The following diagram illustrates the high level architecture. YARN (with
Myriad) is registered as a