Re: [DISCUSS] [PROPOSAL] Myriad for Apache Incubator
I have updated the proposal to include Luciano as a Mentor, and marked it as FINAL. https://wiki.apache.org/incubator/MyriadProposal?action=recallrev=7 I will open up a new thread for the VOTE. On Sat, Feb 21, 2015 at 1:20 PM, jan i j...@apache.org wrote: On 21 February 2015 at 21:12, Luciano Resende luckbr1...@gmail.com wrote: Discussion has died down, and we had only positive feedback for the proposal. Should we start a formal vote ? please do. rgds jan i On Wed, Feb 18, 2015 at 9:27 PM, Ted Dunning ted.dunn...@gmail.com wrote: On Wed, Feb 18, 2015 at 12:24 PM, Adam Bordelon a...@mesosphere.io wrote: I am personally in favor of adding Luciano Resende to the Nominated Mentors list (the more the merrier, right?), but I want to get approval from the other mentors/committers before nominating him in the proposal. +1 I don't think that you really need to worry about other mentors approving the addition of a mentor. This is a duty well shared by more hands. I haven't seen a bad mentor except ones that go missing and having an extra helps with that. -- Luciano Resende http://people.apache.org/~lresende http://twitter.com/lresende1975 http://lresende.blogspot.com/
Re: [DISCUSS] [PROPOSAL] Myriad for Apache Incubator
Sound right to me. Ben? Would you like to do the honors? Sent from my iPhone On Feb 21, 2015, at 15:12, Luciano Resende luckbr1...@gmail.com wrote: Discussion has died down, and we had only positive feedback for the proposal. Should we start a formal vote ? On Wed, Feb 18, 2015 at 9:27 PM, Ted Dunning ted.dunn...@gmail.com wrote: On Wed, Feb 18, 2015 at 12:24 PM, Adam Bordelon a...@mesosphere.io wrote: I am personally in favor of adding Luciano Resende to the Nominated Mentors list (the more the merrier, right?), but I want to get approval from the other mentors/committers before nominating him in the proposal. +1 I don't think that you really need to worry about other mentors approving the addition of a mentor. This is a duty well shared by more hands. I haven't seen a bad mentor except ones that go missing and having an extra helps with that. -- Luciano Resende http://people.apache.org/~lresende http://twitter.com/lresende1975 http://lresende.blogspot.com/ - To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org
Re: [DISCUSS] [PROPOSAL] Myriad for Apache Incubator
On 21 February 2015 at 21:12, Luciano Resende luckbr1...@gmail.com wrote: Discussion has died down, and we had only positive feedback for the proposal. Should we start a formal vote ? please do. rgds jan i On Wed, Feb 18, 2015 at 9:27 PM, Ted Dunning ted.dunn...@gmail.com wrote: On Wed, Feb 18, 2015 at 12:24 PM, Adam Bordelon a...@mesosphere.io wrote: I am personally in favor of adding Luciano Resende to the Nominated Mentors list (the more the merrier, right?), but I want to get approval from the other mentors/committers before nominating him in the proposal. +1 I don't think that you really need to worry about other mentors approving the addition of a mentor. This is a duty well shared by more hands. I haven't seen a bad mentor except ones that go missing and having an extra helps with that. -- Luciano Resende http://people.apache.org/~lresende http://twitter.com/lresende1975 http://lresende.blogspot.com/
Re: [DISCUSS] [PROPOSAL] Myriad for Apache Incubator
Discussion has died down, and we had only positive feedback for the proposal. Should we start a formal vote ? On Wed, Feb 18, 2015 at 9:27 PM, Ted Dunning ted.dunn...@gmail.com wrote: On Wed, Feb 18, 2015 at 12:24 PM, Adam Bordelon a...@mesosphere.io wrote: I am personally in favor of adding Luciano Resende to the Nominated Mentors list (the more the merrier, right?), but I want to get approval from the other mentors/committers before nominating him in the proposal. +1 I don't think that you really need to worry about other mentors approving the addition of a mentor. This is a duty well shared by more hands. I haven't seen a bad mentor except ones that go missing and having an extra helps with that. -- Luciano Resende http://people.apache.org/~lresende http://twitter.com/lresende1975 http://lresende.blogspot.com/
Re: [DISCUSS] [PROPOSAL] Myriad for Apache Incubator
Thanks for your support everyone. I have updated the wiki proposal to remove the user@ mailing list, and fixed up the formatting. I am personally in favor of adding Luciano Resende to the Nominated Mentors list (the more the merrier, right?), but I want to get approval from the other mentors/committers before nominating him in the proposal. See http://incubator.apache.org/guides/proposal.html#template-mentors and http://incubator.apache.org/guides/mentor.html for more details on the role/responsibilities. Alternatively, Luciano could act as a (small 'm') mentor, rather than an official Podling Mentor. Thoughts, opinions? Any high-level critiques or questions not answered in the proposal? Any nit-picky grammer/spelling mistakes? [troll] On Tue, Feb 17, 2015 at 10:19 PM, Naresh Agarwal naresh.agar...@inmobi.com wrote: Looks interesting. Looking forward to this. Thanks Naresh On Wed, Feb 18, 2015 at 11:08 AM, Henry Saputra henry.sapu...@gmail.com wrote: I love this project and the idea. Tried to hack it couple years ago could not make it work. Looking forward seeing it in ASF incubator for sure. @Adam and @Ted, like any new incubator projects coming we always check if you need user@ so early in the process? Would probably better to have all discussion in dev@ early in incubation. - Henry On Fri, Feb 13, 2015 at 5:06 PM, Adam Bordelon a...@mesosphere.io wrote: Hello friends, The Myriad team and I would like to propose the Myriad project for inclusion in the Apache Incubator. Full text of the proposal is below. I can add it to the incubator wiki as well, if desired. Please review and discuss. If there are no major concerns, I will call for a Vote after a week. Cheers, -Adam- me@apache == Apache Myriad Proposal * Abstract Myriad enables co-existence of Apache Hadoop YARN and Apache Mesos together on the same cluster and allows dynamic resource allocations across both Hadoop and other applications running on the same physical data center infrastructure. * Proposal The vision of Myriad is to provide a comprehensive framework to ensure Apache Hadoop YARN and Apache Mesos can interoperate with minimal changes on either side and prevent the static fragmentation of data center resources. * Background Project Myriad is the first resource management framework that allows big data developers to run YARN-based Hadoop jobs alongside other applications and services in production. ebay Inc., MapR, and Mesosphere jointly built Myriad (available on Github at https://github.com/mesos/myriad) with the vision of freeing big data jobs from siloed clusters and consolidating infrastructure into a single pool of resources for greater utilization and operational efficiency. Several companies including Twitter have expressed interest in Myriad and have begun testing it. * Rationale Many Hadoop users are building larger clusters (data lake/data hub architectures) that support multiple workloads - made possible by the advent of Apache Hadoop YARN. As the clusters grow in size and importance, they become an important application within the broader datacenter. At the same time, Apache Mesos enables efficient resource isolation and sharing across distributed applications for the broader data center, for instance MPI, Spark, long running web services, build/test infrastructure, traditional linux applications/scripts, and others (including arbitrary docker images). Myriad aims to enable co-existence of Apache Hadoop YARN and Apache Mesos on the same physical data center resources, reducing fragmentation of data center resources. * Project Goals ** Initial Goals - Run Myriad alongside Apache Hadoop YARN and Apache Mesos to allow policy based allocation of data center resources across Apache Hadoop and other distributed applications - Ensure YARN based execution frameworks work without any changes when running alongside Myriad. YARN Applications will continue to interact and run on top of YARN and can choose to be unaware of Myriad. - Ensure Mesos based execution frameworks work without any changes when running alongside Myriad. Mesos applications will continue to interact and run on Mesos and can choose to be unaware of Myriad. - Provide isolation for multi-tenancy. - Use linux cgroups (and optionally Docker-like technologies to ease packaging, deployment and broader isolation) so that multiple YARN clusters can run in their own space and are isolated from each other. YARN’s RM and NMs are dockerized. - Myriad should be able to manage full YARN lifecycle: - Bring up YARN (RM, NM) - Scale Up/Down YARN - Release resources and shut down YARN ** Longer Term Goals - Allow fine-grained
Re: [DISCUSS] [PROPOSAL] Myriad for Apache Incubator
On Wed, Feb 18, 2015 at 12:24 PM, Adam Bordelon a...@mesosphere.io wrote: I am personally in favor of adding Luciano Resende to the Nominated Mentors list (the more the merrier, right?), but I want to get approval from the other mentors/committers before nominating him in the proposal. +1 I don't think that you really need to worry about other mentors approving the addition of a mentor. This is a duty well shared by more hands. I haven't seen a bad mentor except ones that go missing and having an extra helps with that.
Re: [DISCUSS] [PROPOSAL] Myriad for Apache Incubator
Oh it is painless =) From what I have seen, having just dev@ list early would help ramping up dev quickly. @Adam and @Ted, IMHO once the transition is over and the project has one release under ASF adding user@ list would be beneficial. - Henry On Tue, Feb 17, 2015 at 9:59 PM, Adam Bordelon a...@mesosphere.io wrote: Good point. I'm fine with starting with just a dev@ first, and then we can add user@ if/when dev becomes too noisy. I assume adding a new mailing list is relatively painless. On Tue, Feb 17, 2015 at 9:52 PM, Ted Dunning ted.dunn...@gmail.com wrote: On Tue, Feb 17, 2015 at 9:38 PM, Henry Saputra henry.sapu...@gmail.com wrote: @Adam and @Ted, like any new incubator projects coming we always check if you need user@ so early in the process? Would probably better to have all discussion in dev@ early in incubation. Henry, This is a good question to ask (and I have asked it in the past). I think that Myriad is in, or nearly in production here and there already. That means that a user@ list might well be useful. - To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org
Re: [DISCUSS] [PROPOSAL] Myriad for Apache Incubator
Looks interesting. Looking forward to this. Thanks Naresh On Wed, Feb 18, 2015 at 11:08 AM, Henry Saputra henry.sapu...@gmail.com wrote: I love this project and the idea. Tried to hack it couple years ago could not make it work. Looking forward seeing it in ASF incubator for sure. @Adam and @Ted, like any new incubator projects coming we always check if you need user@ so early in the process? Would probably better to have all discussion in dev@ early in incubation. - Henry On Fri, Feb 13, 2015 at 5:06 PM, Adam Bordelon a...@mesosphere.io wrote: Hello friends, The Myriad team and I would like to propose the Myriad project for inclusion in the Apache Incubator. Full text of the proposal is below. I can add it to the incubator wiki as well, if desired. Please review and discuss. If there are no major concerns, I will call for a Vote after a week. Cheers, -Adam- me@apache == Apache Myriad Proposal * Abstract Myriad enables co-existence of Apache Hadoop YARN and Apache Mesos together on the same cluster and allows dynamic resource allocations across both Hadoop and other applications running on the same physical data center infrastructure. * Proposal The vision of Myriad is to provide a comprehensive framework to ensure Apache Hadoop YARN and Apache Mesos can interoperate with minimal changes on either side and prevent the static fragmentation of data center resources. * Background Project Myriad is the first resource management framework that allows big data developers to run YARN-based Hadoop jobs alongside other applications and services in production. ebay Inc., MapR, and Mesosphere jointly built Myriad (available on Github at https://github.com/mesos/myriad) with the vision of freeing big data jobs from siloed clusters and consolidating infrastructure into a single pool of resources for greater utilization and operational efficiency. Several companies including Twitter have expressed interest in Myriad and have begun testing it. * Rationale Many Hadoop users are building larger clusters (data lake/data hub architectures) that support multiple workloads - made possible by the advent of Apache Hadoop YARN. As the clusters grow in size and importance, they become an important application within the broader datacenter. At the same time, Apache Mesos enables efficient resource isolation and sharing across distributed applications for the broader data center, for instance MPI, Spark, long running web services, build/test infrastructure, traditional linux applications/scripts, and others (including arbitrary docker images). Myriad aims to enable co-existence of Apache Hadoop YARN and Apache Mesos on the same physical data center resources, reducing fragmentation of data center resources. * Project Goals ** Initial Goals - Run Myriad alongside Apache Hadoop YARN and Apache Mesos to allow policy based allocation of data center resources across Apache Hadoop and other distributed applications - Ensure YARN based execution frameworks work without any changes when running alongside Myriad. YARN Applications will continue to interact and run on top of YARN and can choose to be unaware of Myriad. - Ensure Mesos based execution frameworks work without any changes when running alongside Myriad. Mesos applications will continue to interact and run on Mesos and can choose to be unaware of Myriad. - Provide isolation for multi-tenancy. - Use linux cgroups (and optionally Docker-like technologies to ease packaging, deployment and broader isolation) so that multiple YARN clusters can run in their own space and are isolated from each other. YARN’s RM and NMs are dockerized. - Myriad should be able to manage full YARN lifecycle: - Bring up YARN (RM, NM) - Scale Up/Down YARN - Release resources and shut down YARN ** Longer Term Goals - Allow fine-grained dynamic allocation of resources to Hadoop including the ability to scale up and scale down the cluster. - Provide different policies to allow downsizing running applications on Hadoop when resources are taken away from it. - Provide a framework so the downsizing policy is pluggable and users can write their own implementations. - Allow multiple versions of Apache Hadoop to run on the same physical infrastructure - Allow workload portability - ability to migrate YARN workloads across various cloud infrastructures seamlessly (e.g. GCE, AWS, etc) - Security: - Authentication Requirements: - Support basic CRAM-MD5 password authentication between Myriad and Mesos. Additional authentication mechanisms may be supported in the future. - Traditional user authentication with Hadoop’s HTTP web-consoles should work as usual. - Authorization: - Only authorized users are allowed to launch YARN
Re: [DISCUSS] [PROPOSAL] Myriad for Apache Incubator
I love this project and the idea. Tried to hack it couple years ago could not make it work. Looking forward seeing it in ASF incubator for sure. @Adam and @Ted, like any new incubator projects coming we always check if you need user@ so early in the process? Would probably better to have all discussion in dev@ early in incubation. - Henry On Fri, Feb 13, 2015 at 5:06 PM, Adam Bordelon a...@mesosphere.io wrote: Hello friends, The Myriad team and I would like to propose the Myriad project for inclusion in the Apache Incubator. Full text of the proposal is below. I can add it to the incubator wiki as well, if desired. Please review and discuss. If there are no major concerns, I will call for a Vote after a week. Cheers, -Adam- me@apache == Apache Myriad Proposal * Abstract Myriad enables co-existence of Apache Hadoop YARN and Apache Mesos together on the same cluster and allows dynamic resource allocations across both Hadoop and other applications running on the same physical data center infrastructure. * Proposal The vision of Myriad is to provide a comprehensive framework to ensure Apache Hadoop YARN and Apache Mesos can interoperate with minimal changes on either side and prevent the static fragmentation of data center resources. * Background Project Myriad is the first resource management framework that allows big data developers to run YARN-based Hadoop jobs alongside other applications and services in production. ebay Inc., MapR, and Mesosphere jointly built Myriad (available on Github at https://github.com/mesos/myriad) with the vision of freeing big data jobs from siloed clusters and consolidating infrastructure into a single pool of resources for greater utilization and operational efficiency. Several companies including Twitter have expressed interest in Myriad and have begun testing it. * Rationale Many Hadoop users are building larger clusters (data lake/data hub architectures) that support multiple workloads - made possible by the advent of Apache Hadoop YARN. As the clusters grow in size and importance, they become an important application within the broader datacenter. At the same time, Apache Mesos enables efficient resource isolation and sharing across distributed applications for the broader data center, for instance MPI, Spark, long running web services, build/test infrastructure, traditional linux applications/scripts, and others (including arbitrary docker images). Myriad aims to enable co-existence of Apache Hadoop YARN and Apache Mesos on the same physical data center resources, reducing fragmentation of data center resources. * Project Goals ** Initial Goals - Run Myriad alongside Apache Hadoop YARN and Apache Mesos to allow policy based allocation of data center resources across Apache Hadoop and other distributed applications - Ensure YARN based execution frameworks work without any changes when running alongside Myriad. YARN Applications will continue to interact and run on top of YARN and can choose to be unaware of Myriad. - Ensure Mesos based execution frameworks work without any changes when running alongside Myriad. Mesos applications will continue to interact and run on Mesos and can choose to be unaware of Myriad. - Provide isolation for multi-tenancy. - Use linux cgroups (and optionally Docker-like technologies to ease packaging, deployment and broader isolation) so that multiple YARN clusters can run in their own space and are isolated from each other. YARN’s RM and NMs are dockerized. - Myriad should be able to manage full YARN lifecycle: - Bring up YARN (RM, NM) - Scale Up/Down YARN - Release resources and shut down YARN ** Longer Term Goals - Allow fine-grained dynamic allocation of resources to Hadoop including the ability to scale up and scale down the cluster. - Provide different policies to allow downsizing running applications on Hadoop when resources are taken away from it. - Provide a framework so the downsizing policy is pluggable and users can write their own implementations. - Allow multiple versions of Apache Hadoop to run on the same physical infrastructure - Allow workload portability - ability to migrate YARN workloads across various cloud infrastructures seamlessly (e.g. GCE, AWS, etc) - Security: - Authentication Requirements: - Support basic CRAM-MD5 password authentication between Myriad and Mesos. Additional authentication mechanisms may be supported in the future. - Traditional user authentication with Hadoop’s HTTP web-consoles should work as usual. - Authorization: - Only authorized users are allowed to launch YARN clusters. Mesos allows to specify which framework principal is allowed to register as a particular role. - Encryption on wire: - All control traffic to/from Myriad/Mesos - Logs - Audits (where to store them) - Log all major
Re: [DISCUSS] [PROPOSAL] Myriad for Apache Incubator
On Tue, Feb 17, 2015 at 9:38 PM, Henry Saputra henry.sapu...@gmail.com wrote: @Adam and @Ted, like any new incubator projects coming we always check if you need user@ so early in the process? Would probably better to have all discussion in dev@ early in incubation. Henry, This is a good question to ask (and I have asked it in the past). I think that Myriad is in, or nearly in production here and there already. That means that a user@ list might well be useful.
Re: [DISCUSS] [PROPOSAL] Myriad for Apache Incubator
Good point. I'm fine with starting with just a dev@ first, and then we can add user@ if/when dev becomes too noisy. I assume adding a new mailing list is relatively painless. On Tue, Feb 17, 2015 at 9:52 PM, Ted Dunning ted.dunn...@gmail.com wrote: On Tue, Feb 17, 2015 at 9:38 PM, Henry Saputra henry.sapu...@gmail.com wrote: @Adam and @Ted, like any new incubator projects coming we always check if you need user@ so early in the process? Would probably better to have all discussion in dev@ early in incubation. Henry, This is a good question to ask (and I have asked it in the past). I think that Myriad is in, or nearly in production here and there already. That means that a user@ list might well be useful.
Re: [DISCUSS] [PROPOSAL] Myriad for Apache Incubator
On Fri, Feb 13, 2015 at 5:06 PM, Adam Bordelon a...@mesosphere.io wrote: I can add it to the incubator wiki as well, if desired. I added this to the incubator wiki just now.
Re: [DISCUSS] [PROPOSAL] Myriad for Apache Incubator
In case there is any doubt, +1 from me! On Fri, Feb 13, 2015 at 5:15 PM, Luciano Resende luckbr1...@gmail.com wrote: On Fri, Feb 13, 2015 at 5:06 PM, Adam Bordelon a...@mesosphere.io wrote: Hello friends, The Myriad team and I would like to propose the Myriad project for inclusion in the Apache Incubator. Full text of the proposal is below. I can add it to the incubator wiki as well, if desired. Please review and discuss. If there are no major concerns, I will call for a Vote after a week. Cheers, -Adam- me@apache == Apache Myriad Proposal * Abstract Myriad enables co-existence of Apache Hadoop YARN and Apache Mesos together on the same cluster and allows dynamic resource allocations across both Hadoop and other applications running on the same physical data center infrastructure. * Proposal The vision of Myriad is to provide a comprehensive framework to ensure Apache Hadoop YARN and Apache Mesos can interoperate with minimal changes on either side and prevent the static fragmentation of data center resources. * Background Project Myriad is the first resource management framework that allows big data developers to run YARN-based Hadoop jobs alongside other applications and services in production. ebay Inc., MapR, and Mesosphere jointly built Myriad (available on Github at https://github.com/mesos/myriad) with the vision of freeing big data jobs from siloed clusters and consolidating infrastructure into a single pool of resources for greater utilization and operational efficiency. Several companies including Twitter have expressed interest in Myriad and have begun testing it. * Rationale Many Hadoop users are building larger clusters (data lake/data hub architectures) that support multiple workloads - made possible by the advent of Apache Hadoop YARN. As the clusters grow in size and importance, they become an important application within the broader datacenter. At the same time, Apache Mesos enables efficient resource isolation and sharing across distributed applications for the broader data center, for instance MPI, Spark, long running web services, build/test infrastructure, traditional linux applications/scripts, and others (including arbitrary docker images). Myriad aims to enable co-existence of Apache Hadoop YARN and Apache Mesos on the same physical data center resources, reducing fragmentation of data center resources. * Project Goals ** Initial Goals - Run Myriad alongside Apache Hadoop YARN and Apache Mesos to allow policy based allocation of data center resources across Apache Hadoop and other distributed applications - Ensure YARN based execution frameworks work without any changes when running alongside Myriad. YARN Applications will continue to interact and run on top of YARN and can choose to be unaware of Myriad. - Ensure Mesos based execution frameworks work without any changes when running alongside Myriad. Mesos applications will continue to interact and run on Mesos and can choose to be unaware of Myriad. - Provide isolation for multi-tenancy. - Use linux cgroups (and optionally Docker-like technologies to ease packaging, deployment and broader isolation) so that multiple YARN clusters can run in their own space and are isolated from each other. YARN’s RM and NMs are dockerized. - Myriad should be able to manage full YARN lifecycle: - Bring up YARN (RM, NM) - Scale Up/Down YARN - Release resources and shut down YARN ** Longer Term Goals - Allow fine-grained dynamic allocation of resources to Hadoop including the ability to scale up and scale down the cluster. - Provide different policies to allow downsizing running applications on Hadoop when resources are taken away from it. - Provide a framework so the downsizing policy is pluggable and users can write their own implementations. - Allow multiple versions of Apache Hadoop to run on the same physical infrastructure - Allow workload portability - ability to migrate YARN workloads across various cloud infrastructures seamlessly (e.g. GCE, AWS, etc) - Security: - Authentication Requirements: - Support basic CRAM-MD5 password authentication between Myriad and Mesos. Additional authentication mechanisms may be supported in the future. - Traditional user authentication with Hadoop’s HTTP web-consoles should work as usual. - Authorization: - Only authorized users are allowed to launch YARN clusters. Mesos allows to specify which framework principal is allowed to register as a particular role. - Encryption on wire: - All control traffic to/from Myriad/Mesos - Logs - Audits (where to store them) - Log all major activities/events with audit trail - who, what, when, result - Launching YARN/RM - Launching NM’s -
Re: [DISCUSS] [PROPOSAL] Myriad for Apache Incubator
On Fri, Feb 13, 2015 at 5:06 PM, Adam Bordelon a...@mesosphere.io wrote: Hello friends, The Myriad team and I would like to propose the Myriad project for inclusion in the Apache Incubator. Full text of the proposal is below. I can add it to the incubator wiki as well, if desired. Please review and discuss. If there are no major concerns, I will call for a Vote after a week. Cheers, -Adam- me@apache == Apache Myriad Proposal * Abstract Myriad enables co-existence of Apache Hadoop YARN and Apache Mesos together on the same cluster and allows dynamic resource allocations across both Hadoop and other applications running on the same physical data center infrastructure. * Proposal The vision of Myriad is to provide a comprehensive framework to ensure Apache Hadoop YARN and Apache Mesos can interoperate with minimal changes on either side and prevent the static fragmentation of data center resources. * Background Project Myriad is the first resource management framework that allows big data developers to run YARN-based Hadoop jobs alongside other applications and services in production. ebay Inc., MapR, and Mesosphere jointly built Myriad (available on Github at https://github.com/mesos/myriad) with the vision of freeing big data jobs from siloed clusters and consolidating infrastructure into a single pool of resources for greater utilization and operational efficiency. Several companies including Twitter have expressed interest in Myriad and have begun testing it. * Rationale Many Hadoop users are building larger clusters (data lake/data hub architectures) that support multiple workloads - made possible by the advent of Apache Hadoop YARN. As the clusters grow in size and importance, they become an important application within the broader datacenter. At the same time, Apache Mesos enables efficient resource isolation and sharing across distributed applications for the broader data center, for instance MPI, Spark, long running web services, build/test infrastructure, traditional linux applications/scripts, and others (including arbitrary docker images). Myriad aims to enable co-existence of Apache Hadoop YARN and Apache Mesos on the same physical data center resources, reducing fragmentation of data center resources. * Project Goals ** Initial Goals - Run Myriad alongside Apache Hadoop YARN and Apache Mesos to allow policy based allocation of data center resources across Apache Hadoop and other distributed applications - Ensure YARN based execution frameworks work without any changes when running alongside Myriad. YARN Applications will continue to interact and run on top of YARN and can choose to be unaware of Myriad. - Ensure Mesos based execution frameworks work without any changes when running alongside Myriad. Mesos applications will continue to interact and run on Mesos and can choose to be unaware of Myriad. - Provide isolation for multi-tenancy. - Use linux cgroups (and optionally Docker-like technologies to ease packaging, deployment and broader isolation) so that multiple YARN clusters can run in their own space and are isolated from each other. YARN’s RM and NMs are dockerized. - Myriad should be able to manage full YARN lifecycle: - Bring up YARN (RM, NM) - Scale Up/Down YARN - Release resources and shut down YARN ** Longer Term Goals - Allow fine-grained dynamic allocation of resources to Hadoop including the ability to scale up and scale down the cluster. - Provide different policies to allow downsizing running applications on Hadoop when resources are taken away from it. - Provide a framework so the downsizing policy is pluggable and users can write their own implementations. - Allow multiple versions of Apache Hadoop to run on the same physical infrastructure - Allow workload portability - ability to migrate YARN workloads across various cloud infrastructures seamlessly (e.g. GCE, AWS, etc) - Security: - Authentication Requirements: - Support basic CRAM-MD5 password authentication between Myriad and Mesos. Additional authentication mechanisms may be supported in the future. - Traditional user authentication with Hadoop’s HTTP web-consoles should work as usual. - Authorization: - Only authorized users are allowed to launch YARN clusters. Mesos allows to specify which framework principal is allowed to register as a particular role. - Encryption on wire: - All control traffic to/from Myriad/Mesos - Logs - Audits (where to store them) - Log all major activities/events with audit trail - who, what, when, result - Launching YARN/RM - Launching NM’s - Downsizing NM’s - Terminating YARN/RM - What to do with old logs? - Debuggability/Visibility - Hooks to identify different YARN cluster lifecycles (yarn-id?) - GUI: Capability to scale-up and scale-down by selecting nodes and
Re: [DISCUSS] [PROPOSAL] Myriad for Apache Incubator
Luciano, I would expect that you would make an excellent additional mentor. On Fri, Feb 13, 2015 at 5:15 PM, Luciano Resende luckbr1...@gmail.com wrote: On Fri, Feb 13, 2015 at 5:06 PM, Adam Bordelon a...@mesosphere.io wrote: Hello friends, The Myriad team and I would like to propose the Myriad project for inclusion in the Apache Incubator. Full text of the proposal is below. I can add it to the incubator wiki as well, if desired. Please review and discuss. If there are no major concerns, I will call for a Vote after a week. Cheers, -Adam- me@apache == Apache Myriad Proposal * Abstract Myriad enables co-existence of Apache Hadoop YARN and Apache Mesos together on the same cluster and allows dynamic resource allocations across both Hadoop and other applications running on the same physical data center infrastructure. * Proposal The vision of Myriad is to provide a comprehensive framework to ensure Apache Hadoop YARN and Apache Mesos can interoperate with minimal changes on either side and prevent the static fragmentation of data center resources. * Background Project Myriad is the first resource management framework that allows big data developers to run YARN-based Hadoop jobs alongside other applications and services in production. ebay Inc., MapR, and Mesosphere jointly built Myriad (available on Github at https://github.com/mesos/myriad) with the vision of freeing big data jobs from siloed clusters and consolidating infrastructure into a single pool of resources for greater utilization and operational efficiency. Several companies including Twitter have expressed interest in Myriad and have begun testing it. * Rationale Many Hadoop users are building larger clusters (data lake/data hub architectures) that support multiple workloads - made possible by the advent of Apache Hadoop YARN. As the clusters grow in size and importance, they become an important application within the broader datacenter. At the same time, Apache Mesos enables efficient resource isolation and sharing across distributed applications for the broader data center, for instance MPI, Spark, long running web services, build/test infrastructure, traditional linux applications/scripts, and others (including arbitrary docker images). Myriad aims to enable co-existence of Apache Hadoop YARN and Apache Mesos on the same physical data center resources, reducing fragmentation of data center resources. * Project Goals ** Initial Goals - Run Myriad alongside Apache Hadoop YARN and Apache Mesos to allow policy based allocation of data center resources across Apache Hadoop and other distributed applications - Ensure YARN based execution frameworks work without any changes when running alongside Myriad. YARN Applications will continue to interact and run on top of YARN and can choose to be unaware of Myriad. - Ensure Mesos based execution frameworks work without any changes when running alongside Myriad. Mesos applications will continue to interact and run on Mesos and can choose to be unaware of Myriad. - Provide isolation for multi-tenancy. - Use linux cgroups (and optionally Docker-like technologies to ease packaging, deployment and broader isolation) so that multiple YARN clusters can run in their own space and are isolated from each other. YARN’s RM and NMs are dockerized. - Myriad should be able to manage full YARN lifecycle: - Bring up YARN (RM, NM) - Scale Up/Down YARN - Release resources and shut down YARN ** Longer Term Goals - Allow fine-grained dynamic allocation of resources to Hadoop including the ability to scale up and scale down the cluster. - Provide different policies to allow downsizing running applications on Hadoop when resources are taken away from it. - Provide a framework so the downsizing policy is pluggable and users can write their own implementations. - Allow multiple versions of Apache Hadoop to run on the same physical infrastructure - Allow workload portability - ability to migrate YARN workloads across various cloud infrastructures seamlessly (e.g. GCE, AWS, etc) - Security: - Authentication Requirements: - Support basic CRAM-MD5 password authentication between Myriad and Mesos. Additional authentication mechanisms may be supported in the future. - Traditional user authentication with Hadoop’s HTTP web-consoles should work as usual. - Authorization: - Only authorized users are allowed to launch YARN clusters. Mesos allows to specify which framework principal is allowed to register as a particular role. - Encryption on wire: - All control traffic to/from Myriad/Mesos - Logs - Audits (where to store them) - Log all major activities/events with audit trail - who, what, when, result - Launching
[DISCUSS] [PROPOSAL] Myriad for Apache Incubator
Hello friends, The Myriad team and I would like to propose the Myriad project for inclusion in the Apache Incubator. Full text of the proposal is below. I can add it to the incubator wiki as well, if desired. Please review and discuss. If there are no major concerns, I will call for a Vote after a week. Cheers, -Adam- me@apache == Apache Myriad Proposal * Abstract Myriad enables co-existence of Apache Hadoop YARN and Apache Mesos together on the same cluster and allows dynamic resource allocations across both Hadoop and other applications running on the same physical data center infrastructure. * Proposal The vision of Myriad is to provide a comprehensive framework to ensure Apache Hadoop YARN and Apache Mesos can interoperate with minimal changes on either side and prevent the static fragmentation of data center resources. * Background Project Myriad is the first resource management framework that allows big data developers to run YARN-based Hadoop jobs alongside other applications and services in production. ebay Inc., MapR, and Mesosphere jointly built Myriad (available on Github at https://github.com/mesos/myriad) with the vision of freeing big data jobs from siloed clusters and consolidating infrastructure into a single pool of resources for greater utilization and operational efficiency. Several companies including Twitter have expressed interest in Myriad and have begun testing it. * Rationale Many Hadoop users are building larger clusters (data lake/data hub architectures) that support multiple workloads - made possible by the advent of Apache Hadoop YARN. As the clusters grow in size and importance, they become an important application within the broader datacenter. At the same time, Apache Mesos enables efficient resource isolation and sharing across distributed applications for the broader data center, for instance MPI, Spark, long running web services, build/test infrastructure, traditional linux applications/scripts, and others (including arbitrary docker images). Myriad aims to enable co-existence of Apache Hadoop YARN and Apache Mesos on the same physical data center resources, reducing fragmentation of data center resources. * Project Goals ** Initial Goals - Run Myriad alongside Apache Hadoop YARN and Apache Mesos to allow policy based allocation of data center resources across Apache Hadoop and other distributed applications - Ensure YARN based execution frameworks work without any changes when running alongside Myriad. YARN Applications will continue to interact and run on top of YARN and can choose to be unaware of Myriad. - Ensure Mesos based execution frameworks work without any changes when running alongside Myriad. Mesos applications will continue to interact and run on Mesos and can choose to be unaware of Myriad. - Provide isolation for multi-tenancy. - Use linux cgroups (and optionally Docker-like technologies to ease packaging, deployment and broader isolation) so that multiple YARN clusters can run in their own space and are isolated from each other. YARN’s RM and NMs are dockerized. - Myriad should be able to manage full YARN lifecycle: - Bring up YARN (RM, NM) - Scale Up/Down YARN - Release resources and shut down YARN ** Longer Term Goals - Allow fine-grained dynamic allocation of resources to Hadoop including the ability to scale up and scale down the cluster. - Provide different policies to allow downsizing running applications on Hadoop when resources are taken away from it. - Provide a framework so the downsizing policy is pluggable and users can write their own implementations. - Allow multiple versions of Apache Hadoop to run on the same physical infrastructure - Allow workload portability - ability to migrate YARN workloads across various cloud infrastructures seamlessly (e.g. GCE, AWS, etc) - Security: - Authentication Requirements: - Support basic CRAM-MD5 password authentication between Myriad and Mesos. Additional authentication mechanisms may be supported in the future. - Traditional user authentication with Hadoop’s HTTP web-consoles should work as usual. - Authorization: - Only authorized users are allowed to launch YARN clusters. Mesos allows to specify which framework principal is allowed to register as a particular role. - Encryption on wire: - All control traffic to/from Myriad/Mesos - Logs - Audits (where to store them) - Log all major activities/events with audit trail - who, what, when, result - Launching YARN/RM - Launching NM’s - Downsizing NM’s - Terminating YARN/RM - What to do with old logs? - Debuggability/Visibility - Hooks to identify different YARN cluster lifecycles (yarn-id?) - GUI: Capability to scale-up and scale-down by selecting nodes and providing a scale-up/scale-down factor. * Architectural Overview The following diagram illustrates the high level architecture. YARN (with Myriad) is registered as a