Re: Multiple versions of Apache Hadoop YARN as a Service

2019-09-27 Thread yuliya Feldman
 Hello,
Please see inline.
Thanks,Yuliya
On Thursday, September 26, 2019, 12:54:47 AM PDT, Oscar Fernandez 
 wrote:  
 
 Hi,

*1. *Myriad Scheduler would be responsible to register with Mesos and, on
demand, bring up Yarn clusters (RMs and NMs) and manage its resources.
>>> OK 

*2. *Yes, the idea is that Myriad will control NMs for all YARN clusters
that the user wants to deploy. Obviously the web UI should be updated and
the logic to handle the state of several clusters implemented.
>>> UI is the least of a concern here. I am sure UI and API would be fine with 
>>> multiple clusters.
* a.* NMs should come and go on demand, from the UI and API. In the future,
maybe we can implement some auto-scaling with the available resources in
the Mesos cluster, which is on the roadmap.
* b.* IMO NMs shouldn't be permanent, else, we miss the scaling feature.
>>> Sure
* c*. RMs will be permanent until YARN cluster shutdown, as the RM is
needed for the YARN cluster to run properly. Also, Myriad should keep track
of where the RM for each Yarn cluster is running in order to configure the
NM for that cluster.

*3. *I'm not sure I understand this question, what you mean with "isn't it
too much.."? This feature should be implemented and defined, as the current
state of Myriad doesn't allow any of this.
>>> What I meant is that your MyriadScheduler may need to deal with a lot of 
>>> resources it's schedulingyou will need to make sure it's not a bottleneck 
>>> and you don't starve one cluster in favor of the other.It will have to be 
>>> really good 2nd level scheduler - that's why I was saying it's not as 
>>> trivial as getting offers from Mesos, keeping track of them and start NMs 
>>> as you please. Even today I believe we need to make sure locality is taken 
>>> care of.

We have enabled comments in the doc, maybe you can help us make this pros
and cons list with the new design.
>>> Thanks

Thank you, all this help is appreciated

On Wed, Sep 25, 2019 at 6:11 PM yuliya Feldman 
wrote:

>  Hello,
> Thank you for the diagrams - it helps. Could you also enable comments in
> your doc?
> Few thoughts:1.  Myriad Scheduler is wonderful - but it's yet another
> scheduler you need to deal with - or you plan to have current
> MyriadScheduler that sits in RM and use it instead?2. Is Myriad scheduler
> going to control NMs for all YARN clusters?    a. How NMs will come and
> go?    b. Are they going to be permanent?    c. I assume RMs will be
> permanent until cluster shutdown, right?3. If NMs will not be permanent -
> isn't it too much for upper level Myriad Scheduler to deal with all of them?
>
> Also could you please list cons - pros are great, but it's better to have
> cons as well.
> Thanks,Yuliya
>
>
>    On Wednesday, September 25, 2019, 12:30:20 AM PDT, Oscar Fernandez <
> oscarf...@apache.org> wrote:
>
>  Hi,
>
> I've made a diagram to represent the new proposed design in order to
> support Yarn as a service with some of the pros:
>
> https://docs.google.com/document/d/15X0-zSu0G0BDpWyndRhbvAJCXLtAbkNA45wQ_xVKOKQ
>
> Thank you for all your comments and help
>
> On Tue, Sep 24, 2019 at 8:57 PM Javi Roman 
> wrote:
>
> > Honestly your opinion is welcome, this kind of discussions are great
> > in this small traffic dev list ;-)
> > --
> > Javi Roman
> >
> > Twitter: @javiromanrh
> > GitHub: github.com/javiroman
> > Linkedin: es.linkedin.com/in/javiroman
> > Big Data Blog: dataintensive.info
> > Apache Id: javiroman
> >
> > On Tue, Sep 24, 2019 at 8:55 PM yuliya Feldman
> >  wrote:
> > >
> > >  I am not saying it's crazy. I was voicing my opinion. Isn't it what
> was
> > the purpose of the discussion?
> > > It's definitely great to have UI that manages all the YARN clusters,
> but
> > it's not like UI/Web service has to be coupled/collocated with any of the
> > Myriad particular YARN version daemons.
> > > It's great if you would provide write up with pros and cons for your
> > approach or any alternative approaches.
> > >
> > >
> > >    On Tuesday, September 24, 2019, 11:38:13 AM PDT, Javi Roman <
> > jroman.espi...@gmail.com> wrote:
> > >
> > >  On Tue, Sep 24, 2019 at 7:58 PM yuliya Feldman
> > >  wrote:
> > > >
> > > >  Hello,
> > > > Again I apologize for the late reply.
> > > > I think I replied to the thread, but will add more direct notes here
> > > > What you are proposing is to have yet another daemon that would start
> > Yarn Clusters on demand within Mesos framework.
> > > > Meaning - it would be another layer of abstraction.  In this case
> that
> > new layer would need to behave as second level scheduler and deal with
> > third level scheduler(s) (RMs) to propagate offers from Mesos and keep
> > track, etc.
> > > > I am sure you can somehow use concept of Capacity and/or FairShare
> > scheduler in your new layer to do the job. I am just not very much
> > convinced that 3 layers of scheduling will be easy to
> > maintain/reconcile/etc.
> > > > Again - if I understand your design correctly.
> > > > Would be 

Re: Multiple versions of Apache Hadoop YARN as a Service

2019-09-26 Thread Oscar Fernandez
Hi,

*1. *Myriad Scheduler would be responsible to register with Mesos and, on
demand, bring up Yarn clusters (RMs and NMs) and manage its resources.

*2. *Yes, the idea is that Myriad will control NMs for all YARN clusters
that the user wants to deploy. Obviously the web UI should be updated and
the logic to handle the state of several clusters implemented.
* a.* NMs should come and go on demand, from the UI and API. In the future,
maybe we can implement some auto-scaling with the available resources in
the Mesos cluster, which is on the roadmap.
* b.* IMO NMs shouldn't be permanent, else, we miss the scaling feature.
* c*. RMs will be permanent until YARN cluster shutdown, as the RM is
needed for the YARN cluster to run properly. Also, Myriad should keep track
of where the RM for each Yarn cluster is running in order to configure the
NM for that cluster.

*3. *I'm not sure I understand this question, what you mean with "isn't it
too much.."? This feature should be implemented and defined, as the current
state of Myriad doesn't allow any of this.

We have enabled comments in the doc, maybe you can help us make this pros
and cons list with the new design.

Thank you, all this help is appreciated

On Wed, Sep 25, 2019 at 6:11 PM yuliya Feldman 
wrote:

>  Hello,
> Thank you for the diagrams - it helps. Could you also enable comments in
> your doc?
> Few thoughts:1.  Myriad Scheduler is wonderful - but it's yet another
> scheduler you need to deal with - or you plan to have current
> MyriadScheduler that sits in RM and use it instead?2. Is Myriad scheduler
> going to control NMs for all YARN clusters? a. How NMs will come and
> go? b. Are they going to be permanent?c. I assume RMs will be
> permanent until cluster shutdown, right?3. If NMs will not be permanent -
> isn't it too much for upper level Myriad Scheduler to deal with all of them?
>
> Also could you please list cons - pros are great, but it's better to have
> cons as well.
> Thanks,Yuliya
>
>
> On Wednesday, September 25, 2019, 12:30:20 AM PDT, Oscar Fernandez <
> oscarf...@apache.org> wrote:
>
>  Hi,
>
> I've made a diagram to represent the new proposed design in order to
> support Yarn as a service with some of the pros:
>
> https://docs.google.com/document/d/15X0-zSu0G0BDpWyndRhbvAJCXLtAbkNA45wQ_xVKOKQ
>
> Thank you for all your comments and help
>
> On Tue, Sep 24, 2019 at 8:57 PM Javi Roman 
> wrote:
>
> > Honestly your opinion is welcome, this kind of discussions are great
> > in this small traffic dev list ;-)
> > --
> > Javi Roman
> >
> > Twitter: @javiromanrh
> > GitHub: github.com/javiroman
> > Linkedin: es.linkedin.com/in/javiroman
> > Big Data Blog: dataintensive.info
> > Apache Id: javiroman
> >
> > On Tue, Sep 24, 2019 at 8:55 PM yuliya Feldman
> >  wrote:
> > >
> > >  I am not saying it's crazy. I was voicing my opinion. Isn't it what
> was
> > the purpose of the discussion?
> > > It's definitely great to have UI that manages all the YARN clusters,
> but
> > it's not like UI/Web service has to be coupled/collocated with any of the
> > Myriad particular YARN version daemons.
> > > It's great if you would provide write up with pros and cons for your
> > approach or any alternative approaches.
> > >
> > >
> > >On Tuesday, September 24, 2019, 11:38:13 AM PDT, Javi Roman <
> > jroman.espi...@gmail.com> wrote:
> > >
> > >  On Tue, Sep 24, 2019 at 7:58 PM yuliya Feldman
> > >  wrote:
> > > >
> > > >  Hello,
> > > > Again I apologize for the late reply.
> > > > I think I replied to the thread, but will add more direct notes here
> > > > What you are proposing is to have yet another daemon that would start
> > Yarn Clusters on demand within Mesos framework.
> > > > Meaning - it would be another layer of abstraction.  In this case
> that
> > new layer would need to behave as second level scheduler and deal with
> > third level scheduler(s) (RMs) to propagate offers from Mesos and keep
> > track, etc.
> > > > I am sure you can somehow use concept of Capacity and/or FairShare
> > scheduler in your new layer to do the job. I am just not very much
> > convinced that 3 layers of scheduling will be easy to
> > maintain/reconcile/etc.
> > > > Again - if I understand your design correctly.
> > > > Would be great if you do a small write up with the proposal and have
> > some simple diagram of services interactions.
> > > > Just my 2c.
> > > > Thanks,Yuliya
> > >
> > > Great, I wil do a diagram!
> > >
> > > Only for clarify:
> > >
> > > Myriad is registered as framework in Mesos master. The same thread
> > > start the API server and the user interface. By means the user
> > > interface you select the YARN version to run, and the scheduler get
> > > resources from master for running RM and NMs. So you con manage as
> > > many YARN schedulers you want. YARN as a Service.
> > >
> > > Maybe I am missing the point, bu I don't feel this is something so
> > > strange, or so crazy!
> > >
> > >
> > > >On Wednesday, September 11, 2019, 

Re: Multiple versions of Apache Hadoop YARN as a Service

2019-09-25 Thread yuliya Feldman
 Hello,
Thank you for the diagrams - it helps. Could you also enable comments in your 
doc?
Few thoughts:1.  Myriad Scheduler is wonderful - but it's yet another scheduler 
you need to deal with - or you plan to have current MyriadScheduler that sits 
in RM and use it instead?2. Is Myriad scheduler going to control NMs for all 
YARN clusters?     a. How NMs will come and go?     b. Are they going to be 
permanent?    c. I assume RMs will be permanent until cluster shutdown, 
right?3. If NMs will not be permanent - isn't it too much for upper level 
Myriad Scheduler to deal with all of them?

Also could you please list cons - pros are great, but it's better to have cons 
as well.
Thanks,Yuliya


On Wednesday, September 25, 2019, 12:30:20 AM PDT, Oscar Fernandez 
 wrote:  
 
 Hi,

I've made a diagram to represent the new proposed design in order to
support Yarn as a service with some of the pros:
https://docs.google.com/document/d/15X0-zSu0G0BDpWyndRhbvAJCXLtAbkNA45wQ_xVKOKQ

Thank you for all your comments and help

On Tue, Sep 24, 2019 at 8:57 PM Javi Roman  wrote:

> Honestly your opinion is welcome, this kind of discussions are great
> in this small traffic dev list ;-)
> --
> Javi Roman
>
> Twitter: @javiromanrh
> GitHub: github.com/javiroman
> Linkedin: es.linkedin.com/in/javiroman
> Big Data Blog: dataintensive.info
> Apache Id: javiroman
>
> On Tue, Sep 24, 2019 at 8:55 PM yuliya Feldman
>  wrote:
> >
> >  I am not saying it's crazy. I was voicing my opinion. Isn't it what was
> the purpose of the discussion?
> > It's definitely great to have UI that manages all the YARN clusters, but
> it's not like UI/Web service has to be coupled/collocated with any of the
> Myriad particular YARN version daemons.
> > It's great if you would provide write up with pros and cons for your
> approach or any alternative approaches.
> >
> >
> >    On Tuesday, September 24, 2019, 11:38:13 AM PDT, Javi Roman <
> jroman.espi...@gmail.com> wrote:
> >
> >  On Tue, Sep 24, 2019 at 7:58 PM yuliya Feldman
> >  wrote:
> > >
> > >  Hello,
> > > Again I apologize for the late reply.
> > > I think I replied to the thread, but will add more direct notes here
> > > What you are proposing is to have yet another daemon that would start
> Yarn Clusters on demand within Mesos framework.
> > > Meaning - it would be another layer of abstraction.  In this case that
> new layer would need to behave as second level scheduler and deal with
> third level scheduler(s) (RMs) to propagate offers from Mesos and keep
> track, etc.
> > > I am sure you can somehow use concept of Capacity and/or FairShare
> scheduler in your new layer to do the job. I am just not very much
> convinced that 3 layers of scheduling will be easy to
> maintain/reconcile/etc.
> > > Again - if I understand your design correctly.
> > > Would be great if you do a small write up with the proposal and have
> some simple diagram of services interactions.
> > > Just my 2c.
> > > Thanks,Yuliya
> >
> > Great, I wil do a diagram!
> >
> > Only for clarify:
> >
> > Myriad is registered as framework in Mesos master. The same thread
> > start the API server and the user interface. By means the user
> > interface you select the YARN version to run, and the scheduler get
> > resources from master for running RM and NMs. So you con manage as
> > many YARN schedulers you want. YARN as a Service.
> >
> > Maybe I am missing the point, bu I don't feel this is something so
> > strange, or so crazy!
> >
> >
> > >    On Wednesday, September 11, 2019, 11:55:07 PM PDT, Oscar Fernandez <
> oscarf...@apache.org> wrote:
> > >
> > >  Hi,
> > >
> > > I've started working on
> https://issues.apache.org/jira/browse/MYRIAD-295 -
> > > Multiple versions of Apache Hadoop YARN as a Service.
> > >
> > > In order to implement this, we should avoid starting the Myriad
> framework
> > > from Yarn and instead starting Yarn(s) from Myriad on demand.
> > >
> > > I wanted to ask the Myriad community if this design was intended for a
> > > reason or if you think it's a good idea to decouple the execution of
> Myriad
> > > from the Yarn RM. With the new design, the Myriad Framework would
> register
> > > on Mesos, and then, start on demand the RM and NM that the user wants,
> > > allowing several Yarn clusters to run in he same Mesos, even with
> different
> > > versions.
> > >
> > > Thank you
> > >
> >
>
  

Re: Multiple versions of Apache Hadoop YARN as a Service

2019-09-25 Thread Oscar Fernandez
Hi,

I've made a diagram to represent the new proposed design in order to
support Yarn as a service with some of the pros:
https://docs.google.com/document/d/15X0-zSu0G0BDpWyndRhbvAJCXLtAbkNA45wQ_xVKOKQ

Thank you for all your comments and help

On Tue, Sep 24, 2019 at 8:57 PM Javi Roman  wrote:

> Honestly your opinion is welcome, this kind of discussions are great
> in this small traffic dev list ;-)
> --
> Javi Roman
>
> Twitter: @javiromanrh
> GitHub: github.com/javiroman
> Linkedin: es.linkedin.com/in/javiroman
> Big Data Blog: dataintensive.info
> Apache Id: javiroman
>
> On Tue, Sep 24, 2019 at 8:55 PM yuliya Feldman
>  wrote:
> >
> >  I am not saying it's crazy. I was voicing my opinion. Isn't it what was
> the purpose of the discussion?
> > It's definitely great to have UI that manages all the YARN clusters, but
> it's not like UI/Web service has to be coupled/collocated with any of the
> Myriad particular YARN version daemons.
> > It's great if you would provide write up with pros and cons for your
> approach or any alternative approaches.
> >
> >
> > On Tuesday, September 24, 2019, 11:38:13 AM PDT, Javi Roman <
> jroman.espi...@gmail.com> wrote:
> >
> >  On Tue, Sep 24, 2019 at 7:58 PM yuliya Feldman
> >  wrote:
> > >
> > >  Hello,
> > > Again I apologize for the late reply.
> > > I think I replied to the thread, but will add more direct notes here
> > > What you are proposing is to have yet another daemon that would start
> Yarn Clusters on demand within Mesos framework.
> > > Meaning - it would be another layer of abstraction.  In this case that
> new layer would need to behave as second level scheduler and deal with
> third level scheduler(s) (RMs) to propagate offers from Mesos and keep
> track, etc.
> > > I am sure you can somehow use concept of Capacity and/or FairShare
> scheduler in your new layer to do the job. I am just not very much
> convinced that 3 layers of scheduling will be easy to
> maintain/reconcile/etc.
> > > Again - if I understand your design correctly.
> > > Would be great if you do a small write up with the proposal and have
> some simple diagram of services interactions.
> > > Just my 2c.
> > > Thanks,Yuliya
> >
> > Great, I wil do a diagram!
> >
> > Only for clarify:
> >
> > Myriad is registered as framework in Mesos master. The same thread
> > start the API server and the user interface. By means the user
> > interface you select the YARN version to run, and the scheduler get
> > resources from master for running RM and NMs. So you con manage as
> > many YARN schedulers you want. YARN as a Service.
> >
> > Maybe I am missing the point, bu I don't feel this is something so
> > strange, or so crazy!
> >
> >
> > >On Wednesday, September 11, 2019, 11:55:07 PM PDT, Oscar Fernandez <
> oscarf...@apache.org> wrote:
> > >
> > >  Hi,
> > >
> > > I've started working on
> https://issues.apache.org/jira/browse/MYRIAD-295 -
> > > Multiple versions of Apache Hadoop YARN as a Service.
> > >
> > > In order to implement this, we should avoid starting the Myriad
> framework
> > > from Yarn and instead starting Yarn(s) from Myriad on demand.
> > >
> > > I wanted to ask the Myriad community if this design was intended for a
> > > reason or if you think it's a good idea to decouple the execution of
> Myriad
> > > from the Yarn RM. With the new design, the Myriad Framework would
> register
> > > on Mesos, and then, start on demand the RM and NM that the user wants,
> > > allowing several Yarn clusters to run in he same Mesos, even with
> different
> > > versions.
> > >
> > > Thank you
> > >
> >
>


Re: Multiple versions of Apache Hadoop YARN as a Service

2019-09-24 Thread Javi Roman
Honestly your opinion is welcome, this kind of discussions are great
in this small traffic dev list ;-)
--
Javi Roman

Twitter: @javiromanrh
GitHub: github.com/javiroman
Linkedin: es.linkedin.com/in/javiroman
Big Data Blog: dataintensive.info
Apache Id: javiroman

On Tue, Sep 24, 2019 at 8:55 PM yuliya Feldman
 wrote:
>
>  I am not saying it's crazy. I was voicing my opinion. Isn't it what was the 
> purpose of the discussion?
> It's definitely great to have UI that manages all the YARN clusters, but it's 
> not like UI/Web service has to be coupled/collocated with any of the Myriad 
> particular YARN version daemons.
> It's great if you would provide write up with pros and cons for your approach 
> or any alternative approaches.
>
>
> On Tuesday, September 24, 2019, 11:38:13 AM PDT, Javi Roman 
>  wrote:
>
>  On Tue, Sep 24, 2019 at 7:58 PM yuliya Feldman
>  wrote:
> >
> >  Hello,
> > Again I apologize for the late reply.
> > I think I replied to the thread, but will add more direct notes here
> > What you are proposing is to have yet another daemon that would start Yarn 
> > Clusters on demand within Mesos framework.
> > Meaning - it would be another layer of abstraction.  In this case that new 
> > layer would need to behave as second level scheduler and deal with third 
> > level scheduler(s) (RMs) to propagate offers from Mesos and keep track, etc.
> > I am sure you can somehow use concept of Capacity and/or FairShare 
> > scheduler in your new layer to do the job. I am just not very much 
> > convinced that 3 layers of scheduling will be easy to 
> > maintain/reconcile/etc.
> > Again - if I understand your design correctly.
> > Would be great if you do a small write up with the proposal and have some 
> > simple diagram of services interactions.
> > Just my 2c.
> > Thanks,Yuliya
>
> Great, I wil do a diagram!
>
> Only for clarify:
>
> Myriad is registered as framework in Mesos master. The same thread
> start the API server and the user interface. By means the user
> interface you select the YARN version to run, and the scheduler get
> resources from master for running RM and NMs. So you con manage as
> many YARN schedulers you want. YARN as a Service.
>
> Maybe I am missing the point, bu I don't feel this is something so
> strange, or so crazy!
>
>
> >On Wednesday, September 11, 2019, 11:55:07 PM PDT, Oscar Fernandez 
> >  wrote:
> >
> >  Hi,
> >
> > I've started working on https://issues.apache.org/jira/browse/MYRIAD-295 -
> > Multiple versions of Apache Hadoop YARN as a Service.
> >
> > In order to implement this, we should avoid starting the Myriad framework
> > from Yarn and instead starting Yarn(s) from Myriad on demand.
> >
> > I wanted to ask the Myriad community if this design was intended for a
> > reason or if you think it's a good idea to decouple the execution of Myriad
> > from the Yarn RM. With the new design, the Myriad Framework would register
> > on Mesos, and then, start on demand the RM and NM that the user wants,
> > allowing several Yarn clusters to run in he same Mesos, even with different
> > versions.
> >
> > Thank you
> >
>


Re: Multiple versions of Apache Hadoop YARN as a Service

2019-09-24 Thread yuliya Feldman
 I am not saying it's crazy. I was voicing my opinion. Isn't it what was the 
purpose of the discussion?
It's definitely great to have UI that manages all the YARN clusters, but it's 
not like UI/Web service has to be coupled/collocated with any of the Myriad 
particular YARN version daemons.
It's great if you would provide write up with pros and cons for your approach 
or any alternative approaches.


On Tuesday, September 24, 2019, 11:38:13 AM PDT, Javi Roman 
 wrote:  
 
 On Tue, Sep 24, 2019 at 7:58 PM yuliya Feldman
 wrote:
>
>  Hello,
> Again I apologize for the late reply.
> I think I replied to the thread, but will add more direct notes here
> What you are proposing is to have yet another daemon that would start Yarn 
> Clusters on demand within Mesos framework.
> Meaning - it would be another layer of abstraction.  In this case that new 
> layer would need to behave as second level scheduler and deal with third 
> level scheduler(s) (RMs) to propagate offers from Mesos and keep track, etc.
> I am sure you can somehow use concept of Capacity and/or FairShare scheduler 
> in your new layer to do the job. I am just not very much convinced that 3 
> layers of scheduling will be easy to maintain/reconcile/etc.
> Again - if I understand your design correctly.
> Would be great if you do a small write up with the proposal and have some 
> simple diagram of services interactions.
> Just my 2c.
> Thanks,Yuliya

Great, I wil do a diagram!

Only for clarify:

Myriad is registered as framework in Mesos master. The same thread
start the API server and the user interface. By means the user
interface you select the YARN version to run, and the scheduler get
resources from master for running RM and NMs. So you con manage as
many YARN schedulers you want. YARN as a Service.

Maybe I am missing the point, bu I don't feel this is something so
strange, or so crazy!


>    On Wednesday, September 11, 2019, 11:55:07 PM PDT, Oscar Fernandez 
> wrote:
>
>  Hi,
>
> I've started working on https://issues.apache.org/jira/browse/MYRIAD-295 -
> Multiple versions of Apache Hadoop YARN as a Service.
>
> In order to implement this, we should avoid starting the Myriad framework
> from Yarn and instead starting Yarn(s) from Myriad on demand.
>
> I wanted to ask the Myriad community if this design was intended for a
> reason or if you think it's a good idea to decouple the execution of Myriad
> from the Yarn RM. With the new design, the Myriad Framework would register
> on Mesos, and then, start on demand the RM and NM that the user wants,
> allowing several Yarn clusters to run in he same Mesos, even with different
> versions.
>
> Thank you
>
  

Re: Multiple versions of Apache Hadoop YARN as a Service

2019-09-24 Thread yuliya Feldman
 It's true with one "but" you need a yet another scheduler in your service and 
it's not a trivial feat - if you want it to be fulfilling the purpose, 
otherwise Marathon as good as it is. As far as I remember Marathon has simple 
FIFO one (may be it evolved though).
In any case - it was my opinion :). You guys know better and closer to it.
Thanks,Yuliya


On Tuesday, September 24, 2019, 11:32:32 AM PDT, Javi Roman 
 wrote:  
 
 On Tue, Sep 24, 2019 at 7:48 PM yuliya Feldman
 wrote:
>
>  >>> I guess we are talking about the concept of multi-service scheduler

Are you proposing to have a multi service scheduler? I thought that's
what Mesos is for?What am I missing here?

Yes is common in Mesos (IMHO): Marathon is a multi-service scheduler,
Apache Aurora too. Please take a look here:

https://mesosphere.github.io/dcos-commons/multi-service/


> On Tue, Sep 24, 2019 at 5:40 PM Yuliya  wrote:
> >
> > Hello there,
> >
> > Sorry for late reply.
> >
> > Frankly speaking I don’t know original motivation , I would probably say 
> > that it started organically as a need.
> >
> > What do you mean by starting yarn from myriad? I believe myriad daemons 
> > encompass some functionality of yarn daemons, namely rm and nm, so it’s not 
> > yarn that starts myriad, but myriad daemons that play role of yarn daemons.
> >  Unless I am missing something here.
> >
> > Are you proposing to have the same myriad daemons starting different 
> > versions of yarn?
> >  I would consider different set of docker containers built with different 
> >versions of yarn would be better decoupling. I am open for discussion though.
> >
> > Thanks,
> > Yuliya
> >
> >
> >
> > > On Sep 15, 2019, at 10:35 PM, Javi Roman  wrote:
> > >
> > > Hi Oscar,
> > >
> > > I have to say I don't know the initial motivation of this design. You
> > > are right the way of starting Myriad, strongly coupled to YARN is a
> > > little bit weird.
> > > Because of lack of activity of the initial committers, this is a
> > > question that probably we never get a clear answer.
> > >
> > > By the way, your proposal, according with MYRIAD-295 is, from my
> > > understanding, the right way to go ahead with the project.
> > >
> > > This new design is totally aligned with the further Myriad UI design
> > > (https://issues.apache.org/jira/browse/MYRIAD-279).
> > >
> > > The document design of this new UI here:
> > > https://docs.google.com/document/d/16gA67RXoPK24OIxDMNNhuYS8ioScI1eOBR-XMMPjWQE/edit?usp=sharing
> > > --
> > > Javi Roman
> > >
> > > Twitter: @javiromanrh
> > > GitHub: github.com/javiroman
> > > Linkedin: es.linkedin.com/in/javiroman
> > > Big Data Blog: dataintensive.info
> > > Apache Id: javiroman
> > >
> > >> On Thu, Sep 12, 2019 at 8:55 AM Oscar Fernandez  
> > >> wrote:
> > >>
> > >> Hi,
> > >>
> > >> I've started working on https://issues.apache.org/jira/browse/MYRIAD-295 
> > >> -
> > >> Multiple versions of Apache Hadoop YARN as a Service.
> > >>
> > >> In order to implement this, we should avoid starting the Myriad framework
> > >> from Yarn and instead starting Yarn(s) from Myriad on demand.
> > >>
> > >> I wanted to ask the Myriad community if this design was intended for a
> > >> reason or if you think it's a good idea to decouple the execution of 
> > >> Myriad
> > >> from the Yarn RM. With the new design, the Myriad Framework would 
> > >> register
> > >> on Mesos, and then, start on demand the RM and NM that the user wants,
> > >> allowing several Yarn clusters to run in he same Mesos, even with 
> > >> different
> > >> versions.
> > >>
> > >> Thank you
> >  

Re: Multiple versions of Apache Hadoop YARN as a Service

2019-09-24 Thread Javi Roman
On Tue, Sep 24, 2019 at 7:58 PM yuliya Feldman
 wrote:
>
>  Hello,
> Again I apologize for the late reply.
> I think I replied to the thread, but will add more direct notes here
> What you are proposing is to have yet another daemon that would start Yarn 
> Clusters on demand within Mesos framework.
> Meaning - it would be another layer of abstraction.  In this case that new 
> layer would need to behave as second level scheduler and deal with third 
> level scheduler(s) (RMs) to propagate offers from Mesos and keep track, etc.
> I am sure you can somehow use concept of Capacity and/or FairShare scheduler 
> in your new layer to do the job. I am just not very much convinced that 3 
> layers of scheduling will be easy to maintain/reconcile/etc.
> Again - if I understand your design correctly.
> Would be great if you do a small write up with the proposal and have some 
> simple diagram of services interactions.
> Just my 2c.
> Thanks,Yuliya

Great, I wil do a diagram!

Only for clarify:

Myriad is registered as framework in Mesos master. The same thread
start the API server and the user interface. By means the user
interface you select the YARN version to run, and the scheduler get
resources from master for running RM and NMs. So you con manage as
many YARN schedulers you want. YARN as a Service.

Maybe I am missing the point, bu I don't feel this is something so
strange, or so crazy!


> On Wednesday, September 11, 2019, 11:55:07 PM PDT, Oscar Fernandez 
>  wrote:
>
>  Hi,
>
> I've started working on https://issues.apache.org/jira/browse/MYRIAD-295 -
> Multiple versions of Apache Hadoop YARN as a Service.
>
> In order to implement this, we should avoid starting the Myriad framework
> from Yarn and instead starting Yarn(s) from Myriad on demand.
>
> I wanted to ask the Myriad community if this design was intended for a
> reason or if you think it's a good idea to decouple the execution of Myriad
> from the Yarn RM. With the new design, the Myriad Framework would register
> on Mesos, and then, start on demand the RM and NM that the user wants,
> allowing several Yarn clusters to run in he same Mesos, even with different
> versions.
>
> Thank you
>


Re: Multiple versions of Apache Hadoop YARN as a Service

2019-09-24 Thread Javi Roman
On Tue, Sep 24, 2019 at 7:48 PM yuliya Feldman
 wrote:
>
>  >>> I guess we are talking about the concept of multi-service scheduler

Are you proposing to have a multi service scheduler? I thought that's
what Mesos is for?What am I missing here?

Yes is common in Mesos (IMHO): Marathon is a multi-service scheduler,
Apache Aurora too. Please take a look here:

https://mesosphere.github.io/dcos-commons/multi-service/


> On Tue, Sep 24, 2019 at 5:40 PM Yuliya  wrote:
> >
> > Hello there,
> >
> > Sorry for late reply.
> >
> > Frankly speaking I don’t know original motivation , I would probably say 
> > that it started organically as a need.
> >
> > What do you mean by starting yarn from myriad? I believe myriad daemons 
> > encompass some functionality of yarn daemons, namely rm and nm, so it’s not 
> > yarn that starts myriad, but myriad daemons that play role of yarn daemons.
> >  Unless I am missing something here.
> >
> > Are you proposing to have the same myriad daemons starting different 
> > versions of yarn?
> >  I would consider different set of docker containers built with different 
> > versions of yarn would be better decoupling. I am open for discussion 
> > though.
> >
> > Thanks,
> > Yuliya
> >
> >
> >
> > > On Sep 15, 2019, at 10:35 PM, Javi Roman  wrote:
> > >
> > > Hi Oscar,
> > >
> > > I have to say I don't know the initial motivation of this design. You
> > > are right the way of starting Myriad, strongly coupled to YARN is a
> > > little bit weird.
> > > Because of lack of activity of the initial committers, this is a
> > > question that probably we never get a clear answer.
> > >
> > > By the way, your proposal, according with MYRIAD-295 is, from my
> > > understanding, the right way to go ahead with the project.
> > >
> > > This new design is totally aligned with the further Myriad UI design
> > > (https://issues.apache.org/jira/browse/MYRIAD-279).
> > >
> > > The document design of this new UI here:
> > > https://docs.google.com/document/d/16gA67RXoPK24OIxDMNNhuYS8ioScI1eOBR-XMMPjWQE/edit?usp=sharing
> > > --
> > > Javi Roman
> > >
> > > Twitter: @javiromanrh
> > > GitHub: github.com/javiroman
> > > Linkedin: es.linkedin.com/in/javiroman
> > > Big Data Blog: dataintensive.info
> > > Apache Id: javiroman
> > >
> > >> On Thu, Sep 12, 2019 at 8:55 AM Oscar Fernandez  
> > >> wrote:
> > >>
> > >> Hi,
> > >>
> > >> I've started working on https://issues.apache.org/jira/browse/MYRIAD-295 
> > >> -
> > >> Multiple versions of Apache Hadoop YARN as a Service.
> > >>
> > >> In order to implement this, we should avoid starting the Myriad framework
> > >> from Yarn and instead starting Yarn(s) from Myriad on demand.
> > >>
> > >> I wanted to ask the Myriad community if this design was intended for a
> > >> reason or if you think it's a good idea to decouple the execution of 
> > >> Myriad
> > >> from the Yarn RM. With the new design, the Myriad Framework would 
> > >> register
> > >> on Mesos, and then, start on demand the RM and NM that the user wants,
> > >> allowing several Yarn clusters to run in he same Mesos, even with 
> > >> different
> > >> versions.
> > >>
> > >> Thank you
> >


Re: Multiple versions of Apache Hadoop YARN as a Service

2019-09-24 Thread yuliya Feldman
 Hello,
Again I apologize for the late reply.
I think I replied to the thread, but will add more direct notes here
What you are proposing is to have yet another daemon that would start Yarn 
Clusters on demand within Mesos framework.
Meaning - it would be another layer of abstraction.  In this case that new 
layer would need to behave as second level scheduler and deal with third level 
scheduler(s) (RMs) to propagate offers from Mesos and keep track, etc.
I am sure you can somehow use concept of Capacity and/or FairShare scheduler in 
your new layer to do the job. I am just not very much convinced that 3 layers 
of scheduling will be easy to maintain/reconcile/etc.
Again - if I understand your design correctly.
Would be great if you do a small write up with the proposal and have some 
simple diagram of services interactions.
Just my 2c.
Thanks,Yuliya
On Wednesday, September 11, 2019, 11:55:07 PM PDT, Oscar Fernandez 
 wrote:  
 
 Hi,

I've started working on https://issues.apache.org/jira/browse/MYRIAD-295 -
Multiple versions of Apache Hadoop YARN as a Service.

In order to implement this, we should avoid starting the Myriad framework
from Yarn and instead starting Yarn(s) from Myriad on demand.

I wanted to ask the Myriad community if this design was intended for a
reason or if you think it's a good idea to decouple the execution of Myriad
from the Yarn RM. With the new design, the Myriad Framework would register
on Mesos, and then, start on demand the RM and NM that the user wants,
allowing several Yarn clusters to run in he same Mesos, even with different
versions.

Thank you
  

Re: Multiple versions of Apache Hadoop YARN as a Service

2019-09-24 Thread yuliya Feldman
 >>> I guess we are talking about the concept of multi-service schedulerAre you 
 >>>proposing to have a multi service scheduler? I thought that's what Mesos is 
 >>>for?What am I missing here?
Thanks,Yuliya
On Tuesday, September 24, 2019, 10:00:39 AM PDT, Javi Roman 
 wrote:  
 
 I guess we are talking about the concept of multi-service scheduler, or one
framework for multiple services, in our case multiple YARN services
(different versions, and so on).

Javi Roman

Twitter: @javiromanrh
GitHub: github.com/javiroman
Linkedin: es.linkedin.com/in/javiroman
Big Data Blog: dataintensive.info
Apache Id: javiroman
On Tue, Sep 24, 2019 at 5:40 PM Yuliya  wrote:
>
> Hello there,
>
> Sorry for late reply.
>
> Frankly speaking I don’t know original motivation , I would probably say that 
> it started organically as a need.
>
> What do you mean by starting yarn from myriad? I believe myriad daemons 
> encompass some functionality of yarn daemons, namely rm and nm, so it’s not 
> yarn that starts myriad, but myriad daemons that play role of yarn daemons.
>  Unless I am missing something here.
>
> Are you proposing to have the same myriad daemons starting different versions 
> of yarn?
>  I would consider different set of docker containers built with different 
>versions of yarn would be better decoupling. I am open for discussion though.
>
> Thanks,
> Yuliya
>
>
>
> > On Sep 15, 2019, at 10:35 PM, Javi Roman  wrote:
> >
> > Hi Oscar,
> >
> > I have to say I don't know the initial motivation of this design. You
> > are right the way of starting Myriad, strongly coupled to YARN is a
> > little bit weird.
> > Because of lack of activity of the initial committers, this is a
> > question that probably we never get a clear answer.
> >
> > By the way, your proposal, according with MYRIAD-295 is, from my
> > understanding, the right way to go ahead with the project.
> >
> > This new design is totally aligned with the further Myriad UI design
> > (https://issues.apache.org/jira/browse/MYRIAD-279).
> >
> > The document design of this new UI here:
> > https://docs.google.com/document/d/16gA67RXoPK24OIxDMNNhuYS8ioScI1eOBR-XMMPjWQE/edit?usp=sharing
> > --
> > Javi Roman
> >
> > Twitter: @javiromanrh
> > GitHub: github.com/javiroman
> > Linkedin: es.linkedin.com/in/javiroman
> > Big Data Blog: dataintensive.info
> > Apache Id: javiroman
> >
> >> On Thu, Sep 12, 2019 at 8:55 AM Oscar Fernandez  
> >> wrote:
> >>
> >> Hi,
> >>
> >> I've started working on https://issues.apache.org/jira/browse/MYRIAD-295 -
> >> Multiple versions of Apache Hadoop YARN as a Service.
> >>
> >> In order to implement this, we should avoid starting the Myriad framework
> >> from Yarn and instead starting Yarn(s) from Myriad on demand.
> >>
> >> I wanted to ask the Myriad community if this design was intended for a
> >> reason or if you think it's a good idea to decouple the execution of Myriad
> >> from the Yarn RM. With the new design, the Myriad Framework would register
> >> on Mesos, and then, start on demand the RM and NM that the user wants,
> >> allowing several Yarn clusters to run in he same Mesos, even with different
> >> versions.
> >>
> >> Thank you
>  

Re: Multiple versions of Apache Hadoop YARN as a Service

2019-09-24 Thread Javi Roman
I guess we are talking about the concept of multi-service scheduler, or one
framework for multiple services, in our case multiple YARN services
(different versions, and so on).

Javi Roman

Twitter: @javiromanrh
GitHub: github.com/javiroman
Linkedin: es.linkedin.com/in/javiroman
Big Data Blog: dataintensive.info
Apache Id: javiroman
On Tue, Sep 24, 2019 at 5:40 PM Yuliya  wrote:
>
> Hello there,
>
> Sorry for late reply.
>
> Frankly speaking I don’t know original motivation , I would probably say that 
> it started organically as a need.
>
> What do you mean by starting yarn from myriad? I believe myriad daemons 
> encompass some functionality of yarn daemons, namely rm and nm, so it’s not 
> yarn that starts myriad, but myriad daemons that play role of yarn daemons.
>  Unless I am missing something here.
>
> Are you proposing to have the same myriad daemons starting different versions 
> of yarn?
>  I would consider different set of docker containers built with different 
> versions of yarn would be better decoupling. I am open for discussion though.
>
> Thanks,
> Yuliya
>
>
>
> > On Sep 15, 2019, at 10:35 PM, Javi Roman  wrote:
> >
> > Hi Oscar,
> >
> > I have to say I don't know the initial motivation of this design. You
> > are right the way of starting Myriad, strongly coupled to YARN is a
> > little bit weird.
> > Because of lack of activity of the initial committers, this is a
> > question that probably we never get a clear answer.
> >
> > By the way, your proposal, according with MYRIAD-295 is, from my
> > understanding, the right way to go ahead with the project.
> >
> > This new design is totally aligned with the further Myriad UI design
> > (https://issues.apache.org/jira/browse/MYRIAD-279).
> >
> > The document design of this new UI here:
> > https://docs.google.com/document/d/16gA67RXoPK24OIxDMNNhuYS8ioScI1eOBR-XMMPjWQE/edit?usp=sharing
> > --
> > Javi Roman
> >
> > Twitter: @javiromanrh
> > GitHub: github.com/javiroman
> > Linkedin: es.linkedin.com/in/javiroman
> > Big Data Blog: dataintensive.info
> > Apache Id: javiroman
> >
> >> On Thu, Sep 12, 2019 at 8:55 AM Oscar Fernandez  
> >> wrote:
> >>
> >> Hi,
> >>
> >> I've started working on https://issues.apache.org/jira/browse/MYRIAD-295 -
> >> Multiple versions of Apache Hadoop YARN as a Service.
> >>
> >> In order to implement this, we should avoid starting the Myriad framework
> >> from Yarn and instead starting Yarn(s) from Myriad on demand.
> >>
> >> I wanted to ask the Myriad community if this design was intended for a
> >> reason or if you think it's a good idea to decouple the execution of Myriad
> >> from the Yarn RM. With the new design, the Myriad Framework would register
> >> on Mesos, and then, start on demand the RM and NM that the user wants,
> >> allowing several Yarn clusters to run in he same Mesos, even with different
> >> versions.
> >>
> >> Thank you
>


Re: Multiple versions of Apache Hadoop YARN as a Service

2019-09-24 Thread Yuliya
Hello there,

Sorry for late reply.

Frankly speaking I don’t know original motivation , I would probably say that 
it started organically as a need.

What do you mean by starting yarn from myriad? I believe myriad daemons 
encompass some functionality of yarn daemons, namely rm and nm, so it’s not 
yarn that starts myriad, but myriad daemons that play role of yarn daemons.
 Unless I am missing something here.

Are you proposing to have the same myriad daemons starting different versions 
of yarn?
 I would consider different set of docker containers built with different 
versions of yarn would be better decoupling. I am open for discussion though.

Thanks,
Yuliya 



> On Sep 15, 2019, at 10:35 PM, Javi Roman  wrote:
> 
> Hi Oscar,
> 
> I have to say I don't know the initial motivation of this design. You
> are right the way of starting Myriad, strongly coupled to YARN is a
> little bit weird.
> Because of lack of activity of the initial committers, this is a
> question that probably we never get a clear answer.
> 
> By the way, your proposal, according with MYRIAD-295 is, from my
> understanding, the right way to go ahead with the project.
> 
> This new design is totally aligned with the further Myriad UI design
> (https://issues.apache.org/jira/browse/MYRIAD-279).
> 
> The document design of this new UI here:
> https://docs.google.com/document/d/16gA67RXoPK24OIxDMNNhuYS8ioScI1eOBR-XMMPjWQE/edit?usp=sharing
> --
> Javi Roman
> 
> Twitter: @javiromanrh
> GitHub: github.com/javiroman
> Linkedin: es.linkedin.com/in/javiroman
> Big Data Blog: dataintensive.info
> Apache Id: javiroman
> 
>> On Thu, Sep 12, 2019 at 8:55 AM Oscar Fernandez  wrote:
>> 
>> Hi,
>> 
>> I've started working on https://issues.apache.org/jira/browse/MYRIAD-295 -
>> Multiple versions of Apache Hadoop YARN as a Service.
>> 
>> In order to implement this, we should avoid starting the Myriad framework
>> from Yarn and instead starting Yarn(s) from Myriad on demand.
>> 
>> I wanted to ask the Myriad community if this design was intended for a
>> reason or if you think it's a good idea to decouple the execution of Myriad
>> from the Yarn RM. With the new design, the Myriad Framework would register
>> on Mesos, and then, start on demand the RM and NM that the user wants,
>> allowing several Yarn clusters to run in he same Mesos, even with different
>> versions.
>> 
>> Thank you



Re: Multiple versions of Apache Hadoop YARN as a Service

2019-09-15 Thread Javi Roman
Hi Oscar,

I have to say I don't know the initial motivation of this design. You
are right the way of starting Myriad, strongly coupled to YARN is a
little bit weird.
Because of lack of activity of the initial committers, this is a
question that probably we never get a clear answer.

By the way, your proposal, according with MYRIAD-295 is, from my
understanding, the right way to go ahead with the project.

This new design is totally aligned with the further Myriad UI design
(https://issues.apache.org/jira/browse/MYRIAD-279).

The document design of this new UI here:
https://docs.google.com/document/d/16gA67RXoPK24OIxDMNNhuYS8ioScI1eOBR-XMMPjWQE/edit?usp=sharing
--
Javi Roman

Twitter: @javiromanrh
GitHub: github.com/javiroman
Linkedin: es.linkedin.com/in/javiroman
Big Data Blog: dataintensive.info
Apache Id: javiroman

On Thu, Sep 12, 2019 at 8:55 AM Oscar Fernandez  wrote:
>
> Hi,
>
> I've started working on https://issues.apache.org/jira/browse/MYRIAD-295 -
> Multiple versions of Apache Hadoop YARN as a Service.
>
> In order to implement this, we should avoid starting the Myriad framework
> from Yarn and instead starting Yarn(s) from Myriad on demand.
>
> I wanted to ask the Myriad community if this design was intended for a
> reason or if you think it's a good idea to decouple the execution of Myriad
> from the Yarn RM. With the new design, the Myriad Framework would register
> on Mesos, and then, start on demand the RM and NM that the user wants,
> allowing several Yarn clusters to run in he same Mesos, even with different
> versions.
>
> Thank you