Re: external shuffle service in mesos
Hi Susan, yes, agree with you regarding resource accounting. Imho, in this case shuffle service must run on node no matter what resources are available(same as we don't account for resources that "system" takes - mesos agent, OS itself and any other process that is running on same machine) One additional argument against managing it with puppet/chef is that this management becomes "leaked abstraction": usually we submit spark frameworks through mesos and give it any spark distribution uri, while to get this shuffle service running as daemon on every node I need to install specific version of spark distribution on this node and then when upgrading spark version it's not enough to give new uri to mesos, I need to create new shuffle service which uses new spark distro(and then port/dir/other conflicts should be resolved) -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ - To unsubscribe e-mail: user-unsubscr...@spark.apache.org
Re: external shuffle service in mesos
Hi Igor, You made a good point about the tradeoffs. I think the main thing you would get with Marathon is the accounting for resources (the memory and cpus specified in the config file). That allows Mesos to manage the resources properly. I don't think the other tools mentioned would reserve resources from Mesos. If you want more information about production ops for Mesos, you might want to ask in the Mesos mailing list. Or, you can check out the https://dcos.io/community/ project. Susan On Sat, Jan 20, 2018 at 11:59 PM, igor.bermanwrote: > Hi Susan > > In general I can get what I need without Marathon, with configuring > external-shuffle-service with puppet/ansible/chef + maybe some alerts for > checks. > > I mean in companies that don't have strong Devops teams and want to install > services as simple as possible just by config - Marathon might be useful, > however if company already has strong puppet/ansible/chef whatever infra, > the Marathon addition(additional component) and management is less clear > > WDYT? > > > > -- > Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ > > - > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > > -- Susan X. Huynh Software engineer, Data Agility xhu...@mesosphere.com
Re: external shuffle service in mesos
Hi Susan In general I can get what I need without Marathon, with configuring external-shuffle-service with puppet/ansible/chef + maybe some alerts for checks. I mean in companies that don't have strong Devops teams and want to install services as simple as possible just by config - Marathon might be useful, however if company already has strong puppet/ansible/chef whatever infra, the Marathon addition(additional component) and management is less clear WDYT? -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ - To unsubscribe e-mail: user-unsubscr...@spark.apache.org
Re: external shuffle service in mesos
Hi Igor, The best way I know of is with Marathon. * Placement constraint: you could combine constraints in Marathon. Like: "constraints": [ ["hostname", "UNIQUE"], ["hostname", "LIKE", "host1|host2|host3"] ] https://groups.google.com/forum/#!topic/marathon-framework/hfLUw3TIw2I * You would have to use a workaround to deal with a dynamically sized cluster: set the number of instances to be greater than the expected cluster size. https://jira.mesosphere.com/browse/MARATHON-3791?focusedCommentId=79976=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-79976 As the commenter notes, it's not ideal, it's just a workaround. Susan On Sat, Jan 20, 2018 at 8:33 AM, igor.berman <igor.ber...@gmail.com> wrote: > Hi, > wanted to get some advice regarding managing external shuffle service in > mesos environments > > In spark documentation the Marathon is mentioned, however there is very > limited documentation. > I've tried to search for some documentation and it's seems not too > difficult > to configure it under Marathon(e.g. > https://github.com/NBCUAS/dcos-spark-shuffle-service/ > blob/master/marathon/mesos-shuffle-service.json), > however I see few problems: > > There is no clear way to deploy some application in mesos on every node > see https://jira.mesosphere.com/browse/MARATHON-3791 > * it's not possible to guarantee on which nodes shuffle service application > will be placed(it's possible to guarantee with mesos unique constrain that > only 1 shuffle service instance will be placed on some node) > * cluster that has dynamic nodes joining/leaving - the config of shuffle > service must be adjusted(specifically number of instances config) > > So any production ops advices will be welcome > Igor > > > > -- > Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ > > - > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > > -- Susan X. Huynh Software engineer, Data Agility xhu...@mesosphere.com
external shuffle service in mesos
Hi, wanted to get some advice regarding managing external shuffle service in mesos environments In spark documentation the Marathon is mentioned, however there is very limited documentation. I've tried to search for some documentation and it's seems not too difficult to configure it under Marathon(e.g. https://github.com/NBCUAS/dcos-spark-shuffle-service/blob/master/marathon/mesos-shuffle-service.json), however I see few problems: There is no clear way to deploy some application in mesos on every node see https://jira.mesosphere.com/browse/MARATHON-3791 * it's not possible to guarantee on which nodes shuffle service application will be placed(it's possible to guarantee with mesos unique constrain that only 1 shuffle service instance will be placed on some node) * cluster that has dynamic nodes joining/leaving - the config of shuffle service must be adjusted(specifically number of instances config) So any production ops advices will be welcome Igor -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ - To unsubscribe e-mail: user-unsubscr...@spark.apache.org