[jira] [Updated] (MESOS-3035) As a Developer I would like a standard way to run a Subprocess in libprocess
[ https://issues.apache.org/jira/browse/MESOS-3035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Massenzio updated MESOS-3035: --- Assignee: (was: Marco Massenzio) > As a Developer I would like a standard way to run a Subprocess in libprocess > > > Key: MESOS-3035 > URL: https://issues.apache.org/jira/browse/MESOS-3035 > Project: Mesos > Issue Type: Story > Components: libprocess >Reporter: Marco Massenzio > Labels: mesosphere, tech-debt > > As part of MESOS-2830 and MESOS-2902 I have been researching the ability to > run a {{Subprocess}} and capture the {{stdout / stderr}} along with the exit > status code. > {{process::subprocess()}} offers much of the functionality, but in a way that > still requires a lot of handiwork on the developer's part; we would like to > further abstract away the ability to just pass a string, an optional set of > command-line arguments and then collect the output of the command (bonus: > without blocking). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3350) Create a protobuf VersionInfo to store mesos version information
[ https://issues.apache.org/jira/browse/MESOS-3350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15072378#comment-15072378 ] Marco Massenzio commented on MESOS-3350: [~vinodkone], [~bmahler]: Do you guys think this is still useful? If yes, happy to implement it - please let me know what you think. (if not, I'll just close it). Thanks. > Create a protobuf VersionInfo to store mesos version information > > > Key: MESOS-3350 > URL: https://issues.apache.org/jira/browse/MESOS-3350 > Project: Mesos > Issue Type: Improvement >Reporter: haosdent >Assignee: Marco Massenzio > Labels: tech-debt > > Currently we use string to store mesos version in protobuf. In > [MESOS-1841-reviews|https://reviews.apache.org/r/37024/], [~marco-mesos] > think it would be better to create a protobuf struct which named VersionInfo > like: > {code} > message VersionInfo { > option string git_sha = 1; > option string build_user = 2; > x > } > {code} > So that we could use this struct everywhere (expose informations to http > endpoint, replace the version string in MasterInfo). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-2786) Let mesos to be build and run on arm64 servers
[ https://issues.apache.org/jira/browse/MESOS-2786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15072391#comment-15072391 ] Yinwen Wang commented on MESOS-2786: Hi Robin Dong I have used your patch on mesos-0.25.0 release.The patch is good when I compile the mesos source code and I can sucessfully start mesos matser/slave and marathon.But when I use the mesos native container to start a cmd task, the slave'log show some error below: E1226 16:55:12.973712 7920 slave.cpp:3342] Container '738c841a-5d1a-41a2-858d-10f999c32378' for executor 'zxc.6006dda8-abae-11e5-bb25-7a3f6cf980b9' of framework 'd502f28e-9630-4aed-b7c2-ae33bc916ade-0004' failed to start: Failed to fork executor: Failed to clone child process: Failed to clone: Invalid argument The error information indicates that error happened when cloning a process. I have checked the source code,there is nothing wrong with the input parameters when calling the linux kernel API clone(). This problem has confused me for a long time! Did you Encounter similar problems when starting cmd task on mesos in the environment of arm64? Or do you have some ideas? > Let mesos to be build and run on arm64 servers > -- > > Key: MESOS-2786 > URL: https://issues.apache.org/jira/browse/MESOS-2786 > Project: Mesos > Issue Type: Improvement > Components: general >Reporter: RobinDong >Assignee: RobinDong > > Mesos use many third-party software such as protobuf and zookeeper, for they > can't run on arm64 environtment, we can't run mesos too. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4246) mesos support container application HA
[ https://issues.apache.org/jira/browse/MESOS-4246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guangya Liu updated MESOS-4246: --- Summary: mesos support container application HA (was: mesos support HA ) > mesos support container application HA > --- > > Key: MESOS-4246 > URL: https://issues.apache.org/jira/browse/MESOS-4246 > Project: Mesos > Issue Type: Story > Components: docker >Affects Versions: 0.25.0 > Environment: we have setup one mesos cluster, one Master Node, and > several Slave Node. >Reporter: wangqun >Priority: Critical > Fix For: 0.25.0 > > Original Estimate: 12h > Remaining Estimate: 12h > > Right now, we have setup one mesos cluster, one Master Node, and several > Slave Nodes, > We found that seems mesos not support slave nodes app reschudeling , for > example a simple user case: > 1. I have several containers running on one Slave Node. > 2. The slave node down for some issue > How can use those containers ? Could those containers rescheduled to other > Slave Nodes ? > From our test, it seems Mesos Not support this feature. Which means, users > can not use those containers anymore. > Could any Mesos developers confirmed that ? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4246) mesos support container application HA
[ https://issues.apache.org/jira/browse/MESOS-4246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15072452#comment-15072452 ] Gaojin CAO commented on MESOS-4246: --- [~Kennan]yes, just as [~gyliu] said. here is how k8s handle task failure: 1. k8s scheduler received task failure message, and update the task/pod status in registry: https://github.com/kubernetes/kubernetes/blob/master/contrib%2Fmesos%2Fpkg%2Fscheduler%2Fcomponents%2Fframework%2Fframework.go#L458 2. the replication controller will keep an eye on all tasks/pods, and will create new pod instead of the failure one. > mesos support container application HA > --- > > Key: MESOS-4246 > URL: https://issues.apache.org/jira/browse/MESOS-4246 > Project: Mesos > Issue Type: Story > Components: docker >Affects Versions: 0.25.0 > Environment: we have setup one mesos cluster, one Master Node, and > several Slave Node. >Reporter: wangqun >Priority: Critical > Fix For: 0.25.0 > > Original Estimate: 12h > Remaining Estimate: 12h > > Right now, we have setup one mesos cluster, one Master Node, and several > Slave Nodes, > We found that seems mesos not support slave nodes app reschudeling , for > example a simple user case: > 1. I have several containers running on one Slave Node. > 2. The slave node down for some issue > How can use those containers ? Could those containers rescheduled to other > Slave Nodes ? > From our test, it seems Mesos Not support this feature. Which means, users > can not use those containers anymore. > Could any Mesos developers confirmed that ? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4246) mesos support container application HA
[ https://issues.apache.org/jira/browse/MESOS-4246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15072454#comment-15072454 ] Gaojin CAO commented on MESOS-4246: --- [~Kennan]yes, just as [~gyliu] said. here is how k8s handle task failure: 1. k8s scheduler received task failure message, and update the task/pod status in registry: https://github.com/kubernetes/kubernetes/blob/master/contrib%2Fmesos%2Fpkg%2Fscheduler%2Fcomponents%2Fframework%2Fframework.go#L458 2. the replication controller will keep an eye on all tasks/pods, and will create new pod instead of the failure one. > mesos support container application HA > --- > > Key: MESOS-4246 > URL: https://issues.apache.org/jira/browse/MESOS-4246 > Project: Mesos > Issue Type: Story > Components: docker >Affects Versions: 0.25.0 > Environment: we have setup one mesos cluster, one Master Node, and > several Slave Node. >Reporter: wangqun >Priority: Critical > Fix For: 0.25.0 > > Original Estimate: 12h > Remaining Estimate: 12h > > Right now, we have setup one mesos cluster, one Master Node, and several > Slave Nodes, > We found that seems mesos not support slave nodes app reschudeling , for > example a simple user case: > 1. I have several containers running on one Slave Node. > 2. The slave node down for some issue > How can use those containers ? Could those containers rescheduled to other > Slave Nodes ? > From our test, it seems Mesos Not support this feature. Which means, users > can not use those containers anymore. > Could any Mesos developers confirmed that ? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4246) mesos support container application HA
[ https://issues.apache.org/jira/browse/MESOS-4246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15072455#comment-15072455 ] Gaojin CAO commented on MESOS-4246: --- [~Kennan]yes, just as [~gyliu] said. here is how k8s handle task failure: 1. k8s scheduler received task failure message, and update the task/pod status in registry: https://github.com/kubernetes/kubernetes/blob/master/contrib%2Fmesos%2Fpkg%2Fscheduler%2Fcomponents%2Fframework%2Fframework.go#L458 2. the replication controller will keep an eye on all tasks/pods, and will create new pod instead of the failure one. > mesos support container application HA > --- > > Key: MESOS-4246 > URL: https://issues.apache.org/jira/browse/MESOS-4246 > Project: Mesos > Issue Type: Story > Components: docker >Affects Versions: 0.25.0 > Environment: we have setup one mesos cluster, one Master Node, and > several Slave Node. >Reporter: wangqun >Priority: Critical > Fix For: 0.25.0 > > Original Estimate: 12h > Remaining Estimate: 12h > > Right now, we have setup one mesos cluster, one Master Node, and several > Slave Nodes, > We found that seems mesos not support slave nodes app reschudeling , for > example a simple user case: > 1. I have several containers running on one Slave Node. > 2. The slave node down for some issue > How can use those containers ? Could those containers rescheduled to other > Slave Nodes ? > From our test, it seems Mesos Not support this feature. Which means, users > can not use those containers anymore. > Could any Mesos developers confirmed that ? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Issue Comment Deleted] (MESOS-4246) mesos support container application HA
[ https://issues.apache.org/jira/browse/MESOS-4246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gaojin CAO updated MESOS-4246: -- Comment: was deleted (was: [~Kennan]yes, just as [~gyliu] said. here is how k8s handle task failure: 1. k8s scheduler received task failure message, and update the task/pod status in registry: https://github.com/kubernetes/kubernetes/blob/master/contrib%2Fmesos%2Fpkg%2Fscheduler%2Fcomponents%2Fframework%2Fframework.go#L458 2. the replication controller will keep an eye on all tasks/pods, and will create new pod instead of the failure one.) > mesos support container application HA > --- > > Key: MESOS-4246 > URL: https://issues.apache.org/jira/browse/MESOS-4246 > Project: Mesos > Issue Type: Story > Components: docker >Affects Versions: 0.25.0 > Environment: we have setup one mesos cluster, one Master Node, and > several Slave Node. >Reporter: wangqun >Priority: Critical > Fix For: 0.25.0 > > Original Estimate: 12h > Remaining Estimate: 12h > > Right now, we have setup one mesos cluster, one Master Node, and several > Slave Nodes, > We found that seems mesos not support slave nodes app reschudeling , for > example a simple user case: > 1. I have several containers running on one Slave Node. > 2. The slave node down for some issue > How can use those containers ? Could those containers rescheduled to other > Slave Nodes ? > From our test, it seems Mesos Not support this feature. Which means, users > can not use those containers anymore. > Could any Mesos developers confirmed that ? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4246) mesos support HA
[ https://issues.apache.org/jira/browse/MESOS-4246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15072399#comment-15072399 ] wangqun commented on MESOS-4246: @Gaojin CAO,@Klaus Ma Yes, It is the framework'responsibility . But I don't see the framework to rescheduller the tasks. If the framework is work normally. Could the containers can rescheduled to other Slave Node? > mesos support HA > - > > Key: MESOS-4246 > URL: https://issues.apache.org/jira/browse/MESOS-4246 > Project: Mesos > Issue Type: Story > Components: docker >Affects Versions: 0.25.0 > Environment: we have setup one mesos cluster, one Master Node, and > several Slave Node. >Reporter: wangqun >Priority: Critical > Fix For: 0.25.0 > > Original Estimate: 12h > Remaining Estimate: 12h > > Right now, we have setup one mesos cluster, one Master Node, and several > Slave Nodes, > We found that seems mesos not support slave nodes app reschudeling , for > example a simple user case: > 1. I have several containers running on one Slave Node. > 2. The slave node down for some issue > How can use those containers ? Could those containers rescheduled to other > Slave Nodes ? > From our test, it seems Mesos Not support this feature. Which means, users > can not use those containers anymore. > Could any Mesos developers confirmed that ? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4246) mesos support HA
[ https://issues.apache.org/jira/browse/MESOS-4246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15072417#comment-15072417 ] Klaus Ma commented on MESOS-4246: - [~HackToday], AFAIK, Marathon, K8S, Swarm handled such case in latest version. > mesos support HA > - > > Key: MESOS-4246 > URL: https://issues.apache.org/jira/browse/MESOS-4246 > Project: Mesos > Issue Type: Story > Components: docker >Affects Versions: 0.25.0 > Environment: we have setup one mesos cluster, one Master Node, and > several Slave Node. >Reporter: wangqun >Priority: Critical > Fix For: 0.25.0 > > Original Estimate: 12h > Remaining Estimate: 12h > > Right now, we have setup one mesos cluster, one Master Node, and several > Slave Nodes, > We found that seems mesos not support slave nodes app reschudeling , for > example a simple user case: > 1. I have several containers running on one Slave Node. > 2. The slave node down for some issue > How can use those containers ? Could those containers rescheduled to other > Slave Nodes ? > From our test, it seems Mesos Not support this feature. Which means, users > can not use those containers anymore. > Could any Mesos developers confirmed that ? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4246) mesos support HA
[ https://issues.apache.org/jira/browse/MESOS-4246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15072400#comment-15072400 ] Kennan commented on MESOS-4246: --- [~zerobleed] I think mesos have two layers HA cases, For master nodes HA For slave nodes APP HA, this is important cases, which means, if slave node down, user need those app still work. whether marathon or other framework handle that ? As you said framework did that > mesos support HA > - > > Key: MESOS-4246 > URL: https://issues.apache.org/jira/browse/MESOS-4246 > Project: Mesos > Issue Type: Story > Components: docker >Affects Versions: 0.25.0 > Environment: we have setup one mesos cluster, one Master Node, and > several Slave Node. >Reporter: wangqun >Priority: Critical > Fix For: 0.25.0 > > Original Estimate: 12h > Remaining Estimate: 12h > > Right now, we have setup one mesos cluster, one Master Node, and several > Slave Nodes, > We found that seems mesos not support slave nodes app reschudeling , for > example a simple user case: > 1. I have several containers running on one Slave Node. > 2. The slave node down for some issue > How can use those containers ? Could those containers rescheduled to other > Slave Nodes ? > From our test, it seems Mesos Not support this feature. Which means, users > can not use those containers anymore. > Could any Mesos developers confirmed that ? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-4253) Provide a minimalist "runtime context" to an Anonymous Module
Marco Massenzio created MESOS-4253: -- Summary: Provide a minimalist "runtime context" to an Anonymous Module Key: MESOS-4253 URL: https://issues.apache.org/jira/browse/MESOS-4253 Project: Mesos Issue Type: Improvement Components: modules Reporter: Marco Massenzio Assignee: Marco Massenzio Currently, {{Anonymous}} modules only receive at creation a copy of the {{"parameters"}} passed in the JSON configuration file. However, at runtime, it would be useful to also have a "runtime context" for the module developer to use, when implementing the functionality. I would suggest to pass in the {{Flags}} object from the Master/Agent inside an {{setRuntimeContext(const Flags&)}}[0] method, called immediately post-{{create(const Parameters&)}}[1]. Also, I would suggest adding a {{teardown()}} method too, in case the module needs to release resources / conduct cleanup before exiting (there is a TODO in the code to this effect, and adding this in this patch would be close to trivial). [0] In practice, it won't be this trivial, as Master/Agent {{Flags}} are of a different compile-time type - probably use something like variadic templates or something (suggestions appreciated!). [1] In fact, the ideal solution would be to add the {{const Flags&}} to {{create()}}, but that would, alas, break everyone's modules; so that's probably a no-go (ideas welcome here too). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4253) Provide a minimalist "runtime context" to an Anonymous Module
[ https://issues.apache.org/jira/browse/MESOS-4253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15072372#comment-15072372 ] Marco Massenzio commented on MESOS-4253: [~karya] - would you mind terribly shepherding this one, please? > Provide a minimalist "runtime context" to an Anonymous Module > - > > Key: MESOS-4253 > URL: https://issues.apache.org/jira/browse/MESOS-4253 > Project: Mesos > Issue Type: Improvement > Components: modules >Reporter: Marco Massenzio >Assignee: Marco Massenzio > > Currently, {{Anonymous}} modules only receive at creation a copy of the > {{"parameters"}} passed in the JSON configuration file. > However, at runtime, it would be useful to also have a "runtime context" for > the module developer to use, when implementing the functionality. > I would suggest to pass in the {{Flags}} object from the Master/Agent inside > an {{setRuntimeContext(const Flags&)}}[0] method, called immediately > post-{{create(const Parameters&)}}[1]. > Also, I would suggest adding a {{teardown()}} method too, in case the module > needs to release resources / conduct cleanup before exiting (there is a TODO > in the code to this effect, and adding this in this patch would be close to > trivial). > [0] In practice, it won't be this trivial, as Master/Agent {{Flags}} are of a > different compile-time type - probably use something like variadic templates > or something (suggestions appreciated!). > [1] In fact, the ideal solution would be to add the {{const Flags&}} to > {{create()}}, but that would, alas, break everyone's modules; so that's > probably a no-go (ideas welcome here too). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Issue Comment Deleted] (MESOS-4246) mesos support container application HA
[ https://issues.apache.org/jira/browse/MESOS-4246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gaojin CAO updated MESOS-4246: -- Comment: was deleted (was: [~Kennan]yes, just as [~gyliu] said. here is how k8s handle task failure: 1. k8s scheduler received task failure message, and update the task/pod status in registry: https://github.com/kubernetes/kubernetes/blob/master/contrib%2Fmesos%2Fpkg%2Fscheduler%2Fcomponents%2Fframework%2Fframework.go#L458 2. the replication controller will keep an eye on all tasks/pods, and will create new pod instead of the failure one.) > mesos support container application HA > --- > > Key: MESOS-4246 > URL: https://issues.apache.org/jira/browse/MESOS-4246 > Project: Mesos > Issue Type: Story > Components: docker >Affects Versions: 0.25.0 > Environment: we have setup one mesos cluster, one Master Node, and > several Slave Node. >Reporter: wangqun >Priority: Critical > Fix For: 0.25.0 > > Original Estimate: 12h > Remaining Estimate: 12h > > Right now, we have setup one mesos cluster, one Master Node, and several > Slave Nodes, > We found that seems mesos not support slave nodes app reschudeling , for > example a simple user case: > 1. I have several containers running on one Slave Node. > 2. The slave node down for some issue > How can use those containers ? Could those containers rescheduled to other > Slave Nodes ? > From our test, it seems Mesos Not support this feature. Which means, users > can not use those containers anymore. > Could any Mesos developers confirmed that ? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-1104) Move linux/fs.hpp out of `mesos` namespace in linux/fs.h
[ https://issues.apache.org/jira/browse/MESOS-1104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15072487#comment-15072487 ] Abhishek Dasgupta commented on MESOS-1104: -- Can you please be more elaborate on this issue.?There is no file called linux/fs.h in mesos 0.26 version. Is the issue resolved? > Move linux/fs.hpp out of `mesos` namespace in linux/fs.h > > > Key: MESOS-1104 > URL: https://issues.apache.org/jira/browse/MESOS-1104 > Project: Mesos > Issue Type: Improvement >Reporter: Archana kumari > Labels: mesosphere, newbie > -- This message was sent by Atlassian JIRA (v6.3.4#6332)