[jira] [Commented] (MESOS-9619) Mesos Master Crashes with Launch Group when using Port Resources

2019-03-03 Thread Meng Zhu (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-9619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16783058#comment-16783058
 ] 

Meng Zhu commented on MESOS-9619:
-

[~nimi] The crash is due to a misconfiguration, thus a Mesos validation bug.

The accept call 
[here|https://gist.github.com/nemosupremo/3b23c4e1ca0ab241376aa5b975993270] 
specified port 777 in both the executor and the task. This leads to duplication 
allocations that caused the allocator to crash. 


> Mesos Master Crashes with Launch Group when using Port Resources
> 
>
> Key: MESOS-9619
> URL: https://issues.apache.org/jira/browse/MESOS-9619
> Project: Mesos
>  Issue Type: Bug
>  Components: allocation
>Affects Versions: 1.4.3, 1.7.1
> Environment:  
> Testing in both Mesos 1.4.3 and Mesos 1.7.1
>Reporter: Nimi Wariboko Jr.
>Assignee: Meng Zhu
>Priority: Blocker
>  Labels: allocator, master, mesosphere
> Attachments: mesos-master.log, mesos-master.snippet.log
>
>
> Original Issue: 
> [https://lists.apache.org/thread.html/979c8799d128ad0c436b53f2788568212f97ccf324933524f1b4d189@%3Cuser.mesos.apache.org%3E]
>  When the ports resources is removed, Mesos functions normally (I'm able to 
> launch the task as many times as possible, while it always fails continually).
> Attached is a snippet of the mesos master log from OFFER to crash.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (MESOS-9592) Mesos Websitebot is flaky

2019-03-03 Thread Benjamin Bannier (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-9592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Bannier reassigned MESOS-9592:
---

Assignee: Benjamin Bannier

> Mesos Websitebot is flaky
> -
>
> Key: MESOS-9592
> URL: https://issues.apache.org/jira/browse/MESOS-9592
> Project: Mesos
>  Issue Type: Bug
>  Components: project website
>Reporter: Vinod Kone
>Assignee: Benjamin Bannier
>Priority: Major
>  Labels: ci, integration
>
> Mesos Websitebot Jenkins job is sometimes failing during the endpoint 
> documentation generation face. It looks like it is timing out on getting a 
> response from the /health endpoint of the master.
> Example failing build: 
> https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Websitebot/1899/
> {code}
> 01:20:30 make[2]: Leaving directory '/mesos/build/src'
> 01:20:30 make[1]: Leaving directory '/mesos/build/src'
> 01:20:30 /mesos
> 01:20:41 Timeout attempting to hit url: http://127.0.0.1:5050/health
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (MESOS-9628) Consider running tox as part of test suite, not as part of style checking

2019-03-03 Thread Benjamin Bannier (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-9628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Bannier reassigned MESOS-9628:
---

Assignee: Benjamin Bannier

> Consider running tox as part of test suite, not as part of style checking
> -
>
> Key: MESOS-9628
> URL: https://issues.apache.org/jira/browse/MESOS-9628
> Project: Mesos
>  Issue Type: Improvement
>  Components: cli
>Reporter: Benjamin Bannier
>Assignee: Benjamin Bannier
>Priority: Major
>  Labels: newbie++
>
> Currently {{tox}} is being run as part of {{support/mesos-style.py}}. This is 
> unusual as {{tox}} is a tool to run tests, extract code coverage metrics and 
> similar tasks.
> We should consider running it as part of the test suite instead of being part 
> of {{mesos-style.py}}. This might also simplify some of the installation 
> challenges to current style checking setup has.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (MESOS-9630) Consider moving linter setup to pre-commit

2019-03-03 Thread Benjamin Bannier (JIRA)
Benjamin Bannier created MESOS-9630:
---

 Summary: Consider moving linter setup to pre-commit
 Key: MESOS-9630
 URL: https://issues.apache.org/jira/browse/MESOS-9630
 Project: Mesos
  Issue Type: Wish
Reporter: Benjamin Bannier
Assignee: Benjamin Bannier


Mesos currently uses a mix of hand-crafted git commit hooks and mesos-style to 
perform linting. While this has served us well our current approach also has 
some drawbacks, e.g.,
* the linter setup is spread between hooks and {{support/mesos-style.py}}
* adding new linters can be cumbersome
* mesos-style.py uses a process where it creates a single virtualenv to install 
linters in which is tie d to the source tree
* linter dependencies are only cached to an extent and it is easy to run into a 
situation where one needs to update linter dependencies over the network even 
though one has successfully linted a revision before
* {{support/mesos-style.py}} lacks a number of features, e.g., running over 
only staged files, running linters in parallel for improved throughput, and the 
parameterization of the linters is strongly coupled to implementation of the 
style checker itself.

The [pre-commit tool|https://pre-commit.com] solves most of these issues and 
using it in Mesos would not only allow us to get rid of tooling which is hard 
to maintain, but also unlock other features. It is licensed under a MIT 
license. We should consider moving our linting setup over to pre-commit.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (MESOS-9629) Pylint reports cyclic dependencies in cli_new

2019-03-03 Thread Benjamin Bannier (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-9629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782675#comment-16782675
 ] 

Benjamin Bannier commented on MESOS-9629:
-

Reviews:

https://reviews.apache.org/r/70090/
https://reviews.apache.org/r/70092/

> Pylint reports cyclic dependencies in cli_new
> -
>
> Key: MESOS-9629
> URL: https://issues.apache.org/jira/browse/MESOS-9629
> Project: Mesos
>  Issue Type: Bug
>  Components: cli
>Reporter: Benjamin Bannier
>Assignee: Benjamin Bannier
>Priority: Minor
>
> When running {{support/mesos-style.py}} over files in {{src/python/cli_new}} 
> cyclic dependencies in {{cli}} are reported.
> {noformat}
> $ ./support/mesos-style.py `find src/python/cli_new -type f |grep -v \.tox -v 
> |grep -v \.virtualenv`
> The "pip-requirements.txt" file has changed.
> Rebuilding virtualenv...
>  * Install prebuilt node (11.10.1) . done.
>  * Appending data to 
> /Users/bbannier/src/mesos/support/.virtualenv/bin/activate
>  * Appending data to 
> /Users/bbannier/src/mesos/support/.virtualenv/bin/activate.fish
> Checking 26 Python files
> * Module cli.plugins.task.main
> lib/cli/plugins/task/main.py:1:0: R0401: Cyclic import (cli -> cli.plugins -> 
> cli.plugins.base) (cyclic-import)
> lib/cli/plugins/task/main.py:1:0: R0401: Cyclic import (cli -> cli.config) 
> (cyclic-import)
> lib/cli/plugins/task/main.py:1:0: R0401: Cyclic import (cli.tests -> 
> cli.tests.task) (cyclic-import)
> lib/cli/plugins/task/main.py:1:0: R0401: Cyclic import (cli.tests -> 
> cli.tests.agent) (cyclic-import)
> lib/cli/plugins/task/main.py:1:0: R0401: Cyclic import (cli.tests -> 
> cli.tests.tests) (cyclic-import)
> {noformat}
> The exact module {{pylint}} diagnoses this at was not determistic for me.
> I was not able to trigger this failure when passing only a single file.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (MESOS-9629) Pylint reports cyclic dependencies in cli_new

2019-03-03 Thread Benjamin Bannier (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-9629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Bannier reassigned MESOS-9629:
---

Assignee: Benjamin Bannier

> Pylint reports cyclic dependencies in cli_new
> -
>
> Key: MESOS-9629
> URL: https://issues.apache.org/jira/browse/MESOS-9629
> Project: Mesos
>  Issue Type: Bug
>  Components: cli
>Reporter: Benjamin Bannier
>Assignee: Benjamin Bannier
>Priority: Minor
>
> When running {{support/mesos-style.py}} over files in {{src/python/cli_new}} 
> cyclic dependencies in {{cli}} are reported.
> {noformat}
> $ ./support/mesos-style.py `find src/python/cli_new -type f |grep -v \.tox -v 
> |grep -v \.virtualenv`
> The "pip-requirements.txt" file has changed.
> Rebuilding virtualenv...
>  * Install prebuilt node (11.10.1) . done.
>  * Appending data to 
> /Users/bbannier/src/mesos/support/.virtualenv/bin/activate
>  * Appending data to 
> /Users/bbannier/src/mesos/support/.virtualenv/bin/activate.fish
> Checking 26 Python files
> * Module cli.plugins.task.main
> lib/cli/plugins/task/main.py:1:0: R0401: Cyclic import (cli -> cli.plugins -> 
> cli.plugins.base) (cyclic-import)
> lib/cli/plugins/task/main.py:1:0: R0401: Cyclic import (cli -> cli.config) 
> (cyclic-import)
> lib/cli/plugins/task/main.py:1:0: R0401: Cyclic import (cli.tests -> 
> cli.tests.task) (cyclic-import)
> lib/cli/plugins/task/main.py:1:0: R0401: Cyclic import (cli.tests -> 
> cli.tests.agent) (cyclic-import)
> lib/cli/plugins/task/main.py:1:0: R0401: Cyclic import (cli.tests -> 
> cli.tests.tests) (cyclic-import)
> {noformat}
> The exact module {{pylint}} diagnoses this at was not determistic for me.
> I was not able to trigger this failure when passing only a single file.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (MESOS-9269) Mesos UCR with Docker only Works on Host

2019-03-03 Thread Nimi Wariboko Jr. (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-9269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782629#comment-16782629
 ] 

Nimi Wariboko Jr. edited comment on MESOS-9269 at 3/3/19 9:47 AM:
--

I recently ran into this issue as well - I could only access the mesos 
container image from the host but not remotely. I noticed in my case that the 
issue went away when I didn't have docker installed on the agent. If I 
installed the agent with iinstalling docker (and disabling the docker 
containerizer), everything worked as expected.

I also imagine this is why the DC/OS install worked fine - maybe the DC/OS 
agents do not have docker installed.

Looking into this further, I believe I was hitting something related to:

https://docs.docker.com/v17.09/engine/userguide/networking/default_network/container-communication/#container-communication-between-hosts

Running `sudo iptables -P FORWARD ACCEPT` solves the issue for me.


was (Author: nimi):
I recently ran into this issue as well - I could only access the mesos 
container image from the host but not remotely. I noticed in my case that the 
issue went away when I didn't have docker installed on the agent. If I 
installed the agent with iinstalling docker (and disabling the docker 
containerizer), everything worked as expected.

I also imagine this is why the DC/OS install worked fine - maybe the DC/OS 
agents do not have docker installed.

> Mesos UCR with Docker only Works on Host
> 
>
> Key: MESOS-9269
> URL: https://issues.apache.org/jira/browse/MESOS-9269
> Project: Mesos
>  Issue Type: Bug
>  Components: agent, docker
>Affects Versions: 1.7.0
> Environment: Ubuntu 16.04
> Mesos 1.7.0
> Marathon 1.7.111
>Reporter: z s
>Priority: Major
>
> I'm having an issue setting up the `mesos-cni-port-mapper` to allow remote 
> connectivity.
> When I `curl :` from the machine I get a response but from a 
> remote machine the `curl` connection timesout. I'm not sure what's wrong with 
> my route settings.
>  
> */var/lib/mesos/cni/config/mesos-bridge.json*
>  
> {code:java}
> {
> "name" : "mesos-bridge",
> "type" : "mesos-cni-port-mapper",
> "excludeDevices" : ["mesos-cni0"],
> "chain": "MESOS-BRIDGE-PORT-MAPPER",
> "delegate": {
> "type": "bridge",
> "bridge": "mesos-cni0",
> "isGateway": true,
> "ipMasq": true,
> "ipam": {
> "type": "host-local",
> "subnet": "10.1.0.0/16",
> "routes": [
> { "dst":
> "0.0.0.0/0" }
> ]
> }
> }
> }
> {code}
>  
> {code:java}
> $ route -n
> Kernel IP routing table
> Destination Gateway Genmask Flags Metric Ref Use Iface
> 0.0.0.0 172.27.1.1 0.0.0.0 UG 0 0 0 ens3
> 10.1.0.0 0.0.0.0 255.255.0.0 U 0 0 0 mesos-cni0
> 172.17.0.0 0.0.0.0 255.255.0.0 U 0 0 0 docker0
> 172.27.1.0 0.0.0.0 255.255.255.0 U 0 0 0 ens3
> {code}
> Any suggestions?
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (MESOS-9269) Mesos UCR with Docker only Works on Host

2019-03-03 Thread Nimi Wariboko Jr. (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-9269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782629#comment-16782629
 ] 

Nimi Wariboko Jr. commented on MESOS-9269:
--

I recently ran into this issue as well - I could only access the mesos 
container image from the host but not remotely. I noticed in my case that the 
issue went away when I didn't have docker installed on the agent. If I 
installed the agent with iinstalling docker (and disabling the docker 
containerizer), everything worked as expected.

I also imagine this is why the DC/OS install worked fine - maybe the DC/OS 
agents do not have docker installed.

> Mesos UCR with Docker only Works on Host
> 
>
> Key: MESOS-9269
> URL: https://issues.apache.org/jira/browse/MESOS-9269
> Project: Mesos
>  Issue Type: Bug
>  Components: agent, docker
>Affects Versions: 1.7.0
> Environment: Ubuntu 16.04
> Mesos 1.7.0
> Marathon 1.7.111
>Reporter: z s
>Priority: Major
>
> I'm having an issue setting up the `mesos-cni-port-mapper` to allow remote 
> connectivity.
> When I `curl :` from the machine I get a response but from a 
> remote machine the `curl` connection timesout. I'm not sure what's wrong with 
> my route settings.
>  
> */var/lib/mesos/cni/config/mesos-bridge.json*
>  
> {code:java}
> {
> "name" : "mesos-bridge",
> "type" : "mesos-cni-port-mapper",
> "excludeDevices" : ["mesos-cni0"],
> "chain": "MESOS-BRIDGE-PORT-MAPPER",
> "delegate": {
> "type": "bridge",
> "bridge": "mesos-cni0",
> "isGateway": true,
> "ipMasq": true,
> "ipam": {
> "type": "host-local",
> "subnet": "10.1.0.0/16",
> "routes": [
> { "dst":
> "0.0.0.0/0" }
> ]
> }
> }
> }
> {code}
>  
> {code:java}
> $ route -n
> Kernel IP routing table
> Destination Gateway Genmask Flags Metric Ref Use Iface
> 0.0.0.0 172.27.1.1 0.0.0.0 UG 0 0 0 ens3
> 10.1.0.0 0.0.0.0 255.255.0.0 U 0 0 0 mesos-cni0
> 172.17.0.0 0.0.0.0 255.255.0.0 U 0 0 0 docker0
> 172.27.1.0 0.0.0.0 255.255.255.0 U 0 0 0 ens3
> {code}
> Any suggestions?
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)