[jira] [Assigned] (MESOS-3078) Recovered resources are not re-allocated until the next allocation delay.
[ https://issues.apache.org/jira/browse/MESOS-3078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guangya Liu reassigned MESOS-3078: -- Assignee: Guangya Liu > Recovered resources are not re-allocated until the next allocation delay. > - > > Key: MESOS-3078 > URL: https://issues.apache.org/jira/browse/MESOS-3078 > Project: Mesos > Issue Type: Improvement > Components: allocation >Reporter: Benjamin Mahler >Assignee: Guangya Liu > > Currently, when resources are recovered, we do not perform an allocation for > that slave. Rather, we wait until the next allocation interval. > For small task, high throughput frameworks, this can have a significant > impact on overall throughput, see the following thread: > http://markmail.org/thread/y6mzfwzlurv6nik3 > We should consider immediately performing a re-allocation for the slave upon > resource recovery. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3078) Recovered resources are not re-allocated until the next allocation delay.
[ https://issues.apache.org/jira/browse/MESOS-3078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15432190#comment-15432190 ] Guangya Liu commented on MESOS-3078: The review posted by [~jjanco] here https://reviews.apache.org/r/51027/ can help this, we can use similar logic in {{addSlave}} to handle this. {code} allocationCandidates.insert(slaveId); if (!allocationPending) { allocationPending = true; dispatch(self(), ::allocate); } {code} > Recovered resources are not re-allocated until the next allocation delay. > - > > Key: MESOS-3078 > URL: https://issues.apache.org/jira/browse/MESOS-3078 > Project: Mesos > Issue Type: Improvement > Components: allocation >Reporter: Benjamin Mahler > > Currently, when resources are recovered, we do not perform an allocation for > that slave. Rather, we wait until the next allocation interval. > For small task, high throughput frameworks, this can have a significant > impact on overall throughput, see the following thread: > http://markmail.org/thread/y6mzfwzlurv6nik3 > We should consider immediately performing a re-allocation for the slave upon > resource recovery. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-6045) Implement LAUNCH_GROUP operation in master.
[ https://issues.apache.org/jira/browse/MESOS-6045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Mahler updated MESOS-6045: --- Sprint: Mesosphere Sprint 41 > Implement LAUNCH_GROUP operation in master. > --- > > Key: MESOS-6045 > URL: https://issues.apache.org/jira/browse/MESOS-6045 > Project: Mesos > Issue Type: Task > Components: master >Reporter: Benjamin Mahler >Assignee: Benjamin Mahler > > The master needs to handle the new {{LAUNCH_GROUP}} operation. This is a bit > different than the {{LAUNCH}} operation in that we need to ensure that we do > not deliver the task group if any of the tasks fail authorization, are > invalid, or are killed while authorization is in progress. > The entire task group must be delivered in a single message to the agent. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-6071) Validate that an explicitly specified DEFAULT executor has disk resources.
[ https://issues.apache.org/jira/browse/MESOS-6071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Mahler updated MESOS-6071: --- Description: When the framework is explicitly specifying the DEFAULT executor (currently only supported for task groups), we should consider validating that it contains disk resources. Currently, we validate that explicitly specified (DEFAULT or CUSTOM) executors only contain cpus and mem. We should also consider supporting the omission of DEFAULT executor resources and injecting a default amount of resources. However, the difficulty here is that the framework must know about these amounts since they need to be available in the offer. We could expose these to the framework during framework registration. was: When the framework is explicitly specifying the DEFAULT executor (currently only supported for task groups), we should consider validating that it contains disk resources. Currently, we validate that executors only contain cpus and mem. We should also consider supporting the omission of DEFAULT executor resources and injecting a default amount of resources. However, the difficulty here is that the framework must know about these amounts since they need to be available in the offer. We could expose these to the framework during framework registration. > Validate that an explicitly specified DEFAULT executor has disk resources. > -- > > Key: MESOS-6071 > URL: https://issues.apache.org/jira/browse/MESOS-6071 > Project: Mesos > Issue Type: Task > Components: master >Reporter: Benjamin Mahler > > When the framework is explicitly specifying the DEFAULT executor (currently > only supported for task groups), we should consider validating that it > contains disk resources. Currently, we validate that explicitly specified > (DEFAULT or CUSTOM) executors only contain cpus and mem. > We should also consider supporting the omission of DEFAULT executor resources > and injecting a default amount of resources. However, the difficulty here is > that the framework must know about these amounts since they need to be > available in the offer. We could expose these to the framework during > framework registration. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-6071) Validate that an explicitly specified DEFAULT executor has disk resources.
[ https://issues.apache.org/jira/browse/MESOS-6071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Mahler updated MESOS-6071: --- Summary: Validate that an explicitly specified DEFAULT executor has disk resources. (was: Validate that an explicitly specified DEFAULT executor has resources.) > Validate that an explicitly specified DEFAULT executor has disk resources. > -- > > Key: MESOS-6071 > URL: https://issues.apache.org/jira/browse/MESOS-6071 > Project: Mesos > Issue Type: Task > Components: master >Reporter: Benjamin Mahler > > When the framework is explicitly specifying the DEFAULT executor (currently > only supported for task groups), we should consider validating that it > contains disk resources. > We should also consider supporting the omission of DEFAULT executor resources > and injecting a default amount of resources. However, the difficulty here is > that the framework must know about these amounts since they need to be > available in the offer. We could expose these to the framework during > framework registration. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-6071) Validate that an explicitly specified DEFAULT executor has resources.
[ https://issues.apache.org/jira/browse/MESOS-6071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Mahler updated MESOS-6071: --- Description: When the framework is explicitly specifying the DEFAULT executor (currently only supported for task groups), we should consider validating that it contains disk resources. We should also consider supporting the omission of DEFAULT executor resources and injecting a default amount of resources. However, the difficulty here is that the framework must know about these amounts since they need to be available in the offer. We could expose these to the framework during framework registration. was: When the framework is explicitly specifying the DEFAULT executor (currently only supported for task groups), we should consider validating that it contains cpus, mem, and disk resources. We should also consider supporting the omission of DEFAULT executor resources and injecting a default amount of resources. However, the difficulty here is that the framework must know about these amounts since they need to be available in the offer. We could expose these to the framework during framework registration. > Validate that an explicitly specified DEFAULT executor has resources. > - > > Key: MESOS-6071 > URL: https://issues.apache.org/jira/browse/MESOS-6071 > Project: Mesos > Issue Type: Task > Components: master >Reporter: Benjamin Mahler > > When the framework is explicitly specifying the DEFAULT executor (currently > only supported for task groups), we should consider validating that it > contains disk resources. > We should also consider supporting the omission of DEFAULT executor resources > and injecting a default amount of resources. However, the difficulty here is > that the framework must know about these amounts since they need to be > available in the offer. We could expose these to the framework during > framework registration. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-6071) Validate that an explicitly specified DEFAULT executor has disk resources.
[ https://issues.apache.org/jira/browse/MESOS-6071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Mahler updated MESOS-6071: --- Description: When the framework is explicitly specifying the DEFAULT executor (currently only supported for task groups), we should consider validating that it contains disk resources. Currently, we validate that executors only contain cpus and mem. We should also consider supporting the omission of DEFAULT executor resources and injecting a default amount of resources. However, the difficulty here is that the framework must know about these amounts since they need to be available in the offer. We could expose these to the framework during framework registration. was: When the framework is explicitly specifying the DEFAULT executor (currently only supported for task groups), we should consider validating that it contains disk resources. We should also consider supporting the omission of DEFAULT executor resources and injecting a default amount of resources. However, the difficulty here is that the framework must know about these amounts since they need to be available in the offer. We could expose these to the framework during framework registration. > Validate that an explicitly specified DEFAULT executor has disk resources. > -- > > Key: MESOS-6071 > URL: https://issues.apache.org/jira/browse/MESOS-6071 > Project: Mesos > Issue Type: Task > Components: master >Reporter: Benjamin Mahler > > When the framework is explicitly specifying the DEFAULT executor (currently > only supported for task groups), we should consider validating that it > contains disk resources. Currently, we validate that executors only contain > cpus and mem. > We should also consider supporting the omission of DEFAULT executor resources > and injecting a default amount of resources. However, the difficulty here is > that the framework must know about these amounts since they need to be > available in the offer. We could expose these to the framework during > framework registration. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-6056) add NOOP Container Logger for mesos
[ https://issues.apache.org/jira/browse/MESOS-6056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15432113#comment-15432113 ] ASF GitHub Bot commented on MESOS-6056: --- Github user IvanJobs closed the pull request at: https://github.com/apache/mesos/pull/159 > add NOOP Container Logger for mesos > --- > > Key: MESOS-6056 > URL: https://issues.apache.org/jira/browse/MESOS-6056 > Project: Mesos > Issue Type: Improvement > Components: containerization, slave >Affects Versions: 1.0.0 > Environment: mesos 1.0.0, docker >Reporter: IvanJobs >Priority: Trivial > Labels: easyfix, features > Original Estimate: 96h > Remaining Estimate: 96h > > mesos has two Container Loggers in its source files. > One is build into mesos-agent: sandbox Container Logger, it just redirects > stderr/stdout to sandbox, causing fill disk usage problem. > The other is LogrotateContainerLogger module lib, it's good, we can make sure > stdout/stderr in sandbox be in a constant size. > But there is a common need: don't write stdout/stderr into sandbox, pity, we > don't have any flags for turning it off. > This is a come around for this: developing a new module lib for > ContainerLogger for doing nothing(redirect stdout/stderr to /dev/null) > yep, that's it. We need a NOOP ContainerLogger, BTW, FYI, we can also > retrieve stderr/stdout from docker daemon either. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-6056) add NOOP Container Logger for mesos
[ https://issues.apache.org/jira/browse/MESOS-6056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15432112#comment-15432112 ] ASF GitHub Bot commented on MESOS-6056: --- Github user IvanJobs commented on the issue: https://github.com/apache/mesos/pull/159 Well, actually after communication with Joseph Wu, I think this NOOP Container Logger is not so common and should not be accept by mesos community. So just forget about it. But if you have special use case and want to use this, I'm happy about that > add NOOP Container Logger for mesos > --- > > Key: MESOS-6056 > URL: https://issues.apache.org/jira/browse/MESOS-6056 > Project: Mesos > Issue Type: Improvement > Components: containerization, slave >Affects Versions: 1.0.0 > Environment: mesos 1.0.0, docker >Reporter: IvanJobs >Priority: Trivial > Labels: easyfix, features > Original Estimate: 96h > Remaining Estimate: 96h > > mesos has two Container Loggers in its source files. > One is build into mesos-agent: sandbox Container Logger, it just redirects > stderr/stdout to sandbox, causing fill disk usage problem. > The other is LogrotateContainerLogger module lib, it's good, we can make sure > stdout/stderr in sandbox be in a constant size. > But there is a common need: don't write stdout/stderr into sandbox, pity, we > don't have any flags for turning it off. > This is a come around for this: developing a new module lib for > ContainerLogger for doing nothing(redirect stdout/stderr to /dev/null) > yep, that's it. We need a NOOP ContainerLogger, BTW, FYI, we can also > retrieve stderr/stdout from docker daemon either. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-6071) Validate that an explicitly specified DEFAULT executor has resources.
Benjamin Mahler created MESOS-6071: -- Summary: Validate that an explicitly specified DEFAULT executor has resources. Key: MESOS-6071 URL: https://issues.apache.org/jira/browse/MESOS-6071 Project: Mesos Issue Type: Task Components: master Reporter: Benjamin Mahler When the framework is explicitly specifying the DEFAULT executor (currently only supported for task groups), we should consider validating that it contains cpus, mem, and disk resources. We should also consider supporting the omission of DEFAULT executor resources and injecting a default amount of resources. However, the difficulty here is that the framework must know about these amounts since they need to be available in the offer. We could expose these to the framework during framework registration. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-6056) add NOOP Container Logger for mesos
[ https://issues.apache.org/jira/browse/MESOS-6056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15432033#comment-15432033 ] Joseph Wu commented on MESOS-6056: -- By the way, debugging the container logger is a challenge due to how the logger binary itself does not log anything (because when something goes wrong, it most likely does not have the ability to log). The [latest issue we found in the logrotate module|https://issues.apache.org/jira/browse/MESOS-5856] was debugged using a mix of {{strace}} and matching specific syscalls to locations in the logrotate source code. > add NOOP Container Logger for mesos > --- > > Key: MESOS-6056 > URL: https://issues.apache.org/jira/browse/MESOS-6056 > Project: Mesos > Issue Type: Improvement > Components: containerization, slave >Affects Versions: 1.0.0 > Environment: mesos 1.0.0, docker >Reporter: IvanJobs >Priority: Trivial > Labels: easyfix, features > Original Estimate: 96h > Remaining Estimate: 96h > > mesos has two Container Loggers in its source files. > One is build into mesos-agent: sandbox Container Logger, it just redirects > stderr/stdout to sandbox, causing fill disk usage problem. > The other is LogrotateContainerLogger module lib, it's good, we can make sure > stdout/stderr in sandbox be in a constant size. > But there is a common need: don't write stdout/stderr into sandbox, pity, we > don't have any flags for turning it off. > This is a come around for this: developing a new module lib for > ContainerLogger for doing nothing(redirect stdout/stderr to /dev/null) > yep, that's it. We need a NOOP ContainerLogger, BTW, FYI, we can also > retrieve stderr/stdout from docker daemon either. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-6056) add NOOP Container Logger for mesos
[ https://issues.apache.org/jira/browse/MESOS-6056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15432028#comment-15432028 ] Joseph Wu commented on MESOS-6056: -- Nowadays, Github PR's are used (almost entirely) to modify the {{contributors.yaml}} file. That sentence could be clearer, I suppose :) > add NOOP Container Logger for mesos > --- > > Key: MESOS-6056 > URL: https://issues.apache.org/jira/browse/MESOS-6056 > Project: Mesos > Issue Type: Improvement > Components: containerization, slave >Affects Versions: 1.0.0 > Environment: mesos 1.0.0, docker >Reporter: IvanJobs >Priority: Trivial > Labels: easyfix, features > Original Estimate: 96h > Remaining Estimate: 96h > > mesos has two Container Loggers in its source files. > One is build into mesos-agent: sandbox Container Logger, it just redirects > stderr/stdout to sandbox, causing fill disk usage problem. > The other is LogrotateContainerLogger module lib, it's good, we can make sure > stdout/stderr in sandbox be in a constant size. > But there is a common need: don't write stdout/stderr into sandbox, pity, we > don't have any flags for turning it off. > This is a come around for this: developing a new module lib for > ContainerLogger for doing nothing(redirect stdout/stderr to /dev/null) > yep, that's it. We need a NOOP ContainerLogger, BTW, FYI, we can also > retrieve stderr/stdout from docker daemon either. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-6056) add NOOP Container Logger for mesos
[ https://issues.apache.org/jira/browse/MESOS-6056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15431967#comment-15431967 ] IvanJobs commented on MESOS-6056: - Yep, I missed that part of log. log in sandbox is not only log from docker container, but includes log from executor. docker daemon just maintain log from docker container. If I redirect sandbox's log to /dev/null, I will lose log from executor. Thx for reminding me of that. As you say, we don't add features in Github PRs. But my understanding of that ref link is not the same. I picked two sentences out below: " You’ve fixed a bug or added a feature and want to contribute it. AWESOME! Once your JIRA and Review Board accounts are in place please go ahead and create a review or GitHub pull request with an entry for yourself in contributors.yaml file. " did I miss something? thx. > add NOOP Container Logger for mesos > --- > > Key: MESOS-6056 > URL: https://issues.apache.org/jira/browse/MESOS-6056 > Project: Mesos > Issue Type: Improvement > Components: containerization, slave >Affects Versions: 1.0.0 > Environment: mesos 1.0.0, docker >Reporter: IvanJobs >Priority: Trivial > Labels: easyfix, features > Original Estimate: 96h > Remaining Estimate: 96h > > mesos has two Container Loggers in its source files. > One is build into mesos-agent: sandbox Container Logger, it just redirects > stderr/stdout to sandbox, causing fill disk usage problem. > The other is LogrotateContainerLogger module lib, it's good, we can make sure > stdout/stderr in sandbox be in a constant size. > But there is a common need: don't write stdout/stderr into sandbox, pity, we > don't have any flags for turning it off. > This is a come around for this: developing a new module lib for > ContainerLogger for doing nothing(redirect stdout/stderr to /dev/null) > yep, that's it. We need a NOOP ContainerLogger, BTW, FYI, we can also > retrieve stderr/stdout from docker daemon either. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-6055) Mesos libs in LD_LIBRARY_PATH cause fetcher to fail and not report errors
[ https://issues.apache.org/jira/browse/MESOS-6055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15431839#comment-15431839 ] Charles Allen commented on MESOS-6055: -- I'll close it as {{can't reproduce}} for now. May have just been an oddity of the system I was testing on. > Mesos libs in LD_LIBRARY_PATH cause fetcher to fail and not report errors > - > > Key: MESOS-6055 > URL: https://issues.apache.org/jira/browse/MESOS-6055 > Project: Mesos > Issue Type: Bug > Components: fetcher >Reporter: Charles Allen > > in 1.0.0, if the agent is launched such that the mesos libraries can only be > found under {{LD_LIBRARY_PATH}}, the fetcher will fail and simply exit with > no output. The log will not show linker errors. I'm not sure where they are > swallowed. If the task is launched with LD_LIBRARY_PATH set to include where > the mesos libs can be found, the fetcher functions as expected. > The problem is that the errors in the fetcher linking are not obvious as no > logs are produced from the fetcher sub process. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-6055) Mesos libs in LD_LIBRARY_PATH cause fetcher to fail and not report errors
[ https://issues.apache.org/jira/browse/MESOS-6055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15431840#comment-15431840 ] Charles Allen commented on MESOS-6055: -- Thanks for checking it out! > Mesos libs in LD_LIBRARY_PATH cause fetcher to fail and not report errors > - > > Key: MESOS-6055 > URL: https://issues.apache.org/jira/browse/MESOS-6055 > Project: Mesos > Issue Type: Bug > Components: fetcher >Reporter: Charles Allen > > in 1.0.0, if the agent is launched such that the mesos libraries can only be > found under {{LD_LIBRARY_PATH}}, the fetcher will fail and simply exit with > no output. The log will not show linker errors. I'm not sure where they are > swallowed. If the task is launched with LD_LIBRARY_PATH set to include where > the mesos libs can be found, the fetcher functions as expected. > The problem is that the errors in the fetcher linking are not obvious as no > logs are produced from the fetcher sub process. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-6069) Misspelt TASK_KILLED in mesos slave
[ https://issues.apache.org/jira/browse/MESOS-6069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Wu updated MESOS-6069: - Priority: Trivial (was: Major) > Misspelt TASK_KILLED in mesos slave > --- > > Key: MESOS-6069 > URL: https://issues.apache.org/jira/browse/MESOS-6069 > Project: Mesos > Issue Type: Bug > Components: slave >Reporter: Cody Maloney >Priority: Trivial > Labels: newbie > > https://github.com/apache/mesos/blob/c3228f3c3d1a1b2c145d1377185cfe22da6079eb/src/slave/slave.cpp#L2127 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-6069) Misspelt TASK_KILLED in mesos slave
[ https://issues.apache.org/jira/browse/MESOS-6069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Wu updated MESOS-6069: - Labels: newbie (was: ) > Misspelt TASK_KILLED in mesos slave > --- > > Key: MESOS-6069 > URL: https://issues.apache.org/jira/browse/MESOS-6069 > Project: Mesos > Issue Type: Bug > Components: slave >Reporter: Cody Maloney > Labels: newbie > > https://github.com/apache/mesos/blob/c3228f3c3d1a1b2c145d1377185cfe22da6079eb/src/slave/slave.cpp#L2127 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-6055) Mesos libs in LD_LIBRARY_PATH cause fetcher to fail and not report errors
[ https://issues.apache.org/jira/browse/MESOS-6055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15431760#comment-15431760 ] Joseph Wu commented on MESOS-6055: -- Wasn't able to repro from my local build. Agent launched from non-libtool'd binary: {code} sudo -E GLOG_v=1 LD_RUN_PATH=/mesos/build/src/.libs LD_LIBRARY_PATH=/mesos/build/src/.libs src/.libs/mesos-agent --work_dir=/tmp/agent --master=localhost:5050 --launcher_dir=/mesos/build/src {code} Master launched from wherever: {code} bin/mesos-master.sh --work_dir=/tmp/master {code} See if the fetcher does anything. The URI itself doesn't matter: {code} src/balloon-framework --master=localhost:5050 --task_memory=128MB --task_memory_usage_limit=256MB --executor_uri="http://dont/really/care/where/this/is; {code} Checked the task's stderr and it clearly showed a fetcher error, but not a linking error. > Mesos libs in LD_LIBRARY_PATH cause fetcher to fail and not report errors > - > > Key: MESOS-6055 > URL: https://issues.apache.org/jira/browse/MESOS-6055 > Project: Mesos > Issue Type: Bug > Components: fetcher >Reporter: Charles Allen > > in 1.0.0, if the agent is launched such that the mesos libraries can only be > found under {{LD_LIBRARY_PATH}}, the fetcher will fail and simply exit with > no output. The log will not show linker errors. I'm not sure where they are > swallowed. If the task is launched with LD_LIBRARY_PATH set to include where > the mesos libs can be found, the fetcher functions as expected. > The problem is that the errors in the fetcher linking are not obvious as no > logs are produced from the fetcher sub process. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-6070) Renamed containerizer::Termination to ContainerTermination.
Jie Yu created MESOS-6070: - Summary: Renamed containerizer::Termination to ContainerTermination. Key: MESOS-6070 URL: https://issues.apache.org/jira/browse/MESOS-6070 Project: Mesos Issue Type: Task Reporter: Jie Yu Assignee: Jie Yu `containerizer::Termination` is a legacy protobuf for external containerizer. Since we already removed the external containerizer, we should rename it to `ContainerTermination` and moved the definition to `containerizer.proto`. We should also move all definitions in `isolator.proto` to `containerizer.proto` to be more consistent. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (MESOS-6057) docker isolator does not overwrite Dockerfile ENV
[ https://issues.apache.org/jira/browse/MESOS-6057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jie Yu reassigned MESOS-6057: - Assignee: Jie Yu > docker isolator does not overwrite Dockerfile ENV > - > > Key: MESOS-6057 > URL: https://issues.apache.org/jira/browse/MESOS-6057 > Project: Mesos > Issue Type: Bug > Components: containerization >Affects Versions: 1.0.0, 1.0.1, 1.1.0 >Reporter: Stéphane Cottin >Assignee: Jie Yu >Priority: Critical > Labels: mesosphere > Fix For: 1.1.0 > > > The docker/runtime isolator does not overwrite env values when a default > value is present in the Dockerfile. > Steps to reproduce : > {code} > mesos-execute --master=leader.mesos:5050 --name=test > --docker_image=bashell/alpine-bash --env="{\"LC_ALL\": > \fr_FR.UTF-8\",\"LC_TEST\": \"fr_FR.UTF-8\"}" --command="env" > {code} > outputs in stdout : > {code} > [...] > LC_ALL=en_US.UTF-8 > LC_TEST=fr_FR.UTF-8 > [...] > {code} > > {{en_US.UTF-8}} is the default value from the dockerfile, see > https://hub.docker.com/r/bashell/alpine-bash/~/dockerfile/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-6068) Refactor MesosContainerizer::launch to prepare for nesting support.
Jie Yu created MESOS-6068: - Summary: Refactor MesosContainerizer::launch to prepare for nesting support. Key: MESOS-6068 URL: https://issues.apache.org/jira/browse/MESOS-6068 Project: Mesos Issue Type: Task Reporter: Jie Yu Assignee: Jie Yu The idea is to have a common launch path for both top level executor container and nested containers. That means the parameters to the launch method should be container agnostic. Then the original launch can just call this common launch code. When we add nesting support later, the same common launch code will be re-used. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-6069) Misspelt TASK_KILLED in mesos slave
Cody Maloney created MESOS-6069: --- Summary: Misspelt TASK_KILLED in mesos slave Key: MESOS-6069 URL: https://issues.apache.org/jira/browse/MESOS-6069 Project: Mesos Issue Type: Bug Components: slave Reporter: Cody Maloney https://github.com/apache/mesos/blob/c3228f3c3d1a1b2c145d1377185cfe22da6079eb/src/slave/slave.cpp#L2127 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-6066) Operator SUBSCRIBE api should include timestamps
[ https://issues.apache.org/jira/browse/MESOS-6066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15431702#comment-15431702 ] Anand Mazumdar commented on MESOS-6066: --- We intend to expose the {{TaskStatus}} as part of the {{TaskUpdated}} event. That would have the timestamp details. > Operator SUBSCRIBE api should include timestamps > > > Key: MESOS-6066 > URL: https://issues.apache.org/jira/browse/MESOS-6066 > Project: Mesos > Issue Type: Bug > Components: HTTP API, json api >Affects Versions: 1.0.0 >Reporter: Steven Schlansker > > Events coming from the Mesos master are delivered asynchronously. While > usually they are processed in a timely fashion, it really scares me that > updates do not have a timestamp: > {code} > 301 > { > "task_updated": { > "agent_id": { > "value": "fdbb3ff5-47c2-4b49-a521-b52b9acf74dd-S14" > }, > "framework_id": { > "value": "Singularity" > }, > "state": "TASK_KILLED", > "task_id": { > "value": > "pp-demoservice-steven.2016.07.05T17.00.06-1471901722511-1-mesos_slave17_qa_uswest2.qasql.opentable.com-us_west_2b" > } > }, > "type": "TASK_UPDATED" > } > {code} > Events should have a timestamp that indicates the time that they happened at, > otherwise your timestamps include delivery and processing delays. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-313) Report executor terminations to framework schedulers.
[ https://issues.apache.org/jira/browse/MESOS-313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15431688#comment-15431688 ] Stephan Erb commented on MESOS-313: --- Now that this patch has landed, even a clean shutdown of an executor (with status code 0) is reported to the framework via the {{executorLost}} message. Is this a bug or intentional? Example log output: {code} I0616 13:55:16.580080 16915 master.cpp:4891] Executor 'thermos-role-env-job-0-d94972f8-760e-4bb0-beef-654e2df1f5e0' of framework 20151001-085346-58917130-5050-37976- on slave d4218d85-e294-4405-af4c-80fc7a66f1a4 -S0 at slave(1)@:5051 (): exited with status 0 I0616 13:55:16.580286 16915 master.cpp:6540] Removing executor 'thermos-role-env-job-0-d94972f8-760e-4bb0-beef-654e2df1f5e0' with resources cpus(*):0.01; mem(*):128 of framework 20151001-085346-58917130-5050-37976- on slave d4218d85-e294-4405-af4c-80fc7a66f1a4-S0 at slave(1)@:5051 () {code} > Report executor terminations to framework schedulers. > - > > Key: MESOS-313 > URL: https://issues.apache.org/jira/browse/MESOS-313 > Project: Mesos > Issue Type: Improvement >Reporter: Charles Reiss >Assignee: Zhitao Li > Labels: mesosphere, newbie > Fix For: 0.27.0 > > > The Scheduler interface has a callback for executorLost, but currently it is > never called. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-6067) Support provisioner to be nested aware for Mesos Pods.
Gilbert Song created MESOS-6067: --- Summary: Support provisioner to be nested aware for Mesos Pods. Key: MESOS-6067 URL: https://issues.apache.org/jira/browse/MESOS-6067 Project: Mesos Issue Type: Task Components: containerization Reporter: Gilbert Song Assignee: Gilbert Song -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-6066) Operator SUBSCRIBE api should include timestamps
Steven Schlansker created MESOS-6066: Summary: Operator SUBSCRIBE api should include timestamps Key: MESOS-6066 URL: https://issues.apache.org/jira/browse/MESOS-6066 Project: Mesos Issue Type: Bug Components: HTTP API, json api Affects Versions: 1.0.0 Reporter: Steven Schlansker Events coming from the Mesos master are delivered asynchronously. While usually they are processed in a timely fashion, it really scares me that updates do not have a timestamp: {code} 301 { "task_updated": { "agent_id": { "value": "fdbb3ff5-47c2-4b49-a521-b52b9acf74dd-S14" }, "framework_id": { "value": "Singularity" }, "state": "TASK_KILLED", "task_id": { "value": "pp-demoservice-steven.2016.07.05T17.00.06-1471901722511-1-mesos_slave17_qa_uswest2.qasql.opentable.com-us_west_2b" } }, "type": "TASK_UPDATED" } {code} Events should have a timestamp that indicates the time that they happened at, otherwise your timestamps include delivery and processing delays. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-6065) Support provisioning image volumes in an isolator.
Gilbert Song created MESOS-6065: --- Summary: Support provisioning image volumes in an isolator. Key: MESOS-6065 URL: https://issues.apache.org/jira/browse/MESOS-6065 Project: Mesos Issue Type: Improvement Components: containerization, isolation Reporter: Gilbert Song Assignee: Gilbert Song Currently the image volumes are provisioned in mesos containerizer. This makes the containerzer logic complicated, and hard to make containerizer launch to be nest aware. We should implement a 'volume/image' isolator to move these part of logic away from the mesos containerizer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (MESOS-6062) mesos-agent should autodetect mount-type volume sizes
[ https://issues.apache.org/jira/browse/MESOS-6062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anindya Sinha reassigned MESOS-6062: Assignee: Anindya Sinha > mesos-agent should autodetect mount-type volume sizes > - > > Key: MESOS-6062 > URL: https://issues.apache.org/jira/browse/MESOS-6062 > Project: Mesos > Issue Type: Improvement > Components: slave >Reporter: Yan Xu >Assignee: Anindya Sinha > > When dealing with a large fleet of machines it could be cumbersome to > construct the resources JSON file that varies from host to host. Mesos > already auto-detects resources such as cpus, mem and "root" disk, it should > extend it to the MOUNT type disk as it's pretty clear that the value should > be the size of entire volume. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-6016) Expose the unversioned Call and Event Scheduler/Executor Protobufs.
[ https://issues.apache.org/jira/browse/MESOS-6016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15431478#comment-15431478 ] Anand Mazumdar commented on MESOS-6016: --- {noformat} commit e3143e756fafe343a79cadb10b587fad0e5904d5 Author: Anand MazumdarDate: Mon Aug 22 11:15:26 2016 -0700 Exposed unversioned scheduler/executor protos in Mesos JAR. This change exposes the unversioned scheduler/executor protos in the Mesos JAR. We already used to expose the unversioned Mesos protos. This is useful for migrating schedulers to use the new v1 API via the scheduler shim. Otherwise, they would need to create their own copy of these protobufs even for vetting the new API via the shim. Note that this only partially resolves MESOS-6016 and that we would need to tackle the unversioned protobuf deprecation later eventually. Review: https://reviews.apache.org/r/51130/ {noformat} > Expose the unversioned Call and Event Scheduler/Executor Protobufs. > --- > > Key: MESOS-6016 > URL: https://issues.apache.org/jira/browse/MESOS-6016 > Project: Mesos > Issue Type: Task >Reporter: Anand Mazumdar > Labels: mesos > > Currently, we don't expose the un-versioned (v0) {{Call}}/{{Event}} > scheduler/executor protobufs externally to framework authors. This is a bit > disjoint since we already expose the unversioned Mesos protos. The reasoning > for not doing so earlier was that Mesos would use the v0 protobufs as an > alternative to having separate internal protobufs internally. > However, that is not going to work. Eventually, when we introduce a backward > incompatible change in {{v1}} protobufs, we would create new {{v2}} > protobufs. But, we would need to ensure that {{v2}} protobufs can somehow be > translated to {{v0}} without breaking existing users. That's a pretty hard > thing to do! In the interim, to help framework authors migrate their > frameworks (they might be storing old protobufs in ZK/other reliable storage) > , we should expose the v0 scheduler/executor protobufs too and create another > internal translation layer for Mesos. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-5788) Consider adding a Java Scheduler Shim/Adapter for the new/old API.
[ https://issues.apache.org/jira/browse/MESOS-5788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15431476#comment-15431476 ] Anand Mazumdar commented on MESOS-5788: --- {noformat} commit 04b9498bc5b5e080095786f4275c202625b3142b Author: Anand MazumdarDate: Mon Aug 22 11:14:51 2016 -0700 Renamed `JNIMesos` to `V1Mesos` for scheduler shim. This change renames `JNIMesos`, v1 implementation for the scheduler shim to `V1Mesos`. `JNIMesos` was non-intuitive for users considering the implementation was already in the native code for `V0Mesos` too. Also, it was a bit confusing that `JNIMesos` referred to using the v1 API under the hood. Review: https://reviews.apache.org/r/51129/ {noformat} > Consider adding a Java Scheduler Shim/Adapter for the new/old API. > -- > > Key: MESOS-5788 > URL: https://issues.apache.org/jira/browse/MESOS-5788 > Project: Mesos > Issue Type: Task >Reporter: Anand Mazumdar >Assignee: Anand Mazumdar > Labels: mesosphere > Fix For: 1.1.0 > > > Currently, for existing JAVA based frameworks, moving to try out the new API > can be cumbersome. This change intends to introduce a shim/adapter interface > that makes this easier by allowing to toggle between the old/new API > (driver/new scheduler library) implementation via an environment variable. > This would allow framework developers to transition their older frameworks to > the new API rather seamlessly. > This would look similar to the work done for the executor shim for C++ > (command/docker executor). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-6055) Mesos libs in LD_LIBRARY_PATH cause fetcher to fail and not report errors
[ https://issues.apache.org/jira/browse/MESOS-6055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15431398#comment-15431398 ] Charles Allen commented on MESOS-6055: -- Have mesos installed in a way where the main shared library isn't found. ex: launching a slave should fail by default with errors about not able to find/bind the mesos library. Change the LD path via {{LD_LIBRARY_PATH}} such that the slave succeeds in running. Try and launch something with a URI to be fetched, it will fail in confusing ways. Try and launch something without a URI (like {{echo something}}). it will print out {{something} as expected. > Mesos libs in LD_LIBRARY_PATH cause fetcher to fail and not report errors > - > > Key: MESOS-6055 > URL: https://issues.apache.org/jira/browse/MESOS-6055 > Project: Mesos > Issue Type: Bug > Components: fetcher >Reporter: Charles Allen > > in 1.0.0, if the agent is launched such that the mesos libraries can only be > found under {{LD_LIBRARY_PATH}}, the fetcher will fail and simply exit with > no output. The log will not show linker errors. I'm not sure where they are > swallowed. If the task is launched with LD_LIBRARY_PATH set to include where > the mesos libs can be found, the fetcher functions as expected. > The problem is that the errors in the fetcher linking are not obvious as no > logs are produced from the fetcher sub process. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-6064) Add version member field to Docker class to avoid validate docker version every time
[ https://issues.apache.org/jira/browse/MESOS-6064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] haosdent updated MESOS-6064: Labels: docker (was: ) > Add version member field to Docker class to avoid validate docker version > every time > > > Key: MESOS-6064 > URL: https://issues.apache.org/jira/browse/MESOS-6064 > Project: Mesos > Issue Type: Improvement > Components: docker >Reporter: haosdent >Assignee: haosdent > Labels: docker > > Now the minimum docker version we supported is >=1.0.0. However, we support > some advanced features after docker 1.0.0 as well which require > {{Docker::validateVersion}} before use them. {{Docker::validateVersion}} is a > blocking function which waits for {{docker --version}} return. Call it too > many times bring unnecessary overheads. It would be better that we add a > member field represent current docker version in {{Docker}} class to avoid to > execute {{docker --version}} every time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-6064) Add version member field to Docker class to avoid validate docker version every time
haosdent created MESOS-6064: --- Summary: Add version member field to Docker class to avoid validate docker version every time Key: MESOS-6064 URL: https://issues.apache.org/jira/browse/MESOS-6064 Project: Mesos Issue Type: Improvement Reporter: haosdent Assignee: haosdent Now the minimum docker version we supported is >=1.0.0. However, we support some advanced features after docker 1.0.0 as well which require {{Docker::validateVersion}} before use them. {{Docker::validateVersion}} is a blocking function which waits for {{docker --version}} return. Call it too many times bring unnecessary overheads. It would be better that we add a member field represent current docker version in {{Docker}} class to avoid to execute {{docker --version}} every time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-6055) Mesos libs in LD_LIBRARY_PATH cause fetcher to fail and not report errors
[ https://issues.apache.org/jira/browse/MESOS-6055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15431325#comment-15431325 ] Joseph Wu commented on MESOS-6055: -- Can you provide more repro steps? I haven't observed any fetcher linking issues... > Mesos libs in LD_LIBRARY_PATH cause fetcher to fail and not report errors > - > > Key: MESOS-6055 > URL: https://issues.apache.org/jira/browse/MESOS-6055 > Project: Mesos > Issue Type: Bug > Components: fetcher >Reporter: Charles Allen > > in 1.0.0, if the agent is launched such that the mesos libraries can only be > found under {{LD_LIBRARY_PATH}}, the fetcher will fail and simply exit with > no output. The log will not show linker errors. I'm not sure where they are > swallowed. If the task is launched with LD_LIBRARY_PATH set to include where > the mesos libs can be found, the fetcher functions as expected. > The problem is that the errors in the fetcher linking are not obvious as no > logs are produced from the fetcher sub process. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-6064) Add version member field to Docker class to avoid validate docker version every time
[ https://issues.apache.org/jira/browse/MESOS-6064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] haosdent updated MESOS-6064: Component/s: docker > Add version member field to Docker class to avoid validate docker version > every time > > > Key: MESOS-6064 > URL: https://issues.apache.org/jira/browse/MESOS-6064 > Project: Mesos > Issue Type: Improvement > Components: docker >Reporter: haosdent >Assignee: haosdent > Labels: docker > > Now the minimum docker version we supported is >=1.0.0. However, we support > some advanced features after docker 1.0.0 as well which require > {{Docker::validateVersion}} before use them. {{Docker::validateVersion}} is a > blocking function which waits for {{docker --version}} return. Call it too > many times bring unnecessary overheads. It would be better that we add a > member field represent current docker version in {{Docker}} class to avoid to > execute {{docker --version}} every time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-6035) Add non-recursive version of cgroups::get
[ https://issues.apache.org/jira/browse/MESOS-6035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15431254#comment-15431254 ] haosdent commented on MESOS-6035: - +1 for {{recursive=false}} default. > Add non-recursive version of cgroups::get > - > > Key: MESOS-6035 > URL: https://issues.apache.org/jira/browse/MESOS-6035 > Project: Mesos > Issue Type: Improvement >Reporter: haosdent >Assignee: haosdent >Priority: Minor > > In some cases, we only need to get the top level cgroups instead of to get > all cgroups recursively. Add a non-recursive version could help to avoid > unnecessary paths. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (MESOS-6035) Add non-recursive version of cgroups::get
[ https://issues.apache.org/jira/browse/MESOS-6035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15425251#comment-15425251 ] Yan Xu edited comment on MESOS-6035 at 8/22/16 5:35 PM: I commented on the review. This is relevant to MESOS-5879 as well. [~swsnider] brought up a point if we default to *not* recursively descend into nested cgroups, a lot of problems including MESOS-5879 go aways. I can't think of a reason that we should traverse the cgroups recursively today (except {{cgroups::remove()}} and {{cgroups::destroy()}} but they are different in that they are within the cgroups util and encapsulate such details). Of course if we do have cases in the future they can set {{recursive=true}} explicitly. [~jieyu] [~idownes] what are your thoughts on this? was (Author: xujyan): I commented on the review. This is relevant to MESOS-5879 as well. [~swsnider] brought up a point if we default to *not* recursively descend into nested cgroups, a lot of problems including MESOS-5879 go aways. I can't think of a reason that we should traverse the cgroups recursively today. Of course if we do have cases in the future they can set {{recursive=true}} explicitly. [~jieyu] [~idownes] what are your thoughts on this? > Add non-recursive version of cgroups::get > - > > Key: MESOS-6035 > URL: https://issues.apache.org/jira/browse/MESOS-6035 > Project: Mesos > Issue Type: Improvement >Reporter: haosdent >Assignee: haosdent >Priority: Minor > > In some cases, we only need to get the top level cgroups instead of to get > all cgroups recursively. Add a non-recursive version could help to avoid > unnecessary paths. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (MESOS-6035) Add non-recursive version of cgroups::get
[ https://issues.apache.org/jira/browse/MESOS-6035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15425251#comment-15425251 ] Yan Xu edited comment on MESOS-6035 at 8/22/16 5:36 PM: I commented on the review. This is relevant to MESOS-5879 as well. [~swsnider] brought up a point if we default to *not* recursively descend into nested cgroups, a lot of problems including MESOS-5879 go aways. I can't think of a reason that we should traverse the cgroups recursively today (except for {{cgroups::remove()}} and {{cgroups::destroy()}} but they are different in that they are within the cgroups util and encapsulate such details). Of course if we do have cases in the future they can set {{recursive=true}} explicitly. [~jieyu] [~idownes] what are your thoughts on this? was (Author: xujyan): I commented on the review. This is relevant to MESOS-5879 as well. [~swsnider] brought up a point if we default to *not* recursively descend into nested cgroups, a lot of problems including MESOS-5879 go aways. I can't think of a reason that we should traverse the cgroups recursively today (except {{cgroups::remove()}} and {{cgroups::destroy()}} but they are different in that they are within the cgroups util and encapsulate such details). Of course if we do have cases in the future they can set {{recursive=true}} explicitly. [~jieyu] [~idownes] what are your thoughts on this? > Add non-recursive version of cgroups::get > - > > Key: MESOS-6035 > URL: https://issues.apache.org/jira/browse/MESOS-6035 > Project: Mesos > Issue Type: Improvement >Reporter: haosdent >Assignee: haosdent >Priority: Minor > > In some cases, we only need to get the top level cgroups instead of to get > all cgroups recursively. Add a non-recursive version could help to avoid > unnecessary paths. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-6063) Track recovered and prepared subsystems for a container
haosdent created MESOS-6063: --- Summary: Track recovered and prepared subsystems for a container Key: MESOS-6063 URL: https://issues.apache.org/jira/browse/MESOS-6063 Project: Mesos Issue Type: Improvement Components: cgroups Reporter: haosdent Assignee: haosdent Currently, when we restart Mesos Agent with different cgroups subsystems, the exist containers would recover failed on newly added subsystems. In this case, we ignore them and continue to perform `usage`, `status` and `cleanup` on them. It would be better that we track recovered and prepared subsystems for a container. Then ignore perform `update`, `wait`, `usage`, `status` on them. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-6062) mesos-agent should autodetect mount-type volume sizes
Yan Xu created MESOS-6062: - Summary: mesos-agent should autodetect mount-type volume sizes Key: MESOS-6062 URL: https://issues.apache.org/jira/browse/MESOS-6062 Project: Mesos Issue Type: Improvement Components: slave Reporter: Yan Xu When dealing with a large fleet of machines it could be cumbersome to construct the resources JSON file that varies from host to host. Mesos already auto-detects resources such as cpus, mem and "root" disk, it should extend it to the MOUNT type disk as it's pretty clear that the value should be the size of entire volume. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-6061) Docker registry puller shows decode error "No response decoded".
Sunzhe created MESOS-6061: - Summary: Docker registry puller shows decode error "No response decoded". Key: MESOS-6061 URL: https://issues.apache.org/jira/browse/MESOS-6061 Project: Mesos Issue Type: Bug Components: containerization, docker Affects Versions: 1.0.0 Reporter: Sunzhe The {{mesos-agent}} flags: {code} GLOG_v=1 ./bin/mesos-agent.sh \ --master=zk://${MESOS_MASTER_IP}:2181/mesos \ --ip=10.100.3.3 \ --work_dir=${MESOS_WORK_DIR} \ --isolation=cgroups/devices,gpu/nvidia,disk/du,docker/runtime,filesystem/linux \ --enforce_container_disk_quota \ --containerizers=mesos \ --image_providers=docker \ --executor_environment_variables="{}" {code} And the {{mesos-execute}} flags: {code} ./src/mesos-execute \ --master=${MESOS_MASTER_IP}:5050 \ --name=${INSTANCE_NAME} \ --docker_image=nvidia/cuda \ --framework_capabilities=GPU_RESOURCES \ --resources="cpus:1;mem:128;gpus:1" \ --command="nvidia-smi" {code} But when {{./src/mesos-execute}}, the errors like below: {code} I0822 18:45:55.423899 8821 scheduler.cpp:172] Version: 1.0.1 I0822 18:45:55.426172 8821 scheduler.cpp:461] New master detected at master@10.103.0.125:5050 Subscribed with ID '34126b61-9d41-48dd-9c85-b61e4f9ad4c9-0001' Submitted task 'test' to agent 'b6c1587d-ab88-4734-9cb3-2cb916a73bf8-S1' Received status update TASK_FAILED for task 'test' message: 'Failed to launch container: Failed to decode HTTP responses: No response decoded HTTP/1.1 200 Connection established HTTP/1.1 401 Unauthorized Content-Type: application/json; charset=utf-8 Docker-Distribution-Api-Version: registry/2.0 Www-Authenticate: Bearer realm="https://auth.docker.io/token",service="registry.docker.io",scope="repository:nvidia/cuda:pull; Date: Mon, 22 Aug 2016 10:46:25 GMT Content-Length: 143 Strict-Transport-Security: max-age=31536000 {"errors":[{"code":"UNAUTHORIZED","message":"authentication required","detail":[{"Type":"repository","Name":"nvidia/cuda","Action":"pull"}]}]} ; Container destroyed while provisioning images' source: SOURCE_AGENT reason: REASON_CONTAINER_LAUNCH_FAILED {code} The Docker works well, I can use {{docker pull}} IMAGE. And if I used the agent flag {{--docker_registry}} is a local path(i.e:{{/tmp/docker/images}}) in which Docker image archives(result of {{docker save}}) are stored, the mesos-execute works well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)