[jira] [Commented] (MESOS-7622) Agent can crash if a HTTP executor tries to retry subscription in running state.

2019-01-07 Thread Joseph Wu (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-7622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16736216#comment-16736216
 ] 

Joseph Wu commented on MESOS-7622:
--

No, the executor changes I was making (MESOS-7564) touch the executor 
subscription code, but shouldn't affect how/when the executor decides to 
register.

Without diving too deeply, these two logs stand out:
{code}
I0605 14:58:25.247808 10718 slave.cpp:3825] Got registration for executor 
'testapp-cc6e64001fee44e3a20d7a15149d8b34' of framework 
b9d7ab7a-f123-4a7c-bfda-07c483ece870-0001 from executor(1)@127.0.1.1:42459
{code}
{code}
I0605 14:58:25.352342 10712 slave.cpp:3609] Received Subscribe request for HTTP 
executor 'testapp-cc6e64001fee44e3a20d7a15149d8b34' of framework 
b9d7ab7a-f123-4a7c-bfda-07c483ece870-0001 at executor(1)@127.0.1.1:42459
{code}
The same executor registers twice, once as a PID executor, and once as an HTTP 
executor.  The timestamps are close enough to suggest both registrations are 
happening at the same time.

> Agent can crash if a HTTP executor tries to retry subscription in running 
> state.
> 
>
> Key: MESOS-7622
> URL: https://issues.apache.org/jira/browse/MESOS-7622
> Project: Mesos
>  Issue Type: Bug
>  Components: agent, executor
>Affects Versions: 1.2.2
>Reporter: Aaron Wood
>Priority: Critical
>  Labels: foundations
>
> It is possible that a running executor might retry its subscribe request. 
> This can lead to a crash if it previously had any launched tasks. Note that 
> the executor would still be able to subscribe again when the agent process 
> restarts and is recovering.
> {code}
> sudo ./mesos-agent --master=10.0.2.15:5050 --work_dir=/tmp/slave 
> --isolation=cgroups/cpu,cgroups/mem,disk/du,network/cni,filesystem/linux,docker/runtime
>  --image_providers=docker --image_provisioner_backend=overlay 
> --containerizers=mesos --launcher_dir=$(pwd) 
> --executor_environment_variables='{"LD_LIBRARY_PATH": 
> "/home/aaron/Code/src/mesos/build/src/.libs"}'
> WARNING: Logging before InitGoogleLogging() is written to STDERR
> I0605 14:58:23.748180 10710 main.cpp:323] Build: 2017-06-02 17:09:05 UTC by 
> aaron
> I0605 14:58:23.748252 10710 main.cpp:324] Version: 1.4.0
> I0605 14:58:23.755409 10710 systemd.cpp:238] systemd version `232` detected
> I0605 14:58:23.755450 10710 main.cpp:433] Initializing systemd state
> I0605 14:58:23.763049 10710 systemd.cpp:326] Started systemd slice 
> `mesos_executors.slice`
> I0605 14:58:23.763777 10710 resolver.cpp:69] Creating default secret resolver
> I0605 14:58:23.764214 10710 containerizer.cpp:230] Using isolation: 
> cgroups/cpu,cgroups/mem,disk/du,network/cni,filesystem/linux,docker/runtime,volume/image,environment_secret
> I0605 14:58:23.767192 10710 linux_launcher.cpp:150] Using 
> /sys/fs/cgroup/freezer as the freezer hierarchy for the Linux launcher
> E0605 14:58:23.770179 10710 shell.hpp:107] Command 'hadoop version 2>&1' 
> failed; this is the output:
> sh: 1: hadoop: not found
> I0605 14:58:23.770217 10710 fetcher.cpp:69] Skipping URI fetcher plugin 
> 'hadoop' as it could not be created: Failed to create HDFS client: Failed to 
> execute 'hadoop version 2>&1'; the command was either not found or exited 
> with a non-zero exit status: 127
> I0605 14:58:23.770643 10710 provisioner.cpp:255] Using default backend 
> 'overlay'
> I0605 14:58:23.785892 10710 slave.cpp:248] Mesos agent started on 
> (1)@127.0.1.1:5051
> I0605 14:58:23.785957 10710 slave.cpp:249] Flags at startup: 
> --appc_simple_discovery_uri_prefix="http://; 
> --appc_store_dir="/tmp/mesos/store/appc" --authenticate_http_readonly="false" 
> --authenticate_http_readwrite="false" --authenticatee="crammd5" 
> --authentication_backoff_factor="1secs" --authorizer="local" 
> --cgroups_cpu_enable_pids_and_tids_count="false" --cgroups_enable_cfs="false" 
> --cgroups_hierarchy="/sys/fs/cgroup" --cgroups_limit_swap="false" 
> --cgroups_root="mesos" --container_disk_watch_interval="15secs" 
> --containerizers="mesos" --default_role="*" --disk_watch_interval="1mins" 
> --docker="docker" --docker_kill_orphans="true" 
> --docker_registry="https://registry-1.docker.io; --docker_remove_delay="6hrs" 
> --docker_socket="/var/run/docker.sock" --docker_stop_timeout="0ns" 
> --docker_store_dir="/tmp/mesos/store/docker" 
> --docker_volume_checkpoint_dir="/var/run/mesos/isolators/docker/volume" 
> --enforce_container_disk_quota="false" 
> --executor_environment_variables="{"LD_LIBRARY_PATH":"\/home\/aaron\/Code\/src\/mesos\/build\/src\/.libs"}"
>  --executor_registration_timeout="1mins" 
> --executor_reregistration_timeout="2secs" 
> --executor_shutdown_grace_period="5secs" 
> --fetcher_cache_dir="/tmp/mesos/fetch" --fetcher_cache_size="2GB" 
> --frameworks_home="" 

[jira] [Assigned] (MESOS-9495) Test `MasterTest.CreateVolumesV1AuthorizationFailure` is flaky.

2019-01-07 Thread Benjamin Mahler (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-9495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Mahler reassigned MESOS-9495:
--

Assignee: Benjamin Mahler

> Test `MasterTest.CreateVolumesV1AuthorizationFailure` is flaky.
> ---
>
> Key: MESOS-9495
> URL: https://issues.apache.org/jira/browse/MESOS-9495
> Project: Mesos
>  Issue Type: Bug
>  Components: test
>Affects Versions: 1.5.0, 1.5.1, 1.6.0, 1.6.1, 1.7.0
>Reporter: Chun-Hung Hsiao
>Assignee: Benjamin Mahler
>Priority: Major
>  Labels: flaky-test, resource-management
> Attachments: 
> mesos-ec2-centos-7-CMake.Mesos.MasterTest.CreateVolumesV1AuthorizationFailure-badrun.txt
>
>
> {noformat}
> I1219 22:45:59.578233 26107 slave.cpp:1884] Will retry registration in 
> 2.10132ms if necessary
> I1219 22:45:59.578615 26107 master.cpp:6125] Received register agent message 
> from slave(463)@172.16.10.13:35739 (ip-172-16-10-13.ec2.internal)
> I1219 22:45:59.578830 26107 master.cpp:3871] Authorizing agent with principal 
> 'test-principal'
> I1219 22:45:59.578975 26107 master.cpp:6183] Authorized registration of agent 
> at slave(463)@172.16.10.13:35739 (ip-172-16-10-13.ec2.internal)
> I1219 22:45:59.579039 26107 master.cpp:6294] Registering agent at 
> slave(463)@172.16.10.13:35739 (ip-172-16-10-13.ec2.internal) with id 
> 85292fcc-b698-4377-9faa-f76b0ccd4ee5-S0
> I1219 22:45:59.579540 26107 registrar.cpp:495] Applied 1 operations in 
> 143852ns; attempting to update the registry
> I1219 22:45:59.580102 26109 registrar.cpp:552] Successfully updated the 
> registry in 510208ns
> I1219 22:45:59.580312 26109 master.cpp:6342] Admitted agent 
> 85292fcc-b698-4377-9faa-f76b0ccd4ee5-S0 at slave(463)@172.16.10.13:35739 
> (ip-172-16-10-13.ec2.internal)
> I1219 22:45:59.580968 26111 slave.cpp:1884] Will retry registration in 
> 23.973874ms if necessary
> I1219 22:45:59.581447 26111 slave.cpp:1486] Registered with master 
> master@172.16.10.13:35739; given agent ID 
> 85292fcc-b698-4377-9faa-f76b0ccd4ee5-S0
> ...
> I1219 22:45:59.580950 26109 master.cpp:6391] Registered agent 
> 85292fcc-b698-4377-9faa-f76b0ccd4ee5-S0 at slave(463)@172.16.10.13:35739 
> (ip-172-16-10-13.ec2.internal) with disk(reservations: 
> [(STATIC,role1)]):1024; cpus:2; mem:6796; ports:[31000-32000]
> I1219 22:45:59.583326 26109 master.cpp:6125] Received register agent message 
> from slave(463)@172.16.10.13:35739 (ip-172-16-10-13.ec2.internal)
> I1219 22:45:59.583524 26109 master.cpp:3871] Authorizing agent with principal 
> 'test-principal'
> ...
> W1219 22:45:59.584242 26109 master.cpp:6175] Refusing registration of agent 
> at slave(463)@172.16.10.13:35739 (ip-172-16-10-13.ec2.internal): 
> Authorization failure: Authorizer failure
> ...
> I1219 22:45:59.586944 26113 http.cpp:1185] HTTP POST for /master/api/v1 from 
> 172.16.10.13:47412
> I1219 22:45:59.587129 26113 http.cpp:682] Processing call CREATE_VOLUMES
> /home/centos/workspace/mesos/Mesos_CI-build/FLAG/CMake/label/mesos-ec2-centos-7/mesos/src/tests/master_tests.cpp:9386:
>  Failure
> Mock function called more times than expected - returning default value.
> Function call: authorized(@0x7f5066524720 48-byte object  50-7F 00-00 00-00 00-00 00-00 00-00 07-00 00-00 00-00 00-00 10-4E 02-48 50-7F 
> 00-00 E0-4C 02-48 50-7F 00-00 06-00 00-00 50-7F 00-00>)
>   Returns: Abandoned
>  Expected: to be called once
>Actual: called twice - over-saturated and active
> I1219 22:45:59.587761 26113 master.cpp:3811] Authorizing principal 
> 'test-principal' to create volumes 
> '[{"disk":{"persistence":{"id":"id1","principal":"test-principal"},"volume":{"container_path":"path1","mode":"RW"}},"name":"disk","reservations":[{"role":"role1","type":"STATIC"}],"scalar":{"value":64.0},"type":"SCALAR"}]'
> ...
> /home/centos/workspace/mesos/Mesos_CI-build/FLAG/CMake/label/mesos-ec2-centos-7/mesos/src/tests/master_tests.cpp:9398:
>  Failure
> Failed to wait 15secs for response{noformat}
> This is because we authorize the retried registration before dropping it.
> Full log: 
> [^mesos-ec2-centos-7-CMake.Mesos.MasterTest.CreateVolumesV1AuthorizationFailure-badrun.txt]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (MESOS-9130) Test `StorageLocalResourceProviderTest.ROOT_ContainerTerminationMetric` is flaky.

2019-01-07 Thread Benjamin Bannier (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-9130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Bannier reassigned MESOS-9130:
---

Assignee: Benjamin Bannier

> Test `StorageLocalResourceProviderTest.ROOT_ContainerTerminationMetric` is 
> flaky.
> -
>
> Key: MESOS-9130
> URL: https://issues.apache.org/jira/browse/MESOS-9130
> Project: Mesos
>  Issue Type: Bug
>  Components: resource provider, storage
>Affects Versions: 1.6.0, 1.7.0
>Reporter: Chun-Hung Hsiao
>Assignee: Benjamin Bannier
>Priority: Major
>  Labels: mesosphere, storage
> Attachments: test.log
>
>
> This test is flaky and can fail with the following error:
> {noformat}
> ../../src/tests/storage_local_resource_provider_tests.cpp:3167
> Failed to wait 15secs for pluginRestarted{noformat}
> The actual error is the following:
> {noformat}
> E0802 22:13:37.265038  8216 provider.cpp:1496] Failed to reconcile resource 
> provider b9379982-d990-4f63-8a5b-10edd4f5a1bb: Collect failed: OS 
> Error{noformat}
> The root cause is that the SLRP calls {{ListVolumes}} and {{GetCapacity}} 
> when starting up, and if the plugin container is killed when these calls are 
> ongoing, gRPC will return an {{OS Error}} which will lead the SLRP to fail.
> This flakiness will be fixed once we finish 
> https://issues.apache.org/jira/browse/MESOS-8400.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (MESOS-9512) Expose standalone containers in the webui

2019-01-07 Thread Benjamin Bannier (JIRA)
Benjamin Bannier created MESOS-9512:
---

 Summary: Expose standalone containers in the webui
 Key: MESOS-9512
 URL: https://issues.apache.org/jira/browse/MESOS-9512
 Project: Mesos
  Issue Type: Improvement
  Components: agent, containerization, webui
Reporter: Benjamin Bannier


We should expose standalone containers in the webui, just like we already 
expose task containers. This would e.g., allow users to investigate issues 
easier by more exposed standalone container logs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (MESOS-9141) Consider adding restrictions to disk profile names.

2019-01-07 Thread Benjamin Bannier (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-9141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16735779#comment-16735779
 ] 

Benjamin Bannier commented on MESOS-9141:
-

Currently, the storage local resource provider (SLRP) just requests profiles 
from the disk profile adaptor, and applies them pretty mechanically to 
resources. Profile names are never injected in Mesos itself, but only in custom 
modules (the default module implementation never returns any profiles, the 
{{UriDiskProfileAdapator}} reads from an URI). Since SLRP also currently also 
never e.g., construct filesystem paths from profile names, I wonder why we 
would even want to restrict profile names at all. It seems this would be a 
concern of entity managing profile lifetime, i.e., not a Mesos concern.

[~chhsia0], shall we close this issue as {{WONT_FIX}}?

> Consider adding restrictions to disk profile names.
> ---
>
> Key: MESOS-9141
> URL: https://issues.apache.org/jira/browse/MESOS-9141
> Project: Mesos
>  Issue Type: Improvement
>  Components: storage
>Reporter: Chun-Hung Hsiao
>Priority: Critical
>  Labels: mesosphere, storage
>
> We should add some restrictions to profile names. We could consider adding 
> the following rules:
> 1. A profile name must not be empty.
> 2. A profile name must have at most 128 characters.
> 3. A profile name must consist of alphanumeric characters ({{[a-zA-Z0-9]}}), 
> dashes ({{-}}), underscores({{_}}), or dots({{.}}). We might want to consider 
> slashes({{/}}) or percent-signs ({{%}}) as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (MESOS-6632) ContainerLogger might leak FD if container launch fails.

2019-01-07 Thread Gilbert Song (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-6632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16735745#comment-16735745
 ] 

Gilbert Song commented on MESOS-6632:
-

https://reviews.apache.org/r/69681/

> ContainerLogger might leak FD if container launch fails.
> 
>
> Key: MESOS-6632
> URL: https://issues.apache.org/jira/browse/MESOS-6632
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
>Affects Versions: 0.28.2, 1.0.1, 1.1.0
>Reporter: Jie Yu
>Priority: Critical
>
> In MesosContainerizer, if logger->prepare() succeeds but its continuation 
> fails, the pipe fd allocated in the logger will get leaked. We cannot add a 
> destructor in ContainerLogger::SubprocessInfo to close the fd because 
> subprocess might close the OWNED fd.
> A FD abstraction might help here. In other words, subprocess will no longer 
> be responsible for closing external FDs, instead, the FD destructor will be 
> doing so.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (MESOS-9434) Completed framework update streams may retry forever

2019-01-07 Thread Benjamin Bannier (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-9434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16735708#comment-16735708
 ] 

Benjamin Bannier commented on MESOS-9434:
-

There exists a similar issue with when a {{ShutdownFrameworkMessage}} can be 
sent by a master. If an agent is partitioned from the cluster for a long time 
and does not know that a framework completed (partitioned at time of 
completion), after a master failover and resubscription of the agent the new 
master would 1) not know that the framework completed, and even 2) learn about 
the framework from the resubscribed agent.

As we currently do not reliably handle this case as well, it seems the first 
suggestion above is more consistent (i.e., have a master acknowledge operations 
status updates of frameworks it currently knows are removed). Note that should 
we e.g., persist completed {{FrameworkID}} values in the future this solution 
would work naturally as well.

Above second suggestion of masters explicitly informing status update managers 
of framework completion does not work reliable either in cases where status 
update managers are partitioned at the time of completion and subsequent master 
failovers.

> Completed framework update streams may retry forever
> 
>
> Key: MESOS-9434
> URL: https://issues.apache.org/jira/browse/MESOS-9434
> Project: Mesos
>  Issue Type: Bug
>  Components: agent, resource provider
>Affects Versions: 1.7.0
>Reporter: Greg Mann
>Assignee: Benjamin Bannier
>Priority: Major
>  Labels: mesosphere
>
> Since the agent/RP currently does not GC operation status update streams when 
> frameworks are torn down, it's possible that active update streams associated 
> with completed frameworks may remain and continue retrying forever. We should 
> add a mechanism to complete these streams when the framework becomes 
> completed.
> A couple options which have come up during discussion:
> * Have the master acknowledge updates associated with completed frameworks. 
> Note that since completed frameworks are currently only tracked by the master 
> in memory, a master failover could prevent this from working perfectly.
> * Extend the RP API to allow the GC of particular update streams, and have 
> the agent GC streams associated with a framework when it receives a 
> {{ShutdownFrameworkMessage}}. This would also require the addition of a new 
> method to the status update manager.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (MESOS-9511) update for terminal task, destroying container: Container not found

2019-01-07 Thread binmes (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-9511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16735574#comment-16735574
 ] 

binmes commented on MESOS-9511:
---

My environment is centos7

> update for terminal task, destroying container: Container not found
> ---
>
> Key: MESOS-9511
> URL: https://issues.apache.org/jira/browse/MESOS-9511
> Project: Mesos
>  Issue Type: Bug
>  Components: agent
>Affects Versions: 1.7.0
>Reporter: binmes
>Priority: Major
>
> I started one master service, one marathon service and one slave service, but 
> when I started slave service, the following error occurred:
> 5ca13 on status update for terminal task, destroying container: Container not 
> found
> E0107 23:55:26.960202 33227 slave.cpp:5621] Failed to update resources for 
> container 0c77b766-515e-4d05-9ae8-b8f054b8b960 of executor 
> 'basic-113.a5292e7a-1294-11e9-b8bb-0242baf5ca13' running task 
> basic-113.a5292e7a-1294-11e9-b8bb-0242baf5ca13 on status update for terminal 
> task, destroying container: Container not found
> E0107 23:55:46.298625 33225 slave.cpp:5621] Failed to update resources for 
> container eb26ff35-3d39-4466-8564-0236e5619117 of executor 
> 'marathon-docker-demo.b120b04b-1294-11e9-b8bb-0242baf5ca13' running task 
> marathon-docker-demo.b120b04b-1294-11e9-b8bb-0242baf5ca13 on status update 
> for terminal task, destroying container: Container not found
> E0107 23:56:02.795354 33231 slave.cpp:5621] Failed to update resources for 
> container 877c2192-d15e-46a8-b1cc-1b1612939108 of executor 
> 'basic-113.ba2b997c-1294-11e9-b8bb-0242baf5ca13' running task 
> basic-113.ba2b997c-1294-11e9-b8bb-0242baf5ca13 on status update for terminal 
> task, destroying container: Container not found
> E0107 23:56:40.045552 33230 slave.cpp:5621] Failed to update resources for 
> container d3e57aa7-ca91-4473-8c2e-6c1a84f1c6fb of executor 
> 'marathon-docker-demo.c8a29a3d-1294-11e9-b8bb-0242baf5ca13' running task 
> marathon-docker-demo.c8a29a3d-1294-11e9-b8bb-0242baf5ca13 on status update 
> for terminal task, destroying container: Container not found
> E0107 23:56:44.222923 33226 slave.cpp:5621] Failed to update resources for 
> container 37887d3b-5e11-4298-8720-b18c743d9241 of executor 
> 'basic-113.d28768ae-1294-11e9-b8bb-0242baf5ca13' running task 
> basic-113.d28768ae-1294-11e9-b8bb-0242baf5ca13 on status update for terminal 
> task, destroying container: Container not found
> E0107 23:57:25.470654 33227 slave.cpp:5621] Failed to update resources for 
> container b3fd71dd-5b48-4c09-adbe-980e0209d1af of executor 
> 'marathon-docker-demo.ec0fd0ff-1294-11e9-b8bb-0242baf5ca13' running task 
> marathon-docker-demo.ec0fd0ff-1294-11e9-b8bb-0242baf5ca13 on status update 
> for terminal task, destroying container: Container not found
> E0107 23:57:30.780771 33232 slave.cpp:5621] Failed to update resources for 
> container b667b717-b77b-47c4-a605-2d6677a4c458 of executor 
> 'basic-113.ef0d0b70-1294-11e9-b8bb-0242baf5ca13' running task 
> basic-113.ef0d0b70-1294-11e9-b8bb-0242baf5ca13 on status update for terminal 
> task, destroying container: Container not found
> E0107 23:58:17.314245 33226 slave.cpp:5621] Failed to update resources for 
> container 89fab210-5c09-4703-bd77-0fd023a69560 of executor 
> 'marathon-docker-demo.0b08e4c1-1295-11e9-b8bb-0242baf5ca13' running task 
> marathon-docker-demo.0b08e4c1-1295-11e9-b8bb-0242baf5ca13 on status update 
> for terminal task, destroying container: Container not found
> E0107 23:58:42.217229 33232 slave.cpp:5621] Failed to update resources for 
> container f558ba56-a72c-4cad-a9b8-760d6956f4d1 of executor 
> 'basic-113.1103a7c2-1295-11e9-b8bb-0242baf5ca13' running task 
> basic-113.1103a7c2-1295-11e9-b8bb-0242baf5ca13 on status update for terminal 
> task, destroying container: Container not found
> E0107 23:59:16.524421 33227 slave.cpp:5621] Failed to update resources for 
> container f24a9fb4-f172-46a8-a29f-0c0b65d9ff3e of executor 
> 'marathon-docker-demo.2e679c93-1295-11e9-b8bb-0242baf5ca13' running task 
> marathon-docker-demo.2e679c93-1295-11e9-b8bb-0242baf5ca13 on status update 
> for terminal task, destroying container: Container not found
> E0107 23:59:57.159930 33231 slave.cpp:5621] Failed to update resources for 
> container 50b8e03e-3157-4259-beac-aa94e9e26013 of executor 
> 'basic-113.3d3f33e4-1295-11e9-b8bb-0242baf5ca13' running task 
> basic-113.3d3f33e4-1295-11e9-b8bb-0242baf5ca13 on status update for terminal 
> task, destroying container: Container not found
> E0108 00:00:24.637122 33226 slave.cpp:5621] Failed to update resources for 
> container 1a617a56-b5bb-4755-9c00-c333522ca0ff of executor 
> 'marathon-docker-demo.56d753a5-1295-11e9-b8bb-0242baf5ca13' running task 
> marathon-docker-demo.56d753a5-1295-11e9-b8bb-0242baf5ca13 on status update 
> 

[jira] [Created] (MESOS-9511) update for terminal task, destroying container: Container not found

2019-01-07 Thread binmes (JIRA)
binmes created MESOS-9511:
-

 Summary: update for terminal task, destroying container: Container 
not found
 Key: MESOS-9511
 URL: https://issues.apache.org/jira/browse/MESOS-9511
 Project: Mesos
  Issue Type: Bug
  Components: agent
Affects Versions: 1.7.0
Reporter: binmes


I started one master service, one marathon service and one slave service, but 
when I started slave service, the following error occurred:

5ca13 on status update for terminal task, destroying container: Container not 
found
E0107 23:55:26.960202 33227 slave.cpp:5621] Failed to update resources for 
container 0c77b766-515e-4d05-9ae8-b8f054b8b960 of executor 
'basic-113.a5292e7a-1294-11e9-b8bb-0242baf5ca13' running task 
basic-113.a5292e7a-1294-11e9-b8bb-0242baf5ca13 on status update for terminal 
task, destroying container: Container not found
E0107 23:55:46.298625 33225 slave.cpp:5621] Failed to update resources for 
container eb26ff35-3d39-4466-8564-0236e5619117 of executor 
'marathon-docker-demo.b120b04b-1294-11e9-b8bb-0242baf5ca13' running task 
marathon-docker-demo.b120b04b-1294-11e9-b8bb-0242baf5ca13 on status update for 
terminal task, destroying container: Container not found
E0107 23:56:02.795354 33231 slave.cpp:5621] Failed to update resources for 
container 877c2192-d15e-46a8-b1cc-1b1612939108 of executor 
'basic-113.ba2b997c-1294-11e9-b8bb-0242baf5ca13' running task 
basic-113.ba2b997c-1294-11e9-b8bb-0242baf5ca13 on status update for terminal 
task, destroying container: Container not found
E0107 23:56:40.045552 33230 slave.cpp:5621] Failed to update resources for 
container d3e57aa7-ca91-4473-8c2e-6c1a84f1c6fb of executor 
'marathon-docker-demo.c8a29a3d-1294-11e9-b8bb-0242baf5ca13' running task 
marathon-docker-demo.c8a29a3d-1294-11e9-b8bb-0242baf5ca13 on status update for 
terminal task, destroying container: Container not found
E0107 23:56:44.222923 33226 slave.cpp:5621] Failed to update resources for 
container 37887d3b-5e11-4298-8720-b18c743d9241 of executor 
'basic-113.d28768ae-1294-11e9-b8bb-0242baf5ca13' running task 
basic-113.d28768ae-1294-11e9-b8bb-0242baf5ca13 on status update for terminal 
task, destroying container: Container not found
E0107 23:57:25.470654 33227 slave.cpp:5621] Failed to update resources for 
container b3fd71dd-5b48-4c09-adbe-980e0209d1af of executor 
'marathon-docker-demo.ec0fd0ff-1294-11e9-b8bb-0242baf5ca13' running task 
marathon-docker-demo.ec0fd0ff-1294-11e9-b8bb-0242baf5ca13 on status update for 
terminal task, destroying container: Container not found
E0107 23:57:30.780771 33232 slave.cpp:5621] Failed to update resources for 
container b667b717-b77b-47c4-a605-2d6677a4c458 of executor 
'basic-113.ef0d0b70-1294-11e9-b8bb-0242baf5ca13' running task 
basic-113.ef0d0b70-1294-11e9-b8bb-0242baf5ca13 on status update for terminal 
task, destroying container: Container not found
E0107 23:58:17.314245 33226 slave.cpp:5621] Failed to update resources for 
container 89fab210-5c09-4703-bd77-0fd023a69560 of executor 
'marathon-docker-demo.0b08e4c1-1295-11e9-b8bb-0242baf5ca13' running task 
marathon-docker-demo.0b08e4c1-1295-11e9-b8bb-0242baf5ca13 on status update for 
terminal task, destroying container: Container not found
E0107 23:58:42.217229 33232 slave.cpp:5621] Failed to update resources for 
container f558ba56-a72c-4cad-a9b8-760d6956f4d1 of executor 
'basic-113.1103a7c2-1295-11e9-b8bb-0242baf5ca13' running task 
basic-113.1103a7c2-1295-11e9-b8bb-0242baf5ca13 on status update for terminal 
task, destroying container: Container not found
E0107 23:59:16.524421 33227 slave.cpp:5621] Failed to update resources for 
container f24a9fb4-f172-46a8-a29f-0c0b65d9ff3e of executor 
'marathon-docker-demo.2e679c93-1295-11e9-b8bb-0242baf5ca13' running task 
marathon-docker-demo.2e679c93-1295-11e9-b8bb-0242baf5ca13 on status update for 
terminal task, destroying container: Container not found
E0107 23:59:57.159930 33231 slave.cpp:5621] Failed to update resources for 
container 50b8e03e-3157-4259-beac-aa94e9e26013 of executor 
'basic-113.3d3f33e4-1295-11e9-b8bb-0242baf5ca13' running task 
basic-113.3d3f33e4-1295-11e9-b8bb-0242baf5ca13 on status update for terminal 
task, destroying container: Container not found
E0108 00:00:24.637122 33226 slave.cpp:5621] Failed to update resources for 
container 1a617a56-b5bb-4755-9c00-c333522ca0ff of executor 
'marathon-docker-demo.56d753a5-1295-11e9-b8bb-0242baf5ca13' running task 
marathon-docker-demo.56d753a5-1295-11e9-b8bb-0242baf5ca13 on status update for 
terminal task, destroying container: Container not found
E0108 00:01:05.445214 33227 slave.cpp:5621] Failed to update resources for 
container c71bc4eb-89b7-4341-8548-8d999adcefbc of executor 
'basic-113.6f11df36-1295-11e9-b8bb-0242baf5ca13' running task 
basic-113.6f11df36-1295-11e9-b8bb-0242baf5ca13 on status update for terminal 
task, destroying container: Container not found
E0108 00:01:57.342438