[jira] [Updated] (MESOS-7601) Some container launch failures are mistakenly treated as errors.

2017-11-08 Thread Alexander Rukletsov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Rukletsov updated MESOS-7601:
---
Sprint: Mesosphere Sprint 59, Mesosphere Sprint 62, Mesosphere Sprint 63, 
Mesosphere Sprint 64, Mesosphere Sprint 66  (was: Mesosphere Sprint 59, 
Mesosphere Sprint 62, Mesosphere Sprint 63, Mesosphere Sprint 64, Mesosphere 
Sprint 66, Mesosphere Sprint 67)

> Some container launch failures are mistakenly treated as errors.
> 
>
> Key: MESOS-7601
> URL: https://issues.apache.org/jira/browse/MESOS-7601
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
>Affects Versions: 1.3.0
>Reporter: Alexander Rukletsov
>Assignee: Alexander Rukletsov
>  Labels: containerizer, mesosphere, tech-debt
>
> I've observed a case when a scheduler stops (i.e. calls TEARDOWN) while some 
> of its tasks are being launched. While this is a valid behaviour, the agent 
> prints an error and increased container launch errors metrics.
> Below are log excerpts for such framework, 
> {{6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092}}.
> *Master log*
> {noformat}
> [centos@ip-172-31-6-200 ~]$ journalctl _PID=29716 --since "2 hours ago" 
> --no-pager | grep 
> "6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092"
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.226218 29724 master.cpp:6072] Updating 
> info for framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.226356 29728 hierarchical.cpp:274] Added 
> framework 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.226405 29728 hierarchical.cpp:379] 
> Deactivated framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.228570 29728 hierarchical.cpp:343] 
> Activated framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.246068 29721 master.cpp:7105] Sending 1 
> offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.247851 29721 master.cpp:7194] Sending 1 
> inverse offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.912937 29728 master.cpp:4806] Processing 
> DECLINE call for offers: [ 92434aef-27da-4fd1-a5c4-b286d640d5b3-O509464 ] for 
> framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:32:59 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:59.804184 29727 master.cpp:7105] Sending 2 
> offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:32:59 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:59.804411 29727 master.cpp:7194] Sending 2 
> inverse offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:33:01 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:33:01.248924 29721 master.cpp:7105] Sending 2 
> offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:33:01 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:33:01.249289 29721 master.cpp:7194] Sending 2 
> inverse offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:33:01 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:33:01.249724 29721 master.cpp:3851] Processing 
> ACCEPT call for offers: [ 

[jira] [Updated] (MESOS-7601) Some container launch failures are mistakenly treated as errors.

2017-10-30 Thread Vinod Kone (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kone updated MESOS-7601:
--
Sprint: Mesosphere Sprint 59, Mesosphere Sprint 62, Mesosphere Sprint 63, 
Mesosphere Sprint 64, Mesosphere Sprint 66, Mesosphere Sprint 67  (was: 
Mesosphere Sprint 59, Mesosphere Sprint 62, Mesosphere Sprint 63, Mesosphere 
Sprint 64, Mesosphere Sprint 66)

> Some container launch failures are mistakenly treated as errors.
> 
>
> Key: MESOS-7601
> URL: https://issues.apache.org/jira/browse/MESOS-7601
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
>Affects Versions: 1.3.0
>Reporter: Alexander Rukletsov
>Assignee: Alexander Rukletsov
>  Labels: containerizer, mesosphere, tech-debt
>
> I've observed a case when a scheduler stops (i.e. calls TEARDOWN) while some 
> of its tasks are being launched. While this is a valid behaviour, the agent 
> prints an error and increased container launch errors metrics.
> Below are log excerpts for such framework, 
> {{6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092}}.
> *Master log*
> {noformat}
> [centos@ip-172-31-6-200 ~]$ journalctl _PID=29716 --since "2 hours ago" 
> --no-pager | grep 
> "6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092"
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.226218 29724 master.cpp:6072] Updating 
> info for framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.226356 29728 hierarchical.cpp:274] Added 
> framework 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.226405 29728 hierarchical.cpp:379] 
> Deactivated framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.228570 29728 hierarchical.cpp:343] 
> Activated framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.246068 29721 master.cpp:7105] Sending 1 
> offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.247851 29721 master.cpp:7194] Sending 1 
> inverse offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.912937 29728 master.cpp:4806] Processing 
> DECLINE call for offers: [ 92434aef-27da-4fd1-a5c4-b286d640d5b3-O509464 ] for 
> framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:32:59 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:59.804184 29727 master.cpp:7105] Sending 2 
> offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:32:59 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:59.804411 29727 master.cpp:7194] Sending 2 
> inverse offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:33:01 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:33:01.248924 29721 master.cpp:7105] Sending 2 
> offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:33:01 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:33:01.249289 29721 master.cpp:7194] Sending 2 
> inverse offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:33:01 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:33:01.249724 29721 master.cpp:3851] Processing 
> ACCEPT call for offers: [ 

[jira] [Updated] (MESOS-7601) Some container launch failures are mistakenly treated as errors.

2017-10-12 Thread Vinod Kone (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kone updated MESOS-7601:
--
Sprint: Mesosphere Sprint 59, Mesosphere Sprint 62, Mesosphere Sprint 63, 
Mesosphere Sprint 64, Mesosphere Sprint 66  (was: Mesosphere Sprint 59, 
Mesosphere Sprint 62, Mesosphere Sprint 63, Mesosphere Sprint 64)

> Some container launch failures are mistakenly treated as errors.
> 
>
> Key: MESOS-7601
> URL: https://issues.apache.org/jira/browse/MESOS-7601
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
>Affects Versions: 1.3.0
>Reporter: Alexander Rukletsov
>Assignee: Alexander Rukletsov
>  Labels: containerizer, mesosphere, tech-debt
>
> I've observed a case when a scheduler stops (i.e. calls TEARDOWN) while some 
> of its tasks are being launched. While this is a valid behaviour, the agent 
> prints an error and increased container launch errors metrics.
> Below are log excerpts for such framework, 
> {{6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092}}.
> *Master log*
> {noformat}
> [centos@ip-172-31-6-200 ~]$ journalctl _PID=29716 --since "2 hours ago" 
> --no-pager | grep 
> "6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092"
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.226218 29724 master.cpp:6072] Updating 
> info for framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.226356 29728 hierarchical.cpp:274] Added 
> framework 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.226405 29728 hierarchical.cpp:379] 
> Deactivated framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.228570 29728 hierarchical.cpp:343] 
> Activated framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.246068 29721 master.cpp:7105] Sending 1 
> offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.247851 29721 master.cpp:7194] Sending 1 
> inverse offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.912937 29728 master.cpp:4806] Processing 
> DECLINE call for offers: [ 92434aef-27da-4fd1-a5c4-b286d640d5b3-O509464 ] for 
> framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:32:59 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:59.804184 29727 master.cpp:7105] Sending 2 
> offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:32:59 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:59.804411 29727 master.cpp:7194] Sending 2 
> inverse offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:33:01 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:33:01.248924 29721 master.cpp:7105] Sending 2 
> offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:33:01 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:33:01.249289 29721 master.cpp:7194] Sending 2 
> inverse offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:33:01 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:33:01.249724 29721 master.cpp:3851] Processing 
> ACCEPT call for offers: [ 92434aef-27da-4fd1-a5c4-b286d640d5b3-O509469 ] on 
> agent 

[jira] [Updated] (MESOS-7601) Some container launch failures are mistakenly treated as errors.

2017-10-12 Thread Vinod Kone (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kone updated MESOS-7601:
--
Sprint: Mesosphere Sprint 59, Mesosphere Sprint 62, Mesosphere Sprint 63, 
Mesosphere Sprint 64  (was: Mesosphere Sprint 59, Mesosphere Sprint 62, 
Mesosphere Sprint 63, Mesosphere Sprint 64, Mesosphere Sprint 65)

> Some container launch failures are mistakenly treated as errors.
> 
>
> Key: MESOS-7601
> URL: https://issues.apache.org/jira/browse/MESOS-7601
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
>Affects Versions: 1.3.0
>Reporter: Alexander Rukletsov
>Assignee: Alexander Rukletsov
>  Labels: containerizer, mesosphere, tech-debt
>
> I've observed a case when a scheduler stops (i.e. calls TEARDOWN) while some 
> of its tasks are being launched. While this is a valid behaviour, the agent 
> prints an error and increased container launch errors metrics.
> Below are log excerpts for such framework, 
> {{6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092}}.
> *Master log*
> {noformat}
> [centos@ip-172-31-6-200 ~]$ journalctl _PID=29716 --since "2 hours ago" 
> --no-pager | grep 
> "6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092"
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.226218 29724 master.cpp:6072] Updating 
> info for framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.226356 29728 hierarchical.cpp:274] Added 
> framework 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.226405 29728 hierarchical.cpp:379] 
> Deactivated framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.228570 29728 hierarchical.cpp:343] 
> Activated framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.246068 29721 master.cpp:7105] Sending 1 
> offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.247851 29721 master.cpp:7194] Sending 1 
> inverse offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.912937 29728 master.cpp:4806] Processing 
> DECLINE call for offers: [ 92434aef-27da-4fd1-a5c4-b286d640d5b3-O509464 ] for 
> framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:32:59 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:59.804184 29727 master.cpp:7105] Sending 2 
> offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:32:59 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:59.804411 29727 master.cpp:7194] Sending 2 
> inverse offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:33:01 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:33:01.248924 29721 master.cpp:7105] Sending 2 
> offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:33:01 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:33:01.249289 29721 master.cpp:7194] Sending 2 
> inverse offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:33:01 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:33:01.249724 29721 master.cpp:3851] Processing 
> ACCEPT call for offers: [ 92434aef-27da-4fd1-a5c4-b286d640d5b3-O509469 ] on 
> agent 

[jira] [Updated] (MESOS-7601) Some container launch failures are mistakenly treated as errors.

2017-09-28 Thread Vinod Kone (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kone updated MESOS-7601:
--
Sprint: Mesosphere Sprint 59, Mesosphere Sprint 62, Mesosphere Sprint 63, 
Mesosphere Sprint 64, Mesosphere Sprint 65  (was: Mesosphere Sprint 59, 
Mesosphere Sprint 62, Mesosphere Sprint 63, Mesosphere Sprint 64)

> Some container launch failures are mistakenly treated as errors.
> 
>
> Key: MESOS-7601
> URL: https://issues.apache.org/jira/browse/MESOS-7601
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
>Affects Versions: 1.3.0
>Reporter: Alexander Rukletsov
>Assignee: Alexander Rukletsov
>  Labels: containerizer, mesosphere, tech-debt
>
> I've observed a case when a scheduler stops (i.e. calls TEARDOWN) while some 
> of its tasks are being launched. While this is a valid behaviour, the agent 
> prints an error and increased container launch errors metrics.
> Below are log excerpts for such framework, 
> {{6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092}}.
> *Master log*
> {noformat}
> [centos@ip-172-31-6-200 ~]$ journalctl _PID=29716 --since "2 hours ago" 
> --no-pager | grep 
> "6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092"
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.226218 29724 master.cpp:6072] Updating 
> info for framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.226356 29728 hierarchical.cpp:274] Added 
> framework 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.226405 29728 hierarchical.cpp:379] 
> Deactivated framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.228570 29728 hierarchical.cpp:343] 
> Activated framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.246068 29721 master.cpp:7105] Sending 1 
> offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.247851 29721 master.cpp:7194] Sending 1 
> inverse offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.912937 29728 master.cpp:4806] Processing 
> DECLINE call for offers: [ 92434aef-27da-4fd1-a5c4-b286d640d5b3-O509464 ] for 
> framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:32:59 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:59.804184 29727 master.cpp:7105] Sending 2 
> offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:32:59 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:59.804411 29727 master.cpp:7194] Sending 2 
> inverse offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:33:01 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:33:01.248924 29721 master.cpp:7105] Sending 2 
> offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:33:01 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:33:01.249289 29721 master.cpp:7194] Sending 2 
> inverse offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:33:01 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:33:01.249724 29721 master.cpp:3851] Processing 
> ACCEPT call for offers: [ 92434aef-27da-4fd1-a5c4-b286d640d5b3-O509469 ] on 
> agent 

[jira] [Updated] (MESOS-7601) Some container launch failures are mistakenly treated as errors.

2017-09-19 Thread Vinod Kone (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kone updated MESOS-7601:
--
Sprint: Mesosphere Sprint 59, Mesosphere Sprint 62, Mesosphere Sprint 63, 
Mesosphere Sprint 64  (was: Mesosphere Sprint 59, Mesosphere Sprint 62, 
Mesosphere Sprint 63)

> Some container launch failures are mistakenly treated as errors.
> 
>
> Key: MESOS-7601
> URL: https://issues.apache.org/jira/browse/MESOS-7601
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
>Affects Versions: 1.3.0
>Reporter: Alexander Rukletsov
>Assignee: Alexander Rukletsov
>  Labels: containerizer, mesosphere, tech-debt
>
> I've observed a case when a scheduler stops (i.e. calls TEARDOWN) while some 
> of its tasks are being launched. While this is a valid behaviour, the agent 
> prints an error and increased container launch errors metrics.
> Below are log excerpts for such framework, 
> {{6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092}}.
> *Master log*
> {noformat}
> [centos@ip-172-31-6-200 ~]$ journalctl _PID=29716 --since "2 hours ago" 
> --no-pager | grep 
> "6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092"
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.226218 29724 master.cpp:6072] Updating 
> info for framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.226356 29728 hierarchical.cpp:274] Added 
> framework 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.226405 29728 hierarchical.cpp:379] 
> Deactivated framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.228570 29728 hierarchical.cpp:343] 
> Activated framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.246068 29721 master.cpp:7105] Sending 1 
> offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.247851 29721 master.cpp:7194] Sending 1 
> inverse offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.912937 29728 master.cpp:4806] Processing 
> DECLINE call for offers: [ 92434aef-27da-4fd1-a5c4-b286d640d5b3-O509464 ] for 
> framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:32:59 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:59.804184 29727 master.cpp:7105] Sending 2 
> offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:32:59 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:59.804411 29727 master.cpp:7194] Sending 2 
> inverse offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:33:01 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:33:01.248924 29721 master.cpp:7105] Sending 2 
> offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:33:01 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:33:01.249289 29721 master.cpp:7194] Sending 2 
> inverse offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:33:01 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:33:01.249724 29721 master.cpp:3851] Processing 
> ACCEPT call for offers: [ 92434aef-27da-4fd1-a5c4-b286d640d5b3-O509469 ] on 
> agent 36a25adb-4ea2-49d3-a195-448cff1dc146-S35 at 

[jira] [Updated] (MESOS-7601) Some container launch failures are mistakenly treated as errors.

2017-09-14 Thread Greg Mann (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Mann updated MESOS-7601:
-
Shepherd: Greg Mann  (was: Jie Yu)

> Some container launch failures are mistakenly treated as errors.
> 
>
> Key: MESOS-7601
> URL: https://issues.apache.org/jira/browse/MESOS-7601
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
>Affects Versions: 1.3.0
>Reporter: Alexander Rukletsov
>Assignee: Alexander Rukletsov
>  Labels: containerizer, mesosphere, tech-debt
>
> I've observed a case when a scheduler stops (i.e. calls TEARDOWN) while some 
> of its tasks are being launched. While this is a valid behaviour, the agent 
> prints an error and increased container launch errors metrics.
> Below are log excerpts for such framework, 
> {{6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092}}.
> *Master log*
> {noformat}
> [centos@ip-172-31-6-200 ~]$ journalctl _PID=29716 --since "2 hours ago" 
> --no-pager | grep 
> "6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092"
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.226218 29724 master.cpp:6072] Updating 
> info for framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.226356 29728 hierarchical.cpp:274] Added 
> framework 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.226405 29728 hierarchical.cpp:379] 
> Deactivated framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.228570 29728 hierarchical.cpp:343] 
> Activated framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.246068 29721 master.cpp:7105] Sending 1 
> offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.247851 29721 master.cpp:7194] Sending 1 
> inverse offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.912937 29728 master.cpp:4806] Processing 
> DECLINE call for offers: [ 92434aef-27da-4fd1-a5c4-b286d640d5b3-O509464 ] for 
> framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:32:59 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:59.804184 29727 master.cpp:7105] Sending 2 
> offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:32:59 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:59.804411 29727 master.cpp:7194] Sending 2 
> inverse offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:33:01 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:33:01.248924 29721 master.cpp:7105] Sending 2 
> offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:33:01 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:33:01.249289 29721 master.cpp:7194] Sending 2 
> inverse offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:33:01 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:33:01.249724 29721 master.cpp:3851] Processing 
> ACCEPT call for offers: [ 92434aef-27da-4fd1-a5c4-b286d640d5b3-O509469 ] on 
> agent 36a25adb-4ea2-49d3-a195-448cff1dc146-S35 at slave(1)@172.31.13.122:5051 
> (172.31.13.122) for framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> 

[jira] [Updated] (MESOS-7601) Some container launch failures are mistakenly treated as errors.

2017-09-05 Thread Vinod Kone (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kone updated MESOS-7601:
--
Sprint: Mesosphere Sprint 59, Mesosphere Sprint 62, Mesosphere Sprint 63  
(was: Mesosphere Sprint 59, Mesosphere Sprint 62)

> Some container launch failures are mistakenly treated as errors.
> 
>
> Key: MESOS-7601
> URL: https://issues.apache.org/jira/browse/MESOS-7601
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
>Affects Versions: 1.3.0
>Reporter: Alexander Rukletsov
>Assignee: Alexander Rukletsov
>  Labels: containerizer, mesosphere, tech-debt
>
> I've observed a case when a scheduler stops (i.e. calls TEARDOWN) while some 
> of its tasks are being launched. While this is a valid behaviour, the agent 
> prints an error and increased container launch errors metrics.
> Below are log excerpts for such framework, 
> {{6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092}}.
> *Master log*
> {noformat}
> [centos@ip-172-31-6-200 ~]$ journalctl _PID=29716 --since "2 hours ago" 
> --no-pager | grep 
> "6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092"
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.226218 29724 master.cpp:6072] Updating 
> info for framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.226356 29728 hierarchical.cpp:274] Added 
> framework 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.226405 29728 hierarchical.cpp:379] 
> Deactivated framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.228570 29728 hierarchical.cpp:343] 
> Activated framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.246068 29721 master.cpp:7105] Sending 1 
> offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.247851 29721 master.cpp:7194] Sending 1 
> inverse offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.912937 29728 master.cpp:4806] Processing 
> DECLINE call for offers: [ 92434aef-27da-4fd1-a5c4-b286d640d5b3-O509464 ] for 
> framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:32:59 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:59.804184 29727 master.cpp:7105] Sending 2 
> offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:32:59 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:59.804411 29727 master.cpp:7194] Sending 2 
> inverse offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:33:01 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:33:01.248924 29721 master.cpp:7105] Sending 2 
> offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:33:01 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:33:01.249289 29721 master.cpp:7194] Sending 2 
> inverse offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:33:01 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:33:01.249724 29721 master.cpp:3851] Processing 
> ACCEPT call for offers: [ 92434aef-27da-4fd1-a5c4-b286d640d5b3-O509469 ] on 
> agent 36a25adb-4ea2-49d3-a195-448cff1dc146-S35 at slave(1)@172.31.13.122:5051 
> (172.31.13.122) for framework 
> 

[jira] [Updated] (MESOS-7601) Some container launch failures are mistakenly treated as errors.

2017-08-25 Thread Alexander Rukletsov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Rukletsov updated MESOS-7601:
---
  Sprint: Mesosphere Sprint 59, Mesosphere Sprint 62  (was: Mesosphere 
Sprint 59)
Story Points: 5  (was: 3)

> Some container launch failures are mistakenly treated as errors.
> 
>
> Key: MESOS-7601
> URL: https://issues.apache.org/jira/browse/MESOS-7601
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
>Affects Versions: 1.3.0
>Reporter: Alexander Rukletsov
>Assignee: Alexander Rukletsov
>  Labels: containerizer, mesosphere, tech-debt
>
> I've observed a case when a scheduler stops (i.e. calls TEARDOWN) while some 
> of its tasks are being launched. While this is a valid behaviour, the agent 
> prints an error and increased container launch errors metrics.
> Below are log excerpts for such framework, 
> {{6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092}}.
> *Master log*
> {noformat}
> [centos@ip-172-31-6-200 ~]$ journalctl _PID=29716 --since "2 hours ago" 
> --no-pager | grep 
> "6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092"
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.226218 29724 master.cpp:6072] Updating 
> info for framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.226356 29728 hierarchical.cpp:274] Added 
> framework 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.226405 29728 hierarchical.cpp:379] 
> Deactivated framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.228570 29728 hierarchical.cpp:343] 
> Activated framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.246068 29721 master.cpp:7105] Sending 1 
> offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.247851 29721 master.cpp:7194] Sending 1 
> inverse offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.912937 29728 master.cpp:4806] Processing 
> DECLINE call for offers: [ 92434aef-27da-4fd1-a5c4-b286d640d5b3-O509464 ] for 
> framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:32:59 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:59.804184 29727 master.cpp:7105] Sending 2 
> offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:32:59 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:59.804411 29727 master.cpp:7194] Sending 2 
> inverse offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:33:01 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:33:01.248924 29721 master.cpp:7105] Sending 2 
> offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:33:01 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:33:01.249289 29721 master.cpp:7194] Sending 2 
> inverse offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:33:01 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:33:01.249724 29721 master.cpp:3851] Processing 
> ACCEPT call for offers: [ 92434aef-27da-4fd1-a5c4-b286d640d5b3-O509469 ] on 
> agent 36a25adb-4ea2-49d3-a195-448cff1dc146-S35 at slave(1)@172.31.13.122:5051 
> (172.31.13.122) for 

[jira] [Updated] (MESOS-7601) Some container launch failures are mistakenly treated as errors.

2017-08-01 Thread Alexander Rukletsov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Rukletsov updated MESOS-7601:
---
Sprint: Mesosphere Sprint 59  (was: Mesosphere Sprint 59, Mesosphere Sprint 
60)

> Some container launch failures are mistakenly treated as errors.
> 
>
> Key: MESOS-7601
> URL: https://issues.apache.org/jira/browse/MESOS-7601
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
>Affects Versions: 1.3.0
>Reporter: Alexander Rukletsov
>Assignee: Alexander Rukletsov
>  Labels: containerizer, mesosphere, tech-debt
>
> I've observed a case when a scheduler stops (i.e. calls TEARDOWN) while some 
> of its tasks are being launched. While this is a valid behaviour, the agent 
> prints an error and increased container launch errors metrics.
> Below are log excerpts for such framework, 
> {{6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092}}.
> *Master log*
> {noformat}
> [centos@ip-172-31-6-200 ~]$ journalctl _PID=29716 --since "2 hours ago" 
> --no-pager | grep 
> "6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092"
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.226218 29724 master.cpp:6072] Updating 
> info for framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.226356 29728 hierarchical.cpp:274] Added 
> framework 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.226405 29728 hierarchical.cpp:379] 
> Deactivated framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.228570 29728 hierarchical.cpp:343] 
> Activated framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.246068 29721 master.cpp:7105] Sending 1 
> offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.247851 29721 master.cpp:7194] Sending 1 
> inverse offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.912937 29728 master.cpp:4806] Processing 
> DECLINE call for offers: [ 92434aef-27da-4fd1-a5c4-b286d640d5b3-O509464 ] for 
> framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:32:59 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:59.804184 29727 master.cpp:7105] Sending 2 
> offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:32:59 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:59.804411 29727 master.cpp:7194] Sending 2 
> inverse offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:33:01 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:33:01.248924 29721 master.cpp:7105] Sending 2 
> offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:33:01 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:33:01.249289 29721 master.cpp:7194] Sending 2 
> inverse offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:33:01 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:33:01.249724 29721 master.cpp:3851] Processing 
> ACCEPT call for offers: [ 92434aef-27da-4fd1-a5c4-b286d640d5b3-O509469 ] on 
> agent 36a25adb-4ea2-49d3-a195-448cff1dc146-S35 at slave(1)@172.31.13.122:5051 
> (172.31.13.122) for framework 
> 

[jira] [Updated] (MESOS-7601) Some container launch failures are mistakenly treated as errors.

2017-07-21 Thread Vinod Kone (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kone updated MESOS-7601:
--
Sprint: Mesosphere Sprint 59, Mesosphere Sprint 60  (was: Mesosphere Sprint 
59)

> Some container launch failures are mistakenly treated as errors.
> 
>
> Key: MESOS-7601
> URL: https://issues.apache.org/jira/browse/MESOS-7601
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
>Affects Versions: 1.3.0
>Reporter: Alexander Rukletsov
>Assignee: Alexander Rukletsov
>  Labels: containerizer, mesosphere, tech-debt
>
> I've observed a case when a scheduler stops (i.e. calls TEARDOWN) while some 
> of its tasks are being launched. While this is a valid behaviour, the agent 
> prints an error and increased container launch errors metrics.
> Below are log excerpts for such framework, 
> {{6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092}}.
> *Master log*
> {noformat}
> [centos@ip-172-31-6-200 ~]$ journalctl _PID=29716 --since "2 hours ago" 
> --no-pager | grep 
> "6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092"
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.226218 29724 master.cpp:6072] Updating 
> info for framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.226356 29728 hierarchical.cpp:274] Added 
> framework 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.226405 29728 hierarchical.cpp:379] 
> Deactivated framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.228570 29728 hierarchical.cpp:343] 
> Activated framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.246068 29721 master.cpp:7105] Sending 1 
> offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.247851 29721 master.cpp:7194] Sending 1 
> inverse offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.912937 29728 master.cpp:4806] Processing 
> DECLINE call for offers: [ 92434aef-27da-4fd1-a5c4-b286d640d5b3-O509464 ] for 
> framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:32:59 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:59.804184 29727 master.cpp:7105] Sending 2 
> offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:32:59 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:59.804411 29727 master.cpp:7194] Sending 2 
> inverse offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:33:01 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:33:01.248924 29721 master.cpp:7105] Sending 2 
> offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:33:01 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:33:01.249289 29721 master.cpp:7194] Sending 2 
> inverse offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:33:01 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:33:01.249724 29721 master.cpp:3851] Processing 
> ACCEPT call for offers: [ 92434aef-27da-4fd1-a5c4-b286d640d5b3-O509469 ] on 
> agent 36a25adb-4ea2-49d3-a195-448cff1dc146-S35 at slave(1)@172.31.13.122:5051 
> (172.31.13.122) for framework 
> 

[jira] [Updated] (MESOS-7601) Some container launch failures are mistakenly treated as errors.

2017-07-13 Thread Alexander Rukletsov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Rukletsov updated MESOS-7601:
---
Sprint: Mesosphere Sprint 59
Labels: containerizer mesosphere tech-debt  (was: containerizer mesosphere)

> Some container launch failures are mistakenly treated as errors.
> 
>
> Key: MESOS-7601
> URL: https://issues.apache.org/jira/browse/MESOS-7601
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
>Affects Versions: 1.3.0
>Reporter: Alexander Rukletsov
>Assignee: Alexander Rukletsov
>  Labels: containerizer, mesosphere, tech-debt
>
> I've observed a case when a scheduler stops (i.e. calls TEARDOWN) while some 
> of its tasks are being launched. While this is a valid behaviour, the agent 
> prints an error and increased container launch errors metrics.
> Below are log excerpts for such framework, 
> {{6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092}}.
> *Master log*
> {noformat}
> [centos@ip-172-31-6-200 ~]$ journalctl _PID=29716 --since "2 hours ago" 
> --no-pager | grep 
> "6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092"
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.226218 29724 master.cpp:6072] Updating 
> info for framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.226356 29728 hierarchical.cpp:274] Added 
> framework 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.226405 29728 hierarchical.cpp:379] 
> Deactivated framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.228570 29728 hierarchical.cpp:343] 
> Activated framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.246068 29721 master.cpp:7105] Sending 1 
> offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.247851 29721 master.cpp:7194] Sending 1 
> inverse offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.912937 29728 master.cpp:4806] Processing 
> DECLINE call for offers: [ 92434aef-27da-4fd1-a5c4-b286d640d5b3-O509464 ] for 
> framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:32:59 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:59.804184 29727 master.cpp:7105] Sending 2 
> offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:32:59 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:59.804411 29727 master.cpp:7194] Sending 2 
> inverse offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:33:01 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:33:01.248924 29721 master.cpp:7105] Sending 2 
> offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:33:01 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:33:01.249289 29721 master.cpp:7194] Sending 2 
> inverse offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:33:01 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:33:01.249724 29721 master.cpp:3851] Processing 
> ACCEPT call for offers: [ 92434aef-27da-4fd1-a5c4-b286d640d5b3-O509469 ] on 
> agent 36a25adb-4ea2-49d3-a195-448cff1dc146-S35 at slave(1)@172.31.13.122:5051 
> (172.31.13.122) for framework 
> 

[jira] [Updated] (MESOS-7601) Some container launch failures are mistakenly treated as errors.

2017-06-01 Thread Alexander Rukletsov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Rukletsov updated MESOS-7601:
---
Story Points: 3
  Labels: containerizer mesosphere  (was: )
 Description: 
I've observed a case when a scheduler stops (i.e. calls TEARDOWN) while some of 
its tasks are being launched. While this is a valid behaviour, the agent prints 
an error and increased container launch errors metrics.

Below are log excerpts for such framework, 
{{6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092}}.

*Master log*
{noformat}
[centos@ip-172-31-6-200 ~]$ journalctl _PID=29716 --since "2 hours ago" 
--no-pager | grep 
"6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092"
Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal mesos-master[29716]: 
I0601 11:32:58.226218 29724 master.cpp:6072] Updating info for framework 
6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092
Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal mesos-master[29716]: 
I0601 11:32:58.226356 29728 hierarchical.cpp:274] Added framework 
6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092
Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal mesos-master[29716]: 
I0601 11:32:58.226405 29728 hierarchical.cpp:379] Deactivated framework 
6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092
Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal mesos-master[29716]: 
I0601 11:32:58.228570 29728 hierarchical.cpp:343] Activated framework 
6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092
Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal mesos-master[29716]: 
I0601 11:32:58.246068 29721 master.cpp:7105] Sending 1 offers to framework 
6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
(TeraValidate) at 
scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal mesos-master[29716]: 
I0601 11:32:58.247851 29721 master.cpp:7194] Sending 1 inverse offers to 
framework 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
(TeraValidate) at 
scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal mesos-master[29716]: 
I0601 11:32:58.912937 29728 master.cpp:4806] Processing DECLINE call for 
offers: [ 92434aef-27da-4fd1-a5c4-b286d640d5b3-O509464 ] for framework 
6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
(TeraValidate) at 
scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
Jun 01 11:32:59 ip-172-31-6-200.us-west-2.compute.internal mesos-master[29716]: 
I0601 11:32:59.804184 29727 master.cpp:7105] Sending 2 offers to framework 
6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
(TeraValidate) at 
scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
Jun 01 11:32:59 ip-172-31-6-200.us-west-2.compute.internal mesos-master[29716]: 
I0601 11:32:59.804411 29727 master.cpp:7194] Sending 2 inverse offers to 
framework 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
(TeraValidate) at 
scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
Jun 01 11:33:01 ip-172-31-6-200.us-west-2.compute.internal mesos-master[29716]: 
I0601 11:33:01.248924 29721 master.cpp:7105] Sending 2 offers to framework 
6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
(TeraValidate) at 
scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
Jun 01 11:33:01 ip-172-31-6-200.us-west-2.compute.internal mesos-master[29716]: 
I0601 11:33:01.249289 29721 master.cpp:7194] Sending 2 inverse offers to 
framework 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
(TeraValidate) at 
scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
Jun 01 11:33:01 ip-172-31-6-200.us-west-2.compute.internal mesos-master[29716]: 
I0601 11:33:01.249724 29721 master.cpp:3851] Processing ACCEPT call for offers: 
[ 92434aef-27da-4fd1-a5c4-b286d640d5b3-O509469 ] on agent 
36a25adb-4ea2-49d3-a195-448cff1dc146-S35 at slave(1)@172.31.13.122:5051 
(172.31.13.122) for framework 
6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
(TeraValidate) at 
scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
Jun 01 11:33:01 ip-172-31-6-200.us-west-2.compute.internal mesos-master[29716]: 
I0601 11:33:01.250141 29721 master.cpp:3851] Processing ACCEPT call for offers: 
[ 92434aef-27da-4fd1-a5c4-b286d640d5b3-O509470 ] on agent 
36a25adb-4ea2-49d3-a195-448cff1dc146-S2 at slave(1)@172.31.7.202:5051 
(172.31.7.202) for framework 
6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
(TeraValidate) at 
scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
Jun 01 11:33:01 ip-172-31-6-200.us-west-2.compute.internal