[ 
https://issues.apache.org/jira/browse/MESOS-9022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kone reassigned MESOS-9022:
---------------------------------

       Assignee: Benno Evers
         Labels: events foundations mesos mesosphere race-condition streaming  
(was: events mesos mesosphere race-condition streaming)
    Component/s: HTTP API

Oh great. [~bennoe] can you confirm and resolve?

> Race condition in task updates could cause missing event in streaming
> ---------------------------------------------------------------------
>
>                 Key: MESOS-9022
>                 URL: https://issues.apache.org/jira/browse/MESOS-9022
>             Project: Mesos
>          Issue Type: Bug
>          Components: HTTP API, master
>    Affects Versions: 1.6.0
>            Reporter: Evelyn Liu
>            Assignee: Benno Evers
>            Priority: Blocker
>              Labels: events, foundations, mesos, mesosphere, race-condition, 
> streaming
>
> Master sends update event of {{TASK_STARTING}} when task's latest state is 
> already {{TASK_FAILED}}. Then when it handles the update of {{TASK_FAILED}}, 
> {{sendSubscribersUpdate}} is set to {{false}} because of 
> [this|https://github.com/apache/mesos/blob/1.6.x/src/master/master.cpp#L10805].
>  The subscriber would not receive update event of {{TASK_FAILED}}.
> This happened when a task failed very fast. Is there a race condition while 
> handling task updates?
> {{*master log:*}}
> {code:java}
> I0622 13:08:29.189771 84079 master.cpp:8345] Status update TASK_STARTING 
> (Status UUID: eb091093-d303-4e82-b69f-e2ba1011ba76) for task 
> f839055c-7a40-4e6c-9f53-22030f388c8c of framework 
> 4591ea8b-4adb-4acf-bb29-b70817663c4e-0000 from agent 
> d2f1c7c2-668d-46e5-829b-ce614cca79ae-S1587
>  I0622 13:08:29.189801 84079 master.cpp:8402] Forwarding status update 
> TASK_STARTING (Status UUID: eb091093-d303-4e82-b69f-e2ba1011ba76) for task 
> f839055c-7a40-4e6c-9f53-22030f388c8c of framework 
> 4591ea8b-4adb-4acf-bb29-b70817663c4e-0000
>  I0622 13:08:29.190004 84079 master.cpp:10843] Updating the state of task 
> f839055c-7a40-4e6c-9f53-22030f388c8c of framework 
> 4591ea8b-4adb-4acf-bb29-b70817663c4e-0000 (latest state: TASK_STARTING, 
> status update state: TASK_STARTING)
>  I0622 13:08:29.603857 84079 master.cpp:6195] Processing ACKNOWLEDGE call for 
> status eb091093-d303-4e82-b69f-e2ba1011ba76 for task 
> f839055c-7a40-4e6c-9f53-22030f388c8c of framework 
> 4591ea8b-4adb-4acf-bb29-b70817663c4e-0000 (Aurora) on agent 
> d2f1c7c2-668d-46e5-829b-ce614cca79ae-S1587
>  I0622 13:08:29.615643 84079 master.cpp:8345] Status update TASK_STARTING 
> (Status UUID: eb091093-d303-4e82-b69f-e2ba1011ba76) for task 
> f839055c-7a40-4e6c-9f53-22030f388c8c of framework 
> 4591ea8b-4adb-4acf-bb29-b70817663c4e-0000 from agent 
> d2f1c7c2-668d-46e5-829b-ce614cca79ae-S1587
>  I0622 13:08:29.615669 84079 master.cpp:8402] Forwarding status update 
> TASK_STARTING (Status UUID: eb091093-d303-4e82-b69f-e2ba1011ba76) for task 
> f839055c-7a40-4e6c-9f53-22030f388c8c of framework 
> 4591ea8b-4adb-4acf-bb29-b70817663c4e-0000
>  I0622 13:08:29.615783 84079 master.cpp:10843] Updating the state of task 
> f839055c-7a40-4e6c-9f53-22030f388c8c of framework 
> 4591ea8b-4adb-4acf-bb29-b70817663c4e-0000 (latest state: TASK_FAILED, status 
> update state: TASK_STARTING)
>  I0622 13:08:29.620837 84079 master.cpp:8345] Status update TASK_FAILED 
> (Status UUID: ac34f1e9-eaa4-4765-82ac-7398c2e6c835) for task 
> f839055c-7a40-4e6c-9f53-22030f388c8c of framework 
> 4591ea8b-4adb-4acf-bb29-b70817663c4e-0000 from agent 
> d2f1c7c2-668d-46e5-829b-ce614cca79ae-S1587
>  I0622 13:08:29.620853 84079 master.cpp:8402] Forwarding status update 
> TASK_FAILED (Status UUID: ac34f1e9-eaa4-4765-82ac-7398c2e6c835) for task 
> f839055c-7a40-4e6c-9f53-22030f388c8c of framework 
> 4591ea8b-4adb-4acf-bb29-b70817663c4e-0000
>  I0622 13:08:29.620923 84079 master.cpp:10843] Updating the state of task 
> f839055c-7a40-4e6c-9f53-22030f388c8c of framework 
> 4591ea8b-4adb-4acf-bb29-b70817663c4e-0000 (latest state: TASK_FAILED, status 
> update state: TASK_FAILED)
>  I0622 13:08:29.630455 84079 master.cpp:6195] Processing ACKNOWLEDGE call for 
> status eb091093-d303-4e82-b69f-e2ba1011ba76 for task 
> f839055c-7a40-4e6c-9f53-22030f388c8c of framework 
> 4591ea8b-4adb-4acf-bb29-b70817663c4e-0000 (Aurora) on agent 
> d2f1c7c2-668d-46e5-829b-ce614cca79ae-S1587
>  I0622 13:08:29.673051 84095 master.cpp:6195] Processing ACKNOWLEDGE call for 
> status ac34f1e9-eaa4-4765-82ac-7398c2e6c835 for task 
> f839055c-7a40-4e6c-9f53-22030f388c8c of framework 
> 4591ea8b-4adb-4acf-bb29-b70817663c4e-0000 (Aurora) on agent 
> d2f1c7c2-668d-46e5-829b-ce614cca79ae-S1587{code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to