[ 
https://issues.apache.org/jira/browse/MESOS-8545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16619417#comment-16619417
 ] 

Alexander Rukletsov edited comment on MESOS-8545 at 9/21/18 1:01 PM:
---------------------------------------------------------------------

*{{master}} aka {{1.8-dev}}*:
{noformat}
commit 5b95bb0f21852058d22703385f2c8e139881bf1a
Author:     Andrei Budnik <abud...@mesosphere.com>
AuthorDate: Tue Sep 18 19:10:14 2018 +0200
Commit:     Alexander Rukletsov <al...@apache.org>
CommitDate: Tue Sep 18 19:10:14 2018 +0200

    Fixed HTTP errors caused by dropped HTTP responses by IOSwitchboard.
    
    Previously, IOSwitchboard process could terminate before all HTTP
    responses had been sent to the agent. In the case of
    `ATTACH_CONTAINER_INPUT` call, we could drop a final HTTP `200 OK`
    response, so the agent got broken HTTP connection for the call.
    This patch introduces an acknowledgment for the received response
    for the `ATTACH_CONTAINER_INPUT` call. This acknowledgment is a new
    type of control messages for the `ATTACH_CONTAINER_INPUT` call. When
    IOSwitchboard receives an acknowledgment, and io redirects are
    finished, it terminates itself. That guarantees that the agent always
    receives a response for the `ATTACH_CONTAINER_INPUT` call.
    
    Review: https://reviews.apache.org/r/65168/
{noformat}
{noformat}
commit 5b95bb0f21852058d22703385f2c8e139881bf1a
Author:     Andrei Budnik <abud...@mesosphere.com>
AuthorDate: Tue Sep 18 19:10:14 2018 +0200
Commit:     Alexander Rukletsov <al...@apache.org>
CommitDate: Tue Sep 18 19:10:14 2018 +0200

    Fixed HTTP errors caused by dropped HTTP responses by IOSwitchboard.
    
    Previously, IOSwitchboard process could terminate before all HTTP
    responses had been sent to the agent. In the case of
    `ATTACH_CONTAINER_INPUT` call, we could drop a final HTTP `200 OK`
    response, so the agent got broken HTTP connection for the call.
    This patch introduces an acknowledgment for the received response
    for the `ATTACH_CONTAINER_INPUT` call. This acknowledgment is a new
    type of control messages for the `ATTACH_CONTAINER_INPUT` call. When
    IOSwitchboard receives an acknowledgment, and io redirects are
    finished, it terminates itself. That guarantees that the agent always
    receives a response for the `ATTACH_CONTAINER_INPUT` call.
    
    Review: https://reviews.apache.org/r/65168/
{noformat}
{noformat}
commit bfa2bd24780b5c49467b3c23260855e3d8b4c948
Author:     Andrei Budnik <abud...@mesosphere.com>
AuthorDate: Fri Sep 21 14:51:24 2018 +0200
Commit:     Alexander Rukletsov <al...@apache.org>
CommitDate: Fri Sep 21 14:51:24 2018 +0200

    Fixed disconnection while sending acknowledgment to IOSwitchboard.
    
    Previously, an HTTP connection to the IOSwitchboard could be garbage
    collected before the agent sent an acknowledgment to the IOSwitchboard
    via this connection. This patch fixes the issue by keeping a reference
    count to the connection in a lambda callback until disconnection
    occurs.
    
    Review: https://reviews.apache.org/r/68768/
{noformat}
{noformat}
commit c3c77cbef818d497d8bd5e67fa72e55a7190e27a
Author:     Andrei Budnik <abud...@mesosphere.com>
AuthorDate: Fri Sep 21 14:51:59 2018 +0200
Commit:     Alexander Rukletsov <al...@apache.org>
CommitDate: Fri Sep 21 14:51:59 2018 +0200

    Fixed broken pipe error in IOSwitchboard.
    
    Previous attempt to fix `HTTP 500` "broken pipe" in review /r/62187/
    was not correct: after IOSwitchboard sends a response to the agent for
    the `ATTACH_CONTAINER_INPUT` call, the socket is closed immediately,
    thus causing the error on the agent. This patch adds a delay after
    IO redirects are finished and before IOSwitchboard forcibly send a
    response.
    
    Review: https://reviews.apache.org/r/68784/
{noformat}
*{{1.7.1}}*:
{noformat}
commit 1672941630960cccf66ed81b11811d84e8a4e3f0
commit 600b388e25c49f4fac4d39bc07bcf6ffce42c679
commit 021a8f4de1ad65167946548e3ecfa74d8e41e9c5
commit 38a914398b6f1aaf08db4f62f4e42cdb80127eb5
{noformat}
*{{1.6.2}}*:
{noformat}
commit 2ddd6f07bebbe91e1e0d5165c4a5ae552b836303
commit c1448f36d4c2c2c8345e7e8d1bf1f206dba18dac
commit 55b0e94f0c8a1896ca079361d89527123faf22c6
commit c40c92b7710b5b238b13ce6f1bacd3d75e04283b
{noformat}
*{{1.5.2}}*:
{noformat}
commit 3bf4fe22e0ed828a36d5b2ea652d07c6eef4b578
commit 33a6bec95b44592d626874ae8deaa3e2a3bbc120
commit 7b8195680104c2c5f61073a956f60ac961c37f45
commit 0216002744517a6053fd782b6b4dc3d6cf77dd5e
{noformat}


was (Author: alexr):
*{{master}} aka {{1.8-dev}}*:
{noformat}
commit 5b95bb0f21852058d22703385f2c8e139881bf1a
Author:     Andrei Budnik <abud...@mesosphere.com>
AuthorDate: Tue Sep 18 19:10:14 2018 +0200
Commit:     Alexander Rukletsov <al...@apache.org>
CommitDate: Tue Sep 18 19:10:14 2018 +0200

    Fixed HTTP errors caused by dropped HTTP responses by IOSwitchboard.
    
    Previously, IOSwitchboard process could terminate before all HTTP
    responses had been sent to the agent. In the case of
    `ATTACH_CONTAINER_INPUT` call, we could drop a final HTTP `200 OK`
    response, so the agent got broken HTTP connection for the call.
    This patch introduces an acknowledgment for the received response
    for the `ATTACH_CONTAINER_INPUT` call. This acknowledgment is a new
    type of control messages for the `ATTACH_CONTAINER_INPUT` call. When
    IOSwitchboard receives an acknowledgment, and io redirects are
    finished, it terminates itself. That guarantees that the agent always
    receives a response for the `ATTACH_CONTAINER_INPUT` call.
    
    Review: https://reviews.apache.org/r/65168/
{noformat}
{noformat}
commit 5b95bb0f21852058d22703385f2c8e139881bf1a
Author:     Andrei Budnik <abud...@mesosphere.com>
AuthorDate: Tue Sep 18 19:10:14 2018 +0200
Commit:     Alexander Rukletsov <al...@apache.org>
CommitDate: Tue Sep 18 19:10:14 2018 +0200

    Fixed HTTP errors caused by dropped HTTP responses by IOSwitchboard.
    
    Previously, IOSwitchboard process could terminate before all HTTP
    responses had been sent to the agent. In the case of
    `ATTACH_CONTAINER_INPUT` call, we could drop a final HTTP `200 OK`
    response, so the agent got broken HTTP connection for the call.
    This patch introduces an acknowledgment for the received response
    for the `ATTACH_CONTAINER_INPUT` call. This acknowledgment is a new
    type of control messages for the `ATTACH_CONTAINER_INPUT` call. When
    IOSwitchboard receives an acknowledgment, and io redirects are
    finished, it terminates itself. That guarantees that the agent always
    receives a response for the `ATTACH_CONTAINER_INPUT` call.
    
    Review: https://reviews.apache.org/r/65168/
{noformat}
{noformat}
commit bfa2bd24780b5c49467b3c23260855e3d8b4c948
Author:     Andrei Budnik <abud...@mesosphere.com>
AuthorDate: Fri Sep 21 14:51:24 2018 +0200
Commit:     Alexander Rukletsov <al...@apache.org>
CommitDate: Fri Sep 21 14:51:24 2018 +0200

    Fixed disconnection while sending acknowledgment to IOSwitchboard.
    
    Previously, an HTTP connection to the IOSwitchboard could be garbage
    collected before the agent sent an acknowledgment to the IOSwitchboard
    via this connection. This patch fixes the issue by keeping a reference
    count to the connection in a lambda callback until disconnection
    occurs.
    
    Review: https://reviews.apache.org/r/68768/
{noformat}
{noformat}
commit c3c77cbef818d497d8bd5e67fa72e55a7190e27a
Author:     Andrei Budnik <abud...@mesosphere.com>
AuthorDate: Fri Sep 21 14:51:59 2018 +0200
Commit:     Alexander Rukletsov <al...@apache.org>
CommitDate: Fri Sep 21 14:51:59 2018 +0200

    Fixed broken pipe error in IOSwitchboard.
    
    Previous attempt to fix `HTTP 500` "broken pipe" in review /r/62187/
    was not correct: after IOSwitchboard sends a response to the agent for
    the `ATTACH_CONTAINER_INPUT` call, the socket is closed immediately,
    thus causing the error on the agent. This patch adds a delay after
    IO redirects are finished and before IOSwitchboard forcibly send a
    response.
    
    Review: https://reviews.apache.org/r/68784/
{noformat}
*{{1.7.1}}*:
{noformat}
commit 1672941630960cccf66ed81b11811d84e8a4e3f0
commit 600b388e25c49f4fac4d39bc07bcf6ffce42c679
{noformat}
*{{1.6.2}}*:
{noformat}
commit 2ddd6f07bebbe91e1e0d5165c4a5ae552b836303
commit c1448f36d4c2c2c8345e7e8d1bf1f206dba18dac
{noformat}
*{{1.5.2}}*:
{noformat}
commit 3bf4fe22e0ed828a36d5b2ea652d07c6eef4b578
commit 33a6bec95b44592d626874ae8deaa3e2a3bbc120
{noformat}

> AgentAPIStreamingTest.AttachInputToNestedContainerSession is flaky.
> -------------------------------------------------------------------
>
>                 Key: MESOS-8545
>                 URL: https://issues.apache.org/jira/browse/MESOS-8545
>             Project: Mesos
>          Issue Type: Bug
>          Components: agent
>    Affects Versions: 1.5.0, 1.6.1, 1.7.0
>            Reporter: Andrei Budnik
>            Assignee: Andrei Budnik
>            Priority: Major
>              Labels: Mesosphere, flaky-test
>             Fix For: 1.5.2, 1.6.2, 1.7.1, 1.8.0
>
>         Attachments: 
> AgentAPIStreamingTest.AttachInputToNestedContainerSession-badrun.txt, 
> AgentAPIStreamingTest.AttachInputToNestedContainerSession-badrun2.txt
>
>
> {code:java}
> I0205 17:11:01.091872 4898 http_proxy.cpp:132] Returning '500 Internal Server 
> Error' for '/slave(974)/api/v1' (Disconnected)
> /home/centos/workspace/mesos/Mesos_CI-build/FLAG/CMake/label/mesos-ec2-centos-7/mesos/src/tests/api_tests.cpp:6596:
>  Failure
> Value of: (response).get().status
> Actual: "500 Internal Server Error"
> Expected: http::OK().status
> Which is: "200 OK"
> Body: "Disconnected"
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to