[ 
https://issues.apache.org/jira/browse/MESOS-2865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14707454#comment-14707454
 ] 

James DeFelice commented on MESOS-2865:
---------------------------------------

Pretty sure the problem is related to Go's internal net/http library buffering. 
Here's what is happening:
- mesos POST's to a URL, with keep-alive
- Go's net/http (net.http) reads the first request
- the mesos-go handler sends a response
- net.http reads the second request (same pipeline)
- the mesos-go handler sends a response
- net.http reads the third request, but doesn't deliver it to the mesos-go 
handler (even though the full request frame was read)
- the mesos-go handler waits forever because there's no timeout on the 
connection
- the undelivered frame is never send to the message handler

I've verified that the bytes are ready from the connection because I added an 
io.Reader spy that logs all read byte blocks to stdout. It's very clear that 
the entire message has been received by net.http but for some reason it's 
buffering/hoarding the 3rd request frame. This happens in go-1.4.2, and in the 
just-released go-1.5.

I've tested this further by writing a special net.http Handler that bootstraps 
from Go's net.http server but hijacks the connection of the initial request 
immediately and assumes total control over the message framing from thereon. 
I'm unable to reproduce the lost message effect with the mini http server.

> intermittently the executor is not receiving TASK_KILLED
> --------------------------------------------------------
>
>                 Key: MESOS-2865
>                 URL: https://issues.apache.org/jira/browse/MESOS-2865
>             Project: Mesos
>          Issue Type: Bug
>    Affects Versions: 0.21.1, 0.23.0
>         Environment: {code}
> $ dpkg -l |grep -e mesos
> ii  mesos                               0.21.1-1.1.ubuntu1404            
> amd64        Cluster resource manager with efficient resource isolation
> $ uname -a
> Linux node-1 3.13.0-29-generic #53-Ubuntu SMP Wed Jun 4 21:00:20 UTC 2014 
> x86_64 x86_64 x86_64 GNU/Linux
> {code}
>            Reporter: James DeFelice
>              Labels: mesosphere
>
> for details, log snippets see 
> https://github.com/mesosphere/kubernetes-mesos/issues/328
> The slave logs that it's been asked to kill a pod, but the message is never 
> logged as received by the executor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to