Hey,

I'm using the Mesos HTTP API for the first time. I am currently
encountering an issue where after a successful SUBSCRIBE call and receiving
a SUBSCRIBED and HEARTBEAT event, a subsequent TEARDOWN call fails with
HTTP 400 with a message of "The stream ID included in this request didn't
match the stream ID currently associated with framework ID".

Here is a detailed breakdown of what happens with logs:

A new framework sends an SUBSCRIBE call with the following body:

````
framework_id {
  value: "0dffbee9-a514-4ffa-87e1-2850dd4dcf00"
}
type: SUBSCRIBE
subscribe {
  framework_info {
    user: "user"
    name: "name"
    id {
      value: "0dffbee9-a514-4ffa-87e1-2850dd4dcf00"
    }
  }
}
````

It then receives a 200 OK response with the following headers:
`{content-type=[application/x-protobuf], date=[Sat, 13 Aug 2016 02:42:48
GMT], transfer-encoding=[chunked],
mesos-stream-id=[71a0294f-e9c4-4efe-b237-fb120836aaf8]}`

Over this connection it receives a successful subscribed event:
````
type: SUBSCRIBED
subscribed {
  framework_id {
    value: "0dffbee9-a514-4ffa-87e1-2850dd4dcf00"
  }
  heartbeat_interval_seconds: 15.0
}
````

It also receives a single heart beat event.

Then it tries to send the following request:
````
Sending: framework_id {
  value: "0dffbee9-a514-4ffa-87e1-2850dd4dcf00"
}
type: TEARDOWN
````
with the following headers:
`{accept=[application/x-protobuf], accept-encoding=[gzip],
mesos-stream-id=[71a0294f-e9c4-4efe-b237-fb120836aaf8]}`

The response is a 400 with the body: `The stream ID included in this
request didn't match the stream ID currently associated with framework ID
'0dffbee9-a514-4ffa-87e1-2850dd4dcf00'`.


The master logs contains:
````
I0813 02:42:48.376819 13934 http.cpp:381] HTTP POST for
/master/api/v1/scheduler from 192.168.33.1:60780 with
User-Agent='Google-HTTP-Java-Client/1.20.0 (gzip)'
I0813 02:42:48.376998 13934 master.cpp:2146] Received subscription request
for HTTP framework 'name'
I0813 02:42:48.377104 13934 master.cpp:2244] Subscribing framework 'name'
with checkpointing disabled and capabilities [  ]
I0813 02:42:48.377378 13934 hierarchical.cpp:271] Added framework
0dffbee9-a514-4ffa-87e1-2850dd4dcf00
I0813 02:42:49.475163 13929 http.cpp:381] HTTP POST for
/master/api/v1/scheduler from 192.168.33.1:60782 with
User-Agent='Google-HTTP-Java-Client/1.20.0 (gzip)'
I0813 02:42:51.133513 13930 master.cpp:1284] Framework
0dffbee9-a514-4ffa-87e1-2850dd4dcf00 (name) disconnected
I0813 02:42:51.133597 13930 master.cpp:2725] Disconnecting framework
0dffbee9-a514-4ffa-87e1-2850dd4dcf00 (name)
I0813 02:42:51.133618 13930 master.cpp:2749] Deactivating framework
0dffbee9-a514-4ffa-87e1-2850dd4dcf00 (name)
I0813 02:42:51.133644 13930 master.cpp:1297] Giving framework
0dffbee9-a514-4ffa-87e1-2850dd4dcf00 (name) 0ns to failover
I0813 02:42:51.133692 13932 hierarchical.cpp:382] Deactivated framework
0dffbee9-a514-4ffa-87e1-2850dd4dcf00
I0813 02:42:51.137265 13931 master.cpp:5561] Framework failover timeout,
removing framework 0dffbee9-a514-4ffa-87e1-2850dd4dcf00 (name)
I0813 02:42:51.137339 13931 master.cpp:6296] Removing framework
0dffbee9-a514-4ffa-87e1-2850dd4dcf00 (name)
I0813 02:42:51.137464 13931 hierarchical.cpp:333] Removed framework
0dffbee9-a514-4ffa-87e1-2850dd4dcf00
````
Note the immediate disconnection after the second POST is intentional.

This is with Mesos 1.0.0 on Ubuntu Trusty.

What can I do to debug this issue? The logs do not provide a lot of
information to act on. The stream id generated by mesos is not in the logs,
nor anything indicating that an HTTP 400 was sent.

--
Zameer Manji

Reply via email to