Dario,

I do not think the case sensitivity matters here. If the master was
expecting a header that was exactly 'Mesos-Stream-Id' and did not see it, I
would expect to get the error response: `All non-subscribe calls should
include the 'Mesos-Stream-Id' header`. That is the error response that you
get when you do not set the header.

Possibly related, I expected to see the stream id in the mesos logs. I see this
log message
<https://github.com/apache/mesos/blob/c9b70582e9fccab8f6863b0bd3a812b5969a8c24/src/master/master.cpp#L7473-L7474>
in
the code, but I do not see it in the logs.


On Sun, Aug 14, 2016 at 6:12 PM, Dario Rexin <[email protected]> wrote:

> Oh, sorry, I didn't see you actually set the header (wall of text ;) ).
> That's an interesting issue, do you set the header case sensitive? I know
> headers shouldn't be case sensitive, but maybe there's a bug in the Mesos
> code. I have not seen this issue before.
>
> On Aug 14, 2016, at 5:58 PM, Zameer Manji <[email protected]> wrote:
>
> Hey,
>
> I'm using the Mesos HTTP API for the first time. I am currently
> encountering an issue where after a successful SUBSCRIBE call and receiving
> a SUBSCRIBED and HEARTBEAT event, a subsequent TEARDOWN call fails with
> HTTP 400 with a message of "The stream ID included in this request didn't
> match the stream ID currently associated with framework ID".
>
> Here is a detailed breakdown of what happens with logs:
>
> A new framework sends an SUBSCRIBE call with the following body:
>
> ````
> framework_id {
>   value: "0dffbee9-a514-4ffa-87e1-2850dd4dcf00"
> }
> type: SUBSCRIBE
> subscribe {
>   framework_info {
>     user: "user"
>     name: "name"
>     id {
>       value: "0dffbee9-a514-4ffa-87e1-2850dd4dcf00"
>     }
>   }
> }
> ````
>
> It then receives a 200 OK response with the following headers:
> `{content-type=[application/x-protobuf], date=[Sat, 13 Aug 2016 02:42:48
> GMT], transfer-encoding=[chunked], mesos-stream-id=[71a0294f-
> e9c4-4efe-b237-fb120836aaf8]}`
>
> Over this connection it receives a successful subscribed event:
> ````
> type: SUBSCRIBED
> subscribed {
>   framework_id {
>     value: "0dffbee9-a514-4ffa-87e1-2850dd4dcf00"
>   }
>   heartbeat_interval_seconds: 15.0
> }
> ````
>
> It also receives a single heart beat event.
>
> Then it tries to send the following request:
> ````
> Sending: framework_id {
>   value: "0dffbee9-a514-4ffa-87e1-2850dd4dcf00"
> }
> type: TEARDOWN
> ````
> with the following headers:
> `{accept=[application/x-protobuf], accept-encoding=[gzip],
> mesos-stream-id=[71a0294f-e9c4-4efe-b237-fb120836aaf8]}`
>
> The response is a 400 with the body: `The stream ID included in this
> request didn't match the stream ID currently associated with framework ID
> '0dffbee9-a514-4ffa-87e1-2850dd4dcf00'`.
>
>
> The master logs contains:
> ````
> I0813 02:42:48.376819 13934 http.cpp:381] HTTP POST for
> /master/api/v1/scheduler from 192.168.33.1:60780 with
> User-Agent='Google-HTTP-Java-Client/1.20.0 (gzip)'
> I0813 02:42:48.376998 13934 master.cpp:2146] Received subscription request
> for HTTP framework 'name'
> I0813 02:42:48.377104 13934 master.cpp:2244] Subscribing framework 'name'
> with checkpointing disabled and capabilities [  ]
> I0813 02:42:48.377378 13934 hierarchical.cpp:271] Added framework
> 0dffbee9-a514-4ffa-87e1-2850dd4dcf00
> I0813 02:42:49.475163 13929 http.cpp:381] HTTP POST for
> /master/api/v1/scheduler from 192.168.33.1:60782 with
> User-Agent='Google-HTTP-Java-Client/1.20.0 (gzip)'
> I0813 02:42:51.133513 13930 master.cpp:1284] Framework
> 0dffbee9-a514-4ffa-87e1-2850dd4dcf00 (name) disconnected
> I0813 02:42:51.133597 13930 master.cpp:2725] Disconnecting framework
> 0dffbee9-a514-4ffa-87e1-2850dd4dcf00 (name)
> I0813 02:42:51.133618 13930 master.cpp:2749] Deactivating framework
> 0dffbee9-a514-4ffa-87e1-2850dd4dcf00 (name)
> I0813 02:42:51.133644 13930 master.cpp:1297] Giving framework
> 0dffbee9-a514-4ffa-87e1-2850dd4dcf00 (name) 0ns to failover
> I0813 02:42:51.133692 13932 hierarchical.cpp:382] Deactivated framework
> 0dffbee9-a514-4ffa-87e1-2850dd4dcf00
> I0813 02:42:51.137265 13931 master.cpp:5561] Framework failover timeout,
> removing framework 0dffbee9-a514-4ffa-87e1-2850dd4dcf00 (name)
> I0813 02:42:51.137339 13931 master.cpp:6296] Removing framework
> 0dffbee9-a514-4ffa-87e1-2850dd4dcf00 (name)
> I0813 02:42:51.137464 13931 hierarchical.cpp:333] Removed framework
> 0dffbee9-a514-4ffa-87e1-2850dd4dcf00
> ````
> Note the immediate disconnection after the second POST is intentional.
>
> This is with Mesos 1.0.0 on Ubuntu Trusty.
>
> What can I do to debug this issue? The logs do not provide a lot of
> information to act on. The stream id generated by mesos is not in the logs,
> nor anything indicating that an HTTP 400 was sent.
>
> --
> Zameer Manji
>
>

Reply via email to