Dario, The logs show that no disconnections occur until after the second POST request. I would expect a log entry indicating a disconnect between the two POST requests if the stream id changed.
On Sun, Aug 14, 2016 at 8:28 PM, Dario Rexin <[email protected]> wrote: > You’re absolutely right. I just tried the exact same steps and it worked > fine for me. I also don’t see the log message. Do you have any reconnection > logic in place? Is it possible, that your framework reconnected before you > send the call? The Stream Id would change in that case. > > > On Aug 14, 2016, at 6:48 PM, Zameer Manji <[email protected]> wrote: > > Dario, > > I do not think the case sensitivity matters here. If the master was > expecting a header that was exactly 'Mesos-Stream-Id' and did not see it, I > would expect to get the error response: `All non-subscribe calls should > include the 'Mesos-Stream-Id' header`. That is the error response that you > get when you do not set the header. > > Possibly related, I expected to see the stream id in the mesos logs. I see > this > log message > <https://github.com/apache/mesos/blob/c9b70582e9fccab8f6863b0bd3a812b5969a8c24/src/master/master.cpp#L7473-L7474> > in > the code, but I do not see it in the logs. > > > On Sun, Aug 14, 2016 at 6:12 PM, Dario Rexin <[email protected]> wrote: > >> Oh, sorry, I didn't see you actually set the header (wall of text ;) ). >> That's an interesting issue, do you set the header case sensitive? I know >> headers shouldn't be case sensitive, but maybe there's a bug in the Mesos >> code. I have not seen this issue before. >> >> On Aug 14, 2016, at 5:58 PM, Zameer Manji <[email protected]> wrote: >> >> Hey, >> >> I'm using the Mesos HTTP API for the first time. I am currently >> encountering an issue where after a successful SUBSCRIBE call and receiving >> a SUBSCRIBED and HEARTBEAT event, a subsequent TEARDOWN call fails with >> HTTP 400 with a message of "The stream ID included in this request didn't >> match the stream ID currently associated with framework ID". >> >> Here is a detailed breakdown of what happens with logs: >> >> A new framework sends an SUBSCRIBE call with the following body: >> >> ```` >> framework_id { >> value: "0dffbee9-a514-4ffa-87e1-2850dd4dcf00" >> } >> type: SUBSCRIBE >> subscribe { >> framework_info { >> user: "user" >> name: "name" >> id { >> value: "0dffbee9-a514-4ffa-87e1-2850dd4dcf00" >> } >> } >> } >> ```` >> >> It then receives a 200 OK response with the following headers: >> `{content-type=[application/x-protobuf], date=[Sat, 13 Aug 2016 02:42:48 >> GMT], transfer-encoding=[chunked], mesos-stream-id=[71a0294f-e9c4 >> -4efe-b237-fb120836aaf8]}` >> >> Over this connection it receives a successful subscribed event: >> ```` >> type: SUBSCRIBED >> subscribed { >> framework_id { >> value: "0dffbee9-a514-4ffa-87e1-2850dd4dcf00" >> } >> heartbeat_interval_seconds: 15.0 >> } >> ```` >> >> It also receives a single heart beat event. >> >> Then it tries to send the following request: >> ```` >> Sending: framework_id { >> value: "0dffbee9-a514-4ffa-87e1-2850dd4dcf00" >> } >> type: TEARDOWN >> ```` >> with the following headers: >> `{accept=[application/x-protobuf], accept-encoding=[gzip], >> mesos-stream-id=[71a0294f-e9c4-4efe-b237-fb120836aaf8]}` >> >> The response is a 400 with the body: `The stream ID included in this >> request didn't match the stream ID currently associated with framework ID >> '0dffbee9-a514-4ffa-87e1-2850dd4dcf00'`. >> >> >> The master logs contains: >> ```` >> I0813 02:42:48.376819 13934 http.cpp:381] HTTP POST for >> /master/api/v1/scheduler from 192.168.33.1:60780 with >> User-Agent='Google-HTTP-Java-Client/1.20.0 (gzip)' >> I0813 02:42:48.376998 13934 master.cpp:2146] Received subscription >> request for HTTP framework 'name' >> I0813 02:42:48.377104 13934 master.cpp:2244] Subscribing framework 'name' >> with checkpointing disabled and capabilities [ ] >> I0813 02:42:48.377378 13934 hierarchical.cpp:271] Added framework >> 0dffbee9-a514-4ffa-87e1-2850dd4dcf00 >> I0813 02:42:49.475163 13929 http.cpp:381] HTTP POST for >> /master/api/v1/scheduler from 192.168.33.1:60782 with >> User-Agent='Google-HTTP-Java-Client/1.20.0 (gzip)' >> I0813 02:42:51.133513 13930 master.cpp:1284] Framework >> 0dffbee9-a514-4ffa-87e1-2850dd4dcf00 (name) disconnected >> I0813 02:42:51.133597 13930 master.cpp:2725] Disconnecting framework >> 0dffbee9-a514-4ffa-87e1-2850dd4dcf00 (name) >> I0813 02:42:51.133618 13930 master.cpp:2749] Deactivating framework >> 0dffbee9-a514-4ffa-87e1-2850dd4dcf00 (name) >> I0813 02:42:51.133644 13930 master.cpp:1297] Giving framework >> 0dffbee9-a514-4ffa-87e1-2850dd4dcf00 (name) 0ns to failover >> I0813 02:42:51.133692 13932 hierarchical.cpp:382] Deactivated framework >> 0dffbee9-a514-4ffa-87e1-2850dd4dcf00 >> I0813 02:42:51.137265 13931 master.cpp:5561] Framework failover timeout, >> removing framework 0dffbee9-a514-4ffa-87e1-2850dd4dcf00 (name) >> I0813 02:42:51.137339 13931 master.cpp:6296] Removing framework >> 0dffbee9-a514-4ffa-87e1-2850dd4dcf00 (name) >> I0813 02:42:51.137464 13931 hierarchical.cpp:333] Removed framework >> 0dffbee9-a514-4ffa-87e1-2850dd4dcf00 >> ```` >> Note the immediate disconnection after the second POST is intentional. >> >> This is with Mesos 1.0.0 on Ubuntu Trusty. >> >> What can I do to debug this issue? The logs do not provide a lot of >> information to act on. The stream id generated by mesos is not in the logs, >> nor anything indicating that an HTTP 400 was sent. >> >> -- >> Zameer Manji >> >> > >

