Dario was right. I filed MESOS-6041 <https://issues.apache.org/jira/browse/MESOS-6041> to enhance the error message.
On Sun, Aug 14, 2016 at 10:35 PM, Dario Rexin <[email protected]> wrote: > Zameer, > > the header value is enclosed in []. This is because headers can have > multiple values and the library you use pus them into a list. You have to > take the first item from that list and then it should work. > > On Aug 14, 2016, at 10:19 PM, Zameer Manji <[email protected]> wrote: > > Here is a MWE: https://github.com/zmanji/mesos-mwe > > Follow the instructions in the README to reproduce. > > On Sun, Aug 14, 2016 at 9:04 PM, Dario Rexin <[email protected]> wrote: > >> Can you post the code somewhere? >> >> >> On Aug 14, 2016, at 8:54 PM, Zameer Manji <[email protected]> wrote: >> >> Dario, >> >> The logs show that no disconnections occur until after the second POST >> request. I would expect a log entry indicating a disconnect between the two >> POST requests if the stream id changed. >> >> On Sun, Aug 14, 2016 at 8:28 PM, Dario Rexin <[email protected]> wrote: >> >>> You’re absolutely right. I just tried the exact same steps and it worked >>> fine for me. I also don’t see the log message. Do you have any reconnection >>> logic in place? Is it possible, that your framework reconnected before you >>> send the call? The Stream Id would change in that case. >>> >>> >>> On Aug 14, 2016, at 6:48 PM, Zameer Manji <[email protected]> wrote: >>> >>> Dario, >>> >>> I do not think the case sensitivity matters here. If the master was >>> expecting a header that was exactly 'Mesos-Stream-Id' and did not see it, I >>> would expect to get the error response: `All non-subscribe calls should >>> include the 'Mesos-Stream-Id' header`. That is the error response that you >>> get when you do not set the header. >>> >>> Possibly related, I expected to see the stream id in the mesos logs. I >>> see this log message >>> <https://github.com/apache/mesos/blob/c9b70582e9fccab8f6863b0bd3a812b5969a8c24/src/master/master.cpp#L7473-L7474> >>> in >>> the code, but I do not see it in the logs. >>> >>> >>> On Sun, Aug 14, 2016 at 6:12 PM, Dario Rexin <[email protected]> wrote: >>> >>>> Oh, sorry, I didn't see you actually set the header (wall of text ;) ). >>>> That's an interesting issue, do you set the header case sensitive? I know >>>> headers shouldn't be case sensitive, but maybe there's a bug in the Mesos >>>> code. I have not seen this issue before. >>>> >>>> On Aug 14, 2016, at 5:58 PM, Zameer Manji <[email protected]> wrote: >>>> >>>> Hey, >>>> >>>> I'm using the Mesos HTTP API for the first time. I am currently >>>> encountering an issue where after a successful SUBSCRIBE call and receiving >>>> a SUBSCRIBED and HEARTBEAT event, a subsequent TEARDOWN call fails with >>>> HTTP 400 with a message of "The stream ID included in this request didn't >>>> match the stream ID currently associated with framework ID". >>>> >>>> Here is a detailed breakdown of what happens with logs: >>>> >>>> A new framework sends an SUBSCRIBE call with the following body: >>>> >>>> ```` >>>> framework_id { >>>> value: "0dffbee9-a514-4ffa-87e1-2850dd4dcf00" >>>> } >>>> type: SUBSCRIBE >>>> subscribe { >>>> framework_info { >>>> user: "user" >>>> name: "name" >>>> id { >>>> value: "0dffbee9-a514-4ffa-87e1-2850dd4dcf00" >>>> } >>>> } >>>> } >>>> ```` >>>> >>>> It then receives a 200 OK response with the following headers: >>>> `{content-type=[application/x-protobuf], date=[Sat, 13 Aug 2016 >>>> 02:42:48 GMT], transfer-encoding=[chunked], mesos-stream-id=[71a0294f-e9c4 >>>> -4efe-b237-fb120836aaf8]}` >>>> >>>> Over this connection it receives a successful subscribed event: >>>> ```` >>>> type: SUBSCRIBED >>>> subscribed { >>>> framework_id { >>>> value: "0dffbee9-a514-4ffa-87e1-2850dd4dcf00" >>>> } >>>> heartbeat_interval_seconds: 15.0 >>>> } >>>> ```` >>>> >>>> It also receives a single heart beat event. >>>> >>>> Then it tries to send the following request: >>>> ```` >>>> Sending: framework_id { >>>> value: "0dffbee9-a514-4ffa-87e1-2850dd4dcf00" >>>> } >>>> type: TEARDOWN >>>> ```` >>>> with the following headers: >>>> `{accept=[application/x-protobuf], accept-encoding=[gzip], >>>> mesos-stream-id=[71a0294f-e9c4-4efe-b237-fb120836aaf8]}` >>>> >>>> The response is a 400 with the body: `The stream ID included in this >>>> request didn't match the stream ID currently associated with framework ID >>>> '0dffbee9-a514-4ffa-87e1-2850dd4dcf00'`. >>>> >>>> >>>> The master logs contains: >>>> ```` >>>> I0813 02:42:48.376819 13934 http.cpp:381] HTTP POST for >>>> /master/api/v1/scheduler from 192.168.33.1:60780 with >>>> User-Agent='Google-HTTP-Java-Client/1.20.0 (gzip)' >>>> I0813 02:42:48.376998 13934 master.cpp:2146] Received subscription >>>> request for HTTP framework 'name' >>>> I0813 02:42:48.377104 13934 master.cpp:2244] Subscribing framework >>>> 'name' with checkpointing disabled and capabilities [ ] >>>> I0813 02:42:48.377378 13934 hierarchical.cpp:271] Added framework >>>> 0dffbee9-a514-4ffa-87e1-2850dd4dcf00 >>>> I0813 02:42:49.475163 13929 http.cpp:381] HTTP POST for >>>> /master/api/v1/scheduler from 192.168.33.1:60782 with >>>> User-Agent='Google-HTTP-Java-Client/1.20.0 (gzip)' >>>> I0813 02:42:51.133513 13930 master.cpp:1284] Framework >>>> 0dffbee9-a514-4ffa-87e1-2850dd4dcf00 (name) disconnected >>>> I0813 02:42:51.133597 13930 master.cpp:2725] Disconnecting framework >>>> 0dffbee9-a514-4ffa-87e1-2850dd4dcf00 (name) >>>> I0813 02:42:51.133618 13930 master.cpp:2749] Deactivating framework >>>> 0dffbee9-a514-4ffa-87e1-2850dd4dcf00 (name) >>>> I0813 02:42:51.133644 13930 master.cpp:1297] Giving framework >>>> 0dffbee9-a514-4ffa-87e1-2850dd4dcf00 (name) 0ns to failover >>>> I0813 02:42:51.133692 13932 hierarchical.cpp:382] Deactivated framework >>>> 0dffbee9-a514-4ffa-87e1-2850dd4dcf00 >>>> I0813 02:42:51.137265 13931 master.cpp:5561] Framework failover >>>> timeout, removing framework 0dffbee9-a514-4ffa-87e1-2850dd4dcf00 (name) >>>> I0813 02:42:51.137339 13931 master.cpp:6296] Removing framework >>>> 0dffbee9-a514-4ffa-87e1-2850dd4dcf00 (name) >>>> I0813 02:42:51.137464 13931 hierarchical.cpp:333] Removed framework >>>> 0dffbee9-a514-4ffa-87e1-2850dd4dcf00 >>>> ```` >>>> Note the immediate disconnection after the second POST is intentional. >>>> >>>> This is with Mesos 1.0.0 on Ubuntu Trusty. >>>> >>>> What can I do to debug this issue? The logs do not provide a lot of >>>> information to act on. The stream id generated by mesos is not in the logs, >>>> nor anything indicating that an HTTP 400 was sent. >>>> >>>> -- >>>> Zameer Manji >>>> >>>> >>> >>> >> >> > >

