RE: more WebSockets
tioning the upper application layers (like JS WS clients) which deal only with a subset of WS messages - like Text and Binary, Ping/Pong and Close. Such high-level clients expose only these messages in a kind of "abstract" way and thus they kind of hide the opcodes. But libcurl, in my opinion, is a transport layer below that higher layer, so it should provide WS opcodes in frame and messages because it is important "protocol" information (i.e., like HTTP method), and some clients may need to use reserved opcodes for their proprietary communication. > Can opcode 3-7 and b-f be used "at will" by implementations to signal > something without negotiating an extension? In general, all extensions must be negotiated, but I saw cases when some WS client/server implementations used reserved control opcodes for their proprietary communication. I guess they used it for some kind of "real-time" (as control frames are short and has delivery priority over data messages) out-of-band protocol. Thanks, Dmitry Karpov -Original Message- From: Daniel Stenberg Sent: Thursday, August 12, 2021 12:13 AM To: Dmitry Karpov via curl-library Cc: Dmitry Karpov Subject: RE: more WebSockets On Wed, 11 Aug 2021, Dmitry Karpov via curl-library wrote: Thanks for the feedback, this is very helpful! > From a brief look at the document, it looks like Curl will provide > only WebSocket frame level of communication, so the client will have > to implement full message assembling itself. If you by "assembling" mean concatenating multiple frames until the FIN frame, then my thinking was yes, so that we wouldn't have to buffer up potentially a large amount of data before passing it on. How do other client implementations work and how do they handle the unlimited message size? Should we just impose our own maximum size and have applications raise it when needed? > If my understanding is correct, then it seems like a good initial > approach to me - it handles the most critical WS steps: WS handshake, > frame sending/receiving and error reporting, even though it leaves WS > message layer communication (sending/receiving full message) to the clients. That's my current thinking, yes. I want the API to be "good enough" to get sufficiently advanced websockets communication going against "most" server-side websockets implementations. When we think it is, we can start working on code to make it real and then see how it actually works with some early test client applications. The feature will be marked EXPERIMENTAL until we deem it ready anyway so there will be wiggle room to change things around all the way through until we decide it is fine enough to carve in stone (and ship enabled by default in a release). > As it was mentioned in some of our WS-related discussions, the WS > message layer is more complex than the framing layer Can you elaborate on this? Aren't ws messages just the payload from N frames concatenated and delivered? I know there can be control frames injected in the middle of stream of data frames, but the only standard such frames are close, ping and pong and I imagine libcurl would handle them and thus what is passed on to the client would be an unbroken stream of data frames. > libcurl should provide a way to the client to handle incoming message data. > This means that besides Frame-based "send/receive" callbacks, as > described in the document, there should be message-based callbacks on > top of the Frame-layer, which would allow clients to work with WS > messages, rather than with WS frames. There's no way then for libcurl to avoid having to buffer the entire ws message, right? > A small note about iflags: > > " iflags is a bitmask featuring the following (incoming) flags: >CURLWS_TEXT - this is text data >CURLWS_BINARY - this is binary data >CURLWS_FIN - this is also the final fragment of a message >CURLWS_CLOSE - this transfer is now closed" > > The "Text" and "Binary" are special WS frame/message opcodes, so it is > probably better to distinguish frame flags and the frame opcodes > instead of mixing them together. This has been mentioned before but I don't understand why. Why does it matter to an application exactly how the information arrived? The application doesn't see the websocket protocol and it doesn't have to know much about it using this API. > If client is supposed to handle WS frames it gets from libcurl, then > it needs to know the precise opcode along with the frame flags, so it > can properly handle cases when "control" and "data" frames from > different messages are intermixed (i.e. one large "data" messages > intermixed with many "control" messages) and when some "custom
Re: more WebSockets
> Am 12.08.2021 um 09:33 schrieb Stefan Eissing via curl-library > : > > One thing from rfc6455, ch. 5.4: > > "An intermediary MUST NOT change the fragmentation of a message if > any reserved bit values are used and the meaning of these values > is not known to the intermediary." > > > which I read as: if you want to use libcurl as an intermediary, it needs > to expose the frames and its bits. > > Since libcurl never will do any semantic interpretation of the frames, I > would always regard it as an "intermediary". Commenting myself: But if one wants to disregard all this "future proof" and "maybe one day multiplexing" thing in the standard, a re-assembly of fragments into "messages" seems useful for an application. > >> Am 12.08.2021 um 09:24 schrieb Daniel Stenberg via curl-library >> : >> >> On Thu, 12 Aug 2021, Weston Schmidt wrote: >> >>> I'd like to add a flag to CURLOPT_WS_OPTIONS that tells curl if it >>> should negotiate compression or not for easy & multi. >> >>> I like the automatic response to pings & pongs by default. Perhaps >>> another CURLOPT_WS_OPTIONS flag might disable the automatic response >>> behavior in the cases where an app doesn't want to respond (or delay >>> the response, etc). >> >> Thanks, vert good remarks and I've added some text about it now. >> >> -- >> >> / daniel.haxx.se >> | Commercial curl support up to 24x7 is available! >> | Private help, bug fixes, support, ports, new features >> | https://curl.se/support.html >> --- >> Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library >> Etiquette: https://curl.se/mail/etiquette.html > > > --- > Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library > Etiquette: https://curl.se/mail/etiquette.html --- Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library Etiquette: https://curl.se/mail/etiquette.html
Re: more WebSockets
One thing from rfc6455, ch. 5.4: "An intermediary MUST NOT change the fragmentation of a message if any reserved bit values are used and the meaning of these values is not known to the intermediary." which I read as: if you want to use libcurl as an intermediary, it needs to expose the frames and its bits. Since libcurl never will do any semantic interpretation of the frames, I would always regard it as an "intermediary". > Am 12.08.2021 um 09:24 schrieb Daniel Stenberg via curl-library > : > > On Thu, 12 Aug 2021, Weston Schmidt wrote: > >> I'd like to add a flag to CURLOPT_WS_OPTIONS that tells curl if it >> should negotiate compression or not for easy & multi. > >> I like the automatic response to pings & pongs by default. Perhaps >> another CURLOPT_WS_OPTIONS flag might disable the automatic response >> behavior in the cases where an app doesn't want to respond (or delay >> the response, etc). > > Thanks, vert good remarks and I've added some text about it now. > > -- > > / daniel.haxx.se > | Commercial curl support up to 24x7 is available! > | Private help, bug fixes, support, ports, new features > | https://curl.se/support.html > --- > Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library > Etiquette: https://curl.se/mail/etiquette.html --- Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library Etiquette: https://curl.se/mail/etiquette.html
RE: more WebSockets
On Thu, 12 Aug 2021, Daniel Stenberg via curl-library wrote: Should we just impose our own maximum size and have applications raise it when needed? Answering myself. =) Ok, I'm convinced we should make the API able to provide full messages. I'll adjust acccording. -- / daniel.haxx.se | Commercial curl support up to 24x7 is available! | Private help, bug fixes, support, ports, new features | https://curl.se/support.html --- Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library Etiquette: https://curl.se/mail/etiquette.html
Re: more WebSockets
On Thu, 12 Aug 2021, Weston Schmidt wrote: I'd like to add a flag to CURLOPT_WS_OPTIONS that tells curl if it should negotiate compression or not for easy & multi. I like the automatic response to pings & pongs by default. Perhaps another CURLOPT_WS_OPTIONS flag might disable the automatic response behavior in the cases where an app doesn't want to respond (or delay the response, etc). Thanks, vert good remarks and I've added some text about it now. -- / daniel.haxx.se | Commercial curl support up to 24x7 is available! | Private help, bug fixes, support, ports, new features | https://curl.se/support.html --- Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library Etiquette: https://curl.se/mail/etiquette.html
RE: more WebSockets
On Wed, 11 Aug 2021, Dmitry Karpov via curl-library wrote: Thanks for the feedback, this is very helpful! From a brief look at the document, it looks like Curl will provide only WebSocket frame level of communication, so the client will have to implement full message assembling itself. If you by "assembling" mean concatenating multiple frames until the FIN frame, then my thinking was yes, so that we wouldn't have to buffer up potentially a large amount of data before passing it on. How do other client implementations work and how do they handle the unlimited message size? Should we just impose our own maximum size and have applications raise it when needed? If my understanding is correct, then it seems like a good initial approach to me - it handles the most critical WS steps: WS handshake, frame sending/receiving and error reporting, even though it leaves WS message layer communication (sending/receiving full message) to the clients. That's my current thinking, yes. I want the API to be "good enough" to get sufficiently advanced websockets communication going against "most" server-side websockets implementations. When we think it is, we can start working on code to make it real and then see how it actually works with some early test client applications. The feature will be marked EXPERIMENTAL until we deem it ready anyway so there will be wiggle room to change things around all the way through until we decide it is fine enough to carve in stone (and ship enabled by default in a release). As it was mentioned in some of our WS-related discussions, the WS message layer is more complex than the framing layer Can you elaborate on this? Aren't ws messages just the payload from N frames concatenated and delivered? I know there can be control frames injected in the middle of stream of data frames, but the only standard such frames are close, ping and pong and I imagine libcurl would handle them and thus what is passed on to the client would be an unbroken stream of data frames. libcurl should provide a way to the client to handle incoming message data. This means that besides Frame-based "send/receive" callbacks, as described in the document, there should be message-based callbacks on top of the Frame-layer, which would allow clients to work with WS messages, rather than with WS frames. There's no way then for libcurl to avoid having to buffer the entire ws message, right? A small note about iflags: " iflags is a bitmask featuring the following (incoming) flags: CURLWS_TEXT - this is text data CURLWS_BINARY - this is binary data CURLWS_FIN - this is also the final fragment of a message CURLWS_CLOSE - this transfer is now closed" The "Text" and "Binary" are special WS frame/message opcodes, so it is probably better to distinguish frame flags and the frame opcodes instead of mixing them together. This has been mentioned before but I don't understand why. Why does it matter to an application exactly how the information arrived? The application doesn't see the websocket protocol and it doesn't have to know much about it using this API. If client is supposed to handle WS frames it gets from libcurl, then it needs to know the precise opcode along with the frame flags, so it can properly handle cases when "control" and "data" frames from different messages are intermixed (i.e. one large "data" messages intermixed with many "control" messages) and when some "custom" opcodes are used for some proprietary WS communications. To me, that sounds like an argument for providing all the opcodes through to the application. I didn't understand that they are actually used like that, especially not within the same message. Can opcode 3-7 and b-f be used "at will" by implementations to signal something without negotiating an extension? -- / daniel.haxx.se | Commercial curl support up to 24x7 is available! | Private help, bug fixes, support, ports, new features | https://curl.se/support.html --- Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library Etiquette: https://curl.se/mail/etiquette.html
Re: more WebSockets
>> Other websockets implementations are doing that then I presume? I'll only speak to my implementation ... I provided both a streaming interface and a block/message interface. The block/message is nice for small stuff if you know limits but, for all the reasons you point out, streaming through a library is safer & then delegate to the app to assemble. Streaming is also simpler when you need to deal with how non-aligned UTF8 encoded text is handled. A few extra callbacks with small bits of text can reduce larger allocations, copies, or buffers that have reserved padding at the start to handle when you have to carry over a single 4 byte character at the start of a block of text. I like the proposal Daniel. The few thoughts I have: For the easy interface I'm not sure how valuable the curl_ws_poll() call will be. I like the simplicity of just tx and rx. That seems pretty useful for several simple cases. If you're doing more than that you probably are going to want the flexibility of multi. I'd like to add a flag to CURLOPT_WS_OPTIONS that tells curl if it should negotiate compression or not for easy & multi. This allows users to negotiate their own subprotocols where compression may not be allowed and instruct curl to play nicely. Also, for debugging purposes this would be nice. I like the automatic response to pings & pongs by default. Perhaps another CURLOPT_WS_OPTIONS flag might disable the automatic response behavior in the cases where an app doesn't want to respond (or delay the response, etc). Since pings and pongs are allowed to contain application data, it would be useful to send that through the CURL_WS_WRITE callback with a CURLWS_PING or CURLWS_PONG flag so the application gets the payload data. The ability for a client to initiate a ping with it's own arbitrary data is valuable as that enables bidirectional health checking of a connection at the application layer. Pings can be effectively used for all sorts of interesting out of band data while large transfers are happening. Wes On Wed, Aug 11, 2021 at 11:50 PM Daniel Stenberg via curl-library wrote: > > On Wed, 11 Aug 2021, Felipe Gasper wrote: > > >> When a single frame can be 61 bits large? > > (Of course I meant 63...) > > And thanks for this. As you know I'm a WebSockets rookie so I need and > appricate pointers like this! > > > I believe most implementations enforce a maximum message length. Mojolicious > > (Perl), for example, stipulates 256 KiB by default. > > (https://metacpan.org/pod/Mojo::Transaction::WebSocket#max_websocket_size) I > > think Firefox is 2 GiB. > > It could of course work to have a maximum message size set, but this makes me > curious. Surely a client will run into problems if you use 256KB max size > against a server-side websocket thing that assumes much larger? > > Using up to 2 gigabytes buffer for a single message is still several > magnitudes larger than I would want libcurl to do. > > Other websockets implementations are doing that then I presume? > > -- > > / daniel.haxx.se > | Commercial curl support up to 24x7 is available! > | Private help, bug fixes, support, ports, new features > | https://curl.se/support.html > --- > Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library > Etiquette: https://curl.se/mail/etiquette.html --- Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library Etiquette: https://curl.se/mail/etiquette.html
Re: more WebSockets
On Wed, 11 Aug 2021, Felipe Gasper wrote: When a single frame can be 61 bits large? (Of course I meant 63...) And thanks for this. As you know I'm a WebSockets rookie so I need and appricate pointers like this! I believe most implementations enforce a maximum message length. Mojolicious (Perl), for example, stipulates 256 KiB by default. (https://metacpan.org/pod/Mojo::Transaction::WebSocket#max_websocket_size) I think Firefox is 2 GiB. It could of course work to have a maximum message size set, but this makes me curious. Surely a client will run into problems if you use 256KB max size against a server-side websocket thing that assumes much larger? Using up to 2 gigabytes buffer for a single message is still several magnitudes larger than I would want libcurl to do. Other websockets implementations are doing that then I presume? -- / daniel.haxx.se | Commercial curl support up to 24x7 is available! | Private help, bug fixes, support, ports, new features | https://curl.se/support.html --- Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library Etiquette: https://curl.se/mail/etiquette.html
Re: more WebSockets
> On Aug 11, 2021, at 6:34 PM, Daniel Stenberg wrote: > > On Wed, 11 Aug 2021, Felipe Gasper wrote: > >> Why frame by frame? JS’s API only does full messages, and I think the RFC >> actually stipulates that. > > When a single frame can be 61 bits large? I believe most implementations enforce a maximum message length. Mojolicious (Perl), for example, stipulates 256 KiB by default. (https://metacpan.org/pod/Mojo::Transaction::WebSocket#max_websocket_size) I think Firefox is 2 GiB. WS close code 1009 serves this purpose. This can be enforced without receiving a full frame: parse the frame header, determine the size, add it to the previously-received size, and if it exceeds the limit, fail the connection. That said, the RFC does, I now see, explicitly allow for streaming-type interfaces that give individual frame contents to the application. Frame boundaries, though, don’t have the same guarantees that message boundaries do; intermediaries/proxies are free to reshuffle those as they wish. (With TLS now being so prevalent that _probably_ doesn’t happen very often, though.) -F --- Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library Etiquette: https://curl.se/mail/etiquette.html
RE: more WebSockets
Hi Daniel, From a brief look at the document, it looks like Curl will provide only WebSocket frame level of communication, so the client will have to implement full message assembling itself. Summarizing this approach, it seems that libcurl will provide the following: 1. WS handshake handling over HTTP(s). This will include the "upgrade" and handling WS handshake errors (like WS protocol requested by the client wasn't selected by the server etc) and probably handling of some well-known WS extensions like compression. 2. Basic WS framing capabilities - sending/receiving WS frames with all necessary information (flags, data) needed for client to implement full message assembling. This should include compression/decompression, so the client will not have to deal with this low-level stuff. 3. Proper WS closure implementation, so the client will have to specify only close code and reason for client-initiated closures and get close code and reason for server initiated closures (i.e. some WS info options or close callback). 4. "WS alone" mode. Perform only basic WS handshake (probably with handling compression extension), so the client will be able to handle send/receive raw WS frames and do some custom processing. 5. Provide automatic Ping/Pong response and timer-based Ping/Pong pinging (with optional client supplied Ping data). 6. Provide WS-specific error reporting - via proper WS error codes etc. If my understanding is correct, then it seems like a good initial approach to me - it handles the most critical WS steps: WS handshake, frame sending/receiving and error reporting, even though it leaves WS message layer communication (sending/receiving full message) to the clients. As it was mentioned in some of our WS-related discussions, the WS message layer is more complex than the framing layer, and potentially fully assembled WS messages can be huge, so libcurl should provide a way to the client to handle incoming message data. This means that besides Frame-based "send/receive" callbacks, as described in the document, there should be message-based callbacks on top of the Frame-layer, which would allow clients to work with WS messages, rather than with WS frames. But this would require implementation of WS message layer in libcurl, which can be done in some subsequent extensions of WS support in libcurl. A small note about iflags: " iflags is a bitmask featuring the following (incoming) flags: CURLWS_TEXT - this is text data CURLWS_BINARY - this is binary data CURLWS_FIN - this is also the final fragment of a message CURLWS_CLOSE - this transfer is now closed" The "Text" and "Binary" are special WS frame/message opcodes, so it is probably better to distinguish frame flags and the frame opcodes instead of mixing them together. If client is supposed to handle WS frames it gets from libcurl, then it needs to know the precise opcode along with the frame flags, so it can properly handle cases when "control" and "data" frames from different messages are intermixed (i.e. one large "data" messages intermixed with many "control" messages) and when some "custom" opcodes are used for some proprietary WS communications. Thanks, Dmitry Karpov -Original Message- From: curl-library On Behalf Of Daniel Stenberg via curl-library Sent: Wednesday, August 11, 2021 2:41 PM To: libcurl hacking Cc: Daniel Stenberg Subject: more WebSockets Hi, I've refreshed the wiki page a bit using input from the discussion so far. See https://github.com/curl/curl/wiki/WebSockets A few things I realized and tried to reflect in the page: A single fragment can be 61 bits large and a message consists of multiple such fragments: we must have an API that provides data piece by piece to the applicaiton and signal the FIN when it arrives. We need to provide a callback-based approach (as well) to allow for many concurrent websocket transfers - especially for applications that want to mix those up with a few "regular protocol" transfers as well. I've tried to describe how it could work. Not sure it is flexible enough. I added a few questions marked "TBD" in there that I don't think we have answered yet. I think we can design an API that can work. What's the biggest omissions or mistakes in the current draft? -- / daniel.haxx.se | Commercial curl support up to 24x7 is available! | Private help, bug fixes, support, ports, new features | https://curl.se/support.html --- Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library Etiquette: https://curl.se/mail/etiquette.html --- Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library Etiquette: https://curl.se/mail/etiquette.html
Re: more WebSockets
On Wed, 11 Aug 2021, Felipe Gasper wrote: Why frame by frame? JS’s API only does full messages, and I think the RFC actually stipulates that. When a single frame can be 61 bits large? -- / daniel.haxx.se | Commercial curl support up to 24x7 is available! | Private help, bug fixes, support, ports, new features | https://curl.se/support.html--- Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library Etiquette: https://curl.se/mail/etiquette.html
Re: more WebSockets
> On Aug 11, 2021, at 17:46, Daniel Stenberg via curl-library > wrote: > > A single fragment can be 61 bits large and a message consists of multiple > such fragments: we must have an API that provides data piece by piece to the > applicaiton and signal the FIN when it arrives. Why frame by frame? JS’s API only does full messages, and I think the RFC actually stipulates that. -F --- Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library Etiquette: https://curl.se/mail/etiquette.html