On 10-Sep-22 07:44, Patrick Monnerat via curl-library wrote:

On 9/13/21 13:01, Daniel Stenberg via curl-library wrote:
Hi team.

We added support for GOPHERS in late 2020. There's a new PR proposing support for the ManageSieve protocol. We had a PR previously suggesting Gemini support and the other day ICAP was brought up in a discussion. WebSockets is another common one discussed.

I don't think it's crazy to imagine that we might add support for more protocols going forward. Sooner or later.



This is not a problem we must solve *right now*, but I would feel better if we have an idea about how to address it when we get there. Because I'm convinced we will reach this point eventually.


One year later, all protocol bits are used !

In the meantime, CURLOPT_PROTOCOLS_STR has been added for caller's use, but this only translates to bits and the internal problem has not been resolved yet.

IMO, using strings internally is much too expensive in overhead.

Do we have now an idea how we want to extend this internally ?

- Use a packed struct of bools. Requires C99 for initialization. Very clear code for constant protocols but hard to access for a run-time computed protocol number.

- Use an array of 8-bit flags. Also requires C99 for initialization.

- Use a packed array of flags. Almost impossible to initialize statically.

- Use an array of protocol numbers. High run-time overhead.

- Drop support for non-64bit curl_off_t.

- Use a struct with a second set of flags (named CURLPROTO2_*)

- Something else...


Adding another protocol will only be possible after this problem is resolved.

I could look at it for an implementation if I knew in which direction to go.


BTW: the websockets protocols are not (yet) handled by protocol2num().

Patrick


I rather like the array of protocol numbers.  The overhead needn't be particularly high, especially considering the use cases.

For example:  At compile time (or even curl global init), sort the array - which allows for a binary search to query support - bsearch() and qsort() are standard. Further, any application is likely to query protocol support infrequently (typically at initialization).  And is also likely to be interested in only a few of the protocols.  So it could (and could be encouraged to) cache the results in a compact, application-specific way.  For a binary search, the number of probes to find a protocol is  at most log(2)N.  So even with 256 protocols, 8.  It's also easy to enumerate supported protocols with a linear scan.

Many of the same arguments apply to an array of  (pointers to) strings; in addition to a simple ordered table/binary search, the hsearch_r() family could be used.  But the overhead is higher, and intuitively not likely worthwhile with short protocol names, and a relatively modest number of protocols.  O(1) for a hash isn't much different from O(<10) for a(n infrequent) binary search.

Either seems reasonable; numbers is simpler and more compact.

Neither enumerating nor querying protocol support should be critical path items.  Over-optimization is not worthwhile.

Timothe Litt
ACM Distinguished Engineer
--------------------------
This communication may not represent the ACM or my employer's views,
if any, on the matters discussed.
-- 
Unsubscribe: https://lists.haxx.se/listinfo/curl-library
Etiquette:   https://curl.se/mail/etiquette.html

Reply via email to