Re: Above and beyond 32 protocols

Timothe Litt via curl-library Sat, 10 Sep 2022 05:31:39 -0700


On 10-Sep-22 07:44, Patrick Monnerat via curl-library wrote:

On 9/13/21 13:01, Daniel Stenberg via curl-library wrote:
Hi team.
We added support for GOPHERS in late 2020. There's a new PR proposingsupport for the ManageSieve protocol. We had a PR previouslysuggesting Gemini support and the other day ICAP was brought up in adiscussion. WebSockets is another common one discussed.
I don't think it's crazy to imagine that we might add support formore protocols going forward. Sooner or later.
This is not a problem we must solve *right now*, but I would feelbetter if we have an idea about how to address it when we get there.Because I'm convinced we will reach this point eventually.
One year later, all protocol bits are used !
In the meantime, CURLOPT_PROTOCOLS_STR has been added for caller'suse, but this only translates to bits and the internal problem has notbeen resolved yet.
IMO, using strings internally is much too expensive in overhead.

Do we have now an idea how we want to extend this internally ?
- Use a packed struct of bools. Requires C99 for initialization. Veryclear code for constant protocols but hard to access for a run-timecomputed protocol number.
- Use an array of 8-bit flags. Also requires C99 for initialization.
- Use a packed array of flags. Almost impossible to initializestatically.
- Use an array of protocol numbers. High run-time overhead.

- Drop support for non-64bit curl_off_t.

- Use a struct with a second set of flags (named CURLPROTO2_*)

- Something else...
Adding another protocol will only be possible after this problem isresolved.
I could look at it for an implementation if I knew in which directionto go.
BTW: the websockets protocols are not (yet) handled by protocol2num().

Patrick

I rather like the array of protocol numbers. The overhead needn't beparticularly high, especially considering the use cases.

For example: At compile time (or even curl global init), sort the array- which allows for a binary search to query support - bsearch() andqsort() are standard. Further, any application is likely to queryprotocol support infrequently (typically at initialization). And isalso likely to be interested in only a few of the protocols. So itcould (and could be encouraged to) cache the results in a compact,application-specific way. For a binary search, the number of probes tofind a protocol is at most log(2)N. So even with 256 protocols, 8. It's also easy to enumerate supported protocols with a linear scan.

Many of the same arguments apply to an array of (pointers to) strings;in addition to a simple ordered table/binary search, the hsearch_r()family could be used. But the overhead is higher, and intuitively notlikely worthwhile with short protocol names, and a relatively modestnumber of protocols. O(1) for a hash isn't much different from O(<10)for a(n infrequent) binary search.


Either seems reasonable; numbers is simpler and more compact.

Neither enumerating nor querying protocol support should be criticalpath items. Over-optimization is not worthwhile.


Timothe Litt
ACM Distinguished Engineer
--------------------------
This communication may not represent the ACM or my employer's views,
if any, on the matters discussed.

-- 
Unsubscribe: https://lists.haxx.se/listinfo/curl-library
Etiquette:   https://curl.se/mail/etiquette.html

Re: Above and beyond 32 protocols

Reply via email to