On 10-Sep-22 07:44, Patrick Monnerat via curl-library wrote:
On 9/13/21 13:01, Daniel Stenberg via curl-library wrote:
Hi team.
We added support for GOPHERS in late 2020. There's a new PR proposing
support for the ManageSieve protocol. We had a PR previously
suggesting Gemini support and the other day ICAP was brought up in a
discussion. WebSockets is another common one discussed.
I don't think it's crazy to imagine that we might add support for
more protocols going forward. Sooner or later.
This is not a problem we must solve *right now*, but I would feel
better if we have an idea about how to address it when we get there.
Because I'm convinced we will reach this point eventually.
One year later, all protocol bits are used !
In the meantime, CURLOPT_PROTOCOLS_STR has been added for caller's
use, but this only translates to bits and the internal problem has not
been resolved yet.
IMO, using strings internally is much too expensive in overhead.
Do we have now an idea how we want to extend this internally ?
- Use a packed struct of bools. Requires C99 for initialization. Very
clear code for constant protocols but hard to access for a run-time
computed protocol number.
- Use an array of 8-bit flags. Also requires C99 for initialization.
- Use a packed array of flags. Almost impossible to initialize
statically.
- Use an array of protocol numbers. High run-time overhead.
- Drop support for non-64bit curl_off_t.
- Use a struct with a second set of flags (named CURLPROTO2_*)
- Something else...
Adding another protocol will only be possible after this problem is
resolved.
I could look at it for an implementation if I knew in which direction
to go.
BTW: the websockets protocols are not (yet) handled by protocol2num().
Patrick
I rather like the array of protocol numbers. The overhead needn't be
particularly high, especially considering the use cases.
For example: At compile time (or even curl global init), sort the array
- which allows for a binary search to query support - bsearch() and
qsort() are standard. Further, any application is likely to query
protocol support infrequently (typically at initialization). And is
also likely to be interested in only a few of the protocols. So it
could (and could be encouraged to) cache the results in a compact,
application-specific way. For a binary search, the number of probes to
find a protocol is at most log(2)N. So even with 256 protocols, 8.
It's also easy to enumerate supported protocols with a linear scan.
Many of the same arguments apply to an array of (pointers to) strings;
in addition to a simple ordered table/binary search, the hsearch_r()
family could be used. But the overhead is higher, and intuitively not
likely worthwhile with short protocol names, and a relatively modest
number of protocols. O(1) for a hash isn't much different from O(<10)
for a(n infrequent) binary search.
Either seems reasonable; numbers is simpler and more compact.
Neither enumerating nor querying protocol support should be critical
path items. Over-optimization is not worthwhile.
Timothe Litt
ACM Distinguished Engineer
--------------------------
This communication may not represent the ACM or my employer's views,
if any, on the matters discussed.
--
Unsubscribe: https://lists.haxx.se/listinfo/curl-library
Etiquette: https://curl.se/mail/etiquette.html