Hi Tim,
On Wed, Sep 23, 2020 at 11:02:47PM +0200, Tim Düsterhus wrote:
> Willy,
>
> please excuse my belated reply. First the weekend was there and then
> subsequently I forgot about your email.
Well, you don't need to apologize :-) I prefer when design discussions
take their time than when they're rushed anyway.
> Unfortunately no one else wanted
> to add their 2ct so far :-( I believe this is a topic that could benefit
> from administrator opinions.
It could also be indicative of something we already know, which is that
once users have configured their systems, the first thing they don't
want to do is to reconfigure them!
> Am 18.09.20 um 15:26 schrieb Willy Tarreau:
> >> Then there's quite a few redundant converters. We have `sha1`, then
> >> `sha2` and most recently we have `digest`.
> >
> > From what I'm reading in the commit message (the doc not being clear
> > on this point), they don't seem to overlap as "digest" seems to return
> > an HMAC. In addition, "sha1" doesn't require openssl so it comes for
> > free. But anyway I get your point and generally agree.
>
> No, digest() is a plain hash. hmac() is for HMACs :-) Both were added in
> the same commit, though.
OK, but in any case there's no info in the doc about supported algos
beyond "sha256" that is given as an example, so I'm pretty sure both
are mostly unused at the moment :-/
> >> But there's also
> >> a difference in naming. Sometimes the namespace is separated by a dot
> >> (`req.hdr`). Sometime it's an underscore (`ssl_fc`).
> >
> > Yep. The dot was added later to try to group all those working within
> > the same domain (e.g. the request or the response). We also (re)discovered
> > that the automatic binding making them directly available from Lua is less
> > transparent with the dot as that one has to be turned to an underscore
> > there. But it's less common and the dot still appears much more readable
> > to me.
>
> Yes, I agree. Maybe it's because the dot feels natural from the member
> access of a struct in C (or as a property access in many other
> programming languages). It feels like a natural namespace separator to me.
Sure, but my first nasty encounters with underscores that made me want
to suppress them was when seeing them not display at the bottom of a
screen connected to a KVM in a rack, making it a real pain to work on
a config. That's when I realized that they're mostly used by developers
to name symbols in code but rarely in config keywords where the dash
usually replaces the space and the dot serves as a namespace delimiter,
and that using underscores in config keywords was awkward and ought to
be avoided.
> >> s.field() instead of field()
> >> s.word() instead of word()
> >> s.json() instead of json()
> >
> > Maybe, why not for these ones. But then json() alone cannot stay
> > this way and needs to be called json-enc() or something like this
> > since it only encodes and doesn't decode.
>
> Technically even json_enc() is not entirely appropriate, because the
> only thing it does is encoding a value so that the result is a valid
> contents of a string in JSON. Specifically bool(1),json() is simply `1`,
> not `true`. And str("foo"),json() is `foo`, not `"foo"`.
I think you already explained that to me, and the fact that I don't
remember it indicates that the current naming isn't the best one.
> >> Make 'http.' contain HTTP related converters.
> >>
> >> http.language() instead of language()
> >
> > This already becomes difficult. http is very wide and touches all
> > areas. For example you'll use http.date() to encode an HTTP date
> > while plenty of users will search for it in "date.something" because
> > it's just one way to represent a date, or "time.something" because a
> > date is first and foremost something related to time.
>
> I guess for that date() would best be generalized to take the proper
> format as a parameter. Allowing e.g. date(iso8601), initially only
> supporting dates for use in HTTP header.
Yes, possibly.
> I must admit that I never used the math operators for anything before.
> No idea how commonly they are used in real world configurations.
I've mostly seen them used to compute thresholds based on historic values
extracted from stick tables, and used to compute expiration dates to
place in headers.
> >> Make 'sec.' contain security related converters.
> >>
> >> sec.digest() instead of digest()
> >> deprecate sha1, sha2 in favor of digest()
> >
> > Actually I'd rather have "hash.something" for everything related to
> > hashing, as most of the time it has nothing to do with security, and
> > making people think that "djb2" or "crc32" are secure is a problem for
> > me, and personnaly I don't see a reason why these ones wouldn't appear
> > next to "sha1" or "digest".
>
> Yes, that makes sense to me.
Another point to address then is the inconsistency between "digest(function)"
and "hash.function".
> >> sec.secure_memcmp() instead of secure_memcmp()
> >
> > Maybe "sec.memcmp()" or s.secure_memcmp() since it's about strings ?
>
> Here we are starting bikeshedding. Personally I would be happy with any
> improvement and I just attempted to give a few examples. Someone
> definitely needs to draft up a consistent and complete set to allow
> giving proper feedback. Discussing about single converters is not
> helping, I guess.
Probably but at some point this discussion will be necessary. The task
is tedious and I doubt a single person will handle it and propose
something at once, so whoever starts will need to be prepared to have
discussions (and bikeshed sessions) on certain groups of keywords.
(...)
> > Hashing function
> > long name | short name | purpose
> > -------------------+-------------+---------------------------------------
> > hash.crc32 | crc32 | hashes binary input using crc32 algo
(...)
> Yes, such a categorized list would certainly solve most of the current
> pain points of the converter documentation. I probably would even leave
> out the purpose column and instead relying on a descriptive name + the
> current long form description. That should make maintaining the
> documentation a bit easier. Most the the purposes you listed there are
> painfully tautological. I mean ... it's obvious that hash.crc32 will
> create a CRC-32 hash.
For this one yes, but for some it may help, however for others we'd
probably need warnings only, or info such as "returns text" vs
"returns an integer". So most likely instead we should have a "Notes"
column that will most often remain empty, and the purpose should be
described in the long keyword section instead.
> I believe that the short names of the more uncommon converters should be
> deprecated in the long run, though.
I don't think so based on what I've seen everywhere short versions of
something try to be deprecated, like the example of "wr mem" on cisco
router configs vs "copy running-configuration startup-configuration"
that absolutely nobody uses. In the end what remains natural and short
for users can never be removed, and the difference between developers
expectations and real usage creates even more confusion. There might
be a few ones that are seldom used and are easy to rename since they
will not make a huge difference, but I think most of them will remain
as-is. This is also why I constantly say that it's terribly difficult
to name a config option, because once released it cannot be renamed.
It simply indicates that the main long-term scheme should be decided
very early so that at least new keywords can follow the new one.
> This keeps the list nice and tidy
> and makes the configuration more self-documenting. Especially since the
> current short names are not ideally named as we established in this
> conversations.
Yep!
Regards,
Willy