On 5/23/25 02:48, Willy Tarreau wrote:
> On Fri, May 23, 2025 at 12:46:13AM -0400, Demi Marie Obenour wrote:
>>> ... and blocks on some of them if lacking entropy at boot. It's only
>>> very recently that it finally adopted a timeout (which is still quite
>>> long by the way). We've had this problem already on some products. I'd
>>> rather seed a secure prng and only rely on it later, this also has the
>>> advantage of being portable.
>>
>> That *should* be easy to mitigate by calling getrandom() during HAProxy
>> startup. Once it stops blocking it should never block again.
> 
> That's what we already do, just like calling RAND_bytes() for openssl,
> and a few other things for the same reasons. But those starting a
> headless VM seeing the service not being available and taking ages to
> start are not necessarily happy.

VMs should have a paravirtualized entropy source (non-confidential
computing case) or just rely on RDRAND or equivalent (confidential
computing).

>> there is RDRAND on many x86 CPUs.
> 
> Yes, it may only be used as a complement since it's not everywhere. That's
> exactly why we're currently pre-seeding from many different sources at
> boot.

I think it makes sense to rely on the kernel for that, as all entropy
ultimately comes from there.

>>>>> But we could pretty well replace our implementation with a constant
>>>>> time one if needed. From memory we have two implementations, the
>>>>> normal one and a URL-safe one. But do not hesitate to have a look at
>>>>> its replacement. If you need to import a file from libsodium, please
>>>>> place it in src/ for .c, or in include/import/ for .h, and try to
>>>>> change the least possible files there so that it's possible to update
>>>>> from time to time (like we do for xxhash, slz, trees etc). Otherwise
>>>>> if it's just a matter of replacing a function or two, it's OK to
>>>>> just copy them into base64.c, but then please mention in the comment
>>>>> on top of the function where it comes from (both to help check for
>>>>> updates and for crediting the original author).
>>>>
>>>> That should be doable, though I don't have any immediate plans to do
>>>> this in the near term.
>>>
>>> OK!
>>
>> One change that I would also like to see (but don't have any immediate
>> plans to implement) is a "just use it" high-level encryption/decryption
>> function.  I'm thinking something like this:
>>
>> root-secret /path/to/secret/key/file [auto-populate=true]
>>   Sets the "root" secret used for high-level encryption and decryption
>>   operations.  The file must be at least 32 bytes long and its contents
>>   must actually be secret.  The contents of the file are read before
>>   HAProxy chroots, so the path need not (and should not) be inside the
>>   chroot.  The file must be owned by the user HAProxy is launched as or
>>   by root, and must not be readable by any other user.
> (...)
> 
> Very interesting. I'd be fine with this, but we've observed over time
> that initially expected eternal secrets have to change in a seamless
> way. We could thus imagine to support 3 of them (the previous one, the
> current one and the next one) in order to support smooth rotation on
> all nodes when such secrets are used for stuff like cookies.
> 
> Also if we want to do things cleanly, we have to split them in two at
> two places (either two halves, or better just create a random and store
> the secret xored with that random, and the random elsewhere). This is
> against risks of data leak that may always happen. I don't want to see
> the secret appear verbatim in a header in case of a failed rewrite that
> left data there for example.

That's a good idea, but I suspect that zeroing memory on allocation (and,
if possible, again on freeing) is probably a more effective countermeasure.
Privilege separation (perform the crypto in a separate process) is even more
effective, and for asymmetric private-key operations the cost of IPC is going
to be much less than the cost of the cryptography.  For symmetric cryptography
the IPC cost is much more severe, though it can be amortized via batching of
operations.

>> min-encryption-version version
>>   The minimum version of cryptography HAProxy should use.  Currently, the
>>   default and only supported version is 1.  This serves as a safeguard
>>   in case there is a flaw in HAProxy's encryption implementation, to allow
>>   invalidation of data encrypted with old versions.
> 
> If there is a "min", there must be a "default" as well, to indicate which
> version to use to be compatible with other nodes during rotating upgrades.

That's a good point.

>> haproxy_encrypt(domain,associated_data)
>>   Encrypts the raw byte input using the root secret, the given 
>> domain-separation
>>   string, and the given associated data.  Returns the encrypted data.
> (...)
>> haproxy_decrypt(domain,associated_data,success_var)
>>   Decrypts the raw byte input using the root secret, the given 
>> domain-separation
>>   string, and the given associated data.  On success, sets success_var to 1
> (...)
> 
> Both of them are interesting as well. Do you have short term use for
> this ?

Not in particular.  I saw someone implement OAuth using HAProxy Lua scripting
and decided to see if I could implement a toy OIDC relying party myself.

> All of this looks well thought and stemming from some preliminary
> work, not just randomly dumped like this in an e-mail.

It's the product of seeing people misuse symmetric cryptography way too
many times.  The general rule I finally came up with is that if a nonce
is not long enough for a random nonce to be safe, one should only use
freshly generated keys for encryption, where “fresh” means “since the
process started”.  That allows using stateful nonces without concerns
about nonce reuse, at least so long as one doesn't need to worry about
fork() and friends.  Anything else is too risky.

> Similarly that's something that can be added during the 3.3 cycle. And
> depending on the dependencies we may even imagine backporting that once
> it's considered stable enough, because I think this would be fairly
> isolated. So if you have an imminent use case, we could possibly imagine
> that it ends up in the latest LTS after some time.

I don't have an immediate use-case right now, other than a toy OpenID
Connect relying party that will probably never be used in production.
It's something that I think anyone else will need if they want to use
HAProxy for this purpose, though, at least without very ugly workarounds
in Lua.

>>>> lua-load-per-thread is
>>>> the only approach that makes sense here, as you certainly don't want to
>>>> be doing asymmetric cryptography while holding a global lock.
>>>
>>> I agree! We'd even like to deprecate lua-load in favor of an explicit
>>> keyword that makes users conscious of the performance impact of the lock.
>>
>> It might be useful to provide some form of sharded counter for rate limiting
>> and other purposes.
> 
> From Lua you can still access all the variables (proc.XXX, txn.XXX etc).
> So you could store your counter as a global proc.XXX variable. It would
> transparently be accessed under a lock, but only for the time it takes to
> access the variable, which is nothing compared to holding the Lua lock
> during a whole time slice.

For statistics, it might make sense to have a sharded counter that trades
the cost of reads (rare) for increments (extremely common).

>>>>> At the very least it can be an option to easily experiment with extra
>>>>> code without having to patch haproxy. Another option for testing is to
>>>>> rely on LD_PRELOAD, but then you need to be super careful to respect
>>>>> the exact internal API and ABI (any build option counts).
>>>> This would be purely a Lua extension, not reliant on any of HAProxy's
>>>> header file.  
>>>
>>> In this case that's totally fine.
>> Someone actually implemented Rust bindings for the Lua API at
>> https://github.com/khvzak/haproxy-api-rs, which is a LOT more
>> complex than anything I would come up with :-).
> 
> Ah, it's Aleksandr Orlenko, he presented his amazing work at
> HAProxyConf2022. It was a brotli compression filter written in rust 
> and loaded from Lua. That's a perfect illustration of what is possible,
> but indeed you don't need to go that far!

I do wonder if this changes whether loadable modules are a good idea.
Rust has the advantage that the type system makes it much easier to
write code that won't crash, and panics not being undefined behavior
means that one can (if one chooses) catch the panic and return a 500
error rather than tearing down the process.  Rust does crash on out
of memory, but that is very rare, and there is an argument that it
is better to crash and restart than to rely on error handling code
that is almost certainly *not* going to be tested.  For instance,
OpenSSH had an MITM vulnerability due to untested code for out of
memory handling.

> His presentation is here, for those interested:
> 
>     https://www.youtube.com/watch?v=VNidpAXOrpk
> 
> Cheers,
> Willy


-- 
Sincerely,
Demi Marie Obenour (she/her/hers)

Attachment: OpenPGP_0xB288B55FFF9C22C1.asc
Description: OpenPGP public key

Attachment: OpenPGP_signature.asc
Description: OpenPGP digital signature

Reply via email to