On Fri, Jan 13, 2023, at 22:56, John Mattsson wrote: > There are a lot of additional tracking and fingerprinting vectors in > the Client Hello and Server Hello. > > - Tracking is also an issue for servers. IoT devices are often servers > and by tracking the device you can often track the person owning the > device.
This isn't an item in your list, but an expansion of the scope. > - Ticket reuse is just one example of psk identifier reuse, all psk > identifier reuse has the same client and server tracking considerations. > - Client reuse of key shares can be used to track the client. Server > reuse of key shares can be used to track the server or to reveal the > server name. Just to be clearer about your implicit threat model, I assume that you are assuming that this is a scenario where the same client and server are establishing a connection, but that addressing and timing don't reveal any information about their identity. In that context, repetition of identifiers from a previous connection might allow an observer to correlate the two sessions. Tickets and key shares are easy: don't do that. We already say that though, so nothing to do, except perhaps to point out where implementations ignore that advice. > - SNI can be used to track a server and most SNI (except very common > ones) can be used to track a client. This is what ECH is intended to address. > - Non-common values of max_fragment_length, supported_groups, > signature_algorithms, application_layer_protocol_negotiation, etc. can > be used to track a client with high probability. > - The set of extentions in CH or SH might be used to track client or > server with high probability. The fingerprinting vector does not need > to be globally unique. An attacker often looks in a specific location, > in a specific network, and at a specific time. Can also be correlated > with fingerprints at other layers. This is a harder question. Some of these are protected by ECH, some are not. (Threat model question: Are we now protecting against tracking by the other endpoint?) FWIW, there seems to be another implicit assumption in this list that needs to be expanded. The values that a client or server offers here is governed by some combination of the code it executes, the configuration of that endpoint, and the inputs it receives. Differences in code is something I have personally given up on. There are reasons that you might prefer to eliminate apparent differences between implementations, but this is likely somewhere between hard and impossible. If you consider the fingerprinting risk profile as a product of the entire networking stack, you need to eliminate differences across the entire stack, from the hardware up to the application. There are a finite number of implementations, so the entropy provided by implementation variations should be small. Furthermore, synchronizing fingerprints with another implementation is possible only up until the point that both implementations remain the same. It only takes a new feature addition in one implementation to undo this. That doesn't mean that eliminating any gratuitous differences might not eventually help by reducing the number of discrete fingerprints. That's specification work though. For instance, we could all agree that padding is no longer needed, so that we can all remove that extension from ClientHello. But consider the fingerprinting risk inherent in your choice of congestion controller. Do you really expect people agree on all of those details such that you might eliminate any differences - and potential competitive advantage - between products? Configuration and other live inputs tend to produce variation that is observable. The best choice for configuration is to not allow it. That's a general trend now, but see above regarding eliminating choice and control. Live input is harder. If the input is provided by a peer, then the attitude we've taking in browsers is to only apply any corresponding change when communicating with that peer; otherwise, you are exposed to tracking by that entity. But that leads to leaking this change in peer identity toward the network. The best reaction I have here is to say that for ClientHello, any change in configuration/behaviour should be in the part protected by ECH; in comparison, for ServerHello, don't; move it to EncryptedExtensions instead. If this is not possible, reconsider the feature. The only things that can't be protected by ECH are those things that relate to key configuration, which are effectively a lower layer of the protocol. Those lower-layer functions require more care. For instance, if we were to move key share configuration to DNS, so that clients can guess the right key share to provide, then that might work, but we'd need to be careful to ensure that the configuration is consistent on the same level as the ECH configuration. Otherwise, it partitions the anonymity set. Overall, I think that ECH gives us all we realistically act on right now. _______________________________________________ TLS mailing list TLS@ietf.org https://www.ietf.org/mailman/listinfo/tls