Github user squito commented on a diff in the pull request:
https://github.com/apache/spark/pull/16521#discussion_r95624898
--- Diff:
common/network-common/src/main/java/org/apache/spark/network/crypto/README.md
---
@@ -0,0 +1,158 @@
+Spark Auth Protocol and AES Encryption Support
+==============================================
+
+This file describes an auth protocol used by Spark as a more secure
alternative to DIGEST-MD5. This
+protocol is built on symmetric key encryption, based on the assumption
that the two endpoints being
+authenticated share a common secret, which is how Spark authentication
currently works. The protocol
+provides mutual authentication, meaning that after the negotiation both
parties know that the remote
+side knows the shared secret. The protocol is influenced by the ISO/IEC
9798 protocol, although it's
+not an implementation of it.
+
+This protocol could be replaced with TLS PSK, except no PSK ciphers are
available in the currently
+released JREs.
+
+The protocol aims at solving the following shortcomings in Spark's current
usage of DIGEST-MD5:
+
+- MD5 is an aging hash algorithm with known weaknesses, and a more secure
alternative is desired.
+- DIGEST-MD5 has a pre-defined set of ciphers for which it can generate
keys. The only
+ viable, supported cipher these days is 3DES, and a more modern
alternative is desired.
+- Encrypting AES session keys with 3DES doesn't solve the issue, since the
weakest link
+ in the negotiation would still be MD5 and 3DES.
+
+The protocol assumes that the shared secret is generated and distributed
in a secure manner.
+
+The protocol always negotiates encryption keys. If encryption is not
desired, the existing
+SASL-based authentication, or no authentication at all, can be chosen
instead.
+
+When messages are described below, it's expected that the implementation
should support
+arbitrary sizes for fields that don't have a fixed size.
+
+Client Challenge
+----------------
+
+The auth negotiation is started by the client. The client starts by
generating an encryption
+key based on the application's shared secret, and a nonce.
+
+ KEY = KDF(SECRET, SALT, KEY_LENGTH)
+
+Where:
+- KDF(): a key derivation function that takes a secret, a salt, a
configurable number of
+ iterations, and a configurable key length.
+- SALT: a byte sequence used to salt the key derivation function.
+- KEY_LENGTH: length of the encryption key to generate.
+
+
+The client generates a message with the following content:
+
+ CLIENT_CHALLENGE = (
+ APP_ID,
+ KDF,
+ ITERATIONS,
+ CIPHER,
+ KEY_LENGTH,
+ ANONCE,
+ ENC(APP_ID || ANONCE || CHALLENGE))
+
+Where:
+
+- APP_ID: the application ID which the server uses to identify the shared
secret.
+- KDF: the key derivation function described above.
+- ITERATIONS: number of iterations to run the KDF when generating keys.
+- CIPHER: the cipher used to encrypt data.
+- KEY_LENGTH: length of the encryption keys to generate, in bits.
+- ANONCE: the nonce used as the salt when generating the auth key.
+- ENC(): an encryption function that uses the cipher and the generated
key. This function
+ will also be used in the definition of other messages below.
+- CCHALLENGE: a byte sequence used as a challenge to the server.
--- End diff --
typo: CHALLENGE
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]