mridulm commented on code in PR #45425:
URL: https://github.com/apache/spark/pull/45425#discussion_r1548815203


##########
common/network-common/src/main/java/org/apache/spark/network/crypto/README.md:
##########
@@ -99,3 +103,13 @@ sessions. It would, however, allow impersonation of future 
sessions.
 In the event of a pre-shared key compromise, messages would still be 
confidential from a passive
 observer. Only active adversaries spoofing a session would be able to recover 
plaintext.
 
+Security Changes & Compatibility
+-------------
+
+The original version of this protocol, retroactively called v1.0, did not 
apply an HKDF to `sharedSecret` and was

Review Comment:
   nit:
   
   ```suggestion
   The original version of this protocol, retroactively called v1.0, did not 
apply an HKDF to `sharedSecret` (to generate the `derivedKey`) and was
   ```



##########
common/network-common/src/main/java/org/apache/spark/network/util/TransportConf.java:
##########
@@ -213,6 +213,11 @@ public boolean encryptionEnabled() {
     return conf.getBoolean("spark.network.crypto.enabled", false);
   }
 
+  /**
+   * Version number to be used by the AuthEngine key agreement protocol. Value 
values are 1 or 2.
+   */
+  public int authEngineVersion() { return 
conf.getInt("spark.network.crypto.authEngineVersion", 2); }

Review Comment:
   Make the default 1, to preserve the existing behavior, and allowing backport 
to patch versions.
   We can document to set this explicitly to to `2` to mitigate this issue.
   
   In a later minor version, we can flip this to `2` by default.



##########
common/network-common/src/main/java/org/apache/spark/network/crypto/AuthEngine.java:
##########
@@ -224,7 +236,7 @@ private TransportCipher generateTransportCipher(
 
   private byte[] getTranscript(AuthMessage... encryptedPublicKeys) {
     ByteBuf transcript = Unpooled.buffer(
-        Arrays.stream(encryptedPublicKeys).mapToInt(k -> 
k.encodedLength()).sum());
+        
Arrays.stream(encryptedPublicKeys).mapToInt(AuthMessage::encodedLength).sum());

Review Comment:
   Let us minimize optional changes in this PR (this by itself is fine - but in 
general, we should reduce the scope of change), given we want to backport to 
older versions and want to minimize risk.



##########
docs/security.md:
##########
@@ -149,24 +149,32 @@ secret file agrees with the executors' secret file.
 
 # Network Encryption
 
-Spark supports two mutually exclusive forms of encryption for RPC connections.
+Spark supports two mutually exclusive forms of encryption for RPC connections:
 
-The first is an AES-based encryption which relies on a shared secret, and thus 
requires
-RPC authentication to also be enabled.
+The **preferred method** uses TLS (aka SSL) encryption via Netty's support for 
SSL. Enabling SSL
+requires keys and certificates to be properly configured. SSL is standardized 
and considered more
+secure.
 
-The second is an SSL based encryption mechanism utilizing Netty's support for 
SSL. This requires
-keys and certificates to be properly configured. It can be used with or 
without the authentication
-mechanism discussed earlier.
-
-One may prefer to use the SSL based encryption in scenarios where compliance 
mandates the usage
-of specific protocols; or to leverage the security of a more standard 
encryption library. However,
-the AES based encryption is simpler to configure and may be preferred if the 
only requirement
-is that data be encrypted in transit.
+The legacy method is an AES-based encryption mechanism relying on a shared 
secret. This requires
+RPC authentication to also be enabled. This method uses a bespoke protocol and 
it is recommended
+to use SSL instead.

Review Comment:
   ```suggestion
   RPC authentication to also be enabled. This method uses a bespoke protocol 
with an older v1 and an updated v2 version. <add details about v1 vs v2>
   ```
   
   Also, update the config section below on how to configure 
`spark.network.crypto.authEngineVersion`, and a ref to that here: along with 
incompatibility between `1` and `2` (and all versions prior being `1` 
retroactively - without a way to configure it before `3.5.2` and `3.4.3`)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to