mbutrovich commented on code in PR #4309:
URL: https://github.com/apache/datafusion-comet/pull/4309#discussion_r3305936011
##########
spark/src/main/scala/org/apache/comet/rules/CometScanRule.scala:
##########
@@ -367,12 +367,13 @@ case class CometScanRule(session: SparkSession)
val hadoopDerivedProperties =
CometIcebergNativeScan.hadoopToIcebergS3Properties(hadoopS3Options)
- // Extract vended credentials from FileIO (REST catalog credential
vending).
- // FileIO properties take precedence over Hadoop-derived
properties because
- // they contain per-table credentials vended by the REST catalog.
+ // Forward the full FileIO property bag (including
credentials.uri, OAuth tokens,
Review Comment:
Thanks again, @parthchandra. This sent me back to check Spark's own posture
before adding the note.
Spark RPC is plaintext by default. `TransportContext.java:257` only installs
an `SslHandler` when `spark.ssl.rpc.enabled` is true, and that defaults to
false (`TransportConf.java:273-275`). The AES alternative
`spark.network.crypto.enabled` also defaults to false (`Network.scala:30-34`),
and the two are mutually exclusive (`SecurityManager.scala:283`). With both
off, the Netty channel is raw TCP.
The same channel already carries Hadoop delegation tokens,
`CloudCredentialsProvider` JWTs, shuffle blocks, and serialized closures.
Spark's `docs/security.md` covers the custom delegation token SPI (lines
927-944) without recommending RPC encryption for it, and only flags crypto in
two specific contexts: YARN secret distribution (line 64) and
`spark.io.encryption.enabled` (line 285).
The bootstrap property bag here is the same category of data on the same
channel, so a "set `spark.network.crypto.enabled=true`" callout would be more
prescriptive than Spark is for equivalent mechanisms. RPC encryption is a
deployment-wide call, not a per-SPI one.
Added a short "Wire encryption" subsection to the user guide that notes
catalog config rides the Netty RPC channel, names the two opt-in mechanisms
(`spark.network.crypto.enabled` and `spark.ssl.rpc.enabled`), and links Spark's
[security guide](https://spark.apache.org/docs/latest/security.html) rather
than prescribing a specific knob.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]