parthchandra commented on PR #4335:
URL: 
https://github.com/apache/datafusion-comet/pull/4335#issuecomment-4480030753

   Just wanted to add my 2 bits to the credentials refreshing bit. 
   
   The credentials providers are going to be executed on each executor and each 
executor will essentially request credentials at the same time. When running on 
a very large scale, this  has been seen to sometimes overwhelm credentials 
backends leading to _system-wide_ job failure. So caching the credentials at 
the executors makes sense, but it is generally better to refresh centrally and 
distribute the credentials. 
   
   It makes sense for the _engine_ to do the refresh.  For instance, in Spark, 
Kerberos delegation tokens are managed by Spark centrally in 
[DelegationTokenManager](https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/deploy/security/HadoopDelegationTokenManager.scala)
   
   This does open the question of secure _distribution_ of the credentials. 
Broadcast on an insecure channel will not do. The credentials distribution 
needs TLS. 
    


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to