[GitHub] LucaCanali commented on a change in pull request #23525: [SPARK-26595][core] Allow credential renewal based on kerberos ticket cache.
LucaCanali commented on a change in pull request #23525: [SPARK-26595][core] Allow credential renewal based on kerberos ticket cache. URL: https://github.com/apache/spark/pull/23525#discussion_r248966327 ## File path: core/src/main/scala/org/apache/spark/deploy/security/HadoopDelegationTokenManager.scala ## @@ -97,28 +106,37 @@ private[spark] class HadoopDelegationTokenManager( ThreadUtils.newDaemonSingleThreadScheduledExecutor("Credential Renewal Thread") val ugi = UserGroupInformation.getCurrentUser() -if (ugi.isFromKeytab()) { +val tgtRenewalTask = if (ugi.isFromKeytab()) { // In Hadoop 2.x, renewal of the keytab-based login seems to be automatic, but in Hadoop 3.x, // it is configurable (see hadoop.kerberos.keytab.login.autorenewal.enabled, added in // HADOOP-9567). This task will make sure that the user stays logged in regardless of that // configuration's value. Note that checkTGTAndReloginFromKeytab() is a no-op if the TGT does // not need to be renewed yet. - val tgtRenewalTask = new Runnable() { + new Runnable() { override def run(): Unit = { ugi.checkTGTAndReloginFromKeytab() Review comment: I should clarify that the warning messages I reported are for the case where I use the TGT and --conf spark.kerberos.renewal.credentials=ccache rather than using keytab, apologies for the possible confusion this may have generated. Looking now at the code for UserGroupInformatio.reloginFromTicketCache I can see that it calls hasSufficientTimeElapsed which is responsible for generating the warning message in question, when users are trying to renew at a rate higher than a certain frequency. As you pointed out, with hadoop.kerberos.min.seconds.before.relogin set to default value of 60 we are OK as it matches the default for spark.kerberos.relogin.period, (but this requires HADOOP-7930, e.g. Hadoop version >= 2.8)). On a related topic, I can see that checkTGTAndReloginFromKeytab has a "silent" way of checking if the rate of request for renewal is higher than the threshold, so no warning are generated in this case. Does this make sense and is it reproducible in you Hadoop 2.7 environment too? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] LucaCanali commented on a change in pull request #23525: [SPARK-26595][core] Allow credential renewal based on kerberos ticket cache.
LucaCanali commented on a change in pull request #23525: [SPARK-26595][core] Allow credential renewal based on kerberos ticket cache. URL: https://github.com/apache/spark/pull/23525#discussion_r248634712 ## File path: core/src/main/scala/org/apache/spark/deploy/security/HadoopDelegationTokenManager.scala ## @@ -97,28 +106,37 @@ private[spark] class HadoopDelegationTokenManager( ThreadUtils.newDaemonSingleThreadScheduledExecutor("Credential Renewal Thread") val ugi = UserGroupInformation.getCurrentUser() -if (ugi.isFromKeytab()) { +val tgtRenewalTask = if (ugi.isFromKeytab()) { // In Hadoop 2.x, renewal of the keytab-based login seems to be automatic, but in Hadoop 3.x, // it is configurable (see hadoop.kerberos.keytab.login.autorenewal.enabled, added in // HADOOP-9567). This task will make sure that the user stays logged in regardless of that // configuration's value. Note that checkTGTAndReloginFromKeytab() is a no-op if the TGT does // not need to be renewed yet. - val tgtRenewalTask = new Runnable() { + new Runnable() { override def run(): Unit = { ugi.checkTGTAndReloginFromKeytab() Review comment: Thanks @vanzin for the detailed explanations. After some additional investigation I found that if I compile Spark with Hadoop 3.1 the behavior is OK. I can still reproduce the issue I mentioned with the standard 2.7 version in my environment. It appears that `hadoop.kerberos.min.seconds.before.relogin` is not available in Hadoop 2.7 and was introduced in 2.8? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] LucaCanali commented on a change in pull request #23525: [SPARK-26595][core] Allow credential renewal based on kerberos ticket cache.
LucaCanali commented on a change in pull request #23525: [SPARK-26595][core] Allow credential renewal based on kerberos ticket cache. URL: https://github.com/apache/spark/pull/23525#discussion_r247846894 ## File path: docs/security.md ## @@ -776,16 +776,32 @@ The following options provides finer-grained control for this feature: Long-running applications may run into issues if their run time exceeds the maximum delegation token lifetime configured in services it needs to access. -Spark supports automatically creating new tokens for these applications when running in YARN mode. -Kerberos credentials need to be provided to the Spark application via the `spark-submit` command, -using the `--principal` and `--keytab` parameters. +This feature is not available everywhere. In particular, it's only implemented +on YARN and Kubernetes (both client and cluster modes), and on Mesos when using client mode. -The provided keytab will be copied over to the machine running the Application Master via the Hadoop -Distributed Cache. For this reason, it's strongly recommended that both YARN and HDFS be secured -with encryption, at least. +Spark supports automatically creating new tokens for these applications. There are two ways to +enable this functionality. -The Kerberos login will be periodically renewed using the provided credentials, and new delegation -tokens for supported will be created. +### Using a Keytab + +By providing Spark with a principal and keytab (e.g. using `spark-submit` with `--principal` +and `--keytab` parameters), the application will maintain a valid Kerberos login that can be +used to retrieve delegation tokens indefinitely. + +Note that when using a keytab in cluster mode, it will be copied over to the machine running the +Spark driver. In the case of YARN, this means using HDFS as a staging area for the keytab, so it's +strongly recommended that both YARN and HDFS be secured with encryption, at least. + +### Using a ticket cache Review comment: Very nice improvement in this PR. I guess it is worth documenting it also on docs/running-on-yarn.md This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] LucaCanali commented on a change in pull request #23525: [SPARK-26595][core] Allow credential renewal based on kerberos ticket cache.
LucaCanali commented on a change in pull request #23525: [SPARK-26595][core] Allow credential renewal based on kerberos ticket cache. URL: https://github.com/apache/spark/pull/23525#discussion_r247840564 ## File path: core/src/main/scala/org/apache/spark/deploy/security/HadoopDelegationTokenManager.scala ## @@ -236,11 +257,19 @@ private[spark] class HadoopDelegationTokenManager( } private def doLogin(): UserGroupInformation = { -logInfo(s"Attempting to login to KDC using principal: $principal") -require(new File(keytab).isFile(), s"Cannot find keytab at $keytab.") -val ugi = UserGroupInformation.loginUserFromKeytabAndReturnUGI(principal, keytab) -logInfo("Successfully logged into KDC.") -ugi +if (principal != null) { + logInfo(s"Attempting to login to KDC using principal: $principal") + require(new File(keytab).isFile(), s"Cannot find keytab at $keytab.") + val ugi = UserGroupInformation.loginUserFromKeytabAndReturnUGI(principal, keytab) + logInfo("Successfully logged into KDC.") + ugi +} else { + logInfo(s"Attempting to load user's ticket cache.") + val ccache = sparkConf.getenv("KRB5CCNAME") + val user = Option(sparkConf.getenv("KRB5PRINCIPAL")).getOrElse( Review comment: Would it make sense to also check/use the value of spark.yarn.principal (or an ad-hoc config parameter if "reusing" this one is not OK) if provided by the user? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] LucaCanali commented on a change in pull request #23525: [SPARK-26595][core] Allow credential renewal based on kerberos ticket cache.
LucaCanali commented on a change in pull request #23525: [SPARK-26595][core] Allow credential renewal based on kerberos ticket cache. URL: https://github.com/apache/spark/pull/23525#discussion_r247844744 ## File path: core/src/main/scala/org/apache/spark/deploy/security/HadoopDelegationTokenManager.scala ## @@ -97,28 +106,37 @@ private[spark] class HadoopDelegationTokenManager( ThreadUtils.newDaemonSingleThreadScheduledExecutor("Credential Renewal Thread") val ugi = UserGroupInformation.getCurrentUser() -if (ugi.isFromKeytab()) { +val tgtRenewalTask = if (ugi.isFromKeytab()) { // In Hadoop 2.x, renewal of the keytab-based login seems to be automatic, but in Hadoop 3.x, // it is configurable (see hadoop.kerberos.keytab.login.autorenewal.enabled, added in // HADOOP-9567). This task will make sure that the user stays logged in regardless of that // configuration's value. Note that checkTGTAndReloginFromKeytab() is a no-op if the TGT does // not need to be renewed yet. - val tgtRenewalTask = new Runnable() { + new Runnable() { override def run(): Unit = { ugi.checkTGTAndReloginFromKeytab() Review comment: When testing this I get a warning message "WARN UserGroupInformation: Not attempting to re-login since the last re-login was attempted less than 600 seconds before.." every minute (I am using the default value of spark.yarn.kerberos.relogin.period). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] LucaCanali commented on a change in pull request #23525: [SPARK-26595][core] Allow credential renewal based on kerberos ticket cache.
LucaCanali commented on a change in pull request #23525: [SPARK-26595][core] Allow credential renewal based on kerberos ticket cache. URL: https://github.com/apache/spark/pull/23525#discussion_r247840564 ## File path: core/src/main/scala/org/apache/spark/deploy/security/HadoopDelegationTokenManager.scala ## @@ -236,11 +257,19 @@ private[spark] class HadoopDelegationTokenManager( } private def doLogin(): UserGroupInformation = { -logInfo(s"Attempting to login to KDC using principal: $principal") -require(new File(keytab).isFile(), s"Cannot find keytab at $keytab.") -val ugi = UserGroupInformation.loginUserFromKeytabAndReturnUGI(principal, keytab) -logInfo("Successfully logged into KDC.") -ugi +if (principal != null) { + logInfo(s"Attempting to login to KDC using principal: $principal") + require(new File(keytab).isFile(), s"Cannot find keytab at $keytab.") + val ugi = UserGroupInformation.loginUserFromKeytabAndReturnUGI(principal, keytab) + logInfo("Successfully logged into KDC.") + ugi +} else { + logInfo(s"Attempting to load user's ticket cache.") + val ccache = sparkConf.getenv("KRB5CCNAME") + val user = Option(sparkConf.getenv("KRB5PRINCIPAL")).getOrElse( Review comment: Would it make sense to check also the value of spark.yarn.principal is provided by the user? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] LucaCanali commented on a change in pull request #23525: [SPARK-26595][core] Allow credential renewal based on kerberos ticket cache.
LucaCanali commented on a change in pull request #23525: [SPARK-26595][core] Allow credential renewal based on kerberos ticket cache. URL: https://github.com/apache/spark/pull/23525#discussion_r247839666 ## File path: core/src/main/scala/org/apache/spark/deploy/security/HadoopDelegationTokenManager.scala ## @@ -236,11 +257,19 @@ private[spark] class HadoopDelegationTokenManager( } private def doLogin(): UserGroupInformation = { -logInfo(s"Attempting to login to KDC using principal: $principal") -require(new File(keytab).isFile(), s"Cannot find keytab at $keytab.") -val ugi = UserGroupInformation.loginUserFromKeytabAndReturnUGI(principal, keytab) -logInfo("Successfully logged into KDC.") -ugi +if (principal != null) { + logInfo(s"Attempting to login to KDC using principal: $principal") + require(new File(keytab).isFile(), s"Cannot find keytab at $keytab.") + val ugi = UserGroupInformation.loginUserFromKeytabAndReturnUGI(principal, keytab) + logInfo("Successfully logged into KDC.") + ugi +} else { + logInfo(s"Attempting to load user's ticket cache.") + val ccache = sparkConf.getenv("KRB5CCNAME") Review comment: I was wondering if adding an additional optional configuration parameter with the path of the KRB5CC file could also be useful? Possibly more useful when using this in cluster mode? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org