[
https://issues.apache.org/jira/browse/SPARK-14743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Marcelo Vanzin resolved SPARK-14743.
------------------------------------
Resolution: Fixed
Assignee: Saisai Shao
Fix Version/s: 2.1.0
> Improve delegation token handling in secure clusters
> ----------------------------------------------------
>
> Key: SPARK-14743
> URL: https://issues.apache.org/jira/browse/SPARK-14743
> Project: Spark
> Issue Type: Improvement
> Components: Spark Core, YARN
> Affects Versions: 2.0.0
> Reporter: Marcelo Vanzin
> Assignee: Saisai Shao
> Fix For: 2.1.0
>
>
> In a way, I'd consider this a parent bug of SPARK-7252.
> Spark's current support for delegation tokens is a little all over the place:
> - for HDFS, there's support for re-creating tokens if a principal and keytab
> are provided
> - for HBase and Hive, Spark will fetch delegation tokens so that apps can
> work in cluster mode, but will not re-create them, so apps that need those
> will stop working after 7 days
> - for anything else, Spark doesn't do anything. Lots of other services use
> delegation tokens, and supporting them as data sources in Spark becomes more
> complicated because of that. e.g., Kafka will (hopefully) soon support them.
> It would be nice if Spark had consistent support for handling delegation
> tokens regardless of who needs them. I'd list these as the requirements:
> - Spark to provide a generic interface for fetching delegation tokens. This
> would allow Spark's delegation token support to be extended using some plugin
> architecture (e.g. Java services), meaning Spark itself doesn't need to
> support every possible service out there.
> This would be used to fetch tokens when launching apps in cluster mode, and
> when a principal and a keytab are provided to Spark.
> - A way to manually update delegation tokens in Spark. For example, a new
> SparkContext API, or some configuration that tells Spark to monitor a file
> for changes and load tokens from said file.
> This would allow external applications to manage tokens outside of Spark and
> be able to update a running Spark application (think, for example, a job
> sever like Oozie, or something like Hive-on-Spark which manages Spark apps
> running remotely).
> - A way to notify running code that new delegation tokens have been loaded.
> This may not be strictly necessary; it might be possible for code to detect
> that, e.g., by peeking into the UserGroupInformation structure. But an event
> sent to the listener bus would allow applications to react when new tokens
> are available (e.g., the Hive backend could re-create connections to the
> metastore server using the new tokens).
> Also, cc'ing [~busbey] and [~steve_l] since you've talked about this in the
> mailing list recently.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]