[ 
https://issues.apache.org/jira/browse/SPARK-16342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saisai Shao updated SPARK-16342:
--------------------------------
    Description: 
Current Spark on YARN token management has some problems:

1. Supported service is hard-coded, only HDFS, Hive and HBase are supported for 
token fetching. For other third-party services which need to be communicated 
with Spark in Kerberos way, currently the only way is to modify Spark code.
2. Current token renewal and update mechanism is also hard-coded, which means 
other third-party services cannot be benefited from this system and will be 
failed when token is expired.
3. Also In the code level, current token obtain and update codes are placed in 
several different places without elegant structured, which makes it hard to 
maintain and extend.

So here propose a new Configurable Token Manager class to solve the issues 
mentioned above. 

Basically this new proposal will have two changes:

1. Abstract a ServiceTokenProvider for different services, this is configurable 
and pluggable, by default there will be hdfs, hbase, hive service, also user 
could add their own services through configuration. This interface offers a way 
to retrieve the tokens and token renewal interval.

2. Provide a ConfigurableTokenManager to manage all the added-in token 
providers, also expose APIs for external modules to get and update tokens.

Details are in the design doc 
(https://docs.google.com/document/d/1piUvrQywWXiSwyZM9alN6ilrdlX9ohlNOuP4_Q3A6dc/edit?usp=sharing),
 any suggestion and comment is greatly appreciated.

  was:
Current Spark on YARN token management has some problems:

1. Supported service is hard-coded, only HDFS, Hive and HBase are supported for 
token fetching. For other third-party services which need to be communicated 
with Spark in Kerberos way, currently the only way is to modify Spark code.
2. Current token renewal and update mechanism is also hard-coded, which means 
other third-party services cannot be benefited from this system and will be 
failed when token is expired.
3. Also In the code level, current token obtain and update codes are placed in 
several different places without elegant structured, which makes it hard to 
maintain and extend.

So here propose a new Configurable Token Manager class to solve the issues 
mentioned above. Design doc is attached with link 
(https://docs.google.com/document/d/1piUvrQywWXiSwyZM9alN6ilrdlX9ohlNOuP4_Q3A6dc/edit?usp=sharing),
 any suggestion and comment is greatly appreciated.


> Add a new Configurable Token Manager  for Spark Running on YARN
> ---------------------------------------------------------------
>
>                 Key: SPARK-16342
>                 URL: https://issues.apache.org/jira/browse/SPARK-16342
>             Project: Spark
>          Issue Type: New Feature
>          Components: YARN
>            Reporter: Saisai Shao
>
> Current Spark on YARN token management has some problems:
> 1. Supported service is hard-coded, only HDFS, Hive and HBase are supported 
> for token fetching. For other third-party services which need to be 
> communicated with Spark in Kerberos way, currently the only way is to modify 
> Spark code.
> 2. Current token renewal and update mechanism is also hard-coded, which means 
> other third-party services cannot be benefited from this system and will be 
> failed when token is expired.
> 3. Also In the code level, current token obtain and update codes are placed 
> in several different places without elegant structured, which makes it hard 
> to maintain and extend.
> So here propose a new Configurable Token Manager class to solve the issues 
> mentioned above. 
> Basically this new proposal will have two changes:
> 1. Abstract a ServiceTokenProvider for different services, this is 
> configurable and pluggable, by default there will be hdfs, hbase, hive 
> service, also user could add their own services through configuration. This 
> interface offers a way to retrieve the tokens and token renewal interval.
> 2. Provide a ConfigurableTokenManager to manage all the added-in token 
> providers, also expose APIs for external modules to get and update tokens.
> Details are in the design doc 
> (https://docs.google.com/document/d/1piUvrQywWXiSwyZM9alN6ilrdlX9ohlNOuP4_Q3A6dc/edit?usp=sharing),
>  any suggestion and comment is greatly appreciated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to