[
https://issues.apache.org/jira/browse/HDDS-13209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Wei-Chiu Chuang resolved HDDS-13209.
------------------------------------
Resolution: Duplicate
> [Docs] Add Spark under Application Integrations
> -----------------------------------------------
>
> Key: HDDS-13209
> URL: https://issues.apache.org/jira/browse/HDDS-13209
> Project: Apache Ozone
> Issue Type: Task
> Components: documentation
> Reporter: Wei-Chiu Chuang
> Assignee: Wei-Chiu Chuang
> Priority: Major
>
> Like Hive and Impala, Spark is supported by Ozone. We should add a new user
> doc page for Spark under Application Integrations user doc.
> The doc will have a Hugo-compatible header, followed by Apache License text
> header.
> The doc will then have an introduction, that Spark on Yarn can submit jobs to
> access Ozone.
> If Ozone is not configured as default file system, the job submission command
> may need to specify `spark.yarn.access.hadoopFileSystems`. For example:
> ```
> spark-shell \
> --conf "spark.yarn.access.hadoopFileSystems=ofs://ozone1707264383"
> ```
> Similarly for spark-submit command.
> If spark.yarn.access.hadoopFileSystems is not specified, the command may exit
> with this error because Spark doesn’t request Ozone delegation token:
> ```bash
> Caused by: org.apache.hadoop.security.AccessControlException: Client cannot
> authenticate via:[TOKEN, KERBEROS]
> ```
> This command works with Spark3 as well:
> ```bash
> spark3-shell \
> --conf "spark.kerberos.access.hadoopFileSystems=ofs://ozone1707264383"
> ```
> If the spark shell fails due to token renewal
> ```bash
> 24/02/08 01:24:30 ERROR repl.Main: Failed to initialize Spark session.
> org.apache.hadoop.yarn.exceptions.YarnException: Failed to submit
> application_1707350431298_0007 to YARN : Failed to renew token: Kind:
> HDFS_DELEGATION_TOKEN, Service: 10.140.99.144:8020, Ident: (token for systest:
> HDFS_DELEGATION_TOKEN [email protected], renewer=yarn,
> realUser=, issueDate=1707355458945, maxDate=1707960258945, sequenceNumber=50,
> masterKeyId=14)
> ```
> add this property
> ```bash
> --conf
> "spark.yarn.kerberos.renewal.excludeHadoopFileSystems=ofs://ozone1707264383"
> ```
> For example,
> ```bash
> spark3-shell \
> --conf "spark.kerberos.access.hadoopFileSystems=ofs://ozone1707264383" \
> --conf
> "spark.yarn.kerberos.renewal.excludeHadoopFileSystems=ofs://ozone1707264383"
> ```
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]