[ https://issues.apache.org/jira/browse/SPARK-36493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Zikun updated SPARK-36493: -------------------------- Description: Currently we have the logic to deal with the JDBC keytab provided by the "--files" option {{if (keytabParam != null && FilenameUtils.getPath(keytabParam).isEmpty)}} {{{}} {{}}{{val result = SparkFiles.get(keytabParam)}} {{}}{{logDebug(s"Keytab path not found, assuming --files, file name used on executor: $result")}} {{}}{{result}} {{}}} {{else {}} {{}}{{logDebug("Keytab path found, assuming manual upload")}} {{}}{{keytabParam}} {{}}} Spark has already created the soft link for any file submitted by the "--files" option. Here is an example. testusera1.keytab -> /var/opt/hadoop/temp/nm-local-dir/usercache/testusera1/appcache/application_1628584679772_0003/filecache/12/testusera1.keytab So there is no need to call the SparkFiles.get to absolute path of the keytab file. We can directly use the variable `keytabParam` as the keytab file path. Moreover, SparkFiles.get will get a wrong path of keytab for the driver in cluster mode. In cluster mode, the keytab is distributed to the following location for both the driver and executors /var/opt/hadoop/temp/nm-local-dir/usercache/testusera1/appcache/application_1628584679772_0003/filecache/12/testusera1.keytab but SparkFiles.get brings the following wrong location for the driver /var/opt/hadoop/temp/nm-local-dir/usercache/testusera1/appcache/application_1628584679772_0003/spark-8fb0f437-c842-4a9f-9612-39de40082e40/userFiles-5075388b-0928-4bc3-a498-7f6c84b27808/testusera1.keytab was: Currently we have the logic to deal with the JDBC keytab provided by the "--files" option if (keytabParam != null && FilenameUtils.getPath(keytabParam).isEmpty) { val result = SparkFiles.get(keytabParam) logDebug(s"Keytab path not found, assuming --files, file name used on executor: $result") result } Spark has already created the soft link for any file submitted by the "--files" option. Here is an example. testusera1.keytab -> /var/opt/hadoop/temp/nm-local-dir/usercache/testusera1/appcache/application_1628584679772_0003/filecache/12/testusera1.keytab So there is no need to call the SparkFiles.get to absolute path of the keytab file. We can directly use the variable `keytabParam` as the keytab file path. Moreover, SparkFiles.get will get a wrong path of keytab. In a running Spark cluster, the keytab is distributed to the following location /var/opt/hadoop/temp/nm-local-dir/usercache/testusera1/appcache/application_1628584679772_0003/filecache/12/testusera1.keytab but SparkFiles.get brings the following wrong location /var/opt/hadoop/temp/nm-local-dir/usercache/testusera1/appcache/application_1628584679772_0003/spark-8fb0f437-c842-4a9f-9612-39de40082e40/userFiles-5075388b-0928-4bc3-a498-7f6c84b27808/testusera1.keytab > SparkFiles.get is not needed for the JDBC keytab provided by the "--files" > option > --------------------------------------------------------------------------------- > > Key: SPARK-36493 > URL: https://issues.apache.org/jira/browse/SPARK-36493 > Project: Spark > Issue Type: Bug > Components: Spark Core > Affects Versions: 3.1.0, 3.1.2 > Reporter: Zikun > Priority: Major > Fix For: 3.1.3 > > > Currently we have the logic to deal with the JDBC keytab provided by the > "--files" option > {{if (keytabParam != null && FilenameUtils.getPath(keytabParam).isEmpty)}} > {{{}} > {{}}{{val result = SparkFiles.get(keytabParam)}} > {{}}{{logDebug(s"Keytab path not found, assuming --files, file name used on > executor: $result")}} > {{}}{{result}} > {{}}} {{else {}} > {{}}{{logDebug("Keytab path found, assuming manual upload")}} > {{}}{{keytabParam}} > {{}}} > Spark has already created the soft link for any file submitted by the > "--files" option. Here is an example. > testusera1.keytab -> > /var/opt/hadoop/temp/nm-local-dir/usercache/testusera1/appcache/application_1628584679772_0003/filecache/12/testusera1.keytab > > So there is no need to call the SparkFiles.get to absolute path of the keytab > file. We can directly use the variable `keytabParam` as the keytab file path. > > Moreover, SparkFiles.get will get a wrong path of keytab for the driver in > cluster mode. In cluster mode, the keytab is distributed to the following > location for both the driver and executors > /var/opt/hadoop/temp/nm-local-dir/usercache/testusera1/appcache/application_1628584679772_0003/filecache/12/testusera1.keytab > but SparkFiles.get brings the following wrong location for the driver > /var/opt/hadoop/temp/nm-local-dir/usercache/testusera1/appcache/application_1628584679772_0003/spark-8fb0f437-c842-4a9f-9612-39de40082e40/userFiles-5075388b-0928-4bc3-a498-7f6c84b27808/testusera1.keytab > > -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org