Senthil Kumar created SPARK-36327:
-------------------------------------
Summary: Spark sql creates staging dir inside database directory
rather than creating inside table directory
Key: SPARK-36327
URL: https://issues.apache.org/jira/browse/SPARK-36327
Project: Spark
Issue Type: Bug
Components: Spark Core
Affects Versions: 3.1.2
Reporter: Senthil Kumar
Spark sql creates staging dir inside database directory rather than creating
inside table directory.
This arises only when viewfs:// is configured. When the location is hdfs://, it
doesn't occur.
Based on further investigation in file *SaveAsHiveFile.scala*, I could see that
the directory hierarchy has been not properly handled for viewFS condition.
Parent path(db path) is passed rather than passing the actual directory(table
location).
{{
// Mostly copied from Context.java#getExternalTmpPath of Hive 1.2
private def newVersionExternalTempPath(
path: Path,
hadoopConf: Configuration,
stagingDir: String): Path = {
val extURI: URI = path.toUri
if (extURI.getScheme == "viewfs")
{ getExtTmpPathRelTo(path.getParent, hadoopConf, stagingDir) }
else
{ new Path(getExternalScratchDir(extURI, hadoopConf, stagingDir), "-ext-10000")
}
}
}}
Please refer below lines
===============================
if (extURI.getScheme == "viewfs") {
getExtTmpPathRelTo(path.getParent, hadoopConf, stagingDir)
===============================
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]