[ https://issues.apache.org/jira/browse/SPARK-44272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Mridul Muralidharan resolved SPARK-44272. ----------------------------------------- Fix Version/s: 3.5.0 4.0.0 Resolution: Fixed Issue resolved by pull request 41821 [https://github.com/apache/spark/pull/41821] > Path Inconsistency when Operating statCache within Yarn Client > -------------------------------------------------------------- > > Key: SPARK-44272 > URL: https://issues.apache.org/jira/browse/SPARK-44272 > Project: Spark > Issue Type: Bug > Components: Spark Submit > Affects Versions: 0.9.1, 2.3.0, 3.4.0, 3.5.0 > Reporter: SHU WANG > Assignee: SHU WANG > Priority: Critical > Fix For: 3.5.0, 4.0.0 > > > The *addResource* from *ClientDistributedCacheManager* can add *FileStatus* > to > *statCache* when it is not yet cached. Also, there is a subtle bug from > *isPublic* from > *getVisibility* method. *uri.getPath()* will not retain URI information like > scheme, host, etc. So, the *uri* passed to checkPermissionOfOther will differ > from the original {*}uri{*}. > For example, if uri is "file:/foo.invalid.com:8080/tmp/testing", then > {code:java} > uri.getPath -> /foo.invalid.com:8080/tmp/testing > uri.toString -> file:/foo.invalid.com:8080/tmp/testing{code} > The consequence of this bug is that we will *double RPC calls* when the > resources are remote, which is unnecessary. We see nontrivial overhead when > checking those resources from our HDFS, especially when HDFS is overloaded. > > Ref: related code within *ClientDistributedCacheManager* > {code:java} > def addResource(...) { > val destStatus = statCache.getOrElse(destPath.toUri(), > fs.getFileStatus(destPath)) > val visibility = getVisibility(conf, destPath.toUri(), statCache) > } > private[yarn] def getVisibility() { > isPublic(conf, uri, statCache) > } > private def isPublic(conf: Configuration, uri: URI, statCache: Map[URI, > FileStatus]): Boolean = { > val current = new Path(uri.getPath()) // Should not use getPath > checkPermissionOfOther(fs, uri, FsAction.READ, statCache) > } > {code} > -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org