Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/20933#discussion_r179811255
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala
---
@@ -368,8 +368,7 @@ case class FileSourceScanExec(
val bucketed =
selectedPartitions.flatMap { p =>
p.files.map { f =>
- val hosts = getBlockHosts(getBlockLocations(f), 0, f.getLen)
--- End diff --
Better organization to support other changes like this one is the reason.
@jose-torres was right to point out that these changes are self-contained
enough to go in a separate PR and @gengliangwang and I both agreed. Why make
this commit larger than necessary?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]