Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/22018#discussion_r208779110
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/InMemoryFileIndex.scala
---
@@ -297,7 +297,7 @@ object InMemoryFileIndex extends Logging {
val missingFiles = mutable.ArrayBuffer.empty[String]
val filteredLeafStatuses = allLeafStatuses.filterNot(
status => shouldFilterOut(status.getPath.getName))
- val resolvedLeafStatuses = filteredLeafStatuses.flatMap {
+ val resolvedLeafStatuses = filteredLeafStatuses.par.flatMap {
--- End diff --
As this can be called on executors, I think we should use
`ThreadUtils.parmap`. cc @MaxGekk
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]