sunchao commented on a change in pull request #29498:
URL: https://github.com/apache/spark/pull/29498#discussion_r474428435
##########
File path: docs/tuning.md
##########
@@ -264,6 +264,13 @@ parent RDD's number of partitions. You can pass the level
of parallelism as a se
or set the config property `spark.default.parallelism` to change the default.
In general, we recommend 2-3 tasks per CPU core in your cluster.
+Sometimes you may also need to increase directory listing parallelism when job
input has large number of directories,
Review comment:
👍
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]