Github user xuanyuanking commented on the issue: https://github.com/apache/spark/pull/17702 ``` This approach only works if the first level glob pattern matches a lot of directories. ``` Yep, actually in our internal usage, we leave the problem to user and they should use first wild cast to represent most of file. ``` Maybe we should just fork the Hadoop Globber and improve it to run in parallel. ``` Thanks for your detailed explain and guidance, I'll reconsider this and open another PR.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org