Github user yhuai commented on the pull request:
https://github.com/apache/spark/pull/6225#issuecomment-103190896
I have merged it to master and branch 1.4. I also tested manually and it
did fix the performance issue of calling list status. The WIP in the title was
for the work of using a broadcast hadoop conf and to make sure we do not have
regression comparing with 1.3 (broadcasting a conf for every partition's Hadoop
RDD is pretty expensive). Since this issue is an separate issue, I am going to
create another PR to address it.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]