Github user tgravescs commented on the pull request:
https://github.com/apache/spark/pull/11800#issuecomment-212579563
Yeah there are other jira to improve the startup, I just haven't had time
to get to them yet. Feel free to work on if you have time. :)
this just makes it so you are actually reading X number of files in
parallel which could increase memory pressure and I was wondering if you had
look to see by how much that is. We have very large files all the time so if
all threads are reading 10GB files I was wondering how much that would increase
memory usage vs only reading one at a time.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]