Github user squito commented on the pull request:
https://github.com/apache/spark/pull/4187#issuecomment-71323709
Thanks @MickDavies ! thanks for investigating and also putting the
performance comparison into the jira. I think the code looks fine, but I'm not
super-familiar w/ this part of the code so I'd like to get an OK from somebody
else. (I'm assuming the tests will pass ...)
Out of my own curiosity -- does the dictionary effect the amount of memory
needed when reading & writing parquet? Was parquet creating this dictionary in
any case, so we always had it sitting around in memory?? Or does this change
mean that now we're storing more in memory than before?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]