Github user sryza commented on the pull request:

    https://github.com/apache/spark/pull/3913#issuecomment-93487382
  
    Again, one of the main uses is estimating the size of variables you're 
considering broadcasting.  Another is experimenting with different 
representations - e.g. how much more efficient is declaring a custom class than 
just using a hash map?  In these situations, putting the data into and RDD to 
estimate the size would be an inconvenience.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to