Github user kayousterhout commented on the pull request: https://github.com/apache/spark/pull/3003#issuecomment-61183847 Thanks for updating this! I'd still like the error message returned to the user (the one in the abort() call) to include the size of the too-big result, as well as the configured maximum size. I think there is little cost to adding this information, and great savings to a user who is trying to understand this functionality. It looks like you're running from a Spark shell with the logging level set to info, but users in other environments will only see the SparkException and not the log message. Also, it looks like you didn't address the comment about multiple jobs/stages running at once. Right now, the maximum limit only applies to one stage. This seems like an issue because multiple concurrent stages or jobs that all collect results can together add up to more than the limit. @mateiz do you think this is a non-issue?
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org