Github user mateiz commented on the pull request:
https://github.com/apache/spark/pull/2247#issuecomment-54521058
Sure, I'd be happy to codify that. In general my approach to this has been
that we keep all config settings and defaults that affect correctness (whether
the program gives the right result), but things that affect performance are
fair to change. For example, we've kept spark.serializer to be Java since
changing this by default would break a lot of apps. But we've tweaked things
like the % of memory used for caching and buffer sizes, and we've also made
some settings largely irrelevant (e.g. Akka frame size used to matter a lot,
but we moved first task results and then tasks themselves to not be sent
through Akka).
Anyway, in this case I'm not arguing that this should be removed in 1.X if
there are concerns, I'm just trying to understand whether this is a useful
setting to have longer-term. In particular, should we deprecate it and remove
it in 2.X? So far I haven't seen a strong case that this is a useful setting,
but I might be missing some use case.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]