Github user mridulm commented on the pull request:
https://github.com/apache/spark/pull/2247#issuecomment-54395088
I am not sure I follow ... We compress shuffle with first while we compress
spills with second. How would that be same ? The perf characteristics would
be different.
The team has decided to use it, and I don't want to second guess their
reasons whatever they might be (that just goes into unnecessary
discussions).
Given it is a public interface available in 1.0 (iirc since much before ?)
, we can't remove support for it until next major release.... We can of
course deprecate and warn against use, but will need to continue supporting
it until next major release.
On 04-Sep-2014 2:10 am, "Matei Zaharia" <[email protected]> wrote:
> It's not an interface in the sense that program output doesn't change.
> It's only a config setting for an optimization.
>
> Anyway, I agree with Reynold -- I'd like to see an example where
> spark.shuffle.spill.compress and spark.shuffle.compress need to be at
> different values, and see performance numbers for that. It seems to me
that
> you're spilling the same kind of objects in both, so there will be the
same
> tradeoff between I/O and compute time.
>
> â
> Reply to this email directly or view it on GitHub
> <https://github.com/apache/spark/pull/2247#issuecomment-54362759>.
>
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]