GitHub user WenboZhao opened a pull request:
https://github.com/apache/spark/pull/21593
[SPARK-24578][Core] Cap sub-region's size of returned nio buffer
## What changes were proposed in this pull request?
This PR tries to fix the performance regression introduced by SPARK-21517.
In our production job, we performed many parallel computations, with high
possibility, some task could be scheduled to a host-2 where it needs to read
the cache block data from host-1. Often, this big transfer makes the cluster
suffer time out issue (it will retry 3 times, each with 120s timeout, and then
do recompute to put the cache block into the local MemoryStore).
The root cause is that we don't do `consolidateIfNeeded` anymore for many
small chunks which causes the `buf.notBuffer()` has bad performance in the case
that we have to call `copyByteBuf()` many times.
## How was this patch tested?
Existing unit tests and also test in production
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/WenboZhao/spark spark-24578
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/21593.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #21593
----
commit a30d4de019ac4380cf5bfd36ff0cf12ef72d78f7
Author: Wenbo Zhao <wzhao@...>
Date: 2018-06-19T20:34:30Z
Cap sub-region's size of returned nio buffer
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]