AngersZhuuuu opened a new pull request #28541:
URL: https://github.com/apache/spark/pull/28541
### What changes were proposed in this pull request?
We know that when MemoryConsumer acquire memory , if got size < acquired
size,
ConsumerMemory will clear and spill some size to disk, then re-acquire
execution memory.
But some times still got 0. Task break and re compute. Always this kind of
task is heavy.
We know that one task's memory range is `maxExecutionSize / (2 *
activeTaskNum) < size < maxExecutionSize / activeTaskNum`, if there are new
coming task when busy , The amount of execution memory a task can use will be
lower, in ExecutionMemoryPool.acquireMemory()
```
val maxPoolSize = computeMaxPoolSize()
val maxMemoryPerTask = maxPoolSize / numActiveTasks
val minMemoryPerTask = poolSize / (2 * numActiveTasks)
// How much we can grant this task; keep its share within 0 <= X <= 1
/ numActiveTasks
val maxToGrant = math.min(numBytes, math.max(0, maxMemoryPerTask -
curMem))
// Only give it as much memory as is free, which might be none if it
reached 1 / numTasks
val toGrant = math.min(maxToGrant, memoryFree)
```
toGrant will be 0 cause task throw OOM, then retry, it's heavy always. We
can wait for other small task finish an d acquire memory again.
### Why are the changes needed?
Make task stronger
### Does this PR introduce _any_ user-facing change?
NO
### How was this patch tested?
NO
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]