zhli1142015 commented on issue #10932:
URL:
https://github.com/apache/incubator-gluten/issues/10932#issuecomment-3440105177
The formula defines the amount of memory available to each task — once this
threshold is exceeded, the task memory manager triggers a spill.
```
Per-task memory = ((spark.executor.memory - 300MB) * spark.memory.fraction)
/ active tasks count
```
By default, spark.memory.fraction is 0.6, meaning that only about 60% of the
executor memory (after subtracting 300 MB for overhead) is available for task
computation.
As a result, the memory usage threshold that triggers a spill is much lower
than the actual amount of memory required to cause an OOM(137).
The DynamicOffHeapSizingMemoryTarget does not change this logic. Its role is
to let the JVM release free memory back to the operating system when Velox
needs more memory. Since Spark does not account for released memory, this
behavior is transparent to the TaskMemoryManager.
Under this architecture, Velox can theoretically utilize the entire memory
allocated to the task, not just the configured off-heap portion.
@wForget In our case, we observed that the spill behavior was not affected
by this feature.
Note that the JVM currently does not provide an API for ondemand memory
shrinking — shrink only occurs as a side effect of a full GC, and its behavior
may vary across different JVM versions.
Could you share your runtime environment, such as the JVM version, GC type,
and memory configuration? Also, in what scenario did you observe that a spill
should have occurred but instead resulted in an OOM?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]