ganler commented on a change in pull request #8285:
URL: https://github.com/apache/tvm/pull/8285#discussion_r654871849
##########
File path: src/runtime/vm/pooled_allocator.h
##########
@@ -57,14 +57,22 @@ class PooledAllocator final : public Allocator {
Buffer buf;
buf.device = device_;
buf.size = size;
- buf.data = DeviceAPI::Get(device_)->AllocDataSpace(device_, size,
alignment, type_hint);
+ try {
+ buf.data = DeviceAPI::Get(device_)->AllocDataSpace(device_, size,
alignment, type_hint);
+ } catch (InternalError& err) {
+ LOG(WARNING) << "PooledAllocator got InternalError during allocation: "
<< err.message();
+ LOG(WARNING) << "Trying to release all unused memory and reallocate...";
+ ReleaseAll();
+ buf.data = DeviceAPI::Get(device_)->AllocDataSpace(device_, size,
alignment, type_hint);
Review comment:
Thanks for the suggestion. But IMHO this is not robust enough.
Say that we have 8 GB GPU memory, the `PooledAllocator` cached 4 GB and we
want to allocate 6 GB.
- Applying your idea, `ReleaseAll()` returns "4GB" which is less than "6GB",
thus resulting in a failed allocation.
- Instead, if we release unused memory and do re-allocation, "6GB" is very
likely to be successfully allocated.
The big picture behind your idea is practical if we can have some APIs like
"total_system_memory" and "available_system_memory", which may require
introducing a series of runtime/driver libraries. e.g., `cudaMemGetInfo` by
CudaRT (user space) or `NVML` (if some system privilege is allowed).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]