ganler commented on a change in pull request #8285:
URL: https://github.com/apache/tvm/pull/8285#discussion_r654871849
##########
File path: src/runtime/vm/pooled_allocator.h
##########
@@ -57,14 +57,22 @@ class PooledAllocator final : public Allocator {
Buffer buf;
buf.device = device_;
buf.size = size;
- buf.data = DeviceAPI::Get(device_)->AllocDataSpace(device_, size,
alignment, type_hint);
+ try {
+ buf.data = DeviceAPI::Get(device_)->AllocDataSpace(device_, size,
alignment, type_hint);
+ } catch (InternalError& err) {
+ LOG(WARNING) << "PooledAllocator got InternalError during allocation: "
<< err.message();
+ LOG(WARNING) << "Trying to release all unused memory and reallocate...";
+ ReleaseAll();
+ buf.data = DeviceAPI::Get(device_)->AllocDataSpace(device_, size,
alignment, type_hint);
Review comment:
Thanks for the suggestion. But IMHO this is not robust enough.
Say that we have 8 GB GPU memory, the pooled cached 4 GB and we want to
allocate 6 GB.
- Applying your idea, `ReleaseAll()` returns "4GB" which is less than "6GB",
thus resulting in a failed allocation.
- Instead, if we release unused memory and do re-allocation, "6GB" is very
likely to be successfully allocated.
But the big picture behind your idea is practical if we can have some APIs
like "total_system_memory" and "available_system_memory", which may require
introducing a series of system driver libraries. e.g., `cudaMemGetInfo` by
CudaRT (user space) or `NVML` (more system privilege is allowed).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]