gemini-code-assist[bot] commented on code in PR #289:
URL: https://github.com/apache/tvm-ffi/pull/289#discussion_r2566743863
##########
docs/guides/kernel_library_guide.rst:
##########
@@ -145,6 +145,20 @@ Explicit Update
Once the devices on which the stream contexts reside cannot be inferred from
the tensors, the explicit update on stream context table is necessary. TVM FFI
provides :py:func:`tvm_ffi.use_torch_stream` and
:py:func:`tvm_ffi.use_raw_stream` for manual stream context update. However, it
is **recommended** to use implicit update above, to reduce code complexity.
+Device Guard
+============
+
+When launching kernels, the kernel libraries may require a context on specific
device. TVM FFI provides :cpp:class:`tvm::ffi::CUDADeviceGuard` to set the
context device, similar to :cpp:class:`c10::cuda::CUDAGuard`. Once constructing
a :cpp:class:`tvm::ffi::CUDADeviceGuard` with the device index, it stores the
original device index from ``cudaGetDevice``, and set the current device to
given device index, by ``cudaSetDevice``. Finally, when destructing
:cpp:class:`tvm::ffi::CUDADeviceGuard`, the current device index reset to the
stored original device index, still via ``cudaSetDevice``. Here is an example:
+
+.. code-block:: c++
+
+ void func(ffi::TensorView input, ...) {
+ // current device index is original device index
+ ffi::CUDADeviceGuard device_guard(input.device().device_id);
+ // current device index is input device index
+ }
+ // after calling func, the current device index reset to original device index
Review Comment:

The explanation of `CUDADeviceGuard` and the accompanying example can be
improved for clarity and correctness.
* The main paragraph has some grammatical awkwardness and could be rephrased
to be more explicit about the RAII pattern being used, which is a key concept.
* The comment `// after calling func...` is outside the `code-block`, which
is confusing as it will be rendered as a normal paragraph starting with `//`.
It's better to phrase it as a descriptive sentence.
Here is a suggested improvement for the whole section:
```suggestion
When launching kernels, kernel libraries may require the current device
context to be set for a specific device. TVM FFI provides the
:cpp:class:`tvm::ffi::CUDADeviceGuard` class to manage this, similar to
:cpp:class:`c10::cuda::CUDAGuard`. When a
:cpp:class:`tvm::ffi::CUDADeviceGuard` object is constructed with a device
index, it saves the original device index (retrieved using ``cudaGetDevice``)
and sets the current device to the given index (using ``cudaSetDevice``). Upon
destruction (e.g., when it goes out of scope), the guard restores the current
device to the original device index, also using ``cudaSetDevice``. This RAII
pattern ensures the device context is handled correctly. Here is an example:
.. code-block:: c++
void func(ffi::TensorView input, ...) {
// current device index is original device index
ffi::CUDADeviceGuard device_guard(input.device().device_id);
// current device index is input device index
}
After `func` returns, the `device_guard` is destructed, and the original
device index is restored.
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]