ryankert01 commented on code in PR #1310:
URL: https://github.com/apache/mahout/pull/1310#discussion_r3332492205


##########
qdp/qdp-core/src/gpu/encodings/basis.rs:
##########
@@ -541,6 +541,29 @@ impl QuantumEncoder for BasisEncoder {
         Ok(batch_state_vector)
     }
 
+    #[cfg(target_os = "linux")]
+    unsafe fn encode_from_gpu_ptr_f32(
+        &self,
+        device: &Arc<CudaDevice>,
+        input_d: *const c_void,
+        input_len: usize,
+        num_qubits: usize,
+        stream: *mut c_void,
+    ) -> Result<GpuStateVector> {
+        // Delegate to the workhorse `_with_stream` fn (see angle.rs for 
rationale).
+        // `input_len` is unused — basis is always one index per sample — but 
kept on the
+        // signature to match the trait shape used by amplitude / angle.
+        let _ = input_len;
+        unsafe {

Review Comment:
   The caller-supplied `input_len` is silently discarded here. A debug 
assertion would catch misuse in test/debug builds at zero release cost:
   ```rust
   debug_assert_eq!(input_len, 1, "basis encoder takes exactly one index per 
sample");
   ```



##########
qdp/qdp-core/src/gpu/encodings/mod.rs:
##########
@@ -135,6 +135,34 @@ pub trait QuantumEncoder: Send + Sync + 'static {
         )))
     }
 
+    /// Encode a single sample from an existing GPU pointer (zero-copy) using 
an f32 input.
+    /// Default: not supported.
+    ///
+    /// This is the f32 counterpart of 
[`encode_from_gpu_ptr`](Self::encode_from_gpu_ptr). The
+    /// sibling batch variant is 
[`encode_batch_from_gpu_ptr_f32`](Self::encode_batch_from_gpu_ptr_f32).
+    /// Adding a new encoder with f32 zero-copy support should override 
**both** this method and
+    /// the batch variant; the previous arrangement (single-sample as a 
standalone `pub unsafe
+    /// fn` on each encoder type, batch on the trait) made the pattern 
accidentally divergent.
+    ///
+    /// # Safety
+    /// Caller must ensure `input_d` points to valid GPU memory with at least 
`input_len`
+    /// `f32` elements on the same device as `device`, and `stream` is either 
null or a valid
+    /// CUDA stream associated with `device`.
+    #[cfg(target_os = "linux")]
+    unsafe fn encode_from_gpu_ptr_f32(
+        &self,

Review Comment:
   The `# Safety` doc should note that this path skips 
`validate_cuda_input_ptr`. The engine entry point calls it before delegating, 
but callers going through the trait directly don't get that check. Worth one 
sentence:
   ```
   /// The `QdpEngine` entry point validates `input_d` against `device` 
automatically;
   /// callers invoking this method directly are responsible for that check.
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to