Re: [PR] MAHOUT-878 Add CUDA Torch Tensor Support for QDP Python Binding [mahout]

2026-01-22 Thread via GitHub


ryankert01 commented on PR #881:
URL: https://github.com/apache/mahout/pull/881#issuecomment-3784151742

   opened issues! @CheyuWu @400Ping 
   https://github.com/apache/mahout/issues/912
   https://github.com/apache/mahout/issues/911
   https://github.com/apache/mahout/issues/907


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] MAHOUT-878 Add CUDA Torch Tensor Support for QDP Python Binding [mahout]

2026-01-21 Thread via GitHub


ryankert01 merged PR #881:
URL: https://github.com/apache/mahout/pull/881


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] MAHOUT-878 Add CUDA Torch Tensor Support for QDP Python Binding [mahout]

2026-01-21 Thread via GitHub


rich7420 commented on PR #881:
URL: https://github.com/apache/mahout/pull/881#issuecomment-3782772877

   LGTM , @ryankert01  thanks for the update !


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] MAHOUT-878 Add CUDA Torch Tensor Support for QDP Python Binding [mahout]

2026-01-21 Thread via GitHub


rich7420 commented on code in PR #881:
URL: https://github.com/apache/mahout/pull/881#discussion_r2715514875


##
qdp/qdp-core/src/lib.rs:
##
@@ -300,6 +300,269 @@ impl QdpEngine {
 encoding_method,
 )
 }
+
+/// Encode from existing GPU pointer (zero-copy for CUDA tensors)
+///
+/// This method enables zero-copy encoding from PyTorch CUDA tensors by 
accepting
+/// a raw GPU pointer directly, avoiding the GPU→CPU→GPU copy that would 
otherwise
+/// be required.
+///
+/// TODO: Refactor to use QuantumEncoder trait (add `encode_from_gpu_ptr` 
to trait)
+/// to reduce duplication with AmplitudeEncoder::encode(). This would also 
make it
+/// easier to add GPU pointer support for other encoders (angle, basis) in 
the future.
+///
+/// # Arguments
+/// * `input_d` - Device pointer to input data (f64 array on GPU)
+/// * `input_len` - Number of f64 elements in the input
+/// * `num_qubits` - Number of qubits for encoding
+/// * `encoding_method` - Strategy (currently only "amplitude" supported)
+///
+/// # Returns
+/// DLPack pointer for zero-copy PyTorch integration
+///
+/// # Safety
+/// The input pointer must:
+/// - Point to valid GPU memory on the same device as the engine
+/// - Contain at least `input_len` f64 elements
+/// - Remain valid for the duration of this call
+#[cfg(target_os = "linux")]
+pub unsafe fn encode_from_gpu_ptr(

Review Comment:
   no problem!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] MAHOUT-878 Add CUDA Torch Tensor Support for QDP Python Binding [mahout]

2026-01-21 Thread via GitHub


ryankert01 commented on code in PR #881:
URL: https://github.com/apache/mahout/pull/881#discussion_r2715401161


##
qdp/qdp-core/src/lib.rs:
##
@@ -300,6 +300,269 @@ impl QdpEngine {
 encoding_method,
 )
 }
+
+/// Encode from existing GPU pointer (zero-copy for CUDA tensors)
+///
+/// This method enables zero-copy encoding from PyTorch CUDA tensors by 
accepting
+/// a raw GPU pointer directly, avoiding the GPU→CPU→GPU copy that would 
otherwise
+/// be required.
+///
+/// TODO: Refactor to use QuantumEncoder trait (add `encode_from_gpu_ptr` 
to trait)
+/// to reduce duplication with AmplitudeEncoder::encode(). This would also 
make it
+/// easier to add GPU pointer support for other encoders (angle, basis) in 
the future.
+///
+/// # Arguments
+/// * `input_d` - Device pointer to input data (f64 array on GPU)
+/// * `input_len` - Number of f64 elements in the input
+/// * `num_qubits` - Number of qubits for encoding
+/// * `encoding_method` - Strategy (currently only "amplitude" supported)
+///
+/// # Returns
+/// DLPack pointer for zero-copy PyTorch integration
+///
+/// # Safety
+/// The input pointer must:
+/// - Point to valid GPU memory on the same device as the engine
+/// - Contain at least `input_len` f64 elements
+/// - Remain valid for the duration of this call
+#[cfg(target_os = "linux")]
+pub unsafe fn encode_from_gpu_ptr(

Review Comment:
   This is actually one of my followups!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] MAHOUT-878 Add CUDA Torch Tensor Support for QDP Python Binding [mahout]

2026-01-21 Thread via GitHub


ryankert01 commented on PR #881:
URL: https://github.com/apache/mahout/pull/881#issuecomment-3782616112

   cc @rich7420 @guan404ming 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] MAHOUT-878 Add CUDA Torch Tensor Support for QDP Python Binding [mahout]

2026-01-21 Thread via GitHub


ryankert01 commented on code in PR #881:
URL: https://github.com/apache/mahout/pull/881#discussion_r2715401161


##
qdp/qdp-core/src/lib.rs:
##
@@ -300,6 +300,269 @@ impl QdpEngine {
 encoding_method,
 )
 }
+
+/// Encode from existing GPU pointer (zero-copy for CUDA tensors)
+///
+/// This method enables zero-copy encoding from PyTorch CUDA tensors by 
accepting
+/// a raw GPU pointer directly, avoiding the GPU→CPU→GPU copy that would 
otherwise
+/// be required.
+///
+/// TODO: Refactor to use QuantumEncoder trait (add `encode_from_gpu_ptr` 
to trait)
+/// to reduce duplication with AmplitudeEncoder::encode(). This would also 
make it
+/// easier to add GPU pointer support for other encoders (angle, basis) in 
the future.
+///
+/// # Arguments
+/// * `input_d` - Device pointer to input data (f64 array on GPU)
+/// * `input_len` - Number of f64 elements in the input
+/// * `num_qubits` - Number of qubits for encoding
+/// * `encoding_method` - Strategy (currently only "amplitude" supported)
+///
+/// # Returns
+/// DLPack pointer for zero-copy PyTorch integration
+///
+/// # Safety
+/// The input pointer must:
+/// - Point to valid GPU memory on the same device as the engine
+/// - Contain at least `input_len` f64 elements
+/// - Remain valid for the duration of this call
+#[cfg(target_os = "linux")]
+pub unsafe fn encode_from_gpu_ptr(

Review Comment:
   This is actually one of my followups! @lib.rs L310



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] MAHOUT-878 Add CUDA Torch Tensor Support for QDP Python Binding [mahout]

2026-01-21 Thread via GitHub


ryankert01 commented on code in PR #881:
URL: https://github.com/apache/mahout/pull/881#discussion_r2715390231


##
qdp/qdp-core/src/lib.rs:
##
@@ -300,6 +300,269 @@ impl QdpEngine {
 encoding_method,
 )
 }
+
+/// Encode from existing GPU pointer (zero-copy for CUDA tensors)
+///
+/// This method enables zero-copy encoding from PyTorch CUDA tensors by 
accepting
+/// a raw GPU pointer directly, avoiding the GPU→CPU→GPU copy that would 
otherwise
+/// be required.
+///
+/// TODO: Refactor to use QuantumEncoder trait (add `encode_from_gpu_ptr` 
to trait)
+/// to reduce duplication with AmplitudeEncoder::encode(). This would also 
make it
+/// easier to add GPU pointer support for other encoders (angle, basis) in 
the future.
+///
+/// # Arguments
+/// * `input_d` - Device pointer to input data (f64 array on GPU)
+/// * `input_len` - Number of f64 elements in the input
+/// * `num_qubits` - Number of qubits for encoding
+/// * `encoding_method` - Strategy (currently only "amplitude" supported)
+///
+/// # Returns
+/// DLPack pointer for zero-copy PyTorch integration
+///
+/// # Safety
+/// The input pointer must:
+/// - Point to valid GPU memory on the same device as the engine
+/// - Contain at least `input_len` f64 elements
+/// - Remain valid for the duration of this call
+#[cfg(target_os = "linux")]
+pub unsafe fn encode_from_gpu_ptr(
+&self,
+input_d: *const f64,
+input_len: usize,
+num_qubits: usize,
+encoding_method: &str,
+) -> Result<*mut DLManagedTensor> {
+crate::profile_scope!("Mahout::EncodeFromGpuPtr");
+
+if encoding_method != "amplitude" {
+return Err(MahoutError::NotImplemented(format!(
+"GPU pointer encoding currently only supports 'amplitude' 
method, got '{}'",
+encoding_method
+)));
+}
+
+if input_len == 0 {
+return Err(MahoutError::InvalidInput(
+"Input data cannot be empty".into(),
+));
+}
+
+let state_len = 1usize << num_qubits;
+if input_len > state_len {
+return Err(MahoutError::InvalidInput(format!(
+"Input size {} exceeds state vector size {} (2^{} qubits)",
+input_len, state_len, num_qubits
+)));
+}
+
+// Allocate output state vector
+let state_vector = {
+crate::profile_scope!("GPU::Alloc");
+gpu::GpuStateVector::new(&self.device, num_qubits)?
+};
+
+// Compute inverse L2 norm on GPU
+let inv_norm = {
+crate::profile_scope!("GPU::NormFromPtr");
+// SAFETY: input_d validity is guaranteed by the caller's safety 
contract
+unsafe {
+gpu::AmplitudeEncoder::calculate_inv_norm_gpu(&self.device, 
input_d, input_len)?
+}
+};
+
+// Get output pointer
+let state_ptr = state_vector.ptr_f64().ok_or_else(|| {
+MahoutError::InvalidInput(
+"State vector precision mismatch (expected float64 
buffer)".to_string(),
+)
+})?;
+
+// Launch encoding kernel
+{
+crate::profile_scope!("GPU::KernelLaunch");
+let ret = unsafe {
+qdp_kernels::launch_amplitude_encode(
+input_d,
+state_ptr as *mut std::ffi::c_void,
+input_len,
+state_len,
+inv_norm,
+std::ptr::null_mut(), // default stream
+)
+};
+
+if ret != 0 {
+return Err(MahoutError::KernelLaunch(format!(

Review Comment:
   done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] MAHOUT-878 Add CUDA Torch Tensor Support for QDP Python Binding [mahout]

2026-01-21 Thread via GitHub


rich7420 commented on code in PR #881:
URL: https://github.com/apache/mahout/pull/881#discussion_r2715158517


##
qdp/qdp-core/src/lib.rs:
##
@@ -300,6 +300,269 @@ impl QdpEngine {
 encoding_method,
 )
 }
+
+/// Encode from existing GPU pointer (zero-copy for CUDA tensors)
+///
+/// This method enables zero-copy encoding from PyTorch CUDA tensors by 
accepting
+/// a raw GPU pointer directly, avoiding the GPU→CPU→GPU copy that would 
otherwise
+/// be required.
+///
+/// TODO: Refactor to use QuantumEncoder trait (add `encode_from_gpu_ptr` 
to trait)
+/// to reduce duplication with AmplitudeEncoder::encode(). This would also 
make it
+/// easier to add GPU pointer support for other encoders (angle, basis) in 
the future.
+///
+/// # Arguments
+/// * `input_d` - Device pointer to input data (f64 array on GPU)
+/// * `input_len` - Number of f64 elements in the input
+/// * `num_qubits` - Number of qubits for encoding
+/// * `encoding_method` - Strategy (currently only "amplitude" supported)
+///
+/// # Returns
+/// DLPack pointer for zero-copy PyTorch integration
+///
+/// # Safety
+/// The input pointer must:
+/// - Point to valid GPU memory on the same device as the engine
+/// - Contain at least `input_len` f64 elements
+/// - Remain valid for the duration of this call
+#[cfg(target_os = "linux")]
+pub unsafe fn encode_from_gpu_ptr(

Review Comment:
   The `encode_from_gpu_ptr` and `encode_batch_from_gpu_ptr` methods in 
`QdpEngine` have some code duplication with `AmplitudeEncoder::encode()` and 
`AmplitudeEncoder::encode_batch()`.
   I think maybe we could add `encode_from_gpu_ptr` and 
`encode_batch_from_gpu_ptr` methods to the `QuantumEncoder` trait (in 
`qdp/qdp-core/src/gpu/encodings/mod.rs`). And then implement these methods in 
`AmplitudeEncoder` (move the logic from `QdpEngine`) and simplify 
`QdpEngine::encode_from_gpu_ptr` to just get the encoder and call its method. 
WDYT?



##
qdp/qdp-core/src/lib.rs:
##
@@ -300,6 +300,269 @@ impl QdpEngine {
 encoding_method,
 )
 }
+
+/// Encode from existing GPU pointer (zero-copy for CUDA tensors)
+///
+/// This method enables zero-copy encoding from PyTorch CUDA tensors by 
accepting
+/// a raw GPU pointer directly, avoiding the GPU→CPU→GPU copy that would 
otherwise
+/// be required.
+///
+/// TODO: Refactor to use QuantumEncoder trait (add `encode_from_gpu_ptr` 
to trait)
+/// to reduce duplication with AmplitudeEncoder::encode(). This would also 
make it
+/// easier to add GPU pointer support for other encoders (angle, basis) in 
the future.
+///
+/// # Arguments
+/// * `input_d` - Device pointer to input data (f64 array on GPU)
+/// * `input_len` - Number of f64 elements in the input
+/// * `num_qubits` - Number of qubits for encoding
+/// * `encoding_method` - Strategy (currently only "amplitude" supported)
+///
+/// # Returns
+/// DLPack pointer for zero-copy PyTorch integration
+///
+/// # Safety
+/// The input pointer must:
+/// - Point to valid GPU memory on the same device as the engine
+/// - Contain at least `input_len` f64 elements
+/// - Remain valid for the duration of this call
+#[cfg(target_os = "linux")]
+pub unsafe fn encode_from_gpu_ptr(
+&self,
+input_d: *const f64,
+input_len: usize,
+num_qubits: usize,
+encoding_method: &str,
+) -> Result<*mut DLManagedTensor> {
+crate::profile_scope!("Mahout::EncodeFromGpuPtr");
+
+if encoding_method != "amplitude" {
+return Err(MahoutError::NotImplemented(format!(
+"GPU pointer encoding currently only supports 'amplitude' 
method, got '{}'",
+encoding_method
+)));
+}
+
+if input_len == 0 {
+return Err(MahoutError::InvalidInput(
+"Input data cannot be empty".into(),
+));
+}
+
+let state_len = 1usize << num_qubits;
+if input_len > state_len {
+return Err(MahoutError::InvalidInput(format!(
+"Input size {} exceeds state vector size {} (2^{} qubits)",
+input_len, state_len, num_qubits
+)));
+}
+
+// Allocate output state vector
+let state_vector = {
+crate::profile_scope!("GPU::Alloc");
+gpu::GpuStateVector::new(&self.device, num_qubits)?
+};
+
+// Compute inverse L2 norm on GPU
+let inv_norm = {
+crate::profile_scope!("GPU::NormFromPtr");
+// SAFETY: input_d validity is guaranteed by the caller's safety 
contract
+unsafe {
+gpu::AmplitudeEncoder::calculate_inv_norm_gpu(&self.device, 
input_d, input_len)?
+}
+};
+
+// Ge

Re: [PR] MAHOUT-878 Add CUDA Torch Tensor Support for QDP Python Binding [mahout]

2026-01-21 Thread via GitHub


ryankert01 commented on PR #881:
URL: https://github.com/apache/mahout/pull/881#issuecomment-3778447960

   fixed, PTAL @viiccwen @shiavm006 @CheyuWu @400Ping 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] MAHOUT-878 Add CUDA Torch Tensor Support for QDP Python Binding [mahout]

2026-01-21 Thread via GitHub


ryankert01 commented on code in PR #881:
URL: https://github.com/apache/mahout/pull/881#discussion_r2712826273


##
qdp/qdp-python/src/lib.rs:
##
@@ -171,6 +171,145 @@ fn validate_tensor(tensor: &Bound<'_, PyAny>) -> 
PyResult<()> {
 Ok(())
 }
 
+/// Check if a PyTorch tensor is on a CUDA device
+fn is_cuda_tensor(tensor: &Bound<'_, PyAny>) -> PyResult {
+let device = tensor.getattr("device")?;
+let device_type: String = device.getattr("type")?.extract()?;
+Ok(device_type == "cuda")
+}
+
+/// Get the CUDA device index from a PyTorch tensor
+fn get_tensor_device_id(tensor: &Bound<'_, PyAny>) -> PyResult {
+let device = tensor.getattr("device")?;
+let device_index: i32 = device.getattr("index")?.extract()?;
+Ok(device_index)
+}
+
+/// Validate a CUDA tensor for direct GPU encoding
+/// Checks: dtype=float64, contiguous, non-empty, device_id matches engine
+fn validate_cuda_tensor_for_encoding(
+tensor: &Bound<'_, PyAny>,
+expected_device_id: usize,
+encoding_method: &str,
+) -> PyResult<()> {
+// Check encoding method support (currently only amplitude is supported 
for CUDA tensors)
+if encoding_method != "amplitude" {
+return Err(PyRuntimeError::new_err(format!(
+"CUDA tensor encoding currently only supports 'amplitude' method, 
got '{}'. \
+ Use tensor.cpu() to convert to CPU tensor for other encoding 
methods.",
+encoding_method
+)));
+}
+
+// Check dtype is float64
+let dtype = tensor.getattr("dtype")?;
+let dtype_str: String = dtype.str()?.extract()?;
+if !dtype_str.contains("float64") {
+return Err(PyRuntimeError::new_err(format!(
+"CUDA tensor must have dtype float64, got {}. Use 
tensor.to(torch.float64)",
+dtype_str
+)));
+}
+
+// Check contiguous
+let is_contiguous: bool = tensor.call_method0("is_contiguous")?.extract()?;
+if !is_contiguous {
+return Err(PyRuntimeError::new_err(
+"CUDA tensor must be contiguous. Use tensor.contiguous()",
+));
+}
+
+// Check non-empty
+let numel: usize = tensor.call_method0("numel")?.extract()?;
+if numel == 0 {
+return Err(PyRuntimeError::new_err("CUDA tensor cannot be empty"));
+}
+
+// Check device matches engine
+let tensor_device_id = get_tensor_device_id(tensor)?;
+if tensor_device_id as usize != expected_device_id {
+return Err(PyRuntimeError::new_err(format!(
+"Device mismatch: tensor is on cuda:{}, but engine is on cuda:{}. \
+ Move tensor with tensor.to('cuda:{}')",
+tensor_device_id, expected_device_id, expected_device_id
+)));
+}
+
+Ok(())
+}
+
+/// DLPack tensor information extracted from a PyCapsule
+struct DLPackTensorInfo {
+data_ptr: *const f64,
+shape: Vec,
+/// CUDA device ID from DLPack metadata.
+/// Currently unused but kept for potential future device validation or 
multi-GPU support.
+#[allow(dead_code)]
+device_id: i32,
+}
+
+/// Extract GPU pointer from PyTorch tensor's __dlpack__() capsule
+///
+/// # Safety
+/// The returned `data_ptr` points to GPU memory owned by the source tensor.
+/// The caller must ensure the source tensor remains alive and unmodified
+/// for the entire duration that `data_ptr` is in use. Python's GIL ensures
+/// the tensor won't be garbage collected during `encode()`, but the caller
+/// must not deallocate or resize the tensor while encoding is in progress.
+fn extract_dlpack_tensor(_py: Python<'_>, tensor: &Bound<'_, PyAny>) -> 
PyResult {

Review Comment:
   yes, but I'm not sure if I've solved it all, I only solve related ones.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] MAHOUT-878 Add CUDA Torch Tensor Support for QDP Python Binding [mahout]

2026-01-21 Thread via GitHub


ryankert01 commented on code in PR #881:
URL: https://github.com/apache/mahout/pull/881#discussion_r2712826273


##
qdp/qdp-python/src/lib.rs:
##
@@ -171,6 +171,145 @@ fn validate_tensor(tensor: &Bound<'_, PyAny>) -> 
PyResult<()> {
 Ok(())
 }
 
+/// Check if a PyTorch tensor is on a CUDA device
+fn is_cuda_tensor(tensor: &Bound<'_, PyAny>) -> PyResult {
+let device = tensor.getattr("device")?;
+let device_type: String = device.getattr("type")?.extract()?;
+Ok(device_type == "cuda")
+}
+
+/// Get the CUDA device index from a PyTorch tensor
+fn get_tensor_device_id(tensor: &Bound<'_, PyAny>) -> PyResult {
+let device = tensor.getattr("device")?;
+let device_index: i32 = device.getattr("index")?.extract()?;
+Ok(device_index)
+}
+
+/// Validate a CUDA tensor for direct GPU encoding
+/// Checks: dtype=float64, contiguous, non-empty, device_id matches engine
+fn validate_cuda_tensor_for_encoding(
+tensor: &Bound<'_, PyAny>,
+expected_device_id: usize,
+encoding_method: &str,
+) -> PyResult<()> {
+// Check encoding method support (currently only amplitude is supported 
for CUDA tensors)
+if encoding_method != "amplitude" {
+return Err(PyRuntimeError::new_err(format!(
+"CUDA tensor encoding currently only supports 'amplitude' method, 
got '{}'. \
+ Use tensor.cpu() to convert to CPU tensor for other encoding 
methods.",
+encoding_method
+)));
+}
+
+// Check dtype is float64
+let dtype = tensor.getattr("dtype")?;
+let dtype_str: String = dtype.str()?.extract()?;
+if !dtype_str.contains("float64") {
+return Err(PyRuntimeError::new_err(format!(
+"CUDA tensor must have dtype float64, got {}. Use 
tensor.to(torch.float64)",
+dtype_str
+)));
+}
+
+// Check contiguous
+let is_contiguous: bool = tensor.call_method0("is_contiguous")?.extract()?;
+if !is_contiguous {
+return Err(PyRuntimeError::new_err(
+"CUDA tensor must be contiguous. Use tensor.contiguous()",
+));
+}
+
+// Check non-empty
+let numel: usize = tensor.call_method0("numel")?.extract()?;
+if numel == 0 {
+return Err(PyRuntimeError::new_err("CUDA tensor cannot be empty"));
+}
+
+// Check device matches engine
+let tensor_device_id = get_tensor_device_id(tensor)?;
+if tensor_device_id as usize != expected_device_id {
+return Err(PyRuntimeError::new_err(format!(
+"Device mismatch: tensor is on cuda:{}, but engine is on cuda:{}. \
+ Move tensor with tensor.to('cuda:{}')",
+tensor_device_id, expected_device_id, expected_device_id
+)));
+}
+
+Ok(())
+}
+
+/// DLPack tensor information extracted from a PyCapsule
+struct DLPackTensorInfo {
+data_ptr: *const f64,
+shape: Vec,
+/// CUDA device ID from DLPack metadata.
+/// Currently unused but kept for potential future device validation or 
multi-GPU support.
+#[allow(dead_code)]
+device_id: i32,
+}
+
+/// Extract GPU pointer from PyTorch tensor's __dlpack__() capsule
+///
+/// # Safety
+/// The returned `data_ptr` points to GPU memory owned by the source tensor.
+/// The caller must ensure the source tensor remains alive and unmodified
+/// for the entire duration that `data_ptr` is in use. Python's GIL ensures
+/// the tensor won't be garbage collected during `encode()`, but the caller
+/// must not deallocate or resize the tensor while encoding is in progress.
+fn extract_dlpack_tensor(_py: Python<'_>, tensor: &Bound<'_, PyAny>) -> 
PyResult {

Review Comment:
   yes, but I probably not solved it all, so you might be able to still do it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] MAHOUT-878 Add CUDA Torch Tensor Support for QDP Python Binding [mahout]

2026-01-20 Thread via GitHub


viiccwen commented on code in PR #881:
URL: https://github.com/apache/mahout/pull/881#discussion_r2708974047


##
qdp/qdp-python/src/lib.rs:
##
@@ -171,6 +171,145 @@ fn validate_tensor(tensor: &Bound<'_, PyAny>) -> 
PyResult<()> {
 Ok(())
 }
 
+/// Check if a PyTorch tensor is on a CUDA device
+fn is_cuda_tensor(tensor: &Bound<'_, PyAny>) -> PyResult {
+let device = tensor.getattr("device")?;
+let device_type: String = device.getattr("type")?.extract()?;
+Ok(device_type == "cuda")
+}
+
+/// Get the CUDA device index from a PyTorch tensor
+fn get_tensor_device_id(tensor: &Bound<'_, PyAny>) -> PyResult {
+let device = tensor.getattr("device")?;
+let device_index: i32 = device.getattr("index")?.extract()?;
+Ok(device_index)
+}
+
+/// Validate a CUDA tensor for direct GPU encoding
+/// Checks: dtype=float64, contiguous, non-empty, device_id matches engine
+fn validate_cuda_tensor_for_encoding(
+tensor: &Bound<'_, PyAny>,
+expected_device_id: usize,
+encoding_method: &str,
+) -> PyResult<()> {
+// Check encoding method support (currently only amplitude is supported 
for CUDA tensors)
+if encoding_method != "amplitude" {
+return Err(PyRuntimeError::new_err(format!(
+"CUDA tensor encoding currently only supports 'amplitude' method, 
got '{}'. \
+ Use tensor.cpu() to convert to CPU tensor for other encoding 
methods.",
+encoding_method
+)));
+}
+
+// Check dtype is float64
+let dtype = tensor.getattr("dtype")?;
+let dtype_str: String = dtype.str()?.extract()?;
+if !dtype_str.contains("float64") {
+return Err(PyRuntimeError::new_err(format!(
+"CUDA tensor must have dtype float64, got {}. Use 
tensor.to(torch.float64)",
+dtype_str
+)));
+}
+
+// Check contiguous
+let is_contiguous: bool = tensor.call_method0("is_contiguous")?.extract()?;
+if !is_contiguous {
+return Err(PyRuntimeError::new_err(
+"CUDA tensor must be contiguous. Use tensor.contiguous()",
+));
+}
+
+// Check non-empty
+let numel: usize = tensor.call_method0("numel")?.extract()?;
+if numel == 0 {
+return Err(PyRuntimeError::new_err("CUDA tensor cannot be empty"));
+}
+
+// Check device matches engine
+let tensor_device_id = get_tensor_device_id(tensor)?;
+if tensor_device_id as usize != expected_device_id {
+return Err(PyRuntimeError::new_err(format!(
+"Device mismatch: tensor is on cuda:{}, but engine is on cuda:{}. \
+ Move tensor with tensor.to('cuda:{}')",
+tensor_device_id, expected_device_id, expected_device_id
+)));
+}
+
+Ok(())
+}
+
+/// DLPack tensor information extracted from a PyCapsule
+struct DLPackTensorInfo {
+data_ptr: *const f64,
+shape: Vec,
+/// CUDA device ID from DLPack metadata.
+/// Currently unused but kept for potential future device validation or 
multi-GPU support.
+#[allow(dead_code)]
+device_id: i32,
+}
+
+/// Extract GPU pointer from PyTorch tensor's __dlpack__() capsule
+///
+/// # Safety
+/// The returned `data_ptr` points to GPU memory owned by the source tensor.
+/// The caller must ensure the source tensor remains alive and unmodified
+/// for the entire duration that `data_ptr` is in use. Python's GIL ensures
+/// the tensor won't be garbage collected during `encode()`, but the caller
+/// must not deallocate or resize the tensor while encoding is in progress.
+fn extract_dlpack_tensor(_py: Python<'_>, tensor: &Bound<'_, PyAny>) -> 
PyResult {

Review Comment:
   I raise an issue for #888. @ryankert01 would u like to fix it in current PR?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] MAHOUT-878 Add CUDA Torch Tensor Support for QDP Python Binding [mahout]

2026-01-20 Thread via GitHub


shiavm006 commented on code in PR #881:
URL: https://github.com/apache/mahout/pull/881#discussion_r2708578026


##
qdp/qdp-core/src/gpu/encodings/amplitude.rs:
##
@@ -95,11 +95,15 @@ impl QuantumEncoder for AmplitudeEncoder {
 
 // GPU-accelerated norm for medium+ inputs, CPU fallback for 
tiny payloads
 let inv_norm = if host_data.len() >= GPU_NORM_THRESHOLD {
-Self::calculate_inv_norm_gpu(
-_device,
-*input_slice.device_ptr() as *const f64,
-host_data.len(),
-)?
+// SAFETY: input_slice was just allocated and copied from 
host_data,
+// so the pointer is valid and contains host_data.len() 
elements
+unsafe {

Review Comment:
   We’re calling AmplitudeEncoder::calculate_inv_norm_gpu twice here: once 
directly and once within the unsafe block. Since the function is now pub unsafe 
fn, the first call should no longer compile, and even if it did, we’d be doing 
redundant work. We should keep only the unsafe call with the safety 
justification comment



##
qdp/qdp-python/src/lib.rs:
##
@@ -171,6 +171,145 @@ fn validate_tensor(tensor: &Bound<'_, PyAny>) -> 
PyResult<()> {
 Ok(())
 }
 
+/// Check if a PyTorch tensor is on a CUDA device
+fn is_cuda_tensor(tensor: &Bound<'_, PyAny>) -> PyResult {
+let device = tensor.getattr("device")?;
+let device_type: String = device.getattr("type")?.extract()?;
+Ok(device_type == "cuda")
+}
+
+/// Get the CUDA device index from a PyTorch tensor
+fn get_tensor_device_id(tensor: &Bound<'_, PyAny>) -> PyResult {
+let device = tensor.getattr("device")?;
+let device_index: i32 = device.getattr("index")?.extract()?;
+Ok(device_index)
+}
+
+/// Validate a CUDA tensor for direct GPU encoding
+/// Checks: dtype=float64, contiguous, non-empty, device_id matches engine
+fn validate_cuda_tensor_for_encoding(
+tensor: &Bound<'_, PyAny>,
+expected_device_id: usize,
+encoding_method: &str,
+) -> PyResult<()> {
+// Check encoding method support (currently only amplitude is supported 
for CUDA tensors)
+if encoding_method != "amplitude" {
+return Err(PyRuntimeError::new_err(format!(
+"CUDA tensor encoding currently only supports 'amplitude' method, 
got '{}'. \
+ Use tensor.cpu() to convert to CPU tensor for other encoding 
methods.",
+encoding_method
+)));
+}
+
+// Check dtype is float64
+let dtype = tensor.getattr("dtype")?;
+let dtype_str: String = dtype.str()?.extract()?;
+if !dtype_str.contains("float64") {
+return Err(PyRuntimeError::new_err(format!(
+"CUDA tensor must have dtype float64, got {}. Use 
tensor.to(torch.float64)",
+dtype_str
+)));
+}
+
+// Check contiguous
+let is_contiguous: bool = tensor.call_method0("is_contiguous")?.extract()?;
+if !is_contiguous {
+return Err(PyRuntimeError::new_err(
+"CUDA tensor must be contiguous. Use tensor.contiguous()",
+));
+}
+
+// Check non-empty
+let numel: usize = tensor.call_method0("numel")?.extract()?;
+if numel == 0 {
+return Err(PyRuntimeError::new_err("CUDA tensor cannot be empty"));
+}
+
+// Check device matches engine
+let tensor_device_id = get_tensor_device_id(tensor)?;
+if tensor_device_id as usize != expected_device_id {
+return Err(PyRuntimeError::new_err(format!(
+"Device mismatch: tensor is on cuda:{}, but engine is on cuda:{}. \
+ Move tensor with tensor.to('cuda:{}')",
+tensor_device_id, expected_device_id, expected_device_id
+)));
+}
+
+Ok(())
+}
+
+/// DLPack tensor information extracted from a PyCapsule
+struct DLPackTensorInfo {
+data_ptr: *const f64,
+shape: Vec,
+/// CUDA device ID from DLPack metadata.
+/// Currently unused but kept for potential future device validation or 
multi-GPU support.
+#[allow(dead_code)]
+device_id: i32,
+}
+
+/// Extract GPU pointer from PyTorch tensor's __dlpack__() capsule
+///
+/// # Safety
+/// The returned `data_ptr` points to GPU memory owned by the source tensor.
+/// The caller must ensure the source tensor remains alive and unmodified
+/// for the entire duration that `data_ptr` is in use. Python's GIL ensures
+/// the tensor won't be garbage collected during `encode()`, but the caller
+/// must not deallocate or resize the tensor while encoding is in progress.
+fn extract_dlpack_tensor(_py: Python<'_>, tensor: &Bound<'_, PyAny>) -> 
PyResult {

Review Comment:
   use tensor.data_ptr() + tensor.shape directly and avoid DLPack for this 
direction



##
qdp/qdp-python/src/lib.rs:
##
@@ -171,6 +171,145 @@ fn validate_tensor(tensor: &Bound<'_, PyAny>) -> 
PyResult<()> {
 Ok(())
 }
 
+/// Check if a PyTorch ten

Re: [PR] MAHOUT-878 Add CUDA Torch Tensor Support for QDP Python Binding [mahout]

2026-01-20 Thread via GitHub


viiccwen commented on code in PR #881:
URL: https://github.com/apache/mahout/pull/881#discussion_r2708937310


##
qdp/qdp-core/src/gpu/encodings/amplitude.rs:
##
@@ -411,8 +415,20 @@ impl AmplitudeEncoder {
 
 impl AmplitudeEncoder {
 /// Compute inverse L2 norm on GPU using the reduction kernel.
+///
+/// # Arguments
+/// * `device` - CUDA device reference
+/// * `input_ptr` - Device pointer to input data (f64 array on GPU)
+/// * `len` - Number of f64 elements
+///
+/// # Returns
+/// The inverse L2 norm (1/||x||_2) of the input data
+///
+/// # Safety
+/// The caller must ensure `input_ptr` points to valid GPU memory 
containing
+/// at least `len` f64 elements on the same device as `device`.
 #[cfg(target_os = "linux")]
-fn calculate_inv_norm_gpu(
+pub unsafe fn calculate_inv_norm_gpu(

Review Comment:
   If we promote GPU norm computation later as a public feature (e.g. via a 
QuantumEncoder extension trait), maybe we can do a follow-up about design a 
dedicated safe wrapper and keep this function as an internal building block. 
WDYT?



##
qdp/qdp-core/src/gpu/encodings/amplitude.rs:
##
@@ -411,8 +415,20 @@ impl AmplitudeEncoder {
 
 impl AmplitudeEncoder {
 /// Compute inverse L2 norm on GPU using the reduction kernel.
+///
+/// # Arguments
+/// * `device` - CUDA device reference
+/// * `input_ptr` - Device pointer to input data (f64 array on GPU)
+/// * `len` - Number of f64 elements
+///
+/// # Returns
+/// The inverse L2 norm (1/||x||_2) of the input data
+///
+/// # Safety
+/// The caller must ensure `input_ptr` points to valid GPU memory 
containing
+/// at least `len` f64 elements on the same device as `device`.
 #[cfg(target_os = "linux")]
-fn calculate_inv_norm_gpu(
+pub unsafe fn calculate_inv_norm_gpu(

Review Comment:
   Given this is an unsafe primitive that’s only used inside the core GPU 
pipeline, I’d lean towards making it `pub(crate)` for now and exposing only 
safe encoding APIs at the crate boundary.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] MAHOUT-878 Add CUDA Torch Tensor Support for QDP Python Binding [mahout]

2026-01-20 Thread via GitHub


viiccwen commented on PR #881:
URL: https://github.com/apache/mahout/pull/881#issuecomment-3773452490

   I'll dive into refactor `encode` (`qdp/qdp-python/src/lib.rs`) after this PR 
merged.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] MAHOUT-878 Add CUDA Torch Tensor Support for QDP Python Binding [mahout]

2026-01-20 Thread via GitHub


ryankert01 commented on code in PR #881:
URL: https://github.com/apache/mahout/pull/881#discussion_r2708210622


##
qdp/qdp-python/src/lib.rs:
##
@@ -152,7 +152,7 @@ fn is_pytorch_tensor(obj: &Bound<'_, PyAny>) -> 
PyResult {
 Ok(module_name == "torch")
 }
 
-/// Helper to validate tensor
+/// Helper to validate CPU tensor
 fn validate_tensor(tensor: &Bound<'_, PyAny>) -> PyResult<()> {

Review Comment:
   Good catch, got it!
In `validate_tensor` has a part of it redundant as it rechecks 
`is_pytorch_tensor(tensor)`, but it can also catch other hardware tensor like 
MPS, XLA, HPU, etc. if any. 
   We can refactor it and remove `is_pytorch_tensor(tensor)` as a followup. and 
the function name probably should be renamed to `validate_tensor_cpu` or 
something.



##
qdp/qdp-python/src/lib.rs:
##
@@ -152,7 +152,7 @@ fn is_pytorch_tensor(obj: &Bound<'_, PyAny>) -> 
PyResult {
 Ok(module_name == "torch")
 }
 
-/// Helper to validate tensor
+/// Helper to validate CPU tensor
 fn validate_tensor(tensor: &Bound<'_, PyAny>) -> PyResult<()> {

Review Comment:
   Good catch, got it!
   
In `validate_tensor` has a part of it redundant as it rechecks 
`is_pytorch_tensor(tensor)`, but it can also catch other hardware tensor like 
MPS, XLA, HPU, etc. if any. 
   We can refactor it and remove `is_pytorch_tensor(tensor)` as a followup. and 
the function name probably should be renamed to `validate_tensor_cpu` or 
something.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] MAHOUT-878 Add CUDA Torch Tensor Support for QDP Python Binding [mahout]

2026-01-20 Thread via GitHub


ryankert01 commented on code in PR #881:
URL: https://github.com/apache/mahout/pull/881#discussion_r2708210622


##
qdp/qdp-python/src/lib.rs:
##
@@ -152,7 +152,7 @@ fn is_pytorch_tensor(obj: &Bound<'_, PyAny>) -> 
PyResult {
 Ok(module_name == "torch")
 }
 
-/// Helper to validate tensor
+/// Helper to validate CPU tensor
 fn validate_tensor(tensor: &Bound<'_, PyAny>) -> PyResult<()> {

Review Comment:
   Okay, got it. In `validate_tensor` has a part of it redundant as it rechecks 
`is_pytorch_tensor(tensor)`, but it can also catch other hardware tensor like 
MPS, XLA, HPU, etc. if any. 
   We can refactor it and remove `is_pytorch_tensor(tensor)` as a followup. and 
the function name probably should be renamed to `validate_tensor_cpu` or 
something.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] MAHOUT-878 Add CUDA Torch Tensor Support for QDP Python Binding [mahout]

2026-01-20 Thread via GitHub


ryankert01 commented on code in PR #881:
URL: https://github.com/apache/mahout/pull/881#discussion_r2708210622


##
qdp/qdp-python/src/lib.rs:
##
@@ -152,7 +152,7 @@ fn is_pytorch_tensor(obj: &Bound<'_, PyAny>) -> 
PyResult {
 Ok(module_name == "torch")
 }
 
-/// Helper to validate tensor
+/// Helper to validate CPU tensor
 fn validate_tensor(tensor: &Bound<'_, PyAny>) -> PyResult<()> {

Review Comment:
   Okay, got it. In `validate_tensor` has a part of it redundant as it rechecks 
`is_pytorch_tensor(tensor)`, but it can also catch catches MPS, XLA, HPU, etc. 
if any. 
   We can refactor it and remove `is_pytorch_tensor(tensor)` as a followup. and 
the function name probably should be renamed to `validate_tensor_cpu` or 
something.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] MAHOUT-878 Add CUDA Torch Tensor Support for QDP Python Binding [mahout]

2026-01-20 Thread via GitHub


ryankert01 commented on code in PR #881:
URL: https://github.com/apache/mahout/pull/881#discussion_r2708210622


##
qdp/qdp-python/src/lib.rs:
##
@@ -152,7 +152,7 @@ fn is_pytorch_tensor(obj: &Bound<'_, PyAny>) -> 
PyResult {
 Ok(module_name == "torch")
 }
 
-/// Helper to validate tensor
+/// Helper to validate CPU tensor
 fn validate_tensor(tensor: &Bound<'_, PyAny>) -> PyResult<()> {

Review Comment:
   Okay, got it. In validate_tensor has a part of it redundant as it rechecks 
`is_pytorch_tensor(tensor)`, but it can also catch catches MPS, XLA, HPU, etc. 
if any. 
   We can refactor it and remove `is_pytorch_tensor(tensor)` as a followup.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] MAHOUT-878 Add CUDA Torch Tensor Support for QDP Python Binding [mahout]

2026-01-20 Thread via GitHub


ryankert01 commented on code in PR #881:
URL: https://github.com/apache/mahout/pull/881#discussion_r2708210622


##
qdp/qdp-python/src/lib.rs:
##
@@ -152,7 +152,7 @@ fn is_pytorch_tensor(obj: &Bound<'_, PyAny>) -> 
PyResult {
 Ok(module_name == "torch")
 }
 
-/// Helper to validate tensor
+/// Helper to validate CPU tensor
 fn validate_tensor(tensor: &Bound<'_, PyAny>) -> PyResult<()> {

Review Comment:
   Okay, got it. In validate_tensor has a part of it redundant as it rechecks 
`is_pytorch_tensor(tensor)`, but I can also catch catches MPS, XLA, HPU, etc. 
if any. 
   We can refactor it and remove `is_pytorch_tensor(tensor)` as a followup.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] MAHOUT-878 Add CUDA Torch Tensor Support for QDP Python Binding [mahout]

2026-01-20 Thread via GitHub


CheyuWu commented on code in PR #881:
URL: https://github.com/apache/mahout/pull/881#discussion_r2708146590


##
qdp/qdp-python/src/lib.rs:
##
@@ -152,7 +152,7 @@ fn is_pytorch_tensor(obj: &Bound<'_, PyAny>) -> 
PyResult {
 Ok(module_name == "torch")
 }
 
-/// Helper to validate tensor
+/// Helper to validate CPU tensor
 fn validate_tensor(tensor: &Bound<'_, PyAny>) -> PyResult<()> {

Review Comment:
   Sorry, I didn’t explain this very clearly on my side.
   
   Currently, `validate_tensor` is only called at this 
[location](https://github.com/apache/mahout/pull/881/files#diff-2127e48c371dd5082474a4e975ac908065dc7b4a5736dfaa9f0efbfc95c8981aR535).
   The only thing `validate_tensor` does is call `is_pytorch_tensor`.
   Therefore, I don’t think it’s necessary to keep this function, since 
`is_pytorch_tensor` has already been called earlier in 
[here](https://github.com/apache/mahout/pull/881/files#diff-2127e48c371dd5082474a4e975ac908065dc7b4a5736dfaa9f0efbfc95c8981aR462).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] MAHOUT-878 Add CUDA Torch Tensor Support for QDP Python Binding [mahout]

2026-01-20 Thread via GitHub


CheyuWu commented on code in PR #881:
URL: https://github.com/apache/mahout/pull/881#discussion_r2708146590


##
qdp/qdp-python/src/lib.rs:
##
@@ -152,7 +152,7 @@ fn is_pytorch_tensor(obj: &Bound<'_, PyAny>) -> 
PyResult {
 Ok(module_name == "torch")
 }
 
-/// Helper to validate tensor
+/// Helper to validate CPU tensor
 fn validate_tensor(tensor: &Bound<'_, PyAny>) -> PyResult<()> {

Review Comment:
   Currently, `validate_tensor` is only called at this 
[location](https://github.com/apache/mahout/pull/881/files#diff-2127e48c371dd5082474a4e975ac908065dc7b4a5736dfaa9f0efbfc95c8981aR535).
   The only thing `validate_tensor` does is call `is_pytorch_tensor`.
   Therefore, I don’t think it’s necessary to keep this function, since 
`is_pytorch_tensor` has already been called earlier in 
[here](https://github.com/apache/mahout/pull/881/files#diff-2127e48c371dd5082474a4e975ac908065dc7b4a5736dfaa9f0efbfc95c8981aR462).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] MAHOUT-878 Add CUDA Torch Tensor Support for QDP Python Binding [mahout]

2026-01-20 Thread via GitHub


ryankert01 commented on code in PR #881:
URL: https://github.com/apache/mahout/pull/881#discussion_r2708079334


##
qdp/qdp-python/src/lib.rs:
##
@@ -152,7 +152,7 @@ fn is_pytorch_tensor(obj: &Bound<'_, PyAny>) -> 
PyResult {
 Ok(module_name == "torch")
 }
 
-/// Helper to validate tensor
+/// Helper to validate CPU tensor
 fn validate_tensor(tensor: &Bound<'_, PyAny>) -> PyResult<()> {

Review Comment:
   Do you mean we can identify tensor on gpu or cpu, if cpu, we 
can:`tensor.to("cuda")` and go to gpu tensor path? It can be another followup 
to refactor make it simple.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] MAHOUT-878 Add CUDA Torch Tensor Support for QDP Python Binding [mahout]

2026-01-20 Thread via GitHub


ryankert01 commented on code in PR #881:
URL: https://github.com/apache/mahout/pull/881#discussion_r2708079334


##
qdp/qdp-python/src/lib.rs:
##
@@ -152,7 +152,7 @@ fn is_pytorch_tensor(obj: &Bound<'_, PyAny>) -> 
PyResult {
 Ok(module_name == "torch")
 }
 
-/// Helper to validate tensor
+/// Helper to validate CPU tensor
 fn validate_tensor(tensor: &Bound<'_, PyAny>) -> PyResult<()> {

Review Comment:
   Do you mean we can identify tensor on gpu or cpu, if cpu, we 
can:`tensor.to("cuda")` and go to gpu tensor path?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] MAHOUT-878 Add CUDA Torch Tensor Support for QDP Python Binding [mahout]

2026-01-20 Thread via GitHub


ryankert01 commented on code in PR #881:
URL: https://github.com/apache/mahout/pull/881#discussion_r2708079334


##
qdp/qdp-python/src/lib.rs:
##
@@ -152,7 +152,7 @@ fn is_pytorch_tensor(obj: &Bound<'_, PyAny>) -> 
PyResult {
 Ok(module_name == "torch")
 }
 
-/// Helper to validate tensor
+/// Helper to validate CPU tensor
 fn validate_tensor(tensor: &Bound<'_, PyAny>) -> PyResult<()> {

Review Comment:
   Do you mean we can identify tensor on gpu or cpu, if cpu, we 
can:`tensor.to("cuda")`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] MAHOUT-878 Add CUDA Torch Tensor Support for QDP Python Binding [mahout]

2026-01-20 Thread via GitHub


CheyuWu commented on code in PR #881:
URL: https://github.com/apache/mahout/pull/881#discussion_r2708070502


##
qdp/qdp-python/src/lib.rs:
##
@@ -321,6 +460,78 @@ impl QdpEngine {
 
 // Check if it's a PyTorch tensor
 if is_pytorch_tensor(data)? {
+// Check if it's a CUDA tensor - use zero-copy GPU encoding
+if is_cuda_tensor(data)? {
+// Validate CUDA tensor for direct GPU encoding
+validate_cuda_tensor_for_encoding(
+data,
+self.engine.device().ordinal(),
+encoding_method,
+)?;
+
+// Extract GPU pointer via DLPack
+let dlpack_info = extract_dlpack_tensor(data.py(), data)?;
+
+let ndim: usize = data.call_method0("dim")?.extract()?;
+
+match ndim {
+1 => {
+// 1D CUDA tensor: single sample encoding
+let input_len = dlpack_info.shape[0] as usize;
+// SAFETY: dlpack_info.data_ptr was validated via 
DLPack protocol from a
+// valid PyTorch CUDA tensor. The tensor remains alive 
during this call
+// (held by Python's GIL), and we validated 
dtype/contiguity/device above.
+let ptr = unsafe {
+self.engine
+.encode_from_gpu_ptr(
+dlpack_info.data_ptr,
+input_len,
+num_qubits,
+encoding_method,
+)
+.map_err(|e| {
+PyRuntimeError::new_err(format!("Encoding 
failed: {}", e))
+})?
+};
+return Ok(QuantumTensor {
+ptr,
+consumed: false,
+});
+}
+2 => {
+// 2D CUDA tensor: batch encoding
+let num_samples = dlpack_info.shape[0] as usize;
+let sample_size = dlpack_info.shape[1] as usize;
+// SAFETY: Same as above - pointer from validated 
DLPack tensor
+let ptr = unsafe {
+self.engine
+.encode_batch_from_gpu_ptr(
+dlpack_info.data_ptr,
+num_samples,
+sample_size,
+num_qubits,
+encoding_method,
+)
+.map_err(|e| {
+PyRuntimeError::new_err(format!("Encoding 
failed: {}", e))
+})?
+};
+return Ok(QuantumTensor {
+ptr,
+consumed: false,
+});
+}
+_ => {
+return Err(PyRuntimeError::new_err(format!(
+"Unsupported CUDA tensor shape: {}D. Expected 1D 
tensor for single \
+ sample encoding or 2D tensor (batch_size, 
features) for batch encoding.",
+ndim
+)));
+}
+}
+}
+
+// CPU tensor path (existing code)
 validate_tensor(data)?;

Review Comment:
   I think we don't need this anymore



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] MAHOUT-878 Add CUDA Torch Tensor Support for QDP Python Binding [mahout]

2026-01-20 Thread via GitHub


CheyuWu commented on code in PR #881:
URL: https://github.com/apache/mahout/pull/881#discussion_r2708007010


##
qdp/qdp-python/src/lib.rs:
##
@@ -152,7 +152,7 @@ fn is_pytorch_tensor(obj: &Bound<'_, PyAny>) -> 
PyResult {
 Ok(module_name == "torch")
 }
 
-/// Helper to validate tensor
+/// Helper to validate CPU tensor
 fn validate_tensor(tensor: &Bound<'_, PyAny>) -> PyResult<()> {

Review Comment:
   I think we can remove this function, since we already support GPUs.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] MAHOUT-878 Add CUDA Torch Tensor Support for QDP Python Binding [mahout]

2026-01-20 Thread via GitHub


CheyuWu commented on PR #881:
URL: https://github.com/apache/mahout/pull/881#issuecomment-3772458849

   > Sorry @CheyuWu not noticing it. Would you perhaps want to take on some of 
the followups? There are three of them.
   
   That's fine,  I can review the code


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] MAHOUT-878 Add CUDA Torch Tensor Support for QDP Python Binding [mahout]

2026-01-20 Thread via GitHub


400Ping commented on PR #881:
URL: https://github.com/apache/mahout/pull/881#issuecomment-3772177255

   I will finish this up along with what's left of 
https://github.com/apache/mahout/issues/726


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] MAHOUT-878 Add CUDA Torch Tensor Support for QDP Python Binding [mahout]

2026-01-20 Thread via GitHub


ryankert01 commented on PR #881:
URL: https://github.com/apache/mahout/pull/881#issuecomment-3772170855

   Sorry @CheyuWu not noticing it. Would you perhaps want to take on some of 
the followups? There are three of them.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] MAHOUT-878 Add CUDA Torch Tensor Support for QDP Python Binding [mahout]

2026-01-20 Thread via GitHub


400Ping commented on PR #881:
URL: https://github.com/apache/mahout/pull/881#issuecomment-3772164099

   We should probably add a if you checked known issues when you create a issue


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] MAHOUT-878 Add CUDA Torch Tensor Support for QDP Python Binding [mahout]

2026-01-20 Thread via GitHub


400Ping commented on PR #881:
URL: https://github.com/apache/mahout/pull/881#issuecomment-3772157661

   I left this for @CheyuWu to do, but sure. I will do the follow-up to finish 
up once this is merge


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] MAHOUT-878 Add CUDA Torch Tensor Support for QDP Python Binding [mahout]

2026-01-20 Thread via GitHub


ryankert01 commented on PR #881:
URL: https://github.com/apache/mahout/pull/881#issuecomment-3772075605

   ```
 ⎿  
 
CUDA Tensor Encoding Speed Comparison   
 

 
Batch=  100, Features=  64 | CPU:0.30ms | CUDA:0.12ms | 
Speedup: 2.45x   
Batch=  100, Features= 256 | CPU:0.21ms | CUDA:0.46ms | 
Speedup: 0.45x   
Batch=  100, Features=1024 | CPU:0.67ms | CUDA:0.17ms | 
Speedup: 3.94x   
Batch= 1000, Features=  64 | CPU:3.82ms | CUDA:0.48ms | 
Speedup: 7.93x   
Batch= 1000, Features= 256 | CPU:2.90ms | CUDA:0.26ms | 
Speedup: 11.18x  
Batch= 1000, Features=1024 | CPU:6.11ms | CUDA:0.57ms | 
Speedup: 10.78x  
Batch=1, Features=  64 | CPU:4.26ms | CUDA:2.03ms | 
Speedup: 2.10x   
Batch=1, Features= 256 | CPU:   10.09ms | CUDA:4.78ms | 
Speedup: 2.11x   
Batch=1, Features=1024 | CPU:   27.10ms | CUDA:   10.42ms | 
Speedup: 2.60x   
   
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]