ryankert01 opened a new pull request, #881:
URL: https://github.com/apache/mahout/pull/881

   ### Purpose of PR
   Add zero-copy encoding support for PyTorch CUDA tensors in 
`QdpEngine.encode()`. Previously, CUDA tensors were rejected with "Only CPU 
tensors are currently supported", forcing users to call `.cpu()` first and 
incurring a GPU→CPU→GPU copy overhead.
   
   ### Motivation
   
   When working with PyTorch CUDA tensors, the previous workflow required:
   ```python
   # Before: GPU → CPU → GPU (slow)
   result = engine.encode(cuda_tensor.cpu(), num_qubits, "amplitude")
   ```
   
   With this change:
   ```python
   # After: GPU → GPU (zero-copy)
   result = engine.encode(cuda_tensor, num_qubits, "amplitude")
   ```
   
   This eliminates unnecessary memory transfers and improves performance for 
GPU-based ML pipelines.
   
   ### Changes
   
   | File | Changes |
   |------|---------|
   | `qdp-core/src/lib.rs` | Added `unsafe` methods `encode_from_gpu_ptr()` and 
`encode_batch_from_gpu_ptr()` |
   | `qdp-python/src/lib.rs` | Added CUDA tensor detection, validation, DLPack 
extraction; modified `encode()` to route CUDA tensors |
   | `qdp-core/.../amplitude.rs` | Made `calculate_inv_norm_gpu()` public |
   | `testing/.../test_bindings.py` | Added 14 new CUDA tensor tests |
   
   
   ### Error Messages
   
   | Condition | Message |
   |-----------|---------|
   | Non-float64 dtype | `CUDA tensor must have dtype float64, got {dtype}. Use 
tensor.to(torch.float64)` |
   | Non-contiguous | `CUDA tensor must be contiguous. Use tensor.contiguous()` 
|
   | Empty tensor | `CUDA tensor cannot be empty` |
   | Device mismatch | `Device mismatch: tensor is on cuda:{X}, but engine is 
on cuda:{Y}. Move tensor with tensor.to('cuda:{Y}')` |
   | Zero/NaN/Inf values | `Input data has zero or non-finite norm (contains 
NaN, Inf, or all zeros)` |
   | Unsupported encoding | `CUDA tensor encoding currently only supports 
'amplitude' method, got '{method}'...` |
   
   ### Follow-ups
   1. support float32
   2. supports more than amplitude encoding (eg. `"angle"`, `"basis"`)
   3. (maybe) refactoring to extend `QuantumEncoder` trait
   
   ### Testing
   
   ```bash
   pytest testing/qdp/test_bindings.py -v
   ```
   
   
   ### Related Issues or PRs
   <!-- Add links to related issues or PRs. -->
   <!-- - Closes #123  -->
   <!-- - Related to #123   -->
   Closes #878
   
   ### Changes Made
   <!-- Please mark one with an "x"   -->
   - [ ] Bug fix
   - [ ] New feature
   - [ ] Refactoring
   - [ ] Documentation
   - [ ] Test
   - [ ] CI/CD pipeline
   - [ ] Other
   
   ### Breaking Changes
   <!-- Does this PR introduce a breaking change? -->
   - [ ] Yes
   - [ ] No
   
   ### Checklist
   <!-- Please mark each item with an "x" when complete -->
   <!-- If not all items are complete, please open this as a **Draft PR**.
   Once all requirements are met, mark as ready for review. -->
   
   - [ ] Added or updated unit tests for all changes
   - [ ] Added or updated documentation for all changes
   - [ ] Successfully built and ran all unit tests or manual tests locally
   - [ ] PR title follows "MAHOUT-XXX: Brief Description" format (if related to 
an issue)
   - [ ] Code follows ASF guidelines
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to