[
https://issues.apache.org/jira/browse/MAHOUT-878?focusedWorklogId=1000941&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-1000941
]
ASF GitHub Bot logged work on MAHOUT-878:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 20/Jan/26 10:03
Start Date: 20/Jan/26 10:03
Worklog Time Spent: 10m
Work Description: ryankert01 opened a new pull request, #881:
URL: https://github.com/apache/mahout/pull/881
### Purpose of PR
Add zero-copy encoding support for PyTorch CUDA tensors in
`QdpEngine.encode()`. Previously, CUDA tensors were rejected with "Only CPU
tensors are currently supported", forcing users to call `.cpu()` first and
incurring a GPU→CPU→GPU copy overhead.
### Motivation
When working with PyTorch CUDA tensors, the previous workflow required:
```python
# Before: GPU → CPU → GPU (slow)
result = engine.encode(cuda_tensor.cpu(), num_qubits, "amplitude")
```
With this change:
```python
# After: GPU → GPU (zero-copy)
result = engine.encode(cuda_tensor, num_qubits, "amplitude")
```
This eliminates unnecessary memory transfers and improves performance for
GPU-based ML pipelines.
### Changes
| File | Changes |
|------|---------|
| `qdp-core/src/lib.rs` | Added `unsafe` methods `encode_from_gpu_ptr()` and
`encode_batch_from_gpu_ptr()` |
| `qdp-python/src/lib.rs` | Added CUDA tensor detection, validation, DLPack
extraction; modified `encode()` to route CUDA tensors |
| `qdp-core/.../amplitude.rs` | Made `calculate_inv_norm_gpu()` public |
| `testing/.../test_bindings.py` | Added 14 new CUDA tensor tests |
### Error Messages
| Condition | Message |
|-----------|---------|
| Non-float64 dtype | `CUDA tensor must have dtype float64, got {dtype}. Use
tensor.to(torch.float64)` |
| Non-contiguous | `CUDA tensor must be contiguous. Use tensor.contiguous()`
|
| Empty tensor | `CUDA tensor cannot be empty` |
| Device mismatch | `Device mismatch: tensor is on cuda:{X}, but engine is
on cuda:{Y}. Move tensor with tensor.to('cuda:{Y}')` |
| Zero/NaN/Inf values | `Input data has zero or non-finite norm (contains
NaN, Inf, or all zeros)` |
| Unsupported encoding | `CUDA tensor encoding currently only supports
'amplitude' method, got '{method}'...` |
### Follow-ups
1. support float32
2. supports more than amplitude encoding (eg. `"angle"`, `"basis"`)
3. (maybe) refactoring to extend `QuantumEncoder` trait
### Testing
```bash
pytest testing/qdp/test_bindings.py -v
```
### Related Issues or PRs
<!
Issue Time Tracking
-------------------
Worklog Id: (was: 1000941)
Remaining Estimate: 0h
Time Spent: 10m
> Provide better examples for the parallel ALS recommender code
> -------------------------------------------------------------
>
> Key: MAHOUT-878
> URL: https://issues.apache.org/jira/browse/MAHOUT-878
> Project: Mahout
> Issue Type: Task
> Affects Versions: 1.0.0
> Reporter: Sebastian Schelter
> Assignee: Sebastian Schelter
> Priority: Major
> Fix For: 0.6
>
> Attachments: MAHOUT-878.patch
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> We should provide examples that show how to apply the parallel ALS
> recommender to the Netflix or KDD2011 datasets.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)