rich7420 opened a new pull request, #1310: URL: https://github.com/apache/mahout/pull/1310
### Related Issues <!-- Closes #123 --> ### Changes - [ ] Bug fix - [ ] New feature - [x] Refactoring - [ ] Documentation - [x] Test - [ ] CI/CD pipeline - [ ] Other ### Why The single-sample f32 zero-copy entry point currently exists as a standalone `pub unsafe fn` on each encoder type (`AmplitudeEncoder` / `AngleEncoder` / `BasisEncoder`), while the batch sibling (`encode_batch_from_gpu_ptr_f32`) lives on the `QuantumEncoder` trait with a default `NotImplemented` body. That asymmetry has a real cost: adding a new encoder with f32 zero-copy support today requires the author to remember three separate touch points (the inherent fn on the encoder, the dispatcher in `engine.rs`, and the engine-level entry method). The batch-side already pays for the trait override discipline; the single-sample side should too. This refactor makes the trait surface symmetric. No new functionality, no behaviour change, no kernel changes. ### How **Trait change** (`qdp-core/src/gpu/encodings/mod.rs`) Add `encode_from_gpu_ptr_f32` to `QuantumEncoder` with a default body returning `MahoutError::NotImplemented`. Signature mirrors the existing `encode_batch_from_gpu_ptr_f32` exactly. Cfg-gated to Linux / CUDA, same as the batch sibling. **Encoder overrides** - `AmplitudeEncoder`, `AngleEncoder`, `BasisEncoder` each gain a trait `impl` of `encode_from_gpu_ptr_f32` that delegates to the existing inherent `_with_stream` workhorse fn. Hot paths can keep calling the inherent fn directly without a vtable. - Encoders that don't have an f32 zero-copy path yet (Phase, IQP, IQP-Z) inherit the default `NotImplemented` body — the same pattern used by `encode_batch_from_gpu_ptr_f32` today. **Amplitude alignment** (`qdp-core/src/gpu/encodings/amplitude.rs`) `AmplitudeEncoder` previously had no standalone `encode_from_gpu_ptr_f32_with_stream` — the L2-norm + kernel-launch sequence was inlined in `QdpEngine::encode_from_gpu_ptr_f32_with_stream` (`lib.rs`). Add the missing inherent fn to mirror the angle / basis pattern, then have `QdpEngine` delegate to it (−62 LoC inline, +0 net behaviour). **Tests** (`qdp-core/tests/gpu_ptr_encoding.rs`) Four new trait-dispatch tests dispatch through `Encoding::encoder()` (the trait object), not the inherent helper: - `test_trait_encode_from_gpu_ptr_f32_amplitude` - `test_trait_encode_from_gpu_ptr_f32_angle` - `test_trait_encode_from_gpu_ptr_f32_basis` - `test_trait_encode_from_gpu_ptr_f32_default_not_implemented_for_phase` The last one verifies the default body returns a `NotImplemented` error with the expected message — the contract used by callers that fall back to a CPU staging path when an encoder doesn't support zero-copy. ### Test results (local, RTX 2080 Ti) - Rust unit + integration: all passing - GPU-required tests (`gpu_ptr_encoding`, `gpu_fidelity`, `gpu_angle_encoding`): all passing (incl. the 4 new ones) - Python pytest (`qdp/qdp-python/tests`): all passing - Clippy / fmt / license-header: clean ### Baseline benchmark (no regression) 16 qubits, batch_size = 64, 200 batches, prefetch = 16, RTX 2080 Ti, sustained throughput: | Encoding | Median (vec/s) | p95 (vec/s) | |-----------|---------------:|------------:| | amplitude | 14,316 | 14,783 | | angle | 204,209 | 207,091 | | basis | 648,118 | 692,364 | Amplitude meets the existing baseline gate (≥ 14,015 vec/s); angle / basis match or slightly improve over the pre-refactor numbers (the refactor only changes call structure, not the kernel path). ### Migration / compatibility No public API removed or renamed. The new trait method adds a new override point that future encoders should implement when they gain f32 zero-copy support; existing encoders without such a path keep their current `NotImplemented` behaviour. Downstream callers that were reaching `AmplitudeEncoder::encode_from_gpu_ptr_f32_with_stream` (there were none in-tree) gain a new public inherent fn matching the angle / basis layout. ### Checklist - [x] Added or updated unit tests for all changes - [x] Added or updated documentation for all changes (doc-comments on trait method + new amplitude fn explain the symmetry intent) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
