alisha-1000 commented on issue #1003: URL: https://github.com/apache/mahout/issues/1003#issuecomment-3939387314
Hi @viiccwen, I’d like to work on adding zero-copy float32 GPU support for angle encoding. From the description, it seems amplitude encoding already provides specialized APIs (encode_from_gpu_ptr_f32 and _with_stream) that avoid host round-trips when given torch.float32 CUDA tensors. For angle encoding, I propose: Adding encode_from_gpu_ptr_f32 variants for angle encoding in QdpEngine Implementing corresponding CUDA kernel handling float32 input Updating Python bindings to dispatch based on: device (CUDA vs CPU) dtype (float32 vs float64) Ensuring no fallback to CPU occurs when a torch.float32 CUDA tensor is provided Before starting, I’d like to confirm: Should angle encoding maintain float32 internally, or upcast to float64 inside the kernel? Are there precision expectations (e.g. fidelity thresholds) we should preserve compared to the current float64 path? I’ll review the amplitude encoding implementation as the architectural reference and keep the API consistent. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
