alisha-1000 commented on issue #1003:
URL: https://github.com/apache/mahout/issues/1003#issuecomment-3939387314

   Hi @viiccwen,
   
   I’d like to work on adding zero-copy float32 GPU support for angle encoding.
   
   From the description, it seems amplitude encoding already provides 
specialized APIs (encode_from_gpu_ptr_f32 and _with_stream) that avoid host 
round-trips when given torch.float32 CUDA tensors.
   
   For angle encoding, I propose:
   
   Adding encode_from_gpu_ptr_f32 variants for angle encoding in QdpEngine
   
   Implementing corresponding CUDA kernel handling float32 input
   
   Updating Python bindings to dispatch based on:
   
   device (CUDA vs CPU)
   
   dtype (float32 vs float64)
   
   Ensuring no fallback to CPU occurs when a torch.float32 CUDA tensor is 
provided
   
   Before starting, I’d like to confirm:
   
   Should angle encoding maintain float32 internally, or upcast to float64 
inside the kernel?
   
   Are there precision expectations (e.g. fidelity thresholds) we should 
preserve compared to the current float64 path?
   
   I’ll review the amplitude encoding implementation as the architectural 
reference and keep the API consistent.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to