anirudhacharya commented on issue #15560: Add fp16 support for topk
URL: https://github.com/apache/incubator-mxnet/pull/15560#issuecomment-523520676
 
 
   @drivanov yes, `cuda::less_half<half>()` and `cuda::greater_half<half>()` 
are defined similarly and should be looked into. 
   But the failure, as far as fp16 for `topk` and this PR goes, is here - 
https://github.com/apache/incubator-mxnet/blob/master/src/operator/tensor/sort_op-inl.cuh#L75.
 There is a problem when we try to perform `DeviceRadixSort` with half 
precision values. The documentation - 
https://nvlabs.github.io/cub/structcub_1_1_device_radix_sort.html says that 
`DeviceRadixSort` handles all numeric primitive types, including half 
precision. Either there is an issue with the way it is being called here or 
there is an issue with the routine itself.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to