access2rohit commented on a change in pull request #19015:
URL: https://github.com/apache/incubator-mxnet/pull/19015#discussion_r478710967
##########
File path: src/operator/numpy/np_nonzero_op-inl.h
##########
@@ -43,6 +43,25 @@ namespace mxnet {
namespace op {
struct NonzeroForwardKernel {
+ // this is for cpu
+ template<int ndim>
+ MSHADOW_XINLINE static void Map(index_t i,
+ int64_t* out,
+ const index_t* idx,
+ const mshadow::Shape<ndim> shape) {
+ index_t prev = (i == 0) ? 0 : idx[i - 1];
+ index_t curr = idx[i];
+ if (prev != curr) {
+ mshadow::Shape<ndim> coord = mxnet_op::unravel<ndim>(i, shape);
+ for (int j = 0; j < ndim; j++) {
+ out[prev * ndim + j] = coord[j];
+ }
+ }
+ }
+};
+
+struct NonzeroForwardKernelGPU {
+ // for gpu implementation because it does not support int 64 indexing
Review comment:
aah ... makes sense two Fcompute each for CPU and GPU comes to same
kernel eventually but for GPU's FCompute there is a prior call to cub that
expects an int32_t pointer that it populates and returns.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]