anirudh2290 commented on a change in pull request #11252: [MXNET-323] Improve
performance of broadcast ops backward pass
URL: https://github.com/apache/incubator-mxnet/pull/11252#discussion_r200453024
##########
File path: src/operator/tensor/broadcast_reduce-inl.cuh
##########
@@ -602,6 +602,11 @@ void Reduce(Stream<gpu> *s, const TBlob& small, const
OpReqType req,
ReduceImpl<Reducer, ndim, DType, OP>(stream, small, req, big, workspace,
config);
}
+template <typename Reducer, int ndim, typename DType, typename OP>
+void ReduceWithExtraMem(Stream<cpu>* s, const TBlob& small, const OpReqType
req,
+ const Tensor<cpu, 1, char>& workspace, const TBlob&
big) {};
Review comment:
broadcast_reduce-inl.h includes the code in broadcast_reduce-inl.cuh or some
part of code (including ReduceWithExtraMem) in broadcast_reduce-inl.h based on
if __CUDACC__ is defined.
https://github.com/apache/incubator-mxnet/blob/master/src/operator/tensor/broadcast_reduce-inl.h#L171.
This causes build to fail when omitting ReduceWithExtraMem in
broadcast_reduce-inl.cuh.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services