mozga-intel commented on a change in pull request #19896:
URL: https://github.com/apache/incubator-mxnet/pull/19896#discussion_r686123337
##########
File path: src/operator/contrib/adaptive_avg_pooling.cc
##########
@@ -197,6 +198,90 @@
num_threads(engine::OpenMP::Get()->GetRecommendedOMPThreadCount())
}
}
+#if MXNET_USE_MKLDNN == 1
+bool SupportMKLDNNAveragePooling(const NDArray &in_data,
+ const NDArray &out_data) {
+ for (int64_t idx = 2; idx < in_data.shape().ndim(); ++idx) {
+ const int s1 = in_data.shape()[idx];
+ const int s2 = out_data.shape()[idx];
+ if (s2 == 0) {
+ return false;
+ }
+ if (s1 % s2 != 0) {
+ return false;
+ }
+ }
+ const int IH = in_data.shape()[2];
+ const int IW = in_data.shape()[3];
+ const int OH = out_data.shape()[2];
+ const int OW = out_data.shape()[3];
+
+ const int strides_H = floor((IH << 1) / OH) - floor(IH / OH);
+ const int strides_W = floor((IW << 1) / OW) - floor(IW / OW);
+ const int kernel_H = ceil((IH << 1) / OH) - floor(IH / OH);
+ const int kernel_W = ceil((IW << 1) / OW) - floor(IW / OW);
+ const int pad_l_top = (strides_H * (OH - 1) + kernel_H - IH) / 2;
+ const int pad_l_left = (strides_W * (OW - 1) + kernel_W - IW) / 2;
+
Review comment:
Good point! However, it could be difficult to determine precisely the
lower and upper bound. Notice, that this's a random process that's a collection
of "random" variables that is indexed by a lot of external things, i.e: for a
small tensors the time needed to warm up threads could be significant, Just let
me look at the issue I'll try to adress this issue, but in the next patch. Yes
it might be faster, I'm invoking this function tenfold and furthermore the
tesnor aren't storred in the cache.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]