[GitHub] [incubator-mxnet] zhreshold commented on issue #17995: Fix ElemwiseSum for more than 4 inputs
zhreshold commented on issue #17995: Fix ElemwiseSum for more than 4 inputs URL: https://github.com/apache/incubator-mxnet/pull/17995#issuecomment-610766998 confirmed it can fix https://github.com/apache/incubator-mxnet/issues/16708#issuecomment-558876214 as well This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] zixuanweeei commented on issue #17964: [CI] CI Functionality Check for v1.6.x Branch
zixuanweeei commented on issue #17964: [CI] CI Functionality Check for v1.6.x Branch URL: https://github.com/apache/incubator-mxnet/pull/17964#issuecomment-610761957 @mxnet-bot run ci [unix-gpu] This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] mxnet-bot commented on issue #17964: [CI] CI Functionality Check for v1.6.x Branch
mxnet-bot commented on issue #17964: [CI] CI Functionality Check for v1.6.x Branch URL: https://github.com/apache/incubator-mxnet/pull/17964#issuecomment-610761989 Jenkins CI successfully triggered : [unix-gpu] This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[incubator-mxnet] branch master updated: [mkldnn] optimize for mkldnn batchnorm backward (#17902)
This is an automated email from the ASF dual-hosted git repository. patriczhao pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git The following commit(s) were added to refs/heads/master by this push: new 13841dd [mkldnn] optimize for mkldnn batchnorm backward (#17902) 13841dd is described below commit 13841ddd2d6a53f9f0c22f527a0363d818489bd0 Author: rongzha1 AuthorDate: Wed Apr 8 12:57:29 2020 +0800 [mkldnn] optimize for mkldnn batchnorm backward (#17902) * optimize for backward batchnorm * using memcpy instead of 'for' loop * rm unnecessary pointer cast and add const for some variable --- src/operator/nn/mkldnn/mkldnn_batch_norm-inl.h | 50 ++ 1 file changed, 27 insertions(+), 23 deletions(-) diff --git a/src/operator/nn/mkldnn/mkldnn_batch_norm-inl.h b/src/operator/nn/mkldnn/mkldnn_batch_norm-inl.h index 4de0bb3..d407d94 100644 --- a/src/operator/nn/mkldnn/mkldnn_batch_norm-inl.h +++ b/src/operator/nn/mkldnn/mkldnn_batch_norm-inl.h @@ -180,9 +180,10 @@ void MKLDNNBatchNormForward(const nnvm::NodeAttrs , const OpContext , CHECK(weight_mem.get_desc().get_size() == channels_ * sizeof(float) * 2); float* weight_ptr = gamma.data().dptr(); float* bias_ptr = beta.data().dptr(); +const size_t copy_size = sizeof(weight_buf[0]) * channels_; if (!param.fix_gamma) { - memcpy(weight_buf, weight_ptr, sizeof(weight_buf[0]) * channels_); - memcpy(_buf[channels_], bias_ptr, sizeof(weight_buf[0]) * channels_); + memcpy(weight_buf, weight_ptr, copy_size); + memcpy(_buf[channels_], bias_ptr, copy_size); } else if (IsBNWriting(req[batchnorm::kGamma])) { for (int i = 0; i < channels_; i++) { weight_buf[i] = 1.0f; @@ -332,17 +333,18 @@ void MKLDNNBatchNormBackward(const nnvm::NodeAttrs , const OpContext , const NDArray = in_data[batchnorm::kBeta]; DType *weight_buf = reinterpret_cast(bwd.GetWeight().get_data_handle()); nnvm::dim_t channels_ = data.shape()[1]; -for (int i = 0; i < channels_; i++) { - if (!param.fix_gamma) -weight_buf[i] = (gamma.data().dptr())[i]; // weight - else +DType *weight_ptr = gamma.data().dptr(); +DType* bias_ptr = beta.data().dptr(); +const size_t copy_size = sizeof(DType) * channels_; +if (!param.fix_gamma) { + memcpy(weight_buf, weight_ptr, copy_size); + memcpy(_buf[channels_], bias_ptr, copy_size); +} else { + for (int i = 0; i < channels_; i++) { weight_buf[i] = static_cast(1.0f); + } + memcpy(_buf[channels_], bias_ptr, copy_size); } - -for (int i = 0; i < channels_; i++) { - weight_buf[channels_ + i] = (beta.data().dptr())[i]; // bias -} - mkldnn_args_map_t net_args; net_args[MKLDNN_ARG_SRC] = *data_mem; net_args[MKLDNN_ARG_DIFF_SRC] = *gradi_mem; @@ -352,10 +354,10 @@ void MKLDNNBatchNormBackward(const nnvm::NodeAttrs , const OpContext , // training but no input mean and variance if (ctx.is_train && !param.use_global_stats) { - DType* moving_mean_ptr = reinterpret_cast(moving_mean.data().dptr()); - DType* moving_var_ptr = reinterpret_cast(moving_var.data().dptr()); - DType* out_mean_ptr = reinterpret_cast(out_mean.data().dptr()); - DType* out_var_ptr = reinterpret_cast(out_var.data().dptr()); + DType* moving_mean_ptr = moving_mean.data().dptr(); + DType* moving_var_ptr = moving_var.data().dptr(); + DType* out_mean_ptr = out_mean.data().dptr(); + DType* out_var_ptr = out_var.data().dptr(); mkldnn::memory var_mem(bwd.pd.variance_desc(), CpuEngine::Get()->get_engine()); DType *tmp_var_ptr = reinterpret_cast(var_mem.get_data_handle()); @@ -381,15 +383,17 @@ void MKLDNNBatchNormBackward(const nnvm::NodeAttrs , const OpContext , // copy data from gradw_mem to in_grad[1] and in_grad[2] DType *gw_buf = reinterpret_cast(bwd.GetGradw().get_data_handle()); -for (int i = 0; i < channels_; i++) { - if (!param.fix_gamma) -(in_grad[1].data().dptr())[i] = gw_buf[i]; - else -(in_grad[1].data().dptr())[i] = 0.0f; -} +DType *w_grad_1 = in_grad[1].data().dptr(); +DType *w_grad_2 = in_grad[2].data().dptr(); -for (int i = 0; i < channels_; i++) { - (in_grad[2].data().dptr())[i] = gw_buf[i + channels_]; +if (!param.fix_gamma) { + memcpy(w_grad_1, gw_buf, copy_size); + memcpy(w_grad_2, _buf[channels_], copy_size); +} else { + for (int i = 0; i < channels_; i++) { +(in_grad[1].data().dptr())[i] = 0.0f; + } + memcpy(w_grad_2, _buf[channels_], copy_size); } } else { LOG(FATAL) << "MKLDNN batch normalization backward: should not reach here ...";
[GitHub] [incubator-mxnet] pengzhao-intel merged pull request #17902: [mkldnn] optimize for mkldnn batchnorm backward
pengzhao-intel merged pull request #17902: [mkldnn] optimize for mkldnn batchnorm backward URL: https://github.com/apache/incubator-mxnet/pull/17902 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] rongzha1 commented on issue #17902: [mkldnn] optimize for mkldnn batchnorm backward
rongzha1 commented on issue #17902: [mkldnn] optimize for mkldnn batchnorm backward URL: https://github.com/apache/incubator-mxnet/pull/17902#issuecomment-610749446 Hi @pengzhao-intel @TaoLv @ChaiBapchya CI passed. please help review this PR again. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] wuxun-zhang commented on issue #17884: [MKL-DNN] Integrate Conv3d and Pool3d/1d
wuxun-zhang commented on issue #17884: [MKL-DNN] Integrate Conv3d and Pool3d/1d URL: https://github.com/apache/incubator-mxnet/pull/17884#issuecomment-610747504 Now CI has finally passed. @pengzhao-intel @TaoLv please help review again. @ChaiBapchya please also double check if this PR fixes https://github.com/apache/incubator-mxnet/issues/17915. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] JiangZhaoh edited a comment on issue #17759: [numpy] FFI for insert \ delete \ matmul etc.
JiangZhaoh edited a comment on issue #17759: [numpy] FFI for insert \ delete \ matmul etc. URL: https://github.com/apache/incubator-mxnet/pull/17759#issuecomment-610741017 @mxnet-bot run ci [centos-cpu] This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] mxnet-bot commented on issue #17759: [numpy] FFI for insert \ delete \ matmul etc.
mxnet-bot commented on issue #17759: [numpy] FFI for insert \ delete \ matmul etc. URL: https://github.com/apache/incubator-mxnet/pull/17759#issuecomment-610741146 Jenkins CI successfully triggered : [centos-cpu] This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] JiangZhaoh commented on issue #17759: [numpy] FFI for insert \ delete \ matmul etc.
JiangZhaoh commented on issue #17759: [numpy] FFI for insert \ delete \ matmul etc. URL: https://github.com/apache/incubator-mxnet/pull/17759#issuecomment-610741017 run ci [centos-cpu] This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] hzfan commented on issue #17917: fix UnicodeDecodeError: 'utf-8' codec can't decode bytes in position …
hzfan commented on issue #17917: fix UnicodeDecodeError: 'utf-8' codec can't decode bytes in position … URL: https://github.com/apache/incubator-mxnet/pull/17917#issuecomment-610740871 Could you provide your build script? CI hvae tested the build on windows, so the error seems a bit strange to me. Here is a successful build for your reference: https://github.com/apache/incubator-mxnet/blob/master/ci/build_windows.py . cc @vexilligera This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] djaym7 closed issue #17939: Ways of Freezing part of parameter and not the whole layer.
djaym7 closed issue #17939: Ways of Freezing part of parameter and not the whole layer. URL: https://github.com/apache/incubator-mxnet/issues/17939 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] mxnet-bot commented on issue #17851: [Numpy] np.linalg.qr forward implementation
mxnet-bot commented on issue #17851: [Numpy] np.linalg.qr forward implementation URL: https://github.com/apache/incubator-mxnet/pull/17851#issuecomment-610721314 Jenkins CI successfully triggered : [unix-gpu] This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] D-Roberts commented on issue #17851: [Numpy] np.linalg.qr forward implementation
D-Roberts commented on issue #17851: [Numpy] np.linalg.qr forward implementation URL: https://github.com/apache/incubator-mxnet/pull/17851#issuecomment-610721293 @mxnet-bot run ci [unix-gpu] This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[incubator-mxnet] branch master updated (16ddc6d -> b7f7525)
This is an automated email from the ASF dual-hosted git repository. patriczhao pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from 16ddc6d Custom Operator Random Number Generator Support (#17762) add b7f7525 dnnl v1.2.2 (#17991) No new revisions were added by this update. Summary of changes: 3rdparty/mkldnn | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
[GitHub] [incubator-mxnet] pengzhao-intel merged pull request #17991: Get DNNL v1.2.2 back to master branch
pengzhao-intel merged pull request #17991: Get DNNL v1.2.2 back to master branch URL: https://github.com/apache/incubator-mxnet/pull/17991 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] TaoLv commented on issue #17991: Get DNNL v1.2.2 back to master branch
TaoLv commented on issue #17991: Get DNNL v1.2.2 back to master branch URL: https://github.com/apache/incubator-mxnet/pull/17991#issuecomment-610708309 @pengzhao-intel @haojin2 If possible, please approve and merge. Thanks. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] pengzhao-intel commented on issue #17980: When compiled with MKL, fully_connected calls DNNL while dot and batch_dot call MKL
pengzhao-intel commented on issue #17980: When compiled with MKL, fully_connected calls DNNL while dot and batch_dot call MKL URL: https://github.com/apache/incubator-mxnet/issues/17980#issuecomment-610706449 Thanks @emfomenk & @TaoLv to identify the issue. Our team will follow up on the issue and switch more OP to DNNL soon. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] sxjscience commented on issue #17995: Fix ElemwiseSum for more than 4 inputs
sxjscience commented on issue #17995: Fix ElemwiseSum for more than 4 inputs URL: https://github.com/apache/incubator-mxnet/pull/17995#issuecomment-610704560 @ptrendx I can confirm that this fixes the issue. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] ChaiBapchya edited a comment on issue #4870: Travis-CI gave out multiple FAILED result on R_test, preventing PR
ChaiBapchya edited a comment on issue #4870: Travis-CI gave out multiple FAILED result on R_test, preventing PR URL: https://github.com/apache/incubator-mxnet/issues/4870#issuecomment-610703000 ``` First time using roxygen2. Upgrading automatically... [2020-04-08T00:03:57.271Z] Updating roxygen version in /work/mxnet/R-package/DESCRIPTION [2020-04-08T00:03:57.528Z] Loading mxnet [2020-04-08T00:03:58.458Z] [1] "Loading local: inst/libs/libmxnet.so" [2020-04-08T00:04:00.977Z] [1] "Loading local: src/mxnet.so" [2020-04-08T00:04:00.977Z] Error in loadModule("mxnet", TRUE) : could not find function "loadModule" [2020-04-08T00:04:00.977Z] Calls: ... load_code -> -> run_pkg_hook -> [2020-04-08T00:04:00.977Z] Execution halted [2020-04-08T00:04:00.977Z] Makefile:688: recipe for target 'rpkg' failed [2020-04-08T00:04:00.977Z] make: *** [rpkg] Error 1 ``` Facing this issue while trying to Fix R CPU errors on v1.6.x branch of mxnet here : https://github.com/apache/incubator-mxnet/pull/17993 Similar error of `loadModule` function not found in R @thirdwing any idea? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] mxnet-bot commented on issue #17952: 1bit gradient compression
mxnet-bot commented on issue #17952: 1bit gradient compression URL: https://github.com/apache/incubator-mxnet/pull/17952#issuecomment-610703072 Jenkins CI successfully triggered : [centos-gpu] This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] shuo-ouyang commented on issue #17952: 1bit gradient compression
shuo-ouyang commented on issue #17952: 1bit gradient compression URL: https://github.com/apache/incubator-mxnet/pull/17952#issuecomment-610703052 @mxnet-bot run ci [centos-gpu] This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] ChaiBapchya commented on issue #4870: Travis-CI gave out multiple FAILED result on R_test, preventing PR
ChaiBapchya commented on issue #4870: Travis-CI gave out multiple FAILED result on R_test, preventing PR URL: https://github.com/apache/incubator-mxnet/issues/4870#issuecomment-610703000 ``` First time using roxygen2. Upgrading automatically... [2020-04-08T00:03:57.271Z] Updating roxygen version in /work/mxnet/R-package/DESCRIPTION [2020-04-08T00:03:57.528Z] Loading mxnet [2020-04-08T00:03:58.458Z] [1] "Loading local: inst/libs/libmxnet.so" [2020-04-08T00:04:00.977Z] [1] "Loading local: src/mxnet.so" [2020-04-08T00:04:00.977Z] Error in loadModule("mxnet", TRUE) : could not find function "loadModule" [2020-04-08T00:04:00.977Z] Calls: ... load_code -> -> run_pkg_hook -> [2020-04-08T00:04:00.977Z] Execution halted [2020-04-08T00:04:00.977Z] Makefile:688: recipe for target 'rpkg' failed [2020-04-08T00:04:00.977Z] make: *** [rpkg] Error 1 ``` Similar error of `loadModule` function not found in R @thirdwing any idea? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] ciyongch commented on issue #16864: [Discussion] 1.7.0 Roadmap
ciyongch commented on issue #16864: [Discussion] 1.7.0 Roadmap URL: https://github.com/apache/incubator-mxnet/issues/16864#issuecomment-610698949 @szha, sure, I will follow up those items. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] mxnet-bot commented on issue #17974: Add instructions on distributed MXNet with Horovod on Kubernetes
mxnet-bot commented on issue #17974: Add instructions on distributed MXNet with Horovod on Kubernetes URL: https://github.com/apache/incubator-mxnet/pull/17974#issuecomment-610698797 Jenkins CI successfully triggered : [clang, centos-cpu, website, windows-gpu, sanity, windows-cpu, edge, miscellaneous, unix-gpu, unix-cpu, centos-gpu] This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] terrytangyuan commented on issue #17974: Add instructions on distributed MXNet with Horovod on Kubernetes
terrytangyuan commented on issue #17974: Add instructions on distributed MXNet with Horovod on Kubernetes URL: https://github.com/apache/incubator-mxnet/pull/17974#issuecomment-610698753 @mxnet-bot run ci [all] This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] zixuanweeei commented on issue #17959: [MKLDNN] Add LSTMP to v1.6.x
zixuanweeei commented on issue #17959: [MKLDNN] Add LSTMP to v1.6.x URL: https://github.com/apache/incubator-mxnet/pull/17959#issuecomment-610696673 Thanks for your quick response @aaronmarkham @ChaiBapchya. Hope it will work soon. Besides, **[unix-gpu] - Static build GPU 14.04 Python** looks like being broken by some network issues. And it also appeared in #17964. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] shuokay commented on issue #17944: How to split symbol?
shuokay commented on issue #17944: How to split symbol? URL: https://github.com/apache/incubator-mxnet/issues/17944#issuecomment-610693985 I think `mx.sym.slice` is what you need. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[incubator-mxnet-site] branch asf-site updated: Bump the publish timestamp.
This is an automated email from the ASF dual-hosted git repository. aaronmarkham pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/incubator-mxnet-site.git The following commit(s) were added to refs/heads/asf-site by this push: new 9029c98 Bump the publish timestamp. 9029c98 is described below commit 9029c9893fc300a55748b58e5ac0b13b19e0b59f Author: mxnet-ci AuthorDate: Wed Apr 8 01:00:11 2020 + Bump the publish timestamp. --- date.txt | 1 + 1 file changed, 1 insertion(+) diff --git a/date.txt b/date.txt new file mode 100644 index 000..9dd1e81 --- /dev/null +++ b/date.txt @@ -0,0 +1 @@ +Wed Apr 8 01:00:11 UTC 2020
[GitHub] [incubator-mxnet] samskalicky commented on issue #17885: [WIP] MXNet Extensions enhancements
samskalicky commented on issue #17885: [WIP] MXNet Extensions enhancements URL: https://github.com/apache/incubator-mxnet/pull/17885#issuecomment-610693459 @mxnet-bot run ci [sanity] This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] mxnet-bot commented on issue #17885: [WIP] MXNet Extensions enhancements
mxnet-bot commented on issue #17885: [WIP] MXNet Extensions enhancements URL: https://github.com/apache/incubator-mxnet/pull/17885#issuecomment-610693482 Jenkins CI successfully triggered : [sanity] This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] pengzhao-intel commented on issue #17872: Fix issue of zeros gradients w.r.t. RNN bias when num_layers > 1
pengzhao-intel commented on issue #17872: Fix issue of zeros gradients w.r.t. RNN bias when num_layers > 1 URL: https://github.com/apache/incubator-mxnet/pull/17872#issuecomment-610691835 > @zixuanweeei Thanks for your contribution, could you also cherry-pick the commit to 1.7? DJL LSTM model depends on this commit. Thanks! Sure, please add this requirement in 1.7 roadmap #16864 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] haojin2 commented on a change in pull request #17904: [Numpy] add: numpy op trilindices
haojin2 commented on a change in pull request #17904: [Numpy] add: numpy op trilindices URL: https://github.com/apache/incubator-mxnet/pull/17904#discussion_r405196267 ## File path: src/operator/numpy/np_matrix_op-inl.h ## @@ -287,6 +287,106 @@ void NumpyVstackBackward(const nnvm::NodeAttrs& attrs, }); } +struct NumpyTrilindicesParam : public dmlc::Parameter { + int n; + int k; + int m; + DMLC_DECLARE_PARAMETER(NumpyTrilindicesParam) { +DMLC_DECLARE_FIELD(n) + .describe("The row dimension of the arrays for which" +"the returned indices will be valid."); +DMLC_DECLARE_FIELD(k) + .set_default(0) + .describe("Diagonal offset"); +DMLC_DECLARE_FIELD(m) + .describe("The column dimension of the arrays for " +"which the returned arrays will be valid." +"By default m is taken equal to n."); + } + void SetAttrDict(std::unordered_map* dict) { +std::ostringstream n_s, k_s, m_s; +n_s << n; +k_s << k; +m_s << m; +(*dict)["n"] = n_s.str(); +(*dict)["k"] = k_s.str(); +(*dict)["m"] = m_s.str(); + } +}; + +template +struct TrilindicesOpForwardImpl { + template + MSHADOW_XINLINE static void Map(int i, DType* out_data0, DType* out_data1, + int* data, int length) { +KERNEL_ASSIGN(out_data0[i], req, data[i]); +KERNEL_ASSIGN(out_data1[i], req, data[i + length]); + } +}; + +template +void TrilindicesOpForward(const nnvm::NodeAttrs& attrs, +const OpContext& ctx, Review comment: alignment This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] haojin2 commented on a change in pull request #17904: [Numpy] add: numpy op trilindices
haojin2 commented on a change in pull request #17904: [Numpy] add: numpy op trilindices URL: https://github.com/apache/incubator-mxnet/pull/17904#discussion_r405196087 ## File path: src/operator/numpy/np_matrix_op.cc ## @@ -1115,6 +1115,66 @@ NNVM_REGISTER_OP(_backward_np_dstack) .set_attr("TIsBackward", true) .set_attr("FCompute", DStackGradCompute); +DMLC_REGISTER_PARAMETER(NumpyTrilindicesParam); + +inline bool TrilindicesOpType(const nnvm::NodeAttrs& attrs, +std::vector *in_attrs, +std::vector *out_attrs) { + CHECK_EQ(in_attrs->size(), 0U); + CHECK_EQ(out_attrs->size(), 2U); + + TYPE_ASSIGN_CHECK(*out_attrs, 0, mshadow::kInt64); + TYPE_ASSIGN_CHECK(*out_attrs, 1, mshadow::kInt64); + + return true; +} + +inline bool TrilindicesOpShape(const nnvm::NodeAttrs& attrs, +mxnet::ShapeVector* in_attrs, Review comment: alignment This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] haojin2 commented on a change in pull request #17904: [Numpy] add: numpy op trilindices
haojin2 commented on a change in pull request #17904: [Numpy] add: numpy op trilindices URL: https://github.com/apache/incubator-mxnet/pull/17904#discussion_r405196183 ## File path: src/operator/numpy/np_matrix_op.cc ## @@ -1115,6 +1115,66 @@ NNVM_REGISTER_OP(_backward_np_dstack) .set_attr("TIsBackward", true) .set_attr("FCompute", DStackGradCompute); +DMLC_REGISTER_PARAMETER(NumpyTrilindicesParam); + +inline bool TrilindicesOpType(const nnvm::NodeAttrs& attrs, +std::vector *in_attrs, +std::vector *out_attrs) { + CHECK_EQ(in_attrs->size(), 0U); + CHECK_EQ(out_attrs->size(), 2U); + + TYPE_ASSIGN_CHECK(*out_attrs, 0, mshadow::kInt64); + TYPE_ASSIGN_CHECK(*out_attrs, 1, mshadow::kInt64); + + return true; +} + +inline bool TrilindicesOpShape(const nnvm::NodeAttrs& attrs, +mxnet::ShapeVector* in_attrs, +mxnet::ShapeVector* out_attrs) { + CHECK_EQ(in_attrs->size(), 0U); + CHECK_EQ(out_attrs->size(), 2U); + + const NumpyTrilindicesParam& param = + nnvm::get(attrs.parsed); Review comment: ```c++ const NumpyTrilindicesParam& param = nnvm::get(attrs.parsed); ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[incubator-mxnet] branch master updated (f906a02 -> 16ddc6d)
This is an automated email from the ASF dual-hosted git repository. lausen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from f906a02 ffi_atleast_1/2/3d (#17897) add 16ddc6d Custom Operator Random Number Generator Support (#17762) No new revisions were added by this update. Summary of changes: CMakeLists.txt| 17 ++--- example/extensions/lib_custom_op/relu_lib.cu | 90 --- example/extensions/lib_custom_op/test_relu.py | 43 - include/mxnet/lib_api.h | 57 + include/mxnet/random_generator.h | 8 +++ src/c_api/c_api.cc| 41 src/common/random_generator.cu| 5 ++ tests/python/gpu/test_extensions_gpu.py | 18 +- 8 files changed, 220 insertions(+), 59 deletions(-)
[GitHub] [incubator-mxnet] leezu merged pull request #17762: Custom Operator Random Number Generator Support
leezu merged pull request #17762: Custom Operator Random Number Generator Support URL: https://github.com/apache/incubator-mxnet/pull/17762 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[incubator-mxnet] branch master updated (a960f5a -> f906a02)
This is an automated email from the ASF dual-hosted git repository. haoj pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from a960f5a ffi_array_split, v/h/dsplit (#17873) add f906a02 ffi_atleast_1/2/3d (#17897) No new revisions were added by this update. Summary of changes: benchmark/python/ffi/benchmark_ffi.py | 3 + python/mxnet/_numpy_op_doc.py | 107 -- python/mxnet/ndarray/numpy/_op.py | 118 ++ python/mxnet/numpy/multiarray.py | 112 python/mxnet/symbol/numpy/_symbol.py | 71 src/api/operator/numpy/np_init_op.cc | 81 +++ src/operator/numpy/np_init_op.cc | 2 +- src/operator/numpy/np_init_op.cu | 6 +- src/operator/numpy/np_init_op.h | 5 ++ 9 files changed, 394 insertions(+), 111 deletions(-)
[incubator-mxnet] branch master updated (a960f5a -> f906a02)
This is an automated email from the ASF dual-hosted git repository. haoj pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from a960f5a ffi_array_split, v/h/dsplit (#17873) add f906a02 ffi_atleast_1/2/3d (#17897) No new revisions were added by this update. Summary of changes: benchmark/python/ffi/benchmark_ffi.py | 3 + python/mxnet/_numpy_op_doc.py | 107 -- python/mxnet/ndarray/numpy/_op.py | 118 ++ python/mxnet/numpy/multiarray.py | 112 python/mxnet/symbol/numpy/_symbol.py | 71 src/api/operator/numpy/np_init_op.cc | 81 +++ src/operator/numpy/np_init_op.cc | 2 +- src/operator/numpy/np_init_op.cu | 6 +- src/operator/numpy/np_init_op.h | 5 ++ 9 files changed, 394 insertions(+), 111 deletions(-)
[incubator-mxnet] branch master updated (892f982 -> a960f5a)
This is an automated email from the ASF dual-hosted git repository. haoj pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from 892f982 * impl - linalg.lstsq for cpu (#17950) add a960f5a ffi_array_split, v/h/dsplit (#17873) No new revisions were added by this update. Summary of changes: benchmark/python/ffi/benchmark_ffi.py | 4 + python/mxnet/ndarray/numpy/_op.py | 44 +++--- python/mxnet/symbol/numpy/_symbol.py | 23 +++-- src/api/operator/numpy/np_matrix_op.cc | 154 + src/operator/tensor/matrix_op.cc | 1 + 5 files changed, 186 insertions(+), 40 deletions(-)
[incubator-mxnet] branch master updated (a960f5a -> f906a02)
This is an automated email from the ASF dual-hosted git repository. haoj pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from a960f5a ffi_array_split, v/h/dsplit (#17873) add f906a02 ffi_atleast_1/2/3d (#17897) No new revisions were added by this update. Summary of changes: benchmark/python/ffi/benchmark_ffi.py | 3 + python/mxnet/_numpy_op_doc.py | 107 -- python/mxnet/ndarray/numpy/_op.py | 118 ++ python/mxnet/numpy/multiarray.py | 112 python/mxnet/symbol/numpy/_symbol.py | 71 src/api/operator/numpy/np_init_op.cc | 81 +++ src/operator/numpy/np_init_op.cc | 2 +- src/operator/numpy/np_init_op.cu | 6 +- src/operator/numpy/np_init_op.h | 5 ++ 9 files changed, 394 insertions(+), 111 deletions(-)
[incubator-mxnet] branch master updated (892f982 -> a960f5a)
This is an automated email from the ASF dual-hosted git repository. haoj pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from 892f982 * impl - linalg.lstsq for cpu (#17950) add a960f5a ffi_array_split, v/h/dsplit (#17873) No new revisions were added by this update. Summary of changes: benchmark/python/ffi/benchmark_ffi.py | 4 + python/mxnet/ndarray/numpy/_op.py | 44 +++--- python/mxnet/symbol/numpy/_symbol.py | 23 +++-- src/api/operator/numpy/np_matrix_op.cc | 154 + src/operator/tensor/matrix_op.cc | 1 + 5 files changed, 186 insertions(+), 40 deletions(-)
[incubator-mxnet] branch master updated (79c576b -> 892f982)
This is an automated email from the ASF dual-hosted git repository. haoj pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from 79c576b [ONNX export] Fixing spatial export for batchnorm (#17711) add 892f982 * impl - linalg.lstsq for cpu (#17950) No new revisions were added by this update. Summary of changes: python/mxnet/ndarray/numpy/linalg.py | 81 ++- python/mxnet/numpy/fallback_linalg.py | 2 - python/mxnet/numpy/linalg.py | 73 ++- python/mxnet/numpy_dispatch_protocol.py| 1 + python/mxnet/symbol/numpy/linalg.py| 67 ++- src/operator/c_lapack_api.cc | 12 + src/operator/c_lapack_api.h| 64 ++- src/operator/numpy/linalg/np_lstsq-inl.h | 593 + src/operator/numpy/linalg/np_lstsq.cc | 97 .../numpy/{np_memory_op.cu => linalg/np_lstsq.cu} | 11 +- .../python/unittest/test_numpy_interoperability.py | 41 +- tests/python/unittest/test_numpy_op.py | 76 +++ 12 files changed, 1095 insertions(+), 23 deletions(-) create mode 100644 src/operator/numpy/linalg/np_lstsq-inl.h create mode 100644 src/operator/numpy/linalg/np_lstsq.cc copy src/operator/numpy/{np_memory_op.cu => linalg/np_lstsq.cu} (79%)
[incubator-mxnet] branch master updated (79c576b -> 892f982)
This is an automated email from the ASF dual-hosted git repository. haoj pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from 79c576b [ONNX export] Fixing spatial export for batchnorm (#17711) add 892f982 * impl - linalg.lstsq for cpu (#17950) No new revisions were added by this update. Summary of changes: python/mxnet/ndarray/numpy/linalg.py | 81 ++- python/mxnet/numpy/fallback_linalg.py | 2 - python/mxnet/numpy/linalg.py | 73 ++- python/mxnet/numpy_dispatch_protocol.py| 1 + python/mxnet/symbol/numpy/linalg.py| 67 ++- src/operator/c_lapack_api.cc | 12 + src/operator/c_lapack_api.h| 64 ++- src/operator/numpy/linalg/np_lstsq-inl.h | 593 + src/operator/numpy/linalg/np_lstsq.cc | 97 .../numpy/{np_memory_op.cu => linalg/np_lstsq.cu} | 11 +- .../python/unittest/test_numpy_interoperability.py | 41 +- tests/python/unittest/test_numpy_op.py | 76 +++ 12 files changed, 1095 insertions(+), 23 deletions(-) create mode 100644 src/operator/numpy/linalg/np_lstsq-inl.h create mode 100644 src/operator/numpy/linalg/np_lstsq.cc copy src/operator/numpy/{np_memory_op.cu => linalg/np_lstsq.cu} (79%)
[GitHub] [incubator-mxnet] haojin2 merged pull request #17897: [Numpy] FFI: atleast_1/2/3d
haojin2 merged pull request #17897: [Numpy] FFI: atleast_1/2/3d URL: https://github.com/apache/incubator-mxnet/pull/17897 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] haojin2 merged pull request #17873: [Numpy] FFI: array_split, v/h/dsplit
haojin2 merged pull request #17873: [Numpy] FFI: array_split, v/h/dsplit URL: https://github.com/apache/incubator-mxnet/pull/17873 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] haojin2 merged pull request #17950: [Numpy] Add op linalg.lstsq
haojin2 merged pull request #17950: [Numpy] Add op linalg.lstsq URL: https://github.com/apache/incubator-mxnet/pull/17950 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] ptrendx commented on issue #17984: Raise toolchain requirements for MXNet 2
ptrendx commented on issue #17984: Raise toolchain requirements for MXNet 2 URL: https://github.com/apache/incubator-mxnet/pull/17984#issuecomment-610675875 Note: With this we will be able to change shared_ptr here: https://github.com/apache/incubator-mxnet/blob/master/src/engine/threaded_engine.h#L440 to `unique_ptr` (as it was supposed to be but C++11 did not allow `std::move` to lambda). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] ptrendx commented on a change in pull request #17977: Relaxing type requirements for broadcast_like
ptrendx commented on a change in pull request #17977: Relaxing type requirements for broadcast_like URL: https://github.com/apache/incubator-mxnet/pull/17977#discussion_r405177613 ## File path: src/operator/tensor/broadcast_reduce_op_value.cc ## @@ -138,7 +138,16 @@ NNVM_REGISTER_OP(broadcast_like) [](const NodeAttrs& attrs) { return std::vector{"lhs", "rhs"}; }) -.set_attr("FInferType", ElemwiseType<2, 1>) +.set_attr("FInferType", [](const nnvm::NodeAttrs& attrs, + std::vector *in_attrs, + std::vector *out_attrs) { + CHECK_EQ(in_attrs->size(), 2) << " in operator " << attrs.name; + std::vector checked_in_attrs = { (*in_attrs)[0] }; + bool ret = !type_is_none((*in_attrs)[1]) && Review comment: Yes, it is necessary - what `FInferType` returns is whether it succeeded in inferring all the types (so that if all operators return true we know that all types are inferred). That is why it is important to not lie and return true only if all types are really inferred (even if we do not actually do anything with the other type). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] samskalicky edited a comment on issue #17885: [WIP] MXNet Extensions enhancements
samskalicky edited a comment on issue #17885: [WIP] MXNet Extensions enhancements URL: https://github.com/apache/incubator-mxnet/pull/17885#issuecomment-610669060 @mxnet-bot run ci [sanity] This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] mxnet-bot commented on issue #17885: [WIP] MXNet Extensions enhancements
mxnet-bot commented on issue #17885: [WIP] MXNet Extensions enhancements URL: https://github.com/apache/incubator-mxnet/pull/17885#issuecomment-610669252 Jenkins CI successfully triggered : [sanity] This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] samskalicky commented on issue #17885: [WIP] MXNet Extensions enhancements
samskalicky commented on issue #17885: [WIP] MXNet Extensions enhancements URL: https://github.com/apache/incubator-mxnet/pull/17885#issuecomment-610669060 Jenkins CI successfully triggered : [sanity] This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] leezu commented on issue #17995: Fix ElemwiseSum for more than 4 inputs
leezu commented on issue #17995: Fix ElemwiseSum for more than 4 inputs URL: https://github.com/apache/incubator-mxnet/pull/17995#issuecomment-610666920 Should we add a test case in the style of https://github.com/apache/incubator-mxnet/issues/17989#issuecomment-610591835 to ensure no such bug makes it into the codebase again? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] mxnet-bot commented on issue #17995: Fix ElemwiseSum for more than 4 inputs
mxnet-bot commented on issue #17995: Fix ElemwiseSum for more than 4 inputs URL: https://github.com/apache/incubator-mxnet/pull/17995#issuecomment-610661204 Hey @ptrendx , Thanks for submitting the PR All tests are already queued to run once. If tests fail, you can trigger one or more tests again with the following commands: - To trigger all jobs: @mxnet-bot run ci [all] - To trigger specific jobs: @mxnet-bot run ci [job1, job2] *** **CI supported jobs**: [windows-cpu, miscellaneous, unix-gpu, edge, windows-gpu, website, sanity, centos-cpu, clang, centos-gpu, unix-cpu] *** _Note_: Only following 3 categories can trigger CI :PR Author, MXNet Committer, Jenkins Admin. All CI tests must pass before the PR can be merged. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] ptrendx opened a new pull request #17995: Fix ElemwiseSum for more than 4 inputs
ptrendx opened a new pull request #17995: Fix ElemwiseSum for more than 4 inputs URL: https://github.com/apache/incubator-mxnet/pull/17995 ## Description ## Fixes #17989 It was caused by a bug in `ElemwiseSum` which for more than 4 inputs and `kAddTo` req was counting gradients multiple times. @sxjscience @zhreshold Please test if this fixes the issues you saw. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] emfomenk commented on issue #17980: When compiled with MKL, fully_connected calls DNNL while dot and batch_dot call MKL
emfomenk commented on issue #17980: When compiled with MKL, fully_connected calls DNNL while dot and batch_dot call MKL URL: https://github.com/apache/incubator-mxnet/issues/17980#issuecomment-610648623 > I tried single-threaded (OMP_NUM_THREADS=1) and still saw a performance drop. There shouldn't be much difference between OMP libraries then right? Yeah, there should be zero difference in this case :) > > I would also suggest to try upgrading the library to the most recent version > > See my post above "I also tried the latest DNNL". Performance was unchanged. Didn't see that, thanks! Yeah, it looks like a performance bug indeed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] D-Roberts commented on issue #17851: [Numpy] np.linalg.qr forward implementation
D-Roberts commented on issue #17851: [Numpy] np.linalg.qr forward implementation URL: https://github.com/apache/incubator-mxnet/pull/17851#issuecomment-610648177 @mxnet-bot run ci [unix-gpu] This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] mxnet-bot commented on issue #17851: [Numpy] np.linalg.qr forward implementation
mxnet-bot commented on issue #17851: [Numpy] np.linalg.qr forward implementation URL: https://github.com/apache/incubator-mxnet/pull/17851#issuecomment-610648212 Jenkins CI successfully triggered : [unix-gpu] This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] ptrendx commented on issue #17989: [Autograd] Very serious bug of grad_req='add'
ptrendx commented on issue #17989: [Autograd] Very serious bug of grad_req='add' URL: https://github.com/apache/incubator-mxnet/issues/17989#issuecomment-610646412 Yup, setting that env variable (undocumented BTW ;-) ) to higher value makes the test fail for the higher cases too. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] kpu commented on issue #17980: When compiled with MKL, fully_connected calls DNNL while dot and batch_dot call MKL
kpu commented on issue #17980: When compiled with MKL, fully_connected calls DNNL while dot and batch_dot call MKL URL: https://github.com/apache/incubator-mxnet/issues/17980#issuecomment-610643813 > If Intel MKL is linked MKL was linked in both cases and in fact called in both cases from `dot` just not from `FullyConnected` > libiomp5.so will be used instead of libgomp.so I tried single-threaded (`OMP_NUM_THREADS=1`) and still saw a performance drop. There shouldn't be much difference between OMP libraries then right? > I would also suggest to try upgrading the library to the most recent version See my post above "I also tried the latest DNNL". Performance was unchanged. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] sxjscience commented on issue #17989: [Autograd] Very serious bug of grad_req='add'
sxjscience commented on issue #17989: [Autograd] Very serious bug of grad_req='add' URL: https://github.com/apache/incubator-mxnet/issues/17989#issuecomment-610643223 @ptrendx After checking the source code, I think it's due to the `MXNET_EXEC_INPLACE_GRAD_SUM_CAP` setting to 8 by default. The `ElementwiseSum` is used in the GraphExecutor to accumulate the gradient: https://github.com/apache/incubator-mxnet/blob/79c576b8157539d365cc9e0e1e355d4ca12f7374/src/executor/graph_executor.cc#L255-L263 And the inplace_sum_cap is by default 8: https://github.com/apache/incubator-mxnet/blob/79c576b8157539d365cc9e0e1e355d4ca12f7374/src/executor/graph_executor.cc#L228 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] ptrendx commented on issue #17989: [Autograd] Very serious bug of grad_req='add'
ptrendx commented on issue #17989: [Autograd] Very serious bug of grad_req='add' URL: https://github.com/apache/incubator-mxnet/issues/17989#issuecomment-610639500 @zhreshold And for any op that you check you introduce the elemwisesum node in the backward pass that aggregates the gradients, which you do not see in the model. As I said, I still do not understand why 8+ is fine, I would expect everything above 4 to fail (as elemwisesum implementation has special cases for 2,3 and 4, and then buggy one for 5+). I will look into it further. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] mxnet-bot commented on issue #17885: [WIP] MXNet Extensions enhancements
mxnet-bot commented on issue #17885: [WIP] MXNet Extensions enhancements URL: https://github.com/apache/incubator-mxnet/pull/17885#issuecomment-610630046 Jenkins CI successfully triggered : [sanity] This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] samskalicky commented on issue #17885: [WIP] MXNet Extensions enhancements
samskalicky commented on issue #17885: [WIP] MXNet Extensions enhancements URL: https://github.com/apache/incubator-mxnet/pull/17885#issuecomment-610629994 @mxnet-bot run ci [sanity] This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] blchu opened a new pull request #17994: Tensor cores used only for fp16 in interleaved multihead attention
blchu opened a new pull request #17994: Tensor cores used only for fp16 in interleaved multihead attention URL: https://github.com/apache/incubator-mxnet/pull/17994 ## Description ## Fixed issue where fp32 inputs use tensor cores for the interleaved multihead attention operators, resulting in lower precision calculations and potential reduction in accuracy. ## Checklist ## ### Essentials ### Please feel free to remove inapplicable items for your PR. - [ ] Changes are complete (i.e. I finished coding on this PR) - [ ] All changes have test coverage: - Unit tests are added for small changes to verify correctness (e.g. adding a new operator) - Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore) - Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL) - [ ] Code is well-documented: - For user-facing API changes, API doc string has been updated. - For new C++ functions in header files, their functionalities and arguments are documented. - For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable - Check the API doc at https://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html - [ ] To the best of my knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change ### Changes ### - [ ] Set interleaved multihead attention GEMM default to not use tensor cores, and only use if input data type is fp16 - [ ] No longer checks for tensor input shape divisibility by 8 ## Comments ## This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] mxnet-bot commented on issue #17994: Tensor cores used only for fp16 in interleaved multihead attention
mxnet-bot commented on issue #17994: Tensor cores used only for fp16 in interleaved multihead attention URL: https://github.com/apache/incubator-mxnet/pull/17994#issuecomment-610624507 Hey @blchu , Thanks for submitting the PR All tests are already queued to run once. If tests fail, you can trigger one or more tests again with the following commands: - To trigger all jobs: @mxnet-bot run ci [all] - To trigger specific jobs: @mxnet-bot run ci [job1, job2] *** **CI supported jobs**: [centos-cpu, unix-cpu, edge, centos-gpu, miscellaneous, windows-gpu, clang, sanity, unix-gpu, website, windows-cpu] *** _Note_: Only following 3 categories can trigger CI :PR Author, MXNet Committer, Jenkins Admin. All CI tests must pass before the PR can be merged. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] zhreshold commented on issue #17989: [Autograd] Very serious bug of grad_req='add'
zhreshold commented on issue #17989: [Autograd] Very serious bug of grad_req='add' URL: https://github.com/apache/incubator-mxnet/issues/17989#issuecomment-610623359 I don't think it's related to particular op implementation, it's something may not be working at all when autograd is introduced. See the experiments I did: https://github.com/apache/incubator-mxnet/issues/16708#issuecomment-558876214 What I did is replicate the same node N times, if N is in (1, 2, 3, 4, 8, 9, 10...) times, the loss and gradients are always GOOD, however, with (5, 6, 7), the gradients will diverge at the first iteration This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[incubator-mxnet] branch master updated (002d4f1 -> 79c576b)
This is an automated email from the ASF dual-hosted git repository. skm pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from 002d4f1 * impl - FFi for linalg op (#17795) add 79c576b [ONNX export] Fixing spatial export for batchnorm (#17711) No new revisions were added by this update. Summary of changes: python/mxnet/contrib/onnx/mx2onnx/_op_translations.py | 9 + 1 file changed, 5 insertions(+), 4 deletions(-)
[incubator-mxnet] branch master updated (002d4f1 -> 79c576b)
This is an automated email from the ASF dual-hosted git repository. skm pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from 002d4f1 * impl - FFi for linalg op (#17795) add 79c576b [ONNX export] Fixing spatial export for batchnorm (#17711) No new revisions were added by this update. Summary of changes: python/mxnet/contrib/onnx/mx2onnx/_op_translations.py | 9 + 1 file changed, 5 insertions(+), 4 deletions(-)
[incubator-mxnet] branch master updated (002d4f1 -> 79c576b)
This is an automated email from the ASF dual-hosted git repository. skm pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from 002d4f1 * impl - FFi for linalg op (#17795) add 79c576b [ONNX export] Fixing spatial export for batchnorm (#17711) No new revisions were added by this update. Summary of changes: python/mxnet/contrib/onnx/mx2onnx/_op_translations.py | 9 + 1 file changed, 5 insertions(+), 4 deletions(-)
[GitHub] [incubator-mxnet] sandeep-krishnamurthy merged pull request #17711: [ONNX export] Fixing spatial export for batchnorm
sandeep-krishnamurthy merged pull request #17711: [ONNX export] Fixing spatial export for batchnorm URL: https://github.com/apache/incubator-mxnet/pull/17711 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] ptrendx commented on issue #17989: [Autograd] Very serious bug of grad_req='add'
ptrendx commented on issue #17989: [Autograd] Very serious bug of grad_req='add' URL: https://github.com/apache/incubator-mxnet/issues/17989#issuecomment-610605942 It does not explain why it does the right thing for nrepeat=8 forward. There has to be something else going on there that limits elementwisesum to 7 inputs somehow. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] ptrendx commented on issue #17989: [Autograd] Very serious bug of grad_req='add'
ptrendx commented on issue #17989: [Autograd] Very serious bug of grad_req='add' URL: https://github.com/apache/incubator-mxnet/issues/17989#issuecomment-610605392 I don't see this issue in our (not yet released) container because I changed ElementwiseSum implementation to be vectorized, and there I do have the proper logic: ```c for (size_t i = 0; i < inputs.size(); i += num_inputs_per_kernel) { if (i == 0) { using Kernel = VectorizedElementwiseSumFwd; typename Kernel::ParamType params; params.num_inputs = std::min(num_inputs_per_kernel, inputs.size() - i); for (int j = 0; j < params.num_inputs; ++j) { params.inputs[j] = inputs[i + j].dptr(); } params.outputs[0] = outputs[0].dptr(); VectorizedKernelLauncher(size, s, params); } else { /* During subsequent launches we need to accumulate into the previous outputs */ using Kernel = VectorizedElementwiseSumFwd; typename Kernel::ParamType params; params.num_inputs = std::min(num_inputs_per_kernel, inputs.size() - i); for (int j = 0; j < params.num_inputs; ++j) { params.inputs[j] = inputs[i + j].dptr(); } params.outputs[0] = outputs[0].dptr(); VectorizedKernelLauncher(size, s, params); } } ``` where I change the sum to do `kAddTo` and not including the output as one of the parameters. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] sxjscience commented on issue #17989: [Autograd] Very serious bug of grad_req='add'
sxjscience commented on issue #17989: [Autograd] Very serious bug of grad_req='add' URL: https://github.com/apache/incubator-mxnet/issues/17989#issuecomment-610605521 @ptrendx Thanks! I think that explains the cause. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] ptrendx commented on issue #17989: [Autograd] Very serious bug of grad_req='add'
ptrendx commented on issue #17989: [Autograd] Very serious bug of grad_req='add' URL: https://github.com/apache/incubator-mxnet/issues/17989#issuecomment-610604273 It is ElementwiseSum, although I'm not sure why after 7 repeats you get correct result again. The code of ElementwiseSum for more than 4 repetitions: ```c default: { DType* in_0_dptr = in_data[0].dptr(); Kernel::Launch(s, out_size, out_dptr, req[0], in_0_dptr); for (size_t i = 1; i < size; ++i) { DType* in_dptr = in_data[i].dptr(); Kernel::Launch(s, out_size, out_dptr, req[0], out_dptr, in_dptr); } break; } ``` which is wrong - if `req[0]` is kAddTo (which it will be in this case), after the first kernel you get ``` out = out + in_0 ``` but then the subsequent ones instead of doing ``` out = out + in_i ``` do ``` out += out + in_i ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] sxjscience commented on issue #17989: [Autograd] Very serious bug of grad_req='add'
sxjscience commented on issue #17989: [Autograd] Very serious bug of grad_req='add' URL: https://github.com/apache/incubator-mxnet/issues/17989#issuecomment-610599468 May be it's not related to specific implementation in the GPU side. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] ptrendx commented on issue #17989: [Autograd] Very serious bug of grad_req='add'
ptrendx commented on issue #17989: [Autograd] Very serious bug of grad_req='add' URL: https://github.com/apache/incubator-mxnet/issues/17989#issuecomment-610598084 I first tried on our container (which is based on 1.6.0), since that is the easiest thing for me to try first. When I tried both your numpy and ndarray small examples I see CPU exhibiting the error and GPU not, so looking into what could be different between those implementations. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] sxjscience commented on issue #17989: [Gradient Addto] Very serious bug of grad_req='add'
sxjscience commented on issue #17989: [Gradient Addto] Very serious bug of grad_req='add' URL: https://github.com/apache/incubator-mxnet/issues/17989#issuecomment-610596656 @ptrendx I'm using a compiled version of master. Are you able to reproduce it using the script I attached at the beginning of the issue? ``` wget https://gist.githubusercontent.com/sxjscience/0bd336c921396b3c66331354e1866886/raw/d618ba69cbecf04d3013db77af86c29d62fe0336/grad_req_addto_bug.py -O grad_req_addto_bug.py python grad_req_addto_bug.py --addto python grad_req_addto_bug.py ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] sxjscience commented on issue #17989: [Gradient Addto] Very serious bug of grad_req='add'
sxjscience commented on issue #17989: [Gradient Addto] Very serious bug of grad_req='add' URL: https://github.com/apache/incubator-mxnet/issues/17989#issuecomment-610595164 @ptrendx @zhreshold @szha I tried to run with MXNet==1.0.0 but it give me another error. The earliest version I can confirm that has this issue is 1.2.0. This is really critical and impacts the very basic functionality of a DL framework, i.e., autograd. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] sxjscience commented on issue #17989: [Gradient Addto] Very serious bug of grad_req='add'
sxjscience commented on issue #17989: [Gradient Addto] Very serious bug of grad_req='add' URL: https://github.com/apache/incubator-mxnet/issues/17989#issuecomment-610591835 @ptrendx @szha @zhreshold I find that the bug also exists in 1.5.0, 1.4.0, 1.3.1, 1.2.1. In fact, results on both CPU and GPU are wrong in these versions. Reproducible script is given as follows (I used the legacy mx.nd). ```python import mxnet as mx import numpy as np for ctx in [mx.cpu(), mx.gpu()]: for nrepeat in range(1, 10): stored_grad = dict() for grad_req in ['write', 'add']: a = mx.nd.array([1], ctx=ctx) b = mx.nd.array([2], ctx=ctx) if grad_req == 'write': a.attach_grad(grad_req='write') elif grad_req == 'add': a.attach_grad(grad_req='add') a.grad[:] = 0 with mx.autograd.record(): for _ in range(nrepeat): b = b * a b.backward() stored_grad[grad_req] = a.grad.asscalar() print('ctx={}, nrepeat={}, write={}, add={}'.format(ctx, nrepeat, stored_grad['write'], stored_grad['add'])) ``` For MXNet 1.5.0, I used `pip install mxnet-cu101==1.5.0` For MXNet 1.4.0, I used `pip install mxnet-cu92==1.4.0` For MXNet 1.3.1, I used `pip install mxnet-cu92==1.3.1` For MXNet 1.2.1, I used `pip install mxnet-cu92==1.2.1` Output ``` ctx=cpu(0), nrepeat=1, write=2.0, add=2.0 ctx=cpu(0), nrepeat=2, write=4.0, add=4.0 ctx=cpu(0), nrepeat=3, write=6.0, add=6.0 ctx=cpu(0), nrepeat=4, write=8.0, add=8.0 ctx=cpu(0), nrepeat=5, write=10.0, add=62.0 ctx=cpu(0), nrepeat=6, write=12.0, add=126.0 ctx=cpu(0), nrepeat=7, write=14.0, add=254.0 ctx=cpu(0), nrepeat=8, write=16.0, add=16.0 ctx=cpu(0), nrepeat=9, write=18.0, add=18.0 ctx=gpu(0), nrepeat=1, write=2.0, add=2.0 ctx=gpu(0), nrepeat=2, write=4.0, add=4.0 ctx=gpu(0), nrepeat=3, write=6.0, add=6.0 ctx=gpu(0), nrepeat=4, write=8.0, add=8.0 ctx=gpu(0), nrepeat=5, write=10.0, add=62.0 ctx=gpu(0), nrepeat=6, write=12.0, add=126.0 ctx=gpu(0), nrepeat=7, write=14.0, add=254.0 ctx=gpu(0), nrepeat=8, write=16.0, add=16.0 ctx=gpu(0), nrepeat=9, write=18.0, add=18.0 ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] ptrendx commented on issue #17989: [Gradient Addto] Very serious bug of grad_req='add'
ptrendx commented on issue #17989: [Gradient Addto] Very serious bug of grad_req='add' URL: https://github.com/apache/incubator-mxnet/issues/17989#issuecomment-610591415 Hmm, I just tried the latest script from @sxjscience and I got exactly opposite results - gpu working as expected and cpu not working right: ``` ctx=cpu(0), nrepeat=1, write=2.0, add=2.0 ctx=cpu(0), nrepeat=2, write=4.0, add=4.0 ctx=cpu(0), nrepeat=3, write=6.0, add=6.0 ctx=cpu(0), nrepeat=4, write=8.0, add=8.0 ctx=cpu(0), nrepeat=5, write=10.0, add=62.0 ctx=cpu(0), nrepeat=6, write=12.0, add=126.0 ctx=cpu(0), nrepeat=7, write=14.0, add=254.0 ctx=cpu(0), nrepeat=8, write=16.0, add=16.0 ctx=cpu(0), nrepeat=9, write=18.0, add=18.0 ctx=gpu(0), nrepeat=1, write=2.0, add=2.0 ctx=gpu(0), nrepeat=2, write=4.0, add=4.0 ctx=gpu(0), nrepeat=3, write=6.0, add=6.0 ctx=gpu(0), nrepeat=4, write=8.0, add=8.0 ctx=gpu(0), nrepeat=5, write=10.0, add=10.0 ctx=gpu(0), nrepeat=6, write=12.0, add=12.0 ctx=gpu(0), nrepeat=7, write=14.0, add=14.0 ctx=gpu(0), nrepeat=8, write=16.0, add=16.0 ctx=gpu(0), nrepeat=9, write=18.0, add=18.0 ``` I ran it ~15 times and always got the same result. What is the version of MXNet you tried it on, @sxjscience? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] mxnet-bot commented on issue #17927: update ruby & jekyll, remove incompatible plugins
mxnet-bot commented on issue #17927: update ruby & jekyll, remove incompatible plugins URL: https://github.com/apache/incubator-mxnet/pull/17927#issuecomment-610560513 Jenkins CI successfully triggered : [unix-gpu] This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] aaronmarkham commented on issue #17927: update ruby & jekyll, remove incompatible plugins
aaronmarkham commented on issue #17927: update ruby & jekyll, remove incompatible plugins URL: https://github.com/apache/incubator-mxnet/pull/17927#issuecomment-610560438 @mxnet-bot run ci [unix-gpu] This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[incubator-mxnet-site] branch asf-site updated: Bump the publish timestamp.
This is an automated email from the ASF dual-hosted git repository. aaronmarkham pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/incubator-mxnet-site.git The following commit(s) were added to refs/heads/asf-site by this push: new 0485a5d Bump the publish timestamp. 0485a5d is described below commit 0485a5d695fc9c1a318a898fca32433dea716d0a Author: mxnet-ci AuthorDate: Tue Apr 7 18:45:26 2020 + Bump the publish timestamp. --- date.txt | 1 + 1 file changed, 1 insertion(+) diff --git a/date.txt b/date.txt new file mode 100644 index 000..9f8bc0a --- /dev/null +++ b/date.txt @@ -0,0 +1 @@ +Tue Apr 7 18:45:26 UTC 2020
[GitHub] [incubator-mxnet] D-Roberts commented on issue #17851: [Numpy] np.linalg.qr forward implementation
D-Roberts commented on issue #17851: [Numpy] np.linalg.qr forward implementation URL: https://github.com/apache/incubator-mxnet/pull/17851#issuecomment-610550941 @mxnet-bot run ci [all] This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] mxnet-bot commented on issue #17851: [Numpy] np.linalg.qr forward implementation
mxnet-bot commented on issue #17851: [Numpy] np.linalg.qr forward implementation URL: https://github.com/apache/incubator-mxnet/pull/17851#issuecomment-610551042 Jenkins CI successfully triggered : [centos-gpu, sanity, unix-gpu, miscellaneous, edge, centos-cpu, website, unix-cpu, clang, windows-cpu, windows-gpu] This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] ChaiBapchya commented on issue #17993: fix R error; backport 1 line from #17228
ChaiBapchya commented on issue #17993: fix R error; backport 1 line from #17228 URL: https://github.com/apache/incubator-mxnet/pull/17993#issuecomment-610542718 @mxnet-bot run ci [unix-cpu] This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] ChaiBapchya opened a new pull request #17993: fix R error; backport 1 line from #17228
ChaiBapchya opened a new pull request #17993: fix R error; backport 1 line from #17228 URL: https://github.com/apache/incubator-mxnet/pull/17993 Fix setRefClass not found issue by adding import of library `methods` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] mxnet-bot commented on issue #17993: fix R error; backport 1 line from #17228
mxnet-bot commented on issue #17993: fix R error; backport 1 line from #17228 URL: https://github.com/apache/incubator-mxnet/pull/17993#issuecomment-610541518 Hey @ChaiBapchya , Thanks for submitting the PR All tests are already queued to run once. If tests fail, you can trigger one or more tests again with the following commands: - To trigger all jobs: @mxnet-bot run ci [all] - To trigger specific jobs: @mxnet-bot run ci [job1, job2] *** **CI supported jobs**: [centos-gpu, unix-gpu, website, centos-cpu, sanity, windows-cpu, windows-gpu, unix-cpu, miscellaneous, edge, clang] *** _Note_: Only following 3 categories can trigger CI :PR Author, MXNet Committer, Jenkins Admin. All CI tests must pass before the PR can be merged. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] leezu commented on issue #17937: Fix for handling negative indices in the fusion of slice
leezu commented on issue #17937: Fix for handling negative indices in the fusion of slice URL: https://github.com/apache/incubator-mxnet/pull/17937#issuecomment-61052 @ChaiBapchya the hang occurs after ~10 minutes but the instance will remain available until timeout happens (3 hours). So @ptrendx's suggestion should also work. I do remember that someone attempted this for the same or a similar issue before? I think there are logs in the internal issue tracker? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] leezu commented on issue #17992: MKLDNNConvolutionBackward accesses out of bound elements
leezu commented on issue #17992: MKLDNNConvolutionBackward accesses out of bound elements URL: https://github.com/apache/incubator-mxnet/issues/17992#issuecomment-610524257 Notice that there are also other issues with this test https://github.com/apache/incubator-mxnet/pull/15631 https://github.com/apache/incubator-mxnet/issues/15638 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] leezu opened a new issue #17992: MKLDNNConvolutionBackward accesses out of bound elements
leezu opened a new issue #17992: MKLDNNConvolutionBackward accesses out of bound elements URL: https://github.com/apache/incubator-mxnet/issues/17992 ## Description CI with updated toolchain (ie #17984) catches the bug. `vector: :_M_range_check: __n (which is 2) >= this->size() (which is 2)` ### Error Message ## To Reproduce Build with this simple patch ``` diff diff --git a/src/operator/nn/mkldnn/mkldnn_convolution.cc b/src/operator/nn/mkldnn/mkldnn_convolution.cc index ada42a22c..95b44fd92 100644 --- a/src/operator/nn/mkldnn/mkldnn_convolution.cc +++ b/src/operator/nn/mkldnn/mkldnn_convolution.cc @@ -480,7 +480,7 @@ void MKLDNNConvolutionBackward(const nnvm::NodeAttrs& attrs, const OpContext {MKLDNN_ARG_DIFF_SRC, *in_grad_mem.second}}); CommitOutput(in_grad[conv::kData], in_grad_mem); } - if (req[conv::kWeight] || req[conv::kBias]) { + if (req.at(conv::kWeight) || req.at(conv::kBias)) { if (convBwd.GetDataPd().diff_dst_desc() != convBwd.GetWeightsPd().diff_dst_desc()) out_grad_mem = out_grad.GetMKLDNNDataReorder(convBwd.GetWeightsPd().diff_dst_desc()); auto data_mem = data.GetMKLDNNDataReorder(convBwd.GetWeightsPd().src_desc()); ``` OR follow the instructions in https://github.com/apache/incubator-mxnet/issues/17987 to trigger this via glibc assertions in debug build. Run `test_operator.test_convolution_independent_gradients` to trigger the bug. cc @TaoLv This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] leezu commented on issue #17988: Row-sparse constant initializer accesses out of bound elements
leezu commented on issue #17988: Row-sparse constant initializer accesses out of bound elements URL: https://github.com/apache/incubator-mxnet/issues/17988#issuecomment-610520256 Also affects `test_rsp_const_init` in the Perl test suite This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] aaronmarkham commented on issue #17959: [MKLDNN] Add LSTMP to v1.6.x
aaronmarkham commented on issue #17959: [MKLDNN] Add LSTMP to v1.6.x URL: https://github.com/apache/incubator-mxnet/pull/17959#issuecomment-610517907 > The issue related to R package test is fixed with importing the function first, using `library(methods)` > I've made the change to v1.6.x branch [testing it locally now]. > > Referring the doc for Reproducing the tests : https://cwiki.apache.org/confluence/display/MXNET/Reproducing+test+results > > 1. Requirements > > ``` > pip3 install -r ci/requirements.txt --user > ``` > > 1. Build > > ``` > ci/build.py --docker-registry mxnetci --platform ubuntu_cpu --docker-build-retries 3 --shm-size 500m /work/runtime_functions.sh build_ubuntu_cpu_openblas > ``` > > 1. Test > > ``` > ci/build.py --docker-registry mxnetci --platform ubuntu_cpu --docker-build-retries 3 --shm-size 500m /work/runtime_functions.sh unittest_ubuntu_cpu_R > ``` R docs are broken on 1.6.x so maybe this fix can be applied to that pipeline too? https://github.com/apache/incubator-mxnet/issues/17920 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] rondogency commented on issue #17762: Custom Operator Random Number Generator Support
rondogency commented on issue #17762: Custom Operator Random Number Generator Support URL: https://github.com/apache/incubator-mxnet/pull/17762#issuecomment-610516730 @mxnet-bot run ci [unix-gpu, windows-gpu] This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] mxnet-bot commented on issue #17762: Custom Operator Random Number Generator Support
mxnet-bot commented on issue #17762: Custom Operator Random Number Generator Support URL: https://github.com/apache/incubator-mxnet/pull/17762#issuecomment-610516774 Jenkins CI successfully triggered : [windows-gpu, unix-gpu] This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] stu1130 commented on issue #17872: Fix issue of zeros gradients w.r.t. RNN bias when num_layers > 1
stu1130 commented on issue #17872: Fix issue of zeros gradients w.r.t. RNN bias when num_layers > 1 URL: https://github.com/apache/incubator-mxnet/pull/17872#issuecomment-610514553 @zixuanweeei Thanks for your contribution, could you also cherry-pick the commit to 1.7? DJL LSTM model depends on this commit. Thanks! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] mxnet-bot commented on issue #17934: Remove duplicate condition
mxnet-bot commented on issue #17934: Remove duplicate condition URL: https://github.com/apache/incubator-mxnet/pull/17934#issuecomment-610511873 Jenkins CI successfully triggered : [unix-gpu] This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] gaurav1086 commented on issue #17934: Remove duplicate condition
gaurav1086 commented on issue #17934: Remove duplicate condition URL: https://github.com/apache/incubator-mxnet/pull/17934#issuecomment-610511798 @mxnet-bot run ci [unix-gpu] This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] ChaiBapchya commented on issue #17959: [MKLDNN] Add LSTMP to v1.6.x
ChaiBapchya commented on issue #17959: [MKLDNN] Add LSTMP to v1.6.x URL: https://github.com/apache/incubator-mxnet/pull/17959#issuecomment-610502426 The issue related to R package test is fixed with importing the function first, using `library(methods)` I've made the change to v1.6.x branch [testing it locally now]. Referring the doc for Reproducing the tests : https://cwiki.apache.org/confluence/display/MXNET/Reproducing+test+results 1. Requirements ``` pip3 install -r ci/requirements.txt --user ``` 2. Build ``` ci/build.py --docker-registry mxnetci --platform ubuntu_cpu --docker-build-retries 3 --shm-size 500m /work/runtime_functions.sh build_ubuntu_cpu_openblas ``` 3. Test ``` ci/build.py --docker-registry mxnetci --platform ubuntu_cpu --docker-build-retries 3 --shm-size 500m /work/runtime_functions.sh unittest_ubuntu_cpu_R ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services