[GitHub] asitstands commented on issue #11268: A binary RBM example

2018-07-27 Thread GitBox
asitstands commented on issue #11268: A binary RBM example
URL: https://github.com/apache/incubator-mxnet/pull/11268#issuecomment-408584999
 
 
   Now the log-likelihoods of the test and training data are reported at the 
completion of each epoch. They are estimated using AIS. README shows some 
samples generated from the RBM. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] szha commented on issue #11268: A binary RBM example

2018-07-27 Thread GitBox
szha commented on issue #11268: A binary RBM example
URL: https://github.com/apache/incubator-mxnet/pull/11268#issuecomment-408584838
 
 
   @asitstands thanks for updating the PR. @yifeim would you mind taking 
another pass at this PR?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[incubator-mxnet] branch v1.2.0 updated: update 1.2.0. announcement (#11917)

2018-07-27 Thread anirudh2290
This is an automated email from the ASF dual-hosted git repository.

anirudh2290 pushed a commit to branch v1.2.0
in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git


The following commit(s) were added to refs/heads/v1.2.0 by this push:
 new d8a4f5a  update 1.2.0. announcement (#11917)
d8a4f5a is described below

commit d8a4f5aceefe4adf1ff092981fee505678fdac3d
Author: Aaron Markham 
AuthorDate: Fri Jul 27 20:59:53 2018 -0700

update 1.2.0. announcement (#11917)
---
 docs/_static/mxnet-theme/index.html | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/docs/_static/mxnet-theme/index.html 
b/docs/_static/mxnet-theme/index.html
index 005bc88..3647e23 100644
--- a/docs/_static/mxnet-theme/index.html
+++ b/docs/_static/mxnet-theme/index.html
@@ -9,7 +9,7 @@
 Install
 
 
-Learn More
+Learn 
More
 
 
 
@@ -26,9 +26,9 @@
 http://gluon-crash-course.mxnet.io/;>Learn More
   
   
-MXNet 1.2.0.rc0 Released
-We're excited to announce the release of MXNet 1.2.0.rc0! Check out 
the release notes for latest updates.
-https://cwiki.apache.org/confluence/display/MXNET/Apache+MXNet+%28incubating%29+1.2.0+Release+Notes;>Learn
 More
+MXNet 1.2.1 Released
+We're excited to announce the release of MXNet 1.2.1! Check out the 
release notes for latest updates.
+https://github.com/apache/incubator-mxnet/releases/tag/1.2.1;>Learn 
More
   
   
   Introducing the Scala Inference API



[GitHub] anirudh2290 closed pull request #11917: update home page for 1.2.1 announcement

2018-07-27 Thread GitBox
anirudh2290 closed pull request #11917: update home page for 1.2.1 announcement
URL: https://github.com/apache/incubator-mxnet/pull/11917
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/docs/_static/mxnet-theme/index.html 
b/docs/_static/mxnet-theme/index.html
index 005bc88f255..3647e23a736 100644
--- a/docs/_static/mxnet-theme/index.html
+++ b/docs/_static/mxnet-theme/index.html
@@ -9,7 +9,7 @@
 Install
 
 
-Learn More
+Learn 
More
 
 
 
@@ -26,9 +26,9 @@ A 60-minute Gluon Crash Course
 http://gluon-crash-course.mxnet.io/;>Learn More
   
   
-MXNet 1.2.0.rc0 Released
-We're excited to announce the release of MXNet 1.2.0.rc0! Check out 
the release notes for latest updates.
-https://cwiki.apache.org/confluence/display/MXNET/Apache+MXNet+%28incubating%29+1.2.0+Release+Notes;>Learn
 More
+MXNet 1.2.1 Released
+We're excited to announce the release of MXNet 1.2.1! Check out the 
release notes for latest updates.
+https://github.com/apache/incubator-mxnet/releases/tag/1.2.1;>Learn 
More
   
   
   Introducing the Scala Inference API


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] szha commented on issue #9388: Official version of pretrained MobileNet/ShuffleNet/NASNet is available?

2018-07-27 Thread GitBox
szha commented on issue #9388: Official version of pretrained 
MobileNet/ShuffleNet/NASNet is available?
URL: 
https://github.com/apache/incubator-mxnet/issues/9388#issuecomment-408571404
 
 
   @jmnie it's included in the latest versions of mxnet.
   ```
   from mxnet import gluon
   net = gluon.model_zoo.vision.resnet50_v1(pretrained=True)
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] zheng-da commented on issue #11325: [MXNET-703] TensorRT runtime integration

2018-07-27 Thread GitBox
zheng-da commented on issue #11325: [MXNET-703] TensorRT runtime integration
URL: https://github.com/apache/incubator-mxnet/pull/11325#issuecomment-408570652
 
 
   I understand this is an experimental integration. It changes the way of 
using MXNet (users have to pass parameters with `shared_buffer` when binding in 
the executor and it doesn't support module and Gluon hybridize). If these 
problems will be fixed later in the next PRs, the PR looks fine to me.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] anirudh2290 closed pull request #11630: Fix flaky test test_deconvolution

2018-07-27 Thread GitBox
anirudh2290 closed pull request #11630: Fix flaky test test_deconvolution
URL: https://github.com/apache/incubator-mxnet/pull/11630
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/src/operator/linalg_impl.h b/src/operator/linalg_impl.h
index 08d2add28eb..c0ae97ad3a4 100644
--- a/src/operator/linalg_impl.h
+++ b/src/operator/linalg_impl.h
@@ -169,23 +169,52 @@ void linalg_gemm(const 
Tensor inline \
-void linalg_gemm(const Tensor& A, const Tensor& B, \
- const Tensor& C, DType alpha, 
DType beta, \
- bool tA, bool tB, Stream *s) { \
-  using namespace mxnet; \
-  using mshadow::gpu; \
-  CHECK_NOTNULL(s); \
-  check_gemm(A, B, C, alpha, beta, tA, tB); \
-  CUBLAS_CALL(cublas##fname(Stream::GetBlasHandle(s), \
-(tB ? CUBLAS_OP_T : CUBLAS_OP_N), \
-(tA ? CUBLAS_OP_T : CUBLAS_OP_N), \
-C.size(1), C.size(0), (tB ? B.size(1) : 
B.size(0)), \
-, B.dptr_, B.stride_, A.dptr_, A.stride_, \
-, C.dptr_, C.stride_)) \
+#define LINALG_GPU_GEMM(fname, DType)  \
+  template <>  \
+  inline void linalg_gemm( \
+  const Tensor& A, const Tensor& B,  \
+  const Tensor& C, DType alpha, DType beta, bool tA,\
+  bool tB, Stream* s) {   \
+using namespace mxnet; \
+using mshadow::gpu;\
+CHECK_NOTNULL(s);  \
+check_gemm(A, B, C, alpha, beta, tA, tB);  \
+CUBLAS_CALL(cublas##fname( \
+Stream::GetBlasHandle(s), (tB ? CUBLAS_OP_T : CUBLAS_OP_N),   \
+(tA ? CUBLAS_OP_T : CUBLAS_OP_N), C.size(1), C.size(0),\
+(tB ? B.size(1) : B.size(0)), , B.dptr_, B.stride_, A.dptr_, \
+A.stride_, , C.dptr_, C.stride_)) \
+  }
+
+// Use cublasSgemmEx when it is available (CUDA >= 7.5). Resolves precision 
issues with
+// cublasSgemm. Please see https://github.com/apache/incubator-mxnet/pull/11630
+#if CUDA_VERSION >= 7050
+template <>
+inline void linalg_gemm(const Tensor& A,
+const Tensor& B,
+const Tensor& C, float 
alpha,
+float beta, bool tA, bool tB,
+Stream* s) {
+  using namespace mxnet;
+  using mshadow::gpu;
+  CHECK_NOTNULL(s);
+  check_gemm(A, B, C, alpha, beta, tA, tB);
+#if CUDA_VERSION >= 8000
+  cudaDataType_t full_datatype = CUDA_R_32F;
+#else
+  cublasDataType_t full_datatype = CUBLAS_DATA_FULL;
+#endif
+  CUBLAS_CALL(cublasSgemmEx(
+  Stream::GetBlasHandle(s), (tB ? CUBLAS_OP_T : CUBLAS_OP_N),
+  (tA ? CUBLAS_OP_T : CUBLAS_OP_N), C.size(1), C.size(0),
+  (tB ? B.size(1) : B.size(0)), , B.dptr_, full_datatype, B.stride_,
+  A.dptr_, full_datatype, A.stride_, , C.dptr_, full_datatype,
+  C.stride_))
 }
+
+#else
 LINALG_GPU_GEMM(Sgemm, float)
+#endif
 LINALG_GPU_GEMM(Dgemm, double)
 
 // Version where matrix rows are given by first axis.


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] rahul003 edited a comment on issue #11913: Unexpectedly poor copy() performance

2018-07-27 Thread GitBox
rahul003 edited a comment on issue #11913: Unexpectedly poor copy() performance
URL: 
https://github.com/apache/incubator-mxnet/issues/11913#issuecomment-408567946
 
 
   Just going by the script, could you put a waitall before your first time() 
call to ensure we don't factor in time to create the array?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] rahul003 commented on issue #11919: Accuracy changes with number of GPUs

2018-07-27 Thread GitBox
rahul003 commented on issue #11919: Accuracy changes with number of GPUs
URL: 
https://github.com/apache/incubator-mxnet/issues/11919#issuecomment-408570462
 
 
   You would need to change your learning rate based on the total batch size, 
generally proportional to the batch size (as the number of steps the training 
takes halves in the latter case). Try using lr 0.02 for the latter case. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[incubator-mxnet-site] branch asf-site updated: Bump the publish timestamp.

2018-07-27 Thread zhasheng
This is an automated email from the ASF dual-hosted git repository.

zhasheng pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/incubator-mxnet-site.git


The following commit(s) were added to refs/heads/asf-site by this push:
 new 6d5f070  Bump the publish timestamp.
6d5f070 is described below

commit 6d5f070f89b5aa347b057ddae4ab51432f973b86
Author: mxnet-ci 
AuthorDate: Sat Jul 28 00:45:46 2018 +

Bump the publish timestamp.
---
 date.txt | 1 +
 1 file changed, 1 insertion(+)

diff --git a/date.txt b/date.txt
new file mode 100644
index 000..b304434
--- /dev/null
+++ b/date.txt
@@ -0,0 +1 @@
+Sat Jul 28 00:45:46 UTC 2018



[GitHub] mkolod commented on issue #11325: [MXNET-703] TensorRT runtime integration

2018-07-27 Thread GitBox
mkolod commented on issue #11325: [MXNET-703] TensorRT runtime integration
URL: https://github.com/apache/incubator-mxnet/pull/11325#issuecomment-408569907
 
 
   @piiswrong Sorry, I meant [this 
update](https://github.com/mkolod/incubator-mxnet/commit/2a114665ce9342dbb808d9de63cda99fe209a415).


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] andrewfayres commented on issue #11885: Fix JNI custom op code from deregistering the operator fixes #10438

2018-07-27 Thread GitBox
andrewfayres commented on issue #11885: Fix JNI custom op code from 
deregistering the operator fixes #10438
URL: https://github.com/apache/incubator-mxnet/pull/11885#issuecomment-408569886
 
 
   It's more a question of exactly what you want to test. We've got tests for 
custom operators already and there's already work going on to verify model 
backward compatibility.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] mkolod removed a comment on issue #11325: [MXNET-703] TensorRT runtime integration

2018-07-27 Thread GitBox
mkolod removed a comment on issue #11325: [MXNET-703] TensorRT runtime 
integration
URL: https://github.com/apache/incubator-mxnet/pull/11325#issuecomment-408569730
 
 
   @piiswrong Sorry, I meant [this 
update](https://github.com/mkolod/incubator-mxnet/commit/84015a5be82b9097aaed94bac7b74efb177be26f),
 not the one above.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] mkolod commented on issue #11325: [MXNET-703] TensorRT runtime integration

2018-07-27 Thread GitBox
mkolod commented on issue #11325: [MXNET-703] TensorRT runtime integration
URL: https://github.com/apache/incubator-mxnet/pull/11325#issuecomment-408569730
 
 
   @piiswrong Sorry, I meant [this 
update](https://github.com/mkolod/incubator-mxnet/commit/84015a5be82b9097aaed94bac7b74efb177be26f),
 not the one above.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] mkolod commented on issue #11325: [MXNET-703] TensorRT runtime integration

2018-07-27 Thread GitBox
mkolod commented on issue #11325: [MXNET-703] TensorRT runtime integration
URL: https://github.com/apache/incubator-mxnet/pull/11325#issuecomment-408569285
 
 
   @piiswrong 
[Here](https://github.com/mkolod/incubator-mxnet/commit/84015a5be82b9097aaed94bac7b74efb177be26f)
 is the update. Now, when `MXNET_USE_TENSORRT=1` and `grad_req != 'null'`, a 
user will get a warning, and execution will proceed without TensorRT. The 
TensorRT pass will only run if both `MXNET_USE_TENSORRT=1` and 
`grad_req='null'`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] rahul003 commented on issue #11913: Unexpectedly poor copy() performance

2018-07-27 Thread GitBox
rahul003 commented on issue #11913: Unexpectedly poor copy() performance
URL: 
https://github.com/apache/incubator-mxnet/issues/11913#issuecomment-408567946
 
 
   Just going by the script, could you put a waitall before your first time() 
call to ensure we don't factor in time to create the array.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] kpmurali opened a new pull request #11921: [MXNET-711] Added updated logos to the powered by page

2018-07-27 Thread GitBox
kpmurali opened a new pull request #11921: [MXNET-711] Added updated logos to 
the powered by page
URL: https://github.com/apache/incubator-mxnet/pull/11921
 
 
   ## Description ##
   Updating logos to the powered by page
   
   ## Checklist ##
   ### Changes ###
   - [x] Added updated logos to the powered by page
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] rahul003 commented on issue #11855: Distributed learning with Async update does not work.

2018-07-27 Thread GitBox
rahul003 commented on issue #11855: Distributed learning with Async update does 
not work.
URL: 
https://github.com/apache/incubator-mxnet/issues/11855#issuecomment-408565519
 
 
   What optimizer are you using?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] rahul003 commented on a change in pull request #11910: Improving documentation and error messages for Async distributed training with Gluon

2018-07-27 Thread GitBox
rahul003 commented on a change in pull request #11910: Improving documentation 
and error messages for Async distributed training with Gluon
URL: https://github.com/apache/incubator-mxnet/pull/11910#discussion_r205923132
 
 

 ##
 File path: docs/faq/distributed_training.md
 ##
 @@ -73,6 +73,13 @@ These can be passed as arguments to the iterator.
 You can look at 
[example/gluon/image_classification.py](https://github.com/apache/incubator-mxnet/blob/master/example/gluon/image_classification.py)
 to see an example usage.
 
+### Updating weights
+KVStore server supports two modes, one which aggregates the gradients and 
updates the weights using those gradients, and second where the server only 
aggregates gradients. In the latter case, when a worker process pulls from 
kvstore, it gets the aggregated gradients. The worker then uses these gradients 
and applies the weights locally. 
+
+When using Gluon there is an option to choose between these modes by passing 
`update_on_kvstore` variable when you create the 
[Trainer](https://mxnet.incubator.apache.org/versions/master/api/python/gluon/gluon.html#mxnet.gluon.Trainer)
 object. 
 
 Review comment:
   ok


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] rahul003 commented on a change in pull request #11910: Improving documentation and error messages for Async distributed training with Gluon

2018-07-27 Thread GitBox
rahul003 commented on a change in pull request #11910: Improving documentation 
and error messages for Async distributed training with Gluon
URL: https://github.com/apache/incubator-mxnet/pull/11910#discussion_r205921780
 
 

 ##
 File path: python/mxnet/gluon/trainer.py
 ##
 @@ -187,6 +187,11 @@ def _init_kvstore(self):
 arg_arrays = {param.name: param.data(self._contexts[0]) for param 
in self._params}
 kvstore, update_on_kvstore = _create_kvstore(config['kvstore'], 
len(self._contexts),
  arg_arrays)
+if kvstore and 'async' in kvstore.type and 
config['update_on_kvstore'] is not None\
 
 Review comment:
   If the user does not set that variable explicitly (default way), then I set 
it to the right value. If the user explicitly sets it to false, then raised the 
error.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] szha commented on a change in pull request #11920: Remove fixed seed for test_sparse_nd_save_load

2018-07-27 Thread GitBox
szha commented on a change in pull request #11920: Remove fixed seed for 
test_sparse_nd_save_load
URL: https://github.com/apache/incubator-mxnet/pull/11920#discussion_r205921827
 
 

 ##
 File path: tests/python/unittest/test_sparse_ndarray.py
 ##
 @@ -534,7 +534,9 @@ def test_sparse_nd_pickle():
 assert same(a.asnumpy(), b.asnumpy())
 
 
-@with_seed(0)
+# @kalyc: Getting rid of fixed seed as flakiness could not be reproduced
+# tracked at https://github.com/apache/incubator-mxnet/issues/11741
+@with_seed()
 
 Review comment:
   I see


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] szha commented on issue #11919: Accuracy changes with number of GPUs

2018-07-27 Thread GitBox
szha commented on issue #11919: Accuracy changes with number of GPUs
URL: 
https://github.com/apache/incubator-mxnet/issues/11919#issuecomment-408563667
 
 
   Since the actual batch size differs by 2x it's not surprising that accuracy 
can be different.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] kalyc commented on a change in pull request #11920: Remove fixed seed for test_sparse_nd_save_load

2018-07-27 Thread GitBox
kalyc commented on a change in pull request #11920: Remove fixed seed for 
test_sparse_nd_save_load
URL: https://github.com/apache/incubator-mxnet/pull/11920#discussion_r205921622
 
 

 ##
 File path: tests/python/unittest/test_sparse_ndarray.py
 ##
 @@ -534,7 +534,9 @@ def test_sparse_nd_pickle():
 assert same(a.asnumpy(), b.asnumpy())
 
 
-@with_seed(0)
+# @kalyc: Getting rid of fixed seed as flakiness could not be reproduced
+# tracked at https://github.com/apache/incubator-mxnet/issues/11741
+@with_seed()
 
 Review comment:
   See comments by @haojin2 above


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] szha commented on a change in pull request #11920: Remove fixed seed for test_sparse_nd_save_load

2018-07-27 Thread GitBox
szha commented on a change in pull request #11920: Remove fixed seed for 
test_sparse_nd_save_load
URL: https://github.com/apache/incubator-mxnet/pull/11920#discussion_r205921457
 
 

 ##
 File path: tests/python/unittest/test_sparse_ndarray.py
 ##
 @@ -534,7 +534,9 @@ def test_sparse_nd_pickle():
 assert same(a.asnumpy(), b.asnumpy())
 
 
-@with_seed(0)
+# @kalyc: Getting rid of fixed seed as flakiness could not be reproduced
+# tracked at https://github.com/apache/incubator-mxnet/issues/11741
+@with_seed()
 
 Review comment:
   no need to add comment if not flaky.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] apeforest commented on issue #11841: All the tests in tools/coreml package are failing

2018-07-27 Thread GitBox
apeforest commented on issue #11841: All the tests in tools/coreml package are 
failing
URL: 
https://github.com/apache/incubator-mxnet/issues/11841#issuecomment-408562366
 
 
   Added tests to only load the CoreML model without running prediction.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] nswamy closed issue #8545: Incorrect results from R 3.4.2 in MNIST

2018-07-27 Thread GitBox
nswamy closed issue #8545: Incorrect results from R 3.4.2 in MNIST
URL: https://github.com/apache/incubator-mxnet/issues/8545
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] nswamy closed issue #10147: Mxnet R Package Installation Doc Bug

2018-07-27 Thread GitBox
nswamy closed issue #10147: Mxnet R Package Installation Doc Bug
URL: https://github.com/apache/incubator-mxnet/issues/10147
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] harryprince opened a new issue #10147: Mxnet R Package Installation Doc Bug

2018-07-27 Thread GitBox
harryprince opened a new issue #10147: Mxnet R Package Installation Doc Bug
URL: https://github.com/apache/incubator-mxnet/issues/10147
 
 
   Wrong:
   
   ```
   # current repo doc:  
https://github.com/apache/incubator-mxnet/tree/master/R-package
   cran <- getOption("repos")
   cran["dmlc"] <- 
"https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/R/CRAN/;
   options(repos = cran)
   install.packages("mxnet",dependencies = T)
   ```
   
   Correct:
   ```
   cran <- getOption("repos")
   cran["dmlc"] <- "https://s3-us-west-2.amazonaws.com/apache-mxnet/R/CRAN/;
   options(repos = cran)
   install.packages("mxnet",dependencies = T)
   ```
   
   ## Reference
   
   https://stackoverflow.com/questions/43872455/mxnet-package-installation-in-r
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] nswamy closed issue #10147: Mxnet R Package Installation Doc Bug

2018-07-27 Thread GitBox
nswamy closed issue #10147: Mxnet R Package Installation Doc Bug
URL: https://github.com/apache/incubator-mxnet/issues/10147
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] nswamy closed issue #10928: Optimizers memory usage

2018-07-27 Thread GitBox
nswamy closed issue #10928: Optimizers memory usage
URL: https://github.com/apache/incubator-mxnet/issues/10928
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] nswamy commented on issue #10928: Optimizers memory usage

2018-07-27 Thread GitBox
nswamy commented on issue #10928: Optimizers memory usage
URL: 
https://github.com/apache/incubator-mxnet/issues/10928#issuecomment-408561315
 
 
   closing issue as the referenced PR seems to be resolving it, feel free to 
open a new issue if you still find problems.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] nswamy commented on issue #11822: Install from pre-built binaries failing

2018-07-27 Thread GitBox
nswamy commented on issue #11822: Install from pre-built binaries failing
URL: 
https://github.com/apache/incubator-mxnet/issues/11822#issuecomment-408560973
 
 
   closing this issue as it seems to be resolved.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] nswamy closed issue #11822: Install from pre-built binaries failing

2018-07-27 Thread GitBox
nswamy closed issue #11822: Install from pre-built binaries failing
URL: https://github.com/apache/incubator-mxnet/issues/11822
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] kalyc commented on issue #11741: test_sparse_ndarray.test_sparse_nd_save_load has fixed seed that can mask flakiness

2018-07-27 Thread GitBox
kalyc commented on issue #11741: test_sparse_ndarray.test_sparse_nd_save_load 
has fixed seed that can mask flakiness
URL: 
https://github.com/apache/incubator-mxnet/issues/11741#issuecomment-408560463
 
 
   For reference, initially seed was set to 0 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] nswamy closed issue #8792: Different training performance between mxnet v. 0.11.0 and v. 0.12.1

2018-07-27 Thread GitBox
nswamy closed issue #8792: Different training performance between mxnet v. 
0.11.0 and v. 0.12.1
URL: https://github.com/apache/incubator-mxnet/issues/8792
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] nswamy commented on issue #8792: Different training performance between mxnet v. 0.11.0 and v. 0.12.1

2018-07-27 Thread GitBox
nswamy commented on issue #8792: Different training performance between mxnet 
v. 0.11.0 and v. 0.12.1
URL: 
https://github.com/apache/incubator-mxnet/issues/8792#issuecomment-408560141
 
 
   @VGalata Closing this issue.
   @anirudhacharya I think @anirudhacharya meant he tried from the master(v1.3 
is coming out soon). Please create a new Issue if you find issues.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] nswamy closed issue #7196: [R] make sure all optimizers work

2018-07-27 Thread GitBox
nswamy closed issue #7196: [R] make sure all optimizers work
URL: https://github.com/apache/incubator-mxnet/issues/7196
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] anirudhacharya commented on issue #8792: Different training performance between mxnet v. 0.11.0 and v. 0.12.1

2018-07-27 Thread GitBox
anirudhacharya commented on issue #8792: Different training performance between 
mxnet v. 0.11.0 and v. 0.12.1
URL: 
https://github.com/apache/incubator-mxnet/issues/8792#issuecomment-408559047
 
 
   @VGalata I tried with the latest v1.3( source build) and was not able to 
reproduce this issue with the example you provided. Can you please verify and 
reopen the issue if problem persists.
   
   @nswamy please close this issue.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] haojin2 commented on issue #11920: Remove fixed seed for test_sparse_nd_save_load

2018-07-27 Thread GitBox
haojin2 commented on issue #11920: Remove fixed seed for 
test_sparse_nd_save_load
URL: https://github.com/apache/incubator-mxnet/pull/11920#issuecomment-408558498
 
 
   For this kind of case please refer my changes to #11888 and add the link to 
the tracking issue in the code so that we can backtrack if it happens to fail 
again.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] anirudhacharya commented on issue #7196: [R] make sure all optimizers work

2018-07-27 Thread GitBox
anirudhacharya commented on issue #7196: [R] make sure all optimizers work
URL: 
https://github.com/apache/incubator-mxnet/issues/7196#issuecomment-408557567
 
 
   fixed in this - https://github.com/apache/incubator-mxnet/pull/11374 
   
   @nswamy please close.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] kalyc commented on issue #11741: test_sparse_ndarray.test_sparse_nd_save_load has fixed seed that can mask flakiness

2018-07-27 Thread GitBox
kalyc commented on issue #11741: test_sparse_ndarray.test_sparse_nd_save_load 
has fixed seed that can mask flakiness
URL: 
https://github.com/apache/incubator-mxnet/issues/11741#issuecomment-408557099
 
 
   Unable to reproduce issue with 1 runs, opened PR - 
https://github.com/apache/incubator-mxnet/pull/11920


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] kalyc opened a new pull request #11920: Remove fixed seed for test_sparse_nd_save_load

2018-07-27 Thread GitBox
kalyc opened a new pull request #11920: Remove fixed seed for 
test_sparse_nd_save_load
URL: https://github.com/apache/incubator-mxnet/pull/11920
 
 
   ## Description ##
   Remove fixed seed for test_sparse_nd_save_load
   Unable to reproduce flakiness of test - ran 1 times
   
   ## Checklist ##
   ### Essentials ###
   Please feel free to remove inapplicable items for your PR.
   - [ ] The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to 
the relevant [JIRA issue](https://issues.apache.org/jira/projects/MXNET/issues) 
created (except PRs with tiny changes)
   - [X] Changes are complete (i.e. I finished coding on this PR)
   - [X] All changes have test coverage:
   - Unit tests are added for small changes to verify correctness (e.g. adding 
a new operator)
   - Nightly tests are added for complicated/long-running ones (e.g. changing 
distributed kvstore)
   - Build tests will be added for build configuration changes (e.g. adding a 
new build option with NCCL)
   - [ ] Code is well-documented: 
   - For user-facing API changes, API doc string has been updated. 
   - For new C++ functions in header files, their functionalities and arguments 
are documented. 
   - For new examples, README.md is added to explain the what the example does, 
the source of the dataset, expected performance on test set and reference to 
the original paper if applicable
   - Check the API doc at 
http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
   - [X] To the my best knowledge, examples are either not affected by this 
change, or have been fixed to be compatible with this change
   
   ### Changes ###
   - [X] Remove fixed seed for test_sparse_ndarray:test_sparse_nd_save_load
   
   ## Comments ##
   - Related issue - https://github.com/apache/incubator-mxnet/issues/11741
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] abhinavs95 opened a new issue #11919: Accuracy changes with number of GPUs

2018-07-27 Thread GitBox
abhinavs95 opened a new issue #11919: Accuracy changes with number of GPUs
URL: https://github.com/apache/incubator-mxnet/issues/11919
 
 
   ## Description
   I trained same SqueezeNet model with same hyper-parameters and dataset on 
p3.8xlarge and p3.16xlarge with same AMI but got ~3% lower accuracies on 
p3.16xlarge. I used same batch size per GPU but effective batch size is 2x in 
p3.16xlarge due to 2x number of GPUs.
   
   ## Environment info (Required)
   
   p3.8xlarge
   ```
   --Python Info--
   Version  : 3.6.6
   Compiler : GCC 7.2.0
   Build: ('default', 'Jun 28 2018 17:14:51')
   Arch : ('64bit', '')
   Pip Info---
   Version  : 10.0.1
   Directory: 
/home/ubuntu/anaconda3/envs/gln/lib/python3.6/site-packages/pip
   --MXNet Info---
   Version  : 1.3.0
   Directory: 
/home/ubuntu/anaconda3/envs/gln/lib/python3.6/site-packages/mxnet
   Commit Hash   : 65fee984437dcca3516912417e9430cf34ba7313
   --System Info--
   Platform : Linux-4.4.0-1062-aws-x86_64-with-debian-stretch-sid
   system   : Linux
   node : ip-172-31-78-153
   release  : 4.4.0-1062-aws
   version  : #71-Ubuntu SMP Fri Jun 15 10:07:39 UTC 2018
   --Hardware Info--
   machine  : x86_64
   processor: x86_64
   Architecture:  x86_64
   CPU op-mode(s):32-bit, 64-bit
   Byte Order:Little Endian
   CPU(s):32
   On-line CPU(s) list:   0-31
   Thread(s) per core:2
   Core(s) per socket:16
   Socket(s): 1
   NUMA node(s):  1
   Vendor ID: GenuineIntel
   CPU family:6
   Model: 79
   Model name:Intel(R) Xeon(R) CPU E5-2686 v4 @ 2.30GHz
   Stepping:  1
   CPU MHz:   1972.070
   CPU max MHz:   3000.
   CPU min MHz:   1200.
   BogoMIPS:  4600.11
   Hypervisor vendor: Xen
   Virtualization type:   full
   L1d cache: 32K
   L1i cache: 32K
   L2 cache:  256K
   L3 cache:  46080K
   NUMA node0 CPU(s): 0-31
   Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge 
mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx pdpe1gb rdtscp lm 
constant_tsc rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni 
pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt 
tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 
3dnowprefetch invpcid_single kaiser fsgsbase bmi1 hle avx2 smep bmi2 erms 
invpcid rtm rdseed adx xsaveopt
   --Network Test--
   Setting timeout: 10
   Timing for MXNet: https://github.com/apache/incubator-mxnet, DNS: 0.0032 
sec, LOAD: 0.3394 sec.
   Timing for Gluon Tutorial(en): http://gluon.mxnet.io, DNS: 0.1960 sec, LOAD: 
0.3322 sec.
   Timing for Gluon Tutorial(cn): https://zh.gluon.ai, DNS: 0.1599 sec, LOAD: 
0.5460 sec.
   Timing for FashionMNIST: 
https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/dataset/fashion-mnist/train-labels-idx1-ubyte.gz,
 DNS: 0.0474 sec, LOAD: 0.7632 sec.
   Timing for PYPI: https://pypi.python.org/pypi/pip, DNS: 0.0039 sec, LOAD: 
0.1079 sec.
   Timing for Conda: https://repo.continuum.io/pkgs/free/, DNS: 0.0043 sec, 
LOAD: 0.0531 sec.
   ```
   
   p3.16xlarge
   
   ```
   --Python Info--
   Version  : 3.6.6
   Compiler : GCC 7.2.0
   Build: ('default', 'Jun 28 2018 17:14:51')
   Arch : ('64bit', '')
   Pip Info---
   Version  : 10.0.1
   Directory: 
/home/ubuntu/anaconda3/envs/gluon/lib/python3.6/site-packages/pip
   --MXNet Info---
   Version  : 1.3.0
   Directory: 
/home/ubuntu/anaconda3/envs/gluon/lib/python3.6/site-packages/mxnet
   Commit Hash   : 3051c49e3454df3b5f8909d3d76c6213d13539ad
   --System Info--
   Platform : Linux-4.4.0-1062-aws-x86_64-with-debian-stretch-sid
   system   : Linux
   node : ip-172-31-45-182
   release  : 4.4.0-1062-aws
   version  : #71-Ubuntu SMP Fri Jun 15 10:07:39 UTC 2018
   --Hardware Info--
   machine  : x86_64
   processor: x86_64
   Architecture:  x86_64
   CPU op-mode(s):32-bit, 64-bit
   Byte Order:Little Endian
   CPU(s):64
   On-line CPU(s) list:   0-63
   Thread(s) per core:2
   Core(s) per socket:16
   Socket(s): 2
   NUMA node(s):  2
   Vendor ID: GenuineIntel
   CPU family:6
   Model: 79
   Model name:Intel(R) Xeon(R) CPU E5-2686 v4 @ 2.30GHz
   Stepping:  1
   CPU MHz:   1581.609
   CPU max MHz:   3000.
   CPU min MHz:   1200.
   BogoMIPS:  4600.07
   Hypervisor vendor: Xen
   Virtualization type:   full
   L1d cache: 32K
   L1i 

[GitHub] haojin2 commented on issue #11867: we need to update the doc of scatter_nd

2018-07-27 Thread GitBox
haojin2 commented on issue #11867: we need to update the doc of scatter_nd
URL: 
https://github.com/apache/incubator-mxnet/issues/11867#issuecomment-408550838
 
 
   @zheng-da Fix is in #11918 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ptrendx commented on issue #11886: Improve error message of cudnn operators

2018-07-27 Thread GitBox
ptrendx commented on issue #11886: Improve error message of cudnn operators
URL: https://github.com/apache/incubator-mxnet/pull/11886#issuecomment-408548357
 
 
   Sounds good.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] haojin2 commented on issue #11886: Improve error message of cudnn operators

2018-07-27 Thread GitBox
haojin2 commented on issue #11886: Improve error message of cudnn operators
URL: https://github.com/apache/incubator-mxnet/pull/11886#issuecomment-408545460
 
 
   @ptrendx Okay, so how does:
   ```
   N algorithms with minimum memory requirement M bytes have been tried. 
Workspace size is set to X bytes, please consider reducing the batch/model size 
or increasing workspace size.
   ```
   look to you?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] mkolod commented on issue #11325: [MXNET-703] TensorRT runtime integration

2018-07-27 Thread GitBox
mkolod commented on issue #11325: [MXNET-703] TensorRT runtime integration
URL: https://github.com/apache/incubator-mxnet/pull/11325#issuecomment-408543792
 
 
   @piiswrong Sounds good, I'll address this right away.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[incubator-mxnet] branch master updated: make skiptest work (#11889)

2018-07-27 Thread nswamy
This is an automated email from the ASF dual-hosted git repository.

nswamy pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git


The following commit(s) were added to refs/heads/master by this push:
 new a8c8737  make skiptest work (#11889)
a8c8737 is described below

commit a8c873742c25a6cd4b78c6a4d8e1026378fda77d
Author: Lanking 
AuthorDate: Fri Jul 27 14:24:15 2018 -0700

make skiptest work (#11889)
---
 Makefile   | 10 +-
 scala-package/core/pom.xml |  6 +++---
 scala-package/examples/pom.xml |  6 +++---
 scala-package/infer/pom.xml|  6 +++---
 scala-package/pom.xml  |  1 +
 5 files changed, 15 insertions(+), 14 deletions(-)

diff --git a/Makefile b/Makefile
index 88f7dd9..18661aa 100644
--- a/Makefile
+++ b/Makefile
@@ -608,7 +608,7 @@ scalaintegrationtest:
 
 scalainstall:
(cd $(ROOTDIR)/scala-package; \
-   mvn install -P$(SCALA_PKG_PROFILE),$(SCALA_VERSION_PROFILE) 
-DskipTests -Dcxx="$(CXX)" \
+   mvn install -P$(SCALA_PKG_PROFILE),$(SCALA_VERSION_PROFILE) 
-DskipTests=true -Dcxx="$(CXX)" \
-Dbuild.platform="$(SCALA_PKG_PROFILE)" \
-Dcflags="$(CFLAGS)" -Dldflags="$(LDFLAGS)" \
-Dlddeps="$(LIB_DEP) $(ROOTDIR)/lib/libmxnet.a")
@@ -617,23 +617,23 @@ scalarelease-dryrun:
(cd $(ROOTDIR)/scala-package; \
mvn release:clean release:prepare -DdryRun=true 
-DautoVersionSubmodules=true \
-Papache-release,$(SCALA_PKG_PROFILE),$(SCALA_VERSION_PROFILE) \
-   -Darguments=""-Dbuild\.platform=\""$(SCALA_PKG_PROFILE)\""\ 
-DskipTests\ -Dcflags=\""$(CFLAGS)\""\ -Dcxx=\""$(CXX)\""\ 
-Dldflags=\""$(LDFLAGS)\""\ -Dlddeps=\""$(LIB_DEP) 
$(ROOTDIR)/lib/libmxnet.a\)
+   -Darguments=""-Dbuild\.platform=\""$(SCALA_PKG_PROFILE)\""\ 
-DskipTests=true\ -Dcflags=\""$(CFLAGS)\""\ -Dcxx=\""$(CXX)\""\ 
-Dldflags=\""$(LDFLAGS)\""\ -Dlddeps=\""$(LIB_DEP) 
$(ROOTDIR)/lib/libmxnet.a\)
 
 scalarelease-prepare:
(cd $(ROOTDIR)/scala-package; \
mvn release:clean release:prepare -DautoVersionSubmodules=true \
-Papache-release,$(SCALA_PKG_PROFILE),$(SCALA_VERSION_PROFILE) \
-   -Darguments=""-Dbuild\.platform=\""$(SCALA_PKG_PROFILE)\""\ 
-DskipTests\ -Dcflags=\""$(CFLAGS)\""\ -Dcxx=\""$(CXX)\""\ 
-Dldflags=\""$(LDFLAGS)\""\ -Dlddeps=\""$(LIB_DEP) 
$(ROOTDIR)/lib/libmxnet.a\)
+   -Darguments=""-Dbuild\.platform=\""$(SCALA_PKG_PROFILE)\""\ 
-DskipTests=true\ -Dcflags=\""$(CFLAGS)\""\ -Dcxx=\""$(CXX)\""\ 
-Dldflags=\""$(LDFLAGS)\""\ -Dlddeps=\""$(LIB_DEP) 
$(ROOTDIR)/lib/libmxnet.a\)
 
 scalarelease-perform:
(cd $(ROOTDIR)/scala-package; \
mvn release:perform -DautoVersionSubmodules=true \
-Papache-release,$(SCALA_PKG_PROFILE),$(SCALA_VERSION_PROFILE) \
-   -Darguments=""-Dbuild\.platform=\""$(SCALA_PKG_PROFILE)\""\ 
-DskipTests\ -Dcflags=\""$(CFLAGS)\""\ -Dcxx=\""$(CXX)\""\ 
-Dldflags=\""$(LDFLAGS)\""\ -Dlddeps=\""$(LIB_DEP) 
$(ROOTDIR)/lib/libmxnet.a\)
+   -Darguments=""-Dbuild\.platform=\""$(SCALA_PKG_PROFILE)\""\ 
-DskipTests=true\ -Dcflags=\""$(CFLAGS)\""\ -Dcxx=\""$(CXX)\""\ 
-Dldflags=\""$(LDFLAGS)\""\ -Dlddeps=\""$(LIB_DEP) 
$(ROOTDIR)/lib/libmxnet.a\)
 
 scaladeploy:
(cd $(ROOTDIR)/scala-package; \
-   mvn deploy 
-Papache-release,$(SCALA_PKG_PROFILE),$(SCALA_VERSION_PROFILE) \-DskipTests 
-Dcxx="$(CXX)" \
+   mvn deploy 
-Papache-release,$(SCALA_PKG_PROFILE),$(SCALA_VERSION_PROFILE) 
\-DskipTests=true -Dcxx="$(CXX)" \
-Dbuild.platform="$(SCALA_PKG_PROFILE)" \
-Dcflags="$(CFLAGS)" -Dldflags="$(LDFLAGS)" \
-Dlddeps="$(LIB_DEP) $(ROOTDIR)/lib/libmxnet.a")
diff --git a/scala-package/core/pom.xml b/scala-package/core/pom.xml
index 134e0a5..1606197 100644
--- a/scala-package/core/pom.xml
+++ b/scala-package/core/pom.xml
@@ -17,13 +17,13 @@
 
   unittest
   
-false
+false
   
 
 
   integrationtest
   
-true
+true
   
 
 
@@ -74,7 +74,7 @@
 org.scalatest
 scalatest-maven-plugin
 
-  ${skiptest}
+  ${skipTests}
   
 
-Djava.library.path=${project.parent.basedir}/native/${platform}/target \
 
-Dlog4j.configuration=file://${project.basedir}/src/test/resources/log4j.properties
diff --git a/scala-package/examples/pom.xml b/scala-package/examples/pom.xml
index 9a98f74..d24785b 100644
--- a/scala-package/examples/pom.xml
+++ b/scala-package/examples/pom.xml
@@ -17,13 +17,13 @@
 
   unittest
   
-true
+true
   
 
 
   integrationtest
   
-false
+false
   
 
 
@@ -134,7 +134,7 @@
 org.scalatest
 

[GitHub] nswamy closed pull request #11889: [MXNET-319] make skiptest work for Scala

2018-07-27 Thread GitBox
nswamy closed pull request #11889: [MXNET-319] make skiptest work for Scala
URL: https://github.com/apache/incubator-mxnet/pull/11889
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/Makefile b/Makefile
index 88f7dd9278c..18661aa6984 100644
--- a/Makefile
+++ b/Makefile
@@ -608,7 +608,7 @@ scalaintegrationtest:
 
 scalainstall:
(cd $(ROOTDIR)/scala-package; \
-   mvn install -P$(SCALA_PKG_PROFILE),$(SCALA_VERSION_PROFILE) 
-DskipTests -Dcxx="$(CXX)" \
+   mvn install -P$(SCALA_PKG_PROFILE),$(SCALA_VERSION_PROFILE) 
-DskipTests=true -Dcxx="$(CXX)" \
-Dbuild.platform="$(SCALA_PKG_PROFILE)" \
-Dcflags="$(CFLAGS)" -Dldflags="$(LDFLAGS)" \
-Dlddeps="$(LIB_DEP) $(ROOTDIR)/lib/libmxnet.a")
@@ -617,23 +617,23 @@ scalarelease-dryrun:
(cd $(ROOTDIR)/scala-package; \
mvn release:clean release:prepare -DdryRun=true 
-DautoVersionSubmodules=true \
-Papache-release,$(SCALA_PKG_PROFILE),$(SCALA_VERSION_PROFILE) \
-   -Darguments=""-Dbuild\.platform=\""$(SCALA_PKG_PROFILE)\""\ 
-DskipTests\ -Dcflags=\""$(CFLAGS)\""\ -Dcxx=\""$(CXX)\""\ 
-Dldflags=\""$(LDFLAGS)\""\ -Dlddeps=\""$(LIB_DEP) 
$(ROOTDIR)/lib/libmxnet.a\)
+   -Darguments=""-Dbuild\.platform=\""$(SCALA_PKG_PROFILE)\""\ 
-DskipTests=true\ -Dcflags=\""$(CFLAGS)\""\ -Dcxx=\""$(CXX)\""\ 
-Dldflags=\""$(LDFLAGS)\""\ -Dlddeps=\""$(LIB_DEP) 
$(ROOTDIR)/lib/libmxnet.a\)
 
 scalarelease-prepare:
(cd $(ROOTDIR)/scala-package; \
mvn release:clean release:prepare -DautoVersionSubmodules=true \
-Papache-release,$(SCALA_PKG_PROFILE),$(SCALA_VERSION_PROFILE) \
-   -Darguments=""-Dbuild\.platform=\""$(SCALA_PKG_PROFILE)\""\ 
-DskipTests\ -Dcflags=\""$(CFLAGS)\""\ -Dcxx=\""$(CXX)\""\ 
-Dldflags=\""$(LDFLAGS)\""\ -Dlddeps=\""$(LIB_DEP) 
$(ROOTDIR)/lib/libmxnet.a\)
+   -Darguments=""-Dbuild\.platform=\""$(SCALA_PKG_PROFILE)\""\ 
-DskipTests=true\ -Dcflags=\""$(CFLAGS)\""\ -Dcxx=\""$(CXX)\""\ 
-Dldflags=\""$(LDFLAGS)\""\ -Dlddeps=\""$(LIB_DEP) 
$(ROOTDIR)/lib/libmxnet.a\)
 
 scalarelease-perform:
(cd $(ROOTDIR)/scala-package; \
mvn release:perform -DautoVersionSubmodules=true \
-Papache-release,$(SCALA_PKG_PROFILE),$(SCALA_VERSION_PROFILE) \
-   -Darguments=""-Dbuild\.platform=\""$(SCALA_PKG_PROFILE)\""\ 
-DskipTests\ -Dcflags=\""$(CFLAGS)\""\ -Dcxx=\""$(CXX)\""\ 
-Dldflags=\""$(LDFLAGS)\""\ -Dlddeps=\""$(LIB_DEP) 
$(ROOTDIR)/lib/libmxnet.a\)
+   -Darguments=""-Dbuild\.platform=\""$(SCALA_PKG_PROFILE)\""\ 
-DskipTests=true\ -Dcflags=\""$(CFLAGS)\""\ -Dcxx=\""$(CXX)\""\ 
-Dldflags=\""$(LDFLAGS)\""\ -Dlddeps=\""$(LIB_DEP) 
$(ROOTDIR)/lib/libmxnet.a\)
 
 scaladeploy:
(cd $(ROOTDIR)/scala-package; \
-   mvn deploy 
-Papache-release,$(SCALA_PKG_PROFILE),$(SCALA_VERSION_PROFILE) \-DskipTests 
-Dcxx="$(CXX)" \
+   mvn deploy 
-Papache-release,$(SCALA_PKG_PROFILE),$(SCALA_VERSION_PROFILE) 
\-DskipTests=true -Dcxx="$(CXX)" \
-Dbuild.platform="$(SCALA_PKG_PROFILE)" \
-Dcflags="$(CFLAGS)" -Dldflags="$(LDFLAGS)" \
-Dlddeps="$(LIB_DEP) $(ROOTDIR)/lib/libmxnet.a")
diff --git a/scala-package/core/pom.xml b/scala-package/core/pom.xml
index 134e0a59da1..16061979f7c 100644
--- a/scala-package/core/pom.xml
+++ b/scala-package/core/pom.xml
@@ -17,13 +17,13 @@
 
   unittest
   
-false
+false
   
 
 
   integrationtest
   
-true
+true
   
 
 
@@ -74,7 +74,7 @@
 org.scalatest
 scalatest-maven-plugin
 
-  ${skiptest}
+  ${skipTests}
   
 
-Djava.library.path=${project.parent.basedir}/native/${platform}/target \
 
-Dlog4j.configuration=file://${project.basedir}/src/test/resources/log4j.properties
diff --git a/scala-package/examples/pom.xml b/scala-package/examples/pom.xml
index 9a98f74e4e2..d24785b0e87 100644
--- a/scala-package/examples/pom.xml
+++ b/scala-package/examples/pom.xml
@@ -17,13 +17,13 @@
 
   unittest
   
-true
+true
   
 
 
   integrationtest
   
-false
+false
   
 
 
@@ -134,7 +134,7 @@
 org.scalatest
 scalatest-maven-plugin
 
-  ${skiptest}
+  ${skipTests}
   
 
-Djava.library.path=${project.parent.basedir}/native/${platform}/target \
 
-Dlog4j.configuration=file://${project.basedir}/src/test/resources/log4j.properties
diff --git 

[GitHub] nswamy commented on issue #11885: Fix JNI custom op code from deregistering the operator fixes #10438

2018-07-27 Thread GitBox
nswamy commented on issue #11885: Fix JNI custom op code from deregistering the 
operator fixes #10438
URL: https://github.com/apache/incubator-mxnet/pull/11885#issuecomment-408542060
 
 
   @andrewfayres is it possible to add some testing to this?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] absalama commented on issue #11855: Distributed learning with Async update does not work.

2018-07-27 Thread GitBox
absalama commented on issue #11855: Distributed learning with Async update does 
not work.
URL: 
https://github.com/apache/incubator-mxnet/issues/11855#issuecomment-408541838
 
 
   In trainer.py I changed the default value of **update_on_kvstore** from None 
to True in the __init__ method. The imageclassification.py initialises the 
trainer object so I assume that if the **update_on_kvstore** is True in the 
__init__ method then it should be set?
   
   We are using slurm cluster, and all nodes are sharing the same folder where 
mxnet source resides. So the any change should be seen by all worker. 
   
   Do I need to set something extra for slurm configuration?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] piiswrong commented on issue #11325: [MXNET-703] TensorRT runtime integration

2018-07-27 Thread GitBox
piiswrong commented on issue #11325: [MXNET-703] TensorRT runtime integration
URL: https://github.com/apache/incubator-mxnet/pull/11325#issuecomment-408537847
 
 
   yes
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] azai91 commented on issue #11896: subgraph TODO

2018-07-27 Thread GitBox
azai91 commented on issue #11896: subgraph TODO
URL: 
https://github.com/apache/incubator-mxnet/issues/11896#issuecomment-408535506
 
 
   for task 2, what is the optimal layout for the weights?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] mkolod commented on a change in pull request #11325: [MXNET-703] TensorRT runtime integration

2018-07-27 Thread GitBox
mkolod commented on a change in pull request #11325: [MXNET-703] TensorRT 
runtime integration
URL: https://github.com/apache/incubator-mxnet/pull/11325#discussion_r205894176
 
 

 ##
 File path: src/executor/graph_executor.cc
 ##
 @@ -941,6 +970,114 @@ void GraphExecutor::FinishInitGraph(nnvm::Symbol symbol,
   this->InitOpSegs();
 }
 
+/*!
+ * \brief This function is triggered after each tensorrt subgraph replacement 
pass.
+ * Reset arguments of GraphExecutor::Init(...) as some variables (weights and 
biases)
+ * are absorbed into the TRT engine it also it rerun attributes inferences 
accordingly
+ * to the new topology.
+ */
+Graph GraphExecutor::ReinitGraph(Graph&& g, const Context _ctx,
+ const std::map _map,
+ std::vector *in_arg_ctxes,
+ std::vector *arg_grad_ctxes,
+ std::vector *aux_state_ctxes,
+ std::vector *grad_req_types,
+ std::unordered_map 
*arg_shape_map,
+ std::unordered_map 
*arg_dtype_map,
+ std::unordered_map 
*arg_stype_map,
+ std::unordered_map 
*params_map) {
+  std::unordered_set to_remove_params;
+  for (auto& el : *params_map) {
+to_remove_params.insert(el.first);
+  }
+
+  DFSVisit(g.outputs, [_remove_params](const nnvm::NodePtr n) {
+to_remove_params.erase(n->attrs.name);
+  });
+
+  for (auto& el : to_remove_params) {
+params_map->erase(el);
+arg_shape_map->erase(el);
+arg_dtype_map->erase(el);
+arg_stype_map->erase(el);
+  }
+  const auto  = g.indexed_graph();
+  num_forward_inputs_ = idx.input_nodes().size();
+  in_arg_ctxes->resize(num_forward_inputs_ - idx.mutable_input_nodes().size());
 
 Review comment:
   @zhengda I think it can, but we couldn't get it to work so far, due to the 
bind() method for module not taking in the shared_buffer, which is necessary 
for TensorRT engine builder to bake in the weights, which is something that 
TensorRT requires. Regarding the graph rewrite, note that this is taking place 
very early on in the bind process. There is shape inference hapening before the 
rewrite, but no memory allocation, etc., so I think from a data parallel 
perspective, it should work because the resource allocation isn't done before 
the rewrite, but after. Also, after the graph rewrite, shapes are determined 
again, so the bind process follows after the rewrite as if there were no 
rewrite.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] mkolod commented on a change in pull request #11325: [MXNET-703] TensorRT runtime integration

2018-07-27 Thread GitBox
mkolod commented on a change in pull request #11325: [MXNET-703] TensorRT 
runtime integration
URL: https://github.com/apache/incubator-mxnet/pull/11325#discussion_r205894970
 
 

 ##
 File path: src/executor/graph_executor.cc
 ##
 @@ -1018,20 +1156,49 @@ void GraphExecutor::Init(nnvm::Symbol symbol,
 g.GetAttr("storage_type"));
   }
 
+  if (use_tensorrt_) {
+  #if MXNET_USE_TENSORRT
+  // check that this graph is inference-only
+  if (std::any_of(grad_req_types->begin(), grad_req_types->end(),
+[](const OpReqType& op){return op != kNullOp;})) {
+  LOG(FATAL) << "MXNET_USE_TENSORRT set but graph is not 
inference-only. "
+<< "If it is an inference graph, set grad_req to null during 
simple_bind call. "
+<< "If it is a training graph, unset the MXNET_USE_TENSORRT env 
variable";
+  }
+  if (shared_buffer->empty()) {
+LOG(FATAL) << "MXNET_USE_TENSORRT = 1 but shared_buffer is empty."
+  << "Please provide weights and other parameters, such as "
+  << "BatchNorm moments, via the shared_buffer, during simple bind 
call.";
+  }
+  auto trt_groups = GetTrtCompatibleSubsets(g, shared_buffer);
+  for (auto trt_group : trt_groups) {
+if (trt_group.size() > 1) {
+  g = ReplaceSubgraph(std::move(g), trt_group, shared_buffer);
+  g = ReinitGraph(std::move(g), default_ctx, ctx_map, 
in_arg_ctxes, arg_grad_ctxes,
+  aux_state_ctxes, grad_req_types, arg_shape_map, 
arg_dtype_map,
+  arg_stype_map, shared_buffer);
 
 Review comment:
   @Caenorst could you reply to @zheng-da's question above? Thanks!


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] mkolod commented on issue #11325: [MXNET-703] TensorRT runtime integration

2018-07-27 Thread GitBox
mkolod commented on issue #11325: [MXNET-703] TensorRT runtime integration
URL: https://github.com/apache/incubator-mxnet/pull/11325#issuecomment-408533758
 
 
   @piiswrong You mean to basically issue a warning and bypass instead of 
throwing an exception in other cases?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] mkolod commented on a change in pull request #11325: [MXNET-703] TensorRT runtime integration

2018-07-27 Thread GitBox
mkolod commented on a change in pull request #11325: [MXNET-703] TensorRT 
runtime integration
URL: https://github.com/apache/incubator-mxnet/pull/11325#discussion_r205894176
 
 

 ##
 File path: src/executor/graph_executor.cc
 ##
 @@ -941,6 +970,114 @@ void GraphExecutor::FinishInitGraph(nnvm::Symbol symbol,
   this->InitOpSegs();
 }
 
+/*!
+ * \brief This function is triggered after each tensorrt subgraph replacement 
pass.
+ * Reset arguments of GraphExecutor::Init(...) as some variables (weights and 
biases)
+ * are absorbed into the TRT engine it also it rerun attributes inferences 
accordingly
+ * to the new topology.
+ */
+Graph GraphExecutor::ReinitGraph(Graph&& g, const Context _ctx,
+ const std::map _map,
+ std::vector *in_arg_ctxes,
+ std::vector *arg_grad_ctxes,
+ std::vector *aux_state_ctxes,
+ std::vector *grad_req_types,
+ std::unordered_map 
*arg_shape_map,
+ std::unordered_map 
*arg_dtype_map,
+ std::unordered_map 
*arg_stype_map,
+ std::unordered_map 
*params_map) {
+  std::unordered_set to_remove_params;
+  for (auto& el : *params_map) {
+to_remove_params.insert(el.first);
+  }
+
+  DFSVisit(g.outputs, [_remove_params](const nnvm::NodePtr n) {
+to_remove_params.erase(n->attrs.name);
+  });
+
+  for (auto& el : to_remove_params) {
+params_map->erase(el);
+arg_shape_map->erase(el);
+arg_dtype_map->erase(el);
+arg_stype_map->erase(el);
+  }
+  const auto  = g.indexed_graph();
+  num_forward_inputs_ = idx.input_nodes().size();
+  in_arg_ctxes->resize(num_forward_inputs_ - idx.mutable_input_nodes().size());
 
 Review comment:
   @zhengda I think it can, but we couldn't get it to work so far, due to the 
bind() method for module not taking in the shared_buffer, which is necessary 
for TensorRT engine builder to bake in the weights, which is something that 
TensorRT requires. Regarding the graph rewrite, note that this is taking place 
very early on in the bind process. There is shape inference hapening before the 
rewrite, but no memory allocation, etc., so I think from a data parallel 
perspective, it should work because the resource allocation isn't done before 
the rewrite, but after.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] piiswrong commented on a change in pull request #11325: [MXNET-703] TensorRT runtime integration

2018-07-27 Thread GitBox
piiswrong commented on a change in pull request #11325: [MXNET-703] TensorRT 
runtime integration
URL: https://github.com/apache/incubator-mxnet/pull/11325#discussion_r205892192
 
 

 ##
 File path: include/mxnet/executor.h
 ##
 @@ -152,14 +152,14 @@ class Executor {
   static Executor* SimpleBind(nnvm::Symbol symbol,
 
 Review comment:
   Also I think it's better to name the functions as InitTensorRT rather than 
reinitgraph


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] piiswrong commented on a change in pull request #11325: [MXNET-703] TensorRT runtime integration

2018-07-27 Thread GitBox
piiswrong commented on a change in pull request #11325: [MXNET-703] TensorRT 
runtime integration
URL: https://github.com/apache/incubator-mxnet/pull/11325#discussion_r205892000
 
 

 ##
 File path: include/mxnet/executor.h
 ##
 @@ -152,14 +152,14 @@ class Executor {
   static Executor* SimpleBind(nnvm::Symbol symbol,
 
 Review comment:
   why not pass by value?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] piiswrong commented on a change in pull request #11325: [MXNET-703] TensorRT runtime integration

2018-07-27 Thread GitBox
piiswrong commented on a change in pull request #11325: [MXNET-703] TensorRT 
runtime integration
URL: https://github.com/apache/incubator-mxnet/pull/11325#discussion_r205891103
 
 

 ##
 File path: include/mxnet/c_api.h
 ##
 @@ -1714,6 +1714,13 @@ MXNET_DLL int MXExecutorReshape(int partial_shaping,
 NDArrayHandle** aux_states,
 ExecutorHandle shared_exec,
 ExecutorHandle *out);
+
+/*!
+ * \brief get optimized graph from graph executor
+ */
+MXNET_DLL int MXExecutorGetOptimizedSymbol(ExecutorHandle handle,
 
 Review comment:
   I think it's better to expose it as a private member of executor


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] szha commented on issue #11592: Flaky Test Issue of GPU Operator

2018-07-27 Thread GitBox
szha commented on issue #11592: Flaky Test Issue of GPU Operator
URL: 
https://github.com/apache/incubator-mxnet/issues/11592#issuecomment-408525537
 
 
   I just had the exact same error in another PR. 
http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/incubator-mxnet/detail/PR-11482/17/pipeline/859#step-1530-log-1018
   
   @zhanghang1989 what did you do to resolve the problem?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[incubator-mxnet] branch master updated: [MXNET-344] Add more operators to onnx import (#11856)

2018-07-27 Thread zhreshold
This is an automated email from the ASF dual-hosted git repository.

zhreshold pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git


The following commit(s) were added to refs/heads/master by this push:
 new 4bbf15c  [MXNET-344] Add more operators to onnx import (#11856)
4bbf15c is described below

commit 4bbf15c85d300801f6f880f7abe4628e68ced2f7
Author: Anirudh 
AuthorDate: Fri Jul 27 13:02:44 2018 -0700

[MXNET-344] Add more operators to onnx import (#11856)

* add more ops

* use dict.get

* add list comprehensive

* retrigger CI due to unrelated flaky test failure
---
 .../mxnet/contrib/onnx/onnx2mx/_import_helper.py   | 26 ++--
 .../mxnet/contrib/onnx/onnx2mx/_op_translations.py | 73 +-
 tests/python-pytest/onnx/import/test_cases.py  | 39 +++-
 3 files changed, 116 insertions(+), 22 deletions(-)

diff --git a/python/mxnet/contrib/onnx/onnx2mx/_import_helper.py 
b/python/mxnet/contrib/onnx/onnx2mx/_import_helper.py
index c19f0f2..c44403d 100644
--- a/python/mxnet/contrib/onnx/onnx2mx/_import_helper.py
+++ b/python/mxnet/contrib/onnx/onnx2mx/_import_helper.py
@@ -20,8 +20,9 @@
 """Operator attributes conversion"""
 from ._op_translations import identity, random_uniform, random_normal
 from ._op_translations import add, subtract, multiply, divide, absolute, 
negative, add_n
-from ._op_translations import tanh
-from ._op_translations import ceil, floor
+from ._op_translations import tanh, arccos, arcsin, arctan, _cos, _sin, _tan
+from ._op_translations import softplus, shape, gather, lp_pooling
+from ._op_translations import ceil, floor, hardsigmoid, global_lppooling
 from ._op_translations import concat
 from ._op_translations import leaky_relu, _elu, _prelu, softmax, 
fully_connected
 from ._op_translations import global_avgpooling, global_maxpooling, linalg_gemm
@@ -30,12 +31,13 @@ from ._op_translations import dropout, local_response_norm, 
conv, deconv
 from ._op_translations import reshape, cast, split, _slice, transpose, 
squeeze, flatten
 from ._op_translations import reciprocal, squareroot, power, exponent, _log, 
unsqueeze
 from ._op_translations import reduce_max, reduce_mean, reduce_min, reduce_sum
-from ._op_translations import reduce_prod, avg_pooling, max_pooling
+from ._op_translations import reduce_prod, avg_pooling, max_pooling, 
instance_norm
 from ._op_translations import argmax, argmin, maximum, minimum
 from ._op_translations import clip, reduce_log_sum, reduce_log_sum_exp
-from ._op_translations import reduce_sum_square, reduce_l2, max_roi_pooling, 
instance_norm
+from ._op_translations import reduce_sum_square, reduce_l1, reduce_l2, 
max_roi_pooling
 from ._op_translations import log_softmax, softsign, lesser, greater, equal
 from ._op_translations import logical_and, logical_or, logical_xor, logical_not
+from ._op_translations import mean
 
 # convert_map defines maps of ONNX operator names to converter 
functor(callable)
 # defined in the op_translations module.
@@ -77,6 +79,7 @@ _convert_map = {
 'FC': fully_connected,
 'GlobalAveragePool' : global_avgpooling,
 'GlobalMaxPool' : global_maxpooling,
+'GlobalLpPool'  : global_lppooling,
 'Gemm'  : linalg_gemm,
 'LRN'   : local_response_norm,
 'Dropout'   : dropout,
@@ -113,6 +116,7 @@ _convert_map = {
 'ReduceLogSum'  : reduce_log_sum,
 'ReduceLogSumExp'   : reduce_log_sum_exp,
 'ReduceSumSquare'   : reduce_sum_square,
+'ReduceL1'  : reduce_l1,
 'ReduceL2'  : reduce_l2,
 'MaxRoiPool': max_roi_pooling,
 'InstanceNormalization' : instance_norm,
@@ -124,5 +128,17 @@ _convert_map = {
 'And'   : logical_and,
 'Xor'   : logical_xor,
 'Not'   : logical_not,
-'Or': logical_or
+'Or': logical_or,
+'Mean'  : mean,
+'Acos'  : arccos,
+'Asin'  : arcsin,
+'Atan'  : arctan,
+'Cos'   : _cos,
+'Sin'   : _sin,
+'Softplus'  : softplus,
+'Tan'   : _tan,
+'Shape' : shape,
+'Gather': gather,
+'HardSigmoid'   : hardsigmoid,
+'LpPool': lp_pooling
 }
diff --git a/python/mxnet/contrib/onnx/onnx2mx/_op_translations.py 
b/python/mxnet/contrib/onnx/onnx2mx/_op_translations.py
index aa37856..4d1e956 100644
--- a/python/mxnet/contrib/onnx/onnx2mx/_op_translations.py
+++ b/python/mxnet/contrib/onnx/onnx2mx/_op_translations.py
@@ -80,6 +80,13 @@ def divide(attrs, inputs, proto_obj):
 return op_value, new_attr, inputs
 return 'broadcast_div', new_attr, inputs
 
+def mean(attrs, inputs, proto_obj):
+"""Mean of all the input tensors."""
+concat_input = [symbol.expand_dims(op_input, axis=0) for op_input in 
inputs]
+

[GitHub] zhreshold closed pull request #11856: [MXNET-344] Add more operators to onnx import

2018-07-27 Thread GitBox
zhreshold closed pull request #11856: [MXNET-344] Add more operators to onnx 
import
URL: https://github.com/apache/incubator-mxnet/pull/11856
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/python/mxnet/contrib/onnx/onnx2mx/_import_helper.py 
b/python/mxnet/contrib/onnx/onnx2mx/_import_helper.py
index c19f0f2cb24..c44403d4992 100644
--- a/python/mxnet/contrib/onnx/onnx2mx/_import_helper.py
+++ b/python/mxnet/contrib/onnx/onnx2mx/_import_helper.py
@@ -20,8 +20,9 @@
 """Operator attributes conversion"""
 from ._op_translations import identity, random_uniform, random_normal
 from ._op_translations import add, subtract, multiply, divide, absolute, 
negative, add_n
-from ._op_translations import tanh
-from ._op_translations import ceil, floor
+from ._op_translations import tanh, arccos, arcsin, arctan, _cos, _sin, _tan
+from ._op_translations import softplus, shape, gather, lp_pooling
+from ._op_translations import ceil, floor, hardsigmoid, global_lppooling
 from ._op_translations import concat
 from ._op_translations import leaky_relu, _elu, _prelu, softmax, 
fully_connected
 from ._op_translations import global_avgpooling, global_maxpooling, linalg_gemm
@@ -30,12 +31,13 @@
 from ._op_translations import reshape, cast, split, _slice, transpose, 
squeeze, flatten
 from ._op_translations import reciprocal, squareroot, power, exponent, _log, 
unsqueeze
 from ._op_translations import reduce_max, reduce_mean, reduce_min, reduce_sum
-from ._op_translations import reduce_prod, avg_pooling, max_pooling
+from ._op_translations import reduce_prod, avg_pooling, max_pooling, 
instance_norm
 from ._op_translations import argmax, argmin, maximum, minimum
 from ._op_translations import clip, reduce_log_sum, reduce_log_sum_exp
-from ._op_translations import reduce_sum_square, reduce_l2, max_roi_pooling, 
instance_norm
+from ._op_translations import reduce_sum_square, reduce_l1, reduce_l2, 
max_roi_pooling
 from ._op_translations import log_softmax, softsign, lesser, greater, equal
 from ._op_translations import logical_and, logical_or, logical_xor, logical_not
+from ._op_translations import mean
 
 # convert_map defines maps of ONNX operator names to converter 
functor(callable)
 # defined in the op_translations module.
@@ -77,6 +79,7 @@
 'FC': fully_connected,
 'GlobalAveragePool' : global_avgpooling,
 'GlobalMaxPool' : global_maxpooling,
+'GlobalLpPool'  : global_lppooling,
 'Gemm'  : linalg_gemm,
 'LRN'   : local_response_norm,
 'Dropout'   : dropout,
@@ -113,6 +116,7 @@
 'ReduceLogSum'  : reduce_log_sum,
 'ReduceLogSumExp'   : reduce_log_sum_exp,
 'ReduceSumSquare'   : reduce_sum_square,
+'ReduceL1'  : reduce_l1,
 'ReduceL2'  : reduce_l2,
 'MaxRoiPool': max_roi_pooling,
 'InstanceNormalization' : instance_norm,
@@ -124,5 +128,17 @@
 'And'   : logical_and,
 'Xor'   : logical_xor,
 'Not'   : logical_not,
-'Or': logical_or
+'Or': logical_or,
+'Mean'  : mean,
+'Acos'  : arccos,
+'Asin'  : arcsin,
+'Atan'  : arctan,
+'Cos'   : _cos,
+'Sin'   : _sin,
+'Softplus'  : softplus,
+'Tan'   : _tan,
+'Shape' : shape,
+'Gather': gather,
+'HardSigmoid'   : hardsigmoid,
+'LpPool': lp_pooling
 }
diff --git a/python/mxnet/contrib/onnx/onnx2mx/_op_translations.py 
b/python/mxnet/contrib/onnx/onnx2mx/_op_translations.py
index aa37856ffad..4d1e9561230 100644
--- a/python/mxnet/contrib/onnx/onnx2mx/_op_translations.py
+++ b/python/mxnet/contrib/onnx/onnx2mx/_op_translations.py
@@ -80,6 +80,13 @@ def divide(attrs, inputs, proto_obj):
 return op_value, new_attr, inputs
 return 'broadcast_div', new_attr, inputs
 
+def mean(attrs, inputs, proto_obj):
+"""Mean of all the input tensors."""
+concat_input = [symbol.expand_dims(op_input, axis=0) for op_input in 
inputs]
+concat_sym = symbol.concat(*concat_input, dim=0)
+mean_sym = symbol.mean(concat_sym, axis=0)
+return mean_sym, attrs, inputs
+
 def logical_and(attrs, inputs, proto_obj):
 """Logical and of two input arrays."""
 return 'broadcast_logical_and', attrs, inputs
@@ -186,6 +193,10 @@ def sigmoid(attrs, inputs, proto_obj):
 """Computes elementwise sigmoid of the input array"""
 return 'sigmoid', attrs, inputs
 
+def hardsigmoid(attrs, inputs, proto_obj):
+"""Computes elementwise hard sigmoid of the input array"""
+return 'hard_sigmoid', attrs, inputs
+
 def relu(attrs, inputs, proto_obj):
 

[GitHub] ssttevee edited a comment on issue #11914: NDArray.asscalar(): CUDA an illegal memory access was encountered

2018-07-27 Thread GitBox
ssttevee edited a comment on issue #11914: NDArray.asscalar(): CUDA an illegal 
memory access was encountered
URL: 
https://github.com/apache/incubator-mxnet/issues/11914#issuecomment-408487363
 
 
   Sorry, I meant to say that it doesn't crash with higher a data length like 
`--data_length=100 --batch_size=2`. The crashes seem to happen at arbitrary 
the data length and batch size values.
   
   It wouldn't make much sense for it to be a memory limit anyways, since none 
of the parameters gets anywhere close to 1 GB of memory where as both my GPUs 
have well over 1 GB of memory.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ssttevee edited a comment on issue #11914: NDArray.asscalar(): CUDA an illegal memory access was encountered

2018-07-27 Thread GitBox
ssttevee edited a comment on issue #11914: NDArray.asscalar(): CUDA an illegal 
memory access was encountered
URL: 
https://github.com/apache/incubator-mxnet/issues/11914#issuecomment-408487363
 
 
   Sorry, I meant to say that it doesn't crash with higher a data length like 
`--data_length=100 --batch_size=2`. The crashes seem to happen at arbitrary 
data length and batch size values.
   
   It wouldn't make much sense for it to be a memory limit anyways, since none 
of the parameters gets anywhere close to 1 GB of memory where as both my GPUs 
have well over 1 GB of memory.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ssttevee edited a comment on issue #11914: NDArray.asscalar(): CUDA an illegal memory access was encountered

2018-07-27 Thread GitBox
ssttevee edited a comment on issue #11914: NDArray.asscalar(): CUDA an illegal 
memory access was encountered
URL: 
https://github.com/apache/incubator-mxnet/issues/11914#issuecomment-408487363
 
 
   Sorry, I meant to say that it doesn't crash with higher a data length like 
`--data_length=100 --batch_size=2`. The crashes don't seem to have a direct 
of correlation the actual data length.
   
   It wouldn't make much sense for it to be a memory limit anyways, since none 
of the parameters gets anywhere close to 1 GB of memory where as both my GPUs 
have well over 1 GB of memory.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] szha commented on issue #11834: Fix mxnet ctc_loss bug

2018-07-27 Thread GitBox
szha commented on issue #11834: Fix mxnet ctc_loss bug
URL: https://github.com/apache/incubator-mxnet/pull/11834#issuecomment-408521183
 
 
   @HawkAaron thanks for the fix, and @Jerryzcn thanks for the review


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[incubator-mxnet] branch master updated: Fix mxnet ctc_loss bug (#11834)

2018-07-27 Thread zhasheng
This is an automated email from the ASF dual-hosted git repository.

zhasheng pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git


The following commit(s) were added to refs/heads/master by this push:
 new 2bddf6f  Fix mxnet ctc_loss bug (#11834)
2bddf6f is described below

commit 2bddf6f039e94506d11a6539b0e921e5440e09eb
Author: Mingkun Huang 
AuthorDate: Sat Jul 28 03:46:35 2018 +0800

Fix mxnet ctc_loss bug (#11834)

* fix ctc_loss GPU bug

* add blank_label parameter for CTCLoss

* Revert "add blank_label parameter for CTCLoss"

This reverts commit aab11f7575580f88f5f27be14466d0deb4b4c456.
---
 src/operator/contrib/ctc_include/detail/gpu_ctc.h | 7 +--
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/src/operator/contrib/ctc_include/detail/gpu_ctc.h 
b/src/operator/contrib/ctc_include/detail/gpu_ctc.h
index 8015b39..2c521b5 100644
--- a/src/operator/contrib/ctc_include/detail/gpu_ctc.h
+++ b/src/operator/contrib/ctc_include/detail/gpu_ctc.h
@@ -411,12 +411,7 @@ GpuCTC::compute_log_probs(const ProbT* const 
activations) {
 denoms_, out_dim_, num_elements);
 
 // compute denominators for softmax
-denoms_handle = reduce_with_axis(
-F(
-log_probs_handle -
-broadcast<0>(reduce_with_axis(log_probs_handle, 1),
- log_probs_handle.shape_)),
-1);
+denoms_handle = reduce_with_axis(F(log_probs_handle), 1);
 
 // Kernel launch to calculate probabilities
 compute_log_probs_kernel<<>>



[GitHub] szha closed pull request #11834: Fix mxnet ctc_loss bug

2018-07-27 Thread GitBox
szha closed pull request #11834: Fix mxnet ctc_loss bug
URL: https://github.com/apache/incubator-mxnet/pull/11834
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/src/operator/contrib/ctc_include/detail/gpu_ctc.h 
b/src/operator/contrib/ctc_include/detail/gpu_ctc.h
index 8015b39c437..2c521b5abb5 100644
--- a/src/operator/contrib/ctc_include/detail/gpu_ctc.h
+++ b/src/operator/contrib/ctc_include/detail/gpu_ctc.h
@@ -411,12 +411,7 @@ GpuCTC::compute_log_probs(const ProbT* const 
activations) {
 denoms_, out_dim_, num_elements);
 
 // compute denominators for softmax
-denoms_handle = reduce_with_axis(
-F(
-log_probs_handle -
-broadcast<0>(reduce_with_axis(log_probs_handle, 1),
- log_probs_handle.shape_)),
-1);
+denoms_handle = reduce_with_axis(F(log_probs_handle), 1);
 
 // Kernel launch to calculate probabilities
 compute_log_probs_kernel<<>>


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] haojin2 opened a new pull request #11918: Improve scatter_nd doc

2018-07-27 Thread GitBox
haojin2 opened a new pull request #11918: Improve scatter_nd doc
URL: https://github.com/apache/incubator-mxnet/pull/11918
 
 
   ## Description ##
   Address #11867
   
   ## Checklist ##
   ### Essentials ###
   - [x] Changes are complete (i.e. I finished coding on this PR)
   - [x] All changes have test coverage:
   - Unit tests are added for small changes to verify correctness (e.g. adding 
a new operator)
   - Nightly tests are added for complicated/long-running ones (e.g. changing 
distributed kvstore)
   - Build tests will be added for build configuration changes (e.g. adding a 
new build option with NCCL)
   - [x] Code is well-documented: 
   - For user-facing API changes, API doc string has been updated. 
   - For new C++ functions in header files, their functionalities and arguments 
are documented. 
   - For new examples, README.md is added to explain the what the example does, 
the source of the dataset, expected performance on test set and reference to 
the original paper if applicable
   - Check the API doc at 
http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
   - [x] To the my best knowledge, examples are either not affected by this 
change, or have been fixed to be compatible with this change
   
   ### Changes ###
   - [x] Improve doc for scatter_nd (using tf version as reference)
   
   ## Comments ##
   @zheng-da 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] zheng-da commented on a change in pull request #11325: [MXNET-703] TensorRT runtime integration

2018-07-27 Thread GitBox
zheng-da commented on a change in pull request #11325: [MXNET-703] TensorRT 
runtime integration
URL: https://github.com/apache/incubator-mxnet/pull/11325#discussion_r205874962
 
 

 ##
 File path: src/executor/graph_executor.cc
 ##
 @@ -941,6 +970,114 @@ void GraphExecutor::FinishInitGraph(nnvm::Symbol symbol,
   this->InitOpSegs();
 }
 
+/*!
+ * \brief This function is triggered after each tensorrt subgraph replacement 
pass.
+ * Reset arguments of GraphExecutor::Init(...) as some variables (weights and 
biases)
+ * are absorbed into the TRT engine it also it rerun attributes inferences 
accordingly
+ * to the new topology.
+ */
+Graph GraphExecutor::ReinitGraph(Graph&& g, const Context _ctx,
+ const std::map _map,
+ std::vector *in_arg_ctxes,
+ std::vector *arg_grad_ctxes,
+ std::vector *aux_state_ctxes,
+ std::vector *grad_req_types,
+ std::unordered_map 
*arg_shape_map,
+ std::unordered_map 
*arg_dtype_map,
+ std::unordered_map 
*arg_stype_map,
+ std::unordered_map 
*params_map) {
+  std::unordered_set to_remove_params;
+  for (auto& el : *params_map) {
+to_remove_params.insert(el.first);
+  }
+
+  DFSVisit(g.outputs, [_remove_params](const nnvm::NodePtr n) {
+to_remove_params.erase(n->attrs.name);
+  });
+
+  for (auto& el : to_remove_params) {
+params_map->erase(el);
+arg_shape_map->erase(el);
+arg_dtype_map->erase(el);
+arg_stype_map->erase(el);
+  }
+  const auto  = g.indexed_graph();
+  num_forward_inputs_ = idx.input_nodes().size();
+  in_arg_ctxes->resize(num_forward_inputs_ - idx.mutable_input_nodes().size());
 
 Review comment:
   If you change the graph like that, does it work with mxnet module? I checked 
your tests. They are all tested with symbols. Can you test with module?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] sandeep-krishnamurthy commented on a change in pull request #11910: Improving documentation and error messages for Async distributed training with Gluon

2018-07-27 Thread GitBox
sandeep-krishnamurthy commented on a change in pull request #11910: Improving 
documentation and error messages for Async distributed training with Gluon
URL: https://github.com/apache/incubator-mxnet/pull/11910#discussion_r205874324
 
 

 ##
 File path: docs/faq/distributed_training.md
 ##
 @@ -73,6 +73,13 @@ These can be passed as arguments to the iterator.
 You can look at 
[example/gluon/image_classification.py](https://github.com/apache/incubator-mxnet/blob/master/example/gluon/image_classification.py)
 to see an example usage.
 
+### Updating weights
+KVStore server supports two modes, one which aggregates the gradients and 
updates the weights using those gradients, and second where the server only 
aggregates gradients. In the latter case, when a worker process pulls from 
kvstore, it gets the aggregated gradients. The worker then uses these gradients 
and applies the weights locally. 
+
+When using Gluon there is an option to choose between these modes by passing 
`update_on_kvstore` variable when you create the 
[Trainer](https://mxnet.incubator.apache.org/versions/master/api/python/gluon/gluon.html#mxnet.gluon.Trainer)
 object. 
 
 Review comment:
   Example code snippet will be very easy for reader  here.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] sandeep-krishnamurthy commented on a change in pull request #11910: Improving documentation and error messages for Async distributed training with Gluon

2018-07-27 Thread GitBox
sandeep-krishnamurthy commented on a change in pull request #11910: Improving 
documentation and error messages for Async distributed training with Gluon
URL: https://github.com/apache/incubator-mxnet/pull/11910#discussion_r205874742
 
 

 ##
 File path: python/mxnet/gluon/trainer.py
 ##
 @@ -187,6 +187,11 @@ def _init_kvstore(self):
 arg_arrays = {param.name: param.data(self._contexts[0]) for param 
in self._params}
 kvstore, update_on_kvstore = _create_kvstore(config['kvstore'], 
len(self._contexts),
  arg_arrays)
+if kvstore and 'async' in kvstore.type and 
config['update_on_kvstore'] is not None\
 
 Review comment:
   If we are forcing the user to set this param, why don't we set it inside the 
function itself as default value?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[incubator-mxnet-site] branch asf-site updated: Bump the publish timestamp.

2018-07-27 Thread zhasheng
This is an automated email from the ASF dual-hosted git repository.

zhasheng pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/incubator-mxnet-site.git


The following commit(s) were added to refs/heads/asf-site by this push:
 new b42dda6  Bump the publish timestamp.
b42dda6 is described below

commit b42dda620cb6d8befa972b10ae069d14ce272862
Author: mxnet-ci 
AuthorDate: Fri Jul 27 19:03:13 2018 +

Bump the publish timestamp.
---
 date.txt | 1 +
 1 file changed, 1 insertion(+)

diff --git a/date.txt b/date.txt
new file mode 100644
index 000..af0375a
--- /dev/null
+++ b/date.txt
@@ -0,0 +1 @@
+Fri Jul 27 19:03:13 UTC 2018



[GitHub] zhreshold commented on issue #11872: "socket.error: [Errno 111] Connection refused" while training with multiple workers

2018-07-27 Thread GitBox
zhreshold commented on issue #11872: "socket.error: [Errno 111] Connection 
refused" while training with multiple workers
URL: 
https://github.com/apache/incubator-mxnet/issues/11872#issuecomment-408510384
 
 
   I have figured out that the pre-fetch strategy for data loader is too 
aggressive which might cause the related issue with shared mem. 
   The fix is included in https://github.com/apache/incubator-mxnet/pull/11908
   
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] haojin2 commented on a change in pull request #11873: [MXNET-582] Fix flaky test test_operator_gpu.test_batchnorm_with_type (follow-up)

2018-07-27 Thread GitBox
haojin2 commented on a change in pull request #11873: [MXNET-582] Fix flaky 
test test_operator_gpu.test_batchnorm_with_type (follow-up)
URL: https://github.com/apache/incubator-mxnet/pull/11873#discussion_r205865680
 
 

 ##
 File path: tests/python/gpu/test_operator_gpu.py
 ##
 @@ -290,12 +289,12 @@ def test_batchnorm_with_type():
   ]
 
   ctx_list_v2_3D = [
-{'ctx': mx.cpu(0), 'norm_data': (4, 2, 3, 5, 5), 'type_dict': 
{'norm_data': np.float16}},
-{'ctx': mx.cpu(0), 'norm_data': (4, 2, 3, 5, 5), 'type_dict': 
{'norm_data': np.float32}},
-{'ctx': mx.cpu(0), 'norm_data': (4, 2, 3, 5, 5), 'type_dict': 
{'norm_data': np.float64}},
-{'ctx': mx.gpu(0), 'norm_data': (4, 2, 3, 5, 5), 'type_dict': 
{'norm_data': np.float16}},
-{'ctx': mx.gpu(0), 'norm_data': (4, 2, 3, 5, 5), 'type_dict': 
{'norm_data': np.float32}},
-{'ctx': mx.gpu(0), 'norm_data': (4, 2, 3, 5, 5), 'type_dict': 
{'norm_data': np.float64}}
+{'ctx': mx.cpu(0), 'norm_data': (3, 2, 3, 2, 3), 'type_dict': 
{'norm_data': np.float16}},
+{'ctx': mx.cpu(0), 'norm_data': (3, 2, 3, 2, 3), 'type_dict': 
{'norm_data': np.float32}},
+{'ctx': mx.cpu(0), 'norm_data': (3, 2, 3, 2, 3), 'type_dict': 
{'norm_data': np.float64}},
+{'ctx': mx.gpu(0), 'norm_data': (3, 2, 3, 2, 3), 'type_dict': 
{'norm_data': np.float16}},
+{'ctx': mx.gpu(0), 'norm_data': (3, 2, 3, 2, 3), 'type_dict': 
{'norm_data': np.float32}},
+{'ctx': mx.gpu(0), 'norm_data': (3, 2, 3, 2, 3), 'type_dict': 
{'norm_data': np.float64}}
 
 Review comment:
   Just got a chance to take a look at your reply, I'll dig into the 2 links 
you provided. Thanks.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] mkolod commented on a change in pull request #11325: [MXNET-703] TensorRT runtime integration

2018-07-27 Thread GitBox
mkolod commented on a change in pull request #11325: [MXNET-703] TensorRT 
runtime integration
URL: https://github.com/apache/incubator-mxnet/pull/11325#discussion_r205860313
 
 

 ##
 File path: src/executor/graph_executor.cc
 ##
 @@ -941,6 +970,114 @@ void GraphExecutor::FinishInitGraph(nnvm::Symbol symbol,
   this->InitOpSegs();
 }
 
+/*!
+ * \brief This function is triggered after each tensorrt subgraph replacement 
pass.
+ * Reset arguments of GraphExecutor::Init(...) as some variables (weights and 
biases)
+ * are absorbed into the TRT engine it also it rerun attributes inferences 
accordingly
+ * to the new topology.
+ */
+Graph GraphExecutor::ReinitGraph(Graph&& g, const Context _ctx,
+ const std::map _map,
+ std::vector *in_arg_ctxes,
+ std::vector *arg_grad_ctxes,
+ std::vector *aux_state_ctxes,
+ std::vector *grad_req_types,
+ std::unordered_map 
*arg_shape_map,
+ std::unordered_map 
*arg_dtype_map,
+ std::unordered_map 
*arg_stype_map,
+ std::unordered_map 
*params_map) {
+  std::unordered_set to_remove_params;
+  for (auto& el : *params_map) {
+to_remove_params.insert(el.first);
+  }
+
+  DFSVisit(g.outputs, [_remove_params](const nnvm::NodePtr n) {
+to_remove_params.erase(n->attrs.name);
+  });
+
+  for (auto& el : to_remove_params) {
+params_map->erase(el);
+arg_shape_map->erase(el);
+arg_dtype_map->erase(el);
+arg_stype_map->erase(el);
+  }
+  const auto  = g.indexed_graph();
+  num_forward_inputs_ = idx.input_nodes().size();
+  in_arg_ctxes->resize(num_forward_inputs_ - idx.mutable_input_nodes().size());
 
 Review comment:
   @zheng-da Consider any network, such as VGG, ResNet, etc. For any subgraph 
that is extracted by the TensorRT pass, the weights need to be provided to 
TensorRT at TensorRT engine construction time. These weights then become "baked 
into" the engine. Once the subgraph is substituted by a TensorRT node, these 
graph inputs become part of the TensorRT engine and are no longer used by the 
NNVM graph explicitly. Hence, they need to be removed, in order not to waste 
memory, and to prevent the confusion where some inputs still exist in the NNVM 
graph, but are not used anymore.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] mkolod commented on a change in pull request #11325: [MXNET-703] TensorRT runtime integration

2018-07-27 Thread GitBox
mkolod commented on a change in pull request #11325: [MXNET-703] TensorRT 
runtime integration
URL: https://github.com/apache/incubator-mxnet/pull/11325#discussion_r205860313
 
 

 ##
 File path: src/executor/graph_executor.cc
 ##
 @@ -941,6 +970,114 @@ void GraphExecutor::FinishInitGraph(nnvm::Symbol symbol,
   this->InitOpSegs();
 }
 
+/*!
+ * \brief This function is triggered after each tensorrt subgraph replacement 
pass.
+ * Reset arguments of GraphExecutor::Init(...) as some variables (weights and 
biases)
+ * are absorbed into the TRT engine it also it rerun attributes inferences 
accordingly
+ * to the new topology.
+ */
+Graph GraphExecutor::ReinitGraph(Graph&& g, const Context _ctx,
+ const std::map _map,
+ std::vector *in_arg_ctxes,
+ std::vector *arg_grad_ctxes,
+ std::vector *aux_state_ctxes,
+ std::vector *grad_req_types,
+ std::unordered_map 
*arg_shape_map,
+ std::unordered_map 
*arg_dtype_map,
+ std::unordered_map 
*arg_stype_map,
+ std::unordered_map 
*params_map) {
+  std::unordered_set to_remove_params;
+  for (auto& el : *params_map) {
+to_remove_params.insert(el.first);
+  }
+
+  DFSVisit(g.outputs, [_remove_params](const nnvm::NodePtr n) {
+to_remove_params.erase(n->attrs.name);
+  });
+
+  for (auto& el : to_remove_params) {
+params_map->erase(el);
+arg_shape_map->erase(el);
+arg_dtype_map->erase(el);
+arg_stype_map->erase(el);
+  }
+  const auto  = g.indexed_graph();
+  num_forward_inputs_ = idx.input_nodes().size();
+  in_arg_ctxes->resize(num_forward_inputs_ - idx.mutable_input_nodes().size());
 
 Review comment:
   @zheng-da Consider any network, such as VGG, ResNet, etc. For any subgraph 
that is extracted by the TensorRT pass, the weights need to be provided to 
TensorRT at TensorRT engine construction time. These weights then become "baked 
into" the engine. Once the subgraph is substituted by a TensorRT node, these 
graph inputs become part of the TensorRT engine and are no longer used by the 
NNVM graph explicitly. Hence, they need to be removed, in order not to waste 
memory, and to prevent the confusion where some inputs still exists in the NNVM 
graph, but are not used anymore.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] mkolod commented on a change in pull request #11325: [MXNET-703] TensorRT runtime integration

2018-07-27 Thread GitBox
mkolod commented on a change in pull request #11325: [MXNET-703] TensorRT 
runtime integration
URL: https://github.com/apache/incubator-mxnet/pull/11325#discussion_r205860313
 
 

 ##
 File path: src/executor/graph_executor.cc
 ##
 @@ -941,6 +970,114 @@ void GraphExecutor::FinishInitGraph(nnvm::Symbol symbol,
   this->InitOpSegs();
 }
 
+/*!
+ * \brief This function is triggered after each tensorrt subgraph replacement 
pass.
+ * Reset arguments of GraphExecutor::Init(...) as some variables (weights and 
biases)
+ * are absorbed into the TRT engine it also it rerun attributes inferences 
accordingly
+ * to the new topology.
+ */
+Graph GraphExecutor::ReinitGraph(Graph&& g, const Context _ctx,
+ const std::map _map,
+ std::vector *in_arg_ctxes,
+ std::vector *arg_grad_ctxes,
+ std::vector *aux_state_ctxes,
+ std::vector *grad_req_types,
+ std::unordered_map 
*arg_shape_map,
+ std::unordered_map 
*arg_dtype_map,
+ std::unordered_map 
*arg_stype_map,
+ std::unordered_map 
*params_map) {
+  std::unordered_set to_remove_params;
+  for (auto& el : *params_map) {
+to_remove_params.insert(el.first);
+  }
+
+  DFSVisit(g.outputs, [_remove_params](const nnvm::NodePtr n) {
+to_remove_params.erase(n->attrs.name);
+  });
+
+  for (auto& el : to_remove_params) {
+params_map->erase(el);
+arg_shape_map->erase(el);
+arg_dtype_map->erase(el);
+arg_stype_map->erase(el);
+  }
+  const auto  = g.indexed_graph();
+  num_forward_inputs_ = idx.input_nodes().size();
+  in_arg_ctxes->resize(num_forward_inputs_ - idx.mutable_input_nodes().size());
 
 Review comment:
   @zheng-da Consider any network, such as VGG, ResNet, etc. For any subgraph 
that is extracted by the TensorRT pass, the weights need to be provided to 
TensorRT at TensorRT engine construction time. These weights then become "baked 
in" the engine. Once the subgraph is substituted by a TensorRT node, these 
graph inputs become part of the TensorRT engine and are no longer used by the 
NNVM graph explicitly. Hence, they need to be removed, in order not to waste 
memory, and to prevent the confusion where some inputs still exists in the NNVM 
graph, but are not used anymore.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] kalyc commented on issue #11685: test_executor.test_bind has fixed seed that can mask flakiness

2018-07-27 Thread GitBox
kalyc commented on issue #11685: test_executor.test_bind has fixed seed that 
can mask flakiness
URL: 
https://github.com/apache/incubator-mxnet/issues/11685#issuecomment-408499091
 
 
   Able to reproduce flaky test error - 
   ```
   def test_bind():
   def check_bind(disable_bulk_exec):
   if disable_bulk_exec:
   prev_bulk_inf_val = 
mx.test_utils.set_env_var("MXNET_EXEC_BULK_EXEC_INFERENCE", "0", "1")
   prev_bulk_train_val = 
mx.test_utils.set_env_var("MXNET_EXEC_BULK_EXEC_TRAIN", "0", "1")
   
   nrepeat = 10
   maxdim = 4
   for repeat in range(nrepeat):
   for dim in range(1, maxdim):
   check_bind_with_uniform(lambda x, y: x + y,
   lambda g, x, y: (g, g),
   dim)
   check_bind_with_uniform(lambda x, y: x - y,
   lambda g, x, y: (g, -g),
   dim)
   check_bind_with_uniform(lambda x, y: x * y,
   lambda g, x, y: (y * g, x * g),
   dim)
   check_bind_with_uniform(lambda x, y: x / y,
   lambda g, x, y: (g / y, -x * g/ 
(y**2)),
   dim)
   
   check_bind_with_uniform(lambda x, y: np.maximum(x, y),
   lambda g, x, y: (g * (x>y), g * 
(y>x)),
   dim,
   sf=mx.symbol.maximum)
   check_bind_with_uniform(lambda x, y: np.minimum(x, y),
   lambda g, x, y: (g * (x   check_bind(True)
   
   test_executor.py:117: 
   _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
   test_executor.py:108: in check_bind
   sf=mx.symbol.maximum)
   _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
   
   uf =  at 0x1078889b0>, gf =  at 
0x10e2852a8>, dim = 3, sf = , lshape = (3, 1, 
1), rshape = (3, 1, 1)
   
   def check_bind_with_uniform(uf, gf, dim, sf=None, lshape=None, 
rshape=None):
   """check function consistency with uniform random numbers"""
   shape = tuple(np.random.randint(1, int(1000**(1.0/dim)), size=dim))
   lhs = mx.symbol.Variable('lhs')
   rhs = mx.symbol.Variable('rhs')
   if sf is not None:
   ret = sf(lhs, rhs)
   else:
   ret = uf(lhs, rhs)
   
   assert ret.list_arguments() == ['lhs', 'rhs']
   lshape = shape if lshape is None else lshape
   rshape = shape if rshape is None else rshape
   
   lhs_arr = mx.nd.array(np.random.uniform(-1, 1, lshape))
   rhs_arr = mx.nd.array(np.random.uniform(-1, 1, rshape))
   lhs_grad = mx.nd.empty(lshape)
   rhs_grad = mx.nd.empty(rshape)
   executor = ret.bind(mx.Context('cpu'),
   args=[lhs_arr, rhs_arr],
   args_grad=[lhs_grad, rhs_grad])
   
   exec3 = ret.bind(mx.Context('cpu'),
args=[lhs_arr, rhs_arr])
   
   
   exec4 = ret.bind(mx.Context('cpu'),
args={'rhs': rhs_arr, 'lhs': lhs_arr},
args_grad={'lhs': lhs_grad, 'rhs': rhs_grad})
   
   executor.forward()
   exec3.forward()
   exec4.forward()
   out2 = executor.outputs[0].asnumpy()
   out1 = uf(lhs_arr.asnumpy(), rhs_arr.asnumpy())
   out3 = exec3.outputs[0].asnumpy()
   out4 = exec4.outputs[0].asnumpy()
   assert reldiff(out1, out2) < 1e-6
   assert reldiff(out1, out3) < 1e-6
   assert reldiff(out1, out4) < 1e-6
   # test gradient
   out_grad = mx.nd.array(np.ones(out2.shape))
   lhs_grad2, rhs_grad2 = gf(out_grad.asnumpy(),
 lhs_arr.asnumpy(),
 rhs_arr.asnumpy())
   executor.backward([out_grad])
   
   >   assert reldiff(lhs_grad.asnumpy(), lhs_grad2) < 1e-6
   E   assert nan < 1e-06
   E+  where nan = reldiff(array([[[ 0.]],\n\n   [[ 0.]],\n\n   
[[ 0.]]], dtype=float32), array([[[ 0.]],\n\n   [[ 0.]],\n\n   [[ 
0.]]], dtype=float32))
   E+where array([[[ 0.]],\n\n   [[ 0.]],\n\n   [[ 0.]]], 
dtype=float32) = >()
   E

[GitHub] safrooze commented on issue #11906: support 1D and 3D arrays in MKLDNN.

2018-07-27 Thread GitBox
safrooze commented on issue #11906: support 1D and 3D arrays in MKLDNN.
URL: 
https://github.com/apache/incubator-mxnet/issues/11906#issuecomment-408498543
 
 
   3D tensors are common in audio signals, e.g. WaveNet. I specifically ran 
into this issue with WaveNet inference on CPU.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] zheng-da commented on a change in pull request #11325: [MXNET-703] TensorRT runtime integration

2018-07-27 Thread GitBox
zheng-da commented on a change in pull request #11325: [MXNET-703] TensorRT 
runtime integration
URL: https://github.com/apache/incubator-mxnet/pull/11325#discussion_r205852327
 
 

 ##
 File path: src/executor/graph_executor.cc
 ##
 @@ -1018,20 +1156,49 @@ void GraphExecutor::Init(nnvm::Symbol symbol,
 g.GetAttr("storage_type"));
   }
 
+  if (use_tensorrt_) {
+  #if MXNET_USE_TENSORRT
+  // check that this graph is inference-only
+  if (std::any_of(grad_req_types->begin(), grad_req_types->end(),
+[](const OpReqType& op){return op != kNullOp;})) {
+  LOG(FATAL) << "MXNET_USE_TENSORRT set but graph is not 
inference-only. "
+<< "If it is an inference graph, set grad_req to null during 
simple_bind call. "
+<< "If it is a training graph, unset the MXNET_USE_TENSORRT env 
variable";
+  }
+  if (shared_buffer->empty()) {
+LOG(FATAL) << "MXNET_USE_TENSORRT = 1 but shared_buffer is empty."
+  << "Please provide weights and other parameters, such as "
+  << "BatchNorm moments, via the shared_buffer, during simple bind 
call.";
+  }
+  auto trt_groups = GetTrtCompatibleSubsets(g, shared_buffer);
+  for (auto trt_group : trt_groups) {
+if (trt_group.size() > 1) {
+  g = ReplaceSubgraph(std::move(g), trt_group, shared_buffer);
+  g = ReinitGraph(std::move(g), default_ctx, ctx_map, 
in_arg_ctxes, arg_grad_ctxes,
+  aux_state_ctxes, grad_req_types, arg_shape_map, 
arg_dtype_map,
+  arg_stype_map, shared_buffer);
 
 Review comment:
   why do you need to reinit the graph whenever a subgraph is replaced? Can you 
reinit outside the loop?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] zheng-da commented on a change in pull request #11325: [MXNET-703] TensorRT runtime integration

2018-07-27 Thread GitBox
zheng-da commented on a change in pull request #11325: [MXNET-703] TensorRT 
runtime integration
URL: https://github.com/apache/incubator-mxnet/pull/11325#discussion_r205851779
 
 

 ##
 File path: src/executor/graph_executor.cc
 ##
 @@ -941,6 +970,114 @@ void GraphExecutor::FinishInitGraph(nnvm::Symbol symbol,
   this->InitOpSegs();
 }
 
+/*!
+ * \brief This function is triggered after each tensorrt subgraph replacement 
pass.
+ * Reset arguments of GraphExecutor::Init(...) as some variables (weights and 
biases)
+ * are absorbed into the TRT engine it also it rerun attributes inferences 
accordingly
+ * to the new topology.
+ */
+Graph GraphExecutor::ReinitGraph(Graph&& g, const Context _ctx,
+ const std::map _map,
+ std::vector *in_arg_ctxes,
+ std::vector *arg_grad_ctxes,
+ std::vector *aux_state_ctxes,
+ std::vector *grad_req_types,
+ std::unordered_map 
*arg_shape_map,
+ std::unordered_map 
*arg_dtype_map,
+ std::unordered_map 
*arg_stype_map,
+ std::unordered_map 
*params_map) {
+  std::unordered_set to_remove_params;
+  for (auto& el : *params_map) {
+to_remove_params.insert(el.first);
+  }
+
+  DFSVisit(g.outputs, [_remove_params](const nnvm::NodePtr n) {
+to_remove_params.erase(n->attrs.name);
+  });
+
+  for (auto& el : to_remove_params) {
+params_map->erase(el);
+arg_shape_map->erase(el);
+arg_dtype_map->erase(el);
+arg_stype_map->erase(el);
+  }
+  const auto  = g.indexed_graph();
+  num_forward_inputs_ = idx.input_nodes().size();
+  in_arg_ctxes->resize(num_forward_inputs_ - idx.mutable_input_nodes().size());
 
 Review comment:
   Why does the number of inputs to a graph change? When partitioning a graph, 
do you put some inputs inside a subgraph and not exposed in the main graph?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] aaronmarkham opened a new pull request #11917: update home page for 1.2.1 announcement

2018-07-27 Thread GitBox
aaronmarkham opened a new pull request #11917: update home page for 1.2.1 
announcement
URL: https://github.com/apache/incubator-mxnet/pull/11917
 
 
   ## Description ##
   Modifies the 1.2.0 branch to have the updated announcement. Also there's a 
url fix for 'why_mxnet' in there too.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] zheng-da commented on issue #11664: Fall back when sparse arrays are passed to MKLDNN-enabled operators

2018-07-27 Thread GitBox
zheng-da commented on issue #11664: Fall back when sparse arrays are passed to 
MKLDNN-enabled operators
URL: https://github.com/apache/incubator-mxnet/pull/11664#issuecomment-408488135
 
 
   Please benchmark the performance with this modification to make sure there 
isn't performance regression.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] zheng-da commented on a change in pull request #11664: Fall back when sparse arrays are passed to MKLDNN-enabled operators

2018-07-27 Thread GitBox
zheng-da commented on a change in pull request #11664: Fall back when sparse 
arrays are passed to MKLDNN-enabled operators
URL: https://github.com/apache/incubator-mxnet/pull/11664#discussion_r205847059
 
 

 ##
 File path: src/operator/nn/activation.cc
 ##
 @@ -128,17 +130,25 @@ inline static bool BackwardActStorageType(const 
nnvm::NodeAttrs& attrs,
   bool ret = false;
   const ActivationParam& param = nnvm::get(attrs.parsed);
 #if (MXNET_USE_CUDNN == 1 || MXNET_USE_MKLDNN == 1)
-  if (param.act_type != activation::kReLU) {
-CHECK_EQ(in_attrs->size(), 3U);
-ret = ElemwiseStorageType<3, 1, false, false, false>(attrs, dev_mask,
- dispatch_mode,
- in_attrs, out_attrs);
+  bool should_continue = true;
+#if MXNET_USE_MKLDNN == 1
+  if (!(dev_mask == mshadow::cpu::kDevMask && SupportMKLDNNAct(param))) {
+should_continue = false;
+  }
+#endif
+  if (should_continue) {
+if (param.act_type != activation::kReLU) {
+  CHECK_EQ(in_attrs->size(), 3U);
+  ret = ElemwiseStorageType<3, 1, false, false, false>(
+  attrs, dev_mask, dispatch_mode, in_attrs, out_attrs);
 
 Review comment:
   I still have the same question here. dispatch_mode can be kFComputeEx? 
ElemwiseStorageType only uses kFComputeEx for sparse storage type.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ssttevee commented on issue #11914: NDArray.asscalar(): CUDA an illegal memory access was encountered

2018-07-27 Thread GitBox
ssttevee commented on issue #11914: NDArray.asscalar(): CUDA an illegal memory 
access was encountered
URL: 
https://github.com/apache/incubator-mxnet/issues/11914#issuecomment-408487363
 
 
   Sorry, I meant to put that it doesn't crash with higher a data length like 
`--data_length=100 --batch_size=2`. The crashes don't seem to have a direct 
of correlation the actual data length.
   
   It wouldn't make much sense for it to be a memory limit anyways, since none 
of the parameters gets anywhere close to 1 GB of memory where as both my GPUs 
have well over 1 GB of memory.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] zheng-da commented on issue #11906: support 1D and 3D arrays in MKLDNN.

2018-07-27 Thread GitBox
zheng-da commented on issue #11906: support 1D and 3D arrays in MKLDNN.
URL: 
https://github.com/apache/incubator-mxnet/issues/11906#issuecomment-408485145
 
 
   I know mkldnn has 3D arrays, but not all mkldnn operators support 3D arrays. 
I believe the current implementation of mkldnn integration only allows 2D and 
4D arrays. 
   
   For example, mkldnn convolution only supports 2D kernel on 4D arrays. 
Currently, 1D convolution actually calls the native implementation. But if we 
add a fake dim to turn 1D conv into 2D conv, we can make substantial speedup. I 
think this applies to many other operators.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ptrendx commented on issue #11886: Improve error message of cudnn operators

2018-07-27 Thread GitBox
ptrendx commented on issue #11886: Improve error message of cudnn operators
URL: https://github.com/apache/incubator-mxnet/pull/11886#issuecomment-408484982
 
 
   The thing is by default there is a limit on workspace that convolution may 
take (see here: 
http://mxnet.incubator.apache.org/api/python/symbol/symbol.html#mxnet.symbol.Convolution
 `workspace` parameter, not sure how it is specified in Gluon) and sometimes 
you may have enough GPU memory but the workspace limit prevents choosing the 
algo anyway.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] szha commented on issue #11916: [MXNET-371] Sphinx error reduction

2018-07-27 Thread GitBox
szha commented on issue #11916: [MXNET-371] Sphinx error reduction
URL: https://github.com/apache/incubator-mxnet/pull/11916#issuecomment-408484837
 
 
   AWESOME!


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] mkolod commented on issue #11325: [MXNET-703] TensorRT runtime integration

2018-07-27 Thread GitBox
mkolod commented on issue #11325: [MXNET-703] TensorRT runtime integration
URL: https://github.com/apache/incubator-mxnet/pull/11325#issuecomment-408482195
 
 
   @Roshrini It is in my opinion, but whether it is according to the 
committers, I assume that depends on what @piiswrong thinks, right? He asked 
the following question 2 days ago 
[here](https://github.com/apache/incubator-mxnet/pull/11325#pullrequestreview-140440732),
 and I answered it 
[here](https://github.com/apache/incubator-mxnet/pull/11325#issuecomment-408247485)
 in English and 
[here](https://github.com/mkolod/incubator-mxnet/commit/eebd373f0a8c863b96e7211311e50f6aa2ce9f13)
 in code. Whether this satisfies the MXNet committers is something I cannot 
answer, please check with @piiswrong.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] aaronmarkham commented on issue #11916: [MXNET-371] Sphinx error reduction

2018-07-27 Thread GitBox
aaronmarkham commented on issue #11916: [MXNET-371] Sphinx error reduction
URL: https://github.com/apache/incubator-mxnet/pull/11916#issuecomment-408479831
 
 
   @ThomasDelteil @thomelane @sandeep-krishnamurthy @kevinthesun @piiswrong 
@mli @nswamy @marcoabreu - you all use this, or have used it... please let me 
know if you have any suggestions. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] haojin2 commented on issue #11886: Improve error message of cudnn operators

2018-07-27 Thread GitBox
haojin2 commented on issue #11886: Improve error message of cudnn operators
URL: https://github.com/apache/incubator-mxnet/pull/11886#issuecomment-408479199
 
 
   @ptrendx How does the following message look to you?
   ```
   N algorithms with minimum memory requirement M bytes have been tried. There 
are only X bytes available workspace on your GPU, please consider reduce the 
batch size or model size.
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] piyushghai edited a comment on issue #11626: [MXNET-651] MXNet Model Backwards Compatibility Checker

2018-07-27 Thread GitBox
piyushghai edited a comment on issue #11626: [MXNET-651] MXNet Model Backwards 
Compatibility Checker
URL: https://github.com/apache/incubator-mxnet/pull/11626#issuecomment-408271045
 
 
   @marcoabreu The Jenkins CI [1] build is giving an error on : the import 
statement for MXNet under the Inference Stage of the JenkinsFileForMBCC. I'm 
not able to figure out why that's happening. 
   Can you have a look at this 69843fbe4d6669c135d3ae85aa56df144bc6c076 and 
give a second eye opinion ? 
   
   [1] : 
http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/test-backwards-compatibility-checker/detail/test-backwards-compatibility-checker/12/pipeline/4/
   
   
   Edit --> fae44fe22e322d928fa735968476d38cfbf26e62 seems to have fixed the 
issue.  [2]
   @marcoabreu We now need to add the IAM User policy for S3 bucket access. 
   
   [2] : 
http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/test-backwards-compatibility-checker/detail/test-backwards-compatibility-checker/13/pipeline/4
 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] sandeep-krishnamurthy closed issue #7725: accuracy of cpp example is a constant value when training, no matter how many epochs trained!

2018-07-27 Thread GitBox
sandeep-krishnamurthy closed issue #7725: accuracy of cpp example is a constant 
value when training, no matter how many epochs trained!
URL: https://github.com/apache/incubator-mxnet/issues/7725
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] sandeep-krishnamurthy commented on issue #7725: accuracy of cpp example is a constant value when training, no matter how many epochs trained!

2018-07-27 Thread GitBox
sandeep-krishnamurthy commented on issue #7725: accuracy of cpp example is a 
constant value when training, no matter how many epochs trained!
URL: 
https://github.com/apache/incubator-mxnet/issues/7725#issuecomment-408477173
 
 
   Resolving in favor of #8551 . Please reopen if issue still persists.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] sandeep-krishnamurthy closed issue #11911: Illegal instruction (core dumped)

2018-07-27 Thread GitBox
sandeep-krishnamurthy closed issue #11911: Illegal instruction (core dumped)
URL: https://github.com/apache/incubator-mxnet/issues/11911
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] sandeep-krishnamurthy commented on issue #11914: NDArray.asscalar(): CUDA an illegal memory access was encountered

2018-07-27 Thread GitBox
sandeep-krishnamurthy commented on issue #11914: NDArray.asscalar(): CUDA an 
illegal memory access was encountered
URL: 
https://github.com/apache/incubator-mxnet/issues/11914#issuecomment-408476538
 
 
   Since it is directly correlating with data length, looks like GPU ran out of 
memory?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] sandeep-krishnamurthy closed issue #11915: The behavior of np.zeros_like(x) if x is a NDArray is unexpected.

2018-07-27 Thread GitBox
sandeep-krishnamurthy closed issue #11915: The behavior of np.zeros_like(x) if 
x is a NDArray is unexpected.
URL: https://github.com/apache/incubator-mxnet/issues/11915
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] haojin2 commented on issue #11900: Re-enabling randomized test_l2_normalization

2018-07-27 Thread GitBox
haojin2 commented on issue #11900: Re-enabling randomized test_l2_normalization
URL: https://github.com/apache/incubator-mxnet/pull/11900#issuecomment-408474433
 
 
   @rahul003 It was added before my PR for support for fp16 merged.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


  1   2   >