[incubator-mxnet] branch master updated (66ab27e -> bbc7a22)
This is an automated email from the ASF dual-hosted git repository. nswamy pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from 66ab27e add import_ for SymbolBlock (#11127) add bbc7a22 Scala inference memory leak fix (#11204) No new revisions were added by this update. Summary of changes: .../main/scala/org/apache/mxnet/FeedForward.scala | 25 -- 1 file changed, 18 insertions(+), 7 deletions(-) -- To stop receiving notification emails like this one, please contact nsw...@apache.org.
[GitHub] nswamy closed pull request #11204: Scala inference memory leak fix
nswamy closed pull request #11204: Scala inference memory leak fix URL: https://github.com/apache/incubator-mxnet/pull/11204 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/scala-package/core/src/main/scala/org/apache/mxnet/FeedForward.scala b/scala-package/core/src/main/scala/org/apache/mxnet/FeedForward.scala index 7289df19712..87c9bc72be0 100644 --- a/scala-package/core/src/main/scala/org/apache/mxnet/FeedForward.scala +++ b/scala-package/core/src/main/scala/org/apache/mxnet/FeedForward.scala @@ -224,13 +224,24 @@ class FeedForward private( var i = 0 while (data.hasNext && i != numBatch) { val batch = data.next() - i += 1 - ExecutorManager.loadData(batch, dataArrays) - predExec.forward(isTrain = false) - val padded = batch.pad - val realSize = batchSize - padded - for ((list, nd) <- outputs zip predExec.outputs) { -list += nd.slice(0, realSize).copy() + try { +i += 1 +ExecutorManager.loadData(batch, dataArrays) +predExec.forward(isTrain = false) +val padded = batch.pad +val realSize = batchSize - padded +for ((list, nd) <- outputs zip predExec.outputs) { + // The slice is being written to a value so that dispose can be called after the copy. + // The one liner nd.slice().copy() leads to leaking the memory of the slice. + val ndSliced = nd.slice(0, realSize) + try { +list += ndSliced.copy() + } finally { +ndSliced.dispose() + } +} + } finally { +batch.dispose() } } // TODO(Yizhi): we can use Symbol.concat to do the same thing. Can it be more efficient? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] asitstands commented on issue #10951: [MXNET-545] Fix broken cython build
asitstands commented on issue #10951: [MXNET-545] Fix broken cython build URL: https://github.com/apache/incubator-mxnet/pull/10951#issuecomment-397175348 ```bash make cd python make cython # pip install or export PYTHONPATH=my_installation_dir python setup.py install --with-cython --prefix=my_installation_dir cd ../tests/python # run tests ... ``` I think that this is the typical way to build and run the tests. cmake build would be similar. Setting cxx/nvcc compiler flags is not needed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] ThomasDelteil commented on issue #11155: [MXNET-521] Add Facebook open-graph tag integration
ThomasDelteil commented on issue #11155: [MXNET-521] Add Facebook open-graph tag integration URL: https://github.com/apache/incubator-mxnet/pull/11155#issuecomment-397175675 @anirudh2290 @indhub could one of you merge this please? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] lihaofd opened a new pull request #11273: disable testcase test_gru_bidirectional temporarily
lihaofd opened a new pull request #11273: disable testcase test_gru_bidirectional temporarily URL: https://github.com/apache/incubator-mxnet/pull/11273 ## Description ## disable testcase test_gru_bidirectional temporarily, checked at https://github.com/apache/incubator-mxnet/issues/11219 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] ThomasDelteil commented on issue #11210: [MXNET-532] Clarify documentation of save_params(), load_params(), export()
ThomasDelteil commented on issue #11210: [MXNET-532] Clarify documentation of save_params(), load_params(), export() URL: https://github.com/apache/incubator-mxnet/pull/11210#issuecomment-397175139 @anirudh2290 updated, FYI how to save an MXNet model when training on embedded device: ![save](https://user-images.githubusercontent.com/3716307/41393032-c2e07a80-6f58-11e8-8add-e679357f476a.gif) This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] asitstands commented on issue #10951: [MXNET-545] Fix broken cython build
asitstands commented on issue #10951: [MXNET-545] Fix broken cython build URL: https://github.com/apache/incubator-mxnet/pull/10951#issuecomment-397175348 ```bash make cd python make cython # pip install or export PYTHONPATH=my_installation_dir python setup.py install --with-cython --prefix=my_installation_dir cd ../tests/python # run tests ... ``` I think that this is the typical way to build and run the tests. cmake build would be similar. Setting cxx/nvcc compiler flags is not required. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] ThomasDelteil commented on issue #11210: [MXNET-532] Clarify documentation of save_params(), load_params(), export()
ThomasDelteil commented on issue #11210: [MXNET-532] Clarify documentation of save_params(), load_params(), export() URL: https://github.com/apache/incubator-mxnet/pull/11210#issuecomment-397175139 @anirudh2290 updated ![save](https://user-images.githubusercontent.com/3716307/41393032-c2e07a80-6f58-11e8-8add-e679357f476a.gif) This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] anirudh2290 commented on a change in pull request #11210: [MXNET-532] Clarify documentation of save_params(), load_params(), export()
anirudh2290 commented on a change in pull request #11210: [MXNET-532] Clarify documentation of save_params(), load_params(), export() URL: https://github.com/apache/incubator-mxnet/pull/11210#discussion_r195304671 ## File path: python/mxnet/gluon/block.py ## @@ -309,6 +309,17 @@ def _collect_params_with_prefix(self, prefix=''): def save_params(self, filename): """Save parameters to file. +This function is to be used to save parameters of a Gluon model, note that +the saved parameters are not meant to be loaded in a different language binding for now. +Saving parameters using `.save_params()` is different than +`.collect_params().save()`, which is a deprecated way to save parameters of a model +and should be avoided. Review comment: @ThomasDelteil #11127 is merged. Can you make the necessary changes. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] ThomasDelteil commented on issue #11219: Flaky test: test_gru_bidirectional
ThomasDelteil commented on issue #11219: Flaky test: test_gru_bidirectional URL: https://github.com/apache/incubator-mxnet/issues/11219#issuecomment-397173712 @lihaofd could you disable this test while you investigate why it is failing on windows? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] marcoabreu commented on issue #10951: [MXNET-545] Fix broken cython build
marcoabreu commented on issue #10951: [MXNET-545] Fix broken cython build URL: https://github.com/apache/incubator-mxnet/pull/10951#issuecomment-397172819 Depends, I'm not familiar with with cython :) Does it require its own pipeline or is it just a compile-flag we can turn on? Everything of interest to you would be in Jenkinsfile and ci/docker, right. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] asitstands commented on issue #10951: [MXNET-545] Fix broken cython build
asitstands commented on issue #10951: [MXNET-545] Fix broken cython build URL: https://github.com/apache/incubator-mxnet/pull/10951#issuecomment-397172463 @marcoabreu Thank you. First, I'm just not sure what is the best way to add a cython build to CI. There are already many environments. Would you recommend some existing (linux) environments to add cython build? Or is it a good idea to add new environments for cython build? Second, I'm not familiar with Jenkins and docker. It looks like that I need to edit `Jenkinsfile` and files under `docker` dir. Are there any other files that I need to check? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] asitstands commented on issue #10951: [MXNET-545] Fix broken cython build
asitstands commented on issue #10951: [MXNET-545] Fix broken cython build URL: https://github.com/apache/incubator-mxnet/pull/10951#issuecomment-397172463 @marcoabreu Thank you. First, I'm just not sure what is the best way to add a cython build to CI. There are already many environments. Would you recommend some existing environments to add cython build? Or is it a good idea to add new environments for cython build? Second, I'm not familiar with Jenkins and docker. It looks like that I need to edit `Jenkinsfile` and files under `docker` dir. Are there any other files that I need to check? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] azai91 opened a new pull request #11272: [MXNET-546] Add unit test for MKLDNNSum
azai91 opened a new pull request #11272: [MXNET-546] Add unit test for MKLDNNSum URL: https://github.com/apache/incubator-mxnet/pull/11272 ## Description ## Add unit test for MKLDNNSum helper function ## Checklist ## ### Essentials ### Please feel free to remove inapplicable items for your PR. - [ ] The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant [JIRA issue](https://issues.apache.org/jira/projects/MXNET/issues) created (except PRs with tiny changes) - [ ] Changes are complete (i.e. I finished coding on this PR) - [ ] All changes have test coverage: - Unit tests are added for small changes to verify correctness (e.g. adding a new operator) - Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore) - Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL) - [ ] Code is well-documented: - For user-facing API changes, API doc string has been updated. - For new C++ functions in header files, their functionalities and arguments are documented. - For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable - Check the API doc at http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html - [ ] To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change ### Changes ### - [ ] Add unit test for MKLDNNSum ## Comments ## - If this change is a backward incompatible change, why must this change be made. - Interesting edge cases to note here This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] piiswrong closed pull request #11127: add import_ for SymbolBlock
piiswrong closed pull request #11127: add import_ for SymbolBlock URL: https://github.com/apache/incubator-mxnet/pull/11127 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/docs/tutorials/gluon/hybrid.md b/docs/tutorials/gluon/hybrid.md index 3554a15fa3b..5c8372a51f4 100644 --- a/docs/tutorials/gluon/hybrid.md +++ b/docs/tutorials/gluon/hybrid.md @@ -117,7 +117,7 @@ x = mx.sym.var('data') y = net(x) print(y) y.save('model.json') -net.save_params('model.params') +net.save_parameters('model.params') ``` If your network outputs more than one value, you can use `mx.sym.Group` to diff --git a/docs/tutorials/gluon/naming.md b/docs/tutorials/gluon/naming.md index 37b63fa08a9..3606a03dcbd 100644 --- a/docs/tutorials/gluon/naming.md +++ b/docs/tutorials/gluon/naming.md @@ -203,12 +203,12 @@ except Exception as e: Parameter 'model1_dense0_weight' is missing in file 'model.params', which contains parameters: 'model0_mydense_weight', 'model0_dense1_bias', 'model0_dense1_weight', 'model0_dense0_weight', 'model0_dense0_bias', 'model0_mydense_bias'. Please make sure source and target networks have the same prefix. -To solve this problem, we use `save_params`/`load_params` instead of `collect_params` and `save`/`load`. `save_params` uses model structure, instead of parameter name, to match parameters. +To solve this problem, we use `save_parameters`/`load_parameters` instead of `collect_params` and `save`/`load`. `save_parameters` uses model structure, instead of parameter name, to match parameters. ```python -model0.save_params('model.params') -model1.load_params('model.params') +model0.save_parameters('model.params') +model1.load_parameters('model.params') print(mx.nd.load('model.params').keys()) ``` diff --git a/docs/tutorials/gluon/save_load_params.md b/docs/tutorials/gluon/save_load_params.md index cd876808a86..f5f48125cc1 100644 --- a/docs/tutorials/gluon/save_load_params.md +++ b/docs/tutorials/gluon/save_load_params.md @@ -10,7 +10,7 @@ Parameters of any Gluon model can be saved using the `save_params` and `load_par **2. Save/load model parameters AND architecture** -The Model architecture of `Hybrid` models stays static and don't change during execution. Therefore both model parameters AND architecture can be saved and loaded using `export`, `load_checkpoint` and `load` methods. +The Model architecture of `Hybrid` models stays static and don't change during execution. Therefore both model parameters AND architecture can be saved and loaded using `export`, `imports` methods. Let's look at the above methods in more detail. Let's start by importing the modules we'll need. @@ -61,7 +61,7 @@ def build_lenet(net): net.add(gluon.nn.Dense(512, activation="relu")) # Second fully connected layer with as many neurons as the number of classes net.add(gluon.nn.Dense(num_outputs)) - + return net # Train a given model using MNIST data @@ -240,18 +240,10 @@ One of the main reasons to serialize model architecture into a JSON file is to l ### From Python -Serialized Hybrid networks (saved as .JSON and .params file) can be loaded and used inside Python frontend using `mx.model.load_checkpoint` and `gluon.nn.SymbolBlock`. To demonstrate that, let's load the network we serialized above. +Serialized Hybrid networks (saved as .JSON and .params file) can be loaded and used inside Python frontend using `gluon.nn.SymbolBlock`. To demonstrate that, let's load the network we serialized above. ```python -# Load the network architecture and parameters -sym = mx.sym.load('lenet-symbol.json') -# Create a Gluon Block using the loaded network architecture. -# 'inputs' parameter specifies the name of the symbol in the computation graph -# that should be treated as input. 'data' is the default name used for input when -# a model architecture is saved to a file. -deserialized_net = gluon.nn.SymbolBlock(outputs=sym, inputs=mx.sym.var('data')) -# Load the parameters -deserialized_net.collect_params().load('lenet-0001.params', ctx=ctx) +deserialized_net = gluon.nn.SymbolBlock.imports("lenet-symbol.json", ['data'], "lenet-0001.params") ``` `deserialized_net` now contains the network we deserialized from files. Let's test the deserialized network to make sure it works. diff --git a/example/gluon/dcgan.py b/example/gluon/dcgan.py index 3233f430eea..8ac9c522cf5 100644 --- a/example/gluon/dcgan.py +++ b/example/gluon/dcgan.py @@ -229,8 +229,8 @@ def transformer(data, label): logging.info('time: %f' % (time.time() - tic)) if check_point: -netG.save_params(os.path.join(outf,'generator_epoch_%d.params' %epoch)) -netD.save_params(os.path.join(outf,'discriminator_epoch_%d.params'
[incubator-mxnet] branch master updated: add import_ for SymbolBlock (#11127)
This is an automated email from the ASF dual-hosted git repository. jxie pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git The following commit(s) were added to refs/heads/master by this push: new 66ab27e add import_ for SymbolBlock (#11127) 66ab27e is described below commit 66ab27e67a70b1164364a8a52ebbe0def45dc327 Author: Eric Junyuan Xie AuthorDate: Wed Jun 13 22:01:40 2018 -0700 add import_ for SymbolBlock (#11127) * add import_ for SymbolBlock * fix * Update block.py * add save_parameters * fix * fix lint * fix * fix * fix * fix * fix * Update save_load_params.md --- docs/tutorials/gluon/hybrid.md| 2 +- docs/tutorials/gluon/naming.md| 6 +- docs/tutorials/gluon/save_load_params.md | 16 +--- example/gluon/dcgan.py| 8 +- example/gluon/embedding_learning/train.py | 2 +- example/gluon/image_classification.py | 8 +- example/gluon/mnist.py| 2 +- example/gluon/style_transfer/main.py | 8 +- example/gluon/super_resolution.py | 4 +- example/gluon/tree_lstm/main.py | 2 +- example/gluon/word_language_model/train.py| 4 +- python/mxnet/gluon/block.py | 90 +-- python/mxnet/gluon/model_zoo/vision/alexnet.py| 2 +- python/mxnet/gluon/model_zoo/vision/densenet.py | 2 +- python/mxnet/gluon/model_zoo/vision/inception.py | 2 +- python/mxnet/gluon/model_zoo/vision/mobilenet.py | 4 +- python/mxnet/gluon/model_zoo/vision/resnet.py | 4 +- python/mxnet/gluon/model_zoo/vision/squeezenet.py | 2 +- python/mxnet/gluon/model_zoo/vision/vgg.py| 4 +- tests/python/unittest/test_gluon.py | 54 +++--- 20 files changed, 164 insertions(+), 62 deletions(-) diff --git a/docs/tutorials/gluon/hybrid.md b/docs/tutorials/gluon/hybrid.md index 3554a15..5c8372a 100644 --- a/docs/tutorials/gluon/hybrid.md +++ b/docs/tutorials/gluon/hybrid.md @@ -117,7 +117,7 @@ x = mx.sym.var('data') y = net(x) print(y) y.save('model.json') -net.save_params('model.params') +net.save_parameters('model.params') ``` If your network outputs more than one value, you can use `mx.sym.Group` to diff --git a/docs/tutorials/gluon/naming.md b/docs/tutorials/gluon/naming.md index 37b63fa..3606a03 100644 --- a/docs/tutorials/gluon/naming.md +++ b/docs/tutorials/gluon/naming.md @@ -203,12 +203,12 @@ except Exception as e: Parameter 'model1_dense0_weight' is missing in file 'model.params', which contains parameters: 'model0_mydense_weight', 'model0_dense1_bias', 'model0_dense1_weight', 'model0_dense0_weight', 'model0_dense0_bias', 'model0_mydense_bias'. Please make sure source and target networks have the same prefix. -To solve this problem, we use `save_params`/`load_params` instead of `collect_params` and `save`/`load`. `save_params` uses model structure, instead of parameter name, to match parameters. +To solve this problem, we use `save_parameters`/`load_parameters` instead of `collect_params` and `save`/`load`. `save_parameters` uses model structure, instead of parameter name, to match parameters. ```python -model0.save_params('model.params') -model1.load_params('model.params') +model0.save_parameters('model.params') +model1.load_parameters('model.params') print(mx.nd.load('model.params').keys()) ``` diff --git a/docs/tutorials/gluon/save_load_params.md b/docs/tutorials/gluon/save_load_params.md index cd87680..f5f4812 100644 --- a/docs/tutorials/gluon/save_load_params.md +++ b/docs/tutorials/gluon/save_load_params.md @@ -10,7 +10,7 @@ Parameters of any Gluon model can be saved using the `save_params` and `load_par **2. Save/load model parameters AND architecture** -The Model architecture of `Hybrid` models stays static and don't change during execution. Therefore both model parameters AND architecture can be saved and loaded using `export`, `load_checkpoint` and `load` methods. +The Model architecture of `Hybrid` models stays static and don't change during execution. Therefore both model parameters AND architecture can be saved and loaded using `export`, `imports` methods. Let's look at the above methods in more detail. Let's start by importing the modules we'll need. @@ -61,7 +61,7 @@ def build_lenet(net): net.add(gluon.nn.Dense(512, activation="relu")) # Second fully connected layer with as many neurons as the number of classes net.add(gluon.nn.Dense(num_outputs)) - + return net # Train a given model using MNIST data @@ -240,18 +240,10 @@ One of the main reasons to serialize model architecture into a JSON file is to l ### From Python -Serialized Hybrid networks (saved as .JSON and .params file) can be
[GitHub] anirudh2290 commented on a change in pull request #11127: add import_ for SymbolBlock
anirudh2290 commented on a change in pull request #11127: add import_ for SymbolBlock URL: https://github.com/apache/incubator-mxnet/pull/11127#discussion_r195302933 ## File path: docs/tutorials/gluon/save_load_params.md ## @@ -61,7 +61,7 @@ def build_lenet(net): net.add(gluon.nn.Dense(512, activation="relu")) Review comment: sorry didnt realize this was already in. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] anirudh2290 commented on a change in pull request #11127: add import_ for SymbolBlock
anirudh2290 commented on a change in pull request #11127: add import_ for SymbolBlock URL: https://github.com/apache/incubator-mxnet/pull/11127#discussion_r195302812 ## File path: docs/tutorials/gluon/save_load_params.md ## @@ -61,7 +61,7 @@ def build_lenet(net): net.add(gluon.nn.Dense(512, activation="relu")) Review comment: we can make this change as part of another PR to avoid another round of CI. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] marcoabreu commented on issue #10951: [MXNET-545] Fix broken cython build
marcoabreu commented on issue #10951: [MXNET-545] Fix broken cython build URL: https://github.com/apache/incubator-mxnet/pull/10951#issuecomment-397170811 Sure. How can I help you? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[incubator-mxnet] branch master updated: [MXNET-290] MKLDNN support for model quantization (#10433)
This is an automated email from the ASF dual-hosted git repository. marcoabreu pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git The following commit(s) were added to refs/heads/master by this push: new d79e1ad [MXNET-290] MKLDNN support for model quantization (#10433) d79e1ad is described below commit d79e1ad3294837cac653478045023fd312ceed78 Author: wentingj AuthorDate: Thu Jun 14 12:58:33 2018 +0800 [MXNET-290] MKLDNN support for model quantization (#10433) * mkldnn support for quantization * fix output number in graph * update licsence * modify Jenkinsfile * modify Jenkinsfile * mkldnn has no int8 fc api, excluded_sym_names includes fc for cpu * add mkldnn uint8 pass for quantization graph * update ut * retrig ic * remove no mkldnn quantization test temp * seperate mkldnn quantization ut from gpu quantization ut * rm dev_id check for cpu * add mkl tests dictionary * resolve review comments * simplify DequantizeStorageType() logic * simplify quantize/quantized_conv storage type logic * Add mkldnn_OIhw4i16o4i type case (needed by int8) * INT8 conv/pooling: share with FP32 convolution/pooling class/function * minor indent changes * Remove unnecessary mkldnn_quantized_pooling-inl.h * Fix minor issue * Fix lint * delete duplicated data type * fix bugs and convert requantize data to NDArray * fix lint * fix requantize storgetype * fix requantize storge type * Fix coding style comments * Fix compile issue * Change to use quantized_dtype option to support uint8/int8 scenarios * fix gpu test quantization failure * Fix indent * fix quantized pooling param parser * Fix imagenet_gen_qsym.py option style * retrigger jenkins * retrigger again * trigger jenkins * Resolve further comments * share test code * remove unnecessary test code * add test_quantize_model for cpu * add comments in quantize_graph_pass.cc * jenkins * jenkins * improve coding style * improve coding style * Add naive CPU quantization test back and share quantization code between naive-CPU/MKLDNN/GPU * rename test_quantization_cpu.py to test_quantization_mkldnn.py * code style * trigger * Adjust variable naming for test quantization * add qdtype for quantized op test case to test/bypass all cases explicitly * change expressions to be consistent * revert unnecessary change --- ci/docker/runtime_functions.sh | 3 +- example/quantization/imagenet_gen_qsym.py | 44 +- example/quantization/imagenet_inference.py | 10 +- include/mxnet/c_api.h | 4 +- python/mxnet/contrib/quantization.py | 21 +- src/c_api/c_api_symbolic.cc| 5 +- src/operator/nn/convolution-inl.h | 2 + src/operator/nn/convolution.cc | 2 +- src/operator/nn/mkldnn/mkldnn_convolution-inl.h| 77 src/operator/nn/mkldnn/mkldnn_convolution.cc | 109 ++--- src/operator/nn/mkldnn/mkldnn_pooling-inl.h| 4 + src/operator/nn/pooling-inl.h | 2 + src/operator/nn/pooling.cc | 2 +- src/operator/quantization/dequantize.cc| 24 + .../quantization/mkldnn/mkldnn_dequantize-inl.h| 105 + .../quantization/mkldnn/mkldnn_quantize-inl.h | 112 + .../quantization/mkldnn/mkldnn_quantized_conv.cc | 89 .../mkldnn/mkldnn_quantized_pooling.cc | 54 +++ .../quantization/mkldnn/mkldnn_requantize-inl.h| 158 +++ src/operator/quantization/quantize.cc | 24 + src/operator/quantization/quantize_graph_pass.cc | 20 +- src/operator/quantization/quantized_conv.cc| 27 +- src/operator/quantization/quantized_flatten-inl.h | 23 +- src/operator/quantization/quantized_pooling.cc | 31 +- src/operator/quantization/requantize.cc| 25 + tests/python/mkl/test_quantization_mkldnn.py | 28 ++ tests/python/quantization/test_quantization.py | 506 - 27 files changed, 1185 insertions(+), 326 deletions(-) diff --git a/ci/docker/runtime_functions.sh b/ci/docker/runtime_functions.sh index 36e2387..293ac64 100755 --- a/ci/docker/runtime_functions.sh +++ b/ci/docker/runtime_functions.sh @@ -466,13 +466,12 @@ unittest_ubuntu_python3_cpu() { unittest_ubuntu_python3_cpu_mkldnn() { set -ex -export PYTHONPATH=./python/ +export PYTHONPATH=./python/ # MXNET_MKLDNN_DEBUG is buggy and produces false positives #
[GitHub] marcoabreu closed pull request #10433: [MXNET-290] MKLDNN support for model quantization
marcoabreu closed pull request #10433: [MXNET-290] MKLDNN support for model quantization URL: https://github.com/apache/incubator-mxnet/pull/10433 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/ci/docker/runtime_functions.sh b/ci/docker/runtime_functions.sh index 7abe767c869..b2f0d8ddfcb 100755 --- a/ci/docker/runtime_functions.sh +++ b/ci/docker/runtime_functions.sh @@ -381,13 +381,12 @@ unittest_ubuntu_python3_cpu() { unittest_ubuntu_python3_cpu_mkldnn() { set -ex -export PYTHONPATH=./python/ +export PYTHONPATH=./python/ # MXNET_MKLDNN_DEBUG is buggy and produces false positives # https://github.com/apache/incubator-mxnet/issues/10026 #export MXNET_MKLDNN_DEBUG=1 # Ignored if not present export MXNET_STORAGE_FALLBACK_LOG_VERBOSE=0 nosetests-3.4 --verbose tests/python/unittest -nosetests-3.4 --verbose tests/python/quantization nosetests-3.4 --verbose tests/python/mkl } diff --git a/example/quantization/imagenet_gen_qsym.py b/example/quantization/imagenet_gen_qsym.py index 045ce62489a..85474b663fa 100644 --- a/example/quantization/imagenet_gen_qsym.py +++ b/example/quantization/imagenet_gen_qsym.py @@ -53,6 +53,7 @@ def save_params(fname, arg_params, aux_params, logger=None): if __name__ == '__main__': parser = argparse.ArgumentParser(description='Generate a calibrated quantized model from a FP32 model') +parser.add_argument('--ctx', type=str, default='gpu') parser.add_argument('--model', type=str, choices=['imagenet1k-resnet-152', 'imagenet1k-inception-bn'], help='currently only supports imagenet1k-resnet-152 or imagenet1k-inception-bn') parser.add_argument('--batch-size', type=int, default=32) @@ -91,8 +92,18 @@ def save_params(fname, arg_params, aux_params, logger=None): ' thresholds. This mode is expected to produce the best inference accuracy of all three' ' kinds of quantized models if the calibration dataset is representative enough of the' ' inference dataset.') +parser.add_argument('--quantized-dtype', type=str, default='int8', +choices=['int8', 'uint8'], +help='quantization destination data type for input data') args = parser.parse_args() +if args.ctx == 'gpu': +ctx = mx.gpu(0) +elif args.ctx == 'cpu': +ctx = mx.cpu(0) +else: +raise ValueError('ctx %s is not supported in this script' % args.ctx) + logging.basicConfig() logger = logging.getLogger('logger') logger.setLevel(logging.INFO) @@ -129,17 +140,26 @@ def save_params(fname, arg_params, aux_params, logger=None): excluded_sym_names = [] if args.model == 'imagenet1k-resnet-152': rgb_mean = '0,0,0' -calib_layer = lambda name: name.endswith('_output') and (name.find('conv') != -1 - or name.find('sc') != -1 - or name.find('fc') != -1) +if args.ctx == 'gpu': +calib_layer = lambda name: name.endswith('_output') and (name.find('conv') != -1 + or name.find('sc') != -1 + or name.find('fc') != -1) +else: +calib_layer = lambda name: name.endswith('_output') and (name.find('conv') != -1 + or name.find('sc') != -1) +excluded_sym_names += ['flatten0', 'fc1'] if exclude_first_conv: -excluded_sym_names = ['conv0'] +excluded_sym_names += ['conv0'] elif args.model == 'imagenet1k-inception-bn': rgb_mean = '123.68,116.779,103.939' -calib_layer = lambda name: name.endswith('_output') and (name.find('conv') != -1 - or name.find('fc') != -1) +if args.ctx == 'gpu': +calib_layer = lambda name: name.endswith('_output') and (name.find('conv') != -1 + or name.find('fc') != -1) +else: +calib_layer = lambda name: name.endswith('_output') and (name.find('conv') != -1) +excluded_sym_names += ['flatten', 'fc1'] if exclude_first_conv: -excluded_sym_names = ['conv_1'] +excluded_sym_names += ['conv_1'] else: raise ValueError('model %s is not supported in this script' % args.model) @@ -156,8 +176,9 @@ def save_params(fname, arg_params,
[incubator-mxnet] branch master updated (bf26886 -> eb95d7b)
This is an automated email from the ASF dual-hosted git repository. anirudh2290 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from bf26886 gpu mem pool strategy (#11041) add eb95d7b [MXNET-543] disable scalatest on Spark (#11264) No new revisions were added by this update. Summary of changes: scala-package/spark/pom.xml| 18 --- .../org/apache/mxnet/spark/MXNetGeneralSuite.scala | 36 -- 2 files changed, 19 insertions(+), 35 deletions(-) -- To stop receiving notification emails like this one, please contact anirudh2...@apache.org.
[GitHub] leezu opened a new issue #11271: nd.topk regression with nan values
leezu opened a new issue #11271: nd.topk regression with nan values URL: https://github.com/apache/incubator-mxnet/issues/11271 mxnet master contains a regression in the nd.topk operator compared to mxnet v1.2. Likely https://github.com/apache/incubator-mxnet/pull/10997 is at fault (but I have not performed an bisection to proof that). Consider the following: Behavior in mxnet master: ``` In [1]: a = [np.nan] * 4 + list(range(2500)) ...: a = mx.nd.array(a) ...: print(a) ...: for k in range(3,10): ...: print(mx.nd.topk(a, k=k)) ...: ...: [ nan nan nan ... 2497. 2498. 2499.] [2. 0. 1.] [2. 0. 1. 3.] [2. 4. 0. 1. 3.] [5. 4. 2. 0. 1. 3.] [2. 6. 5. 4. 0. 1. 3.] [ 2. 1. 10. 9. 8. 6. 0. 3.] [ 2. 1. 11. 10. 9. 6. 8. 0. 3.] ``` Behavior in mxnet v1.2: ``` In [1]: In [1]: a = [np.nan] * 4 + list(range(2500)) ...: ...: a = mx.nd.array(a) ...: ...: print(a) ...: ...: for k in range(3,10): ...: ...: print(mx.nd.topk(a, k=k)) ...: [ nan nan nan ... 2497. 2498. 2499.] [0. 1. 2.] [0. 1. 2. 3.] [0.000e+00 1.000e+00 2.000e+00 3.000e+00 2.503e+03] [0.000e+00 1.000e+00 2.000e+00 3.000e+00 2.503e+03 2.502e+03] [0.000e+00 1.000e+00 2.000e+00 3.000e+00 2.503e+03 2.502e+03 2.501e+03] [0.000e+00 1.000e+00 2.000e+00 3.000e+00 2.503e+03 2.502e+03 2.501e+03 2.500e+03] [0.000e+00 1.000e+00 2.000e+00 3.000e+00 2.503e+03 2.502e+03 2.501e+03 2.500e+03 2.499e+03] ``` Notice how the result is correct with mxnet version 1.2 but wrong on the master version. As a sidenote, even with mxnet version 1.2 the behavior of nd.topk is inconsistent with nd.max. The latter ignores 'nan' whereas the former considers it to be the maximum element. ``` n [9]: mx.nd.argmax(mx.nd.array([np.nan, 1]), axis=0) Out[9]: [1.] In [10]: mx.nd.topk(mx.nd.array([np.nan, 1]), axis=0) Out[10]: [0.] ``` Arguably there was never a guarantee that nd.topk works under the presence of nan values, but legacy code may rely on this behavior and can break in unexpected ways. @asmushetzel This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] anirudh2290 closed pull request #11264: [MXNET-543] disable scalatest on Spark
anirudh2290 closed pull request #11264: [MXNET-543] disable scalatest on Spark URL: https://github.com/apache/incubator-mxnet/pull/11264 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/scala-package/spark/pom.xml b/scala-package/spark/pom.xml index 43ff1f78fe1..f2b806094af 100644 --- a/scala-package/spark/pom.xml +++ b/scala-package/spark/pom.xml @@ -36,24 +36,6 @@ - - - -org.scalatest -scalatest-maven-plugin - - - -Djava.library.path=${project.parent.basedir}/native/${platform}/target \ - -Dlog4j.configuration=file://${project.basedir}/src/test/resources/log4j.properties - - - - -org.scalastyle -scalastyle-maven-plugin - - - org.apache.mxnet diff --git a/scala-package/spark/src/test/scala/org/apache/mxnet/spark/MXNetGeneralSuite.scala b/scala-package/spark/src/test/scala/org/apache/mxnet/spark/MXNetGeneralSuite.scala index 74bc1dbb71f..72bbbe0fed0 100644 --- a/scala-package/spark/src/test/scala/org/apache/mxnet/spark/MXNetGeneralSuite.scala +++ b/scala-package/spark/src/test/scala/org/apache/mxnet/spark/MXNetGeneralSuite.scala @@ -46,24 +46,26 @@ class MXNetGeneralSuite extends SharedSparkContext { "/dataset/mxnet-spark-test/train.txt" + " -P " + testDataDir + " -q") ! } - override def beforeAll(): Unit = { -val tempDirFile = Files.createTempDirectory(s"mxnet-spark-test-${System.currentTimeMillis()}"). - toFile -testDataDir = tempDirFile.getPath -tempDirFile.deleteOnExit() -downloadTestData() - } - +// override def beforeAll(): Unit = { +// val tempDirFile = Files.createTempDirectory(s"mxnet-spark-test-${System.currentTimeMillis()}"). +// toFile +//testDataDir = tempDirFile.getPath +//tempDirFile.deleteOnExit() +//downloadTestData() +// } - test("run spark with MLP") { -val trainData = parseRawData(sc, s"$testDataDir/train.txt") -val model = buildMlp().fit(trainData) -assert(model != null) - } + test("Dummy test on Spark") { - test("run spark with LeNet") { -val trainData = parseRawData(sc, s"$testDataDir/train.txt") -val model = buildLeNet().fit(trainData) -assert(model != null) } +// test("run spark with MLP") { +//val trainData = parseRawData(sc, s"$testDataDir/train.txt") +//val model = buildMlp().fit(trainData) +//assert(model != null) +// } +// +// test("run spark with LeNet") { +//val trainData = parseRawData(sc, s"$testDataDir/train.txt") +//val model = buildLeNet().fit(trainData) +//assert(model != null) +// } } This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] anirudh2290 commented on a change in pull request #11264: [MXNET-543] disable scalatest on Spark
anirudh2290 commented on a change in pull request #11264: [MXNET-543] disable scalatest on Spark URL: https://github.com/apache/incubator-mxnet/pull/11264#discussion_r195302388 ## File path: scala-package/spark/src/test/scala/org/apache/mxnet/spark/MXNetGeneralSuite.scala ## @@ -46,24 +46,26 @@ class MXNetGeneralSuite extends SharedSparkContext { "/dataset/mxnet-spark-test/train.txt" + " -P " + testDataDir + " -q") ! } - override def beforeAll(): Unit = { -val tempDirFile = Files.createTempDirectory(s"mxnet-spark-test-${System.currentTimeMillis()}"). - toFile -testDataDir = tempDirFile.getPath -tempDirFile.deleteOnExit() -downloadTestData() - } - +// override def beforeAll(): Unit = { +// val tempDirFile = Files.createTempDirectory(s"mxnet-spark-test-${System.currentTimeMillis()}"). +// toFile +//testDataDir = tempDirFile.getPath +//tempDirFile.deleteOnExit() +//downloadTestData() +// } - test("run spark with MLP") { -val trainData = parseRawData(sc, s"$testDataDir/train.txt") -val model = buildMlp().fit(trainData) -assert(model != null) - } + test("Dummy test on Spark") { - test("run spark with LeNet") { -val trainData = parseRawData(sc, s"$testDataDir/train.txt") -val model = buildLeNet().fit(trainData) -assert(model != null) } +// test("run spark with MLP") { Review comment: No need to comment we can just remove, but not blocking it as this will have to go through another round of CI. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] aaronmarkham commented on issue #11219: Flaky test: test_gru_bidirectional
aaronmarkham commented on issue #11219: Flaky test: test_gru_bidirectional URL: https://github.com/apache/incubator-mxnet/issues/11219#issuecomment-397165774 Still a problem... http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/incubator-mxnet/detail/PR-11261/4/pipeline **edit** ...For a bunch of PRs... almost like nothing is getting through because of this! This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] aaronmarkham commented on issue #11219: Flaky test: test_gru_bidirectional
aaronmarkham commented on issue #11219: Flaky test: test_gru_bidirectional URL: https://github.com/apache/incubator-mxnet/issues/11219#issuecomment-397165774 Still a problem... http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/incubator-mxnet/detail/PR-11261/4/pipeline This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] eric-haibin-lin opened a new pull request #11270: Enable CUDNN for conv1D (#11194)
eric-haibin-lin opened a new pull request #11270: Enable CUDNN for conv1D (#11194) URL: https://github.com/apache/incubator-mxnet/pull/11270 * enable cudnn for conv1d * add checks for backward * fix build * fix build * fix lint * Update convolution.cc ## Description ## (Brief description on what this PR is about) ## Checklist ## ### Essentials ### Please feel free to remove inapplicable items for your PR. - [ ] The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant [JIRA issue](https://issues.apache.org/jira/projects/MXNET/issues) created (except PRs with tiny changes) - [ ] Changes are complete (i.e. I finished coding on this PR) - [ ] All changes have test coverage: - Unit tests are added for small changes to verify correctness (e.g. adding a new operator) - Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore) - Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL) - [ ] Code is well-documented: - For user-facing API changes, API doc string has been updated. - For new C++ functions in header files, their functionalities and arguments are documented. - For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable - Check the API doc at http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html - [ ] To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change ### Changes ### - [ ] Feature1, tests, (and when applicable, API doc) - [ ] Feature2, tests, (and when applicable, API doc) ## Comments ## - If this change is a backward incompatible change, why must this change be made. - Interesting edge cases to note here This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] FzuLJ commented on issue #4764: Error when apply mx.sym.simple_bind() on gpu
FzuLJ commented on issue #4764: Error when apply mx.sym.simple_bind() on gpu URL: https://github.com/apache/incubator-mxnet/issues/4764#issuecomment-397162313 Hi I met the same problem. Do you know how to solve this problem? @back2yes This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] marcoabreu opened a new pull request #11269: Enable shared ccache
marcoabreu opened a new pull request #11269: Enable shared ccache URL: https://github.com/apache/incubator-mxnet/pull/11269 ## Description ## (Brief description on what this PR is about) ## Checklist ## ### Essentials ### Please feel free to remove inapplicable items for your PR. - [ ] The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant [JIRA issue](https://issues.apache.org/jira/projects/MXNET/issues) created (except PRs with tiny changes) - [ ] Changes are complete (i.e. I finished coding on this PR) - [ ] All changes have test coverage: - Unit tests are added for small changes to verify correctness (e.g. adding a new operator) - Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore) - Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL) - [ ] Code is well-documented: - For user-facing API changes, API doc string has been updated. - For new C++ functions in header files, their functionalities and arguments are documented. - For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable - Check the API doc at http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html - [ ] To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change ### Changes ### - [ ] Feature1, tests, (and when applicable, API doc) - [ ] Feature2, tests, (and when applicable, API doc) ## Comments ## - If this change is a backward incompatible change, why must this change be made. - Interesting edge cases to note here This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] asitstands commented on issue #10951: [MXNET-545] Fix broken cython build
asitstands commented on issue #10951: [MXNET-545] Fix broken cython build URL: https://github.com/apache/incubator-mxnet/pull/10951#issuecomment-397151382 @piiswrong @szha @marcoabreu Would you review this PR? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[incubator-mxnet-site] branch asf-site updated: Bump the publish timestamp.
This is an automated email from the ASF dual-hosted git repository. zhasheng pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/incubator-mxnet-site.git The following commit(s) were added to refs/heads/asf-site by this push: new 4a6b960 Bump the publish timestamp. 4a6b960 is described below commit 4a6b9609df57eae75f3b858d6337c211a5f99fb5 Author: mxnet-ci AuthorDate: Thu Jun 14 01:38:17 2018 + Bump the publish timestamp. --- date.txt | 1 + 1 file changed, 1 insertion(+) diff --git a/date.txt b/date.txt new file mode 100644 index 000..4dad2d7 --- /dev/null +++ b/date.txt @@ -0,0 +1 @@ +Thu Jun 14 01:38:17 UTC 2018 -- To stop receiving notification emails like this one, please contact zhash...@apache.org.
[GitHub] mbaijal commented on a change in pull request #10827: [MXNET-405][WIP] Add 2 new pipelines to the Official CI and run nightly tests.
mbaijal commented on a change in pull request #10827: [MXNET-405][WIP] Add 2 new pipelines to the Official CI and run nightly tests. URL: https://github.com/apache/incubator-mxnet/pull/10827#discussion_r195288927 ## File path: tests/nightly/Jenkinsfile ## @@ -0,0 +1,180 @@ +// -*- mode: groovy -*- +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + + +//This is a Jenkinsfile for nightly tests. The format and some functions have been picked up from the top-level Jenkinsfile + +err = null +mx_lib = 'lib/libmxnet.so, lib/libmxnet.a, 3rdparty/dmlc-core/libdmlc.a, 3rdparty/nnvm/lib/libnnvm.a' + +// pack libraries for later use +def pack_lib(name, libs=mx_lib) { + sh """ +echo "Packing ${libs} into ${name}" +echo ${libs} | sed -e 's/,/ /g' | xargs md5sum +""" + stash includes: libs, name: name +} + +// unpack libraries saved before +def unpack_lib(name, libs=mx_lib) { + unstash name + sh """ +echo "Unpacked ${libs} from ${name}" +echo ${libs} | sed -e 's/,/ /g' | xargs md5sum +""" +} + +def init_git() { + deleteDir() + retry(5) { +try { + timeout(time: 15, unit: 'MINUTES') { +checkout scm +sh 'git submodule update --init --recursive' +sh 'git clean -d -f' Review comment: I have kept this the same as the top level Jenkinsfile. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] asitstands opened a new pull request #11268: A binary RBM example
asitstands opened a new pull request #11268: A binary RBM example URL: https://github.com/apache/incubator-mxnet/pull/11268 ## Description ## An example of binary restricted Boltzmann machine learning MNIST. ## Checklist ## ### Essentials ### Please feel free to remove inapplicable items for your PR. - [ ] The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant [JIRA issue](https://issues.apache.org/jira/projects/MXNET/issues) created (except PRs with tiny changes) - [x] Changes are complete (i.e. I finished coding on this PR) - [ ] All changes have test coverage: - Unit tests are added for small changes to verify correctness (e.g. adding a new operator) - Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore) - Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL) - [x] Code is well-documented: - For user-facing API changes, API doc string has been updated. - For new C++ functions in header files, their functionalities and arguments are documented. - For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable - Check the API doc at http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html - [x] To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] mbaijal commented on a change in pull request #10827: [MXNET-405][WIP] Add 2 new pipelines to the Official CI and run nightly tests.
mbaijal commented on a change in pull request #10827: [MXNET-405][WIP] Add 2 new pipelines to the Official CI and run nightly tests. URL: https://github.com/apache/incubator-mxnet/pull/10827#discussion_r195288531 ## File path: docs/install/index.md ## @@ -84,7 +84,7 @@ $ wget https://bootstrap.pypa.io/get-pip.py && sudo python get-pip.py **Step 2** Install MXNet with OpenBLAS acceleration. ```bash -$ pip install mxnet +$ sudo pip install mxnet Review comment: I do not have an exact answer to this. The test fails without a sudo. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] mbaijal commented on a change in pull request #10827: [MXNET-405][WIP] Add 2 new pipelines to the Official CI and run nightly tests.
mbaijal commented on a change in pull request #10827: [MXNET-405][WIP] Add 2 new pipelines to the Official CI and run nightly tests. URL: https://github.com/apache/incubator-mxnet/pull/10827#discussion_r195288560 ## File path: ci/docker/runtime_functions.sh ## @@ -591,6 +591,65 @@ build_docs() { popd } + +# Functions that run the nightly Tests: + +#Runs Apache RAT Check on MXNet Source for License Headers +nightly_test_rat_check() { +set -ex +#This Test fails without changing permissions Review comment: 0755 works This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] dwSun commented on issue #11243: weird gpu memory usage
dwSun commented on issue #11243: weird gpu memory usage URL: https://github.com/apache/incubator-mxnet/issues/11243#issuecomment-397144592 @ThomasDelteil, thanks for explain that. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] dwSun closed issue #11243: weird gpu memory usage
dwSun closed issue #11243: weird gpu memory usage URL: https://github.com/apache/incubator-mxnet/issues/11243 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] jinhuang415 commented on issue #10433: [MXNET-290] MKLDNN support for model quantization
jinhuang415 commented on issue #10433: [MXNET-290] MKLDNN support for model quantization URL: https://github.com/apache/incubator-mxnet/pull/10433#issuecomment-397143625 @marcoabreu @zheng-da Thanks for the approve! Regarding how to check different backend context, if we are not in an agreement to expose this as APIs to users, how about keep current change? Simply from context is not enough since we need to differentiate MKLDNN and native CPU but they all belong to CPU context. Another approach from python level to separate mkldnn and native CPU is by checking `/proc//maps `to match related mapped library for one process, so if mkldnn is built in, libmkldnn.so will be mapped to the process's space, the same with cudnn/cuda as well. But this approach only applies for Linux (replies on proc file system), so if we do nosetests under MAC or windows this will not work. Anyway if you think we are fine right now to keep current change, how about continue to merge the PR? Thanks! This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] marcoabreu commented on issue #11265: Flaky test: Python 3: CPU Win
marcoabreu commented on issue #11265: Flaky test: Python 3: CPU Win URL: https://github.com/apache/incubator-mxnet/issues/11265#issuecomment-397138748 Seems like our slave died.. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[incubator-mxnet] branch master updated: gpu mem pool strategy (#11041)
This is an automated email from the ASF dual-hosted git repository. zhasheng pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git The following commit(s) were added to refs/heads/master by this push: new bf26886 gpu mem pool strategy (#11041) bf26886 is described below commit bf268862f5dd6ba3abb61cd7edd423f535d4b5b7 Author: Sheng Zha AuthorDate: Wed Jun 13 21:08:16 2018 -0400 gpu mem pool strategy (#11041) * use nearest power of 2 for gpu memory pool sizes * add linear * add test --- src/storage/pooled_storage_manager.h| 181 +++- src/storage/storage.cc | 16 ++- tests/cpp/storage/storage_test.cc | 36 - tests/python/gpu/test_forward.py| 2 +- tests/python/gpu/test_gluon_model_zoo_gpu.py| 2 +- tests/python/gpu/test_kvstore_gpu.py| 4 +- tests/python/gpu/test_operator_gpu.py | 2 +- tests/python/unittest/common.py | 8 ++ tests/python/unittest/test_autograd.py | 2 +- tests/python/unittest/test_contrib_autograd.py | 2 +- tests/python/unittest/test_exc_handling.py | 2 +- tests/python/unittest/test_executor.py | 2 +- tests/python/unittest/test_gluon.py | 5 +- tests/python/unittest/test_gluon_contrib.py | 2 +- tests/python/unittest/test_gluon_data.py| 2 +- tests/python/unittest/test_gluon_data_vision.py | 2 +- tests/python/unittest/test_gluon_model_zoo.py | 2 +- tests/python/unittest/test_kvstore.py | 2 +- tests/python/unittest/test_loss.py | 2 +- tests/python/unittest/test_module.py| 2 +- tests/python/unittest/test_ndarray.py | 2 +- tests/python/unittest/test_operator.py | 4 +- tests/python/unittest/test_optimizer.py | 4 +- tests/python/unittest/test_random.py| 2 +- tests/python/unittest/test_recordio.py | 2 +- tests/python/unittest/test_sparse_ndarray.py| 2 +- tests/python/unittest/test_sparse_operator.py | 2 +- 27 files changed, 259 insertions(+), 37 deletions(-) diff --git a/src/storage/pooled_storage_manager.h b/src/storage/pooled_storage_manager.h index 3bf4373..bed9730 100644 --- a/src/storage/pooled_storage_manager.h +++ b/src/storage/pooled_storage_manager.h @@ -28,9 +28,11 @@ #if MXNET_USE_CUDA #include #endif // MXNET_USE_CUDA + #include #include #include +#include #include #include #include @@ -43,7 +45,8 @@ namespace storage { #if MXNET_USE_CUDA /*! - * \brief Storage manager with a memory pool on gpu. + * \brief Storage manager with a memory pool on gpu. Memory chunks are reused based on exact size + * match. */ class GPUPooledStorageManager final : public StorageManager { public: @@ -52,6 +55,11 @@ class GPUPooledStorageManager final : public StorageManager { */ GPUPooledStorageManager() { reserve_ = dmlc::GetEnv("MXNET_GPU_MEM_POOL_RESERVE", 5); +page_size_ = dmlc::GetEnv("MXNET_GPU_MEM_POOL_PAGE_SIZE", 4096); +if (page_size_ < NDEV) { + LOG(FATAL) << "MXNET_GPU_MEM_POOL_PAGE_SIZE cannot be set to a value smaller than " << NDEV \ + << ". Got " << page_size_ << "."; +} } /*! * \brief Default destructor. @@ -71,7 +79,7 @@ class GPUPooledStorageManager final : public StorageManager { private: void DirectFreeNoLock(Storage::Handle handle) { cudaError_t err = cudaFree(handle.dptr); -size_t size = handle.size + NDEV; +size_t size = std::max(handle.size, page_size_); // ignore unloading error, as memory has already been recycled if (err != cudaSuccess && err != cudaErrorCudartUnloading) { LOG(FATAL) << "CUDA: " << cudaGetErrorString(err); @@ -83,10 +91,12 @@ class GPUPooledStorageManager final : public StorageManager { void ReleaseAll(); // used memory size_t used_memory_ = 0; + // page size + size_t page_size_; // percentage of reserved memory int reserve_; // number of devices - const int NDEV = 32; + const size_t NDEV = 32; // memory pool std::unordered_map> memory_pool_; DISALLOW_COPY_AND_ASSIGN(GPUPooledStorageManager); @@ -94,7 +104,7 @@ class GPUPooledStorageManager final : public StorageManager { void GPUPooledStorageManager::Alloc(Storage::Handle* handle) { std::lock_guard lock(Storage::Get()->GetMutex(Context::kGPU)); - size_t size = handle->size + NDEV; + size_t size = std::max(handle->size, page_size_); auto&& reuse_it = memory_pool_.find(size); if (reuse_it == memory_pool_.end() || reuse_it->second.size() == 0) { size_t free, total; @@ -119,7 +129,7 @@ void GPUPooledStorageManager::Alloc(Storage::Handle* handle) { void GPUPooledStorageManager::Free(Storage::Handle handle) { std::lock_guard lock(Storage::Get()->GetMutex(Context::kGPU)); - size_t size = handle.size + NDEV; + size_t
[GitHub] rahul003 commented on a change in pull request #10696: [MXNET-366]Extend MXNet Distributed Training by AllReduce
rahul003 commented on a change in pull request #10696: [MXNET-366]Extend MXNet Distributed Training by AllReduce URL: https://github.com/apache/incubator-mxnet/pull/10696#discussion_r195267542 ## File path: src/kvstore/collectives/include/collectives.h ## @@ -0,0 +1,139 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +/** + * Copyright (c) 2018 by Contributors + */ + +#ifndef MXNET_KVSTORE_COLLECTIVES_INCLUDE_COLLECTIVES_H_ +#define MXNET_KVSTORE_COLLECTIVES_INCLUDE_COLLECTIVES_H_ + +#if MXNET_USE_ALLREDUCE_DIST_KVSTORE + +#include + +#include +#include + +namespace mxnet { +namespace kvstore { + +/*! + * \brief Get node number. Review comment: Better to clarify this description. Looks wrong. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] rahul003 commented on a change in pull request #10696: [MXNET-366]Extend MXNet Distributed Training by AllReduce
rahul003 commented on a change in pull request #10696: [MXNET-366]Extend MXNet Distributed Training by AllReduce URL: https://github.com/apache/incubator-mxnet/pull/10696#discussion_r195257681 ## File path: src/kvstore/kvstore.cc ## @@ -49,6 +53,19 @@ KVStore* KVStore::Create(const char *type_name) { use_device_comm = true; } +#if MXNET_USE_ALLREDUCE_DIST_KVSTORE + if (has("dist_sync_allreduce")) { +kv = new kvstore::KVStoreDistSyncAllReduce(); +kv->type_ = tname; +return kv; + } +#else + if (has("dist_sync_allreduce")) { +LOG(FATAL) << "compile with USE_ALLREDUCE_DIST_KVSTORE=1 to use " << tname; +return nullptr; + } +#endif + if (has("dist")) { Review comment: make this else if? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] rahul003 commented on a change in pull request #10696: [MXNET-366]Extend MXNet Distributed Training by AllReduce
rahul003 commented on a change in pull request #10696: [MXNET-366]Extend MXNet Distributed Training by AllReduce URL: https://github.com/apache/incubator-mxnet/pull/10696#discussion_r195272925 ## File path: src/kvstore/collectives/src/mpi_message.proto ## @@ -0,0 +1,67 @@ +syntax = "proto3"; + +package mxnet.kvstore; + +// We would like to just use DataType here, but since this +// is a contrib package, linking directly to MXNet protos seems to be +// impossible. Doing so compiles, but fails with a cryptic error at runtime +// about a pointer that was passed to free() but not created by malloc(). +// +// Since using the mxnet/core protos seems to cause issues, we use our own, +// which also has the benefit of supporting only the data types we want to support. +enum MPIDataType { +MX_MPI_INVALID_TYPE = 0; +MX_MPI_FLOAT32 = 1; +MX_MPI_INT32 = 2; +MX_MPI_INT64 = 3; +}; + +// An MPIRequest is a message sent from a rank greater than zero to the +// coordinator (rank zero), informing the coordinator of an operation that +// the rank wants to do and the tensor that it wants to apply the operation to. +message MPIRequest { + enum RequestType { +ALLREDUCE = 0; +ALLGATHER = 1; +BROADCAST = 2; + } + + // The request rank is necessary to create a consistent ordering of results, + // for example in the allgather where the order of outputs should be sorted + // by rank. + int32 request_rank = 1; + string key_name = 2; + RequestType request_type = 3; + MPIDataType value_type = 4; + int32 root_rank = 5; + + // We use a repeated integer instead of a TensorShapeProto because linking directly Review comment: For my understanding what does this do? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] rahul003 commented on a change in pull request #10696: [MXNET-366]Extend MXNet Distributed Training by AllReduce
rahul003 commented on a change in pull request #10696: [MXNET-366]Extend MXNet Distributed Training by AllReduce URL: https://github.com/apache/incubator-mxnet/pull/10696#discussion_r195266548 ## File path: src/kvstore/collectives/include/coll_wrapper.h ## @@ -0,0 +1,114 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +/** + * Copyright (c) 2018 by Contributors + */ + +#ifndef MXNET_KVSTORE_COLLECTIVES_INCLUDE_COLL_WRAPPER_H_ +#define MXNET_KVSTORE_COLLECTIVES_INCLUDE_COLL_WRAPPER_H_ + +#if MXNET_USE_ALLREDUCE_DIST_KVSTORE + +#include + +#include "mxnet/ndarray.h" +#include "mxnet/base.h" +#include "mpi_message.pb.h" + +template +MPI_Datatype MPI_Data_Type_Cast(void); + +template<> +MPI_Datatype MPI_Data_Type_Cast(void) { + return MPI_INT; +} + +template<> +MPI_Datatype MPI_Data_Type_Cast(void) { + return MPI_FLOAT; +} + +template<> +MPI_Datatype MPI_Data_Type_Cast(void) { + return MPI_DOUBLE; +} + +template +struct COLL_Wrapper { + static int Broadcast(mxnet::NDArray *input_array, + int root_rank) { +return 0; } + + static int AllReduce(mxnet::NDArray *input_array, + mxnet::NDArray *output_array) { +return 0; } +}; + +// CPU Implementation +template +struct COLL_Wrapper { + static int Broadcast(mxnet::NDArray *input_array, + int root_rank) { +DType *buf = reinterpret_cast(input_array->data().dptr()); +unsigned int count = input_array->data().Size(); +int ret = MPI_Bcast(buf, count, MPI_Data_Type_Cast(), root_rank, MPI_COMM_WORLD); +return ret; + } + + static int AllReduce(mxnet::NDArray *input_array, + mxnet::NDArray *output_array) { +DType *send_buf = reinterpret_cast(input_array->data().dptr()); +DType *recv_buf = reinterpret_cast(output_array->data().dptr()); +unsigned int count = input_array->data().Size(); +int ret; +assert(input_array->data().Size() == output_array->data().Size()); + +if (send_buf != recv_buf) { + ret = MPI_Allreduce(reinterpret_cast(send_buf), + reinterpret_cast(recv_buf), + count, MPI_Data_Type_Cast(), MPI_SUM, MPI_COMM_WORLD); +} else { Review comment: Just for my understanding when can send_buf = recv_buf This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] rahul003 commented on a change in pull request #10696: [MXNET-366]Extend MXNet Distributed Training by AllReduce
rahul003 commented on a change in pull request #10696: [MXNET-366]Extend MXNet Distributed Training by AllReduce URL: https://github.com/apache/incubator-mxnet/pull/10696#discussion_r195277270 ## File path: src/kvstore/collectives/src/collectives.cc ## @@ -0,0 +1,855 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +/** + * Copyright (c) 2018 by Contributors + */ + +#if MXNET_USE_ALLREDUCE_DIST_KVSTORE + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "mxnet/base.h" +#include "mxnet/ndarray.h" +#include "mxnet/engine.h" +#include "dmlc/logging.h" +#include "mpi_message.pb.h" +#include "collectives.h" +#include "coll_wrapper.h" +#include "coll_util.h" + +using namespace mxnet::kvstore; + +const char INT_PREFIX[] = "INT"; +const char STR_PREFIX[] = "STR"; +const char IDX_PREFIX[] = "IDX"; +const char OPS_PREFIX[] = "OPS"; +const char OPS_ALLREDUCE[] = "ALLREDUCE"; +const char OPS_BROADCAST[] = "BROADCAST"; +const char DELIMITER[] = ":"; + +namespace { + +struct CollectiveOpRecord { + int rank; + + std::string key; + + MPIDataType dtype; + + mxnet::NDArray *val_in; + + mxnet::NDArray *val_out; + + int root_rank; + + mxnet::engine::CallbackOnComplete callback; +}; + +typedef std::unordered_map NDArrayTable; + +typedef std::unordered_map > MessageTable; + +/* + * Collective_global var maintain a message table and a background thread. + * In rank 0, message table is used to coordinate all reduce order + * of ndarray in different nodes.The background thread is used + * for doing collectives and doing coordination between nodes + * through mpi messages. + */ +struct CollectiveGlobalState { + std::atomic_flag initialized_flag = ATOMIC_FLAG_INIT; + + std::condition_variable cv; + + bool initialization_done = false; + + int init_status; + + std::mutex mu; + + NDArrayTable ndarray_table; + + std::queue message_queue; + + std::thread background_thread; + + bool shut_down = false; + + std::unique_ptr message_table; + + int rank = 0; + + int local_rank = 0; + + int size = 1; + + int device = -1; + + mxnet::Context pinned_ctx; + +~CollectiveGlobalState() { + if (background_thread.joinable()) { +shut_down = true; +background_thread.join(); + } +} +}; + +static CollectiveGlobalState coll_global; + +// static std::unordered_map mpi_comm_buf; + +#define RANK_ZERO 0 + +#define TAG_NOTIFY 1 + +bool IncrementNDArrayCount( + const std::unique_ptr& message_table, + const MPIRequest , int mpi_size) { + auto name = msg.key_name(); + auto table_iter = message_table->find(name); + if (table_iter == message_table->end()) { +message_table->emplace(name, std::vector({msg})); +MXCOLL_DEBUG(coll_global.rank, "Insert new message key [%s] reqeust type [%d] from " +"rank[%d] into message table!\n", name.c_str(), msg.request_type(), +msg.request_rank()); +table_iter = message_table->find(name); + } else { +MXCOLL_DEBUG(coll_global.rank, "Insert existing message key [%s] request type [%d]" +"from rank[%d] into message table!\n", +name.c_str(), msg.request_type(), msg.request_rank()); +table_iter->second.push_back(msg); + } + + int count = table_iter->second.size(); + MXCOLL_DEBUG(coll_global.rank, "Message Key [%s] count [%d]\n", name.c_str(), count); + return count == mpi_size; +} + +int DataTypeToMPIType(int ndarray_dtype, MPIDataType *mpi_dtype) { + if (ndarray_dtype == mshadow::kFloat32) { +*mpi_dtype = MX_MPI_FLOAT32; + } else if (ndarray_dtype == mshadow::kInt32) { +*mpi_dtype = MX_MPI_INT32; + } else if (ndarray_dtype == mshadow::kInt64) { +*mpi_dtype = MX_MPI_INT64; + } else { +return -1; + } + return 0; +} + +MPIResponse ConstructMPIResponse(const std::unique_ptr& message_table, + std::string name) { + bool error = false; + auto it = message_table->find(name); + assert(it != message_table->end()); + + std::vector requests = it->second; + assert(requests.size() > 0); + + std::ostringstream error_message_stream; + + auto data_type = requests[0].value_type(); + for (unsigned int i = 1; i
[GitHub] rahul003 commented on a change in pull request #10696: [MXNET-366]Extend MXNet Distributed Training by AllReduce
rahul003 commented on a change in pull request #10696: [MXNET-366]Extend MXNet Distributed Training by AllReduce URL: https://github.com/apache/incubator-mxnet/pull/10696#discussion_r195260725 ## File path: python/mxnet/gluon/trainer.py ## @@ -125,8 +125,11 @@ def _init_kvstore(self): # optimizer preferably needs to be set before init for multiprecision for i, param in enumerate(self._params): param_arrays = param.list_data() -kvstore.init(i, param_arrays[0]) -kvstore.pull(i, param_arrays, priority=-i) +if 'allreduce' not in kvstore.type: +kvstore.init(i, param_arrays[0]) +kvstore.pull(i, param_arrays, priority=-i) +else: +kvstore.broadcast(i, param_arrays, 0, priority=-i) Review comment: Since push pull doesn't support updater on kvstore don't you need to check for when user passes update_on_kvstore to trainer in Gluon. Have you tested training with Gluon? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[incubator-mxnet] branch v1.2.0-java updated: remove varargs for NDArray operators
This is an automated email from the ASF dual-hosted git repository. liuyizhi pushed a commit to branch v1.2.0-java in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git The following commit(s) were added to refs/heads/v1.2.0-java by this push: new 455150c remove varargs for NDArray operators 455150c is described below commit 455150c8ad3d53916b4e5523c7ea91dc4df0fe9e Author: Yizhi Liu AuthorDate: Wed Jun 13 18:01:09 2018 -0700 remove varargs for NDArray operators --- scala-package/macros/src/main/scala/org/apache/mxnet/NDArrayMacro.scala | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/scala-package/macros/src/main/scala/org/apache/mxnet/NDArrayMacro.scala b/scala-package/macros/src/main/scala/org/apache/mxnet/NDArrayMacro.scala index c4d16bc..036b9ec 100644 --- a/scala-package/macros/src/main/scala/org/apache/mxnet/NDArrayMacro.scala +++ b/scala-package/macros/src/main/scala/org/apache/mxnet/NDArrayMacro.scala @@ -70,7 +70,7 @@ private[mxnet] object NDArrayMacro { // def transpose(kwargs: Map[String, Any] = null)(args: Any*) q"def $termName(kwargs: Map[String, Any] = null)(args: Any*) = {genericNDArrayFunctionInvoke($funcName, args, kwargs)}", // def transpose(args: Any*) -q"@scala.annotation.varargs def $termName(args: Any*) = {genericNDArrayFunctionInvoke($funcName, args, null)}" +q"def $termName(args: Any*) = {genericNDArrayFunctionInvoke($funcName, args, null)}" // scalastyle:on ) } -- To stop receiving notification emails like this one, please contact liuyi...@apache.org.
[GitHub] rahul003 commented on a change in pull request #10696: [MXNET-366]Extend MXNet Distributed Training by AllReduce
rahul003 commented on a change in pull request #10696: [MXNET-366]Extend MXNet Distributed Training by AllReduce URL: https://github.com/apache/incubator-mxnet/pull/10696#discussion_r195262040 ## File path: python/mxnet/model.py ## @@ -98,7 +102,10 @@ def _initialize_kvstore(kvstore, param_arrays, arg_params, param_names, update_o """Initialize kvstore""" for idx, param_on_devs in enumerate(param_arrays): name = param_names[idx] -kvstore.init(name, arg_params[name]) +if 'allreduce' not in kvstore.type: Review comment: This is the sort of check we should have everywhere This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] rahul003 commented on a change in pull request #10696: [MXNET-366]Extend MXNet Distributed Training by AllReduce
rahul003 commented on a change in pull request #10696: [MXNET-366]Extend MXNet Distributed Training by AllReduce URL: https://github.com/apache/incubator-mxnet/pull/10696#discussion_r195260104 ## File path: make/config.mk ## @@ -152,6 +152,14 @@ USE_F16C = # whether or not to enable multi-machine supporting USE_DIST_KVSTORE = 0 +# whether or not to enable kvstore with type dist_sync_allreduce +USE_ALLREDUCE_DIST_KVSTORE = 0 Review comment: I wonder if MPI_DIST_KVSTORE is a better name since ALL_REDUCE is not necessarily MPI specific. @eric-haibin-lin @piiswrong ? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] rahul003 commented on a change in pull request #10696: [MXNET-366]Extend MXNet Distributed Training by AllReduce
rahul003 commented on a change in pull request #10696: [MXNET-366]Extend MXNet Distributed Training by AllReduce URL: https://github.com/apache/incubator-mxnet/pull/10696#discussion_r195262215 ## File path: python/mxnet/model.py ## @@ -86,6 +88,8 @@ def _create_kvstore(kvstore, num_device, arg_params): arg_params.values()) if max_size > 1024 * 1024 * 16: update_on_kvstore = False +if kvstore == 'dist_sync_allreduce': Review comment: Merge these two update_on_kvstore to one place This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] eric-haibin-lin commented on a change in pull request #11229: [MXNET-379] L1 Normalization
eric-haibin-lin commented on a change in pull request #11229: [MXNET-379] L1 Normalization URL: https://github.com/apache/incubator-mxnet/pull/11229#discussion_r195276646 ## File path: tests/python/unittest/test_operator.py ## @@ -2879,6 +2879,32 @@ def npy_layer_norm(data, gamma, beta, axis=1, eps=1E-5): grad_nodes={'data': req, 'gamma': req, 'beta': req}, numeric_eps=1e-2, rtol=1e-2, atol=1e-2) +@with_seed() +def test_l1_norm(): +ctx = default_context() +data = mx.symbol.Variable('data') +in_data_dim = random_sample([4,5,6], 1)[0] +in_shape = rand_shape_nd(in_data_dim) +for dtype in [np.float16, np.float32, np.float64]: +in_data = np.random.uniform(-1, 1, in_shape).astype(dtype) +for i in range(in_data_dim): +for keep_dims in [True, False]: +norm_sym = mx.symbol.norm(data=data, ord=1, axis=i, keepdims=keep_dims) +npy_out = np.sum(abs(in_data), axis=i, keepdims=keep_dims) +check_symbolic_forward(norm_sym, [in_data], [npy_out], + rtol=1e-2 if dtype is np.float16 else 1e-5, + atol=1e-5, ctx=ctx) +# check gradient +#check_numeric_gradient(norm_sym, [in_data], numeric_eps=1e-3, rtol=1e-2, atol=1e-3) +if i < in_data_dim-1: +norm_sym = mx.symbol.norm(data=data, ord=1, axis=(i, i+1), keepdims=keep_dims) +npy_out = np.sum(abs(in_data), axis=(i, i+1), keepdims=keep_dims) +check_symbolic_forward(norm_sym, [in_data], [npy_out], + rtol=1e-2 if dtype is np.float16 else 1e-5, + atol=1e-5, ctx=ctx) +# check gradient +#check_numeric_gradient(norm_sym, [in_data], numeric_eps=1e-3, rtol=1e-2, atol=1e-3) Review comment: why comment this out? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] eric-haibin-lin commented on a change in pull request #11229: [MXNET-379] L1 Normalization
eric-haibin-lin commented on a change in pull request #11229: [MXNET-379] L1 Normalization URL: https://github.com/apache/incubator-mxnet/pull/11229#discussion_r195276671 ## File path: tests/python/unittest/test_operator.py ## @@ -2879,6 +2879,32 @@ def npy_layer_norm(data, gamma, beta, axis=1, eps=1E-5): grad_nodes={'data': req, 'gamma': req, 'beta': req}, numeric_eps=1e-2, rtol=1e-2, atol=1e-2) +@with_seed() +def test_l1_norm(): +ctx = default_context() +data = mx.symbol.Variable('data') +in_data_dim = random_sample([4,5,6], 1)[0] +in_shape = rand_shape_nd(in_data_dim) +for dtype in [np.float16, np.float32, np.float64]: +in_data = np.random.uniform(-1, 1, in_shape).astype(dtype) +for i in range(in_data_dim): +for keep_dims in [True, False]: +norm_sym = mx.symbol.norm(data=data, ord=1, axis=i, keepdims=keep_dims) +npy_out = np.sum(abs(in_data), axis=i, keepdims=keep_dims) +check_symbolic_forward(norm_sym, [in_data], [npy_out], + rtol=1e-2 if dtype is np.float16 else 1e-5, + atol=1e-5, ctx=ctx) +# check gradient +#check_numeric_gradient(norm_sym, [in_data], numeric_eps=1e-3, rtol=1e-2, atol=1e-3) +if i < in_data_dim-1: +norm_sym = mx.symbol.norm(data=data, ord=1, axis=(i, i+1), keepdims=keep_dims) +npy_out = np.sum(abs(in_data), axis=(i, i+1), keepdims=keep_dims) +check_symbolic_forward(norm_sym, [in_data], [npy_out], + rtol=1e-2 if dtype is np.float16 else 1e-5, + atol=1e-5, ctx=ctx) +# check gradient +#check_numeric_gradient(norm_sym, [in_data], numeric_eps=1e-3, rtol=1e-2, atol=1e-3) Review comment: Also - check_symbolic_backward? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] haojin2 commented on a change in pull request #11229: [MXNET-379] L1 Normalization
haojin2 commented on a change in pull request #11229: [MXNET-379] L1 Normalization URL: https://github.com/apache/incubator-mxnet/pull/11229#discussion_r195276978 ## File path: src/operator/tensor/broadcast_reduce_op.h ## @@ -880,27 +880,29 @@ inline bool L2NormStorageType(const nnvm::NodeAttrs& attrs, int& out_stype = out_attrs->at(0); const NormParam& param = nnvm::get(attrs.parsed); bool dispatched = false; - // l2 norm on a particular axis only supports cpu - const bool invalid_ctx = dev_mask != mshadow::cpu::kDevMask; - const auto dispatch_ex = + if (param.ord == 2) { +// l2 norm on a particular axis only supports cpu +const bool invalid_ctx = dev_mask != mshadow::cpu::kDevMask; +const auto dispatch_ex = invalid_ctx ? DispatchMode::kFComputeFallback : DispatchMode::kFComputeEx; - if (!dispatched && in_stype == kDefaultStorage) { -// dns -> dns -dispatched = storage_type_assign(_stype, kDefaultStorage, dispatch_mode, - DispatchMode::kFCompute); - } - const TShape axis = param.axis.has_value() ? param.axis.value() : TShape(); - if (!dispatched && (in_stype == kRowSparseStorage || in_stype == kCSRStorage) && - axis.ndim() == 0 && param.ord == 2) { -// l2 norm: rsp/csr, axis = () -> dns -dispatched = storage_type_assign(_stype, kDefaultStorage, dispatch_mode, - DispatchMode::kFComputeEx); - } - if (!dispatched && in_stype == kCSRStorage && axis.ndim() == 1 && !param.keepdims && - (axis[0] == 0 || axis[0] == 1) && param.ord == 2) { -// l2 norm: csr, axis = 0/1 -> dns -dispatched = storage_type_assign(_stype, kDefaultStorage, dispatch_mode, - dispatch_ex); +if (!dispatched && in_stype == kDefaultStorage) { + // dns -> dns + dispatched = storage_type_assign(_stype, kDefaultStorage, dispatch_mode, + DispatchMode::kFCompute); +} Review comment: Please take the default case out of the if branch: ```c++ if (!dispatched && in_stype == kDefaultStorage) { // ... } if (param.ord == 2) { // the rest cases } if (!dispatched) { // fallback } ``` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] aaronmarkham commented on issue #11238: UX for ONNX Documentation is broken
aaronmarkham commented on issue #11238: UX for ONNX Documentation is broken URL: https://github.com/apache/incubator-mxnet/issues/11238#issuecomment-397090798 Upon further investigation I think this commit triggered what we're seeing today. https://github.com/apache/incubator-mxnet/pull/9534/files @szha @astonzhang - Can you guy clarify what's going on with the name changes and what you think should be done to fix these errors? Why don't we use index.html anymore for each folder? Why no /api/index.html? This js file is looking for index.html for some urltracker... what's this all for anyway? https://github.com/apache/incubator-mxnet/blob/5dde19ffbef016d2bf7c67347f860f0330361a35/docs/_static/js/page.js This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] anirudh2290 commented on issue #11265: Flaky test: Python 3: CPU Win
anirudh2290 commented on issue #11265: Flaky test: Python 3: CPU Win URL: https://github.com/apache/incubator-mxnet/issues/11265#issuecomment-397128472 is there something wrong with the windows builds : I came across similar issues here: http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/incubator-mxnet/detail/PR-11127/12/pipeline/ and http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/incubator-mxnet/detail/PR-11264/4/pipeline @lanking520 @marcoabreu This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] anirudh2290 commented on issue #11265: Flaky test: Python 3: CPU Win
anirudh2290 commented on issue #11265: Flaky test: Python 3: CPU Win URL: https://github.com/apache/incubator-mxnet/issues/11265#issuecomment-397128472 is there something wrong with the windows builds : I came across similar issues here: http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/incubator-mxnet/detail/PR-11127/12/pipeline/ adn http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/incubator-mxnet/detail/PR-11264/4/pipeline @lanking520 @marcoabreu This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] pengzhao-intel commented on issue #11129: [MXNET-497]Test kAddTo request for mkldnn operators
pengzhao-intel commented on issue #11129: [MXNET-497]Test kAddTo request for mkldnn operators URL: https://github.com/apache/incubator-mxnet/pull/11129#issuecomment-397127682 @zheng-da agree. It's nice to keep consistency between the title/description with the real changes. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] hcho3 commented on issue #11209: [MXNET-536] implement var/std operators
hcho3 commented on issue #11209: [MXNET-536] implement var/std operators URL: https://github.com/apache/incubator-mxnet/pull/11209#issuecomment-397126291 @piiswrong I ended up changing the variance implementation slightly to improve precision. As for testing backward results, I used both numerical and symbolic tests. Numerical gradients have low precision, so I set a high threshold for numerical tests. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] anirudh2290 opened a new pull request #11267: [WIP] Add NEWS and README
anirudh2290 opened a new pull request #11267: [WIP] Add NEWS and README URL: https://github.com/apache/incubator-mxnet/pull/11267 ## Description ## Added NEWs and README. @ThomasDelteil @piiswrong @zheng-da @eric-haibin-lin @nswamy @lebeg @lanking520 @andrewfayres ## Checklist ## ### Essentials ### Please feel free to remove inapplicable items for your PR. - [ ] The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant [JIRA issue](https://issues.apache.org/jira/projects/MXNET/issues) created (except PRs with tiny changes) - [ ] Changes are complete (i.e. I finished coding on this PR) - [ ] All changes have test coverage: - Unit tests are added for small changes to verify correctness (e.g. adding a new operator) - Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore) - Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL) - [ ] Code is well-documented: - For user-facing API changes, API doc string has been updated. - For new C++ functions in header files, their functionalities and arguments are documented. - For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable - Check the API doc at http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html - [ ] To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change ### Changes ### - [ ] Feature1, tests, (and when applicable, API doc) - [ ] Feature2, tests, (and when applicable, API doc) ## Comments ## - If this change is a backward incompatible change, why must this change be made. - Interesting edge cases to note here This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] lupesko commented on issue #10867: Scala Module API resize is leaking memory on the native size.
lupesko commented on issue #10867: Scala Module API resize is leaking memory on the native size. URL: https://github.com/apache/incubator-mxnet/issues/10867#issuecomment-397124685 Hey @jessebrizzi - thanks for reporting the issue! I noticed that your example code is using the old namespace... can you reproduce on MXNet latest (1.2.0) and with the new package that is available on Maven? Adding @nswamy who contributes to the Scala package This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] eric-haibin-lin opened a new pull request #11266: Add clip_global_norm(row_sparse_grad). Fix row_sparse_param.save()
eric-haibin-lin opened a new pull request #11266: Add clip_global_norm(row_sparse_grad). Fix row_sparse_param.save() URL: https://github.com/apache/incubator-mxnet/pull/11266 ## Description ## As title. Test cases are updated accordingly. ## Checklist ## ### Essentials ### Please feel free to remove inapplicable items for your PR. - [ ] The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant [JIRA issue](https://issues.apache.org/jira/projects/MXNET/issues) created (except PRs with tiny changes) - [ ] Changes are complete (i.e. I finished coding on this PR) - [ ] All changes have test coverage: - Unit tests are added for small changes to verify correctness (e.g. adding a new operator) - Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore) - Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL) - [ ] Code is well-documented: - For user-facing API changes, API doc string has been updated. - For new C++ functions in header files, their functionalities and arguments are documented. - For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable - Check the API doc at http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html - [ ] To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change ### Changes ### - [ ] Feature1, tests, (and when applicable, API doc) - [ ] Feature2, tests, (and when applicable, API doc) ## Comments ## - If this change is a backward incompatible change, why must this change be made. - Interesting edge cases to note here This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] Roshrini commented on a change in pull request #11213: [MXNET-533] MXNet-ONNX export
Roshrini commented on a change in pull request #11213: [MXNET-533] MXNet-ONNX export URL: https://github.com/apache/incubator-mxnet/pull/11213#discussion_r195260307 ## File path: python/mxnet/contrib/onnx/_export/op_translations.py ## @@ -0,0 +1,1667 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. +# +# Based on +# https://github.com/NVIDIA/mxnet_to_onnx/blob/master/mx2onnx_converter/ +# mx2onnx_converter_functions.py +# Copyright (c) 2017, NVIDIA CORPORATION. All rights reserved. +# +# Redistribution and use in source and binary forms, with or without +# modification, are permitted provided that the following conditions +# are met: +# * Redistributions of source code must retain the above copyright +#notice, this list of conditions and the following disclaimer. +# * Redistributions in binary form must reproduce the above copyright +#notice, this list of conditions and the following disclaimer in the +#documentation and/or other materials provided with the distribution. +# * Neither the name of NVIDIA CORPORATION nor the names of its +#contributors may be used to endorse or promote products derived +#from this software without specific prior written permission. +# +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS ``AS IS'' AND ANY +# EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR +# PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR +# CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, +# EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, +# PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR +# PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY +# OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +# coding: utf-8 +# pylint: disable=too-many-locals,no-else-return,too-many-lines +# pylint: disable=anomalous-backslash-in-string,eval-used +""" +Conversion Functions for common layers. +Add new functions here with a decorator. +""" +from __future__ import absolute_import +from __future__ import division +from __future__ import print_function +from __future__ import unicode_literals +import re +import numpy as np + +from onnx import helper, numpy_helper, mapping +from .export_onnx import MXNetGraph as mx_op + + +@mx_op.register("null") +def convert_weights_and_inputs(node, **kwargs): +"""Helper function to convert weights and inputs. +""" +name = node["name"] + +if kwargs["is_input"] is False: +weights = kwargs["weights"] +initializer = kwargs["initializer"] +np_arr = weights[name] +data_type = mapping.NP_TYPE_TO_TENSOR_TYPE[np_arr.dtype] +dims = np.shape(np_arr) + +tensor_node = helper.make_tensor_value_info(name, data_type, dims) + +initializer.append( +helper.make_tensor( +name=name, +data_type=data_type, +dims=dims, +vals=np_arr.flatten().tolist(), +raw=False, +) +) + +return [tensor_node] +else: +tval_node = helper.make_tensor_value_info(name, kwargs["in_type"], kwargs["in_shape"]) +return [tval_node] + + +def parse_helper(attrs, attrs_name, alt_value=None): +"""Helper function to parse operator attributes in required format.""" +tuple_re = re.compile('\([0-9L|,| ]+\)') +if attrs is None: +return alt_value +attrs_str = str(attrs.get(attrs_name)) Review comment: Done This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] piiswrong commented on a change in pull request #11199: [MXNET-538] Add XUnit to python tests
piiswrong commented on a change in pull request #11199: [MXNET-538] Add XUnit to python tests URL: https://github.com/apache/incubator-mxnet/pull/11199#discussion_r195259435 ## File path: ci/docker/runtime_functions.sh ## @@ -448,9 +448,9 @@ unittest_ubuntu_python2_cpu() { # https://github.com/apache/incubator-mxnet/issues/10026 #export MXNET_MKLDNN_DEBUG=1 # Ignored if not present export MXNET_STORAGE_FALLBACK_LOG_VERBOSE=0 -nosetests-2.7 --verbose tests/python/unittest -nosetests-2.7 --verbose tests/python/train -nosetests-2.7 --verbose tests/python/quantization +nosetests-2.7 --with-xunit --xunit-file nosetests1.xml --verbose tests/python/unittest Review comment: use more meaning ful names instead of noestests1/2/3? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] kalyc commented on issue #11265: Flaky test: Python 3: CPU Win
kalyc commented on issue #11265: Flaky test: Python 3: CPU Win URL: https://github.com/apache/incubator-mxnet/issues/11265#issuecomment-397109914 This appears to be a permissions issue `remote file operation failed: C:/jenkins_slave/workspace/ut-python-cpu at hudson.remoting.Channel@232e68f3:JNLP4-connect connection from ip-172-31-2-173.us-west-2.compute.internal/172.31.2.173:49714: java.nio.file.AccessDeniedException: C:\jenkins_slave\workspace\ut-python-cpu` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] kalyc commented on issue #11265: Flaky test: Python 3: CPU Win
kalyc commented on issue #11265: Flaky test: Python 3: CPU Win URL: https://github.com/apache/incubator-mxnet/issues/11265#issuecomment-397109698 @aaronmarkham thanks for submitting the issue. @sandeep-krishnamurthy could you please add label "CI" "Python" to this? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] sandeep-krishnamurthy closed issue #11028: Pre-trained Shufflenet model fails during inference on mxnet-mkl==1.2.0
sandeep-krishnamurthy closed issue #11028: Pre-trained Shufflenet model fails during inference on mxnet-mkl==1.2.0 URL: https://github.com/apache/incubator-mxnet/issues/11028 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] lanking520 commented on issue #11264: [MXNET-543] disable scalatest on Spark
lanking520 commented on issue #11264: [MXNET-543] disable scalatest on Spark URL: https://github.com/apache/incubator-mxnet/pull/11264#issuecomment-397098926 ~~Please Don't merge at this time, still WIP~~ Finish fixing... have my hand crossed to pass the test This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] lanking520 commented on issue #11264: [MXNET-543][DO NOT MERGE] disable scalatest on Spark
lanking520 commented on issue #11264: [MXNET-543][DO NOT MERGE] disable scalatest on Spark URL: https://github.com/apache/incubator-mxnet/pull/11264#issuecomment-397098926 Please Don't merge at this time, still WIP This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] mbaijal commented on a change in pull request #10827: [MXNET-405][WIP] Add 2 new pipelines to the Official CI and run nightly tests.
mbaijal commented on a change in pull request #10827: [MXNET-405][WIP] Add 2 new pipelines to the Official CI and run nightly tests. URL: https://github.com/apache/incubator-mxnet/pull/10827#discussion_r195244742 ## File path: tests/nightly/broken_link_checker_test/broken_link_checker.sh ## @@ -0,0 +1,42 @@ +#!/usr/bin/env bash + +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +#Author: Amol Lele + +#software-properties-common, curl are installed in the docker container 'ubuntu_cpu' +# install git-core + +#rm -rf incubator-mxnet-site && rm -rf website-link-checker + +git config --global user.email \"$apache_usern...@gl.com\" && git config --global user.name \"$APACHE_USERNAME\" + +echo "clone the repo and checkout the correct branch" +git clone https://$APACHE_USERNAME:$apache_passw...@github.com/leleamol/incubator-mxnet-site.git Review comment: yes, I will remove these files from this PR. It will come as part of a separate PR from Amol This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] mbaijal commented on a change in pull request #10827: [MXNET-405][WIP] Add 2 new pipelines to the Official CI and run nightly tests.
mbaijal commented on a change in pull request #10827: [MXNET-405][WIP] Add 2 new pipelines to the Official CI and run nightly tests. URL: https://github.com/apache/incubator-mxnet/pull/10827#discussion_r195244575 ## File path: tests/nightly/broken_link_checker_test/JenkinsfileForBLC ## @@ -0,0 +1,72 @@ +// -*- mode: groovy -*- +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + + +//This is a Jenkinsfile for the broken link checker test. + +err = null + +def init_git() { Review comment: This is a future enhancements right, can we track this somewhere. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] mbaijal commented on a change in pull request #10827: [MXNET-405][WIP] Add 2 new pipelines to the Official CI and run nightly tests.
mbaijal commented on a change in pull request #10827: [MXNET-405][WIP] Add 2 new pipelines to the Official CI and run nightly tests. URL: https://github.com/apache/incubator-mxnet/pull/10827#discussion_r195244298 ## File path: tests/nightly/Jenkinsfile ## @@ -0,0 +1,180 @@ +// -*- mode: groovy -*- +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + + +//This is a Jenkinsfile for nightly tests. The format and some functions have been picked up from the top-level Jenkinsfile + +err = null +mx_lib = 'lib/libmxnet.so, lib/libmxnet.a, 3rdparty/dmlc-core/libdmlc.a, 3rdparty/nnvm/lib/libnnvm.a' + +// pack libraries for later use +def pack_lib(name, libs=mx_lib) { + sh """ +echo "Packing ${libs} into ${name}" +echo ${libs} | sed -e 's/,/ /g' | xargs md5sum +""" + stash includes: libs, name: name +} + +// unpack libraries saved before +def unpack_lib(name, libs=mx_lib) { + unstash name + sh """ +echo "Unpacked ${libs} from ${name}" +echo ${libs} | sed -e 's/,/ /g' | xargs md5sum +""" +} + +def init_git() { + deleteDir() + retry(5) { +try { + timeout(time: 15, unit: 'MINUTES') { +checkout scm +sh 'git submodule update --init --recursive' +sh 'git clean -d -f' + } +} catch (exc) { + deleteDir() + error "Failed to fetch source codes with ${exc}" + sleep 2 +} + } +} + + +try { + stage('NightlyTests'){ +parallel 'RATCheck: CPU': { + node('mxnetlinux-cpu') { +ws('workspace/nt-RATTest') { + init_git() + sh "ci/build.py --platform ubuntu_nightly_cpu /work/runtime_functions.sh nightly_test_rat_check" +} + } +}, +'CompilationWarnings: GPU': { + node('mxnetlinux-gpu') { +ws('workspace/nt-compilationTest') { + init_git() + sh "ci/build.py --platform ubuntu_nightly_gpu /work/runtime_functions.sh nightly_test_compilation_warning" +} + } +}, +'InstallationGuide: CPU': { + node('mxnetlinux-gpu') { +ws('workspace/nt-Installation-cpu') { + init_git() + sh "ci/build.py --platform ubuntu_base_cpu /work/runtime_functions.sh nightly_test_installation ubuntu_python_cpu_virtualenv" + sh "ci/build.py --platform ubuntu_base_cpu /work/runtime_functions.sh nightly_test_installation ubuntu_python_cpu_pip" + //Docker installation test is commented out since this would lead to docker within a docker + //sh "ci/build.py --platform ubuntu_base_cpu /work/runtime_functions.sh nightly_test_installation ubuntu_python_cpu_docker" + sh "ci/build.py --platform ubuntu_base_gpu /work/runtime_functions.sh nightly_test_installation ubuntu_python_cpu_source" +} + } +}, +'InstallationGuide: GPU': { + node('mxnetlinux-gpu') { +ws('workspace/nt-Installation-gpu') { + init_git() + sh "ci/build.py --platform ubuntu_base_gpu /work/runtime_functions.sh nightly_test_installation ubuntu_python_gpu_virtualenv" + sh "ci/build.py --platform ubuntu_base_gpu /work/runtime_functions.sh nightly_test_installation ubuntu_python_gpu_pip" + //Docker installation test is commented out since this would lead to docker within a docker + //sh "ci/build.py --platform ubuntu_base_gpu /work/runtime_functions.sh nightly_test_installation ubuntu_python_gpu_docker" + sh "ci/build.py --platform ubuntu_base_gpu /work/runtime_functions.sh nightly_test_installation ubuntu_python_gpu_source" +} + } +}, +'PipTest: GPU': { + node('mxnetlinux-gpu') { +ws('workspace/nt-pipTest') { + init_git() + //sh "ci/build.py --platform ubuntu_nightly_gpu /work/runtime_functions.sh nightly_test_pip_test" +} + } +}, +'Amalgamation-atlas: CPU': { + node('mxnetlinux-cpu') { +ws('workspace/nt-amalgamation1') { + init_git() + sh "ci/build.py --platform ubuntu_nightly_cpu /work/runtime_functions.sh nightly_test_amalgamation USE_BLAS=atlas" +} + } +}, +'Amalgamation-atlas-min: CPU': { + node('mxnetlinux-cpu') { +ws('workspace/nt-amalgamation2') { +
[GitHub] mbaijal commented on a change in pull request #10827: [MXNET-405][WIP] Add 2 new pipelines to the Official CI and run nightly tests.
mbaijal commented on a change in pull request #10827: [MXNET-405][WIP] Add 2 new pipelines to the Official CI and run nightly tests. URL: https://github.com/apache/incubator-mxnet/pull/10827#discussion_r195244120 ## File path: tests/nightly/Jenkinsfile ## @@ -0,0 +1,180 @@ +// -*- mode: groovy -*- +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + + +//This is a Jenkinsfile for nightly tests. The format and some functions have been picked up from the top-level Jenkinsfile + +err = null +mx_lib = 'lib/libmxnet.so, lib/libmxnet.a, 3rdparty/dmlc-core/libdmlc.a, 3rdparty/nnvm/lib/libnnvm.a' + +// pack libraries for later use +def pack_lib(name, libs=mx_lib) { + sh """ +echo "Packing ${libs} into ${name}" +echo ${libs} | sed -e 's/,/ /g' | xargs md5sum +""" + stash includes: libs, name: name +} + +// unpack libraries saved before +def unpack_lib(name, libs=mx_lib) { + unstash name + sh """ +echo "Unpacked ${libs} from ${name}" +echo ${libs} | sed -e 's/,/ /g' | xargs md5sum +""" +} + +def init_git() { + deleteDir() + retry(5) { +try { + timeout(time: 15, unit: 'MINUTES') { +checkout scm +sh 'git submodule update --init --recursive' +sh 'git clean -d -f' + } +} catch (exc) { + deleteDir() + error "Failed to fetch source codes with ${exc}" + sleep 2 +} + } +} + + +try { + stage('NightlyTests'){ +parallel 'RATCheck: CPU': { + node('mxnetlinux-cpu') { +ws('workspace/nt-RATTest') { + init_git() + sh "ci/build.py --platform ubuntu_nightly_cpu /work/runtime_functions.sh nightly_test_rat_check" +} + } +}, +'CompilationWarnings: GPU': { + node('mxnetlinux-gpu') { +ws('workspace/nt-compilationTest') { + init_git() + sh "ci/build.py --platform ubuntu_nightly_gpu /work/runtime_functions.sh nightly_test_compilation_warning" +} + } +}, +'InstallationGuide: CPU': { + node('mxnetlinux-gpu') { +ws('workspace/nt-Installation-cpu') { + init_git() + sh "ci/build.py --platform ubuntu_base_cpu /work/runtime_functions.sh nightly_test_installation ubuntu_python_cpu_virtualenv" + sh "ci/build.py --platform ubuntu_base_cpu /work/runtime_functions.sh nightly_test_installation ubuntu_python_cpu_pip" + //Docker installation test is commented out since this would lead to docker within a docker + //sh "ci/build.py --platform ubuntu_base_cpu /work/runtime_functions.sh nightly_test_installation ubuntu_python_cpu_docker" + sh "ci/build.py --platform ubuntu_base_gpu /work/runtime_functions.sh nightly_test_installation ubuntu_python_cpu_source" +} + } +}, +'InstallationGuide: GPU': { + node('mxnetlinux-gpu') { +ws('workspace/nt-Installation-gpu') { + init_git() + sh "ci/build.py --platform ubuntu_base_gpu /work/runtime_functions.sh nightly_test_installation ubuntu_python_gpu_virtualenv" + sh "ci/build.py --platform ubuntu_base_gpu /work/runtime_functions.sh nightly_test_installation ubuntu_python_gpu_pip" + //Docker installation test is commented out since this would lead to docker within a docker + //sh "ci/build.py --platform ubuntu_base_gpu /work/runtime_functions.sh nightly_test_installation ubuntu_python_gpu_docker" + sh "ci/build.py --platform ubuntu_base_gpu /work/runtime_functions.sh nightly_test_installation ubuntu_python_gpu_source" +} + } +}, +'PipTest: GPU': { + node('mxnetlinux-gpu') { +ws('workspace/nt-pipTest') { + init_git() + //sh "ci/build.py --platform ubuntu_nightly_gpu /work/runtime_functions.sh nightly_test_pip_test" +} + } +}, +'Amalgamation-atlas: CPU': { + node('mxnetlinux-cpu') { +ws('workspace/nt-amalgamation1') { + init_git() + sh "ci/build.py --platform ubuntu_nightly_cpu /work/runtime_functions.sh nightly_test_amalgamation USE_BLAS=atlas" +} + } +}, +'Amalgamation-atlas-min: CPU': { + node('mxnetlinux-cpu') { +ws('workspace/nt-amalgamation2') { +
[GitHub] mbaijal commented on a change in pull request #10827: [MXNET-405][WIP] Add 2 new pipelines to the Official CI and run nightly tests.
mbaijal commented on a change in pull request #10827: [MXNET-405][WIP] Add 2 new pipelines to the Official CI and run nightly tests. URL: https://github.com/apache/incubator-mxnet/pull/10827#discussion_r195244019 ## File path: tests/nightly/Jenkinsfile ## @@ -0,0 +1,180 @@ +// -*- mode: groovy -*- +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + + +//This is a Jenkinsfile for nightly tests. The format and some functions have been picked up from the top-level Jenkinsfile + +err = null +mx_lib = 'lib/libmxnet.so, lib/libmxnet.a, 3rdparty/dmlc-core/libdmlc.a, 3rdparty/nnvm/lib/libnnvm.a' + +// pack libraries for later use +def pack_lib(name, libs=mx_lib) { + sh """ +echo "Packing ${libs} into ${name}" +echo ${libs} | sed -e 's/,/ /g' | xargs md5sum +""" + stash includes: libs, name: name +} + +// unpack libraries saved before +def unpack_lib(name, libs=mx_lib) { + unstash name + sh """ +echo "Unpacked ${libs} from ${name}" +echo ${libs} | sed -e 's/,/ /g' | xargs md5sum +""" +} + +def init_git() { + deleteDir() + retry(5) { +try { + timeout(time: 15, unit: 'MINUTES') { +checkout scm +sh 'git submodule update --init --recursive' +sh 'git clean -d -f' + } +} catch (exc) { + deleteDir() + error "Failed to fetch source codes with ${exc}" + sleep 2 +} + } +} + + +try { + stage('NightlyTests'){ +parallel 'RATCheck: CPU': { + node('mxnetlinux-cpu') { +ws('workspace/nt-RATTest') { + init_git() + sh "ci/build.py --platform ubuntu_nightly_cpu /work/runtime_functions.sh nightly_test_rat_check" +} + } +}, +'CompilationWarnings: GPU': { + node('mxnetlinux-gpu') { +ws('workspace/nt-compilationTest') { + init_git() + sh "ci/build.py --platform ubuntu_nightly_gpu /work/runtime_functions.sh nightly_test_compilation_warning" +} + } +}, +'InstallationGuide: CPU': { + node('mxnetlinux-gpu') { +ws('workspace/nt-Installation-cpu') { + init_git() + sh "ci/build.py --platform ubuntu_base_cpu /work/runtime_functions.sh nightly_test_installation ubuntu_python_cpu_virtualenv" + sh "ci/build.py --platform ubuntu_base_cpu /work/runtime_functions.sh nightly_test_installation ubuntu_python_cpu_pip" + //Docker installation test is commented out since this would lead to docker within a docker + //sh "ci/build.py --platform ubuntu_base_cpu /work/runtime_functions.sh nightly_test_installation ubuntu_python_cpu_docker" + sh "ci/build.py --platform ubuntu_base_gpu /work/runtime_functions.sh nightly_test_installation ubuntu_python_cpu_source" +} + } +}, +'InstallationGuide: GPU': { + node('mxnetlinux-gpu') { +ws('workspace/nt-Installation-gpu') { + init_git() + sh "ci/build.py --platform ubuntu_base_gpu /work/runtime_functions.sh nightly_test_installation ubuntu_python_gpu_virtualenv" + sh "ci/build.py --platform ubuntu_base_gpu /work/runtime_functions.sh nightly_test_installation ubuntu_python_gpu_pip" + //Docker installation test is commented out since this would lead to docker within a docker + //sh "ci/build.py --platform ubuntu_base_gpu /work/runtime_functions.sh nightly_test_installation ubuntu_python_gpu_docker" + sh "ci/build.py --platform ubuntu_base_gpu /work/runtime_functions.sh nightly_test_installation ubuntu_python_gpu_source" +} + } +}, +'PipTest: GPU': { + node('mxnetlinux-gpu') { +ws('workspace/nt-pipTest') { + init_git() + //sh "ci/build.py --platform ubuntu_nightly_gpu /work/runtime_functions.sh nightly_test_pip_test" Review comment: Yes. This test is deprecated but somehow deleting this task from the Jenkinsfile messes the whole pipeline naming in blueocean. I believe some formatting of the Jenkinsfile i need to fix. @marcoabreu, ideas? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to
[GitHub] mbaijal commented on a change in pull request #10827: [MXNET-405][WIP] Add 2 new pipelines to the Official CI and run nightly tests.
mbaijal commented on a change in pull request #10827: [MXNET-405][WIP] Add 2 new pipelines to the Official CI and run nightly tests. URL: https://github.com/apache/incubator-mxnet/pull/10827#discussion_r195243501 ## File path: tests/jenkins/run_test_installation_docs.sh ## @@ -262,29 +262,47 @@ LINUX_PYTHON_CPU_END_LINENO=$(grep -n "END - Linux Python CPU Installation Instr set_instruction_set ${LINUX_PYTHON_CPU_START_LINENO} ${LINUX_PYTHON_CPU_END_LINENO} -echo -echo "### Testing Virtualenv ###" -echo "${virtualenv_commands}" -echo -docker run --rm ubuntu:14.04 bash -c "${virtualenv_commands}" +ubuntu_python_cpu_virtualenv() +{ +echo +echo "### Testing Virtualenv ###" +echo "${virtualenv_commands}" #> "$filewithcommands" +echo +eval ${virtualenv_commands} +echo "ubuntu_python_cpu_virtualenv: MXNet Installed Successfully" +} -echo -echo "### Testing Pip ###" -echo "${pip_commands}" -echo -docker run --rm ubuntu:14.04 bash -c "${pip_commands}" +ubuntu_python_cpu_pip() +{ +echo +echo "### Testing Pip ###" +echo "${pip_commands}" +echo +eval ${pip_commands} +echo "ubuntu_python_cpu_pip: MXNet Installed Successfully" Review comment: Good Tip, Thanks! This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[incubator-mxnet] 01/12: Add Windows MKLDNN Building Instruction (#10613)
This is an automated email from the ASF dual-hosted git repository. anirudh2290 pushed a commit to branch v1.2.0 in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git commit bc9a2d2e9f00a89fac738366a3347c9ca4788a7e Author: XinyuChen AuthorDate: Fri May 18 20:59:50 2018 +0800 Add Windows MKLDNN Building Instruction (#10613) * add windows mkldnn instruction * update readme * typo full mkl to mkldnn * update blas * update mxnet url * update mkl build * intel mkl liscence * retrigger --- MKL_README.md | 96 ++- docs/install/windows_setup.md | 4 +- 2 files changed, 79 insertions(+), 21 deletions(-) diff --git a/MKL_README.md b/MKL_README.md index 5374adb..a5c63b0 100644 --- a/MKL_README.md +++ b/MKL_README.md @@ -1,19 +1,77 @@ -# Full MKL Installation - -## Build/Install MXNet with a full MKL installation: -Installing and enabling the full MKL installation enables MKL support for all operators under the linalg namespace. - - 1. Download and install the latest full MKL version following instructions on the [intel website.](https://software.intel.com/en-us/articles/intel-mkl-111-install-guide) - - 2. Set USE_BLAS=mkl in make/config.mk - -1.1 Set ADD_LDFLAGS=-L (ex. ADD_LDFLAGS=-L/opt/intel/compilers_and_libraries_2018.0.128/linux/mkl/lib) - -1.1 Set ADD_CFLAGS=-I (ex. ADD_CFLAGS=-L/opt/intel/compilers_and_libraries_2018.0.128/linux/mkl/include) - - 3. Run 'make -j ${nproc}' - - 4. Navigate into the python directory - - 5. Run 'sudo python setup.py install' - +## Build/Install MXNet with a full MKL installation: + +To make it convenient for customers, Intel introduced a new license called [Intel® Simplified license](https://software.intel.com/en-us/license/intel-simplified-software-license) that allows to redistribute not only dynamic libraries but also headers, examples and static libraries. + +Installing and enabling the full MKL installation enables MKL support for all operators under the linalg namespace. + + 1. Download and install the latest full MKL version following instructions on the [intel website.](https://software.intel.com/en-us/mkl) + + 2. Run 'make -j ${nproc} USE_BLAS=mkl' + + 3. Navigate into the python directory + + 4. Run 'sudo python setup.py install' + + +## Build/Install MXNet with MKLDNN on Windows: + +To build and install MXNet yourself, you need the following dependencies. Install the required dependencies: + +1. If [Microsoft Visual Studio 2015](https://www.visualstudio.com/vs/older-downloads/) is not already installed, download and install it. You can download and install the free community edition. +2. Download and Install [CMake](https://cmake.org/) if it is not already installed. +3. Download and install [OpenCV](http://sourceforge.net/projects/opencvlibrary/files/opencv-win/3.0.0/opencv-3.0.0.exe/download). +4. Unzip the OpenCV package. +5. Set the environment variable ```OpenCV_DIR``` to point to the ```OpenCV build directory``` (```C:\opencv\build\x64\vc14``` for example). Also, you need to add the OpenCV bin directory (```C:\opencv\build\x64\vc14\bin``` for example) to the ``PATH`` variable. +6. If you have Intel Math Kernel Library (MKL) installed, set ```MKL_ROOT``` to point to ```MKL``` directory that contains the ```include``` and ```lib```. If you want to use MKL blas, you should set ```-DUSE_BLAS=mkl``` when cmake. Typically, you can find the directory in +```C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2018\windows\mkl```. +7. If you don't have the Intel Math Kernel Library (MKL) installed, download and install [OpenBLAS](http://sourceforge.net/projects/openblas/files/v0.2.14/). Note that you should also download ```mingw64.dll.zip`` along with openBLAS and add them to PATH. +8. Set the environment variable ```OpenBLAS_HOME``` to point to the ```OpenBLAS``` directory that contains the ```include``` and ```lib``` directories. Typically, you can find the directory in ```C:\Program files (x86)\OpenBLAS\```. + +After you have installed all of the required dependencies, build the MXNet source code: + +1. Download the MXNet source code from [GitHub](https://github.com/apache/incubator-mxnet). Don't forget to pull the submodules: +``` +git clone https://github.com/apache/incubator-mxnet.git --recursive +``` + +2. Copy file `3rdparty/mkldnn/config_template.vcxproj` to incubator-mxnet root. + +3. Start a Visual Studio command prompt. + +4. Use [CMake](https://cmake.org/) to create a Visual Studio solution in ```./build``` or some other directory. Make sure to specify the architecture in the +[CMake](https://cmake.org/) command: +``` +mkdir build +cd build +cmake -G "Visual Studio 14 Win64" .. -DUSE_CUDA=0 -DUSE_CUDNN=0 -DUSE_NVRTC=0 -DUSE_OPENCV=1 -DUSE_OPENMP=1 -DUSE_PROFILER=1 -DUSE_BLAS=open -DUSE_LAPACK=1 -DUSE_DIST_KVSTORE=0
[incubator-mxnet] 02/12: [MXNET-33] SSD example not working with mkl-dnn (#10021)
This is an automated email from the ASF dual-hosted git repository. anirudh2290 pushed a commit to branch v1.2.0 in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git commit db24cc0bed085187bcdbecb144807d756898f51a Author: Ashok Emani AuthorDate: Tue Apr 24 10:48:01 2018 -0700 [MXNET-33] SSD example not working with mkl-dnn (#10021) * use mkl-dnn for 'valid' pooling_convention only * pooling convention full not supported by current mkl-dnn impl * disable unreachable code * add sample model test for mkldnn * fix review feedback * add jira link to comment * fix lint issue * rename python test for mkl * enable python tests for mkldnn in CI * use vgg16 with convention full * fix unittest --- Jenkinsfile| 8 +- ci/docker/runtime_functions.sh | 12 + src/operator/nn/mkldnn/mkldnn_pooling-inl.h| 6 + .../data/test_mkldnn_test_mkldnn_model_model1.json | 770 + tests/python/mkl/test_mkldnn.py| 94 +++ 5 files changed, 889 insertions(+), 1 deletion(-) diff --git a/Jenkinsfile b/Jenkinsfile index 8686012..7167f14 100644 --- a/Jenkinsfile +++ b/Jenkinsfile @@ -107,6 +107,12 @@ def python3_ut(docker_container_name) { } } +def python3_ut_mkldnn(docker_container_name) { + timeout(time: max_time, unit: 'MINUTES') { +sh "ci/build.py --build --platform ${docker_container_name} /work/runtime_functions.sh unittest_ubuntu_python3_cpu_mkldnn" + } +} + // GPU test has two parts. 1) run unittest on GPU, 2) compare the results on // both CPU and GPU // Python 2 @@ -438,7 +444,7 @@ try { ws('workspace/ut-python3-mkldnn-cpu') { init_git() unpack_lib('mkldnn_cpu', mx_mkldnn_lib) - python3_ut('ubuntu_cpu') + python3_ut_mkldnn('ubuntu_cpu') } } }, diff --git a/ci/docker/runtime_functions.sh b/ci/docker/runtime_functions.sh index 77ffad1..4d0f846 100755 --- a/ci/docker/runtime_functions.sh +++ b/ci/docker/runtime_functions.sh @@ -375,6 +375,18 @@ unittest_ubuntu_python3_cpu() { nosetests-3.4 --verbose tests/python/quantization } +unittest_ubuntu_python3_cpu_mkldnn() { +set -ex +export PYTHONPATH=./python/ +# MXNET_MKLDNN_DEBUG is buggy and produces false positives +# https://github.com/apache/incubator-mxnet/issues/10026 +#export MXNET_MKLDNN_DEBUG=1 # Ignored if not present +export MXNET_STORAGE_FALLBACK_LOG_VERBOSE=0 +nosetests-3.4 --verbose tests/python/unittest +nosetests-3.4 --verbose tests/python/quantization +nosetests-3.4 --verbose tests/python/mkl +} + unittest_ubuntu_python2_gpu() { set -ex export PYTHONPATH=./python/ diff --git a/src/operator/nn/mkldnn/mkldnn_pooling-inl.h b/src/operator/nn/mkldnn/mkldnn_pooling-inl.h index 2097d57..4b6235e 100644 --- a/src/operator/nn/mkldnn/mkldnn_pooling-inl.h +++ b/src/operator/nn/mkldnn/mkldnn_pooling-inl.h @@ -92,12 +92,18 @@ inline bool SupportMKLDNNPooling(const PoolingParam , if (param.pooling_convention == pool_enum::kValid) return true; + else +return false; +// need to support pooling convention full +// https://issues.apache.org/jira/browse/MXNET-33 +#if 0 if (((dshape[2] + 2 * param.pad[0] - param.kernel[0]) % param.stride[0] == 0) && ((dshape[3] + 2 * param.pad[1] - param.kernel[1]) % param.stride[1] == 0)) return true; else return false; +#endif } inline bool MKLDNNRequireWorkspace(const PoolingParam ) { diff --git a/tests/python/mkl/data/test_mkldnn_test_mkldnn_model_model1.json b/tests/python/mkl/data/test_mkldnn_test_mkldnn_model_model1.json new file mode 100644 index 000..ba822f5 --- /dev/null +++ b/tests/python/mkl/data/test_mkldnn_test_mkldnn_model_model1.json @@ -0,0 +1,770 @@ +{ + "nodes": [ +{ + "op": "null", + "name": "data", + "inputs": [] +}, +{ + "op": "null", + "name": "conv1_1_weight", + "attrs": { +"kernel": "(3, 3)", +"num_filter": "64", +"pad": "(1, 1)" + }, + "inputs": [] +}, +{ + "op": "null", + "name": "conv1_1_bias", + "attrs": { +"kernel": "(3, 3)", +"num_filter": "64", +"pad": "(1, 1)" + }, + "inputs": [] +}, +{ + "op": "Convolution", + "name": "conv1_1", + "attrs": { +"kernel": "(3, 3)", +"num_filter": "64", +"pad": "(1, 1)" + }, + "inputs": [[0, 0, 0], [1, 0, 0], [2, 0, 0]] +}, +{ + "op": "Activation", + "name": "relu1_1", + "attrs": {"act_type": "relu"}, + "inputs": [[3, 0, 0]] +}, +{ + "op": "null", + "name": "conv1_2_weight", + "attrs": { +"kernel": "(3, 3)", +"num_filter": "64", +"pad": "(1, 1)" + }, + "inputs": [] +
[incubator-mxnet] 09/12: handle NDArray slice properly for mkldnn layout
This is an automated email from the ASF dual-hosted git repository. anirudh2290 pushed a commit to branch v1.2.0 in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git commit 22966a0e6c9759cb83b723bbd05eb01946a00ce7 Author: Ashok Emani AuthorDate: Thu Apr 19 14:07:46 2018 -0700 handle NDArray slice properly for mkldnn layout --- src/ndarray/ndarray.cc | 9 - 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/src/ndarray/ndarray.cc b/src/ndarray/ndarray.cc index 4b45969..0175c5c 100644 --- a/src/ndarray/ndarray.cc +++ b/src/ndarray/ndarray.cc @@ -542,11 +542,18 @@ NDArray NDArray::Reorder2Default() const { if (format == ptr_->mkl_mem_->GetFormat()) return *this; - NDArray ret(shape(), ctx(), false, dtype()); + // create new ndarray from mkldnn layout + mkldnn::memory::desc from_desc = ptr_->mkl_mem_->GetPrimitiveDesc().desc(); + TShape tshape(from_desc.data.ndims); + for (int i = 0; i < from_desc.data.ndims; i++) tshape[i] = from_desc.data.dims[i]; + NDArray ret(tshape, ctx(), false, dtype()); mkldnn::memory::primitive_desc def_pd = ptr_->mkl_mem_->GetPrimitiveDesc(format); CHECK(ret.ptr_->shandle.size >= def_pd.get_size()); mkldnn::memory def_mem(def_pd, ret.ptr_->shandle.dptr); ptr_->mkl_mem_->ReorderTo(_mem); + // reshape as needed + ret.shape_ = shape_; + ret.byte_offset_ = byte_offset_; return ret; } -- To stop receiving notification emails like this one, please contact anirudh2...@apache.org.
[incubator-mxnet] 12/12: Fix bugs in MKLDNN. (#10979)
This is an automated email from the ASF dual-hosted git repository. anirudh2290 pushed a commit to branch v1.2.0 in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git commit 62a47a7ecf203138b0b0b953bcc7c6fceb1ba0e1 Author: Da Zheng AuthorDate: Fri May 25 10:11:45 2018 -0700 Fix bugs in MKLDNN. (#10979) * Fix bugs in MKLDNN. * add more test cases. * Fix CopyFrom when it's the view of an NDArray. * add test. * check same shape correctly. * add unit test for CopyFrom. * Fix warning. * Add test sum. * fix sum. * Fix fallback. * Fix fallback of sum. * add tests. * Update mkldnn.cc --- src/ndarray/ndarray.cc | 111 +--- src/operator/nn/mkldnn/mkldnn_base.cc | 5 +- src/operator/nn/mkldnn/mkldnn_sum.cc| 22 +++- src/operator/tensor/elemwise_binary_op_basic.cc | 12 +- tests/cpp/operator/mkldnn.cc| 165 +--- 5 files changed, 235 insertions(+), 80 deletions(-) diff --git a/src/ndarray/ndarray.cc b/src/ndarray/ndarray.cc index 6a8bc9d..fc01c75 100644 --- a/src/ndarray/ndarray.cc +++ b/src/ndarray/ndarray.cc @@ -200,6 +200,7 @@ NDArray NDArray::MKLDNNDataReshape(const TShape ) const { ret.ptr_->delay_alloc = false; ret.ptr_->static_data = true; ret.byte_offset_ = byte_offset_; +ret.reuse_ = false; return ret; } } @@ -217,6 +218,7 @@ NDArray NDArray::Reshape(const TShape ) const { // Otherwise, reshape only works on the default layout. CHECK_EQ(storage_type(), kDefaultStorage); ret.shape_ = shape; + ret.reuse_ = false; return ret; } @@ -249,6 +251,7 @@ NDArray NDArray::Slice(index_t begin, index_t end) const { MSHADOW_TYPE_SWITCH(ret.dtype(), DType, { ret.byte_offset_ += begin * length * sizeof(DType); }); + ret.reuse_ = false; ret.shape_[0] = end - begin; return ret; } @@ -554,6 +557,7 @@ NDArray NDArray::Reorder2Default() const { // reshape as needed ret.shape_ = shape_; ret.byte_offset_ = byte_offset_; + ret.reuse_ = false; return ret; } @@ -583,39 +587,39 @@ void NDArray::MKLDNNDataReorderAsync(const mkldnn::memory::primitive_desc ) const mkldnn::memory *NDArray::GetMKLDNNData() const { CHECK(storage_type() == kDefaultStorage); + bool is_view = IsView(); if (IsMKLDNNData()) { // If this array uses MKLDNN layout, we have to make sure it's not a view. // Otherwise, we'll have to change the layout inside the array. -CHECK(!IsView()); +CHECK(!is_view); MKLDNNStream::Get()->RegisterMem(ptr_->mkl_mem_->GetMem()); // If this array uses MKLDNN format, we should return now. Otherwise, // SetMKLMem may mess up mkl_mem_. return ptr_->mkl_mem_->GetRaw(); - } - ptr_->SetMKLMem(IsView() ? ptr_->storage_shape : shape_, dtype_); - MKLDNNStream::Get()->RegisterMem(ptr_->mkl_mem_->GetMem()); - if (IsView()) { -mkldnn::memory::primitive_desc pd = ptr_->mkl_mem_->GetPrimitiveDesc(); -// Sliced array must use the default layout. -CHECK_EQ(GetDefaultFormat(pd.desc()), pd.desc().data.format); -void *off_addr = static_cast(ptr_->mkl_mem_->GetDataHandle()) -+ byte_offset_; - + } else if (is_view) { +// If this is a view, we can't create a MKLDNN memory for the chunk +// because we don't have the complete data type and shape information for +// the chunk. +void *off_addr = static_cast(ptr_->shandle.dptr) + byte_offset_; // Create the primitive desc for the new mkldnn memory. mkldnn::memory::dims dims(shape().ndim()); for (size_t i = 0; i < dims.size(); i++) dims[i] = shape()[i]; mkldnn::memory::format cpp_format = static_cast( GetDefaultFormat(shape().ndim())); -mkldnn::memory::data_type cpp_type = static_cast( -pd.desc().data.data_type); +mkldnn::memory::data_type cpp_type = get_mkldnn_type(dtype_); mkldnn::memory::desc data_md(dims, cpp_type, cpp_format); -mkldnn::memory::primitive_desc new_pd(data_md, pd.get_engine()); +mkldnn::memory::primitive_desc new_pd(data_md, + CpuEngine::Get()->get_engine()); std::shared_ptr ret(new mkldnn::memory(new_pd, off_addr)); MKLDNNStream::Get()->RegisterMem(ret); return ret.get(); } else { +// If this isn't a view, we can create a MKLDNN memory and store it in the +// chunk. +ptr_->SetMKLMem(shape_, dtype_); +MKLDNNStream::Get()->RegisterMem(ptr_->mkl_mem_->GetMem()); return ptr_->mkl_mem_->GetRaw(); } } @@ -630,20 +634,23 @@ void NDArray::CopyFrom(const mkldnn::memory ) { MKLDNNStream *stream = MKLDNNStream::Get(); // If this array uses MKLDNN layout, we have to make sure it's not a view. // Otherwise, we'll have to change the layout inside the array. - if (IsMKLDNNData()) -CHECK(!IsView()); - ptr_->SetMKLMem(IsView() ?
[incubator-mxnet] 04/12: [MXNET-362] ensure same mkldnn engine is used for consistency (#10616)
This is an automated email from the ASF dual-hosted git repository. anirudh2290 pushed a commit to branch v1.2.0 in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git commit 4dcdb17685d8141a3fc571595679877cd975c4f9 Author: Ashok Emani AuthorDate: Sat Apr 28 00:43:14 2018 -0700 [MXNET-362] ensure same mkldnn engine is used for consistency (#10616) * ensure same mkldnn engine is used for consistency * add unittest for mkldnn engine thread testing * add comments for thread context switching * fix lint issue * use dummy data --- src/operator/nn/mkldnn/mkldnn_base-inl.h | 3 ++- tests/python/mkl/test_mkldnn.py | 36 +++- 2 files changed, 37 insertions(+), 2 deletions(-) diff --git a/src/operator/nn/mkldnn/mkldnn_base-inl.h b/src/operator/nn/mkldnn/mkldnn_base-inl.h index 489351e..16e5605 100644 --- a/src/operator/nn/mkldnn/mkldnn_base-inl.h +++ b/src/operator/nn/mkldnn/mkldnn_base-inl.h @@ -67,7 +67,8 @@ class CpuEngine { public: static CpuEngine *Get() { // I's thread-safe in C++11. -static thread_local CpuEngine myInstance; +// ensure same mkldnn engine is used across threads +static CpuEngine myInstance; return } CpuEngine(CpuEngine const &) = delete; // Copy construct diff --git a/tests/python/mkl/test_mkldnn.py b/tests/python/mkl/test_mkldnn.py index 5a621b0..4501c3b 100644 --- a/tests/python/mkl/test_mkldnn.py +++ b/tests/python/mkl/test_mkldnn.py @@ -91,6 +91,40 @@ def test_mkldnn_model(): except: # pylint: disable=bare-except assert 0, "test_mkldnn_model exception in bind and execution" +def test_mkldnn_engine_threading(): +""" +This test will trigger mkldnn engine on different thread of execution. +The test will first kickoff simple model calculation, and then uses a +gluon data iterator to trigger different thread context, and executes +the model on this new thread. +""" + +import mxnet as mx +from mxnet import gluon, nd + +net = gluon.nn.HybridSequential() +with net.name_scope(): +net.add(gluon.nn.Conv2D(channels=32, kernel_size=3, activation=None)) +net.collect_params().initialize(ctx=mx.cpu()) +class Dummy(gluon.data.Dataset): +def __len__(self): +return 2 +def __getitem__(self, key): +return key, np.ones((3, 224, 224)), np.ones((10, )) + +loader = gluon.data.DataLoader(Dummy(), batch_size=2, num_workers=1) + +X = (32, 3, 32, 32) +# trigger mkldnn execution thread +y = net(nd.array(np.ones(X))).asnumpy() + +# Use Gluon dataloader to trigger different thread. +# below line triggers different execution thread +for _ in loader: +y = net(nd.array(np.ones(X))).asnumpy() +# output should have 0.3376348 +assert_almost_equal(y[0, 0, 0, 0], 0.3376348) +break def test_mkldnn_ndarray_slice(): """ @@ -108,7 +142,7 @@ def test_mkldnn_ndarray_slice(): y = net(x) # trigger computation on ndarray slice -assert_almost_equal(y[0].asnumpy()[0,0,0], 0.3376348) +assert_almost_equal(y[0].asnumpy()[0, 0, 0], 0.3376348) if __name__ == '__main__': test_mkldnn_install() -- To stop receiving notification emails like this one, please contact anirudh2...@apache.org.
[incubator-mxnet] 08/12: fix a bug in cudnn softmax activation. (#10918)
This is an automated email from the ASF dual-hosted git repository. anirudh2290 pushed a commit to branch v1.2.0 in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git commit be7f358e4f64bc27b2ebe93866bcf3040953e5fe Author: Da Zheng AuthorDate: Sat May 12 22:48:34 2018 -0700 fix a bug in cudnn softmax activation. (#10918) --- .../nn/cudnn/cudnn_softmax_activation-inl.h | 13 ++--- tests/python/gpu/test_operator_gpu.py | 21 + 2 files changed, 31 insertions(+), 3 deletions(-) diff --git a/src/operator/nn/cudnn/cudnn_softmax_activation-inl.h b/src/operator/nn/cudnn/cudnn_softmax_activation-inl.h index 239da02..0845eb7 100644 --- a/src/operator/nn/cudnn/cudnn_softmax_activation-inl.h +++ b/src/operator/nn/cudnn/cudnn_softmax_activation-inl.h @@ -48,7 +48,7 @@ class CuDNNSoftmaxActivationOp { } void Forward(const OpContext , const TBlob _data, - const OpReqType , const TBlob _data) { + const OpReqType , const TBlob _data) { using namespace mshadow; using namespace mshadow::expr; Stream *s = ctx.get_stream(); @@ -102,14 +102,14 @@ class CuDNNSoftmaxActivationOp { } void Backward(const OpContext , const TBlob _grad, - const TBlob _data, const OpReqType , const TBlob _grad) { +const TBlob _data, const OpReqType , +const TBlob _grad) { using namespace mshadow; using namespace mshadow::expr; float alpha = 1.0f; float beta = 0.0f; Stream *s = ctx.get_stream(); Tensor grad; -Tensor data; Tensor output_data; Tensor input_grad; cudnnSoftmaxMode_t softmax_mode; @@ -141,6 +141,13 @@ class CuDNNSoftmaxActivationOp { softmax_mode = CUDNN_SOFTMAX_MODE_CHANNEL; } CHECK_EQ(s->dnn_handle_ownership_, mshadow::Stream::OwnHandle); +CUDNN_CALL(cudnnSetTensor4dDescriptor(shape_desc_, + CUDNN_TENSOR_NCHW, + dtype_, + input_grad.shape_[0], + input_grad.shape_[1], + input_grad.shape_[2], + input_grad.shape_[3])); CUDNN_CALL(cudnnSoftmaxBackward(s->dnn_handle_, CUDNN_SOFTMAX_ACCURATE, softmax_mode, diff --git a/tests/python/gpu/test_operator_gpu.py b/tests/python/gpu/test_operator_gpu.py index 83dfc42..7c18027 100644 --- a/tests/python/gpu/test_operator_gpu.py +++ b/tests/python/gpu/test_operator_gpu.py @@ -1837,6 +1837,27 @@ def test_batchnorm_backwards_notrain(): loss=y.square().sum() loss.backward(train_mode=False) + +@with_seed() +def test_softmax_activation(): +gpu_a = mx.nd.array([[3., 0.5, -0.5, 2., 7.], +[2., -.4, 7., 3., 0.2]], ctx=mx.gpu(0)) +cpu_a = mx.nd.array([[3., 0.5, -0.5, 2., 7.], +[2., -.4, 7., 3., 0.2]], ctx=mx.cpu()) + +cpu_a.attach_grad() +gpu_a.attach_grad() +with mx.autograd.record(): +gpu_y = mx.nd.SoftmaxActivation(data = gpu_a) +cpu_y = mx.nd.SoftmaxActivation(data = cpu_a) +assert_almost_equal(cpu_y.asnumpy(), gpu_y.asnumpy(), atol = 1e-3, rtol = 1e-3) + +gpu_y.backward() +cpu_y.backward() +assert_almost_equal(cpu_a.grad.asnumpy(), gpu_a.grad.asnumpy(), +atol = 1e-3, rtol = 1e-3) + + if __name__ == '__main__': import nose nose.runmodule() -- To stop receiving notification emails like this one, please contact anirudh2...@apache.org.
[incubator-mxnet] 06/12: [MXNET-365] handle inplace in mkldnn FallBackCompute (#10591)
This is an automated email from the ASF dual-hosted git repository. anirudh2290 pushed a commit to branch v1.2.0 in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git commit 18d239cc6776ab8bdc2aae7cf575a8cd85f7caf3 Author: Ashok Emani AuthorDate: Fri May 11 02:31:28 2018 -0700 [MXNET-365] handle inplace in mkldnn FallBackCompute (#10591) * handle inplace in mkldnn FallBackCompute * add comments * handle kAddTo in mkldnn FallBackCompute * add PR feedback * add unittest for mkldnn inplace sum with cpu data * add back mkldnn engine threading unittest * separate mkldnn install test and fix pylint issue * remove --build from mkldnn jenkins test * update mkldnn unittests * update comments for mkldnn test * remove python test doc string so unittest name is used --- Jenkinsfile | 2 +- src/operator/nn/mkldnn/mkldnn_base.cc | 13 +++- tests/python/mkl/test_mkldnn.py | 132 tests/python/mkl/test_mkldnn_install.py | 56 ++ 4 files changed, 113 insertions(+), 90 deletions(-) diff --git a/Jenkinsfile b/Jenkinsfile index 7167f14..eb2160f 100644 --- a/Jenkinsfile +++ b/Jenkinsfile @@ -109,7 +109,7 @@ def python3_ut(docker_container_name) { def python3_ut_mkldnn(docker_container_name) { timeout(time: max_time, unit: 'MINUTES') { -sh "ci/build.py --build --platform ${docker_container_name} /work/runtime_functions.sh unittest_ubuntu_python3_cpu_mkldnn" +sh "ci/build.py --platform ${docker_container_name} /work/runtime_functions.sh unittest_ubuntu_python3_cpu_mkldnn" } } diff --git a/src/operator/nn/mkldnn/mkldnn_base.cc b/src/operator/nn/mkldnn/mkldnn_base.cc index 684abd2..8792cbc 100644 --- a/src/operator/nn/mkldnn/mkldnn_base.cc +++ b/src/operator/nn/mkldnn/mkldnn_base.cc @@ -293,10 +293,15 @@ void FallBackCompute(FCompute fn, const nnvm::NodeAttrs , std::vector out_blobs(outputs.size()); for (size_t i = 0; i < out_blobs.size(); i++) { -if (req[i] == kWriteTo) - const_cast(outputs[i]).InvalidateMKLDNNData(); -CHECK(outputs[i].IsDefaultData()); -out_blobs[i] = outputs[i].data(); +NDArray output = outputs[i]; +// ensure output does not use mkldnn mem. +// for inplace, we already converted & copied input above. +if ((req[i] == kWriteTo) || (req[i] == kWriteInplace)) + const_cast(output).InvalidateMKLDNNData(); +else if (req[i] == kAddTo) + output = outputs[i].Reorder2Default(); +CHECK(output.IsDefaultData()); +out_blobs[i] = output.data(); } fn(attrs, ctx, in_blobs, req, out_blobs); } diff --git a/tests/python/mkl/test_mkldnn.py b/tests/python/mkl/test_mkldnn.py index dc9e914..2caf7af 100644 --- a/tests/python/mkl/test_mkldnn.py +++ b/tests/python/mkl/test_mkldnn.py @@ -18,57 +18,19 @@ """ MKL-DNN related test cases """ - -import mxnet as mx +import sys +import os import numpy as np -import sys,os,logging +import mxnet as mx +from mxnet.test_utils import assert_almost_equal from mxnet import gluon from mxnet.gluon import nn curr_path = os.path.dirname(os.path.abspath(os.path.expanduser(__file__))) sys.path.append(os.path.join(curr_path, '../unittest/')) -from common import setup_module, with_seed -from nose.tools import raises -from mxnet.test_utils import assert_almost_equal - - -def test_mkldnn_install(): -""" -This test will verify that MXNet is built/installed correctly when -compiled with Intel MKL-DNN library. The method will try to import -the mxnet module and see if the mkldnn library is mapped to this -process's address space. -""" -logging.basicConfig(level=logging.INFO) - -if not sys.platform.startswith('linux'): -logging.info("Bypass mkldnn install test for non-Linux OS") -return - -try: -#pylint: disable=unused-variable -import mxnet as mx -except (ImportError, OSError) as e: -assert 0, "Import mxnet error: %s. Please double check your build/" \ -"install steps or environment variable settings" % str(e) - -pid = os.getpid() -rc = os.system("cat /proc/" + str(pid) + - "/maps | grep libmkldnn > /dev/null") - -if rc == 0: -logging.info("MXNet is built/installed correctly with MKL-DNN") -else: -assert 0, "MXNet is built/installed incorrectly with MKL-DNN, please " \ -"double check your build/install steps or environment " \ -"variable settings" +from common import with_seed def test_mkldnn_model(): -""" -This test will run a sample model for couple of iterations. -""" - -import mxnet as mx model = os.path.join(os.path.dirname(os.path.realpath(__file__)), "data", "test_mkldnn_test_mkldnn_model_model1.json") shape = (32, 3, 300, 300) @@ -96,17 +58,19 @@ def test_mkldnn_model():
[incubator-mxnet] 11/12: handle fallback correctly for write inplace when the array is MKLDNN. (#10651)
This is an automated email from the ASF dual-hosted git repository. anirudh2290 pushed a commit to branch v1.2.0 in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git commit dfc847f59197a8c4ce1a65806ec3226183cee4f8 Author: Da Zheng AuthorDate: Tue May 15 15:44:23 2018 -0700 handle fallback correctly for write inplace when the array is MKLDNN. (#10651) * handle writeinplace correctly for mkldnn arrays. * Add unit tests. * Fix a bug in mkldnn copy. * Fix a bug in ndarray copy. * Verify results. --- src/common/exec_utils.h | 15 +-- src/executor/attach_op_execs_pass.cc | 7 +- src/imperative/imperative_utils.h | 10 +- src/ndarray/ndarray.cc| 5 +- src/operator/nn/mkldnn/mkldnn_copy.cc | 8 +- tests/cpp/operator/mkldnn.cc | 192 +- 6 files changed, 217 insertions(+), 20 deletions(-) diff --git a/src/common/exec_utils.h b/src/common/exec_utils.h index 29537d3..b07f7d8 100644 --- a/src/common/exec_utils.h +++ b/src/common/exec_utils.h @@ -76,8 +76,8 @@ inline bool SetupDefaultBlobsIn(const std::vector& src, } inline bool SetupDefaultBlobsOut(const std::vector& src, - const std::vector , const std::vector *bufs, + std::vector *req, std::vector *blobs, std::vector *temp_src, std::vector *temp_dst) { @@ -86,9 +86,12 @@ inline bool SetupDefaultBlobsOut(const std::vector& src, auto& nd = src[i]; bool is_default = nd.storage_type() == kDefaultStorage; #if MXNET_USE_MKLDNN == 1 -// If it's writeTo, we don't need to worry whether it contains valid data. -if (req[i] == kWriteTo && is_default) - const_cast(nd).InvalidateMKLDNNData(); +if (req->at(i) == kWriteInplace && nd.IsMKLDNNData()) + // If it's write inplace and the output array doesn't use the default + // layout, we'll generate a temporary output array below, which means + // the input array and the output array are no longer the same array. + // we should change the request type. + req->at(i) = kWriteTo; // We have to make sure it's default storage and default layout. is_default = nd.IsDefaultData(); #endif @@ -118,9 +121,9 @@ inline bool SetupDefaultBlobsOut(const std::vector& src, */ inline void SetupDefaultBlobsInOut(const std::vector , const std::vector , - const std::vector , const std::vector *in_bufs, const std::vector *out_bufs, + std::vector *req, std::vector *input_blobs, std::vector *output_blobs, std::vector *pre_temp_src, @@ -133,7 +136,7 @@ inline void SetupDefaultBlobsInOut(const std::vector , SetupDefaultBlobsIn(ndinputs, in_bufs, input_blobs, pre_temp_src, pre_temp_dst, in_temp_idx_map); // populate output blobs - SetupDefaultBlobsOut(ndoutputs, req, out_bufs, output_blobs, post_temp_dst, + SetupDefaultBlobsOut(ndoutputs, out_bufs, req, output_blobs, post_temp_dst, post_temp_src); // add mutable inputs to post temp list for (const auto idx : mutate_idx) { diff --git a/src/executor/attach_op_execs_pass.cc b/src/executor/attach_op_execs_pass.cc index 3c8fb83..1709965 100644 --- a/src/executor/attach_op_execs_pass.cc +++ b/src/executor/attach_op_execs_pass.cc @@ -78,7 +78,8 @@ class StorageFallbackOpExecutor : public OpExecutor { pre_temp_src_.clear(); pre_temp_dst_.clear(); post_temp_src_.clear(); post_temp_dst_.clear(); in_temp_idx_map_.clear(); -SetupDefaultBlobsInOut(in_array, out_array, req, _temp_buf_, _temp_buf_, +tmp_req = req; +SetupDefaultBlobsInOut(in_array, out_array, _temp_buf_, _temp_buf_, , _data_, _data_, _temp_src_, _temp_dst_, _temp_src_, _temp_dst_, @@ -89,8 +90,12 @@ class StorageFallbackOpExecutor : public OpExecutor { // storage fallback after fcompute is completed void PostFCompute(bool is_gpu) { common::CastNonDefaultStorage(post_temp_src_, post_temp_dst_, op_ctx, is_gpu); +req = tmp_req; } + // output requirement on each output array. + // This temporarily saves the original output requirements. + std::vector tmp_req; // default storage tensor blobs for fcompute std::vector in_data_, out_data_; // These are NDArray buffers for cast storage. diff --git a/src/imperative/imperative_utils.h b/src/imperative/imperative_utils.h index 86683f9..0956deb 100644 --- a/src/imperative/imperative_utils.h +++
[incubator-mxnet] branch v1.2.0 updated (546a233 -> 62a47a7)
This is an automated email from the ASF dual-hosted git repository. anirudh2290 pushed a change to branch v1.2.0 in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from 546a233 [MXNET-491] Use depthwise convolution by cuDNNv7 if available, updated version (#11076) (#11233) new bc9a2d2 Add Windows MKLDNN Building Instruction (#10613) new db24cc0 [MXNET-33] SSD example not working with mkl-dnn (#10021) new 162cc78 add unittest for gluon mkldnn ndarray slice computation new 4dcdb17 [MXNET-362] ensure same mkldnn engine is used for consistency (#10616) new 1647b70 Add more gluon computation on MKLDNN with memory operation(slice, reshape etc.) (#10764) new 18d239c [MXNET-365] handle inplace in mkldnn FallBackCompute (#10591) new 0111a36 Fix a bug in getting MKLDNN memory (#10731) new be7f358 fix a bug in cudnn softmax activation. (#10918) new 22966a0 handle NDArray slice properly for mkldnn layout new 9e97a96 invalidate MKLDNN memory for reused NDArrays. (#10706) new dfc847f handle fallback correctly for write inplace when the array is MKLDNN. (#10651) new 62a47a7 Fix bugs in MKLDNN. (#10979) The 12 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: CMakeLists.txt | 8 +- Jenkinsfile| 21 +- MKL_README.md | 96 ++- ci/docker/runtime_functions.sh | 15 + docs/install/windows_setup.md | 4 +- src/common/exec_utils.h| 15 +- src/executor/attach_op_execs_pass.cc | 16 +- src/imperative/imperative_utils.h | 23 +- src/ndarray/ndarray.cc | 171 +++-- .../nn/cudnn/cudnn_softmax_activation-inl.h| 13 +- src/operator/nn/mkldnn/mkldnn_base-inl.h | 39 +- src/operator/nn/mkldnn/mkldnn_base.cc | 30 +- src/operator/nn/mkldnn/mkldnn_copy.cc | 8 +- src/operator/nn/mkldnn/mkldnn_pooling-inl.h| 6 + src/operator/nn/mkldnn/mkldnn_sum.cc | 22 +- src/operator/tensor/elemwise_binary_op_basic.cc| 12 +- tests/cpp/include/test_core_op.h | 10 +- tests/cpp/operator/mkldnn.cc | 565 +++ tests/python/gpu/test_gluon_model_zoo_gpu.py | 19 +- tests/python/gpu/test_operator_gpu.py | 21 + .../data/test_mkldnn_test_mkldnn_model_model1.json | 770 + tests/python/mkl/test_mkldnn.py| 217 ++ tests/python/mkl/test_mkldnn_install.py| 56 ++ 23 files changed, 2005 insertions(+), 152 deletions(-) create mode 100644 tests/python/mkl/data/test_mkldnn_test_mkldnn_model_model1.json create mode 100644 tests/python/mkl/test_mkldnn.py create mode 100644 tests/python/mkl/test_mkldnn_install.py -- To stop receiving notification emails like this one, please contact anirudh2...@apache.org.
[incubator-mxnet] 10/12: invalidate MKLDNN memory for reused NDArrays. (#10706)
This is an automated email from the ASF dual-hosted git repository. anirudh2290 pushed a commit to branch v1.2.0 in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git commit 9e97a96fab3bc6c6174c9541a69bf1accd95f731 Author: Da Zheng AuthorDate: Fri Apr 27 10:35:12 2018 -0700 invalidate MKLDNN memory for reused NDArrays. (#10706) * Revert "Revert "invalidate outputs for imperative."" This reverts commit b428937968adf177e0260361c972e502e839edb5. * invalidate mkldnn memory. * enable test. --- src/executor/attach_op_execs_pass.cc | 9 + src/imperative/imperative_utils.h| 13 + 2 files changed, 22 insertions(+) diff --git a/src/executor/attach_op_execs_pass.cc b/src/executor/attach_op_execs_pass.cc index e4d4955..3c8fb83 100644 --- a/src/executor/attach_op_execs_pass.cc +++ b/src/executor/attach_op_execs_pass.cc @@ -113,6 +113,9 @@ class StatefulComputeExecutor : public StorageFallbackOpExecutor { public: void Run(RunContext rctx, bool is_gpu) override { op_ctx.run_ctx = rctx; +#if MXNET_USE_MKLDNN == 1 +InvalidateOutputs(out_array, req); +#endif PreFCompute(is_gpu); fcompute_(state_, op_ctx, in_data_, req, out_data_); PostFCompute(is_gpu); @@ -146,6 +149,9 @@ class StatefulComputeExExecutor : public OpExecutor { public: void Run(RunContext rctx, bool is_gpu) override { op_ctx.run_ctx = rctx; +#if MXNET_USE_MKLDNN == 1 +InvalidateOutputs(out_array, req); +#endif fcompute_(state_, op_ctx, in_array, req, out_array); } @@ -178,6 +184,9 @@ class FComputeExecutor : public StorageFallbackOpExecutor { void Run(RunContext rctx, bool is_gpu) override { using namespace common; op_ctx.run_ctx = rctx; +#if MXNET_USE_MKLDNN == 1 +InvalidateOutputs(out_array, req); +#endif PreFCompute(is_gpu); fcompute_(attrs_, op_ctx, in_data_, req, out_data_); PostFCompute(is_gpu); diff --git a/src/imperative/imperative_utils.h b/src/imperative/imperative_utils.h index 0d6525d..86683f9 100644 --- a/src/imperative/imperative_utils.h +++ b/src/imperative/imperative_utils.h @@ -29,6 +29,7 @@ #include "../c_api/c_api_common.h" #include "../common/utils.h" #include "../common/exec_utils.h" +#include "../operator/nn/mkldnn/mkldnn_base-inl.h" #ifndef MXNET_IMPERATIVE_IMPERATIVE_UTILS_H_ #define MXNET_IMPERATIVE_IMPERATIVE_UTILS_H_ @@ -365,6 +366,9 @@ inline void PushFCompute(const FCompute& fn, std::vector pre_temp_src, pre_temp_dst, post_temp_dst, post_temp_src; // mapping from index in input_blobs to index in pre_temp_dst std::unordered_map in_temp_idx_map; +#if MXNET_USE_MKLDNN == 1 + InvalidateOutputs(outputs, req); +#endif // setup blobs SetupDefaultBlobsInOut(inputs, outputs, req, nullptr, nullptr, _blobs, _blobs, _temp_src, _temp_dst, @@ -402,6 +406,9 @@ inline void PushFComputeEx(const FComputeEx& fn, DerefInputOutput(p_inputs, p_outputs, , ); const auto& run = [=](RunContext rctx) { OpContext opctx{is_train, rctx, engine::CallbackOnComplete(), requested}; +#if MXNET_USE_MKLDNN == 1 + InvalidateOutputs(outputs, req); +#endif fn(attrs, opctx, inputs, req, outputs); if (ctx.dev_mask() == gpu::kDevMask && exec_type == ExecType::kSync) { rctx.get_stream()->Wait(); @@ -445,6 +452,9 @@ inline void PushOperator(const OpStatePtr& state, const auto& run = [=](RunContext rctx, engine::CallbackOnComplete on_complete) { OpContext opctx{is_train, rctx, on_complete, requested}; +#if MXNET_USE_MKLDNN == 1 + InvalidateOutputs(outputs, req); +#endif fcompute_ex(state, opctx, inputs, req, outputs); if (ctx.dev_mask() == gpu::kDevMask && exec_type == ExecType::kSync) { rctx.get_stream()->Wait(); @@ -475,6 +485,9 @@ inline void PushOperator(const OpStatePtr& state, std::vector pre_temp_src, pre_temp_dst, post_temp_dst, post_temp_src; // mapping from index in input_blobs to index in pre_temp_dst std::unordered_map in_temp_idx_map; +#if MXNET_USE_MKLDNN == 1 +InvalidateOutputs(outputs, req); +#endif // populate input blobs and output blobs SetupDefaultBlobsInOut(inputs, outputs, req, nullptr, nullptr, _blobs, _blobs, _temp_src, _temp_dst, -- To stop receiving notification emails like this one, please contact anirudh2...@apache.org.
[incubator-mxnet] 07/12: Fix a bug in getting MKLDNN memory (#10731)
This is an automated email from the ASF dual-hosted git repository. anirudh2290 pushed a commit to branch v1.2.0 in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git commit 0111a36793df919ae490299a0edaaddedc9f1287 Author: Da Zheng AuthorDate: Thu May 3 10:27:01 2018 -0700 Fix a bug in getting MKLDNN memory (#10731) * test inference multiple times. * Fix a bug in GetMKLDNNData(). * Update comments. * Handle all cases for GetMKLDNNDataReorder * avoid unnecessary message. * Add C++ unit test for NDArray. * Fix a minor bug. * Unit tests on GetMKLDNNDataReorder. * Fix lint error. * Add more test cases. * add comments for the test code. * Reorganize test code. * Fix cpp tests. * test. * Add a new Jenkins compile task. * Update jenkins. * update jenkins. * Fix a Jenkins. * Fix jenkins. * Fix jenkins. * Fix CMake for MKLDNN. * Fix jenkins. * update jenkins. * update CMake. * Fix cmake. * update CI. * add comment. * add comments. * cmake builds mkldnn with -mtune=generic by default. * adjust comments. remove unnecessary tests. --- CMakeLists.txt | 8 +- Jenkinsfile | 13 +- ci/docker/runtime_functions.sh | 3 + src/ndarray/ndarray.cc | 48 -- src/operator/nn/mkldnn/mkldnn_base-inl.h | 36 +++- src/operator/nn/mkldnn/mkldnn_base.cc| 12 +- tests/cpp/include/test_core_op.h | 10 +- tests/cpp/operator/mkldnn.cc | 248 +++ tests/python/gpu/test_gluon_model_zoo_gpu.py | 19 +- 9 files changed, 355 insertions(+), 42 deletions(-) diff --git a/CMakeLists.txt b/CMakeLists.txt index 05d8021..ed96a6c 100644 --- a/CMakeLists.txt +++ b/CMakeLists.txt @@ -187,8 +187,12 @@ endif() if(USE_MKL_IF_AVAILABLE) if(USE_MKLDNN) +# We need to use generic archtecture. Otherwise, MKLDNN compiled in one +# CPU architecture (e.g., C5) can't run on another architecture (e.g., g3). +set(ARCH_OPT_FLAGS "-mtune=generic") add_subdirectory(3rdparty/mkldnn) include_directories(3rdparty/mkldnn/include) +add_definitions(-DMXNET_USE_MKLDNN=1) list(APPEND mxnet_LINKER_LIBS mkldnn) endif() find_package(MKL) @@ -197,10 +201,6 @@ if(USE_MKL_IF_AVAILABLE) include_directories(${MKL_INCLUDE_DIR}) include_directories(${CMAKE_CURRENT_SOURCE_DIR}/src/operator/mkl) -if(USE_MKLDNN) - add_definitions(-DMXNET_USE_MKLDNN=1) -endif() - add_definitions(-DUSE_MKL=1) add_definitions(-DCUB_MKL=1) list(APPEND mxnet_LINKER_LIBS ${MKL_LIBRARIES}) diff --git a/Jenkinsfile b/Jenkinsfile index eb2160f..84116e4 100644 --- a/Jenkinsfile +++ b/Jenkinsfile @@ -26,7 +26,7 @@ mx_lib = 'lib/libmxnet.so, lib/libmxnet.a, 3rdparty/dmlc-core/libdmlc.a, 3rdpart mx_dist_lib = 'lib/libmxnet.so, lib/libmxnet.a, 3rdparty/dmlc-core/libdmlc.a, 3rdparty/nnvm/lib/libnnvm.a, 3rdparty/ps-lite/build/libps.a, deps/lib/libprotobuf-lite.a, deps/lib/libzmq.a' // mxnet cmake libraries, in cmake builds we do not produce a libnvvm static library by default. mx_cmake_lib = 'build/libmxnet.so, build/libmxnet.a, build/3rdparty/dmlc-core/libdmlc.a, build/tests/mxnet_unit_tests, build/3rdparty/openmp/runtime/src/libomp.so' -mx_cmake_mkldnn_lib = 'build/libmxnet.so, build/libmxnet.a, build/3rdparty/dmlc-core/libdmlc.a, build/tests/mxnet_unit_tests, build/3rdparty/openmp/runtime/src/libomp.so, build/3rdparty/mkldnn/src/libmkldnn.so, build/3rdparty/mkldnn/src/libmkldnn.so.0' +mx_cmake_mkldnn_lib = 'build/libmxnet.so, build/libmxnet.a, build/3rdparty/dmlc-core/libdmlc.a, build/tests/mxnet_unit_tests, build/3rdparty/openmp/runtime/src/libomp.so, build/3rdparty/mkldnn/src/libmkldnn.so.0' mx_mkldnn_lib = 'lib/libmxnet.so, lib/libmxnet.a, lib/libiomp5.so, lib/libmkldnn.so.0, lib/libmklml_intel.so, 3rdparty/dmlc-core/libdmlc.a, 3rdparty/nnvm/lib/libnnvm.a' // command to start a docker container docker_run = 'tests/ci_build/ci_build.sh' @@ -534,6 +534,17 @@ try { } } }, +'Cpp: MKLDNN+GPU': { + node('mxnetlinux-gpu') { +ws('workspace/ut-cpp-mkldnn-gpu') { + timeout(time: max_time, unit: 'MINUTES') { +init_git() +unpack_lib('cmake_mkldnn_gpu', mx_cmake_mkldnn_lib) +sh "ci/build.py --nvidiadocker --platform ubuntu_gpu /work/runtime_functions.sh unittest_ubuntu_gpu_cpp" + } +} + } +}, 'R: CPU': { node('mxnetlinux-cpu') { ws('workspace/ut-r-cpu') { diff --git a/ci/docker/runtime_functions.sh b/ci/docker/runtime_functions.sh index 4d0f846..8ba6fa3 100755 --- a/ci/docker/runtime_functions.sh +++
[incubator-mxnet] 03/12: add unittest for gluon mkldnn ndarray slice computation
This is an automated email from the ASF dual-hosted git repository. anirudh2290 pushed a commit to branch v1.2.0 in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git commit 162cc788849773798cafb8614541bfe36f293d62 Author: Ashok Emani AuthorDate: Wed Apr 25 13:47:37 2018 -0700 add unittest for gluon mkldnn ndarray slice computation --- tests/python/mkl/test_mkldnn.py | 20 1 file changed, 20 insertions(+) diff --git a/tests/python/mkl/test_mkldnn.py b/tests/python/mkl/test_mkldnn.py index a4c9c45..5a621b0 100644 --- a/tests/python/mkl/test_mkldnn.py +++ b/tests/python/mkl/test_mkldnn.py @@ -22,6 +22,8 @@ MKL-DNN related test cases import logging import os from sys import platform +import numpy as np +from mxnet.test_utils import assert_almost_equal def test_mkldnn_install(): @@ -90,5 +92,23 @@ def test_mkldnn_model(): assert 0, "test_mkldnn_model exception in bind and execution" +def test_mkldnn_ndarray_slice(): +""" +This test will trigger gluon computation on mkldnn with ndarray slice +""" + +import mxnet as mx +from mxnet import gluon +ctx = mx.cpu() +net = gluon.nn.HybridSequential() +with net.name_scope(): +net.add(gluon.nn.Conv2D(channels=32, kernel_size=3, activation=None)) +net.collect_params().initialize(ctx=ctx) +x = mx.nd.array(np.ones([32, 3, 224, 224]), ctx) +y = net(x) + +# trigger computation on ndarray slice +assert_almost_equal(y[0].asnumpy()[0,0,0], 0.3376348) + if __name__ == '__main__': test_mkldnn_install() -- To stop receiving notification emails like this one, please contact anirudh2...@apache.org.
[incubator-mxnet] 05/12: Add more gluon computation on MKLDNN with memory operation(slice, reshape etc.) (#10764)
This is an automated email from the ASF dual-hosted git repository. anirudh2290 pushed a commit to branch v1.2.0 in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git commit 1647b7013611260493cd145aee262e5833de63ac Author: Shufan <33112206+juliusshu...@users.noreply.github.com> AuthorDate: Wed May 2 01:49:41 2018 +0800 Add more gluon computation on MKLDNN with memory operation(slice, reshape etc.) (#10764) --- tests/python/mkl/test_mkldnn.py | 115 ++-- 1 file changed, 111 insertions(+), 4 deletions(-) diff --git a/tests/python/mkl/test_mkldnn.py b/tests/python/mkl/test_mkldnn.py index 4501c3b..dc9e914 100644 --- a/tests/python/mkl/test_mkldnn.py +++ b/tests/python/mkl/test_mkldnn.py @@ -19,10 +19,15 @@ MKL-DNN related test cases """ -import logging -import os -from sys import platform +import mxnet as mx import numpy as np +import sys,os,logging +from mxnet import gluon +from mxnet.gluon import nn +curr_path = os.path.dirname(os.path.abspath(os.path.expanduser(__file__))) +sys.path.append(os.path.join(curr_path, '../unittest/')) +from common import setup_module, with_seed +from nose.tools import raises from mxnet.test_utils import assert_almost_equal @@ -35,7 +40,7 @@ def test_mkldnn_install(): """ logging.basicConfig(level=logging.INFO) -if not platform.startswith('linux'): +if not sys.platform.startswith('linux'): logging.info("Bypass mkldnn install test for non-Linux OS") return @@ -144,5 +149,107 @@ def test_mkldnn_ndarray_slice(): # trigger computation on ndarray slice assert_almost_equal(y[0].asnumpy()[0, 0, 0], 0.3376348) +@with_seed() +def test_reshape_before_conv(): +""" +This test will test gluon Conv2d computation on mkldnn with ndarray reshape +""" +class Net(gluon.HybridBlock): +def __init__(self, **kwargs): +super(Net, self).__init__(**kwargs) +with self.name_scope(): +self.conv0 = nn.Conv2D(10, (3, 3)) +self.conv1 = nn.Conv2D(5, (3, 3)) + +def hybrid_forward(self, F, x): +x_reshape = x.reshape((0, 0, 20, 5)) +y = self.conv0(x_reshape) +y_reshape = y.reshape((0, 0, 9, 6)) +out = self.conv1(y_reshape) +return out +x = mx.nd.random.uniform(shape=(2, 4, 10, 10)) +x.attach_grad() +net = Net() +net.collect_params().initialize() +with mx.autograd.record(): +out1 = net(x) +out1.backward() +dx1 = x.grad +net.hybridize() +with mx.autograd.record(): +out2 = net(x) +out2.backward() +mx.test_utils.assert_almost_equal(dx1.asnumpy(), x.grad.asnumpy(), rtol=1e-5, atol=1e-6) +mx.test_utils.assert_almost_equal(out1.asnumpy(), out2.asnumpy(), rtol=1e-5, atol=1e-6) + + +@with_seed() +def test_slice_before_conv(): +""" +This test will test gluon Conv2d computation on mkldnn with ndarray slice +""" +class Net(gluon.HybridBlock): +def __init__(self, **kwargs): +super(Net, self).__init__(**kwargs) +with self.name_scope(): +self.conv0 = nn.Conv2D(4, (3, 3)) +self.conv1 = nn.Conv2D(4, (3, 3)) + +def hybrid_forward(self, F, x): +x_slice = x.slice(begin=(0, 0, 0, 0), end=(2, 4, 10, 10)) +y = self.conv0(x_slice) +y_slice = y.slice(begin=(1, 0, 2, 2), end=(2, 1, 7, 7)) +out = self.conv1(y_slice) +return out +x = mx.nd.random.uniform(shape=(2, 10, 10, 10)) +x.attach_grad() +net = Net() +net.collect_params().initialize() +with mx.autograd.record(): +out1 = net(x) +out1.backward() +dx1 = x.grad +net.hybridize() +with mx.autograd.record(): +out2 = net(x) +out2.backward() +mx.test_utils.assert_almost_equal(dx1.asnumpy(), x.grad.asnumpy(), rtol=1e-5, atol=1e-6) +mx.test_utils.assert_almost_equal(out1.asnumpy(), out2.asnumpy(), rtol=1e-5, atol=1e-6) + + +@with_seed() +def test_slice_reshape_before_conv(): +""" +This test will test gluon Conv2d computation on mkldnn with ndarray reshape and slice +""" +class Net(gluon.HybridBlock): +def __init__(self, **kwargs): +super(Net, self).__init__(**kwargs) +with self.name_scope(): +self.conv0 = nn.Conv2D(4, (3, 3)) +self.conv1 = nn.Conv2D(4, (3, 3)) + +def hybrid_forward(self, F, x): +x_slice = x.slice(begin=(0, 0, 0, 0), end=(2, 4, 8, 9)) +y = self.conv0(x_slice) +y_reshape = y.reshape((0, 0, 14, 3)) +out = self.conv1(y_reshape) +return out +x = mx.nd.random.uniform(shape=(2, 10, 10, 10)) +x.attach_grad() +net = Net() +net.collect_params().initialize() +with mx.autograd.record(): +out1 = net(x) +out1.backward() +dx1 = x.grad +net.hybridize() +
[GitHub] anirudh2290 closed pull request #11212: cherry-pick bug fixes in MKLDNN for v1.2.0
anirudh2290 closed pull request #11212: cherry-pick bug fixes in MKLDNN for v1.2.0 URL: https://github.com/apache/incubator-mxnet/pull/11212 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/CMakeLists.txt b/CMakeLists.txt index 05d8021c367..ed96a6c8371 100644 --- a/CMakeLists.txt +++ b/CMakeLists.txt @@ -187,8 +187,12 @@ endif() if(USE_MKL_IF_AVAILABLE) if(USE_MKLDNN) +# We need to use generic archtecture. Otherwise, MKLDNN compiled in one +# CPU architecture (e.g., C5) can't run on another architecture (e.g., g3). +set(ARCH_OPT_FLAGS "-mtune=generic") add_subdirectory(3rdparty/mkldnn) include_directories(3rdparty/mkldnn/include) +add_definitions(-DMXNET_USE_MKLDNN=1) list(APPEND mxnet_LINKER_LIBS mkldnn) endif() find_package(MKL) @@ -197,10 +201,6 @@ if(USE_MKL_IF_AVAILABLE) include_directories(${MKL_INCLUDE_DIR}) include_directories(${CMAKE_CURRENT_SOURCE_DIR}/src/operator/mkl) -if(USE_MKLDNN) - add_definitions(-DMXNET_USE_MKLDNN=1) -endif() - add_definitions(-DUSE_MKL=1) add_definitions(-DCUB_MKL=1) list(APPEND mxnet_LINKER_LIBS ${MKL_LIBRARIES}) diff --git a/Jenkinsfile b/Jenkinsfile index 8686012164d..84116e4d85b 100644 --- a/Jenkinsfile +++ b/Jenkinsfile @@ -26,7 +26,7 @@ mx_lib = 'lib/libmxnet.so, lib/libmxnet.a, 3rdparty/dmlc-core/libdmlc.a, 3rdpart mx_dist_lib = 'lib/libmxnet.so, lib/libmxnet.a, 3rdparty/dmlc-core/libdmlc.a, 3rdparty/nnvm/lib/libnnvm.a, 3rdparty/ps-lite/build/libps.a, deps/lib/libprotobuf-lite.a, deps/lib/libzmq.a' // mxnet cmake libraries, in cmake builds we do not produce a libnvvm static library by default. mx_cmake_lib = 'build/libmxnet.so, build/libmxnet.a, build/3rdparty/dmlc-core/libdmlc.a, build/tests/mxnet_unit_tests, build/3rdparty/openmp/runtime/src/libomp.so' -mx_cmake_mkldnn_lib = 'build/libmxnet.so, build/libmxnet.a, build/3rdparty/dmlc-core/libdmlc.a, build/tests/mxnet_unit_tests, build/3rdparty/openmp/runtime/src/libomp.so, build/3rdparty/mkldnn/src/libmkldnn.so, build/3rdparty/mkldnn/src/libmkldnn.so.0' +mx_cmake_mkldnn_lib = 'build/libmxnet.so, build/libmxnet.a, build/3rdparty/dmlc-core/libdmlc.a, build/tests/mxnet_unit_tests, build/3rdparty/openmp/runtime/src/libomp.so, build/3rdparty/mkldnn/src/libmkldnn.so.0' mx_mkldnn_lib = 'lib/libmxnet.so, lib/libmxnet.a, lib/libiomp5.so, lib/libmkldnn.so.0, lib/libmklml_intel.so, 3rdparty/dmlc-core/libdmlc.a, 3rdparty/nnvm/lib/libnnvm.a' // command to start a docker container docker_run = 'tests/ci_build/ci_build.sh' @@ -107,6 +107,12 @@ def python3_ut(docker_container_name) { } } +def python3_ut_mkldnn(docker_container_name) { + timeout(time: max_time, unit: 'MINUTES') { +sh "ci/build.py --platform ${docker_container_name} /work/runtime_functions.sh unittest_ubuntu_python3_cpu_mkldnn" + } +} + // GPU test has two parts. 1) run unittest on GPU, 2) compare the results on // both CPU and GPU // Python 2 @@ -438,7 +444,7 @@ try { ws('workspace/ut-python3-mkldnn-cpu') { init_git() unpack_lib('mkldnn_cpu', mx_mkldnn_lib) - python3_ut('ubuntu_cpu') + python3_ut_mkldnn('ubuntu_cpu') } } }, @@ -528,6 +534,17 @@ try { } } }, +'Cpp: MKLDNN+GPU': { + node('mxnetlinux-gpu') { +ws('workspace/ut-cpp-mkldnn-gpu') { + timeout(time: max_time, unit: 'MINUTES') { +init_git() +unpack_lib('cmake_mkldnn_gpu', mx_cmake_mkldnn_lib) +sh "ci/build.py --nvidiadocker --platform ubuntu_gpu /work/runtime_functions.sh unittest_ubuntu_gpu_cpp" + } +} + } +}, 'R: CPU': { node('mxnetlinux-cpu') { ws('workspace/ut-r-cpu') { diff --git a/MKL_README.md b/MKL_README.md index 5374adb8e42..a5c63b097c5 100644 --- a/MKL_README.md +++ b/MKL_README.md @@ -1,19 +1,77 @@ -# Full MKL Installation - -## Build/Install MXNet with a full MKL installation: -Installing and enabling the full MKL installation enables MKL support for all operators under the linalg namespace. - - 1. Download and install the latest full MKL version following instructions on the [intel website.](https://software.intel.com/en-us/articles/intel-mkl-111-install-guide) - - 2. Set USE_BLAS=mkl in make/config.mk - -1.1 Set ADD_LDFLAGS=-L (ex. ADD_LDFLAGS=-L/opt/intel/compilers_and_libraries_2018.0.128/linux/mkl/lib) - -1.1 Set ADD_CFLAGS=-I (ex. ADD_CFLAGS=-L/opt/intel/compilers_and_libraries_2018.0.128/linux/mkl/include) - - 3. Run 'make -j ${nproc}' - - 4. Navigate into the python directory - - 5. Run 'sudo python setup.py install' - +## Build/Install MXNet with a full MKL installation: + +To make it convenient for
[GitHub] Roshrini commented on a change in pull request #11213: [MXNET-533] MXNet-ONNX export
Roshrini commented on a change in pull request #11213: [MXNET-533] MXNet-ONNX export URL: https://github.com/apache/incubator-mxnet/pull/11213#discussion_r195242388 ## File path: python/mxnet/contrib/onnx/_export/export_onnx.py ## @@ -0,0 +1,270 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. +# +# Based on +# https://github.com/NVIDIA/mxnet_to_onnx/blob/master/mx2onnx_converter/mx2onnx_converter.py# +# Copyright (c) 2017, NVIDIA CORPORATION. All rights reserved. +# +# Redistribution and use in source and binary forms, with or without +# modification, are permitted provided that the following conditions +# are met: +# * Redistributions of source code must retain the above copyright +#notice, this list of conditions and the following disclaimer. +# * Redistributions in binary form must reproduce the above copyright +#notice, this list of conditions and the following disclaimer in the +#documentation and/or other materials provided with the distribution. +# * Neither the name of NVIDIA CORPORATION nor the names of its +#contributors may be used to endorse or promote products derived +#from this software without specific prior written permission. +# +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS ``AS IS'' AND ANY +# EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR +# PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR +# CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, +# EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, +# PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR +# PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY +# OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +# coding: utf-8 +# pylint: disable=invalid-name,too-many-locals,no-self-use,too-many-arguments, +# pylint: disable=maybe-no-member,too-many-nested-blocks +"""MXNet to ONNX graph converter functions""" +from __future__ import absolute_import +from __future__ import division +from __future__ import print_function +from __future__ import unicode_literals + +import json +import numpy as np + +from import context +from import ndarray as nd +from import io +from import module as mod + + +class MXNetGraph(object): +"""Class to convert MXNet to ONNX graph""" +registry_ = {} +input_output_maps_ = {} + +def __init__(self): +# topologically sorted nodes +self.nodes = [] +self.input_tensors = [] +self.output_tensors = [] + +@staticmethod +def register(op_name): +"""Register operator""" +def wrapper(func): +"""Helper function to map functions""" +MXNetGraph.registry_[op_name] = func +return func + +return wrapper + +@staticmethod +def convert_layer(node, **kwargs): +"""Convert MXNet layer to ONNX""" +op = str(node["op"]) +if op not in MXNetGraph.registry_: +raise AttributeError("No conversion function registered for op type %s yet." % op) +convert_fun = MXNetGraph.registry_[op] +return convert_fun(node, **kwargs) + +@staticmethod +def forward_pass(inputs, sym, arg_params, aux_params): +""" Do a forward pass based on the sym and params""" +data_names = [graph_input for graph_input in sym.list_inputs() + if graph_input not in arg_params and graph_input not in aux_params + and graph_input != 'softmax_label'] + +data_shapes = [] +# Adding extra dimension of batch_size 1 if the batch_size is different for multiple inputs. +for idx, input_name in enumerate(data_names): +data_shapes.append((input_name, inputs[idx].shape)) + +# create module, passing cpu context +ctx = context.cpu() +test_mod = mod.Module(symbol=sym, data_names=data_names, context=ctx, label_names=None) +
[GitHub] Roshrini commented on a change in pull request #11213: [MXNET-533] MXNet-ONNX export
Roshrini commented on a change in pull request #11213: [MXNET-533] MXNet-ONNX export URL: https://github.com/apache/incubator-mxnet/pull/11213#discussion_r195242313 ## File path: python/mxnet/contrib/onnx/_export/export_onnx.py ## @@ -0,0 +1,270 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. +# +# Based on +# https://github.com/NVIDIA/mxnet_to_onnx/blob/master/mx2onnx_converter/mx2onnx_converter.py# +# Copyright (c) 2017, NVIDIA CORPORATION. All rights reserved. +# +# Redistribution and use in source and binary forms, with or without +# modification, are permitted provided that the following conditions +# are met: +# * Redistributions of source code must retain the above copyright +#notice, this list of conditions and the following disclaimer. +# * Redistributions in binary form must reproduce the above copyright +#notice, this list of conditions and the following disclaimer in the +#documentation and/or other materials provided with the distribution. +# * Neither the name of NVIDIA CORPORATION nor the names of its +#contributors may be used to endorse or promote products derived +#from this software without specific prior written permission. +# +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS ``AS IS'' AND ANY +# EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR +# PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR +# CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, +# EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, +# PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR +# PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY +# OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +# coding: utf-8 +# pylint: disable=invalid-name,too-many-locals,no-self-use,too-many-arguments, +# pylint: disable=maybe-no-member,too-many-nested-blocks +"""MXNet to ONNX graph converter functions""" +from __future__ import absolute_import +from __future__ import division +from __future__ import print_function +from __future__ import unicode_literals + +import json +import numpy as np + +from import context +from import ndarray as nd +from import io +from import module as mod + + +class MXNetGraph(object): +"""Class to convert MXNet to ONNX graph""" +registry_ = {} +input_output_maps_ = {} + +def __init__(self): +# topologically sorted nodes +self.nodes = [] +self.input_tensors = [] +self.output_tensors = [] + +@staticmethod +def register(op_name): +"""Register operator""" +def wrapper(func): +"""Helper function to map functions""" +MXNetGraph.registry_[op_name] = func +return func + +return wrapper + +@staticmethod +def convert_layer(node, **kwargs): +"""Convert MXNet layer to ONNX""" +op = str(node["op"]) +if op not in MXNetGraph.registry_: +raise AttributeError("No conversion function registered for op type %s yet." % op) +convert_fun = MXNetGraph.registry_[op] +return convert_fun(node, **kwargs) + +@staticmethod +def forward_pass(inputs, sym, arg_params, aux_params): +""" Do a forward pass based on the sym and params""" +data_names = [graph_input for graph_input in sym.list_inputs() + if graph_input not in arg_params and graph_input not in aux_params + and graph_input != 'softmax_label'] Review comment: Fixed it This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] aaronmarkham commented on issue #11238: UX for ONNX Documentation is broken
aaronmarkham commented on issue #11238: UX for ONNX Documentation is broken URL: https://github.com/apache/incubator-mxnet/issues/11238#issuecomment-397090798 Upon further investigation I think this commit triggered what we're seeing today. https://github.com/apache/incubator-mxnet/pull/9534/files @szha @astonzhang - Can you guy clarify what's going on with the name changes and what you think should be done to fix these errors? Why don't we use index.html anymore for each folder? Why no /api/index.html? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] lanking520 commented on issue #11249: Flaky Scala test: IllegalArgumentException: requirement failed: Failed to start ps scheduler
lanking520 commented on issue #11249: Flaky Scala test: IllegalArgumentException: requirement failed: Failed to start ps scheduler URL: https://github.com/apache/incubator-mxnet/issues/11249#issuecomment-397089350 @yzhliu This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] aaronmarkham commented on issue #11238: UX for ONNX Documentation is broken
aaronmarkham commented on issue #11238: UX for ONNX Documentation is broken URL: https://github.com/apache/incubator-mxnet/issues/11238#issuecomment-397089097 So this is happening elsewhere too. It seems like a lot of the docs moved one folder deeper, and how they are referenced changed as well. Certain things, like jquery, look for an index.html page in each subfolder. Rather than have a contrib/index.html, we seem to have contrib/contrib.html. Also, all of the script references like the following that are used to generate the API Reference section are off by one folder.: ``` ``` This should be `../../../` or just reference the full link: https://mxnet.incubator.apache.org/_static/js/auto_module_index.js I can easily fix this, but I'm concerned why/how it happened in the first place, and that just updating all of the references to this script is just treating a tiny symptom of a larger problem. FYI, @kpmurali @thomelane @ThomasDelteil This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] aaronmarkham commented on issue #11238: UX for ONNX Documentation is broken
aaronmarkham commented on issue #11238: UX for ONNX Documentation is broken URL: https://github.com/apache/incubator-mxnet/issues/11238#issuecomment-397089097 So this is happening elsewhere too. It seems like a lot of the docs moved one folder deeper, and how they are referenced changed as well. Certain things, like jquery, look for an index.html page in each subfolder. Rather than have a contrib/index.html, we seem to have contrib/contrib.html. Also, all of the script references like the following that are used to generate the API Reference section are off by one folder.: ``` ``` This should be `../../../` or just reference the full link: https://mxnet.incubator.apache.org/_static/js/auto_module_index.js I can easily fix this, but I'm concerned why/how it happened in the first place, and that just updating all of the references to this script is just treating a tiny symptom of a larger problem. FYI, @kpmurali @thomelane This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] lanking520 commented on issue #11249: Flaky Scala test: IllegalArgumentException: requirement failed: Failed to start ps scheduler
lanking520 commented on issue #11249: Flaky Scala test: IllegalArgumentException: requirement failed: Failed to start ps scheduler URL: https://github.com/apache/incubator-mxnet/issues/11249#issuecomment-397089026 Currently all test I can see from CI build failure comes from CPU test. There is no difference in configuration running on CPU and GPU for the test on Spark currently. I think it might come from the diff between CPU/GPU build. Please correct me if this assumption is wrong. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] anirudh2290 commented on issue #11054: [MXNET-244] Fixed armv7 wheel (1.2.0 release)
anirudh2290 commented on issue #11054: [MXNET-244] Fixed armv7 wheel (1.2.0 release) URL: https://github.com/apache/incubator-mxnet/pull/11054#issuecomment-396686426 ping @lebeg . Is this ready ? Also has this been merged to the master ? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] aaronmarkham commented on issue #11180: [MXNET-503] Website landing page for MMS, PR II
aaronmarkham commented on issue #11180: [MXNET-503] Website landing page for MMS, PR II URL: https://github.com/apache/incubator-mxnet/pull/11180#issuecomment-397086503 Rather than make a special folder for mms in /examples, I created a model-server folder, so any other model serving examples can be added there. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] aaronmarkham opened a new issue #11265: Flaky test: Python 3: CPU Win
aaronmarkham opened a new issue #11265: Flaky test: Python 3: CPU Win URL: https://github.com/apache/incubator-mxnet/issues/11265 ``` remote file operation failed: C:/jenkins_slave/workspace/ut-python-cpu at hudson.remoting.Channel@232e68f3:JNLP4-connect connection from ip-172-31-2-173.us-west-2.compute.internal/172.31.2.173:49714: java.nio.file.AccessDeniedException: C:\jenkins_slave\workspace\ut-python-cpu ``` http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/incubator-mxnet/detail/PR-11261/3/pipeline/ This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] nswamy commented on issue #11249: Flaky Scala test: IllegalArgumentException: requirement failed: Failed to start ps scheduler
nswamy commented on issue #11249: Flaky Scala test: IllegalArgumentException: requirement failed: Failed to start ps scheduler URL: https://github.com/apache/incubator-mxnet/issues/11249#issuecomment-397084974 may be i haven't looked how long it takes to run these tests. We need to have some tests for training as well in the regular pipeline, may be we can run it on GPUs and keep MNIST only on CPU tests. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services