[GitHub] zheng-da commented on a change in pull request #11498: Fix InferStorage for sparse fallback in FullyConnected
zheng-da commented on a change in pull request #11498: Fix InferStorage for sparse fallback in FullyConnected URL: https://github.com/apache/incubator-mxnet/pull/11498#discussion_r199313642 ## File path: src/operator/nn/fully_connected.cc ## @@ -210,17 +210,17 @@ inline static bool BackwardFCStorageType(const nnvm::NodeAttrs& attrs, CHECK_EQ(in_attrs->size(), 3U); CHECK_EQ(out_attrs->size(), out_expected); - DispatchMode wanted_mode; -#if 0 + bool dispatched = false; // TODO(zhengda) let's disable MKLDNN for FullyConnected for now. Review comment: ideally we should enable it. i forgot the case that it fails. i don't think MKLDNN FC backward is much faster than our native implementation. so it's not very urgent. @pengzhao-intel @TaoLv did you check if MKLDNN FC backward works in all cases? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] yukang2017 opened a new issue #11508: 【Question】Are there any examples for gradients clipping in gluon?
yukang2017 opened a new issue #11508: 【Question】Are there any examples for gradients clipping in gluon? URL: https://github.com/apache/incubator-mxnet/issues/11508 This is what I guess. Is it right? trainer.allreduce_grads() with autograd.record(): logits = model(input) loss = criterion(logits, target) loss.backward() grads = [i.grad(ctx) for i in model.params.values()] gluon.utils.clip_global_norm(grads, args.grad_clip) trainer.update(args.batch_size) This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] szha commented on a change in pull request #11134: [MXNET-507] Set dtype=int32 for ret_indices in ordering ops
szha commented on a change in pull request #11134: [MXNET-507] Set dtype=int32 for ret_indices in ordering ops URL: https://github.com/apache/incubator-mxnet/pull/11134#discussion_r199310908 ## File path: src/operator/tensor/ordering_op-inl.h ## @@ -79,6 +80,16 @@ struct TopKParam : public dmlc::Parameter { DMLC_DECLARE_FIELD(is_ascend).set_default(false) .describe("Whether to choose k largest or k smallest elements." " Top K largest elements will be chosen if set to false."); +DMLC_DECLARE_FIELD(dtype) +.add_enum("uint8", mshadow::kUint8) +.add_enum("int32", mshadow::kInt32) +.add_enum("float16", mshadow::kFloat16) Review comment: should we remove this option given that it won't be supported? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[incubator-mxnet] branch master updated: [MXNET-595] Install page not loading properly in Internet Explorer (#11404)
This is an automated email from the ASF dual-hosted git repository. zhasheng pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git The following commit(s) were added to refs/heads/master by this push: new 9a0099f [MXNET-595] Install page not loading properly in Internet Explorer (#11404) 9a0099f is described below commit 9a0099f6cebbdd2c1a226cdbf6b863e8a248e9f0 Author: kpmurali <37911926+kpmur...@users.noreply.github.com> AuthorDate: Fri Jun 29 19:45:43 2018 -0700 [MXNET-595] Install page not loading properly in Internet Explorer (#11404) * Adding in the fixes for IE in the install page by replacing urlSearchParams for a custom method * Further fixes for IE install page --- docs/_static/js/navbar.js | 4 ++-- docs/_static/js/options.js | 24 +--- 2 files changed, 19 insertions(+), 9 deletions(-) diff --git a/docs/_static/js/navbar.js b/docs/_static/js/navbar.js index 2a27c50..0384194 100644 --- a/docs/_static/js/navbar.js +++ b/docs/_static/js/navbar.js @@ -4,8 +4,8 @@ var DOC_TITLE = ['/faq/', '/tutorials/', '/architecture/', '/model_zoo/']; var APISubmenu, versionSubmenu, docSubmenu, communitySubmenu; $("#burgerMenu").children().each(function () { if($(this).children().first().html() == 'API') APISubmenu = $(this).clone(); -if($(this).children().first().html().startsWith('Versions')) versionSubmenu = $(this).clone(); -if($(this).children().first().html().startsWith('Community')) communitySubmenu = $(this).clone(); +if($(this).children().first().html().indexOf('Versions') == 0) versionSubmenu = $(this).clone(); +if($(this).children().first().html().indexOf('Community') == 0) communitySubmenu = $(this).clone(); if($(this).children().first().html() == 'Docs') docSubmenu= $(this).clone(); }); diff --git a/docs/_static/js/options.js b/docs/_static/js/options.js index a1e40fe..eb82b11 100644 --- a/docs/_static/js/options.js +++ b/docs/_static/js/options.js @@ -8,9 +8,19 @@ $(document).ready(function () { function label(lbl) { return lbl.replace(/[ .]/g, '-').toLowerCase(); } + +function urlSearchParams(searchString) { +let urlDict = new Map(); +let searchParams = searchString.substring(1).split("&"); +searchParams.forEach(function(element) { +kvPair = element.split("="); +urlDict.set(kvPair[0], kvPair[1]); +}); +return urlDict; +} function setSelects(){ -let urlParams = new URLSearchParams(window.location.search); +let urlParams = urlSearchParams(window.location.search); if (urlParams.get('version')) versionSelect = urlParams.get('version'); $('li a:contains(' + versionSelect + ')').parent().siblings().removeClass('active'); @@ -33,8 +43,8 @@ $(document).ready(function () { $('button:contains(' + environSelect + ')').siblings().removeClass('active'); $('button:contains(' + environSelect + ')').addClass('active'); showContent(); -if (window.location.href.includes("/install/index.html")) { -if (versionSelect.includes(defaultVersion)) { +if (window.location.href.indexOf("/install/index.html") >= 0) { +if (versionSelect.indexOf(defaultVersion) >= 0) { history.pushState(null, null, '/install/index.html?platform=' + platformSelect + '=' + languageSelect + '=' + processorSelect); } else { history.pushState(null, null, '/install/index.html?version=' + versionSelect + '=' + platformSelect + '=' + languageSelect + '=' + processorSelect); @@ -56,18 +66,18 @@ $(document).ready(function () { setSelects(); function setContent() { var el = $(this); -let urlParams = new URLSearchParams(window.location.search); +let urlParams = urlSearchParams(window.location.search); el.siblings().removeClass('active'); el.addClass('active'); if ($(this).hasClass("versions")) { $('.current-version').html( $(this).text() + ' ' ); -if (!$(this).text().includes(defaultVersion)) { -if (!window.location.search.includes("version")) { +if ($(this).text().indexOf(defaultVersion) < 0) { +if (window.location.search.indexOf("version") < 0) { history.pushState(null, null, '/install/index.html' + window.location.search.concat( '=' + $(this).text() )); } else { history.pushState(null, null, '/install/index.html' + window.location.search.replace( urlParams.get('version'), $(this).text() )); } -} else if (window.location.search.includes("version")) { +} else if (window.location.search.indexOf("version") >= 0) { history.pushState(null, null, '/install/index.html' + window.location.search.replace(
[GitHub] szha closed pull request #11404: [MXNET-595] Install page not loading properly in Internet Explorer
szha closed pull request #11404: [MXNET-595] Install page not loading properly in Internet Explorer URL: https://github.com/apache/incubator-mxnet/pull/11404 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/docs/_static/js/navbar.js b/docs/_static/js/navbar.js index 2a27c50bbc0..0384194fa2d 100644 --- a/docs/_static/js/navbar.js +++ b/docs/_static/js/navbar.js @@ -4,8 +4,8 @@ var DOC_TITLE = ['/faq/', '/tutorials/', '/architecture/', '/model_zoo/']; var APISubmenu, versionSubmenu, docSubmenu, communitySubmenu; $("#burgerMenu").children().each(function () { if($(this).children().first().html() == 'API') APISubmenu = $(this).clone(); -if($(this).children().first().html().startsWith('Versions')) versionSubmenu = $(this).clone(); -if($(this).children().first().html().startsWith('Community')) communitySubmenu = $(this).clone(); +if($(this).children().first().html().indexOf('Versions') == 0) versionSubmenu = $(this).clone(); +if($(this).children().first().html().indexOf('Community') == 0) communitySubmenu = $(this).clone(); if($(this).children().first().html() == 'Docs') docSubmenu= $(this).clone(); }); diff --git a/docs/_static/js/options.js b/docs/_static/js/options.js index a1e40fe6c30..eb82b113c63 100644 --- a/docs/_static/js/options.js +++ b/docs/_static/js/options.js @@ -8,9 +8,19 @@ $(document).ready(function () { function label(lbl) { return lbl.replace(/[ .]/g, '-').toLowerCase(); } + +function urlSearchParams(searchString) { +let urlDict = new Map(); +let searchParams = searchString.substring(1).split("&"); +searchParams.forEach(function(element) { +kvPair = element.split("="); +urlDict.set(kvPair[0], kvPair[1]); +}); +return urlDict; +} function setSelects(){ -let urlParams = new URLSearchParams(window.location.search); +let urlParams = urlSearchParams(window.location.search); if (urlParams.get('version')) versionSelect = urlParams.get('version'); $('li a:contains(' + versionSelect + ')').parent().siblings().removeClass('active'); @@ -33,8 +43,8 @@ $(document).ready(function () { $('button:contains(' + environSelect + ')').siblings().removeClass('active'); $('button:contains(' + environSelect + ')').addClass('active'); showContent(); -if (window.location.href.includes("/install/index.html")) { -if (versionSelect.includes(defaultVersion)) { +if (window.location.href.indexOf("/install/index.html") >= 0) { +if (versionSelect.indexOf(defaultVersion) >= 0) { history.pushState(null, null, '/install/index.html?platform=' + platformSelect + '=' + languageSelect + '=' + processorSelect); } else { history.pushState(null, null, '/install/index.html?version=' + versionSelect + '=' + platformSelect + '=' + languageSelect + '=' + processorSelect); @@ -56,18 +66,18 @@ $(document).ready(function () { setSelects(); function setContent() { var el = $(this); -let urlParams = new URLSearchParams(window.location.search); +let urlParams = urlSearchParams(window.location.search); el.siblings().removeClass('active'); el.addClass('active'); if ($(this).hasClass("versions")) { $('.current-version').html( $(this).text() + ' ' ); -if (!$(this).text().includes(defaultVersion)) { -if (!window.location.search.includes("version")) { +if ($(this).text().indexOf(defaultVersion) < 0) { +if (window.location.search.indexOf("version") < 0) { history.pushState(null, null, '/install/index.html' + window.location.search.concat( '=' + $(this).text() )); } else { history.pushState(null, null, '/install/index.html' + window.location.search.replace( urlParams.get('version'), $(this).text() )); } -} else if (window.location.search.includes("version")) { +} else if (window.location.search.indexOf("version") >= 0) { history.pushState(null, null, '/install/index.html' + window.location.search.replace( 'version', 'prev' )); } } This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] szha closed pull request #11405: Update large word language model example
szha closed pull request #11405: Update large word language model example URL: https://github.com/apache/incubator-mxnet/pull/11405 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/example/rnn/large_word_lm/LogUniformGenerator.cc b/example/rnn/large_word_lm/LogUniformGenerator.cc new file mode 100644 index 000..ae40659437d --- /dev/null +++ b/example/rnn/large_word_lm/LogUniformGenerator.cc @@ -0,0 +1,52 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +/*! + * Copyright (c) 2018 by Contributors + * \file LogUniformGenerator.cc + * \brief log uniform distribution generator +*/ + +#include +#include +#include +#include +#include + +#include "LogUniformGenerator.h" + +LogUniformGenerator::LogUniformGenerator(const int range_max) + : range_max_(range_max), log_range_max_(log(range_max)), +generator_(), distribution_(0.0, 1.0) {} + +std::unordered_set LogUniformGenerator::draw(const size_t size, int* num_tries) { + std::unordered_set result; + int tries = 0; + while (result.size() != size) { +tries += 1; +double x = distribution_(generator_); +long value = lround(exp(x * log_range_max_)) - 1; +// sampling without replacement +if (result.find(value) == result.end()) { + result.emplace(value); +} + } + *num_tries = tries; + return result; +} diff --git a/example/rnn/large_word_lm/LogUniformGenerator.h b/example/rnn/large_word_lm/LogUniformGenerator.h new file mode 100644 index 000..b6c4f93e515 --- /dev/null +++ b/example/rnn/large_word_lm/LogUniformGenerator.h @@ -0,0 +1,45 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +/*! + * Copyright (c) 2018 by Contributors + * \file LogUniformGenerator.h + * \brief log uniform distribution generator +*/ + +#ifndef _LOG_UNIFORM_GENERATOR_H +#define _LOG_UNIFORM_GENERATOR_H + +#include +#include +#include + +class LogUniformGenerator { +private: + const int range_max_; + const double log_range_max_; + std::default_random_engine generator_; + std::uniform_real_distribution distribution_; +public: + LogUniformGenerator(const int); + std::unordered_set draw(const size_t, int*); +}; + +#endif // _LOG_UNIFORM_GENERATOR_H + diff --git a/example/rnn/large_word_lm/Makefile b/example/rnn/large_word_lm/Makefile new file mode 100644 index 000..116f7bb1514 --- /dev/null +++ b/example/rnn/large_word_lm/Makefile @@ -0,0 +1,25 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +all: clean +
[incubator-mxnet] branch master updated: Update large word language model example (#11405)
This is an automated email from the ASF dual-hosted git repository. zhasheng pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git The following commit(s) were added to refs/heads/master by this push: new 806b41b Update large word language model example (#11405) 806b41b is described below commit 806b41bfed33d496a35f0af00997774b662990f5 Author: Haibin Lin AuthorDate: Fri Jun 29 19:43:57 2018 -0700 Update large word language model example (#11405) * add cython sampler * remove unused files * use eval batch size = 1 * update read me * update read me * update license --- example/rnn/large_word_lm/LogUniformGenerator.cc | 52 ++ example/rnn/large_word_lm/LogUniformGenerator.h | 45 +++ example/rnn/large_word_lm/Makefile | 25 +++ example/rnn/large_word_lm/custom_module.py | 3 +- example/rnn/large_word_lm/log_uniform.pyx| 38 example/rnn/large_word_lm/model.py | 21 - example/rnn/large_word_lm/readme.md | 16 +++ example/rnn/large_word_lm/run_utils.py | 11 +++-- example/rnn/large_word_lm/sampler.py | 55 example/rnn/large_word_lm/setup.py | 28 example/rnn/large_word_lm/train.py | 32 +- 11 files changed, 292 insertions(+), 34 deletions(-) diff --git a/example/rnn/large_word_lm/LogUniformGenerator.cc b/example/rnn/large_word_lm/LogUniformGenerator.cc new file mode 100644 index 000..ae40659 --- /dev/null +++ b/example/rnn/large_word_lm/LogUniformGenerator.cc @@ -0,0 +1,52 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +/*! + * Copyright (c) 2018 by Contributors + * \file LogUniformGenerator.cc + * \brief log uniform distribution generator +*/ + +#include +#include +#include +#include +#include + +#include "LogUniformGenerator.h" + +LogUniformGenerator::LogUniformGenerator(const int range_max) + : range_max_(range_max), log_range_max_(log(range_max)), +generator_(), distribution_(0.0, 1.0) {} + +std::unordered_set LogUniformGenerator::draw(const size_t size, int* num_tries) { + std::unordered_set result; + int tries = 0; + while (result.size() != size) { +tries += 1; +double x = distribution_(generator_); +long value = lround(exp(x * log_range_max_)) - 1; +// sampling without replacement +if (result.find(value) == result.end()) { + result.emplace(value); +} + } + *num_tries = tries; + return result; +} diff --git a/example/rnn/large_word_lm/LogUniformGenerator.h b/example/rnn/large_word_lm/LogUniformGenerator.h new file mode 100644 index 000..b6c4f93 --- /dev/null +++ b/example/rnn/large_word_lm/LogUniformGenerator.h @@ -0,0 +1,45 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +/*! + * Copyright (c) 2018 by Contributors + * \file LogUniformGenerator.h + * \brief log uniform distribution generator +*/ + +#ifndef _LOG_UNIFORM_GENERATOR_H +#define _LOG_UNIFORM_GENERATOR_H + +#include +#include +#include + +class LogUniformGenerator { +private: + const int range_max_; + const double log_range_max_; + std::default_random_engine generator_; + std::uniform_real_distribution distribution_; +public: + LogUniformGenerator(const int); + std::unordered_set draw(const size_t,
[GitHub] marcoabreu opened a new pull request #11507: Disable flaky test test_conv in gluon
marcoabreu opened a new pull request #11507: Disable flaky test test_conv in gluon URL: https://github.com/apache/incubator-mxnet/pull/11507 ## Description ## Seems like test_conv is actually broken as well... https://github.com/apache/incubator-mxnet/issues/11506 ## Checklist ## ### Essentials ### Please feel free to remove inapplicable items for your PR. - [ ] The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant [JIRA issue](https://issues.apache.org/jira/projects/MXNET/issues) created (except PRs with tiny changes) - [x] Changes are complete (i.e. I finished coding on this PR) - [ ] All changes have test coverage: - Unit tests are added for small changes to verify correctness (e.g. adding a new operator) - Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore) - Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL) - [x] Code is well-documented: - For user-facing API changes, API doc string has been updated. - For new C++ functions in header files, their functionalities and arguments are documented. - For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable - Check the API doc at http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html - [ ] To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change ### Changes ### - [ ] Feature1, tests, (and when applicable, API doc) - [ ] Feature2, tests, (and when applicable, API doc) ## Comments ## - If this change is a backward incompatible change, why must this change be made. - Interesting edge cases to note here This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] marcoabreu opened a new issue #11506: Flaky test_operator_gpu.test_conv causes other tests to fail on Windows
marcoabreu opened a new issue #11506: Flaky test_operator_gpu.test_conv causes other tests to fail on Windows URL: https://github.com/apache/incubator-mxnet/issues/11506 http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/incubator-mxnet/detail/master/1106/pipeline ``` test_operator_gpu.test_ndarray_crop ... ok test_operator_gpu.test_cell_fill_shape ... ok test_operator_gpu.test_conv ... [23:56:08] c:\jenkins_slave\workspace\build-gpu@2\src\operator\nn\cudnn\./cudnn_algoreg-inl.h:107: Running performance tests to find the best convolution algorithm, this can take a while... (setting env variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable) [23:56:08] C:/jenkins_slave/workspace/build-gpu@2/src/operator/nn/convolution.cu:148: This convolution is not supported by cudnn, MXNET convolution is applied. [23:56:08] C:/jenkins_slave/workspace/build-gpu@2/src/operator/nn/convolution.cu:227: This convolution is not supported by cudnn, MXNET convolution is applied. [23:56:08] C:/jenkins_slave/workspace/build-gpu@2/src/operator/nn/convolution.cu:148: This convolution is not supported by cudnn, MXNET convolution is applied. [23:56:08] C:/jenkins_slave/workspace/build-gpu@2/src/operator/nn/convolution.cu:227: This convolution is not supported by cudnn, MXNET convolution is applied. [INFO] Setting test np/mx/python random seeds, use MXNET_TEST_SEED=270901180 to reproduce. [INFO] Setting test np/mx/python random seeds, use MXNET_TEST_SEED=1177804468 to reproduce. ERROR test_operator_gpu.test_layer_fill_shape ... ERROR test_operator_gpu.test_ndarray_concatenate ... [INFO] Setting test np/mx/python random seeds, use MXNET_TEST_SEED=820959189 to reproduce. ERROR test_operator_gpu.test_normal_generator ... [INFO] Setting test np/mx/python random seeds, use MXNET_TEST_SEED=1440948114 to reproduce. ERROR test_operator_gpu.test_sparse_nd_transpose ... [INFO] Setting test np/mx/python random seeds, use MXNET_TEST_SEED=479442568 to reproduce. ERROR test_operator_gpu.test_slice_channel ... [INFO] Setting test np/mx/python random seeds, use MXNET_TEST_SEED=952262734 to reproduce. ERROR test_operator_gpu.test_sparse_nd_storage_fallback ... [23:56:09] c:\jenkins_slave\workspace\build-gpu@2\src\operator\../common/utils.h:447: Storage type fallback detected: operator = broadcast_add input storage types = [default, default, ] output storage types = [csr, ] params = {} context.dev_mask = gpu The operator with default storage type will be dispatched for execution. You're seeing this warning message because the operator above is unable to process the given ndarrays with specified storage types, context and parameter. Temporary dense ndarrays are generated in order to execute the operator. This does not affect the correctness of the programme. You can set environment variable MXNET_STORAGE_FALLBACK_LOG_VERBOSE to 0 to suppress this warning. [INFO] Setting test np/mx/python random seeds, use MXNET_TEST_SEED=2072600123 to reproduce. ERROR test_operator_gpu.test_clip ... [INFO] Setting test np/mx/python random seeds, use MXNET_TEST_SEED=1670877169 to reproduce. ERROR test_operator_gpu.test_convolution_with_type ... [INFO] Setting test np/mx/python random seeds, use MXNET_TEST_SEED=1234 to reproduce. ERROR test_operator_gpu.test_deconv ... [INFO] Setting test np/mx/python random seeds, use MXNET_TEST_SEED=77155738 to reproduce. [INFO] Setting test np/mx/python random seeds, use MXNET_TEST_SEED=542068512 to reproduce. ERROR test_operator_gpu.test_dot ... [INFO] Setting test np/mx/python random seeds, use MXNET_TEST_SEED=1533114250 to reproduce. ERROR test_operator_gpu.test_uniform_generator ... [INFO] Setting test np/mx/python random seeds, use MXNET_TEST_SEED=1203830416 to reproduce. ERROR test sparse random operator on cpu ... ok test_operator_gpu.test_nag ... [INFO] Setting test np/mx/python random seeds, use MXNET_TEST_SEED=0 to reproduce. ERROR test_operator_gpu.test_gamma_generator ... [INFO] Setting test np/mx/python random seeds, use MXNET_TEST_SEED=579518551 to reproduce. ERROR test regression operator ... [INFO] Setting test np/mx/python random seeds, use MXNET_TEST_SEED=1630735113 to reproduce. ERROR test_operator_gpu.test_sparse_nd_astype ... ok test_operator_gpu.test_exponential_generator ... [INFO] Setting test np/mx/python random seeds, use MXNET_TEST_SEED=493650054 to reproduce. ERROR test_operator_gpu.test_sparse_nd_astype_copy ... ok test_operator_gpu.test_convolution_options ... SKIP: test fails intermittently. temporarily disabled till it gets fixed. tracked at
[GitHub] marcoabreu commented on a change in pull request #10696: [MXNET-366]Extend MXNet Distributed Training by AllReduce
marcoabreu commented on a change in pull request #10696: [MXNET-366]Extend MXNet Distributed Training by AllReduce URL: https://github.com/apache/incubator-mxnet/pull/10696#discussion_r199309383 ## File path: tests/nightly/test_all.sh ## @@ -44,6 +44,7 @@ USE_CUDA=1 USE_CUDA_PATH=/usr/local/cuda USE_CUDNN=1 USE_DIST_KVSTORE=1 +USE_ALLREDUCE_DIST_KVSTORE=1 Review comment: I'm quite certain that we are not using this file @mbaijal could you please confirm? We should be deleting this file otherwise. Sure, feel free to add it. Does it need to be a separate build or could we just extend the current one? Please consider doing that in the nightly Jenkinsfile at tests/nightly/JenkinsfileForBinary (or something like that) This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] marcoabreu edited a comment on issue #11475: Amending ONNX importer/exporter #11213
marcoabreu edited a comment on issue #11475: Amending ONNX importer/exporter #11213 URL: https://github.com/apache/incubator-mxnet/issues/11475#issuecomment-401508772 We currently have the following structure: ``` tests/python-pytest tests/python tests/cpp ``` The current structure is basically ``tests/{LANGUAGE}`` and I think it's quite user-friendly to keep it that way (we have some exceptions already). ``tests/python`` is nosetests and basically every python test is nose by default if not specified otherwise; the same counts for ``tests/cpp`` being for gtest. Having a folder called python-pytest on the same level as the regular python tests covered by nose makes it easily distinguishable that there is an exception from our standard. I rather prefer to have explicit naming opposed to having implicit ones (like python being implied by pytest). From my experience, this "verbose" style makes it more structured and easier for newcomers to understand. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] marcoabreu commented on issue #11475: Amending ONNX importer/exporter #11213
marcoabreu commented on issue #11475: Amending ONNX importer/exporter #11213 URL: https://github.com/apache/incubator-mxnet/issues/11475#issuecomment-401508772 We currently have the following structure: ``` tests/python-pytest tests/python tests/cpp ``` The current structure is basically ``tests/{LANGUAGE}`` and I think it's quite user-friendly to keep it that way (we have some exceptions already). ``tests/python`` is nosetests and basically every python test is nose by default if not specified otherwise. Having a folder called python-pytest on the same level as the regular python tests covered by nose makes it easily distinguishable. I rather prefer to have explicit naming opposed to having implicit ones (like python being implied by pytest). From my experience, this "verbose" style makes it more structured and easier for newcomers to understand. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] aaronmarkham edited a comment on issue #11503: updating installation info to have latest packages and more clarity
aaronmarkham edited a comment on issue #11503: updating installation info to have latest packages and more clarity URL: https://github.com/apache/incubator-mxnet/pull/11503#issuecomment-401507051 The table is rendering incorrectly, so I'll need to fix that first. It is supposed to look like this: ![2018-06-29_18-06-38](https://user-images.githubusercontent.com/5974205/42119928-362ad5ba-7bc7-11e8-97de-dba8fd099c90.png) I now have it rendering on github, but it doesn't make it to the website. I suppose I can't use those github icons, so I'll try font awesome instead. Or just make an image... This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] vrakesh opened a new pull request #11505: [MXNET-615] Fix for flaky test_svmoutput_with_type
vrakesh opened a new pull request #11505: [MXNET-615] Fix for flaky test_svmoutput_with_type URL: https://github.com/apache/incubator-mxnet/pull/11505 ## Description ## Fixes flaky test #8288 ## Checklist ## ### Essentials ### Please feel free to remove inapplicable items for your PR. - [ ] The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant [JIRA issue](https://issues.apache.org/jira/projects/MXNET/issues) created (except PRs with tiny changes) - [ ] Changes are complete (i.e. I finished coding on this PR) - [ ] All changes have test coverage: - Unit tests are added for small changes to verify correctness (e.g. adding a new operator) - Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore) - Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL) - [ ] Code is well-documented: - For user-facing API changes, API doc string has been updated. - For new C++ functions in header files, their functionalities and arguments are documented. - For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable - Check the API doc at http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html - [ ] To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change ### Changes ### - [ ] Feature1, tests, (and when applicable, API doc) - [ ] Feature2, tests, (and when applicable, API doc) ## Comments ## - If this change is a backward incompatible change, why must this change be made. - Interesting edge cases to note here This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] eric-haibin-lin closed issue #10920: Flaky test test_operator_gpu.test_sparse_dot
eric-haibin-lin closed issue #10920: Flaky test test_operator_gpu.test_sparse_dot URL: https://github.com/apache/incubator-mxnet/issues/10920 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] aaronmarkham commented on issue #11503: updating installation info to have latest packages and more clarity
aaronmarkham commented on issue #11503: updating installation info to have latest packages and more clarity URL: https://github.com/apache/incubator-mxnet/pull/11503#issuecomment-401507051 The table is rendering incorrectly, so I'll need to fix that first. It is supposed to look like this: ![2018-06-29_18-06-38](https://user-images.githubusercontent.com/5974205/42119928-362ad5ba-7bc7-11e8-97de-dba8fd099c90.png) This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] aaronmarkham removed a comment on issue #11503: updating installation info to have latest packages and more clarity
aaronmarkham removed a comment on issue #11503: updating installation info to have latest packages and more clarity URL: https://github.com/apache/incubator-mxnet/pull/11503#issuecomment-401503263 @nswamy @marcoabreu @szha - Can you please review? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] eric-haibin-lin closed pull request #10889: [MXNET-382] Shape and Size Operator
eric-haibin-lin closed pull request #10889: [MXNET-382] Shape and Size Operator URL: https://github.com/apache/incubator-mxnet/pull/10889 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/docs/api/python/ndarray/ndarray.md b/docs/api/python/ndarray/ndarray.md index 323344d69c0..dda534151a1 100644 --- a/docs/api/python/ndarray/ndarray.md +++ b/docs/api/python/ndarray/ndarray.md @@ -124,6 +124,8 @@ The `ndarray` package provides several classes: :nosignatures: NDArray.T +NDArray.shape_array +NDArray.size_array NDArray.reshape NDArray.reshape_like NDArray.flatten @@ -375,6 +377,8 @@ The `ndarray` package provides several classes: :nosignatures: cast +shape_array +size_array reshape reshape_like flatten diff --git a/docs/api/python/symbol/symbol.md b/docs/api/python/symbol/symbol.md index cc63e13e6ec..304b17803ed 100644 --- a/docs/api/python/symbol/symbol.md +++ b/docs/api/python/symbol/symbol.md @@ -191,6 +191,8 @@ Composite multiple symbols into a new one by an operator. :nosignatures: Symbol.astype +Symbol.shape_array +Symbol.size_array Symbol.reshape Symbol.reshape_like Symbol.flatten @@ -373,6 +375,8 @@ Composite multiple symbols into a new one by an operator. :nosignatures: cast +shape_array +size_array reshape reshape_like flatten diff --git a/python/mxnet/ndarray/ndarray.py b/python/mxnet/ndarray/ndarray.py index 002ce3ebbc2..09395e2ec82 100644 --- a/python/mxnet/ndarray/ndarray.py +++ b/python/mxnet/ndarray/ndarray.py @@ -1254,6 +1254,22 @@ def flatten(self, *args, **kwargs): """ return op.flatten(self, *args, **kwargs) +def shape_array(self, *args, **kwargs): +"""Convenience fluent method for :py:func:`shape_array`. + +The arguments are the same as for :py:func:`shape_array`, with +this array as data. +""" +return op.shape_array(self, *args, **kwargs) + +def size_array(self, *args, **kwargs): +"""Convenience fluent method for :py:func:`size_array`. + +The arguments are the same as for :py:func:`size_array`, with +this array as data. +""" +return op.size_array(self, *args, **kwargs) + def expand_dims(self, *args, **kwargs): """Convenience fluent method for :py:func:`expand_dims`. diff --git a/python/mxnet/symbol/symbol.py b/python/mxnet/symbol/symbol.py index c5e2f5cb77d..b041f4ef646 100644 --- a/python/mxnet/symbol/symbol.py +++ b/python/mxnet/symbol/symbol.py @@ -1982,6 +1982,22 @@ def flatten(self, *args, **kwargs): """ return op.flatten(self, *args, **kwargs) +def shape_array(self, *args, **kwargs): +"""Convenience fluent method for :py:func:`shape_array`. + +The arguments are the same as for :py:func:`shape_op`, with +this array as data. +""" +return op.shape_array(self, *args, **kwargs) + +def size_array(self, *args, **kwargs): +"""Convenience fluent method for :py:func:`size_array`. + +The arguments are the same as for :py:func:`size_array`, with +this array as data. +""" +return op.size_array(self, *args, **kwargs) + def expand_dims(self, *args, **kwargs): """Convenience fluent method for :py:func:`expand_dims`. diff --git a/python/mxnet/test_utils.py b/python/mxnet/test_utils.py index 19fe0749598..ae5a473d228 100644 --- a/python/mxnet/test_utils.py +++ b/python/mxnet/test_utils.py @@ -1251,13 +1251,15 @@ def check_consistency(sym, ctx_list, scale=1.0, grad_req='write', np.dtype(np.float32): 1e-3, np.dtype(np.float64): 1e-5, np.dtype(np.uint8): 0, - np.dtype(np.int32): 0} + np.dtype(np.int32): 0, + np.dtype(np.int64): 0} elif isinstance(tol, numbers.Number): tol = {np.dtype(np.float16): tol, np.dtype(np.float32): tol, np.dtype(np.float64): tol, np.dtype(np.uint8): tol, - np.dtype(np.int32): tol} + np.dtype(np.int32): tol, + np.dtype(np.int64): tol} assert len(ctx_list) > 1 if isinstance(sym, Symbol): diff --git a/src/operator/tensor/elemwise_unary_op_basic.cc b/src/operator/tensor/elemwise_unary_op_basic.cc index 46f62651c75..5b89d49f430 100644 --- a/src/operator/tensor/elemwise_unary_op_basic.cc +++ b/src/operator/tensor/elemwise_unary_op_basic.cc @@ -398,6 +398,98 @@ NNVM_REGISTER_OP(reshape_like) .add_argument("lhs", "NDArray-or-Symbol", "First input.") .add_argument("rhs", "NDArray-or-Symbol", "Second input."); +void ShapeComputeCPU(const nnvm::NodeAttrs& attrs, +
[incubator-mxnet] branch master updated: [MXNET-382] Shape and Size Operator (#10889)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git The following commit(s) were added to refs/heads/master by this push: new 33022f8 [MXNET-382] Shape and Size Operator (#10889) 33022f8 is described below commit 33022f82a3088e95b61ed8ef3659e8c55433f556 Author: Anirudh AuthorDate: Fri Jun 29 18:05:38 2018 -0700 [MXNET-382] Shape and Size Operator (#10889) * Shape Operator * cuda * size op * lint issues * docs example * add docs, change op name to avoid conflict, add convenience confluent method * change name to _nd * fix test cases, add new kernel * test name fix. * solve gpu memory problem for size and shape * get rid of FIgnoreInputs attr of shape_nd * op name change * fix * retrigger CI * retrigger CI * retrigger CI * trigger CI * fix comments * cpplint * nit * trigger CI --- docs/api/python/ndarray/ndarray.md | 4 ++ docs/api/python/symbol/symbol.md | 4 ++ python/mxnet/ndarray/ndarray.py| 16 + python/mxnet/symbol/symbol.py | 16 + python/mxnet/test_utils.py | 6 +- src/operator/tensor/elemwise_unary_op_basic.cc | 92 ++ src/operator/tensor/elemwise_unary_op_basic.cu | 46 + tests/python/unittest/test_ndarray.py | 4 +- tests/python/unittest/test_operator.py | 18 + tests/python/unittest/test_symbol.py | 4 +- 10 files changed, 204 insertions(+), 6 deletions(-) diff --git a/docs/api/python/ndarray/ndarray.md b/docs/api/python/ndarray/ndarray.md index 323344d..dda5341 100644 --- a/docs/api/python/ndarray/ndarray.md +++ b/docs/api/python/ndarray/ndarray.md @@ -124,6 +124,8 @@ The `ndarray` package provides several classes: :nosignatures: NDArray.T +NDArray.shape_array +NDArray.size_array NDArray.reshape NDArray.reshape_like NDArray.flatten @@ -375,6 +377,8 @@ The `ndarray` package provides several classes: :nosignatures: cast +shape_array +size_array reshape reshape_like flatten diff --git a/docs/api/python/symbol/symbol.md b/docs/api/python/symbol/symbol.md index cc63e13..304b178 100644 --- a/docs/api/python/symbol/symbol.md +++ b/docs/api/python/symbol/symbol.md @@ -191,6 +191,8 @@ Composite multiple symbols into a new one by an operator. :nosignatures: Symbol.astype +Symbol.shape_array +Symbol.size_array Symbol.reshape Symbol.reshape_like Symbol.flatten @@ -373,6 +375,8 @@ Composite multiple symbols into a new one by an operator. :nosignatures: cast +shape_array +size_array reshape reshape_like flatten diff --git a/python/mxnet/ndarray/ndarray.py b/python/mxnet/ndarray/ndarray.py index 002ce3e..09395e2 100644 --- a/python/mxnet/ndarray/ndarray.py +++ b/python/mxnet/ndarray/ndarray.py @@ -1254,6 +1254,22 @@ fixed-size items. """ return op.flatten(self, *args, **kwargs) +def shape_array(self, *args, **kwargs): +"""Convenience fluent method for :py:func:`shape_array`. + +The arguments are the same as for :py:func:`shape_array`, with +this array as data. +""" +return op.shape_array(self, *args, **kwargs) + +def size_array(self, *args, **kwargs): +"""Convenience fluent method for :py:func:`size_array`. + +The arguments are the same as for :py:func:`size_array`, with +this array as data. +""" +return op.size_array(self, *args, **kwargs) + def expand_dims(self, *args, **kwargs): """Convenience fluent method for :py:func:`expand_dims`. diff --git a/python/mxnet/symbol/symbol.py b/python/mxnet/symbol/symbol.py index c5e2f5c..b041f4e 100644 --- a/python/mxnet/symbol/symbol.py +++ b/python/mxnet/symbol/symbol.py @@ -1982,6 +1982,22 @@ class Symbol(SymbolBase): """ return op.flatten(self, *args, **kwargs) +def shape_array(self, *args, **kwargs): +"""Convenience fluent method for :py:func:`shape_array`. + +The arguments are the same as for :py:func:`shape_op`, with +this array as data. +""" +return op.shape_array(self, *args, **kwargs) + +def size_array(self, *args, **kwargs): +"""Convenience fluent method for :py:func:`size_array`. + +The arguments are the same as for :py:func:`size_array`, with +this array as data. +""" +return op.size_array(self, *args, **kwargs) + def expand_dims(self, *args, **kwargs): """Convenience fluent method for :py:func:`expand_dims`. diff --git a/python/mxnet/test_utils.py b/python/mxnet/test_utils.py index
[GitHub] rahul003 commented on a change in pull request #11406: [MXNET-599] Partial shape infer for Slice
rahul003 commented on a change in pull request #11406: [MXNET-599] Partial shape infer for Slice URL: https://github.com/apache/incubator-mxnet/pull/11406#discussion_r199307912 ## File path: src/operator/tensor/matrix_op-inl.h ## @@ -674,13 +675,23 @@ inline void SetSliceOpOutputDimSize(const index_t i, const int b, const int e, const int s, TShape* oshape) { if (s > 0) { -CHECK_LT(b, e) << "slicing with begin=[" << i << "]=" << b << ", end[" << i << "]=" +CHECK_LE(b, e) << "slicing with begin=[" << i << "]=" << b << ", end[" << i << "]=" Review comment: Yes, now when shape of a dim is not known, `b==e==0` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] aaronmarkham commented on issue #11152: updating Scala IntelliJ tutorial & installation instructions
aaronmarkham commented on issue #11152: updating Scala IntelliJ tutorial & installation instructions URL: https://github.com/apache/incubator-mxnet/pull/11152#issuecomment-401506350 @nswamy I've addressed all of your concerns. Please confirm/merge. Thanks! This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] roywei commented on issue #8044: Fix the Test test_operator_gpu.test_batchnorm_training
roywei commented on issue #8044: Fix the Test test_operator_gpu.test_batchnorm_training URL: https://github.com/apache/incubator-mxnet/issues/8044#issuecomment-401506153 it may due to cuDNN implementation in BatchNorm. changed ``` numeric_eps=1e-3 atol=1e-3 ``` running test ``` MXNET_TEST_SEED=612977881 nosetests -s --verbose test_operator_gpu.py:test_batchnorm_training ``` give error: ``` == FAIL: test_operator_gpu.test_batchnorm_training -- Traceback (most recent call last): File "/usr/local/lib/python3.5/dist-packages/nose/case.py", line 198, in runTest self.test(*self.arg) File "/usr/local/lib/python3.5/dist-packages/nose/util.py", line 620, in newfunc return func(*arg, **kw) File "/home/ubuntu/incubator-mxnet/tests/python/gpu/../unittest/common.py", line 157, in test_new orig_test(*args, **kwargs) File "/home/ubuntu/incubator-mxnet/tests/python/gpu/../unittest/test_operator.py", line 1513, in test_batchnorm_training check_batchnorm_training(stype) File "/home/ubuntu/incubator-mxnet/tests/python/gpu/../unittest/test_operator.py", line 1465, in check_batchnorm_training check_numeric_gradient(test, in_location, mean_std, numeric_eps=1e-3, rtol=0.16, atol=1e-3) File "/usr/local/lib/python3.5/dist-packages/mxnet/test_utils.py", line 914, in check_numeric_gradient ("NUMERICAL_%s"%name, "BACKWARD_%s"%name)) File "/usr/local/lib/python3.5/dist-packages/mxnet/test_utils.py", line 493, in assert_almost_equal raise AssertionError(msg) AssertionError: Items are not equal: Error 1.417775 exceeds tolerance rtol=0.16, atol=0.001000. Location of maximum error:(0,), a=-0.001431, b=-0.10 NUMERICAL_batchnorm22_gamma: array([-0.00143051, 0.8763224 , -0.47454238], dtype=float32) BACKWARD_batchnorm22_gamma: array([-0.1038, 0.87640995, -0.47491556], dtype=float32) >> begin captured logging << common: INFO: Setting module np/mx/python random seeds, use MXNET_MODULE_SEED=797368411 to reproduce. common: WARNING: *** test-level seed set: all "@with_seed()" tests run deterministically *** common: INFO: Setting test np/mx/python random seeds, use MXNET_TEST_SEED=612977881 to reproduce. - >> end captured logging << - ``` This test is disabled for both cpu and gpu, we should enable cpu case at least. #11396 didn't have this issue because, [check consistency](https://github.com/apache/incubator-mxnet/blob/master/python/mxnet/test_utils.py#L1207) is not using training during forward execution. while check_numeric_gradients use `is_train=True` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] rahul003 commented on a change in pull request #11406: [MXNET-599] Partial shape infer for Slice
rahul003 commented on a change in pull request #11406: [MXNET-599] Partial shape infer for Slice URL: https://github.com/apache/incubator-mxnet/pull/11406#discussion_r199307425 ## File path: src/operator/tensor/matrix_op-inl.h ## @@ -690,9 +701,10 @@ inline bool SliceOpShape(const nnvm::NodeAttrs& attrs, CHECK_EQ(in_attrs->size(), 1U); CHECK_EQ(out_attrs->size(), 1U); const TShape& dshape = (*in_attrs)[0]; - if (dshape.ndim() == 0 || dshape.Size() == 0) return false; Review comment: Added This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[incubator-mxnet-site] branch asf-site updated: Bump the publish timestamp.
This is an automated email from the ASF dual-hosted git repository. zhasheng pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/incubator-mxnet-site.git The following commit(s) were added to refs/heads/asf-site by this push: new e00cc55 Bump the publish timestamp. e00cc55 is described below commit e00cc5574348e2d4c4cb8126b62123fe7732d3df Author: mxnet-ci AuthorDate: Sat Jun 30 00:35:08 2018 + Bump the publish timestamp. --- date.txt | 1 + 1 file changed, 1 insertion(+) diff --git a/date.txt b/date.txt new file mode 100644 index 000..6f6d13d --- /dev/null +++ b/date.txt @@ -0,0 +1 @@ +Sat Jun 30 00:35:08 UTC 2018
[GitHub] anirudhacharya commented on issue #10756: L1 Normalization operator not present in MXNet
anirudhacharya commented on issue #10756: L1 Normalization operator not present in MXNet URL: https://github.com/apache/incubator-mxnet/issues/10756#issuecomment-401504206 https://github.com/apache/incubator-mxnet/pull/11229 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] anirudhacharya closed issue #10756: L1 Normalization operator not present in MXNet
anirudhacharya closed issue #10756: L1 Normalization operator not present in MXNet URL: https://github.com/apache/incubator-mxnet/issues/10756 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] anirudhacharya commented on issue #10889: [MXNET-382] Shape and Size Operator
anirudhacharya commented on issue #10889: [MXNET-382] Shape and Size Operator URL: https://github.com/apache/incubator-mxnet/pull/10889#issuecomment-401504000 @piiswrong @reminisce please merge this. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] vrakesh commented on issue #8288: test_svmoutput_with_type fails in CI builds
vrakesh commented on issue #8288: test_svmoutput_with_type fails in CI builds URL: https://github.com/apache/incubator-mxnet/issues/8288#issuecomment-401503518 The above script from @anirudh2290 , was expanded with a test for gpu as well ```python import mxnet as mx import numpy as np def try_svm(dtype, x): exe = sym.simple_bind(grad_req='write',**{'ctx': mx.cpu(0), 'svmoutput_data': (4, 2), 'type_dict': {'svmoutput_data': dtype} }) exe.arg_arrays[0][:] = x.astype(dtype) exe.forward(is_train=True) exe.backward(exe.outputs) print(exe.outputs[0].asnumpy()) print(exe.arg_arrays[0].asnumpy()) print(exe.grad_arrays[0].asnumpy()) def try_svm_gpu(dtype, x): exe = sym.simple_bind(grad_req='write',**{'ctx': mx.gpu(0), 'svmoutput_data': (4, 2), 'type_dict': {'svmoutput_data': dtype} }) exe.arg_arrays[0][:] = x.astype(dtype) exe.forward(is_train=True) exe.backward(exe.outputs) print(exe.outputs[0].asnumpy()) print(exe.arg_arrays[0].asnumpy()) print(exe.grad_arrays[0].asnumpy()) print("CPU run\n") np.random.seed(1640822401) x = np.random.normal(size=(4, 2), scale=1.0) sym = mx.sym.SVMOutput(name='svmoutput', use_linear=True) try_svm(np.float32, x) try_svm(np.float16, x) print("GPU run\n") np.random.seed(643549208) try_svm_gpu(np.float32, x) try_svm_gpu(np.float16, x) ``` The output of the test is ```bash CPU run # outputs, args float 32 [[ 0.23923199 -0.5441972 ] [ 0.30816466 -0.34832394] [ 0.5518 1.06915462] [-0.37221318 -0.06465939]] [[ 0.23923199 -0.5441972 ] [ 0.30816466 -0.34832394] [ 0.5518 1.06915462] [-0.37221318 -0.06465939]] # Grads Float32 [[-1. 1.] [-1. 1.] [-1. 1.] [-1. 1.]] # outputs, args float16 [[ 0.23925781 -0.54443359] [ 0.30810547 -0.34838867] [ 1. 1.06933594] [-0.37231445 -0.06463623]] [[ 0.23925781 -0.54443359] [ 0.30810547 -0.34838867] [ 1. 1.06933594] [-0.37231445 -0.06463623]] #Grads float 16 [[-1. 1.] [-1. 1.] [-0. 1.] [-1. 1.]] GPU run # outputs, args [[ 0.23923199 -0.5441972 ] [ 0.30816466 -0.34832394] [ 0.5518 1.06915462] [-0.37221318 -0.06465939]] [[ 0.23923199 -0.5441972 ] [ 0.30816466 -0.34832394] [ 0.5518 1.06915462] [-0.37221318 -0.06465939]] # Grad float 32 [[-1. 1.] [-1. 1.] [-1. 1.] [-1. 1.]] #outputs float 16 [[ 0.23925781 -0.54443359] [ 0.30810547 -0.34838867] [ 1. 1.06933594] [-0.37231445 -0.06463623]] [[ 0.23925781 -0.54443359] [ 0.30810547 -0.34838867] [ 1. 1.06933594] [-0.37231445 -0.06463623]] #grad float 16 [[-1. 1.] [-1. 1.] [-0. 1.] [-1. 1.]] ``` As seen above for float 16, the output that is close to 0.999 rounds to 1.0. Our default margin for SVMOutput layer is 1.0 In L1_SVM function of the operator we compare margin > src (where src is the outputs) For float32 this resolves to 1.0 > 0.99532 For float16 this resolves to 1.0 > 1.0 Thus showing wrong results in grad (Type casted from bool to float) which is 1.0 and 0.0 thus resulting in the error To fix this the input being provided for test, should not be normally distributed, as values closer to the margin, will break the condition due to rounding at float16 level. One solution is to provide the test with a uniform distribution of inputs , that will never get close to the margin. Creating a PR for this This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] aaronmarkham commented on issue #11503: updating installation info to have latest packages and more clarity
aaronmarkham commented on issue #11503: updating installation info to have latest packages and more clarity URL: https://github.com/apache/incubator-mxnet/pull/11503#issuecomment-401503263 @nswamy @marcoabreu @szha - Can you please review? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] aaronmarkham opened a new pull request #11503: updating installation info to have latest packages and more clarity
aaronmarkham opened a new pull request #11503: updating installation info to have latest packages and more clarity URL: https://github.com/apache/incubator-mxnet/pull/11503 ## Description ## There hasn't been an update on installation detail instructions in a while. Many new packages are out and not referenced, so I'm trying to fix that. - Fixed a lot of old info in the Ubuntu instructions. - Added a table of different pip packages and what versions they support. - Added recommended installations. - Added scripts for quick installation. - Reorganized sections for clarity. - Untangled some R instructions - these were sort of inter-mingled with the standard instructions. ## Comments ## - I didn't mark mkl as experimental. LMK if I should change that. - I made an assumption that I don't want to show CUDA 9.1 packages... since NVIDIA isn't even offering it... - I assume that CUDA 9.2 and cuDNN 7.1.4 is what is desired. - I assume that `pip install mxnet-cu92mkl` is recommended for inference. - I assume that `pip install mxnet-cu92` is recommended for training. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] eric-haibin-lin closed pull request #11229: [MXNET-379] L1 Normalization
eric-haibin-lin closed pull request #11229: [MXNET-379] L1 Normalization URL: https://github.com/apache/incubator-mxnet/pull/11229 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/src/operator/tensor/broadcast_reduce_op.h b/src/operator/tensor/broadcast_reduce_op.h index e50071bdab7..ac7199a9482 100644 --- a/src/operator/tensor/broadcast_reduce_op.h +++ b/src/operator/tensor/broadcast_reduce_op.h @@ -70,7 +70,7 @@ struct NormParam : public dmlc::Parameter { bool keepdims; DMLC_DECLARE_PARAMETER(NormParam) { DMLC_DECLARE_FIELD(ord).set_default(2) - .describe("Order of the norm. Currently ord=2 is supported."); + .describe("Order of the norm. Currently ord=1 and ord=2 is supported."); DMLC_DECLARE_FIELD(axis).set_default(dmlc::optional()) .describe(R"code(The axis or axes along which to perform the reduction. The default, `axis=()`, will compute over all elements into a @@ -869,7 +869,7 @@ struct ReduceGrad { } }; -inline bool L2NormStorageType(const nnvm::NodeAttrs& attrs, +inline bool LpNormStorageType(const nnvm::NodeAttrs& attrs, const int dev_mask, DispatchMode* dispatch_mode, std::vector* in_attrs, @@ -889,18 +889,20 @@ inline bool L2NormStorageType(const nnvm::NodeAttrs& attrs, dispatched = storage_type_assign(_stype, kDefaultStorage, dispatch_mode, DispatchMode::kFCompute); } - const TShape axis = param.axis.has_value() ? param.axis.value() : TShape(); - if (!dispatched && (in_stype == kRowSparseStorage || in_stype == kCSRStorage) && - axis.ndim() == 0 && param.ord == 2) { -// l2 norm: rsp/csr, axis = () -> dns -dispatched = storage_type_assign(_stype, kDefaultStorage, dispatch_mode, - DispatchMode::kFComputeEx); - } - if (!dispatched && in_stype == kCSRStorage && axis.ndim() == 1 && !param.keepdims && - (axis[0] == 0 || axis[0] == 1) && param.ord == 2) { -// l2 norm: csr, axis = 0/1 -> dns -dispatched = storage_type_assign(_stype, kDefaultStorage, dispatch_mode, - dispatch_ex); + if (param.ord == 2) { +const TShape axis = param.axis.has_value() ? param.axis.value() : TShape(); +if (!dispatched && (in_stype == kRowSparseStorage || in_stype == kCSRStorage) && +axis.ndim() == 0 && param.ord == 2) { + // l2 norm: rsp/csr, axis = () -> dns + dispatched = storage_type_assign(_stype, kDefaultStorage, dispatch_mode, + DispatchMode::kFComputeEx); +} +if (!dispatched && in_stype == kCSRStorage && axis.ndim() == 1 && !param.keepdims && +(axis[0] == 0 || axis[0] == 1) && param.ord == 2) { + // l2 norm: csr, axis = 0/1 -> dns + dispatched = storage_type_assign(_stype, kDefaultStorage, dispatch_mode, + dispatch_ex); +} } if (!dispatched) { dispatched = dispatch_fallback(out_attrs, dispatch_mode); @@ -984,13 +986,13 @@ void SqRootForL2(const OpContext& ctx, OpReqType req, const TBlob ) { } template -void L2NormCompute(const nnvm::NodeAttrs& attrs, +void LpNormCompute(const nnvm::NodeAttrs& attrs, const OpContext& ctx, const std::vector& inputs, const std::vector& req, const std::vector& outputs) { const NormParam& param = nnvm::get(attrs.parsed); - CHECK_EQ(param.ord, 2) << "norm only support ord=2"; + CHECK(param.ord == 1 || param.ord == 2) << "norm only supports ord=1 and ord=2"; if (req[0] == kNullOp) return; TShape small; @@ -999,13 +1001,18 @@ void L2NormCompute(const nnvm::NodeAttrs& attrs, } else { small = ReduceAxesShapeImpl(inputs[0].shape_, param.axis, true, false); } - ReduceAxesComputeImpl( - ctx, inputs, req, outputs, small); - SqRootForL2(ctx, req[0], outputs[0]); + if (param.ord == 1) { +ReduceAxesComputeImpl( + ctx, inputs, req, outputs, small); + } else if (param.ord == 2) { +ReduceAxesComputeImpl( +ctx, inputs, req, outputs, small); +SqRootForL2(ctx, req[0], outputs[0]); + } } template -void L2NormGradCompute(const nnvm::NodeAttrs& attrs, +void LpNormGradCompute(const nnvm::NodeAttrs& attrs, const OpContext& ctx, const std::vector& inputs, const std::vector& req, @@ -1021,8 +1028,36 @@ void L2NormGradCompute(const nnvm::NodeAttrs& attrs, } else { small = ReduceAxesShapeImpl(outputs[0].shape_, param.axis, true, false); } - ReduceAxesBackwardUseInOutImpl(ctx, small, inputs, -
[incubator-mxnet] branch master updated: [MXNET-379] L1 Normalization (#11229)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git The following commit(s) were added to refs/heads/master by this push: new f783a66 [MXNET-379] L1 Normalization (#11229) f783a66 is described below commit f783a66b1c9f141738ab4f4c0b6f525f61a95d6c Author: Anirudh AuthorDate: Fri Jun 29 17:06:41 2018 -0700 [MXNET-379] L1 Normalization (#11229) * l1 norm --- src/operator/tensor/broadcast_reduce_op.h| 79 +--- src/operator/tensor/broadcast_reduce_op_value.cc | 32 ++ src/operator/tensor/broadcast_reduce_op_value.cu | 4 +- tests/python/unittest/test_ndarray.py| 47 +++--- tests/python/unittest/test_operator.py | 44 + 5 files changed, 143 insertions(+), 63 deletions(-) diff --git a/src/operator/tensor/broadcast_reduce_op.h b/src/operator/tensor/broadcast_reduce_op.h index e50071b..ac7199a 100644 --- a/src/operator/tensor/broadcast_reduce_op.h +++ b/src/operator/tensor/broadcast_reduce_op.h @@ -70,7 +70,7 @@ struct NormParam : public dmlc::Parameter { bool keepdims; DMLC_DECLARE_PARAMETER(NormParam) { DMLC_DECLARE_FIELD(ord).set_default(2) - .describe("Order of the norm. Currently ord=2 is supported."); + .describe("Order of the norm. Currently ord=1 and ord=2 is supported."); DMLC_DECLARE_FIELD(axis).set_default(dmlc::optional()) .describe(R"code(The axis or axes along which to perform the reduction. The default, `axis=()`, will compute over all elements into a @@ -869,7 +869,7 @@ struct ReduceGrad { } }; -inline bool L2NormStorageType(const nnvm::NodeAttrs& attrs, +inline bool LpNormStorageType(const nnvm::NodeAttrs& attrs, const int dev_mask, DispatchMode* dispatch_mode, std::vector* in_attrs, @@ -889,18 +889,20 @@ inline bool L2NormStorageType(const nnvm::NodeAttrs& attrs, dispatched = storage_type_assign(_stype, kDefaultStorage, dispatch_mode, DispatchMode::kFCompute); } - const TShape axis = param.axis.has_value() ? param.axis.value() : TShape(); - if (!dispatched && (in_stype == kRowSparseStorage || in_stype == kCSRStorage) && - axis.ndim() == 0 && param.ord == 2) { -// l2 norm: rsp/csr, axis = () -> dns -dispatched = storage_type_assign(_stype, kDefaultStorage, dispatch_mode, - DispatchMode::kFComputeEx); - } - if (!dispatched && in_stype == kCSRStorage && axis.ndim() == 1 && !param.keepdims && - (axis[0] == 0 || axis[0] == 1) && param.ord == 2) { -// l2 norm: csr, axis = 0/1 -> dns -dispatched = storage_type_assign(_stype, kDefaultStorage, dispatch_mode, - dispatch_ex); + if (param.ord == 2) { +const TShape axis = param.axis.has_value() ? param.axis.value() : TShape(); +if (!dispatched && (in_stype == kRowSparseStorage || in_stype == kCSRStorage) && +axis.ndim() == 0 && param.ord == 2) { + // l2 norm: rsp/csr, axis = () -> dns + dispatched = storage_type_assign(_stype, kDefaultStorage, dispatch_mode, + DispatchMode::kFComputeEx); +} +if (!dispatched && in_stype == kCSRStorage && axis.ndim() == 1 && !param.keepdims && +(axis[0] == 0 || axis[0] == 1) && param.ord == 2) { + // l2 norm: csr, axis = 0/1 -> dns + dispatched = storage_type_assign(_stype, kDefaultStorage, dispatch_mode, + dispatch_ex); +} } if (!dispatched) { dispatched = dispatch_fallback(out_attrs, dispatch_mode); @@ -984,13 +986,13 @@ void SqRootForL2(const OpContext& ctx, OpReqType req, const TBlob ) { } template -void L2NormCompute(const nnvm::NodeAttrs& attrs, +void LpNormCompute(const nnvm::NodeAttrs& attrs, const OpContext& ctx, const std::vector& inputs, const std::vector& req, const std::vector& outputs) { const NormParam& param = nnvm::get(attrs.parsed); - CHECK_EQ(param.ord, 2) << "norm only support ord=2"; + CHECK(param.ord == 1 || param.ord == 2) << "norm only supports ord=1 and ord=2"; if (req[0] == kNullOp) return; TShape small; @@ -999,13 +1001,18 @@ void L2NormCompute(const nnvm::NodeAttrs& attrs, } else { small = ReduceAxesShapeImpl(inputs[0].shape_, param.axis, true, false); } - ReduceAxesComputeImpl( - ctx, inputs, req, outputs, small); - SqRootForL2(ctx, req[0], outputs[0]); + if (param.ord == 1) { +ReduceAxesComputeImpl( + ctx, inputs, req, outputs, small); + } else if (param.ord == 2) { +ReduceAxesComputeImpl( +ctx, inputs, req, outputs, small); +SqRootForL2(ctx, req[0], outputs[0]); + } } template -void
[GitHub] szha commented on issue #11482: make gluon rnn layers hybrid blocks
szha commented on issue #11482: make gluon rnn layers hybrid blocks URL: https://github.com/apache/incubator-mxnet/pull/11482#issuecomment-401501912 I can override forward but it would be pretty much equivalent. The reason I cannot do this in hybrid_forward is that when given partial shape, hybrid_forward would only be invoked with symbols as part of the infer_shape pass. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] anirudhacharya commented on issue #11229: [MXNET-379] L1 Normalization
anirudhacharya commented on issue #11229: [MXNET-379] L1 Normalization URL: https://github.com/apache/incubator-mxnet/pull/11229#issuecomment-401501298 @eric-haibin-lin can you please merge this. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] ctcyang commented on a change in pull request #10696: [MXNET-366]Extend MXNet Distributed Training by AllReduce
ctcyang commented on a change in pull request #10696: [MXNET-366]Extend MXNet Distributed Training by AllReduce URL: https://github.com/apache/incubator-mxnet/pull/10696#discussion_r199300072 ## File path: src/kvstore/collectives/src/collectives.cc ## @@ -0,0 +1,779 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +/** + * Copyright (c) 2018 by Contributors + */ + +#if MXNET_USE_ALLREDUCE_DIST_KVSTORE + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "mxnet/base.h" +#include "mxnet/ndarray.h" +#include "mxnet/engine.h" +#include "dmlc/logging.h" +#include "mpi_message.pb.h" +#include "collectives.h" +#include "coll_wrapper.h" +#include "coll_util.h" + +using namespace mxnet::kvstore; + +const char INT_PREFIX[] = "INT"; +const char STR_PREFIX[] = "STR"; +const char IDX_PREFIX[] = "IDX"; +const char OPS_PREFIX[] = "OPS"; +const char OPS_ALLREDUCE[] = "ALLREDUCE"; +const char OPS_BROADCAST[] = "BROADCAST"; +const char DELIMITER[] = ":"; + +namespace { + +struct CollectiveOpRecord { + int rank; + + std::string key; + + MPIDataType dtype; + + mxnet::NDArray *val_in; + + mxnet::NDArray *val_out; + + int root_rank; + + mxnet::engine::CallbackOnComplete callback; +}; + +typedef std::unordered_map NDArrayTable; + +typedef std::unordered_map > MessageTable; + +/* + * Collective_global var maintain a message table and a background thread. + * In rank 0, message table is used to coordinate all reduce order + * of ndarray in different nodes.The background thread is used + * for doing collectives and doing coordination between nodes + * through mpi messages. + */ +struct CollectiveGlobalState { + std::atomic_flag initialized_flag = ATOMIC_FLAG_INIT; + + std::condition_variable cv; + + bool initialization_done = false; + + int init_status; + + std::mutex mu; + + NDArrayTable ndarray_table; + + std::queue message_queue; + + std::thread background_thread; + + bool shut_down = false; + + std::unique_ptr message_table; + + int rank = 0; + + int local_rank = 0; + + int size = 1; + + int device = -1; + + mxnet::Context pinned_ctx; + + Comm *local_comm = NULL; + +~CollectiveGlobalState() { + if (background_thread.joinable()) { +shut_down = true; +background_thread.join(); + } +} +}; + +static CollectiveGlobalState coll_global; + +// static std::unordered_map mpi_comm_buf; + +#define RANK_ZERO 0 + +#define TAG_NOTIFY 1 + +bool IncrementNDArrayCount( + const std::unique_ptr& message_table, + const MPIRequest , int mpi_size) { + auto name = msg.key_name(); + auto table_iter = message_table->find(name); + if (table_iter == message_table->end()) { +message_table->emplace(name, std::vector({msg})); +MXCOLL_DEBUG(coll_global.rank, "Insert new message key [%s] reqeust type [%d] from " +"rank[%d] into message table!\n", name.c_str(), msg.request_type(), +msg.request_rank()); +table_iter = message_table->find(name); + } else { +MXCOLL_DEBUG(coll_global.rank, "Insert existing message key [%s] request type [%d]" +"from rank[%d] into message table!\n", +name.c_str(), msg.request_type(), msg.request_rank()); +table_iter->second.push_back(msg); + } + + int count = table_iter->second.size(); + MXCOLL_DEBUG(coll_global.rank, "Message Key [%s] count [%d]\n", name.c_str(), count); + return count == mpi_size; +} + +int DataTypeToMPIType(int ndarray_dtype, MPIDataType *mpi_dtype) { + if (ndarray_dtype == mshadow::kFloat32) { Review comment: float16 is a very important datatype to GPU training, so it would be great if that were added here. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] zhanghang1989 opened a new pull request #11502: [MXNET-614] Adding Synchronized Batch Normalization
zhanghang1989 opened a new pull request #11502: [MXNET-614] Adding Synchronized Batch Normalization URL: https://github.com/apache/incubator-mxnet/pull/11502 ## Description ## Adding Synchronized Batch Normalization ## Checklist ## ### Essentials ### Please feel free to remove inapplicable items for your PR. - [ ] The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant [JIRA issue](https://issues.apache.org/jira/projects/MXNET/issues) created (except PRs with tiny changes) - [ ] Changes are complete (i.e. I finished coding on this PR) - [ ] All changes have test coverage: - Unit tests are added for small changes to verify correctness (e.g. adding a new operator) - Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore) - Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL) - [ ] Code is well-documented: - For user-facing API changes, API doc string has been updated. - For new C++ functions in header files, their functionalities and arguments are documented. - For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable - Check the API doc at http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html - [ ] To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change ### Changes ### - [ ] Feature1, tests, (and when applicable, API doc) - [ ] Feature2, tests, (and when applicable, API doc) ## Comments ## - If this change is a backward incompatible change, why must this change be made. - Interesting edge cases to note here This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[incubator-mxnet-site] branch asf-site updated: Bump the publish timestamp.
This is an automated email from the ASF dual-hosted git repository. zhasheng pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/incubator-mxnet-site.git The following commit(s) were added to refs/heads/asf-site by this push: new a937c7ce Bump the publish timestamp. a937c7ce is described below commit a937c7ce2027942f14ec770293d434456de14c49 Author: mxnet-ci AuthorDate: Fri Jun 29 23:31:07 2018 + Bump the publish timestamp. --- date.txt | 1 + 1 file changed, 1 insertion(+) diff --git a/date.txt b/date.txt new file mode 100644 index 000..062659f --- /dev/null +++ b/date.txt @@ -0,0 +1 @@ +Fri Jun 29 23:31:07 UTC 2018
[GitHub] piiswrong commented on issue #11482: make gluon rnn layers hybrid blocks
piiswrong commented on issue #11482: make gluon rnn layers hybrid blocks URL: https://github.com/apache/incubator-mxnet/pull/11482#issuecomment-401497076 I see. Then could you do it without overriding `__call__`? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] szha closed pull request #11287: [MXNET-548] fixed path for auto_module_index.js
szha closed pull request #11287: [MXNET-548] fixed path for auto_module_index.js URL: https://github.com/apache/incubator-mxnet/pull/11287 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/docs/api/python/autograd/autograd.md b/docs/api/python/autograd/autograd.md index 1e699d26d55..f20799a4435 100644 --- a/docs/api/python/autograd/autograd.md +++ b/docs/api/python/autograd/autograd.md @@ -13,7 +13,7 @@ In machine learning applications, of loss functions with respect to parameters. -### Record vs Pause +## Record vs Pause `autograd` records computation history on the fly to calculate gradients later. This is only enabled inside a `with autograd.record():` block. @@ -63,7 +63,7 @@ Detailed tutorials are available in Part 1 of - + ## Autograd @@ -86,7 +86,7 @@ Detailed tutorials are available in Part 1 of ## API Reference - + ```eval_rst .. automodule:: mxnet.autograd diff --git a/docs/api/python/callback/callback.md b/docs/api/python/callback/callback.md index c83816c352a..714966678a1 100644 --- a/docs/api/python/callback/callback.md +++ b/docs/api/python/callback/callback.md @@ -17,7 +17,7 @@ This document lists the routines of the callback package ## API Reference - + ```eval_rst .. automodule:: mxnet.callback diff --git a/docs/api/python/contrib/onnx.md b/docs/api/python/contrib/onnx.md index 8cd619809c1..8ec491e8f10 100644 --- a/docs/api/python/contrib/onnx.md +++ b/docs/api/python/contrib/onnx.md @@ -40,7 +40,7 @@ This document describes all the ONNX-MXNet APIs. ## API Reference - + ```eval_rst diff --git a/docs/api/python/contrib/text.md b/docs/api/python/contrib/text.md index 8bd67d2b508..e5f92bc7ef0 100644 --- a/docs/api/python/contrib/text.md +++ b/docs/api/python/contrib/text.md @@ -375,7 +375,7 @@ The following functions provide utilities for text data processing. ## API Reference - + ```eval_rst diff --git a/docs/api/python/executor/executor.md b/docs/api/python/executor/executor.md index 65245a41308..a37bfa91d2b 100644 --- a/docs/api/python/executor/executor.md +++ b/docs/api/python/executor/executor.md @@ -30,7 +30,7 @@ graph execution. This document is only intended for reference for advanced users ## API Reference - + ```eval_rst .. automodule:: mxnet.executor diff --git a/docs/api/python/gluon/gluon.md b/docs/api/python/gluon/gluon.md index 9bf866d21a1..07c98981a16 100644 --- a/docs/api/python/gluon/gluon.md +++ b/docs/api/python/gluon/gluon.md @@ -5,7 +5,7 @@ .. currentmodule:: mxnet.gluon ``` - + ## Overview @@ -152,7 +152,7 @@ net.hybridize() ## API Reference - + ```eval_rst .. automodule:: mxnet.gluon diff --git a/docs/api/python/gluon/loss.md b/docs/api/python/gluon/loss.md index 2bb7576e203..1aeb340a3db 100644 --- a/docs/api/python/gluon/loss.md +++ b/docs/api/python/gluon/loss.md @@ -30,7 +30,7 @@ This package includes several commonly used loss functions in neural networks. ## API Reference - + ```eval_rst .. automodule:: mxnet.gluon.loss diff --git a/docs/api/python/gluon/nn.md b/docs/api/python/gluon/nn.md index 1791faf86f0..25c82f06668 100644 --- a/docs/api/python/gluon/nn.md +++ b/docs/api/python/gluon/nn.md @@ -85,7 +85,7 @@ This document lists the neural network blocks in Gluon: ## API Reference - + ```eval_rst .. automodule:: mxnet.gluon.nn diff --git a/docs/api/python/gluon/rnn.md b/docs/api/python/gluon/rnn.md index 7a40c451bca..b558b8479e4 100644 --- a/docs/api/python/gluon/rnn.md +++ b/docs/api/python/gluon/rnn.md @@ -71,7 +71,7 @@ for i in range(5): ## API Reference - + ```eval_rst .. automodule:: mxnet.gluon.rnn diff --git a/docs/api/python/image/image.md b/docs/api/python/image/image.md index 82af4aa9b5c..11fff4f4340 100644 --- a/docs/api/python/image/image.md +++ b/docs/api/python/image/image.md @@ -156,7 +156,7 @@ and a list of augmenters specific for `Object detection` is provided ## API Reference - + ```eval_rst .. automodule:: mxnet.image diff --git a/docs/api/python/io/io.md b/docs/api/python/io/io.md index ecf3e75ac0d..6980834835b 100644 --- a/docs/api/python/io/io.md +++ b/docs/api/python/io/io.md @@ -149,7 +149,7 @@ The backend engine will recognize the index of `N` in the `layout` as the axis f ## API Reference - + ```eval_rst .. automodule:: mxnet.io diff --git a/docs/api/python/kvstore/kvstore.md b/docs/api/python/kvstore/kvstore.md index efd34bc724b..9f4c0649966 100644 --- a/docs/api/python/kvstore/kvstore.md +++ b/docs/api/python/kvstore/kvstore.md @@ -119,7 +119,7 @@ update on key: 9 ## API Reference - + ```eval_rst .. automodule:: mxnet.kvstore diff --git a/docs/api/python/metric/metric.md b/docs/api/python/metric/metric.md index 50a4a9be455..319647a6949 100644 ---
[GitHub] szha closed issue #11238: UX for ONNX Documentation is broken
szha closed issue #11238: UX for ONNX Documentation is broken URL: https://github.com/apache/incubator-mxnet/issues/11238 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[incubator-mxnet] branch master updated: [MXNET-548] fixed path for auto_module_index.js (#11287)
This is an automated email from the ASF dual-hosted git repository. zhasheng pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git The following commit(s) were added to refs/heads/master by this push: new 355de66 [MXNET-548] fixed path for auto_module_index.js (#11287) 355de66 is described below commit 355de662ec6c13e5512419ccdde752c2b07a03ae Author: Aaron Markham AuthorDate: Fri Jun 29 16:02:21 2018 -0700 [MXNET-548] fixed path for auto_module_index.js (#11287) * fixed path for auto_module_index.js * nudge flaky ci test --- docs/api/python/autograd/autograd.md | 6 +++--- docs/api/python/callback/callback.md | 2 +- docs/api/python/contrib/onnx.md | 2 +- docs/api/python/contrib/text.md | 2 +- docs/api/python/executor/executor.md | 2 +- docs/api/python/gluon/gluon.md | 4 ++-- docs/api/python/gluon/loss.md| 2 +- docs/api/python/gluon/nn.md | 2 +- docs/api/python/gluon/rnn.md | 2 +- docs/api/python/image/image.md | 2 +- docs/api/python/io/io.md | 2 +- docs/api/python/kvstore/kvstore.md | 2 +- docs/api/python/metric/metric.md | 2 +- docs/api/python/model.md | 2 +- docs/api/python/module/module.md | 2 +- docs/api/python/optimization/optimization.md | 2 +- docs/api/python/profiler/profiler.md | 2 +- docs/api/python/rtc/rtc.md | 2 +- docs/api/python/symbol/rnn.md| 2 +- 19 files changed, 22 insertions(+), 22 deletions(-) diff --git a/docs/api/python/autograd/autograd.md b/docs/api/python/autograd/autograd.md index 1e699d2..f20799a 100644 --- a/docs/api/python/autograd/autograd.md +++ b/docs/api/python/autograd/autograd.md @@ -13,7 +13,7 @@ In machine learning applications, of loss functions with respect to parameters. -### Record vs Pause +## Record vs Pause `autograd` records computation history on the fly to calculate gradients later. This is only enabled inside a `with autograd.record():` block. @@ -63,7 +63,7 @@ Detailed tutorials are available in Part 1 of - + ## Autograd @@ -86,7 +86,7 @@ Detailed tutorials are available in Part 1 of ## API Reference - + ```eval_rst .. automodule:: mxnet.autograd diff --git a/docs/api/python/callback/callback.md b/docs/api/python/callback/callback.md index c83816c..7149666 100644 --- a/docs/api/python/callback/callback.md +++ b/docs/api/python/callback/callback.md @@ -17,7 +17,7 @@ This document lists the routines of the callback package ## API Reference - + ```eval_rst .. automodule:: mxnet.callback diff --git a/docs/api/python/contrib/onnx.md b/docs/api/python/contrib/onnx.md index 8cd6198..8ec491e 100644 --- a/docs/api/python/contrib/onnx.md +++ b/docs/api/python/contrib/onnx.md @@ -40,7 +40,7 @@ This document describes all the ONNX-MXNet APIs. ## API Reference - + ```eval_rst diff --git a/docs/api/python/contrib/text.md b/docs/api/python/contrib/text.md index 8bd67d2..e5f92bc 100644 --- a/docs/api/python/contrib/text.md +++ b/docs/api/python/contrib/text.md @@ -375,7 +375,7 @@ The following functions provide utilities for text data processing. ## API Reference - + ```eval_rst diff --git a/docs/api/python/executor/executor.md b/docs/api/python/executor/executor.md index 65245a4..a37bfa9 100644 --- a/docs/api/python/executor/executor.md +++ b/docs/api/python/executor/executor.md @@ -30,7 +30,7 @@ graph execution. This document is only intended for reference for advanced users ## API Reference - + ```eval_rst .. automodule:: mxnet.executor diff --git a/docs/api/python/gluon/gluon.md b/docs/api/python/gluon/gluon.md index 9bf866d..07c9898 100644 --- a/docs/api/python/gluon/gluon.md +++ b/docs/api/python/gluon/gluon.md @@ -5,7 +5,7 @@ .. currentmodule:: mxnet.gluon ``` - + ## Overview @@ -152,7 +152,7 @@ net.hybridize() ## API Reference - + ```eval_rst .. automodule:: mxnet.gluon diff --git a/docs/api/python/gluon/loss.md b/docs/api/python/gluon/loss.md index 2bb7576..1aeb340 100644 --- a/docs/api/python/gluon/loss.md +++ b/docs/api/python/gluon/loss.md @@ -30,7 +30,7 @@ This package includes several commonly used loss functions in neural networks. ## API Reference - + ```eval_rst .. automodule:: mxnet.gluon.loss diff --git a/docs/api/python/gluon/nn.md b/docs/api/python/gluon/nn.md index 1791faf..25c82f0 100644 --- a/docs/api/python/gluon/nn.md +++ b/docs/api/python/gluon/nn.md @@ -85,7 +85,7 @@ This document lists the neural network blocks in Gluon: ## API Reference - + ```eval_rst .. automodule:: mxnet.gluon.nn diff --git a/docs/api/python/gluon/rnn.md b/docs/api/python/gluon/rnn.md index 7a40c45..b558b84 100644 --- a/docs/api/python/gluon/rnn.md +++ b/docs/api/python/gluon/rnn.md @@ -71,7 +71,7 @@ for i in range(5): ##
[GitHub] szha commented on issue #11004: Only allocate cudnn-rnn dropout memory if dropout p > 0 and acquire descriptors during initialization
szha commented on issue #11004: Only allocate cudnn-rnn dropout memory if dropout p > 0 and acquire descriptors during initialization URL: https://github.com/apache/incubator-mxnet/pull/11004#issuecomment-401494044 @sxjscience see above. This seems like a good justification for reusing the temp memory for dropout. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] eric-haibin-lin commented on issue #11004: Only allocate cudnn-rnn dropout memory if dropout p > 0 and acquire descriptors during initialization
eric-haibin-lin commented on issue #11004: Only allocate cudnn-rnn dropout memory if dropout p > 0 and acquire descriptors during initialization URL: https://github.com/apache/incubator-mxnet/pull/11004#issuecomment-401493149 According to @safrooze this PR makes the RNN cells in his model about 40% faster. We should mention this in the release note :P This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] anirudh2290 commented on issue #11478: 1.2.1 release notes
anirudh2290 commented on issue #11478: 1.2.1 release notes URL: https://github.com/apache/incubator-mxnet/pull/11478#issuecomment-401492746 @srochel your changes have been incorporated. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] apeforest commented on a change in pull request #11466: [MXNET-560] Add temperature parameter in Softmax operator
apeforest commented on a change in pull request #11466: [MXNET-560] Add temperature parameter in Softmax operator URL: https://github.com/apache/incubator-mxnet/pull/11466#discussion_r199296955 ## File path: src/operator/nn/softmax-inl.h ## @@ -53,7 +53,7 @@ struct log_softmax_fwd { template inline void Softmax(Stream *s, DType *in, DType *out, -Shape shape, int axis) { +Shape shape, int axis, const float temperature) { Review comment: Thanks for the suggestion. I have changed the data type from float to generic DType This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] roywei commented on issue #8044: Fix the Test test_operator_gpu.test_batchnorm_training
roywei commented on issue #8044: Fix the Test test_operator_gpu.test_batchnorm_training URL: https://github.com/apache/incubator-mxnet/issues/8044#issuecomment-401490912 issue causing the ctx error is tracked at #11448 working on flakiness: tried changing numeric_eps=1e-3 using following will reproduce the error ``` MXNET_TEST_SEED=335962305 nosetests -s --verbose test_operator_gpu.py:test_batchnorm_training ``` error ``` == FAIL: test_operator_gpu.test_batchnorm_training -- Traceback (most recent call last): File "/usr/local/lib/python3.5/dist-packages/nose/case.py", line 198, in runTest self.test(*self.arg) File "/usr/local/lib/python3.5/dist-packages/nose/util.py", line 620, in newfunc return func(*arg, **kw) File "/home/ubuntu/incubator-mxnet/tests/python/gpu/../unittest/common.py", line 157, in test_new orig_test(*args, **kwargs) File "/home/ubuntu/incubator-mxnet/tests/python/gpu/../unittest/test_operator.py", line 1510, in test_batchnorm_training check_batchnorm_training(stype) File "/home/ubuntu/incubator-mxnet/tests/python/gpu/../unittest/test_operator.py", line 1462, in check_batchnorm_training check_numeric_gradient(test, in_location, mean_std, numeric_eps=1e-3, rtol=0.16, atol=1e-4) File "/usr/local/lib/python3.5/dist-packages/mxnet/test_utils.py", line 914, in check_numeric_gradient ("NUMERICAL_%s"%name, "BACKWARD_%s"%name)) File "/usr/local/lib/python3.5/dist-packages/mxnet/test_utils.py", line 493, in assert_almost_equal raise AssertionError(msg) AssertionError: Items are not equal: Error 1.203853 exceeds tolerance rtol=0.16, atol=0.000100. Location of maximum error:(1, 2, 1, 1), a=-0.006936, b=-0.008740 NUMERICAL_data: array(-3.2685995 , 4.8455296 ], [ 2.1361113 , -0.71915984]], ... BACKWARD_data: array(-3.2687469 , 4.845755 ], [ 2.135889 , -0.71962786]], ... >> begin captured logging << common: INFO: Setting module np/mx/python random seeds, use MXNET_MODULE_SEED=262813092 to reproduce. common: WARNING: *** test-level seed set: all "@with_seed()" tests run deterministically *** common: INFO: Setting test np/mx/python random seeds, use MXNET_TEST_SEED=335962305 to reproduce. - >> end captured logging << - -- Ran 1 test in 2.915s FAILED (failures=1) ``` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] gigasquid commented on a change in pull request #11205: Clojure Contrib Package
gigasquid commented on a change in pull request #11205: Clojure Contrib Package URL: https://github.com/apache/incubator-mxnet/pull/11205#discussion_r199295055 ## File path: contrib/clojure-package/src/org/apache/clojure_mxnet/gen/ndarray.clj ## @@ -0,0 +1,2312 @@ +(ns org.apache.clojure-mxnet.ndarray + (:refer-clojure :exclude [* - + > >= < <= / cast concat flatten identity load max +min repeat reverse set sort take to-array empty shuffle]) + (:import (org.apache.mxnet NDArray Shape))) + +;; Do not edit - this is auto-generated + +;; Licensed to the Apache Software Foundation (ASF) under one or more +;; contributor license agreements. See the NOTICE file distributed with +;; this work for additional information regarding copyright ownership. +;; The ASF licenses this file to You under the Apache License, Version 2.0 +;; (the "License"); you may not use this file except in compliance with +;; the License. You may obtain a copy of the License at +;; +;;http://www.apache.org/licenses/LICENSE-2.0 +;; +;; Unless required by applicable law or agreed to in writing, software +;; distributed under the License is distributed on an "AS IS" BASIS, +;; WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +;; See the License for the specific language governing permissions and +;; limitations under the License. +;; + + + + Review comment: That's a good idea. I added the detail of this to the Clojure Package Contribution Needs - thanks :) This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] zhreshold closed pull request #11145: [MXNET-517] add sample ratio for ROI Align
zhreshold closed pull request #11145: [MXNET-517] add sample ratio for ROI Align URL: https://github.com/apache/incubator-mxnet/pull/11145 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/src/operator/contrib/bilinear_resize-inl.h b/src/operator/contrib/bilinear_resize-inl.h index b73ead9eba5..c096f014975 100644 --- a/src/operator/contrib/bilinear_resize-inl.h +++ b/src/operator/contrib/bilinear_resize-inl.h @@ -51,9 +51,9 @@ struct BilinearSampleParam : public dmlc::Parameter { int height; int width; DMLC_DECLARE_PARAMETER(BilinearSampleParam) { -DMLC_DECLARE_FIELD(height).set_range(1, 1000) +DMLC_DECLARE_FIELD(height).set_range(1, 1) .describe("output height (required)"); -DMLC_DECLARE_FIELD(width).set_range(1, 1000) +DMLC_DECLARE_FIELD(width).set_range(1, 1) .describe("output width (required)"); } }; diff --git a/src/operator/contrib/roi_align-inl.h b/src/operator/contrib/roi_align-inl.h index 5ac420cc3d4..263f72a6abc 100644 --- a/src/operator/contrib/roi_align-inl.h +++ b/src/operator/contrib/roi_align-inl.h @@ -47,6 +47,7 @@ enum ROIAlignOpOutputs {kOut}; struct ROIAlignParam : public dmlc::Parameter { TShape pooled_size; float spatial_scale; + int sample_ratio; DMLC_DECLARE_PARAMETER(ROIAlignParam) { DMLC_DECLARE_FIELD(pooled_size) .set_expect_ndim(2).enforce_nonzero() @@ -54,6 +55,8 @@ struct ROIAlignParam : public dmlc::Parameter { DMLC_DECLARE_FIELD(spatial_scale).set_range(0.0, 1.0) .describe("Ratio of input feature map height (or w) to raw image height (or w). " "Equals the reciprocal of total stride in convolutional layers"); +DMLC_DECLARE_FIELD(sample_ratio).set_default(-1) +.describe("Optional sampling ratio of ROI align, using adaptive size by default."); } }; diff --git a/src/operator/contrib/roi_align.cc b/src/operator/contrib/roi_align.cc index c2cb929966a..22611273cf5 100644 --- a/src/operator/contrib/roi_align.cc +++ b/src/operator/contrib/roi_align.cc @@ -440,8 +440,8 @@ void ROIAlignForwardCompute(const nnvm::NodeAttrs& attrs, DType *top_data = out_data[roialign::kOut].dptr(); ROIAlignForward(count, bottom_data, param.spatial_scale, channels, - height, width, pooled_height, pooled_width, -1, bottom_rois, - rois_cols, top_data); + height, width, pooled_height, pooled_width, param.sample_ratio, + bottom_rois, rois_cols, top_data); }) } @@ -490,7 +490,7 @@ void ROIAlignBackwardCompute(const nnvm::NodeAttrs& attrs, } ROIAlignBackward(count, top_diff, num_rois, param.spatial_scale, channels, height, width, pooled_height, pooled_width, - -1, grad_in, bottom_rois, rois_cols); + param.sample_ratio, grad_in, bottom_rois, rois_cols); } if (kWriteTo == req[roialign::kBox]) { Fill(s, outputs[1], kWriteTo, static_cast(0)); diff --git a/src/operator/contrib/roi_align.cu b/src/operator/contrib/roi_align.cu index 21066ea15fa..d3db70b73b1 100644 --- a/src/operator/contrib/roi_align.cu +++ b/src/operator/contrib/roi_align.cu @@ -231,13 +231,6 @@ __device__ void bilinear_interpolate_gradient( T lx = x - *x_low; T hy = 1. - ly, hx = 1. - lx; - // reference in forward - // T v1 = bottom_data[*y_low * width + *x_low]; - // T v2 = bottom_data[*y_low * width + *x_high]; - // T v3 = bottom_data[*y_high * width + *x_low]; - // T v4 = bottom_data[*y_high * width + *x_high]; - // T val = (w1 * v1 + *w2 * v2 + *w3 * v3 + *w4 * v4); - *w1 = hy * hx, *w2 = hy * lx, *w3 = ly * hx, *w4 = ly * lx; return; @@ -341,16 +334,6 @@ __global__ void RoIAlignBackwardKernel( offset_bottom_diff + y_high * width + x_low, static_cast(g3)); atomicAdd( offset_bottom_diff + y_high * width + x_high, static_cast(g4)); - /* - gpu_atomic_add( - static_cast(g1), offset_bottom_diff + y_low * width + x_low); - gpu_atomic_add( - static_cast(g2), offset_bottom_diff + y_low * width + x_high); - gpu_atomic_add( - static_cast(g3), offset_bottom_diff + y_high * width + x_low); - gpu_atomic_add( - static_cast(g4), offset_bottom_diff + y_high * width + x_high); - */ } // if } // ix } // iy @@ -399,7 +382,7 @@ void ROIAlignForwardCompute(const nnvm::NodeAttrs& attrs, width, pooled_height, pooled_width, - -1, + param.sample_ratio, bottom_rois, top_data); }) @@ -467,7 +450,7 @@ void ROIAlignBackwardCompute(const nnvm::NodeAttrs&
[incubator-mxnet] branch master updated: [MXNET-517] add sample ratio for ROI Align (#11145)
This is an automated email from the ASF dual-hosted git repository. zhreshold pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git The following commit(s) were added to refs/heads/master by this push: new e892301 [MXNET-517] add sample ratio for ROI Align (#11145) e892301 is described below commit e8923011523f900b1f7e9f180feecb89c1a1d6e1 Author: Hang Zhang <8041160+zhanghang1...@users.noreply.github.com> AuthorDate: Fri Jun 29 16:23:33 2018 -0600 [MXNET-517] add sample ratio for ROI Align (#11145) * add sample ratio * pylint * increase size limit for bilinearup * add test case * fix typo * rm comments and cpu back --- src/operator/contrib/bilinear_resize-inl.h | 4 ++-- src/operator/contrib/roi_align-inl.h | 3 +++ src/operator/contrib/roi_align.cc | 6 +++--- src/operator/contrib/roi_align.cu | 21 ++--- tests/python/unittest/test_operator.py | 16 +--- 5 files changed, 19 insertions(+), 31 deletions(-) diff --git a/src/operator/contrib/bilinear_resize-inl.h b/src/operator/contrib/bilinear_resize-inl.h index b73ead9..c096f01 100644 --- a/src/operator/contrib/bilinear_resize-inl.h +++ b/src/operator/contrib/bilinear_resize-inl.h @@ -51,9 +51,9 @@ struct BilinearSampleParam : public dmlc::Parameter { int height; int width; DMLC_DECLARE_PARAMETER(BilinearSampleParam) { -DMLC_DECLARE_FIELD(height).set_range(1, 1000) +DMLC_DECLARE_FIELD(height).set_range(1, 1) .describe("output height (required)"); -DMLC_DECLARE_FIELD(width).set_range(1, 1000) +DMLC_DECLARE_FIELD(width).set_range(1, 1) .describe("output width (required)"); } }; diff --git a/src/operator/contrib/roi_align-inl.h b/src/operator/contrib/roi_align-inl.h index 5ac420c..263f72a 100644 --- a/src/operator/contrib/roi_align-inl.h +++ b/src/operator/contrib/roi_align-inl.h @@ -47,6 +47,7 @@ enum ROIAlignOpOutputs {kOut}; struct ROIAlignParam : public dmlc::Parameter { TShape pooled_size; float spatial_scale; + int sample_ratio; DMLC_DECLARE_PARAMETER(ROIAlignParam) { DMLC_DECLARE_FIELD(pooled_size) .set_expect_ndim(2).enforce_nonzero() @@ -54,6 +55,8 @@ struct ROIAlignParam : public dmlc::Parameter { DMLC_DECLARE_FIELD(spatial_scale).set_range(0.0, 1.0) .describe("Ratio of input feature map height (or w) to raw image height (or w). " "Equals the reciprocal of total stride in convolutional layers"); +DMLC_DECLARE_FIELD(sample_ratio).set_default(-1) +.describe("Optional sampling ratio of ROI align, using adaptive size by default."); } }; diff --git a/src/operator/contrib/roi_align.cc b/src/operator/contrib/roi_align.cc index c2cb929..2261127 100644 --- a/src/operator/contrib/roi_align.cc +++ b/src/operator/contrib/roi_align.cc @@ -440,8 +440,8 @@ void ROIAlignForwardCompute(const nnvm::NodeAttrs& attrs, DType *top_data = out_data[roialign::kOut].dptr(); ROIAlignForward(count, bottom_data, param.spatial_scale, channels, - height, width, pooled_height, pooled_width, -1, bottom_rois, - rois_cols, top_data); + height, width, pooled_height, pooled_width, param.sample_ratio, + bottom_rois, rois_cols, top_data); }) } @@ -490,7 +490,7 @@ void ROIAlignBackwardCompute(const nnvm::NodeAttrs& attrs, } ROIAlignBackward(count, top_diff, num_rois, param.spatial_scale, channels, height, width, pooled_height, pooled_width, - -1, grad_in, bottom_rois, rois_cols); + param.sample_ratio, grad_in, bottom_rois, rois_cols); } if (kWriteTo == req[roialign::kBox]) { Fill(s, outputs[1], kWriteTo, static_cast(0)); diff --git a/src/operator/contrib/roi_align.cu b/src/operator/contrib/roi_align.cu index 21066ea..d3db70b 100644 --- a/src/operator/contrib/roi_align.cu +++ b/src/operator/contrib/roi_align.cu @@ -231,13 +231,6 @@ __device__ void bilinear_interpolate_gradient( T lx = x - *x_low; T hy = 1. - ly, hx = 1. - lx; - // reference in forward - // T v1 = bottom_data[*y_low * width + *x_low]; - // T v2 = bottom_data[*y_low * width + *x_high]; - // T v3 = bottom_data[*y_high * width + *x_low]; - // T v4 = bottom_data[*y_high * width + *x_high]; - // T val = (w1 * v1 + *w2 * v2 + *w3 * v3 + *w4 * v4); - *w1 = hy * hx, *w2 = hy * lx, *w3 = ly * hx, *w4 = ly * lx; return; @@ -341,16 +334,6 @@ __global__ void RoIAlignBackwardKernel( offset_bottom_diff + y_high * width + x_low, static_cast(g3)); atomicAdd( offset_bottom_diff + y_high * width + x_high, static_cast(g4)); - /* - gpu_atomic_add( - static_cast(g1), offset_bottom_diff + y_low * width + x_low); - gpu_atomic_add( -
[GitHub] zhreshold closed issue #11077: The sampling_ratio in roi-align can be specified(not adaptive size)!
zhreshold closed issue #11077: The sampling_ratio in roi-align can be specified(not adaptive size)! URL: https://github.com/apache/incubator-mxnet/issues/11077 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] frankfliu commented on issue #11496: Selu Operator not present in MXNet
frankfliu commented on issue #11496: Selu Operator not present in MXNet URL: https://github.com/apache/incubator-mxnet/issues/11496#issuecomment-401485460 Hi @anirudhacharya, thanks for submitting issue. @sandeep-krishnamurthy requesting this be labeled. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] frankfliu commented on issue #11501: Tracking issue for tests with fixed random seed
frankfliu commented on issue #11501: Tracking issue for tests with fixed random seed URL: https://github.com/apache/incubator-mxnet/issues/11501#issuecomment-401485304 Thank you for submitting the issue! @sandeep-krishnamurthy requesting this be labeled under feature request. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] vdantu commented on issue #9864: Flaky hanging test_operator.test_laop_3
vdantu commented on issue #9864: Flaky hanging test_operator.test_laop_3 URL: https://github.com/apache/incubator-mxnet/issues/9864#issuecomment-401485090 @marcoabreu @KellenSunderland : Please suggest how we could go about narrowing this down and resolve this. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] frankfliu commented on issue #11495: Bug for matrices of multiple dimension, with one dimension much larger
frankfliu commented on issue #11495: Bug for matrices of multiple dimension, with one dimension much larger URL: https://github.com/apache/incubator-mxnet/issues/11495#issuecomment-401485161 Hi @altosaar, thanks for submitting issue. @sandeep-krishnamurthy requesting this be labeled. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] szha commented on issue #11482: make gluon rnn layers hybrid blocks
szha commented on issue #11482: make gluon rnn layers hybrid blocks URL: https://github.com/apache/incubator-mxnet/pull/11482#issuecomment-401484871 @piiswrong turns out I cannot do that only in infer_shape. The reason is that sometimes the block is used as a child block of other blocks, in which case the infer shape is called from parent, thus bypassing the code path in rnn infer shape. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] frankfliu commented on issue #11497: Feature request: where operator should support scalar operand
frankfliu commented on issue #11497: Feature request: where operator should support scalar operand URL: https://github.com/apache/incubator-mxnet/issues/11497#issuecomment-401484809 Thank you for submitting the request! We are labeling it so MXNet community members can help resolve it. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] haojin2 commented on issue #10920: Flaky test test_operator_gpu.test_sparse_dot
haojin2 commented on issue #10920: Flaky test test_operator_gpu.test_sparse_dot URL: https://github.com/apache/incubator-mxnet/issues/10920#issuecomment-401483849 @eric-haibin-lin should be fixed at this time. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] ankkhedia edited a comment on issue #9116: unit test test_ndarray.test_ndarray_indexing fails
ankkhedia edited a comment on issue #9116: unit test test_ndarray.test_ndarray_indexing fails URL: https://github.com/apache/incubator-mxnet/issues/9116#issuecomment-399574168 Couldn't reproduce the issue for 1 runs on the setup mentioned in the issue. @zheng-da @sandeep-krishnamurthy Should we close this issue as I have not seen the test failing in last many CI runs or would you suggest something else to try? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] lanking520 commented on a change in pull request #11205: Clojure Contrib Package
lanking520 commented on a change in pull request #11205: Clojure Contrib Package URL: https://github.com/apache/incubator-mxnet/pull/11205#discussion_r199287183 ## File path: contrib/clojure-package/src/org/apache/clojure_mxnet/gen/ndarray.clj ## @@ -0,0 +1,2312 @@ +(ns org.apache.clojure-mxnet.ndarray + (:refer-clojure :exclude [* - + > >= < <= / cast concat flatten identity load max +min repeat reverse set sort take to-array empty shuffle]) + (:import (org.apache.mxnet NDArray Shape))) + +;; Do not edit - this is auto-generated + +;; Licensed to the Apache Software Foundation (ASF) under one or more +;; contributor license agreements. See the NOTICE file distributed with +;; this work for additional information regarding copyright ownership. +;; The ASF licenses this file to You under the Apache License, Version 2.0 +;; (the "License"); you may not use this file except in compliance with +;; the License. You may obtain a copy of the License at +;; +;;http://www.apache.org/licenses/LICENSE-2.0 +;; +;; Unless required by applicable law or agreed to in writing, software +;; distributed under the License is distributed on an "AS IS" BASIS, +;; WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +;; See the License for the specific language governing permissions and +;; limitations under the License. +;; + + + + Review comment: Currently on Scala side, @nswamy proposed a solution: Generate the signature of the file and documentation rather than the entire implementation. Maybe Clojure can follow the similar step to do. We can put this as a feature in the future. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] azai91 commented on issue #11466: [MXNET-560] Add temperature parameter in Softmax operator
azai91 commented on issue #11466: [MXNET-560] Add temperature parameter in Softmax operator URL: https://github.com/apache/incubator-mxnet/pull/11466#issuecomment-401480335 @pengzhao-intel @zheng-da do you know if MKLDNN supports the temperature parameter? If not then we will just have to fallback to fcompute This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] anooprh closed issue #11500: Unable to build docker image locally
anooprh closed issue #11500: Unable to build docker image locally URL: https://github.com/apache/incubator-mxnet/issues/11500 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] anooprh commented on issue #11500: Unable to build docker image locally
anooprh commented on issue #11500: Unable to build docker image locally URL: https://github.com/apache/incubator-mxnet/issues/11500#issuecomment-401476755 My bad. This issue got fixed after I allocated more memory to docker (4GB) and more swap(2GB). Closing the issue. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] szha commented on a change in pull request #11482: make gluon rnn layers hybrid blocks
szha commented on a change in pull request #11482: make gluon rnn layers hybrid blocks URL: https://github.com/apache/incubator-mxnet/pull/11482#discussion_r199282336 ## File path: python/mxnet/gluon/rnn/rnn_layer.py ## @@ -173,67 +177,63 @@ def begin_state(self, batch_size=0, func=ndarray.zeros, **kwargs): states.append(func(name='%sh0_%d'%(self.prefix, i), **info)) return states -def forward(self, inputs, states=None): -batch_size = inputs.shape[self._layout.find('N')] +def hybrid_forward(self, F, inputs, states=None, **kwargs): +if F is ndarray: +batch_size = inputs.shape[self._layout.find('N')] +if self._input_size == 0: Review comment: or did you mean overriding block's infer shape? I'm taking a look This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] szha commented on a change in pull request #11482: make gluon rnn layers hybrid blocks
szha commented on a change in pull request #11482: make gluon rnn layers hybrid blocks URL: https://github.com/apache/incubator-mxnet/pull/11482#discussion_r199281972 ## File path: python/mxnet/gluon/rnn/rnn_layer.py ## @@ -173,67 +177,63 @@ def begin_state(self, batch_size=0, func=ndarray.zeros, **kwargs): states.append(func(name='%sh0_%d'%(self.prefix, i), **info)) return states -def forward(self, inputs, states=None): -batch_size = inputs.shape[self._layout.find('N')] +def hybrid_forward(self, F, inputs, states=None, **kwargs): +if F is ndarray: +batch_size = inputs.shape[self._layout.find('N')] +if self._input_size == 0: +for i in range(self._dir): +self.i2h_weight[i].shape = (self._gates*self._hidden_size, inputs.shape[2]) +self.i2h_weight[i]._finish_deferred_init() skip_states = states is None if skip_states: -states = self.begin_state(batch_size, ctx=inputs.context) -if isinstance(states, ndarray.NDArray): +if F is ndarray: +states = self.begin_state(batch_size, ctx=inputs.context) +else: +states = self.begin_state(0, func=symbol.zeros) +if isinstance(states, (ndarray.NDArray, symbol.Symbol)): states = [states] -for state, info in zip(states, self.state_info(batch_size)): -if state.shape != info['shape']: -raise ValueError( -"Invalid recurrent state shape. Expecting %s, got %s."%( -str(info['shape']), str(state.shape))) -if self._input_size == 0: -for i in range(self._dir): -self.i2h_weight[i].shape = (self._gates*self._hidden_size, inputs.shape[2]) -self.i2h_weight[i]._finish_deferred_init() -if inputs.context.device_type == 'gpu' or \ - self._mode in ['lstm', 'gru'] and not self._dropout: -out = self._forward_kernel(inputs, states) -else: -out = self._forward(inputs, states) +if F is ndarray: +for state, info in zip(states, self.state_info(batch_size)): +if state.shape != info['shape']: +raise ValueError( +"Invalid recurrent state shape. Expecting %s, got %s."%( +str(info['shape']), str(state.shape))) +out = self._forward_kernel(F, inputs, states, **kwargs) # out is (output, state) return out[0] if skip_states else out -def _forward(self, inputs, states): -"""forward using gluon cell""" -ns = len(states) -axis = self._layout.find('T') -states = sum(zip(*((j for j in i) for i in states)), ()) -outputs, states = self._unfused.unroll( -inputs.shape[axis], inputs, states, -layout=self._layout, merge_outputs=True) -new_states = [] -for i in range(ns): -state = ndarray.concat(*(j.reshape((1,)+j.shape) for j in states[i::ns]), dim=0) -new_states.append(state) - -return outputs, new_states - -def _forward_kernel(self, inputs, states): +def __call__(self, inputs, *states): Review comment: this is not possible due to the inverse shape inference in concat. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] szha commented on a change in pull request #11482: make gluon rnn layers hybrid blocks
szha commented on a change in pull request #11482: make gluon rnn layers hybrid blocks URL: https://github.com/apache/incubator-mxnet/pull/11482#discussion_r199281972 ## File path: python/mxnet/gluon/rnn/rnn_layer.py ## @@ -173,67 +177,63 @@ def begin_state(self, batch_size=0, func=ndarray.zeros, **kwargs): states.append(func(name='%sh0_%d'%(self.prefix, i), **info)) return states -def forward(self, inputs, states=None): -batch_size = inputs.shape[self._layout.find('N')] +def hybrid_forward(self, F, inputs, states=None, **kwargs): +if F is ndarray: +batch_size = inputs.shape[self._layout.find('N')] +if self._input_size == 0: +for i in range(self._dir): +self.i2h_weight[i].shape = (self._gates*self._hidden_size, inputs.shape[2]) +self.i2h_weight[i]._finish_deferred_init() skip_states = states is None if skip_states: -states = self.begin_state(batch_size, ctx=inputs.context) -if isinstance(states, ndarray.NDArray): +if F is ndarray: +states = self.begin_state(batch_size, ctx=inputs.context) +else: +states = self.begin_state(0, func=symbol.zeros) +if isinstance(states, (ndarray.NDArray, symbol.Symbol)): states = [states] -for state, info in zip(states, self.state_info(batch_size)): -if state.shape != info['shape']: -raise ValueError( -"Invalid recurrent state shape. Expecting %s, got %s."%( -str(info['shape']), str(state.shape))) -if self._input_size == 0: -for i in range(self._dir): -self.i2h_weight[i].shape = (self._gates*self._hidden_size, inputs.shape[2]) -self.i2h_weight[i]._finish_deferred_init() -if inputs.context.device_type == 'gpu' or \ - self._mode in ['lstm', 'gru'] and not self._dropout: -out = self._forward_kernel(inputs, states) -else: -out = self._forward(inputs, states) +if F is ndarray: +for state, info in zip(states, self.state_info(batch_size)): +if state.shape != info['shape']: +raise ValueError( +"Invalid recurrent state shape. Expecting %s, got %s."%( +str(info['shape']), str(state.shape))) +out = self._forward_kernel(F, inputs, states, **kwargs) # out is (output, state) return out[0] if skip_states else out -def _forward(self, inputs, states): -"""forward using gluon cell""" -ns = len(states) -axis = self._layout.find('T') -states = sum(zip(*((j for j in i) for i in states)), ()) -outputs, states = self._unfused.unroll( -inputs.shape[axis], inputs, states, -layout=self._layout, merge_outputs=True) -new_states = [] -for i in range(ns): -state = ndarray.concat(*(j.reshape((1,)+j.shape) for j in states[i::ns]), dim=0) -new_states.append(state) - -return outputs, new_states - -def _forward_kernel(self, inputs, states): +def __call__(self, inputs, *states): Review comment: this is not possible due to the inverse shape inference in concat. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] szha commented on a change in pull request #11482: make gluon rnn layers hybrid blocks
szha commented on a change in pull request #11482: make gluon rnn layers hybrid blocks URL: https://github.com/apache/incubator-mxnet/pull/11482#discussion_r199281929 ## File path: python/mxnet/gluon/rnn/rnn_layer.py ## @@ -173,67 +177,63 @@ def begin_state(self, batch_size=0, func=ndarray.zeros, **kwargs): states.append(func(name='%sh0_%d'%(self.prefix, i), **info)) return states -def forward(self, inputs, states=None): -batch_size = inputs.shape[self._layout.find('N')] +def hybrid_forward(self, F, inputs, states=None, **kwargs): +if F is ndarray: +batch_size = inputs.shape[self._layout.find('N')] +if self._input_size == 0: Review comment: this is not possible due to the inverse shape inference in concat. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] aaronmarkham commented on issue #11287: [MXNET-548] fixed path for auto_module_index.js
aaronmarkham commented on issue #11287: [MXNET-548] fixed path for auto_module_index.js URL: https://github.com/apache/incubator-mxnet/pull/11287#issuecomment-401475808 @szha rebased and passed! This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] zhreshold closed pull request #11434: add ignore_reinit to initialize to skip warnings
zhreshold closed pull request #11434: add ignore_reinit to initialize to skip warnings URL: https://github.com/apache/incubator-mxnet/pull/11434 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/python/mxnet/gluon/block.py b/python/mxnet/gluon/block.py index 0ef28496c20..776592de6d7 100644 --- a/python/mxnet/gluon/block.py +++ b/python/mxnet/gluon/block.py @@ -478,7 +478,7 @@ def apply(self, fn): return self def initialize(self, init=initializer.Uniform(), ctx=None, verbose=False, - force_reinit=False): + force_reinit=False, ignore_reinit=False): """Initializes :py:class:`Parameter` s of this :py:class:`Block` and its children. Equivalent to ``block.collect_params().initialize(...)`` @@ -493,8 +493,10 @@ def initialize(self, init=initializer.Uniform(), ctx=None, verbose=False, Whether to verbosely print out details on initialization. force_reinit : bool, default False Whether to force re-initialization if parameter is already initialized. +ignore_reinit : bool, default False +Whether to ignore re-initialization warning if `force_reinit` is not True. """ -self.collect_params().initialize(init, ctx, verbose, force_reinit) +self.collect_params().initialize(init, ctx, verbose, force_reinit, ignore_reinit) def hybridize(self, active=True, **kwargs): """Activates or deactivates :py:class:`HybridBlock` s recursively. Has no effect on diff --git a/python/mxnet/gluon/parameter.py b/python/mxnet/gluon/parameter.py index 0c6aae92135..4edd0377d5b 100644 --- a/python/mxnet/gluon/parameter.py +++ b/python/mxnet/gluon/parameter.py @@ -323,7 +323,7 @@ def _reduce(self): return data def initialize(self, init=None, ctx=None, default_init=initializer.Uniform(), - force_reinit=False): + force_reinit=False, ignore_reinit=False): """Initializes parameter and gradient arrays. Only used for :py:class:`NDArray` API. Parameters @@ -344,6 +344,8 @@ def initialize(self, init=None, ctx=None, default_init=initializer.Uniform(), and :py:meth:`Parameter.init` are ``None``. force_reinit : bool, default False Whether to force re-initialization if parameter is already initialized. +ignore_reinit : bool, default False +Whether to ignore re-initialization warning if `force_reinit` is not True. Examples @@ -368,9 +370,10 @@ def initialize(self, init=None, ctx=None, default_init=initializer.Uniform(), """ if self._data is not None and not force_reinit: -warnings.warn("Parameter '%s' is already initialized, ignoring. " \ - "Set force_reinit=True to re-initialize."%self.name, - stacklevel=2) +if not ignore_reinit: +warnings.warn("Parameter '%s' is already initialized, ignoring. " \ + "Set force_reinit=True to re-initialize."%self.name, + stacklevel=2) return self._data = self._grad = None @@ -789,7 +792,7 @@ def update(self, other): self._params[k] = v def initialize(self, init=initializer.Uniform(), ctx=None, verbose=False, - force_reinit=False): + force_reinit=False, ignore_reinit=False): """Initializes all Parameters managed by this dictionary to be used for :py:class:`NDArray` API. It has no effect when using :py:class:`Symbol` API. @@ -804,11 +807,13 @@ def initialize(self, init=initializer.Uniform(), ctx=None, verbose=False, Whether to verbosely print out details on initialization. force_reinit : bool, default False Whether to force re-initialization if parameter is already initialized. +ignore_reinit : bool, default False +Whether to ignore re-initialization warning if `force_reinit` is not True. """ if verbose: init.set_verbosity(verbose=verbose) for _, v in self.items(): -v.initialize(None, ctx, init, force_reinit=force_reinit) +v.initialize(None, ctx, init, force_reinit=force_reinit, ignore_reinit=ignore_reinit) def zero_grad(self): """Sets all Parameters' gradient buffer to 0.""" diff --git a/tests/python/unittest/test_gluon.py b/tests/python/unittest/test_gluon.py index cd3cc685bdd..1f844691eaf 100644 --- a/tests/python/unittest/test_gluon.py +++ b/tests/python/unittest/test_gluon.py @@ -1360,6 +1360,29 @@ def test_hybrid_static_memory_recording():
[GitHub] szha closed pull request #11499: Test for new int64 type in CSVIter
szha closed pull request #11499: Test for new int64 type in CSVIter URL: https://github.com/apache/incubator-mxnet/pull/11499 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/tests/python/unittest/test_io.py b/tests/python/unittest/test_io.py index dbd327d429f..4dfa69cc105 100644 --- a/tests/python/unittest/test_io.py +++ b/tests/python/unittest/test_io.py @@ -317,6 +317,8 @@ def check_CSVIter_synthetic(dtype='float32'): entry_str = '1' if dtype is 'int32': entry_str = '20001' +if dtype is 'int64': +entry_str = '2147483648' with open(data_path, 'w') as fout: for i in range(1000): fout.write(','.join([entry_str for _ in range(8*8)]) + '\n') @@ -332,7 +334,7 @@ def check_CSVIter_synthetic(dtype='float32'): assert_almost_equal(data_batch.asnumpy(), expected.asnumpy()) assert data_batch.asnumpy().dtype == expected.asnumpy().dtype -for dtype in ['int32', 'float32']: +for dtype in ['int32', 'int64', 'float32']: check_CSVIter_synthetic(dtype=dtype) @unittest.skip("Flaky test: https://github.com/apache/incubator-mxnet/issues/11359;) This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] lupesko commented on issue #11463: Removing FlakyTest: memory_test.cc
lupesko commented on issue #11463: Removing FlakyTest: memory_test.cc URL: https://github.com/apache/incubator-mxnet/pull/11463#issuecomment-401473248 Adding @cjolivier01 who is the author of these tests. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] haojin2 opened a new issue #11501: Tracking issue for tests with fixed random seed
haojin2 opened a new issue #11501: Tracking issue for tests with fixed random seed URL: https://github.com/apache/incubator-mxnet/issues/11501 @eric-haibin-lin and @azai91 discovered that some unit tests have been using fixed seed to mask flakiness, opening this issue to keep track of those tests. Ideally every test should not be flaky with whatever seed. Tests with fixed seed: - test_convolution_with_type - test_bilinear_sampler_with_type - test_spatial_transformer_with_type - test_pooling_with_type - test_psroipooling_with_type - test_deformable_psroipooling_with_type - test_deformable_convolution_with_type - test_not_ok_with_random_data - test_ce_loss - test_bce_loss - test_kl_loss() - test_l2_loss - test_l1_loss - test_ctc_loss - test_ctc_loss_train - test_sample_weight_loss - test_saveload - test_huber_loss - test_hinge_loss - test_squared_hinge_loss - test_triplet_loss - test_dot - test_roipooling - test_l2_normalization - test_bilinear_sampler - test_dropout - test_nadam Please refer this issue if you're providing a fix to any of those tests. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] anirudh2290 commented on issue #11470: Fix build issue with USE_CUDNN=0
anirudh2290 commented on issue #11470: Fix build issue with USE_CUDNN=0 URL: https://github.com/apache/incubator-mxnet/pull/11470#issuecomment-401468703 @szha Yep, planning to fix that. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] haojin2 commented on a change in pull request #10889: [MXNET-382] Shape and Size Operator
haojin2 commented on a change in pull request #10889: [MXNET-382] Shape and Size Operator URL: https://github.com/apache/incubator-mxnet/pull/10889#discussion_r199275013 ## File path: src/operator/tensor/elemwise_unary_op_basic.cc ## @@ -398,6 +398,98 @@ NNVM_REGISTER_OP(reshape_like) .add_argument("lhs", "NDArray-or-Symbol", "First input.") .add_argument("rhs", "NDArray-or-Symbol", "Second input."); +void ShapeComputeCPU(const nnvm::NodeAttrs& attrs, + const OpContext& ctx, Review comment: nit: alignments of all ComputeXPU functions This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] zhreshold commented on issue #11434: add ignore_reinit to initialize to skip warnings
zhreshold commented on issue #11434: add ignore_reinit to initialize to skip warnings URL: https://github.com/apache/incubator-mxnet/pull/11434#issuecomment-401468140 Yes, I agree that it’s not a good practice to add an option to disable warning. However in this particular case, if user acknowledged that they understand the risk of already initialized parameters, there is no need to generate warnings any more, especially for a deep network it will propagate hundreds of warnings, suddenly block the entire screen This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] piiswrong commented on a change in pull request #11482: make gluon rnn layers hybrid blocks
piiswrong commented on a change in pull request #11482: make gluon rnn layers hybrid blocks URL: https://github.com/apache/incubator-mxnet/pull/11482#discussion_r199273106 ## File path: python/mxnet/gluon/rnn/rnn_layer.py ## @@ -173,67 +177,63 @@ def begin_state(self, batch_size=0, func=ndarray.zeros, **kwargs): states.append(func(name='%sh0_%d'%(self.prefix, i), **info)) return states -def forward(self, inputs, states=None): -batch_size = inputs.shape[self._layout.find('N')] +def hybrid_forward(self, F, inputs, states=None, **kwargs): +if F is ndarray: +batch_size = inputs.shape[self._layout.find('N')] +if self._input_size == 0: Review comment: implement infershape instead? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] piiswrong commented on a change in pull request #11482: make gluon rnn layers hybrid blocks
piiswrong commented on a change in pull request #11482: make gluon rnn layers hybrid blocks URL: https://github.com/apache/incubator-mxnet/pull/11482#discussion_r199273220 ## File path: python/mxnet/gluon/rnn/rnn_layer.py ## @@ -173,67 +177,63 @@ def begin_state(self, batch_size=0, func=ndarray.zeros, **kwargs): states.append(func(name='%sh0_%d'%(self.prefix, i), **info)) return states -def forward(self, inputs, states=None): -batch_size = inputs.shape[self._layout.find('N')] +def hybrid_forward(self, F, inputs, states=None, **kwargs): +if F is ndarray: +batch_size = inputs.shape[self._layout.find('N')] +if self._input_size == 0: +for i in range(self._dir): +self.i2h_weight[i].shape = (self._gates*self._hidden_size, inputs.shape[2]) +self.i2h_weight[i]._finish_deferred_init() skip_states = states is None if skip_states: -states = self.begin_state(batch_size, ctx=inputs.context) -if isinstance(states, ndarray.NDArray): +if F is ndarray: +states = self.begin_state(batch_size, ctx=inputs.context) +else: +states = self.begin_state(0, func=symbol.zeros) +if isinstance(states, (ndarray.NDArray, symbol.Symbol)): Review comment: (ndarray.NDArray, symbol.Symbol) -> tensor_types This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] anirudhacharya edited a comment on issue #10889: [MXNET-382] Shape and Size Operator
anirudhacharya edited a comment on issue #10889: [MXNET-382] Shape and Size Operator URL: https://github.com/apache/incubator-mxnet/pull/10889#issuecomment-401413402 @haojin2 @piiswrong can this be merged? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] haojin2 commented on a change in pull request #11466: [MXNET-560] Add temperature parameter in Softmax operator
haojin2 commented on a change in pull request #11466: [MXNET-560] Add temperature parameter in Softmax operator URL: https://github.com/apache/incubator-mxnet/pull/11466#discussion_r199268467 ## File path: src/operator/nn/softmax-inl.h ## @@ -53,7 +53,7 @@ struct log_softmax_fwd { template inline void Softmax(Stream *s, DType *in, DType *out, -Shape shape, int axis) { +Shape shape, int axis, const float temperature) { Review comment: This is a parameter, so it should be of a specific type. I think to ensure the highest precision we should take this parameter in as a double type and cast it to the DType during runtime. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] anooprh opened a new issue #11500: Unable to build docker image locally
anooprh opened a new issue #11500: Unable to build docker image locally URL: https://github.com/apache/incubator-mxnet/issues/11500 Note: Providing complete information in the most concise form is the best way to get help. This issue template serves as the checklist for essential information to most of the technical issues and bug reports. For non-technical issues and feature requests, feel free to present the information in what you believe is the best form. For Q & A and discussion, please start a discussion thread at https://discuss.mxnet.io ## Description I'm trying to setup a development environment and am unable to build docker image locally. I'm trying to learn(i.e trying to set myself up for contributions) and AFAIK building the python CPU image is a quick sanity check. Please correct me if there any other way to check that my repo is in a state to start contributing. ## Environment info (Required) ``` What to do: 1. Clone the `apache/incubator-mxnet` repository 2. `cd docker` 3. `./tool.sh build python cpu` ``` Package used (Python/R/Scala/Julia): (I'm trying to use the Python CPU package) ## Build info (Required if built from source) Compiler (gcc/clang/mingw/visual studio): Docker image MXNet commit hash: 988458f0a7203121ba30f6b50e1acb58d227c219 Build config: ``` % ./tool.sh build python cpu http://batman.gyptis.org/zerobin/?484ec556dcaf2c5e#4bViUUjHZzxQ4hOh5pBuSKc2FB5NXHnptvYEcWdQd5E= ``` ## Error Message: (Paste the complete error message, including stack trace.) ## Minimum reproducible example (If you are using your own code, please provide a short script that reproduces the error. Otherwise, please provide link to the existing example.) ## What have you tried to solve it? 1. Upgraded the docker version to `Docker version 18.03.1-ce, build 9ee9f40` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] szha commented on issue #11499: Test for new int64 type in CSVIter
szha commented on issue #11499: Test for new int64 type in CSVIter URL: https://github.com/apache/incubator-mxnet/pull/11499#issuecomment-401452682 @haojin2 thanks for adding the test for #11446 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] haojin2 commented on issue #11499: Test for new int64 type in CSVIter
haojin2 commented on issue #11499: Test for new int64 type in CSVIter URL: https://github.com/apache/incubator-mxnet/pull/11499#issuecomment-401448212 @szha This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] haojin2 opened a new pull request #11499: Test for new int64 type in CSVIter
haojin2 opened a new pull request #11499: Test for new int64 type in CSVIter URL: https://github.com/apache/incubator-mxnet/pull/11499 ## Description ## Previous int64 data type for CSVIter PR was merged but forgot a special test for it. ## Checklist ## ### Essentials ### - [x] Changes are complete (i.e. I finished coding on this PR) - [x] All changes have test coverage: - Unit tests are added for small changes to verify correctness (e.g. adding a new operator) - Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore) - Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL) - [x] Code is well-documented: - For user-facing API changes, API doc string has been updated. - For new C++ functions in header files, their functionalities and arguments are documented. - For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable - Check the API doc at http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html - [x] To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change ### Changes ### - [x] Unit test for int64 data type support for CSVIter ## Comments ## This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] zhanghang1989 commented on issue #11145: [MXNET-517] add sample ratio for ROI Align
zhanghang1989 commented on issue #11145: [MXNET-517] add sample ratio for ROI Align URL: https://github.com/apache/incubator-mxnet/pull/11145#issuecomment-401447551 Any updates or comments @piiswrong This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[incubator-mxnet] branch master updated: support int64 data type in CSVIter (#11446)
This is an automated email from the ASF dual-hosted git repository. zhasheng pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git The following commit(s) were added to refs/heads/master by this push: new 3a62200 support int64 data type in CSVIter (#11446) 3a62200 is described below commit 3a62200799d2d75039eb38e186d94361c17d060c Author: Hao Jin AuthorDate: Fri Jun 29 15:02:25 2018 -0400 support int64 data type in CSVIter (#11446) --- 3rdparty/dmlc-core | 2 +- src/io/image_iter_common.h | 1 + src/io/iter_csv.cc | 16 +++- 3 files changed, 13 insertions(+), 6 deletions(-) diff --git a/3rdparty/dmlc-core b/3rdparty/dmlc-core index dadcd97..649be18 16 --- a/3rdparty/dmlc-core +++ b/3rdparty/dmlc-core @@ -1 +1 @@ -Subproject commit dadcd97fdceb5f395e963b2a637f6ed377f59fc4 +Subproject commit 649be18a8c55c48517861d67158a45dec54992ee diff --git a/src/io/image_iter_common.h b/src/io/image_iter_common.h index 5682288..8580ff8 100644 --- a/src/io/image_iter_common.h +++ b/src/io/image_iter_common.h @@ -348,6 +348,7 @@ struct PrefetcherParam : public dmlc::Parameter { .add_enum("float32", mshadow::kFloat32) .add_enum("float64", mshadow::kFloat64) .add_enum("float16", mshadow::kFloat16) + .add_enum("int64", mshadow::kInt64) .add_enum("int32", mshadow::kInt32) .add_enum("uint8", mshadow::kUint8) .set_default(dmlc::optional()) diff --git a/src/io/iter_csv.cc b/src/io/iter_csv.cc index ca3f042..5fd1495 100644 --- a/src/io/iter_csv.cc +++ b/src/io/iter_csv.cc @@ -174,15 +174,21 @@ class CSVIter: public IIterator { for (const auto& arg : kwargs) { if (arg.first == "dtype") { dtype_has_value = true; -if (arg.second == "int32" || arg.second == "float32") { - target_dtype = (arg.second == "int32") ? mshadow::kInt32 : mshadow::kFloat32; +if (arg.second == "int32") { + target_dtype = mshadow::kInt32; +} else if (arg.second == "int64") { + target_dtype = mshadow::kInt64; +} else if (arg.second == "float32") { + target_dtype = mshadow::kFloat32; } else { CHECK(false) << arg.second << " is not supported for CSVIter"; } } } if (dtype_has_value && target_dtype == mshadow::kInt32) { - iterator_.reset(reinterpret_cast(new CSVIterTyped())); + iterator_.reset(reinterpret_cast(new CSVIterTyped())); +} else if (dtype_has_value && target_dtype == mshadow::kInt64) { + iterator_.reset(reinterpret_cast(new CSVIterTyped())); } else if (!dtype_has_value || target_dtype == mshadow::kFloat32) { iterator_.reset(reinterpret_cast(new CSVIterTyped())); } @@ -229,8 +235,8 @@ If ``data_csv = 'data/'`` is set, then all the files in this directory will be r ``reset()`` is expected to be called only after a complete pass of data. By default, the CSVIter parses all entries in the data file as float32 data type, -if `dtype` argument is set to be 'int32' then CSVIter will parse all entries in the file -as int32 data type. +if `dtype` argument is set to be 'int32' or 'int64' then CSVIter will parse all entries in the file +as int32 or int64 data type accordingly. Examples::
[incubator-mxnet-site] branch asf-site updated: Bump the publish timestamp.
This is an automated email from the ASF dual-hosted git repository. zhasheng pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/incubator-mxnet-site.git The following commit(s) were added to refs/heads/asf-site by this push: new a6d2a6c Bump the publish timestamp. a6d2a6c is described below commit a6d2a6cd52462ffdd64c9c065d22759c27c459a0 Author: mxnet-ci AuthorDate: Fri Jun 29 18:38:31 2018 + Bump the publish timestamp. --- date.txt | 1 + 1 file changed, 1 insertion(+) diff --git a/date.txt b/date.txt new file mode 100644 index 000..5806464 --- /dev/null +++ b/date.txt @@ -0,0 +1 @@ +Fri Jun 29 18:38:31 UTC 2018
[incubator-mxnet] branch master updated: update docs: deps using CI scripts and other clarifications (#11431)
This is an automated email from the ASF dual-hosted git repository. zhasheng pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git The following commit(s) were added to refs/heads/master by this push: new 275cd8e update docs: deps using CI scripts and other clarifications (#11431) 275cd8e is described below commit 275cd8e10a7f3141d70b589081909159aeba5e6d Author: Aaron Markham AuthorDate: Fri Jun 29 11:32:46 2018 -0700 update docs: deps using CI scripts and other clarifications (#11431) * update dependencies using CI scripts; clarifications * nudging flaky ci test --- docs/README.md | 16 +-- docs/build_version_doc/README.md | 93 ++-- 2 files changed, 34 insertions(+), 75 deletions(-) diff --git a/docs/README.md b/docs/README.md index 42e068c..db64bf0 100644 --- a/docs/README.md +++ b/docs/README.md @@ -1,4 +1,4 @@ -# MXNet Documentation +# Building and Updating MXNet Documentation The website is hosted at http://mxnet.incubator.apache.org/. http://mxnet.io redirects to this site and advised to use links with http://mxnet.incubator.apache.org/ instead of http://mxnet.io/. @@ -8,9 +8,11 @@ MXNet Documentation Website is built with [Sphinx](http://www.sphinx-doc.org) an ## How to Build the MXNet Website for Development and QA -* [Dependencies](https://github.com/apache/incubator-mxnet/tree/master/docs/build_version_doc#dependencies) -* [Developer Build Instructions](https://github.com/apache/incubator-mxnet/tree/master/docs/build_version_doc#developer-instructions) -* [Full Site Build Instructions](https://github.com/apache/incubator-mxnet/tree/master/docs/build_version_doc#full-website-build) +Using `make docs` from the MXNet root is the quickest way to generate the MXNet API docs and the website. This method automatically generates each API, [except the Perl and R APIs](#other-build-processes). + +* [Dependencies](https://github.com/apache/incubator-mxnet/tree/master/docs/build_version_doc#dependencies) - required before you do any building of the docs +* [Developer Build Instructions](https://github.com/apache/incubator-mxnet/tree/master/docs/build_version_doc#developer-instructions) - build your local branch +* [Full Site Build Instructions](https://github.com/apache/incubator-mxnet/tree/master/docs/build_version_doc#full-website-build) - build the latest commits to the official branches ## File Structure @@ -51,6 +53,12 @@ The host repo is hooked with [Apache gitbox](https://gitbox.apache.org/repos/asf **IMPORTANT**: Refer to [Full Site Build Instructions](https://github.com/apache/incubator-mxnet/tree/master/docs/build_version_doc#full-website-build) for a working site build with the versions dropdown in the UI. +## Other Build Processes + +* Perl API docs are maintained separately at [metacpan](https://metacpan.org/release/AI-MXNet). +* R API docs building must be triggered manually. The function for generating these automatically was disabled in the nightly builds. You may run the R docs build process in a local docs build by uncommenting the [function call in mxdoc.py](https://github.com/apache/incubator-mxnet/blob/master/docs/mxdoc.py#L378). + + ## Troubleshooting - If C++ code has been changed, remove the previous results to trigger the rebuild for all pages. To do this, run `make clean_docs`. diff --git a/docs/build_version_doc/README.md b/docs/build_version_doc/README.md index 4fd2c10..d25d163 100644 --- a/docs/build_version_doc/README.md +++ b/docs/build_version_doc/README.md @@ -2,7 +2,8 @@ This folder contains a variety of scripts to generate the MXNet.io website as well as the docs for different versions of MXNet. -## Contents +## Contents of the build_version_doc Folder + * [AddPackageLink.py](AddPackageLink.py) - MXNet.io site data massaging; injects pip version numbers in the different versions' install pages * [AddVersion.py](AddVersion.py) - MXNet.io site data massaging; injects the versions dropdown menu in the navigation bar * [build_site_tag.sh](build_site_tag.sh) - takes version tags as input and generates static html; calls `build_all_version.sh` and `update_all_version.sh` @@ -13,7 +14,9 @@ This folder contains a variety of scripts to generate the MXNet.io website as we ## Setting Up a Docs Dev Server -For these instructions, you will use an Ubuntu machine. This flow has been tested on a [Deep Learning Base AMI](https://aws.amazon.com/marketplace/pp/B077GCZ4GR), although you may use the full Deep Learning Base AMI or any other Ubuntu 16.04 system with some minor adjustments. +Running docs builds locally on a Mac is not recommended. For these instructions, you will use an Ubuntu machine. + +This flow has been tested on a vanilla Ubuntu 16.04 cloud instance on AWS. **Step 1:** Spin up your Ubuntu server and SSH in. @@ -30,29 +33,37 @@ source mxnet_docs/bin/activate
[GitHub] gigasquid commented on a change in pull request #11205: Clojure Contrib Package
gigasquid commented on a change in pull request #11205: Clojure Contrib Package URL: https://github.com/apache/incubator-mxnet/pull/11205#discussion_r199245552 ## File path: contrib/clojure-package/src/org/apache/clojure_mxnet/gen/ndarray.clj ## @@ -0,0 +1,2312 @@ +(ns org.apache.clojure-mxnet.ndarray + (:refer-clojure :exclude [* - + > >= < <= / cast concat flatten identity load max +min repeat reverse set sort take to-array empty shuffle]) + (:import (org.apache.mxnet NDArray Shape))) + +;; Do not edit - this is auto-generated + +;; Licensed to the Apache Software Foundation (ASF) under one or more +;; contributor license agreements. See the NOTICE file distributed with +;; this work for additional information regarding copyright ownership. +;; The ASF licenses this file to You under the Apache License, Version 2.0 +;; (the "License"); you may not use this file except in compliance with +;; the License. You may obtain a copy of the License at +;; +;;http://www.apache.org/licenses/LICENSE-2.0 +;; +;; Unless required by applicable law or agreed to in writing, software +;; distributed under the License is distributed on an "AS IS" BASIS, +;; WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +;; See the License for the specific language governing permissions and +;; limitations under the License. +;; + + + + Review comment: Yes that is definitely a possibility to investigate. There are pros and cons to doing it manually vs compile time. For example, doing it as a manual step gives you a way to control the flow of changes into the project and a git diff. Where, doing it at compile time gives the advantage of not having to worry about a manual step. I'm definitely open to changing this aspect of it, especially if the manual `generate-code` step becomes a burden. I added it to the https://cwiki.apache.org/confluence/display/MXNET/Clojure+Package+Contribution+Needs as feature to further investigate. Thanks for the feedback :) This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] haojin2 commented on issue #10988: Flaky test: test_operator_gpu.test_countsketch
haojin2 commented on issue #10988: Flaky test: test_operator_gpu.test_countsketch URL: https://github.com/apache/incubator-mxnet/issues/10988#issuecomment-401435901 From the reproduced error we can see that only part of the grad ndarray is filled: ``` == FAIL: test_operator_gpu.test_countsketch -- Traceback (most recent call last): File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/nose/case.py", line 197, in runTest self.test(*self.arg) File "/home/ubuntu/5-mxnet/tests/python/gpu/../unittest/common.py", line 157, in test_new orig_test(*args, **kwargs) File "/home/ubuntu/5-mxnet/tests/python/gpu/test_operator_gpu.py", line 103, in test_countsketch check_countsketch(in_dim, out_dim, n) File "/home/ubuntu/5-mxnet/tests/python/gpu/test_operator_gpu.py", line 88, in check_countsketch assert_almost_equal(a,arr_grad[0].asnumpy(),rtol=1e-3, atol=1e-5) File "/home/ubuntu/6-mxnet/python/mxnet/test_utils.py", line 493, in assert_almost_equal raise AssertionError(msg) AssertionError: Items are not equal: Error 159853.341227 exceeds tolerance rtol=0.001000, atol=0.10. Location of maximum error:(5, 9), a=6.016212, b=-0.027810 a: array([[-0.08690866, 0., -0., ..., 0., 0., -0.], [ 0., 0., 0., ..., -0.,... b: array([[-0.08690866, -3.90360618, -1.36067092, ..., -0.4085128 , -2.49076152, 0.51365918], [ 9.16543007, -3.62473965, -6.45960188, ..., -1.51162243,... >> begin captured logging << common: INFO: Setting module np/mx/python random seeds, use MXNET_MODULE_SEED=954783568 to reproduce. common: INFO: 1 of 1: Setting test np/mx/python random seeds, use MXNET_TEST_SEED=1220294681 to reproduce. common: INFO: 2 of 1: Setting test np/mx/python random seeds, use MXNET_TEST_SEED=1220294681 to reproduce. common: INFO: 3 of 1: Setting test np/mx/python random seeds, use MXNET_TEST_SEED=1220294681 to reproduce. - >> end captured logging << - -- Ran 1 test in 5.961s ``` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] ctcyang edited a comment on issue #10696: [MXNET-366]Extend MXNet Distributed Training by AllReduce
ctcyang edited a comment on issue #10696: [MXNET-366]Extend MXNet Distributed Training by AllReduce URL: https://github.com/apache/incubator-mxnet/pull/10696#issuecomment-401432358 Thanks for your help @threeleafzerg I was able to build it with `USE_ALLREDUCE_DIST_KVSTORE = 1`. On AWS EC2 instances using the Deep Learning AMI, you need to do these additional steps: ``` wget https://github.com/google/protobuf/releases/download/v3.5.1/protobuf-cpp-3.5.1.tar.gz && tar --no-same-owner -zxf protobuf-cpp-3.5.1.tar.gz cd protobuf-3.5.1 && export CFLAGS=-fPIC && export CXXFLAGS=-fPIC && ./configure -prefix=/usr && sudo make -j16 && sudo make -j16 install conda remove protobuf conda remove libprotobuf rm -rf ~/anaconda3/bin/proto* && rm -rf ~/anaconda3/lib/libproto* sudo apt remove libprotobuf-dev sudo apt remove libprotobuf-lite9v5 sudo apt remove libprotobuf9v5 sudo apt remove libprotoc9v5 sudo ldconfig ``` When finished, do `ldconfig -p` and verify no more occurrences of `libprotoc.so.9` and `libprotobuf.so.9` occur in the output. Due to ABI incompatibility between different protobuf versions--preinstalled version that comes with Deep Learning AMI (Ubuntu apt), preinstalled version that comes with Anaconda and version that gets auto-installed by the Makefile--you need to uninstall the 2 versions that come with apt and Anaconda. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] ctcyang edited a comment on issue #10696: [MXNET-366]Extend MXNet Distributed Training by AllReduce
ctcyang edited a comment on issue #10696: [MXNET-366]Extend MXNet Distributed Training by AllReduce URL: https://github.com/apache/incubator-mxnet/pull/10696#issuecomment-401432358 Thanks for your help @threeleafzerg I was able to build it with USE_ALLREDUCE_DIST_KVSTORE = 1. On AWS EC2 instances using the Deep Learning AMI, you need to do these additional steps: ``` wget https://github.com/google/protobuf/releases/download/v3.5.1/protobuf-cpp-3.5.1.tar.gz && tar --no-same-owner -zxf protobuf-cpp-3.5.1.tar.gz cd protobuf-3.5.1 && export CFLAGS=-fPIC && export CXXFLAGS=-fPIC && ./configure -prefix=/usr && sudo make -j16 && sudo make -j16 install conda remove protobuf conda remove libprotobuf rm -rf ~/anaconda3/bin/proto* && rm -rf ~/anaconda3/lib/libproto* sudo apt remove libprotobuf-dev sudo apt remove libprotobuf-lite9v5 sudo apt remove libprotobuf9v5 sudo apt remove libprotoc9v5 sudo ldconfig ``` When finished, do `ldconfig -p` and verify no more occurrences of `libprotoc.so.9` and `libprotobuf.so.9` occur in the output. Due to ABI incompatibility between different protobuf versions--preinstalled version that comes with Deep Learning AMI (Ubuntu apt), preinstalled version that comes with Anaconda and version that gets auto-installed by the Makefile--you need to uninstall the 2 versions that come with apt and Anaconda. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] ctcyang edited a comment on issue #10696: [MXNET-366]Extend MXNet Distributed Training by AllReduce
ctcyang edited a comment on issue #10696: [MXNET-366]Extend MXNet Distributed Training by AllReduce URL: https://github.com/apache/incubator-mxnet/pull/10696#issuecomment-401432358 Thanks for your help @threeleafzerg I was able to build it with USE_ALLREDUCE_DIST_KVSTORE = 1. On AWS EC2 instances using the Deep Learning AMI, you need to do these additional steps: ``` wget https://github.com/google/protobuf/releases/download/v3.5.1/protobuf-cpp-3.5.1.tar.gz && tar --no-same-owner -zxf protobuf-cpp-3.5.1.tar.gz cd protobuf-3.5.1 && export CFLAGS=-fPIC && export CXXFLAGS=-fPIC && ./configure -prefix=/usr && sudo make -j16 && sudo make -j16 install conda remove protobuf conda remove libprotobuf rm -rf ~/anaconda3/bin/proto* && rm -rf ~/anaconda3/lib/libproto* sudo apt remove libprotobuf-dev sudo apt remove libprotobuf-lite9v5 sudo apt remove libprotobuf9v5 sudo apt remove libprotoc9v5 sudo ldconfig ``` When finished, do `ldconfig -p` and verify no more occurrences of `libprotoc.so.9` and `libprotobuf.so.9` occur in the output. Due to ABI incompatibility between different protobuf versions--preinstalled version that comes with Deep Learning AMI (Ubuntu apt), preinstalled version that comes with Anaconda and version that gets auto-installed by the Makefile--you need to uninstall the 2 versions that come with both apt and Anaconda. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] ctcyang edited a comment on issue #10696: [MXNET-366]Extend MXNet Distributed Training by AllReduce
ctcyang edited a comment on issue #10696: [MXNET-366]Extend MXNet Distributed Training by AllReduce URL: https://github.com/apache/incubator-mxnet/pull/10696#issuecomment-401432358 Thanks for your help @threeleafzerg I was able to build it with USE_ALLREDUCE_DIST_KVSTORE = 1. On AWS EC2 instances using the Deep Learning AMI, you need to do these additional steps: ``` wget https://github.com/google/protobuf/releases/download/v3.5.1/protobuf-cpp-3.5.1.tar.gz && tar --no-same-owner -zxf protobuf-cpp-3.5.1.tar.gz cd protobuf-3.5.1 && export CFLAGS=-fPIC && export CXXFLAGS=-fPIC && ./configure -prefix=/usr && sudo make -j16 && sudo make -j16 install conda remove protobuf conda remove libprotobuf rm -rf ~/anaconda3/bin/proto* && rm -rf ~/anaconda3/lib/libproto* sudo apt remove libprotobuf-dev sudo apt remove libprotobuf-lite9v5 sudo apt remove libprotobuf9v5 sudo apt remove libprotoc9v5 sudo ldconfig ``` When finished, do `ldconfig -p` and verify no more occurrences of `libprotoc.so.9` and `libprotobuf.so.9` occur in the output. As @threeleafzerg said, this is due to ABI incompatibility between different protobuf versions: preinstalled version that comes with Deep Learning AMI and version that gets auto-installed by the Makefile. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] ctcyang edited a comment on issue #10696: [MXNET-366]Extend MXNet Distributed Training by AllReduce
ctcyang edited a comment on issue #10696: [MXNET-366]Extend MXNet Distributed Training by AllReduce URL: https://github.com/apache/incubator-mxnet/pull/10696#issuecomment-401432358 Thanks for your help @threeleafzerg I was able to build it with USE_ALLREDUCE_DIST_KVSTORE = 1. On AWS EC2 instances, using the Deep Learning AMI, you need to do these additional steps: ``` wget https://github.com/google/protobuf/releases/download/v3.5.1/protobuf-cpp-3.5.1.tar.gz && tar --no-same-owner -zxf protobuf-cpp-3.5.1.tar.gz cd protobuf-3.5.1 && export CFLAGS=-fPIC && export CXXFLAGS=-fPIC && ./configure -prefix=/usr && sudo make -j16 && sudo make -j16 install conda remove protobuf conda remove libprotobuf rm -rf ~/anaconda3/bin/proto* && rm -rf ~/anaconda3/lib/libproto* sudo apt remove libprotobuf-dev sudo apt remove libprotobuf-lite9v5 sudo apt remove libprotobuf9v5 sudo apt remove libprotoc9v5 sudo ldconfig ``` When finished, do `ldconfig -p` and verify no more occurrences of `libprotoc.so.9` and `libprotobuf.so.9` occur in the output. As @threeleafzerg said, this is due to ABI incompatibility between different protobuf versions: preinstalled version that comes with Deep Learning AMI and version that gets auto-installed by the Makefile. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] ctcyang edited a comment on issue #10696: [MXNET-366]Extend MXNet Distributed Training by AllReduce
ctcyang edited a comment on issue #10696: [MXNET-366]Extend MXNet Distributed Training by AllReduce URL: https://github.com/apache/incubator-mxnet/pull/10696#issuecomment-401432358 Thanks for your help @threeleafzerg I was able to build it with USE_ALLREDUCE_DIST_KVSTORE = 1. On AWS EC2 instances, using the Deep Learning AMI, you need to do these additional steps. Maybe this will be helpful to others: ``` wget https://github.com/google/protobuf/releases/download/v3.5.1/protobuf-cpp-3.5.1.tar.gz && tar --no-same-owner -zxf protobuf-cpp-3.5.1.tar.gz cd protobuf-3.5.1 && export CFLAGS=-fPIC && export CXXFLAGS=-fPIC && ./configure -prefix=/usr && sudo make -j16 && sudo make -j16 install conda remove protobuf conda remove libprotobuf rm -rf ~/anaconda3/bin/proto* && rm -rf ~/anaconda3/lib/libproto* sudo apt remove libprotobuf-dev sudo apt remove libprotobuf-lite9v5 sudo apt remove libprotobuf9v5 sudo apt remove libprotoc9v5 sudo ldconfig ``` When finished, do `ldconfig -p` and verify no more occurrences of `libprotoc.so.9` and `libprotobuf.so.9` occur in the output. As @threeleafzerg said, this is due to ABI incompatibility between different protobuf versions: preinstalled version that comes with Deep Learning AMI and version that gets auto-installed by the Makefile. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] eric-haibin-lin commented on a change in pull request #11498: Fix InferStorage for sparse fallback in FullyConnected
eric-haibin-lin commented on a change in pull request #11498: Fix InferStorage for sparse fallback in FullyConnected URL: https://github.com/apache/incubator-mxnet/pull/11498#discussion_r199239656 ## File path: src/operator/nn/fully_connected.cc ## @@ -210,17 +210,17 @@ inline static bool BackwardFCStorageType(const nnvm::NodeAttrs& attrs, CHECK_EQ(in_attrs->size(), 3U); CHECK_EQ(out_attrs->size(), out_expected); - DispatchMode wanted_mode; -#if 0 + bool dispatched = false; // TODO(zhengda) let's disable MKLDNN for FullyConnected for now. Review comment: @zheng-da @azai91 is there a plan to enable FC with MKLDNN, or at least test it out? If there's a bug reveal by this test, this will also help stabilize mxnet-MKLDNN. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] ctcyang commented on issue #10696: [MXNET-366]Extend MXNet Distributed Training by AllReduce
ctcyang commented on issue #10696: [MXNET-366]Extend MXNet Distributed Training by AllReduce URL: https://github.com/apache/incubator-mxnet/pull/10696#issuecomment-401432358 Thanks for your help @threeleafzerg I was able to build it with USE_ALLREDUCE_DIST_KVSTORE=1. On AWS EC2 instances, using the Deep Learning AMI, you need to do these additional steps. Maybe this will be helpful to others: ``` wget https://github.com/google/protobuf/releases/download/v3.5.1/protobuf-cpp-3.5.1.tar.gz && tar --no-same-owner -zxf protobuf-cpp-3.5.1.tar.gz cd protobuf-3.5.1 && export CFLAGS=-fPIC && export CXXFLAGS=-fPIC && ./configure -prefix=/usr && sudo make -j16 && sudo make -j16 install conda remove protobuf conda remove libprotobuf rm -rf ~/anaconda3/bin/proto* && rm -rf ~/anaconda3/lib/libproto* sudo apt remove libprotobuf-dev sudo apt remove libprotobuf-lite9v5 sudo apt remove libprotobuf9v5 sudo apt remove libprotoc9v5 sudo ldconfig ``` When finished, do `ldconfig -p` and verify no more occurrences of `libprotoc.so.9` and `libprotobuf.so.9` occur in the output. As @threeleafzerg said, this is due to ABI incompatibility between different protobuf versions: preinstalled version that comes with Deep Learning AMI and version that gets auto-installed by the Makefile. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] yzhliu commented on a change in pull request #11205: Clojure Contrib Package
yzhliu commented on a change in pull request #11205: Clojure Contrib Package URL: https://github.com/apache/incubator-mxnet/pull/11205#discussion_r199239591 ## File path: contrib/clojure-package/src/org/apache/clojure_mxnet/gen/ndarray.clj ## @@ -0,0 +1,2312 @@ +(ns org.apache.clojure-mxnet.ndarray + (:refer-clojure :exclude [* - + > >= < <= / cast concat flatten identity load max +min repeat reverse set sort take to-array empty shuffle]) + (:import (org.apache.mxnet NDArray Shape))) + +;; Do not edit - this is auto-generated + +;; Licensed to the Apache Software Foundation (ASF) under one or more +;; contributor license agreements. See the NOTICE file distributed with +;; this work for additional information regarding copyright ownership. +;; The ASF licenses this file to You under the Apache License, Version 2.0 +;; (the "License"); you may not use this file except in compliance with +;; the License. You may obtain a copy of the License at +;; +;;http://www.apache.org/licenses/LICENSE-2.0 +;; +;; Unless required by applicable law or agreed to in writing, software +;; distributed under the License is distributed on an "AS IS" BASIS, +;; WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +;; See the License for the specific language governing permissions and +;; limitations under the License. +;; + + + + Review comment: Can we generate the source code file during compiling, instead of checking in to the repo? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] eric-haibin-lin commented on a change in pull request #10696: [MXNET-366]Extend MXNet Distributed Training by AllReduce
eric-haibin-lin commented on a change in pull request #10696: [MXNET-366]Extend MXNet Distributed Training by AllReduce URL: https://github.com/apache/incubator-mxnet/pull/10696#discussion_r199239103 ## File path: tests/nightly/test_all.sh ## @@ -44,6 +44,7 @@ USE_CUDA=1 USE_CUDA_PATH=/usr/local/cuda USE_CUDNN=1 USE_DIST_KVSTORE=1 +USE_ALLREDUCE_DIST_KVSTORE=1 Review comment: I'm not sure if this script is being used for nightly test at all. @marcoabreu could you verify? Also - do we need to add a build stage with USE_ALLREDUCE_DIST_KVSTORE=1 to Jenkins? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] szha commented on issue #11429: Improve sparse pull performance for gluon trainer
szha commented on issue #11429: Improve sparse pull performance for gluon trainer URL: https://github.com/apache/incubator-mxnet/pull/11429#issuecomment-401430706 how does the performance look? @eric-haibin-lin @leezu This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] szha commented on a change in pull request #11429: Improve sparse pull performance for gluon trainer
szha commented on a change in pull request #11429: Improve sparse pull performance for gluon trainer URL: https://github.com/apache/incubator-mxnet/pull/11429#discussion_r199237863 ## File path: python/mxnet/gluon/trainer.py ## @@ -169,9 +190,9 @@ def _init_kvstore(self): if kvstore: if self._compression_params: kvstore.set_gradient_compression(self._compression_params) -# kv.pull(row_sparse_grad) is not supported -if 'dist' in kvstore.type and not self._contains_sparse: -update_on_kvstore = False +if 'dist' in kvstore.type: +# kv.pull(row_sparse_grad) is not supported for dist kvstore +update_on_kvstore = self._contains_sparse_weight or self._contains_sparse_grad Review comment: from the comment I'm guessing you meant `not self._contains_sparse_weight and not self._contains_sparse_grad` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services