[GitHub] [incubator-mxnet] ciyongch commented on issue #18641: Backporting recent mx.np changes to 1.7 branch
ciyongch commented on issue #18641: URL: https://github.com/apache/incubator-mxnet/issues/18641#issuecomment-652205754 Wow... can you help to evaluate how many PRs are related to enable this feature? Then we can decide the time needed and whether to include them or not? Thanks! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] ciyongch commented on issue #18641: Backporting recent mx.np changes to 1.7 branch
ciyongch commented on issue #18641: URL: https://github.com/apache/incubator-mxnet/issues/18641#issuecomment-652204662 Thanks @sxjscience for your prompt help on this. I've already ping the author for https://github.com/apache/incubator-mxnet/pull/18523, do you need any help for this PR? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] xidulu commented on a change in pull request #18403: Gluon.probability
xidulu commented on a change in pull request #18403: URL: https://github.com/apache/incubator-mxnet/pull/18403#discussion_r448127519 ## File path: python/mxnet/gluon/probability/block/stochastic_block.py ## @@ -0,0 +1,127 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +# coding: utf-8 +# pylint: disable=abstract-method +"""Stochastic block class.""" +__all__ = ['StochasticBlock', 'StochasticSequential'] + +from functools import wraps +from ...block import HybridBlock +from ...utils import _indent + + +class StochasticBlock(HybridBlock): +"""`StochasticBlock` extends `HybridBlock` to support accumulating loss +in the forward phase, which is extremely useful in building Bayesian Neural Network, +where the loss function is composed of a classification loss and a KL loss. + +""" + +def __init__(self, **kwargs): +super(StochasticBlock, self).__init__(**kwargs) +self._losses = [] +self._losscache = [] + +def add_loss(self, loss): +self._losscache.append(loss) + +@staticmethod +def collectLoss(func): +"""To accumulate loss during the forward phase, one could first decorate +hybrid_forward with `StochasticBlock.collectLoss, +and then collect the loss tensor `x` by calling self.add_loss(x). +For example, in the following forward function, +we generate samples from a Gaussian parameterized by `loc` and `scale` and +accumulate the KL-divergence between it and its prior into the block's loss storage.: +@StochasticBlock.collectLoss +def hybrid_forward(self, F, loc, scale): +qz = mgp.Normal(loc, scale) +# prior +pz = mgp.Normal(F.np.zeros_like(loc), F.np.ones_like(scale)) +self.add_loss(mgp.kl_divergence(qz, pz)) +return qz.sample() +""" +@wraps(func) +def inner(self, *args, **kwargs): +# Loss from hybrid_forward +func_out = func(self, *args, **kwargs) +collected_loss = self._losscache +self._losscache = [] +return (func_out, collected_loss) + +return inner + +def __call__(self, *args, **kwargs): + # pylint: disable=arguments-differ +out = super().__call__(*args, **kwargs) +self._losses.extend(out[1]) +return out[0] Review comment: @leezu Update: I made further changes here to avoid confusion. Now the users are forced to use to collectLoss decorator in all cases, otherwise an exception would be raised. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] TaoLv commented on pull request #18645: [WIP] typedef for MKL_INT to point to int64_t when building with Large Tensor
TaoLv commented on pull request #18645: URL: https://github.com/apache/incubator-mxnet/pull/18645#issuecomment-652198544 Please refer to https://software.intel.com/content/www/us/en/develop/documentation/mkl-macos-developer-guide/top/linking-your-application-with-the-intel-math-kernel-library/linking-in-detail/linking-with-interface-libraries/using-the-ilp64-interface-vs-lp64-interface.html The cmake flag `MKL_USE_ILP64` needs to be set to enable MKL ILP64 interface. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] szha commented on pull request #18602: Fix softmax, logsoftmax failed on empty ndarray
szha commented on pull request #18602: URL: https://github.com/apache/incubator-mxnet/pull/18602#issuecomment-652197512 @TaoLv no it's not enforced at the moment. feel free to merge when ready This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] TaoLv commented on pull request #18602: Fix softmax, logsoftmax failed on empty ndarray
TaoLv commented on pull request #18602: URL: https://github.com/apache/incubator-mxnet/pull/18602#issuecomment-652196505 @szha @leezu do we need to fix the codecov status before merging? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] TaoLv commented on pull request #18504: [Improvement] Invoke mkldnn and cudnn BatchNorm when axis != 1
TaoLv commented on pull request #18504: URL: https://github.com/apache/incubator-mxnet/pull/18504#issuecomment-652195820 @wkcn, do you have any performance number? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] mxnet-bot commented on pull request #18648: [v1.7.x] [Backport]add zero grad for npi_unique (#18080)
mxnet-bot commented on pull request #18648: URL: https://github.com/apache/incubator-mxnet/pull/18648#issuecomment-652193277 Hey @sxjscience , Thanks for submitting the PR All tests are already queued to run once. If tests fail, you can trigger one or more tests again with the following commands: - To trigger all jobs: @mxnet-bot run ci [all] - To trigger specific jobs: @mxnet-bot run ci [job1, job2] *** **CI supported jobs**: [windows-gpu, centos-cpu, edge, windows-cpu, unix-gpu, miscellaneous, unix-cpu, website, clang, sanity, centos-gpu] *** _Note_: Only following 3 categories can trigger CI :PR Author, MXNet Committer, Jenkins Admin. All CI tests must pass before the PR can be merged. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] sxjscience opened a new pull request #18648: [v1.7.x] [Backport]add zero grad for npi_unique (#18080)
sxjscience opened a new pull request #18648: URL: https://github.com/apache/incubator-mxnet/pull/18648 ## Description ## (Brief description on what this PR is about) ## Checklist ## ### Essentials ### Please feel free to remove inapplicable items for your PR. - [ ] The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant [JIRA issue](https://issues.apache.org/jira/projects/MXNET/issues) created (except PRs with tiny changes) - [ ] Changes are complete (i.e. I finished coding on this PR) - [ ] All changes have test coverage: - Unit tests are added for small changes to verify correctness (e.g. adding a new operator) - Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore) - Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL) - [ ] Code is well-documented: - For user-facing API changes, API doc string has been updated. - For new C++ functions in header files, their functionalities and arguments are documented. - For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable - Check the API doc at https://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html - [ ] To the best of my knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change ### Changes ### - [ ] Feature1, tests, (and when applicable, API doc) - [ ] Feature2, tests, (and when applicable, API doc) ## Comments ## - If this change is a backward incompatible change, why must this change be made. - Interesting edge cases to note here This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] access2rohit commented on issue #18166: Flaky test_numpy_op.py::test_np_mixedType_unary_funcs
access2rohit commented on issue #18166: URL: https://github.com/apache/incubator-mxnet/issues/18166#issuecomment-652183845 Still causing failures : http://jenkins.mxnet-ci.amazon-ml.com/blue/rest/organizations/jenkins/pipelines/mxnet-validation/pipelines/unix-cpu/branches/PR-18625/runs/7/nodes/358/steps/484/log/?start=0 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] DongfeiJi commented on issue #18643: ndarray.contrib.boolean_mask can not be hybridize
DongfeiJi commented on issue #18643: URL: https://github.com/apache/incubator-mxnet/issues/18643#issuecomment-652163234 > Hi @DongfeiJi , > It works for me on MXNet 2.0. > Note that `boolean_mask` doesn't work when mask are all `zero/false`, since the traditional operator doesn't support zero-size array. > > ```python > import mxnet as mx > from mxnet import gluon > from mxnet.gluon.loss import Loss, _apply_weighting > > class NewTripletLoss(Loss): > def __init__(self, batch_size_per_gpu, margin=1, weight=None, batch_axis=0, **kwargs): > super(NewTripletLoss, self).__init__(weight, batch_axis, **kwargs) > self.batch_size_per_gpu = batch_size_per_gpu > self.margin = margin > def hybrid_forward(self, F, embeddings, labels, sample_weight=None): > N = self.batch_size_per_gpu > # get distance > xx = F.power(embeddings, 2).sum(1, keepdims=True).tile((1, self.batch_size_per_gpu)) > dist = F.broadcast_add(xx, xx.transpose()) > dist = F.broadcast_sub(dist, 2 * F.dot(embeddings, embeddings.transpose())) > dist = F.clip(dist, 1e-12, 1e12) > # get mask > labels = F.cast(labels, dtype='float32') > labels = labels.expand_dims(1).tile((1, self.batch_size_per_gpu)) > is_pos = F.broadcast_equal(labels, labels.transpose()) > is_neg = F.broadcast_not_equal(labels, labels.transpose()) > # hard example mining > dist_mat = dist.reshape((self.batch_size_per_gpu * self.batch_size_per_gpu,)) > pos_mask = is_pos.reshape((self.batch_size_per_gpu * self.batch_size_per_gpu,)) > dist_ap = F.contrib.boolean_mask(dist_mat, pos_mask).reshape((self.batch_size_per_gpu, -1)) > #dist_ap = F.broadcast_mul(dist_mat, pos_mask).reshape((self.batch_size_per_gpu, -1)) > dist_ap = F.max(dist_ap, axis=1) > neg_mask = is_neg.reshape((self.batch_size_per_gpu * self.batch_size_per_gpu,)) > dist_an = F.contrib.boolean_mask(dist_mat, neg_mask).reshape((self.batch_size_per_gpu, -1)) > #dist_an = F.broadcast_mul(dist_mat, neg_mask).reshape((self.batch_size_per_gpu, -1)) > dist_an = F.min(dist_an, axis=1) > # add margin > margin = F.full(shape=(self.batch_size_per_gpu, 1), val=self.margin) > loss = F.broadcast_add(F.broadcast_sub(dist_ap, dist_an), margin) > loss = F.maximum(loss, F.zeros_like(loss)) > # apply weight > loss = _apply_weighting(F, loss, self._weight, sample_weight) > return F.mean(loss, axis=self._batch_axis, exclude=True) > > block = NewTripletLoss(2) > block.hybridize() > embeddings = mx.nd.array([[1.0, 0.0, 1.0], [1.0, 1.0, 0.0]]).reshape((2,3)) > embeddings.attach_grad() > labels = mx.nd.array([0, 1]).reshape((2, )) > with mx.autograd.record(): > out = block(embeddings, labels) > out.sum().backward() > print(out) > mx.nd.waitall() > ``` u can review this, i upload the example code [https://github.com/DongfeiJi/chineseocr_lite/blob/master/jdf.py](url) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] DongfeiJi commented on issue #18643: ndarray.contrib.boolean_mask can not be hybridize
DongfeiJi commented on issue #18643: URL: https://github.com/apache/incubator-mxnet/issues/18643#issuecomment-652161754 > Hi @DongfeiJi , > It works for me on MXNet 2.0. > Note that `boolean_mask` doesn't work when mask are all `zero/false`, since the traditional operator doesn't support zero-size array. > > ```python > import mxnet as mx > from mxnet import gluon > from mxnet.gluon.loss import Loss, _apply_weighting > > class NewTripletLoss(Loss): > def __init__(self, batch_size_per_gpu, margin=1, weight=None, batch_axis=0, **kwargs): > super(NewTripletLoss, self).__init__(weight, batch_axis, **kwargs) > self.batch_size_per_gpu = batch_size_per_gpu > self.margin = margin > def hybrid_forward(self, F, embeddings, labels, sample_weight=None): > N = self.batch_size_per_gpu > # get distance > xx = F.power(embeddings, 2).sum(1, keepdims=True).tile((1, self.batch_size_per_gpu)) > dist = F.broadcast_add(xx, xx.transpose()) > dist = F.broadcast_sub(dist, 2 * F.dot(embeddings, embeddings.transpose())) > dist = F.clip(dist, 1e-12, 1e12) > # get mask > labels = F.cast(labels, dtype='float32') > labels = labels.expand_dims(1).tile((1, self.batch_size_per_gpu)) > is_pos = F.broadcast_equal(labels, labels.transpose()) > is_neg = F.broadcast_not_equal(labels, labels.transpose()) > # hard example mining > dist_mat = dist.reshape((self.batch_size_per_gpu * self.batch_size_per_gpu,)) > pos_mask = is_pos.reshape((self.batch_size_per_gpu * self.batch_size_per_gpu,)) > dist_ap = F.contrib.boolean_mask(dist_mat, pos_mask).reshape((self.batch_size_per_gpu, -1)) > #dist_ap = F.broadcast_mul(dist_mat, pos_mask).reshape((self.batch_size_per_gpu, -1)) > dist_ap = F.max(dist_ap, axis=1) > neg_mask = is_neg.reshape((self.batch_size_per_gpu * self.batch_size_per_gpu,)) > dist_an = F.contrib.boolean_mask(dist_mat, neg_mask).reshape((self.batch_size_per_gpu, -1)) > #dist_an = F.broadcast_mul(dist_mat, neg_mask).reshape((self.batch_size_per_gpu, -1)) > dist_an = F.min(dist_an, axis=1) > # add margin > margin = F.full(shape=(self.batch_size_per_gpu, 1), val=self.margin) > loss = F.broadcast_add(F.broadcast_sub(dist_ap, dist_an), margin) > loss = F.maximum(loss, F.zeros_like(loss)) > # apply weight > loss = _apply_weighting(F, loss, self._weight, sample_weight) > return F.mean(loss, axis=self._batch_axis, exclude=True) > > block = NewTripletLoss(2) > block.hybridize() > embeddings = mx.nd.array([[1.0, 0.0, 1.0], [1.0, 1.0, 0.0]]).reshape((2,3)) > embeddings.attach_grad() > labels = mx.nd.array([0, 1]).reshape((2, )) > with mx.autograd.record(): > out = block(embeddings, labels) > out.sum().backward() > print(out) > mx.nd.waitall() > ``` Thank you again for your reply. It is OK to hybridize directly. However, if the initialization of the model is delayed, there will be problems. You can run my code. I report an error because I use gluon's trainer, and the mxnet version is 1.5. PS: This is an example of my code, for simplify, i do not use gluon.trainer. If use nd, it is ok, when hybridize, it does not work. `import mxnet from mxnet import nd from mxnet.gluon import nn from mxnet.gluon.loss import Loss, _apply_weighting class MyBlock(nn.HybridBlock): def __init__(self, **kwargs): super(MyBlock, self).__init__(**kwargs) self.conv = nn.Conv2D(channels=2048, kernel_size=1, strides=1, padding=0, use_bias=False) self.pool = nn.GlobalAvgPool2D() self.flatten = nn.Flatten() def hybrid_forward(self, F, x): x = self.conv(x) x = self.pool(x) x = self.flatten(x) return x class NewTripletLoss(Loss): def __init__(self, batch_size_per_gpu, margin=1, weight=None, batch_axis=0, **kwargs): super(NewTripletLoss, self).__init__(weight, batch_axis, **kwargs) self.batch_size_per_gpu = batch_size_per_gpu self.margin = margin def hybrid_forward(self, F, embeddings, labels, sample_weight=None): N = self.batch_size_per_gpu # get distance xx = F.power(embeddings, 2).sum(1, keepdims=True).tile((1, self.batch_size_per_gpu)) dist = F.broadcast_add(xx, xx.transpose()) dist = F.broadcast_sub(dist, 2 * F.dot(embeddings, embeddings.transpose())) dist = F.clip(dist, 1e-12, 1e12).sqrt() print(dist) # get mask
[GitHub] [incubator-mxnet] yuantangliang opened a new pull request #18647: add visualization support for qualization operator
yuantangliang opened a new pull request #18647: URL: https://github.com/apache/incubator-mxnet/pull/18647 ## Description ## add visualization support for qualization operator ### Changes ### the plot_network function will crash when plot quantization symbol. this pr fixed this bug. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] mxnet-bot commented on pull request #18647: add visualization support for qualization operator
mxnet-bot commented on pull request #18647: URL: https://github.com/apache/incubator-mxnet/pull/18647#issuecomment-652160331 Hey @yuantangliang , Thanks for submitting the PR All tests are already queued to run once. If tests fail, you can trigger one or more tests again with the following commands: - To trigger all jobs: @mxnet-bot run ci [all] - To trigger specific jobs: @mxnet-bot run ci [job1, job2] *** **CI supported jobs**: [website, unix-gpu, windows-cpu, miscellaneous, clang, centos-cpu, sanity, unix-cpu, edge, windows-gpu, centos-gpu] *** _Note_: Only following 3 categories can trigger CI :PR Author, MXNet Committer, Jenkins Admin. All CI tests must pass before the PR can be merged. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] ciyongch commented on pull request #18080: Add zero grad for npi_unique
ciyongch commented on pull request #18080: URL: https://github.com/apache/incubator-mxnet/pull/18080#issuecomment-652139046 Hi @haojin2 @sxjscience , could you please help to backport this PR to v1.7.x? thanks! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] ciyongch commented on pull request #18523: [numpy] unify impl of mixed type binary op between linux and windows
ciyongch commented on pull request #18523: URL: https://github.com/apache/incubator-mxnet/pull/18523#issuecomment-652138648 Hi @BenjaminCHEN2016 , could you please help to backport this PR to v1.7.x branch as suggested by @sxjscience and @szha in https://github.com/apache/incubator-mxnet/issues/18641, thanks! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] ciyongch commented on issue #18641: Backporting recent mx.np changes to 1.7 branch
ciyongch commented on issue #18641: URL: https://github.com/apache/incubator-mxnet/issues/18641#issuecomment-652137950 Hi @sxjscience may I know if you're going to backport the above two PR into 1.7 as @szha suggested? We're waiting for them to be merged and tag rc0 now, thanks! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[incubator-mxnet] branch v1.6.x updated (31ec0f0 -> fb3fea4)
This is an automated email from the ASF dual-hosted git repository. skm pushed a change to branch v1.6.x in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from 31ec0f0 Increase staggered build timeout to 180 min (#18568) (#18586) add fb3fea4 [CI][v1.6.x] Fix failing CI pipelines (#18597) No new revisions were added by this update. Summary of changes: ci/docker/Dockerfile.build.jetson | 5 + ci/docker/install/requirements| 1 + ci/jenkins/Jenkins_steps.groovy | 13 - ci/jenkins/Jenkinsfile_unix_gpu | 5 + 4 files changed, 7 insertions(+), 17 deletions(-)
[GitHub] [incubator-mxnet] ciyongch commented on pull request #18597: [CI][v1.6.x] Fix failing CI pipelines
ciyongch commented on pull request #18597: URL: https://github.com/apache/incubator-mxnet/pull/18597#issuecomment-652136352 Thanks you @ChaiBapchya for the prompt fix, I will rebase my PR https://github.com/apache/incubator-mxnet/pull/18632. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[incubator-mxnet] branch v1.6.x updated: [CI][v1.6.x] Fix failing CI pipelines (#18597)
This is an automated email from the ASF dual-hosted git repository. skm pushed a commit to branch v1.6.x in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git The following commit(s) were added to refs/heads/v1.6.x by this push: new fb3fea4 [CI][v1.6.x] Fix failing CI pipelines (#18597) fb3fea4 is described below commit fb3fea441f24a874219549e33a2ac0b0da3e8015 Author: Chaitanya Prakash Bapat AuthorDate: Tue Jun 30 18:37:22 2020 -0700 [CI][v1.6.x] Fix failing CI pipelines (#18597) * add the missing build_ubuntu_gpu_cuda101_cudnn7_mkldnn_cpp_test in runtime_functions.sh * Revert "add the missing build_ubuntu_gpu_cuda101_cudnn7_mkldnn_cpp_test in runtime_functions.sh" This reverts commit de173b05a393c2b21075b02f276b6fb6e5312530. * Revert "[CI][1.6.x] fix centos 7 url to unblock centos-cpu & gpu pipeline (#18560)" This reverts commit d2713482f9a6a45f1274df87bd34d784a94756ed. * fix centos 7 url to unblock centos-cpu & gpu pipeline * skip quantized conv flaky case (#16866) * Fix quantized concat when inputs are mixed int8 and uint8 Change-Id: I4da04bf4502425134a466823fb5f73da2d7a419b * skip flaky test * trigger ci * Trigger empty commit * [v1.7.x] update jetson dockerfile to support CUDA 10.0 (#18339) * update dockerfile for jetson * add toolchain files * update build_jetson function * update ubuntu_julia.sh * update FindCUDAToolkit.cmake * Update centos7_python.sh * revert changes on ubuntu_julia.sh * disable TVM for gpu build * Disable TVM_OP on GPU builds Co-authored-by: Wei Chu Co-authored-by: Leonard Lausen * add setuptools to ci/docker/install/requirements * add missing build_ubuntu_gpu_cuda101_cudnn7_mkldnn_cpp_test * add setuptool to docker & cpp-test build syntax error * remove erroneously added cpp tests in 1.6.x * py3 to p2 Co-authored-by: Xinyu Chen Co-authored-by: waytrue17 <52505574+waytru...@users.noreply.github.com> Co-authored-by: Wei Chu Co-authored-by: Leonard Lausen --- ci/docker/Dockerfile.build.jetson | 5 + ci/docker/install/requirements| 1 + ci/jenkins/Jenkins_steps.groovy | 13 - ci/jenkins/Jenkinsfile_unix_gpu | 5 + 4 files changed, 7 insertions(+), 17 deletions(-) diff --git a/ci/docker/Dockerfile.build.jetson b/ci/docker/Dockerfile.build.jetson index 93fe5e0..45a0572 100644 --- a/ci/docker/Dockerfile.build.jetson +++ b/ci/docker/Dockerfile.build.jetson @@ -37,6 +37,8 @@ RUN apt-get update && DEBIAN_FRONTEND=noninteractive apt-get install -y \ unzip \ python3 \ python3-pip \ +python \ +python-pip \ awscli \ crossbuild-essential-arm64 \ && rm -rf /var/lib/apt/lists/* @@ -78,5 +80,8 @@ ARG GROUP_ID=0 COPY install/ubuntu_adduser.sh /work/ RUN /work/ubuntu_adduser.sh +COPY install/requirements /work/ +RUN python -m pip install -r /work/requirements + COPY runtime_functions.sh /work/ WORKDIR /work/mxnet diff --git a/ci/docker/install/requirements b/ci/docker/install/requirements index 5f9f28c..ada25d2 100644 --- a/ci/docker/install/requirements +++ b/ci/docker/install/requirements @@ -32,4 +32,5 @@ pylint==2.3.1; python_version >= '3.0' astroid==2.3.3; python_version >= '3.0' requests<2.19.0,>=2.18.4 scipy==1.2.1 +setuptools six==1.11.0 diff --git a/ci/jenkins/Jenkins_steps.groovy b/ci/jenkins/Jenkins_steps.groovy index 5345c78..c5e3fab 100644 --- a/ci/jenkins/Jenkins_steps.groovy +++ b/ci/jenkins/Jenkins_steps.groovy @@ -261,19 +261,6 @@ def compile_unix_full_gpu() { }] } -def compile_unix_full_gpu_mkldnn_cpp_test() { -return ['GPU: CUDA10.1+cuDNN7+MKLDNN+CPPTEST': { - node(NODE_LINUX_CPU) { -ws('workspace/build-gpu-mkldnn-cpp') { - timeout(time: max_time, unit: 'MINUTES') { -utils.init_git() -utils.docker_run('ubuntu_build_cuda', 'build_ubuntu_gpu_cuda101_cudnn7_mkldnn_cpp_test', false) -utils.pack_lib('gpu_mkldnn_cpp_test', mx_lib_cpp_capi) - } -} - } -}] -} def compile_unix_cmake_mkldnn_gpu() { return ['GPU: CMake MKLDNN': { diff --git a/ci/jenkins/Jenkinsfile_unix_gpu b/ci/jenkins/Jenkinsfile_unix_gpu index e3ff319..bc4a74e 100644 --- a/ci/jenkins/Jenkinsfile_unix_gpu +++ b/ci/jenkins/Jenkinsfile_unix_gpu @@ -41,8 +41,7 @@ core_logic: { custom_steps.compile_unix_cmake_gpu(), custom_steps.compile_unix_tensorrt_gpu(), custom_steps.compile_unix_int64_gpu(), -custom_steps.compile_unix_cmake_gpu_no_rtc(), -custom_steps.compile_unix_full_gpu_mkldnn_cpp_test() +custom_steps.compile_unix_cmake_gpu_no_rtc() ]) utils.parallel_stage('Tests', [ @@ -63,8 +62,6 @@ core_logic: { custom_steps.test_unix_scala_gpu(),
[GitHub] [incubator-mxnet] sandeep-krishnamurthy merged pull request #18597: [CI][v1.6.x] Fix failing CI pipelines
sandeep-krishnamurthy merged pull request #18597: URL: https://github.com/apache/incubator-mxnet/pull/18597 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] sandeep-krishnamurthy commented on pull request #18597: [CI][v1.6.x] Fix failing CI pipelines
sandeep-krishnamurthy commented on pull request #18597: URL: https://github.com/apache/incubator-mxnet/pull/18597#issuecomment-652135618 Thank you @ChaiBapchya This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] mxnet-bot commented on pull request #18642: [numpy] Fix less/greater bug with scalar input
mxnet-bot commented on pull request #18642: URL: https://github.com/apache/incubator-mxnet/pull/18642#issuecomment-652126161 Jenkins CI successfully triggered : [unix-cpu, centos-cpu] This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] Yiyan66 commented on pull request #18642: [numpy] Fix less/greater bug with scalar input
Yiyan66 commented on pull request #18642: URL: https://github.com/apache/incubator-mxnet/pull/18642#issuecomment-652126133 @mxnet-bot run ci [centos-cpu, unix-cpu] This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] Kh4L commented on a change in pull request #18490: MXNet-TRT: Add PrePartition param caching - move init_tensorrt_params logic
Kh4L commented on a change in pull request #18490: URL: https://github.com/apache/incubator-mxnet/pull/18490#discussion_r448057212 ## File path: src/operator/subgraph/tensorrt/tensorrt-inl.h ## @@ -267,6 +267,24 @@ class TensorrtProperty : public SubgraphProperty { return std::make_shared(); } + void PrePartition(const nnvm::Graph& g, +const std::vector>& options_map) override { +auto& in_arg_names = g.GetAttr>("in_arg_names"); +auto& in_aux_names = g.GetAttr>("in_aux_names"); +NDArray **in_args_ptr = g.GetAttr("in_args"); +NDArray **in_aux_ptr = g.GetAttr("in_aux"); +// should we check if not empty? Review comment: Removed the comment This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[incubator-mxnet-site] branch asf-site updated: Bump the publish timestamp.
This is an automated email from the ASF dual-hosted git repository. aaronmarkham pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/incubator-mxnet-site.git The following commit(s) were added to refs/heads/asf-site by this push: new 63d72b4 Bump the publish timestamp. 63d72b4 is described below commit 63d72b432a5076dc70c7dac96506775671ba4cdb Author: mxnet-ci AuthorDate: Wed Jul 1 00:41:08 2020 + Bump the publish timestamp. --- date.txt | 1 + 1 file changed, 1 insertion(+) diff --git a/date.txt b/date.txt new file mode 100644 index 000..f6b8612 --- /dev/null +++ b/date.txt @@ -0,0 +1 @@ +Wed Jul 1 00:41:08 UTC 2020
[incubator-mxnet-site] branch asf-site updated: Publish triggered by CI
This is an automated email from the ASF dual-hosted git repository. aaronmarkham pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/incubator-mxnet-site.git The following commit(s) were added to refs/heads/asf-site by this push: new 800590a Publish triggered by CI 800590a is described below commit 800590aa9db7dbe6e6280b6ae66274b1eaec722a Author: mxnet-ci AuthorDate: Wed Jul 1 00:40:54 2020 + Publish triggered by CI --- date.txt | 1 - feed.xml | 2 +- 2 files changed, 1 insertion(+), 2 deletions(-) diff --git a/date.txt b/date.txt deleted file mode 100644 index 0bf7c63..000 --- a/date.txt +++ /dev/null @@ -1 +0,0 @@ -Tue Jun 30 18:41:30 UTC 2020 diff --git a/feed.xml b/feed.xml index bc2cea4..b2d13bd 100644 --- a/feed.xml +++ b/feed.xml @@ -1 +1 @@ -http://www.w3.org/2005/Atom; >https://jekyllrb.com/; version="4.0.0">Jekyllhttps://mxnet.apache.org/feed.xml; rel="self" type="application/atom+xml" />https://mxnet.apache.org/; rel="alternate" type="text/html" />2020-06-30T18:30:39+00:00https://mxnet.apache.org/feed.xmlApache MXNetA flexible and efficient library for deep [...] \ No newline at end of file +http://www.w3.org/2005/Atom; >https://jekyllrb.com/; version="4.0.0">Jekyllhttps://mxnet.apache.org/feed.xml; rel="self" type="application/atom+xml" />https://mxnet.apache.org/; rel="alternate" type="text/html" />2020-07-01T00:30:23+00:00https://mxnet.apache.org/feed.xmlApache MXNetA flexible and efficient library for deep [...] \ No newline at end of file
[GitHub] [incubator-mxnet] szha commented on pull request #18504: [Improvement] Invoke mkldnn and cudnn BatchNorm when axis != 1
szha commented on pull request #18504: URL: https://github.com/apache/incubator-mxnet/pull/18504#issuecomment-652113639 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] wkcn commented on pull request #18504: [Improvement] Invoke mkldnn and cudnn BatchNorm when axis != 1
wkcn commented on pull request #18504: URL: https://github.com/apache/incubator-mxnet/pull/18504#issuecomment-652112636 Hi @szha , could the PR be merged before replacing mkldnn_off and cudnn_off attributes with environment variables? I can remove the attribute mkldnn_off. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] stu1130 commented on issue #18646: BatchNorm with axis=-1 is much slower than axis=1
stu1130 commented on issue #18646: URL: https://github.com/apache/incubator-mxnet/issues/18646#issuecomment-652111953 @wkcn Thanks for you detailed explanation. So I think there are two phrases. 1. enable cuDNN when axis is not 1 2. use `cudnnBatchNormalizationForwardTrainingEx` for NHWC case (I checked the source code, we are all using cudnnBatchNormalizationForwardTraining) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] wkcn edited a comment on issue #18646: BatchNorm with axis=-1 is much slower than axis=1
wkcn edited a comment on issue #18646: URL: https://github.com/apache/incubator-mxnet/issues/18646#issuecomment-652109101 The reason is that MKLDNN and CuDNN are only applied when axis = 1. The open PR https://github.com/apache/incubator-mxnet/pull/18504 fixes it. However, we will replace mkldnn_off and cudnn_off attributes with environment variables, so the PR is blocked. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] wkcn commented on issue #18646: BatchNorm with axis=-1 is much slower than axis=1
wkcn commented on issue #18646: URL: https://github.com/apache/incubator-mxnet/issues/18646#issuecomment-652109101 The reason is that MKLDNN and CuDNN are only applied when axis = 1. The open PR https://github.com/apache/incubator-mxnet/pull/18504 fixes it. However, we will add environment variable to control whether to use MKLDNN and CuDNN, so the PR is blocked. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] mxnet-bot commented on pull request #18639: User Feedback Widget Part 1
mxnet-bot commented on pull request #18639: URL: https://github.com/apache/incubator-mxnet/pull/18639#issuecomment-652107031 Jenkins CI successfully triggered : [unix-cpu] This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] ys2843 edited a comment on pull request #18639: User Feedback Widget Part 1
ys2843 edited a comment on pull request #18639: URL: https://github.com/apache/incubator-mxnet/pull/18639#issuecomment-651529674 @mxnet-bot run ci [unix-cpu ] This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] ys2843 edited a comment on pull request #18605: Clipboard refactor
ys2843 edited a comment on pull request #18605: URL: https://github.com/apache/incubator-mxnet/pull/18605#issuecomment-649792399 @mxnet-bot run ci [unix-cpu ] This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] mxnet-bot commented on pull request #18605: Clipboard refactor
mxnet-bot commented on pull request #18605: URL: https://github.com/apache/incubator-mxnet/pull/18605#issuecomment-652105367 Jenkins CI successfully triggered : [unix-cpu] This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] access2rohit closed pull request #18612: [WIP]B axis improv cpu
access2rohit closed pull request #18612: URL: https://github.com/apache/incubator-mxnet/pull/18612 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] ChaiBapchya commented on a change in pull request #18597: [CI][v1.6.x] Fix failing CI pipelines
ChaiBapchya commented on a change in pull request #18597: URL: https://github.com/apache/incubator-mxnet/pull/18597#discussion_r448021132 ## File path: ci/docker/Dockerfile.build.jetson ## @@ -37,6 +37,8 @@ RUN apt-get update && DEBIAN_FRONTEND=noninteractive apt-get install -y \ unzip \ python3 \ python3-pip \ +python \ +python-pip \ Review comment: pip3 existed beforehand. I need python2 since build_wheel uses python2 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] ChaiBapchya commented on a change in pull request #18597: [CI][v1.6.x] Fix failing CI pipelines
ChaiBapchya commented on a change in pull request #18597: URL: https://github.com/apache/incubator-mxnet/pull/18597#discussion_r448020644 ## File path: ci/docker/Dockerfile.build.jetson ## @@ -78,5 +80,8 @@ ARG GROUP_ID=0 COPY install/ubuntu_adduser.sh /work/ RUN /work/ubuntu_adduser.sh +COPY install/requirements /work/ +RUN python -m pip install -r /work/requirements Review comment: Because build_wheel using python2 https://github.com/ChaiBapchya/incubator-mxnet/blob/5936afbb57b5ab899135c3802f2e8d92f311ba63/ci/docker/runtime_functions.sh#L114 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] mseth10 commented on a change in pull request #18597: [CI][v1.6.x] Fix failing CI pipelines
mseth10 commented on a change in pull request #18597: URL: https://github.com/apache/incubator-mxnet/pull/18597#discussion_r448019514 ## File path: ci/docker/Dockerfile.build.jetson ## @@ -37,6 +37,8 @@ RUN apt-get update && DEBIAN_FRONTEND=noninteractive apt-get install -y \ unzip \ python3 \ python3-pip \ +python \ +python-pip \ Review comment: Do we need both pip and pip3? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] mseth10 commented on a change in pull request #18597: [CI][v1.6.x] Fix failing CI pipelines
mseth10 commented on a change in pull request #18597: URL: https://github.com/apache/incubator-mxnet/pull/18597#discussion_r448019369 ## File path: ci/docker/Dockerfile.build.jetson ## @@ -78,5 +80,8 @@ ARG GROUP_ID=0 COPY install/ubuntu_adduser.sh /work/ RUN /work/ubuntu_adduser.sh +COPY install/requirements /work/ +RUN python -m pip install -r /work/requirements Review comment: Why do we not use pip3 to install requirements? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] ChaiBapchya commented on pull request #18597: [CI][v1.6.x] Fix failing CI pipelines
ChaiBapchya commented on pull request #18597: URL: https://github.com/apache/incubator-mxnet/pull/18597#issuecomment-652077634 @mxnet-label-bot add [pr-awaiting-review] This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] ChaiBapchya commented on pull request #18597: [CI][v1.6.x] Fix failing CI pipelines
ChaiBapchya commented on pull request #18597: URL: https://github.com/apache/incubator-mxnet/pull/18597#issuecomment-652077425 @leezu @sandeep-krishnamurthy @PatricZhao @szha Please review. This unblocks https://github.com/apache/incubator-mxnet/pull/18632 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] mxnet-bot commented on pull request #18597: [CI][v1.6.x] Fix failing CI pipelines
mxnet-bot commented on pull request #18597: URL: https://github.com/apache/incubator-mxnet/pull/18597#issuecomment-65207 Jenkins CI successfully triggered : [unix-gpu] This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] ChaiBapchya commented on pull request #18597: [CI][v1.6.x] Fix failing CI pipelines
ChaiBapchya commented on pull request #18597: URL: https://github.com/apache/incubator-mxnet/pull/18597#issuecomment-652074410 @mxnet-bot run ci [unix-gpu] This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] ChaiBapchya commented on issue #14994: Flaky test: test_lstm_clip
ChaiBapchya commented on issue #14994: URL: https://github.com/apache/incubator-mxnet/issues/14994#issuecomment-652074321 PR #18597 http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/mxnet-validation%2Funix-gpu/detail/PR-18597/9/pipeline This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] stu1130 opened a new issue #18646: BatchNorm with axis=-1 is much slower than axis=1
stu1130 opened a new issue #18646: URL: https://github.com/apache/incubator-mxnet/issues/18646 ## Description ``` import mxnet as mx from mxnet import autograd, np, npx, gluon, init from mxnet.gluon import nn import time npx.set_np() data = mx.np.random.uniform(size=(32, 100, 100), ctx=mx.gpu()) label = mx.np.ones((32, 100, 100), ctx=mx.gpu()) net = nn.Sequential() net.add(nn.BatchNorm(axis=-1)) net.initialize(init.Xavier(), ctx=mx.gpu()) loss = gluon.loss.L2Loss() t = time.time() for _ in range(5000): with autograd.record(): l = loss(net(data), label) l.backward() mx.nd.waitall() print('spent: {}s'.format(time.time() - t)) ``` I got around 5 sec with axis=1 and 30 sec with axis=-1. ## Solution Thanks @ptrendx pointed to point it out, cudnn 7.4 (https://docs.nvidia.com/deeplearning/sdk/cudnn-release-notes/rel_7xx.html#rel_741) added a new cudnnBatchNormalization*Ex API that gives much better speed for axis = -1 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] ChaiBapchya commented on pull request #18597: [CI][v1.6.x] Fix failing CI pipelines
ChaiBapchya commented on pull request #18597: URL: https://github.com/apache/incubator-mxnet/pull/18597#issuecomment-651996528 Found the root-cause of this issue | Function | Dockerfile| Base Docker Image | Python| |-- |--- | | | | build_jetson | ci/docker/Dockerfile.build.jetson | FROM nvidia/cuda:10.0-cudnn7-devel-ubuntu18.04| ✖︎| | build_armv6| ci/docker/Dockerfile.build.armv6 | FROM dockcross/linux-armv6| 2.7 | | build_armv7| ci/docker/Dockerfile.build.armv7 | FROM dockcross/linux-armv7| 2.7 | | build_armv8| ci/docker/Dockerfile.build.armv8 | FROM dockcross/linux-armv64 | 2.7 | All `dockcross/linux-arm* ` have default python as python2 and it includes setuptools installed. However, nvidia/cuda docker image doesn't have python & as a result doesn't have setuptools. Hence we need to specifically install python2 & setuptools. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[incubator-mxnet-site] branch asf-site updated: Publish triggered by CI
This is an automated email from the ASF dual-hosted git repository. aaronmarkham pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/incubator-mxnet-site.git The following commit(s) were added to refs/heads/asf-site by this push: new 6e48c7a Publish triggered by CI 6e48c7a is described below commit 6e48c7a15dfe80735eea36e174c3d0428fa37c5b Author: mxnet-ci AuthorDate: Tue Jun 30 18:41:19 2020 + Publish triggered by CI --- date.txt | 1 - feed.xml | 2 +- 2 files changed, 1 insertion(+), 2 deletions(-) diff --git a/date.txt b/date.txt deleted file mode 100644 index a6d3fd5..000 --- a/date.txt +++ /dev/null @@ -1 +0,0 @@ -Tue Jun 30 12:40:54 UTC 2020 diff --git a/feed.xml b/feed.xml index ab5dbbe5..bc2cea4 100644 --- a/feed.xml +++ b/feed.xml @@ -1 +1 @@ -http://www.w3.org/2005/Atom; >https://jekyllrb.com/; version="4.0.0">Jekyllhttps://mxnet.apache.org/feed.xml; rel="self" type="application/atom+xml" />https://mxnet.apache.org/; rel="alternate" type="text/html" />2020-06-30T12:30:10+00:00https://mxnet.apache.org/feed.xmlApache MXNetA flexible and efficient library for deep [...] \ No newline at end of file +http://www.w3.org/2005/Atom; >https://jekyllrb.com/; version="4.0.0">Jekyllhttps://mxnet.apache.org/feed.xml; rel="self" type="application/atom+xml" />https://mxnet.apache.org/; rel="alternate" type="text/html" />2020-06-30T18:30:39+00:00https://mxnet.apache.org/feed.xmlApache MXNetA flexible and efficient library for deep [...] \ No newline at end of file
[incubator-mxnet-site] branch asf-site updated: Bump the publish timestamp.
This is an automated email from the ASF dual-hosted git repository. aaronmarkham pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/incubator-mxnet-site.git The following commit(s) were added to refs/heads/asf-site by this push: new 71979c0 Bump the publish timestamp. 71979c0 is described below commit 71979c0ba3e391c59f5b9b5a6687f63013e056b9 Author: mxnet-ci AuthorDate: Tue Jun 30 18:41:30 2020 + Bump the publish timestamp. --- date.txt | 1 + 1 file changed, 1 insertion(+) diff --git a/date.txt b/date.txt new file mode 100644 index 000..0bf7c63 --- /dev/null +++ b/date.txt @@ -0,0 +1 @@ +Tue Jun 30 18:41:30 UTC 2020
[GitHub] [incubator-mxnet] xidulu commented on issue #18638: mx.np.broadcast_to has undocumented features
xidulu commented on issue #18638: URL: https://github.com/apache/incubator-mxnet/issues/18638#issuecomment-651856471 I believe the reason behind is that we would like to keep the documentation and feature consistent with the origin NumPy, so this magic api is not supposed to be known by users. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] wkcn commented on a change in pull request #18644: Fix BatchNorm backward synchronization
wkcn commented on a change in pull request #18644: URL: https://github.com/apache/incubator-mxnet/pull/18644#discussion_r447749228 ## File path: tests/python/unittest/test_gluon.py ## @@ -665,6 +665,34 @@ def transpose(shape): assert (layer(x).shape==ceil_out_shape) +@with_seed() +@pytest.mark.parametrize('cudnn_off', [True, False]) +@pytest.mark.parametrize('variable', ['running_var', 'running_mean']) +def test_batchnorm_backward_synchronization(cudnn_off, variable): +""" +Tests if synchronization of BatchNorm running variables is done correctly. +If not, the test sometimes fails - depending on the timing. +""" +ctx = mx.cpu() if cudnn_off else mx.gpu() Review comment: It should be `ctx = mx.test_utils.default_context()`. In CI, there are tests on CPU and GPU. We don't need to specify the context. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] wkcn commented on a change in pull request #18644: Fix BatchNorm backward synchronization
wkcn commented on a change in pull request #18644: URL: https://github.com/apache/incubator-mxnet/pull/18644#discussion_r447749228 ## File path: tests/python/unittest/test_gluon.py ## @@ -665,6 +665,34 @@ def transpose(shape): assert (layer(x).shape==ceil_out_shape) +@with_seed() +@pytest.mark.parametrize('cudnn_off', [True, False]) +@pytest.mark.parametrize('variable', ['running_var', 'running_mean']) +def test_batchnorm_backward_synchronization(cudnn_off, variable): +""" +Tests if synchronization of BatchNorm running variables is done correctly. +If not, the test sometimes fails - depending on the timing. +""" +ctx = mx.cpu() if cudnn_off else mx.gpu() Review comment: It should be `ctx = default_context()`. In CI, there are tests on CPU and GPU. We don't need to specify the context. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] wkcn commented on a change in pull request #18644: Fix BatchNorm backward synchronization
wkcn commented on a change in pull request #18644: URL: https://github.com/apache/incubator-mxnet/pull/18644#discussion_r447749228 ## File path: tests/python/unittest/test_gluon.py ## @@ -665,6 +665,34 @@ def transpose(shape): assert (layer(x).shape==ceil_out_shape) +@with_seed() +@pytest.mark.parametrize('cudnn_off', [True, False]) +@pytest.mark.parametrize('variable', ['running_var', 'running_mean']) +def test_batchnorm_backward_synchronization(cudnn_off, variable): +""" +Tests if synchronization of BatchNorm running variables is done correctly. +If not, the test sometimes fails - depending on the timing. +""" +ctx = mx.cpu() if cudnn_off else mx.gpu() Review comment: It should be `ctx=default_context()`. In CI, there are tests on CPU and GPU. We don't need to specify the context. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] wkcn commented on a change in pull request #18644: Fix BatchNorm backward synchronization
wkcn commented on a change in pull request #18644: URL: https://github.com/apache/incubator-mxnet/pull/18644#discussion_r447742251 ## File path: tests/python/unittest/test_gluon.py ## @@ -665,6 +665,34 @@ def transpose(shape): assert (layer(x).shape==ceil_out_shape) +@with_seed() +@pytest.mark.parametrize('cudnn_off', [True, False]) +@pytest.mark.parametrize('variable', ['running_var', 'running_mean']) +def test_batchnorm_backward_synchronization(cudnn_off, variable): +""" +Tests if synchronization of BatchNorm running variables is done correctly. +If not, the test sometimes fails - depending on the timing. +""" +ctx = mx.cpu() if cudnn_off else mx.gpu() +read_op = 'layer.' + variable + '.data().asnumpy()' + +for _ in range(20): +layer = nn.BatchNorm() +layer.initialize(ctx=ctx) +for _ in range(3): +data = mx.nd.random.normal(loc=10, scale=2, shape=(1, 3, 10, 10), ctx=ctx) +with mx.autograd.record(): +out = layer(data) +out.backward() + +# check if each read give the same value +var1 = eval(read_op) Review comment: Thank you for the contribution! For safety, I suggest to use `var1 = getattr(layer, variable).data().asnumpy()`. ## File path: tests/python/unittest/test_gluon.py ## @@ -665,6 +665,34 @@ def transpose(shape): assert (layer(x).shape==ceil_out_shape) +@with_seed() +@pytest.mark.parametrize('cudnn_off', [True, False]) +@pytest.mark.parametrize('variable', ['running_var', 'running_mean']) +def test_batchnorm_backward_synchronization(cudnn_off, variable): +""" +Tests if synchronization of BatchNorm running variables is done correctly. +If not, the test sometimes fails - depending on the timing. +""" +ctx = mx.cpu() if cudnn_off else mx.gpu() +read_op = 'layer.' + variable + '.data().asnumpy()' + +for _ in range(20): +layer = nn.BatchNorm() +layer.initialize(ctx=ctx) +for _ in range(3): +data = mx.nd.random.normal(loc=10, scale=2, shape=(1, 3, 10, 10), ctx=ctx) +with mx.autograd.record(): +out = layer(data) +out.backward() + +# check if each read give the same value +var1 = eval(read_op) +for _ in range(10): +var2 = eval(read_op) Review comment: `var2 = getattr(layer, variable).data().asnumpy()`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] wkcn commented on issue #18643: ndarray.contrib.boolean_mask can not be hybridize
wkcn commented on issue #18643: URL: https://github.com/apache/incubator-mxnet/issues/18643#issuecomment-651837537 Hi @DongfeiJi , It works for me on MXNet 2.0. Note that `boolean_mask` doesn't work when mask are all `zero/false`, since the traditional operator doesn't support zero-size array. ```python import mxnet as mx from mxnet import gluon from mxnet.gluon.loss import Loss, _apply_weighting class NewTripletLoss(Loss): def __init__(self, batch_size_per_gpu, margin=1, weight=None, batch_axis=0, **kwargs): super(NewTripletLoss, self).__init__(weight, batch_axis, **kwargs) self.batch_size_per_gpu = batch_size_per_gpu self.margin = margin def hybrid_forward(self, F, embeddings, labels, sample_weight=None): N = self.batch_size_per_gpu # get distance xx = F.power(embeddings, 2).sum(1, keepdims=True).tile((1, self.batch_size_per_gpu)) dist = F.broadcast_add(xx, xx.transpose()) dist = F.broadcast_sub(dist, 2 * F.dot(embeddings, embeddings.transpose())) dist = F.clip(dist, 1e-12, 1e12) # get mask labels = F.cast(labels, dtype='float32') labels = labels.expand_dims(1).tile((1, self.batch_size_per_gpu)) is_pos = F.broadcast_equal(labels, labels.transpose()) is_neg = F.broadcast_not_equal(labels, labels.transpose()) # hard example mining dist_mat = dist.reshape((self.batch_size_per_gpu * self.batch_size_per_gpu,)) pos_mask = is_pos.reshape((self.batch_size_per_gpu * self.batch_size_per_gpu,)) dist_ap = F.contrib.boolean_mask(dist_mat, pos_mask).reshape((self.batch_size_per_gpu, -1)) #dist_ap = F.broadcast_mul(dist_mat, pos_mask).reshape((self.batch_size_per_gpu, -1)) dist_ap = F.max(dist_ap, axis=1) neg_mask = is_neg.reshape((self.batch_size_per_gpu * self.batch_size_per_gpu,)) dist_an = F.contrib.boolean_mask(dist_mat, neg_mask).reshape((self.batch_size_per_gpu, -1)) #dist_an = F.broadcast_mul(dist_mat, neg_mask).reshape((self.batch_size_per_gpu, -1)) dist_an = F.min(dist_an, axis=1) # add margin margin = F.full(shape=(self.batch_size_per_gpu, 1), val=self.margin) loss = F.broadcast_add(F.broadcast_sub(dist_ap, dist_an), margin) loss = F.maximum(loss, F.zeros_like(loss)) # apply weight loss = _apply_weighting(F, loss, self._weight, sample_weight) return F.mean(loss, axis=self._batch_axis, exclude=True) block = NewTripletLoss(2) block.hybridize() embeddings = mx.nd.array([[1.0, 0.0, 1.0], [1.0, 1.0, 0.0]]).reshape((2,3)) embeddings.attach_grad() labels = mx.nd.array([0, 1]).reshape((2, )) with mx.autograd.record(): out = block(embeddings, labels) out.sum().backward() print(out) mx.nd.waitall() ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] Yiyan66 commented on pull request #18642: [numpy] Fix less/greater bug with scalar input
Yiyan66 commented on pull request #18642: URL: https://github.com/apache/incubator-mxnet/pull/18642#issuecomment-651801224 @mxnet-bot run ci [centos-cpu, unix-cpu] This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] mxnet-bot commented on pull request #18642: [numpy] Fix less/greater bug with scalar input
mxnet-bot commented on pull request #18642: URL: https://github.com/apache/incubator-mxnet/pull/18642#issuecomment-651801318 Jenkins CI successfully triggered : [centos-cpu, unix-cpu] This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[incubator-mxnet-site] branch asf-site updated: Publish triggered by CI
This is an automated email from the ASF dual-hosted git repository. aaronmarkham pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/incubator-mxnet-site.git The following commit(s) were added to refs/heads/asf-site by this push: new 3ef65df Publish triggered by CI 3ef65df is described below commit 3ef65df85f06fc473e2059dfea18b9142465c9f7 Author: mxnet-ci AuthorDate: Tue Jun 30 12:40:44 2020 + Publish triggered by CI --- date.txt | 1 - feed.xml | 2 +- 2 files changed, 1 insertion(+), 2 deletions(-) diff --git a/date.txt b/date.txt deleted file mode 100644 index 2591340..000 --- a/date.txt +++ /dev/null @@ -1 +0,0 @@ -Tue Jun 30 06:47:10 UTC 2020 diff --git a/feed.xml b/feed.xml index d2cb44f..ab5dbbe5 100644 --- a/feed.xml +++ b/feed.xml @@ -1 +1 @@ -http://www.w3.org/2005/Atom; >https://jekyllrb.com/; version="4.0.0">Jekyllhttps://mxnet.apache.org/feed.xml; rel="self" type="application/atom+xml" />https://mxnet.apache.org/; rel="alternate" type="text/html" />2020-06-30T06:36:18+00:00https://mxnet.apache.org/feed.xmlApache MXNetA flexible and efficient library for deep [...] \ No newline at end of file +http://www.w3.org/2005/Atom; >https://jekyllrb.com/; version="4.0.0">Jekyllhttps://mxnet.apache.org/feed.xml; rel="self" type="application/atom+xml" />https://mxnet.apache.org/; rel="alternate" type="text/html" />2020-06-30T12:30:10+00:00https://mxnet.apache.org/feed.xmlApache MXNetA flexible and efficient library for deep [...] \ No newline at end of file
[incubator-mxnet-site] branch asf-site updated: Bump the publish timestamp.
This is an automated email from the ASF dual-hosted git repository. aaronmarkham pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/incubator-mxnet-site.git The following commit(s) were added to refs/heads/asf-site by this push: new 25e1fa1 Bump the publish timestamp. 25e1fa1 is described below commit 25e1fa1438eebd3847ef1efb62d147e00cf30964 Author: mxnet-ci AuthorDate: Tue Jun 30 12:40:54 2020 + Bump the publish timestamp. --- date.txt | 1 + 1 file changed, 1 insertion(+) diff --git a/date.txt b/date.txt new file mode 100644 index 000..a6d3fd5 --- /dev/null +++ b/date.txt @@ -0,0 +1 @@ +Tue Jun 30 12:40:54 UTC 2020
[GitHub] [incubator-mxnet] anko-intel opened a new pull request #18644: Fix BatchNorm backward synchronization
anko-intel opened a new pull request #18644: URL: https://github.com/apache/incubator-mxnet/pull/18644 ## Description ## Fix the issue #18610 - synchronization problem with running variables for backward pass of BatchNorm ## Checklist ## ### Essentials ### - [ ] Changes are complete (i.e. I finished coding on this PR) - [ ] All changes have test coverage: - Unit tests are added for small changes to verify correctness (e.g. adding a new operator) - [ ] To the best of my knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] mxnet-bot commented on pull request #18644: Fix BatchNorm backward synchronization
mxnet-bot commented on pull request #18644: URL: https://github.com/apache/incubator-mxnet/pull/18644#issuecomment-651723999 Hey @anko-intel , Thanks for submitting the PR All tests are already queued to run once. If tests fail, you can trigger one or more tests again with the following commands: - To trigger all jobs: @mxnet-bot run ci [all] - To trigger specific jobs: @mxnet-bot run ci [job1, job2] *** **CI supported jobs**: [clang, centos-gpu, sanity, centos-cpu, miscellaneous, windows-cpu, unix-cpu, windows-gpu, edge, website, unix-gpu] *** _Note_: Only following 3 categories can trigger CI :PR Author, MXNet Committer, Jenkins Admin. All CI tests must pass before the PR can be merged. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] DongfeiJi commented on issue #18643: ndarray.contrib.boolean_mask can not be hybridize
DongfeiJi commented on issue #18643: URL: https://github.com/apache/incubator-mxnet/issues/18643#issuecomment-651723106 This problem will occur when this operation is used in loss This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] DongfeiJi commented on issue #18643: ndarray.contrib.boolean_mask can not be hybridize
DongfeiJi commented on issue #18643: URL: https://github.com/apache/incubator-mxnet/issues/18643#issuecomment-651722037 > provide a minimal reproduce examp This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] DongfeiJi closed issue #18643: ndarray.contrib.boolean_mask can not be hybridize
DongfeiJi closed issue #18643: URL: https://github.com/apache/incubator-mxnet/issues/18643 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] DongfeiJi removed a comment on issue #18643: ndarray.contrib.boolean_mask can not be hybridize
DongfeiJi removed a comment on issue #18643: URL: https://github.com/apache/incubator-mxnet/issues/18643#issuecomment-651722037 > provide a minimal reproduce examp This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] DongfeiJi commented on issue #18643: ndarray.contrib.boolean_mask can not be hybridize
DongfeiJi commented on issue #18643: URL: https://github.com/apache/incubator-mxnet/issues/18643#issuecomment-651721010 > Could you provide a minimal reproduce example? > > In MXNet unittest, The test of hybridized `boolean_mask` works. > https://github.com/apache/incubator-mxnet/blob/master/tests/python/unittest/test_dynamic_shape.py#L39 --- Thank you for your reply. Yes, I read the code before. But in my scenario, I used it in the loss function,when i use broadcast_mul it can hybridize, use boolean_mask it does not work. `class NewTripletLoss(Loss): def __init__(self, batch_size_per_gpu, margin=1, weight=None, batch_axis=0, **kwargs): super(NewTripletLoss, self).__init__(weight, batch_axis, **kwargs) self.batch_size_per_gpu = batch_size_per_gpu self.margin = margin def hybrid_forward(self, F, embeddings, labels, sample_weight=None): N = self.batch_size_per_gpu # get distance xx = F.power(embeddings, 2).sum(1, keepdims=True).tile((1, self.batch_size_per_gpu)) dist = F.broadcast_add(xx, xx.transpose()) dist = F.broadcast_sub(dist, 2 * F.dot(embeddings, embeddings.transpose())) dist = F.clip(dist, 1e-12, 1e12) # get mask labels = F.cast(labels, dtype='float32') labels = labels.expand_dims(1).tile((1, self.batch_size_per_gpu)) is_pos = F.broadcast_equal(labels, labels.transpose()) is_neg = F.broadcast_not_equal(labels, labels.transpose()) # hard example mining dist_mat = dist.reshape((self.batch_size_per_gpu * self.batch_size_per_gpu,)) pos_mask = is_pos.reshape((self.batch_size_per_gpu * self.batch_size_per_gpu,)) dist_ap = F.contrib.boolean_mask(dist_mat, pos_mask).reshape((self.batch_size_per_gpu, -1)) # dist_ap = F.broadcast_mul(dist_mat, pos_mask).reshape((self.batch_size_per_gpu, -1)) dist_ap = F.max(dist_ap, axis=1) neg_mask = is_neg.reshape((self.batch_size_per_gpu * self.batch_size_per_gpu,)) dist_an = F.contrib.boolean_mask(dist_mat, neg_mask).reshape((self.batch_size_per_gpu, -1)) # dist_an = F.broadcast_mul(dist_mat, neg_mask).reshape((self.batch_size_per_gpu, -1)) dist_an = F.min(dist_an, axis=1) # add margin margin = F.full(shape=(self.batch_size_per_gpu, 1), val=self.margin) loss = F.broadcast_add(F.broadcast_sub(dist_ap, dist_an), margin) loss = F.maximum(loss, F.zeros_like(loss)) # apply weight loss = _apply_weighting(F, loss, self._weight, sample_weight) return F.mean(loss, axis=self._batch_axis, exclude=True)` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] wkcn commented on issue #18643: ndarray.contrib.boolean_mask can not be hybridize
wkcn commented on issue #18643: URL: https://github.com/apache/incubator-mxnet/issues/18643#issuecomment-651715706 Could you provide a minimal reproduce example? In MXNet unittest, The test of hybridized `boolean_mask` works. https://github.com/apache/incubator-mxnet/blob/master/tests/python/unittest/test_dynamic_shape.py#L39 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] DongfeiJi opened a new issue #18643: ndarray.contrib.boolean_mask can not be hybridize
DongfeiJi opened a new issue #18643: URL: https://github.com/apache/incubator-mxnet/issues/18643 ## Description ndarray.contrib.boolean_mask can not be hybridize Gluon does not support infer shape ndarray.contrib.boolean_mask This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] mxnet-bot commented on pull request #18642: [numpy] Fix less/greater bug with scalar input
mxnet-bot commented on pull request #18642: URL: https://github.com/apache/incubator-mxnet/pull/18642#issuecomment-651677650 Hey @Yiyan66 , Thanks for submitting the PR All tests are already queued to run once. If tests fail, you can trigger one or more tests again with the following commands: - To trigger all jobs: @mxnet-bot run ci [all] - To trigger specific jobs: @mxnet-bot run ci [job1, job2] *** **CI supported jobs**: [unix-cpu, miscellaneous, unix-gpu, centos-gpu, windows-gpu, edge, windows-cpu, clang, centos-cpu, sanity, website] *** _Note_: Only following 3 categories can trigger CI :PR Author, MXNet Committer, Jenkins Admin. All CI tests must pass before the PR can be merged. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] Yiyan66 opened a new pull request #18642: [numpy] Fix less/greater bug with scalar input
Yiyan66 opened a new pull request #18642: URL: https://github.com/apache/incubator-mxnet/pull/18642 ## Description ## 1. FFI greater 2. Fix a bug in ufunc op: less/greater/less_equal/great_equal described in #18594 ## Checklist ## ### Essentials ### Please feel free to remove inapplicable items for your PR. - [ ] The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant [JIRA issue](https://issues.apache.org/jira/projects/MXNET/issues) created (except PRs with tiny changes) - [ ] Changes are complete (i.e. I finished coding on this PR) - [ ] All changes have test coverage: - Unit tests are added for small changes to verify correctness (e.g. adding a new operator) - Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore) - Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL) - [ ] Code is well-documented: - For user-facing API changes, API doc string has been updated. - For new C++ functions in header files, their functionalities and arguments are documented. - For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable - Check the API doc at https://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html - [ ] To the best of my knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change ### Changes ### - [ ] Feature1, tests, (and when applicable, API doc) - [ ] Feature2, tests, (and when applicable, API doc) ## Comments ## - If this change is a backward incompatible change, why must this change be made. - Interesting edge cases to note here This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] ciyongch commented on issue #18600: Multiple numpy tests fail with numpy 1.19
ciyongch commented on issue #18600: URL: https://github.com/apache/incubator-mxnet/issues/18600#issuecomment-651638346 As discussed [here](https://github.com/apache/incubator-mxnet/issues/18641#issuecomment-651487181), we'll mark numpy operator as experimental in v1.7 release and probably will move forward by taking this as an known issue (some cases are broken with latest numpy version 1.19.0). Please let me know if you have any further concerns or suggestions! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] xidulu commented on a change in pull request #18403: Gluon.probability
xidulu commented on a change in pull request #18403: URL: https://github.com/apache/incubator-mxnet/pull/18403#discussion_r447495416 ## File path: python/mxnet/gluon/probability/block/stochastic_block.py ## @@ -0,0 +1,127 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +# coding: utf-8 +# pylint: disable=abstract-method +"""Stochastic block class.""" +__all__ = ['StochasticBlock', 'StochasticSequential'] + +from functools import wraps +from ...block import HybridBlock +from ...utils import _indent + + +class StochasticBlock(HybridBlock): +"""`StochasticBlock` extends `HybridBlock` to support accumulating loss +in the forward phase, which is extremely useful in building Bayesian Neural Network, +where the loss function is composed of a classification loss and a KL loss. + +""" + +def __init__(self, **kwargs): +super(StochasticBlock, self).__init__(**kwargs) +self._losses = [] +self._losscache = [] + +def add_loss(self, loss): +self._losscache.append(loss) + +@staticmethod +def collectLoss(func): +"""To accumulate loss during the forward phase, one could first decorate +hybrid_forward with `StochasticBlock.collectLoss, +and then collect the loss tensor `x` by calling self.add_loss(x). +For example, in the following forward function, +we generate samples from a Gaussian parameterized by `loc` and `scale` and +accumulate the KL-divergence between it and its prior into the block's loss storage.: +@StochasticBlock.collectLoss +def hybrid_forward(self, F, loc, scale): +qz = mgp.Normal(loc, scale) +# prior +pz = mgp.Normal(F.np.zeros_like(loc), F.np.ones_like(scale)) +self.add_loss(mgp.kl_divergence(qz, pz)) +return qz.sample() +""" +@wraps(func) +def inner(self, *args, **kwargs): +# Loss from hybrid_forward +func_out = func(self, *args, **kwargs) +collected_loss = self._losscache +self._losscache = [] +return (func_out, collected_loss) + +return inner + +def __call__(self, *args, **kwargs): + # pylint: disable=arguments-differ +out = super().__call__(*args, **kwargs) +self._losses.extend(out[1]) +return out[0] Review comment: @leezu I add two checks here: https://github.com/apache/incubator-mxnet/pull/18403/files#diff-85458cf5116b137da8148bf5b38bcfaeR74 https://github.com/apache/incubator-mxnet/pull/18403/files#diff-85458cf5116b137da8148bf5b38bcfaeR78 To make it clearer, I list several possible situations: 1. Users call add_loss inside functions decorated by CollectLoss, add_loss appends losses into _losscache, _losscache would then get cleared in CollectLoss, len(_losscache) becomes 0 when __call__ is invoked. 2. Users call add_loss without using CollectLoss, add_loss appends losses into _losscache, _losscache still contains value when entering __call__, in this case, a exception will be raised. 3. Users use CollectLoss without calling add_loss, self._losses = out[1] = [] 4. Users use StochasticBlock without calling CollectLoss or add_loss, len(out) == 1, out[1] will not be accessed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] ma-hei commented on pull request #18445: updating ubuntu_cpu base image to 20.04 to observe failing tests regarding Python 3.8
ma-hei commented on pull request #18445: URL: https://github.com/apache/incubator-mxnet/pull/18445#issuecomment-651616085 here's whats going on with onnx 1.7: https://github.com/onnx/onnx/issues/2865 We just need to use the newer way of instantiating a Pad node. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] bgawrych removed a comment on pull request #18602: Fix softmax, logsoftmax failed on empty ndarray
bgawrych removed a comment on pull request #18602: URL: https://github.com/apache/incubator-mxnet/pull/18602#issuecomment-651597245 @mxnet-bot run ci [codecov/project] This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] bgawrych commented on pull request #18602: Fix softmax, logsoftmax failed on empty ndarray
bgawrych commented on pull request #18602: URL: https://github.com/apache/incubator-mxnet/pull/18602#issuecomment-651597245 @mxnet-bot run ci [codecov/project] This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] mxnet-bot commented on pull request #18602: Fix softmax, logsoftmax failed on empty ndarray
mxnet-bot commented on pull request #18602: URL: https://github.com/apache/incubator-mxnet/pull/18602#issuecomment-651597278 None of the jobs entered are supported. Jobs entered by user: [codecov/project] CI supported Jobs: [website, unix-cpu, centos-gpu, miscellaneous, edge, windows-gpu, centos-cpu, sanity, clang, unix-gpu, windows-cpu] This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[incubator-mxnet-site] branch asf-site updated: Bump the publish timestamp.
This is an automated email from the ASF dual-hosted git repository. aaronmarkham pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/incubator-mxnet-site.git The following commit(s) were added to refs/heads/asf-site by this push: new 2113198 Bump the publish timestamp. 2113198 is described below commit 21131986623e7a113577a088f85e6c5d8f3a2ce5 Author: mxnet-ci AuthorDate: Tue Jun 30 06:47:10 2020 + Bump the publish timestamp. --- date.txt | 1 + 1 file changed, 1 insertion(+) diff --git a/date.txt b/date.txt new file mode 100644 index 000..2591340 --- /dev/null +++ b/date.txt @@ -0,0 +1 @@ +Tue Jun 30 06:47:10 UTC 2020
[incubator-mxnet-site] branch asf-site updated: Publish triggered by CI
This is an automated email from the ASF dual-hosted git repository. aaronmarkham pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/incubator-mxnet-site.git The following commit(s) were added to refs/heads/asf-site by this push: new 3e45743 Publish triggered by CI 3e45743 is described below commit 3e457430c03f3c729ab4ef7e4c3c9c37b5860aa5 Author: mxnet-ci AuthorDate: Tue Jun 30 06:47:05 2020 + Publish triggered by CI --- date.txt | 1 - feed.xml | 2 +- 2 files changed, 1 insertion(+), 2 deletions(-) diff --git a/date.txt b/date.txt deleted file mode 100644 index efcc280..000 --- a/date.txt +++ /dev/null @@ -1 +0,0 @@ -Tue Jun 30 00:42:01 UTC 2020 diff --git a/feed.xml b/feed.xml index a18cbb6..d2cb44f 100644 --- a/feed.xml +++ b/feed.xml @@ -1 +1 @@ -http://www.w3.org/2005/Atom; >https://jekyllrb.com/; version="4.0.0">Jekyllhttps://mxnet.apache.org/feed.xml; rel="self" type="application/atom+xml" />https://mxnet.apache.org/; rel="alternate" type="text/html" />2020-06-30T00:31:08+00:00https://mxnet.apache.org/feed.xmlApache MXNetA flexible and efficient library for deep [...] \ No newline at end of file +http://www.w3.org/2005/Atom; >https://jekyllrb.com/; version="4.0.0">Jekyllhttps://mxnet.apache.org/feed.xml; rel="self" type="application/atom+xml" />https://mxnet.apache.org/; rel="alternate" type="text/html" />2020-06-30T06:36:18+00:00https://mxnet.apache.org/feed.xmlApache MXNetA flexible and efficient library for deep [...] \ No newline at end of file
[GitHub] [incubator-mxnet] samskalicky commented on a change in pull request #18555: Remove check for subgraph with cycles
samskalicky commented on a change in pull request #18555: URL: https://github.com/apache/incubator-mxnet/pull/18555#discussion_r447447347 ## File path: src/operator/subgraph/build_subgraph.cc ## @@ -306,14 +300,19 @@ void PreSelectSubgraphNodes(const nnvm::Graph& g, SubgraphSelectorV2Ptr subgraph const std::vector& simple_nodes, std::vector* subgraph_nodes) { std::unordered_set excluded_nodes; + size_t n_excluded_nodes = 0; const size_t max_num_retry = simple_nodes.size() * simple_nodes.size(); size_t count = 0; bool success = false; while (!success && count < max_num_retry) { success = LabelSubgraph(g, subgraph_selector, label, snid, simple_nodes, subgraph_nodes, _nodes); if (!success) { - CHECK(!excluded_nodes.empty()); + if (excluded_nodes.size() == n_excluded_nodes) { Review comment: Do nodes have to be excluded in order to have a possible subgraph? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] ys2843 edited a comment on pull request #18487: Consolidate installation instructions on website and add disclaimer for non-ASF ressources
ys2843 edited a comment on pull request #18487: URL: https://github.com/apache/incubator-mxnet/pull/18487#issuecomment-651574234 > The notice needs to be backported to previous versions too as people who select older versions may see installation guide that are different from the one on master. > > cc @ys2843 @sandeep-krishnamurthy If the master install guide is the correct one, I suggest we redirect previous version installation page all to master install page. Because previous version websites are static artifact, not in a good shape for making these many changes. And in fact, I believe it is easier to maintain if the current versioned website could be replaced by one single master website with versioned docs & tutorials. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] ys2843 commented on pull request #18487: Consolidate installation instructions on website and add disclaimer for non-ASF ressources
ys2843 commented on pull request #18487: URL: https://github.com/apache/incubator-mxnet/pull/18487#issuecomment-651574234 > The notice needs to be backported to previous versions too as people who select older versions may see installation guide that are different from the one on master. > > cc @ys2843 @sandeep-krishnamurthy If the master install guide is the correct one, I suggest we redirect previous version installation page all to master. Because previous version websites are static artifact, not in a good shape for making these many changes. And in fact, I believe it is easier to maintain if the current versioned website could be replaced by one single master website with versioned docs & tutorials. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[incubator-mxnet] branch master updated (638622f -> 2158106)
This is an automated email from the ASF dual-hosted git repository. liuyizhi pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from 638622f Improve performance of broadcast_axis on CPU (#17882) add 2158106 [Numpy] FFI: tril_indices (#18546) No new revisions were added by this update. Summary of changes: python/mxnet/ndarray/numpy/_op.py | 2 +- python/mxnet/numpy/multiarray.py | 2 +- src/api/operator/numpy/np_matrix_op.cc | 24 tests/python/unittest/test_numpy_op.py | 2 +- 4 files changed, 27 insertions(+), 3 deletions(-)
[incubator-mxnet] branch master updated (638622f -> 2158106)
This is an automated email from the ASF dual-hosted git repository. liuyizhi pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from 638622f Improve performance of broadcast_axis on CPU (#17882) add 2158106 [Numpy] FFI: tril_indices (#18546) No new revisions were added by this update. Summary of changes: python/mxnet/ndarray/numpy/_op.py | 2 +- python/mxnet/numpy/multiarray.py | 2 +- src/api/operator/numpy/np_matrix_op.cc | 24 tests/python/unittest/test_numpy_op.py | 2 +- 4 files changed, 27 insertions(+), 3 deletions(-)
[GitHub] [incubator-mxnet] yzhliu commented on pull request #18546: [Numpy] FFI: tril_indices
yzhliu commented on pull request #18546: URL: https://github.com/apache/incubator-mxnet/pull/18546#issuecomment-651566467 Thanks @XIAO-XIA @hzfan This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] yzhliu merged pull request #18546: [Numpy] FFI: tril_indices
yzhliu merged pull request #18546: URL: https://github.com/apache/incubator-mxnet/pull/18546 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] ciyongch commented on issue #18641: Backporting recent mx.np changes to 1.7 branch
ciyongch commented on issue #18641: URL: https://github.com/apache/incubator-mxnet/issues/18641#issuecomment-651564244 Thanks @sandeep-krishnamurthy @szha , then let's take it as the experimental feature in v1.7 release. @sxjscience could you please help to backport these two PRs and tag me on the new PR? Then we'll move forward with rc0 tag and the rest of release process when they're get merged. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org