[GitHub] [incubator-mxnet] ciyongch commented on pull request #18572: [v1.7.x]Add KEY for Ciyong Chen
ciyongch commented on pull request #18572: URL: https://github.com/apache/incubator-mxnet/pull/18572#issuecomment-645166878 @TaoLv @pengzhao-intel please help to merge. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] Neutron3529 commented on pull request #18423: fix misbehave of KLDivLoss
Neutron3529 commented on pull request #18423: URL: https://github.com/apache/incubator-mxnet/pull/18423#issuecomment-645160030 @mxnet-bot run ci [unix-cpu] This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] mxnet-bot commented on pull request #18423: fix misbehave of KLDivLoss
mxnet-bot commented on pull request #18423: URL: https://github.com/apache/incubator-mxnet/pull/18423#issuecomment-645160078 Jenkins CI successfully triggered : [unix-cpu] This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] mxnet-bot commented on pull request #18546: [Numpy] FFI: tril_indices
mxnet-bot commented on pull request #18546: URL: https://github.com/apache/incubator-mxnet/pull/18546#issuecomment-645154562 Jenkins CI successfully triggered : [unix-cpu, unix-gpu] This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] XIAO-XIA commented on pull request #18546: [Numpy] FFI: tril_indices
XIAO-XIA commented on pull request #18546: URL: https://github.com/apache/incubator-mxnet/pull/18546#issuecomment-645154526 @mxnet-bot run ci [unix-cpu, unix-gpu] This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] ChaiBapchya commented on issue #18493: 3D Upsampling
ChaiBapchya commented on issue #18493: URL: https://github.com/apache/incubator-mxnet/issues/18493#issuecomment-645140735 Great to have this use-case. @andevellicus Thanks for bringing it up. While your experience in Julie will be handy, we would still need work to be done on MXNet Backend [C/C++] because those few lines that @leezu mentioned would go somewhere here For Upsampling Forward & Backward https://github.com/apache/incubator-mxnet/blob/3b23c2de950fb0e4d44560f4c7ea933a520c526c/src/operator/nn/upsampling-inl.h#L100 CPU-specific implementation For e.g. Shape currently checks for 4D input [2D image] https://github.com/apache/incubator-mxnet/blob/eceb5f2c1c494094c1a697286f2c4560b7ca472e/src/operator/nn/upsampling.cc#L44-L45 https://github.com/apache/incubator-mxnet/blob/eceb5f2c1c494094c1a697286f2c4560b7ca472e/src/operator/nn/upsampling.cc#L61-L62 There don't seem to be GPU-specific implementations at the moment. So we are good on that. You can take a stab at updating the forward & backward implementations. Additionally, we could add a test for this 3D image use-case Similar to https://github.com/apache/incubator-mxnet/blob/f1f3f44166e2e47afad6c65025fb48dd47efeb65/tests/python/gpu/test_operator_gpu.py#L1481-L1489 I can help review your PR. Feel free to ping me if you need any help / have specific doubts related to contributing code to the MXNet Backend. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] ChaiBapchya commented on issue #18551: MXnet with cuda wont install on windows 10!
ChaiBapchya commented on issue #18551: URL: https://github.com/apache/incubator-mxnet/issues/18551#issuecomment-645136694 @mxnet-label-bot add [windows] This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] ChaiBapchya commented on pull request #18560: [CI][1.6.x] fix centos 7 url to unblock centos-cpu & gpu pipeline
ChaiBapchya commented on pull request #18560: URL: https://github.com/apache/incubator-mxnet/pull/18560#issuecomment-645135685 Makes sense. Can someone then help create this branch protection ticket to apache-infra? @leezu @marcoabreu I'm not sure if I as a contributor can do that. Thanks everyone for the advice & clarification. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] ChaiBapchya commented on pull request #18573: [v1.x] Cherry-pick Fix mxnet-native and Docker CD pipelines (#17784)
ChaiBapchya commented on pull request #18573: URL: https://github.com/apache/incubator-mxnet/pull/18573#issuecomment-645133527 So does this mean the PR is incomplete? like are there more additions to be done? I can see few jobs red/yellow [fail] in past 2 days. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] ChaiBapchya commented on pull request #18559: add cd mxnet_lib/static and python/pypi stages to ci
ChaiBapchya commented on pull request #18559: URL: https://github.com/apache/incubator-mxnet/pull/18559#issuecomment-645131019 Thanks for pointing it out. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] pengzhao-intel commented on issue #18569: [Numpy] softmax, logsoftmax failed on empty ndarray
pengzhao-intel commented on issue #18569: URL: https://github.com/apache/incubator-mxnet/issues/18569#issuecomment-645106526 Our team will take a look if this is related to MKL integration. Thanks report the issue @stu1130 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] ciyongch commented on pull request #18572: [v1.7.x]Add KEY for Ciyong Chen
ciyongch commented on pull request #18572: URL: https://github.com/apache/incubator-mxnet/pull/18572#issuecomment-645104330 @mxnet-bot run ci [unix-cpu] This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] mxnet-bot commented on pull request #18572: [v1.7.x]Add KEY for Ciyong Chen
mxnet-bot commented on pull request #18572: URL: https://github.com/apache/incubator-mxnet/pull/18572#issuecomment-645104371 Jenkins CI successfully triggered : [unix-cpu] This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] leezu commented on pull request #18559: add cd mxnet_lib/static and python/pypi stages to ci
leezu commented on pull request #18559: URL: https://github.com/apache/incubator-mxnet/pull/18559#issuecomment-645087169 > For instance for building binaries for AWS-MXNet we do it on linux 14.04 [and soon migrating to 16 or 18.04] That would be a bad idea. You want to be compatible with https://www.python.org/dev/peps/pep-0599/ and for that you MUST build on CentOS7 or a system with equivalent glibc version This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] ChaiBapchya commented on pull request #18559: add cd mxnet_lib/static and python/pypi stages to ci
ChaiBapchya commented on pull request #18559: URL: https://github.com/apache/incubator-mxnet/pull/18559#issuecomment-645085628 Curious why static builds are on Centos [and not on linux] For instance for building binaries for AWS-MXNet we do it on linux 14.04 [and soon migrating to 16 or 18.04] This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[incubator-mxnet-site] branch asf-site updated: Bump the publish timestamp.
This is an automated email from the ASF dual-hosted git repository. aaronmarkham pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/incubator-mxnet-site.git The following commit(s) were added to refs/heads/asf-site by this push: new 5724c8b Bump the publish timestamp. 5724c8b is described below commit 5724c8b3501223f3b4baa515a694fbdef99e7035 Author: mxnet-ci AuthorDate: Wed Jun 17 00:43:16 2020 + Bump the publish timestamp. --- date.txt | 1 + 1 file changed, 1 insertion(+) diff --git a/date.txt b/date.txt new file mode 100644 index 000..0cfbad2 --- /dev/null +++ b/date.txt @@ -0,0 +1 @@ +Wed Jun 17 00:43:16 UTC 2020
[incubator-mxnet] branch master updated (8039377 -> 103d839)
This is an automated email from the ASF dual-hosted git repository. lausen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from 8039377 add op npx.index_update (#18545) add 103d839 Test CD mxnet_lib/static and python/pypi stages on CI (#18559) No new revisions were added by this update. Summary of changes: ci/docker/runtime_functions.sh| 35 ++-- ci/jenkins/Jenkins_steps.groovy | 86 +++ ci/jenkins/Jenkinsfile_centos_cpu | 6 ++- ci/jenkins/Jenkinsfile_centos_gpu | 7 +++- 4 files changed, 98 insertions(+), 36 deletions(-)
[incubator-mxnet] branch master updated (8039377 -> 103d839)
This is an automated email from the ASF dual-hosted git repository. lausen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from 8039377 add op npx.index_update (#18545) add 103d839 Test CD mxnet_lib/static and python/pypi stages on CI (#18559) No new revisions were added by this update. Summary of changes: ci/docker/runtime_functions.sh| 35 ++-- ci/jenkins/Jenkins_steps.groovy | 86 +++ ci/jenkins/Jenkinsfile_centos_cpu | 6 ++- ci/jenkins/Jenkinsfile_centos_gpu | 7 +++- 4 files changed, 98 insertions(+), 36 deletions(-)
[incubator-mxnet] branch master updated (8039377 -> 103d839)
This is an automated email from the ASF dual-hosted git repository. lausen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from 8039377 add op npx.index_update (#18545) add 103d839 Test CD mxnet_lib/static and python/pypi stages on CI (#18559) No new revisions were added by this update. Summary of changes: ci/docker/runtime_functions.sh| 35 ++-- ci/jenkins/Jenkins_steps.groovy | 86 +++ ci/jenkins/Jenkinsfile_centos_cpu | 6 ++- ci/jenkins/Jenkinsfile_centos_gpu | 7 +++- 4 files changed, 98 insertions(+), 36 deletions(-)
[GitHub] [incubator-mxnet] leezu merged pull request #18559: add cd mxnet_lib/static and python/pypi stages to ci
leezu merged pull request #18559: URL: https://github.com/apache/incubator-mxnet/pull/18559 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] mseth10 commented on pull request #18559: add cd mxnet_lib/static and python/pypi stages to ci
mseth10 commented on pull request #18559: URL: https://github.com/apache/incubator-mxnet/pull/18559#issuecomment-645062149 @leezu please help merge the pr. thanks! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] eric-haibin-lin opened a new issue #18575: while loop fails with hybridization
eric-haibin-lin opened a new issue #18575: URL: https://github.com/apache/incubator-mxnet/issues/18575 ``` import mxnet as mx from mxnet.base import _as_list class MyBlock(mx.gluon.HybridBlock): def __init__(self): super().__init__() def hybrid_forward(self, F, free_nds, loop_nds): n_steps = 5 max_iterations = 5 def step(loop, free): (s, ), (a, b) = loop, free return (s, s) cond = lambda loop_vars, _: (loop_vars[0] < 1e35).prod() func=lambda *_loop_vars: func(_loop_vars, free_nds) outputs, final_loop_nds = F.contrib.while_loop( cond=lambda *_loop_vars: cond(_loop_vars, free_nds), func=lambda *_loop_vars: step(_loop_vars, free_nds), loop_vars=loop_nds, max_iterations=max_iterations, ) outputs = _as_list(outputs) final_loop_nds = _as_list(final_loop_nds) if n_steps == 0: outputs = [] else: outputs = [x.slice_axis(axis=0, begin=0, end=n_steps) for x in outputs] loop_result_sym = [x * 2 for x in outputs] + [x * 3 for x in final_loop_nds] return loop_result_sym net = MyBlock() net.initialize() net.hybridize() free_var_shapes=[(1, ),(1, )] loop_var_shapes=[(1, )] free_nds = [mx.nd.ones(s) for s in free_var_shapes] loop_nds = [mx.nd.ones(s) for s in loop_var_shapes] for n in free_nds + loop_nds: n.attach_grad() with mx.autograd.record(): result = net(free_nds, loop_nds) print(result) mx.nd.waitall() ``` python3.7 test.py ``` Traceback (most recent call last): File "test.py", line 51, in result = net(free_nds, loop_nds) File "/home/ec2-user/cached_executor/python/mxnet/gluon/block.py", line 1324, in __call__ return super().__call__(x, *args) File "/home/ec2-user/cached_executor/python/mxnet/gluon/block.py", line 705, in __call__ out = self.forward(*args) File "/home/ec2-user/cached_executor/python/mxnet/gluon/block.py", line 1369, in forward return self._call_cached_op(x, *args) File "/home/ec2-user/cached_executor/python/mxnet/gluon/block.py", line 1090, in _call_cached_op out = self._cached_op(*cargs) File "mxnet/cython/ndarray.pyx", line 177, in mxnet._cy3.ndarray.CachedOp.__call__ File "mxnet/cython/./base.pyi", line 41, in mxnet._cy3.ndarray.CALL mxnet.base.MXNetError: Traceback (most recent call last): File "../src/imperative/imperative.cc", line 217 MXNetError: Check failed: AGInfo: :IsNone(*output): Assigning to NDArrays that are already in a computational graph will cause undefined behavior when evaluating gradients. Please call backward first to clear the graph or do this out side of a record section. Also note that you cannot use inplace operations like +=, *=, relu(x, out=x), y[idx]=x, etc inside a record section._cachedop ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] mxnet-bot commented on pull request #18574: Update the onnx-tensorrt submodule
mxnet-bot commented on pull request #18574: URL: https://github.com/apache/incubator-mxnet/pull/18574#issuecomment-645036167 Hey @Kh4L , Thanks for submitting the PR All tests are already queued to run once. If tests fail, you can trigger one or more tests again with the following commands: - To trigger all jobs: @mxnet-bot run ci [all] - To trigger specific jobs: @mxnet-bot run ci [job1, job2] *** **CI supported jobs**: [centos-gpu, unix-gpu, windows-cpu, website, clang, miscellaneous, windows-gpu, sanity, centos-cpu, unix-cpu, edge] *** _Note_: Only following 3 categories can trigger CI :PR Author, MXNet Committer, Jenkins Admin. All CI tests must pass before the PR can be merged. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] Kh4L opened a new pull request #18574: Update the onnx-tensorrt submodule
Kh4L opened a new pull request #18574: URL: https://github.com/apache/incubator-mxnet/pull/18574 ## Description ## This PR updates the onnx_tensorrt submodule to the latest commit. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] anko-intel commented on issue #14357: [Bug] Batchnorm running_var behaves differently when using gpu vs. cpu
anko-intel commented on issue #14357: URL: https://github.com/apache/incubator-mxnet/issues/14357#issuecomment-645038422 Hi @ThomasDelteil , According to the training script from to https://github.com/apache/incubator-mxnet/issues/14357#issuecomment-497508722, I didn’t manage to change the script to produce the same result in each run, so instead I run the test procedure 999 times and get the mean value: ``` import mxnet as mx from mxnet import nd, autograd, gluon import numpy as np def transform(data, label): return nd.transpose(data.astype(np.float32), (2,0,1))/255, label.astype(np.float32) trainset = gluon.data.vision.FashionMNIST(train=True) trainset= trainset.transform(transform) train_data = gluon.data.DataLoader(dataset=trainset, batch_size=50, shuffle=True) SCE = gluon.loss.SoftmaxCrossEntropyLoss() under_res = {} under_sum = {} outside_res = {} outside_sum = {} for ctx in [mx.gpu(), mx.cpu()]: under_sum[ctx] = 0.0 outside_sum[ctx] = 0.0 for t in range(1,1000): for ctx in [mx.cpu(), mx.gpu()]: net = gluon.model_zoo.vision.get_model('resnet18_v1', pretrained=False, classes=10) # Parameter initialization net.initialize(mx.init.Xavier(magnitude=2.24), ctx=ctx, force_reinit=True) trainer = gluon.Trainer(params=net.collect_params(), optimizer='sgd', optimizer_params={'learning_rate': .01, 'wd': 0.0001, 'momentum': 0.9}) # Training for i, (data, label) in enumerate(train_data): data = data.as_in_context(ctx) label = label.as_in_context(ctx) with autograd.record(): output = net(data) loss = SCE(output, label) loss.backward() trainer.step(data.shape[0]) if i == 20: break # Training accuracy under autograd accuracy = mx.gluon.metric.Accuracy() for i, (data, label) in enumerate(train_data): with autograd.record(): output = net(data.as_in_context(ctx)) accuracy.update(label, output) if i == 5: break under_res[ctx] = accuracy.get()[1] under_sum[ctx] += under_res[ctx] # Training accuracy outside autograd accuracy = mx.gluon.metric.Accuracy() for i, (data, label) in enumerate(train_data): output = net(data.as_in_context(ctx)) accuracy.update(label, output) if i == 5: break outside_res[ctx] = accuracy.get()[1] outside_sum[ctx] += outside_res[ctx] for ctx in [mx.cpu(), mx.gpu()]: print("Test {:3} Accuracy for {}: under autograd: {:.6f} mean: {:.6f}, outside autograd: {:.6f} mean: {:.6f}".format( t, ctx, under_res[ctx], under_sum[ctx] / t, outside_res[ctx], outside_sum[ctx] / t)) ``` It shows that statistically GPU and CPU give similar result: ``` Test 999 Accuracy for cpu(0): under autograd: 0.58 mean: 0.588709, outside autograd: 0.17 mean: 0.192819 Test 999 Accuracy for gpu(0): under autograd: 0.67 mean: 0.589486, outside autograd: 0.17 mean: 0.191892 ``` Please see the log for full data: [test_03_master_fixed_sync.txt](https://github.com/apache/incubator-mxnet/files/4789319/test_03_master_fixed_sync.txt) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] mxnet-bot commented on pull request #18559: add cd mxnet_lib/static and python/pypi stages to ci
mxnet-bot commented on pull request #18559: URL: https://github.com/apache/incubator-mxnet/pull/18559#issuecomment-645012723 Jenkins CI successfully triggered : [windows-cpu] This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] mseth10 commented on pull request #18559: add cd mxnet_lib/static and python/pypi stages to ci
mseth10 commented on pull request #18559: URL: https://github.com/apache/incubator-mxnet/pull/18559#issuecomment-645012655 @mxnet-bot run ci [windows-cpu] This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] mseth10 commented on pull request #18573: [v1.x] Cherry-pick Fix mxnet-native and Docker CD pipelines (#17784)
mseth10 commented on pull request #18573: URL: https://github.com/apache/incubator-mxnet/pull/18573#issuecomment-644995374 We still need to check for this issue: https://github.com/apache/incubator-mxnet/pull/18465#issuecomment-638364850 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] mxnet-bot commented on pull request #18573: [v1.x] Cherry-pick Fix mxnet-native and Docker CD pipelines (#17784)
mxnet-bot commented on pull request #18573: URL: https://github.com/apache/incubator-mxnet/pull/18573#issuecomment-644994141 Hey @mseth10 , Thanks for submitting the PR All tests are already queued to run once. If tests fail, you can trigger one or more tests again with the following commands: - To trigger all jobs: @mxnet-bot run ci [all] - To trigger specific jobs: @mxnet-bot run ci [job1, job2] *** **CI supported jobs**: [centos-gpu, website, unix-cpu, sanity, windows-cpu, clang, unix-gpu, windows-gpu, edge, centos-cpu, miscellaneous] *** _Note_: Only following 3 categories can trigger CI :PR Author, MXNet Committer, Jenkins Admin. All CI tests must pass before the PR can be merged. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] mseth10 opened a new pull request #18573: [v1.x] Cherry-pick Fix mxnet-native and Docker CD pipelines (#17784)
mseth10 opened a new pull request #18573: URL: https://github.com/apache/incubator-mxnet/pull/18573 * Fix Jenkinsfile CD pipeline for mxnet-native * Fix cd/python/docker/python_images.sh This fixes CD Python docker images pipeline for v1.x branch http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/restricted-mxnet-cd%2Fmxnet-cd-release-job-1.x/detail/mxnet-cd-release-job-1.x/309/pipeline This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] szha commented on pull request #18560: [CI][1.6.x] fix centos 7 url to unblock centos-cpu & gpu pipeline
szha commented on pull request #18560: URL: https://github.com/apache/incubator-mxnet/pull/18560#issuecomment-644972460 > We opted to not turn that in for feature branches to not bother infra too much and also allow the release manager to make certain calls without hitting limits. I don't think that's the case. We didn't have explicit discussion on this, nor do I think it's the right approach. It doesn't make sense to allow force push to release branches while protecting the development branch. Also, enabling branch protection for release branches doesn't necessarily "bother" apache infra either, as the setup is likely one-time. Branch protection can be turned on for branches that match a pattern, and we do have an explicit pattern for release branches according to the release process. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] ys2843 commented on pull request #18571: fix contribute page anchor position shifted
ys2843 commented on pull request #18571: URL: https://github.com/apache/incubator-mxnet/pull/18571#issuecomment-644962814 @mxnet-label-bot add [website, pr-awaiting-review] This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] mseth10 commented on a change in pull request #18559: add cd mxnet_lib/static and python/pypi stages to ci
mseth10 commented on a change in pull request #18559: URL: https://github.com/apache/incubator-mxnet/pull/18559#discussion_r441081183 ## File path: ci/jenkins/Jenkinsfile_centos_cpu ## @@ -37,14 +37,16 @@ core_logic: { custom_steps.compile_centos7_cpu('centos7_cpu'), custom_steps.compile_centos7_cpu_make('centos7_cpu_make'), custom_steps.compile_centos7_cpu_mkldnn(), +custom_steps.compile_static_cd_cpu('centos7_cpu_cd'), Review comment: Done! Also removed custom_steps.compile_static_python_gpu_cmake() that was running cu92 in favor of new stage running cu102. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[incubator-mxnet-site] branch asf-site updated: Bump the publish timestamp.
This is an automated email from the ASF dual-hosted git repository. aaronmarkham pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/incubator-mxnet-site.git The following commit(s) were added to refs/heads/asf-site by this push: new 10c0af3 Bump the publish timestamp. 10c0af3 is described below commit 10c0af3f9638304bf5e7713d4117797780e1b7fd Author: mxnet-ci AuthorDate: Tue Jun 16 18:48:03 2020 + Bump the publish timestamp. --- date.txt | 1 + 1 file changed, 1 insertion(+) diff --git a/date.txt b/date.txt new file mode 100644 index 000..2177a10 --- /dev/null +++ b/date.txt @@ -0,0 +1 @@ +Tue Jun 16 18:48:03 UTC 2020
[GitHub] [incubator-mxnet] ys2843 edited a comment on pull request #18571: fix contribute page anchor position shifted
ys2843 edited a comment on pull request #18571: URL: https://github.com/apache/incubator-mxnet/pull/18571#issuecomment-644933217 > @ys2843 Thanks for prioritizing on this & explaining what the issue was. Curious, are there other places in the website where anchor tag header position is "fixed"? I can't find any more, because this problem only occurs on main information site. There aren't too many anchors (points to some places on the same page) here. Thanks for reviewing and reporting the bug! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] ys2843 commented on pull request #18571: fix contribute page anchor position shifted
ys2843 commented on pull request #18571: URL: https://github.com/apache/incubator-mxnet/pull/18571#issuecomment-644933217 > @ys2843 Thanks for prioritizing on this & explaining what the issue was. Curious, are there other places in the website where anchor tag header position is "fixed"? I can't find any more, because this problem only occurs on main information site. There aren't too many anchors (points to some places on the same page) here. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] leezu commented on a change in pull request #18559: add cd mxnet_lib/static stages to ci
leezu commented on a change in pull request #18559: URL: https://github.com/apache/incubator-mxnet/pull/18559#discussion_r441049583 ## File path: ci/jenkins/Jenkinsfile_centos_cpu ## @@ -37,14 +37,16 @@ core_logic: { custom_steps.compile_centos7_cpu('centos7_cpu'), custom_steps.compile_centos7_cpu_make('centos7_cpu_make'), custom_steps.compile_centos7_cpu_mkldnn(), +custom_steps.compile_static_cd_cpu('centos7_cpu_cd'), Review comment: The new test is a more elaborate version of `custom_steps.compile_static_python_cpu_cmake()` isn't it? If so, let's remove the `custom_steps.compile_static_python_cpu_cmake()` in favor of the new approach. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[incubator-mxnet] branch master updated: add op npx.index_update (#18545)
This is an automated email from the ASF dual-hosted git repository. sxjscience pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git The following commit(s) were added to refs/heads/master by this push: new 8039377 add op npx.index_update (#18545) 8039377 is described below commit 8039377e6630bcb00c5a95abdaf0851803686bc6 Author: JiangZhaoh <54654391+jiangzh...@users.noreply.github.com> AuthorDate: Wed Jun 17 01:45:30 2020 +0800 add op npx.index_update (#18545) * add op npx.index_update * remove debug comment * change eps * fix stupid error * add blank line in docs * gpu temporary space request alignment * fix test error Co-authored-by: Ubuntu --- python/mxnet/_numpy_op_doc.py | 72 ++ src/operator/tensor/index_add-inl.h| 2 +- src/operator/tensor/index_add_backward.cc | 18 +- .../tensor/{index_add-inl.h => index_update-inl.h} | 175 -- src/operator/tensor/index_update.cc| 261 + src/operator/tensor/index_update.cu| 204 tests/python/unittest/test_numpy_op.py | 162 + 7 files changed, 813 insertions(+), 81 deletions(-) diff --git a/python/mxnet/_numpy_op_doc.py b/python/mxnet/_numpy_op_doc.py index fecd0e6..b8f4a49 100644 --- a/python/mxnet/_numpy_op_doc.py +++ b/python/mxnet/_numpy_op_doc.py @@ -630,6 +630,7 @@ def _npx_index_add(a, ind, val): """ Add values to input according to given indexes. If exists repeate positions to be updated, the update value will be accumulated. + Parameters -- a : ndarray @@ -643,10 +644,12 @@ def _npx_index_add(a, ind, val): - ind.dtype should be 'int32' or 'int64' val : ndarray Input data. The array to update the input 'a'. + Returns --- out : ndarray The output array. + Examples >>> a = np.zeros((2, 3, 4)) @@ -699,6 +702,75 @@ def _npx_index_add(a, ind, val): pass +def _npx_index_update(a, ind, val): +""" +Update values to input according to given indexes. +If multiple indices refer to the same location it is undefined which update is chosen; it may choose +the order of updates arbitrarily and nondeterministically (e.g., due to concurrent updates on some +hardware platforms). Recommend not to use repeate positions. + +Parameters +-- +a : ndarray +Input data. The array to be updated. +Support dtype: 'float32', 'float64', 'int32', 'int64'. +ind : ndarray +Indexes for indicating update positions. +For example, array([[0, 1], [2, 3], [4, 5]] indicates here are two positions to +be updated, which is (0, 2, 4) and (1, 3, 5). +Note: - 'ind' cannot be empty array '[]', for that case, please use operator 'add' instead. + - 0 <= ind.ndim <= 2. + - ind.dtype should be 'int32' or 'int64' +val : ndarray +Input data. The array to update the input 'a'. +Support dtype: 'float32', 'float64', 'int32', 'int64'. + +Returns +--- +out : ndarray +The output array. + +Examples + +>>> a = np.zeros((2, 3, 4)) +>>> ind = np.array([[0, 0], [0, 0], [0, 1]], dtype='int32') +>>> val = np.arange(2).reshape(2) + 1 +>>> b = npx.index_update(a, ind, val) +>>> b +array([[[1., 2., 0., 0.], +[0., 0., 0., 0.], +[0., 0., 0., 0.]], + + [[0., 0., 0., 0.], +[0., 0., 0., 0.], +[0., 0., 0., 0.]]]) + +>>> ind=np.array([[0, 0], [0, 1]], dtype='int32') +>>> val = np.arange(8).reshape(2, 4) +>>> b = npx.index_update(a, ind, val) +>>> b +array([[[0., 1., 2., 3.], +[4., 5., 6., 7.], +[0., 0., 0., 0.]], + + [[0., 0., 0., 0.], +[0., 0., 0., 0.], +[0., 0., 0., 0.]]]) + +>>> val = np.arange(4).reshape(4) # brocast 'val' +>>> b = npx.index_update(a, ind, val) +>>> b +array([[[0., 1., 2., 3.], +[0., 1., 2., 3.], +[0., 0., 0., 0.]], + +[[0., 0., 0., 0.], +[0., 0., 0., 0.], +[0., 0., 0., 0.]]]) +""" +pass + + def _np_diag(array, k=0): """ Extracts a diagonal or constructs a diagonal array. diff --git a/src/operator/tensor/index_add-inl.h b/src/operator/tensor/index_add-inl.h index 83463da..122aa01 100644 --- a/src/operator/tensor/index_add-inl.h +++ b/src/operator/tensor/index_add-inl.h @@ -52,7 +52,7 @@ inline bool IndexModifyOpType(const nnvm::NodeAttrs& attrs, CHECK_NE((*in_attrs)[1], -1); CHECK_NE((*in_attrs)[2], -1); CHECK_EQ((*in_attrs)[0], (*in_attrs)[2]) -<< "index_add(a, ind, val) only support a.dtype == val.dtype"; +<< "index_add/index_update(a, ind,
[GitHub] [incubator-mxnet] sxjscience merged pull request #18545: add op npx.index_update
sxjscience merged pull request #18545: URL: https://github.com/apache/incubator-mxnet/pull/18545 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] ChaiBapchya commented on pull request #18571: fix contribute page anchor position shifted
ChaiBapchya commented on pull request #18571: URL: https://github.com/apache/incubator-mxnet/pull/18571#issuecomment-644911772 @ys2843 Thanks for prioritizing on this & explaining what the issue was. Curious, are there other places in the website where anchor tag header position is "fixed"? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[incubator-mxnet] branch v1.x updated: Increase staggered build timeout to 180 min (#18568)
This is an automated email from the ASF dual-hosted git repository. marcoabreu pushed a commit to branch v1.x in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git The following commit(s) were added to refs/heads/v1.x by this push: new f91b989 Increase staggered build timeout to 180 min (#18568) f91b989 is described below commit f91b98932b0a0846782905a68942f0870242246d Author: Joe Evans AuthorDate: Tue Jun 16 10:25:01 2020 -0700 Increase staggered build timeout to 180 min (#18568) * Increase staggered build timeout to 180 min, since sanity build has 180 min timeout. * Decrease timeout so everyone is happy. Co-authored-by: Joe Evans --- ci/jenkins/Jenkinsfile_full | 2 +- ci/jenkins/Jenkinsfile_sanity | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/ci/jenkins/Jenkinsfile_full b/ci/jenkins/Jenkinsfile_full index 33d57d2..415bd7b 100644 --- a/ci/jenkins/Jenkinsfile_full +++ b/ci/jenkins/Jenkinsfile_full @@ -21,7 +21,7 @@ // See documents at https://jenkins.io/doc/book/pipeline/jenkinsfile/ // timeout in minutes -def max_time = 30 +def max_time = 60 def buildJobs = [ 'centos-cpu', diff --git a/ci/jenkins/Jenkinsfile_sanity b/ci/jenkins/Jenkinsfile_sanity index ed4d16e..065202c 100644 --- a/ci/jenkins/Jenkinsfile_sanity +++ b/ci/jenkins/Jenkinsfile_sanity @@ -21,7 +21,7 @@ // See documents at https://jenkins.io/doc/book/pipeline/jenkinsfile/ // timeout in minutes -max_time = 180 +max_time = 60 node('utility') { // Loading the utilities requires a node context unfortunately
[incubator-mxnet] branch v1.x updated: Increase staggered build timeout to 180 min (#18568)
This is an automated email from the ASF dual-hosted git repository. marcoabreu pushed a commit to branch v1.x in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git The following commit(s) were added to refs/heads/v1.x by this push: new f91b989 Increase staggered build timeout to 180 min (#18568) f91b989 is described below commit f91b98932b0a0846782905a68942f0870242246d Author: Joe Evans AuthorDate: Tue Jun 16 10:25:01 2020 -0700 Increase staggered build timeout to 180 min (#18568) * Increase staggered build timeout to 180 min, since sanity build has 180 min timeout. * Decrease timeout so everyone is happy. Co-authored-by: Joe Evans --- ci/jenkins/Jenkinsfile_full | 2 +- ci/jenkins/Jenkinsfile_sanity | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/ci/jenkins/Jenkinsfile_full b/ci/jenkins/Jenkinsfile_full index 33d57d2..415bd7b 100644 --- a/ci/jenkins/Jenkinsfile_full +++ b/ci/jenkins/Jenkinsfile_full @@ -21,7 +21,7 @@ // See documents at https://jenkins.io/doc/book/pipeline/jenkinsfile/ // timeout in minutes -def max_time = 30 +def max_time = 60 def buildJobs = [ 'centos-cpu', diff --git a/ci/jenkins/Jenkinsfile_sanity b/ci/jenkins/Jenkinsfile_sanity index ed4d16e..065202c 100644 --- a/ci/jenkins/Jenkinsfile_sanity +++ b/ci/jenkins/Jenkinsfile_sanity @@ -21,7 +21,7 @@ // See documents at https://jenkins.io/doc/book/pipeline/jenkinsfile/ // timeout in minutes -max_time = 180 +max_time = 60 node('utility') { // Loading the utilities requires a node context unfortunately
[GitHub] [incubator-mxnet] marcoabreu merged pull request #18568: Increase staggered build timeout to 180 min
marcoabreu merged pull request #18568: URL: https://github.com/apache/incubator-mxnet/pull/18568 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] anko-intel commented on issue #14357: [Bug] Batchnorm running_var behaves differently when using gpu vs. cpu
anko-intel commented on issue #14357: URL: https://github.com/apache/incubator-mxnet/issues/14357#issuecomment-644887485 Hi @ThomasDelteil, According to the training script from https://github.com/apache/incubator-mxnet/issues/14357#issuecomment-497487102, As I mentioned in previous comment on the master branch (on 8174771) running variables in BatchNorm are calculated only during the backward pass. Still, there are some differences in the results between CPU and GPU backend. One of the reasons comes from different input tensors on GPU and CPU as mx.nd.random.normal() function produces different results on both backends. According to the documentation https://mxnet.apache.org/api/python/docs/api/mxnet/random/index.html it is an expected behavior: > Random number generators in MXNet are device specific. mx.random.seed(seed_state) sets the state of each generator using seed_state and the device id. Therefore, random numbers generated from different devices can be different even if they are seeded using the same seed. > To produce identical random number sequences independent of the device id, set optional ctx argument. This produces the same sequence of random numbers independent of the device id, but the sequence can be different on different kind of devices as MXNet’s random number generators for CPU and GPU use different algorithms. So, for comparison purpose I moved generating tensors to NumPy. The second issue I observe is synchronization problem for running vars. For now I put some workaround to receive final result from backward pass (I am not sure if it is an issue for real network). So the scripts could looks as follows: ``` import mxnet as mx from mxnet import gluon from mxnet import autograd import numpy as np seed = np.random.randint(np.iinfo(np.int32).max) #seed = 0 print("seed:", seed) shape = (1,3,224,224) layers = 100 dataNumpy = {} np.random.seed(seed) for i in range(layers): dataNumpy[i] = np.random.normal(loc=10, scale=2, size=shape) for ctx in [mx.cpu(), mx.gpu()]: layer2 = gluon.nn.BatchNorm() layer2.initialize(ctx=ctx) for i in range(layers): data2 = mx.nd.array(dataNumpy[i], ctx=ctx) with autograd.record(): out = layer2(data2) out.backward() # workaround for synchronization issue var1 = layer2.running_var.data().asnumpy() for t in range(1, 10): var2 = layer2.running_var.data().asnumpy() if (var1 != var2).any(): print(ctx, "- DIFF in running_var reads:\n 0 :", var1, "\n ", t,":", var2 ) break print(ctx, layer2.running_var.data().asnumpy(), layer2.running_mean.data().asnumpy() ) ``` For the test above I receive almost the same values for both backends: ``` seed: 791821049 cpu(0) [3.9977632 3.999108 4.0007195] [10.000481 10.000663 9.999296] gpu(0) [3.997764 3.9991088 4.0007195] [10.000478 10.000664 9.999295] ``` The difference is so small that I guess it could be neglected (as a difference in rounding in both backends) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] anko-intel edited a comment on issue #14357: [Bug] Batchnorm running_var behaves differently when using gpu vs. cpu
anko-intel edited a comment on issue #14357: URL: https://github.com/apache/incubator-mxnet/issues/14357#issuecomment-644775609 Hi @adrianloy, Your issue still exist on the 1.6 branch. On master branch (on 81747710c) „running_var” is calculated only in backward pass on CPU and GPU backend as well, so your test gives the same results on both contexts: ``` Batchnorm running var values [1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] anko-intel edited a comment on issue #14357: [Bug] Batchnorm running_var behaves differently when using gpu vs. cpu
anko-intel edited a comment on issue #14357: URL: https://github.com/apache/incubator-mxnet/issues/14357#issuecomment-644775609 Hi @adrianloy, Your issue still exist on the 1.6 branch. On master branch „running_var” is calculated only in backward pass on CPU and GPU backend as well, so your test gives the same results on both contexts: ``` Batchnorm running var values [1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] anko-intel commented on issue #14357: [Bug] Batchnorm running_var behaves differently when using gpu vs. cpu
anko-intel commented on issue #14357: URL: https://github.com/apache/incubator-mxnet/issues/14357#issuecomment-644775609 Hi @adrianloy, Your issue still exist on the 1.6 branch. On master branch „running_var” is calculated only in backward pass on CPU and GPU backend as well, so your test gives the same results on both contexts: Batchnorm running var values [1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] mxnet-bot commented on pull request #18572: [v1.7.x]Add KEY for Ciyong Chen
mxnet-bot commented on pull request #18572: URL: https://github.com/apache/incubator-mxnet/pull/18572#issuecomment-644772671 Hey @ciyongch , Thanks for submitting the PR All tests are already queued to run once. If tests fail, you can trigger one or more tests again with the following commands: - To trigger all jobs: @mxnet-bot run ci [all] - To trigger specific jobs: @mxnet-bot run ci [job1, job2] *** **CI supported jobs**: [website, clang, centos-gpu, centos-cpu, miscellaneous, unix-cpu, edge, sanity, windows-cpu, windows-gpu, unix-gpu] *** _Note_: Only following 3 categories can trigger CI :PR Author, MXNet Committer, Jenkins Admin. All CI tests must pass before the PR can be merged. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] ciyongch opened a new pull request #18572: [v1.7.x]Add KEY for Ciyong Chen
ciyongch opened a new pull request #18572: URL: https://github.com/apache/incubator-mxnet/pull/18572 ## Description ## update keys file for Ciyong Chen. @TaoLv @pengzhao-intel This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[incubator-mxnet-site] branch asf-site updated: Bump the publish timestamp.
This is an automated email from the ASF dual-hosted git repository. aaronmarkham pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/incubator-mxnet-site.git The following commit(s) were added to refs/heads/asf-site by this push: new 9d0447b Bump the publish timestamp. 9d0447b is described below commit 9d0447bf49bd487b62b8f42d1b3b2080cbe48f42 Author: mxnet-ci AuthorDate: Tue Jun 16 12:48:09 2020 + Bump the publish timestamp. --- date.txt | 1 + 1 file changed, 1 insertion(+) diff --git a/date.txt b/date.txt new file mode 100644 index 000..324e8c1 --- /dev/null +++ b/date.txt @@ -0,0 +1 @@ +Tue Jun 16 12:48:09 UTC 2020
[GitHub] [incubator-mxnet] MoritzMaxeiner commented on pull request #18535: [Numpy] Bugfix of slice operator export (MXNet to ONNX) v2
MoritzMaxeiner commented on pull request #18535: URL: https://github.com/apache/incubator-mxnet/pull/18535#issuecomment-644731564 > Is there some procedure to get your Pull Request accepted faster, that I am missing? If there is, I'm not ware of it; I've also got [one](https://github.com/apache/incubator-mxnet/pull/16251) that's been waiting for a while. That's not really unusual in my experience, though. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] RuRo edited a comment on pull request #18535: [Numpy] Bugfix of slice operator export (MXNet to ONNX) v2
RuRo edited a comment on pull request #18535: URL: https://github.com/apache/incubator-mxnet/pull/18535#issuecomment-644657795 @szha can you PTAL or tag another reviewer? Thanks. P.S. I was wondering. Is there some procedure to get your Pull Request accepted faster, that I am missing? So far, I've submitted 2 PRs that were successfully accepted and participated in some other Pull Requests. And in all cases there is a really long delay after the PR is "done", where we are just waiting for the reviewers. If there is no such procedure to get your PRs accepted faster, maybe we need some way for the PR owners to triage their Pull Requests based on their status? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] RuRo commented on pull request #18535: [Numpy] Bugfix of slice operator export (MXNet to ONNX) v2
RuRo commented on pull request #18535: URL: https://github.com/apache/incubator-mxnet/pull/18535#issuecomment-644657795 @szha can you PTAL or tag another reviewer? Thanks. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] JiangZhaoh commented on pull request #18545: add op npx.index_update
JiangZhaoh commented on pull request #18545: URL: https://github.com/apache/incubator-mxnet/pull/18545#issuecomment-644645699 @mxnet-bot run ci [unix-cpu] This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] mxnet-bot commented on pull request #18545: add op npx.index_update
mxnet-bot commented on pull request #18545: URL: https://github.com/apache/incubator-mxnet/pull/18545#issuecomment-644645760 Jenkins CI successfully triggered : [unix-cpu] This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] Neutron3529 commented on issue #18551: MXnet with cuda wont install on windows 10!
Neutron3529 commented on issue #18551: URL: https://github.com/apache/incubator-mxnet/issues/18551#issuecomment-644622784 > > My MXNet with win10 is OK(although quite slow compared to Linux) > > have you ever tried calling `nvidia-smi`? > > Yes, i call it every day -l 1 If nvidia-smi is callable, the CUDA should be installed properly. (you could check the "CUDA Version") slot to ensure it. ``` neutron@Neutron:/me$ nvidia-smi Tue Jun 16 16:29:00 2020 +-+ | NVIDIA-SMI 440.82 Driver Version: 440.82 CUDA Version: 10.2 | |---+--+--+ | GPU NamePersistence-M| Bus-IdDisp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===+==+==| | 0 GeForce GTX 1060On | :01:00.0 Off | N/A | | N/A 80CP278W / N/A | 2270MiB / 6078MiB | 99% Default | +---+--+--+ +-+ | Processes: GPU Memory | | GPU PID Type Process name Usage | |=| |0 1003 G /usr/lib/Xorg 38MiB | |0 96256 C python 2227MiB | +-+ ``` If the CUDA is installed properly, I don't know what would cause MXNet failed to load. maybe `pdb` could help telling you what happened. ``` import pdb pdb.set_trace() import mxnet ``` save the following script into a `test.py` then execute `python test.py`, the python would enter pdb mode, then, using either `s`(step), `n`(next) or `r`(execute until return) to control the `pdb` procedure. It may help you to find which `dll` is missing. and you may manually edit the `.py` script of MXNet to ensure a successful load. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] XIAO-XIA commented on pull request #18546: [Numpy] FFI: tril_indices
XIAO-XIA commented on pull request #18546: URL: https://github.com/apache/incubator-mxnet/pull/18546#issuecomment-644589045 > @XIAO-XIA CI fails in a test related to the Op you're modifying. It suggests there's a bug in your PR Thank you very much! I'm trying to fix the bug. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] sxjscience commented on a change in pull request #18319: [numpy] symbolic advanced indexing
sxjscience commented on a change in pull request #18319: URL: https://github.com/apache/incubator-mxnet/pull/18319#discussion_r440637596 ## File path: src/operator/numpy/np_indexing_op.cc ## @@ -0,0 +1,544 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +/*! + * Copyright (c) 2018 by Contributors + * \file np_indexing_op.cc +*/ + +#include "./np_indexing_op.h" + +namespace mxnet { +namespace op { + +struct AdvancedIndexingTakeCPU { + // assume that idx have been flattened to a 1-D tensor (N,) + // assume that out_data and in_data have been flattened to 2-D tensors, (N, M) and (K, M) + // M is the number of columns of in_data and out_data + // K is the number of rows of in_data + // i is the index of out_data + template + MSHADOW_XINLINE static void Map(index_t i, DType* out_data, const DType* in_data, + const IType* idx, const size_t M, const int64_t K) { +int64_t j = static_cast(idx[i]); +j = j % K; +j += (j < 0) ? K : 0; +#pragma GCC diagnostic push +#if __GNUC__ >= 8 +#pragma GCC diagnostic ignored "-Wclass-memaccess" +#endif +std::memcpy(out_data + i * M, in_data + j * M, M * sizeof(DType)); +#pragma GCC diagnostic pop + } +}; + +struct AdvancedIndexingTakeMultiDimensionCPU { + // assume that idx have been flattened to a 1-D tensor (N,) + // assume that out_data and in_data have been flattened to 2-D tensors, (N, M) and (K, M) + // M is the number of columns of in_data and out_data + // K is the number of rows of in_data + // i is the index of out_data + template + MSHADOW_XINLINE static void Map(index_t i, DType* out_data, const DType* in_data, + const IType* idx, const size_t M, const int64_t K) { +int64_t j = static_cast(idx[i]); +j = j % K; +j += (j < 0) ? K : 0; +#pragma GCC diagnostic push +#if __GNUC__ >= 8 +#pragma GCC diagnostic ignored "-Wclass-memaccess" +#endif +std::memcpy(out_data + i * M, in_data + (i * K + j) * M, M * sizeof(DType)); +#pragma GCC diagnostic pop + } +}; + +struct AdvancedIndexingBooleanMaskBackwardCPUWriteKernel { + template + static void Map(int i, + DType* igrad, + const OpReqType /*req*/, + const DType* ograd, + const int32_t* idx, + const size_t col_size) { +// i is row id already +int32_t prev = (i == 0) ? 0 : idx[i - 1]; +int32_t curr = idx[i]; +#pragma GCC diagnostic push +#if __GNUC__ >= 8 +#pragma GCC diagnostic ignored "-Wclass-memaccess" +#endif +if (prev != curr) { + std::memcpy(igrad + i * col_size, ograd + prev * col_size, col_size * sizeof(DType)); +} else { + std::memset(igrad + i * col_size, 0, col_size * sizeof(DType)); +} +#pragma GCC diagnostic pop + } +}; + +template +bool CheckIndexOutOfBound(const DType* data_ptr, size_t data_size, + const DType min, const DType max) { + bool is_valid = true; + for (size_t i = 0; i < data_size; i++) { +if (data_ptr[i] > max || data_ptr[i] < min) { + is_valid = false; + break; +} + } + return is_valid; +} + +inline bool AdvancedIndexingOpType(const nnvm::NodeAttrs& attrs, + std::vector *in_attrs, + std::vector *out_attrs) { + CHECK_EQ(in_attrs->size(), 2U); + CHECK_EQ(out_attrs->size(), 1U); + CHECK_NE((*in_attrs)[1], -1) << "Index type must be set for take operator"; + + TYPE_ASSIGN_CHECK(*out_attrs, 0, (*in_attrs)[0]); + TYPE_ASSIGN_CHECK(*in_attrs, 0, (*out_attrs)[0]); + return (*in_attrs)[0] != -1; +} + +bool AdvancedIndexingOpStorageType(const nnvm::NodeAttrs& attrs, +const int dev_mask, +DispatchMode* dispatch_mode, +std::vector *in_attrs, +std::vector *out_attrs) { + CHECK_EQ(in_attrs->size(), 2); + CHECK_EQ(out_attrs->size(), 1); + for (int : *in_attrs) { +CHECK_EQ(attr, kDefaultStorage) << "Only default storage is supported"; + } + for (int : *out_attrs) { +attr = kDefaultStorage; + } + *dispatch_mode = DispatchMode::kFComputeEx; + return true; +} + +bool
[GitHub] [incubator-mxnet] Mauhing edited a comment on issue #13484: flaky test test_gluon_data.test_recordimage_dataset_with_data_loader_multiworker
Mauhing edited a comment on issue #13484: URL: https://github.com/apache/incubator-mxnet/issues/13484#issuecomment-644560102 It may be a shared memory problem. Check shm used by using `df -h` and find shm. If it is 100%, then your multiprocess worker will just stall. If you use docker. Use --shm 1024m to lauch docker, so docker run --shm 1024m why? gluon.data.DataLoader uses python multiprocess, multiprocess need shared memory. The default shared memory is 64m in docker container. You can check the shm usage by using df -h and find shm. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] Mauhing edited a comment on issue #13484: flaky test test_gluon_data.test_recordimage_dataset_with_data_loader_multiworker
Mauhing edited a comment on issue #13484: URL: https://github.com/apache/incubator-mxnet/issues/13484#issuecomment-644560102 It may be a shared memory problem. Check shm used by using `df -h` and find shm. If it is 100%, then your multiprocess worker will just stall. If you use docker. The default is 64m, This is not enough for dataloader. Use --shm 1024m to lauch docker, so docker run --shm 1024m why? gluon.data.DataLoader uses python multiprocess, multiprocess need shared memory. The default shared memory is 64m in docker container. You can check the shm usage by using df -h and find shm. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] sxjscience commented on a change in pull request #18319: [numpy] symbolic advanced indexing
sxjscience commented on a change in pull request #18319: URL: https://github.com/apache/incubator-mxnet/pull/18319#discussion_r440636183 ## File path: src/operator/numpy/np_indexing_op.cu ## @@ -0,0 +1,574 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +/*! + * Copyright (c) 2018 by Contributors + * \file np_indexing_op.cu +*/ + +#include "./np_indexing_op.h" +#include + +namespace mxnet { +namespace op { + +/*! \brief If there are out-of-bound indices, out will be assigned to 1. + */ +struct is_valid_check { + template + MSHADOW_XINLINE static void Map(int i, char* out, const DType* data, + const DType min, const DType max) { +if (data[i] < min || data[i] > max) *out = 1; + } +}; + +template +bool CheckIndexOutOfBound(mshadow::Stream *s, const DType* data_ptr, size_t data_size, +const DType min, const DType max, char* is_valid_ptr) { +using namespace mxnet_op; +int32_t is_valid = 0; +Kernel::Launch(s, 1, is_valid_ptr); +Kernel::Launch(s, data_size, is_valid_ptr, data_ptr, min, max); +CUDA_CALL(cudaMemcpyAsync(_valid, is_valid_ptr, sizeof(char), +cudaMemcpyDeviceToHost, mshadow::Stream::GetStream(s))); +CUDA_CALL(cudaStreamSynchronize(mshadow::Stream::GetStream(s))); +return is_valid == 0; +} + +struct AdvancedIndexingTakeGPU { +// assume that idx have been flattened to a 1-D tensor (N,) +// assume that out_data and in_data have been flattened to 2-D tensors, (N, M) and (K, M) +// M is the number of columns of in_data and out_data +// K is the number of rows of in_data +// i is the index of out_data +template +MSHADOW_XINLINE static void Map(int i, DType* out_data, const DType* in_data, +const IType* idx, const int64_t M, const int64_t K) { + int64_t j = static_cast(idx[i]); + j = j % K; + j += (j < 0) ? K : 0; + + for (int64_t k = 0; k < M; k++){ +out_data[i * M + k] = in_data[j * M + k]; + } +} +}; + +struct AdvancedIndexingTakeMultiDimensionGPU { +// assume that idx have been flattened to a 1-D tensor (N,) +// assume that out_data and in_data have been flattened to 2-D tensors, (N, M) and (K, M) +// M is the number of columns of in_data and out_data +// K is the number of rows of in_data +// i is the index of out_data +template +MSHADOW_XINLINE static void Map(int i, DType* out_data, const DType* in_data, +const IType* idx, const int64_t M, const int64_t K) { + int64_t j = static_cast(idx[i]); + j = j % K; + j += (j < 0) ? K : 0; + + for (int64_t k = 0; k < M; k++){ +out_data[i * M + k] = in_data[(i * k + j) * M + k]; + } +} +}; + +template<> +inline void AdvancedIndexingOpForward(const nnvm::NodeAttrs& attrs, +const OpContext , +const std::vector , +const std::vector , +const std::vector ) { + using namespace mshadow; + CHECK_EQ(inputs.size(), 2U); + CHECK_EQ(outputs.size(), 1U); + + if (inputs[np_indexing_::kIdx].dtype() == mshadow::kBool) { +CHECK(req[0] == kWriteTo || req[0] == kWriteInplace); +const int axis = 0; +const NDArray = inputs[0]; +const NDArray = inputs[1]; +const NDArray = outputs[0]; +CHECK_EQ(axis, 0) << "Not supported yet"; +CHECK_EQ(data.shape()[axis], idx.shape()[0]); +CHECK_EQ(idx.shape().ndim(), 1U); +Stream* s = ctx.get_stream(); +cudaStream_t stream = Stream::GetStream(s); +// count the number of 1s in `idx`, so that we could know the output dimension +size_t idx_size = idx.shape()[0]; +int32_t valid_num = 0; +int32_t* prefix_sum = nullptr; +void* d_temp_storage = nullptr; +size_t temp_storage_bytes = 0; +// Calculate total temporary memory size +cub::DeviceScan::InclusiveSum(d_temp_storage, +temp_storage_bytes, +prefix_sum, +
[GitHub] [incubator-mxnet] sxjscience commented on a change in pull request #18319: [numpy] symbolic advanced indexing
sxjscience commented on a change in pull request #18319: URL: https://github.com/apache/incubator-mxnet/pull/18319#discussion_r440634530 ## File path: src/operator/numpy/np_indexing_op.cu ## @@ -0,0 +1,574 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +/*! + * Copyright (c) 2018 by Contributors + * \file np_indexing_op.cu +*/ + +#include "./np_indexing_op.h" +#include + +namespace mxnet { +namespace op { + +/*! \brief If there are out-of-bound indices, out will be assigned to 1. + */ +struct is_valid_check { + template + MSHADOW_XINLINE static void Map(int i, char* out, const DType* data, + const DType min, const DType max) { +if (data[i] < min || data[i] > max) *out = 1; + } +}; + +template +bool CheckIndexOutOfBound(mshadow::Stream *s, const DType* data_ptr, size_t data_size, +const DType min, const DType max, char* is_valid_ptr) { +using namespace mxnet_op; +int32_t is_valid = 0; +Kernel::Launch(s, 1, is_valid_ptr); +Kernel::Launch(s, data_size, is_valid_ptr, data_ptr, min, max); +CUDA_CALL(cudaMemcpyAsync(_valid, is_valid_ptr, sizeof(char), Review comment: Here, `is_valid` has dtype=`int32_t`, but the `is_valid_ptr` has dtype=`char`. Thus, you may consider to change the dtype of is_valid to char. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] sxjscience commented on a change in pull request #18319: [numpy] symbolic advanced indexing
sxjscience commented on a change in pull request #18319: URL: https://github.com/apache/incubator-mxnet/pull/18319#discussion_r440633446 ## File path: src/operator/numpy/np_indexing_op.cu ## @@ -0,0 +1,574 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +/*! + * Copyright (c) 2018 by Contributors + * \file np_indexing_op.cu +*/ + +#include "./np_indexing_op.h" +#include + +namespace mxnet { +namespace op { + +/*! \brief If there are out-of-bound indices, out will be assigned to 1. + */ +struct is_valid_check { + template + MSHADOW_XINLINE static void Map(int i, char* out, const DType* data, + const DType min, const DType max) { +if (data[i] < min || data[i] > max) *out = 1; + } +}; + +template +bool CheckIndexOutOfBound(mshadow::Stream *s, const DType* data_ptr, size_t data_size, +const DType min, const DType max, char* is_valid_ptr) { +using namespace mxnet_op; +int32_t is_valid = 0; +Kernel::Launch(s, 1, is_valid_ptr); +Kernel::Launch(s, data_size, is_valid_ptr, data_ptr, min, max); +CUDA_CALL(cudaMemcpyAsync(_valid, is_valid_ptr, sizeof(char), +cudaMemcpyDeviceToHost, mshadow::Stream::GetStream(s))); +CUDA_CALL(cudaStreamSynchronize(mshadow::Stream::GetStream(s))); +return is_valid == 0; +} + +struct AdvancedIndexingTakeGPU { +// assume that idx have been flattened to a 1-D tensor (N,) +// assume that out_data and in_data have been flattened to 2-D tensors, (N, M) and (K, M) +// M is the number of columns of in_data and out_data +// K is the number of rows of in_data +// i is the index of out_data +template +MSHADOW_XINLINE static void Map(int i, DType* out_data, const DType* in_data, +const IType* idx, const int64_t M, const int64_t K) { + int64_t j = static_cast(idx[i]); + j = j % K; + j += (j < 0) ? K : 0; + + for (int64_t k = 0; k < M; k++){ +out_data[i * M + k] = in_data[j * M + k]; + } +} +}; + +struct AdvancedIndexingTakeMultiDimensionGPU { +// assume that idx have been flattened to a 1-D tensor (N,) +// assume that out_data and in_data have been flattened to 2-D tensors, (N, M) and (K, M) +// M is the number of columns of in_data and out_data +// K is the number of rows of in_data +// i is the index of out_data +template +MSHADOW_XINLINE static void Map(int i, DType* out_data, const DType* in_data, +const IType* idx, const int64_t M, const int64_t K) { + int64_t j = static_cast(idx[i]); + j = j % K; + j += (j < 0) ? K : 0; + + for (int64_t k = 0; k < M; k++){ +out_data[i * M + k] = in_data[(i * k + j) * M + k]; + } +} +}; + +template<> +inline void AdvancedIndexingOpForward(const nnvm::NodeAttrs& attrs, +const OpContext , +const std::vector , +const std::vector , +const std::vector ) { + using namespace mshadow; + CHECK_EQ(inputs.size(), 2U); + CHECK_EQ(outputs.size(), 1U); + + if (inputs[np_indexing_::kIdx].dtype() == mshadow::kBool) { +CHECK(req[0] == kWriteTo || req[0] == kWriteInplace); +const int axis = 0; +const NDArray = inputs[0]; +const NDArray = inputs[1]; +const NDArray = outputs[0]; +CHECK_EQ(axis, 0) << "Not supported yet"; +CHECK_EQ(data.shape()[axis], idx.shape()[0]); +CHECK_EQ(idx.shape().ndim(), 1U); +Stream* s = ctx.get_stream(); +cudaStream_t stream = Stream::GetStream(s); +// count the number of 1s in `idx`, so that we could know the output dimension +size_t idx_size = idx.shape()[0]; +int32_t valid_num = 0; +int32_t* prefix_sum = nullptr; +void* d_temp_storage = nullptr; +size_t temp_storage_bytes = 0; +// Calculate total temporary memory size +cub::DeviceScan::InclusiveSum(d_temp_storage, +temp_storage_bytes, +prefix_sum, +
[incubator-mxnet-site] branch asf-site updated: Bump the publish timestamp.
This is an automated email from the ASF dual-hosted git repository. aaronmarkham pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/incubator-mxnet-site.git The following commit(s) were added to refs/heads/asf-site by this push: new 72d6ec7 Bump the publish timestamp. 72d6ec7 is described below commit 72d6ec79990ac53dadef49cc5461b3ac5b22d719 Author: mxnet-ci AuthorDate: Tue Jun 16 06:47:39 2020 + Bump the publish timestamp. --- date.txt | 1 + 1 file changed, 1 insertion(+) diff --git a/date.txt b/date.txt new file mode 100644 index 000..4d0d978 --- /dev/null +++ b/date.txt @@ -0,0 +1 @@ +Tue Jun 16 06:47:39 UTC 2020
[GitHub] [incubator-mxnet] Mauhing commented on issue #13484: flaky test test_gluon_data.test_recordimage_dataset_with_data_loader_multiworker
Mauhing commented on issue #13484: URL: https://github.com/apache/incubator-mxnet/issues/13484#issuecomment-644560102 It may be a shared memory problem. Check shm used by using `df -h` and find shm. If it is 100%, then your multiprocess worker will just stall. If you use docker. Use --shm 1024m to lauch docker, so docker run --shm 1024m why? gluon.data.DataLoader uses python multiprocess, multiprocess need shared memory. The default shared memory is 64m in docker container. You can check the shm usage by using df -h and find shm. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] Mauhing commented on issue #18224: DataLoader timed out
Mauhing commented on issue #18224: URL: https://github.com/apache/incubator-mxnet/issues/18224#issuecomment-644557207 I solved it. I have the same problem in d2l.ai - Ch. 7.7. Use `--shm 1024m` to lauch docker, so `docker run --shm 1024m ` *why?* `gluon.data.DataLoader` use python multiprocess, multiprocess need shared memory. The default shared memory is 64m in docker container. You can check the shm usage by using `df -h` and find `shm`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] JiangZhaoh commented on a change in pull request #18545: add op npx.index_update
JiangZhaoh commented on a change in pull request #18545: URL: https://github.com/apache/incubator-mxnet/pull/18545#discussion_r440610528 ## File path: src/operator/tensor/index_update.cu ## @@ -0,0 +1,202 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +/*! + * \file index_update.cu + * \brief GPU implementation of index_update operator + */ + +#include +#include "./index_update-inl.h" +#include "../tensor/util/tensor_util-inl.cuh" +#include "../tensor/util/tensor_util-inl.h" + + +namespace mxnet { +namespace op { + +template +struct IndexUpdateForwardGPUKernel { + MSHADOW_XINLINE static void Map(size_t i, DType* out, + const DType* val, + const mshadow::Shape a_tail_shape, + const mshadow::Shape a_pre_stride, + const mshadow::Shape val_stride, + const mshadow::Shape val_shape, + const int a_tail_size, const int ind_num, + const int ind_ndim, const int* ind, + const int a_ndim, const int seg) { +index_t id = 0; +for (int dim = 0; dim < ind_ndim; ++dim) { + id += a_pre_stride[seg + dim] * ind[dim * ind_num + i]; +} +id *= a_tail_size; +for (int _i = 0; _i < a_tail_size; ++_i) { + mshadow::Shape a_tail_id = mxnet_op::unravel(_i, a_tail_shape); + mshadow::Shape val_id; + for (int _j = 0; _j < seg; ++_j) { +val_id[_j] = 0; + } + for (int _j = seg; _j < seg + a_ndim; ++_j) { +val_id[_j] = (val_shape[_j] == 1) ? 0 : a_tail_id[_j]; + } + val_id[seg + ind_ndim - 1] = (val_shape[seg + ind_ndim - 1] == 1) ? 0 : i; + index_t val_dest = mxnet_op::dot(val_id, val_stride); + out[id + _i] = val[val_dest]; +} + } +}; + +template +void IndexUpdateForwardCalc(mshadow::Stream *s, +const int ind_num, DType* out, +const DType* val, +const mshadow::Shape a_tail_shape, +const mshadow::Shape a_pre_stride, +const mshadow::Shape val_stride, +const mshadow::Shape val_shape, +const mshadow::Shape a_shape, +const int a_tail_size, +const int ind_ndim, const int* ind, +const int a_ndim) { + using namespace mxnet_op; + using namespace mshadow; + int seg = MXNET_SPECIAL_MAX_NDIM - a_ndim; + Kernel, xpu>::Launch( +s, ind_num, out, val, a_tail_shape, a_pre_stride, +val_stride, val_shape, a_tail_size, ind_num, +ind_ndim, ind, a_ndim, seg); +} + + +struct IndexUpdateBackwardValGPUKernel { + template + MSHADOW_XINLINE static void Map(size_t i, DType* grad_val, + const DType* ograd, const int* ind_vec, + const mshadow::Shape ograd_tail_shape, + const mshadow::Shape ograd_pre_stride, + const mshadow::Shape val_stride, + const mshadow::Shape val_shape, + const int ograd_tail_size, const int ind_num, + const int ind_ndim, const int out_ndim, const int seg) { +index_t id = 0; +for (int dim = 0; dim < ind_ndim; ++dim) { + id += ograd_pre_stride[seg + dim] * ind_vec[dim * ind_num + i]; +} +id *= ograd_tail_size; +for (int _i = 0; _i < ograd_tail_size; ++_i) { + mshadow::Shape ograd_tail_id = +mxnet_op::unravel(_i, ograd_tail_shape); + mshadow::Shape val_id; + for (int _j = 0; _j < seg; ++_j) { +val_id[_j] = 0; + } + for (int _j = seg; _j < seg + out_ndim; ++_j) { +val_id[_j] = (val_shape[_j] == 1) ? 0 : ograd_tail_id[_j]; + } + val_id[seg + ind_ndim - 1] = (val_shape[seg + ind_ndim - 1] == 1) ? 0 : i; + index_t val_dest = mxnet_op::dot(val_id, val_stride); + atomicAdd(_val[val_dest],
[GitHub] [incubator-mxnet] JiangZhaoh commented on a change in pull request #18545: add op npx.index_update
JiangZhaoh commented on a change in pull request #18545: URL: https://github.com/apache/incubator-mxnet/pull/18545#discussion_r440609104 ## File path: src/operator/tensor/index_update.cu ## @@ -0,0 +1,202 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +/*! + * \file index_update.cu + * \brief GPU implementation of index_update operator + */ + +#include +#include "./index_update-inl.h" +#include "../tensor/util/tensor_util-inl.cuh" +#include "../tensor/util/tensor_util-inl.h" + + +namespace mxnet { +namespace op { + +template +struct IndexUpdateForwardGPUKernel { + MSHADOW_XINLINE static void Map(size_t i, DType* out, + const DType* val, + const mshadow::Shape a_tail_shape, + const mshadow::Shape a_pre_stride, + const mshadow::Shape val_stride, + const mshadow::Shape val_shape, + const int a_tail_size, const int ind_num, + const int ind_ndim, const int* ind, + const int a_ndim, const int seg) { +index_t id = 0; +for (int dim = 0; dim < ind_ndim; ++dim) { + id += a_pre_stride[seg + dim] * ind[dim * ind_num + i]; +} +id *= a_tail_size; +for (int _i = 0; _i < a_tail_size; ++_i) { + mshadow::Shape a_tail_id = mxnet_op::unravel(_i, a_tail_shape); + mshadow::Shape val_id; + for (int _j = 0; _j < seg; ++_j) { +val_id[_j] = 0; + } + for (int _j = seg; _j < seg + a_ndim; ++_j) { +val_id[_j] = (val_shape[_j] == 1) ? 0 : a_tail_id[_j]; + } + val_id[seg + ind_ndim - 1] = (val_shape[seg + ind_ndim - 1] == 1) ? 0 : i; + index_t val_dest = mxnet_op::dot(val_id, val_stride); + out[id + _i] = val[val_dest]; +} + } +}; + +template +void IndexUpdateForwardCalc(mshadow::Stream *s, +const int ind_num, DType* out, +const DType* val, +const mshadow::Shape a_tail_shape, +const mshadow::Shape a_pre_stride, +const mshadow::Shape val_stride, +const mshadow::Shape val_shape, +const mshadow::Shape a_shape, +const int a_tail_size, +const int ind_ndim, const int* ind, +const int a_ndim) { + using namespace mxnet_op; + using namespace mshadow; + int seg = MXNET_SPECIAL_MAX_NDIM - a_ndim; + Kernel, xpu>::Launch( +s, ind_num, out, val, a_tail_shape, a_pre_stride, +val_stride, val_shape, a_tail_size, ind_num, +ind_ndim, ind, a_ndim, seg); +} + + +struct IndexUpdateBackwardValGPUKernel { + template + MSHADOW_XINLINE static void Map(size_t i, DType* grad_val, + const DType* ograd, const int* ind_vec, + const mshadow::Shape ograd_tail_shape, + const mshadow::Shape ograd_pre_stride, + const mshadow::Shape val_stride, + const mshadow::Shape val_shape, + const int ograd_tail_size, const int ind_num, + const int ind_ndim, const int out_ndim, const int seg) { +index_t id = 0; +for (int dim = 0; dim < ind_ndim; ++dim) { + id += ograd_pre_stride[seg + dim] * ind_vec[dim * ind_num + i]; +} +id *= ograd_tail_size; +for (int _i = 0; _i < ograd_tail_size; ++_i) { + mshadow::Shape ograd_tail_id = +mxnet_op::unravel(_i, ograd_tail_shape); + mshadow::Shape val_id; + for (int _j = 0; _j < seg; ++_j) { +val_id[_j] = 0; + } + for (int _j = seg; _j < seg + out_ndim; ++_j) { +val_id[_j] = (val_shape[_j] == 1) ? 0 : ograd_tail_id[_j]; + } + val_id[seg + ind_ndim - 1] = (val_shape[seg + ind_ndim - 1] == 1) ? 0 : i; + index_t val_dest = mxnet_op::dot(val_id, val_stride); + atomicAdd(_val[val_dest],
[GitHub] [incubator-mxnet] mseth10 edited a comment on pull request #18559: add cd mxnet_lib/static stages to ci
mseth10 edited a comment on pull request #18559: URL: https://github.com/apache/incubator-mxnet/pull/18559#issuecomment-644552613 @leezu @szha please help review and merge. Thanks! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] mseth10 commented on pull request #18559: add cd mxnet_lib/static stages to ci
mseth10 commented on pull request #18559: URL: https://github.com/apache/incubator-mxnet/pull/18559#issuecomment-644552613 @leezu please help review and merge. Thanks! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] mxnet-bot commented on pull request #18571: fix contribute page anchor position shifted
mxnet-bot commented on pull request #18571: URL: https://github.com/apache/incubator-mxnet/pull/18571#issuecomment-644550209 Jenkins CI successfully triggered : [unix-cpu] This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] ys2843 edited a comment on pull request #18571: fix contribute page anchor position shifted
ys2843 edited a comment on pull request #18571: URL: https://github.com/apache/incubator-mxnet/pull/18571#issuecomment-644550054 @mxnet-bot run ci [unix-cpu] This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] ys2843 commented on pull request #18571: fix contribute page anchor position shifted
ys2843 commented on pull request #18571: URL: https://github.com/apache/incubator-mxnet/pull/18571#issuecomment-644550054 @mxnet-bot run [unix-cpu] This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-mxnet] mxnet-bot commented on pull request #18571: fix contribute page anchor position shifted
mxnet-bot commented on pull request #18571: URL: https://github.com/apache/incubator-mxnet/pull/18571#issuecomment-644550082 Undefined action detected. Permissible actions are : run ci [all], run ci [job1, job2] Example : @mxnet-bot run ci [all] Example : @mxnet-bot run ci [centos-cpu, clang] This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org