date:20200616

[GitHub] [incubator-mxnet] ciyongch commented on pull request #18572: [v1.7.x]Add KEY for Ciyong Chen

2020-06-16 Thread GitBox



ciyongch commented on pull request #18572:
URL: https://github.com/apache/incubator-mxnet/pull/18572#issuecomment-645166878


   @TaoLv @pengzhao-intel please help to merge.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [incubator-mxnet] Neutron3529 commented on pull request #18423: fix misbehave of KLDivLoss

2020-06-16 Thread GitBox



Neutron3529 commented on pull request #18423:
URL: https://github.com/apache/incubator-mxnet/pull/18423#issuecomment-645160030


   @mxnet-bot run ci [unix-cpu]



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [incubator-mxnet] mxnet-bot commented on pull request #18423: fix misbehave of KLDivLoss

2020-06-16 Thread GitBox



mxnet-bot commented on pull request #18423:
URL: https://github.com/apache/incubator-mxnet/pull/18423#issuecomment-645160078


   Jenkins CI successfully triggered : [unix-cpu]



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [incubator-mxnet] mxnet-bot commented on pull request #18546: [Numpy] FFI: tril_indices

2020-06-16 Thread GitBox



mxnet-bot commented on pull request #18546:
URL: https://github.com/apache/incubator-mxnet/pull/18546#issuecomment-645154562


   Jenkins CI successfully triggered : [unix-cpu, unix-gpu]



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [incubator-mxnet] XIAO-XIA commented on pull request #18546: [Numpy] FFI: tril_indices

2020-06-16 Thread GitBox



XIAO-XIA commented on pull request #18546:
URL: https://github.com/apache/incubator-mxnet/pull/18546#issuecomment-645154526


   @mxnet-bot run ci [unix-cpu, unix-gpu]



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [incubator-mxnet] ChaiBapchya commented on issue #18493: 3D Upsampling

2020-06-16 Thread GitBox

ChaiBapchya commented on issue #18493:
URL:
https://github.com/apache/incubator-mxnet/issues/18493#issuecomment-645140735

Great to have this use-case. @andevellicus Thanks for bringing it up. While
your experience in Julie will be handy, we would still need work to be done on
MXNet Backend [C/C++] because those few lines that @leezu mentioned would go
somewhere here

For Upsampling Forward & Backward

https://github.com/apache/incubator-mxnet/blob/3b23c2de950fb0e4d44560f4c7ea933a520c526c/src/operator/nn/upsampling-inl.h#L100

CPU-specific implementation
For e.g. Shape currently checks for 4D input [2D image]

https://github.com/apache/incubator-mxnet/blob/eceb5f2c1c494094c1a697286f2c4560b7ca472e/src/operator/nn/upsampling.cc#L44-L45

https://github.com/apache/incubator-mxnet/blob/eceb5f2c1c494094c1a697286f2c4560b7ca472e/src/operator/nn/upsampling.cc#L61-L62

There don't seem to be GPU-specific implementations at the moment. So we are
good on that.

You can take a stab at updating the forward & backward implementations.
Additionally, we could add a test for this 3D image use-case
Similar to

https://github.com/apache/incubator-mxnet/blob/f1f3f44166e2e47afad6c65025fb48dd47efeb65/tests/python/gpu/test_operator_gpu.py#L1481-L1489

I can help review your PR. Feel free to ping me if you need any help / have
specific doubts related to contributing code to the MXNet Backend.

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [incubator-mxnet] ChaiBapchya commented on issue #18551: MXnet with cuda wont install on windows 10!

2020-06-16 Thread GitBox



ChaiBapchya commented on issue #18551:
URL: 
https://github.com/apache/incubator-mxnet/issues/18551#issuecomment-645136694


   @mxnet-label-bot add [windows]



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [incubator-mxnet] ChaiBapchya commented on pull request #18560: [CI][1.6.x] fix centos 7 url to unblock centos-cpu & gpu pipeline

2020-06-16 Thread GitBox



ChaiBapchya commented on pull request #18560:
URL: https://github.com/apache/incubator-mxnet/pull/18560#issuecomment-645135685


   Makes sense.
   Can someone then help create this branch protection ticket to apache-infra? 
@leezu @marcoabreu 
   I'm not sure if I as a contributor can do that. Thanks everyone for the 
advice & clarification.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [incubator-mxnet] ChaiBapchya commented on pull request #18573: [v1.x] Cherry-pick Fix mxnet-native and Docker CD pipelines (#17784)

2020-06-16 Thread GitBox



ChaiBapchya commented on pull request #18573:
URL: https://github.com/apache/incubator-mxnet/pull/18573#issuecomment-645133527


   So does this mean the PR is incomplete? like are there more additions to be 
done?
   I can see few jobs red/yellow [fail] in past 2 days.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [incubator-mxnet] ChaiBapchya commented on pull request #18559: add cd mxnet_lib/static and python/pypi stages to ci

2020-06-16 Thread GitBox



ChaiBapchya commented on pull request #18559:
URL: https://github.com/apache/incubator-mxnet/pull/18559#issuecomment-645131019


   Thanks for pointing it out. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [incubator-mxnet] pengzhao-intel commented on issue #18569: [Numpy] softmax, logsoftmax failed on empty ndarray

2020-06-16 Thread GitBox



pengzhao-intel commented on issue #18569:
URL: 
https://github.com/apache/incubator-mxnet/issues/18569#issuecomment-645106526


   Our team will take a look if this is related to MKL integration.  Thanks 
report the issue @stu1130 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [incubator-mxnet] ciyongch commented on pull request #18572: [v1.7.x]Add KEY for Ciyong Chen

2020-06-16 Thread GitBox



ciyongch commented on pull request #18572:
URL: https://github.com/apache/incubator-mxnet/pull/18572#issuecomment-645104330


   @mxnet-bot run ci [unix-cpu]



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [incubator-mxnet] mxnet-bot commented on pull request #18572: [v1.7.x]Add KEY for Ciyong Chen

2020-06-16 Thread GitBox



mxnet-bot commented on pull request #18572:
URL: https://github.com/apache/incubator-mxnet/pull/18572#issuecomment-645104371


   Jenkins CI successfully triggered : [unix-cpu]



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [incubator-mxnet] leezu commented on pull request #18559: add cd mxnet_lib/static and python/pypi stages to ci

2020-06-16 Thread GitBox



leezu commented on pull request #18559:
URL: https://github.com/apache/incubator-mxnet/pull/18559#issuecomment-645087169


   >  For instance for building binaries for AWS-MXNet we do it on linux 14.04 
[and soon migrating to 16 or 18.04]
   
   That would be a bad idea. You want to be compatible with 
https://www.python.org/dev/peps/pep-0599/ and for that you MUST build on 
CentOS7 or a system with equivalent glibc version



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [incubator-mxnet] ChaiBapchya commented on pull request #18559: add cd mxnet_lib/static and python/pypi stages to ci

2020-06-16 Thread GitBox



ChaiBapchya commented on pull request #18559:
URL: https://github.com/apache/incubator-mxnet/pull/18559#issuecomment-645085628


   Curious why static builds are on Centos [and not on linux]
   For instance for building binaries for AWS-MXNet we do it on linux 14.04 
[and soon migrating to 16 or 18.04]



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[incubator-mxnet-site] branch asf-site updated: Bump the publish timestamp.

2020-06-16 Thread aaronmarkham

This is an automated email from the ASF dual-hosted git repository.

aaronmarkham pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/incubator-mxnet-site.git


The following commit(s) were added to refs/heads/asf-site by this push:
 new 5724c8b  Bump the publish timestamp.
5724c8b is described below

commit 5724c8b3501223f3b4baa515a694fbdef99e7035
Author: mxnet-ci 
AuthorDate: Wed Jun 17 00:43:16 2020 +

Bump the publish timestamp.
---
 date.txt | 1 +
 1 file changed, 1 insertion(+)

diff --git a/date.txt b/date.txt
new file mode 100644
index 000..0cfbad2
--- /dev/null
+++ b/date.txt
@@ -0,0 +1 @@
+Wed Jun 17 00:43:16 UTC 2020

[incubator-mxnet] branch master updated (8039377 -> 103d839)

2020-06-16 Thread lausen

This is an automated email from the ASF dual-hosted git repository.

lausen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git.


from 8039377  add op npx.index_update (#18545)
 add 103d839  Test CD mxnet_lib/static and python/pypi stages on CI (#18559)

No new revisions were added by this update.

Summary of changes:
 ci/docker/runtime_functions.sh| 35 ++--
 ci/jenkins/Jenkins_steps.groovy   | 86 +++
 ci/jenkins/Jenkinsfile_centos_cpu |  6 ++-
 ci/jenkins/Jenkinsfile_centos_gpu |  7 +++-
 4 files changed, 98 insertions(+), 36 deletions(-)

[incubator-mxnet] branch master updated (8039377 -> 103d839)

2020-06-16 Thread lausen

This is an automated email from the ASF dual-hosted git repository.

lausen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git.


from 8039377  add op npx.index_update (#18545)
 add 103d839  Test CD mxnet_lib/static and python/pypi stages on CI (#18559)

No new revisions were added by this update.

Summary of changes:
 ci/docker/runtime_functions.sh| 35 ++--
 ci/jenkins/Jenkins_steps.groovy   | 86 +++
 ci/jenkins/Jenkinsfile_centos_cpu |  6 ++-
 ci/jenkins/Jenkinsfile_centos_gpu |  7 +++-
 4 files changed, 98 insertions(+), 36 deletions(-)

[incubator-mxnet] branch master updated (8039377 -> 103d839)

2020-06-16 Thread lausen

This is an automated email from the ASF dual-hosted git repository.

lausen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git.


from 8039377  add op npx.index_update (#18545)
 add 103d839  Test CD mxnet_lib/static and python/pypi stages on CI (#18559)

No new revisions were added by this update.

Summary of changes:
 ci/docker/runtime_functions.sh| 35 ++--
 ci/jenkins/Jenkins_steps.groovy   | 86 +++
 ci/jenkins/Jenkinsfile_centos_cpu |  6 ++-
 ci/jenkins/Jenkinsfile_centos_gpu |  7 +++-
 4 files changed, 98 insertions(+), 36 deletions(-)

[GitHub] [incubator-mxnet] leezu merged pull request #18559: add cd mxnet_lib/static and python/pypi stages to ci

2020-06-16 Thread GitBox



leezu merged pull request #18559:
URL: https://github.com/apache/incubator-mxnet/pull/18559


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [incubator-mxnet] mseth10 commented on pull request #18559: add cd mxnet_lib/static and python/pypi stages to ci

2020-06-16 Thread GitBox



mseth10 commented on pull request #18559:
URL: https://github.com/apache/incubator-mxnet/pull/18559#issuecomment-645062149


   @leezu please help merge the pr. thanks!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [incubator-mxnet] eric-haibin-lin opened a new issue #18575: while loop fails with hybridization

2020-06-16 Thread GitBox



eric-haibin-lin opened a new issue #18575:
URL: https://github.com/apache/incubator-mxnet/issues/18575


   ```
   import mxnet as mx
   from mxnet.base import _as_list
   
   class MyBlock(mx.gluon.HybridBlock):
   def __init__(self):
   super().__init__()
   
   def hybrid_forward(self, F, free_nds, loop_nds):
   n_steps = 5
   max_iterations = 5
   
   def step(loop, free):
   (s, ), (a, b) = loop, free
   return (s, s)
   
   cond = lambda loop_vars, _: (loop_vars[0] < 1e35).prod()
   func=lambda *_loop_vars: func(_loop_vars, free_nds)
   
   outputs, final_loop_nds = F.contrib.while_loop(
   cond=lambda *_loop_vars: cond(_loop_vars, free_nds),
   func=lambda *_loop_vars: step(_loop_vars, free_nds),
   loop_vars=loop_nds,
   max_iterations=max_iterations,
   )
   
   outputs = _as_list(outputs)
   final_loop_nds = _as_list(final_loop_nds)
   
   if n_steps == 0:
   outputs = []
   else:
   outputs = [x.slice_axis(axis=0, begin=0, end=n_steps) for x in 
outputs]
   loop_result_sym = [x * 2 for x in outputs] + [x * 3 for x in 
final_loop_nds]
   return loop_result_sym
   
   net = MyBlock()
   net.initialize()
   net.hybridize()
   
   free_var_shapes=[(1, ),(1, )]
   loop_var_shapes=[(1, )]
   
   free_nds = [mx.nd.ones(s) for s in free_var_shapes]
   loop_nds = [mx.nd.ones(s) for s in loop_var_shapes]
   
   for n in free_nds + loop_nds:
   n.attach_grad()
   
   with mx.autograd.record():
   result = net(free_nds, loop_nds)
   
   print(result)
   mx.nd.waitall()
   ```
   python3.7 test.py
   ```
   Traceback (most recent call last):
 File "test.py", line 51, in 
   result = net(free_nds, loop_nds)
 File "/home/ec2-user/cached_executor/python/mxnet/gluon/block.py", line 
1324, in __call__
   return super().__call__(x, *args)
 File "/home/ec2-user/cached_executor/python/mxnet/gluon/block.py", line 
705, in __call__
   out = self.forward(*args)
 File "/home/ec2-user/cached_executor/python/mxnet/gluon/block.py", line 
1369, in forward
   return self._call_cached_op(x, *args)
 File "/home/ec2-user/cached_executor/python/mxnet/gluon/block.py", line 
1090, in _call_cached_op
   out = self._cached_op(*cargs)
 File "mxnet/cython/ndarray.pyx", line 177, in 
mxnet._cy3.ndarray.CachedOp.__call__
 File "mxnet/cython/./base.pyi", line 41, in mxnet._cy3.ndarray.CALL
   mxnet.base.MXNetError: Traceback (most recent call last):
 File "../src/imperative/imperative.cc", line 217
   MXNetError: Check failed: AGInfo: :IsNone(*output): Assigning to NDArrays 
that are already in a computational graph will cause undefined behavior when 
evaluating gradients. Please call backward first to clear the graph or do this 
out side of a record section. Also note that you cannot use inplace operations 
like +=, *=, relu(x, out=x), y[idx]=x, etc inside a record section._cachedop
   ```
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [incubator-mxnet] mxnet-bot commented on pull request #18574: Update the onnx-tensorrt submodule

2020-06-16 Thread GitBox



mxnet-bot commented on pull request #18574:
URL: https://github.com/apache/incubator-mxnet/pull/18574#issuecomment-645036167


   Hey @Kh4L , Thanks for submitting the PR 
   All tests are already queued to run once. If tests fail, you can trigger one 
or more tests again with the following commands: 
   - To trigger all jobs: @mxnet-bot run ci [all] 
   - To trigger specific jobs: @mxnet-bot run ci [job1, job2] 
   *** 
   **CI supported jobs**: [centos-gpu, unix-gpu, windows-cpu, website, clang, 
miscellaneous, windows-gpu, sanity, centos-cpu, unix-cpu, edge]
   *** 
   _Note_: 
Only following 3 categories can trigger CI :PR Author, MXNet Committer, 
Jenkins Admin. 
   All CI tests must pass before the PR can be merged. 
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [incubator-mxnet] Kh4L opened a new pull request #18574: Update the onnx-tensorrt submodule

2020-06-16 Thread GitBox



Kh4L opened a new pull request #18574:
URL: https://github.com/apache/incubator-mxnet/pull/18574


   ## Description ##
   This PR updates the onnx_tensorrt submodule to the latest commit.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [incubator-mxnet] anko-intel commented on issue #14357: [Bug] Batchnorm running_var behaves differently when using gpu vs. cpu

2020-06-16 Thread GitBox



anko-intel commented on issue #14357:
URL: 
https://github.com/apache/incubator-mxnet/issues/14357#issuecomment-645038422


   Hi @ThomasDelteil ,
   According to the training script from to 
https://github.com/apache/incubator-mxnet/issues/14357#issuecomment-497508722,
   I didn’t manage to change the script to produce the same result in each run, 
so instead I run the test procedure 999 times and get the mean value:
   ```
   import mxnet as mx
   from mxnet import nd, autograd, gluon
   import numpy as np
   
   def transform(data, label):
   return nd.transpose(data.astype(np.float32), (2,0,1))/255, 
label.astype(np.float32)
   trainset = gluon.data.vision.FashionMNIST(train=True)
   trainset= trainset.transform(transform)
   train_data = gluon.data.DataLoader(dataset=trainset, batch_size=50, 
shuffle=True)
   SCE = gluon.loss.SoftmaxCrossEntropyLoss()
   
   under_res = {}
   under_sum = {}
   outside_res = {}
   outside_sum = {}
   for ctx in [mx.gpu(), mx.cpu()]:
   under_sum[ctx] = 0.0
   outside_sum[ctx] = 0.0
   
   for t in range(1,1000):
   for ctx in [mx.cpu(), mx.gpu()]:
   net = gluon.model_zoo.vision.get_model('resnet18_v1', 
pretrained=False, classes=10)
   # Parameter initialization
   net.initialize(mx.init.Xavier(magnitude=2.24), ctx=ctx, 
force_reinit=True)
   trainer = gluon.Trainer(params=net.collect_params(), 
optimizer='sgd', optimizer_params={'learning_rate': .01, 'wd': 0.0001, 
'momentum': 0.9})
   
   # Training
   for i, (data, label) in enumerate(train_data):
   data = data.as_in_context(ctx)
   label = label.as_in_context(ctx)
   with autograd.record():
   output = net(data)
   loss = SCE(output, label)
   loss.backward()
   trainer.step(data.shape[0])
   if i == 20:
   break
   
   # Training accuracy under autograd
   accuracy = mx.gluon.metric.Accuracy()
   for i, (data, label) in enumerate(train_data):
   with autograd.record():
   output = net(data.as_in_context(ctx))
   accuracy.update(label, output)
   if i == 5:
   break
   under_res[ctx] =  accuracy.get()[1]
   under_sum[ctx] += under_res[ctx]
   
   # Training accuracy outside autograd
   accuracy = mx.gluon.metric.Accuracy()
   for i, (data, label) in enumerate(train_data):
   output = net(data.as_in_context(ctx))
   accuracy.update(label, output)
   if i == 5:
   break
   outside_res[ctx] =  accuracy.get()[1]
   outside_sum[ctx] += outside_res[ctx]
   
   for ctx in [mx.cpu(), mx.gpu()]:
   print("Test {:3} Accuracy for {}: under autograd: {:.6f} mean: 
{:.6f},  outside autograd: {:.6f} mean: {:.6f}".format(
   t, ctx, 
   under_res[ctx], 
   under_sum[ctx] / t,  
   outside_res[ctx],
   outside_sum[ctx] / t))
   ```
   It shows that statistically GPU and CPU give similar result:
   
   ```
   Test 999 Accuracy for cpu(0): under autograd: 0.58 mean: 0.588709,  
outside autograd: 0.17 mean: 0.192819
   Test 999 Accuracy for gpu(0): under autograd: 0.67 mean: 0.589486,  
outside autograd: 0.17 mean: 0.191892
   ```
   Please see the log for full data: 
[test_03_master_fixed_sync.txt](https://github.com/apache/incubator-mxnet/files/4789319/test_03_master_fixed_sync.txt)
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [incubator-mxnet] mxnet-bot commented on pull request #18559: add cd mxnet_lib/static and python/pypi stages to ci

2020-06-16 Thread GitBox



mxnet-bot commented on pull request #18559:
URL: https://github.com/apache/incubator-mxnet/pull/18559#issuecomment-645012723


   Jenkins CI successfully triggered : [windows-cpu]



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [incubator-mxnet] mseth10 commented on pull request #18559: add cd mxnet_lib/static and python/pypi stages to ci

2020-06-16 Thread GitBox



mseth10 commented on pull request #18559:
URL: https://github.com/apache/incubator-mxnet/pull/18559#issuecomment-645012655


   @mxnet-bot run ci [windows-cpu]



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [incubator-mxnet] mseth10 commented on pull request #18573: [v1.x] Cherry-pick Fix mxnet-native and Docker CD pipelines (#17784)

2020-06-16 Thread GitBox



mseth10 commented on pull request #18573:
URL: https://github.com/apache/incubator-mxnet/pull/18573#issuecomment-644995374


   We still need to check for this issue:
   https://github.com/apache/incubator-mxnet/pull/18465#issuecomment-638364850



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [incubator-mxnet] mxnet-bot commented on pull request #18573: [v1.x] Cherry-pick Fix mxnet-native and Docker CD pipelines (#17784)

2020-06-16 Thread GitBox



mxnet-bot commented on pull request #18573:
URL: https://github.com/apache/incubator-mxnet/pull/18573#issuecomment-644994141


   Hey @mseth10 , Thanks for submitting the PR 
   All tests are already queued to run once. If tests fail, you can trigger one 
or more tests again with the following commands: 
   - To trigger all jobs: @mxnet-bot run ci [all] 
   - To trigger specific jobs: @mxnet-bot run ci [job1, job2] 
   *** 
   **CI supported jobs**: [centos-gpu, website, unix-cpu, sanity, windows-cpu, 
clang, unix-gpu, windows-gpu, edge, centos-cpu, miscellaneous]
   *** 
   _Note_: 
Only following 3 categories can trigger CI :PR Author, MXNet Committer, 
Jenkins Admin. 
   All CI tests must pass before the PR can be merged. 
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [incubator-mxnet] mseth10 opened a new pull request #18573: [v1.x] Cherry-pick Fix mxnet-native and Docker CD pipelines (#17784)

2020-06-16 Thread GitBox



mseth10 opened a new pull request #18573:
URL: https://github.com/apache/incubator-mxnet/pull/18573


   * Fix Jenkinsfile CD pipeline for mxnet-native
   * Fix cd/python/docker/python_images.sh
   
   This fixes CD Python docker images pipeline for v1.x branch
   
http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/restricted-mxnet-cd%2Fmxnet-cd-release-job-1.x/detail/mxnet-cd-release-job-1.x/309/pipeline



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [incubator-mxnet] szha commented on pull request #18560: [CI][1.6.x] fix centos 7 url to unblock centos-cpu & gpu pipeline

2020-06-16 Thread GitBox



szha commented on pull request #18560:
URL: https://github.com/apache/incubator-mxnet/pull/18560#issuecomment-644972460


   > We opted to not turn that in for feature branches to not bother infra too 
much and also allow the release manager to make certain calls without hitting 
limits.
   
   I don't think that's the case. We didn't have explicit discussion on this, 
nor do I think it's the right approach. It doesn't make sense to allow force 
push to release branches while protecting the development branch. Also, 
enabling branch protection for release branches doesn't necessarily "bother" 
apache infra either, as the setup is likely one-time. Branch protection can be 
turned on for branches that match a pattern, and we do have an explicit pattern 
for release branches according to the release process.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [incubator-mxnet] ys2843 commented on pull request #18571: fix contribute page anchor position shifted

2020-06-16 Thread GitBox



ys2843 commented on pull request #18571:
URL: https://github.com/apache/incubator-mxnet/pull/18571#issuecomment-644962814


   @mxnet-label-bot add [website, pr-awaiting-review]



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [incubator-mxnet] mseth10 commented on a change in pull request #18559: add cd mxnet_lib/static and python/pypi stages to ci

2020-06-16 Thread GitBox



mseth10 commented on a change in pull request #18559:
URL: https://github.com/apache/incubator-mxnet/pull/18559#discussion_r441081183



##
File path: ci/jenkins/Jenkinsfile_centos_cpu
##
@@ -37,14 +37,16 @@ core_logic: {
 custom_steps.compile_centos7_cpu('centos7_cpu'),
 custom_steps.compile_centos7_cpu_make('centos7_cpu_make'),
 custom_steps.compile_centos7_cpu_mkldnn(),
+custom_steps.compile_static_cd_cpu('centos7_cpu_cd'),

Review comment:
   Done! Also removed custom_steps.compile_static_python_gpu_cmake() that 
was running cu92 in favor of new stage running cu102.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[incubator-mxnet-site] branch asf-site updated: Bump the publish timestamp.

2020-06-16 Thread aaronmarkham

This is an automated email from the ASF dual-hosted git repository.

aaronmarkham pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/incubator-mxnet-site.git


The following commit(s) were added to refs/heads/asf-site by this push:
 new 10c0af3  Bump the publish timestamp.
10c0af3 is described below

commit 10c0af3f9638304bf5e7713d4117797780e1b7fd
Author: mxnet-ci 
AuthorDate: Tue Jun 16 18:48:03 2020 +

Bump the publish timestamp.
---
 date.txt | 1 +
 1 file changed, 1 insertion(+)

diff --git a/date.txt b/date.txt
new file mode 100644
index 000..2177a10
--- /dev/null
+++ b/date.txt
@@ -0,0 +1 @@
+Tue Jun 16 18:48:03 UTC 2020

[GitHub] [incubator-mxnet] ys2843 edited a comment on pull request #18571: fix contribute page anchor position shifted

2020-06-16 Thread GitBox



ys2843 edited a comment on pull request #18571:
URL: https://github.com/apache/incubator-mxnet/pull/18571#issuecomment-644933217


   > @ys2843 Thanks for prioritizing on this & explaining what the issue was. 
Curious, are there other places in the website where anchor tag header position 
is "fixed"?
   
   I can't find any more, because this problem only occurs on main information 
site. There aren't too many anchors (points to some places on the same page) 
here.
   Thanks for reviewing and reporting the bug!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [incubator-mxnet] ys2843 commented on pull request #18571: fix contribute page anchor position shifted

2020-06-16 Thread GitBox



ys2843 commented on pull request #18571:
URL: https://github.com/apache/incubator-mxnet/pull/18571#issuecomment-644933217


   > @ys2843 Thanks for prioritizing on this & explaining what the issue was. 
Curious, are there other places in the website where anchor tag header position 
is "fixed"?
   
   I can't find any more, because this problem only occurs on main information 
site. There aren't too many anchors (points to some places on the same page) 
here.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [incubator-mxnet] leezu commented on a change in pull request #18559: add cd mxnet_lib/static stages to ci

2020-06-16 Thread GitBox



leezu commented on a change in pull request #18559:
URL: https://github.com/apache/incubator-mxnet/pull/18559#discussion_r441049583



##
File path: ci/jenkins/Jenkinsfile_centos_cpu
##
@@ -37,14 +37,16 @@ core_logic: {
 custom_steps.compile_centos7_cpu('centos7_cpu'),
 custom_steps.compile_centos7_cpu_make('centos7_cpu_make'),
 custom_steps.compile_centos7_cpu_mkldnn(),
+custom_steps.compile_static_cd_cpu('centos7_cpu_cd'),

Review comment:
   The new test is a more elaborate version of 
`custom_steps.compile_static_python_cpu_cmake()` isn't it? If so, let's remove 
the  `custom_steps.compile_static_python_cpu_cmake()` in favor of the new 
approach.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[incubator-mxnet] branch master updated: add op npx.index_update (#18545)

2020-06-16 Thread sxjscience

This is an automated email from the ASF dual-hosted git repository.

sxjscience pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git


The following commit(s) were added to refs/heads/master by this push:
 new 8039377  add op npx.index_update (#18545)
8039377 is described below

commit 8039377e6630bcb00c5a95abdaf0851803686bc6
Author: JiangZhaoh <54654391+jiangzh...@users.noreply.github.com>
AuthorDate: Wed Jun 17 01:45:30 2020 +0800

add op npx.index_update (#18545)

* add op npx.index_update

* remove debug comment

* change eps

* fix stupid error

* add blank line in docs

* gpu temporary space request alignment

* fix test error

Co-authored-by: Ubuntu 
---
 python/mxnet/_numpy_op_doc.py  |  72 ++
 src/operator/tensor/index_add-inl.h|   2 +-
 src/operator/tensor/index_add_backward.cc  |  18 +-
 .../tensor/{index_add-inl.h => index_update-inl.h} | 175 --
 src/operator/tensor/index_update.cc| 261 +
 src/operator/tensor/index_update.cu| 204 
 tests/python/unittest/test_numpy_op.py | 162 +
 7 files changed, 813 insertions(+), 81 deletions(-)

diff --git a/python/mxnet/_numpy_op_doc.py b/python/mxnet/_numpy_op_doc.py
index fecd0e6..b8f4a49 100644
--- a/python/mxnet/_numpy_op_doc.py
+++ b/python/mxnet/_numpy_op_doc.py
@@ -630,6 +630,7 @@ def _npx_index_add(a, ind, val):
 """
 Add values to input according to given indexes.
 If exists repeate positions to be updated, the update value will be 
accumulated.
+
 Parameters
 --
 a : ndarray
@@ -643,10 +644,12 @@ def _npx_index_add(a, ind, val):
   - ind.dtype should be 'int32' or 'int64'
 val : ndarray
 Input data. The array to update the input 'a'.
+
 Returns
 ---
 out : ndarray
 The output array.
+
 Examples
 
 >>> a = np.zeros((2, 3, 4))
@@ -699,6 +702,75 @@ def _npx_index_add(a, ind, val):
 pass
 
 
+def _npx_index_update(a, ind, val):
+"""
+Update values to input according to given indexes.
+If multiple indices refer to the same location it is undefined which 
update is chosen; it may choose
+the order of updates arbitrarily and nondeterministically (e.g., due to 
concurrent updates on some
+hardware platforms). Recommend not to use repeate positions.
+
+Parameters
+--
+a : ndarray
+Input data. The array to be updated.
+Support dtype: 'float32', 'float64', 'int32', 'int64'.
+ind : ndarray
+Indexes for indicating update positions.
+For example, array([[0, 1], [2, 3], [4, 5]] indicates here are two 
positions to
+be updated, which is (0, 2, 4) and (1, 3, 5).
+Note: - 'ind' cannot be empty array '[]', for that case, please use 
operator 'add' instead.
+  - 0 <= ind.ndim <= 2.
+  - ind.dtype should be 'int32' or 'int64'
+val : ndarray
+Input data. The array to update the input 'a'.
+Support dtype: 'float32', 'float64', 'int32', 'int64'.
+
+Returns
+---
+out : ndarray
+The output array.
+
+Examples
+
+>>> a = np.zeros((2, 3, 4))
+>>> ind = np.array([[0, 0], [0, 0], [0, 1]], dtype='int32')
+>>> val = np.arange(2).reshape(2) + 1
+>>> b = npx.index_update(a, ind, val)
+>>> b
+array([[[1., 2., 0., 0.],
+[0., 0., 0., 0.],
+[0., 0., 0., 0.]],
+
+   [[0., 0., 0., 0.],
+[0., 0., 0., 0.],
+[0., 0., 0., 0.]]])
+
+>>> ind=np.array([[0, 0], [0, 1]], dtype='int32') 
+>>> val = np.arange(8).reshape(2, 4) 
+>>> b = npx.index_update(a, ind, val)
+>>> b
+array([[[0., 1., 2., 3.],
+[4., 5., 6., 7.],
+[0., 0., 0., 0.]],
+
+   [[0., 0., 0., 0.],
+[0., 0., 0., 0.],
+[0., 0., 0., 0.]]])
+
+>>> val = np.arange(4).reshape(4)  # brocast 'val'
+>>> b = npx.index_update(a, ind, val)
+>>> b
+array([[[0., 1., 2., 3.],
+[0., 1., 2., 3.],
+[0., 0., 0., 0.]],
+
+[[0., 0., 0., 0.],
+[0., 0., 0., 0.],
+[0., 0., 0., 0.]]])
+"""
+pass
+
+
 def _np_diag(array, k=0):
 """
 Extracts a diagonal or constructs a diagonal array.
diff --git a/src/operator/tensor/index_add-inl.h 
b/src/operator/tensor/index_add-inl.h
index 83463da..122aa01 100644
--- a/src/operator/tensor/index_add-inl.h
+++ b/src/operator/tensor/index_add-inl.h
@@ -52,7 +52,7 @@ inline bool IndexModifyOpType(const nnvm::NodeAttrs& attrs,
   CHECK_NE((*in_attrs)[1], -1);
   CHECK_NE((*in_attrs)[2], -1);
   CHECK_EQ((*in_attrs)[0], (*in_attrs)[2])
-<< "index_add(a, ind, val) only support a.dtype == val.dtype";
+<< "index_add/index_update(a, ind,

[GitHub] [incubator-mxnet] sxjscience merged pull request #18545: add op npx.index_update

2020-06-16 Thread GitBox



sxjscience merged pull request #18545:
URL: https://github.com/apache/incubator-mxnet/pull/18545


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [incubator-mxnet] ChaiBapchya commented on pull request #18571: fix contribute page anchor position shifted

2020-06-16 Thread GitBox



ChaiBapchya commented on pull request #18571:
URL: https://github.com/apache/incubator-mxnet/pull/18571#issuecomment-644911772


   @ys2843 Thanks for prioritizing on this & explaining what the issue was. 
Curious, are there other places in the website where anchor tag header position 
is "fixed"? 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[incubator-mxnet] branch v1.x updated: Increase staggered build timeout to 180 min (#18568)

2020-06-16 Thread marcoabreu

This is an automated email from the ASF dual-hosted git repository.

marcoabreu pushed a commit to branch v1.x
in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git


The following commit(s) were added to refs/heads/v1.x by this push:
 new f91b989  Increase staggered build timeout to 180 min (#18568)
f91b989 is described below

commit f91b98932b0a0846782905a68942f0870242246d
Author: Joe Evans 
AuthorDate: Tue Jun 16 10:25:01 2020 -0700

Increase staggered build timeout to 180 min (#18568)

* Increase staggered build timeout to 180 min, since sanity build has 180 
min timeout.

* Decrease timeout so everyone is happy.

Co-authored-by: Joe Evans 
---
 ci/jenkins/Jenkinsfile_full   | 2 +-
 ci/jenkins/Jenkinsfile_sanity | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/ci/jenkins/Jenkinsfile_full b/ci/jenkins/Jenkinsfile_full
index 33d57d2..415bd7b 100644
--- a/ci/jenkins/Jenkinsfile_full
+++ b/ci/jenkins/Jenkinsfile_full
@@ -21,7 +21,7 @@
 // See documents at https://jenkins.io/doc/book/pipeline/jenkinsfile/
 
 // timeout in minutes
-def max_time = 30
+def max_time = 60
 
 def buildJobs = [
 'centos-cpu',
diff --git a/ci/jenkins/Jenkinsfile_sanity b/ci/jenkins/Jenkinsfile_sanity
index ed4d16e..065202c 100644
--- a/ci/jenkins/Jenkinsfile_sanity
+++ b/ci/jenkins/Jenkinsfile_sanity
@@ -21,7 +21,7 @@
 // See documents at https://jenkins.io/doc/book/pipeline/jenkinsfile/
 
 // timeout in minutes
-max_time = 180
+max_time = 60
 
 node('utility') {
   // Loading the utilities requires a node context unfortunately

[incubator-mxnet] branch v1.x updated: Increase staggered build timeout to 180 min (#18568)

2020-06-16 Thread marcoabreu

This is an automated email from the ASF dual-hosted git repository.

marcoabreu pushed a commit to branch v1.x
in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git


The following commit(s) were added to refs/heads/v1.x by this push:
 new f91b989  Increase staggered build timeout to 180 min (#18568)
f91b989 is described below

commit f91b98932b0a0846782905a68942f0870242246d
Author: Joe Evans 
AuthorDate: Tue Jun 16 10:25:01 2020 -0700

Increase staggered build timeout to 180 min (#18568)

* Increase staggered build timeout to 180 min, since sanity build has 180 
min timeout.

* Decrease timeout so everyone is happy.

Co-authored-by: Joe Evans 
---
 ci/jenkins/Jenkinsfile_full   | 2 +-
 ci/jenkins/Jenkinsfile_sanity | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/ci/jenkins/Jenkinsfile_full b/ci/jenkins/Jenkinsfile_full
index 33d57d2..415bd7b 100644
--- a/ci/jenkins/Jenkinsfile_full
+++ b/ci/jenkins/Jenkinsfile_full
@@ -21,7 +21,7 @@
 // See documents at https://jenkins.io/doc/book/pipeline/jenkinsfile/
 
 // timeout in minutes
-def max_time = 30
+def max_time = 60
 
 def buildJobs = [
 'centos-cpu',
diff --git a/ci/jenkins/Jenkinsfile_sanity b/ci/jenkins/Jenkinsfile_sanity
index ed4d16e..065202c 100644
--- a/ci/jenkins/Jenkinsfile_sanity
+++ b/ci/jenkins/Jenkinsfile_sanity
@@ -21,7 +21,7 @@
 // See documents at https://jenkins.io/doc/book/pipeline/jenkinsfile/
 
 // timeout in minutes
-max_time = 180
+max_time = 60
 
 node('utility') {
   // Loading the utilities requires a node context unfortunately

[GitHub] [incubator-mxnet] marcoabreu merged pull request #18568: Increase staggered build timeout to 180 min

2020-06-16 Thread GitBox



marcoabreu merged pull request #18568:
URL: https://github.com/apache/incubator-mxnet/pull/18568


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [incubator-mxnet] anko-intel commented on issue #14357: [Bug] Batchnorm running_var behaves differently when using gpu vs. cpu

2020-06-16 Thread GitBox



anko-intel commented on issue #14357:
URL: 
https://github.com/apache/incubator-mxnet/issues/14357#issuecomment-644887485


   Hi @ThomasDelteil,
   According to the training script from  
https://github.com/apache/incubator-mxnet/issues/14357#issuecomment-497487102,
   As I mentioned in previous comment on the master branch (on 8174771) running 
variables in BatchNorm are calculated only during the backward pass. 
   Still, there are some differences in the results between CPU and GPU 
backend. One of the reasons comes from different input tensors on GPU and CPU 
as mx.nd.random.normal() function produces different results on both backends. 
According to the documentation 
https://mxnet.apache.org/api/python/docs/api/mxnet/random/index.html it  is an 
expected behavior:
   
   > Random number generators in MXNet are device specific. 
mx.random.seed(seed_state) sets the state of each generator using seed_state 
and the device id. Therefore, random numbers generated from different devices 
can be different even if they are seeded using the same seed.
   > To produce identical random number sequences independent of the device id, 
set optional ctx argument. This produces the same sequence of random numbers 
independent of the device id, but the sequence can be different on different 
kind of devices as MXNet’s random number generators for CPU and GPU use 
different algorithms.
   
   So, for comparison purpose I moved generating tensors to NumPy.
   The second issue I observe is synchronization problem for running vars. For 
now I put some workaround to receive final result from backward pass (I am not 
sure if it is an issue for real network). So the scripts could looks as follows:
   
   ```
   import mxnet as mx
   from mxnet import gluon
   from mxnet import autograd
   import numpy as np
   
   seed = np.random.randint(np.iinfo(np.int32).max)
   #seed = 0
   print("seed:", seed)
   shape = (1,3,224,224)
   layers = 100
   
   dataNumpy = {}
   np.random.seed(seed)
   for i in range(layers):
   dataNumpy[i] = np.random.normal(loc=10, scale=2, size=shape)
   
   for ctx in [mx.cpu(), mx.gpu()]:
   layer2 = gluon.nn.BatchNorm()
   layer2.initialize(ctx=ctx)
   
   for i in range(layers):
   data2 = mx.nd.array(dataNumpy[i], ctx=ctx)
   with autograd.record():
   out = layer2(data2)
   out.backward()
   
   # workaround for synchronization issue
   var1 = layer2.running_var.data().asnumpy()
   for t in range(1, 10):
   var2 = layer2.running_var.data().asnumpy()
   if (var1 != var2).any():
   print(ctx, "- DIFF in running_var reads:\n   0 :", var1, "\n  ", 
t,":", var2 )
   break
   
   print(ctx, layer2.running_var.data().asnumpy(), 
layer2.running_mean.data().asnumpy() )
   ```
   For the test above I receive almost the same values for both backends:
   ```
   seed: 791821049
   cpu(0) [3.9977632 3.999108  4.0007195] [10.000481 10.000663  9.999296]
   gpu(0) [3.997764  3.9991088 4.0007195] [10.000478 10.000664  9.999295]
   ```
   The difference is so small that I guess it could be neglected (as a 
difference in rounding in both backends)



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [incubator-mxnet] anko-intel edited a comment on issue #14357: [Bug] Batchnorm running_var behaves differently when using gpu vs. cpu

2020-06-16 Thread GitBox



anko-intel edited a comment on issue #14357:
URL: 
https://github.com/apache/incubator-mxnet/issues/14357#issuecomment-644775609


   Hi @adrianloy,
   Your issue still exist on the 1.6 branch. On master branch (on 81747710c) 
„running_var” is calculated only in backward pass on CPU and GPU backend as 
well, so your test gives the same results on both contexts:
   ```
   Batchnorm running var values [1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 
1. 1. 1. 1. 1. 1. 1. 1. 1.
1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
   ```
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [incubator-mxnet] anko-intel edited a comment on issue #14357: [Bug] Batchnorm running_var behaves differently when using gpu vs. cpu

2020-06-16 Thread GitBox



anko-intel edited a comment on issue #14357:
URL: 
https://github.com/apache/incubator-mxnet/issues/14357#issuecomment-644775609


   Hi @adrianloy,
   Your issue still exist on the 1.6 branch. On master branch „running_var” is 
calculated only in backward pass on CPU and GPU backend as well, so your test 
gives the same results on both contexts:
   ```
   Batchnorm running var values [1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 
1. 1. 1. 1. 1. 1. 1. 1. 1.
1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
   ```
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [incubator-mxnet] anko-intel commented on issue #14357: [Bug] Batchnorm running_var behaves differently when using gpu vs. cpu

2020-06-16 Thread GitBox



anko-intel commented on issue #14357:
URL: 
https://github.com/apache/incubator-mxnet/issues/14357#issuecomment-644775609


   Hi @adrianloy,
   Your issue still exist on the 1.6 branch. On master branch „running_var” is 
calculated only in backward pass on CPU and GPU backend as well, so your test 
gives the same results on both contexts:
   Batchnorm running var values [1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 
1. 1. 1. 1. 1. 1. 1. 1. 1.
1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [incubator-mxnet] mxnet-bot commented on pull request #18572: [v1.7.x]Add KEY for Ciyong Chen

2020-06-16 Thread GitBox



mxnet-bot commented on pull request #18572:
URL: https://github.com/apache/incubator-mxnet/pull/18572#issuecomment-644772671


   Hey @ciyongch , Thanks for submitting the PR 
   All tests are already queued to run once. If tests fail, you can trigger one 
or more tests again with the following commands: 
   - To trigger all jobs: @mxnet-bot run ci [all] 
   - To trigger specific jobs: @mxnet-bot run ci [job1, job2] 
   *** 
   **CI supported jobs**: [website, clang, centos-gpu, centos-cpu, 
miscellaneous, unix-cpu, edge, sanity, windows-cpu, windows-gpu, unix-gpu]
   *** 
   _Note_: 
Only following 3 categories can trigger CI :PR Author, MXNet Committer, 
Jenkins Admin. 
   All CI tests must pass before the PR can be merged. 
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [incubator-mxnet] ciyongch opened a new pull request #18572: [v1.7.x]Add KEY for Ciyong Chen

2020-06-16 Thread GitBox



ciyongch opened a new pull request #18572:
URL: https://github.com/apache/incubator-mxnet/pull/18572


   ## Description ##
   update keys file for Ciyong Chen.
   
   @TaoLv @pengzhao-intel 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[incubator-mxnet-site] branch asf-site updated: Bump the publish timestamp.

2020-06-16 Thread aaronmarkham

This is an automated email from the ASF dual-hosted git repository.

aaronmarkham pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/incubator-mxnet-site.git


The following commit(s) were added to refs/heads/asf-site by this push:
 new 9d0447b  Bump the publish timestamp.
9d0447b is described below

commit 9d0447bf49bd487b62b8f42d1b3b2080cbe48f42
Author: mxnet-ci 
AuthorDate: Tue Jun 16 12:48:09 2020 +

Bump the publish timestamp.
---
 date.txt | 1 +
 1 file changed, 1 insertion(+)

diff --git a/date.txt b/date.txt
new file mode 100644
index 000..324e8c1
--- /dev/null
+++ b/date.txt
@@ -0,0 +1 @@
+Tue Jun 16 12:48:09 UTC 2020

[GitHub] [incubator-mxnet] MoritzMaxeiner commented on pull request #18535: [Numpy] Bugfix of slice operator export (MXNet to ONNX) v2

2020-06-16 Thread GitBox



MoritzMaxeiner commented on pull request #18535:
URL: https://github.com/apache/incubator-mxnet/pull/18535#issuecomment-644731564


   > Is there some procedure to get your Pull Request accepted faster, that I 
am missing?
   
   If there is, I'm not ware of it; I've also got 
[one](https://github.com/apache/incubator-mxnet/pull/16251) that's been waiting 
for a while. That's not really unusual in my experience, though.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [incubator-mxnet] RuRo edited a comment on pull request #18535: [Numpy] Bugfix of slice operator export (MXNet to ONNX) v2

2020-06-16 Thread GitBox



RuRo edited a comment on pull request #18535:
URL: https://github.com/apache/incubator-mxnet/pull/18535#issuecomment-644657795


   @szha can you PTAL or tag another reviewer? Thanks.
   
   P.S. I was wondering. Is there some procedure to get your Pull Request 
accepted faster, that I am missing?
   
   So far, I've submitted 2 PRs that were successfully accepted and 
participated in some other Pull Requests. And in all cases there is a really 
long delay after the PR is "done", where we are just waiting for the reviewers.
   
   If there is no such procedure to get your PRs accepted faster, maybe we need 
some way for the PR owners to triage their Pull Requests based on their status?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [incubator-mxnet] RuRo commented on pull request #18535: [Numpy] Bugfix of slice operator export (MXNet to ONNX) v2

2020-06-16 Thread GitBox



RuRo commented on pull request #18535:
URL: https://github.com/apache/incubator-mxnet/pull/18535#issuecomment-644657795


   @szha can you PTAL or tag another reviewer? Thanks.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [incubator-mxnet] JiangZhaoh commented on pull request #18545: add op npx.index_update

2020-06-16 Thread GitBox



JiangZhaoh commented on pull request #18545:
URL: https://github.com/apache/incubator-mxnet/pull/18545#issuecomment-644645699


   @mxnet-bot run ci [unix-cpu]



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [incubator-mxnet] mxnet-bot commented on pull request #18545: add op npx.index_update

2020-06-16 Thread GitBox



mxnet-bot commented on pull request #18545:
URL: https://github.com/apache/incubator-mxnet/pull/18545#issuecomment-644645760


   Jenkins CI successfully triggered : [unix-cpu]



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [incubator-mxnet] Neutron3529 commented on issue #18551: MXnet with cuda wont install on windows 10!

2020-06-16 Thread GitBox



Neutron3529 commented on issue #18551:
URL: 
https://github.com/apache/incubator-mxnet/issues/18551#issuecomment-644622784


   > > My MXNet with win10 is OK(although quite slow compared to Linux)
   > > have you ever tried calling `nvidia-smi`?
   > 
   > Yes, i call it every day -l 1
   
   If nvidia-smi is callable, the CUDA should be installed properly.
   (you could check the "CUDA Version") slot to ensure it.
   ```
   neutron@Neutron:/me$ nvidia-smi
   Tue Jun 16 16:29:00 2020   
   
+-+
   | NVIDIA-SMI 440.82   Driver Version: 440.82   CUDA Version: 10.2
 |
   
|---+--+--+
   | GPU  NamePersistence-M| Bus-IdDisp.A | Volatile Uncorr. 
ECC |
   | Fan  Temp  Perf  Pwr:Usage/Cap| Memory-Usage | GPU-Util  Compute 
M. |
   
|===+==+==|
   |   0  GeForce GTX 1060On   | :01:00.0 Off |  
N/A |
   | N/A   80CP278W /  N/A |   2270MiB /  6078MiB | 99%  
Default |
   
+---+--+--+

  
   
+-+
   | Processes:   GPU 
Memory |
   |  GPU   PID   Type   Process name Usage 
 |
   
|=|
   |0  1003  G   /usr/lib/Xorg 
38MiB |
   |0 96256  C   python  
2227MiB |
   
+-+
   ```
   If the CUDA is installed properly, I don't know what would cause MXNet 
failed to load.
   maybe `pdb` could help telling you what happened.
   ```
   import pdb
   pdb.set_trace()
   import mxnet
   ```
   save the following script into a `test.py` then execute `python test.py`, 
the python would enter pdb mode, then, using either `s`(step), `n`(next) or 
`r`(execute until return) to control the `pdb` procedure.
   It may help you to find which `dll` is missing. and you may manually edit 
the `.py` script of MXNet to ensure a successful load.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [incubator-mxnet] XIAO-XIA commented on pull request #18546: [Numpy] FFI: tril_indices

2020-06-16 Thread GitBox



XIAO-XIA commented on pull request #18546:
URL: https://github.com/apache/incubator-mxnet/pull/18546#issuecomment-644589045


   > @XIAO-XIA CI fails in a test related to the Op you're modifying. It 
suggests there's a bug in your PR
   
   Thank you very much! I'm trying to fix the bug.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [incubator-mxnet] sxjscience commented on a change in pull request #18319: [numpy] symbolic advanced indexing

2020-06-16 Thread GitBox



sxjscience commented on a change in pull request #18319:
URL: https://github.com/apache/incubator-mxnet/pull/18319#discussion_r440637596



##
File path: src/operator/numpy/np_indexing_op.cc
##
@@ -0,0 +1,544 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+/*!
+ * Copyright (c) 2018 by Contributors
+ * \file np_indexing_op.cc
+*/
+
+#include "./np_indexing_op.h"
+
+namespace mxnet {
+namespace op {
+
+struct AdvancedIndexingTakeCPU {
+  // assume that idx have been flattened to a 1-D tensor (N,)
+  // assume that out_data and in_data have been flattened to 2-D tensors, (N, 
M) and (K, M)
+  // M is the number of columns of in_data and out_data
+  // K is the number of rows of in_data
+  // i is the index of out_data
+  template
+  MSHADOW_XINLINE static void Map(index_t i, DType* out_data, const DType* 
in_data,
+  const IType* idx, const size_t M, const 
int64_t K) {
+int64_t j = static_cast(idx[i]);
+j = j % K;
+j += (j < 0) ? K : 0;
+#pragma GCC diagnostic push
+#if __GNUC__ >= 8
+#pragma GCC diagnostic ignored "-Wclass-memaccess"
+#endif
+std::memcpy(out_data + i * M, in_data + j * M, M * sizeof(DType));
+#pragma GCC diagnostic pop
+  }
+};
+
+struct AdvancedIndexingTakeMultiDimensionCPU {
+  // assume that idx have been flattened to a 1-D tensor (N,)
+  // assume that out_data and in_data have been flattened to 2-D tensors, (N, 
M) and (K, M)
+  // M is the number of columns of in_data and out_data
+  // K is the number of rows of in_data
+  // i is the index of out_data
+  template
+  MSHADOW_XINLINE static void Map(index_t i, DType* out_data, const DType* 
in_data,
+  const IType* idx, const size_t M, const 
int64_t K) {
+int64_t j = static_cast(idx[i]);
+j = j % K;
+j += (j < 0) ? K : 0;
+#pragma GCC diagnostic push
+#if __GNUC__ >= 8
+#pragma GCC diagnostic ignored "-Wclass-memaccess"
+#endif
+std::memcpy(out_data + i * M, in_data + (i * K + j) * M, M * 
sizeof(DType));
+#pragma GCC diagnostic pop
+  }
+};
+
+struct AdvancedIndexingBooleanMaskBackwardCPUWriteKernel {
+  template
+  static void Map(int i,
+  DType* igrad,
+  const OpReqType /*req*/,
+  const DType* ograd,
+  const int32_t* idx,
+  const size_t col_size) {
+// i is row id already
+int32_t prev = (i == 0) ? 0 : idx[i - 1];
+int32_t curr = idx[i];
+#pragma GCC diagnostic push
+#if __GNUC__ >= 8
+#pragma GCC diagnostic ignored "-Wclass-memaccess"
+#endif
+if (prev != curr) {
+  std::memcpy(igrad + i * col_size, ograd + prev * col_size, col_size * 
sizeof(DType));
+} else {
+  std::memset(igrad + i * col_size, 0, col_size * sizeof(DType));
+}
+#pragma GCC diagnostic pop
+  }
+};
+
+template
+bool CheckIndexOutOfBound(const DType* data_ptr, size_t data_size,
+  const DType min, const DType max) {
+  bool is_valid = true;
+  for (size_t i = 0; i < data_size; i++) {
+if (data_ptr[i] > max || data_ptr[i] < min) {
+  is_valid = false;
+  break;
+}
+  }
+  return is_valid;
+}
+
+inline bool AdvancedIndexingOpType(const nnvm::NodeAttrs& attrs,
+   std::vector *in_attrs,
+   std::vector *out_attrs) {
+  CHECK_EQ(in_attrs->size(), 2U);
+  CHECK_EQ(out_attrs->size(), 1U);
+  CHECK_NE((*in_attrs)[1], -1) << "Index type must be set for take operator";
+
+  TYPE_ASSIGN_CHECK(*out_attrs, 0, (*in_attrs)[0]);
+  TYPE_ASSIGN_CHECK(*in_attrs, 0, (*out_attrs)[0]);
+  return (*in_attrs)[0] != -1;
+}
+
+bool AdvancedIndexingOpStorageType(const nnvm::NodeAttrs& attrs,
+const int dev_mask,
+DispatchMode* dispatch_mode,
+std::vector *in_attrs,
+std::vector *out_attrs) {
+  CHECK_EQ(in_attrs->size(), 2);
+  CHECK_EQ(out_attrs->size(), 1);
+  for (int  : *in_attrs) {
+CHECK_EQ(attr, kDefaultStorage) << "Only default storage is supported";
+  }
+  for (int  : *out_attrs) {
+attr = kDefaultStorage;
+  }
+  *dispatch_mode = DispatchMode::kFComputeEx;
+  return true;
+}
+
+bool

[GitHub] [incubator-mxnet] Mauhing edited a comment on issue #13484: flaky test test_gluon_data.test_recordimage_dataset_with_data_loader_multiworker

2020-06-16 Thread GitBox



Mauhing edited a comment on issue #13484:
URL: 
https://github.com/apache/incubator-mxnet/issues/13484#issuecomment-644560102


   It may be a shared memory problem. Check shm used by using `df -h` and find 
shm. If it is 100%, then your multiprocess worker will just stall.
   
   If you use docker.
Use --shm 1024m to lauch docker, so docker run --shm 1024m 
   
   why?
   gluon.data.DataLoader uses python multiprocess, multiprocess need shared 
memory. The default shared memory is 64m in docker container. You can check the 
shm usage by using df -h and find shm.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [incubator-mxnet] Mauhing edited a comment on issue #13484: flaky test test_gluon_data.test_recordimage_dataset_with_data_loader_multiworker

2020-06-16 Thread GitBox



Mauhing edited a comment on issue #13484:
URL: 
https://github.com/apache/incubator-mxnet/issues/13484#issuecomment-644560102


   It may be a shared memory problem. Check shm used by using `df -h` and find 
shm. If it is 100%, then your multiprocess worker will just stall.
   
   If you use docker.
   The default is 64m, This is not enough for dataloader. Use --shm 1024m to 
lauch docker, so docker run --shm 1024m 
   
   why?
   gluon.data.DataLoader uses python multiprocess, multiprocess need shared 
memory. The default shared memory is 64m in docker container. You can check the 
shm usage by using df -h and find shm.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [incubator-mxnet] sxjscience commented on a change in pull request #18319: [numpy] symbolic advanced indexing

2020-06-16 Thread GitBox



sxjscience commented on a change in pull request #18319:
URL: https://github.com/apache/incubator-mxnet/pull/18319#discussion_r440636183



##
File path: src/operator/numpy/np_indexing_op.cu
##
@@ -0,0 +1,574 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+/*!
+ * Copyright (c) 2018 by Contributors
+ * \file np_indexing_op.cu
+*/
+
+#include "./np_indexing_op.h"
+#include 
+
+namespace mxnet {
+namespace op {
+
+/*! \brief If there are out-of-bound indices, out will be assigned to 1.
+ */
+struct is_valid_check {
+  template
+  MSHADOW_XINLINE static void Map(int i, char* out, const DType* data,
+  const DType min, const DType max) {
+if (data[i] < min || data[i] > max) *out = 1;
+  }
+};
+
+template
+bool CheckIndexOutOfBound(mshadow::Stream *s, const DType* data_ptr, 
size_t data_size,
+const DType min, const DType max, char* 
is_valid_ptr) {
+using namespace mxnet_op;
+int32_t is_valid = 0;
+Kernel::Launch(s, 1, is_valid_ptr);
+Kernel::Launch(s, data_size, is_valid_ptr, data_ptr, 
min, max);
+CUDA_CALL(cudaMemcpyAsync(_valid, is_valid_ptr, sizeof(char),
+cudaMemcpyDeviceToHost, 
mshadow::Stream::GetStream(s)));
+CUDA_CALL(cudaStreamSynchronize(mshadow::Stream::GetStream(s)));
+return is_valid == 0;
+}
+
+struct AdvancedIndexingTakeGPU {
+// assume that idx have been flattened to a 1-D tensor (N,)
+// assume that out_data and in_data have been flattened to 2-D tensors, 
(N, M) and (K, M)
+// M is the number of columns of in_data and out_data
+// K is the number of rows of in_data
+// i is the index of out_data
+template
+MSHADOW_XINLINE static void Map(int i, DType* out_data, const DType* 
in_data,
+const IType* idx, const int64_t M, const 
int64_t K) {
+  int64_t j = static_cast(idx[i]);
+  j = j % K;
+  j += (j < 0) ? K : 0;
+
+  for (int64_t k = 0; k < M; k++){
+out_data[i * M + k] = in_data[j * M + k];
+  }
+}
+};
+
+struct AdvancedIndexingTakeMultiDimensionGPU {
+// assume that idx have been flattened to a 1-D tensor (N,)
+// assume that out_data and in_data have been flattened to 2-D tensors, 
(N, M) and (K, M)
+// M is the number of columns of in_data and out_data
+// K is the number of rows of in_data
+// i is the index of out_data
+template
+MSHADOW_XINLINE static void Map(int i, DType* out_data, const DType* 
in_data,
+const IType* idx, const int64_t M, const 
int64_t K) {
+  int64_t j = static_cast(idx[i]);
+  j = j % K;
+  j += (j < 0) ? K : 0;
+
+  for (int64_t k = 0; k < M; k++){
+out_data[i * M + k] = in_data[(i * k + j) * M + k];
+  }
+}
+};
+
+template<>
+inline void AdvancedIndexingOpForward(const nnvm::NodeAttrs& attrs,
+const OpContext ,
+const std::vector ,
+const std::vector ,
+const std::vector ) {
+  using namespace mshadow;
+  CHECK_EQ(inputs.size(), 2U);
+  CHECK_EQ(outputs.size(), 1U);
+
+  if (inputs[np_indexing_::kIdx].dtype() == mshadow::kBool) {
+CHECK(req[0] == kWriteTo || req[0] == kWriteInplace);
+const int axis = 0;
+const NDArray  = inputs[0];
+const NDArray  = inputs[1];
+const NDArray  = outputs[0];
+CHECK_EQ(axis, 0) << "Not supported yet";
+CHECK_EQ(data.shape()[axis], idx.shape()[0]);
+CHECK_EQ(idx.shape().ndim(), 1U);
+Stream* s = ctx.get_stream();
+cudaStream_t stream = Stream::GetStream(s);
+// count the number of 1s in `idx`, so that we could know the output 
dimension
+size_t idx_size = idx.shape()[0];
+int32_t valid_num = 0;
+int32_t* prefix_sum = nullptr;
+void* d_temp_storage = nullptr;
+size_t temp_storage_bytes = 0;
+// Calculate total temporary memory size
+cub::DeviceScan::InclusiveSum(d_temp_storage,
+temp_storage_bytes,
+prefix_sum,
+

[GitHub] [incubator-mxnet] sxjscience commented on a change in pull request #18319: [numpy] symbolic advanced indexing

2020-06-16 Thread GitBox



sxjscience commented on a change in pull request #18319:
URL: https://github.com/apache/incubator-mxnet/pull/18319#discussion_r440634530



##
File path: src/operator/numpy/np_indexing_op.cu
##
@@ -0,0 +1,574 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+/*!
+ * Copyright (c) 2018 by Contributors
+ * \file np_indexing_op.cu
+*/
+
+#include "./np_indexing_op.h"
+#include 
+
+namespace mxnet {
+namespace op {
+
+/*! \brief If there are out-of-bound indices, out will be assigned to 1.
+ */
+struct is_valid_check {
+  template
+  MSHADOW_XINLINE static void Map(int i, char* out, const DType* data,
+  const DType min, const DType max) {
+if (data[i] < min || data[i] > max) *out = 1;
+  }
+};
+
+template
+bool CheckIndexOutOfBound(mshadow::Stream *s, const DType* data_ptr, 
size_t data_size,
+const DType min, const DType max, char* 
is_valid_ptr) {
+using namespace mxnet_op;
+int32_t is_valid = 0;
+Kernel::Launch(s, 1, is_valid_ptr);
+Kernel::Launch(s, data_size, is_valid_ptr, data_ptr, 
min, max);
+CUDA_CALL(cudaMemcpyAsync(_valid, is_valid_ptr, sizeof(char),

Review comment:
   Here, `is_valid` has dtype=`int32_t`, but the `is_valid_ptr` has 
dtype=`char`. Thus, you may consider to change the dtype of is_valid to char.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [incubator-mxnet] sxjscience commented on a change in pull request #18319: [numpy] symbolic advanced indexing

2020-06-16 Thread GitBox



sxjscience commented on a change in pull request #18319:
URL: https://github.com/apache/incubator-mxnet/pull/18319#discussion_r440633446



##
File path: src/operator/numpy/np_indexing_op.cu
##
@@ -0,0 +1,574 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+/*!
+ * Copyright (c) 2018 by Contributors
+ * \file np_indexing_op.cu
+*/
+
+#include "./np_indexing_op.h"
+#include 
+
+namespace mxnet {
+namespace op {
+
+/*! \brief If there are out-of-bound indices, out will be assigned to 1.
+ */
+struct is_valid_check {
+  template
+  MSHADOW_XINLINE static void Map(int i, char* out, const DType* data,
+  const DType min, const DType max) {
+if (data[i] < min || data[i] > max) *out = 1;
+  }
+};
+
+template
+bool CheckIndexOutOfBound(mshadow::Stream *s, const DType* data_ptr, 
size_t data_size,
+const DType min, const DType max, char* 
is_valid_ptr) {
+using namespace mxnet_op;
+int32_t is_valid = 0;
+Kernel::Launch(s, 1, is_valid_ptr);
+Kernel::Launch(s, data_size, is_valid_ptr, data_ptr, 
min, max);
+CUDA_CALL(cudaMemcpyAsync(_valid, is_valid_ptr, sizeof(char),
+cudaMemcpyDeviceToHost, 
mshadow::Stream::GetStream(s)));
+CUDA_CALL(cudaStreamSynchronize(mshadow::Stream::GetStream(s)));
+return is_valid == 0;
+}
+
+struct AdvancedIndexingTakeGPU {
+// assume that idx have been flattened to a 1-D tensor (N,)
+// assume that out_data and in_data have been flattened to 2-D tensors, 
(N, M) and (K, M)
+// M is the number of columns of in_data and out_data
+// K is the number of rows of in_data
+// i is the index of out_data
+template
+MSHADOW_XINLINE static void Map(int i, DType* out_data, const DType* 
in_data,
+const IType* idx, const int64_t M, const 
int64_t K) {
+  int64_t j = static_cast(idx[i]);
+  j = j % K;
+  j += (j < 0) ? K : 0;
+
+  for (int64_t k = 0; k < M; k++){
+out_data[i * M + k] = in_data[j * M + k];
+  }
+}
+};
+
+struct AdvancedIndexingTakeMultiDimensionGPU {
+// assume that idx have been flattened to a 1-D tensor (N,)
+// assume that out_data and in_data have been flattened to 2-D tensors, 
(N, M) and (K, M)
+// M is the number of columns of in_data and out_data
+// K is the number of rows of in_data
+// i is the index of out_data
+template
+MSHADOW_XINLINE static void Map(int i, DType* out_data, const DType* 
in_data,
+const IType* idx, const int64_t M, const 
int64_t K) {
+  int64_t j = static_cast(idx[i]);
+  j = j % K;
+  j += (j < 0) ? K : 0;
+
+  for (int64_t k = 0; k < M; k++){
+out_data[i * M + k] = in_data[(i * k + j) * M + k];
+  }
+}
+};
+
+template<>
+inline void AdvancedIndexingOpForward(const nnvm::NodeAttrs& attrs,
+const OpContext ,
+const std::vector ,
+const std::vector ,
+const std::vector ) {
+  using namespace mshadow;
+  CHECK_EQ(inputs.size(), 2U);
+  CHECK_EQ(outputs.size(), 1U);
+
+  if (inputs[np_indexing_::kIdx].dtype() == mshadow::kBool) {
+CHECK(req[0] == kWriteTo || req[0] == kWriteInplace);
+const int axis = 0;
+const NDArray  = inputs[0];
+const NDArray  = inputs[1];
+const NDArray  = outputs[0];
+CHECK_EQ(axis, 0) << "Not supported yet";
+CHECK_EQ(data.shape()[axis], idx.shape()[0]);
+CHECK_EQ(idx.shape().ndim(), 1U);
+Stream* s = ctx.get_stream();
+cudaStream_t stream = Stream::GetStream(s);
+// count the number of 1s in `idx`, so that we could know the output 
dimension
+size_t idx_size = idx.shape()[0];
+int32_t valid_num = 0;
+int32_t* prefix_sum = nullptr;
+void* d_temp_storage = nullptr;
+size_t temp_storage_bytes = 0;
+// Calculate total temporary memory size
+cub::DeviceScan::InclusiveSum(d_temp_storage,
+temp_storage_bytes,
+prefix_sum,
+

[incubator-mxnet-site] branch asf-site updated: Bump the publish timestamp.

2020-06-16 Thread aaronmarkham

This is an automated email from the ASF dual-hosted git repository.

aaronmarkham pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/incubator-mxnet-site.git


The following commit(s) were added to refs/heads/asf-site by this push:
 new 72d6ec7  Bump the publish timestamp.
72d6ec7 is described below

commit 72d6ec79990ac53dadef49cc5461b3ac5b22d719
Author: mxnet-ci 
AuthorDate: Tue Jun 16 06:47:39 2020 +

Bump the publish timestamp.
---
 date.txt | 1 +
 1 file changed, 1 insertion(+)

diff --git a/date.txt b/date.txt
new file mode 100644
index 000..4d0d978
--- /dev/null
+++ b/date.txt
@@ -0,0 +1 @@
+Tue Jun 16 06:47:39 UTC 2020

[GitHub] [incubator-mxnet] Mauhing commented on issue #13484: flaky test test_gluon_data.test_recordimage_dataset_with_data_loader_multiworker

2020-06-16 Thread GitBox



Mauhing commented on issue #13484:
URL: 
https://github.com/apache/incubator-mxnet/issues/13484#issuecomment-644560102


   It may be a shared memory problem. Check shm used by using `df -h` and find 
shm. If it is 100%, then your multiprocess worker will just stall.
   
   If you use docker.
   Use --shm 1024m to lauch docker, so docker run --shm 1024m 
   
   why?
   gluon.data.DataLoader uses python multiprocess, multiprocess need shared 
memory. The default shared memory is 64m in docker container. You can check the 
shm usage by using df -h and find shm.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [incubator-mxnet] Mauhing commented on issue #18224: DataLoader timed out

2020-06-16 Thread GitBox



Mauhing commented on issue #18224:
URL: 
https://github.com/apache/incubator-mxnet/issues/18224#issuecomment-644557207


   I solved it. I have the same problem in d2l.ai - Ch. 7.7.
   Use `--shm 1024m` to lauch docker, so `docker run --shm 1024m `
   
   *why?*
   `gluon.data.DataLoader` use python multiprocess, multiprocess need shared 
memory. The default shared memory is 64m in docker container. You can check the 
shm usage by using `df -h` and find `shm`.  



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [incubator-mxnet] JiangZhaoh commented on a change in pull request #18545: add op npx.index_update

2020-06-16 Thread GitBox



JiangZhaoh commented on a change in pull request #18545:
URL: https://github.com/apache/incubator-mxnet/pull/18545#discussion_r440610528



##
File path: src/operator/tensor/index_update.cu
##
@@ -0,0 +1,202 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+/*!
+ * \file index_update.cu
+ * \brief GPU implementation of index_update operator
+ */
+
+#include 
+#include "./index_update-inl.h"
+#include "../tensor/util/tensor_util-inl.cuh"
+#include "../tensor/util/tensor_util-inl.h"
+
+
+namespace mxnet {
+namespace op {
+
+template
+struct IndexUpdateForwardGPUKernel {
+  MSHADOW_XINLINE static void Map(size_t i, DType* out,
+  const DType* val,
+  const mshadow::Shape 
a_tail_shape,
+  const mshadow::Shape 
a_pre_stride,
+  const mshadow::Shape 
val_stride,
+  const mshadow::Shape 
val_shape,
+  const int a_tail_size, const int ind_num,
+  const int ind_ndim, const int* ind,
+  const int a_ndim, const int seg) {
+index_t id = 0;
+for (int dim = 0; dim < ind_ndim; ++dim) {
+  id += a_pre_stride[seg + dim] * ind[dim * ind_num + i];
+}
+id *= a_tail_size;
+for (int _i = 0; _i < a_tail_size; ++_i) {
+  mshadow::Shape a_tail_id = mxnet_op::unravel(_i, 
a_tail_shape);
+  mshadow::Shape val_id;
+  for (int _j = 0; _j < seg; ++_j) {
+val_id[_j] = 0;
+  }
+  for (int _j = seg; _j < seg + a_ndim; ++_j) {
+val_id[_j] = (val_shape[_j] == 1) ? 0 : a_tail_id[_j];
+  }
+  val_id[seg + ind_ndim - 1] = (val_shape[seg + ind_ndim - 1] == 1) ? 0 : 
i;
+  index_t val_dest = mxnet_op::dot(val_id, val_stride);
+  out[id + _i] = val[val_dest];
+}
+  }
+};
+
+template
+void IndexUpdateForwardCalc(mshadow::Stream *s,
+const int ind_num, DType* out,
+const DType* val,
+const mshadow::Shape 
a_tail_shape,
+const mshadow::Shape 
a_pre_stride,
+const mshadow::Shape 
val_stride,
+const mshadow::Shape 
val_shape,
+const mshadow::Shape 
a_shape,
+const int a_tail_size,
+const int ind_ndim, const int* ind,
+const int a_ndim) {
+  using namespace mxnet_op;
+  using namespace mshadow;
+  int seg = MXNET_SPECIAL_MAX_NDIM - a_ndim;
+  Kernel, xpu>::Launch(
+s, ind_num, out, val, a_tail_shape, a_pre_stride,
+val_stride, val_shape, a_tail_size, ind_num,
+ind_ndim, ind, a_ndim, seg);
+}
+
+
+struct IndexUpdateBackwardValGPUKernel {
+  template
+  MSHADOW_XINLINE static void Map(size_t i, DType* grad_val,
+  const DType* ograd, const int* ind_vec,
+  const mshadow::Shape 
ograd_tail_shape,
+  const mshadow::Shape 
ograd_pre_stride,
+  const mshadow::Shape 
val_stride,
+  const mshadow::Shape 
val_shape,
+  const int ograd_tail_size, const int ind_num,
+  const int ind_ndim, const int out_ndim, 
const int seg) {
+index_t id = 0;
+for (int dim = 0; dim < ind_ndim; ++dim) {
+  id += ograd_pre_stride[seg + dim] * ind_vec[dim * ind_num + i];
+}
+id *= ograd_tail_size;
+for (int _i = 0; _i < ograd_tail_size; ++_i) {
+  mshadow::Shape ograd_tail_id =
+mxnet_op::unravel(_i, ograd_tail_shape);
+  mshadow::Shape val_id;
+  for (int _j = 0; _j < seg; ++_j) {
+val_id[_j] = 0;
+  }
+  for (int _j = seg; _j < seg + out_ndim; ++_j) {
+val_id[_j] = (val_shape[_j] == 1) ? 0 : ograd_tail_id[_j];
+  }
+  val_id[seg + ind_ndim - 1] = (val_shape[seg + ind_ndim - 1] == 1) ? 0 : 
i;
+  index_t val_dest = mxnet_op::dot(val_id, val_stride);
+  atomicAdd(_val[val_dest],

[GitHub] [incubator-mxnet] JiangZhaoh commented on a change in pull request #18545: add op npx.index_update

2020-06-16 Thread GitBox



JiangZhaoh commented on a change in pull request #18545:
URL: https://github.com/apache/incubator-mxnet/pull/18545#discussion_r440609104



##
File path: src/operator/tensor/index_update.cu
##
@@ -0,0 +1,202 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+/*!
+ * \file index_update.cu
+ * \brief GPU implementation of index_update operator
+ */
+
+#include 
+#include "./index_update-inl.h"
+#include "../tensor/util/tensor_util-inl.cuh"
+#include "../tensor/util/tensor_util-inl.h"
+
+
+namespace mxnet {
+namespace op {
+
+template
+struct IndexUpdateForwardGPUKernel {
+  MSHADOW_XINLINE static void Map(size_t i, DType* out,
+  const DType* val,
+  const mshadow::Shape 
a_tail_shape,
+  const mshadow::Shape 
a_pre_stride,
+  const mshadow::Shape 
val_stride,
+  const mshadow::Shape 
val_shape,
+  const int a_tail_size, const int ind_num,
+  const int ind_ndim, const int* ind,
+  const int a_ndim, const int seg) {
+index_t id = 0;
+for (int dim = 0; dim < ind_ndim; ++dim) {
+  id += a_pre_stride[seg + dim] * ind[dim * ind_num + i];
+}
+id *= a_tail_size;
+for (int _i = 0; _i < a_tail_size; ++_i) {
+  mshadow::Shape a_tail_id = mxnet_op::unravel(_i, 
a_tail_shape);
+  mshadow::Shape val_id;
+  for (int _j = 0; _j < seg; ++_j) {
+val_id[_j] = 0;
+  }
+  for (int _j = seg; _j < seg + a_ndim; ++_j) {
+val_id[_j] = (val_shape[_j] == 1) ? 0 : a_tail_id[_j];
+  }
+  val_id[seg + ind_ndim - 1] = (val_shape[seg + ind_ndim - 1] == 1) ? 0 : 
i;
+  index_t val_dest = mxnet_op::dot(val_id, val_stride);
+  out[id + _i] = val[val_dest];
+}
+  }
+};
+
+template
+void IndexUpdateForwardCalc(mshadow::Stream *s,
+const int ind_num, DType* out,
+const DType* val,
+const mshadow::Shape 
a_tail_shape,
+const mshadow::Shape 
a_pre_stride,
+const mshadow::Shape 
val_stride,
+const mshadow::Shape 
val_shape,
+const mshadow::Shape 
a_shape,
+const int a_tail_size,
+const int ind_ndim, const int* ind,
+const int a_ndim) {
+  using namespace mxnet_op;
+  using namespace mshadow;
+  int seg = MXNET_SPECIAL_MAX_NDIM - a_ndim;
+  Kernel, xpu>::Launch(
+s, ind_num, out, val, a_tail_shape, a_pre_stride,
+val_stride, val_shape, a_tail_size, ind_num,
+ind_ndim, ind, a_ndim, seg);
+}
+
+
+struct IndexUpdateBackwardValGPUKernel {
+  template
+  MSHADOW_XINLINE static void Map(size_t i, DType* grad_val,
+  const DType* ograd, const int* ind_vec,
+  const mshadow::Shape 
ograd_tail_shape,
+  const mshadow::Shape 
ograd_pre_stride,
+  const mshadow::Shape 
val_stride,
+  const mshadow::Shape 
val_shape,
+  const int ograd_tail_size, const int ind_num,
+  const int ind_ndim, const int out_ndim, 
const int seg) {
+index_t id = 0;
+for (int dim = 0; dim < ind_ndim; ++dim) {
+  id += ograd_pre_stride[seg + dim] * ind_vec[dim * ind_num + i];
+}
+id *= ograd_tail_size;
+for (int _i = 0; _i < ograd_tail_size; ++_i) {
+  mshadow::Shape ograd_tail_id =
+mxnet_op::unravel(_i, ograd_tail_shape);
+  mshadow::Shape val_id;
+  for (int _j = 0; _j < seg; ++_j) {
+val_id[_j] = 0;
+  }
+  for (int _j = seg; _j < seg + out_ndim; ++_j) {
+val_id[_j] = (val_shape[_j] == 1) ? 0 : ograd_tail_id[_j];
+  }
+  val_id[seg + ind_ndim - 1] = (val_shape[seg + ind_ndim - 1] == 1) ? 0 : 
i;
+  index_t val_dest = mxnet_op::dot(val_id, val_stride);
+  atomicAdd(_val[val_dest],

[GitHub] [incubator-mxnet] mseth10 edited a comment on pull request #18559: add cd mxnet_lib/static stages to ci

2020-06-16 Thread GitBox



mseth10 edited a comment on pull request #18559:
URL: https://github.com/apache/incubator-mxnet/pull/18559#issuecomment-644552613


   @leezu @szha please help review and merge. Thanks!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [incubator-mxnet] mseth10 commented on pull request #18559: add cd mxnet_lib/static stages to ci

2020-06-16 Thread GitBox



mseth10 commented on pull request #18559:
URL: https://github.com/apache/incubator-mxnet/pull/18559#issuecomment-644552613


   @leezu please help review and merge. Thanks!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [incubator-mxnet] mxnet-bot commented on pull request #18571: fix contribute page anchor position shifted

2020-06-16 Thread GitBox



mxnet-bot commented on pull request #18571:
URL: https://github.com/apache/incubator-mxnet/pull/18571#issuecomment-644550209


   Jenkins CI successfully triggered : [unix-cpu]



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [incubator-mxnet] ys2843 edited a comment on pull request #18571: fix contribute page anchor position shifted

2020-06-16 Thread GitBox



ys2843 edited a comment on pull request #18571:
URL: https://github.com/apache/incubator-mxnet/pull/18571#issuecomment-644550054


   @mxnet-bot run ci [unix-cpu]



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [incubator-mxnet] ys2843 commented on pull request #18571: fix contribute page anchor position shifted

2020-06-16 Thread GitBox



ys2843 commented on pull request #18571:
URL: https://github.com/apache/incubator-mxnet/pull/18571#issuecomment-644550054


   @mxnet-bot run [unix-cpu]



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [incubator-mxnet] mxnet-bot commented on pull request #18571: fix contribute page anchor position shifted

2020-06-16 Thread GitBox



mxnet-bot commented on pull request #18571:
URL: https://github.com/apache/incubator-mxnet/pull/18571#issuecomment-644550082


   Undefined action detected. 
   Permissible actions are : run ci [all], run ci [job1, job2] 
   Example : @mxnet-bot run ci [all] 
   Example : @mxnet-bot run ci [centos-cpu, clang]



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

74 matches

Mail list logo