date:20180517

[GitHub] zheng-da commented on issue #10994: MKLDNN fails in the backward computation when forward runs with is_train=False

2018-05-17 Thread GitBox

zheng-da commented on issue #10994: MKLDNN fails in the backward computation 
when forward runs with is_train=False
URL: 
https://github.com/apache/incubator-mxnet/issues/10994#issuecomment-390102644
 
 
   this mode exists in MXNet.
   I'm not sure if mkldnn backward can deal with this case. potentially, the 
backward computation can just fall back.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] pengzhao-intel commented on issue #10994: MKLDNN fails in the backward computation when forward runs with is_train=False

2018-05-17 Thread GitBox

pengzhao-intel commented on issue #10994: MKLDNN fails in the backward 
computation when forward runs with is_train=False
URL: 
https://github.com/apache/incubator-mxnet/issues/10994#issuecomment-390101245
 
 
   When we run forward, how it goes into backward?
If we run the training, `is_train` should be `True`.
   
   Will look into the case. 
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] zheng-da opened a new issue #10994: MKLDNN fails in the backward computation when forward runs with is_train=False

2018-05-17 Thread GitBox

zheng-da opened a new issue #10994: MKLDNN fails in the backward computation 
when forward runs with is_train=False
URL: https://github.com/apache/incubator-mxnet/issues/10994
 
 
   This is a pretty special case. When we run forward with is_train=False and 
MKLDNN is enabled, backward fails with a memory error. @ashokei @pengzhao-intel 
@TaoLv 
   
   ```python
   def test_hybrid_static_memory():
   x = mx.nd.random.uniform(shape=(2, 3, 32, 32))
   x.attach_grad()
   
   net1 = gluon.model_zoo.vision.get_resnet(
   1, 18, pretrained=True, prefix='net_', 
ctx=mx.context.current_context())
   net2 = gluon.model_zoo.vision.get_resnet(
   1, 18, pretrained=True, prefix='net_', 
ctx=mx.context.current_context())
   net1(x)
   net2(x)
   
   net1.save_params('test.params')
   net2.load_params('test.params')
   
   def test(net, x):
   with mx.autograd.record(False):
   y = net(x) + net(x)
   y.backward()
   
   grads = {k: v.grad() for k, v in net.collect_params().items() if 
v.grad_req != 'null'}
   
   return y, grads
   
   y1, grads1 = test(net1, x)
   y2, grads2 = test(net2, x)
   
   assert_almost_equal(y1.asnumpy(), y2.asnumpy(), rtol=1e-3, atol=1e-5)
   for key in grads1:
   print(key)
   try:
   assert_almost_equal(grads1[key].asnumpy(), 
grads2[key].asnumpy(), rtol=1e-3, atol=1e-5)
   except Exception as e:
   print(e)
   ```
   
   The memory error is something like this:
   ```
   *** Error in `/usr/bin/python': corrupted double-linked list: 
0x7f426ee97880 ***
   === Backtrace: =
   /lib/x86_64-linux-gnu/libc.so.6(+0x777e5)[0x7f4314aa77e5]
   /lib/x86_64-linux-gnu/libc.so.6(+0x80baf)[0x7f4314ab0baf]
   /lib/x86_64-linux-gnu/libc.so.6(cfree+0x4c)[0x7f4314ab453c]
   
/home/ubuntu/incubator-mxnet/python/mxnet/../../lib/libmkldnn.so.0(mkldnn_primitive_desc_destroy+0xf)[0x7f4308f39bcf]
   
/home/ubuntu/incubator-mxnet/python/mxnet/../../lib/libmxnet.so(_ZNSt19_Sp_counted_deleterIP21mkldnn_primitive_descPF15mkldnn_status_tS1_ESaIvELN9__gnu_cxx12_Lock_policyE2EE10_M_disposeEv+0x2c)[0x7f42db233e4c]
   
/home/ubuntu/incubator-mxnet/python/mxnet/../../lib/libmxnet.so(_ZNSt16_Sp_counted_baseILN9__gnu_cxx12_Lock_policyE2EE10_M_releaseEv+0x42)[0x7f42db22c9a2]
   
/home/ubuntu/incubator-mxnet/python/mxnet/../../lib/libmxnet.so(_ZNSt14__shared_countILN9__gnu_cxx12_Lock_policyE2EED1Ev+0x27)[0x7f42db22a8ad]
   
/home/ubuntu/incubator-mxnet/python/mxnet/../../lib/libmxnet.so(_ZNSt12__shared_ptrI21mkldnn_primitive_descLN9__gnu_cxx12_Lock_policyE2EED1Ev+0x1c)[0x7f42db2241ba]
   
/home/ubuntu/incubator-mxnet/python/mxnet/../../lib/libmxnet.so(_ZNSt10shared_ptrI21mkldnn_primitive_descED1Ev+0x18)[0x7f42db2241d6]
   
/home/ubuntu/incubator-mxnet/python/mxnet/../../lib/libmxnet.so(_ZN6mkldnn6handleIP21mkldnn_primitive_descNS_13handle_traitsIS2_EEED1Ev+0x18)[0x7f42db2241f2]
   
/home/ubuntu/incubator-mxnet/python/mxnet/../../lib/libmxnet.so(_ZN6mkldnn16pooling_backward14primitive_descD1Ev+0x18)[0x7f42db266438]
   
/home/ubuntu/incubator-mxnet/python/mxnet/../../lib/libmxnet.so(_ZN5mxnet2op24MKLDNNPoolingGradComputeERKNS_9OpContextERKNS0_12PoolingParamERKNS_7NDArrayES9_PS8_NS_9OpReqTypeES9_+0xa07)[0x7f42db2643f2]
   
/home/ubuntu/incubator-mxnet/python/mxnet/../../lib/libmxnet.so(_ZN5mxnet2op23PoolingGradComputeExCPUERKN4nnvm9NodeAttrsERKNS_9OpContextERKSt6vectorINS_7NDArrayESaIS9_EERKS8_INS_9OpReqTypeESaISE_EESD_+0x401)[0x7f42dd262e2d]
   
/home/ubuntu/incubator-mxnet/python/mxnet/../../lib/libmxnet.so(_ZNSt17_Function_handlerIFvRKN4nnvm9NodeAttrsERKN5mxnet9OpContextERKSt6vectorINS4_7NDArrayESaIS9_EERKS8_INS4_9OpReqTypeESaISE_EESD_EPSJ_E9_M_invokeERKSt9_Any_dataS3_S7_SD_SI_SD_+0x91)[0x7f42db3727e4]
   
/home/ubuntu/incubator-mxnet/python/mxnet/../../lib/libmxnet.so(_ZNKSt8functionIFvRKN4nnvm9NodeAttrsERKN5mxnet9OpContextERKSt6vectorINS4_7NDArrayESaIS9_EERKS8_INS4_9OpReqTypeESaISE_EESD_EEclES3_S7_SD_SI_SD_+0xa6)[0x7f42dd5a5940]
   
/home/ubuntu/incubator-mxnet/python/mxnet/../../lib/libmxnet.so(_ZZN5mxnet10imperative14PushFComputeExERKSt8functionIFvRKN4nnvm9NodeAttrsERKNS_9OpContextERKSt6vectorINS_7NDArrayESaISA_EERKS9_INS_9OpReqTypeESaISF_EESE_EEPKNS2_2OpES5_RKNS_7ContextERKS9_IPNS_6engine3VarESaISW_EES10_RKS9_INS_8ResourceESaIS11_EERKS9_IPSA_SaIS16_EES1A_SJ_ENKUlNS_10RunContextEE_clES1B_+0xf7)[0x7f42dd59f493]
   
/home/ubuntu/incubator-mxnet/python/mxnet/../../lib/libmxnet.so(_ZNSt17_Function_handlerIFvN5mxnet10RunContextEEZNS0_10imperative14PushFComputeExERKSt8functionIFvRKN4nnvm9NodeAttrsERKNS0_9OpContextERKSt6vectorINS0_7NDArrayESaISD_EERKSC_INS0_9OpReqTypeESaISI_EESH_EEPKNS5_2OpES8_RKNS0_7ContextERKSC_IPNS0_6engine3VarESaISZ_EES13_RKSC_INS0_8ResourceESaIS14_EERKSC_IPSD_SaIS19_EES1D_SM_EUlS1_E_E9_M_invokeERKSt9_Any_dataOS1_+0x44)[0x7f42dd5aa81f]

[GitHub] zhanghang1989 commented on issue #10852: [MXNET-411] Add ROI Align

2018-05-17 Thread GitBox

zhanghang1989 commented on issue #10852: [MXNET-411] Add ROI Align
URL: https://github.com/apache/incubator-mxnet/pull/10852#issuecomment-390098221
 
 
   Remove OMP in backward, due to no atomic add in cpu.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] xinyu-intel commented on issue #10933: remove unnecessary checks on convolution parameters

2018-05-17 Thread GitBox

xinyu-intel commented on issue #10933: remove unnecessary checks on convolution 
parameters
URL: https://github.com/apache/incubator-mxnet/pull/10933#issuecomment-390092279
 
 
   ok, i'll have a try.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] zheng-da commented on issue #10613: Add Windows MKLDNN Building Instruction

2018-05-17 Thread GitBox

zheng-da commented on issue #10613: Add Windows MKLDNN Building Instruction
URL: https://github.com/apache/incubator-mxnet/pull/10613#issuecomment-390091878
 
 
   It looks good to me. @marcoabreu do you have more comments for this PR? If 
not, could you please merge it?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] zheng-da commented on issue #10933: remove unnecessary checks on convolution parameters

2018-05-17 Thread GitBox

zheng-da commented on issue #10933: remove unnecessary checks on convolution 
parameters
URL: https://github.com/apache/incubator-mxnet/pull/10933#issuecomment-390091429
 
 
   does the convolution has the same performance with and without dilation?
   could you please also do the same for deconvolution?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] indhub closed pull request #10959: [MXNET-423] Gluon Model Zoo Pre Trained Model tutorial

2018-05-17 Thread GitBox

indhub closed pull request #10959: [MXNET-423] Gluon Model Zoo Pre Trained 
Model tutorial
URL: https://github.com/apache/incubator-mxnet/pull/10959
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/docs/tutorials/gluon/pretrained_models.md 
b/docs/tutorials/gluon/pretrained_models.md
new file mode 100644
index 000..0de5fdd0b44
--- /dev/null
+++ b/docs/tutorials/gluon/pretrained_models.md
@@ -0,0 +1,375 @@
+
+# Using pre-trained models in MXNet
+
+
+In this tutorial we will see how to use multiple pre-trained models with 
Apache MXNet. First, let's download three image classification models from the 
Apache MXNet [Gluon model 
zoo](https://mxnet.incubator.apache.org/api/python/gluon/model_zoo.html).
+* **DenseNet-121** ([research paper](https://arxiv.org/abs/1608.06993)), 
improved state of the art on [ImageNet 
dataset](http://image-net.org/challenges/LSVRC) in 2016.
+* **MobileNet** ([research paper](https://arxiv.org/abs/1704.04861)), 
MobileNets are based on a streamlined architecture that uses depth-wise 
separable convolutions to build light weight deep neural networks, suitable for 
mobile applications.
+* **ResNet-18** ([research paper](https://arxiv.org/abs/1512.03385v1)), the 
-152 version is the 2015 winner in multiple categories.
+
+Why would you want to try multiple models? Why not just pick the one with the 
best accuracy? As we will see later in the tutorial, even though these models 
have been trained on the same dataset and optimized for maximum accuracy, they 
do behave slightly differently on specific images. In addition, prediction 
speed and memory footprints can vary, and that is an important factor for many 
applications. By trying a few pretrained models, you have an opportunity to 
find a model that can be a good fit for solving your business problem.
+
+
+```python
+import json
+
+import matplotlib.pyplot as plt
+import mxnet as mx
+from mxnet import gluon, nd
+from mxnet.gluon.model_zoo import vision
+import numpy as np
+%matplotlib inline
+```
+
+## Loading the model
+
+The [Gluon Model 
Zoo](https://mxnet.incubator.apache.org/api/python/gluon/model_zoo.html) 
provides a collection of off-the-shelf models. You can get the ImageNet 
pre-trained model by using `pretrained=True`. 
+If you want to train on your own classification problem from scratch, you can 
get an untrained network with a specific number of classes using the `classes` 
parameter: for example `net = vision.resnet18_v1(classes=10)`. However note 
that you cannot use the `pretrained` and `classes` parameter at the same time. 
If you want to use pre-trained weights as initialization of your network except 
for the last layer, have a look at the last section of this tutorial.
+
+We can specify the *context* where we want to run the model: the default 
behavior is to use a CPU context. There are two reasons for this:
+* First, this will allow you to test the notebook even if your machine is not 
equipped with a GPU :)
+* Second, we're going to predict a single image and we don't have any specific 
performance requirements. For production applications where you'd want to 
predict large batches of images with the best possible throughput, a GPU could 
definitely be the way to go.
+* If you want to use a GPU, make sure you have pip installed the right version 
of mxnet, or you will get an error when using the `mx.gpu()` context. Refer to 
the [install instructions](http://mxnet.incubator.apache.org/install/index.html)
+
+
+```python
+# We set the context to CPU, you can switch to GPU if you have one and 
installed a compatible version of MXNet 
+ctx = mx.cpu() 
+```
+
+
+```python
+# We can load three the three models
+densenet121 = vision.densenet121(pretrained=True, ctx=ctx)
+mobileNet = vision.mobilenet0_5(pretrained=True, ctx=ctx)
+resnet18 = vision.resnet18_v1(pretrained=True, ctx=ctx)
+```
+
+We can look at the description of the MobileNet network for example, which has 
a relatively simple yet deep architecture
+
+
+```python
+print(mobileNet)
+```
+
+MobileNet(
+  (features): HybridSequential(
+(0): Conv2D(3 -> 16, kernel_size=(3, 3), stride=(2, 2), padding=(1, 
1), bias=False)
+(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, 
use_global_stats=False, in_channels=16)
+(2): Activation(relu)
+(3): Conv2D(1 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 
1), groups=16, bias=False)
+(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, 
use_global_stats=False, in_channels=16)
+(5): Activation(relu)
+(6): Conv2D(16 -> 32, kernel_size=(1, 1), stride=(1, 1), bias=False)
+(7): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, 
use_global_stats=False,

[incubator-mxnet] branch master updated: [MXNET-423] Gluon Model Zoo Pre Trained Model tutorial (#10959)

2018-05-17 Thread indhub

This is an automated email from the ASF dual-hosted git repository.

indhub pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git


The following commit(s) were added to refs/heads/master by this push:
 new 6abd654  [MXNET-423] Gluon Model Zoo Pre Trained Model tutorial 
(#10959)
6abd654 is described below

commit 6abd6540af2d5b20ddfe84d086e9b5ff02bb12a2
Author: Thomas Delteil 
AuthorDate: Thu May 17 21:25:56 2018 -0700

[MXNET-423] Gluon Model Zoo Pre Trained Model tutorial (#10959)

* adding the pre-trained model tutorial

* adding pretrained model tutorial

* updating the tutorial

* Update pretrained_models.md

* Update test_tutorials.py

* Update pretrained_models.md

* Update pretrained_models.md

* updates following indhu's review

* Update pretrained_models.md

* Trigger build

* Trigger build

* Trigger build

* implement sina feedback

* Update pretrained_models.md

* Trigger build
---
 docs/tutorials/gluon/pretrained_models.md | 375 ++
 docs/tutorials/index.md   |   2 +-
 tests/tutorials/test_tutorials.py |   3 +
 3 files changed, 379 insertions(+), 1 deletion(-)

diff --git a/docs/tutorials/gluon/pretrained_models.md 
b/docs/tutorials/gluon/pretrained_models.md
new file mode 100644
index 000..0de5fdd
--- /dev/null
+++ b/docs/tutorials/gluon/pretrained_models.md
@@ -0,0 +1,375 @@
+
+# Using pre-trained models in MXNet
+
+
+In this tutorial we will see how to use multiple pre-trained models with 
Apache MXNet. First, let's download three image classification models from the 
Apache MXNet [Gluon model 
zoo](https://mxnet.incubator.apache.org/api/python/gluon/model_zoo.html).
+* **DenseNet-121** ([research paper](https://arxiv.org/abs/1608.06993)), 
improved state of the art on [ImageNet 
dataset](http://image-net.org/challenges/LSVRC) in 2016.
+* **MobileNet** ([research paper](https://arxiv.org/abs/1704.04861)), 
MobileNets are based on a streamlined architecture that uses depth-wise 
separable convolutions to build light weight deep neural networks, suitable for 
mobile applications.
+* **ResNet-18** ([research paper](https://arxiv.org/abs/1512.03385v1)), the 
-152 version is the 2015 winner in multiple categories.
+
+Why would you want to try multiple models? Why not just pick the one with the 
best accuracy? As we will see later in the tutorial, even though these models 
have been trained on the same dataset and optimized for maximum accuracy, they 
do behave slightly differently on specific images. In addition, prediction 
speed and memory footprints can vary, and that is an important factor for many 
applications. By trying a few pretrained models, you have an opportunity to 
find a model that can be a  [...]
+
+
+```python
+import json
+
+import matplotlib.pyplot as plt
+import mxnet as mx
+from mxnet import gluon, nd
+from mxnet.gluon.model_zoo import vision
+import numpy as np
+%matplotlib inline
+```
+
+## Loading the model
+
+The [Gluon Model 
Zoo](https://mxnet.incubator.apache.org/api/python/gluon/model_zoo.html) 
provides a collection of off-the-shelf models. You can get the ImageNet 
pre-trained model by using `pretrained=True`. 
+If you want to train on your own classification problem from scratch, you can 
get an untrained network with a specific number of classes using the `classes` 
parameter: for example `net = vision.resnet18_v1(classes=10)`. However note 
that you cannot use the `pretrained` and `classes` parameter at the same time. 
If you want to use pre-trained weights as initialization of your network except 
for the last layer, have a look at the last section of this tutorial.
+
+We can specify the *context* where we want to run the model: the default 
behavior is to use a CPU context. There are two reasons for this:
+* First, this will allow you to test the notebook even if your machine is not 
equipped with a GPU :)
+* Second, we're going to predict a single image and we don't have any specific 
performance requirements. For production applications where you'd want to 
predict large batches of images with the best possible throughput, a GPU could 
definitely be the way to go.
+* If you want to use a GPU, make sure you have pip installed the right version 
of mxnet, or you will get an error when using the `mx.gpu()` context. Refer to 
the [install instructions](http://mxnet.incubator.apache.org/install/index.html)
+
+
+```python
+# We set the context to CPU, you can switch to GPU if you have one and 
installed a compatible version of MXNet 
+ctx = mx.cpu() 
+```
+
+
+```python
+# We can load three the three models
+densenet121 = vision.densenet121(pretrained=True, ctx=ctx)
+mobileNet = vision.mobilenet0_5(pretrained=True, ctx=ctx)
+resnet18 = vision.resnet18_v1(pretrained=True, ctx=ctx)
+```
+
+We can look at the

[GitHub] asitstands commented on a change in pull request #10970: [MXNET-424] dtype option for multinomial

2018-05-17 Thread GitBox

asitstands commented on a change in pull request #10970: [MXNET-424] dtype 
option for multinomial
URL: https://github.com/apache/incubator-mxnet/pull/10970#discussion_r189154689
 
 

 ##
 File path: src/operator/random/sample_multinomial_op.h
 ##
 @@ -155,9 +158,11 @@ void SampleMultinomialForward(const nnvm::NodeAttrs& 
attrs,
 Tensor uniform =
   ctx.requested[1].get_space_typed(Shape1(N*M), s);
 prnd->SampleUniform(, 0, 1);
-Kernel::Launch(
-  s, N, K, M, inputs[0].dptr(), uniform.dptr_, 
outputs[0].dptr(),
-  param.get_prob ? outputs[1].dptr() : nullptr);
+MSHADOW_TYPE_SWITCH(outputs[0].type_flag_, IType, {
 
 Review comment:
   Sometimes the multinomial samples need further processing in floating point 
arithmetic, so the samples need to be copied into a new array of floating point 
type. The copy slows down the training. For example, in RBM, the samples need 
to be applied by `linalg.gemm` which supports only floating point arrays.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] asitstands commented on a change in pull request #10970: [MXNET-424] dtype option for multinomial

2018-05-17 Thread GitBox

asitstands commented on a change in pull request #10970: [MXNET-424] dtype 
option for multinomial
URL: https://github.com/apache/incubator-mxnet/pull/10970#discussion_r189154689
 
 

 ##
 File path: src/operator/random/sample_multinomial_op.h
 ##
 @@ -155,9 +158,11 @@ void SampleMultinomialForward(const nnvm::NodeAttrs& 
attrs,
 Tensor uniform =
   ctx.requested[1].get_space_typed(Shape1(N*M), s);
 prnd->SampleUniform(, 0, 1);
-Kernel::Launch(
-  s, N, K, M, inputs[0].dptr(), uniform.dptr_, 
outputs[0].dptr(),
-  param.get_prob ? outputs[1].dptr() : nullptr);
+MSHADOW_TYPE_SWITCH(outputs[0].type_flag_, IType, {
 
 Review comment:
   Sometimes the multinomial samples need further processing in floating point 
arithmetic, so the samples need to be copied into a new array of floating point 
type which slow down the training. For example, in RBM, the samples need to be 
applied by `linalg.gemm` which supports only floating point arrays.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] liuzx32 commented on issue #9157: YARN support data locality

2018-05-17 Thread GitBox

liuzx32 commented on issue #9157: YARN support data locality
URL: 
https://github.com/apache/incubator-mxnet/issues/9157#issuecomment-390081995
 
 
   anyone?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] wushilian opened a new issue #10993: How to train the model by c++

2018-05-17 Thread GitBox

wushilian opened a new issue #10993: How to train the model by c++
URL: https://github.com/apache/incubator-mxnet/issues/10993
 
 
   If I want to use c++ not python to train the model,How should I do?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[incubator-mxnet-site] branch asf-site updated: Bump the publish timestamp.

2018-05-17 Thread zhasheng

This is an automated email from the ASF dual-hosted git repository.

zhasheng pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/incubator-mxnet-site.git


The following commit(s) were added to refs/heads/asf-site by this push:
 new 1578138  Bump the publish timestamp.
1578138 is described below

commit 15781381db6e046b60a160617572afbf8f6bf043
Author: mxnet-ci 
AuthorDate: Fri May 18 01:58:14 2018 +

Bump the publish timestamp.
---
 date.txt | 1 +
 1 file changed, 1 insertion(+)

diff --git a/date.txt b/date.txt
new file mode 100644
index 000..896bc49
--- /dev/null
+++ b/date.txt
@@ -0,0 +1 @@
+Fri May 18 01:58:14 UTC 2018

-- 
To stop receiving notification emails like this one, please contact
zhash...@apache.org.

[GitHub] szha commented on issue #10989: [WIP] add gluon model summary

2018-05-17 Thread GitBox

szha commented on issue #10989: [WIP] add gluon model summary
URL: https://github.com/apache/incubator-mxnet/pull/10989#issuecomment-390059672
 
 
   Thanks. I'm still working on a general model summary method, and this PR 
will need another round of review on that.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] zhanghang1989 commented on a change in pull request #10852: [MXNET-411] Add ROI Align

2018-05-17 Thread GitBox

zhanghang1989 commented on a change in pull request #10852: [MXNET-411] Add ROI 
Align
URL: https://github.com/apache/incubator-mxnet/pull/10852#discussion_r189138583
 
 

 ##
 File path: src/operator/contrib/roi_align.cc
 ##
 @@ -0,0 +1,586 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+/*!
+ * Copyright (c) 2018 by Contributors
+ * \file roi_align.cc
+ * \brief roi align operator
+ * \author Hang Zhang
+ * Adapted from Caffe2
+*/
+#include "./roi_align-inl.h"
+
+
+namespace mxnet {
+namespace op {
+
+template 
+struct PreCalc {
+  int pos1;
+  int pos2;
+  int pos3;
+  int pos4;
+  T w1;
+  T w2;
+  T w3;
+  T w4;
+};
+
+template 
+void pre_calc_for_bilinear_interpolate(
+const int height,
+const int width,
+const int pooled_height,
+const int pooled_width,
+const int iy_upper,
+const int ix_upper,
+T roi_start_h,
+T roi_start_w,
+T bin_size_h,
+T bin_size_w,
+int roi_bin_grid_h,
+int roi_bin_grid_w,
+std::vector* pre_calc) {
+  int pre_calc_index = 0;
+  for (int ph = 0; ph < pooled_height; ph++) {
+for (int pw = 0; pw < pooled_width; pw++) {
+  for (int iy = 0; iy < iy_upper; iy++) {
+const T yy = roi_start_h + ph * bin_size_h +
+static_cast(iy + .5f) * bin_size_h /
+static_cast(roi_bin_grid_h);  // e.g., 0.5, 1.5
+for (int ix = 0; ix < ix_upper; ix++) {
+  const T xx = roi_start_w + pw * bin_size_w +
+  static_cast(ix + .5f) * bin_size_w /
+  static_cast(roi_bin_grid_w);
+
+  T x = xx;
+  T y = yy;
+  // deal with: inverse elements are out of feature map boundary
+  if (y < -1.0 || y > height || x < -1.0 || x > width) {
+// empty
+PreCalc pc;
+pc.pos1 = 0;
+pc.pos2 = 0;
+pc.pos3 = 0;
+pc.pos4 = 0;
+pc.w1 = 0;
+pc.w2 = 0;
+pc.w3 = 0;
+pc.w4 = 0;
+pre_calc->at(pre_calc_index) = pc;
+pre_calc_index += 1;
+continue;
+  }
+
+  if (y <= 0) {
+y = 0;
+  }
+  if (x <= 0) {
+x = 0;
+  }
+
+  int y_low = static_cast(y);
+  int x_low = static_cast(x);
+  int y_high;
+  int x_high;
+
+  if (y_low >= height - 1) {
+y_high = y_low = height - 1;
+y = (T)y_low;
+  } else {
+y_high = y_low + 1;
+  }
+
+  if (x_low >= width - 1) {
+x_high = x_low = width - 1;
+x = (T)x_low;
+  } else {
+x_high = x_low + 1;
+  }
+
+  T ly = y - y_low;
+  T lx = x - x_low;
+  T hy = 1. - ly, hx = 1. - lx;
+  T w1 = hy * hx, w2 = hy * lx, w3 = ly * hx, w4 = ly * lx;
+
+  // save weights and indeces
+  PreCalc pc;
+  pc.pos1 = y_low * width + x_low;
+  pc.pos2 = y_low * width + x_high;
+  pc.pos3 = y_high * width + x_low;
+  pc.pos4 = y_high * width + x_high;
+  pc.w1 = w1;
+  pc.w2 = w2;
+  pc.w3 = w3;
+  pc.w4 = w4;
+  pre_calc->at(pre_calc_index) = pc;
+
+  pre_calc_index += 1;
+}
+  }
+}
+  }
+}
+
+template 
+void ROIAlignForward(
+const int nthreads,
+const T* bottom_data,
+const T& spatial_scale,
+const int channels,
+const int height,
+const int width,
+const int pooled_height,
+const int pooled_width,
+const int sampling_ratio,
+const T* bottom_rois,
+int roi_cols,
+T* top_data) {
+  DCHECK(roi_cols == 4 || roi_cols == 5);
+
+  int n_rois = nthreads / channels / pooled_width / pooled_height;
+  // (n, c, ph, pw) is an element in the pooled output
+  // can be parallelized using omp
+  int n;
+#pragma omp parallel for private(n) \
 
 Review comment:
   Are you suggesting removing this omp?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For

[GitHub] zhanghang1989 commented on a change in pull request #10852: [MXNET-411] Add ROI Align

2018-05-17 Thread GitBox

zhanghang1989 commented on a change in pull request #10852: [MXNET-411] Add ROI 
Align
URL: https://github.com/apache/incubator-mxnet/pull/10852#discussion_r189138585
 
 

 ##
 File path: src/operator/contrib/roi_align.cc
 ##
 @@ -0,0 +1,586 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+/*!
+ * Copyright (c) 2018 by Contributors
+ * \file roi_align.cc
+ * \brief roi align operator
+ * \author Hang Zhang
+ * Adapted from Caffe2
+*/
+#include "./roi_align-inl.h"
+
+
+namespace mxnet {
+namespace op {
+
+template 
+struct PreCalc {
+  int pos1;
+  int pos2;
+  int pos3;
+  int pos4;
+  T w1;
+  T w2;
+  T w3;
+  T w4;
+};
+
+template 
+void pre_calc_for_bilinear_interpolate(
+const int height,
+const int width,
+const int pooled_height,
+const int pooled_width,
+const int iy_upper,
+const int ix_upper,
+T roi_start_h,
+T roi_start_w,
+T bin_size_h,
+T bin_size_w,
+int roi_bin_grid_h,
+int roi_bin_grid_w,
+std::vector* pre_calc) {
+  int pre_calc_index = 0;
+  for (int ph = 0; ph < pooled_height; ph++) {
+for (int pw = 0; pw < pooled_width; pw++) {
+  for (int iy = 0; iy < iy_upper; iy++) {
+const T yy = roi_start_h + ph * bin_size_h +
+static_cast(iy + .5f) * bin_size_h /
+static_cast(roi_bin_grid_h);  // e.g., 0.5, 1.5
+for (int ix = 0; ix < ix_upper; ix++) {
+  const T xx = roi_start_w + pw * bin_size_w +
+  static_cast(ix + .5f) * bin_size_w /
+  static_cast(roi_bin_grid_w);
+
+  T x = xx;
+  T y = yy;
+  // deal with: inverse elements are out of feature map boundary
+  if (y < -1.0 || y > height || x < -1.0 || x > width) {
+// empty
+PreCalc pc;
+pc.pos1 = 0;
+pc.pos2 = 0;
+pc.pos3 = 0;
+pc.pos4 = 0;
+pc.w1 = 0;
+pc.w2 = 0;
+pc.w3 = 0;
+pc.w4 = 0;
+pre_calc->at(pre_calc_index) = pc;
+pre_calc_index += 1;
+continue;
+  }
+
+  if (y <= 0) {
+y = 0;
+  }
+  if (x <= 0) {
+x = 0;
+  }
+
+  int y_low = static_cast(y);
+  int x_low = static_cast(x);
+  int y_high;
+  int x_high;
+
+  if (y_low >= height - 1) {
+y_high = y_low = height - 1;
+y = (T)y_low;
+  } else {
+y_high = y_low + 1;
+  }
+
+  if (x_low >= width - 1) {
+x_high = x_low = width - 1;
+x = (T)x_low;
+  } else {
+x_high = x_low + 1;
+  }
+
+  T ly = y - y_low;
+  T lx = x - x_low;
+  T hy = 1. - ly, hx = 1. - lx;
+  T w1 = hy * hx, w2 = hy * lx, w3 = ly * hx, w4 = ly * lx;
+
+  // save weights and indeces
+  PreCalc pc;
+  pc.pos1 = y_low * width + x_low;
+  pc.pos2 = y_low * width + x_high;
+  pc.pos3 = y_high * width + x_low;
+  pc.pos4 = y_high * width + x_high;
+  pc.w1 = w1;
+  pc.w2 = w2;
+  pc.w3 = w3;
+  pc.w4 = w4;
+  pre_calc->at(pre_calc_index) = pc;
+
+  pre_calc_index += 1;
+}
+  }
+}
+  }
+}
+
+template 
+void ROIAlignForward(
+const int nthreads,
+const T* bottom_data,
+const T& spatial_scale,
+const int channels,
+const int height,
+const int width,
+const int pooled_height,
+const int pooled_width,
+const int sampling_ratio,
+const T* bottom_rois,
+int roi_cols,
+T* top_data) {
+  DCHECK(roi_cols == 4 || roi_cols == 5);
+
+  int n_rois = nthreads / channels / pooled_width / pooled_height;
+  // (n, c, ph, pw) is an element in the pooled output
+  // can be parallelized using omp
+  int n;
+#pragma omp parallel for private(n) \
+num_threads(engine::OpenMP::Get()->GetRecommendedOMPThreadCount())
+  for (n = 0; n < n_rois; n++) {
+int index_n = n * channels * pooled_width * pooled_height;
+
+// roi could have 4 or 5 columns
+const T* offset_bottom_rois = bottom_rois + n * roi_cols;
+int roi_batch_ind

[GitHub] piiswrong opened a new pull request #10992: unlink memory shared file immediately on linux

2018-05-17 Thread GitBox

piiswrong opened a new pull request #10992: unlink memory shared file 
immediately on linux
URL: https://github.com/apache/incubator-mxnet/pull/10992
 
 
   ## Description ##
   (Brief description on what this PR is about)
   
   ## Checklist ##
   ### Essentials ###
   Please feel free to remove inapplicable items for your PR.
   - [ ] The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to 
the relevant [JIRA issue](https://issues.apache.org/jira/projects/MXNET/issues) 
created (except PRs with tiny changes)
   - [ ] Changes are complete (i.e. I finished coding on this PR)
   - [ ] All changes have test coverage:
   - Unit tests are added for small changes to verify correctness (e.g. adding 
a new operator)
   - Nightly tests are added for complicated/long-running ones (e.g. changing 
distributed kvstore)
   - Build tests will be added for build configuration changes (e.g. adding a 
new build option with NCCL)
   - [ ] Code is well-documented: 
   - For user-facing API changes, API doc string has been updated. 
   - For new C++ functions in header files, their functionalities and arguments 
are documented. 
   - For new examples, README.md is added to explain the what the example does, 
the source of the dataset, expected performance on test set and reference to 
the original paper if applicable
   - Check the API doc at 
http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
   - [ ] To the my best knowledge, examples are either not affected by this 
change, or have been fixed to be compatible with this change
   
   ### Changes ###
   - [ ] Feature1, tests, (and when applicable, API doc)
   - [ ] Feature2, tests, (and when applicable, API doc)
   
   ## Comments ##
   - If this change is a backward incompatible change, why must this change be 
made.
   - Interesting edge cases to note here
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] indhub commented on a change in pull request #10955: [MXNET-422] Distributed training tutorial

2018-05-17 Thread GitBox

indhub commented on a change in pull request #10955: [MXNET-422] Distributed
training tutorial
URL: https://github.com/apache/incubator-mxnet/pull/10955#discussion_r189136422

##
File path: example/distributed_training/README.md
##
@@ -0,0 +1,231 @@
+# Distributed Training using Gluon
+
+Deep learning models are usually trained using GPUs because GPUs can do a lot
more computations in parallel that CPUs. But even with the modern GPUs, it
could take several days to train big models. Training can be done faster by
using multiple GPUs like described in
[this](https://gluon.mxnet.io/chapter07_distributed-learning/multiple-gpus-gluon.html)
tutorial. However only a certain number of GPUs can be attached to one host
(typically 8 or 16). To make the training even faster, we can use multiple GPUs
attached to multiple hosts.
+
+In this tutorial, we will show how to train a model faster using multi-host
distributed training.
+
+![Multiple GPUs connected to multiple
hosts](https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/example/distributed_training/distributed_training.png)
+
+We will use data parallelism to distribute the training which involves
splitting the training data across GPUs attached to multiple hosts. Since the
hosts are working with different subset of the training data in parallel, the
training completes lot faster.
+
+In this tutorial, we will train a LeNet network using MNIST dataset using two
hosts each having four GPUs.

Review comment:
Valid point. I'll switch to CIFAR.

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

With regards,
Apache Git Services

[GitHub] xinyu-intel commented on a change in pull request #10852: [MXNET-411] Add ROI Align

2018-05-17 Thread GitBox

xinyu-intel commented on a change in pull request #10852: [MXNET-411] Add ROI 
Align
URL: https://github.com/apache/incubator-mxnet/pull/10852#discussion_r189134301
 
 

 ##
 File path: src/operator/contrib/roi_align.cc
 ##
 @@ -0,0 +1,586 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+/*!
+ * Copyright (c) 2018 by Contributors
+ * \file roi_align.cc
+ * \brief roi align operator
+ * \author Hang Zhang
+ * Adapted from Caffe2
+*/
+#include "./roi_align-inl.h"
+
+
+namespace mxnet {
+namespace op {
+
+template 
+struct PreCalc {
+  int pos1;
+  int pos2;
+  int pos3;
+  int pos4;
+  T w1;
+  T w2;
+  T w3;
+  T w4;
+};
+
+template 
+void pre_calc_for_bilinear_interpolate(
+const int height,
+const int width,
+const int pooled_height,
+const int pooled_width,
+const int iy_upper,
+const int ix_upper,
+T roi_start_h,
+T roi_start_w,
+T bin_size_h,
+T bin_size_w,
+int roi_bin_grid_h,
+int roi_bin_grid_w,
+std::vector* pre_calc) {
+  int pre_calc_index = 0;
+  for (int ph = 0; ph < pooled_height; ph++) {
+for (int pw = 0; pw < pooled_width; pw++) {
+  for (int iy = 0; iy < iy_upper; iy++) {
+const T yy = roi_start_h + ph * bin_size_h +
+static_cast(iy + .5f) * bin_size_h /
+static_cast(roi_bin_grid_h);  // e.g., 0.5, 1.5
+for (int ix = 0; ix < ix_upper; ix++) {
+  const T xx = roi_start_w + pw * bin_size_w +
+  static_cast(ix + .5f) * bin_size_w /
+  static_cast(roi_bin_grid_w);
+
+  T x = xx;
+  T y = yy;
+  // deal with: inverse elements are out of feature map boundary
+  if (y < -1.0 || y > height || x < -1.0 || x > width) {
+// empty
+PreCalc pc;
+pc.pos1 = 0;
+pc.pos2 = 0;
+pc.pos3 = 0;
+pc.pos4 = 0;
+pc.w1 = 0;
+pc.w2 = 0;
+pc.w3 = 0;
+pc.w4 = 0;
+pre_calc->at(pre_calc_index) = pc;
+pre_calc_index += 1;
+continue;
+  }
+
+  if (y <= 0) {
+y = 0;
+  }
+  if (x <= 0) {
+x = 0;
+  }
+
+  int y_low = static_cast(y);
+  int x_low = static_cast(x);
+  int y_high;
+  int x_high;
+
+  if (y_low >= height - 1) {
+y_high = y_low = height - 1;
+y = (T)y_low;
+  } else {
+y_high = y_low + 1;
+  }
+
+  if (x_low >= width - 1) {
+x_high = x_low = width - 1;
+x = (T)x_low;
+  } else {
+x_high = x_low + 1;
+  }
+
+  T ly = y - y_low;
+  T lx = x - x_low;
+  T hy = 1. - ly, hx = 1. - lx;
+  T w1 = hy * hx, w2 = hy * lx, w3 = ly * hx, w4 = ly * lx;
+
+  // save weights and indeces
+  PreCalc pc;
+  pc.pos1 = y_low * width + x_low;
+  pc.pos2 = y_low * width + x_high;
+  pc.pos3 = y_high * width + x_low;
+  pc.pos4 = y_high * width + x_high;
+  pc.w1 = w1;
+  pc.w2 = w2;
+  pc.w3 = w3;
+  pc.w4 = w4;
+  pre_calc->at(pre_calc_index) = pc;
+
+  pre_calc_index += 1;
+}
+  }
+}
+  }
+}
+
+template 
+void ROIAlignForward(
+const int nthreads,
+const T* bottom_data,
+const T& spatial_scale,
+const int channels,
+const int height,
+const int width,
+const int pooled_height,
+const int pooled_width,
+const int sampling_ratio,
+const T* bottom_rois,
+int roi_cols,
+T* top_data) {
+  DCHECK(roi_cols == 4 || roi_cols == 5);
+
+  int n_rois = nthreads / channels / pooled_width / pooled_height;
+  // (n, c, ph, pw) is an element in the pooled output
+  // can be parallelized using omp
+  int n;
+#pragma omp parallel for private(n) \
 
 Review comment:
   Thanks for adding omp. Regarding to `NUM_OF_ROIS` and `CHANNELS`, I found 
laster is more and more bigger than former in ROIPooling, so I just apply omp 
on channels to achieve better application performance. Can you help benchmark 
the performance based on the usually

[GitHub] xinyu-intel commented on a change in pull request #10852: [MXNET-411] Add ROI Align

2018-05-17 Thread GitBox

xinyu-intel commented on a change in pull request #10852: [MXNET-411] Add ROI 
Align
URL: https://github.com/apache/incubator-mxnet/pull/10852#discussion_r189134358
 
 

 ##
 File path: src/operator/contrib/roi_align.cc
 ##
 @@ -0,0 +1,586 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+/*!
+ * Copyright (c) 2018 by Contributors
+ * \file roi_align.cc
+ * \brief roi align operator
+ * \author Hang Zhang
+ * Adapted from Caffe2
+*/
+#include "./roi_align-inl.h"
+
+
+namespace mxnet {
+namespace op {
+
+template 
+struct PreCalc {
+  int pos1;
+  int pos2;
+  int pos3;
+  int pos4;
+  T w1;
+  T w2;
+  T w3;
+  T w4;
+};
+
+template 
+void pre_calc_for_bilinear_interpolate(
+const int height,
+const int width,
+const int pooled_height,
+const int pooled_width,
+const int iy_upper,
+const int ix_upper,
+T roi_start_h,
+T roi_start_w,
+T bin_size_h,
+T bin_size_w,
+int roi_bin_grid_h,
+int roi_bin_grid_w,
+std::vector* pre_calc) {
+  int pre_calc_index = 0;
+  for (int ph = 0; ph < pooled_height; ph++) {
+for (int pw = 0; pw < pooled_width; pw++) {
+  for (int iy = 0; iy < iy_upper; iy++) {
+const T yy = roi_start_h + ph * bin_size_h +
+static_cast(iy + .5f) * bin_size_h /
+static_cast(roi_bin_grid_h);  // e.g., 0.5, 1.5
+for (int ix = 0; ix < ix_upper; ix++) {
+  const T xx = roi_start_w + pw * bin_size_w +
+  static_cast(ix + .5f) * bin_size_w /
+  static_cast(roi_bin_grid_w);
+
+  T x = xx;
+  T y = yy;
+  // deal with: inverse elements are out of feature map boundary
+  if (y < -1.0 || y > height || x < -1.0 || x > width) {
+// empty
+PreCalc pc;
+pc.pos1 = 0;
+pc.pos2 = 0;
+pc.pos3 = 0;
+pc.pos4 = 0;
+pc.w1 = 0;
+pc.w2 = 0;
+pc.w3 = 0;
+pc.w4 = 0;
+pre_calc->at(pre_calc_index) = pc;
+pre_calc_index += 1;
+continue;
+  }
+
+  if (y <= 0) {
+y = 0;
+  }
+  if (x <= 0) {
+x = 0;
+  }
+
+  int y_low = static_cast(y);
+  int x_low = static_cast(x);
+  int y_high;
+  int x_high;
+
+  if (y_low >= height - 1) {
+y_high = y_low = height - 1;
+y = (T)y_low;
+  } else {
+y_high = y_low + 1;
+  }
+
+  if (x_low >= width - 1) {
+x_high = x_low = width - 1;
+x = (T)x_low;
+  } else {
+x_high = x_low + 1;
+  }
+
+  T ly = y - y_low;
+  T lx = x - x_low;
+  T hy = 1. - ly, hx = 1. - lx;
+  T w1 = hy * hx, w2 = hy * lx, w3 = ly * hx, w4 = ly * lx;
+
+  // save weights and indeces
+  PreCalc pc;
+  pc.pos1 = y_low * width + x_low;
+  pc.pos2 = y_low * width + x_high;
+  pc.pos3 = y_high * width + x_low;
+  pc.pos4 = y_high * width + x_high;
+  pc.w1 = w1;
+  pc.w2 = w2;
+  pc.w3 = w3;
+  pc.w4 = w4;
+  pre_calc->at(pre_calc_index) = pc;
+
+  pre_calc_index += 1;
+}
+  }
+}
+  }
+}
+
+template 
+void ROIAlignForward(
+const int nthreads,
+const T* bottom_data,
+const T& spatial_scale,
+const int channels,
+const int height,
+const int width,
+const int pooled_height,
+const int pooled_width,
+const int sampling_ratio,
+const T* bottom_rois,
+int roi_cols,
+T* top_data) {
+  DCHECK(roi_cols == 4 || roi_cols == 5);
+
+  int n_rois = nthreads / channels / pooled_width / pooled_height;
+  // (n, c, ph, pw) is an element in the pooled output
+  // can be parallelized using omp
+  int n;
+#pragma omp parallel for private(n) \
+num_threads(engine::OpenMP::Get()->GetRecommendedOMPThreadCount())
+  for (n = 0; n < n_rois; n++) {
+int index_n = n * channels * pooled_width * pooled_height;
+
+// roi could have 4 or 5 columns
+const T* offset_bottom_rois = bottom_rois + n * roi_cols;
+int roi_batch_ind =

[GitHub] indhub commented on a change in pull request #10955: [MXNET-422] Distributed training tutorial

2018-05-17 Thread GitBox

indhub commented on a change in pull request #10955: [MXNET-422] Distributed 
training tutorial
URL: https://github.com/apache/incubator-mxnet/pull/10955#discussion_r189132417
 
 

 ##
 File path: docs/tutorials/index.md
 ##
 @@ -38,6 +38,7 @@ Select API:
 * [Visual Question 
Answering](http://gluon.mxnet.io/chapter08_computer-vision/visual-question-answer.html)
 https://upload.wikimedia.org/wikipedia/commons/6/6a/External_link_font_awesome.svg;
 alt="External link" height="15px" style="margin: 0px 0px 3px 3px;"/>
 * Practitioner Guides
 * [Multi-GPU 
training](http://gluon.mxnet.io/chapter07_distributed-learning/multiple-gpus-gluon.html)
 https://upload.wikimedia.org/wikipedia/commons/6/6a/External_link_font_awesome.svg;
 alt="External link" height="15px" style="margin: 0px 0px 3px 3px;"/>
+* [Distributed 
Training](https://github.com/apache/incubator-mxnet/tree/master/example/distributed_training)
 
 Review comment:
   I don't think the CI system can currently test distributed training. I think 
you are trying to say part of the code can be tested. Note that it will only 
test very trivial parts and those parts will lose syntax highlighting which 
will spoils user experience.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] indhub commented on a change in pull request #10959: [MXNET-423] Gluon Model Zoo Pre Trained Model tutorial

2018-05-17 Thread GitBox

indhub commented on a change in pull request #10959: [MXNET-423] Gluon Model 
Zoo Pre Trained Model tutorial
URL: https://github.com/apache/incubator-mxnet/pull/10959#discussion_r189128940
 
 

 ##
 File path: docs/tutorials/gluon/pretrained_models.md
 ##
 @@ -0,0 +1,374 @@
+
+# Using pre-trained models in MXNet
+
+In this tutorial we will see how to use multiple pre-trained models with 
Apache MXNet. First, let's download three image classification models from the 
Apache MXNet [Gluon model 
zoo](https://mxnet.incubator.apache.org/api/python/gluon/model_zoo.html).
+* **DenseNet-121** ([research paper](https://arxiv.org/abs/1608.06993)), 
improved state of the art on [ImageNet 
dataset](http://image-net.org/challenges/LSVRC) in 2016.
+* **MobileNet** ([research paper](https://arxiv.org/abs/1704.04861)), 
MobileNets are based on a streamlined architecture that uses depth-wise 
separable convolutions to build light weight deep neural networks, suitable for 
mobile applications.
+* **ResNet-18** ([research paper](https://arxiv.org/abs/1512.03385v1)), the 
-152 version is the 2015 winner in multiple categories.
+
+Why would you want to try multiple models? Why not just pick the one with the 
best accuracy? As we will see later in the tutorial, even though these models 
have been trained on the same dataset and optimized for maximum accuracy, they 
do behave slightly differently on specific images. In addition, prediction 
speed and memory footprints can vary, and that is an important factor for many 
applications. By trying a few pretrained models, you have an opportunity to 
find a model that can be a good fit for solving your business problem.
+
+
+```python
+import json
+
+import matplotlib.pyplot as plt
+import mxnet as mx
+from mxnet import gluon, nd
+from mxnet.gluon.model_zoo import vision
+import numpy as np
+%matplotlib inline
+```
+
+## Loading the model
+
+The [Gluon Model 
Zoo](https://mxnet.incubator.apache.org/api/python/gluon/model_zoo.html) 
provides a collection of off-the-shelf models. You can get the ImageNet 
pre-trained model by using `pretrained=True`. 
+If you want to train on your own classification problem from scratch, you can 
get an untrained network with a specific number of classes using the `classes` 
parameter: for example `net = vision.resnet18_v1(classes=10)`. However note 
that you cannot use the `pretrained` and `classes` parameter at the same time. 
If you want to use pre-trained weights as initialization of your network except 
for the last layer, have a look at the last section of this tutorial.
+
+We can specify the *context* where we want to run the model: the default 
behavior is to use a CPU context. There are two reasons for this:
+* First, this will allow you to test the notebook even if your machine is not 
equipped with a GPU :)
+* Second, we're going to predict a single image and we don't have any specific 
performance requirements. For production applications where you'd want to 
predict large batches of images with the best possible throughput, a GPU could 
definitely be the way to go.
+* If you want to use a GPU, make sure you have pip installed the right version 
of mxnet, or you will get an error when using the `mx.gpu()` context. Refer to 
the [install instructions](http://mxnet.incubator.apache.org/install/index.html)
+
+
+```python
+# We set the context to CPU, you can switch to GPU if you have one and 
installed a compatible version of MXNet 
+ctx = mx.cpu() 
+```
+
+
+```python
+# We can load three the three models
+densenet121 = vision.densenet121(pretrained=True, ctx=ctx)
+mobileNet = vision.mobilenet0_5(pretrained=True, ctx=ctx)
+resnet18 = vision.resnet18_v1(pretrained=True, ctx=ctx)
+```
+
+We can look at the description of the MobileNet network for example, which has 
a relatively simple yet deep architecture
+
+
+```python
+print(mobileNet)
+```
+
+MobileNet(
+  (features): HybridSequential(
+(0): Conv2D(3 -> 16, kernel_size=(3, 3), stride=(2, 2), padding=(1, 
1), bias=False)
+(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, 
use_global_stats=False, in_channels=16)
+(2): Activation(relu)
+(3): Conv2D(1 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 
1), groups=16, bias=False)
+(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, 
use_global_stats=False, in_channels=16)
+(5): Activation(relu)
+(6): Conv2D(16 -> 32, kernel_size=(1, 1), stride=(1, 1), bias=False)
+(7): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, 
use_global_stats=False, in_channels=32)
+(8): Activation(relu)
+(9): Conv2D(1 -> 32, kernel_size=(3, 3), stride=(2, 2), padding=(1, 
1), groups=32, bias=False)
+(10): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, 
use_global_stats=False, in_channels=32)
+(11): Activation(relu)
+(12): Conv2D(32 -> 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
+

[GitHub] indhub commented on a change in pull request #10959: [MXNET-423] Gluon Model Zoo Pre Trained Model tutorial

2018-05-17 Thread GitBox

indhub commented on a change in pull request #10959: [MXNET-423] Gluon Model 
Zoo Pre Trained Model tutorial
URL: https://github.com/apache/incubator-mxnet/pull/10959#discussion_r189128940
 
 

 ##
 File path: docs/tutorials/gluon/pretrained_models.md
 ##
 @@ -0,0 +1,374 @@
+
+# Using pre-trained models in MXNet
+
+In this tutorial we will see how to use multiple pre-trained models with 
Apache MXNet. First, let's download three image classification models from the 
Apache MXNet [Gluon model 
zoo](https://mxnet.incubator.apache.org/api/python/gluon/model_zoo.html).
+* **DenseNet-121** ([research paper](https://arxiv.org/abs/1608.06993)), 
improved state of the art on [ImageNet 
dataset](http://image-net.org/challenges/LSVRC) in 2016.
+* **MobileNet** ([research paper](https://arxiv.org/abs/1704.04861)), 
MobileNets are based on a streamlined architecture that uses depth-wise 
separable convolutions to build light weight deep neural networks, suitable for 
mobile applications.
+* **ResNet-18** ([research paper](https://arxiv.org/abs/1512.03385v1)), the 
-152 version is the 2015 winner in multiple categories.
+
+Why would you want to try multiple models? Why not just pick the one with the 
best accuracy? As we will see later in the tutorial, even though these models 
have been trained on the same dataset and optimized for maximum accuracy, they 
do behave slightly differently on specific images. In addition, prediction 
speed and memory footprints can vary, and that is an important factor for many 
applications. By trying a few pretrained models, you have an opportunity to 
find a model that can be a good fit for solving your business problem.
+
+
+```python
+import json
+
+import matplotlib.pyplot as plt
+import mxnet as mx
+from mxnet import gluon, nd
+from mxnet.gluon.model_zoo import vision
+import numpy as np
+%matplotlib inline
+```
+
+## Loading the model
+
+The [Gluon Model 
Zoo](https://mxnet.incubator.apache.org/api/python/gluon/model_zoo.html) 
provides a collection of off-the-shelf models. You can get the ImageNet 
pre-trained model by using `pretrained=True`. 
+If you want to train on your own classification problem from scratch, you can 
get an untrained network with a specific number of classes using the `classes` 
parameter: for example `net = vision.resnet18_v1(classes=10)`. However note 
that you cannot use the `pretrained` and `classes` parameter at the same time. 
If you want to use pre-trained weights as initialization of your network except 
for the last layer, have a look at the last section of this tutorial.
+
+We can specify the *context* where we want to run the model: the default 
behavior is to use a CPU context. There are two reasons for this:
+* First, this will allow you to test the notebook even if your machine is not 
equipped with a GPU :)
+* Second, we're going to predict a single image and we don't have any specific 
performance requirements. For production applications where you'd want to 
predict large batches of images with the best possible throughput, a GPU could 
definitely be the way to go.
+* If you want to use a GPU, make sure you have pip installed the right version 
of mxnet, or you will get an error when using the `mx.gpu()` context. Refer to 
the [install instructions](http://mxnet.incubator.apache.org/install/index.html)
+
+
+```python
+# We set the context to CPU, you can switch to GPU if you have one and 
installed a compatible version of MXNet 
+ctx = mx.cpu() 
+```
+
+
+```python
+# We can load three the three models
+densenet121 = vision.densenet121(pretrained=True, ctx=ctx)
+mobileNet = vision.mobilenet0_5(pretrained=True, ctx=ctx)
+resnet18 = vision.resnet18_v1(pretrained=True, ctx=ctx)
+```
+
+We can look at the description of the MobileNet network for example, which has 
a relatively simple yet deep architecture
+
+
+```python
+print(mobileNet)
+```
+
+MobileNet(
+  (features): HybridSequential(
+(0): Conv2D(3 -> 16, kernel_size=(3, 3), stride=(2, 2), padding=(1, 
1), bias=False)
+(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, 
use_global_stats=False, in_channels=16)
+(2): Activation(relu)
+(3): Conv2D(1 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 
1), groups=16, bias=False)
+(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, 
use_global_stats=False, in_channels=16)
+(5): Activation(relu)
+(6): Conv2D(16 -> 32, kernel_size=(1, 1), stride=(1, 1), bias=False)
+(7): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, 
use_global_stats=False, in_channels=32)
+(8): Activation(relu)
+(9): Conv2D(1 -> 32, kernel_size=(3, 3), stride=(2, 2), padding=(1, 
1), groups=32, bias=False)
+(10): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, 
use_global_stats=False, in_channels=32)
+(11): Activation(relu)
+(12): Conv2D(32 -> 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
+

[GitHub] yzhliu commented on a change in pull request #10787: [MXNET-357] New Scala API Design (NDArray)

2018-05-17 Thread GitBox

yzhliu commented on a change in pull request #10787: [MXNET-357] New Scala API 
Design (NDArray)
URL: https://github.com/apache/incubator-mxnet/pull/10787#discussion_r189128659
 
 

 ##
 File path: 
scala-package/macros/src/main/scala/org/apache/mxnet/NDArrayMacro.scala
 ##
 @@ -102,20 +178,80 @@ private[mxnet] object NDArrayMacro {
 result
   }
 
+
+  // Convert C++ Types to Scala Types
+  private def typeConversion(in : String, argType : String = "") : String = {
 
 Review comment:
   can we  have these  functions  shared with those in SymbolMacro?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] piiswrong commented on issue #10989: [WIP] add gluon model summary

2018-05-17 Thread GitBox

piiswrong commented on issue #10989: [WIP] add gluon model summary
URL: https://github.com/apache/incubator-mxnet/pull/10989#issuecomment-390040412
 
 
   Need warning for hybridize. otherwise LGTM


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] indhub commented on a change in pull request #10900: [MXNET-414] Tutorial on visualizing CNN decisions using Grad-CAM

2018-05-17 Thread GitBox

indhub commented on a change in pull request #10900: [MXNET-414] Tutorial on 
visualizing CNN decisions using Grad-CAM
URL: https://github.com/apache/incubator-mxnet/pull/10900#discussion_r189123063
 
 

 ##
 File path: docs/tutorials/vision/cnn_visualization.md
 ##
 @@ -0,0 +1,250 @@
+# Visualizing Decisions of Convolutional Neural Networks
+
+Convolutional Neural Networks have made a lot of progress in Computer Vision. 
Their accuracy is as good as humans in some tasks. However it remains hard to 
explain the predictions of convolutional neural networks.
+
+It is often helpful to be able to explain why a model made the prediction it 
made. For example when a model misclassifies an image, it is hard to say why 
without visualizing the network's decision.
+
+https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/example/cnn_visualization/volcano_barn_spider.png;
 alt="Explaining the misclassification of volcano as spider" width=500px/>
+
+Visualizations also help build confidence about the predictions of a model. 
For example, even if a model correctly predicts birds as birds, we would want 
to confirm that the model bases its decision on the features of bird and not on 
the features of some other object that might occur together with birds in the 
dataset (like leaves).
+
+In this tutorial, we show how to visualize the predictions made by 
convolutional neural networks using Gradient-weighted Class Activation Mapping. 
Unlike many other visualization methods, Grad-CAM can be used on a wide variety 
of CNN model families - CNNs with fully connected layers, CNNs used for 
structural outputs (e.g. captioning), CNNs used in tasks with multi-model input 
(e.g. VQA) or reinforcement learning without architectural changes or 
re-training.
+
+In the rest of this notebook, we will explain how to visualize predictions 
made by [VGG-16](https://arxiv.org/abs/1409.1556). We begin by importing the 
required dependencies. `gradcam` module contains the implementation of 
visualization techniques used in this notebook.
+
+```python
+from __future__ import print_function
+
+import mxnet as mx
+from mxnet import gluon
+
+from matplotlib import pyplot as plt
+import numpy as np
+import cv2
+
+gradcam_file = "gradcam.py" 
+base_url = 
"https://raw.githubusercontent.com/indhub/mxnet/cnnviz/example/cnn_visualization/{}?raw=true;
+mx.test_utils.download(base_url.format(gradcam_file), fname=gradcam_file)
+import gradcam
+```
+
+## Building the network to visualize
+
+Next, we build the network we want to visualize. For this example, we will use 
the [VGG-16](https://arxiv.org/abs/1409.1556) network. This code was taken from 
the Gluon [model 
zoo](https://github.com/apache/incubator-mxnet/blob/master/python/mxnet/gluon/model_zoo/vision/alexnet.py)
 and refactored to make it easy to switch between `gradcam`'s and Gluon's 
implementation of ReLU and Conv2D. Same code can be used for both training and 
visualization with a minor (one line) change.
+
+Notice that we import ReLU and Conv2D from `gradcam` module instead of 
mxnet.gluon.nn.
+- We use a modified ReLU because we use guided backpropagation for 
visualization and guided backprop requires ReLU layer to block the backward 
flow of negative gradients corresponding to the neurons which decrease the 
activation of the higher layer unit we aim to visualize. Check 
[this](https://arxiv.org/abs/1412.6806) paper to learn more about guided 
backprop.
+- We use a modified Conv2D (a wrapper on top of Gluon's Conv2D) because we 
want to capture the output of a given convolutional layer and its gradients. 
This is needed to implement Grad-CAM. Check 
[this](https://arxiv.org/abs/1610.02391) paper to learn more about Grad-CAM.
+
+When you train the network, you could just import `Activation` and `Conv2D` 
from `gluon.nn` instead. No other part of the code needs any change to switch 
between training and visualization.
+
+```python
+import os
+from mxnet.gluon.model_zoo import model_store
+
+from mxnet.initializer import Xavier
+from mxnet.gluon.nn import MaxPool2D, Flatten, Dense, Dropout, BatchNorm
+from gradcam import Activation, Conv2D
+
+class VGG(mx.gluon.HybridBlock):
+def __init__(self, layers, filters, classes=1000, batch_norm=False, 
**kwargs):
+super(VGG, self).__init__(**kwargs)
+assert len(layers) == len(filters)
+with self.name_scope():
+self.features = self._make_features(layers, filters, batch_norm)
+self.features.add(Dense(4096, activation='relu',
+   weight_initializer='normal',
+   bias_initializer='zeros'))
+self.features.add(Dropout(rate=0.5))
+self.features.add(Dense(4096, activation='relu',
+   weight_initializer='normal',
+   bias_initializer='zeros'))
+self.features.add(Dropout(rate=0.5))
+self.output

[GitHub] indhub commented on a change in pull request #10900: [MXNET-414] Tutorial on visualizing CNN decisions using Grad-CAM

2018-05-17 Thread GitBox

indhub commented on a change in pull request #10900: [MXNET-414] Tutorial on 
visualizing CNN decisions using Grad-CAM
URL: https://github.com/apache/incubator-mxnet/pull/10900#discussion_r189122590
 
 

 ##
 File path: docs/tutorials/vision/cnn_visualization.md
 ##
 @@ -0,0 +1,250 @@
+# Visualizing Decisions of Convolutional Neural Networks
+
+Convolutional Neural Networks have made a lot of progress in Computer Vision. 
Their accuracy is as good as humans in some tasks. However it remains hard to 
explain the predictions of convolutional neural networks.
+
+It is often helpful to be able to explain why a model made the prediction it 
made. For example when a model misclassifies an image, it is hard to say why 
without visualizing the network's decision.
+
+https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/example/cnn_visualization/volcano_barn_spider.png;
 alt="Explaining the misclassification of volcano as spider" width=500px/>
+
+Visualizations also help build confidence about the predictions of a model. 
For example, even if a model correctly predicts birds as birds, we would want 
to confirm that the model bases its decision on the features of bird and not on 
the features of some other object that might occur together with birds in the 
dataset (like leaves).
+
+In this tutorial, we show how to visualize the predictions made by 
convolutional neural networks using Gradient-weighted Class Activation Mapping. 
Unlike many other visualization methods, Grad-CAM can be used on a wide variety 
of CNN model families - CNNs with fully connected layers, CNNs used for 
structural outputs (e.g. captioning), CNNs used in tasks with multi-model input 
(e.g. VQA) or reinforcement learning without architectural changes or 
re-training.
+
+In the rest of this notebook, we will explain how to visualize predictions 
made by [VGG-16](https://arxiv.org/abs/1409.1556). We begin by importing the 
required dependencies. `gradcam` module contains the implementation of 
visualization techniques used in this notebook.
+
+```python
+from __future__ import print_function
+
+import mxnet as mx
+from mxnet import gluon
+
+from matplotlib import pyplot as plt
+import numpy as np
+import cv2
+
+gradcam_file = "gradcam.py" 
+base_url = 
"https://raw.githubusercontent.com/indhub/mxnet/cnnviz/example/cnn_visualization/{}?raw=true;
+mx.test_utils.download(base_url.format(gradcam_file), fname=gradcam_file)
+import gradcam
+```
+
+## Building the network to visualize
+
+Next, we build the network we want to visualize. For this example, we will use 
the [VGG-16](https://arxiv.org/abs/1409.1556) network. This code was taken from 
the Gluon [model 
zoo](https://github.com/apache/incubator-mxnet/blob/master/python/mxnet/gluon/model_zoo/vision/alexnet.py)
 and refactored to make it easy to switch between `gradcam`'s and Gluon's 
implementation of ReLU and Conv2D. Same code can be used for both training and 
visualization with a minor (one line) change.
+
+Notice that we import ReLU and Conv2D from `gradcam` module instead of 
mxnet.gluon.nn.
+- We use a modified ReLU because we use guided backpropagation for 
visualization and guided backprop requires ReLU layer to block the backward 
flow of negative gradients corresponding to the neurons which decrease the 
activation of the higher layer unit we aim to visualize. Check 
[this](https://arxiv.org/abs/1412.6806) paper to learn more about guided 
backprop.
+- We use a modified Conv2D (a wrapper on top of Gluon's Conv2D) because we 
want to capture the output of a given convolutional layer and its gradients. 
This is needed to implement Grad-CAM. Check 
[this](https://arxiv.org/abs/1610.02391) paper to learn more about Grad-CAM.
+
+When you train the network, you could just import `Activation` and `Conv2D` 
from `gluon.nn` instead. No other part of the code needs any change to switch 
between training and visualization.
+
+```python
+import os
+from mxnet.gluon.model_zoo import model_store
+
+from mxnet.initializer import Xavier
+from mxnet.gluon.nn import MaxPool2D, Flatten, Dense, Dropout, BatchNorm
+from gradcam import Activation, Conv2D
+
+class VGG(mx.gluon.HybridBlock):
+def __init__(self, layers, filters, classes=1000, batch_norm=False, 
**kwargs):
+super(VGG, self).__init__(**kwargs)
+assert len(layers) == len(filters)
+with self.name_scope():
+self.features = self._make_features(layers, filters, batch_norm)
+self.features.add(Dense(4096, activation='relu',
+   weight_initializer='normal',
+   bias_initializer='zeros'))
+self.features.add(Dropout(rate=0.5))
+self.features.add(Dense(4096, activation='relu',
+   weight_initializer='normal',
+   bias_initializer='zeros'))
+self.features.add(Dropout(rate=0.5))
+self.output

[GitHub] indhub commented on a change in pull request #10900: [MXNET-414] Tutorial on visualizing CNN decisions using Grad-CAM

2018-05-17 Thread GitBox

indhub commented on a change in pull request #10900: [MXNET-414] Tutorial on 
visualizing CNN decisions using Grad-CAM
URL: https://github.com/apache/incubator-mxnet/pull/10900#discussion_r189122599
 
 

 ##
 File path: docs/tutorials/vision/cnn_visualization.md
 ##
 @@ -0,0 +1,250 @@
+# Visualizing Decisions of Convolutional Neural Networks
+
+Convolutional Neural Networks have made a lot of progress in Computer Vision. 
Their accuracy is as good as humans in some tasks. However it remains hard to 
explain the predictions of convolutional neural networks.
+
+It is often helpful to be able to explain why a model made the prediction it 
made. For example when a model misclassifies an image, it is hard to say why 
without visualizing the network's decision.
+
+https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/example/cnn_visualization/volcano_barn_spider.png;
 alt="Explaining the misclassification of volcano as spider" width=500px/>
+
+Visualizations also help build confidence about the predictions of a model. 
For example, even if a model correctly predicts birds as birds, we would want 
to confirm that the model bases its decision on the features of bird and not on 
the features of some other object that might occur together with birds in the 
dataset (like leaves).
+
+In this tutorial, we show how to visualize the predictions made by 
convolutional neural networks using Gradient-weighted Class Activation Mapping. 
Unlike many other visualization methods, Grad-CAM can be used on a wide variety 
of CNN model families - CNNs with fully connected layers, CNNs used for 
structural outputs (e.g. captioning), CNNs used in tasks with multi-model input 
(e.g. VQA) or reinforcement learning without architectural changes or 
re-training.
+
+In the rest of this notebook, we will explain how to visualize predictions 
made by [VGG-16](https://arxiv.org/abs/1409.1556). We begin by importing the 
required dependencies. `gradcam` module contains the implementation of 
visualization techniques used in this notebook.
+
+```python
+from __future__ import print_function
+
+import mxnet as mx
+from mxnet import gluon
+
+from matplotlib import pyplot as plt
+import numpy as np
+import cv2
+
+gradcam_file = "gradcam.py" 
+base_url = 
"https://raw.githubusercontent.com/indhub/mxnet/cnnviz/example/cnn_visualization/{}?raw=true;
+mx.test_utils.download(base_url.format(gradcam_file), fname=gradcam_file)
+import gradcam
+```
+
+## Building the network to visualize
+
+Next, we build the network we want to visualize. For this example, we will use 
the [VGG-16](https://arxiv.org/abs/1409.1556) network. This code was taken from 
the Gluon [model 
zoo](https://github.com/apache/incubator-mxnet/blob/master/python/mxnet/gluon/model_zoo/vision/alexnet.py)
 and refactored to make it easy to switch between `gradcam`'s and Gluon's 
implementation of ReLU and Conv2D. Same code can be used for both training and 
visualization with a minor (one line) change.
+
+Notice that we import ReLU and Conv2D from `gradcam` module instead of 
mxnet.gluon.nn.
+- We use a modified ReLU because we use guided backpropagation for 
visualization and guided backprop requires ReLU layer to block the backward 
flow of negative gradients corresponding to the neurons which decrease the 
activation of the higher layer unit we aim to visualize. Check 
[this](https://arxiv.org/abs/1412.6806) paper to learn more about guided 
backprop.
+- We use a modified Conv2D (a wrapper on top of Gluon's Conv2D) because we 
want to capture the output of a given convolutional layer and its gradients. 
This is needed to implement Grad-CAM. Check 
[this](https://arxiv.org/abs/1610.02391) paper to learn more about Grad-CAM.
+
+When you train the network, you could just import `Activation` and `Conv2D` 
from `gluon.nn` instead. No other part of the code needs any change to switch 
between training and visualization.
+
+```python
+import os
+from mxnet.gluon.model_zoo import model_store
+
+from mxnet.initializer import Xavier
+from mxnet.gluon.nn import MaxPool2D, Flatten, Dense, Dropout, BatchNorm
+from gradcam import Activation, Conv2D
+
+class VGG(mx.gluon.HybridBlock):
+def __init__(self, layers, filters, classes=1000, batch_norm=False, 
**kwargs):
+super(VGG, self).__init__(**kwargs)
+assert len(layers) == len(filters)
+with self.name_scope():
+self.features = self._make_features(layers, filters, batch_norm)
+self.features.add(Dense(4096, activation='relu',
+   weight_initializer='normal',
+   bias_initializer='zeros'))
+self.features.add(Dropout(rate=0.5))
+self.features.add(Dense(4096, activation='relu',
+   weight_initializer='normal',
+   bias_initializer='zeros'))
+self.features.add(Dropout(rate=0.5))
+self.output

[GitHub] indhub commented on a change in pull request #10900: [MXNET-414] Tutorial on visualizing CNN decisions using Grad-CAM

2018-05-17 Thread GitBox

indhub commented on a change in pull request #10900: [MXNET-414] Tutorial on 
visualizing CNN decisions using Grad-CAM
URL: https://github.com/apache/incubator-mxnet/pull/10900#discussion_r189122530
 
 

 ##
 File path: docs/tutorials/vision/cnn_visualization.md
 ##
 @@ -0,0 +1,250 @@
+# Visualizing Decisions of Convolutional Neural Networks
+
+Convolutional Neural Networks have made a lot of progress in Computer Vision. 
Their accuracy is as good as humans in some tasks. However it remains hard to 
explain the predictions of convolutional neural networks.
+
+It is often helpful to be able to explain why a model made the prediction it 
made. For example when a model misclassifies an image, it is hard to say why 
without visualizing the network's decision.
+
+https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/example/cnn_visualization/volcano_barn_spider.png;
 alt="Explaining the misclassification of volcano as spider" width=500px/>
+
+Visualizations also help build confidence about the predictions of a model. 
For example, even if a model correctly predicts birds as birds, we would want 
to confirm that the model bases its decision on the features of bird and not on 
the features of some other object that might occur together with birds in the 
dataset (like leaves).
+
+In this tutorial, we show how to visualize the predictions made by 
convolutional neural networks using Gradient-weighted Class Activation Mapping. 
Unlike many other visualization methods, Grad-CAM can be used on a wide variety 
of CNN model families - CNNs with fully connected layers, CNNs used for 
structural outputs (e.g. captioning), CNNs used in tasks with multi-model input 
(e.g. VQA) or reinforcement learning without architectural changes or 
re-training.
+
+In the rest of this notebook, we will explain how to visualize predictions 
made by [VGG-16](https://arxiv.org/abs/1409.1556). We begin by importing the 
required dependencies. `gradcam` module contains the implementation of 
visualization techniques used in this notebook.
+
+```python
+from __future__ import print_function
+
+import mxnet as mx
+from mxnet import gluon
+
+from matplotlib import pyplot as plt
+import numpy as np
+import cv2
+
+gradcam_file = "gradcam.py" 
+base_url = 
"https://raw.githubusercontent.com/indhub/mxnet/cnnviz/example/cnn_visualization/{}?raw=true;
+mx.test_utils.download(base_url.format(gradcam_file), fname=gradcam_file)
+import gradcam
+```
+
+## Building the network to visualize
+
+Next, we build the network we want to visualize. For this example, we will use 
the [VGG-16](https://arxiv.org/abs/1409.1556) network. This code was taken from 
the Gluon [model 
zoo](https://github.com/apache/incubator-mxnet/blob/master/python/mxnet/gluon/model_zoo/vision/alexnet.py)
 and refactored to make it easy to switch between `gradcam`'s and Gluon's 
implementation of ReLU and Conv2D. Same code can be used for both training and 
visualization with a minor (one line) change.
+
+Notice that we import ReLU and Conv2D from `gradcam` module instead of 
mxnet.gluon.nn.
+- We use a modified ReLU because we use guided backpropagation for 
visualization and guided backprop requires ReLU layer to block the backward 
flow of negative gradients corresponding to the neurons which decrease the 
activation of the higher layer unit we aim to visualize. Check 
[this](https://arxiv.org/abs/1412.6806) paper to learn more about guided 
backprop.
+- We use a modified Conv2D (a wrapper on top of Gluon's Conv2D) because we 
want to capture the output of a given convolutional layer and its gradients. 
This is needed to implement Grad-CAM. Check 
[this](https://arxiv.org/abs/1610.02391) paper to learn more about Grad-CAM.
+
+When you train the network, you could just import `Activation` and `Conv2D` 
from `gluon.nn` instead. No other part of the code needs any change to switch 
between training and visualization.
+
+```python
+import os
+from mxnet.gluon.model_zoo import model_store
+
+from mxnet.initializer import Xavier
+from mxnet.gluon.nn import MaxPool2D, Flatten, Dense, Dropout, BatchNorm
+from gradcam import Activation, Conv2D
+
+class VGG(mx.gluon.HybridBlock):
+def __init__(self, layers, filters, classes=1000, batch_norm=False, 
**kwargs):
+super(VGG, self).__init__(**kwargs)
+assert len(layers) == len(filters)
+with self.name_scope():
+self.features = self._make_features(layers, filters, batch_norm)
+self.features.add(Dense(4096, activation='relu',
+   weight_initializer='normal',
+   bias_initializer='zeros'))
+self.features.add(Dropout(rate=0.5))
+self.features.add(Dense(4096, activation='relu',
+   weight_initializer='normal',
+   bias_initializer='zeros'))
+self.features.add(Dropout(rate=0.5))
+self.output

[GitHub] indhub commented on a change in pull request #10900: [MXNET-414] Tutorial on visualizing CNN decisions using Grad-CAM

2018-05-17 Thread GitBox

indhub commented on a change in pull request #10900: [MXNET-414] Tutorial on
visualizing CNN decisions using Grad-CAM
URL: https://github.com/apache/incubator-mxnet/pull/10900#discussion_r189122496

##
File path: docs/tutorials/vision/cnn_visualization.md
##
@@ -0,0 +1,250 @@
+# Visualizing Decisions of Convolutional Neural Networks
+
+Convolutional Neural Networks have made a lot of progress in Computer Vision.
Their accuracy is as good as humans in some tasks. However it remains hard to
explain the predictions of convolutional neural networks.
+
+It is often helpful to be able to explain why a model made the prediction it
made. For example when a model misclassifies an image, it is hard to say why
without visualizing the network's decision.
+
+https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/example/cnn_visualization/volcano_barn_spider.png;
alt="Explaining the misclassification of volcano as spider" width=500px/>
+
+Visualizations also help build confidence about the predictions of a model.
For example, even if a model correctly predicts birds as birds, we would want
to confirm that the model bases its decision on the features of bird and not on
the features of some other object that might occur together with birds in the
dataset (like leaves).
+
+In this tutorial, we show how to visualize the predictions made by
convolutional neural networks using Gradient-weighted Class Activation Mapping.
Unlike many other visualization methods, Grad-CAM can be used on a wide variety
of CNN model families - CNNs with fully connected layers, CNNs used for
structural outputs (e.g. captioning), CNNs used in tasks with multi-model input
(e.g. VQA) or reinforcement learning without architectural changes or
re-training.
+
+In the rest of this notebook, we will explain how to visualize predictions
made by [VGG-16](https://arxiv.org/abs/1409.1556). We begin by importing the
required dependencies. `gradcam` module contains the implementation of
visualization techniques used in this notebook.

Review comment:
VGG is used for a couple of reasons:
1. ResNet is too big to include in a notebook without spoiling easiness to
read.
2. VGG16 the network used in the original paper.

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

With regards,
Apache Git Services

[GitHub] indhub commented on a change in pull request #10900: [MXNET-414] Tutorial on visualizing CNN decisions using Grad-CAM

2018-05-17 Thread GitBox

indhub commented on a change in pull request #10900: [MXNET-414] Tutorial on 
visualizing CNN decisions using Grad-CAM
URL: https://github.com/apache/incubator-mxnet/pull/10900#discussion_r189122505
 
 

 ##
 File path: docs/tutorials/vision/cnn_visualization.md
 ##
 @@ -0,0 +1,250 @@
+# Visualizing Decisions of Convolutional Neural Networks
+
+Convolutional Neural Networks have made a lot of progress in Computer Vision. 
Their accuracy is as good as humans in some tasks. However it remains hard to 
explain the predictions of convolutional neural networks.
+
+It is often helpful to be able to explain why a model made the prediction it 
made. For example when a model misclassifies an image, it is hard to say why 
without visualizing the network's decision.
+
+https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/example/cnn_visualization/volcano_barn_spider.png;
 alt="Explaining the misclassification of volcano as spider" width=500px/>
+
+Visualizations also help build confidence about the predictions of a model. 
For example, even if a model correctly predicts birds as birds, we would want 
to confirm that the model bases its decision on the features of bird and not on 
the features of some other object that might occur together with birds in the 
dataset (like leaves).
+
+In this tutorial, we show how to visualize the predictions made by 
convolutional neural networks using Gradient-weighted Class Activation Mapping. 
Unlike many other visualization methods, Grad-CAM can be used on a wide variety 
of CNN model families - CNNs with fully connected layers, CNNs used for 
structural outputs (e.g. captioning), CNNs used in tasks with multi-model input 
(e.g. VQA) or reinforcement learning without architectural changes or 
re-training.
+
+In the rest of this notebook, we will explain how to visualize predictions 
made by [VGG-16](https://arxiv.org/abs/1409.1556). We begin by importing the 
required dependencies. `gradcam` module contains the implementation of 
visualization techniques used in this notebook.
+
+```python
+from __future__ import print_function
+
+import mxnet as mx
+from mxnet import gluon
+
+from matplotlib import pyplot as plt
+import numpy as np
+import cv2
+
+gradcam_file = "gradcam.py" 
+base_url = 
"https://raw.githubusercontent.com/indhub/mxnet/cnnviz/example/cnn_visualization/{}?raw=true;
 
 Review comment:
   Will remember to switch.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] zhanghang1989 commented on a change in pull request #10536: [MXNET-317] Add Data Parallel

2018-05-17 Thread GitBox

zhanghang1989 commented on a change in pull request #10536: [MXNET-317] Add 
Data Parallel
URL: https://github.com/apache/incubator-mxnet/pull/10536#discussion_r189101927
 
 

 ##
 File path: docs/api/python/gluon/contrib.md
 ##
 @@ -80,6 +80,20 @@ In the rest of this document, we list routines provided by 
the `gluon.contrib` p
 WikiText103
 ```
 
+ Parallel
+
+```eval_rst
+.. currentmodule:: mxnet.gluon.parallel
+
+.. autosummary::
+:nosignatures:
+
+DataParallelModel
+DataParallelCriterion
 
 Review comment:
   This make it easier for the users, because the situation is complicated for 
network with multiple outputs. We just handle the situation internally.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] zhanghang1989 commented on a change in pull request #10536: [MXNET-317] Add Data Parallel

2018-05-17 Thread GitBox

zhanghang1989 commented on a change in pull request #10536: [MXNET-317] Add 
Data Parallel
URL: https://github.com/apache/incubator-mxnet/pull/10536#discussion_r189112846
 
 

 ##
 File path: python/mxnet/gluon/contrib/parallel.py
 ##
 @@ -0,0 +1,343 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=broad-except, redefined-builtin
+"""Synchronized DataParallel"""
+import threading
+from ... import autograd
+from ...ndarray import NDArray
+from ..utils import split_and_load
+
+__all__ = ['DataParallelModel', 'DataParallelCriterion', 'Barrier']
+
+
+class Barrier(object):
+"""Shared NDArray for cross device operation.
+
+A cross device operation that allows synchronized push and pull. It can be 
used in
+Cross-gpu Sycnhronized Batch Normalization and Sparse Blocks.
+
+Parameters
+--
+counter : int
+Number of deivces.
+operation : callable
+The cross device operation is applying (e.g. AllReduce).
+"""
+def __init__(self, counter, operation):
+self.mutex = threading.Lock()
+self.all_tasks_done = threading.Condition(self.mutex)
+self.counter = counter
+self.op = operation
+self._clear()
+
+def push(self, x):
+"""Push a NDArray from one of the device.
+Input:
+x (NDArray)
+
+Output:
+idx (int), the output index
+"""
+with self.mutex:
+if self.push_tasks == 0:
+self._clear()
+self.list.append(x)
+idx = len(self.list) - 1
+self.push_tasks -= 1
+
+with self.all_tasks_done:
+if self.push_tasks == 0:
+self.all_tasks_done.notify_all()
+while self.push_tasks:
 
 Review comment:
   avoid execute the `self.op` before all the datas are ready, see the 
`_sync_op()` function


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] altosaar commented on issue #10990: Gluon hybridize fails to detect an input

2018-05-17 Thread GitBox

altosaar commented on issue #10990: Gluon hybridize fails to detect an input
URL: 
https://github.com/apache/incubator-mxnet/issues/10990#issuecomment-390024658
 
 
   Yup this is with the latest nightly build.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] eric-haibin-lin commented on issue #10990: Gluon hybridize fails to detect an input

2018-05-17 Thread GitBox

eric-haibin-lin commented on issue #10990: Gluon hybridize fails to detect an 
input
URL: 
https://github.com/apache/incubator-mxnet/issues/10990#issuecomment-390018500
 
 
   What's your mxnet version? Is it still happening with pre-release nightly 
build?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] eric-haibin-lin commented on issue #10990: Gluon hybridize fails to detect an input

2018-05-17 Thread GitBox

eric-haibin-lin commented on issue #10990: Gluon hybridize fails to detect an 
input
URL: 
https://github.com/apache/incubator-mxnet/issues/10990#issuecomment-390018500
 
 
   I've seen similar error like this long time ago. What's your mxnet version? 
Is it still happening with pre-release nightly build?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] zhanghang1989 commented on a change in pull request #10536: [MXNET-317] Add Data Parallel

2018-05-17 Thread GitBox

zhanghang1989 commented on a change in pull request #10536: [MXNET-317] Add 
Data Parallel
URL: https://github.com/apache/incubator-mxnet/pull/10536#discussion_r189106867
 
 

 ##
 File path: python/mxnet/gluon/contrib/parallel.py
 ##
 @@ -0,0 +1,343 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=broad-except, redefined-builtin
+"""Synchronized DataParallel"""
+import threading
+from ... import autograd
+from ...ndarray import NDArray
+from ..utils import split_and_load
+
+__all__ = ['DataParallelModel', 'DataParallelCriterion', 'Barrier']
+
+
+class Barrier(object):
+"""Shared NDArray for cross device operation.
+
+A cross device operation that allows synchronized push and pull. It can be 
used in
+Cross-gpu Sycnhronized Batch Normalization and Sparse Blocks.
+
+Parameters
+--
+counter : int
+Number of deivces.
+operation : callable
+The cross device operation is applying (e.g. AllReduce).
+"""
+def __init__(self, counter, operation):
+self.mutex = threading.Lock()
+self.all_tasks_done = threading.Condition(self.mutex)
+self.counter = counter
+self.op = operation
+self._clear()
+
+def push(self, x):
 
 Review comment:
   There was a case in 'Sparse' application, which only pull onetime from the 
master thread. We can change it to single function. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] zhanghang1989 commented on a change in pull request #10536: [MXNET-317] Add Data Parallel

2018-05-17 Thread GitBox

zhanghang1989 commented on a change in pull request #10536: [MXNET-317] Add 
Data Parallel
URL: https://github.com/apache/incubator-mxnet/pull/10536#discussion_r189106577
 
 

 ##
 File path: python/mxnet/gluon/contrib/parallel.py
 ##
 @@ -0,0 +1,343 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=broad-except, redefined-builtin
+"""Synchronized DataParallel"""
+import threading
+from ... import autograd
+from ...ndarray import NDArray
+from ..utils import split_and_load
+
+__all__ = ['DataParallelModel', 'DataParallelCriterion', 'Barrier']
+
+
+class Barrier(object):
+"""Shared NDArray for cross device operation.
+
+A cross device operation that allows synchronized push and pull. It can be 
used in
+Cross-gpu Sycnhronized Batch Normalization and Sparse Blocks.
+
+Parameters
+--
+counter : int
+Number of deivces.
+operation : callable
+The cross device operation is applying (e.g. AllReduce).
+"""
+def __init__(self, counter, operation):
+self.mutex = threading.Lock()
+self.all_tasks_done = threading.Condition(self.mutex)
+self.counter = counter
+self.op = operation
+self._clear()
+
+def push(self, x):
+"""Push a NDArray from one of the device.
+Input:
+x (NDArray)
+
+Output:
+idx (int), the output index
+"""
+with self.mutex:
+if self.push_tasks == 0:
+self._clear()
+self.list.append(x)
+idx = len(self.list) - 1
+self.push_tasks -= 1
+
+with self.all_tasks_done:
+if self.push_tasks == 0:
+self.all_tasks_done.notify_all()
+while self.push_tasks:
+self.all_tasks_done.wait()
+
+self._sync_op()
+return idx
+
+def pull(self, idx):
+"""Pull the output to each device
+Input:
+idx (int)
+
+Output:
+out (NDArray)
+"""
+return self.out[idx]
+
+def _sync_op(self):
+with self.mutex:
+if self.reduce_tasks == 1:
+assert(len(self.list) == self.counter)
+self.out = self.op(*self.list)
+if isinstance(self.out, (list, tuple)):
+for xi in self.out:
+xi.wait_to_read()
+else:
+self.out.wait_to_read()
+self.reduce_tasks -= 1
+else:
+self.reduce_tasks -= 1
+
+with self.all_tasks_done:
+if self.reduce_tasks == 0:
+self.all_tasks_done.notify_all()
+while self.reduce_tasks:
+self.all_tasks_done.wait()
+
+def _clear(self):
+self.list = []
+self.push_tasks = self.counter
+self.reduce_tasks = self.counter
+
+def __len__(self):
+return len(self.list)
+
+def __repr__(self):
+return 'ParallelState'
+
+
+class DataParallelModel(object):
+"""Data parallelism
+
+Hide the difference of single/multiple GPUs to the user.
+Inputs and outputs are both list of NDArrays in different contexts.
+In the forward pass, the module is replicated on each device,
+and each replica handles a portion of the input. During the backwards
+pass, gradients from each replica are summed into the original module.
+
+Parameters
+--
+module : object
+Network to be parallelized.
+ctx_list : list
+A list of contexts
+sync : bool
+enable synchronization (default: False).
+
+
+Inputs:
+- **inputs**: list of input (NDArrays)
+
+Outputs:
+- **outputs**: list of output (NDArrays)
+
+Example::
+>>> ctx = [mx.gpu(0), mx.gpu(1)]
+>>> net = DataParallelModel(model, ctx_list=ctx)
+>>> y = net(x)
+"""
+def __init__(self, module, ctx_list=None, sync=False):
+module.collect_params().reset_ctx(ctx=ctx_list)
+self.ctx_list = ctx_list
+self.module = module
+self.sync = sync
+
+def __call__(self, *inputs, **kwargs):
+if not self.ctx_list:
+

[GitHub] zhanghang1989 commented on a change in pull request #10852: [MXNET-411] Add ROI Align

2018-05-17 Thread GitBox

zhanghang1989 commented on a change in pull request #10852: [MXNET-411] Add ROI 
Align
URL: https://github.com/apache/incubator-mxnet/pull/10852#discussion_r189103423
 
 

 ##
 File path: src/operator/contrib/roi_align-inl.h
 ##
 @@ -0,0 +1,75 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+/*!
+ * Copyright (c) 2018 by Contributors
+ * \file roi_align-inl.h
+ * \brief roi align operator and symbol
+ * \author Hang Zhang
+ * modified from Caffe2
+*/
+#ifndef MXNET_OPERATOR_CONTRIB_ROI_ALIGN_INL_H_
+#define MXNET_OPERATOR_CONTRIB_ROI_ALIGN_INL_H_
+
+#include 
+#include 
+#include "../mshadow_op.h"
+#include "../tensor/init_op.h"
+
+
+namespace mxnet {
+namespace op {
+
+
+// Declare enumeration of input order to make code more intuitive.
+// These enums are only visible within this header
+namespace roialign {
+enum ROIAlignOpInputs {kData, kBox};
+enum ROIAlignOpOutputs {kOut};
+}  // roialign
+
+
+struct ROIAlignParam : public dmlc::Parameter {
+  TShape pooled_size;
+  float spatial_scale;
+  DMLC_DECLARE_PARAMETER(ROIAlignParam) {
+DMLC_DECLARE_FIELD(pooled_size)
+.set_expect_ndim(2).enforce_nonzero()
+.describe("fix pooled size: (h, w)");
 
 Review comment:
   The output roi feature sizes. Name is compatible with ROIPooling


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] zhanghang1989 commented on a change in pull request #10536: [MXNET-317] Add Data Parallel

2018-05-17 Thread GitBox

zhanghang1989 commented on a change in pull request #10536: [MXNET-317] Add 
Data Parallel
URL: https://github.com/apache/incubator-mxnet/pull/10536#discussion_r189102855
 
 

 ##
 File path: python/mxnet/gluon/contrib/parallel.py
 ##
 @@ -0,0 +1,343 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=broad-except, redefined-builtin
+"""Synchronized DataParallel"""
+import threading
+from ... import autograd
+from ...ndarray import NDArray
+from ..utils import split_and_load
+
+__all__ = ['DataParallelModel', 'DataParallelCriterion', 'Barrier']
+
+
+class Barrier(object):
+"""Shared NDArray for cross device operation.
+
+A cross device operation that allows synchronized push and pull. It can be 
used in
+Cross-gpu Sycnhronized Batch Normalization and Sparse Blocks.
+
+Parameters
+--
+counter : int
+Number of deivces.
+operation : callable
+The cross device operation is applying (e.g. AllReduce).
+"""
+def __init__(self, counter, operation):
+self.mutex = threading.Lock()
+self.all_tasks_done = threading.Condition(self.mutex)
+self.counter = counter
+self.op = operation
+self._clear()
+
+def push(self, x):
+"""Push a NDArray from one of the device.
+Input:
+x (NDArray)
+
+Output:
+idx (int), the output index
+"""
+with self.mutex:
+if self.push_tasks == 0:
+self._clear()
+self.list.append(x)
+idx = len(self.list) - 1
+self.push_tasks -= 1
+
+with self.all_tasks_done:
+if self.push_tasks == 0:
+self.all_tasks_done.notify_all()
+while self.push_tasks:
+self.all_tasks_done.wait()
+
+self._sync_op()
+return idx
+
+def pull(self, idx):
+"""Pull the output to each device
+Input:
+idx (int)
+
+Output:
+out (NDArray)
+"""
+return self.out[idx]
+
+def _sync_op(self):
+with self.mutex:
+if self.reduce_tasks == 1:
+assert(len(self.list) == self.counter)
+self.out = self.op(*self.list)
+if isinstance(self.out, (list, tuple)):
+for xi in self.out:
+xi.wait_to_read()
 
 Review comment:
   If we don't wait, the training of BN fails. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] zhanghang1989 commented on a change in pull request #10536: [MXNET-317] Add Data Parallel

2018-05-17 Thread GitBox

zhanghang1989 commented on a change in pull request #10536: [MXNET-317] Add 
Data Parallel
URL: https://github.com/apache/incubator-mxnet/pull/10536#discussion_r189102539
 
 

 ##
 File path: python/mxnet/gluon/contrib/parallel.py
 ##
 @@ -0,0 +1,343 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=broad-except, redefined-builtin
+"""Synchronized DataParallel"""
+import threading
+from ... import autograd
+from ...ndarray import NDArray
+from ..utils import split_and_load
+
+__all__ = ['DataParallelModel', 'DataParallelCriterion', 'Barrier']
+
+
+class Barrier(object):
+"""Shared NDArray for cross device operation.
+
+A cross device operation that allows synchronized push and pull. It can be 
used in
+Cross-gpu Sycnhronized Batch Normalization and Sparse Blocks.
+
+Parameters
+--
+counter : int
+Number of deivces.
+operation : callable
+The cross device operation is applying (e.g. AllReduce).
+"""
+def __init__(self, counter, operation):
+self.mutex = threading.Lock()
+self.all_tasks_done = threading.Condition(self.mutex)
+self.counter = counter
+self.op = operation
+self._clear()
+
+def push(self, x):
+"""Push a NDArray from one of the device.
+Input:
+x (NDArray)
+
+Output:
+idx (int), the output index
+"""
+with self.mutex:
+if self.push_tasks == 0:
+self._clear()
+self.list.append(x)
+idx = len(self.list) - 1
+self.push_tasks -= 1
+
+with self.all_tasks_done:
+if self.push_tasks == 0:
+self.all_tasks_done.notify_all()
+while self.push_tasks:
+self.all_tasks_done.wait()
+
+self._sync_op()
+return idx
+
+def pull(self, idx):
+"""Pull the output to each device
+Input:
+idx (int)
+
+Output:
+out (NDArray)
+"""
+return self.out[idx]
+
+def _sync_op(self):
+with self.mutex:
+if self.reduce_tasks == 1:
+assert(len(self.list) == self.counter)
+self.out = self.op(*self.list)
+if isinstance(self.out, (list, tuple)):
+for xi in self.out:
+xi.wait_to_read()
+else:
+self.out.wait_to_read()
+self.reduce_tasks -= 1
+else:
+self.reduce_tasks -= 1
+
+with self.all_tasks_done:
+if self.reduce_tasks == 0:
+self.all_tasks_done.notify_all()
+while self.reduce_tasks:
+self.all_tasks_done.wait()
+
+def _clear(self):
+self.list = []
+self.push_tasks = self.counter
+self.reduce_tasks = self.counter
+
+def __len__(self):
 
 Review comment:
   We may want to get the number of devices outside. For example, we want to 
calculate the global mean by dividing the global sum by global number of 
elements (local number of elements * N devices) 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] zhanghang1989 commented on a change in pull request #10536: [MXNET-317] Add Data Parallel

2018-05-17 Thread GitBox

zhanghang1989 commented on a change in pull request #10536: [MXNET-317] Add 
Data Parallel
URL: https://github.com/apache/incubator-mxnet/pull/10536#discussion_r189101927
 
 

 ##
 File path: docs/api/python/gluon/contrib.md
 ##
 @@ -80,6 +80,20 @@ In the rest of this document, we list routines provided by 
the `gluon.contrib` p
 WikiText103
 ```
 
+ Parallel
+
+```eval_rst
+.. currentmodule:: mxnet.gluon.parallel
+
+.. autosummary::
+:nosignatures:
+
+DataParallelModel
+DataParallelCriterion
 
 Review comment:
   This make it easier for the users, because the situation is complicated for 
network with multiple output cases. We just handle the situation internally.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[incubator-mxnet] branch master updated: Fix optimizer pickle (#10983)

2018-05-17 Thread haibin

This is an automated email from the ASF dual-hosted git repository.

haibin pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git


The following commit(s) were added to refs/heads/master by this push:
 new 8937488  Fix optimizer pickle (#10983)
8937488 is described below

commit 8937488b5bdec9111d06498f5e3aeb1224003c94
Author: Haibin Lin 
AuthorDate: Thu May 17 14:07:56 2018 -0700

Fix optimizer pickle (#10983)
---
 python/mxnet/gluon/trainer.py   |  2 ++
 python/mxnet/optimizer.py   | 11 +++
 tests/python/unittest/test_gluon.py | 20 ++--
 3 files changed, 31 insertions(+), 2 deletions(-)

diff --git a/python/mxnet/gluon/trainer.py b/python/mxnet/gluon/trainer.py
index 39c4a1f..f285b91 100644
--- a/python/mxnet/gluon/trainer.py
+++ b/python/mxnet/gluon/trainer.py
@@ -317,6 +317,8 @@ class Trainer(object):
 if self._update_on_kvstore:
 self._kvstore.load_optimizer_states(fname)
 self._optimizer = self._kvstore._updater.optimizer
+param_dict = {i: param for i, param in enumerate(self._params)}
+self._optimizer.param_dict = param_dict
 else:
 with open(fname, 'rb') as f:
 states = f.read()
diff --git a/python/mxnet/optimizer.py b/python/mxnet/optimizer.py
index 1d2fd2e..0c3fc90 100644
--- a/python/mxnet/optimizer.py
+++ b/python/mxnet/optimizer.py
@@ -426,6 +426,17 @@ class Optimizer(object):
 wd *= self.wd_mult.get(self.idx2name[index], 1.0)
 return wd
 
+def __getstate__(self):
+ret = self.__dict__.copy()
+# do not include param_dict in the state
+del ret['param_dict']
+return ret
+
+def __setstate__(self, state):
+self.__dict__ = state
+# param_dict needs to be explicitly set by the trainer
+self.param_dict = {}
+
 # convenience wrapper for Optimizer.Register
 register = Optimizer.register   # pylint: disable=invalid-name
 
diff --git a/tests/python/unittest/test_gluon.py 
b/tests/python/unittest/test_gluon.py
index e302674..035c713 100644
--- a/tests/python/unittest/test_gluon.py
+++ b/tests/python/unittest/test_gluon.py
@@ -524,10 +524,10 @@ def test_trainer():
 
 assert (x.data(mx.cpu(1)).asnumpy() == -4).all()
 
-trainer.save_states('test.states')
+trainer.save_states('test_trainer.states')
 states = deepcopy(trainer._kvstore._updater.states) if 
trainer._update_on_kvstore \
  else deepcopy(trainer._updaters[0].states)
-trainer.load_states('test.states')
+trainer.load_states('test_trainer.states')
 if trainer._update_on_kvstore:
 dict_equ(trainer._kvstore._updater.states, states)
 assert trainer._optimizer == trainer._kvstore._updater.optimizer
@@ -553,6 +553,22 @@ def test_trainer():
 
 assert (x.data(mx.cpu(1)).asnumpy() == -1).all(), 
x.data(mx.cpu(1)).asnumpy()
 
+@with_seed()
+def test_trainer_save_load():
+x = gluon.Parameter('x', shape=(10,), lr_mult=1.0)
+x.initialize(ctx=[mx.cpu(0), mx.cpu(1)], init='zeros')
+trainer = gluon.Trainer([x], 'sgd', {'learning_rate': 0.1})
+with mx.autograd.record():
+for w in x.list_data():
+y = w + 1
+y.backward()
+trainer.step(1)
+assert trainer._kvstore._updater.optimizer._get_lr(0) == 0.1
+trainer.save_states('test_trainer_save_load.states')
+trainer.load_states('test_trainer_save_load.states')
+x.lr_mult = 2.0
+# check if parameter dict is correctly associated with optimizer after 
load_state
+assert trainer._kvstore._updater.optimizer._get_lr(0) == 0.2
 
 @with_seed()
 def test_block_attr_hidden():

-- 
To stop receiving notification emails like this one, please contact
hai...@apache.org.

[GitHub] eric-haibin-lin closed pull request #10983: [MXNET-427] Fix trainer.load_state by removing param_dict from optimizer state pickle

2018-05-17 Thread GitBox

eric-haibin-lin closed pull request #10983: [MXNET-427] Fix trainer.load_state 
by removing param_dict from optimizer state pickle
URL: https://github.com/apache/incubator-mxnet/pull/10983
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/python/mxnet/gluon/trainer.py b/python/mxnet/gluon/trainer.py
index da67fc0b1d9..e25aa7e68b9 100644
--- a/python/mxnet/gluon/trainer.py
+++ b/python/mxnet/gluon/trainer.py
@@ -303,6 +303,8 @@ def load_states(self, fname):
 if self._update_on_kvstore:
 self._kvstore.load_optimizer_states(fname)
 self._optimizer = self._kvstore._updater.optimizer
+param_dict = {i: param for i, param in enumerate(self._params)}
+self._optimizer.param_dict = param_dict
 else:
 with open(fname, 'rb') as f:
 states = f.read()
diff --git a/python/mxnet/optimizer.py b/python/mxnet/optimizer.py
index 1d2fd2e73df..0c3fc904fb1 100644
--- a/python/mxnet/optimizer.py
+++ b/python/mxnet/optimizer.py
@@ -426,6 +426,17 @@ def _get_wd(self, index):
 wd *= self.wd_mult.get(self.idx2name[index], 1.0)
 return wd
 
+def __getstate__(self):
+ret = self.__dict__.copy()
+# do not include param_dict in the state
+del ret['param_dict']
+return ret
+
+def __setstate__(self, state):
+self.__dict__ = state
+# param_dict needs to be explicitly set by the trainer
+self.param_dict = {}
+
 # convenience wrapper for Optimizer.Register
 register = Optimizer.register   # pylint: disable=invalid-name
 
diff --git a/tests/python/unittest/test_gluon.py 
b/tests/python/unittest/test_gluon.py
index fb73e53bc05..a2688282650 100644
--- a/tests/python/unittest/test_gluon.py
+++ b/tests/python/unittest/test_gluon.py
@@ -509,10 +509,10 @@ def dict_equ(a, b):
 
 assert (x.data(mx.cpu(1)).asnumpy() == -4).all()
 
-trainer.save_states('test.states')
+trainer.save_states('test_trainer.states')
 states = deepcopy(trainer._kvstore._updater.states) if 
trainer._update_on_kvstore \
  else deepcopy(trainer._updaters[0].states)
-trainer.load_states('test.states')
+trainer.load_states('test_trainer.states')
 if trainer._update_on_kvstore:
 dict_equ(trainer._kvstore._updater.states, states)
 assert trainer._optimizer == trainer._kvstore._updater.optimizer
@@ -538,6 +538,22 @@ def dict_equ(a, b):
 
 assert (x.data(mx.cpu(1)).asnumpy() == -1).all(), 
x.data(mx.cpu(1)).asnumpy()
 
+@with_seed()
+def test_trainer_save_load():
+x = gluon.Parameter('x', shape=(10,), lr_mult=1.0)
+x.initialize(ctx=[mx.cpu(0), mx.cpu(1)], init='zeros')
+trainer = gluon.Trainer([x], 'sgd', {'learning_rate': 0.1})
+with mx.autograd.record():
+for w in x.list_data():
+y = w + 1
+y.backward()
+trainer.step(1)
+assert trainer._kvstore._updater.optimizer._get_lr(0) == 0.1
+trainer.save_states('test_trainer_save_load.states')
+trainer.load_states('test_trainer_save_load.states')
+x.lr_mult = 2.0
+# check if parameter dict is correctly associated with optimizer after 
load_state
+assert trainer._kvstore._updater.optimizer._get_lr(0) == 0.2
 
 @with_seed()
 def test_block_attr_hidden():


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] lanking520 commented on issue #10546: [MXNET-319] [WIP] Add Autocomplete Macros in Scala

2018-05-17 Thread GitBox

lanking520 commented on issue #10546: [MXNET-319] [WIP] Add Autocomplete Macros 
in Scala
URL: https://github.com/apache/incubator-mxnet/pull/10546#issuecomment-389998569
 
 
   Close this PR since it is outdated and duplicated with the new one: 
https://github.com/apache/incubator-mxnet/pull/10991


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] lanking520 closed pull request #10546: [MXNET-319] [WIP] Add Autocomplete Macros in Scala

2018-05-17 Thread GitBox

lanking520 closed pull request #10546: [MXNET-319] [WIP] Add Autocomplete 
Macros in Scala
URL: https://github.com/apache/incubator-mxnet/pull/10546
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/scala-package/core/src/main/scala/org/apache/mxnet/LibInfo.scala 
b/scala-package/core/src/main/scala/org/apache/mxnet/LibInfo.scala
index 0a5683aa7ab..fd8269b6ac8 100644
--- a/scala-package/core/src/main/scala/org/apache/mxnet/LibInfo.scala
+++ b/scala-package/core/src/main/scala/org/apache/mxnet/LibInfo.scala
@@ -194,7 +194,8 @@ private[mxnet] class LibInfo {
   argNames: ListBuffer[String],
   argTypes: ListBuffer[String],
   argDescs: ListBuffer[String],
-  keyVarNumArgs: RefString): Int
+  keyVarNumArgs: RefString,
+  returnType: RefString): Int
   @native def mxSymbolCreateAtomicSymbol(handle: SymbolHandle,
  paramKeys: Array[String],
  paramVals: Array[String],
diff --git a/scala-package/core/src/main/scala/org/apache/mxnet/NDArray.scala 
b/scala-package/core/src/main/scala/org/apache/mxnet/NDArray.scala
index 416f2d74e82..561ebc96b55 100644
--- a/scala-package/core/src/main/scala/org/apache/mxnet/NDArray.scala
+++ b/scala-package/core/src/main/scala/org/apache/mxnet/NDArray.scala
@@ -162,13 +162,14 @@ object NDArray {
 val name = new RefString
 val desc = new RefString
 val keyVarNumArgs = new RefString
+val returnType = new RefString
 val numArgs = new RefInt
 val argNames = ListBuffer.empty[String]
 val argTypes = ListBuffer.empty[String]
 val argDescs = ListBuffer.empty[String]
 
 checkCall(_LIB.mxSymbolGetAtomicSymbolInfo(
-  handle, name, desc, numArgs, argNames, argTypes, argDescs, 
keyVarNumArgs))
+  handle, name, desc, numArgs, argNames, argTypes, argDescs, 
keyVarNumArgs, returnType))
 val arguments = (argTypes zip argNames).filter { case (dtype, _) =>
   !(dtype.startsWith("NDArray") || dtype.startsWith("Symbol")
 || dtype.startsWith("NDArray-or-Symbol"))
diff --git a/scala-package/core/src/main/scala/org/apache/mxnet/Symbol.scala 
b/scala-package/core/src/main/scala/org/apache/mxnet/Symbol.scala
index 13f85a731dc..05abcce18a5 100644
--- a/scala-package/core/src/main/scala/org/apache/mxnet/Symbol.scala
+++ b/scala-package/core/src/main/scala/org/apache/mxnet/Symbol.scala
@@ -29,7 +29,8 @@ import scala.collection.mutable.{ArrayBuffer, ListBuffer}
  * WARNING: it is your responsibility to clear this object through dispose().
  * 
  */
-class Symbol private(private[mxnet] val handle: SymbolHandle) extends 
WarnIfNotDisposed {
+class Symbol private(private[mxnet] val handle: SymbolHandle)
+  extends WarnIfNotDisposed {
   private val logger: Logger = LoggerFactory.getLogger(classOf[Symbol])
   private var disposed = false
   protected def isDisposed = disposed
@@ -822,9 +823,8 @@ class Symbol private(private[mxnet] val handle: 
SymbolHandle) extends WarnIfNotD
 jsonStr.value
   }
 }
-
 @AddSymbolFunctions(false)
-object Symbol {
+object Symbol extends SymbolBase {
   private type SymbolCreateNamedFunc = Map[String, Any] => Symbol
   private val logger = LoggerFactory.getLogger(classOf[Symbol])
   private val functions: Map[String, SymbolFunction] = initSymbolModule()
@@ -1026,13 +1026,14 @@ object Symbol {
 val name = new RefString
 val desc = new RefString
 val keyVarNumArgs = new RefString
+val returnType = new RefString
 val numArgs = new RefInt
 val argNames = ListBuffer.empty[String]
 val argTypes = ListBuffer.empty[String]
 val argDescs = ListBuffer.empty[String]
 
 checkCall(_LIB.mxSymbolGetAtomicSymbolInfo(
-  handle, name, desc, numArgs, argNames, argTypes, argDescs, 
keyVarNumArgs))
+  handle, name, desc, numArgs, argNames, argTypes, argDescs, 
keyVarNumArgs, returnType))
 (aliasName, new SymbolFunction(handle, keyVarNumArgs.value))
   }
 
diff --git 
a/scala-package/core/src/main/scala/org/apache/mxnet/SymbolBase.scala 
b/scala-package/core/src/main/scala/org/apache/mxnet/SymbolBase.scala
new file mode 100644
index 000..30a6a4c0d3f
--- /dev/null
+++ b/scala-package/core/src/main/scala/org/apache/mxnet/SymbolBase.scala
@@ -0,0 +1,20 @@
+/*
+* Licensed to the Apache Software Foundation (ASF) under one or more
+* contributor license agreements.  See the NOTICE file distributed with
+* this work for additional information regarding copyright ownership.
+* The ASF licenses this file to You under the Apache License, Version

[GitHub] lanking520 opened a new pull request #10991: [MXNET-386][WIP] API Docs Generation

2018-05-17 Thread GitBox

lanking520 opened a new pull request #10991: [MXNET-386][WIP] API Docs 
Generation
URL: https://github.com/apache/incubator-mxnet/pull/10991
 
 
   ## Description ##
   API docs generation (Write file). Create a file (an abstract class) that can 
be inherited to the API classes. 
   @nswamy @yzhliu 
   ## Checklist ##
   ### Essentials ###
   - [ ] API file generator
   - [ ] Maven build integration
   - [ ] Current API integration
   - [ ] Operational and test checking
   - [ ] The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to 
the relevant [JIRA issue](https://issues.apache.org/jira/projects/MXNET/issues) 
created (except PRs with tiny changes)
   - [ ] Changes are complete (i.e. I finished coding on this PR)
   - [ ] All changes have test coverage:
   - Unit tests are added for small changes to verify correctness (e.g. adding 
a new operator)
   - Nightly tests are added for complicated/long-running ones (e.g. changing 
distributed kvstore)
   - Build tests will be added for build configuration changes (e.g. adding a 
new build option with NCCL)
   - [ ] Code is well-documented: 
   - For user-facing API changes, API doc string has been updated. 
   - For new C++ functions in header files, their functionalities and arguments 
are documented. 
   - For new examples, README.md is added to explain the what the example does, 
the source of the dataset, expected performance on test set and reference to 
the original paper if applicable
   - Check the API doc at 
http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
   - [ ] To the my best knowledge, examples are either not affected by this 
change, or have been fixed to be compatible with this change
   
   ### Changes ###
   - [ ] Feature1, tests, (and when applicable, API doc)
   - [ ] Feature2, tests, (and when applicable, API doc)
   
   ## Comments ##
   - If this change is a backward incompatible change, why must this change be 
made.
   - Interesting edge cases to note here
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] altosaar opened a new issue #10990: Gluon hybridize fails to detect an input

2018-05-17 Thread GitBox

altosaar opened a new issue #10990: Gluon hybridize fails to detect an input
URL: https://github.com/apache/incubator-mxnet/issues/10990
 
 
   Here is a minimal example of hybridize failing to detect an input: 
https://gist.github.com/altosaar/6c29d8ac505ae1cea03ca65f193e7832
   
   Without hybridize, this code runs fine. 
   
   However, hybridize thinks one of the inputs is not used. Stack trace:
   
   ```/usr/local/anaconda3/lib/python3.6/site-packages/h5py/__init__.py:36: 
FutureWarning: Conversion of the second argument of issubdtype from `float` to 
`np.floating` is deprecated. In future, it will be treated as `np.float64 == 
np.dtype(float).type`.
 from ._conv import register_converters as _register_converters
   /usr/local/anaconda3/lib/python3.6/site-packages/mxnet/gluon/block.py:414: 
UserWarning: The 2-th input to HybridBlock is not used by any computation. Is 
this intended?
 return self.forward(*args)
   Traceback (most recent call last):
 File "gluon-hybridize-error.py", line 70, in 
   model(users, items, item_counts, set_sizes)
 File 
"/usr/local/anaconda3/lib/python3.6/site-packages/mxnet/gluon/block.py", line 
414, in __call__
   return self.forward(*args)
 File 
"/usr/local/anaconda3/lib/python3.6/site-packages/mxnet/gluon/block.py", line 
620, in forward
   return self._call_cached_op(x, *args)
 File 
"/usr/local/anaconda3/lib/python3.6/site-packages/mxnet/gluon/block.py", line 
525, in _call_cached_op
   self._build_cache(*args)
 File 
"/usr/local/anaconda3/lib/python3.6/site-packages/mxnet/gluon/block.py", line 
513, in _build_cache
   self._cached_op = ndarray.CachedOp(out, self._flags, input_names, 
param_dict)
 File 
"/usr/local/anaconda3/lib/python3.6/site-packages/mxnet/_ctypes/ndarray.py", 
line 130, in __init__
   ctypes.byref(self.handle)))
 File "/usr/local/anaconda3/lib/python3.6/site-packages/mxnet/base.py", 
line 210, in check_call
   raise MXNetError(py_str(_LIB.MXGetLastError()))
   mxnet.base.MXNetError: [16:11:21] src/imperative/cached_op.cc:91: Check 
failed: arg_name_to_id.size() == arg_names.size() (3 vs. 4) CachedOp expects 3 
inputs, given 4
   
   Stack trace returned 6 entries:
   [bt] (0) 0   libmxnet.so 0x000111e58bb4 
libmxnet.so + 19380
   [bt] (1) 1   libmxnet.so 0x000111e5896f 
libmxnet.so + 18799
   [bt] (2) 2   libmxnet.so 0x000112f0906f 
MXNDListFree + 353135
   [bt] (3) 3   libmxnet.so 0x000112e8cead 
MXCreateCachedOpEx + 845
   [bt] (4) 4   libffi.6.dylib  0x00011063b884 
ffi_call_unix64 + 76
   [bt] (5) 5   ??? 0x7ffee0120790 0x0 + 
140732657698704
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] marcoabreu commented on issue #10921: Test cases improvement for MKLDNN on Gluon

2018-05-17 Thread GitBox

marcoabreu commented on issue #10921: Test cases improvement for MKLDNN on Gluon
URL: https://github.com/apache/incubator-mxnet/pull/10921#issuecomment-389990597
 
 
   Thanks a lot for adding so many tests, it's very appreciated!!!


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] safrooze commented on a change in pull request #10959: [MXNET-423] Gluon Model Zoo Pre Trained Model tutorial

2018-05-17 Thread GitBox

safrooze commented on a change in pull request #10959: [MXNET-423] Gluon Model 
Zoo Pre Trained Model tutorial
URL: https://github.com/apache/incubator-mxnet/pull/10959#discussion_r189070370
 
 

 ##
 File path: docs/tutorials/gluon/pretrained_models.md
 ##
 @@ -0,0 +1,373 @@
+
+# Using pre-trained models in MXNet
+
+In this tutorial we will see how to use multiple pre-trained models with 
Apache MXNet. First, let's download three images classification models from the 
Apache MXNet [Gluon model 
zoo](https://mxnet.incubator.apache.org/api/python/gluon/model_zoo.html).
+* **DenseNet-121** ([research paper](https://arxiv.org/abs/1608.06993)), 
improved state of the art on [ImageNet 
dataset](http://image-net.org/challenges/LSVRC) in 2016.
+* **MobileNet** ([research paper](https://arxiv.org/abs/1704.04861)), 
MobileNets are based on a streamlined architecture that uses depth-wise 
separable convolutions to build light weight deep neural networks, suitable for 
mobile applications.
+* **ResNet-18** ([research paper](https://arxiv.org/abs/1512.03385v1)), the 
-152 version is the 2015 winner in multiple categories.
+
+Why would you want to try multiple models? Why not just pick the one with the 
best accuracy? As we will see later in the tutorial, even though these models 
have been trained on the same dataset and optimized for maximum accuracy, they 
do behave slightly differently on specific images. In addition, prediction 
speed and memory footprints can vary, and that's an important factor for many 
applications. By trying a few pretrained models, you have an opportunity to 
find a model that can be a good fit for solving your business problem.
+
+
+```python
+import mxnet as mx
+from mxnet import gluon, nd
+from mxnet.gluon.model_zoo import vision
+import matplotlib.pyplot as plt
+import numpy as np
+import json
+%matplotlib inline
+```
+
+## Loading the model
+
+The [Gluon Model 
Zoo](https://mxnet.incubator.apache.org/api/python/gluon/model_zoo.html) 
provides a collection of off-the-shelf models. You can get the ImageNet 
pre-trained model by using `pretrained=True`. 
+If you want to train on your own classification problem from scratch, you can 
get an untrained network with a specific number of classes using the `classes` 
parameter: for example `classes=10`.
+
+We can specify the *context* where we want to run the model: the default 
behavior is to use a CPU context. There are two reasons for this:
+* First, this will allow you to test the notebook even if your machine is not 
equipped with a GPU :)
+* Second, we're going to predict a single image and we don't have any specific 
performance requirements. For production applications where you'd want to 
predict large batches of images with the best possible throughput, a GPU could 
definitely be the way to go.
+* If you want to use a GPU, make sure you have pip installed the right version 
of mxnet, or you will get an error when using the `mx.gpu()` context. Refer to 
the [install instructions](http://mxnet.incubator.apache.org/install/index.html)
+
+
+```python
+# We set the context to CPU, you can switch to GPU if you have one and 
installed a compatible version of MXNet 
+ctx = mx.cpu() 
+```
+
+
+```python
+# We can load three the three models
+densenet121 = vision.densenet121(pretrained=True, ctx=ctx)
+mobileNet = vision.mobilenet0_5(pretrained=True, ctx=ctx)
+resnet18 = vision.resnet18_v1(pretrained=True, ctx=ctx)
+```
+
+We can look at the description of the MobileNet network for example, which has 
a relatively simple though deep architecture
+
+
+```python
+print(mobileNet)
+```
+
+MobileNet(
+  (features): HybridSequential(
+(0): Conv2D(3 -> 16, kernel_size=(3, 3), stride=(2, 2), padding=(1, 
1), bias=False)
+(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, 
use_global_stats=False, in_channels=16)
+(2): Activation(relu)
+(3): Conv2D(1 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 
1), groups=16, bias=False)
+(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, 
use_global_stats=False, in_channels=16)
+(5): Activation(relu)
+(6): Conv2D(16 -> 32, kernel_size=(1, 1), stride=(1, 1), bias=False)
+(7): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, 
use_global_stats=False, in_channels=32)
+(8): Activation(relu)
+(9): Conv2D(1 -> 32, kernel_size=(3, 3), stride=(2, 2), padding=(1, 
1), groups=32, bias=False)
+(10): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, 
use_global_stats=False, in_channels=32)
+(11): Activation(relu)
+(12): Conv2D(32 -> 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
+(13): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, 
use_global_stats=False, in_channels=64)
+(14): Activation(relu)
+(15): Conv2D(1 -> 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 
1), groups=64, bias=False)
+(16):

[GitHub] safrooze commented on a change in pull request #10959: [MXNET-423] Gluon Model Zoo Pre Trained Model tutorial

2018-05-17 Thread GitBox

safrooze commented on a change in pull request #10959: [MXNET-423] Gluon Model 
Zoo Pre Trained Model tutorial
URL: https://github.com/apache/incubator-mxnet/pull/10959#discussion_r189033996
 
 

 ##
 File path: docs/tutorials/gluon/pretrained_models.md
 ##
 @@ -0,0 +1,373 @@
+
+# Using pre-trained models in MXNet
+
+In this tutorial we will see how to use multiple pre-trained models with 
Apache MXNet. First, let's download three images classification models from the 
Apache MXNet [Gluon model 
zoo](https://mxnet.incubator.apache.org/api/python/gluon/model_zoo.html).
 
 Review comment:
   Typo: images->image
   Also since you're not really downloading any models in this section, maybe 
reword to: *In this tutorial we will use the following three different image 
classification models:"


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] safrooze commented on a change in pull request #10959: [MXNET-423] Gluon Model Zoo Pre Trained Model tutorial

2018-05-17 Thread GitBox

safrooze commented on a change in pull request #10959: [MXNET-423] Gluon Model 
Zoo Pre Trained Model tutorial
URL: https://github.com/apache/incubator-mxnet/pull/10959#discussion_r189034610
 
 

 ##
 File path: docs/tutorials/gluon/pretrained_models.md
 ##
 @@ -0,0 +1,373 @@
+
+# Using pre-trained models in MXNet
+
+In this tutorial we will see how to use multiple pre-trained models with 
Apache MXNet. First, let's download three images classification models from the 
Apache MXNet [Gluon model 
zoo](https://mxnet.incubator.apache.org/api/python/gluon/model_zoo.html).
+* **DenseNet-121** ([research paper](https://arxiv.org/abs/1608.06993)), 
improved state of the art on [ImageNet 
dataset](http://image-net.org/challenges/LSVRC) in 2016.
+* **MobileNet** ([research paper](https://arxiv.org/abs/1704.04861)), 
MobileNets are based on a streamlined architecture that uses depth-wise 
separable convolutions to build light weight deep neural networks, suitable for 
mobile applications.
+* **ResNet-18** ([research paper](https://arxiv.org/abs/1512.03385v1)), the 
-152 version is the 2015 winner in multiple categories.
+
+Why would you want to try multiple models? Why not just pick the one with the 
best accuracy? As we will see later in the tutorial, even though these models 
have been trained on the same dataset and optimized for maximum accuracy, they 
do behave slightly differently on specific images. In addition, prediction 
speed and memory footprints can vary, and that's an important factor for many 
applications. By trying a few pretrained models, you have an opportunity to 
find a model that can be a good fit for solving your business problem.
+
+
+```python
+import mxnet as mx
+from mxnet import gluon, nd
+from mxnet.gluon.model_zoo import vision
+import matplotlib.pyplot as plt
+import numpy as np
+import json
 
 Review comment:
   PEP8: system imports (json) before other packages followed by a blank line.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] safrooze commented on a change in pull request #10959: [MXNET-423] Gluon Model Zoo Pre Trained Model tutorial

2018-05-17 Thread GitBox

safrooze commented on a change in pull request #10959: [MXNET-423] Gluon Model 
Zoo Pre Trained Model tutorial
URL: https://github.com/apache/incubator-mxnet/pull/10959#discussion_r189073842
 
 

 ##
 File path: docs/tutorials/gluon/pretrained_models.md
 ##
 @@ -0,0 +1,373 @@
+
+# Using pre-trained models in MXNet
+
+In this tutorial we will see how to use multiple pre-trained models with 
Apache MXNet. First, let's download three images classification models from the 
Apache MXNet [Gluon model 
zoo](https://mxnet.incubator.apache.org/api/python/gluon/model_zoo.html).
+* **DenseNet-121** ([research paper](https://arxiv.org/abs/1608.06993)), 
improved state of the art on [ImageNet 
dataset](http://image-net.org/challenges/LSVRC) in 2016.
+* **MobileNet** ([research paper](https://arxiv.org/abs/1704.04861)), 
MobileNets are based on a streamlined architecture that uses depth-wise 
separable convolutions to build light weight deep neural networks, suitable for 
mobile applications.
+* **ResNet-18** ([research paper](https://arxiv.org/abs/1512.03385v1)), the 
-152 version is the 2015 winner in multiple categories.
+
+Why would you want to try multiple models? Why not just pick the one with the 
best accuracy? As we will see later in the tutorial, even though these models 
have been trained on the same dataset and optimized for maximum accuracy, they 
do behave slightly differently on specific images. In addition, prediction 
speed and memory footprints can vary, and that's an important factor for many 
applications. By trying a few pretrained models, you have an opportunity to 
find a model that can be a good fit for solving your business problem.
+
+
+```python
+import mxnet as mx
+from mxnet import gluon, nd
+from mxnet.gluon.model_zoo import vision
+import matplotlib.pyplot as plt
+import numpy as np
+import json
+%matplotlib inline
+```
+
+## Loading the model
+
+The [Gluon Model 
Zoo](https://mxnet.incubator.apache.org/api/python/gluon/model_zoo.html) 
provides a collection of off-the-shelf models. You can get the ImageNet 
pre-trained model by using `pretrained=True`. 
+If you want to train on your own classification problem from scratch, you can 
get an untrained network with a specific number of classes using the `classes` 
parameter: for example `classes=10`.
+
+We can specify the *context* where we want to run the model: the default 
behavior is to use a CPU context. There are two reasons for this:
+* First, this will allow you to test the notebook even if your machine is not 
equipped with a GPU :)
+* Second, we're going to predict a single image and we don't have any specific 
performance requirements. For production applications where you'd want to 
predict large batches of images with the best possible throughput, a GPU could 
definitely be the way to go.
+* If you want to use a GPU, make sure you have pip installed the right version 
of mxnet, or you will get an error when using the `mx.gpu()` context. Refer to 
the [install instructions](http://mxnet.incubator.apache.org/install/index.html)
+
+
+```python
+# We set the context to CPU, you can switch to GPU if you have one and 
installed a compatible version of MXNet 
+ctx = mx.cpu() 
+```
+
+
+```python
+# We can load three the three models
+densenet121 = vision.densenet121(pretrained=True, ctx=ctx)
+mobileNet = vision.mobilenet0_5(pretrained=True, ctx=ctx)
+resnet18 = vision.resnet18_v1(pretrained=True, ctx=ctx)
+```
+
+We can look at the description of the MobileNet network for example, which has 
a relatively simple though deep architecture
+
+
+```python
+print(mobileNet)
+```
+
+MobileNet(
+  (features): HybridSequential(
+(0): Conv2D(3 -> 16, kernel_size=(3, 3), stride=(2, 2), padding=(1, 
1), bias=False)
+(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, 
use_global_stats=False, in_channels=16)
+(2): Activation(relu)
+(3): Conv2D(1 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 
1), groups=16, bias=False)
+(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, 
use_global_stats=False, in_channels=16)
+(5): Activation(relu)
+(6): Conv2D(16 -> 32, kernel_size=(1, 1), stride=(1, 1), bias=False)
+(7): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, 
use_global_stats=False, in_channels=32)
+(8): Activation(relu)
+(9): Conv2D(1 -> 32, kernel_size=(3, 3), stride=(2, 2), padding=(1, 
1), groups=32, bias=False)
+(10): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, 
use_global_stats=False, in_channels=32)
+(11): Activation(relu)
+(12): Conv2D(32 -> 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
+(13): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, 
use_global_stats=False, in_channels=64)
+(14): Activation(relu)
+(15): Conv2D(1 -> 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 
1), groups=64, bias=False)
+(16):

[GitHub] safrooze commented on a change in pull request #10959: [MXNET-423] Gluon Model Zoo Pre Trained Model tutorial

2018-05-17 Thread GitBox

safrooze commented on a change in pull request #10959: [MXNET-423] Gluon Model 
Zoo Pre Trained Model tutorial
URL: https://github.com/apache/incubator-mxnet/pull/10959#discussion_r189071450
 
 

 ##
 File path: docs/tutorials/gluon/pretrained_models.md
 ##
 @@ -0,0 +1,373 @@
+
+# Using pre-trained models in MXNet
+
+In this tutorial we will see how to use multiple pre-trained models with 
Apache MXNet. First, let's download three images classification models from the 
Apache MXNet [Gluon model 
zoo](https://mxnet.incubator.apache.org/api/python/gluon/model_zoo.html).
+* **DenseNet-121** ([research paper](https://arxiv.org/abs/1608.06993)), 
improved state of the art on [ImageNet 
dataset](http://image-net.org/challenges/LSVRC) in 2016.
+* **MobileNet** ([research paper](https://arxiv.org/abs/1704.04861)), 
MobileNets are based on a streamlined architecture that uses depth-wise 
separable convolutions to build light weight deep neural networks, suitable for 
mobile applications.
+* **ResNet-18** ([research paper](https://arxiv.org/abs/1512.03385v1)), the 
-152 version is the 2015 winner in multiple categories.
+
+Why would you want to try multiple models? Why not just pick the one with the 
best accuracy? As we will see later in the tutorial, even though these models 
have been trained on the same dataset and optimized for maximum accuracy, they 
do behave slightly differently on specific images. In addition, prediction 
speed and memory footprints can vary, and that's an important factor for many 
applications. By trying a few pretrained models, you have an opportunity to 
find a model that can be a good fit for solving your business problem.
+
+
+```python
+import mxnet as mx
+from mxnet import gluon, nd
+from mxnet.gluon.model_zoo import vision
+import matplotlib.pyplot as plt
+import numpy as np
+import json
+%matplotlib inline
+```
+
+## Loading the model
+
+The [Gluon Model 
Zoo](https://mxnet.incubator.apache.org/api/python/gluon/model_zoo.html) 
provides a collection of off-the-shelf models. You can get the ImageNet 
pre-trained model by using `pretrained=True`. 
+If you want to train on your own classification problem from scratch, you can 
get an untrained network with a specific number of classes using the `classes` 
parameter: for example `classes=10`.
+
+We can specify the *context* where we want to run the model: the default 
behavior is to use a CPU context. There are two reasons for this:
+* First, this will allow you to test the notebook even if your machine is not 
equipped with a GPU :)
+* Second, we're going to predict a single image and we don't have any specific 
performance requirements. For production applications where you'd want to 
predict large batches of images with the best possible throughput, a GPU could 
definitely be the way to go.
+* If you want to use a GPU, make sure you have pip installed the right version 
of mxnet, or you will get an error when using the `mx.gpu()` context. Refer to 
the [install instructions](http://mxnet.incubator.apache.org/install/index.html)
+
+
+```python
+# We set the context to CPU, you can switch to GPU if you have one and 
installed a compatible version of MXNet 
+ctx = mx.cpu() 
+```
+
+
+```python
+# We can load three the three models
+densenet121 = vision.densenet121(pretrained=True, ctx=ctx)
+mobileNet = vision.mobilenet0_5(pretrained=True, ctx=ctx)
+resnet18 = vision.resnet18_v1(pretrained=True, ctx=ctx)
+```
+
+We can look at the description of the MobileNet network for example, which has 
a relatively simple though deep architecture
+
+
+```python
+print(mobileNet)
+```
+
+MobileNet(
+  (features): HybridSequential(
+(0): Conv2D(3 -> 16, kernel_size=(3, 3), stride=(2, 2), padding=(1, 
1), bias=False)
+(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, 
use_global_stats=False, in_channels=16)
+(2): Activation(relu)
+(3): Conv2D(1 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 
1), groups=16, bias=False)
+(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, 
use_global_stats=False, in_channels=16)
+(5): Activation(relu)
+(6): Conv2D(16 -> 32, kernel_size=(1, 1), stride=(1, 1), bias=False)
+(7): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, 
use_global_stats=False, in_channels=32)
+(8): Activation(relu)
+(9): Conv2D(1 -> 32, kernel_size=(3, 3), stride=(2, 2), padding=(1, 
1), groups=32, bias=False)
+(10): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, 
use_global_stats=False, in_channels=32)
+(11): Activation(relu)
+(12): Conv2D(32 -> 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
+(13): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, 
use_global_stats=False, in_channels=64)
+(14): Activation(relu)
+(15): Conv2D(1 -> 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 
1), groups=64, bias=False)
+(16):

[GitHub] safrooze commented on a change in pull request #10959: [MXNET-423] Gluon Model Zoo Pre Trained Model tutorial

2018-05-17 Thread GitBox

safrooze commented on a change in pull request #10959: [MXNET-423] Gluon Model 
Zoo Pre Trained Model tutorial
URL: https://github.com/apache/incubator-mxnet/pull/10959#discussion_r189079007
 
 

 ##
 File path: docs/tutorials/gluon/pretrained_models.md
 ##
 @@ -0,0 +1,373 @@
+
+# Using pre-trained models in MXNet
+
+In this tutorial we will see how to use multiple pre-trained models with 
Apache MXNet. First, let's download three images classification models from the 
Apache MXNet [Gluon model 
zoo](https://mxnet.incubator.apache.org/api/python/gluon/model_zoo.html).
+* **DenseNet-121** ([research paper](https://arxiv.org/abs/1608.06993)), 
improved state of the art on [ImageNet 
dataset](http://image-net.org/challenges/LSVRC) in 2016.
+* **MobileNet** ([research paper](https://arxiv.org/abs/1704.04861)), 
MobileNets are based on a streamlined architecture that uses depth-wise 
separable convolutions to build light weight deep neural networks, suitable for 
mobile applications.
+* **ResNet-18** ([research paper](https://arxiv.org/abs/1512.03385v1)), the 
-152 version is the 2015 winner in multiple categories.
+
+Why would you want to try multiple models? Why not just pick the one with the 
best accuracy? As we will see later in the tutorial, even though these models 
have been trained on the same dataset and optimized for maximum accuracy, they 
do behave slightly differently on specific images. In addition, prediction 
speed and memory footprints can vary, and that's an important factor for many 
applications. By trying a few pretrained models, you have an opportunity to 
find a model that can be a good fit for solving your business problem.
+
+
+```python
+import mxnet as mx
+from mxnet import gluon, nd
+from mxnet.gluon.model_zoo import vision
+import matplotlib.pyplot as plt
+import numpy as np
+import json
+%matplotlib inline
+```
+
+## Loading the model
+
+The [Gluon Model 
Zoo](https://mxnet.incubator.apache.org/api/python/gluon/model_zoo.html) 
provides a collection of off-the-shelf models. You can get the ImageNet 
pre-trained model by using `pretrained=True`. 
+If you want to train on your own classification problem from scratch, you can 
get an untrained network with a specific number of classes using the `classes` 
parameter: for example `classes=10`.
 
 Review comment:
   Maybe add a line of code to show how to get the untrained model with 
different number of classes? Also perhaps emphasise that one cannot change 
number of classes and user pretrained as the same time.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] safrooze commented on a change in pull request #10959: [MXNET-423] Gluon Model Zoo Pre Trained Model tutorial

2018-05-17 Thread GitBox

safrooze commented on a change in pull request #10959: [MXNET-423] Gluon Model 
Zoo Pre Trained Model tutorial
URL: https://github.com/apache/incubator-mxnet/pull/10959#discussion_r189070034
 
 

 ##
 File path: docs/tutorials/gluon/pretrained_models.md
 ##
 @@ -0,0 +1,373 @@
+
+# Using pre-trained models in MXNet
+
+In this tutorial we will see how to use multiple pre-trained models with 
Apache MXNet. First, let's download three images classification models from the 
Apache MXNet [Gluon model 
zoo](https://mxnet.incubator.apache.org/api/python/gluon/model_zoo.html).
+* **DenseNet-121** ([research paper](https://arxiv.org/abs/1608.06993)), 
improved state of the art on [ImageNet 
dataset](http://image-net.org/challenges/LSVRC) in 2016.
+* **MobileNet** ([research paper](https://arxiv.org/abs/1704.04861)), 
MobileNets are based on a streamlined architecture that uses depth-wise 
separable convolutions to build light weight deep neural networks, suitable for 
mobile applications.
+* **ResNet-18** ([research paper](https://arxiv.org/abs/1512.03385v1)), the 
-152 version is the 2015 winner in multiple categories.
+
+Why would you want to try multiple models? Why not just pick the one with the 
best accuracy? As we will see later in the tutorial, even though these models 
have been trained on the same dataset and optimized for maximum accuracy, they 
do behave slightly differently on specific images. In addition, prediction 
speed and memory footprints can vary, and that's an important factor for many 
applications. By trying a few pretrained models, you have an opportunity to 
find a model that can be a good fit for solving your business problem.
+
+
+```python
+import mxnet as mx
+from mxnet import gluon, nd
+from mxnet.gluon.model_zoo import vision
+import matplotlib.pyplot as plt
+import numpy as np
+import json
+%matplotlib inline
+```
+
+## Loading the model
+
+The [Gluon Model 
Zoo](https://mxnet.incubator.apache.org/api/python/gluon/model_zoo.html) 
provides a collection of off-the-shelf models. You can get the ImageNet 
pre-trained model by using `pretrained=True`. 
+If you want to train on your own classification problem from scratch, you can 
get an untrained network with a specific number of classes using the `classes` 
parameter: for example `classes=10`.
+
+We can specify the *context* where we want to run the model: the default 
behavior is to use a CPU context. There are two reasons for this:
+* First, this will allow you to test the notebook even if your machine is not 
equipped with a GPU :)
+* Second, we're going to predict a single image and we don't have any specific 
performance requirements. For production applications where you'd want to 
predict large batches of images with the best possible throughput, a GPU could 
definitely be the way to go.
+* If you want to use a GPU, make sure you have pip installed the right version 
of mxnet, or you will get an error when using the `mx.gpu()` context. Refer to 
the [install instructions](http://mxnet.incubator.apache.org/install/index.html)
+
+
+```python
+# We set the context to CPU, you can switch to GPU if you have one and 
installed a compatible version of MXNet 
+ctx = mx.cpu() 
+```
+
+
+```python
+# We can load three the three models
+densenet121 = vision.densenet121(pretrained=True, ctx=ctx)
+mobileNet = vision.mobilenet0_5(pretrained=True, ctx=ctx)
+resnet18 = vision.resnet18_v1(pretrained=True, ctx=ctx)
+```
+
+We can look at the description of the MobileNet network for example, which has 
a relatively simple though deep architecture
+
+
+```python
+print(mobileNet)
+```
+
+MobileNet(
+  (features): HybridSequential(
+(0): Conv2D(3 -> 16, kernel_size=(3, 3), stride=(2, 2), padding=(1, 
1), bias=False)
+(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, 
use_global_stats=False, in_channels=16)
+(2): Activation(relu)
+(3): Conv2D(1 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 
1), groups=16, bias=False)
+(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, 
use_global_stats=False, in_channels=16)
+(5): Activation(relu)
+(6): Conv2D(16 -> 32, kernel_size=(1, 1), stride=(1, 1), bias=False)
+(7): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, 
use_global_stats=False, in_channels=32)
+(8): Activation(relu)
+(9): Conv2D(1 -> 32, kernel_size=(3, 3), stride=(2, 2), padding=(1, 
1), groups=32, bias=False)
+(10): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, 
use_global_stats=False, in_channels=32)
+(11): Activation(relu)
+(12): Conv2D(32 -> 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
+(13): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, 
use_global_stats=False, in_channels=64)
+(14): Activation(relu)
+(15): Conv2D(1 -> 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 
1), groups=64, bias=False)
+(16):

[GitHub] safrooze commented on a change in pull request #10959: [MXNET-423] Gluon Model Zoo Pre Trained Model tutorial

2018-05-17 Thread GitBox

safrooze commented on a change in pull request #10959: [MXNET-423] Gluon Model 
Zoo Pre Trained Model tutorial
URL: https://github.com/apache/incubator-mxnet/pull/10959#discussion_r189041711
 
 

 ##
 File path: docs/tutorials/gluon/pretrained_models.md
 ##
 @@ -0,0 +1,373 @@
+
+# Using pre-trained models in MXNet
+
+In this tutorial we will see how to use multiple pre-trained models with 
Apache MXNet. First, let's download three images classification models from the 
Apache MXNet [Gluon model 
zoo](https://mxnet.incubator.apache.org/api/python/gluon/model_zoo.html).
+* **DenseNet-121** ([research paper](https://arxiv.org/abs/1608.06993)), 
improved state of the art on [ImageNet 
dataset](http://image-net.org/challenges/LSVRC) in 2016.
+* **MobileNet** ([research paper](https://arxiv.org/abs/1704.04861)), 
MobileNets are based on a streamlined architecture that uses depth-wise 
separable convolutions to build light weight deep neural networks, suitable for 
mobile applications.
+* **ResNet-18** ([research paper](https://arxiv.org/abs/1512.03385v1)), the 
-152 version is the 2015 winner in multiple categories.
+
+Why would you want to try multiple models? Why not just pick the one with the 
best accuracy? As we will see later in the tutorial, even though these models 
have been trained on the same dataset and optimized for maximum accuracy, they 
do behave slightly differently on specific images. In addition, prediction 
speed and memory footprints can vary, and that's an important factor for many 
applications. By trying a few pretrained models, you have an opportunity to 
find a model that can be a good fit for solving your business problem.
+
+
+```python
+import mxnet as mx
+from mxnet import gluon, nd
+from mxnet.gluon.model_zoo import vision
+import matplotlib.pyplot as plt
+import numpy as np
+import json
+%matplotlib inline
+```
+
+## Loading the model
+
+The [Gluon Model 
Zoo](https://mxnet.incubator.apache.org/api/python/gluon/model_zoo.html) 
provides a collection of off-the-shelf models. You can get the ImageNet 
pre-trained model by using `pretrained=True`. 
+If you want to train on your own classification problem from scratch, you can 
get an untrained network with a specific number of classes using the `classes` 
parameter: for example `classes=10`.
+
+We can specify the *context* where we want to run the model: the default 
behavior is to use a CPU context. There are two reasons for this:
+* First, this will allow you to test the notebook even if your machine is not 
equipped with a GPU :)
+* Second, we're going to predict a single image and we don't have any specific 
performance requirements. For production applications where you'd want to 
predict large batches of images with the best possible throughput, a GPU could 
definitely be the way to go.
+* If you want to use a GPU, make sure you have pip installed the right version 
of mxnet, or you will get an error when using the `mx.gpu()` context. Refer to 
the [install instructions](http://mxnet.incubator.apache.org/install/index.html)
+
+
+```python
+# We set the context to CPU, you can switch to GPU if you have one and 
installed a compatible version of MXNet 
+ctx = mx.cpu() 
+```
+
+
+```python
+# We can load three the three models
+densenet121 = vision.densenet121(pretrained=True, ctx=ctx)
+mobileNet = vision.mobilenet0_5(pretrained=True, ctx=ctx)
+resnet18 = vision.resnet18_v1(pretrained=True, ctx=ctx)
+```
+
+We can look at the description of the MobileNet network for example, which has 
a relatively simple though deep architecture
 
 Review comment:
   "simple yet deep" or just simple deep?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] safrooze commented on a change in pull request #10959: [MXNET-423] Gluon Model Zoo Pre Trained Model tutorial

2018-05-17 Thread GitBox

safrooze commented on a change in pull request #10959: [MXNET-423] Gluon Model 
Zoo Pre Trained Model tutorial
URL: https://github.com/apache/incubator-mxnet/pull/10959#discussion_r189034379
 
 

 ##
 File path: docs/tutorials/gluon/pretrained_models.md
 ##
 @@ -0,0 +1,373 @@
+
+# Using pre-trained models in MXNet
+
+In this tutorial we will see how to use multiple pre-trained models with 
Apache MXNet. First, let's download three images classification models from the 
Apache MXNet [Gluon model 
zoo](https://mxnet.incubator.apache.org/api/python/gluon/model_zoo.html).
+* **DenseNet-121** ([research paper](https://arxiv.org/abs/1608.06993)), 
improved state of the art on [ImageNet 
dataset](http://image-net.org/challenges/LSVRC) in 2016.
+* **MobileNet** ([research paper](https://arxiv.org/abs/1704.04861)), 
MobileNets are based on a streamlined architecture that uses depth-wise 
separable convolutions to build light weight deep neural networks, suitable for 
mobile applications.
+* **ResNet-18** ([research paper](https://arxiv.org/abs/1512.03385v1)), the 
-152 version is the 2015 winner in multiple categories.
+
+Why would you want to try multiple models? Why not just pick the one with the 
best accuracy? As we will see later in the tutorial, even though these models 
have been trained on the same dataset and optimized for maximum accuracy, they 
do behave slightly differently on specific images. In addition, prediction 
speed and memory footprints can vary, and that's an important factor for many 
applications. By trying a few pretrained models, you have an opportunity to 
find a model that can be a good fit for solving your business problem.
+
+
+```python
+import mxnet as mx
+from mxnet import gluon, nd
+from mxnet.gluon.model_zoo import vision
+import matplotlib.pyplot as plt
 
 Review comment:
   PEP8: matplotlib < mxnet


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] marcoabreu closed pull request #10975: [MXNET-343] avoid importing docker_cache if feature is not used

2018-05-17 Thread GitBox

marcoabreu closed pull request #10975: [MXNET-343] avoid importing docker_cache 
if feature is not used
URL: https://github.com/apache/incubator-mxnet/pull/10975
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/Jenkinsfile b/Jenkinsfile
index 4d8dfb713eb..e45bea7f456 100644
--- a/Jenkinsfile
+++ b/Jenkinsfile
@@ -449,7 +449,7 @@ try {
 }
   }
 },
-'Raspberry / ARMv6l':{
+'Raspberry / ARMv6':{
   node('mxnetlinux-cpu') {
 ws('workspace/build-raspberry-armv6') {
   timeout(time: max_time, unit: 'MINUTES') {
diff --git a/ci/build.py b/ci/build.py
index 6b1d23e0391..deae1d733a8 100755
--- a/ci/build.py
+++ b/ci/build.py
@@ -33,7 +33,6 @@
 import shutil
 import subprocess
 import sys
-import docker_cache
 from copy import deepcopy
 from itertools import chain
 from subprocess import call, check_call
@@ -232,6 +231,7 @@ def script_name() -> str:
 platform = args.platform
 tag = get_docker_tag(platform)
 if args.download_docker_cache:
+import docker_cache
 logging.info('Docker cache download is enabled')
 
docker_cache.load_docker_cache(bucket_name=args.docker_cache_bucket, 
docker_tag=tag)
 build_docker(platform, docker_binary)
@@ -256,6 +256,7 @@ def script_name() -> str:
 logging.info("Artifacts will be produced in the build/ directory.")
 for platform in platforms:
 if args.download_docker_cache:
+import docker_cache
 tag = get_docker_tag(platform)
 logging.info('Docker cache download is enabled')
 
docker_cache.load_docker_cache(bucket_name=args.docker_cache_bucket, 
docker_tag=tag)


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[incubator-mxnet] branch master updated: [MXNET-343] Avoid importing docker_cache if feature is not used. (#10975)

2018-05-17 Thread marcoabreu

This is an automated email from the ASF dual-hosted git repository.

marcoabreu pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git


The following commit(s) were added to refs/heads/master by this push:
 new 3677b6f  [MXNET-343] Avoid importing docker_cache if feature is not 
used. (#10975)
3677b6f is described below

commit 3677b6fecdac201062d3448eab1a33ba0fd1108b
Author: Pedro Larroy <928489+lar...@users.noreply.github.com>
AuthorDate: Thu May 17 21:02:07 2018 +0200

[MXNET-343] Avoid importing docker_cache if feature is not used. (#10975)

So boto3 and joblib are not needed
---
 Jenkinsfile | 2 +-
 ci/build.py | 3 ++-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/Jenkinsfile b/Jenkinsfile
index 4d8dfb7..e45bea7 100644
--- a/Jenkinsfile
+++ b/Jenkinsfile
@@ -449,7 +449,7 @@ try {
 }
   }
 },
-'Raspberry / ARMv6l':{
+'Raspberry / ARMv6':{
   node('mxnetlinux-cpu') {
 ws('workspace/build-raspberry-armv6') {
   timeout(time: max_time, unit: 'MINUTES') {
diff --git a/ci/build.py b/ci/build.py
index 6b1d23e..deae1d7 100755
--- a/ci/build.py
+++ b/ci/build.py
@@ -33,7 +33,6 @@ import re
 import shutil
 import subprocess
 import sys
-import docker_cache
 from copy import deepcopy
 from itertools import chain
 from subprocess import call, check_call
@@ -232,6 +231,7 @@ def main() -> int:
 platform = args.platform
 tag = get_docker_tag(platform)
 if args.download_docker_cache:
+import docker_cache
 logging.info('Docker cache download is enabled')
 
docker_cache.load_docker_cache(bucket_name=args.docker_cache_bucket, 
docker_tag=tag)
 build_docker(platform, docker_binary)
@@ -256,6 +256,7 @@ def main() -> int:
 logging.info("Artifacts will be produced in the build/ directory.")
 for platform in platforms:
 if args.download_docker_cache:
+import docker_cache
 tag = get_docker_tag(platform)
 logging.info('Docker cache download is enabled')
 
docker_cache.load_docker_cache(bucket_name=args.docker_cache_bucket, 
docker_tag=tag)

-- 
To stop receiving notification emails like this one, please contact
marcoab...@apache.org.

[GitHub] piiswrong commented on a change in pull request #10852: [MXNET-411] Add ROI Align

2018-05-17 Thread GitBox

piiswrong commented on a change in pull request #10852: [MXNET-411] Add ROI 
Align
URL: https://github.com/apache/incubator-mxnet/pull/10852#discussion_r189058175
 
 

 ##
 File path: src/operator/contrib/adaptive_avg_pooling.cc
 ##
 @@ -26,8 +26,8 @@
 // #include "elemwise_op_common.h"
 #include "../elemwise_op_common.h"
 
-#define START_IND(a, b, c) static_cast(floor(static_cast(a * c) / 
b))
-#define END_IND(a, b, c) static_cast(ceil(static_cast((a + 1) * c) 
/ b))
+#define START_IND(a, b, c) static_cast(floor(static_cast(a * c) / 
b))
 
 Review comment:
   ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] piiswrong commented on a change in pull request #10852: [MXNET-411] Add ROI Align

2018-05-17 Thread GitBox

piiswrong commented on a change in pull request #10852: [MXNET-411] Add ROI 
Align
URL: https://github.com/apache/incubator-mxnet/pull/10852#discussion_r189060442
 
 

 ##
 File path: src/operator/contrib/roi_align.cc
 ##
 @@ -0,0 +1,593 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+/*!
+ * Copyright (c) 2018 by Contributors
+ * \file roi_align.cc
+ * \brief roi align operator
+ * \author Hang Zhang
+ * Adapted from Caffe2
+*/
+#include "./roi_align-inl.h"
+
+
+namespace mxnet {
+namespace op {
+
+template 
+struct PreCalc {
+  int pos1;
+  int pos2;
+  int pos3;
+  int pos4;
+  T w1;
+  T w2;
+  T w3;
+  T w4;
+};
+
+template 
+void pre_calc_for_bilinear_interpolate(
+const int height,
+const int width,
+const int pooled_height,
+const int pooled_width,
+const int iy_upper,
+const int ix_upper,
+T roi_start_h,
+T roi_start_w,
+T bin_size_h,
+T bin_size_w,
+int roi_bin_grid_h,
+int roi_bin_grid_w,
+std::vector* pre_calc) {
+  int pre_calc_index = 0;
+  for (int ph = 0; ph < pooled_height; ph++) {
+for (int pw = 0; pw < pooled_width; pw++) {
+  for (int iy = 0; iy < iy_upper; iy++) {
+const T yy = roi_start_h + ph * bin_size_h +
+static_cast(iy + .5f) * bin_size_h /
+static_cast(roi_bin_grid_h);  // e.g., 0.5, 1.5
+for (int ix = 0; ix < ix_upper; ix++) {
+  const T xx = roi_start_w + pw * bin_size_w +
+  static_cast(ix + .5f) * bin_size_w /
+  static_cast(roi_bin_grid_w);
+
+  T x = xx;
+  T y = yy;
+  // deal with: inverse elements are out of feature map boundary
+  if (y < -1.0 || y > height || x < -1.0 || x > width) {
+// empty
+PreCalc pc;
+pc.pos1 = 0;
+pc.pos2 = 0;
+pc.pos3 = 0;
+pc.pos4 = 0;
+pc.w1 = 0;
+pc.w2 = 0;
+pc.w3 = 0;
+pc.w4 = 0;
+pre_calc->at(pre_calc_index) = pc;
+pre_calc_index += 1;
+continue;
+  }
+
+  if (y <= 0) {
+y = 0;
+  }
+  if (x <= 0) {
+x = 0;
+  }
+
+  int y_low = static_cast(y);
+  int x_low = static_cast(x);
+  int y_high;
+  int x_high;
+
+  if (y_low >= height - 1) {
+y_high = y_low = height - 1;
+y = (T)y_low;
+  } else {
+y_high = y_low + 1;
+  }
+
+  if (x_low >= width - 1) {
+x_high = x_low = width - 1;
+x = (T)x_low;
+  } else {
+x_high = x_low + 1;
+  }
+
+  T ly = y - y_low;
+  T lx = x - x_low;
+  T hy = 1. - ly, hx = 1. - lx;
+  T w1 = hy * hx, w2 = hy * lx, w3 = ly * hx, w4 = ly * lx;
+
+  // save weights and indeces
+  PreCalc pc;
+  pc.pos1 = y_low * width + x_low;
+  pc.pos2 = y_low * width + x_high;
+  pc.pos3 = y_high * width + x_low;
+  pc.pos4 = y_high * width + x_high;
+  pc.w1 = w1;
+  pc.w2 = w2;
+  pc.w3 = w3;
+  pc.w4 = w4;
+  pre_calc->at(pre_calc_index) = pc;
+
+  pre_calc_index += 1;
+}
+  }
+}
+  }
+}
+
+template 
+void ROIAlignForward(
+const int nthreads,
+const T* bottom_data,
+const T& spatial_scale,
+const int channels,
+const int height,
+const int width,
+const int pooled_height,
+const int pooled_width,
+const int sampling_ratio,
+const T* bottom_rois,
+int roi_cols,
+T* top_data) {
+  DCHECK(roi_cols == 4 || roi_cols == 5);
+
+  int n_rois = nthreads / channels / pooled_width / pooled_height;
+  // (n, c, ph, pw) is an element in the pooled output
+  // can be parallelized using omp
+  // #pragma omp parallel for num_threads(32)
+  for (int n = 0; n < n_rois; n++) {
+int index_n = n * channels * pooled_width * pooled_height;
+
+// roi could have 4 or 5 columns
+const T* offset_bottom_rois = bottom_rois + n * roi_cols;
+int roi_batch_ind = 0;
+if (roi_cols == 5) {
+  roi_batch_ind =

[GitHub] piiswrong commented on a change in pull request #10852: [MXNET-411] Add ROI Align

2018-05-17 Thread GitBox

piiswrong commented on a change in pull request #10852: [MXNET-411] Add ROI 
Align
URL: https://github.com/apache/incubator-mxnet/pull/10852#discussion_r189062271
 
 

 ##
 File path: src/operator/contrib/roi_align-inl.h
 ##
 @@ -0,0 +1,75 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+/*!
+ * Copyright (c) 2018 by Contributors
+ * \file roi_align-inl.h
+ * \brief roi align operator and symbol
+ * \author Hang Zhang
+ * modified from Caffe2
+*/
+#ifndef MXNET_OPERATOR_CONTRIB_ROI_ALIGN_INL_H_
+#define MXNET_OPERATOR_CONTRIB_ROI_ALIGN_INL_H_
+
+#include 
+#include 
+#include "../mshadow_op.h"
+#include "../tensor/init_op.h"
+
+
+namespace mxnet {
+namespace op {
+
+
+// Declare enumeration of input order to make code more intuitive.
+// These enums are only visible within this header
+namespace roialign {
+enum ROIAlignOpInputs {kData, kBox};
+enum ROIAlignOpOutputs {kOut};
+}  // roialign
+
+
+struct ROIAlignParam : public dmlc::Parameter {
+  TShape pooled_size;
+  float spatial_scale;
+  DMLC_DECLARE_PARAMETER(ROIAlignParam) {
+DMLC_DECLARE_FIELD(pooled_size)
+.set_expect_ndim(2).enforce_nonzero()
+.describe("fix pooled size: (h, w)");
 
 Review comment:
   what's pooled_size?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] piiswrong commented on a change in pull request #10852: [MXNET-411] Add ROI Align

2018-05-17 Thread GitBox

piiswrong commented on a change in pull request #10852: [MXNET-411] Add ROI 
Align
URL: https://github.com/apache/incubator-mxnet/pull/10852#discussion_r189061899
 
 

 ##
 File path: src/operator/contrib/roi_align.cu
 ##
 @@ -0,0 +1,485 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+/*!
+ * Copyright (c) 2018 by Contributors
+ * \file roi_align.cu
+ * \brief roi align operator
+ * \author Hang Zhang
+ * Adapted from Caffe2
+*/
+#include "./roi_align-inl.h"
+
+
+namespace mxnet {
+namespace op {
+
+#define CUDA_1D_KERNEL_LOOP(i, n) \
+  for (size_t i = blockIdx.x * blockDim.x + threadIdx.x; i < (n); \
+   i += blockDim.x * gridDim.x)
+
+using namespace mshadow::cuda;
+
+// The maximum number of blocks to use in the default kernel call.
+constexpr int ROI_MAXIMUM_NUM_BLOCKS = 4096;
+
+/**
+ * @brief Compute the number of blocks needed to run N threads.
+ */
+inline int ROI_GET_BLOCKS(const int N) {
+  return std::max(
+  std::min(
+  (N + kMaxThreadsPerBlock - 1) / kMaxThreadsPerBlock,
+  ROI_MAXIMUM_NUM_BLOCKS),
+  // Use at least 1 block, since CUDA does not allow empty block
+  1);
+}
+
+
+template 
+__device__ T bilinear_interpolate(
+const T* bottom_data,
+const int height,
+const int width,
+T y,
+T x,
+const int index /* index for debug only*/) {
+  // deal with cases that inverse elements are out of feature map boundary
+  if (y < -1.0 || y > height || x < -1.0 || x > width) {
+// empty
+return 0;
+  }
+
+  if (y <= 0) {
+y = 0;
+  }
+  if (x <= 0) {
+x = 0;
+  }
+
+  int y_low = static_cast(y);
+  int x_low = static_cast(x);
+  int y_high;
+  int x_high;
+
+  if (y_low >= height - 1) {
+y_high = y_low = height - 1;
+y = (T)y_low;
+  } else {
+y_high = y_low + 1;
+  }
+
+  if (x_low >= width - 1) {
+x_high = x_low = width - 1;
+x = (T)x_low;
+  } else {
+x_high = x_low + 1;
+  }
+
+  T ly = y - y_low;
+  T lx = x - x_low;
+  T hy = 1. - ly, hx = 1. - lx;
+  // do bilinear interpolation
+  T v1 = bottom_data[y_low * width + x_low];
+  T v2 = bottom_data[y_low * width + x_high];
+  T v3 = bottom_data[y_high * width + x_low];
+  T v4 = bottom_data[y_high * width + x_high];
+  T w1 = hy * hx, w2 = hy * lx, w3 = ly * hx, w4 = ly * lx;
+
+  T val = (w1 * v1 + w2 * v2 + w3 * v3 + w4 * v4);
+
+  return val;
+}
+
+template 
+__global__ void RoIAlignForwardKernel(
+const int nthreads,
+const T* bottom_data,
+const T spatial_scale,
+const int channels,
+const int height,
+const int width,
+const int pooled_height,
+const int pooled_width,
+const int sampling_ratio,
+const T* bottom_rois,
+T* top_data) {
+  CUDA_1D_KERNEL_LOOP(index, nthreads) {
+// (n, c, ph, pw) is an element in the pooled output
+int pw = index % pooled_width;
+int ph = (index / pooled_width) % pooled_height;
+int c = (index / pooled_width / pooled_height) % channels;
+int n = index / pooled_width / pooled_height / channels;
+
+const T* offset_bottom_rois = bottom_rois + n * 5;
+int roi_batch_ind = offset_bottom_rois[0];
+
+// Do not using rounding; this implementation detail is critical
+T roi_start_w = offset_bottom_rois[1] * spatial_scale;
+T roi_start_h = offset_bottom_rois[2] * spatial_scale;
+T roi_end_w = offset_bottom_rois[3] * spatial_scale;
+T roi_end_h = offset_bottom_rois[4] * spatial_scale;
+// T roi_start_w = round(offset_bottom_rois[1] * spatial_scale);
+// T roi_start_h = round(offset_bottom_rois[2] * spatial_scale);
+// T roi_end_w = round(offset_bottom_rois[3] * spatial_scale);
+// T roi_end_h = round(offset_bottom_rois[4] * spatial_scale);
+
+// Force malformed ROIs to be 1x1
+T roi_width = max(roi_end_w - roi_start_w, (T)1.);
+T roi_height = max(roi_end_h - roi_start_h, (T)1.);
+T bin_size_h = static_cast(roi_height) / static_cast(pooled_height);
+T bin_size_w = static_cast(roi_width) / static_cast(pooled_width);
+
+const T* offset_bottom_data =
+bottom_data + (roi_batch_ind * channels + c) * height * width;
+
+// We use roi_bin_grid to sample the grid and mimic integral

[GitHub] piiswrong commented on a change in pull request #10852: [MXNET-411] Add ROI Align

2018-05-17 Thread GitBox

piiswrong commented on a change in pull request #10852: [MXNET-411] Add ROI 
Align
URL: https://github.com/apache/incubator-mxnet/pull/10852#discussion_r189060029
 
 

 ##
 File path: src/operator/contrib/roi_align.cc
 ##
 @@ -0,0 +1,593 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+/*!
+ * Copyright (c) 2018 by Contributors
+ * \file roi_align.cc
+ * \brief roi align operator
+ * \author Hang Zhang
+ * Adapted from Caffe2
+*/
+#include "./roi_align-inl.h"
+
+
+namespace mxnet {
+namespace op {
+
+template 
+struct PreCalc {
+  int pos1;
+  int pos2;
+  int pos3;
+  int pos4;
+  T w1;
+  T w2;
+  T w3;
+  T w4;
+};
+
+template 
+void pre_calc_for_bilinear_interpolate(
+const int height,
+const int width,
+const int pooled_height,
+const int pooled_width,
+const int iy_upper,
+const int ix_upper,
+T roi_start_h,
+T roi_start_w,
+T bin_size_h,
+T bin_size_w,
+int roi_bin_grid_h,
+int roi_bin_grid_w,
+std::vector* pre_calc) {
+  int pre_calc_index = 0;
+  for (int ph = 0; ph < pooled_height; ph++) {
+for (int pw = 0; pw < pooled_width; pw++) {
+  for (int iy = 0; iy < iy_upper; iy++) {
+const T yy = roi_start_h + ph * bin_size_h +
+static_cast(iy + .5f) * bin_size_h /
+static_cast(roi_bin_grid_h);  // e.g., 0.5, 1.5
+for (int ix = 0; ix < ix_upper; ix++) {
+  const T xx = roi_start_w + pw * bin_size_w +
+  static_cast(ix + .5f) * bin_size_w /
+  static_cast(roi_bin_grid_w);
+
+  T x = xx;
+  T y = yy;
+  // deal with: inverse elements are out of feature map boundary
+  if (y < -1.0 || y > height || x < -1.0 || x > width) {
+// empty
+PreCalc pc;
+pc.pos1 = 0;
+pc.pos2 = 0;
+pc.pos3 = 0;
+pc.pos4 = 0;
+pc.w1 = 0;
+pc.w2 = 0;
+pc.w3 = 0;
+pc.w4 = 0;
+pre_calc->at(pre_calc_index) = pc;
+pre_calc_index += 1;
+continue;
+  }
+
+  if (y <= 0) {
+y = 0;
+  }
+  if (x <= 0) {
+x = 0;
+  }
+
+  int y_low = static_cast(y);
+  int x_low = static_cast(x);
+  int y_high;
+  int x_high;
+
+  if (y_low >= height - 1) {
+y_high = y_low = height - 1;
+y = (T)y_low;
+  } else {
+y_high = y_low + 1;
+  }
+
+  if (x_low >= width - 1) {
+x_high = x_low = width - 1;
+x = (T)x_low;
+  } else {
+x_high = x_low + 1;
+  }
+
+  T ly = y - y_low;
+  T lx = x - x_low;
+  T hy = 1. - ly, hx = 1. - lx;
+  T w1 = hy * hx, w2 = hy * lx, w3 = ly * hx, w4 = ly * lx;
+
+  // save weights and indeces
+  PreCalc pc;
+  pc.pos1 = y_low * width + x_low;
+  pc.pos2 = y_low * width + x_high;
+  pc.pos3 = y_high * width + x_low;
+  pc.pos4 = y_high * width + x_high;
+  pc.w1 = w1;
+  pc.w2 = w2;
+  pc.w3 = w3;
+  pc.w4 = w4;
+  pre_calc->at(pre_calc_index) = pc;
+
+  pre_calc_index += 1;
+}
+  }
+}
+  }
+}
+
+template 
+void ROIAlignForward(
+const int nthreads,
+const T* bottom_data,
+const T& spatial_scale,
+const int channels,
+const int height,
+const int width,
+const int pooled_height,
+const int pooled_width,
+const int sampling_ratio,
+const T* bottom_rois,
+int roi_cols,
+T* top_data) {
+  DCHECK(roi_cols == 4 || roi_cols == 5);
+
+  int n_rois = nthreads / channels / pooled_width / pooled_height;
+  // (n, c, ph, pw) is an element in the pooled output
+  // can be parallelized using omp
+  // #pragma omp parallel for num_threads(32)
+  for (int n = 0; n < n_rois; n++) {
+int index_n = n * channels * pooled_width * pooled_height;
+
+// roi could have 4 or 5 columns
+const T* offset_bottom_rois = bottom_rois + n * roi_cols;
+int roi_batch_ind = 0;
+if (roi_cols == 5) {
+  roi_batch_ind =

[GitHub] piiswrong commented on a change in pull request #10852: [MXNET-411] Add ROI Align

2018-05-17 Thread GitBox

piiswrong commented on a change in pull request #10852: [MXNET-411] Add ROI 
Align
URL: https://github.com/apache/incubator-mxnet/pull/10852#discussion_r189060521
 
 

 ##
 File path: src/operator/contrib/roi_align.cc
 ##
 @@ -0,0 +1,593 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+/*!
+ * Copyright (c) 2018 by Contributors
+ * \file roi_align.cc
+ * \brief roi align operator
+ * \author Hang Zhang
+ * Adapted from Caffe2
+*/
+#include "./roi_align-inl.h"
+
+
+namespace mxnet {
+namespace op {
+
+template 
+struct PreCalc {
+  int pos1;
+  int pos2;
+  int pos3;
+  int pos4;
+  T w1;
+  T w2;
+  T w3;
+  T w4;
+};
+
+template 
+void pre_calc_for_bilinear_interpolate(
+const int height,
+const int width,
+const int pooled_height,
+const int pooled_width,
+const int iy_upper,
+const int ix_upper,
+T roi_start_h,
+T roi_start_w,
+T bin_size_h,
+T bin_size_w,
+int roi_bin_grid_h,
+int roi_bin_grid_w,
+std::vector* pre_calc) {
+  int pre_calc_index = 0;
+  for (int ph = 0; ph < pooled_height; ph++) {
+for (int pw = 0; pw < pooled_width; pw++) {
+  for (int iy = 0; iy < iy_upper; iy++) {
+const T yy = roi_start_h + ph * bin_size_h +
+static_cast(iy + .5f) * bin_size_h /
+static_cast(roi_bin_grid_h);  // e.g., 0.5, 1.5
+for (int ix = 0; ix < ix_upper; ix++) {
+  const T xx = roi_start_w + pw * bin_size_w +
+  static_cast(ix + .5f) * bin_size_w /
+  static_cast(roi_bin_grid_w);
+
+  T x = xx;
+  T y = yy;
+  // deal with: inverse elements are out of feature map boundary
+  if (y < -1.0 || y > height || x < -1.0 || x > width) {
+// empty
+PreCalc pc;
+pc.pos1 = 0;
+pc.pos2 = 0;
+pc.pos3 = 0;
+pc.pos4 = 0;
+pc.w1 = 0;
+pc.w2 = 0;
+pc.w3 = 0;
+pc.w4 = 0;
+pre_calc->at(pre_calc_index) = pc;
+pre_calc_index += 1;
+continue;
+  }
+
+  if (y <= 0) {
+y = 0;
+  }
+  if (x <= 0) {
+x = 0;
+  }
+
+  int y_low = static_cast(y);
+  int x_low = static_cast(x);
+  int y_high;
+  int x_high;
+
+  if (y_low >= height - 1) {
+y_high = y_low = height - 1;
+y = (T)y_low;
+  } else {
+y_high = y_low + 1;
+  }
+
+  if (x_low >= width - 1) {
+x_high = x_low = width - 1;
+x = (T)x_low;
+  } else {
+x_high = x_low + 1;
+  }
+
+  T ly = y - y_low;
+  T lx = x - x_low;
+  T hy = 1. - ly, hx = 1. - lx;
+  T w1 = hy * hx, w2 = hy * lx, w3 = ly * hx, w4 = ly * lx;
+
+  // save weights and indeces
+  PreCalc pc;
+  pc.pos1 = y_low * width + x_low;
+  pc.pos2 = y_low * width + x_high;
+  pc.pos3 = y_high * width + x_low;
+  pc.pos4 = y_high * width + x_high;
+  pc.w1 = w1;
+  pc.w2 = w2;
+  pc.w3 = w3;
+  pc.w4 = w4;
+  pre_calc->at(pre_calc_index) = pc;
+
+  pre_calc_index += 1;
+}
+  }
+}
+  }
+}
+
+template 
+void ROIAlignForward(
+const int nthreads,
+const T* bottom_data,
+const T& spatial_scale,
+const int channels,
+const int height,
+const int width,
+const int pooled_height,
+const int pooled_width,
+const int sampling_ratio,
+const T* bottom_rois,
+int roi_cols,
+T* top_data) {
+  DCHECK(roi_cols == 4 || roi_cols == 5);
+
+  int n_rois = nthreads / channels / pooled_width / pooled_height;
+  // (n, c, ph, pw) is an element in the pooled output
+  // can be parallelized using omp
+  // #pragma omp parallel for num_threads(32)
+  for (int n = 0; n < n_rois; n++) {
 
 Review comment:
   openmp?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For

[GitHub] piiswrong commented on a change in pull request #10852: [MXNET-411] Add ROI Align

2018-05-17 Thread GitBox

piiswrong commented on a change in pull request #10852: [MXNET-411] Add ROI 
Align
URL: https://github.com/apache/incubator-mxnet/pull/10852#discussion_r189062628
 
 

 ##
 File path: src/operator/contrib/roi_align-inl.h
 ##
 @@ -0,0 +1,75 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+/*!
+ * Copyright (c) 2018 by Contributors
+ * \file roi_align-inl.h
+ * \brief roi align operator and symbol
+ * \author Hang Zhang
+ * modified from Caffe2
+*/
+#ifndef MXNET_OPERATOR_CONTRIB_ROI_ALIGN_INL_H_
+#define MXNET_OPERATOR_CONTRIB_ROI_ALIGN_INL_H_
+
+#include 
+#include 
+#include "../mshadow_op.h"
+#include "../tensor/init_op.h"
+
+
+namespace mxnet {
+namespace op {
+
+
+// Declare enumeration of input order to make code more intuitive.
+// These enums are only visible within this header
+namespace roialign {
+enum ROIAlignOpInputs {kData, kBox};
+enum ROIAlignOpOutputs {kOut};
+}  // roialign
+
+
+struct ROIAlignParam : public dmlc::Parameter {
+  TShape pooled_size;
+  float spatial_scale;
+  DMLC_DECLARE_PARAMETER(ROIAlignParam) {
+DMLC_DECLARE_FIELD(pooled_size)
+.set_expect_ndim(2).enforce_nonzero()
+.describe("fix pooled size: (h, w)");
+DMLC_DECLARE_FIELD(spatial_scale).set_range(0.0, 1.0)
+.describe("Ratio of input feature map height (or w) to raw image height 
(or w). "
+"Equals the reciprocal of total stride in convolutional layers");
+  }
+};
+
+
+struct ROIAlignGrad {
 
 Review comment:
   No need for this struct. Use lambda directly at set_attr
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] zhanghang1989 commented on a change in pull request #10536: [MXNET-317] Add Data Parallel

2018-05-17 Thread GitBox

zhanghang1989 commented on a change in pull request #10536: [MXNET-317] Add 
Data Parallel
URL: https://github.com/apache/incubator-mxnet/pull/10536#discussion_r189059049
 
 

 ##
 File path: python/mxnet/gluon/contrib/parallel.py
 ##
 @@ -0,0 +1,343 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=broad-except, redefined-builtin
+"""Synchronized DataParallel"""
+import threading
+from ... import autograd
+from ...ndarray import NDArray
+from ..utils import split_and_load
+
+__all__ = ['DataParallelModel', 'DataParallelCriterion', 'Barrier']
+
+
+class Barrier(object):
+"""Shared NDArray for cross device operation.
+
+A cross device operation that allows synchronized push and pull. It can be 
used in
+Cross-gpu Sycnhronized Batch Normalization and Sparse Blocks.
+
+Parameters
+--
+counter : int
+Number of deivces.
+operation : callable
+The cross device operation is applying (e.g. AllReduce).
+"""
+def __init__(self, counter, operation):
+self.mutex = threading.Lock()
+self.all_tasks_done = threading.Condition(self.mutex)
+self.counter = counter
+self.op = operation
+self._clear()
+
+def push(self, x):
+"""Push a NDArray from one of the device.
+Input:
+x (NDArray)
+
+Output:
+idx (int), the output index
+"""
+with self.mutex:
+if self.push_tasks == 0:
+self._clear()
+self.list.append(x)
+idx = len(self.list) - 1
+self.push_tasks -= 1
+
+with self.all_tasks_done:
+if self.push_tasks == 0:
+self.all_tasks_done.notify_all()
+while self.push_tasks:
+self.all_tasks_done.wait()
+
+self._sync_op()
+return idx
+
+def pull(self, idx):
+"""Pull the output to each device
+Input:
+idx (int)
+
+Output:
+out (NDArray)
+"""
+return self.out[idx]
+
+def _sync_op(self):
+with self.mutex:
+if self.reduce_tasks == 1:
+assert(len(self.list) == self.counter)
+self.out = self.op(*self.list)
+if isinstance(self.out, (list, tuple)):
+for xi in self.out:
+xi.wait_to_read()
+else:
+self.out.wait_to_read()
+self.reduce_tasks -= 1
+else:
+self.reduce_tasks -= 1
+
+with self.all_tasks_done:
+if self.reduce_tasks == 0:
+self.all_tasks_done.notify_all()
+while self.reduce_tasks:
+self.all_tasks_done.wait()
+
+def _clear(self):
+self.list = []
+self.push_tasks = self.counter
+self.reduce_tasks = self.counter
+
+def __len__(self):
+return len(self.list)
+
+def __repr__(self):
+return 'ParallelState'
+
+
+class DataParallelModel(object):
+"""Data parallelism
+
+Hide the difference of single/multiple GPUs to the user.
+Inputs and outputs are both list of NDArrays in different contexts.
+In the forward pass, the module is replicated on each device,
+and each replica handles a portion of the input. During the backwards
+pass, gradients from each replica are summed into the original module.
+
+Parameters
+--
+module : object
+Network to be parallelized.
+ctx_list : list
+A list of contexts
+sync : bool
+enable synchronization (default: False).
+
+
+Inputs:
+- **inputs**: list of input (NDArrays)
+
+Outputs:
+- **outputs**: list of output (NDArrays)
+
+Example::
+>>> ctx = [mx.gpu(0), mx.gpu(1)]
+>>> net = DataParallelModel(model, ctx_list=ctx)
+>>> y = net(x)
+"""
+def __init__(self, module, ctx_list=None, sync=False):
+module.collect_params().reset_ctx(ctx=ctx_list)
+self.ctx_list = ctx_list
+self.module = module
+self.sync = sync
+
+def __call__(self, *inputs, **kwargs):
+if not self.ctx_list:
+

[GitHub] haojin2 commented on a change in pull request #10780: [MXNET-375] Lp Pooling and Global Lp Pooling

2018-05-17 Thread GitBox

haojin2 commented on a change in pull request #10780: [MXNET-375] Lp Pooling 
and Global Lp Pooling
URL: https://github.com/apache/incubator-mxnet/pull/10780#discussion_r189058395
 
 

 ##
 File path: tests/python/gpu/test_operator_gpu.py
 ##
 @@ -769,140 +769,161 @@ def test_pooling_versions_helper(pool_op_list, data, 
kernel, pool_type, pad, str
 ctx_list.append({'ctx': mx.cpu(0), 'pool_data': data, 'type_dict': 
{'pool_data': np.float32}})
 if not global_pool:
 sym_list.append(mx.sym.Pooling(kernel=kernel, pad=pad, 
stride=stride, pool_type=pool_type,
-   
pooling_convention=pooling_convention, name='pool'))
+   
pooling_convention=pooling_convention, name='pool', p_value=p_value))
 else:
-sym_list.append(mx.sym.Pooling(kernel=kernel, 
pool_type=pool_type, global_pool=True, name='pool'))
+sym_list.append(mx.sym.Pooling(kernel=kernel, 
pool_type=pool_type, global_pool=True, name='pool', p_value=p_value))
 # Pooling gpu
 if 'pool_gpu' in pool_op_list:
 ctx_list.append({'ctx': mx.gpu(0), 'pool_data': data, 'type_dict': 
{'pool_data': np.float32}})
 if not global_pool:
 sym_list.append(mx.sym.Pooling(kernel=kernel, pad=pad, 
stride=stride, pool_type=pool_type,
-   
pooling_convention=pooling_convention, cudnn_off=True, name='pool'))
+   
pooling_convention=pooling_convention, cudnn_off=True, name='pool', 
p_value=p_value))
 else:
 sym_list.append(mx.sym.Pooling(kernel=kernel, 
pool_type=pool_type, global_pool=True, cudnn_off=True,
-   name='pool'))
+   name='pool', p_value=p_value))
 # CuDNNPooling
 if 'pool_cudnn' in pool_op_list:
 ctx_list.append({'ctx': mx.gpu(0), 'pool_data': data, 'type_dict': 
{'pool_data': np.float32}})
 if not global_pool:
 sym_list.append(mx.sym.Pooling(kernel=kernel, pad=pad, 
stride=stride, pool_type=pool_type,
-   
pooling_convention=pooling_convention, cudnn_off=False, name='pool'))
+   
pooling_convention=pooling_convention, p_value=p_value, cudnn_off=False, 
name='pool'))
 else:
-sym_list.append(mx.sym.Pooling(kernel=kernel, 
pool_type=pool_type, global_pool=True, cudnn_off=False,
-   name='pool'))
+sym_list.append(mx.sym.Pooling(kernel=kernel, 
pool_type=pool_type, global_pool=True, p_value=p_value,
+   cudnn_off=False, name='pool'))
 check_consistency(sym_list, ctx_list)
 
-def test_1d_pooling(pool_type):
+def test_1d_pooling(pool_type, p_value=2):
 data = (2, 3, 20)
 kernel = (4,)
 pad = (0,)
 stride = (1,)
 test_pooling_versions_helper(pool_op_list=['pool_cpu', 'pool_gpu'],
  data=data, kernel=kernel, pad=pad, 
stride=stride, pool_type=pool_type,
- pooling_convention='valid', 
global_pool=False)
+ pooling_convention='valid', 
global_pool=False, p_value=p_value)
 
 pad = (2,)
 stride = (2,)
 test_pooling_versions_helper(pool_op_list=['pool_cpu', 'pool_gpu'],
  data=data, kernel=kernel, pad=pad, 
stride=stride, pool_type=pool_type,
- pooling_convention='valid', 
global_pool=False)
+ pooling_convention='valid', 
global_pool=False, p_value=p_value)
 
 pad = (0,)
 stride = (1,)
 test_pooling_versions_helper(pool_op_list=['pool_cpu', 'pool_gpu'],
  data=data, kernel=kernel, pad=pad, 
stride=stride, pool_type=pool_type,
- pooling_convention='full', 
global_pool=False)
+ pooling_convention='full', 
global_pool=False, p_value=p_value)
 
 pad = (2,)
 stride = (2,)
 test_pooling_versions_helper(pool_op_list=['pool_cpu', 'pool_gpu'],
  data=data, kernel=kernel, pad=pad, 
stride=stride, pool_type=pool_type,
- pooling_convention='full', 
global_pool=False)
+ pooling_convention='full', 
global_pool=False, p_value=p_value)
 
 test_pooling_versions_helper(pool_op_list=['pool_cpu', 'pool_gpu'],
  data=data, kernel=kernel, pad=pad, 
stride=stride, pool_type=pool_type,
-

[GitHub] piiswrong commented on a change in pull request #10536: [MXNET-317] Add Data Parallel

2018-05-17 Thread GitBox

piiswrong commented on a change in pull request #10536: [MXNET-317] Add Data 
Parallel
URL: https://github.com/apache/incubator-mxnet/pull/10536#discussion_r189056068
 
 

 ##
 File path: python/mxnet/gluon/contrib/parallel.py
 ##
 @@ -0,0 +1,343 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=broad-except, redefined-builtin
+"""Synchronized DataParallel"""
+import threading
+from ... import autograd
+from ...ndarray import NDArray
+from ..utils import split_and_load
+
+__all__ = ['DataParallelModel', 'DataParallelCriterion', 'Barrier']
+
+
+class Barrier(object):
+"""Shared NDArray for cross device operation.
+
+A cross device operation that allows synchronized push and pull. It can be 
used in
+Cross-gpu Sycnhronized Batch Normalization and Sparse Blocks.
+
+Parameters
+--
+counter : int
+Number of deivces.
+operation : callable
+The cross device operation is applying (e.g. AllReduce).
+"""
+def __init__(self, counter, operation):
+self.mutex = threading.Lock()
+self.all_tasks_done = threading.Condition(self.mutex)
+self.counter = counter
+self.op = operation
+self._clear()
+
+def push(self, x):
+"""Push a NDArray from one of the device.
+Input:
+x (NDArray)
+
+Output:
+idx (int), the output index
+"""
+with self.mutex:
+if self.push_tasks == 0:
+self._clear()
+self.list.append(x)
+idx = len(self.list) - 1
+self.push_tasks -= 1
+
+with self.all_tasks_done:
+if self.push_tasks == 0:
+self.all_tasks_done.notify_all()
+while self.push_tasks:
+self.all_tasks_done.wait()
+
+self._sync_op()
+return idx
+
+def pull(self, idx):
+"""Pull the output to each device
+Input:
+idx (int)
+
+Output:
+out (NDArray)
+"""
+return self.out[idx]
+
+def _sync_op(self):
+with self.mutex:
+if self.reduce_tasks == 1:
+assert(len(self.list) == self.counter)
+self.out = self.op(*self.list)
+if isinstance(self.out, (list, tuple)):
+for xi in self.out:
+xi.wait_to_read()
+else:
+self.out.wait_to_read()
+self.reduce_tasks -= 1
+else:
+self.reduce_tasks -= 1
+
+with self.all_tasks_done:
+if self.reduce_tasks == 0:
+self.all_tasks_done.notify_all()
+while self.reduce_tasks:
+self.all_tasks_done.wait()
+
+def _clear(self):
+self.list = []
+self.push_tasks = self.counter
+self.reduce_tasks = self.counter
+
+def __len__(self):
+return len(self.list)
+
+def __repr__(self):
+return 'ParallelState'
+
+
+class DataParallelModel(object):
+"""Data parallelism
+
+Hide the difference of single/multiple GPUs to the user.
+Inputs and outputs are both list of NDArrays in different contexts.
+In the forward pass, the module is replicated on each device,
+and each replica handles a portion of the input. During the backwards
+pass, gradients from each replica are summed into the original module.
+
+Parameters
+--
+module : object
+Network to be parallelized.
+ctx_list : list
+A list of contexts
+sync : bool
+enable synchronization (default: False).
+
+
+Inputs:
+- **inputs**: list of input (NDArrays)
+
+Outputs:
+- **outputs**: list of output (NDArrays)
+
+Example::
+>>> ctx = [mx.gpu(0), mx.gpu(1)]
+>>> net = DataParallelModel(model, ctx_list=ctx)
+>>> y = net(x)
+"""
+def __init__(self, module, ctx_list=None, sync=False):
+module.collect_params().reset_ctx(ctx=ctx_list)
+self.ctx_list = ctx_list
+self.module = module
+self.sync = sync
+
+def __call__(self, *inputs, **kwargs):
+if not self.ctx_list:
+

[GitHub] piiswrong commented on a change in pull request #10536: [MXNET-317] Add Data Parallel

2018-05-17 Thread GitBox

piiswrong commented on a change in pull request #10536: [MXNET-317] Add Data 
Parallel
URL: https://github.com/apache/incubator-mxnet/pull/10536#discussion_r189051703
 
 

 ##
 File path: python/mxnet/gluon/contrib/parallel.py
 ##
 @@ -0,0 +1,343 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=broad-except, redefined-builtin
+"""Synchronized DataParallel"""
+import threading
+from ... import autograd
+from ...ndarray import NDArray
+from ..utils import split_and_load
+
+__all__ = ['DataParallelModel', 'DataParallelCriterion', 'Barrier']
+
+
+class Barrier(object):
+"""Shared NDArray for cross device operation.
+
+A cross device operation that allows synchronized push and pull. It can be 
used in
+Cross-gpu Sycnhronized Batch Normalization and Sparse Blocks.
+
+Parameters
+--
+counter : int
+Number of deivces.
+operation : callable
+The cross device operation is applying (e.g. AllReduce).
+"""
+def __init__(self, counter, operation):
+self.mutex = threading.Lock()
+self.all_tasks_done = threading.Condition(self.mutex)
+self.counter = counter
+self.op = operation
+self._clear()
+
+def push(self, x):
+"""Push a NDArray from one of the device.
+Input:
+x (NDArray)
+
+Output:
+idx (int), the output index
+"""
+with self.mutex:
+if self.push_tasks == 0:
+self._clear()
+self.list.append(x)
+idx = len(self.list) - 1
+self.push_tasks -= 1
+
+with self.all_tasks_done:
+if self.push_tasks == 0:
+self.all_tasks_done.notify_all()
+while self.push_tasks:
+self.all_tasks_done.wait()
+
+self._sync_op()
+return idx
+
+def pull(self, idx):
+"""Pull the output to each device
+Input:
+idx (int)
+
+Output:
+out (NDArray)
+"""
+return self.out[idx]
+
+def _sync_op(self):
+with self.mutex:
+if self.reduce_tasks == 1:
+assert(len(self.list) == self.counter)
+self.out = self.op(*self.list)
+if isinstance(self.out, (list, tuple)):
+for xi in self.out:
+xi.wait_to_read()
 
 Review comment:
   why do you need to wait?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] piiswrong commented on a change in pull request #10536: [MXNET-317] Add Data Parallel

2018-05-17 Thread GitBox

piiswrong commented on a change in pull request #10536: [MXNET-317] Add Data 
Parallel
URL: https://github.com/apache/incubator-mxnet/pull/10536#discussion_r189054696
 
 

 ##
 File path: python/mxnet/gluon/contrib/parallel.py
 ##
 @@ -0,0 +1,343 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=broad-except, redefined-builtin
+"""Synchronized DataParallel"""
+import threading
+from ... import autograd
+from ...ndarray import NDArray
+from ..utils import split_and_load
+
+__all__ = ['DataParallelModel', 'DataParallelCriterion', 'Barrier']
+
+
+class Barrier(object):
+"""Shared NDArray for cross device operation.
+
+A cross device operation that allows synchronized push and pull. It can be 
used in
+Cross-gpu Sycnhronized Batch Normalization and Sparse Blocks.
+
+Parameters
+--
+counter : int
+Number of deivces.
+operation : callable
+The cross device operation is applying (e.g. AllReduce).
+"""
+def __init__(self, counter, operation):
+self.mutex = threading.Lock()
+self.all_tasks_done = threading.Condition(self.mutex)
+self.counter = counter
+self.op = operation
+self._clear()
+
+def push(self, x):
+"""Push a NDArray from one of the device.
+Input:
+x (NDArray)
+
+Output:
+idx (int), the output index
+"""
+with self.mutex:
+if self.push_tasks == 0:
+self._clear()
+self.list.append(x)
+idx = len(self.list) - 1
+self.push_tasks -= 1
+
+with self.all_tasks_done:
+if self.push_tasks == 0:
+self.all_tasks_done.notify_all()
+while self.push_tasks:
+self.all_tasks_done.wait()
+
+self._sync_op()
+return idx
+
+def pull(self, idx):
+"""Pull the output to each device
+Input:
+idx (int)
+
+Output:
+out (NDArray)
+"""
+return self.out[idx]
+
+def _sync_op(self):
+with self.mutex:
+if self.reduce_tasks == 1:
 
 Review comment:
   this counter seems redundent


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] piiswrong commented on a change in pull request #10536: [MXNET-317] Add Data Parallel

2018-05-17 Thread GitBox

piiswrong commented on a change in pull request #10536: [MXNET-317] Add Data 
Parallel
URL: https://github.com/apache/incubator-mxnet/pull/10536#discussion_r189055932
 
 

 ##
 File path: python/mxnet/gluon/contrib/parallel.py
 ##
 @@ -0,0 +1,343 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=broad-except, redefined-builtin
+"""Synchronized DataParallel"""
+import threading
+from ... import autograd
+from ...ndarray import NDArray
+from ..utils import split_and_load
+
+__all__ = ['DataParallelModel', 'DataParallelCriterion', 'Barrier']
+
+
+class Barrier(object):
+"""Shared NDArray for cross device operation.
+
+A cross device operation that allows synchronized push and pull. It can be 
used in
+Cross-gpu Sycnhronized Batch Normalization and Sparse Blocks.
+
+Parameters
+--
+counter : int
+Number of deivces.
+operation : callable
+The cross device operation is applying (e.g. AllReduce).
+"""
+def __init__(self, counter, operation):
+self.mutex = threading.Lock()
+self.all_tasks_done = threading.Condition(self.mutex)
+self.counter = counter
+self.op = operation
+self._clear()
+
+def push(self, x):
+"""Push a NDArray from one of the device.
+Input:
+x (NDArray)
+
+Output:
+idx (int), the output index
+"""
+with self.mutex:
+if self.push_tasks == 0:
+self._clear()
+self.list.append(x)
+idx = len(self.list) - 1
+self.push_tasks -= 1
+
+with self.all_tasks_done:
+if self.push_tasks == 0:
+self.all_tasks_done.notify_all()
+while self.push_tasks:
+self.all_tasks_done.wait()
+
+self._sync_op()
+return idx
+
+def pull(self, idx):
+"""Pull the output to each device
+Input:
+idx (int)
+
+Output:
+out (NDArray)
+"""
+return self.out[idx]
+
+def _sync_op(self):
+with self.mutex:
+if self.reduce_tasks == 1:
+assert(len(self.list) == self.counter)
+self.out = self.op(*self.list)
+if isinstance(self.out, (list, tuple)):
+for xi in self.out:
+xi.wait_to_read()
+else:
+self.out.wait_to_read()
+self.reduce_tasks -= 1
+else:
+self.reduce_tasks -= 1
+
+with self.all_tasks_done:
+if self.reduce_tasks == 0:
+self.all_tasks_done.notify_all()
+while self.reduce_tasks:
+self.all_tasks_done.wait()
+
+def _clear(self):
+self.list = []
+self.push_tasks = self.counter
+self.reduce_tasks = self.counter
+
+def __len__(self):
+return len(self.list)
+
+def __repr__(self):
+return 'ParallelState'
+
+
+class DataParallelModel(object):
+"""Data parallelism
+
+Hide the difference of single/multiple GPUs to the user.
+Inputs and outputs are both list of NDArrays in different contexts.
+In the forward pass, the module is replicated on each device,
+and each replica handles a portion of the input. During the backwards
+pass, gradients from each replica are summed into the original module.
+
+Parameters
+--
+module : object
+Network to be parallelized.
+ctx_list : list
+A list of contexts
+sync : bool
+enable synchronization (default: False).
+
+
+Inputs:
+- **inputs**: list of input (NDArrays)
+
+Outputs:
+- **outputs**: list of output (NDArrays)
+
+Example::
+>>> ctx = [mx.gpu(0), mx.gpu(1)]
+>>> net = DataParallelModel(model, ctx_list=ctx)
+>>> y = net(x)
+"""
+def __init__(self, module, ctx_list=None, sync=False):
+module.collect_params().reset_ctx(ctx=ctx_list)
+self.ctx_list = ctx_list
+self.module = module
+self.sync = sync
+
+def __call__(self, *inputs, **kwargs):
+if not self.ctx_list:
+

[GitHub] piiswrong commented on a change in pull request #10536: [MXNET-317] Add Data Parallel

2018-05-17 Thread GitBox

piiswrong commented on a change in pull request #10536: [MXNET-317] Add Data 
Parallel
URL: https://github.com/apache/incubator-mxnet/pull/10536#discussion_r189053366
 
 

 ##
 File path: python/mxnet/gluon/contrib/parallel.py
 ##
 @@ -0,0 +1,343 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=broad-except, redefined-builtin
+"""Synchronized DataParallel"""
+import threading
+from ... import autograd
+from ...ndarray import NDArray
+from ..utils import split_and_load
+
+__all__ = ['DataParallelModel', 'DataParallelCriterion', 'Barrier']
+
+
+class Barrier(object):
+"""Shared NDArray for cross device operation.
+
+A cross device operation that allows synchronized push and pull. It can be 
used in
+Cross-gpu Sycnhronized Batch Normalization and Sparse Blocks.
+
+Parameters
+--
+counter : int
+Number of deivces.
+operation : callable
+The cross device operation is applying (e.g. AllReduce).
+"""
+def __init__(self, counter, operation):
+self.mutex = threading.Lock()
+self.all_tasks_done = threading.Condition(self.mutex)
+self.counter = counter
+self.op = operation
+self._clear()
+
+def push(self, x):
+"""Push a NDArray from one of the device.
+Input:
+x (NDArray)
+
+Output:
+idx (int), the output index
+"""
+with self.mutex:
+if self.push_tasks == 0:
+self._clear()
+self.list.append(x)
+idx = len(self.list) - 1
+self.push_tasks -= 1
+
+with self.all_tasks_done:
+if self.push_tasks == 0:
+self.all_tasks_done.notify_all()
+while self.push_tasks:
 
 Review comment:
   why busy wait here?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] piiswrong commented on a change in pull request #10536: [MXNET-317] Add Data Parallel

2018-05-17 Thread GitBox

piiswrong commented on a change in pull request #10536: [MXNET-317] Add Data 
Parallel
URL: https://github.com/apache/incubator-mxnet/pull/10536#discussion_r189051148
 
 

 ##
 File path: python/mxnet/gluon/contrib/parallel.py
 ##
 @@ -0,0 +1,343 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=broad-except, redefined-builtin
+"""Synchronized DataParallel"""
+import threading
+from ... import autograd
+from ...ndarray import NDArray
+from ..utils import split_and_load
+
+__all__ = ['DataParallelModel', 'DataParallelCriterion', 'Barrier']
+
+
+class Barrier(object):
+"""Shared NDArray for cross device operation.
+
+A cross device operation that allows synchronized push and pull. It can be 
used in
+Cross-gpu Sycnhronized Batch Normalization and Sparse Blocks.
+
+Parameters
+--
+counter : int
+Number of deivces.
+operation : callable
+The cross device operation is applying (e.g. AllReduce).
+"""
+def __init__(self, counter, operation):
+self.mutex = threading.Lock()
 
 Review comment:
   internal variables should start with _


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] piiswrong commented on a change in pull request #10536: [MXNET-317] Add Data Parallel

2018-05-17 Thread GitBox

piiswrong commented on a change in pull request #10536: [MXNET-317] Add Data 
Parallel
URL: https://github.com/apache/incubator-mxnet/pull/10536#discussion_r189051417
 
 

 ##
 File path: python/mxnet/gluon/contrib/parallel.py
 ##
 @@ -0,0 +1,343 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=broad-except, redefined-builtin
+"""Synchronized DataParallel"""
+import threading
+from ... import autograd
+from ...ndarray import NDArray
+from ..utils import split_and_load
+
+__all__ = ['DataParallelModel', 'DataParallelCriterion', 'Barrier']
+
+
+class Barrier(object):
+"""Shared NDArray for cross device operation.
+
+A cross device operation that allows synchronized push and pull. It can be 
used in
+Cross-gpu Sycnhronized Batch Normalization and Sparse Blocks.
+
+Parameters
+--
+counter : int
+Number of deivces.
+operation : callable
+The cross device operation is applying (e.g. AllReduce).
+"""
+def __init__(self, counter, operation):
+self.mutex = threading.Lock()
+self.all_tasks_done = threading.Condition(self.mutex)
+self.counter = counter
+self.op = operation
+self._clear()
+
+def push(self, x):
+"""Push a NDArray from one of the device.
+Input:
+x (NDArray)
+
+Output:
+idx (int), the output index
+"""
+with self.mutex:
+if self.push_tasks == 0:
+self._clear()
+self.list.append(x)
+idx = len(self.list) - 1
+self.push_tasks -= 1
+
+with self.all_tasks_done:
+if self.push_tasks == 0:
+self.all_tasks_done.notify_all()
+while self.push_tasks:
+self.all_tasks_done.wait()
+
+self._sync_op()
+return idx
+
+def pull(self, idx):
+"""Pull the output to each device
+Input:
+idx (int)
+
+Output:
+out (NDArray)
+"""
+return self.out[idx]
+
+def _sync_op(self):
+with self.mutex:
+if self.reduce_tasks == 1:
+assert(len(self.list) == self.counter)
+self.out = self.op(*self.list)
+if isinstance(self.out, (list, tuple)):
+for xi in self.out:
+xi.wait_to_read()
+else:
+self.out.wait_to_read()
+self.reduce_tasks -= 1
+else:
+self.reduce_tasks -= 1
+
+with self.all_tasks_done:
+if self.reduce_tasks == 0:
+self.all_tasks_done.notify_all()
+while self.reduce_tasks:
+self.all_tasks_done.wait()
+
+def _clear(self):
+self.list = []
+self.push_tasks = self.counter
+self.reduce_tasks = self.counter
+
+def __len__(self):
 
 Review comment:
   why is this needed?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] piiswrong commented on a change in pull request #10536: [MXNET-317] Add Data Parallel

2018-05-17 Thread GitBox

piiswrong commented on a change in pull request #10536: [MXNET-317] Add Data 
Parallel
URL: https://github.com/apache/incubator-mxnet/pull/10536#discussion_r189050743
 
 

 ##
 File path: docs/api/python/gluon/contrib.md
 ##
 @@ -80,6 +80,20 @@ In the rest of this document, we list routines provided by 
the `gluon.contrib` p
 WikiText103
 ```
 
+ Parallel
+
+```eval_rst
+.. currentmodule:: mxnet.gluon.parallel
+
+.. autosummary::
+:nosignatures:
+
+DataParallelModel
+DataParallelCriterion
 
 Review comment:
   
   I think we only need a DataParallel.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] piiswrong commented on a change in pull request #10536: [MXNET-317] Add Data Parallel

2018-05-17 Thread GitBox

piiswrong commented on a change in pull request #10536: [MXNET-317] Add Data 
Parallel
URL: https://github.com/apache/incubator-mxnet/pull/10536#discussion_r189051356
 
 

 ##
 File path: python/mxnet/gluon/contrib/parallel.py
 ##
 @@ -0,0 +1,343 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=broad-except, redefined-builtin
+"""Synchronized DataParallel"""
+import threading
+from ... import autograd
+from ...ndarray import NDArray
+from ..utils import split_and_load
+
+__all__ = ['DataParallelModel', 'DataParallelCriterion', 'Barrier']
+
+
+class Barrier(object):
+"""Shared NDArray for cross device operation.
+
+A cross device operation that allows synchronized push and pull. It can be 
used in
+Cross-gpu Sycnhronized Batch Normalization and Sparse Blocks.
+
+Parameters
+--
+counter : int
+Number of deivces.
+operation : callable
+The cross device operation is applying (e.g. AllReduce).
+"""
+def __init__(self, counter, operation):
+self.mutex = threading.Lock()
+self.all_tasks_done = threading.Condition(self.mutex)
+self.counter = counter
+self.op = operation
+self._clear()
+
+def push(self, x):
+"""Push a NDArray from one of the device.
+Input:
+x (NDArray)
+
+Output:
+idx (int), the output index
+"""
+with self.mutex:
+if self.push_tasks == 0:
+self._clear()
+self.list.append(x)
+idx = len(self.list) - 1
+self.push_tasks -= 1
+
+with self.all_tasks_done:
+if self.push_tasks == 0:
+self.all_tasks_done.notify_all()
+while self.push_tasks:
+self.all_tasks_done.wait()
+
+self._sync_op()
+return idx
+
+def pull(self, idx):
+"""Pull the output to each device
+Input:
+idx (int)
+
+Output:
+out (NDArray)
+"""
+return self.out[idx]
+
+def _sync_op(self):
+with self.mutex:
+if self.reduce_tasks == 1:
+assert(len(self.list) == self.counter)
+self.out = self.op(*self.list)
+if isinstance(self.out, (list, tuple)):
+for xi in self.out:
+xi.wait_to_read()
+else:
+self.out.wait_to_read()
+self.reduce_tasks -= 1
+else:
+self.reduce_tasks -= 1
+
+with self.all_tasks_done:
+if self.reduce_tasks == 0:
+self.all_tasks_done.notify_all()
+while self.reduce_tasks:
+self.all_tasks_done.wait()
+
+def _clear(self):
+self.list = []
+self.push_tasks = self.counter
+self.reduce_tasks = self.counter
+
+def __len__(self):
+return len(self.list)
+
+def __repr__(self):
+return 'ParallelState'
 
 Review comment:
   Barrier


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] piiswrong commented on a change in pull request #10536: [MXNET-317] Add Data Parallel

2018-05-17 Thread GitBox

piiswrong commented on a change in pull request #10536: [MXNET-317] Add Data 
Parallel
URL: https://github.com/apache/incubator-mxnet/pull/10536#discussion_r189055795
 
 

 ##
 File path: python/mxnet/gluon/contrib/parallel.py
 ##
 @@ -0,0 +1,343 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=broad-except, redefined-builtin
+"""Synchronized DataParallel"""
+import threading
+from ... import autograd
+from ...ndarray import NDArray
+from ..utils import split_and_load
+
+__all__ = ['DataParallelModel', 'DataParallelCriterion', 'Barrier']
+
+
+class Barrier(object):
+"""Shared NDArray for cross device operation.
+
+A cross device operation that allows synchronized push and pull. It can be 
used in
+Cross-gpu Sycnhronized Batch Normalization and Sparse Blocks.
+
+Parameters
+--
+counter : int
+Number of deivces.
+operation : callable
+The cross device operation is applying (e.g. AllReduce).
+"""
+def __init__(self, counter, operation):
+self.mutex = threading.Lock()
+self.all_tasks_done = threading.Condition(self.mutex)
+self.counter = counter
+self.op = operation
+self._clear()
+
+def push(self, x):
+"""Push a NDArray from one of the device.
+Input:
+x (NDArray)
+
+Output:
+idx (int), the output index
+"""
+with self.mutex:
+if self.push_tasks == 0:
+self._clear()
+self.list.append(x)
+idx = len(self.list) - 1
+self.push_tasks -= 1
+
+with self.all_tasks_done:
+if self.push_tasks == 0:
+self.all_tasks_done.notify_all()
+while self.push_tasks:
+self.all_tasks_done.wait()
+
+self._sync_op()
+return idx
+
+def pull(self, idx):
+"""Pull the output to each device
+Input:
+idx (int)
+
+Output:
+out (NDArray)
+"""
+return self.out[idx]
+
+def _sync_op(self):
+with self.mutex:
+if self.reduce_tasks == 1:
+assert(len(self.list) == self.counter)
+self.out = self.op(*self.list)
+if isinstance(self.out, (list, tuple)):
+for xi in self.out:
+xi.wait_to_read()
+else:
+self.out.wait_to_read()
+self.reduce_tasks -= 1
+else:
+self.reduce_tasks -= 1
+
+with self.all_tasks_done:
+if self.reduce_tasks == 0:
+self.all_tasks_done.notify_all()
+while self.reduce_tasks:
+self.all_tasks_done.wait()
+
+def _clear(self):
+self.list = []
+self.push_tasks = self.counter
+self.reduce_tasks = self.counter
+
+def __len__(self):
+return len(self.list)
+
+def __repr__(self):
+return 'ParallelState'
+
+
+class DataParallelModel(object):
+"""Data parallelism
+
+Hide the difference of single/multiple GPUs to the user.
+Inputs and outputs are both list of NDArrays in different contexts.
+In the forward pass, the module is replicated on each device,
+and each replica handles a portion of the input. During the backwards
+pass, gradients from each replica are summed into the original module.
+
+Parameters
+--
+module : object
+Network to be parallelized.
+ctx_list : list
+A list of contexts
+sync : bool
+enable synchronization (default: False).
+
+
+Inputs:
+- **inputs**: list of input (NDArrays)
+
+Outputs:
+- **outputs**: list of output (NDArrays)
+
+Example::
+>>> ctx = [mx.gpu(0), mx.gpu(1)]
+>>> net = DataParallelModel(model, ctx_list=ctx)
+>>> y = net(x)
+"""
+def __init__(self, module, ctx_list=None, sync=False):
+module.collect_params().reset_ctx(ctx=ctx_list)
+self.ctx_list = ctx_list
+self.module = module
+self.sync = sync
+
+def __call__(self, *inputs, **kwargs):
+if not self.ctx_list:
+

[GitHub] piiswrong commented on a change in pull request #10536: [MXNET-317] Add Data Parallel

2018-05-17 Thread GitBox

piiswrong commented on a change in pull request #10536: [MXNET-317] Add Data 
Parallel
URL: https://github.com/apache/incubator-mxnet/pull/10536#discussion_r189056835
 
 

 ##
 File path: python/mxnet/gluon/contrib/parallel.py
 ##
 @@ -0,0 +1,343 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=broad-except, redefined-builtin
+"""Synchronized DataParallel"""
+import threading
+from ... import autograd
+from ...ndarray import NDArray
+from ..utils import split_and_load
+
+__all__ = ['DataParallelModel', 'DataParallelCriterion', 'Barrier']
+
+
+class Barrier(object):
+"""Shared NDArray for cross device operation.
+
+A cross device operation that allows synchronized push and pull. It can be 
used in
+Cross-gpu Sycnhronized Batch Normalization and Sparse Blocks.
+
+Parameters
+--
+counter : int
+Number of deivces.
+operation : callable
+The cross device operation is applying (e.g. AllReduce).
+"""
+def __init__(self, counter, operation):
+self.mutex = threading.Lock()
+self.all_tasks_done = threading.Condition(self.mutex)
+self.counter = counter
+self.op = operation
+self._clear()
+
+def push(self, x):
 
 Review comment:
   how about have a single wait function instead of push/pull.
   Is push and pull ever called separately?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] piiswrong commented on a change in pull request #10536: [MXNET-317] Add Data Parallel

2018-05-17 Thread GitBox

piiswrong commented on a change in pull request #10536: [MXNET-317] Add Data 
Parallel
URL: https://github.com/apache/incubator-mxnet/pull/10536#discussion_r189054371
 
 

 ##
 File path: python/mxnet/gluon/contrib/parallel.py
 ##
 @@ -0,0 +1,343 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=broad-except, redefined-builtin
+"""Synchronized DataParallel"""
+import threading
+from ... import autograd
+from ...ndarray import NDArray
+from ..utils import split_and_load
+
+__all__ = ['DataParallelModel', 'DataParallelCriterion', 'Barrier']
+
+
+class Barrier(object):
+"""Shared NDArray for cross device operation.
+
+A cross device operation that allows synchronized push and pull. It can be 
used in
+Cross-gpu Sycnhronized Batch Normalization and Sparse Blocks.
+
+Parameters
+--
+counter : int
+Number of deivces.
+operation : callable
+The cross device operation is applying (e.g. AllReduce).
+"""
+def __init__(self, counter, operation):
+self.mutex = threading.Lock()
+self.all_tasks_done = threading.Condition(self.mutex)
+self.counter = counter
+self.op = operation
+self._clear()
+
+def push(self, x):
+"""Push a NDArray from one of the device.
+Input:
 
 Review comment:
   Document format is wrong. Should be Parameters and Returns


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] piiswrong commented on a change in pull request #10536: [MXNET-317] Add Data Parallel

2018-05-17 Thread GitBox

piiswrong commented on a change in pull request #10536: [MXNET-317] Add Data 
Parallel
URL: https://github.com/apache/incubator-mxnet/pull/10536#discussion_r189054787
 
 

 ##
 File path: python/mxnet/gluon/contrib/parallel.py
 ##
 @@ -0,0 +1,343 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=broad-except, redefined-builtin
+"""Synchronized DataParallel"""
+import threading
+from ... import autograd
+from ...ndarray import NDArray
+from ..utils import split_and_load
+
+__all__ = ['DataParallelModel', 'DataParallelCriterion', 'Barrier']
+
+
+class Barrier(object):
+"""Shared NDArray for cross device operation.
+
+A cross device operation that allows synchronized push and pull. It can be 
used in
+Cross-gpu Sycnhronized Batch Normalization and Sparse Blocks.
+
+Parameters
+--
+counter : int
+Number of deivces.
+operation : callable
+The cross device operation is applying (e.g. AllReduce).
+"""
+def __init__(self, counter, operation):
+self.mutex = threading.Lock()
+self.all_tasks_done = threading.Condition(self.mutex)
+self.counter = counter
+self.op = operation
+self._clear()
+
+def push(self, x):
+"""Push a NDArray from one of the device.
+Input:
+x (NDArray)
+
+Output:
+idx (int), the output index
+"""
+with self.mutex:
+if self.push_tasks == 0:
+self._clear()
+self.list.append(x)
+idx = len(self.list) - 1
+self.push_tasks -= 1
+
+with self.all_tasks_done:
+if self.push_tasks == 0:
+self.all_tasks_done.notify_all()
+while self.push_tasks:
+self.all_tasks_done.wait()
+
+self._sync_op()
+return idx
+
+def pull(self, idx):
+"""Pull the output to each device
+Input:
+idx (int)
+
+Output:
+out (NDArray)
+"""
+return self.out[idx]
+
+def _sync_op(self):
+with self.mutex:
+if self.reduce_tasks == 1:
+assert(len(self.list) == self.counter)
+self.out = self.op(*self.list)
+if isinstance(self.out, (list, tuple)):
+for xi in self.out:
+xi.wait_to_read()
+else:
+self.out.wait_to_read()
+self.reduce_tasks -= 1
+else:
+self.reduce_tasks -= 1
+
+with self.all_tasks_done:
+if self.reduce_tasks == 0:
+self.all_tasks_done.notify_all()
+while self.reduce_tasks:
+self.all_tasks_done.wait()
+
+def _clear(self):
+self.list = []
+self.push_tasks = self.counter
+self.reduce_tasks = self.counter
+
+def __len__(self):
+return len(self.list)
+
+def __repr__(self):
+return 'ParallelState'
+
+
+class DataParallelModel(object):
+"""Data parallelism
+
+Hide the difference of single/multiple GPUs to the user.
+Inputs and outputs are both list of NDArrays in different contexts.
+In the forward pass, the module is replicated on each device,
+and each replica handles a portion of the input. During the backwards
+pass, gradients from each replica are summed into the original module.
+
+Parameters
+--
+module : object
 
 Review comment:
   Block?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] piiswrong commented on a change in pull request #10536: [MXNET-317] Add Data Parallel

2018-05-17 Thread GitBox

piiswrong commented on a change in pull request #10536: [MXNET-317] Add Data 
Parallel
URL: https://github.com/apache/incubator-mxnet/pull/10536#discussion_r189054240
 
 

 ##
 File path: python/mxnet/gluon/contrib/parallel.py
 ##
 @@ -0,0 +1,343 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=broad-except, redefined-builtin
+"""Synchronized DataParallel"""
+import threading
+from ... import autograd
+from ...ndarray import NDArray
+from ..utils import split_and_load
+
+__all__ = ['DataParallelModel', 'DataParallelCriterion', 'Barrier']
+
+
+class Barrier(object):
+"""Shared NDArray for cross device operation.
+
+A cross device operation that allows synchronized push and pull. It can be 
used in
+Cross-gpu Sycnhronized Batch Normalization and Sparse Blocks.
+
+Parameters
+--
+counter : int
+Number of deivces.
+operation : callable
+The cross device operation is applying (e.g. AllReduce).
+"""
+def __init__(self, counter, operation):
+self.mutex = threading.Lock()
+self.all_tasks_done = threading.Condition(self.mutex)
+self.counter = counter
+self.op = operation
+self._clear()
+
+def push(self, x):
+"""Push a NDArray from one of the device.
+Input:
+x (NDArray)
+
+Output:
+idx (int), the output index
+"""
+with self.mutex:
+if self.push_tasks == 0:
+self._clear()
+self.list.append(x)
+idx = len(self.list) - 1
+self.push_tasks -= 1
+
+with self.all_tasks_done:
+if self.push_tasks == 0:
+self.all_tasks_done.notify_all()
+while self.push_tasks:
+self.all_tasks_done.wait()
+
+self._sync_op()
+return idx
+
+def pull(self, idx):
 
 Review comment:
   where does idx come from?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] ThomasDelteil commented on issue #10953: Flaky test: test_tutorials.test_gluon_gluon

2018-05-17 Thread GitBox

ThomasDelteil commented on issue #10953: Flaky test: 
test_tutorials.test_gluon_gluon
URL: 
https://github.com/apache/incubator-mxnet/issues/10953#issuecomment-389958509
 
 
   @eric-haibin-lin a fix has been put it in to add a delay between restarts of 
the jupyter kernel that should help with that bug, haven't been able to 
reproduce it locally after 100s of runs. Consider closing for now.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] szha closed issue #10846: flaky tests in tutorial CI

2018-05-17 Thread GitBox

szha closed issue #10846: flaky tests in tutorial CI
URL: https://github.com/apache/incubator-mxnet/issues/10846
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] szha closed issue #10953: Flaky test: test_tutorials.test_gluon_gluon

2018-05-17 Thread GitBox

szha closed issue #10953: Flaky test: test_tutorials.test_gluon_gluon
URL: https://github.com/apache/incubator-mxnet/issues/10953
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] ThomasDelteil commented on issue #10947: Flaky test: test_tutorials.test_onnx_inference_on_onnx_model

2018-05-17 Thread GitBox

ThomasDelteil commented on issue #10947: Flaky test: 
test_tutorials.test_onnx_inference_on_onnx_model
URL: 
https://github.com/apache/incubator-mxnet/issues/10947#issuecomment-389959040
 
 
   @KellenSunderland just wanted to check what is the status of the retry logic 
on the mx.test_utils.download() ? That would be particularly helpful with that 
specific bug. 
   
   @eric-haibin-lin I have moved all models that were 500MB+ to <50MB models, 
so the likeliness of download errors will decrease.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] ThomasDelteil commented on issue #10846: flaky tests in tutorial CI

2018-05-17 Thread GitBox

ThomasDelteil commented on issue #10846: flaky tests in tutorial CI
URL: 
https://github.com/apache/incubator-mxnet/issues/10846#issuecomment-389959374
 
 
   @zheng-da a fix has been put in to reduce the size of the models from 500MB 
to <50MB for the tutorials. That should prevent this bug from happening in the 
future. Please consider closing for now.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] ThomasDelteil commented on issue #10947: Flaky test: test_tutorials.test_onnx_inference_on_onnx_model

2018-05-17 Thread GitBox

ThomasDelteil commented on issue #10947: Flaky test: 
test_tutorials.test_onnx_inference_on_onnx_model
URL: 
https://github.com/apache/incubator-mxnet/issues/10947#issuecomment-389959040
 
 
   @KellenSunderland just wanted to check what is the status of the retry logic 
on the mx.test_utils.download() ? That would help with that particular bug. 
   
   @eric-haibin-lin I have moved all models that were 500MB+ to <50MB models, 
so the likeliness of download errors will decrease.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] haojin2 commented on a change in pull request #10780: [MXNET-375] Lp Pooling and Global Lp Pooling

2018-05-17 Thread GitBox

haojin2 commented on a change in pull request #10780: [MXNET-375] Lp Pooling 
and Global Lp Pooling
URL: https://github.com/apache/incubator-mxnet/pull/10780#discussion_r189051853
 
 

 ##
 File path: src/operator/nn/pooling.cc
 ##
 @@ -92,6 +92,9 @@ static bool PoolingShape(const nnvm::NodeAttrs ,
  std::vector *out_shape) {
   const PoolingParam  = nnvm::get(attrs.parsed);
   CHECK_EQ(in_shape->size(), 1U);
+  if (param.pool_type == pool_enum::kLpPooling) {
+CHECK(param.p_value.has_value());
 
 Review comment:
   We'd better not do this as this may affect Jun's work. If we set p_value to 
be defaulting to 1 instead of optional, the generated json file for this symbol 
would contain an extra field.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] piiswrong commented on a change in pull request #10536: [MXNET-317] Add Data Parallel

2018-05-17 Thread GitBox

piiswrong commented on a change in pull request #10536: [MXNET-317] Add Data 
Parallel
URL: https://github.com/apache/incubator-mxnet/pull/10536#discussion_r189050649
 
 

 ##
 File path: docs/api/python/gluon/contrib.md
 ##
 @@ -80,6 +80,20 @@ In the rest of this document, we list routines provided by 
the `gluon.contrib` p
 WikiText103
 ```
 
+ Parallel
+
+```eval_rst
+.. currentmodule:: mxnet.gluon.parallel
+
+.. autosummary::
+:nosignatures:
+
+DataParallelModel
+DataParallelCriterion
 
 Review comment:
   I think we just need a DataParallel.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] piiswrong commented on a change in pull request #10536: [MXNET-317] Add Data Parallel

2018-05-17 Thread GitBox

piiswrong commented on a change in pull request #10536: [MXNET-317] Add Data 
Parallel
URL: https://github.com/apache/incubator-mxnet/pull/10536#discussion_r189050649
 
 

 ##
 File path: docs/api/python/gluon/contrib.md
 ##
 @@ -80,6 +80,20 @@ In the rest of this document, we list routines provided by 
the `gluon.contrib` p
 WikiText103
 ```
 
+ Parallel
+
+```eval_rst
+.. currentmodule:: mxnet.gluon.parallel
+
+.. autosummary::
+:nosignatures:
+
+DataParallelModel
+DataParallelCriterion
 
 Review comment:
   I think we just need a DataParallel.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] piiswrong commented on a change in pull request #10989: [WIP] add gluon model summary

2018-05-17 Thread GitBox

piiswrong commented on a change in pull request #10989: [WIP] add gluon model 
summary
URL: https://github.com/apache/incubator-mxnet/pull/10989#discussion_r189050120
 
 

 ##
 File path: python/mxnet/gluon/block.py
 ##
 @@ -355,14 +357,68 @@ def load_params(self, filename, ctx=None, 
allow_missing=False,
 name, filename, 
_brief_print_list(self._params.keys(
 params[name]._load_init(loaded[name], ctx)
 
-
 def register_child(self, block, name=None):
 """Registers block as a child of self. :py:class:`Block` s assigned to 
self as
 attributes will be registered automatically."""
 if name is None:
 name = str(len(self._children))
 self._children[name] = block
 
+def register_forward_pre_hook(self, hook):
+r"""Registers a forward pre-hook on the block.
+
+The hook function is called immediately before :func:`forward`.
+It should not modify the input or output.
+
+Parameters
+--
+hook : callable
+The forward hook function of form `hook(block, input) -> None`.
+
+Returns
+---
+:class:`mxnet.gluon.utils.RemovableHandle`
+"""
+handle = RemovableHandle(self._forward_pre_hooks)
 
 Review comment:
   How about the following design:
   ```
   handle = RemovableHandle()
   handle.register(self._forward_pre_hooks, hook)
   return handle
   
   
   handle.remove()
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] piiswrong commented on a change in pull request #10970: [MXNET-424] dtype option for multinomial

2018-05-17 Thread GitBox

piiswrong commented on a change in pull request #10970: [MXNET-424] dtype 
option for multinomial
URL: https://github.com/apache/incubator-mxnet/pull/10970#discussion_r189047791
 
 

 ##
 File path: src/operator/random/sample_multinomial_op.h
 ##
 @@ -155,9 +158,11 @@ void SampleMultinomialForward(const nnvm::NodeAttrs& 
attrs,
 Tensor uniform =
   ctx.requested[1].get_space_typed(Shape1(N*M), s);
 prnd->SampleUniform(, 0, 1);
-Kernel::Launch(
-  s, N, K, M, inputs[0].dptr(), uniform.dptr_, 
outputs[0].dptr(),
-  param.get_prob ? outputs[1].dptr() : nullptr);
+MSHADOW_TYPE_SWITCH(outputs[0].type_flag_, IType, {
 
 Review comment:
   This kind of 2 layer switches is very slow to compile. Why do we need type 
support for output?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] piiswrong closed pull request #10978: removed script tags that are being interpreted as html

2018-05-17 Thread GitBox

piiswrong closed pull request #10978: removed script tags that are being 
interpreted as html
URL: https://github.com/apache/incubator-mxnet/pull/10978
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/docs/tutorials/scala/mxnet_scala_on_intellij.md 
b/docs/tutorials/scala/mxnet_scala_on_intellij.md
index c93d99c4504..eff8cc88294 100644
--- a/docs/tutorials/scala/mxnet_scala_on_intellij.md
+++ b/docs/tutorials/scala/mxnet_scala_on_intellij.md
@@ -132,7 +132,7 @@ After clicking Finish, you will be presented with the 
project's first view.
 The project's `pom.xml` will be open for editing.
 
 **Step 3.** Setup project properties:
-  - Specify project properties in `pom.xml` by pasting the following content 
in the `` tag. You will be overwriting the `` tag in 
the process, upgrading from `2.11.5` to `2.11.8`.
+  - Specify project properties in `pom.xml` by pasting the following content 
in the `properties` tag. You will be overwriting the `scala.version` tag in the 
process, upgrading from `2.11.5` to `2.11.8`.
 
 ```xml
 
@@ -143,7 +143,7 @@ The project's `pom.xml` will be open for editing.
 
 **Step 4.** Setup project profiles and platforms:
 
-  - Specify project profiles and platforms in `pom.xml` by pasting the 
following content below the `` tag:
+  - Specify project profiles and platforms in `pom.xml` by pasting the 
following content below the closing `properties` tag:
 
 ```xml
 
@@ -170,7 +170,7 @@ The project's `pom.xml` will be open for editing.
 
 **Step 5.** Setup project dependencies:
 
-  - Specify project dependencies in `pom.xml` adding the dependencies listed 
below. Place them inside the `` tag:
+  - Specify project dependencies in `pom.xml` adding the dependencies listed 
below. Place them inside the `dependencies` tag:
 
 ```xml
 
@@ -233,7 +233,7 @@ The project's `pom.xml` will be open for editing.
 
 ![project 
2](https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/scala/intellij-project-2.png)
 
-Note the `` tag and update it to match the file path to the jar 
file that was created when you built the MXNet-Scala package. It can be found 
in the `mxnet-incubator/scala-package/assembly/{platform}/target` directory, 
and is named with the pattern 
`mxnet-full_${scala.binary.version}-${platform}-{version-SNAPSHOT}.jar`.
+Note the `systemPath` tag and update it to match the file path to the jar file 
that was created when you built the MXNet-Scala package. It can be found in the 
`mxnet-incubator/scala-package/assembly/{platform}/target` directory, and is 
named with the pattern 
`mxnet-full_${scala.binary.version}-${platform}-{version-SNAPSHOT}.jar`.
 
 **Step 6.** Import dependencies with Maven:
 
@@ -246,7 +246,7 @@ Click "Import Changes" in this prompt.
 **Step 7.** Build the project:
 - To build the project, from the menu choose Build, and then choose Build 
Project.
 
-**Note**: During the build you may experience `[ERROR] scalac error: bad 
option: '-make:transitive'`. You can fix this by deleting or commenting this 
out in your `pom.xml`. This line in question is: `-make:transitive`.
+**Note**: During the build you may experience `[ERROR] scalac error: bad 
option: '-make:transitive'`. You can fix this by deleting or commenting this 
out in your `pom.xml`. This line in question is the `arg` tag, and it should 
contain: `-make:transitive`.
 
 **Step 8.** Run the Hello World App:
 


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] szha opened a new pull request #10989: [WIP] add gluon model summary

2018-05-17 Thread GitBox

szha opened a new pull request #10989: [WIP] add gluon model summary
URL: https://github.com/apache/incubator-mxnet/pull/10989
 
 
   ## Description ##
   add hook/pre-hook to gluon block and model summary
   
   ## Checklist ##
   ### Essentials ###
   Please feel free to remove inapplicable items for your PR.
   - [ ] Changes are complete (i.e. I finished coding on this PR)
   - [x] All changes have test coverage:
   - Unit tests are added for small changes to verify correctness (e.g. adding 
a new operator)
   - [x] Code is well-documented: 
   - For user-facing API changes, API doc string has been updated. 
   - Check the API doc at 
http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
   - [x] To the my best knowledge, examples are either not affected by this 
change, or have been fixed to be compatible with this change
   
   ### Changes ###
   - [x] add `register_forward_hook`, `register_forward_pre_hook`, `apply`
   - [ ] add model summary


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[incubator-mxnet] branch master updated: Add Apachev2 License for contrib (#10938)

2018-05-17 Thread jxie

This is an automated email from the ASF dual-hosted git repository.

jxie pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git


The following commit(s) were added to refs/heads/master by this push:
 new 034f24f  Add Apachev2 License for contrib (#10938)
034f24f is described below

commit 034f24f7b3a03deba063ff09ac8a5a2000a8e410
Author: Anirudh Subramanian 
AuthorDate: Thu May 17 10:52:16 2018 -0700

Add Apachev2 License for contrib (#10938)

* Add Apachev2 License for contrib

* Add copyright back
---
 src/operator/contrib/ctc_include/detail/cpu_ctc.h| 19 +++
 src/operator/contrib/ctc_include/detail/ctc_helper.h | 19 +++
 src/operator/contrib/ctc_include/detail/gpu_ctc.h| 19 +++
 .../contrib/ctc_include/detail/gpu_ctc_kernels.h | 19 +++
 src/operator/contrib/ctc_include/detail/hostdevice.h | 20 
 src/operator/contrib/psroi_pooling-inl.h | 20 +++-
 tests/nightly/apache_rat_license_check/.rat-excludes |  1 -
 7 files changed, 115 insertions(+), 2 deletions(-)

diff --git a/src/operator/contrib/ctc_include/detail/cpu_ctc.h 
b/src/operator/contrib/ctc_include/detail/cpu_ctc.h
index ba8bbc5..005b956 100644
--- a/src/operator/contrib/ctc_include/detail/cpu_ctc.h
+++ b/src/operator/contrib/ctc_include/detail/cpu_ctc.h
@@ -1,3 +1,22 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
 #pragma once
 
 #include 
diff --git a/src/operator/contrib/ctc_include/detail/ctc_helper.h 
b/src/operator/contrib/ctc_include/detail/ctc_helper.h
index 35b7a96..250188c 100644
--- a/src/operator/contrib/ctc_include/detail/ctc_helper.h
+++ b/src/operator/contrib/ctc_include/detail/ctc_helper.h
@@ -1,3 +1,22 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
 #pragma once
 
 #include 
diff --git a/src/operator/contrib/ctc_include/detail/gpu_ctc.h 
b/src/operator/contrib/ctc_include/detail/gpu_ctc.h
index c249046..8015b39 100644
--- a/src/operator/contrib/ctc_include/detail/gpu_ctc.h
+++ b/src/operator/contrib/ctc_include/detail/gpu_ctc.h
@@ -1,3 +1,22 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
 #pragma once
 
 
diff --git a/src/operator/contrib/ctc_include/detail/gpu_ctc_kernels.h 
b/src/operator/contrib/ctc_include/detail/gpu_ctc_kernels.h
index 7f53232..c9bc202 100644
--- a/src/operator/contrib/ctc_include/detail/gpu_ctc_kernels.h
+++ b/src/operator/contrib/ctc_include/detail/gpu_ctc_kernels.h
@@ -1,3 +1,22 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license

[GitHub] piiswrong closed pull request #10938: Add Apachev2 License for contrib

2018-05-17 Thread GitBox

piiswrong closed pull request #10938: Add Apachev2 License for contrib
URL: https://github.com/apache/incubator-mxnet/pull/10938
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/src/operator/contrib/ctc_include/detail/cpu_ctc.h 
b/src/operator/contrib/ctc_include/detail/cpu_ctc.h
index ba8bbc558f0..005b956343d 100644
--- a/src/operator/contrib/ctc_include/detail/cpu_ctc.h
+++ b/src/operator/contrib/ctc_include/detail/cpu_ctc.h
@@ -1,3 +1,22 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
 #pragma once
 
 #include 
diff --git a/src/operator/contrib/ctc_include/detail/ctc_helper.h 
b/src/operator/contrib/ctc_include/detail/ctc_helper.h
index 35b7a960149..250188c697c 100644
--- a/src/operator/contrib/ctc_include/detail/ctc_helper.h
+++ b/src/operator/contrib/ctc_include/detail/ctc_helper.h
@@ -1,3 +1,22 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
 #pragma once
 
 #include 
diff --git a/src/operator/contrib/ctc_include/detail/gpu_ctc.h 
b/src/operator/contrib/ctc_include/detail/gpu_ctc.h
index c249046424e..8015b39c437 100644
--- a/src/operator/contrib/ctc_include/detail/gpu_ctc.h
+++ b/src/operator/contrib/ctc_include/detail/gpu_ctc.h
@@ -1,3 +1,22 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
 #pragma once
 
 
diff --git a/src/operator/contrib/ctc_include/detail/gpu_ctc_kernels.h 
b/src/operator/contrib/ctc_include/detail/gpu_ctc_kernels.h
index 7f53232f871..c9bc2026efb 100644
--- a/src/operator/contrib/ctc_include/detail/gpu_ctc_kernels.h
+++ b/src/operator/contrib/ctc_include/detail/gpu_ctc_kernels.h
@@ -1,3 +1,22 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
 #pragma once
 
 #include

[incubator-mxnet] branch master updated: Minor doc fixes (#10963)

2018-05-17 Thread jxie

This is an automated email from the ASF dual-hosted git repository.

jxie pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git


The following commit(s) were added to refs/heads/master by this push:
 new ed846c9  Minor doc fixes (#10963)
ed846c9 is described below

commit ed846c963e17b3dd9b6ac58011235e0f89772232
Author: Haibin Lin 
AuthorDate: Thu May 17 10:48:48 2018 -0700

Minor doc fixes (#10963)

* Update dot.cc

* Update cast_storage.cc

* Update dot.cc

* Update init_op.cc

* Update dot.cc
---
 src/operator/tensor/cast_storage.cc | 2 +-
 src/operator/tensor/dot.cc  | 3 ++-
 src/operator/tensor/init_op.cc  | 2 +-
 3 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/src/operator/tensor/cast_storage.cc 
b/src/operator/tensor/cast_storage.cc
index f77a50a..afea9b8 100644
--- a/src/operator/tensor/cast_storage.cc
+++ b/src/operator/tensor/cast_storage.cc
@@ -31,7 +31,7 @@ namespace op {
 
 DMLC_REGISTER_PARAMETER(CastStorageParam);
 NNVM_REGISTER_OP(cast_storage)
-.add_alias("_sparse_cast_storage")
+MXNET_ADD_SPARSE_OP_ALIAS(cast_storage)
 .describe(R"code(Casts tensor storage type to the new type.
 
 When an NDArray with default storage type is cast to csr or row_sparse storage,
diff --git a/src/operator/tensor/dot.cc b/src/operator/tensor/dot.cc
index 11a8b27..2f44f53 100644
--- a/src/operator/tensor/dot.cc
+++ b/src/operator/tensor/dot.cc
@@ -29,7 +29,7 @@ namespace op {
 DMLC_REGISTER_PARAMETER(DotParam);
 
 NNVM_REGISTER_OP(dot)
-.add_alias("_sparse_dot")  // alias for op registration under 
mxnet.ndarray.sparse
+MXNET_ADD_SPARSE_OP_ALIAS(dot)
 .describe(R"doc(Dot product of two arrays.
 
 ``dot``'s behavior depends on the input array dimensions:
@@ -57,6 +57,7 @@ forward_stype option for output storage type. Implemented 
sparse operations incl
 - dot(default, default, transpose_a=True/False, transpose_b=True/False) = 
default
 - dot(csr, default, transpose_a=True) = default
 - dot(csr, default, transpose_a=True) = row_sparse
+- dot(csr, default) = default
 - dot(csr, row_sparse) = default
 - dot(default, csr) = csr (CPU only)
 - dot(default, csr, forward_stype='default') = default (GPU only)
diff --git a/src/operator/tensor/init_op.cc b/src/operator/tensor/init_op.cc
index 52cb9f2..bb23f5d 100644
--- a/src/operator/tensor/init_op.cc
+++ b/src/operator/tensor/init_op.cc
@@ -87,7 +87,7 @@ NNVM_REGISTER_OP(_arange)
 .add_arguments(RangeParam::__FIELDS__());
 
 NNVM_REGISTER_OP(zeros_like)
-.add_alias("_sparse_zeros_like")
+MXNET_ADD_SPARSE_OP_ALIAS(zeros_like)
 .describe(R"code(Return an array of zeros with the same shape, type and 
storage type
 as the input array.
 

-- 
To stop receiving notification emails like this one, please contact
j...@apache.org.

1 2 >

1 - 100 of 137 matches

Mail list logo