[GitHub] liuzx32 opened a new issue #11020: mxnet on yarn throw the OSError, how can I solve it.

2018-05-21 Thread GitBox
liuzx32 opened a new issue #11020: mxnet on yarn throw the OSError, how can I 
solve it.
URL: https://github.com/apache/incubator-mxnet/issues/11020
 
 
 File "/usr/lib64/python2.7/ctypes/__init__.py", line 357, in __init__
   self._handle = _dlopen(self._name, mode)
   OSError: libopenblas.so.0: cannot open shared object file: No such file or 
directory
   ##
   The problem show
   s that there is no openblas on ec2,does it means that I need to install 
openblas for every ec2 on yarn?  Is there any other solutions to solve it.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] anirudh2290 commented on issue #10895: [MXNET-413] Fixing the broken links - Week of 5/7

2018-05-21 Thread GitBox
anirudh2290 commented on issue #10895: [MXNET-413] Fixing the broken links - 
Week of 5/7
URL: https://github.com/apache/incubator-mxnet/pull/10895#issuecomment-390870258
 
 
   @kpmurali since we have sha512 i think its okay to remove md5 column. And 
you can move 1.1.0 to the archive link.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] nttstar opened a new issue #11019: Gluon SoftmaxCELoss does not converge, for large number of classes.

2018-05-21 Thread GitBox
nttstar opened a new issue #11019: Gluon SoftmaxCELoss does not converge, for 
large number of classes.
URL: https://github.com/apache/incubator-mxnet/issues/11019
 
 
   
   ## Description
   Training with 85K classes failed while I use gluon trainer with 
SoftmaxCELoss. But it is ok if I defined the same network by gluon but training 
with symbolic module interface(sym.SoftmaxOutput).
   
   ## Error Message:
   training acc starts from 0.0 to 0.001, but then drop to 0.0 again after 
about 1K iterations.
   
   ## Steps to reproduce
   
   
   1. checkout latest insightface 
repo(https://github.com/deepinsight/insightface)
   2. Download ms1m dataset from the repo and unzip to ./faces_ms1m
   3. Run training script ``insightface/gluon/train.py`` and you can see the 
training acc changing at   about 1.5K iterations. Validation process will start 
every 2K iterations, depends on --verbose param. 
   
   The below command works fine:
   ```
   CUDA_VISIBLE_DEVICES='0,1,2,3' python -u train.py --data-dir ./faces_ms1m 
--network r18 --prefix ./model-r18-test --per-batch-size 128 --lr-steps 
'1,2,3000' --lr 0.1 --ckpt 0 --verbose 2000 --wd 0.0005 --margin-a 0.0 
--eval lfw --mode symbol
   ```
   
   The below command does not converge:
   ```
   CUDA_VISIBLE_DEVICES='0,1,2,3' python -u train.py --data-dir ./faces_ms1m 
--network r18 --prefix ./model-r18-test --per-batch-size 128 --lr-steps 
'1,2,3000' --lr 0.1 --ckpt 0 --verbose 2000 --wd 0.0005 --margin-a 0.0 
--eval lfw --mode gluon
   ```
   
   
   
   ## Environment info (Required)
   
   --Python Info--
   ('Version  :', '2.7.5')
   ('Compiler :', 'GCC 4.8.5 20150623 (Red Hat 4.8.5-16)')
   ('Build:', ('default', 'Aug  4 2017 00:39:18'))
   ('Arch :', ('64bit', 'ELF'))
   Pip Info---
   ('Version  :', '9.0.2')
   ('Directory:', '/usr/lib/python2.7/site-packages/pip')
   --MXNet Info---
   ('Version  :', '1.2.0')
   ('Directory:', '/usr/lib/python2.7/site-packages/mxnet')
   ('Commit Hash   :', 'f0be910ae5e3fa01e0a9aaf98dbd4616c35be76b')
   --System Info--
   ('Platform :', 
'Linux-3.10.0-327.el7.x86_64-x86_64-with-centos-7.4.1708-Core')
   ('system   :', 'Linux')
   ('node :', 'cdsl-gpu-a04')
   ('release  :', '3.10.0-327.el7.x86_64')
   ('version  :', '#1 SMP Thu Nov 19 22:10:57 UTC 2015')
   --Hardware Info--
   ('machine  :', 'x86_64')
   ('processor:', 'x86_64')
   Architecture:  x86_64
   CPU op-mode(s):32-bit, 64-bit
   Byte Order:Little Endian
   CPU(s):48
   On-line CPU(s) list:   0-47
   Thread(s) per core:2
   Core(s) per socket:12
   Socket(s): 2
   NUMA node(s):  2
   Vendor ID: GenuineIntel
   CPU family:6
   Model: 79
   Model name:Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz
   Stepping:  1
   CPU MHz:   2199.914
   CPU max MHz:   2900.
   CPU min MHz:   1200.
   BogoMIPS:  4400.12
   Virtualization:VT-x
   L1d cache: 32K
   L1i cache: 32K
   L2 cache:  256K
   L3 cache:  30720K
   NUMA node0 CPU(s): 0-11,24-35
   NUMA node1 CPU(s): 12-23,36-47
   Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge 
mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx 
pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology 
nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est 
tm2 ssse3 fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt 
tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch ida arat 
epb pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust 
bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm rdseed adx smap xsaveopt cqm_llc 
cqm_occup_llc
   --Network Test--
   Setting timeout: 10
   Timing for MXNet: https://github.com/apache/incubator-mxnet, DNS: 0.0067 
sec, LOAD: 2.4557 sec.
   Timing for PYPI: https://pypi.python.org/pypi/pip, DNS: 0.0062 sec, LOAD: 
1.8693 sec.
   Timing for FashionMNIST: 
https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/dataset/fashion-mnist/train-labels-idx1-ubyte.gz,
 DNS: 0.2143 sec, LOAD: 2.3253 sec.
   Timing for Conda: https://repo.continuum.io/pkgs/free/, DNS: 0.0067 sec, 
LOAD: 1.1611 sec.
   Timing for Gluon Tutorial(en): http://gluon.mxnet.io, DNS: 0.5429 sec, LOAD: 
2.8609 sec.
   Timing for Gluon Tutorial(cn): https://zh.gluon.ai, DNS: 0.0070 sec, LOAD: 
1.6273 sec.
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this 

[GitHub] szha commented on a change in pull request #10957: Make inner transform activation configurable for LSTMCell

2018-05-21 Thread GitBox
szha commented on a change in pull request #10957: Make inner transform 
activation configurable for LSTMCell
URL: https://github.com/apache/incubator-mxnet/pull/10957#discussion_r189778302
 
 

 ##
 File path: python/mxnet/gluon/rnn/rnn_cell.py
 ##
 @@ -255,8 +256,7 @@ def _get_activation(self, F, inputs, activation, **kwargs):
 return F.Activation(inputs, act_type=activation, **kwargs)
 
 Review comment:
   for string type, map the string to the most efficient operator. for example, 
if the string is 'tanh', instead of doing `F.Activation(act_type='tanh')`, do 
`F.tanh`, which doesn't require parsing the string at each call.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] szha commented on issue #10951: Fix broken build with cython 0.28

2018-05-21 Thread GitBox
szha commented on issue #10951: Fix broken build with cython 0.28
URL: https://github.com/apache/incubator-mxnet/pull/10951#issuecomment-390858066
 
 
   Thanks. @marcoabreu do you have recommendation on how to integrate the tests 
for cython (e.g. whether to integrate into existing environments or to add a 
new environment)?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[incubator-mxnet-site] branch asf-site updated: Bump the publish timestamp.

2018-05-21 Thread zhasheng
This is an automated email from the ASF dual-hosted git repository.

zhasheng pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/incubator-mxnet-site.git


The following commit(s) were added to refs/heads/asf-site by this push:
 new 9c54d0f  Bump the publish timestamp.
9c54d0f is described below

commit 9c54d0f283c93a43134795ab19326fec0c9deb2a
Author: mxnet-ci 
AuthorDate: Tue May 22 02:01:30 2018 +

Bump the publish timestamp.
---
 date.txt | 1 +
 1 file changed, 1 insertion(+)

diff --git a/date.txt b/date.txt
new file mode 100644
index 000..ada68ea
--- /dev/null
+++ b/date.txt
@@ -0,0 +1 @@
+Tue May 22 02:01:30 UTC 2018

-- 
To stop receiving notification emails like this one, please contact
zhash...@apache.org.


[incubator-mxnet-site] branch asf-site updated: Nightly build

2018-05-21 Thread zhasheng
This is an automated email from the ASF dual-hosted git repository.

zhasheng pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/incubator-mxnet-site.git


The following commit(s) were added to refs/heads/asf-site by this push:
 new e40e755  Nightly build
e40e755 is described below

commit e40e755705b514f79fb10caedb1479b1d57d7842
Author: mxnet-ci 
AuthorDate: Tue May 22 02:01:24 2018 +

Nightly build
---
 date.txt | 1 -
 1 file changed, 1 deletion(-)

diff --git a/date.txt b/date.txt
deleted file mode 100644
index f90d706..000
--- a/date.txt
+++ /dev/null
@@ -1 +0,0 @@
-Mon May 21 23:26:55 UTC 2018

-- 
To stop receiving notification emails like this one, please contact
zhash...@apache.org.


[GitHub] juliusshufan commented on issue #10922: Fix Python error on --model-prefix parameter used and make the param can be periodically saved

2018-05-21 Thread GitBox
juliusshufan commented on issue #10922: Fix Python error on --model-prefix 
parameter used and make the param can be periodically saved
URL: https://github.com/apache/incubator-mxnet/pull/10922#issuecomment-390838822
 
 
   @zhreshold thanks for reviews, I modify it accordingly. Thanks.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] lihaofd commented on a change in pull request #10311: [MXNET-107]Fused GRU implementation for CPU

2018-05-21 Thread GitBox
lihaofd commented on a change in pull request #10311: [MXNET-107]Fused GRU 
implementation for CPU
URL: https://github.com/apache/incubator-mxnet/pull/10311#discussion_r189752108
 
 

 ##
 File path: src/operator/rnn_impl.h
 ##
 @@ -40,6 +40,9 @@
 #include "./mshadow_op.h"
 #include "./linalg.h"
 
+#define UNIDIRECT 1
 
 Review comment:
   removed define code, thanks!


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] zheng-da commented on a change in pull request #10451: [MXNET-432] Add Foreach

2018-05-21 Thread GitBox
zheng-da commented on a change in pull request #10451: [MXNET-432] Add Foreach
URL: https://github.com/apache/incubator-mxnet/pull/10451#discussion_r189735932
 
 

 ##
 File path: src/operator/nn/control_flow.cc
 ##
 @@ -0,0 +1,532 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "../operator_common.h"
+#include "../elemwise_op_common.h"
+#include "../../imperative/imperative_utils.h"
+#include "./subgraph_op_common.h"
+
+namespace mxnet {
+namespace op {
+
+struct ForeachParam : public dmlc::Parameter {
+  int num_args;
+  int dim;
+  int num_outputs;
+  int num_out_data;
+  nnvm::Tuple in_state_locs;
+  nnvm::Tuple in_data_locs;
+  DMLC_DECLARE_PARAMETER(ForeachParam) {
+DMLC_DECLARE_FIELD(num_args).set_lower_bound(1)
+.describe("Number of inputs.");
+DMLC_DECLARE_FIELD(dim).set_default(1)
+.describe("the dimension of the input array to iterate.");
+DMLC_DECLARE_FIELD(num_outputs)
+.describe("The number of outputs of the subgraph.");
+DMLC_DECLARE_FIELD(num_out_data)
+.describe("The number of output data of the subgraph.");
+DMLC_DECLARE_FIELD(in_state_locs)
+.describe("The locations of loop states among the inputs.");
+DMLC_DECLARE_FIELD(in_data_locs)
+.describe("The locations of input data among the inputs.");
+  }
+};  // struct ForeachParam
+
+DMLC_REGISTER_PARAMETER(ForeachParam);
+
+class ForeachState {
+  // These are output arrays from all iterations.
+  // They also contain the Op state for each CachedOp.
+  std::vector all_outputs;
+  std::vector all_inputs;
+  std::vector all_gradients;
+  std::vector iter_ops;
+
+ public:
+  Symbol subgraph_sym;
+  nnvm::Graph subgraph;
+  ForeachParam params;
+
+  ForeachState(const Symbol , const ForeachParam ) {
+this->subgraph_sym = g;
+this->subgraph.outputs = g.outputs;
+this->params = params;
+  }
+
+  void Forward(std::vector cinputs,
+   const std::vector& req,
+   std::vector coutputs, bool is_recording);
+  void Backward(int iter_no, std::vector ograds,
+const std::vector ,
+std::vector igrads);
+  void Cleanup() {
+all_outputs.clear();
+all_inputs.clear();
+all_gradients.clear();
+iter_ops.clear();
+  }
+};
+
+void ForeachState::Forward(std::vector cinputs,
+   const std::vector& req,
+   std::vector coutputs, bool is_recording) {
+  using namespace nnvm;
+  using namespace imperative;
+
+  bool orig_is_record;
+  if (is_recording)
+orig_is_record = Imperative::Get()->set_is_recording(true);
+  else
+orig_is_record = Imperative::Get()->is_recording();
+
+  std::vector inputs(cinputs.size());
+  std::vector outputs(coutputs.size());
+  for (size_t i = 0; i < inputs.size(); i++)
+inputs[i] = [i];
+  for (size_t i = 0; i < outputs.size(); i++)
+outputs[i] = [i];
+
+  if (is_recording) {
+all_inputs.push_back(cinputs);
+std::vector gradients(cinputs.size());
+std::vector input_ptrs(cinputs.size());
+std::vector gradient_ptrs(cinputs.size());
+std::vector grad_reqs(cinputs.size());
+for (size_t i = 0; i < gradients.size(); i++) {
+  gradients[i] = NDArray(cinputs[i].shape(), cinputs[i].ctx(),
+ true, cinputs[i].dtype());
+  input_ptrs[i] = [i];
+  gradient_ptrs[i] = [i];
+  grad_reqs[i] = kWriteTo;
+}
+Imperative::Get()->MarkVariables(input_ptrs, grad_reqs, gradient_ptrs);;
+  }
+
+  std::vector > kwargs;
+  kwargs.push_back(std::pair("inline_limit", "0"));
+  // Get input names.
+  const auto& idx = subgraph.indexed_graph();
+  std::vector arg_names(idx.input_nodes().size());
+  for (size_t i = 0; i < idx.input_nodes().size(); ++i)
+arg_names[i] = idx[idx.input_nodes()[i]].source->attrs.name;
+  // We don't have parameters for the cached op.
+  std::unordered_map params;
+  CachedOpPtr op = std::make_shared(subgraph_sym, kwargs,
+   

[GitHub] marcoabreu opened a new pull request #11018: [MXNET-454][WIP] Move distributed Docker cache from S3 to Docker Hub

2018-05-21 Thread GitBox
marcoabreu opened a new pull request #11018: [MXNET-454][WIP] Move distributed 
Docker cache from S3 to Docker Hub
URL: https://github.com/apache/incubator-mxnet/pull/11018
 
 
   ## Description ##
   At the moment, we download the entire cache from S3, amounting to up to 10GB 
of data. The problem here is that the download and the load parts have to run 
in sequence and especially the latter is quite slow (up to 4 minutes). This PR 
allows partial cache retrieval and parallelized loading.
   
   I have added a test file to verify the functionality of full and partial 
caching. I will not add it to CI as it messes with the Docker Daemon and could 
have negative side effects. In general, just run it locally if you'd like to 
make a modification.
   
   ## Checklist ##
   ### Essentials ###
   Please feel free to remove inapplicable items for your PR.
   - [x] The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to 
the relevant [JIRA issue](https://issues.apache.org/jira/projects/MXNET/issues) 
created (except PRs with tiny changes)
   - [ ] Changes are complete (i.e. I finished coding on this PR)
   - [x] All changes have test coverage:
   - Unit tests are added for small changes to verify correctness (e.g. adding 
a new operator)
   - Nightly tests are added for complicated/long-running ones (e.g. changing 
distributed kvstore)
   - Build tests will be added for build configuration changes (e.g. adding a 
new build option with NCCL)
   - [x] Code is well-documented: 
   - For user-facing API changes, API doc string has been updated. 
   - For new C++ functions in header files, their functionalities and arguments 
are documented. 
   - For new examples, README.md is added to explain the what the example does, 
the source of the dataset, expected performance on test set and reference to 
the original paper if applicable
   - Check the API doc at 
http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
   - [x] To the my best knowledge, examples are either not affected by this 
change, or have been fixed to be compatible with this change
   
   ### Changes ###
   - [x] Move from S3 to Docker Hub
   
   ## Comments ##
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] szha commented on issue #8278: [Feature Request][Gluon] Add text processing utilities for Gluon like PyTorch/Text

2018-05-21 Thread GitBox
szha commented on issue #8278: [Feature Request][Gluon] Add text processing 
utilities for Gluon like PyTorch/Text
URL: 
https://github.com/apache/incubator-mxnet/issues/8278#issuecomment-390821131
 
 
   Check out https://github.com/dmlc/gluon-nlp


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] szha closed issue #8278: [Feature Request][Gluon] Add text processing utilities for Gluon like PyTorch/Text

2018-05-21 Thread GitBox
szha closed issue #8278: [Feature Request][Gluon] Add text processing utilities 
for Gluon like PyTorch/Text
URL: https://github.com/apache/incubator-mxnet/issues/8278
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] szha commented on issue #8932: mobileNet pretrained model official version from MXNET

2018-05-21 Thread GitBox
szha commented on issue #8932: mobileNet pretrained model official version from 
MXNET
URL: 
https://github.com/apache/incubator-mxnet/issues/8932#issuecomment-390820871
 
 
   mobilenet pre-trained model is now provided thanks to @hetong007 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] szha closed issue #8859: Is there a pretrained mobilenet weights in model zoo?

2018-05-21 Thread GitBox
szha closed issue #8859: Is there a pretrained mobilenet weights in model zoo?
URL: https://github.com/apache/incubator-mxnet/issues/8859
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] szha closed issue #8932: mobileNet pretrained model official version from MXNET

2018-05-21 Thread GitBox
szha closed issue #8932: mobileNet pretrained model official version from MXNET
URL: https://github.com/apache/incubator-mxnet/issues/8932
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] szha commented on issue #8859: Is there a pretrained mobilenet weights in model zoo?

2018-05-21 Thread GitBox
szha commented on issue #8859: Is there a pretrained mobilenet weights in model 
zoo?
URL: 
https://github.com/apache/incubator-mxnet/issues/8859#issuecomment-390820940
 
 
   mobilenet pre-trained model is now provided thanks to @hetong007 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] zhreshold commented on issue #9808: [MXNET-267] Ssd camera demo

2018-05-21 Thread GitBox
zhreshold commented on issue #9808: [MXNET-267] Ssd camera demo
URL: https://github.com/apache/incubator-mxnet/pull/9808#issuecomment-390819991
 
 
   Waiting for a README or small tutorial.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] aaronmarkham commented on issue #10895: [MXNET-413] Fixing the broken links - Week of 5/7

2018-05-21 Thread GitBox
aaronmarkham commented on issue #10895: [MXNET-413] Fixing the broken links - 
Week of 5/7
URL: https://github.com/apache/incubator-mxnet/pull/10895#issuecomment-390819797
 
 
   @kpmurali Yes, download.html.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] zhanghang1989 commented on issue #9648: BatchNorm Evaluation Mode Backward Fails with cudnn Enabled

2018-05-21 Thread GitBox
zhanghang1989 commented on issue #9648: BatchNorm Evaluation Mode Backward 
Fails with cudnn Enabled
URL: 
https://github.com/apache/incubator-mxnet/issues/9648#issuecomment-390818172
 
 
   Reopen this due to CUDNN is handled wrongly @piiswrong 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] zhanghang1989 commented on issue #9648: BatchNorm Evaluation Mode Backward Fails with cudnn Enabled

2018-05-21 Thread GitBox
zhanghang1989 commented on issue #9648: BatchNorm Evaluation Mode Backward 
Fails with cudnn Enabled
URL: 
https://github.com/apache/incubator-mxnet/issues/9648#issuecomment-390818172
 
 
   Reopen this due to CUDNN is handled wrongly @piiswrong 
   Related PR https://github.com/apache/incubator-mxnet/pull/10470


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] zhanghang1989 opened a new issue #9648: BatchNorm Evaluation Mode Backward Fails with cudnn Enabled

2018-05-21 Thread GitBox
zhanghang1989 opened a new issue #9648: BatchNorm Evaluation Mode Backward 
Fails with cudnn Enabled
URL: https://github.com/apache/incubator-mxnet/issues/9648
 
 
   When installing MXNet with cudnn enabled ``USE_CUDNN=1``. The backward of 
BatchNorm operator fails
   
   ## Reproducing the error
   ```
   import mxnet as mx
   import mxnet.ndarray as F
   from mxnet import autograd
   
   B,C,H,W = 4,3,2,2
   x = mx.nd.random.poisson(1,shape=(B,C,H,W)).as_in_context(mx.gpu(0))
   gamma = mx.nd.random.normal(shape=(C)).as_in_context(mx.gpu(0))
   beta = mx.nd.random.normal(shape=(C)).as_in_context(mx.gpu(0))
   mean = mx.nd.random.normal(shape=(C)).as_in_context(mx.gpu(0))
   std = mx.nd.random.normal(shape=(C)).as_in_context(mx.gpu(0))
   x.attach_grad()
   
   with autograd.record(False):
   y = F.BatchNorm(x, gamma, beta, mean, std.square(), fix_gamma=False) 
   loss=y.square().sum()
   loss.backward(train_mode=False)
   ```
   got the error:
   ```bash
   terminate called after throwing an instance of 'dmlc::Error'
 what():  [19:23:23] src/engine/./threaded_engine.h:359: [19:23:23] 
src/operator/nn/./cudnn/cudnn_batch_norm-inl.h:193: Check failed: ctx.is_train 
&& !param_.use_global_stats use global statistics is not yet supported in 
CuDNNBatchNorm
   ```
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] anirudh2290 commented on issue #10895: [MXNET-413] Fixing the broken links - Week of 5/7

2018-05-21 Thread GitBox
anirudh2290 commented on issue #10895: [MXNET-413] Fixing the broken links - 
Week of 5/7
URL: https://github.com/apache/incubator-mxnet/pull/10895#issuecomment-390817178
 
 
   @szha good point. @kpmurali can you please add the source files from here: 
https://github.com/apache/incubator-mxnet/releases/tag/1.2.0. Note that we 
don't have md5 hashes anymore, since they are deprecated for apache releases.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] eric-haibin-lin commented on a change in pull request #11001: [MXNET-374] handle row_sparse weight in parameter and trainer

2018-05-21 Thread GitBox
eric-haibin-lin commented on a change in pull request #11001: [MXNET-374] 
handle row_sparse weight in parameter and trainer
URL: https://github.com/apache/incubator-mxnet/pull/11001#discussion_r189741864
 
 

 ##
 File path: python/mxnet/gluon/parameter.py
 ##
 @@ -162,6 +169,15 @@ def shape(self, new_shape):
 
 self._shape = new_shape
 
+def _set_trainer(self, trainer):
+""" Set the trainer this parameter is associated with. """
+if self._trainer and self._trainer is not trainer:
+raise RuntimeError(
+"Failed to set the trainer for Parameter '%s' to %s because it 
was set to %s. " \
 
 Review comment:
   Updated. Users can just call `_set_trainer(None)`. I don't think this will 
be used by common users, hence it remains private 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] eric-haibin-lin commented on a change in pull request #11001: [MXNET-374] handle row_sparse weight in parameter and trainer

2018-05-21 Thread GitBox
eric-haibin-lin commented on a change in pull request #11001: [MXNET-374] 
handle row_sparse weight in parameter and trainer
URL: https://github.com/apache/incubator-mxnet/pull/11001#discussion_r189741522
 
 

 ##
 File path: python/mxnet/gluon/parameter.py
 ##
 @@ -116,10 +119,14 @@ def __init__(self, name, grad_req='write', shape=None, 
dtype=mx_real_t,
 self.wd_mult = wd_mult
 self.grad_req = grad_req
 self.init = init
-assert grad_stype in ['default', 'row_sparse', 'csr'], \
-"grad_stype for Parameter '%s' must be one of 'default', 
'row_sparse', or 'csr'," \
-" but got '%s'" % (name, grad_stype)
+# sparse related storage type information
+valid_stypes = ['default', 'row_sparse', 'csr']
 
 Review comment:
   Only has 3 elements. I don't think this makes any real difference


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] eric-haibin-lin commented on a change in pull request #11001: [MXNET-374] handle row_sparse weight in parameter and trainer

2018-05-21 Thread GitBox
eric-haibin-lin commented on a change in pull request #11001: [MXNET-374] 
handle row_sparse weight in parameter and trainer
URL: https://github.com/apache/incubator-mxnet/pull/11001#discussion_r189741522
 
 

 ##
 File path: python/mxnet/gluon/parameter.py
 ##
 @@ -116,10 +119,14 @@ def __init__(self, name, grad_req='write', shape=None, 
dtype=mx_real_t,
 self.wd_mult = wd_mult
 self.grad_req = grad_req
 self.init = init
-assert grad_stype in ['default', 'row_sparse', 'csr'], \
-"grad_stype for Parameter '%s' must be one of 'default', 
'row_sparse', or 'csr'," \
-" but got '%s'" % (name, grad_stype)
+# sparse related storage type information
+valid_stypes = ['default', 'row_sparse', 'csr']
 
 Review comment:
   Only has 3 elements. I don't think this makes any difference


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[incubator-mxnet-site] branch asf-site updated: Bump the publish timestamp.

2018-05-21 Thread zhasheng
This is an automated email from the ASF dual-hosted git repository.

zhasheng pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/incubator-mxnet-site.git


The following commit(s) were added to refs/heads/asf-site by this push:
 new cccf616  Bump the publish timestamp.
cccf616 is described below

commit cccf616a32d5d459e9893b343e7e1073f7ff1872
Author: mxnet-ci 
AuthorDate: Mon May 21 23:26:55 2018 +

Bump the publish timestamp.
---
 date.txt | 1 +
 1 file changed, 1 insertion(+)

diff --git a/date.txt b/date.txt
new file mode 100644
index 000..f90d706
--- /dev/null
+++ b/date.txt
@@ -0,0 +1 @@
+Mon May 21 23:26:55 UTC 2018

-- 
To stop receiving notification emails like this one, please contact
zhash...@apache.org.


[GitHub] zheng-da commented on a change in pull request #10451: [MXNET-432] Add Foreach

2018-05-21 Thread GitBox
zheng-da commented on a change in pull request #10451: [MXNET-432] Add Foreach
URL: https://github.com/apache/incubator-mxnet/pull/10451#discussion_r189738041
 
 

 ##
 File path: src/operator/nn/subgraph_op_common.h
 ##
 @@ -0,0 +1,61 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+#ifndef MXNET_OPERATOR_NN_SUBGRAPH_OP_COMMON_H_
 
 Review comment:
   you are right. i probably should move control flow op out as well.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] zheng-da commented on a change in pull request #10451: [MXNET-432] Add Foreach

2018-05-21 Thread GitBox
zheng-da commented on a change in pull request #10451: [MXNET-432] Add Foreach
URL: https://github.com/apache/incubator-mxnet/pull/10451#discussion_r189735932
 
 

 ##
 File path: src/operator/nn/control_flow.cc
 ##
 @@ -0,0 +1,532 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "../operator_common.h"
+#include "../elemwise_op_common.h"
+#include "../../imperative/imperative_utils.h"
+#include "./subgraph_op_common.h"
+
+namespace mxnet {
+namespace op {
+
+struct ForeachParam : public dmlc::Parameter {
+  int num_args;
+  int dim;
+  int num_outputs;
+  int num_out_data;
+  nnvm::Tuple in_state_locs;
+  nnvm::Tuple in_data_locs;
+  DMLC_DECLARE_PARAMETER(ForeachParam) {
+DMLC_DECLARE_FIELD(num_args).set_lower_bound(1)
+.describe("Number of inputs.");
+DMLC_DECLARE_FIELD(dim).set_default(1)
+.describe("the dimension of the input array to iterate.");
+DMLC_DECLARE_FIELD(num_outputs)
+.describe("The number of outputs of the subgraph.");
+DMLC_DECLARE_FIELD(num_out_data)
+.describe("The number of output data of the subgraph.");
+DMLC_DECLARE_FIELD(in_state_locs)
+.describe("The locations of loop states among the inputs.");
+DMLC_DECLARE_FIELD(in_data_locs)
+.describe("The locations of input data among the inputs.");
+  }
+};  // struct ForeachParam
+
+DMLC_REGISTER_PARAMETER(ForeachParam);
+
+class ForeachState {
+  // These are output arrays from all iterations.
+  // They also contain the Op state for each CachedOp.
+  std::vector all_outputs;
+  std::vector all_inputs;
+  std::vector all_gradients;
+  std::vector iter_ops;
+
+ public:
+  Symbol subgraph_sym;
+  nnvm::Graph subgraph;
+  ForeachParam params;
+
+  ForeachState(const Symbol , const ForeachParam ) {
+this->subgraph_sym = g;
+this->subgraph.outputs = g.outputs;
+this->params = params;
+  }
+
+  void Forward(std::vector cinputs,
+   const std::vector& req,
+   std::vector coutputs, bool is_recording);
+  void Backward(int iter_no, std::vector ograds,
+const std::vector ,
+std::vector igrads);
+  void Cleanup() {
+all_outputs.clear();
+all_inputs.clear();
+all_gradients.clear();
+iter_ops.clear();
+  }
+};
+
+void ForeachState::Forward(std::vector cinputs,
+   const std::vector& req,
+   std::vector coutputs, bool is_recording) {
+  using namespace nnvm;
+  using namespace imperative;
+
+  bool orig_is_record;
+  if (is_recording)
+orig_is_record = Imperative::Get()->set_is_recording(true);
+  else
+orig_is_record = Imperative::Get()->is_recording();
+
+  std::vector inputs(cinputs.size());
+  std::vector outputs(coutputs.size());
+  for (size_t i = 0; i < inputs.size(); i++)
+inputs[i] = [i];
+  for (size_t i = 0; i < outputs.size(); i++)
+outputs[i] = [i];
+
+  if (is_recording) {
+all_inputs.push_back(cinputs);
+std::vector gradients(cinputs.size());
+std::vector input_ptrs(cinputs.size());
+std::vector gradient_ptrs(cinputs.size());
+std::vector grad_reqs(cinputs.size());
+for (size_t i = 0; i < gradients.size(); i++) {
+  gradients[i] = NDArray(cinputs[i].shape(), cinputs[i].ctx(),
+ true, cinputs[i].dtype());
+  input_ptrs[i] = [i];
+  gradient_ptrs[i] = [i];
+  grad_reqs[i] = kWriteTo;
+}
+Imperative::Get()->MarkVariables(input_ptrs, grad_reqs, gradient_ptrs);;
+  }
+
+  std::vector > kwargs;
+  kwargs.push_back(std::pair("inline_limit", "0"));
+  // Get input names.
+  const auto& idx = subgraph.indexed_graph();
+  std::vector arg_names(idx.input_nodes().size());
+  for (size_t i = 0; i < idx.input_nodes().size(); ++i)
+arg_names[i] = idx[idx.input_nodes()[i]].source->attrs.name;
+  // We don't have parameters for the cached op.
+  std::unordered_map params;
+  CachedOpPtr op = std::make_shared(subgraph_sym, kwargs,
+   

[GitHub] zheng-da commented on a change in pull request #10451: [MXNET-432] Add Foreach

2018-05-21 Thread GitBox
zheng-da commented on a change in pull request #10451: [MXNET-432] Add Foreach
URL: https://github.com/apache/incubator-mxnet/pull/10451#discussion_r189735639
 
 

 ##
 File path: src/operator/nn/control_flow.cc
 ##
 @@ -0,0 +1,532 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "../operator_common.h"
+#include "../elemwise_op_common.h"
+#include "../../imperative/imperative_utils.h"
+#include "./subgraph_op_common.h"
+
+namespace mxnet {
+namespace op {
+
+struct ForeachParam : public dmlc::Parameter {
+  int num_args;
+  int dim;
+  int num_outputs;
+  int num_out_data;
+  nnvm::Tuple in_state_locs;
+  nnvm::Tuple in_data_locs;
+  DMLC_DECLARE_PARAMETER(ForeachParam) {
+DMLC_DECLARE_FIELD(num_args).set_lower_bound(1)
+.describe("Number of inputs.");
+DMLC_DECLARE_FIELD(dim).set_default(1)
+.describe("the dimension of the input array to iterate.");
+DMLC_DECLARE_FIELD(num_outputs)
+.describe("The number of outputs of the subgraph.");
+DMLC_DECLARE_FIELD(num_out_data)
+.describe("The number of output data of the subgraph.");
+DMLC_DECLARE_FIELD(in_state_locs)
+.describe("The locations of loop states among the inputs.");
+DMLC_DECLARE_FIELD(in_data_locs)
+.describe("The locations of input data among the inputs.");
+  }
+};  // struct ForeachParam
+
+DMLC_REGISTER_PARAMETER(ForeachParam);
+
+class ForeachState {
+  // These are output arrays from all iterations.
+  // They also contain the Op state for each CachedOp.
+  std::vector all_outputs;
+  std::vector all_inputs;
+  std::vector all_gradients;
+  std::vector iter_ops;
+
+ public:
+  Symbol subgraph_sym;
+  nnvm::Graph subgraph;
+  ForeachParam params;
+
+  ForeachState(const Symbol , const ForeachParam ) {
+this->subgraph_sym = g;
+this->subgraph.outputs = g.outputs;
+this->params = params;
+  }
+
+  void Forward(std::vector cinputs,
+   const std::vector& req,
+   std::vector coutputs, bool is_recording);
+  void Backward(int iter_no, std::vector ograds,
+const std::vector ,
+std::vector igrads);
+  void Cleanup() {
+all_outputs.clear();
+all_inputs.clear();
+all_gradients.clear();
+iter_ops.clear();
+  }
+};
+
+void ForeachState::Forward(std::vector cinputs,
+   const std::vector& req,
+   std::vector coutputs, bool is_recording) {
+  using namespace nnvm;
+  using namespace imperative;
+
+  bool orig_is_record;
+  if (is_recording)
+orig_is_record = Imperative::Get()->set_is_recording(true);
+  else
+orig_is_record = Imperative::Get()->is_recording();
+
+  std::vector inputs(cinputs.size());
+  std::vector outputs(coutputs.size());
+  for (size_t i = 0; i < inputs.size(); i++)
+inputs[i] = [i];
+  for (size_t i = 0; i < outputs.size(); i++)
+outputs[i] = [i];
+
+  if (is_recording) {
+all_inputs.push_back(cinputs);
+std::vector gradients(cinputs.size());
+std::vector input_ptrs(cinputs.size());
+std::vector gradient_ptrs(cinputs.size());
+std::vector grad_reqs(cinputs.size());
+for (size_t i = 0; i < gradients.size(); i++) {
+  gradients[i] = NDArray(cinputs[i].shape(), cinputs[i].ctx(),
+ true, cinputs[i].dtype());
+  input_ptrs[i] = [i];
+  gradient_ptrs[i] = [i];
+  grad_reqs[i] = kWriteTo;
+}
+Imperative::Get()->MarkVariables(input_ptrs, grad_reqs, gradient_ptrs);;
+  }
+
+  std::vector > kwargs;
+  kwargs.push_back(std::pair("inline_limit", "0"));
+  // Get input names.
+  const auto& idx = subgraph.indexed_graph();
+  std::vector arg_names(idx.input_nodes().size());
+  for (size_t i = 0; i < idx.input_nodes().size(); ++i)
+arg_names[i] = idx[idx.input_nodes()[i]].source->attrs.name;
+  // We don't have parameters for the cached op.
+  std::unordered_map params;
+  CachedOpPtr op = std::make_shared(subgraph_sym, kwargs,
+   

[GitHub] zheng-da commented on a change in pull request #10451: [MXNET-432] Add Foreach

2018-05-21 Thread GitBox
zheng-da commented on a change in pull request #10451: [MXNET-432] Add Foreach
URL: https://github.com/apache/incubator-mxnet/pull/10451#discussion_r189735354
 
 

 ##
 File path: src/imperative/imperative_utils.h
 ##
 @@ -505,12 +515,16 @@ inline void PushOperator(const OpStatePtr& state,
 fcompute(state, opctx, input_blobs, tmp_req, output_blobs);
 // post-fcompute fallback, cast to original storage type, if necessary
 CastNonDefaultStorage(post_temp_src, post_temp_dst, opctx, is_gpu);
-if (is_gpu && exec_type == ExecType::kSync) {
+if (is_gpu && exec_type == ExecType::kSync
 
 Review comment:
   because subgraph operators don't run in the threaded engine and don't have 
gpu stream.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] zhreshold commented on issue #10605: [MXNET-310] [ONNX-MXNet] API to import ONNX models into Gluon.

2018-05-21 Thread GitBox
zhreshold commented on issue #10605: [MXNET-310] [ONNX-MXNet] API to import 
ONNX models into Gluon.
URL: https://github.com/apache/incubator-mxnet/pull/10605#issuecomment-390806446
 
 
   Since we are recompose the network using symbol, why specifically targeting 
Gluon?
   It make more sense to me if it is API to import ONNX model to mxnet.
   It is always simple to convert a symbol to a SymbolBlock to use with Gluon. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] zheng-da commented on a change in pull request #10451: [MXNET-432] Add Foreach

2018-05-21 Thread GitBox
zheng-da commented on a change in pull request #10451: [MXNET-432] Add Foreach
URL: https://github.com/apache/incubator-mxnet/pull/10451#discussion_r189735354
 
 

 ##
 File path: src/imperative/imperative_utils.h
 ##
 @@ -505,12 +515,16 @@ inline void PushOperator(const OpStatePtr& state,
 fcompute(state, opctx, input_blobs, tmp_req, output_blobs);
 // post-fcompute fallback, cast to original storage type, if necessary
 CastNonDefaultStorage(post_temp_src, post_temp_dst, opctx, is_gpu);
-if (is_gpu && exec_type == ExecType::kSync) {
+if (is_gpu && exec_type == ExecType::kSync
 
 Review comment:
   because subgraph operators don't run in the threaded engine and don't have 
gpu stream or cpu stream.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] zheng-da commented on a change in pull request #10451: [MXNET-432] Add Foreach

2018-05-21 Thread GitBox
zheng-da commented on a change in pull request #10451: [MXNET-432] Add Foreach
URL: https://github.com/apache/incubator-mxnet/pull/10451#discussion_r189735162
 
 

 ##
 File path: src/imperative/imperative_utils.h
 ##
 @@ -456,17 +458,24 @@ inline void PushOperator(const OpStatePtr& state,
   if (fcompute_ex != nullptr && dispatch_mode == DispatchMode::kFComputeEx) {
 const auto& run = [=](RunContext rctx,
   engine::CallbackOnComplete on_complete) {
-  OpContext opctx{is_train, rctx, on_complete, requested};
+  bool need_grad = Imperative::Get()->is_recording();
+  OpContext opctx{need_grad, is_train, rctx, on_complete, requested};
 #if MXNET_USE_MKLDNN == 1
   InvalidateOutputs(outputs, req);
 #endif
   fcompute_ex(state, opctx, inputs, req, outputs);
-  if (ctx.dev_mask() == gpu::kDevMask && exec_type == ExecType::kSync) {
 
 Review comment:
   why?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] zheng-da commented on a change in pull request #10451: [MXNET-432] Add Foreach

2018-05-21 Thread GitBox
zheng-da commented on a change in pull request #10451: [MXNET-432] Add Foreach
URL: https://github.com/apache/incubator-mxnet/pull/10451#discussion_r189734937
 
 

 ##
 File path: src/executor/graph_executor.cc
 ##
 @@ -1537,6 +1555,9 @@ GraphExecutor::CachedSegOpr 
GraphExecutor::CreateCachedSegOpr(size_t topo_start,
 OpNode& op_node = op_nodes_[nid];
 if (op_node.skip_exec_node) continue;
 if (inode.source->is_variable()) continue;
+// We shouldn't add control flow operators to a segment.
+// We can't execute these operators in the engine.
+if (op_node.exec->HasSubgraph()) continue;
 
 Review comment:
   using `return ret` means breaking the graph into two pieces?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] zheng-da commented on a change in pull request #10451: [MXNET-432] Add Foreach

2018-05-21 Thread GitBox
zheng-da commented on a change in pull request #10451: [MXNET-432] Add Foreach
URL: https://github.com/apache/incubator-mxnet/pull/10451#discussion_r189734691
 
 

 ##
 File path: python/mxnet/symbol/contrib.py
 ##
 @@ -91,3 +98,176 @@ def rand_zipfian(true_classes, num_sampled, range_max):
 expected_prob_sampled = ((sampled_cls_fp64 + 2.0) / (sampled_cls_fp64 + 
1.0)).log() / log_range
 expected_count_sampled = expected_prob_sampled * num_sampled
 return sampled_classes, expected_count_true, expected_count_sampled
+
+def _get_graph_inputs(subg):
+num_handles = ctypes.c_int(1000)
+handles = c_array(SymbolHandle, [SymbolHandle(0) for i in range(1000)])
+check_call(_LIB.MXSymbolGetInputSymbols(subg.handle, handles, 
ctypes.byref(num_handles)))
+
+syms = []
+for i in range(num_handles.value):
+s = Symbol(handles[i])
+syms.append(s)
+return syms
+
+def foreach(func, data, init_states, name="foreach"):
+"""Run a for loop with user-defined computation over NDArrays on dimension 
0.
+
+This operator simulates a for loop and func has the computation for an 
iteration
+of the for loop. It runs the computation in func on each slice from the 
input
+NDArrays.
+
+func takes two arguments as input and outputs a tuple of two elements,
+as illustrated below:
+
+out, states = func(data1, states)
+
+data1 can be either a symbol or a list of symbols. If data is a symbol,
+data1 is a symbol. Otherwise, data1 is a list of symbols and has the same
+size as data. states is a list of symbols and have the same size as 
init_states.
+Similarly, out can be either a symbol or a list of symbols, which are 
concatenated
+as the first output of foreach; states from the last execution of func
+are the second output of foreach.
+
+The computation done by this operator is equivalent to the pseudo code 
below
+when the input data is NDArray:
+
+states = init_states
+outs = []
+for i in data.shape[0]:
+s = data[i]
+out, states = func(s, states)
+outs.append(out)
+outs = stack(*outs)
+
+
+Parameters
+--
+func : a Python function.
+Define computation in an iteration.
+data: a symbol or a list of symbols.
+The input data.
+init_states: a symbol or a list of symbols.
+The initial values of the loop states.
+name: string.
+The name of the operator.
+
+Returns
+---
+outputs: a Symbol or a list of Symbols.
+The output data concatenated from the output of all iterations.
+states: a list of Symbols.
+The loop states in the last iteration.
+
+Examples
+
+>>> step = lambda data, states: (data + states[0], [states[0] * 2])
+>>> data = mx.sym.var('data')
+>>> states = [mx.sym.var('state')]
+>>> outs, states = mx.sym.contrib.foreach(step, data, states)
+"""
+
+def check_data(inputs, in_type, msg):
+is_NDArray_or_list = True
+if isinstance(inputs, list):
+for i in inputs:
+if not isinstance(i, in_type):
+is_NDArray_or_list = False
+break
+else:
+is_NDArray_or_list = isinstance(inputs, in_type)
+assert is_NDArray_or_list, msg
+
+check_data(data, symbol.Symbol, "data should be an NDArray or a list of 
NDArrays")
+check_data(init_states, symbol.Symbol,
+"init_states should be an NDArray or a list of NDArrays")
+not_state_list = isinstance(init_states, symbol.Symbol)
+
+# TODO(zhengda) If the input python function references to the symbols 
outside
+# the python function, we need to prune the computation graph constructed 
from
+# the function. One way of doing it is to mark the nodes in the 
computation graph
+# with AttrScope and prune the nodes without the special attribute.
+with AttrScope(subgraph_name=name):
 
 Review comment:
   alternative of what?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] zheng-da commented on a change in pull request #10451: [MXNET-432] Add Foreach

2018-05-21 Thread GitBox
zheng-da commented on a change in pull request #10451: [MXNET-432] Add Foreach
URL: https://github.com/apache/incubator-mxnet/pull/10451#discussion_r189734655
 
 

 ##
 File path: python/mxnet/symbol/contrib.py
 ##
 @@ -19,13 +19,20 @@
 # pylint: disable=wildcard-import, unused-wildcard-import
 """Contrib Symbol API of MXNet."""
 import math
+import ctypes
+
 from .random import uniform
 from .symbol import Symbol
 try:
 from .gen_contrib import *
 except ImportError:
 pass
 
+from . import symbol
+from ..base import _LIB, c_array, check_call
+from ..base import SymbolHandle, _as_list
+from ..attribute import AttrScope
+
 __all__ = ["rand_zipfian"]
 
 Review comment:
   what is this for?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] zheng-da commented on a change in pull request #10451: [MXNET-432] Add Foreach

2018-05-21 Thread GitBox
zheng-da commented on a change in pull request #10451: [MXNET-432] Add Foreach
URL: https://github.com/apache/incubator-mxnet/pull/10451#discussion_r189734486
 
 

 ##
 File path: python/mxnet/ndarray/contrib.py
 ##
 @@ -96,3 +98,101 @@ def rand_zipfian(true_classes, num_sampled, range_max, 
ctx=None):
 expected_count_sampled = expected_prob_sampled * num_sampled
 return sampled_classes, expected_count_true, expected_count_sampled
 # pylint: enable=line-too-long
+
+def foreach(func, data, init_states):
+"""Run a for loop with user-defined computation over NDArrays on dimension 
0.
+
+This operator simulates a for loop and func has the computation for an 
iteration
+of the for loop. It runs the computation in func on each slice from the 
input
+NDArrays.
+
+func takes two arguments as input and outputs a tuple of two elements,
+as illustrated below:
+
+out, states = func(data1, states)
+
+data1 can be either an NDArray or a list of NDArrays. If data is an 
NDArray,
+data1 is an NDArray. Otherwise, data1 is a list of NDArrays and has the 
same
+size as data. states is a list of NDArrays and have the same size as 
init_states.
+Similarly, out can be either an NDArray or a list of NDArrays, which are 
concatenated
+as the first output of foreach; states from the last execution of func
+are the second output of foreach.
+
+The computation done by this operator is equivalent to the pseudo code 
below
+when the input data is NDArray:
+
+states = init_states
+outs = []
+for i in data.shape[0]:
+s = data[i]
+out, states = func(s, states)
+outs.append(out)
+outs = stack(*outs)
+
+
+Parameters
+--
+func : a Python function.
 
 Review comment:
   this is to follow the interface of TensorFlow. 
https://www.tensorflow.org/api_docs/python/tf/while_loop
   Using class does make API more well defined, but it requires users to write 
more code to define it. I don't know what is the best way. 
   @piiswrong what's your opinion?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] szha commented on issue #9933: [MXNET-23] Adding support to profile kvstore server during distributed training

2018-05-21 Thread GitBox
szha commented on issue #9933: [MXNET-23] Adding support to profile kvstore 
server during distributed training
URL: https://github.com/apache/incubator-mxnet/pull/9933#issuecomment-390805091
 
 
   ping


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] szha commented on issue #9808: [MXNET-267] Ssd camera demo

2018-05-21 Thread GitBox
szha commented on issue #9808: [MXNET-267] Ssd camera demo
URL: https://github.com/apache/incubator-mxnet/pull/9808#issuecomment-390804816
 
 
   ping


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] eric-haibin-lin commented on a change in pull request #10451: [MXNET-432] Add Foreach

2018-05-21 Thread GitBox
eric-haibin-lin commented on a change in pull request #10451: [MXNET-432] Add 
Foreach
URL: https://github.com/apache/incubator-mxnet/pull/10451#discussion_r189717728
 
 

 ##
 File path: python/mxnet/symbol/contrib.py
 ##
 @@ -19,13 +19,20 @@
 # pylint: disable=wildcard-import, unused-wildcard-import
 """Contrib Symbol API of MXNet."""
 import math
+import ctypes
+
 from .random import uniform
 from .symbol import Symbol
 try:
 from .gen_contrib import *
 except ImportError:
 pass
 
+from . import symbol
+from ..base import _LIB, c_array, check_call
+from ..base import SymbolHandle, _as_list
+from ..attribute import AttrScope
+
 __all__ = ["rand_zipfian"]
 
 Review comment:
   Add foreach to \_\_all\_\_ ? 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] eric-haibin-lin commented on a change in pull request #10451: [MXNET-432] Add Foreach

2018-05-21 Thread GitBox
eric-haibin-lin commented on a change in pull request #10451: [MXNET-432] Add 
Foreach
URL: https://github.com/apache/incubator-mxnet/pull/10451#discussion_r189733374
 
 

 ##
 File path: src/operator/nn/subgraph_op_common.h
 ##
 @@ -0,0 +1,61 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+#ifndef MXNET_OPERATOR_NN_SUBGRAPH_OP_COMMON_H_
 
 Review comment:
   Why is subgraph op inside `nn/` folder? Isn't it more general?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] eric-haibin-lin commented on a change in pull request #10451: [MXNET-432] Add Foreach

2018-05-21 Thread GitBox
eric-haibin-lin commented on a change in pull request #10451: [MXNET-432] Add 
Foreach
URL: https://github.com/apache/incubator-mxnet/pull/10451#discussion_r189715798
 
 

 ##
 File path: python/mxnet/ndarray/contrib.py
 ##
 @@ -96,3 +98,101 @@ def rand_zipfian(true_classes, num_sampled, range_max, 
ctx=None):
 expected_count_sampled = expected_prob_sampled * num_sampled
 return sampled_classes, expected_count_true, expected_count_sampled
 # pylint: enable=line-too-long
+
+def foreach(func, data, init_states):
+"""Run a for loop with user-defined computation over NDArrays on dimension 
0.
+
+This operator simulates a for loop and func has the computation for an 
iteration
+of the for loop. It runs the computation in func on each slice from the 
input
+NDArrays.
+
+func takes two arguments as input and outputs a tuple of two elements,
+as illustrated below:
+
+out, states = func(data1, states)
+
+data1 can be either an NDArray or a list of NDArrays. If data is an 
NDArray,
+data1 is an NDArray. Otherwise, data1 is a list of NDArrays and has the 
same
+size as data. states is a list of NDArrays and have the same size as 
init_states.
+Similarly, out can be either an NDArray or a list of NDArrays, which are 
concatenated
+as the first output of foreach; states from the last execution of func
+are the second output of foreach.
+
+The computation done by this operator is equivalent to the pseudo code 
below
+when the input data is NDArray:
+
+states = init_states
+outs = []
+for i in data.shape[0]:
+s = data[i]
+out, states = func(s, states)
+outs.append(out)
+outs = stack(*outs)
+
+
+Parameters
+--
+func : a Python function.
 
 Review comment:
   Generic python function as an argument seems too broad. Since the interface 
for func is well defined, do we want to restrict it to a well-defined python 
class? For example,
   
   ```
   class ForeachBody(object):
   def forward(data, states):
   raise NotImplementedError()
   
   def __call__(data, states):
   """
   data: NDArray or list of NDArrays
   states: NDArray or list of NDArrays
   ... 
   """
   check_input(data, states)
   return self.forward(data,states)
   
   def foreach(body, data, state)
   
   Parameters
   --
   func : a ForeachBody.
   
   ```
   Then you don't have to do check_input inside contrib.foreach


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] eric-haibin-lin commented on a change in pull request #10451: [MXNET-432] Add Foreach

2018-05-21 Thread GitBox
eric-haibin-lin commented on a change in pull request #10451: [MXNET-432] Add 
Foreach
URL: https://github.com/apache/incubator-mxnet/pull/10451#discussion_r189723680
 
 

 ##
 File path: src/operator/nn/control_flow.cc
 ##
 @@ -0,0 +1,532 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "../operator_common.h"
+#include "../elemwise_op_common.h"
+#include "../../imperative/imperative_utils.h"
+#include "./subgraph_op_common.h"
+
+namespace mxnet {
+namespace op {
+
+struct ForeachParam : public dmlc::Parameter {
+  int num_args;
+  int dim;
+  int num_outputs;
+  int num_out_data;
+  nnvm::Tuple in_state_locs;
+  nnvm::Tuple in_data_locs;
+  DMLC_DECLARE_PARAMETER(ForeachParam) {
+DMLC_DECLARE_FIELD(num_args).set_lower_bound(1)
+.describe("Number of inputs.");
+DMLC_DECLARE_FIELD(dim).set_default(1)
+.describe("the dimension of the input array to iterate.");
+DMLC_DECLARE_FIELD(num_outputs)
+.describe("The number of outputs of the subgraph.");
+DMLC_DECLARE_FIELD(num_out_data)
+.describe("The number of output data of the subgraph.");
+DMLC_DECLARE_FIELD(in_state_locs)
+.describe("The locations of loop states among the inputs.");
+DMLC_DECLARE_FIELD(in_data_locs)
+.describe("The locations of input data among the inputs.");
+  }
+};  // struct ForeachParam
+
+DMLC_REGISTER_PARAMETER(ForeachParam);
+
+class ForeachState {
+  // These are output arrays from all iterations.
+  // They also contain the Op state for each CachedOp.
+  std::vector all_outputs;
+  std::vector all_inputs;
+  std::vector all_gradients;
+  std::vector iter_ops;
+
+ public:
+  Symbol subgraph_sym;
+  nnvm::Graph subgraph;
+  ForeachParam params;
+
+  ForeachState(const Symbol , const ForeachParam ) {
+this->subgraph_sym = g;
+this->subgraph.outputs = g.outputs;
+this->params = params;
+  }
+
+  void Forward(std::vector cinputs,
+   const std::vector& req,
+   std::vector coutputs, bool is_recording);
+  void Backward(int iter_no, std::vector ograds,
+const std::vector ,
+std::vector igrads);
+  void Cleanup() {
+all_outputs.clear();
+all_inputs.clear();
+all_gradients.clear();
+iter_ops.clear();
+  }
+};
+
+void ForeachState::Forward(std::vector cinputs,
+   const std::vector& req,
+   std::vector coutputs, bool is_recording) {
+  using namespace nnvm;
+  using namespace imperative;
+
+  bool orig_is_record;
+  if (is_recording)
+orig_is_record = Imperative::Get()->set_is_recording(true);
+  else
+orig_is_record = Imperative::Get()->is_recording();
+
+  std::vector inputs(cinputs.size());
+  std::vector outputs(coutputs.size());
+  for (size_t i = 0; i < inputs.size(); i++)
+inputs[i] = [i];
+  for (size_t i = 0; i < outputs.size(); i++)
+outputs[i] = [i];
+
+  if (is_recording) {
+all_inputs.push_back(cinputs);
+std::vector gradients(cinputs.size());
+std::vector input_ptrs(cinputs.size());
+std::vector gradient_ptrs(cinputs.size());
+std::vector grad_reqs(cinputs.size());
+for (size_t i = 0; i < gradients.size(); i++) {
+  gradients[i] = NDArray(cinputs[i].shape(), cinputs[i].ctx(),
+ true, cinputs[i].dtype());
+  input_ptrs[i] = [i];
+  gradient_ptrs[i] = [i];
+  grad_reqs[i] = kWriteTo;
+}
+Imperative::Get()->MarkVariables(input_ptrs, grad_reqs, gradient_ptrs);;
+  }
+
+  std::vector > kwargs;
+  kwargs.push_back(std::pair("inline_limit", "0"));
+  // Get input names.
+  const auto& idx = subgraph.indexed_graph();
+  std::vector arg_names(idx.input_nodes().size());
+  for (size_t i = 0; i < idx.input_nodes().size(); ++i)
+arg_names[i] = idx[idx.input_nodes()[i]].source->attrs.name;
+  // We don't have parameters for the cached op.
+  std::unordered_map params;
+  CachedOpPtr op = std::make_shared(subgraph_sym, kwargs,
+   

[GitHub] eric-haibin-lin commented on a change in pull request #10451: [MXNET-432] Add Foreach

2018-05-21 Thread GitBox
eric-haibin-lin commented on a change in pull request #10451: [MXNET-432] Add 
Foreach
URL: https://github.com/apache/incubator-mxnet/pull/10451#discussion_r189719090
 
 

 ##
 File path: python/mxnet/symbol/contrib.py
 ##
 @@ -91,3 +98,176 @@ def rand_zipfian(true_classes, num_sampled, range_max):
 expected_prob_sampled = ((sampled_cls_fp64 + 2.0) / (sampled_cls_fp64 + 
1.0)).log() / log_range
 expected_count_sampled = expected_prob_sampled * num_sampled
 return sampled_classes, expected_count_true, expected_count_sampled
+
+def _get_graph_inputs(subg):
+num_handles = ctypes.c_int(1000)
+handles = c_array(SymbolHandle, [SymbolHandle(0) for i in range(1000)])
+check_call(_LIB.MXSymbolGetInputSymbols(subg.handle, handles, 
ctypes.byref(num_handles)))
+
+syms = []
+for i in range(num_handles.value):
+s = Symbol(handles[i])
+syms.append(s)
+return syms
+
+def foreach(func, data, init_states, name="foreach"):
+"""Run a for loop with user-defined computation over NDArrays on dimension 
0.
+
+This operator simulates a for loop and func has the computation for an 
iteration
+of the for loop. It runs the computation in func on each slice from the 
input
+NDArrays.
+
+func takes two arguments as input and outputs a tuple of two elements,
+as illustrated below:
+
+out, states = func(data1, states)
+
+data1 can be either a symbol or a list of symbols. If data is a symbol,
+data1 is a symbol. Otherwise, data1 is a list of symbols and has the same
+size as data. states is a list of symbols and have the same size as 
init_states.
+Similarly, out can be either a symbol or a list of symbols, which are 
concatenated
+as the first output of foreach; states from the last execution of func
+are the second output of foreach.
+
+The computation done by this operator is equivalent to the pseudo code 
below
+when the input data is NDArray:
+
+states = init_states
+outs = []
+for i in data.shape[0]:
+s = data[i]
+out, states = func(s, states)
+outs.append(out)
+outs = stack(*outs)
+
+
+Parameters
+--
+func : a Python function.
+Define computation in an iteration.
+data: a symbol or a list of symbols.
+The input data.
+init_states: a symbol or a list of symbols.
+The initial values of the loop states.
+name: string.
+The name of the operator.
+
+Returns
+---
+outputs: a Symbol or a list of Symbols.
+The output data concatenated from the output of all iterations.
+states: a list of Symbols.
+The loop states in the last iteration.
+
+Examples
+
+>>> step = lambda data, states: (data + states[0], [states[0] * 2])
+>>> data = mx.sym.var('data')
+>>> states = [mx.sym.var('state')]
+>>> outs, states = mx.sym.contrib.foreach(step, data, states)
+"""
+
+def check_data(inputs, in_type, msg):
+is_NDArray_or_list = True
+if isinstance(inputs, list):
+for i in inputs:
+if not isinstance(i, in_type):
+is_NDArray_or_list = False
+break
+else:
+is_NDArray_or_list = isinstance(inputs, in_type)
+assert is_NDArray_or_list, msg
+
+check_data(data, symbol.Symbol, "data should be an NDArray or a list of 
NDArrays")
+check_data(init_states, symbol.Symbol,
+"init_states should be an NDArray or a list of NDArrays")
+not_state_list = isinstance(init_states, symbol.Symbol)
+
+# TODO(zhengda) If the input python function references to the symbols 
outside
+# the python function, we need to prune the computation graph constructed 
from
+# the function. One way of doing it is to mark the nodes in the 
computation graph
+# with AttrScope and prune the nodes without the special attribute.
+with AttrScope(subgraph_name=name):
 
 Review comment:
   What's the alternative?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] eric-haibin-lin commented on a change in pull request #10451: [MXNET-432] Add Foreach

2018-05-21 Thread GitBox
eric-haibin-lin commented on a change in pull request #10451: [MXNET-432] Add 
Foreach
URL: https://github.com/apache/incubator-mxnet/pull/10451#discussion_r189721418
 
 

 ##
 File path: src/executor/graph_executor.cc
 ##
 @@ -1537,6 +1555,9 @@ GraphExecutor::CachedSegOpr 
GraphExecutor::CreateCachedSegOpr(size_t topo_start,
 OpNode& op_node = op_nodes_[nid];
 if (op_node.skip_exec_node) continue;
 if (inode.source->is_variable()) continue;
+// We shouldn't add control flow operators to a segment.
+// We can't execute these operators in the engine.
+if (op_node.exec->HasSubgraph()) continue;
 
 Review comment:
   Why not `return ret` instead of `continue`?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] eric-haibin-lin commented on a change in pull request #10451: [MXNET-432] Add Foreach

2018-05-21 Thread GitBox
eric-haibin-lin commented on a change in pull request #10451: [MXNET-432] Add 
Foreach
URL: https://github.com/apache/incubator-mxnet/pull/10451#discussion_r189722567
 
 

 ##
 File path: src/imperative/imperative_utils.h
 ##
 @@ -456,17 +458,24 @@ inline void PushOperator(const OpStatePtr& state,
   if (fcompute_ex != nullptr && dispatch_mode == DispatchMode::kFComputeEx) {
 const auto& run = [=](RunContext rctx,
   engine::CallbackOnComplete on_complete) {
-  OpContext opctx{is_train, rctx, on_complete, requested};
+  bool need_grad = Imperative::Get()->is_recording();
+  OpContext opctx{need_grad, is_train, rctx, on_complete, requested};
 #if MXNET_USE_MKLDNN == 1
   InvalidateOutputs(outputs, req);
 #endif
   fcompute_ex(state, opctx, inputs, req, outputs);
-  if (ctx.dev_mask() == gpu::kDevMask && exec_type == ExecType::kSync) {
 
 Review comment:
   Good catch. I think we need to check 
   `bool is_gpu = rctx.get_ctx().dev_mask() == gpu::kDevMask;` 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] eric-haibin-lin commented on a change in pull request #10451: [MXNET-432] Add Foreach

2018-05-21 Thread GitBox
eric-haibin-lin commented on a change in pull request #10451: [MXNET-432] Add 
Foreach
URL: https://github.com/apache/incubator-mxnet/pull/10451#discussion_r189732800
 
 

 ##
 File path: src/operator/nn/control_flow.cc
 ##
 @@ -0,0 +1,532 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "../operator_common.h"
+#include "../elemwise_op_common.h"
+#include "../../imperative/imperative_utils.h"
+#include "./subgraph_op_common.h"
+
+namespace mxnet {
+namespace op {
+
+struct ForeachParam : public dmlc::Parameter {
+  int num_args;
+  int dim;
+  int num_outputs;
+  int num_out_data;
+  nnvm::Tuple in_state_locs;
+  nnvm::Tuple in_data_locs;
+  DMLC_DECLARE_PARAMETER(ForeachParam) {
+DMLC_DECLARE_FIELD(num_args).set_lower_bound(1)
+.describe("Number of inputs.");
+DMLC_DECLARE_FIELD(dim).set_default(1)
+.describe("the dimension of the input array to iterate.");
+DMLC_DECLARE_FIELD(num_outputs)
+.describe("The number of outputs of the subgraph.");
+DMLC_DECLARE_FIELD(num_out_data)
+.describe("The number of output data of the subgraph.");
+DMLC_DECLARE_FIELD(in_state_locs)
+.describe("The locations of loop states among the inputs.");
+DMLC_DECLARE_FIELD(in_data_locs)
+.describe("The locations of input data among the inputs.");
+  }
+};  // struct ForeachParam
+
+DMLC_REGISTER_PARAMETER(ForeachParam);
+
+class ForeachState {
+  // These are output arrays from all iterations.
+  // They also contain the Op state for each CachedOp.
+  std::vector all_outputs;
+  std::vector all_inputs;
+  std::vector all_gradients;
+  std::vector iter_ops;
+
+ public:
+  Symbol subgraph_sym;
+  nnvm::Graph subgraph;
+  ForeachParam params;
+
+  ForeachState(const Symbol , const ForeachParam ) {
+this->subgraph_sym = g;
+this->subgraph.outputs = g.outputs;
+this->params = params;
+  }
+
+  void Forward(std::vector cinputs,
+   const std::vector& req,
+   std::vector coutputs, bool is_recording);
+  void Backward(int iter_no, std::vector ograds,
+const std::vector ,
+std::vector igrads);
+  void Cleanup() {
+all_outputs.clear();
+all_inputs.clear();
+all_gradients.clear();
+iter_ops.clear();
+  }
+};
+
+void ForeachState::Forward(std::vector cinputs,
+   const std::vector& req,
+   std::vector coutputs, bool is_recording) {
+  using namespace nnvm;
+  using namespace imperative;
+
+  bool orig_is_record;
+  if (is_recording)
+orig_is_record = Imperative::Get()->set_is_recording(true);
+  else
+orig_is_record = Imperative::Get()->is_recording();
+
+  std::vector inputs(cinputs.size());
+  std::vector outputs(coutputs.size());
+  for (size_t i = 0; i < inputs.size(); i++)
+inputs[i] = [i];
+  for (size_t i = 0; i < outputs.size(); i++)
+outputs[i] = [i];
+
+  if (is_recording) {
+all_inputs.push_back(cinputs);
+std::vector gradients(cinputs.size());
+std::vector input_ptrs(cinputs.size());
+std::vector gradient_ptrs(cinputs.size());
+std::vector grad_reqs(cinputs.size());
+for (size_t i = 0; i < gradients.size(); i++) {
+  gradients[i] = NDArray(cinputs[i].shape(), cinputs[i].ctx(),
+ true, cinputs[i].dtype());
+  input_ptrs[i] = [i];
+  gradient_ptrs[i] = [i];
+  grad_reqs[i] = kWriteTo;
+}
+Imperative::Get()->MarkVariables(input_ptrs, grad_reqs, gradient_ptrs);;
+  }
+
+  std::vector > kwargs;
+  kwargs.push_back(std::pair("inline_limit", "0"));
+  // Get input names.
+  const auto& idx = subgraph.indexed_graph();
+  std::vector arg_names(idx.input_nodes().size());
+  for (size_t i = 0; i < idx.input_nodes().size(); ++i)
+arg_names[i] = idx[idx.input_nodes()[i]].source->attrs.name;
+  // We don't have parameters for the cached op.
+  std::unordered_map params;
+  CachedOpPtr op = std::make_shared(subgraph_sym, kwargs,
+   

[GitHub] eric-haibin-lin commented on a change in pull request #10451: [MXNET-432] Add Foreach

2018-05-21 Thread GitBox
eric-haibin-lin commented on a change in pull request #10451: [MXNET-432] Add 
Foreach
URL: https://github.com/apache/incubator-mxnet/pull/10451#discussion_r189722611
 
 

 ##
 File path: src/imperative/imperative_utils.h
 ##
 @@ -505,12 +515,16 @@ inline void PushOperator(const OpStatePtr& state,
 fcompute(state, opctx, input_blobs, tmp_req, output_blobs);
 // post-fcompute fallback, cast to original storage type, if necessary
 CastNonDefaultStorage(post_temp_src, post_temp_dst, opctx, is_gpu);
-if (is_gpu && exec_type == ExecType::kSync) {
+if (is_gpu && exec_type == ExecType::kSync
 
 Review comment:
   Why is `&& rctx.get_stream()` required?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] szha commented on issue #9940: [MXNET-46]remove set CMAKE_GENERATOR_TOOLSET

2018-05-21 Thread GitBox
szha commented on issue #9940: [MXNET-46]remove set CMAKE_GENERATOR_TOOLSET
URL: https://github.com/apache/incubator-mxnet/pull/9940#issuecomment-390803685
 
 
   ping


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] zhreshold commented on a change in pull request #10922: Fix Python error on --model-prefix parameter used and make the param can be periodically saved

2018-05-21 Thread GitBox
zhreshold commented on a change in pull request #10922: Fix Python error on 
--model-prefix parameter used and make the param can be periodically saved
URL: https://github.com/apache/incubator-mxnet/pull/10922#discussion_r189732937
 
 

 ##
 File path: example/image-classification/common/fit.py
 ##
 @@ -111,6 +108,7 @@ def add_fit_args(parser):
help='show progress for every n batches')
 train.add_argument('--model-prefix', type=str,
help='model prefix')
+train.add_argument('--save-period', type=int, default=0, help='params 
saving period')
 
 Review comment:
   default should be 1 to comply with current behavior


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] szha commented on issue #10297: [MXNET-244][WIP Don't merge] Fixes for cross compilation in ARM

2018-05-21 Thread GitBox
szha commented on issue #10297: [MXNET-244][WIP Don't merge] Fixes for cross 
compilation in ARM
URL: https://github.com/apache/incubator-mxnet/pull/10297#issuecomment-390803408
 
 
   ping


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] zheng-da commented on a change in pull request #10451: [MXNET-432] Add Foreach

2018-05-21 Thread GitBox
zheng-da commented on a change in pull request #10451: [MXNET-432] Add Foreach
URL: https://github.com/apache/incubator-mxnet/pull/10451#discussion_r189732286
 
 

 ##
 File path: src/operator/nn/control_flow.cc
 ##
 @@ -0,0 +1,532 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "../operator_common.h"
+#include "../elemwise_op_common.h"
+#include "../../imperative/imperative_utils.h"
+#include "./subgraph_op_common.h"
+
+namespace mxnet {
+namespace op {
+
+struct ForeachParam : public dmlc::Parameter {
+  int num_args;
+  int dim;
+  int num_outputs;
+  int num_out_data;
+  nnvm::Tuple in_state_locs;
+  nnvm::Tuple in_data_locs;
+  DMLC_DECLARE_PARAMETER(ForeachParam) {
+DMLC_DECLARE_FIELD(num_args).set_lower_bound(1)
+.describe("Number of inputs.");
+DMLC_DECLARE_FIELD(dim).set_default(1)
+.describe("the dimension of the input array to iterate.");
+DMLC_DECLARE_FIELD(num_outputs)
+.describe("The number of outputs of the subgraph.");
+DMLC_DECLARE_FIELD(num_out_data)
+.describe("The number of output data of the subgraph.");
+DMLC_DECLARE_FIELD(in_state_locs)
+.describe("The locations of loop states among the inputs.");
+DMLC_DECLARE_FIELD(in_data_locs)
+.describe("The locations of input data among the inputs.");
+  }
+};  // struct ForeachParam
+
+DMLC_REGISTER_PARAMETER(ForeachParam);
+
+class ForeachState {
+  // These are output arrays from all iterations.
+  // They also contain the Op state for each CachedOp.
+  std::vector all_outputs;
+  std::vector all_inputs;
+  std::vector all_gradients;
+  std::vector iter_ops;
+
+ public:
+  Symbol subgraph_sym;
+  nnvm::Graph subgraph;
+  ForeachParam params;
+
+  ForeachState(const Symbol , const ForeachParam ) {
+this->subgraph_sym = g;
+this->subgraph.outputs = g.outputs;
+this->params = params;
+  }
+
+  void Forward(std::vector cinputs,
+   const std::vector& req,
+   std::vector coutputs, bool is_recording);
+  void Backward(int iter_no, std::vector ograds,
+const std::vector ,
+std::vector igrads);
+  void Cleanup() {
+all_outputs.clear();
+all_inputs.clear();
+all_gradients.clear();
+iter_ops.clear();
+  }
+};
+
+void ForeachState::Forward(const std::vector ,
+   const std::vector& req,
+   const std::vector , bool 
is_recording) {
+  using namespace nnvm;
+  using namespace imperative;
+
+  bool orig_is_record;
+  if (is_recording)
+orig_is_record = Imperative::Get()->set_is_recording(true);
+  else
+orig_is_record = Imperative::Get()->is_recording();
+
+  std::vector inputs(cinputs.size());
+  std::vector outputs(coutputs.size());
+  for (size_t i = 0; i < inputs.size(); i++)
+inputs[i] = [i];
+  for (size_t i = 0; i < outputs.size(); i++)
+outputs[i] = [i];
+
+  if (is_recording) {
+all_inputs.push_back(cinputs);
+std::vector gradients(cinputs.size());
+std::vector input_ptrs(cinputs.size());
+std::vector gradient_ptrs(cinputs.size());
+std::vector grad_reqs(cinputs.size());
+for (size_t i = 0; i < gradients.size(); i++) {
+  gradients[i] = NDArray(cinputs[i].shape(), cinputs[i].ctx(),
+ true, cinputs[i].dtype());
+  input_ptrs[i] = [i];
+  gradient_ptrs[i] = [i];
+  grad_reqs[i] = kWriteTo;
+}
+Imperative::Get()->MarkVariables(input_ptrs, grad_reqs, gradient_ptrs);;
+  }
+
+  std::vector > kwargs;
+  kwargs.push_back(std::pair("inline_limit", "0"));
+  // Get input names.
+  const auto& idx = subgraph.indexed_graph();
+  std::vector arg_names(idx.input_nodes().size());
+  for (size_t i = 0; i < idx.input_nodes().size(); ++i)
+arg_names[i] = idx[idx.input_nodes()[i]].source->attrs.name;
+  // We don't have parameters for the cached op.
+  std::unordered_map params;
+  CachedOpPtr op = std::make_shared(subgraph_sym, kwargs,
+ 

[GitHub] anirudh2290 opened a new pull request #11017: NEWS and README update to master

2018-05-21 Thread GitBox
anirudh2290 opened a new pull request #11017: NEWS and README update to master
URL: https://github.com/apache/incubator-mxnet/pull/11017
 
 
   ## Description ##
   Cherry picked from changes for NEWs and README and added some changes.
   
   
   ## Checklist ##
   ### Essentials ###
   Please feel free to remove inapplicable items for your PR.
   - [ ] The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to 
the relevant [JIRA issue](https://issues.apache.org/jira/projects/MXNET/issues) 
created (except PRs with tiny changes)
   - [ ] Changes are complete (i.e. I finished coding on this PR)
   - [ ] All changes have test coverage:
   - Unit tests are added for small changes to verify correctness (e.g. adding 
a new operator)
   - Nightly tests are added for complicated/long-running ones (e.g. changing 
distributed kvstore)
   - Build tests will be added for build configuration changes (e.g. adding a 
new build option with NCCL)
   - [ ] Code is well-documented: 
   - For user-facing API changes, API doc string has been updated. 
   - For new C++ functions in header files, their functionalities and arguments 
are documented. 
   - For new examples, README.md is added to explain the what the example does, 
the source of the dataset, expected performance on test set and reference to 
the original paper if applicable
   - Check the API doc at 
http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
   - [ ] To the my best knowledge, examples are either not affected by this 
change, or have been fixed to be compatible with this change
   
   ### Changes ###
   - [ ] Feature1, tests, (and when applicable, API doc)
   - [ ] Feature2, tests, (and when applicable, API doc)
   
   ## Comments ##
   - If this change is a backward incompatible change, why must this change be 
made.
   - Interesting edge cases to note here
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] szha commented on issue #10594: [example/sparse/factorization_machine/]add auc and fix the typo in re…

2018-05-21 Thread GitBox
szha commented on issue #10594: [example/sparse/factorization_machine/]add auc 
and fix the typo in re…
URL: https://github.com/apache/incubator-mxnet/pull/10594#issuecomment-390802678
 
 
   ping


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] zheng-da commented on a change in pull request #10451: [MXNET-432] Add Foreach

2018-05-21 Thread GitBox
zheng-da commented on a change in pull request #10451: [MXNET-432] Add Foreach
URL: https://github.com/apache/incubator-mxnet/pull/10451#discussion_r189731937
 
 

 ##
 File path: src/imperative/imperative_utils.h
 ##
 @@ -456,17 +458,24 @@ inline void PushOperator(const OpStatePtr& state,
   if (fcompute_ex != nullptr && dispatch_mode == DispatchMode::kFComputeEx) {
 const auto& run = [=](RunContext rctx,
   engine::CallbackOnComplete on_complete) {
-  OpContext opctx{is_train, rctx, on_complete, requested};
+  bool need_grad = Imperative::Get()->is_recording();
+  OpContext opctx{need_grad, is_train, rctx, on_complete, requested};
 #if MXNET_USE_MKLDNN == 1
   InvalidateOutputs(outputs, req);
 #endif
   fcompute_ex(state, opctx, inputs, req, outputs);
-  if (ctx.dev_mask() == gpu::kDevMask && exec_type == ExecType::kSync) {
+  if (ctx.dev_mask() == gpu::kDevMask && exec_type == ExecType::kSync
+  && rctx.get_stream()) {
 rctx.get_stream()->Wait();
   }
 };
 
-if (exec_type == ExecType::kSync) {
+// For operators with subgraphs, we need to invoke them in the main thread
+// instead of the threaded engine.
+if (!attrs.subgraphs.empty()) {
 
 Review comment:
   I think it can happen. For example, if we hybridize a block with control 
flow operators, the execution of these operators will happen here.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] asmushetzel commented on a change in pull request #10864: Support for axis parameter in linalg.gemm

2018-05-21 Thread GitBox
asmushetzel commented on a change in pull request #10864: Support for axis 
parameter in linalg.gemm
URL: https://github.com/apache/incubator-mxnet/pull/10864#discussion_r189729421
 
 

 ##
 File path: src/operator/tensor/la_op.h
 ##
 @@ -53,13 +54,17 @@ struct LaMatrixMacParam : public 
dmlc::Parameter {
 DMLC_DECLARE_FIELD(beta)
   .set_default(1.0)
   .describe("Scalar factor multiplied with C.");
+DMLC_DECLARE_FIELD(axis)
+  .set_default(-2)
+  .describe("Axis corresponding to the matrix rows.");
 
 Review comment:
   It's not similar to the theano's tensordot


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] slitsey opened a new issue #11016: [Feature request] temperature parameter in Softmax and SoftmaxOutput

2018-05-21 Thread GitBox
slitsey opened a new issue #11016: [Feature request] temperature parameter in 
Softmax and SoftmaxOutput
URL: https://github.com/apache/incubator-mxnet/issues/11016
 
 
   MXNet does not appear to have a native temperature parameter in its softmax 
functions. I would like this to be added, as it has many useful applications 
when learning a categorical probability distribution, especially in a 
reinforcement learning setting. Should default to 1 to reproduce current 
behavior.
   
   https://en.wikipedia.org/wiki/Softmax_function#Reinforcement_learning
   
   @eric-haibin-lin 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] asmushetzel commented on a change in pull request #10864: Support for axis parameter in linalg.gemm

2018-05-21 Thread GitBox
asmushetzel commented on a change in pull request #10864: Support for axis 
parameter in linalg.gemm
URL: https://github.com/apache/incubator-mxnet/pull/10864#discussion_r189726042
 
 

 ##
 File path: src/operator/tensor/la_op.h
 ##
 @@ -53,13 +54,17 @@ struct LaMatrixMacParam : public 
dmlc::Parameter {
 DMLC_DECLARE_FIELD(beta)
   .set_default(1.0)
   .describe("Scalar factor multiplied with C.");
+DMLC_DECLARE_FIELD(axis)
+  .set_default(-2)
+  .describe("Axis corresponding to the matrix rows.");
 
 Review comment:
   I changed the operator description and added an example. Let me know if this 
clarifies things.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] szha commented on issue #10719: Fix compile error in Windows-MSVC 2015

2018-05-21 Thread GitBox
szha commented on issue #10719: Fix compile error in Windows-MSVC 2015
URL: https://github.com/apache/incubator-mxnet/pull/10719#issuecomment-390795522
 
 
   Seems that this is no longer relevant. @dongeliu if this is still an issue, 
feel free to ping me to reopen, or open another pull request.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] szha closed pull request #10719: Fix compile error in Windows-MSVC 2015

2018-05-21 Thread GitBox
szha closed pull request #10719: Fix compile error in Windows-MSVC 2015
URL: https://github.com/apache/incubator-mxnet/pull/10719
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/CMakeLists.txt b/CMakeLists.txt
index ffa4d6549d6..70ead436fb7 100644
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -187,6 +187,13 @@ if(USE_MKL_IF_AVAILABLE)
   endif()
 endif()
 
+if(FORCE_USE_MKLDNN AND MSVC)
+  include_directories(3rdparty/mkldnn/install/include)
+  list(APPEND mxnet_LINKER_LIBS 
${CMAKE_CURRENT_SOURCE_DIR}/3rdparty/mkldnn/install/lib/mkldnn.lib)
+  add_definitions(-DMXNET_USE_MKLDNN=1)
+  message('Force the use of MKL-DNN even without MKL')
+endif()
+
 # Allow Cuda compiles outside of src tree to find things in 'src' and 'include'
 include_directories(${CMAKE_CURRENT_SOURCE_DIR}/include)
 include_directories(${CMAKE_CURRENT_SOURCE_DIR}/src)
diff --git a/install.sh b/install.sh
new file mode 100644
index 000..c31f78204c0
--- /dev/null
+++ b/install.sh
@@ -0,0 +1,5 @@
+cp -r ./3rdparty/dmlc-core/include/dmlc /usr/include/
+cp -r ./3rdparty/nnvm/include/nnvm /usr/include/
+cp -r ./include/mxnet /usr/include/
+cp -r ./cpp-package/include/mxnet-cpp /usr/include/
+cp ./lib/* /usr/lib/
diff --git a/src/operator/nn/mkldnn/mkldnn_base.cc 
b/src/operator/nn/mkldnn/mkldnn_base.cc
index c0e1ee6aaa6..5febe9c6c84 100644
--- a/src/operator/nn/mkldnn/mkldnn_base.cc
+++ b/src/operator/nn/mkldnn/mkldnn_base.cc
@@ -349,7 +349,11 @@ static bool SimilarArray(const mxnet::NDArray , const 
mxnet::NDArray ,
   arr2.IsMKLDNNData() ? buf2.data().dptr_: arr2.data().dptr_);
   std::atomic success(true);
 #pragma omp parallel for
+#ifdef _MSC_VER
+  for (long long i = 0; i < (long long)arr1.shape().Size(); i++) {
+#else
   for (size_t i = 0; i < arr1.shape().Size(); i++) {
+#endif
 if (std::abs(data1[i] - data2[i]) > atol + rtol * std::abs(data2[i]))
   success.store(false);
   }
diff --git a/src/operator/tensor/elemwise_sum.h 
b/src/operator/tensor/elemwise_sum.h
index acf73e722b4..51dd08c32bb 100644
--- a/src/operator/tensor/elemwise_sum.h
+++ b/src/operator/tensor/elemwise_sum.h
@@ -37,7 +37,7 @@
 namespace mxnet {
 namespace op {
 
-struct Sum {
+struct StructSum {
   template
   MSHADOW_XINLINE static DType sum(int i, const DType* a) {
 return a[i];
@@ -70,14 +70,14 @@ void ElementWiseSumCompute_(const nnvm::NodeAttrs& attrs,
 case 2: {
   DType* in_0_dptr = in_data[0].dptr();
   DType* in_1_dptr = in_data[1].dptr();
-  Kernel::Launch(s, out_size, out_dptr, req[0], in_0_dptr, 
in_1_dptr);
+  Kernel::Launch(s, out_size, out_dptr, req[0], in_0_dptr, 
in_1_dptr);
   break;
 }
 case 3: {
   DType* in_0_dptr = in_data[0].dptr();
   DType* in_1_dptr = in_data[1].dptr();
   DType* in_2_dptr = in_data[2].dptr();
-  Kernel::Launch(s, out_size, out_dptr, req[0], in_0_dptr, 
in_1_dptr, in_2_dptr);
+  Kernel::Launch(s, out_size, out_dptr, req[0], in_0_dptr, 
in_1_dptr, in_2_dptr);
   break;
 }
 case 4: {
@@ -85,16 +85,16 @@ void ElementWiseSumCompute_(const nnvm::NodeAttrs& attrs,
   DType* in_1_dptr = in_data[1].dptr();
   DType* in_2_dptr = in_data[2].dptr();
   DType* in_3_dptr = in_data[3].dptr();
-  Kernel::Launch(s, out_size, out_dptr, req[0], in_0_dptr, 
in_1_dptr, in_2_dptr,
+  Kernel::Launch(s, out_size, out_dptr, req[0], in_0_dptr, 
in_1_dptr, in_2_dptr,
 in_3_dptr);
   break;
 }
 default: {
   DType* in_0_dptr = in_data[0].dptr();
-  Kernel::Launch(s, out_size, out_dptr, req[0], in_0_dptr);
+  Kernel::Launch(s, out_size, out_dptr, req[0], in_0_dptr);
   for (size_t i = 1; i < size; ++i) {
 DType* in_dptr = in_data[i].dptr();
-Kernel::Launch(s, out_size, out_dptr, req[0], out_dptr, 
in_dptr);
+Kernel::Launch(s, out_size, out_dptr, req[0], 
out_dptr, in_dptr);
   }
   break;
 }


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] szha commented on a change in pull request #10727: cuda archs for TX1 and TX2

2018-05-21 Thread GitBox
szha commented on a change in pull request #10727: cuda archs for TX1 and TX2
URL: https://github.com/apache/incubator-mxnet/pull/10727#discussion_r189725219
 
 

 ##
 File path: Makefile
 ##
 @@ -287,7 +287,7 @@ endif
 # be JIT-compiled by the updated driver from the included PTX.
 ifeq ($(USE_CUDA), 1)
 ifeq ($(CUDA_ARCH),)
-   KNOWN_CUDA_ARCHS := 30 35 50 52 60 61 70
+   KNOWN_CUDA_ARCHS := 30 35 50 52 53 60 61 62 70
 
 Review comment:
   could you make the KNOWN_CUDA_ARCHS configurable too?
   `ifeq ($(KNOWN_CUDA_ARCHS),) then set the value`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] kpmurali commented on issue #10895: [MXNET-413] Fixing the broken links - Week of 5/7

2018-05-21 Thread GitBox
kpmurali commented on issue #10895: [MXNET-413] Fixing the broken links - Week 
of 5/7
URL: https://github.com/apache/incubator-mxnet/pull/10895#issuecomment-390795092
 
 
   @aaronmarkham  Just to confirm, you mean the docs/install/download.html file 
right?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[incubator-mxnet-site] branch asf-site updated: Bump the publish timestamp.

2018-05-21 Thread zhasheng
This is an automated email from the ASF dual-hosted git repository.

zhasheng pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/incubator-mxnet-site.git


The following commit(s) were added to refs/heads/asf-site by this push:
 new d78416c  Bump the publish timestamp.
d78416c is described below

commit d78416c6160b16d52d7079a50a0c7944f1ff203a
Author: mxnet-ci 
AuthorDate: Mon May 21 21:53:43 2018 +

Bump the publish timestamp.
---
 date.txt | 1 +
 1 file changed, 1 insertion(+)

diff --git a/date.txt b/date.txt
new file mode 100644
index 000..50dcfb7
--- /dev/null
+++ b/date.txt
@@ -0,0 +1 @@
+Mon May 21 21:53:43 UTC 2018

-- 
To stop receiving notification emails like this one, please contact
zhash...@apache.org.


[GitHub] aaronmarkham commented on issue #10895: [MXNET-413] Fixing the broken links - Week of 5/7

2018-05-21 Thread GitBox
aaronmarkham commented on issue #10895: [MXNET-413] Fixing the broken links - 
Week of 5/7
URL: https://github.com/apache/incubator-mxnet/pull/10895#issuecomment-390794528
 
 
   Yes, may as well add it now. @kpmurali can you add 1.2.0 to the downloads 
page and resubmit?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] szha commented on issue #10827: [MXNET-405][WIP] Add 2 new pipelines to the Official CI and run nightly tests.

2018-05-21 Thread GitBox
szha commented on issue #10827: [MXNET-405][WIP] Add 2 new pipelines to the 
Official CI and run nightly tests. 
URL: https://github.com/apache/incubator-mxnet/pull/10827#issuecomment-390793258
 
 
   what's the status?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] szha commented on issue #10882: move exec.reshape to backend

2018-05-21 Thread GitBox
szha commented on issue #10882: move exec.reshape to backend
URL: https://github.com/apache/incubator-mxnet/pull/10882#issuecomment-390793106
 
 
   ping @piiswrong 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] szha commented on issue #10889: [MXNET-382] Shape and Size Operator

2018-05-21 Thread GitBox
szha commented on issue #10889: [MXNET-382] Shape and Size Operator
URL: https://github.com/apache/incubator-mxnet/pull/10889#issuecomment-390792981
 
 
   Ping for another round of reviews.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] szha commented on issue #10893: benchmark with float16

2018-05-21 Thread GitBox
szha commented on issue #10893: benchmark with float16
URL: https://github.com/apache/incubator-mxnet/pull/10893#issuecomment-390792842
 
 
   ping


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] szha commented on issue #10895: [MXNET-413] Fixing the broken links - Week of 5/7

2018-05-21 Thread GitBox
szha commented on issue #10895: [MXNET-413] Fixing the broken links - Week of 
5/7
URL: https://github.com/apache/incubator-mxnet/pull/10895#issuecomment-390792675
 
 
   Triggered build again. @anirudh2290 @aaronmarkham should 1.2.0 be listed too?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] szha commented on issue #10951: Fix broken build with cython 0.28

2018-05-21 Thread GitBox
szha commented on issue #10951: Fix broken build with cython 0.28
URL: https://github.com/apache/incubator-mxnet/pull/10951#issuecomment-390791538
 
 
   How is this tested? 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] rahul003 commented on a change in pull request #9994: [MXNET-59] Tensorboard: Add histogram callback

2018-05-21 Thread GitBox
rahul003 commented on a change in pull request #9994: [MXNET-59] Tensorboard: 
Add histogram callback
URL: https://github.com/apache/incubator-mxnet/pull/9994#discussion_r189720574
 
 

 ##
 File path: python/mxnet/contrib/tensorboard.py
 ##
 @@ -71,3 +72,43 @@ def __call__(self, param):
 if self.prefix is not None:
 name = '%s-%s' % (self.prefix, name)
 self.summary_writer.add_scalar(name, value)
+
+def node_histogram_visualization(self, prefix=None, node_names=None, 
bins="auto"):
+"""Node histogram visualization in TensorBoard.
+This callback works almost same as `callback.module_checkpoint`,
+but write TensorBoard event file for visualization.
+For more usage, please refer https://github.com/dmlc/tensorboard
+
+Parameters
+--
+prefix : str
+Prefix for a metric name of `histograms` and `distributions` value.
+node_names : list of str, optional
+Name of nodes list you want to visualize.
+If set 'None', this callback visualize all nodes histogram and 
distributions.
+Default node_names = None.
+bins : str
+one of {'tensorflow','auto', 'fd', ...}, this determines how the 
bins are made.
+You can find other options in:
+
https://docs.scipy.org/doc/numpy/reference/generated/numpy.histogram.html
+Default bins = 'auto'
+"""
+self.histogram_prefix = prefix
+self.node_names = node_names
+self.bins = bins
+
+# pylint: disable=unused-argument
+def _callback(iter_no, sym=None, arg=None, aux=None):
+"""Callback to log node histogram visualization in TensorBoard."""
+for k, v in arg.items():
+if self.node_names is None or k in self.node_names:
+if self.histogram_prefix is not None:
+name = '%s-%s' % (self.histogram_prefix, k)
+self.summary_writer.add_histogram(name, v, 
global_step=iter_no, bins=self.bins)
+for k, v in aux.items():
+if self.node_names is None or k in self.node_names:
+if self.histogram_prefix is not None:
+name = '%s-%s' % (self.histogram_prefix, k)
+self.summary_writer.add_histogram(name, v, 
global_step=iter_no, bins=self.bins)
 
 Review comment:
   This PR is still helpful to use Symbolic mode with mxboard. I'm trying to 
use it, however this name is used without being initialized


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] zheng-da commented on a change in pull request #10451: [MXNET-432] Add Foreach

2018-05-21 Thread GitBox
zheng-da commented on a change in pull request #10451: [MXNET-432] Add Foreach
URL: https://github.com/apache/incubator-mxnet/pull/10451#discussion_r189711919
 
 

 ##
 File path: python/mxnet/symbol/contrib.py
 ##
 @@ -91,3 +98,154 @@ def rand_zipfian(true_classes, num_sampled, range_max):
 expected_prob_sampled = ((sampled_cls_fp64 + 2.0) / (sampled_cls_fp64 + 
1.0)).log() / log_range
 expected_count_sampled = expected_prob_sampled * num_sampled
 return sampled_classes, expected_count_true, expected_count_sampled
+
+def _get_graph_inputs(subg):
+num_handles = ctypes.c_int(1000)
 
 Review comment:
   the maximal size of the buffer. I need to fix this.
   Ideally, I need the C API to allocate a memory buffer and return the buffer 
to Python, but I'm not sure how to do it in Python ctypes. so I allocate a 
memory buffer in Python and pass it to C API.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] eric-haibin-lin commented on a change in pull request #10536: [MXNET-317] Add Data Parallel

2018-05-21 Thread GitBox
eric-haibin-lin commented on a change in pull request #10536: [MXNET-317] Add 
Data Parallel
URL: https://github.com/apache/incubator-mxnet/pull/10536#discussion_r189707688
 
 

 ##
 File path: python/mxnet/gluon/contrib/parallel.py
 ##
 @@ -0,0 +1,362 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=broad-except, redefined-builtin
+"""Synchronized DataParallel"""
+import threading
+from ... import autograd
+from ...ndarray import NDArray
+from ..utils import split_and_load
+
+__all__ = ['DataParallelModel', 'DataParallelCriterion', 'Barrier']
+
+
+class Barrier(object):
+"""Shared NDArray for cross device operation.
+
+A cross device operation that allows synchronized push and pull. It can be 
used in
+Cross-gpu Sycnhronized Batch Normalization and Sparse Blocks.
+
+Parameters
+--
+counter : int
+Number of deivces.
+operation : callable
+The cross device operation is applying (e.g. AllReduce).
+"""
+def __init__(self, counter, operation):
+self._mutex = threading.Lock()
+self.all_tasks_done = threading.Condition(self._mutex)
+self.counter = counter
+self.op = operation
+self._clear()
+
+def push(self, x):
+"""Push a NDArray from one of the device.
+Input:
+x (NDArray)
+
+Output:
+idx (int), the output index
+"""
+with self._mutex:
+if self.push_tasks == 0:
+self._clear()
+self.list.append(x)
+idx = len(self.list) - 1
+self.push_tasks -= 1
+
+with self.all_tasks_done:
+if self.push_tasks == 0:
+self.all_tasks_done.notify_all()
+while self.push_tasks:
+self.all_tasks_done.wait()
+
+self._sync_op()
+return idx
+
+def pull(self, idx):
+"""Pull the output to each device
+Input:
+idx (int)
+
+Output:
+out (NDArray)
+"""
+return self.out[idx]
+
+def _sync_op(self):
+with self._mutex:
+if self.reduce_tasks == 1:
+assert(len(self.list) == self.counter)
+self.out = self.op(*self.list)
+if isinstance(self.out, (list, tuple)):
+for xi in self.out:
+xi.wait_to_read()
+else:
+self.out.wait_to_read()
+self.reduce_tasks -= 1
+else:
+self.reduce_tasks -= 1
+
+with self.all_tasks_done:
+if self.reduce_tasks == 0:
+self.all_tasks_done.notify_all()
+while self.reduce_tasks:
+self.all_tasks_done.wait()
+
+def _clear(self):
+self.list = []
+self.push_tasks = self.counter
+self.reduce_tasks = self.counter
+
+def __len__(self):
+return len(self.list)
+
+def __repr__(self):
+return 'Barrier'
+
+
+class DataParallelModel(object):
+"""Data Parallelism
+
+Hide the difference of single/multiple GPUs to the user.
+This container parallelizes the application of the given module by
+splitting the input across the specified devices by chunking in the
+batch dimension.
+In the forward pass, the module is replicated on each device,
+and each replica handles a portion of the input. During the backwards pass,
+gradients from each replica are summed into the original module.
+Note that the outputs are not gathered, please use compatible
+:class:`mxnet.gluon.contrib.DataParallelCriterion`.
+
+The batch size should be larger than the number of GPUs used. It should
+also be an integer multiple of the number of GPUs so that each chunk is
+the same size (so that each GPU processes the same number of samples).
+
+Parameters
+--
+module : object
+Network to be parallelized.
+ctx_list : list
+A list of contexts
+sync : bool
+enable synchronization (default: False).
+
+
+Inputs:
+- **inputs**: list of input (NDArrays)
+
+Outputs:
+- **outputs**: list of output 

[GitHub] eric-haibin-lin commented on a change in pull request #10536: [MXNET-317] Add Data Parallel

2018-05-21 Thread GitBox
eric-haibin-lin commented on a change in pull request #10536: [MXNET-317] Add 
Data Parallel
URL: https://github.com/apache/incubator-mxnet/pull/10536#discussion_r189704701
 
 

 ##
 File path: tests/python/unittest/test_contrib_parallel.py
 ##
 @@ -0,0 +1,100 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+import mxnet as mx
+from mxnet import nd, autograd, gluon
+from mxnet.gluon import nn, Block
+from mxnet.gluon.contrib.parallel import *
+from numpy.testing import assert_allclose, assert_array_equal
+
+def test_data_parallel():
+# test gluon.contrib.parallel.DataParallelModel
+net = nn.HybridSequential()
+with net.name_scope():
+net.add(nn.Conv2D(in_channels=1, channels=20, kernel_size=5))
+net.add(nn.Activation('relu'))
+net.add(nn.MaxPool2D(pool_size=2, strides=2))
+net.add(nn.Conv2D(in_channels=20, channels=50, kernel_size=5))
+net.add(nn.Activation('relu'))
+net.add(nn.MaxPool2D(pool_size=2, strides=2))
+# The Flatten layer collapses all axis, except the first one, into one 
axis.
+net.add(nn.Flatten())
+net.add(nn.Dense(512,in_units=800))
+net.add(nn.Activation('relu'))
+net.add(nn.Dense(10, in_units=512))
+
+net.collect_params().initialize()
+criterion = gluon.loss.SoftmaxCELoss(axis=1)
+
+def test_net_sync(net, criterion, sync, nDevices):
 
 Review comment:
   nDevices -> num_devices


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] eric-haibin-lin commented on a change in pull request #10536: [MXNET-317] Add Data Parallel

2018-05-21 Thread GitBox
eric-haibin-lin commented on a change in pull request #10536: [MXNET-317] Add 
Data Parallel
URL: https://github.com/apache/incubator-mxnet/pull/10536#discussion_r189710149
 
 

 ##
 File path: python/mxnet/gluon/contrib/parallel.py
 ##
 @@ -0,0 +1,343 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=broad-except, redefined-builtin
+"""Synchronized DataParallel"""
+import threading
+from ... import autograd
+from ...ndarray import NDArray
+from ..utils import split_and_load
+
+__all__ = ['DataParallelModel', 'DataParallelCriterion', 'Barrier']
+
+
+class Barrier(object):
+"""Shared NDArray for cross device operation.
+
+A cross device operation that allows synchronized push and pull. It can be 
used in
+Cross-gpu Sycnhronized Batch Normalization and Sparse Blocks.
+
+Parameters
+--
+counter : int
+Number of deivces.
+operation : callable
+The cross device operation is applying (e.g. AllReduce).
+"""
+def __init__(self, counter, operation):
+self.mutex = threading.Lock()
+self.all_tasks_done = threading.Condition(self.mutex)
+self.counter = counter
+self.op = operation
+self._clear()
+
+def push(self, x):
+"""Push a NDArray from one of the device.
+Input:
+x (NDArray)
+
+Output:
+idx (int), the output index
+"""
+with self.mutex:
+if self.push_tasks == 0:
+self._clear()
+self.list.append(x)
+idx = len(self.list) - 1
+self.push_tasks -= 1
+
+with self.all_tasks_done:
+if self.push_tasks == 0:
+self.all_tasks_done.notify_all()
+while self.push_tasks:
+self.all_tasks_done.wait()
+
+self._sync_op()
+return idx
+
+def pull(self, idx):
+"""Pull the output to each device
+Input:
+idx (int)
+
+Output:
+out (NDArray)
+"""
+return self.out[idx]
+
+def _sync_op(self):
+with self.mutex:
+if self.reduce_tasks == 1:
+assert(len(self.list) == self.counter)
+self.out = self.op(*self.list)
+if isinstance(self.out, (list, tuple)):
+for xi in self.out:
+xi.wait_to_read()
 
 Review comment:
   Is the async execution engine not able to handle this?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] eric-haibin-lin commented on a change in pull request #10536: [MXNET-317] Add Data Parallel

2018-05-21 Thread GitBox
eric-haibin-lin commented on a change in pull request #10536: [MXNET-317] Add 
Data Parallel
URL: https://github.com/apache/incubator-mxnet/pull/10536#discussion_r189708470
 
 

 ##
 File path: tests/python/unittest/test_contrib_parallel.py
 ##
 @@ -0,0 +1,100 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+import mxnet as mx
+from mxnet import nd, autograd, gluon
+from mxnet.gluon import nn, Block
+from mxnet.gluon.contrib.parallel import *
+from numpy.testing import assert_allclose, assert_array_equal
+
+def test_data_parallel():
+# test gluon.contrib.parallel.DataParallelModel
+net = nn.HybridSequential()
+with net.name_scope():
+net.add(nn.Conv2D(in_channels=1, channels=20, kernel_size=5))
+net.add(nn.Activation('relu'))
+net.add(nn.MaxPool2D(pool_size=2, strides=2))
+net.add(nn.Conv2D(in_channels=20, channels=50, kernel_size=5))
+net.add(nn.Activation('relu'))
+net.add(nn.MaxPool2D(pool_size=2, strides=2))
+# The Flatten layer collapses all axis, except the first one, into one 
axis.
+net.add(nn.Flatten())
+net.add(nn.Dense(512,in_units=800))
+net.add(nn.Activation('relu'))
+net.add(nn.Dense(10, in_units=512))
+
+net.collect_params().initialize()
+criterion = gluon.loss.SoftmaxCELoss(axis=1)
+
+def test_net_sync(net, criterion, sync, nDevices):
+ctx_list = [mx.cpu(0) for i in range(nDevices)]
+net = DataParallelModel(net, ctx_list, sync=sync)
+criterion = DataParallelCriterion(criterion, ctx_list, sync=sync)
+iters = 100
+# train mode
+for i in range(iters):
+x = mx.random.uniform(shape=(8, 1, 28, 28))
+t = nd.ones(shape=(8))
+with autograd.record():
+y = net(x)
+loss = criterion(y, t)
+autograd.backward(loss)
+# evaluation mode
+for i in range(iters):
+x = mx.random.uniform(shape=(8, 1, 28, 28))
+y = net(x)
+
+test_net_sync(net, criterion, True, 1)
+test_net_sync(net, criterion, True, 2)
+test_net_sync(net, criterion, False, 1)
+test_net_sync(net, criterion, False, 2)
+
+
+def test_parallel_barrier():
+def my_callable(*inputs):
+return inputs
+
+class MyLayer(Block):
+def __init__(self, nGPU):
+super(MyLayer, self).__init__()
+self.barrier = Barrier(nGPU, my_callable)
+
+def forward(self, x):
+idx = self.barrier.push(x)
+y = self.barrier.pull(idx)
+assert_allclose(y.asnumpy(), x.asnumpy(), rtol=1e-2, atol=1e-4)
+return y
+
+nDevices = 2
+ctx_list = [mx.cpu(0) for i in range(nDevices)]
+net = MyLayer(nDevices)
+net = DataParallelModel(net, ctx_list, sync=True)
+iters = 100
+# train mode
+for i in range(iters):
+x = mx.random.uniform(shape=(8, 1, 28, 28))
+with autograd.record():
+y = net(x)
+# evaluation mode
+for i in range(iters):
+x = mx.random.uniform(shape=(8, 1, 28, 28))
+y = net(x)
 
 Review comment:
   what specifically are you checking here?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] eric-haibin-lin commented on a change in pull request #10536: [MXNET-317] Add Data Parallel

2018-05-21 Thread GitBox
eric-haibin-lin commented on a change in pull request #10536: [MXNET-317] Add 
Data Parallel
URL: https://github.com/apache/incubator-mxnet/pull/10536#discussion_r189707012
 
 

 ##
 File path: python/mxnet/gluon/contrib/parallel.py
 ##
 @@ -0,0 +1,362 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: disable=broad-except, redefined-builtin
+"""Synchronized DataParallel"""
+import threading
+from ... import autograd
+from ...ndarray import NDArray
+from ..utils import split_and_load
+
+__all__ = ['DataParallelModel', 'DataParallelCriterion', 'Barrier']
+
+
+class Barrier(object):
+"""Shared NDArray for cross device operation.
+
+A cross device operation that allows synchronized push and pull. It can be 
used in
+Cross-gpu Sycnhronized Batch Normalization and Sparse Blocks.
+
+Parameters
+--
+counter : int
+Number of deivces.
+operation : callable
+The cross device operation is applying (e.g. AllReduce).
+"""
+def __init__(self, counter, operation):
+self._mutex = threading.Lock()
+self.all_tasks_done = threading.Condition(self._mutex)
+self.counter = counter
+self.op = operation
+self._clear()
+
+def push(self, x):
+"""Push a NDArray from one of the device.
+Input:
+x (NDArray)
+
+Output:
+idx (int), the output index
+"""
+with self._mutex:
+if self.push_tasks == 0:
+self._clear()
+self.list.append(x)
+idx = len(self.list) - 1
+self.push_tasks -= 1
+
+with self.all_tasks_done:
+if self.push_tasks == 0:
+self.all_tasks_done.notify_all()
+while self.push_tasks:
+self.all_tasks_done.wait()
+
+self._sync_op()
+return idx
+
+def pull(self, idx):
+"""Pull the output to each device
+Input:
+idx (int)
+
+Output:
+out (NDArray)
+"""
+return self.out[idx]
+
+def _sync_op(self):
+with self._mutex:
+if self.reduce_tasks == 1:
+assert(len(self.list) == self.counter)
+self.out = self.op(*self.list)
+if isinstance(self.out, (list, tuple)):
+for xi in self.out:
+xi.wait_to_read()
+else:
+self.out.wait_to_read()
+self.reduce_tasks -= 1
+else:
+self.reduce_tasks -= 1
+
+with self.all_tasks_done:
+if self.reduce_tasks == 0:
+self.all_tasks_done.notify_all()
+while self.reduce_tasks:
+self.all_tasks_done.wait()
+
+def _clear(self):
+self.list = []
+self.push_tasks = self.counter
+self.reduce_tasks = self.counter
+
+def __len__(self):
+return len(self.list)
+
+def __repr__(self):
+return 'Barrier'
+
+
+class DataParallelModel(object):
+"""Data Parallelism
+
+Hide the difference of single/multiple GPUs to the user.
 
 Review comment:
   Hide .. from .. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] eric-haibin-lin commented on a change in pull request #10536: [MXNET-317] Add Data Parallel

2018-05-21 Thread GitBox
eric-haibin-lin commented on a change in pull request #10536: [MXNET-317] Add 
Data Parallel
URL: https://github.com/apache/incubator-mxnet/pull/10536#discussion_r189705455
 
 

 ##
 File path: tests/python/unittest/test_contrib_parallel.py
 ##
 @@ -0,0 +1,100 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+import mxnet as mx
+from mxnet import nd, autograd, gluon
+from mxnet.gluon import nn, Block
+from mxnet.gluon.contrib.parallel import *
+from numpy.testing import assert_allclose, assert_array_equal
+
+def test_data_parallel():
+# test gluon.contrib.parallel.DataParallelModel
+net = nn.HybridSequential()
+with net.name_scope():
+net.add(nn.Conv2D(in_channels=1, channels=20, kernel_size=5))
+net.add(nn.Activation('relu'))
+net.add(nn.MaxPool2D(pool_size=2, strides=2))
+net.add(nn.Conv2D(in_channels=20, channels=50, kernel_size=5))
+net.add(nn.Activation('relu'))
+net.add(nn.MaxPool2D(pool_size=2, strides=2))
+# The Flatten layer collapses all axis, except the first one, into one 
axis.
+net.add(nn.Flatten())
+net.add(nn.Dense(512,in_units=800))
+net.add(nn.Activation('relu'))
+net.add(nn.Dense(10, in_units=512))
+
+net.collect_params().initialize()
+criterion = gluon.loss.SoftmaxCELoss(axis=1)
+
+def test_net_sync(net, criterion, sync, nDevices):
+ctx_list = [mx.cpu(0) for i in range(nDevices)]
+net = DataParallelModel(net, ctx_list, sync=sync)
+criterion = DataParallelCriterion(criterion, ctx_list, sync=sync)
+iters = 100
+# train mode
+for i in range(iters):
+x = mx.random.uniform(shape=(8, 1, 28, 28))
+t = nd.ones(shape=(8))
+with autograd.record():
+y = net(x)
+loss = criterion(y, t)
+autograd.backward(loss)
+# evaluation mode
+for i in range(iters):
+x = mx.random.uniform(shape=(8, 1, 28, 28))
+y = net(x)
+
+test_net_sync(net, criterion, True, 1)
+test_net_sync(net, criterion, True, 2)
+test_net_sync(net, criterion, False, 1)
+test_net_sync(net, criterion, False, 2)
+
+
+def test_parallel_barrier():
+def my_callable(*inputs):
+return inputs
+
+class MyLayer(Block):
+def __init__(self, nGPU):
+super(MyLayer, self).__init__()
+self.barrier = Barrier(nGPU, my_callable)
+
+def forward(self, x):
+idx = self.barrier.push(x)
+y = self.barrier.pull(idx)
+assert_allclose(y.asnumpy(), x.asnumpy(), rtol=1e-2, atol=1e-4)
 
 Review comment:
   should x and y be exactly the same? The tolerance seems large


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] zheng-da commented on a change in pull request #10451: [MXNET-432] Add Foreach

2018-05-21 Thread GitBox
zheng-da commented on a change in pull request #10451: [MXNET-432] Add Foreach
URL: https://github.com/apache/incubator-mxnet/pull/10451#discussion_r189709942
 
 

 ##
 File path: include/mxnet/c_api.h
 ##
 @@ -1056,6 +1056,16 @@ MXNET_DLL int MXSymbolListAtomicSymbolCreators(mx_uint 
*out_size,
  */
 MXNET_DLL int MXSymbolGetAtomicSymbolName(AtomicSymbolCreator creator,
   const char **name);
+
+/*!
+ * \brief Get the input symbols of the graph.
+ * \param sym The graph.
+ * \param outs The input symbols of the graph.
+ * \param out_size the number of input symbols returned.
+ */
+MXNET_DLL int MXSymbolGetInputSymbols(SymbolHandle sym, SymbolHandle **outs,
 
 Review comment:
   here i need to get a list of input symbols instead of names. do you suggest 
merging these two APIs?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] piiswrong commented on a change in pull request #10451: [MXNET-432] Add Foreach

2018-05-21 Thread GitBox
piiswrong commented on a change in pull request #10451: [MXNET-432] Add Foreach
URL: https://github.com/apache/incubator-mxnet/pull/10451#discussion_r189501415
 
 

 ##
 File path: python/mxnet/ndarray/contrib.py
 ##
 @@ -96,3 +98,78 @@ def rand_zipfian(true_classes, num_sampled, range_max, 
ctx=None):
 expected_count_sampled = expected_prob_sampled * num_sampled
 return sampled_classes, expected_count_true, expected_count_sampled
 # pylint: enable=line-too-long
+
+def foreach(func, data, init_states):
+"""Run a for loop with user-defined computation over NDArrays on dimension 
0.
+
+This operator simulates a for loop and func has the computation for an 
iteration
+of the for loop. It runs the computation in func on each slice from the 
input
+NDArrays.
+
+func takes two arguments as input and outputs a tuple of two elements,
+as illustrated below:
+
+out, states = func(data1, states)
+
+data1 can be either a symbol or a list of symbols. If data is a symbol,
+data1 is a symbol. Otherwise, data1 is a list of symbols and has the same
+size as data. states is a list of symbols and have the same size as 
init_states.
+Similarly, out can be either a symbol or a list of symbols, which are 
concatenated
+as the first output of foreach; states from the last execution of func
+are the second output of foreach.
+
+The computation done by this operator is equivalent to the pseudo code 
below
+when the input data is NDArray:
+
+states = init_states
+outs = []
+for i in data.shape[0]:
+s = data[i]
+out, states = func(s, states)
+outs.append(out)
+outs = stack(*outs)
+
+
+Parameters
+--
+func : a Python function.
+Define computation in an iteration.
+data: a symbol or a list of symbols.
+The input data.
+init_states: a list of symbols.
+The initial values of the loop states.
+name: string.
+The name of the operator.
+
+Returns
+---
+outputs: a Symbol or a list of Symbols.
+The output data concatenated from the output of all iterations.
+states: a list of Symbols.
+The loop states in the last iteration.
+
+Examples
+
+>>> step = lambda data, states: (data + states[0], [states[0] * 2])
+>>> data = mx.nd.random.uniform(shape=(2, 10))
+>>> states = [mx.nd.random.uniform(shape=(10))]
+>>> outs, states = mx.nd.contrib.foreach(step, data, states)
+"""
+
+assert isinstance(init_states, list), "init_states should be a list"
+states = init_states
+outputs = []
+for i in range(data.shape[0]):
+ele = data[i]
+outs, states = func(ele, states)
+outs = _as_list(outs)
+if i == 0:
+# outputs is a list of lists
+for out in outs:
+outputs.append([out])
+else:
+for j, out in enumerate(outs):
+outputs[j].append(out)
+for out in outputs:
+out = stack(*out)
+return (outputs, states)
 
 Review comment:
   Return value is always a list?
   If func returns a single value outputs should also be an single value?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] piiswrong commented on a change in pull request #10451: [MXNET-432] Add Foreach

2018-05-21 Thread GitBox
piiswrong commented on a change in pull request #10451: [MXNET-432] Add Foreach
URL: https://github.com/apache/incubator-mxnet/pull/10451#discussion_r189500807
 
 

 ##
 File path: include/mxnet/c_api.h
 ##
 @@ -1056,6 +1056,16 @@ MXNET_DLL int MXSymbolListAtomicSymbolCreators(mx_uint 
*out_size,
  */
 MXNET_DLL int MXSymbolGetAtomicSymbolName(AtomicSymbolCreator creator,
   const char **name);
+
+/*!
+ * \brief Get the input symbols of the graph.
+ * \param sym The graph.
+ * \param outs The input symbols of the graph.
+ * \param out_size the number of input symbols returned.
+ */
+MXNET_DLL int MXSymbolGetInputSymbols(SymbolHandle sym, SymbolHandle **outs,
 
 Review comment:
   We already have an ListInput api right?
   should be **inputs?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] piiswrong commented on a change in pull request #10451: [MXNET-432] Add Foreach

2018-05-21 Thread GitBox
piiswrong commented on a change in pull request #10451: [MXNET-432] Add Foreach
URL: https://github.com/apache/incubator-mxnet/pull/10451#discussion_r189702893
 
 

 ##
 File path: src/c_api/c_api_symbolic.cc
 ##
 @@ -38,10 +38,11 @@ void RegisterLegacyOpProp();
 void RegisterLegacyNDFunc();
 }
 const std::vector kHiddenKeys = {
-  "ctx_group", "lr_mult", "wd_mult", "force_mirroring", "mirror_stage"
+  "ctx_group", "lr_mult", "wd_mult", "force_mirroring", "mirror_stage", 
"subgraph_name"
 
 Review comment:
   This was a legacy code handling mechanism.
   You should use __subgraph_name__ directly


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] piiswrong commented on a change in pull request #10451: [MXNET-432] Add Foreach

2018-05-21 Thread GitBox
piiswrong commented on a change in pull request #10451: [MXNET-432] Add Foreach
URL: https://github.com/apache/incubator-mxnet/pull/10451#discussion_r189500871
 
 

 ##
 File path: include/mxnet/ndarray.h
 ##
 @@ -693,6 +693,10 @@ class NDArray {
   NDArray MKLDNNDataReshape(const TShape ) const;
 #endif
 
+  const nnvm::NodeEntry () const {
 
 Review comment:
   use entry() for simple accessor


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] piiswrong commented on a change in pull request #10451: [MXNET-432] Add Foreach

2018-05-21 Thread GitBox
piiswrong commented on a change in pull request #10451: [MXNET-432] Add Foreach
URL: https://github.com/apache/incubator-mxnet/pull/10451#discussion_r182570388
 
 

 ##
 File path: python/mxnet/ndarray/contrib.py
 ##
 @@ -96,3 +96,18 @@ def rand_zipfian(true_classes, num_sampled, range_max, 
ctx=None):
 expected_count_sampled = expected_prob_sampled * num_sampled
 return sampled_classes, expected_count_true, expected_count_sampled
 # pylint: enable=line-too-long
+
+def foreach(func, input, init_states, back_prop=False, name="foreach"):
 
 Review comment:
   back_prop -> Imperative::Get()->is_recording()
   add OpContext::is_record at backend


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] piiswrong commented on a change in pull request #10451: [MXNET-432] Add Foreach

2018-05-21 Thread GitBox
piiswrong commented on a change in pull request #10451: [MXNET-432] Add Foreach
URL: https://github.com/apache/incubator-mxnet/pull/10451#discussion_r189500972
 
 

 ##
 File path: python/mxnet/ndarray/contrib.py
 ##
 @@ -96,3 +98,78 @@ def rand_zipfian(true_classes, num_sampled, range_max, 
ctx=None):
 expected_count_sampled = expected_prob_sampled * num_sampled
 return sampled_classes, expected_count_true, expected_count_sampled
 # pylint: enable=line-too-long
+
+def foreach(func, data, init_states):
 
 Review comment:
   func -> body?
   init_states -> begin_states?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] piiswrong commented on a change in pull request #10451: [MXNET-432] Add Foreach

2018-05-21 Thread GitBox
piiswrong commented on a change in pull request #10451: [MXNET-432] Add Foreach
URL: https://github.com/apache/incubator-mxnet/pull/10451#discussion_r189704344
 
 

 ##
 File path: src/operator/nn/control_flow.cc
 ##
 @@ -0,0 +1,532 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "../operator_common.h"
+#include "../elemwise_op_common.h"
+#include "../../imperative/imperative_utils.h"
+#include "./subgraph_op_common.h"
+
+namespace mxnet {
+namespace op {
+
+struct ForeachParam : public dmlc::Parameter {
+  int num_args;
+  int dim;
+  int num_outputs;
+  int num_out_data;
+  nnvm::Tuple in_state_locs;
+  nnvm::Tuple in_data_locs;
+  DMLC_DECLARE_PARAMETER(ForeachParam) {
+DMLC_DECLARE_FIELD(num_args).set_lower_bound(1)
+.describe("Number of inputs.");
+DMLC_DECLARE_FIELD(dim).set_default(1)
+.describe("the dimension of the input array to iterate.");
+DMLC_DECLARE_FIELD(num_outputs)
+.describe("The number of outputs of the subgraph.");
+DMLC_DECLARE_FIELD(num_out_data)
+.describe("The number of output data of the subgraph.");
+DMLC_DECLARE_FIELD(in_state_locs)
+.describe("The locations of loop states among the inputs.");
+DMLC_DECLARE_FIELD(in_data_locs)
+.describe("The locations of input data among the inputs.");
+  }
+};  // struct ForeachParam
+
+DMLC_REGISTER_PARAMETER(ForeachParam);
+
+class ForeachState {
+  // These are output arrays from all iterations.
+  // They also contain the Op state for each CachedOp.
+  std::vector all_outputs;
+  std::vector all_inputs;
+  std::vector all_gradients;
+  std::vector iter_ops;
+
+ public:
+  Symbol subgraph_sym;
+  nnvm::Graph subgraph;
+  ForeachParam params;
+
+  ForeachState(const Symbol , const ForeachParam ) {
+this->subgraph_sym = g;
+this->subgraph.outputs = g.outputs;
+this->params = params;
+  }
+
+  void Forward(std::vector cinputs,
+   const std::vector& req,
+   std::vector coutputs, bool is_recording);
+  void Backward(int iter_no, std::vector ograds,
 
 Review comment:
   line break between args


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] piiswrong commented on a change in pull request #10451: [MXNET-432] Add Foreach

2018-05-21 Thread GitBox
piiswrong commented on a change in pull request #10451: [MXNET-432] Add Foreach
URL: https://github.com/apache/incubator-mxnet/pull/10451#discussion_r182576711
 
 

 ##
 File path: src/operator/nn/control_flow.cc
 ##
 @@ -0,0 +1,443 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "../operator_common.h"
+#include "../../imperative/imperative_utils.h"
+
+namespace mxnet {
+namespace op {
+
+void RunGraph(const nnvm::IndexedGraph& idx,
+const std::vector arrays,
+size_t node_start, size_t node_end,
+std::vector&& array_reqs,
+std::vector&& ref_count,
+std::vector *p_states,
+const DispatchModeVector _modes) {
+  using namespace nnvm;
+  using namespace imperative;
+  static auto& createop = nnvm::Op::GetAttr("FCreateOpState");
+  static auto& is_layer_backward = Op::GetAttr("TIsLayerOpBackward");
+
+  std::vector& states = *p_states;
+  std::vector ndinputs, ndoutputs;
+  ShapeVector arg_shapes;
+  DTypeVector arg_dtypes;
+  std::vector req;
+
+  for (size_t i = node_start; i < node_end; ++i) {
+const nnvm::IndexedGraph::Node& node = idx[i];
+if (node.source->op() == nullptr) continue;
+auto num_outputs = node.source->num_outputs();
+ndinputs.clear();
+ndinputs.reserve(node.inputs.size());
+for (const auto& j : node.inputs) {
+  ndinputs.emplace_back(arrays[idx.entry_id(j)]);
+  CHECK(!ndinputs.back()->is_none()) << idx[j.node_id].source->attrs.name
+  << " " << j.index;
+}
+ndoutputs.clear();
+ndoutputs.reserve(num_outputs);
+req.clear();
+req.reserve(num_outputs);
+for (size_t j = 0; j < num_outputs; ++j) {
+  size_t eid = idx.entry_id(i, j);
+  ndoutputs.emplace_back(arrays[eid]);
+  req.push_back(array_reqs[eid]);
+  CHECK(!ndoutputs.back()->is_none());
+}
+const Context& ctx = ndoutputs[0]->ctx();
+const DispatchMode dispatch_mode = dispatch_modes[i];
+if (createop.count(node.source->op())) {
+  arg_shapes.clear();
+  arg_dtypes.clear();
+  arg_shapes.reserve(ndinputs.size());
+  arg_dtypes.reserve(ndinputs.size());
+  for (size_t i = 0; i < ndinputs.size(); ++i) {
+arg_shapes.emplace_back(ndinputs[i]->shape());
+arg_dtypes.emplace_back(ndinputs[i]->dtype());
+  }
+  states[i] = createop[node.source->op()](
+  node.source->attrs, ctx, arg_shapes, arg_dtypes);
+  Imperative::InvokeOp(ctx, node.source->attrs, ndinputs, ndoutputs, req,
+   dispatch_mode, states[i]);
+} else if (is_layer_backward.get(node.source->op(), false)) {
+  nnvm::Node* fwd_node = node.source->control_deps[0].get();
+  auto fwd_node_id = idx.node_id(fwd_node);
+  Imperative::InvokeOp(ctx, node.source->attrs, ndinputs, ndoutputs,
+   req, dispatch_mode, states[fwd_node_id]);
+} else {
+  Imperative::InvokeOp(ctx, node.source->attrs, ndinputs, ndoutputs,
+   req, dispatch_mode);
+}
+  }
+}
+
+static void ExecSubgraph(nnvm::Graph , const OpContext& ctx,
+ const std::vector& cinputs,
+ const std::vector& req,
+ const std::vector& coutputs) {
+  using namespace nnvm;
+  using namespace imperative;
+  const auto& idx = g.indexed_graph();
+  size_t num_inputs = idx.input_nodes().size();
+
+  CHECK_EQ(num_inputs, cinputs.size())
+  << "The subgraph requires " << num_inputs << " but got " << 
cinputs.size();
+
+  Context default_ctx = cinputs[0].ctx();
+  for (size_t i = 0; i < cinputs.size(); ++i) {
+CHECK_EQ(cinputs[i].ctx(), default_ctx)
+<< "The subgraph requires all inputs to live on the same context. But "
+<< idx[idx.input_nodes()[0]].source->attrs.name << " is on " << 
default_ctx
+<< " while " << idx[idx.input_nodes()[i]].source->attrs.name << " is 
on "
+<< cinputs[i].ctx();
+  }
+
+  // TODO(zhengda) we might want to buffer them.
+  std::vector buff;
+  std::vector states;
+  std::vector inputs = cinputs;
+  std::vector outputs = coutputs;
+
+  // Allocate 

[GitHub] piiswrong commented on a change in pull request #10451: [MXNET-432] Add Foreach

2018-05-21 Thread GitBox
piiswrong commented on a change in pull request #10451: [MXNET-432] Add Foreach
URL: https://github.com/apache/incubator-mxnet/pull/10451#discussion_r189704747
 
 

 ##
 File path: src/operator/nn/control_flow.cc
 ##
 @@ -0,0 +1,532 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "../operator_common.h"
+#include "../elemwise_op_common.h"
+#include "../../imperative/imperative_utils.h"
+#include "./subgraph_op_common.h"
+
+namespace mxnet {
+namespace op {
+
+struct ForeachParam : public dmlc::Parameter {
+  int num_args;
+  int dim;
+  int num_outputs;
+  int num_out_data;
+  nnvm::Tuple in_state_locs;
+  nnvm::Tuple in_data_locs;
+  DMLC_DECLARE_PARAMETER(ForeachParam) {
+DMLC_DECLARE_FIELD(num_args).set_lower_bound(1)
+.describe("Number of inputs.");
+DMLC_DECLARE_FIELD(dim).set_default(1)
+.describe("the dimension of the input array to iterate.");
+DMLC_DECLARE_FIELD(num_outputs)
+.describe("The number of outputs of the subgraph.");
+DMLC_DECLARE_FIELD(num_out_data)
+.describe("The number of output data of the subgraph.");
+DMLC_DECLARE_FIELD(in_state_locs)
+.describe("The locations of loop states among the inputs.");
+DMLC_DECLARE_FIELD(in_data_locs)
+.describe("The locations of input data among the inputs.");
+  }
+};  // struct ForeachParam
+
+DMLC_REGISTER_PARAMETER(ForeachParam);
+
+class ForeachState {
+  // These are output arrays from all iterations.
+  // They also contain the Op state for each CachedOp.
+  std::vector all_outputs;
+  std::vector all_inputs;
+  std::vector all_gradients;
+  std::vector iter_ops;
+
+ public:
+  Symbol subgraph_sym;
+  nnvm::Graph subgraph;
+  ForeachParam params;
+
+  ForeachState(const Symbol , const ForeachParam ) {
+this->subgraph_sym = g;
+this->subgraph.outputs = g.outputs;
+this->params = params;
+  }
+
+  void Forward(std::vector cinputs,
+   const std::vector& req,
+   std::vector coutputs, bool is_recording);
+  void Backward(int iter_no, std::vector ograds,
+const std::vector ,
+std::vector igrads);
+  void Cleanup() {
+all_outputs.clear();
+all_inputs.clear();
+all_gradients.clear();
+iter_ops.clear();
+  }
+};
+
+void ForeachState::Forward(const std::vector ,
+   const std::vector& req,
+   const std::vector , bool 
is_recording) {
+  using namespace nnvm;
+  using namespace imperative;
+
+  bool orig_is_record;
+  if (is_recording)
+orig_is_record = Imperative::Get()->set_is_recording(true);
+  else
+orig_is_record = Imperative::Get()->is_recording();
+
+  std::vector inputs(cinputs.size());
+  std::vector outputs(coutputs.size());
+  for (size_t i = 0; i < inputs.size(); i++)
+inputs[i] = [i];
+  for (size_t i = 0; i < outputs.size(); i++)
+outputs[i] = [i];
+
+  if (is_recording) {
+all_inputs.push_back(cinputs);
+std::vector gradients(cinputs.size());
+std::vector input_ptrs(cinputs.size());
+std::vector gradient_ptrs(cinputs.size());
+std::vector grad_reqs(cinputs.size());
+for (size_t i = 0; i < gradients.size(); i++) {
+  gradients[i] = NDArray(cinputs[i].shape(), cinputs[i].ctx(),
+ true, cinputs[i].dtype());
+  input_ptrs[i] = [i];
+  gradient_ptrs[i] = [i];
+  grad_reqs[i] = kWriteTo;
+}
+Imperative::Get()->MarkVariables(input_ptrs, grad_reqs, gradient_ptrs);;
+  }
+
+  std::vector > kwargs;
+  kwargs.push_back(std::pair("inline_limit", "0"));
+  // Get input names.
+  const auto& idx = subgraph.indexed_graph();
+  std::vector arg_names(idx.input_nodes().size());
+  for (size_t i = 0; i < idx.input_nodes().size(); ++i)
+arg_names[i] = idx[idx.input_nodes()[i]].source->attrs.name;
+  // We don't have parameters for the cached op.
+  std::unordered_map params;
+  CachedOpPtr op = std::make_shared(subgraph_sym, kwargs,
+

[GitHub] piiswrong commented on a change in pull request #10451: [MXNET-432] Add Foreach

2018-05-21 Thread GitBox
piiswrong commented on a change in pull request #10451: [MXNET-432] Add Foreach
URL: https://github.com/apache/incubator-mxnet/pull/10451#discussion_r189702263
 
 

 ##
 File path: python/mxnet/symbol/contrib.py
 ##
 @@ -91,3 +98,154 @@ def rand_zipfian(true_classes, num_sampled, range_max):
 expected_prob_sampled = ((sampled_cls_fp64 + 2.0) / (sampled_cls_fp64 + 
1.0)).log() / log_range
 expected_count_sampled = expected_prob_sampled * num_sampled
 return sampled_classes, expected_count_true, expected_count_sampled
+
+def _get_graph_inputs(subg):
+num_handles = ctypes.c_int(1000)
+handles = c_array(SymbolHandle, [SymbolHandle(0) for i in range(1000)])
+check_call(_LIB.MXSymbolGetInputSymbols(subg.handle, handles, 
ctypes.byref(num_handles)))
+
+syms = []
+for i in range(num_handles.value):
+s = Symbol(handles[i])
+syms.append(s)
+return syms
+
+def foreach(func, data, init_states, name="foreach"):
+"""Run a for loop with user-defined computation over NDArrays on dimension 
0.
+
+This operator simulates a for loop and func has the computation for an 
iteration
+of the for loop. It runs the computation in func on each slice from the 
input
+NDArrays.
+
+func takes two arguments as input and outputs a tuple of two elements,
+as illustrated below:
+
+out, states = func(data1, states)
+
+data1 can be either a symbol or a list of symbols. If data is a symbol,
+data1 is a symbol. Otherwise, data1 is a list of symbols and has the same
+size as data. states is a list of symbols and have the same size as 
init_states.
+Similarly, out can be either a symbol or a list of symbols, which are 
concatenated
+as the first output of foreach; states from the last execution of func
+are the second output of foreach.
+
+The computation done by this operator is equivalent to the pseudo code 
below
+when the input data is NDArray:
+
+states = init_states
+outs = []
+for i in data.shape[0]:
+s = data[i]
+out, states = func(s, states)
+outs.append(out)
+outs = stack(*outs)
+
+
+Parameters
+--
+func : a Python function.
+Define computation in an iteration.
+data: a symbol or a list of symbols.
+The input data.
+init_states: a list of symbols.
+The initial values of the loop states.
+name: string.
+The name of the operator.
+
+Returns
+---
+outputs: a Symbol or a list of Symbols.
+The output data concatenated from the output of all iterations.
+states: a list of Symbols.
+The loop states in the last iteration.
+
+Examples
+
+>>> step = lambda data, states: (data + states[0], [states[0] * 2])
+>>> data = mx.sym.var('data')
+>>> states = [mx.sym.var('state')]
+>>> outs, states = mx.sym.contrib.foreach(step, data, states)
+"""
+
+assert isinstance(init_states, list), "init_states should be a list"
+states = []
+
+# TODO(zhengda) If the input python function references to the symbols 
outside
+# the python function, we need to prune the computation graph constructed 
from
+# the function. One way of doing it is to mark the nodes in the 
computation graph
+# with AttrScope and prune the nodes without the special attribute.
+with AttrScope(subgraph_name=name):
+if isinstance(data, list):
+in_eles = [symbol.var(sym.name) for sym in data]
+else:
+in_eles = symbol.var(data.name)
+for s in init_states:
+states.append(symbol.var(s.name))
+
+sym_out = func(in_eles, states)
+# The function should return a tuple. The first element goes to
+# the output of the function. The second element is a list.
+assert isinstance(sym_out, tuple), "func should return a tuple (out, 
states)"
+assert isinstance(sym_out[1], list), \
+"the second element in the returned tuple should be a list"
+assert len(sym_out[1]) == len(init_states), \
+"the number of output states (%d) should be the same as input 
states (%d)" \
+% (len(sym_out[1]), len(init_states))
+
+if isinstance(sym_out[0], list):
 
 Review comment:
   what if output or states are nested?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] piiswrong commented on a change in pull request #10451: [MXNET-432] Add Foreach

2018-05-21 Thread GitBox
piiswrong commented on a change in pull request #10451: [MXNET-432] Add Foreach
URL: https://github.com/apache/incubator-mxnet/pull/10451#discussion_r182573208
 
 

 ##
 File path: python/mxnet/ndarray/contrib.py
 ##
 @@ -96,3 +96,18 @@ def rand_zipfian(true_classes, num_sampled, range_max, 
ctx=None):
 expected_count_sampled = expected_prob_sampled * num_sampled
 return sampled_classes, expected_count_true, expected_count_sampled
 # pylint: enable=line-too-long
+
+def foreach(func, input, init_states, back_prop=False, name="foreach"):
+assert isinstance(init_states, list), "init_states should be a list"
+states = init_states
+outputs = []
+for i in range(input.shape[0]):
+ele = input[i]
+outs, states = func(ele, states)
+outs = _as_list(outs)
+if (i == 0):
 
 Review comment:
   outputs.append(outs)
   ...
   
   outputs = zip(*outputs)
   
   [(a, b, c), (a2, b2, c2), ...] -> [(a, a, a, ...), (b, b, b, ...), ...]


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] piiswrong commented on a change in pull request #10451: [MXNET-432] Add Foreach

2018-05-21 Thread GitBox
piiswrong commented on a change in pull request #10451: [MXNET-432] Add Foreach
URL: https://github.com/apache/incubator-mxnet/pull/10451#discussion_r189704093
 
 

 ##
 File path: src/imperative/imperative_utils.h
 ##
 @@ -456,17 +458,24 @@ inline void PushOperator(const OpStatePtr& state,
   if (fcompute_ex != nullptr && dispatch_mode == DispatchMode::kFComputeEx) {
 const auto& run = [=](RunContext rctx,
   engine::CallbackOnComplete on_complete) {
-  OpContext opctx{is_train, rctx, on_complete, requested};
+  bool need_grad = Imperative::Get()->is_recording();
+  OpContext opctx{need_grad, is_train, rctx, on_complete, requested};
 #if MXNET_USE_MKLDNN == 1
   InvalidateOutputs(outputs, req);
 #endif
   fcompute_ex(state, opctx, inputs, req, outputs);
-  if (ctx.dev_mask() == gpu::kDevMask && exec_type == ExecType::kSync) {
+  if (ctx.dev_mask() == gpu::kDevMask && exec_type == ExecType::kSync
+  && rctx.get_stream()) {
 rctx.get_stream()->Wait();
   }
 };
 
-if (exec_type == ExecType::kSync) {
+// For operators with subgraphs, we need to invoke them in the main thread
+// instead of the threaded engine.
+if (!attrs.subgraphs.empty()) {
 
 Review comment:
   You wouldn't imperatively call an op with subgraphs right?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] piiswrong commented on a change in pull request #10451: [MXNET-432] Add Foreach

2018-05-21 Thread GitBox
piiswrong commented on a change in pull request #10451: [MXNET-432] Add Foreach
URL: https://github.com/apache/incubator-mxnet/pull/10451#discussion_r189703051
 
 

 ##
 File path: src/executor/attach_op_execs_pass.cc
 ##
 @@ -134,15 +138,16 @@ class StatefulComputeExecutor : public 
StorageFallbackOpExecutor {
 return state_.get_var();
   }
 
-  explicit StatefulComputeExecutor(const OpStatePtr& state,
+  explicit StatefulComputeExecutor(const NodeAttrs& attrs, const OpStatePtr& 
state,
 
 Review comment:
   line break


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] piiswrong commented on a change in pull request #10451: [MXNET-432] Add Foreach

2018-05-21 Thread GitBox
piiswrong commented on a change in pull request #10451: [MXNET-432] Add Foreach
URL: https://github.com/apache/incubator-mxnet/pull/10451#discussion_r189501469
 
 

 ##
 File path: python/mxnet/symbol/contrib.py
 ##
 @@ -91,3 +98,154 @@ def rand_zipfian(true_classes, num_sampled, range_max):
 expected_prob_sampled = ((sampled_cls_fp64 + 2.0) / (sampled_cls_fp64 + 
1.0)).log() / log_range
 expected_count_sampled = expected_prob_sampled * num_sampled
 return sampled_classes, expected_count_true, expected_count_sampled
+
+def _get_graph_inputs(subg):
+num_handles = ctypes.c_int(1000)
 
 Review comment:
   what's 1000?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] piiswrong commented on a change in pull request #10451: [MXNET-432] Add Foreach

2018-05-21 Thread GitBox
piiswrong commented on a change in pull request #10451: [MXNET-432] Add Foreach
URL: https://github.com/apache/incubator-mxnet/pull/10451#discussion_r189703732
 
 

 ##
 File path: src/imperative/imperative_utils.h
 ##
 @@ -379,7 +379,8 @@ inline void PushFCompute(const FCompute& fn,
  _blobs, _blobs, _temp_src, 
_temp_dst,
  _temp_src, _temp_dst, _temp_idx_map, 
mutate_idx);
   // setup context
-  OpContext opctx{is_train, rctx, engine::CallbackOnComplete(), requested};
+  bool need_grad = Imperative::Get()->is_recording();
 
 Review comment:
   need_grad shouldn't be get from the worker thread. It should be set outside 
similar to is_train


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] piiswrong commented on a change in pull request #10451: [MXNET-432] Add Foreach

2018-05-21 Thread GitBox
piiswrong commented on a change in pull request #10451: [MXNET-432] Add Foreach
URL: https://github.com/apache/incubator-mxnet/pull/10451#discussion_r182570078
 
 

 ##
 File path: include/mxnet/imperative.h
 ##
 @@ -177,18 +177,18 @@ class Imperative {
 std::vector* p_save_inputs = nullptr,
 std::vector* p_save_outputs = nullptr);
   /*! \brief */
-  OpStatePtr Invoke(const Context& default_ctx,
-const nnvm::NodeAttrs& attrs,
-const std::vector& inputs,
-const std::vector& outputs);
+  static OpStatePtr Invoke(const Context& default_ctx,
 
 Review comment:
   use Imperative::Get()


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] piiswrong commented on a change in pull request #10451: [MXNET-432] Add Foreach

2018-05-21 Thread GitBox
piiswrong commented on a change in pull request #10451: [MXNET-432] Add Foreach
URL: https://github.com/apache/incubator-mxnet/pull/10451#discussion_r189701989
 
 

 ##
 File path: python/mxnet/symbol/contrib.py
 ##
 @@ -91,3 +98,154 @@ def rand_zipfian(true_classes, num_sampled, range_max):
 expected_prob_sampled = ((sampled_cls_fp64 + 2.0) / (sampled_cls_fp64 + 
1.0)).log() / log_range
 expected_count_sampled = expected_prob_sampled * num_sampled
 return sampled_classes, expected_count_true, expected_count_sampled
+
+def _get_graph_inputs(subg):
+num_handles = ctypes.c_int(1000)
+handles = c_array(SymbolHandle, [SymbolHandle(0) for i in range(1000)])
+check_call(_LIB.MXSymbolGetInputSymbols(subg.handle, handles, 
ctypes.byref(num_handles)))
+
+syms = []
+for i in range(num_handles.value):
+s = Symbol(handles[i])
+syms.append(s)
+return syms
+
+def foreach(func, data, init_states, name="foreach"):
+"""Run a for loop with user-defined computation over NDArrays on dimension 
0.
+
+This operator simulates a for loop and func has the computation for an 
iteration
+of the for loop. It runs the computation in func on each slice from the 
input
+NDArrays.
+
+func takes two arguments as input and outputs a tuple of two elements,
+as illustrated below:
+
+out, states = func(data1, states)
+
+data1 can be either a symbol or a list of symbols. If data is a symbol,
+data1 is a symbol. Otherwise, data1 is a list of symbols and has the same
+size as data. states is a list of symbols and have the same size as 
init_states.
+Similarly, out can be either a symbol or a list of symbols, which are 
concatenated
+as the first output of foreach; states from the last execution of func
+are the second output of foreach.
+
+The computation done by this operator is equivalent to the pseudo code 
below
+when the input data is NDArray:
+
+states = init_states
+outs = []
+for i in data.shape[0]:
+s = data[i]
+out, states = func(s, states)
+outs.append(out)
+outs = stack(*outs)
+
+
+Parameters
+--
+func : a Python function.
+Define computation in an iteration.
+data: a symbol or a list of symbols.
+The input data.
+init_states: a list of symbols.
+The initial values of the loop states.
+name: string.
+The name of the operator.
+
+Returns
+---
+outputs: a Symbol or a list of Symbols.
+The output data concatenated from the output of all iterations.
+states: a list of Symbols.
+The loop states in the last iteration.
+
+Examples
+
+>>> step = lambda data, states: (data + states[0], [states[0] * 2])
+>>> data = mx.sym.var('data')
+>>> states = [mx.sym.var('state')]
+>>> outs, states = mx.sym.contrib.foreach(step, data, states)
+"""
+
+assert isinstance(init_states, list), "init_states should be a list"
+states = []
+
+# TODO(zhengda) If the input python function references to the symbols 
outside
+# the python function, we need to prune the computation graph constructed 
from
+# the function. One way of doing it is to mark the nodes in the 
computation graph
+# with AttrScope and prune the nodes without the special attribute.
+with AttrScope(subgraph_name=name):
+if isinstance(data, list):
+in_eles = [symbol.var(sym.name) for sym in data]
+else:
+in_eles = symbol.var(data.name)
+for s in init_states:
+states.append(symbol.var(s.name))
+
+sym_out = func(in_eles, states)
+# The function should return a tuple. The first element goes to
+# the output of the function. The second element is a list.
+assert isinstance(sym_out, tuple), "func should return a tuple (out, 
states)"
+assert isinstance(sym_out[1], list), \
+"the second element in the returned tuple should be a list"
 
 Review comment:
   second element -> returned states (second element in the returned tuple)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] wellner commented on issue #10436: FeedForward.scala and NDArrayIter.scala leak memory by not disposing of NDArrays

2018-05-21 Thread GitBox
wellner commented on issue #10436: FeedForward.scala and NDArrayIter.scala leak 
memory by not disposing of NDArrays
URL: 
https://github.com/apache/incubator-mxnet/issues/10436#issuecomment-390772887
 
 
   Any chance of having this issue bumped up?  This bug renders training via 
Scala impossible in many cases and could be problematic for large batch 
decoding/inference. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] eric-haibin-lin commented on a change in pull request #10894: [MXNET-399] Elemwise_mul between dense and csr on CPU & GPU

2018-05-21 Thread GitBox
eric-haibin-lin commented on a change in pull request #10894: [MXNET-399] 
Elemwise_mul between dense and csr on CPU & GPU
URL: https://github.com/apache/incubator-mxnet/pull/10894#discussion_r189126584
 
 

 ##
 File path: src/operator/tensor/elemwise_binary_op-inl.h
 ##
 @@ -495,6 +495,93 @@ void ElemwiseBinaryOp::DnsCsrDnsOp(mshadow::Stream 
*s,
   });
 }
 
+/*!
+ * \brief Kernel for performing elemwise op between dense and csr matrix
+ * \param iglobal thread id
+ * \param req  type of request
+ * \param out  output array
+ * \param dns_data data array of dense input
+ * \param csr_data data array of csr input
+ * \param csr_indices  indices array of csr input
+ * \param csr_indptr   indptr array of csr input
+ * \param num_rows number of rows of both inputs
+ * \param num_cols number of columns of both inputs
+ */
+template
+struct ElemwiseDnsCsrCsrKernel {
+  template
+  MSHADOW_XINLINE static void Map(int i, DType* out, DType* dns_data,
+  const DType* csr_data, const IType* 
csr_indices,
+  const CType* csr_indptr, const nnvm::dim_t 
num_rows,
+  const nnvm::dim_t num_cols) {
+if (i < num_rows) {
+  for (int j = csr_indptr[i]; j < csr_indptr[i+1]; ++j) {
+KERNEL_ASSIGN(out[j], req, reverse ?
+   OP::Map(dns_data[i * num_cols + 
csr_indices[j]], csr_data[j]) :
+   OP::Map(csr_data[j], dns_data[i * num_cols 
+ csr_indices[j]]));
+  }
+}
+  }
+};
+
+/*! \brief DNS -op- CSR binary operator for non-canonical NDArray */
+template
+void ElemwiseBinaryOp::DnsCsrCsrOp(const nnvm::NodeAttrs ,
+   const OpContext ,
+   const NDArray ,
+   const NDArray ,
+   const OpReqType req,
+   const NDArray ,
+   const bool reverse) {
+  using namespace mshadow;
+  using namespace mxnet_op;
+  using namespace csr;
+  CHECK_EQ(dns.storage_type(), kDefaultStorage);
+  CHECK_EQ(csr.storage_type(), kCSRStorage);
+  CHECK_EQ(req, kWriteTo) << "elemwise(dns, csr) = csr only supports kWriteTo";
+  CHECK(req != kNullOp);
+  const bool supported_op = std::is_same::value ||
+std::is_same::value;
 
 Review comment:
   Remove div


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] eric-haibin-lin commented on a change in pull request #10894: [MXNET-399] Elemwise_mul between dense and csr on CPU & GPU

2018-05-21 Thread GitBox
eric-haibin-lin commented on a change in pull request #10894: [MXNET-399] 
Elemwise_mul between dense and csr on CPU & GPU
URL: https://github.com/apache/incubator-mxnet/pull/10894#discussion_r189703198
 
 

 ##
 File path: src/operator/tensor/elemwise_binary_op-inl.h
 ##
 @@ -495,6 +495,93 @@ void ElemwiseBinaryOp::DnsCsrDnsOp(mshadow::Stream 
*s,
   });
 }
 
+/*!
+ * \brief Kernel for performing elemwise op between dense and csr matrix
+ * \param iglobal thread id
+ * \param req  type of request
+ * \param out  output array
+ * \param dns_data data array of dense input
+ * \param csr_data data array of csr input
+ * \param csr_indices  indices array of csr input
+ * \param csr_indptr   indptr array of csr input
+ * \param num_rows number of rows of both inputs
+ * \param num_cols number of columns of both inputs
+ */
+template
+struct ElemwiseDnsCsrCsrKernel {
+  template
+  MSHADOW_XINLINE static void Map(int i, DType* out, DType* dns_data,
+  const DType* csr_data, const IType* 
csr_indices,
+  const CType* csr_indptr, const nnvm::dim_t 
num_rows,
+  const nnvm::dim_t num_cols) {
+if (i < num_rows) {
+  for (int j = csr_indptr[i]; j < csr_indptr[i+1]; ++j) {
+KERNEL_ASSIGN(out[j], req, reverse ?
+   OP::Map(dns_data[i * num_cols + 
csr_indices[j]], csr_data[j]) :
+   OP::Map(csr_data[j], dns_data[i * num_cols 
+ csr_indices[j]]));
+  }
+}
+  }
+};
+
+/*! \brief DNS -op- CSR binary operator for non-canonical NDArray */
+template
+void ElemwiseBinaryOp::DnsCsrCsrOp(const nnvm::NodeAttrs ,
+   const OpContext ,
+   const NDArray ,
+   const NDArray ,
+   const OpReqType req,
+   const NDArray ,
+   const bool reverse) {
+  using namespace mshadow;
+  using namespace mxnet_op;
+  using namespace csr;
+  CHECK_EQ(dns.storage_type(), kDefaultStorage);
+  CHECK_EQ(csr.storage_type(), kCSRStorage);
+  CHECK_EQ(req, kWriteTo) << "elemwise(dns, csr) = csr only supports kWriteTo";
+  CHECK(req != kNullOp);
 
 Review comment:
   if (req == kNullOp) return


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


  1   2   >