[GitHub] ZiyueHuang commented on issue #8338: master branch cannot build on centos 7 with cuda-8.0

2017-10-21 Thread GitBox
ZiyueHuang commented on issue #8338: master branch cannot build on centos 7 
with cuda-8.0
URL: 
https://github.com/apache/incubator-mxnet/issues/8338#issuecomment-338387761
 
 
   I tried the latest master but it fails too.
   
   I replaced smooth_l1_* with returning DType(0), the warnings disappear, but 
there are still error message,
   ```
   /home/hanfeng/zyh/mxnet/mshadow/mshadow/././././cuda/tensor_gpu-inl.cuh(75): 
error: expression preceding parentheses of apparent call must have 
(pointer-to-) function type
   
   /home/hanfeng/zyh/mxnet/mshadow/mshadow/././././cuda/tensor_gpu-inl.cuh(75): 
error: expression preceding parentheses of apparent call must have 
(pointer-to-) function type
   
   2 errors detected in the compilation of 
"/tmp/tmpxft_964a_-24_activation.compute_30.cpp2.i".
   make: *** [build/src/operator/activation_gpu.o] Error 2
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] 0x6a62 opened a new pull request #8376: Fix Typo (classification)

2017-10-21 Thread GitBox
0x6a62 opened a new pull request #8376: Fix Typo (classification)
URL: https://github.com/apache/incubator-mxnet/pull/8376
 
 
   ## Description ##
   Fix a typo in the example readme.
   
   ## Checklist ##
   ### Essentials ###
   - [NA ] Passed code style checking (`make lint`)
   - [X] Changes are complete (i.e. I finished coding on this PR)
   - [NA] All changes have test coverage
   - [NA] For user-facing API changes, API doc string has been updated.
   - [X] To my best knowledge, examples are either not affected by this change, 
or have been fixed to be compatible with this change
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] piiswrong closed pull request #8345: Misc fixes for sparse distributed training

2017-10-21 Thread GitBox
piiswrong closed pull request #8345: Misc fixes for sparse distributed training
URL: https://github.com/apache/incubator-mxnet/pull/8345
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/example/sparse/linear_classification.py 
b/example/sparse/linear_classification.py
index b173d04139..70f896386c 100644
--- a/example/sparse/linear_classification.py
+++ b/example/sparse/linear_classification.py
@@ -96,6 +96,7 @@
 # get the sparse weight parameter
 weight_index = mod._exec_group.param_names.index('weight')
 weight_param = mod._exec_group.param_arrays[weight_index]
+all_row_ids = mx.nd.arange(0, num_features, dtype='int64')
 speedometer = mx.callback.Speedometer(batch_size, 100)
 
 logging.info('Training started ...')
@@ -118,9 +119,15 @@
 speedometer_param = mx.model.BatchEndParam(epoch=epoch, 
nbatch=nbatch,
eval_metric=metric, 
locals=locals())
 speedometer(speedometer_param)
+# pull all rows before making a checkpoint
+if kv:
+kv.row_sparse_pull('weight', weight_param, row_ids=[all_row_ids],
+   priority=-weight_index)
 # evaluate metric on validation dataset
 score = mod.score(eval_data, ['nll_loss'])
 logging.info('epoch %d, eval nll = %s ' % (epoch, score[0][1]))
+save_optimizer_states = 'dist' not in kv.type
+mod.save_checkpoint("checkpoint", epoch, save_optimizer_states=False)
 # reset the iterator for next pass of data
 data_iter.reset()
 logging.info('Training completed.')
diff --git a/src/kvstore/kvstore_dist.h b/src/kvstore/kvstore_dist.h
index 2d5e52fc3a..5e62be8c4c 100644
--- a/src/kvstore/kvstore_dist.h
+++ b/src/kvstore/kvstore_dist.h
@@ -42,10 +42,6 @@ namespace kvstore {
 /**
  * \brief distributed kvstore
  *
- * for a worker node, it always guarantees that all push and pull issued from
- * this worker on the same key are serialized. namely push(3) and then pull(3),
- * then the data pulled is always containing the modification from the push(3).
- *
  * it's the server node's job to control the data consistency among all
  * workers. see details on \ref ServerHandle::Start
  */
@@ -248,7 +244,7 @@ class KVStoreDist : public KVStoreLocal {
 LOG(FATAL) << "RowSparsePull with multiple values is not implemented 
yet";
   } else {
 auto& indices = target_val_rowids[0].second;
-PullRowSparse_(key, _buf, indices, priority);
+PullRowSparse_(key, recv_buf, indices, priority);
 comm_->BroadcastRowSparse(key, recv_buf, grouped_val_rowid, num_vals 
== 1, priority);
   }
 }
@@ -322,24 +318,24 @@ class KVStoreDist : public KVStoreLocal {
   }
 
   // pull row sparse weight into `recv_buf` based on indices given by `indices`
-  void PullRowSparse_(const int key, NDArray *recv_buf, const NDArray& 
indices, int priority) {
+  void PullRowSparse_(const int key, const NDArray& recv_buf,
+  const NDArray& indices, int priority) {
 using namespace rowsparse;
 auto pull_from_servers = [this, key, recv_buf, indices]
  (RunContext rctx, Engine::CallbackOnComplete cb) {
   // allocate memory for the buffer
   size_t num_rows = indices.shape().Size();
-  recv_buf->CheckAndAlloc({mshadow::Shape1(num_rows)});
+  recv_buf.CheckAndAlloc({mshadow::Shape1(num_rows)});
 #if MKL_EXPERIMENTAL == 1
-  mkl_set_tblob_eager_mode(recv_buf->data());
+  mkl_set_tblob_eager_mode(recv_buf.data());
 #endif
-  real_t* data = recv_buf->data().dptr();
-  auto indices_data = indices.data();
-  const auto offsets = indices_data.dptr();
-  const auto unit_len = recv_buf->shape().ProdShape(1, 
recv_buf->shape().ndim());
+  real_t* data = recv_buf.data().dptr();
+  const auto offsets = indices.data().dptr();
+  const auto unit_len = recv_buf.shape().ProdShape(1, 
recv_buf.shape().ndim());
   const int64_t size = num_rows * unit_len;
// convert to ps keys in row sparse format
   PSKV& pskv = EncodeRowSparseKey(key, size, num_rows, offsets,
-  unit_len, recv_buf->shape()[0]);
+  unit_len, recv_buf.shape()[0]);
   if (this->log_verbose_) {
 LOG(INFO) << "worker " << get_rank() << " pull lens: " << pskv.lens << 
" keys: "
   << pskv.keys << " size: " << size;
@@ -348,8 +344,8 @@ class KVStoreDist : public KVStoreLocal {
   // copy indices to recv_buf. this needs to be done before ZPull
   // because after pull is done, the callback function returns and locks 
are released.
   // at this point, later functions 

[GitHub] piiswrong commented on a change in pull request #8364: Fix typo

2017-10-21 Thread GitBox
piiswrong commented on a change in pull request #8364: Fix typo
URL: https://github.com/apache/incubator-mxnet/pull/8364#discussion_r146099197
 
 

 ##
 File path: example/README.md
 ##
 @@ -62,7 +62,7 @@ If you want to contribute to this list and the examples, 
please open a new pull
 * [Deep Q-learning in MXNet](https://github.com/zmonoid/DQN-MXNet) by 
[zmonoid](https://github.com/zmonoid)
 * [Face Detection with End-to-End Integration of a ConvNet and a 3D Model 
(ECCV16)](https://github.com/tfwu/FaceDetection-ConvNet-3D) by 
[tfwu](https://github.com/tfwu), source code for paper Yunzhu Li, Benyuan Sun, 
Tianfu Wu and Yizhou Wang, "Face Detection with End-to-End Integration of a 
ConvNet and a 3D Model", ECCV 2016 
 * [End-to-End Chinese plate recognition base on 
MXNet](https://github.com/szad670401/end-to-end-for-chinese-plate-recognition) 
by [szad670401](https://github.com/szad670401)
-* [Reproduce ResNet-v2 (Identity Mappings in Deep Residual Networks) using 
MXNet](https://github.com/tornadomeet/ResNet) by 
[tornadomeet](https://github.com/tornadomeet)
+* [Reproduce ResNet-v2v0.11.0 (Identity Mappings in Deep Residual Networks) 
using MXNet](https://github.com/tornadomeet/ResNet) by 
[tornadomeet](https://github.com/tornadomeet)
 
 Review comment:
   ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] piiswrong commented on issue #8373: distribute training in fp16

2017-10-21 Thread GitBox
piiswrong commented on issue #8373: distribute training in fp16
URL: https://github.com/apache/incubator-mxnet/pull/8373#issuecomment-338371778
 
 
   @eric-haibin-lin @rahul003 I think this should be merged with the 2bit PR to 
make a more general n-bit gradient compression feature.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] piiswrong commented on a change in pull request #8364: Fix typo

2017-10-21 Thread GitBox
piiswrong commented on a change in pull request #8364: Fix typo
URL: https://github.com/apache/incubator-mxnet/pull/8364#discussion_r146099200
 
 

 ##
 File path: example/README.md
 ##
 @@ -86,7 +86,7 @@ If you want to contribute to this list and the examples, 
please open a new pull
 * [simple 
bind](https://github.com/dmlc/mxnet-notebooks/blob/master/python/moved-from-mxnet/simple_bind.ipynb)
 - A demo of low level training API.
 * [Multi task 
tutorial](https://github.com/haria/mxnet-multi-task-example/blob/master/multi-task.ipynb)
 - A demo of how to train and predict multi-task network on both MNIST and your 
own dataset.
 * [class active 
maps](https://github.com/dmlc/mxnet-notebooks/blob/master/python/moved-from-mxnet/class_active_maps.ipynb)
 - A demo of how to localize the discriminative regions in an image using 
global average pooling (GAP) in CNNs.
-* [DMLC MXNet Notebooks](https://github.com/dmlc/mxnet-notebooks) DMLC's repo 
for various notebooks ranging from basic usages of MXNet to state-of-the-art 
deep learning applications.
+* [DMLC MXNet Notebooks](https://github.com/dmlc/mxnet-notebooks) DMLC's repo 
for various notebooks ranging from basic usages of Mv0.11.0XNet to 
state-of-the-art deep learning applications.
 
 Review comment:
   ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] reminisce opened a new pull request #8371: Add note in the doc for using naive engine in multithreading environment

2017-10-21 Thread GitBox
reminisce opened a new pull request #8371: Add note in the doc for using naive 
engine in multithreading environment
URL: https://github.com/apache/incubator-mxnet/pull/8371
 
 
   ## Description ##
   Add note for using naive engine in multithreading environment per the 
request of @bhavinthaker.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] jiarenyf commented on issue #8347: CTC Example Problem

2017-10-21 Thread GitBox
jiarenyf commented on issue #8347: CTC Example Problem
URL: 
https://github.com/apache/incubator-mxnet/issues/8347#issuecomment-338108475
 
 
   ??


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] wisdomdeng opened a new issue #8372: PyPi wheel installation failed on Centos 7.5

2017-10-21 Thread GitBox
wisdomdeng opened a new issue #8372: PyPi wheel installation failed on Centos 
7.5
URL: https://github.com/apache/incubator-mxnet/issues/8372
 
 
   System information
   ```
   (mxnet) [ruizhid@cedar5 mxnet]$ lsb_release -a
   LSB Version: n/a
   Distributor ID:  CentOS
   Description: CentOS Linux release 7.3.1611 (Core) 
   Release: 7.3.1611
   Codename:Core
   ```
   
   Compile version
   ```
   (mxnet) [ruizhid@cedar5 mxnet]$ gcc --version
   gcc (GCC) 5.4.0
   Copyright (C) 2015 Free Software Foundation, Inc.
   This is free software; see the source for copying conditions.  There is NO
   warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
   
   (mxnet) [ruizhid@cedar5 mxnet]$ g++ --version
   g++ (GCC) 5.4.0
   Copyright (C) 2015 Free Software Foundation, Inc.
   This is free software; see the source for copying conditions.  There is NO
   warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
   ```
   
   Python version
   ```
   (mxnet) [ruizhid@cedar5 mxnet]$ python
   Python 2.7.14 |Anaconda, Inc.| (default, Oct 16 2017, 17:29:19) 
   [GCC 7.2.0] on linux2
   Type "help", "copyright", "credits" or "license" for more information.
   >>> 
   KeyboardInterrupt
   >>> 
   (mxnet) [ruizhid@cedar5 mxnet]$ python --version
   Python 2.7.14 :: Anaconda, Inc.
   (mxnet) [ruizhid@cedar5 mxnet]$ pip --version
   pip 9.0.1 from 
/home/ruizhid/miniconda/envs/mxnet/lib/python2.7/site-packages (python 2.7)
   ```
   
   Installtion
   ```
   (mxnet) [ruizhid@cedar5 mxnet]$ pip search mxnet
   mxnet-to-coreml (0.1.2)  - Tool to convert MXNet models into Apple 
CoreML model format.
   mxnet-cu75 (0.12.0b20171020) - MXNet is an ultra-scalable deep learning 
framework. This version uses CUDA-7.5.
   mxnet-cu75mkl (0.12.0b20171020)  - MXNet is an ultra-scalable deep learning 
framework. This version uses CUDA-7.5 and MKL-ML.
   mxnet-cu80 (0.12.0b20171020) - MXNet is an ultra-scalable deep learning 
framework. This version uses CUDA-8.0.
   mxnet-cu80mkl (0.12.0b20171019)  - MXNet is an ultra-scalable deep learning 
framework. This version uses CUDA-8.0 and MKL-ML.
   mxnet-cu90 (0.11.1b20171009) - MXNet is an ultra-scalable deep learning 
framework. This version uses .
   mxnet-cu90mkl (0.11.1b20171009)  - MXNet is an ultra-scalable deep learning 
framework. This version uses MKL-ML.
   keras-mxnet (1.2.2)  - MXNet backend for Keras 1.2.2
   mxnet-mkl (0.12.0b20171020)  - MXNet is an ultra-scalable deep learning 
framework. This version uses MKL-ML.
   mxnet-model-server (0.3.5)   - MXNet Model Server
   mxbox (0.0.22)   - Image and video datasets and models for 
mxnet deep learning
   mxnet (0.12.0b20171020)  - MXNet is an ultra-scalable deep learning 
framework. This version uses openblas.
   (mxnet) [ruizhid@cedar5 mxnet]$ pip install mxnet
   Collecting mxnet
 Could not find a version that satisfies the requirement mxnet (from 
versions: )
   No matching distribution found for mxnet
   (mxnet) [ruizhid@cedar5 mxnet]$ pip install mxnet-cu80
   Collecting mxnet-cu80
 Could not find a version that satisfies the requirement mxnet-cu80 (from 
versions: )
   No matching distribution found for mxnet-cu80
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] piiswrong commented on issue #8368: A hybrid model with sync and async in kvstore

2017-10-21 Thread GitBox
piiswrong commented on issue #8368: A hybrid model with sync and async in 
kvstore
URL: https://github.com/apache/incubator-mxnet/pull/8368#issuecomment-338371317
 
 
   This is too hacky


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] eric-haibin-lin commented on issue #5326: Train-accuracy=0.000000

2017-10-21 Thread GitBox
eric-haibin-lin commented on issue #5326: Train-accuracy=0.00
URL: 
https://github.com/apache/incubator-mxnet/issues/5326#issuecomment-338371172
 
 
   FYI there's an example for layer-wise pretrain for auto-encoder 
   
https://github.com/apache/incubator-mxnet/blob/master/example/autoencoder/autoencoder.py


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] jiarenyf commented on issue #8347: CTC Example Problem

2017-10-21 Thread GitBox
jiarenyf commented on issue #8347: CTC Example Problem
URL: 
https://github.com/apache/incubator-mxnet/issues/8347#issuecomment-337843827
 
 
   @pluskid @thinxer 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] piiswrong commented on a change in pull request #8342: [WIP] 2bit gradient compression

2017-10-21 Thread GitBox
piiswrong commented on a change in pull request #8342: [WIP] 2bit gradient 
compression
URL: https://github.com/apache/incubator-mxnet/pull/8342#discussion_r146099236
 
 

 ##
 File path: src/operator/contrib/two_bit_quantize.cc
 ##
 @@ -0,0 +1,122 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+/*!
+ * \file two_bit_quantize.cc
+ * \brief registers quantize_2bit, dequantize_2bit
+ * and create_2bit operators with nnvm
+ */
+#include "./two_bit_quantize-inl.h"
+
+namespace mxnet {
+namespace op {
+
+DMLC_REGISTER_PARAMETER(TwoBitParam);
+
+NNVM_REGISTER_OP(_contrib_quantize_2bit)
+.describe(R"code(Quantize an input tensor into using 2bits for each value using
+user-specified thresholds, while storing quantization error in residual array.
+
+The quantize_2bit operator takes 5 arguments and is called as follows:
+`quantize_2bit(array, residual, out, neg_threshold, pos_threshold)`.
+The operator modifies `residual` and `out` arrays.
+The `out`variable will be the quantized array. Note that, `out` array can be 
generated by
+invoking `create_2bit(array)`, avoiding calculation of size of quantized array.
+This `out` array has first three elements as negative threshold, positive 
threshold,
+and size of the original uncompressed array. Any elements after these three 
elements
+represent quantized data.
+The operation sums up array and residual, and then
+applies the thresholds to quantize the data into one of three states
+represented by 2bits. 16 such quantized floats in the original array
+are packed together into one float in the `out` array.
+The quantization error is stored in residual array.
+
+For example, assume the input array (gradient) is [5.0, -1.0, -5.0, -4.0], and 
the
+residual is [0.0, -2.0, 0, 1.0]. Let the negative and positive thresholds be
+-4.0 and +4.0, respectively. In this method, the elements whose
+(gradient + residual) >= pos_threshold will be quantized into 2-bits '01',
+and the elements whose (gradient + residual) <= neg_threshold will be
+quantized into 2-bits '10'. The other elements will be quantized
+as '00'. Every 16 floats in the original array will be packed
+into one float variable in the output array.
+
+In this example, 'out' has 4 elements. The first element stores the
+neg_threshold (-4.0), the second element stores the pos_threshold (+4.0), the
+third element stores the original size of the uncompressed array, and the
+original array will be quantized into a single element in the last element.
+The residual is also updated to [1.0, -3.0, -1.0, -3.0].
+)code" ADD_FILELINE)
+.set_num_inputs(3)
+.set_num_outputs(0)
+.set_attr_parser(ParamParser)
+.set_attr("FInferShape", Quantize2BitShape)
+.set_attr("FInferType", Quantize2BitType)
+.set_attr("FCompute", Quantize2BitCompute)
+.set_attr("FGradient", ElemwiseGradUseNone{"_quantize_2bit"})
+.set_attr("FMutateInputs",
+[](const nnvm::NodeAttrs& attrs) {
+return std::vector{1, 2};
+})
+.add_argument("gradient_array", "NDArray-or-Symbol", "A ndarray/symbol of type 
`float32`")
+.add_argument("residual_array", "NDArray-or-Symbol", "A ndarray/symbol of type 
`float32`")
+.add_argument("quantized_array", "NDArray-or-Symbol", "A ndarray/symbol of 
type `float32`")
+.add_arguments(TwoBitParam::__FIELDS__());
+
+NNVM_REGISTER_OP(_contrib_create_2bit)
 
 Review comment:
   Dont like this. The name is inconsistent. Plus this shouldn't be exposed as 
an operator


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] piiswrong commented on a change in pull request #8342: [WIP] 2bit gradient compression

2017-10-21 Thread GitBox
piiswrong commented on a change in pull request #8342: [WIP] 2bit gradient 
compression
URL: https://github.com/apache/incubator-mxnet/pull/8342#discussion_r146099245
 
 

 ##
 File path: python/mxnet/gluon/trainer.py
 ##
 @@ -65,7 +74,10 @@ def __init__(self, params, optimizer, 
optimizer_params=None, kvstore='device'):
 "First argument must be a list or dict of Parameters, " \
 "got list of %s."%(type(param)))
 self._params.append(param)
-
+if compress_params:
+if not isinstance(compress_params, dict):
+raise ValueError("compress_params needs to be a dictionary")
 
 Review comment:
   why does it have to be a dict?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ZiyueHuang opened a new pull request #8374: fix condition when CSRNDArray is used in NDArrayIter with shuffle=True

2017-10-21 Thread GitBox
ZiyueHuang opened a new pull request #8374: fix condition when CSRNDArray is 
used in NDArrayIter with shuffle=True
URL: https://github.com/apache/incubator-mxnet/pull/8374
 
 
   ## Description ##
   Currently if `NDArrayIter({'data':csr}, ..., shuffle=True)`, there is no 
`AssertionError`.
   
   @eric-haibin-lin 
   
   ## Checklist ##
   ### Essentials ###
   - [x] Passed code style checking (`make lint`)
   - [x] Changes are complete (i.e. I finished coding on this PR)
   - [ ] All changes have test coverage
   - [ ] For user-facing API changes, API doc string has been updated.
   - [x] To my best knowledge, examples are either not affected by this change, 
or have been fixed to be compatible with this change
   
   ### Changes ###
   - [ ] Feature1, tests, (and when applicable, API doc)
   - [ ] Feature2, tests, (and when applicable, API doc)
   
   ## Comments ##
   - If this change is a backward incompatible change, why must this change be 
made.
   - Intersting edge cases to note here
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] eric-haibin-lin commented on issue #8189: Feed forward pass memory leaks (using htop)

2017-10-21 Thread GitBox
eric-haibin-lin commented on issue #8189: Feed forward pass memory leaks (using 
htop)
URL: 
https://github.com/apache/incubator-mxnet/issues/8189#issuecomment-338371727
 
 
   forward/backward/update are all asynchronous operation. It just pushes the 
operations to the backend engine and returns immediately. 
https://mxnet.incubator.apache.org/versions/master/tutorials/basic/ndarray.html#lazy-evaluation-and-automatic-parallelization
 
   You can use `mx.nd.waitall` to make sure all operations are complete


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] piiswrong commented on a change in pull request #8342: [WIP] 2bit gradient compression

2017-10-21 Thread GitBox
piiswrong commented on a change in pull request #8342: [WIP] 2bit gradient 
compression
URL: https://github.com/apache/incubator-mxnet/pull/8342#discussion_r146112817
 
 

 ##
 File path: python/mxnet/gluon/trainer.py
 ##
 @@ -44,14 +44,23 @@ class Trainer(object):
 kvstore : str or KVStore
 kvstore type for multi-gpu and distributed training. See help on
 :any:`mxnet.kvstore.create` for more information.
+compress_params : dict
 
 Review comment:
   compression_params


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] piiswrong commented on issue #8182: Make gluon.Block cooperative in multiple inheritance setting

2017-10-21 Thread GitBox
piiswrong commented on issue #8182: Make gluon.Block cooperative in multiple 
inheritance setting
URL: https://github.com/apache/incubator-mxnet/pull/8182#issuecomment-338425611
 
 
   Let's put this on hold to see if there is wide demand for this feature.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] cjolivier01 commented on issue #8361: Simplified unary/binary math operators

2017-10-21 Thread GitBox
cjolivier01 commented on issue #8361: Simplified unary/binary math operators
URL: https://github.com/apache/incubator-mxnet/pull/8361#issuecomment-338426767
 
 
   We see it in the nightly benchmark runs, but I was unable to reproduce 
locally after spending quite a bit of time on it. However probably I should try 
again after I?ve now heard from a second source (you being the first) that 
c4.8xlarge machines behave strangely from a performance perspective. I?ll try 
this on Monday or this weekend if I get a chance.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] cjolivier01 commented on issue #8361: Simplified unary/binary math operators

2017-10-21 Thread GitBox
cjolivier01 commented on issue #8361: Simplified unary/binary math operators
URL: https://github.com/apache/incubator-mxnet/pull/8361#issuecomment-338426981
 
 
   For example, a unit test that takes 3 seconds on my machine, I?m told it 
takes 15-20 minutes on a c4.8xlarge.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] piiswrong commented on a change in pull request #8342: [WIP] 2bit gradient compression

2017-10-21 Thread GitBox
piiswrong commented on a change in pull request #8342: [WIP] 2bit gradient 
compression
URL: https://github.com/apache/incubator-mxnet/pull/8342#discussion_r146112744
 
 

 ##
 File path: python/mxnet/kvstore.py
 ##
 @@ -349,6 +349,101 @@ def row_sparse_pull(self, key, out=None, priority=0, 
row_ids=None):
 check_call(_LIB.MXKVStorePullRowSparse(
 self.handle, mx_uint(len(ckeys)), ckeys, cvals, crow_ids, 
ctypes.c_int(priority)))
 
+def set_compress(self, compress_params=None):
+""" Specifies type of low-bit quantization for gradient compression if 
any,
+ and additional arguments depending on the type of compression being 
used.
+
+Parameters
+--
+compress_params : dict
+`compress_params` is a dictionary specifying the type and 
parameters
+for gradient compression. The key `compress` in this dictionary is 
a required argument
+and specifies the type of gradient compression. Other keys in this
+dictionary are optional and specific to the type of gradient 
compression.
+
+2bit Gradient Compression
+-
+2bit gradient compression takes two thresholds, one for positive 
values and
+other for negative thresholds. This works by limiting positive 
values in the
+gradient to the positive threshold, and limiting negative values 
to the
+negative threshold. Values which don't meet the thresholds are set 
to 0.
+By doing so, each value in the gradient is in one of three states. 
2bits are
+used to represent these states, and every 16 float values in the 
original
+gradient can be represented using one float. This compressed 
representation
+can reduce communication costs. The difference between these 
values and
+original values is stored at the sender's end as residual and 
added to the
+gradient in the next iteration.
+
+When kvstore is 'local', gradient compression is used to reduce 
communication
+between multiple devices (gpus). Gradient is quantized on each GPU 
which
+computed the gradients, then sent to the GPU which merges the 
gradients. This
+receiving GPU dequantizes the gradients and merges them. Note that 
this
+increases memory usage on each GPU because of the residual array 
stored.
+
+When kvstore is 'dist', gradient compression is used to reduce 
communication
+from worker to sender. Gradient is quantized on each worker which
+computed the gradients, then sent to the server which dequantizes
+this data and merges the gradients from each worker. Note that this
+increases CPU memory usage on each worker because of the residual 
array stored.
+Only worker to server communication is compressed in this setting.
+If each machine has multiple GPUs, currently this GPU to GPU 
communication is
+not compressed. Server to worker communication (in the case of 
pull) is also not
+compressed.
+
+To use 2bit compression, we need to specify `compress` as `2bit`.
+Only specifying `compress` would use default values
+for the other arguments of thresholds.
+To completely specify the arguments for 2bit compression, we would 
need to pass
+a dictionary which includes `positive_threshold` and 
`negative_threshold` like:
+{'compress':'2bit', 'positive_threshold':0.5, 
'negative_threshold':-0.5}
 
 Review comment:
   is it positive_threshold or pos_threshold?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] piiswrong commented on a change in pull request #8342: [WIP] 2bit gradient compression

2017-10-21 Thread GitBox
piiswrong commented on a change in pull request #8342: [WIP] 2bit gradient 
compression
URL: https://github.com/apache/incubator-mxnet/pull/8342#discussion_r146112750
 
 

 ##
 File path: python/mxnet/kvstore.py
 ##
 @@ -349,6 +349,101 @@ def row_sparse_pull(self, key, out=None, priority=0, 
row_ids=None):
 check_call(_LIB.MXKVStorePullRowSparse(
 self.handle, mx_uint(len(ckeys)), ckeys, cvals, crow_ids, 
ctypes.c_int(priority)))
 
+def set_compress(self, compress_params=None):
+""" Specifies type of low-bit quantization for gradient compression if 
any,
+ and additional arguments depending on the type of compression being 
used.
+
+Parameters
+--
+compress_params : dict
+`compress_params` is a dictionary specifying the type and 
parameters
+for gradient compression. The key `compress` in this dictionary is 
a required argument
+and specifies the type of gradient compression. Other keys in this
+dictionary are optional and specific to the type of gradient 
compression.
+
+2bit Gradient Compression
 
 Review comment:
   Is there a paper for reference?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] piiswrong commented on a change in pull request #8342: [WIP] 2bit gradient compression

2017-10-21 Thread GitBox
piiswrong commented on a change in pull request #8342: [WIP] 2bit gradient 
compression
URL: https://github.com/apache/incubator-mxnet/pull/8342#discussion_r146112886
 
 

 ##
 File path: src/ndarray/ndarray.cc
 ##
 @@ -558,6 +558,101 @@ void CopyFromTo(const NDArray& from, const NDArray& to, 
int priority) {
   }
 }
 
+void Quantize(const NDArray , NDArray *to, NDArray *residual, const 
std::string& compress,
+  const float neg_threshold, const float pos_threshold,
+  int priority) {
+  CHECK(from.shape().ndim() != 0)
+  << "source operands have zero dimension shape";
+  // important: callback must always capture by value
+  NDArray ret = *to;
+  NDArray res = *residual;
+  int a = from.ctx().dev_mask();
+  int b = to->ctx().dev_mask();
+  if (a == cpu::kDevMask && b == cpu::kDevMask) {
+if (compress == "2bit") {
+  Engine::Get()->PushSync([from, res, ret, neg_threshold, 
pos_threshold](RunContext ctx) {
+  std::vector inputs(3);
+  inputs[0] = from.data();
+  inputs[1] = res.data();
+  inputs[2] = ret.data();
+  mxnet::ndarray::Quantize2BitDispatch(ctx.get_stream(), 
inputs,
+neg_threshold, 
pos_threshold);
+}, from.ctx(), {from.var()}, {ret.var(), res.var()},
+FnProperty::kNormal, priority, PROFILER_MESSAGE("QuantizeCPU"));
+} else {
+  LOG(FATAL) << "Unsupported Quantization";
+}
+  } else {
+#if MXNET_USE_CUDA
+if (a == gpu::kDevMask && b == gpu::kDevMask) {
+  if (compress == "2bit") {
+Engine::Get()->PushSync([from, res, ret, neg_threshold, 
pos_threshold](RunContext ctx) {
+std::vector inputs(3);
+inputs[0] = from.data();
+inputs[1] = res.data();
+inputs[2] = ret.data();
+mxnet::ndarray::Quantize2BitDispatch(ctx.get_stream(), 
inputs,
+  neg_threshold, 
pos_threshold);
+// Wait GPU kernel to complete
+ctx.get_stream()->Wait();
+  }, from.ctx(), {from.var()}, {ret.var(), res.var()},
+  FnProperty::kNormal, priority, PROFILER_MESSAGE("QuantizeGPU"));
+} else {
+  LOG(FATAL) << "Unsupported Quantization";
+}
+} else {
+  LOG(FATAL) << "unknown device mask";
+}
+#else
+LOG(FATAL) << MXNET_GPU_NOT_ENABLED_ERROR;
+#endif
+  }
+}
+
+void Dequantize(const NDArray , NDArray *to, const std::string& compress, 
int priority) {
+  CHECK(from.shape().ndim() != 0)
+<< "source operands have zero dimension shape";
+  // important: callback must always capture by value
+  NDArray ret = *to;
+  int a = from.ctx().dev_mask();
+  int b = to->ctx().dev_mask();
+  if (a == cpu::kDevMask && b == cpu::kDevMask) {
+if (compress == "2bit") {
+  Engine::Get()->PushSync([from, ret](RunContext ctx) {
+std::vector inputs(2);
+inputs[0] = from.data();
+inputs[1] = ret.data();
+mxnet::ndarray::Dequantize2BitDispatch(ctx.get_stream(), 
inputs);
+  }, from.ctx(), {from.var()}, {ret.var()},
+  FnProperty::kNormal, priority, PROFILER_MESSAGE("DequantizeCPU"));
+} else {
+  LOG(FATAL) << "Unsupported dequantization " << compress << std::endl;
+}
+  } else {
+#if MXNET_USE_CUDA
+if (a == gpu::kDevMask && b == gpu::kDevMask) {
+  if (compress == "2bit") {
 
 Review comment:
   agreed. Backend shouldn't use string constants for compression method


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] mseeger commented on issue #8361: Simplified unary/binary math operators

2017-10-21 Thread GitBox
mseeger commented on issue #8361: Simplified unary/binary math operators
URL: https://github.com/apache/incubator-mxnet/pull/8361#issuecomment-338430529
 
 
   Which functions in mshadow_op.h would be used in an LSTM? sigmoid, or relu? 
We could easily change them back to the old code and compare run times.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] piiswrong commented on a change in pull request #8342: [WIP] 2bit gradient compression

2017-10-21 Thread GitBox
piiswrong commented on a change in pull request #8342: [WIP] 2bit gradient 
compression
URL: https://github.com/apache/incubator-mxnet/pull/8342#discussion_r146112917
 
 

 ##
 File path: src/ndarray/ndarray.cc
 ##
 @@ -558,6 +558,101 @@ void CopyFromTo(const NDArray& from, const NDArray& to, 
int priority) {
   }
 }
 
+void Quantize(const NDArray , NDArray *to, NDArray *residual, const 
std::string& compress,
+  const float neg_threshold, const float pos_threshold,
+  int priority) {
+  CHECK(from.shape().ndim() != 0)
+  << "source operands have zero dimension shape";
+  // important: callback must always capture by value
+  NDArray ret = *to;
+  NDArray res = *residual;
+  int a = from.ctx().dev_mask();
+  int b = to->ctx().dev_mask();
+  if (a == cpu::kDevMask && b == cpu::kDevMask) {
+if (compress == "2bit") {
+  Engine::Get()->PushSync([from, res, ret, neg_threshold, 
pos_threshold](RunContext ctx) {
+  std::vector inputs(3);
+  inputs[0] = from.data();
+  inputs[1] = res.data();
+  inputs[2] = ret.data();
+  mxnet::ndarray::Quantize2BitDispatch(ctx.get_stream(), 
inputs,
+neg_threshold, 
pos_threshold);
+}, from.ctx(), {from.var()}, {ret.var(), res.var()},
+FnProperty::kNormal, priority, PROFILER_MESSAGE("QuantizeCPU"));
+} else {
+  LOG(FATAL) << "Unsupported Quantization";
+}
+  } else {
+#if MXNET_USE_CUDA
+if (a == gpu::kDevMask && b == gpu::kDevMask) {
+  if (compress == "2bit") {
+Engine::Get()->PushSync([from, res, ret, neg_threshold, 
pos_threshold](RunContext ctx) {
+std::vector inputs(3);
+inputs[0] = from.data();
+inputs[1] = res.data();
+inputs[2] = ret.data();
+mxnet::ndarray::Quantize2BitDispatch(ctx.get_stream(), 
inputs,
+  neg_threshold, 
pos_threshold);
+// Wait GPU kernel to complete
+ctx.get_stream()->Wait();
+  }, from.ctx(), {from.var()}, {ret.var(), res.var()},
+  FnProperty::kNormal, priority, PROFILER_MESSAGE("QuantizeGPU"));
+} else {
+  LOG(FATAL) << "Unsupported Quantization";
+}
+} else {
+  LOG(FATAL) << "unknown device mask";
+}
+#else
+LOG(FATAL) << MXNET_GPU_NOT_ENABLED_ERROR;
+#endif
+  }
+}
+
+void Dequantize(const NDArray , NDArray *to, const std::string& compress, 
int priority) {
+  CHECK(from.shape().ndim() != 0)
+<< "source operands have zero dimension shape";
+  // important: callback must always capture by value
+  NDArray ret = *to;
+  int a = from.ctx().dev_mask();
+  int b = to->ctx().dev_mask();
+  if (a == cpu::kDevMask && b == cpu::kDevMask) {
+if (compress == "2bit") {
+  Engine::Get()->PushSync([from, ret](RunContext ctx) {
+std::vector inputs(2);
+inputs[0] = from.data();
+inputs[1] = ret.data();
+mxnet::ndarray::Dequantize2BitDispatch(ctx.get_stream(), 
inputs);
+  }, from.ctx(), {from.var()}, {ret.var()},
+  FnProperty::kNormal, priority, PROFILER_MESSAGE("DequantizeCPU"));
+} else {
+  LOG(FATAL) << "Unsupported dequantization " << compress << std::endl;
+}
+  } else {
+#if MXNET_USE_CUDA
+if (a == gpu::kDevMask && b == gpu::kDevMask) {
+  if (compress == "2bit") {
 
 Review comment:
   Also I doubt this should be a ndarray method.
   It's more clean to abstract stuff related to gradient compression into a 
separate class


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] piiswrong commented on a change in pull request #8342: [WIP] 2bit gradient compression

2017-10-21 Thread GitBox
piiswrong commented on a change in pull request #8342: [WIP] 2bit gradient 
compression
URL: https://github.com/apache/incubator-mxnet/pull/8342#discussion_r146112984
 
 

 ##
 File path: src/operator/contrib/two_bit_quantize.cc
 ##
 @@ -0,0 +1,122 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+/*!
+ * \file two_bit_quantize.cc
+ * \brief registers quantize_2bit, dequantize_2bit
+ * and create_2bit operators with nnvm
+ */
+#include "./two_bit_quantize-inl.h"
+
+namespace mxnet {
+namespace op {
+
+DMLC_REGISTER_PARAMETER(TwoBitParam);
+
+NNVM_REGISTER_OP(_contrib_quantize_2bit)
+.describe(R"code(Quantize an input tensor into using 2bits for each value using
+user-specified thresholds, while storing quantization error in residual array.
+
+The quantize_2bit operator takes 5 arguments and is called as follows:
+`quantize_2bit(array, residual, out, neg_threshold, pos_threshold)`.
 
 Review comment:
   Also I don't see why these need to be exposed as operators. It looks like 
for now kvstore handles everything internally.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] piiswrong commented on a change in pull request #8342: [WIP] 2bit gradient compression

2017-10-21 Thread GitBox
piiswrong commented on a change in pull request #8342: [WIP] 2bit gradient 
compression
URL: https://github.com/apache/incubator-mxnet/pull/8342#discussion_r146112556
 
 

 ##
 File path: src/operator/contrib/two_bit_quantize.cc
 ##
 @@ -0,0 +1,122 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+/*!
+ * \file two_bit_quantize.cc
+ * \brief registers quantize_2bit, dequantize_2bit
+ * and create_2bit operators with nnvm
+ */
+#include "./two_bit_quantize-inl.h"
+
+namespace mxnet {
+namespace op {
+
+DMLC_REGISTER_PARAMETER(TwoBitParam);
+
+NNVM_REGISTER_OP(_contrib_quantize_2bit)
+.describe(R"code(Quantize an input tensor into using 2bits for each value using
+user-specified thresholds, while storing quantization error in residual array.
+
+The quantize_2bit operator takes 5 arguments and is called as follows:
+`quantize_2bit(array, residual, out, neg_threshold, pos_threshold)`.
 
 Review comment:
   why not out, residual = quantize_2bit(weight, residual)?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] piiswrong closed pull request #8369: Fix the Readme

2017-10-21 Thread GitBox
piiswrong closed pull request #8369: Fix the Readme
URL: https://github.com/apache/incubator-mxnet/pull/8369
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/README.md b/README.md
index fc252a7a72..8a65b4060c 100644
--- a/README.md
+++ b/README.md
@@ -22,7 +22,6 @@ deep learning systems, and interesting insights of DL systems 
for hackers.
 
 What's New
 --
-* [Version 0.12.0 
Release](https://github.com/apache/incubator-mxnet/releases/tag/0.12.0) - MXNet 
0.12.0 Release.
 * [Version 0.11.0 
Release](https://github.com/apache/incubator-mxnet/releases/tag/0.11.0) - MXNet 
0.11.0 Release.
 * [Apache Incubator](http://incubator.apache.org/projects/mxnet.html) - We are 
now an Apache Incubator project.
 * [Version 0.10.0 Release](https://github.com/dmlc/mxnet/releases/tag/v0.10.0) 
- MXNet 0.10.0 Release.


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[incubator-mxnet] branch master updated: Update cudnn_algoreg-inl.h (#7988)

2017-10-21 Thread jxie
This is an automated email from the ASF dual-hosted git repository.

jxie pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git


The following commit(s) were added to refs/heads/master by this push:
 new ad20d91  Update cudnn_algoreg-inl.h (#7988)
ad20d91 is described below

commit ad20d91ebd4286e1001ce4a964e2c929e02247dc
Author: solin319 
AuthorDate: Sun Oct 22 03:13:02 2017 +0800

Update cudnn_algoreg-inl.h (#7988)
---
 src/operator/cudnn_algoreg-inl.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/operator/cudnn_algoreg-inl.h b/src/operator/cudnn_algoreg-inl.h
index ccc5140..c10593f 100644
--- a/src/operator/cudnn_algoreg-inl.h
+++ b/src/operator/cudnn_algoreg-inl.h
@@ -102,7 +102,7 @@ class CuDNNAlgoReg {
 ParamKey key{param, in_shape[0], in_shape[1], out_shape[0], 
cudnn_data_type,
  cudnn_forward_compute_type, cudnn_backward_compute_type, 
sm_arch};
 std::lock_guard guard(lock_);
-if (reg_.size() % 50 == 0) {
+if (param.cudnn_tune.value() && reg_.size() % 50 == 0) {
   LOG(INFO) << "Running performance tests to find the best convolution "
"algorithm, "
"this can take a while... (setting env variable "

-- 
To stop receiving notification emails like this one, please contact
['"comm...@mxnet.apache.org" '].


[GitHub] piiswrong closed pull request #7988: solve problem in print "cudnn autotune"

2017-10-21 Thread GitBox
piiswrong closed pull request #7988: solve problem in print "cudnn autotune" 
URL: https://github.com/apache/incubator-mxnet/pull/7988
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/src/operator/cudnn_algoreg-inl.h b/src/operator/cudnn_algoreg-inl.h
index b27d2be297..d90b6609d2 100644
--- a/src/operator/cudnn_algoreg-inl.h
+++ b/src/operator/cudnn_algoreg-inl.h
@@ -102,7 +102,7 @@ class CuDNNAlgoReg {
 ParamKey key{param, in_shape[0], in_shape[1], out_shape[0], 
cudnn_data_type,
  cudnn_forward_compute_type, cudnn_backward_compute_type, 
sm_arch};
 std::lock_guard guard(lock_);
-if (reg_.size() % 50 == 0) {
+if (param.cudnn_tune.value() && reg_.size() % 50 == 0) {
   LOG(INFO) << "Running performance tests to find the best convolution "
"algorithm, "
"this can take a while... (setting env variable "


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] szha closed pull request #8171: add profile option for frontend profiling to image script

2017-10-21 Thread GitBox
szha closed pull request #8171: add profile option for frontend profiling to 
image script
URL: https://github.com/apache/incubator-mxnet/pull/8171
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/example/gluon/image_classification.py 
b/example/gluon/image_classification.py
index 8481afb50c..a67da35341 100644
--- a/example/gluon/image_classification.py
+++ b/example/gluon/image_classification.py
@@ -64,6 +64,9 @@
 parser.add_argument('--kvstore', type=str, default='device',
 help='kvstore to use for trainer/module.')
 parser.add_argument('--log-interval', type=int, default=50, help='Number of 
batches to wait before logging.')
+parser.add_argument('--profile', action='store_true',
+help='Option to turn on memory profiling for front-end, '\
+ 'and prints out the memory usage by python function 
at the end.')
 opt = parser.parse_args()
 
 logging.info(opt)
@@ -166,7 +169,7 @@ def train(epochs, ctx):
 
 net.save_params('image-classifier-%s-%d.params'%(opt.model, epochs))
 
-if __name__ == '__main__':
+def main():
 if opt.mode == 'symbolic':
 data = mx.sym.var('data')
 out = net(data)
@@ -186,3 +189,16 @@ def train(epochs, ctx):
 if opt.mode == 'hybrid':
 net.hybridize()
 train(opt.epochs, context)
+
+if __name__ == '__main__':
+if opt.profile:
+import hotshot, hotshot.stats
+prof = hotshot.Profile('image-classifier-%s-%s.prof'%(opt.model, 
opt.mode))
+prof.runcall(main)
+prof.close()
+stats = hotshot.stats.load('image-classifier-%s-%s.prof'%(opt.model, 
opt.mode))
+stats.strip_dirs()
+stats.sort_stats('cumtime', 'calls')
+stats.print_stats()
+else:
+main()


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[incubator-mxnet] branch master updated: add profile option for frontend profiling to image script (#8171)

2017-10-21 Thread zhasheng
This is an automated email from the ASF dual-hosted git repository.

zhasheng pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git


The following commit(s) were added to refs/heads/master by this push:
 new 590d7b0  add profile option for frontend profiling to image script 
(#8171)
590d7b0 is described below

commit 590d7b0fd96afdd1205fc0df15a0dc31db862f1f
Author: Sheng Zha 
AuthorDate: Sat Oct 21 14:32:36 2017 -0700

add profile option for frontend profiling to image script (#8171)

* add profile option for frontend profiling to image script

* Update image_classification.py

* Update image_classification.py
---
 example/gluon/image_classification.py | 18 +-
 1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/example/gluon/image_classification.py 
b/example/gluon/image_classification.py
index 8481afb..a67da35 100644
--- a/example/gluon/image_classification.py
+++ b/example/gluon/image_classification.py
@@ -64,6 +64,9 @@ parser.add_argument('--use-pretrained', action='store_true',
 parser.add_argument('--kvstore', type=str, default='device',
 help='kvstore to use for trainer/module.')
 parser.add_argument('--log-interval', type=int, default=50, help='Number of 
batches to wait before logging.')
+parser.add_argument('--profile', action='store_true',
+help='Option to turn on memory profiling for front-end, '\
+ 'and prints out the memory usage by python function 
at the end.')
 opt = parser.parse_args()
 
 logging.info(opt)
@@ -166,7 +169,7 @@ def train(epochs, ctx):
 
 net.save_params('image-classifier-%s-%d.params'%(opt.model, epochs))
 
-if __name__ == '__main__':
+def main():
 if opt.mode == 'symbolic':
 data = mx.sym.var('data')
 out = net(data)
@@ -186,3 +189,16 @@ if __name__ == '__main__':
 if opt.mode == 'hybrid':
 net.hybridize()
 train(opt.epochs, context)
+
+if __name__ == '__main__':
+if opt.profile:
+import hotshot, hotshot.stats
+prof = hotshot.Profile('image-classifier-%s-%s.prof'%(opt.model, 
opt.mode))
+prof.runcall(main)
+prof.close()
+stats = hotshot.stats.load('image-classifier-%s-%s.prof'%(opt.model, 
opt.mode))
+stats.strip_dirs()
+stats.sort_stats('cumtime', 'calls')
+stats.print_stats()
+else:
+main()

-- 
To stop receiving notification emails like this one, please contact
['"comm...@mxnet.apache.org" '].


[GitHub] reminisce commented on a change in pull request #8371: Add note in the doc for using naive engine in multithreading environment

2017-10-21 Thread GitBox
reminisce commented on a change in pull request #8371: Add note in the doc for 
using naive engine in multithreading environment
URL: https://github.com/apache/incubator-mxnet/pull/8371#discussion_r146114557
 
 

 ##
 File path: docs/faq/env_var.md
 ##
 @@ -56,6 +56,9 @@ export MXNET_GPU_WORKER_NTHREADS=3
 - NaiveEngine: A very simple engine that uses the master thread to do the 
computation synchronously. Setting this engine disables multi-threading. You 
can use this type for debugging in case of any error. Backtrace will give you 
the series of calls that lead to the error. Remember to set MXNET_ENGINE_TYPE 
back to empty after debugging.
 - ThreadedEngine: A threaded engine that uses a global thread pool to 
schedule jobs.
 - ThreadedEnginePerDevice: A threaded engine that allocates thread per GPU 
and executes jobs asynchronously.
+  - Note: ThreadedEngine and ThreadedEnginePerDevice are not thread-safe. 
Switch to using NaiveEngine
+  if you want to have multiple threads interacting with a single MXNet 
model at the same time.
 
 Review comment:
   Good suggestion and example. But I'm not following the reasoning of the last 
sentence `Since the fork of a process replicates the complete process address 
space including threads, keeping MXNet single-threaded via the use of 
NaiveEngine makes it safe.`
   
   The ThreadedEngine of MXNet schedules operations based upon the availability 
of NDArrays in their corresponding operation functions. The root cause for the 
problem we saw is that multiple threads pushes operations on NDArrays to the 
MXNet ThreadedEngine simultaneously. This will cause
   1. data race, e.g. thread1 and thread2 writing to the same NDArray memory 
space.
   2. dead lock, e.g. thread1 trying to copy from NDArray1 to NDArray2, while 
thread2 trying to copy from NDArray2 to NDArray1, and they are both waiting for 
the read on the source arrays ready.
   
   If it's already multiple threads in a single process interacting with MXNet, 
does it matter by forking it or not?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] mseeger commented on issue #8361: Simplified unary/binary math operators

2017-10-21 Thread GitBox
mseeger commented on issue #8361: Simplified unary/binary math operators
URL: https://github.com/apache/incubator-mxnet/pull/8361#issuecomment-338430381
 
 
   Due to the unary/binary math functions?
   
   In principle, at least for the majority of all expressions, we should go 
through the same computations for DType != double, and I suppose LSTMs probably 
use float?
   
   WIth the current PR, I have now also changed the gradient expressions to be 
computed in float, when before they were computed in DType. But I'd be 
surprised if this would have a large effect.
   
   But we are also seeing these weird compilation errors for Linux CentOS ...


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] piiswrong commented on a change in pull request #8342: [WIP] 2bit gradient compression

2017-10-21 Thread GitBox
piiswrong commented on a change in pull request #8342: [WIP] 2bit gradient 
compression
URL: https://github.com/apache/incubator-mxnet/pull/8342#discussion_r146112785
 
 

 ##
 File path: python/mxnet/kvstore.py
 ##
 @@ -349,6 +349,101 @@ def row_sparse_pull(self, key, out=None, priority=0, 
row_ids=None):
 check_call(_LIB.MXKVStorePullRowSparse(
 self.handle, mx_uint(len(ckeys)), ckeys, cvals, crow_ids, 
ctypes.c_int(priority)))
 
+def set_compress(self, compress_params=None):
 
 Review comment:
   set_gradient_compression(self, method, **kwargs)
   same for c++ API


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] piiswrong commented on issue #8361: Simplified unary/binary math operators

2017-10-21 Thread GitBox
piiswrong commented on issue #8361: Simplified unary/binary math operators
URL: https://github.com/apache/incubator-mxnet/pull/8361#issuecomment-338425272
 
 
   Chris observed 4x performance regression for LSTMs that might be related to 
the previous PR.
   
   @cjolivier01  have you found the cause?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[incubator-mxnet] branch master updated: Allow test to converge (#8351)

2017-10-21 Thread jxie
This is an automated email from the ASF dual-hosted git repository.

jxie pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git


The following commit(s) were added to refs/heads/master by this push:
 new 72b0051  Allow test to converge (#8351)
72b0051 is described below

commit 72b00516a79cb6ac32f0fdd77af73a01994c2956
Author: Chris Olivier 
AuthorDate: Sat Oct 21 12:06:21 2017 -0700

Allow test to converge (#8351)

* Allow test to converge

* Trigger build

* Trigger build

* Trigger build
---
 tests/python/train/test_dtype.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/python/train/test_dtype.py b/tests/python/train/test_dtype.py
index b0a5248..96912c0 100644
--- a/tests/python/train/test_dtype.py
+++ b/tests/python/train/test_dtype.py
@@ -99,7 +99,7 @@ def run_cifar10(train, val, use_module):
 devs = [mx.cpu(0)]
 net = get_net()
 mod = mx.mod.Module(net, context=devs)
-optim_args = {'learning_rate': 0.05, 'wd': 0.1, 'momentum': 0.9}
+optim_args = {'learning_rate': 0.001, 'wd': 0.1, 'momentum': 0.9}
 eval_metrics = ['accuracy']
 if use_module:
 executor = mx.mod.Module(net, context=devs)

-- 
To stop receiving notification emails like this one, please contact
['"comm...@mxnet.apache.org" '].


[GitHub] piiswrong closed pull request #8351: Allow test to converge

2017-10-21 Thread GitBox
piiswrong closed pull request #8351: Allow test to converge
URL: https://github.com/apache/incubator-mxnet/pull/8351
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/tests/python/train/test_dtype.py b/tests/python/train/test_dtype.py
index b0a524815c..96912c09db 100644
--- a/tests/python/train/test_dtype.py
+++ b/tests/python/train/test_dtype.py
@@ -99,7 +99,7 @@ def run_cifar10(train, val, use_module):
 devs = [mx.cpu(0)]
 net = get_net()
 mod = mx.mod.Module(net, context=devs)
-optim_args = {'learning_rate': 0.05, 'wd': 0.1, 'momentum': 0.9}
+optim_args = {'learning_rate': 0.001, 'wd': 0.1, 'momentum': 0.9}
 eval_metrics = ['accuracy']
 if use_module:
 executor = mx.mod.Module(net, context=devs)


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] piiswrong closed pull request #8192: [Perl] emulate Python zip() for Perl

2017-10-21 Thread GitBox
piiswrong closed pull request #8192: [Perl] emulate Python zip() for Perl
URL: https://github.com/apache/incubator-mxnet/pull/8192
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/perl-package/AI-MXNet/lib/AI/MXNet/AutoGrad.pm 
b/perl-package/AI-MXNet/lib/AI/MXNet/AutoGrad.pm
index b49c0b69c5..221840e300 100644
--- a/perl-package/AI-MXNet/lib/AI/MXNet/AutoGrad.pm
+++ b/perl-package/AI-MXNet/lib/AI/MXNet/AutoGrad.pm
@@ -333,10 +333,10 @@ method grad(
 );
 
 my @ret;
-zip(sub {
-my ($handle, $stype) = @_;
+for(zip($grad_vars, $grad_stypes)) {
+my ($handle, $stype) = @$_;
 push @ret, AI::MXNet::NDArray->new(handle => $handle, stype => $stype);
-}, $grad_vars, $grad_stypes);
+}
 if(blessed $variables)
 {
 return $ret[0];
@@ -474,4 +474,4 @@ func _parse_head($heads, $head_grads)
 return (\@head_handles, \@hgrad_handles);
 }
 
-1;
\ No newline at end of file
+1;
diff --git a/perl-package/AI-MXNet/lib/AI/MXNet/Base.pm 
b/perl-package/AI-MXNet/lib/AI/MXNet/Base.pm
index a8da8470f5..f748ecbe1f 100644
--- a/perl-package/AI-MXNet/lib/AI/MXNet/Base.pm
+++ b/perl-package/AI-MXNet/lib/AI/MXNet/Base.pm
@@ -120,12 +120,17 @@ use constant GRAD_REQ_MAP => {
 
 sub zip
 {
-my ($sub, @arrays) = @_;
-my $len = @{ $arrays[0] };
-for (my $i = 0; $i < $len; $i++)
+if('CODE' eq ref $_[0])
 {
-$sub->(map { $_->[$i] } @arrays);
+# continue supporting the callback style
+my $code = shift;
+$code->(@$_) for AI::MXNetCAPI::py_zip(map { \@$_ } @_);
+return;
 }
+# the map() here may seem like a no-op, but triggers overloading or
+# whatever else is needed to make array-ish things actually arrays
+# before entering the low level list builder.
+return AI::MXNetCAPI::py_zip(map { \@$_ } @_);
 }
 
 =head2 enumerate
@@ -270,16 +275,14 @@ sub build_param_doc
 $remove_dup //= 1;
 my %param_keys;
 my @param_str;
-zip(sub {
-my ($key, $type_info, $desc) = @_;
-return if exists $param_keys{$key} and $remove_dup;
+for(zip($arg_names, $arg_types, $arg_descs)) {
+my ($key, $type_info, $desc) = @$_;
+next if exists $param_keys{$key} and $remove_dup;
 $param_keys{$key} = 1;
 my $ret = sprintf("%s : %s", $key, $type_info);
 $ret .= "\n".$desc if length($desc);
 push @param_str,  $ret;
-},
-$arg_names, $arg_types, $arg_descs
-);
+}
 return sprintf("Parameters\n--\n%s\n", join("\n", @param_str));
 }
 
diff --git a/perl-package/AI-MXNet/lib/AI/MXNet/Executor/Group.pm 
b/perl-package/AI-MXNet/lib/AI/MXNet/Executor/Group.pm
index 7ac054333c..acacffde1e 100644
--- a/perl-package/AI-MXNet/lib/AI/MXNet/Executor/Group.pm
+++ b/perl-package/AI-MXNet/lib/AI/MXNet/Executor/Group.pm
@@ -57,18 +57,18 @@ func _split_input_slice($batch_size, $work_load_list)
 # Load a array ref of arrays into a array ref of arrays specified by slices
 func _load_general($data, $targets, $major_axis)
 {
-zip(sub {
-my ($d_src, $d_targets, $axis) = @_;
+for(zip($data, $targets, $major_axis)) {
+my ($d_src, $d_targets, $axis) = @$_;
 if(blessed($d_targets) and $d_targets->isa('AI::MXNet::NDarray'))
 {
 $d_src->copyto($d_targets);
 }
 elsif(ref $d_targets eq 'ARRAY' and blessed $d_targets->[0])
 {
-zip(sub {
-my ($src, $dst) = @_;
+for(zip($d_src, $d_targets)) {
+my ($src, $dst) = @$_;
 $src->copyto($dst);
-}, $d_src, $d_targets);
+}
 }
 else
 {
@@ -124,7 +124,7 @@ func _load_general($data, $targets, $major_axis)
 }
 }
 }
-}, $data, $targets, $major_axis);
+}
 }
 
 # Load data into sliced arrays
@@ -144,8 +144,8 @@ func _load_label($batch, $targets, $major_axis)
 func _merge_multi_context($outputs, $major_axis)
 {
 my @rets;
-zip(sub {
-my ($tensors, $axis) = @_;
+for(zip($outputs, $major_axis)) {
+my ($tensors, $axis) = @$_;
 if($axis >= 0)
 {
 if(@$tensors == 1)
@@ -165,7 +165,7 @@ func _merge_multi_context($outputs, $major_axis)
 # first one, without checking they are actually the same
 push @rets, $tensors->[0];
 }
-}, $outputs, $major_axis);
+}
 return \@rets;
 }
 
@@ -353,9 +353,9 @@ method decide_slices(ArrayRef[AI::MXNet::DataDesc] 
$data_shapes)
 {
 confess("empty data_shapes array") unless @{ $data_shapes } > 0;
 my $major_axis = [map { AI::MXNet::DataDesc->get_batch_axis($_->layout) } 
@{ $data_shapes 

[incubator-mxnet] branch master updated: [Perl] emulate Python zip() for Perl (#8192)

2017-10-21 Thread jxie
This is an automated email from the ASF dual-hosted git repository.

jxie pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git


The following commit(s) were added to refs/heads/master by this push:
 new d6062f7  [Perl] emulate Python zip() for Perl (#8192)
d6062f7 is described below

commit d6062f7c450c2cde5aa37e1a9a883ffe90622941
Author: Robert Stone 
AuthorDate: Sat Oct 21 12:15:08 2017 -0700

[Perl] emulate Python zip() for Perl (#8192)

* [Perl] emulate Python zip() for Perl

* [Perl] retool zip() uses away from the callback form
---
 perl-package/AI-MXNet/lib/AI/MXNet/AutoGrad.pm |   8 +-
 perl-package/AI-MXNet/lib/AI/MXNet/Base.pm |  23 +++--
 .../AI-MXNet/lib/AI/MXNet/Executor/Group.pm|  74 +++---
 perl-package/AI-MXNet/lib/AI/MXNet/Gluon/Block.pm  |  24 ++---
 .../AI-MXNet/lib/AI/MXNet/Gluon/Parameter.pm   |   8 +-
 .../AI-MXNet/lib/AI/MXNet/Gluon/RNN/Cell.pm|  14 +--
 .../AI-MXNet/lib/AI/MXNet/Gluon/RNN/Layer.pm   |   6 +-
 .../AI-MXNet/lib/AI/MXNet/Gluon/Trainer.pm |   8 +-
 perl-package/AI-MXNet/lib/AI/MXNet/Gluon/Utils.pm  |   8 +-
 perl-package/AI-MXNet/lib/AI/MXNet/KVStore.pm  |   6 +-
 perl-package/AI-MXNet/lib/AI/MXNet/Metric.pm   |  66 ++---
 perl-package/AI-MXNet/lib/AI/MXNet/Module.pm   |  12 +--
 perl-package/AI-MXNet/lib/AI/MXNet/Monitor.pm  |  12 +--
 perl-package/AI-MXNet/lib/AI/MXNet/NDArray.pm  |  12 +--
 .../AI-MXNet/lib/AI/MXNet/NDArray/Slice.pm |  20 ++--
 perl-package/AI-MXNet/lib/AI/MXNet/RNN/Cell.pm |  18 ++--
 perl-package/AI-MXNet/lib/AI/MXNet/Symbol.pm   |   6 +-
 perl-package/AI-MXNet/t/test_autograd.t|   6 +-
 perl-package/AI-MXNet/t/test_base.t| 107 +
 perl-package/AI-MXNet/t/test_model_parallel.t  |   6 +-
 perl-package/AI-MXNet/t/test_module.t  |   8 +-
 perl-package/AI-MXNet/t/test_multi_device_exec.t   |   6 +-
 perl-package/AI-MXNetCAPI/mxnet.i  |  37 +++
 23 files changed, 318 insertions(+), 177 deletions(-)

diff --git a/perl-package/AI-MXNet/lib/AI/MXNet/AutoGrad.pm 
b/perl-package/AI-MXNet/lib/AI/MXNet/AutoGrad.pm
index b49c0b6..221840e 100644
--- a/perl-package/AI-MXNet/lib/AI/MXNet/AutoGrad.pm
+++ b/perl-package/AI-MXNet/lib/AI/MXNet/AutoGrad.pm
@@ -333,10 +333,10 @@ method grad(
 );
 
 my @ret;
-zip(sub {
-my ($handle, $stype) = @_;
+for(zip($grad_vars, $grad_stypes)) {
+my ($handle, $stype) = @$_;
 push @ret, AI::MXNet::NDArray->new(handle => $handle, stype => $stype);
-}, $grad_vars, $grad_stypes);
+}
 if(blessed $variables)
 {
 return $ret[0];
@@ -474,4 +474,4 @@ func _parse_head($heads, $head_grads)
 return (\@head_handles, \@hgrad_handles);
 }
 
-1;
\ No newline at end of file
+1;
diff --git a/perl-package/AI-MXNet/lib/AI/MXNet/Base.pm 
b/perl-package/AI-MXNet/lib/AI/MXNet/Base.pm
index a8da847..f748ecb 100644
--- a/perl-package/AI-MXNet/lib/AI/MXNet/Base.pm
+++ b/perl-package/AI-MXNet/lib/AI/MXNet/Base.pm
@@ -120,12 +120,17 @@ use constant GRAD_REQ_MAP => {
 
 sub zip
 {
-my ($sub, @arrays) = @_;
-my $len = @{ $arrays[0] };
-for (my $i = 0; $i < $len; $i++)
+if('CODE' eq ref $_[0])
 {
-$sub->(map { $_->[$i] } @arrays);
+# continue supporting the callback style
+my $code = shift;
+$code->(@$_) for AI::MXNetCAPI::py_zip(map { \@$_ } @_);
+return;
 }
+# the map() here may seem like a no-op, but triggers overloading or
+# whatever else is needed to make array-ish things actually arrays
+# before entering the low level list builder.
+return AI::MXNetCAPI::py_zip(map { \@$_ } @_);
 }
 
 =head2 enumerate
@@ -270,16 +275,14 @@ sub build_param_doc
 $remove_dup //= 1;
 my %param_keys;
 my @param_str;
-zip(sub {
-my ($key, $type_info, $desc) = @_;
-return if exists $param_keys{$key} and $remove_dup;
+for(zip($arg_names, $arg_types, $arg_descs)) {
+my ($key, $type_info, $desc) = @$_;
+next if exists $param_keys{$key} and $remove_dup;
 $param_keys{$key} = 1;
 my $ret = sprintf("%s : %s", $key, $type_info);
 $ret .= "\n".$desc if length($desc);
 push @param_str,  $ret;
-},
-$arg_names, $arg_types, $arg_descs
-);
+}
 return sprintf("Parameters\n--\n%s\n", join("\n", @param_str));
 }
 
diff --git a/perl-package/AI-MXNet/lib/AI/MXNet/Executor/Group.pm 
b/perl-package/AI-MXNet/lib/AI/MXNet/Executor/Group.pm
index 7ac0543..acacffd 100644
--- a/perl-package/AI-MXNet/lib/AI/MXNet/Executor/Group.pm
+++ b/perl-package/AI-MXNet/lib/AI/MXNet/Executor/Group.pm
@@ -57,18 +57,18 @@ func _split_input_slice($batch_size, $work_load_list)
 # Load a array ref of arrays into a array ref of arrays specified by 

[GitHub] yajiedesign commented on issue #4746: ImportError for mxnet: cannot import name libinfo

2017-10-21 Thread GitBox
yajiedesign commented on issue #4746: ImportError for mxnet: cannot import name 
libinfo
URL: 
https://github.com/apache/incubator-mxnet/issues/4746#issuecomment-338445615
 
 
   @Bumblebee1964 do you install cuda 8.0?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] reminisce commented on a change in pull request #8371: Add note in the doc for using naive engine in multithreading environment

2017-10-21 Thread GitBox
reminisce commented on a change in pull request #8371: Add note in the doc for 
using naive engine in multithreading environment
URL: https://github.com/apache/incubator-mxnet/pull/8371#discussion_r146114557
 
 

 ##
 File path: docs/faq/env_var.md
 ##
 @@ -56,6 +56,9 @@ export MXNET_GPU_WORKER_NTHREADS=3
 - NaiveEngine: A very simple engine that uses the master thread to do the 
computation synchronously. Setting this engine disables multi-threading. You 
can use this type for debugging in case of any error. Backtrace will give you 
the series of calls that lead to the error. Remember to set MXNET_ENGINE_TYPE 
back to empty after debugging.
 - ThreadedEngine: A threaded engine that uses a global thread pool to 
schedule jobs.
 - ThreadedEnginePerDevice: A threaded engine that allocates thread per GPU 
and executes jobs asynchronously.
+  - Note: ThreadedEngine and ThreadedEnginePerDevice are not thread-safe. 
Switch to using NaiveEngine
+  if you want to have multiple threads interacting with a single MXNet 
model at the same time.
 
 Review comment:
   Good suggestion and example. But I'm not following the reasoning of the last 
sentence `Since the fork of a process replicates the complete process address 
space including threads, keeping MXNet single-threaded via the use of 
NaiveEngine makes it safe.`
   
   The ThreadedEngine of MXNet schedules operations based upon the availability 
of NDArrays in their corresponding operation functions. The root cause for the 
problem we saw is that multiple threads pushes operations on NDArrays to the 
MXNet ThreadedEngine simultaneously. This will cause
   1. data race, e.g. thread1 and thread2 writing to the same NDArray memory 
space.
   2. dead lock, e.g. thread1 trying to copy from NDArray1 to NDArray2, while 
thread2 trying to copy from NDArray2 to NDArray1, and they are both waiting for 
the read on the source arrays to be ready.
   
   If it's already multiple threads in a single process interacting with MXNet, 
does it matter by forking it or not?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] szha opened a new pull request #8377: re-enable tests

2017-10-21 Thread GitBox
szha opened a new pull request #8377: re-enable tests
URL: https://github.com/apache/incubator-mxnet/pull/8377
 
 
   ## Description ##
   Re-enable tests that were disabled in #7648, #8265, #7829, #8045. The tests 
are marked as flaky, and are allowed maximum of three runs.
   
   ## Checklist ##
   ### Essentials ###
   - [x] Changes are complete (i.e. I finished coding on this PR)
   - [x] All changes have test coverage
   - [x] To my best knowledge, examples are either not affected by this change, 
or have been fixed to be compatible with this change
   
   ### Changes ###
   - [x] re-enable tests and mark as flaky with three runs.
   
   ## Comments ##
   - The tests are still flaky and needs fixing eventually.
   - Adding flaky tests for new features should be discouraged.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] szha opened a new pull request #8378: upgrade MKL

2017-10-21 Thread GitBox
szha opened a new pull request #8378: upgrade MKL
URL: https://github.com/apache/incubator-mxnet/pull/8378
 
 
   ## Description ##
   upgrade mklml dependency
   
   ## Checklist ##
   ### Essentials ###
   - [x] Changes are complete (i.e. I finished coding on this PR)
   - [x] All changes have test coverage
   - [x] To my best knowledge, examples are either not affected by this change, 
or have been fixed to be compatible with this change
   
   ### Changes ###
   - [x] upgrade MKL version from 20170720 to 20170908


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] piiswrong closed pull request #8376: Fix Typo (classification)

2017-10-21 Thread GitBox
piiswrong closed pull request #8376: Fix Typo (classification)
URL: https://github.com/apache/incubator-mxnet/pull/8376
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/example/README.md b/example/README.md
index 12ada4d0ce..507b144ad6 100644
--- a/example/README.md
+++ b/example/README.md
@@ -53,7 +53,7 @@ If you want to contribute to this list and the examples, 
please open a new pull
 * [Fast R-CNN](https://github.com/precedenceguo/mx-rcnn) by [Jian 
Guo](https://github.com/precedenceguo)
 * "End2End Captcha Recognition (OCR)" by 
[xlvector](https://github.com/xlvector) [github 
link](https://github.com/xlvector/learning-dl/tree/master/mxnet/ocr) [Blog in 
Chinese](http://blog.xlvector.net/2016-05/mxnet-ocr-cnn/)
 * "Prediction step of xlvector's lstm ocr" by 
[melody-rain](https://github.com/melody-rain) [github 
link](https://github.com/melody-rain/mxnet/commit/46002e31fc34c746c01bcaa7ade999187068ad3c)
 [Blog in Chinese](https://zhuanlan.zhihu.com/p/22698511)
-* "Solving classificiation + regression with MXnet in Multi Input + Multi Obj" 
by [xlvector](https://github.com/xlvector) [github 
link](https://gist.github.com/xlvector/c304d74f9dd6a3b68a3387985482baac) [Blog 
in 
Chinese](http://blog.xlvector.net/2016-05/mxnet-regression-classification-for-concret-continuous-features/)
+* "Solving classification + regression with MXnet in Multi Input + Multi Obj" 
by [xlvector](https://github.com/xlvector) [github 
link](https://gist.github.com/xlvector/c304d74f9dd6a3b68a3387985482baac) [Blog 
in 
Chinese](http://blog.xlvector.net/2016-05/mxnet-regression-classification-for-concret-continuous-features/)
 * "Learn to sort by LSTM" by [xlvector](https://github.com/xlvector) [github 
link](https://github.com/xlvector/learning-dl/tree/master/mxnet/lstm_sort) 
[Blog in Chinese](http://blog.xlvector.net/2016-05/mxnet-lstm-example/)
 * [Neural Art using extremely lightweight (<500K) neural 
network](https://github.com/pavelgonchar/neural-art-mini) Lightweight version 
of mxnet neural art implementation by [Pavel 
Gonchar](https://github.com/pavelgonchar)
 * [Neural Art with generative networks](https://github.com/zhaw/neural_style) 
by [zhaw](https://github.com/zhaw)


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[incubator-mxnet] branch master updated: Fix Typo (classification) (#8376)

2017-10-21 Thread jxie
This is an automated email from the ASF dual-hosted git repository.

jxie pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git


The following commit(s) were added to refs/heads/master by this push:
 new 725a542  Fix Typo (classification) (#8376)
725a542 is described below

commit 725a5425d49e5e52455ce19055260df7ffaaadb9
Author: jb 
AuthorDate: Sat Oct 21 19:38:16 2017 -0400

Fix Typo (classification) (#8376)

Fix a typo in the example readme.
---
 example/README.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/example/README.md b/example/README.md
index 12ada4d..507b144 100644
--- a/example/README.md
+++ b/example/README.md
@@ -53,7 +53,7 @@ If you want to contribute to this list and the examples, 
please open a new pull
 * [Fast R-CNN](https://github.com/precedenceguo/mx-rcnn) by [Jian 
Guo](https://github.com/precedenceguo)
 * "End2End Captcha Recognition (OCR)" by 
[xlvector](https://github.com/xlvector) [github 
link](https://github.com/xlvector/learning-dl/tree/master/mxnet/ocr) [Blog in 
Chinese](http://blog.xlvector.net/2016-05/mxnet-ocr-cnn/)
 * "Prediction step of xlvector's lstm ocr" by 
[melody-rain](https://github.com/melody-rain) [github 
link](https://github.com/melody-rain/mxnet/commit/46002e31fc34c746c01bcaa7ade999187068ad3c)
 [Blog in Chinese](https://zhuanlan.zhihu.com/p/22698511)
-* "Solving classificiation + regression with MXnet in Multi Input + Multi Obj" 
by [xlvector](https://github.com/xlvector) [github 
link](https://gist.github.com/xlvector/c304d74f9dd6a3b68a3387985482baac) [Blog 
in 
Chinese](http://blog.xlvector.net/2016-05/mxnet-regression-classification-for-concret-continuous-features/)
+* "Solving classification + regression with MXnet in Multi Input + Multi Obj" 
by [xlvector](https://github.com/xlvector) [github 
link](https://gist.github.com/xlvector/c304d74f9dd6a3b68a3387985482baac) [Blog 
in 
Chinese](http://blog.xlvector.net/2016-05/mxnet-regression-classification-for-concret-continuous-features/)
 * "Learn to sort by LSTM" by [xlvector](https://github.com/xlvector) [github 
link](https://github.com/xlvector/learning-dl/tree/master/mxnet/lstm_sort) 
[Blog in Chinese](http://blog.xlvector.net/2016-05/mxnet-lstm-example/)
 * [Neural Art using extremely lightweight (<500K) neural 
network](https://github.com/pavelgonchar/neural-art-mini) Lightweight version 
of mxnet neural art implementation by [Pavel 
Gonchar](https://github.com/pavelgonchar)
 * [Neural Art with generative networks](https://github.com/zhaw/neural_style) 
by [zhaw](https://github.com/zhaw)

-- 
To stop receiving notification emails like this one, please contact
['"comm...@mxnet.apache.org" '].


[GitHub] szha commented on issue #8340: Fill optimizations

2017-10-21 Thread GitBox
szha commented on issue #8340: Fill optimizations
URL: https://github.com/apache/incubator-mxnet/pull/8340#issuecomment-338442944
 
 
   The symbol-side can be updated to use the new op as well. The frontend 
definition that needs update is here: 
https://github.com/apache/incubator-mxnet/blob/725a5425d49e5e52455ce19055260df7ffaaadb9/python/mxnet/symbol/symbol.py#L2762


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] cjolivier01 commented on a change in pull request #8340: Fill optimizations

2017-10-21 Thread GitBox
cjolivier01 commented on a change in pull request #8340: Fill optimizations
URL: https://github.com/apache/incubator-mxnet/pull/8340#discussion_r146119887
 
 

 ##
 File path: src/operator/tensor/init_op.h
 ##
 @@ -164,19 +164,38 @@ inline bool InitStorageType(const nnvm::NodeAttrs& attrs,
   return true;
 }
 
+/*! \brief Fill output with a scalar integer value */
 template
 void FillCompute(const nnvm::NodeAttrs& attrs,
  const OpContext& ctx,
  const std::vector& inputs,
  const std::vector& req,
  const std::vector& outputs) {
-  using namespace mshadow;
-  using namespace mshadow::expr;
-  Stream *s = ctx.get_stream();
-  MSHADOW_TYPE_SWITCH(outputs[0].type_flag_, DType, {
-Tensor out = outputs[0].FlatTo1D(s);
-ASSIGN_DISPATCH(out, req[0], scalar(value));
-  });
+  if (req[0] != kNullOp) {
+mshadow::Stream *s = ctx.get_stream();
+MSHADOW_TYPE_SWITCH(outputs[0].type_flag_, DType, {
+  mxnet_op::Kernel::Launch(s,
 
 Review comment:
   op_with_req<> wrapper will handle this along with the other changes


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] cjolivier01 commented on issue #8340: Fill optimizations

2017-10-21 Thread GitBox
cjolivier01 commented on issue #8340: Fill optimizations
URL: https://github.com/apache/incubator-mxnet/pull/8340#issuecomment-338444522
 
 
   done


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] cjolivier01 opened a new pull request #8379: Use omp_get_max_threads() when OMP_NUM_THREADS environment set

2017-10-21 Thread GitBox
cjolivier01 opened a new pull request #8379: Use omp_get_max_threads() when 
OMP_NUM_THREADS environment set
URL: https://github.com/apache/incubator-mxnet/pull/8379
 
 
   Using wrong API call here. Only relevant if OMP_NUM_THREADS environment 
variable is set.
   
   ## Description ##
   (Brief description on what this PR is about)
   
   ## Checklist ##
   ### Essentials ###
   - [ ] Passed code style checking (`make lint`)
   - [ ] Changes are complete (i.e. I finished coding on this PR)
   - [ ] All changes have test coverage
   - [ ] For user-facing API changes, API doc string has been updated.
   - [ ] To my best knowledge, examples are either not affected by this change, 
or have been fixed to be compatible with this change
   
   ### Changes ###
   - [ ] Feature1, tests, (and when applicable, API doc)
   - [ ] Feature2, tests, (and when applicable, API doc)
   
   ## Comments ##
   - If this change is a backward incompatible change, why must this change be 
made.
   - Intersting edge cases to note here
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ZhichengHuang commented on issue #7593: why gluon is slower than PyTorch?

2017-10-21 Thread GitBox
ZhichengHuang commented on issue #7593: why gluon is slower than PyTorch?
URL: 
https://github.com/apache/incubator-mxnet/issues/7593#issuecomment-338448789
 
 
   @piiswrong This is issue haven't solve ,can you give me some suggestions 
about how to make the transform faster? or How to solve this issue.Thank you.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] cjolivier01 opened a new pull request #8380: Memset/memcpy/omp profiling tester

2017-10-21 Thread GitBox
cjolivier01 opened a new pull request #8380: Memset/memcpy/omp profiling tester
URL: https://github.com/apache/incubator-mxnet/pull/8380
 
 
   ## Description ##
   This doesn't assert, but tests the threshold where memset or memcpy becomes 
slower than the equivalent OMP implementation.
   
   ## Checklist ##
   ### Essentials ###
   - [ ] Passed code style checking (`make lint`)
   - [ ] Changes are complete (i.e. I finished coding on this PR)
   - [ ] All changes have test coverage
   - [ ] For user-facing API changes, API doc string has been updated.
   - [ ] To my best knowledge, examples are either not affected by this change, 
or have been fixed to be compatible with this change
   
   ### Changes ###
   - [ ] Feature1, tests, (and when applicable, API doc)
   - [ ] Feature2, tests, (and when applicable, API doc)
   
   ## Comments ##
   - If this change is a backward incompatible change, why must this change be 
made.
   - Intersting edge cases to note here
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] szha commented on issue #8152: gluon improvement

2017-10-21 Thread GitBox
szha commented on issue #8152: gluon improvement
URL: https://github.com/apache/incubator-mxnet/pull/8152#issuecomment-338452966
 
 
   added shape completion after loading parameters or finishing deferred init.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services