[GitHub] benqua commented on issue #8129: [scala] Module api: label (1,1,868,868) and prediction (1,868,868)should have the same length

2017-10-03 Thread git
benqua commented on issue #8129: [scala] Module api: label (1,1,868,868) and 
prediction (1,868,868)should have the same length
URL: 
https://github.com/apache/incubator-mxnet/issues/8129#issuecomment-334054179
 
 
   Yes, I am conducting a segmentation tasks.
   My label has only one channel, with value being either 1 or 0, depending on 
the class (providedLabel: softmax_label -> (1,1,868,868)).
   
   The output of outputShapes on the module is 
   OutShapes (module): ArrayBuffer((softmaxoutput0_output,(1,2,868,868)))
   
   You mean I shouldn't use the multi_output parameter?
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] eric-haibin-lin opened a new pull request #8141: update sparse LR example

2017-10-03 Thread git
eric-haibin-lin opened a new pull request #8141: update sparse LR example
URL: https://github.com/apache/incubator-mxnet/pull/8141
 
 
   - fix wrong metric name `log_loss` -> `nll_loss`
   - fix wrong README instruction for distributed training 
   - added weighted loss for the output layer
   - add a list of sparse optimizers as argument choices
   
   @szha 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] Ldpe2G commented on issue #8129: [scala] Module api: label (1,1,868,868) and prediction (1,868,868)should have the same length

2017-10-03 Thread git
Ldpe2G commented on issue #8129: [scala] Module api: label (1,1,868,868) and 
prediction (1,868,868)should have the same length
URL: 
https://github.com/apache/incubator-mxnet/issues/8129#issuecomment-334057398
 
 
   No you should. And I suggest you to try the intermediate api 
   
https://github.com/apache/incubator-mxnet/blob/8a4221bca2ddd4fa05840f7951a7216775021237/scala-package/examples/src/main/scala/ml/dmlc/mxnetexamples/module/MnistMlp.scala#L43
 
   
   Since the output shape is correct, it is strange to get such error message. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] CodingCat commented on issue #8128: Adding code owners

2017-10-03 Thread git
CodingCat commented on issue #8128: Adding code owners
URL: https://github.com/apache/incubator-mxnet/pull/8128#issuecomment-333957516
 
 
   @gautamkmr I am totally fine and supportive with this..
   
   as long as it is not something like "file a,b,c has to be signed off by Mr 
xyz before merge the changes", it is following Apache way to my understanding
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] aseyboldt commented on issue #8133: Infer_shape_partial for rank 0 arrays

2017-10-03 Thread git
aseyboldt commented on issue #8133: Infer_shape_partial for rank 0 arrays
URL: 
https://github.com/apache/incubator-mxnet/issues/8133#issuecomment-333958354
 
 
   `ndarray` does allow scalars:
   ```python
   >>> a = mx.nd.ones(())
   >>> a.shape
   ()
   >>> a.size
   1
   >>> a.asnumpy()
   array(1.0, dtype=float32)
   >>> mx.nd.ones((1,)).asnumpy()
   array([ 1.], dtype=float32)
   ```
   This seems to lead to the invalid memory access in the example above (which 
by the way I think is a serious bug).
   
   I feel bad about criticising someone else's project after using it for only 
a couple of days, but I have to admit it is pretty much beyond me why you would 
design a tensor library without the concept of a scalar. :sweat_smile: 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] CodingCat commented on issue #8128: Adding code owners

2017-10-03 Thread git
CodingCat commented on issue #8128: Adding code owners
URL: https://github.com/apache/incubator-mxnet/pull/8128#issuecomment-333957516
 
 
   @gautamkmr I am totally fine and supportive with this..
   
   as long as it is not something like "file a,b,c has to be signed off by Mr 
xyz before merge the changes", it is following Apache way to my understanding
   
   I am just bring this up to everyone for awareness since we do not want to 
fall into argument too frequently
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[incubator-mxnet] branch szha-patch-1 created (now 585b0b8)

2017-10-03 Thread zhasheng
This is an automated email from the ASF dual-hosted git repository.

zhasheng pushed a change to branch szha-patch-1
in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git.


  at 585b0b8  Update nn.md

No new revisions were added by this update.

-- 
To stop receiving notification emails like this one, please contact
['"comm...@mxnet.apache.org" '].


[GitHub] szha opened a new pull request #8134: Update nn.md

2017-10-03 Thread git
szha opened a new pull request #8134: Update nn.md
URL: https://github.com/apache/incubator-mxnet/pull/8134
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[incubator-mxnet] branch master updated: Updating code owners (#8128)

2017-10-03 Thread jxie
This is an automated email from the ASF dual-hosted git repository.

jxie pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git


The following commit(s) were added to refs/heads/master by this push:
 new 236d1c2  Updating code owners (#8128)
236d1c2 is described below

commit 236d1c2dac8ab13f9d99c2af7c4febc663e3940c
Author: Gautam Kumar 
AuthorDate: Tue Oct 3 13:38:35 2017 -0700

Updating code owners (#8128)

In order to make the master branch protected all
the committers should be part of code owner.
---
 CODEOWNERS | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/CODEOWNERS b/CODEOWNERS
index 26fcf35..57b4ec3 100644
--- a/CODEOWNERS
+++ b/CODEOWNERS
@@ -1,7 +1,7 @@
 # Owners of Apache MXNet
 
 # Global owners
-*  @piiswrong @mli
+*  @apache/mxnet-committers
 
 # Owners of language bindings
 R-package/*@thirdwing

-- 
To stop receiving notification emails like this one, please contact
['"comm...@mxnet.apache.org" '].


[GitHub] mli commented on issue #7995: Problem in acc metric

2017-10-03 Thread git
mli commented on issue #7995: Problem in acc metric
URL: https://github.com/apache/incubator-mxnet/pull/7995#issuecomment-333984572
 
 
   can we just use ndarray to compute instead of converting to numpy? if there 
is a number of classes, say 10K, then it could be problematic
   
   a sample code:
   
   ```
   def accuracy(output, label):
   # both output and label are ndarray
   return (output.argmax(axis=1)==label).sum().asscalar()
   ```
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] aseyboldt commented on issue #8133: Infer_shape_partial for rank 0 arrays

2017-10-03 Thread git
aseyboldt commented on issue #8133: Infer_shape_partial for rank 0 arrays
URL: 
https://github.com/apache/incubator-mxnet/issues/8133#issuecomment-333958354
 
 
   That explains a lot :-)
   
   However `ndarray` does allow scalars:
   ```python
   >>> a = mx.nd.ones(())
   >>> a.shape
   ()
   >>> a.size
   1
   >>> a.asnumpy()
   array(1.0, dtype=float32)
   >>> mx.nd.ones((1,)).asnumpy()
   array([ 1.], dtype=float32)
   ```
   This seems to lead to the invalid memory access in the example above (which 
by the way I think is a serious bug).
   
   I feel bad about criticising someone else's project after using it for only 
a couple of days, but I have to admit that I'm kind of surprised that you would 
design a tensor library without the concept of a scalar. :sweat_smile: 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] anirudh2290 commented on a change in pull request #8020: Get bz2 data fix

2017-10-03 Thread git
anirudh2290 commented on a change in pull request #8020: Get bz2 data fix
URL: https://github.com/apache/incubator-mxnet/pull/8020#discussion_r142521336
 
 

 ##
 File path: python/mxnet/test_utils.py
 ##
 @@ -1411,8 +1411,29 @@ def read_data(label_url, image_url):
 'test_data':test_img, 'test_label':test_lbl}
 
 def get_bz2_data(data_dir, data_name, url, data_origin_name):
-"""Download and extract bz2 data."""
+"""Download and extract bz2 data.
+
+Parameters
+--
+
+data_dir : str
+Absolute or relative path of the directory name to store bz2 files
+data_name : str
+Name of the output file in which bz2 contents will be extracted
+url : str
+URL to download data from
+data_origin_name : str
+Name of the downloaded b2 file
+
+Examples
+
+>>> get_bz2_data("data_dir", "kdda.t",
+ 
"https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary/kdda.t.bz2;,
+ "kdda.t.bz2")
+"""
+
 download(url, dirname=data_dir, overwrite=False)
+cwd = os.path.abspath(os.getcwd())
 os.chdir(data_dir)
 
 Review comment:
   Removed chdir and used paths.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] piiswrong opened a new pull request #8136: stable sum

2017-10-03 Thread git
piiswrong opened a new pull request #8136: stable sum
URL: https://github.com/apache/incubator-mxnet/pull/8136
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] aseyboldt commented on issue #8133: Infer_shape_partial for rank 0 arrays

2017-10-03 Thread git
aseyboldt commented on issue #8133: Infer_shape_partial for rank 0 arrays
URL: 
https://github.com/apache/incubator-mxnet/issues/8133#issuecomment-333958354
 
 
   That explains a lot :-)
   
   However `ndarray` does allow scalars:
   ```python
   >>> a = mx.nd.ones(())
   >>> a.shape
   ()
   >>> a.size
   1
   >>> a.asnumpy()
   array(1.0, dtype=float32)
   >>> mx.nd.ones((1,)).asnumpy()
   array([ 1.], dtype=float32)
   ```
   This seems to lead to the invalid memory access in the example above (which 
by the way I think is a serious bug).
   
   I feel bad about criticising someone else's project after using it for only 
a couple of days, but I have to admit it is pretty much beyond me why you would 
design a tensor library without the concept of a scalar. :sweat_smile: 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] szha commented on issue #8136: stable sum

2017-10-03 Thread git
szha commented on issue #8136: stable sum
URL: https://github.com/apache/incubator-mxnet/pull/8136#issuecomment-333983441
 
 
   Do you plan on using Kahan's summation on regular sum too? Our mean is using 
that as the reduce function.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] piiswrong closed pull request #8128: Adding code owners

2017-10-03 Thread git
piiswrong closed pull request #8128: Adding code owners
URL: https://github.com/apache/incubator-mxnet/pull/8128
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] piiswrong opened a new pull request #8135: Stable sum

2017-10-03 Thread git
piiswrong opened a new pull request #8135: Stable sum
URL: https://github.com/apache/incubator-mxnet/pull/8135
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] piiswrong closed pull request #8135: Stable sum

2017-10-03 Thread git
piiswrong closed pull request #8135: Stable sum
URL: https://github.com/apache/incubator-mxnet/pull/8135
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[incubator-mxnet] branch master updated: Update loss.md (#8131)

2017-10-03 Thread zhasheng
This is an automated email from the ASF dual-hosted git repository.

zhasheng pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git


The following commit(s) were added to refs/heads/master by this push:
 new 57d59ac  Update loss.md (#8131)
57d59ac is described below

commit 57d59ac008598bd1e7a13f4a4e7ae7a7b3da41f7
Author: Sheng Zha 
AuthorDate: Tue Oct 3 11:23:41 2017 -0700

Update loss.md (#8131)
---
 docs/api/python/gluon/loss.md | 1 +
 1 file changed, 1 insertion(+)

diff --git a/docs/api/python/gluon/loss.md b/docs/api/python/gluon/loss.md
index 5c27ab3..347eb49 100644
--- a/docs/api/python/gluon/loss.md
+++ b/docs/api/python/gluon/loss.md
@@ -17,6 +17,7 @@ This package includes several commonly used loss functions in 
neural networks.
 L2Loss
 L1Loss
 SoftmaxCrossEntropyLoss
+SigmoidBinaryCrossEntropyLoss
 KLDivLoss
 CTCLoss
 ```

-- 
To stop receiving notification emails like this one, please contact
['"comm...@mxnet.apache.org" '].


[incubator-mxnet] branch szha-patch-1 deleted (was c7550b0)

2017-10-03 Thread zhasheng
This is an automated email from the ASF dual-hosted git repository.

zhasheng pushed a change to branch szha-patch-1
in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git.


 was c7550b0  Update loss.md

The revisions that were on this branch are still contained in
other references; therefore, this change does not discard any commits
from the repository.

-- 
To stop receiving notification emails like this one, please contact
['"comm...@mxnet.apache.org" '].


[GitHub] szha closed pull request #8131: Update loss.md

2017-10-03 Thread git
szha closed pull request #8131: Update loss.md
URL: https://github.com/apache/incubator-mxnet/pull/8131
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] szha commented on issue #8133: Infer_shape_partial for rank 0 arrays

2017-10-03 Thread git
szha commented on issue #8133: Infer_shape_partial for rank 0 arrays
URL: 
https://github.com/apache/incubator-mxnet/issues/8133#issuecomment-333935130
 
 
   I don't think we have the concept of scalar in ndarray or symbol. Shape of 
`()` doesn't mean it's a scalar, it means its shape is unknown and needs to be 
inferred.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] Ldpe2G commented on issue #8129: [scala] Module api: label (1,1,868,868) and prediction (1,868,868)should have the same length

2017-10-03 Thread git
Ldpe2G commented on issue #8129: [scala] Module api: label (1,1,868,868) and 
prediction (1,868,868)should have the same length
URL: 
https://github.com/apache/incubator-mxnet/issues/8129#issuecomment-334036151
 
 
   @benqua  are you conducting the segmentation task? If so, the shape of label 
 has only one channel, just like the normal softmax. The multi_output parameter 
means to calculate the softmax along the channel axis . 
   
   And I suggest you first print the output shapes by calling the 
`outputShapes` method of Module just after you call the `bing` method.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] gautamkmr commented on issue #8128: Adding code owners

2017-10-03 Thread git
gautamkmr commented on issue #8128: Adding code owners
URL: https://github.com/apache/incubator-mxnet/pull/8128#issuecomment-333992063
 
 
   @CodingCat  sure :) 
   @piiswrong  Thanks ? 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] benqua commented on issue #8129: [scala] Module api: label (1,1,868,868) and prediction (1,868,868)should have the same length

2017-10-03 Thread git
benqua commented on issue #8129: [scala] Module api: label (1,1,868,868) and 
prediction (1,868,868)should have the same length
URL: 
https://github.com/apache/incubator-mxnet/issues/8129#issuecomment-333992969
 
 
   ok, I checked and log the shapes as suggested and realize that it is not the 
batch size dimension that is lost but the channel one (I had 1 for both, so I 
didn't realize at first).
   
   The network is a u-net very similar as the one describe in the original 
u-net paper.
   Each pixel can be in one of two classes, as in the original paper.
   So, the last layers are:
   ```scala
   // output
   val conv10 = Symbol.Convolution()()(Map("data" -> conv9, "num_filter" -> 
2, "kernel" -> "(1,1)"))
   val label  = Symbol.Variable("softmax_label")
   val so = Symbol.SoftmaxOutput()()(Map("data" -> conv10, "label" -> 
label, "multi_output" -> true))
   ```
   Now, when I run the code to train the network (posted above) with more 
logging, I get the following:
   ```
   2017-10-03 23:40:37,808 [run-main-0] [UNet] [INFO] - so - Shape: 
Vector((1,2,868,868))
   2017-10-03 23:40:37,897 [run-main-0] [TrainModuleUNet] [INFO] - symbol 
shape: Vector((1,2,868,868))
   2017-10-03 23:40:37,899 [run-main-0] [TrainModuleUNet] [INFO] - 
providedData: data -> (1,1,1052,1052)
   2017-10-03 23:40:37,900 [run-main-0] [TrainModuleUNet] [INFO] - 
providedLabel: softmax_label -> (1,1,868,868)
   2017-10-03 23:40:38,038 [run-main-0] [TrainModuleUNet] [INFO] - bound!
   2017-10-03 23:40:38,088 [run-main-0] [TrainModuleUNet] [INFO] - initialized!
   2017-10-03 23:40:38,089 [run-main-0] [ml.dmlc.mxnet.module.Module] [WARN] - 
Already binded, ignoring bind()
   MKL Build:20170720
   [error] (run-main-0) java.lang.IllegalArgumentException: requirement failed: 
label (1,1,868,868) and prediction (1,868,868)should have the same length.
   java.lang.IllegalArgumentException: requirement failed: label (1,1,868,868) 
and prediction (1,868,868)should have the same length.
at scala.Predef$.require(Predef.scala:224)
at ml.dmlc.mxnet.Accuracy$$anonfun$update$4.apply(EvalMetric.scala:111)
   (...)
   ```
   The output of my network has a shape of (1, 2, 868, 868). However, the error 
message said that prediction shape is (1, 868, 868). How can this be?
   
   I also see that my label is likely not in the right shape (one channel, with 
either 0 or 1 instead of two channels with the probability of 0 and 1). 
However, the bind function seems ok, which makes me think that there is 
possibly a implicit conversion done somewhere.
   
   Another very strange thing is that the program doesn't really stop after 
this exception. Memory and CPU usage continue to grow up until I kill sbt. 
Despite the filed require, the C++ backend continues to work...
   
   Any hint about how to correctly use SoftmaxOutput with muli_output would be 
greatly appreciate. :)
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] zhreshold closed pull request #8137: [Gluon] Object detection preview

2017-10-03 Thread git
zhreshold closed pull request #8137: [Gluon] Object detection preview
URL: https://github.com/apache/incubator-mxnet/pull/8137
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] zhreshold opened a new pull request #8137: [Gluon] Object detection preview

2017-10-03 Thread git
zhreshold opened a new pull request #8137: [Gluon] Object detection preview
URL: https://github.com/apache/incubator-mxnet/pull/8137
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[incubator-mxnet] branch szha-patch-1 deleted (was 585b0b8)

2017-10-03 Thread zhasheng
This is an automated email from the ASF dual-hosted git repository.

zhasheng pushed a change to branch szha-patch-1
in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git.


 was 585b0b8  Update nn.md

The revisions that were on this branch are still contained in
other references; therefore, this change does not discard any commits
from the repository.

-- 
To stop receiving notification emails like this one, please contact
['"comm...@mxnet.apache.org" '].


[GitHub] szha closed pull request #8134: Update nn.md

2017-10-03 Thread git
szha closed pull request #8134: Update nn.md
URL: https://github.com/apache/incubator-mxnet/pull/8134
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[incubator-mxnet] branch master updated: Update nn.md (#8134)

2017-10-03 Thread zhasheng
This is an automated email from the ASF dual-hosted git repository.

zhasheng pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git


The following commit(s) were added to refs/heads/master by this push:
 new 8a4221b  Update nn.md (#8134)
8a4221b is described below

commit 8a4221bca2ddd4fa05840f7951a7216775021237
Author: Sheng Zha 
AuthorDate: Tue Oct 3 16:07:42 2017 -0700

Update nn.md (#8134)
---
 docs/api/python/gluon/nn.md | 1 +
 1 file changed, 1 insertion(+)

diff --git a/docs/api/python/gluon/nn.md b/docs/api/python/gluon/nn.md
index d230860..5e2dbe0 100644
--- a/docs/api/python/gluon/nn.md
+++ b/docs/api/python/gluon/nn.md
@@ -22,6 +22,7 @@ This document lists the neural network blocks in Gluon:
 BatchNorm
 LeakyReLU
 Embedding
+Flatten
 ```
 
 

-- 
To stop receiving notification emails like this one, please contact
['"comm...@mxnet.apache.org" '].


[GitHub] szha commented on issue #8105: Proposal: PR Template

2017-10-03 Thread git
szha commented on issue #8105: Proposal: PR Template
URL: 
https://github.com/apache/incubator-mxnet/issues/8105#issuecomment-334005341
 
 
   Thanks. @apache/mxnet-committers just to make sure everyone is aware.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] szha commented on issue #8136: stable sum

2017-10-03 Thread git
szha commented on issue #8136: stable sum
URL: https://github.com/apache/incubator-mxnet/pull/8136#issuecomment-333994412
 
 
   I didn't realize that it was already added to mshadow. never mind 
https://github.com/dmlc/mshadow/blame/master/mshadow/base.h#L672-L685
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] piiswrong commented on issue #8136: stable sum

2017-10-03 Thread git
piiswrong commented on issue #8136: stable sum
URL: https://github.com/apache/incubator-mxnet/pull/8136#issuecomment-333991002
 
 
   Which file/function are you talking about?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] szha commented on issue #8136: stable sum

2017-10-03 Thread git
szha commented on issue #8136: stable sum
URL: https://github.com/apache/incubator-mxnet/pull/8136#issuecomment-333993695
 
 
   
https://github.com/apache/incubator-mxnet/blob/master/src/operator/tensor/broadcast_reduce_op_value.cc#L81-L88
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] yxchng commented on issue #8132: How to disable MXNET_CUDNN_AUTOTUNE_DEFAULT and bucketing log message without turning off MXNET_CUDNN_AUTOTUNE_DEFAULT?

2017-10-03 Thread git
yxchng commented on issue #8132: How to disable MXNET_CUDNN_AUTOTUNE_DEFAULT 
and bucketing log message without turning off MXNET_CUDNN_AUTOTUNE_DEFAULT?
URL: 
https://github.com/apache/incubator-mxnet/issues/8132#issuecomment-333791801
 
 
   I tried exporting or using os.environ but they don't work. The code i am 
using is https://github.com/pangyupo/mxnet_mtcnn_face_detection
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] benqua commented on issue #8129: [scala] Module api: label (1,1,868,868) and prediction (1,868,868)should have the same length

2017-10-03 Thread git
benqua commented on issue #8129: [scala] Module api: label (1,1,868,868) and 
prediction (1,868,868)should have the same length
URL: 
https://github.com/apache/incubator-mxnet/issues/8129#issuecomment-333796027
 
 
   If that helps, the last layer of my network is:
   ```scala
   val out= Symbol.SoftmaxOutput()()(Map("data" -> conv10, "label" -> 
label, "multi_output" -> true))
   ```
   @javelinjs , @Ldpe2G , any idea?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] yxchng opened a new issue #8132: How to disable MXNET_CUDNN_AUTOTUNE_DEFAULT log message without turning off MXNET_CUDNN_AUTOTUNE_DEFAULT?

2017-10-03 Thread git
yxchng opened a new issue #8132: How to disable MXNET_CUDNN_AUTOTUNE_DEFAULT 
log message without turning off MXNET_CUDNN_AUTOTUNE_DEFAULT?
URL: https://github.com/apache/incubator-mxnet/issues/8132
 
 
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] theSparta commented on issue #8126: Not able to train a neural network using MXNET with C++ API

2017-10-03 Thread git
theSparta commented on issue #8126: Not able to train a neural network using 
MXNET with C++ API
URL: 
https://github.com/apache/incubator-mxnet/issues/8126#issuecomment-333847990
 
 
   I am adding a very simple example here in which I tried to fit a neural 
network on the XOR function but unable to do so as well.
   ```C++
   #include 
   #include 
   #include 
   #include "mxnet-cpp/MxNetCpp.h"
   // Allow IDE to parse the types
   #include "../include/mxnet-cpp/op.h"
   
   using namespace std;
   using namespace mxnet::cpp;
   
   
   Symbol mlp(const vector , const vector & weights,
   const std::vector & biases, const string & inp_name )
   {
 auto x = Symbol::Variable(inp_name);
   
 vector outputs(layers.size());
   
 for (size_t i = 0; i < layers.size(); ++i)
 {
   string istr = to_string(i);
   Symbol fc = FullyConnected(
 i == 0? x : outputs[i-1],  // data
 weights[i],
 biases[i],
 layers[i]);
   outputs[i] = i == layers.size()-1 ? fc :  Activation(string("act") + 
istr, fc, 
   ActivationActType::kTanh);
 }
   
 return outputs.back();
   }
   
   int main(int argc, char** argv)
   {
   const int feature_size = 2;
   const vector layers{8, 4, 1};
   const int batch_size = 4;
   const int max_epoch = 10;
   const float learning_rate = 0.001;
   const float weight_decay = 1e-2;
   
   auto ctx = Context::cpu(); // Use GPU for training
   auto ctx_cpu = Context::cpu();
   
   vector weights(layers.size());
   vector biases(layers.size());
   
   for (size_t i = 0; i < layers.size(); ++i)
   {
   string istr = to_string(i);
   weights[i] = Symbol::Variable("w" + istr);
   biases[i] = Symbol::Variable("b" + istr);
   }
   
   auto Net = mlp(layers, weights, biases, "X");
   auto sym_label = Symbol::Variable("label");
   auto output = LogisticRegressionOutput(string("sigmoid"), Net, 
sym_label);
   
   map args_map;
   args_map["X"] = NDArray(Shape(batch_size, feature_size) , ctx);
   args_map["label"] = NDArray(Shape(batch_size, 1), ctx);
   
   auto *exec = output.SimpleBind(ctx, args_map);
   output.InferArgsMap(ctx, _map, args_map);
   auto arg_names = output.ListArguments();
   
   Xavier xavier = Xavier(Xavier::gaussian, Xavier::avg);
   for (auto  : args_map)
   {
   xavier(arg.first, );
   }
   
   Optimizer* opt = OptimizerRegistry::Find("adam");
   opt->SetParam("rescale_grad", 1.0 / batch_size)
   ->SetParam("lr", learning_rate)
   ->SetParam("wd", weight_decay);
   
   // XOR Function
   mx_float* aptr_x = new mx_float[batch_size * feature_size];
   mx_float* aptr_y = new mx_float[batch_size];
   
   aptr_x[0] = 0.; aptr_x[1] = 0.; aptr_y[0] = 0;
   aptr_x[2] = 0; aptr_x[3] = 1.; aptr_y[1] = 1;
   aptr_x[4] = 1.; aptr_x[5] = 0.; aptr_y[2] = 1;
   aptr_x[6] = 1.; aptr_x[7] = 1.; aptr_y[3] = 0;
   
   NDArray train_data = NDArray(Shape(batch_size, 2), ctx_cpu, false);
   NDArray train_label = NDArray(Shape(batch_size), ctx_cpu, false);
   train_data.SyncCopyFromCPU(aptr_x, batch_size * 2);
   train_label.SyncCopyFromCPU(aptr_y, batch_size);
   train_data.WaitToRead();
   train_label.WaitToRead();
   
   Accuracy acu_train;
   for (int ITER = 0; ITER < max_epoch ; ++ITER)
   {
   acu_train.Reset();
   args_map["X"] = train_data.Copy(ctx);
   args_map["label"] = train_label.Copy(ctx);
   NDArray::WaitAll();
   
   exec->Forward(true);
   acu_train.Update(args_map["label"], exec->outputs[0]);
   
   if(ITER % 5000 == 0){
   auto out = (exec->outputs[0]).Copy(ctx_cpu);
   auto labels = args_map["label"].Copy(ctx_cpu);
   NDArray::WaitAll();
   const mx_float * outs = out.GetData();
   auto lbs = labels.GetData();
   for (int i = 0 ; i < batch_size ; i++)
   cout << lbs[i] << ":" << outs[i] << " ";
   cout << endl;
   LG << "ITER: " << ITER << " Train Accuracy: " << acu_train.Get();
   }
   exec->Backward();
   // Update parameters
   for (size_t i = 0; i < arg_names.size(); ++i)
   {
   if (arg_names[i] == "X" || arg_names[i] == "label") continue;
   opt->Update(i, exec->arg_arrays[i], exec->grad_arrays[i]);
   }
   }
   
   delete exec;
   delete [] aptr_x;
   delete [] aptr_y;
   MXNotifyShutdown();
   return 0;
}
   ```
   The output is (True_label : Predicted_label):
   ```
   0:0.95178 1:0.880215 1:0.944654 0:0.86154 
   [19:14:56] xor.cpp:114: ITER: 0 Train Accuracy: 0.5
   0:0.786497 1:1 1:0.799124 0:3.35246e-13 
   [19:14:57] xor.cpp:114: ITER: 5000 Train Accuracy: 0.5
   0:0.786137 1:1 1:0.800972 

[GitHub] theSparta commented on issue #8126: Not able to train a neural network [XOR added]

2017-10-03 Thread git
theSparta commented on issue #8126: Not able to train a neural network [XOR 
added]
URL: 
https://github.com/apache/incubator-mxnet/issues/8126#issuecomment-333847990
 
 
   I am adding a very simple example here in which I tried to fit a neural 
network on the XOR function but unable to do so as well.
   ```C++
   #include 
   #include 
   #include 
   #include "mxnet-cpp/MxNetCpp.h"
   // Allow IDE to parse the types
   #include "../include/mxnet-cpp/op.h"
   
   using namespace std;
   using namespace mxnet::cpp;
   
   
   Symbol mlp(const vector , const vector & weights,
   const std::vector & biases, const string & inp_name )
   {
 auto x = Symbol::Variable(inp_name);
   
 vector outputs(layers.size());
   
 for (size_t i = 0; i < layers.size(); ++i)
 {
   string istr = to_string(i);
   Symbol fc = FullyConnected(
 i == 0? x : outputs[i-1],  // data
 weights[i],
 biases[i],
 layers[i]);
   outputs[i] = i == layers.size()-1 ? fc :  Activation(string("act") + 
istr, fc, 
   ActivationActType::kTanh);
 }
   
 return outputs.back();
   }
   
   int main(int argc, char** argv)
   {
   const int feature_size = 2;
   const vector layers{8, 4, 1};
   const int batch_size = 4;
   const int max_epoch = 10;
   const float learning_rate = 0.001;
   const float weight_decay = 1e-2;
   
   auto ctx = Context::cpu(); // Use GPU for training
   auto ctx_cpu = Context::cpu();
   
   vector weights(layers.size());
   vector biases(layers.size());
   
   for (size_t i = 0; i < layers.size(); ++i)
   {
   string istr = to_string(i);
   weights[i] = Symbol::Variable("w" + istr);
   biases[i] = Symbol::Variable("b" + istr);
   }
   
   auto Net = mlp(layers, weights, biases, "X");
   auto sym_label = Symbol::Variable("label");
   auto output = LogisticRegressionOutput(string("sigmoid"), Net, 
sym_label);
   
   map args_map;
   args_map["X"] = NDArray(Shape(batch_size, feature_size) , ctx);
   args_map["label"] = NDArray(Shape(batch_size, 1), ctx);
   
   auto *exec = output.SimpleBind(ctx, args_map);
   output.InferArgsMap(ctx, _map, args_map);
   auto arg_names = output.ListArguments();
   
   Xavier xavier = Xavier(Xavier::gaussian, Xavier::avg);
   for (auto  : args_map)
   {
   xavier(arg.first, );
   }
   
   Optimizer* opt = OptimizerRegistry::Find("adam");
   opt->SetParam("rescale_grad", 1.0 / batch_size)
   ->SetParam("lr", learning_rate)
   ->SetParam("wd", weight_decay);
   
   // XOR Function
   mx_float* aptr_x = new mx_float[batch_size * feature_size];
   mx_float* aptr_y = new mx_float[batch_size];
   
   aptr_x[0] = 0.; aptr_x[1] = 0.; aptr_y[0] = 0;
   aptr_x[2] = 0; aptr_x[3] = 1.; aptr_y[1] = 1;
   aptr_x[4] = 1.; aptr_x[5] = 0.; aptr_y[2] = 1;
   aptr_x[6] = 1.; aptr_x[7] = 1.; aptr_y[3] = 0;
   
   NDArray train_data = NDArray(Shape(batch_size, 2), ctx_cpu, false);
   NDArray train_label = NDArray(Shape(batch_size), ctx_cpu, false);
   train_data.SyncCopyFromCPU(aptr_x, batch_size * 2);
   train_label.SyncCopyFromCPU(aptr_y, batch_size);
   train_data.WaitToRead();
   train_label.WaitToRead();
   
   Accuracy acu_train;
   for (int ITER = 0; ITER < max_epoch ; ++ITER)
   {
   acu_train.Reset();
   args_map["X"] = train_data.Copy(ctx);
   args_map["label"] = train_label.Copy(ctx);
   NDArray::WaitAll();
   
   exec->Forward(true);
   acu_train.Update(args_map["label"], exec->outputs[0]);
   
   if(ITER % 5000 == 0){
   auto out = (exec->outputs[0]).Copy(ctx_cpu);
   auto labels = args_map["label"].Copy(ctx_cpu);
   NDArray::WaitAll();
   const mx_float * outs = out.GetData();
   auto lbs = labels.GetData();
   for (int i = 0 ; i < batch_size ; i++)
   cout << lbs[i] << ":" << outs[i] << " ";
   cout << endl;
   LG << "ITER: " << ITER << " Train Accuracy: " << acu_train.Get();
   }
   exec->Backward();
   // Update parameters
   for (size_t i = 0; i < arg_names.size(); ++i)
   {
   if (arg_names[i] == "X" || arg_names[i] == "label") continue;
   opt->Update(i, exec->arg_arrays[i], exec->grad_arrays[i]);
   }
   }
   
   delete exec;
   delete [] aptr_x;
   delete [] aptr_y;
   MXNotifyShutdown();
   return 0;
}
   ```
   The output is (True_label : Predicted_label):
   ```
   0:0.95178 1:0.880215 1:0.944654 0:0.86154 
   [19:14:56] xor.cpp:114: ITER: 0 Train Accuracy: 0.5
   0:0.786497 1:1 1:0.799124 0:3.35246e-13 
   [19:14:57] xor.cpp:114: ITER: 5000 Train Accuracy: 0.5
   0:0.786137 1:1 1:0.800972 0:1.01632e-21 
 

[GitHub] bhavinthaker commented on issue #8105: Proposal: PR Template

2017-10-03 Thread git
bhavinthaker commented on issue #8105: Proposal: PR Template
URL: 
https://github.com/apache/incubator-mxnet/issues/8105#issuecomment-333874115
 
 
   Looks good to me. Thanks for this suggestion.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] Ldpe2G commented on issue #8129: [scala] Module api: label (1,1,868,868) and prediction (1,868,868)should have the same length

2017-10-03 Thread git
Ldpe2G commented on issue #8129: [scala] Module api: label (1,1,868,868) and 
prediction (1,868,868)should have the same length
URL: 
https://github.com/apache/incubator-mxnet/issues/8129#issuecomment-333860436
 
 
   @benqua It seems like the error occurs when calling the `update` method  of 
the Accuracy mertic. May be you should first check the output shape of your 
network by calling the inferShape method of the symbol.
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] benqua commented on issue #8129: [scala] Module api: label (1,1,868,868) and prediction (1,868,868)should have the same length

2017-10-03 Thread git
benqua commented on issue #8129: [scala] Module api: label (1,1,868,868) and 
prediction (1,868,868)should have the same length
URL: 
https://github.com/apache/incubator-mxnet/issues/8129#issuecomment-333862739
 
 
   @Ldpe2G all right, I am going to check that (pretty sure it was 
(1,1,868,868), but I will cross-check once I arrive home).
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] altosaar commented on issue #8130: autograd.backward() segfaults: how to get gradients with respect to a subset of variables in mxnet?

2017-10-03 Thread git
altosaar commented on issue #8130: autograd.backward() segfaults: how to get 
gradients with respect to a subset of variables in mxnet? 
URL: 
https://github.com/apache/incubator-mxnet/issues/8130#issuecomment-333903906
 
 
   Thanks @piiswrong !
   
   I installed the newest mxnet (`pip install --pre mxnet`).
   
   Here is a reproducible example based on what you suggested:
   
   ```
   In [49]: from mxnet import nd, autograd
   
   In [50]: x = nd.array([1.])
   
   In [51]: z = nd.array([1.])
   
   In [52]: x.attach_grad()
   
   In [53]: z.attach_grad()
   
   In [54]: with autograd.record():
   ...: first = nd.square(x)
   ...: second = nd.square(z)
   ...: y = first + second
   ...: autograd.grad(y, [x], retain_graph=True)
   ...: autograd.grad(y, [z])
   ...:
   
   In [56]: x.grad
   Out[56]:
   
   [ 0.]
   
   
   In [57]: z.grad
   Out[57]:
   
   [ 0.]
   
   ```
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] aseyboldt opened a new issue #8133: Infer_shape_partial for rank 0 arrays

2017-10-03 Thread git
aseyboldt opened a new issue #8133: Infer_shape_partial for rank 0 arrays
URL: https://github.com/apache/incubator-mxnet/issues/8133
 
 
   In the python interface of mxnet it seems to be impossible to distinguish 
between an array of unknown shape and an array with known shape of `()`:
   ```python
   a = mx.sym.var('a')
   b = mx.sym.var('b')
   
   (a + b).infer_shape_partial(b=())
   ([(), ()], [()], [])
   ```
   In this case `b` is known to be a scalar, while `a`'s shape is unknown.
   Shouldn't it return something like `([None, ()], [None], [])`?
   
   If `a` is set to be a scalar, it returns an invalid result:
   ```python
   a = mx.sym.var('a')
   b = mx.sym.var('b')
   
   (a + b).infer_shape_partial(a=(), b=(1, 2))
   ([(1, 2), (1, 2)], [(1, 2)], [])
   ```
   It identifies the shape of `a` as `(1, 2)`, even though we set it to a 
scalar.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] szha commented on issue #8136: stable sum

2017-10-03 Thread git
szha commented on issue #8136: stable sum
URL: https://github.com/apache/incubator-mxnet/pull/8136#issuecomment-334013658
 
 
   Looks like square sum needs updating too:
   
   
https://github.com/apache/incubator-mxnet/blob/master/src/operator/tensor/square_sum-inl.h#L127-L130
   
   
https://github.com/apache/incubator-mxnet/blob/master/src/operator/tensor/square_sum-inl.h#L148-L151
   
   
https://github.com/apache/incubator-mxnet/blob/master/src/operator/tensor/square_sum-inl.h#L171-L174
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] eric-haibin-lin commented on issue #7893: Add barriers in kvstore init

2017-10-03 Thread git
eric-haibin-lin commented on issue #7893: Add barriers in kvstore init
URL: https://github.com/apache/incubator-mxnet/pull/7893#issuecomment-334014237
 
 
   The test failure seems irrelevant. Do you mind sync up with master again to 
see if it passes? 
   Regarding fp16, yes I think we need that fix. The current static_casting 
approach is a bug. Why is an extra copy required? If you train with fp16, are 
both weight and gradient in fp16? Or you're just trying to minimize the network 
traffic? BTW @rahul003 is working on gradient compression in parallel
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] eric-haibin-lin opened a new pull request #8138: add storage type logging to graph executor

2017-10-03 Thread git
eric-haibin-lin opened a new pull request #8138: add storage type logging to 
graph executor
URL: https://github.com/apache/incubator-mxnet/pull/8138
 
 
   This is mentioned in the tutorial PR #7921 and should be merged before that. 
   The log message will be printed if env_var 
MXNET_INFER_STORAGE_TYPE_VERBOSE_LOGGING = 1
   
   ```
   >>> os.environ['MXNET_INFER_STORAGE_TYPE_VERBOSE_LOGGING'] = "1"
   >>> print(os.environ['MXNET_INFER_STORAGE_TYPE_VERBOSE_LOGGING'])
   1
   >>> # Data in csr format
   ... data = mx.sym.var('data', stype='csr', shape=(32, 1))
   >>> # Weight in row_sparse format
   ... weight = mx.sym.var('weight', stype='row_sparse', shape=(1, 2))
   >>> bias = mx.symbol.Variable("bias", shape=(2,))
   >>> dot = mx.symbol.sparse.dot(data, weight)
   >>> pred = mx.symbol.broadcast_add(dot, bias)
   >>> y = mx.symbol.Variable("label")
   >>> output = mx.symbol.SoftmaxOutput(data=pred, label=y, name="output")
   >>> executor = output.simple_bind(ctx=mx.cpu())
   [00:42:45] src/executor/../common/utils.h:130: node 0 var
   [00:42:45] src/executor/../common/utils.h:130: node 1 var
   [00:42:45] src/executor/../common/utils.h:132: node 2 dot: fcompute_ex
   [00:42:45] src/executor/../common/utils.h:136:  input 0: csr
   [00:42:45] src/executor/../common/utils.h:136:  input 1: row_sparse
   [00:42:45] src/executor/../common/utils.h:141:  output 2: default
   [00:42:45] src/executor/../common/utils.h:130: node 3 var
   [00:42:45] src/executor/../common/utils.h:132: node 4 broadcast_add: fcompute
   [00:42:45] src/executor/../common/utils.h:136:  input 2: default
   [00:42:45] src/executor/../common/utils.h:136:  input 3: default
   [00:42:45] src/executor/../common/utils.h:141:  output 4: default
   [00:42:45] src/executor/../common/utils.h:130: node 5 var
   [00:42:45] src/executor/../common/utils.h:132: node 6 SoftmaxOutput: fcompute
   [00:42:45] src/executor/../common/utils.h:136:  input 4: default
   [00:42:45] src/executor/../common/utils.h:136:  input 5: default
   [00:42:45] src/executor/../common/utils.h:141:  output 6: default
   [00:42:45] src/executor/../common/utils.h:132: node 7 
_backward_SoftmaxOutput: fcompute
   [00:42:45] src/executor/../common/utils.h:136:  input 5: default
   [00:42:45] src/executor/../common/utils.h:136:  input 6: default
   [00:42:45] src/executor/../common/utils.h:141:  output 7: default
   [00:42:45] src/executor/../common/utils.h:141:  output 8: default
   [00:42:45] src/executor/../common/utils.h:132: node 8 
_backward_broadcast_add: fcompute
   [00:42:45] src/executor/../common/utils.h:136:  input 7: default
   [00:42:45] src/executor/../common/utils.h:141:  output 9: default
   [00:42:45] src/executor/../common/utils.h:141:  output 10: default
   [00:42:45] src/executor/../common/utils.h:132: node 9 _backward_dot: 
fcompute_ex
   [00:42:45] src/executor/../common/utils.h:136:  input 9: default
   [00:42:45] src/executor/../common/utils.h:136:  input 0: csr
   [00:42:45] src/executor/../common/utils.h:136:  input 1: row_sparse
   [00:42:45] src/executor/../common/utils.h:141:  output 11: default
   [00:42:45] src/executor/../common/utils.h:141:  output 12: row_sparse
   ```
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] szha commented on issue #8136: stable sum

2017-10-03 Thread git
szha commented on issue #8136: stable sum
URL: https://github.com/apache/incubator-mxnet/pull/8136#issuecomment-334013658
 
 
   Looks like square sum needs updating too:
   
   
https://github.com/apache/incubator-mxnet/blob/master/src/operator/tensor/square_sum-inl.h#L127-L130
   
   
https://github.com/apache/incubator-mxnet/blob/master/src/operator/tensor/square_sum-inl.h#L148-L151
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ZiyueHuang commented on issue #8130: autograd.backward() segfaults: how to get gradients with respect to a subset of variables in mxnet?

2017-10-03 Thread git
ZiyueHuang commented on issue #8130: autograd.backward() segfaults: how to get 
gradients with respect to a subset of variables in mxnet? 
URL: 
https://github.com/apache/incubator-mxnet/issues/8130#issuecomment-334040825
 
 
   Instead of using two `autograd.grad`, please try
   ```
   print autograd.grad(y, [x, z])
   ```
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] caiqi opened a new issue #8139: mxnet ssd training speed slow down after some batches

2017-10-03 Thread git
caiqi opened a new issue #8139: mxnet ssd training speed slow down after some 
batches
URL: https://github.com/apache/incubator-mxnet/issues/8139
 
 
   ## Environment info
   Operating System:
   Windows
   
   Package used (Python/R/Scala/Julia):
   Python
   
   MXNet version:
   0.11.0
   
   Or if installed from source:
   install with pip
   
   When train ssd detection model using multi GPUs, the training speed is 
slowing down after a long time. Is there any way to solve the problem? Thanks.
   
   > INFO:root:Epoch[0] Batch [20]   Speed: 74.29 samples/sec
   > INFO:root:Epoch[0] Batch [40]   Speed: 76.35 samples/sec
   > INFO:root:Epoch[0] Batch [60]   Speed: 75.31 samples/sec
   > INFO:root:Epoch[0] Batch [80]   Speed: 74.59 samples/sec
   > INFO:root:Epoch[0] Batch [100]  Speed: 75.76 samples/sec
   > INFO:root:Epoch[0] Batch [120]  Speed: 77.23 samples/sec
   > INFO:root:Epoch[0] Batch [140]  Speed: 74.44 samples/sec
   > INFO:root:Epoch[0] Batch [160]  Speed: 74.67 samples/sec
   > INFO:root:Epoch[0] Batch [180]  Speed: 75.40 samples/sec
   > INFO:root:Epoch[0] Batch [200]  Speed: 76.74 samples/sec
   > INFO:root:Epoch[0] Batch [220]  Speed: 75.12 samples/sec
   > INFO:root:Epoch[0] Batch [240]  Speed: 76.70 samples/sec
   > INFO:root:Epoch[0] Batch [260]  Speed: 74.27 samples/sec
   > INFO:root:Epoch[0] Batch [280]  Speed: 75.89 samples/sec
   > INFO:root:Epoch[0] Batch [300]  Speed: 75.57 samples/sec
   > INFO:root:Epoch[0] Batch [320]  Speed: 76.34 samples/sec
   > INFO:root:Epoch[0] Batch [340]  Speed: 75.85 samples/sec
   > INFO:root:Epoch[0] Batch [360]  Speed: 76.27 samples/sec
   > INFO:root:Epoch[0] Batch [380]  Speed: 76.11 samples/sec
   > INFO:root:Epoch[0] Batch [400]  Speed: 76.88 samples/sec
   > INFO:root:Epoch[0] Batch [420]  Speed: 75.87 samples/sec
   > INFO:root:Epoch[0] Batch [440]  Speed: 75.08 samples/sec
   > INFO:root:Epoch[0] Batch [460]  Speed: 76.34 samples/sec
   > INFO:root:Epoch[0] Batch [480]  Speed: 76.06 samples/sec
   > INFO:root:Epoch[0] Batch [500]  Speed: 73.84 samples/sec
   > INFO:root:Epoch[0] Batch [520]  Speed: 69.82 samples/sec
   > INFO:root:Epoch[0] Batch [540]  Speed: 65.33 samples/sec
   > INFO:root:Epoch[0] Batch [560]  Speed: 63.28 samples/sec
   > INFO:root:Epoch[0] Batch [580]  Speed: 59.28 samples/sec
   > INFO:root:Epoch[0] Batch [600]  Speed: 54.57 samples/sec
   > INFO:root:Epoch[0] Batch [620]  Speed: 52.37 samples/sec
   > INFO:root:Epoch[0] Batch [640]  Speed: 51.08 samples/sec
   > INFO:root:Epoch[0] Batch [660]  Speed: 50.30 samples/sec
   > INFO:root:Epoch[0] Batch [680]  Speed: 49.22 samples/sec
   > INFO:root:Epoch[0] Batch [700]  Speed: 49.70 samples/sec
   > INFO:root:Epoch[0] Batch [720]  Speed: 50.45 samples/sec
   > INFO:root:Epoch[0] Batch [740]  Speed: 52.21 samples/sec
   > INFO:root:Epoch[0] Batch [760]  Speed: 54.90 samples/sec
   > INFO:root:Epoch[0] Batch [780]  Speed: 58.65 samples/sec
   > INFO:root:Epoch[0] Batch [800]  Speed: 60.69 samples/sec
   > INFO:root:Epoch[0] Batch [820]  Speed: 66.90 samples/sec
   > INFO:root:Epoch[0] Batch [840]  Speed: 68.57 samples/sec
   > INFO:root:Epoch[0] Batch [860]  Speed: 70.10 samples/sec
   > INFO:root:Epoch[0] Batch [880]  Speed: 70.06 samples/sec
   > INFO:root:Epoch[0] Batch [900]  Speed: 71.81 samples/sec
   > INFO:root:Epoch[0] Batch [920]  Speed: 73.46 samples/sec
   > INFO:root:Epoch[0] Batch [940]  Speed: 72.55 samples/sec
   > INFO:root:Epoch[0] Batch [960]  Speed: 71.95 samples/sec
   > INFO:root:Epoch[0] Batch [980]  Speed: 72.64 samples/sec
   > INFO:root:Epoch[0] Batch [1000] Speed: 72.28 samples/sec
   > INFO:root:Epoch[0] Batch [1020] Speed: 72.63 samples/sec
   > INFO:root:Epoch[0] Batch [1040] Speed: 73.61 samples/sec
   > INFO:root:Epoch[0] Batch [1060] Speed: 74.30 samples/sec
   > INFO:root:Epoch[0] Batch [1080] Speed: 73.47 samples/sec
   > INFO:root:Epoch[0] Batch [1100] Speed: 73.09 samples/sec
   > INFO:root:Epoch[0] Batch [1120] Speed: 72.78 samples/sec
   > INFO:root:Epoch[0] Batch [1140] Speed: 73.37 samples/sec
   > INFO:root:Epoch[0] Batch [1160] Speed: 73.40 samples/sec
   > INFO:root:Epoch[0] Batch [1180] Speed: 73.53 samples/sec
   > INFO:root:Epoch[0] Batch [1200] Speed: 73.48 samples/sec
   > INFO:root:Epoch[0] Batch [1220] Speed: 71.79 samples/sec
   > INFO:root:Epoch[0] Batch [1240] Speed: 72.09 samples/sec
   > INFO:root:Epoch[0] Batch [1260] Speed: 69.93 samples/sec
   > INFO:root:Epoch[0] Batch [1280] Speed: 64.27 samples/sec
   > INFO:root:Epoch[0] Batch [1300] Speed: 59.91 samples/sec
   > INFO:root:Epoch[0] Batch [1320] Speed: 54.84 samples/sec
   > INFO:root:Epoch[0] Batch [1340] Speed: 51.23 samples/sec
   > INFO:root:Epoch[0] Batch [1360] Speed: 48.56 samples/sec
   > INFO:root:Epoch[0] Batch [1380] Speed: 44.69 samples/sec
   > INFO:root:Epoch[0] Batch [1400] Speed: 41.22 samples/sec
   > INFO:root:Epoch[0] Batch [1420] Speed: 38.33 samples/sec
   > INFO:root:Epoch[0] Batch [1440] Speed: 35.85 samples/sec
   > INFO:root:Epoch[0] 

[GitHub] solin319 commented on issue #7893: Add barriers in kvstore init

2017-10-03 Thread git
solin319 commented on issue #7893: Add barriers in kvstore init
URL: https://github.com/apache/incubator-mxnet/pull/7893#issuecomment-334041868
 
 
   Sorry,I am in holiday these days. l didn't bring my computer back to my 
hometown. When I go back,I will sync the code and add a test file as soon as 
possible.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] eric-haibin-lin commented on issue #7893: Add barriers in kvstore init

2017-10-03 Thread git
eric-haibin-lin commented on issue #7893: Add barriers in kvstore init
URL: https://github.com/apache/incubator-mxnet/pull/7893#issuecomment-334042834
 
 
   @solin319 no worries. Happy mid-autumn festival! 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] szha opened a new pull request #8140: use fma in fully connected.

2017-10-03 Thread git
szha opened a new pull request #8140: use fma in fully connected.
URL: https://github.com/apache/incubator-mxnet/pull/8140
 
 
   when use bias, broadcast the bias to output first and then use kAddTo, which 
in turn keeps beta in gemm to be 1.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] zhreshold commented on issue #8139: mxnet ssd training speed slow down after some batches

2017-10-03 Thread git
zhreshold commented on issue #8139: mxnet ssd training speed slow down after 
some batches
URL: 
https://github.com/apache/incubator-mxnet/issues/8139#issuecomment-334045504
 
 
   Seems like thermal throttling. Are you using workstation with multiple GPUs?
   If so, you need to address the heating problem.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] caiqi commented on issue #8139: mxnet ssd training speed slow down after some batches

2017-10-03 Thread git
caiqi commented on issue #8139: mxnet ssd training speed slow down after some 
batches
URL: 
https://github.com/apache/incubator-mxnet/issues/8139#issuecomment-334047247
 
 
   I'm using muti GPUs on a server and it works well for other programs. When I 
train ssd on a single GPU, there is no problem, the speed is constantly around 
39 samples/sec. So I think it maybe not due to the heating problem.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] caiqi commented on issue #8139: mxnet ssd training speed slow down after some batches

2017-10-03 Thread git
caiqi commented on issue #8139: mxnet ssd training speed slow down after some 
batches
URL: 
https://github.com/apache/incubator-mxnet/issues/8139#issuecomment-334047544
 
 
   I have read this page 
https://github.com/msracver/Flow-Guided-Feature-Aggregation, FAQ 2. I 
originally thought the problem is specific to that repo. Is the problem common 
to mxnet on windows? Thanks.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] eric-haibin-lin commented on issue #8062: CSVIter and LibSVMIter not returning correct number of batches per epoch

2017-10-03 Thread git
eric-haibin-lin commented on issue #8062: CSVIter and LibSVMIter not returning 
correct number of batches per epoch
URL: 
https://github.com/apache/incubator-mxnet/issues/8062#issuecomment-334051905
 
 
   The number of batches will be correct if reset() is moved to the end of the 
epoch:
   ```
   for epoch in range(10):
   nbatch = 0
   for batch in iter(data_train):
   nbatch += 1
   assert(nbatch == 100), nbatch
   data_train.reset()
   ```
   I've updated the documentation for this in #8111
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] eric-haibin-lin closed issue #8062: CSVIter and LibSVMIter not returning correct number of batches per epoch

2017-10-03 Thread git
eric-haibin-lin closed issue #8062: CSVIter and LibSVMIter not returning 
correct number of batches per epoch
URL: https://github.com/apache/incubator-mxnet/issues/8062
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services