[GitHub] benqua commented on issue #8129: [scala] Module api: label (1,1,868,868) and prediction (1,868,868)should have the same length
benqua commented on issue #8129: [scala] Module api: label (1,1,868,868) and prediction (1,868,868)should have the same length URL: https://github.com/apache/incubator-mxnet/issues/8129#issuecomment-334054179 Yes, I am conducting a segmentation tasks. My label has only one channel, with value being either 1 or 0, depending on the class (providedLabel: softmax_label -> (1,1,868,868)). The output of outputShapes on the module is OutShapes (module): ArrayBuffer((softmaxoutput0_output,(1,2,868,868))) You mean I shouldn't use the multi_output parameter? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] eric-haibin-lin opened a new pull request #8141: update sparse LR example
eric-haibin-lin opened a new pull request #8141: update sparse LR example URL: https://github.com/apache/incubator-mxnet/pull/8141 - fix wrong metric name `log_loss` -> `nll_loss` - fix wrong README instruction for distributed training - added weighted loss for the output layer - add a list of sparse optimizers as argument choices @szha This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] Ldpe2G commented on issue #8129: [scala] Module api: label (1,1,868,868) and prediction (1,868,868)should have the same length
Ldpe2G commented on issue #8129: [scala] Module api: label (1,1,868,868) and prediction (1,868,868)should have the same length URL: https://github.com/apache/incubator-mxnet/issues/8129#issuecomment-334057398 No you should. And I suggest you to try the intermediate api https://github.com/apache/incubator-mxnet/blob/8a4221bca2ddd4fa05840f7951a7216775021237/scala-package/examples/src/main/scala/ml/dmlc/mxnetexamples/module/MnistMlp.scala#L43 Since the output shape is correct, it is strange to get such error message. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] CodingCat commented on issue #8128: Adding code owners
CodingCat commented on issue #8128: Adding code owners URL: https://github.com/apache/incubator-mxnet/pull/8128#issuecomment-333957516 @gautamkmr I am totally fine and supportive with this.. as long as it is not something like "file a,b,c has to be signed off by Mr xyz before merge the changes", it is following Apache way to my understanding This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] aseyboldt commented on issue #8133: Infer_shape_partial for rank 0 arrays
aseyboldt commented on issue #8133: Infer_shape_partial for rank 0 arrays URL: https://github.com/apache/incubator-mxnet/issues/8133#issuecomment-333958354 `ndarray` does allow scalars: ```python >>> a = mx.nd.ones(()) >>> a.shape () >>> a.size 1 >>> a.asnumpy() array(1.0, dtype=float32) >>> mx.nd.ones((1,)).asnumpy() array([ 1.], dtype=float32) ``` This seems to lead to the invalid memory access in the example above (which by the way I think is a serious bug). I feel bad about criticising someone else's project after using it for only a couple of days, but I have to admit it is pretty much beyond me why you would design a tensor library without the concept of a scalar. :sweat_smile: This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] CodingCat commented on issue #8128: Adding code owners
CodingCat commented on issue #8128: Adding code owners URL: https://github.com/apache/incubator-mxnet/pull/8128#issuecomment-333957516 @gautamkmr I am totally fine and supportive with this.. as long as it is not something like "file a,b,c has to be signed off by Mr xyz before merge the changes", it is following Apache way to my understanding I am just bring this up to everyone for awareness since we do not want to fall into argument too frequently This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[incubator-mxnet] branch szha-patch-1 created (now 585b0b8)
This is an automated email from the ASF dual-hosted git repository. zhasheng pushed a change to branch szha-patch-1 in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. at 585b0b8 Update nn.md No new revisions were added by this update. -- To stop receiving notification emails like this one, please contact ['"comm...@mxnet.apache.org"'].
[GitHub] szha opened a new pull request #8134: Update nn.md
szha opened a new pull request #8134: Update nn.md URL: https://github.com/apache/incubator-mxnet/pull/8134 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[incubator-mxnet] branch master updated: Updating code owners (#8128)
This is an automated email from the ASF dual-hosted git repository. jxie pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git The following commit(s) were added to refs/heads/master by this push: new 236d1c2 Updating code owners (#8128) 236d1c2 is described below commit 236d1c2dac8ab13f9d99c2af7c4febc663e3940c Author: Gautam KumarAuthorDate: Tue Oct 3 13:38:35 2017 -0700 Updating code owners (#8128) In order to make the master branch protected all the committers should be part of code owner. --- CODEOWNERS | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/CODEOWNERS b/CODEOWNERS index 26fcf35..57b4ec3 100644 --- a/CODEOWNERS +++ b/CODEOWNERS @@ -1,7 +1,7 @@ # Owners of Apache MXNet # Global owners -* @piiswrong @mli +* @apache/mxnet-committers # Owners of language bindings R-package/*@thirdwing -- To stop receiving notification emails like this one, please contact ['"comm...@mxnet.apache.org" '].
[GitHub] mli commented on issue #7995: Problem in acc metric
mli commented on issue #7995: Problem in acc metric URL: https://github.com/apache/incubator-mxnet/pull/7995#issuecomment-333984572 can we just use ndarray to compute instead of converting to numpy? if there is a number of classes, say 10K, then it could be problematic a sample code: ``` def accuracy(output, label): # both output and label are ndarray return (output.argmax(axis=1)==label).sum().asscalar() ``` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] aseyboldt commented on issue #8133: Infer_shape_partial for rank 0 arrays
aseyboldt commented on issue #8133: Infer_shape_partial for rank 0 arrays URL: https://github.com/apache/incubator-mxnet/issues/8133#issuecomment-333958354 That explains a lot :-) However `ndarray` does allow scalars: ```python >>> a = mx.nd.ones(()) >>> a.shape () >>> a.size 1 >>> a.asnumpy() array(1.0, dtype=float32) >>> mx.nd.ones((1,)).asnumpy() array([ 1.], dtype=float32) ``` This seems to lead to the invalid memory access in the example above (which by the way I think is a serious bug). I feel bad about criticising someone else's project after using it for only a couple of days, but I have to admit that I'm kind of surprised that you would design a tensor library without the concept of a scalar. :sweat_smile: This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] anirudh2290 commented on a change in pull request #8020: Get bz2 data fix
anirudh2290 commented on a change in pull request #8020: Get bz2 data fix URL: https://github.com/apache/incubator-mxnet/pull/8020#discussion_r142521336 ## File path: python/mxnet/test_utils.py ## @@ -1411,8 +1411,29 @@ def read_data(label_url, image_url): 'test_data':test_img, 'test_label':test_lbl} def get_bz2_data(data_dir, data_name, url, data_origin_name): -"""Download and extract bz2 data.""" +"""Download and extract bz2 data. + +Parameters +-- + +data_dir : str +Absolute or relative path of the directory name to store bz2 files +data_name : str +Name of the output file in which bz2 contents will be extracted +url : str +URL to download data from +data_origin_name : str +Name of the downloaded b2 file + +Examples + +>>> get_bz2_data("data_dir", "kdda.t", + "https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary/kdda.t.bz2;, + "kdda.t.bz2") +""" + download(url, dirname=data_dir, overwrite=False) +cwd = os.path.abspath(os.getcwd()) os.chdir(data_dir) Review comment: Removed chdir and used paths. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] piiswrong opened a new pull request #8136: stable sum
piiswrong opened a new pull request #8136: stable sum URL: https://github.com/apache/incubator-mxnet/pull/8136 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] aseyboldt commented on issue #8133: Infer_shape_partial for rank 0 arrays
aseyboldt commented on issue #8133: Infer_shape_partial for rank 0 arrays URL: https://github.com/apache/incubator-mxnet/issues/8133#issuecomment-333958354 That explains a lot :-) However `ndarray` does allow scalars: ```python >>> a = mx.nd.ones(()) >>> a.shape () >>> a.size 1 >>> a.asnumpy() array(1.0, dtype=float32) >>> mx.nd.ones((1,)).asnumpy() array([ 1.], dtype=float32) ``` This seems to lead to the invalid memory access in the example above (which by the way I think is a serious bug). I feel bad about criticising someone else's project after using it for only a couple of days, but I have to admit it is pretty much beyond me why you would design a tensor library without the concept of a scalar. :sweat_smile: This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] szha commented on issue #8136: stable sum
szha commented on issue #8136: stable sum URL: https://github.com/apache/incubator-mxnet/pull/8136#issuecomment-333983441 Do you plan on using Kahan's summation on regular sum too? Our mean is using that as the reduce function. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] piiswrong closed pull request #8128: Adding code owners
piiswrong closed pull request #8128: Adding code owners URL: https://github.com/apache/incubator-mxnet/pull/8128 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] piiswrong opened a new pull request #8135: Stable sum
piiswrong opened a new pull request #8135: Stable sum URL: https://github.com/apache/incubator-mxnet/pull/8135 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] piiswrong closed pull request #8135: Stable sum
piiswrong closed pull request #8135: Stable sum URL: https://github.com/apache/incubator-mxnet/pull/8135 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[incubator-mxnet] branch master updated: Update loss.md (#8131)
This is an automated email from the ASF dual-hosted git repository. zhasheng pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git The following commit(s) were added to refs/heads/master by this push: new 57d59ac Update loss.md (#8131) 57d59ac is described below commit 57d59ac008598bd1e7a13f4a4e7ae7a7b3da41f7 Author: Sheng ZhaAuthorDate: Tue Oct 3 11:23:41 2017 -0700 Update loss.md (#8131) --- docs/api/python/gluon/loss.md | 1 + 1 file changed, 1 insertion(+) diff --git a/docs/api/python/gluon/loss.md b/docs/api/python/gluon/loss.md index 5c27ab3..347eb49 100644 --- a/docs/api/python/gluon/loss.md +++ b/docs/api/python/gluon/loss.md @@ -17,6 +17,7 @@ This package includes several commonly used loss functions in neural networks. L2Loss L1Loss SoftmaxCrossEntropyLoss +SigmoidBinaryCrossEntropyLoss KLDivLoss CTCLoss ``` -- To stop receiving notification emails like this one, please contact ['"comm...@mxnet.apache.org" '].
[incubator-mxnet] branch szha-patch-1 deleted (was c7550b0)
This is an automated email from the ASF dual-hosted git repository. zhasheng pushed a change to branch szha-patch-1 in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. was c7550b0 Update loss.md The revisions that were on this branch are still contained in other references; therefore, this change does not discard any commits from the repository. -- To stop receiving notification emails like this one, please contact ['"comm...@mxnet.apache.org"'].
[GitHub] szha closed pull request #8131: Update loss.md
szha closed pull request #8131: Update loss.md URL: https://github.com/apache/incubator-mxnet/pull/8131 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] szha commented on issue #8133: Infer_shape_partial for rank 0 arrays
szha commented on issue #8133: Infer_shape_partial for rank 0 arrays URL: https://github.com/apache/incubator-mxnet/issues/8133#issuecomment-333935130 I don't think we have the concept of scalar in ndarray or symbol. Shape of `()` doesn't mean it's a scalar, it means its shape is unknown and needs to be inferred. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] Ldpe2G commented on issue #8129: [scala] Module api: label (1,1,868,868) and prediction (1,868,868)should have the same length
Ldpe2G commented on issue #8129: [scala] Module api: label (1,1,868,868) and prediction (1,868,868)should have the same length URL: https://github.com/apache/incubator-mxnet/issues/8129#issuecomment-334036151 @benqua are you conducting the segmentation task? If so, the shape of label has only one channel, just like the normal softmax. The multi_output parameter means to calculate the softmax along the channel axis . And I suggest you first print the output shapes by calling the `outputShapes` method of Module just after you call the `bing` method. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] gautamkmr commented on issue #8128: Adding code owners
gautamkmr commented on issue #8128: Adding code owners URL: https://github.com/apache/incubator-mxnet/pull/8128#issuecomment-333992063 @CodingCat sure :) @piiswrong Thanks ? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] benqua commented on issue #8129: [scala] Module api: label (1,1,868,868) and prediction (1,868,868)should have the same length
benqua commented on issue #8129: [scala] Module api: label (1,1,868,868) and prediction (1,868,868)should have the same length URL: https://github.com/apache/incubator-mxnet/issues/8129#issuecomment-333992969 ok, I checked and log the shapes as suggested and realize that it is not the batch size dimension that is lost but the channel one (I had 1 for both, so I didn't realize at first). The network is a u-net very similar as the one describe in the original u-net paper. Each pixel can be in one of two classes, as in the original paper. So, the last layers are: ```scala // output val conv10 = Symbol.Convolution()()(Map("data" -> conv9, "num_filter" -> 2, "kernel" -> "(1,1)")) val label = Symbol.Variable("softmax_label") val so = Symbol.SoftmaxOutput()()(Map("data" -> conv10, "label" -> label, "multi_output" -> true)) ``` Now, when I run the code to train the network (posted above) with more logging, I get the following: ``` 2017-10-03 23:40:37,808 [run-main-0] [UNet] [INFO] - so - Shape: Vector((1,2,868,868)) 2017-10-03 23:40:37,897 [run-main-0] [TrainModuleUNet] [INFO] - symbol shape: Vector((1,2,868,868)) 2017-10-03 23:40:37,899 [run-main-0] [TrainModuleUNet] [INFO] - providedData: data -> (1,1,1052,1052) 2017-10-03 23:40:37,900 [run-main-0] [TrainModuleUNet] [INFO] - providedLabel: softmax_label -> (1,1,868,868) 2017-10-03 23:40:38,038 [run-main-0] [TrainModuleUNet] [INFO] - bound! 2017-10-03 23:40:38,088 [run-main-0] [TrainModuleUNet] [INFO] - initialized! 2017-10-03 23:40:38,089 [run-main-0] [ml.dmlc.mxnet.module.Module] [WARN] - Already binded, ignoring bind() MKL Build:20170720 [error] (run-main-0) java.lang.IllegalArgumentException: requirement failed: label (1,1,868,868) and prediction (1,868,868)should have the same length. java.lang.IllegalArgumentException: requirement failed: label (1,1,868,868) and prediction (1,868,868)should have the same length. at scala.Predef$.require(Predef.scala:224) at ml.dmlc.mxnet.Accuracy$$anonfun$update$4.apply(EvalMetric.scala:111) (...) ``` The output of my network has a shape of (1, 2, 868, 868). However, the error message said that prediction shape is (1, 868, 868). How can this be? I also see that my label is likely not in the right shape (one channel, with either 0 or 1 instead of two channels with the probability of 0 and 1). However, the bind function seems ok, which makes me think that there is possibly a implicit conversion done somewhere. Another very strange thing is that the program doesn't really stop after this exception. Memory and CPU usage continue to grow up until I kill sbt. Despite the filed require, the C++ backend continues to work... Any hint about how to correctly use SoftmaxOutput with muli_output would be greatly appreciate. :) This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] zhreshold closed pull request #8137: [Gluon] Object detection preview
zhreshold closed pull request #8137: [Gluon] Object detection preview URL: https://github.com/apache/incubator-mxnet/pull/8137 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] zhreshold opened a new pull request #8137: [Gluon] Object detection preview
zhreshold opened a new pull request #8137: [Gluon] Object detection preview URL: https://github.com/apache/incubator-mxnet/pull/8137 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[incubator-mxnet] branch szha-patch-1 deleted (was 585b0b8)
This is an automated email from the ASF dual-hosted git repository. zhasheng pushed a change to branch szha-patch-1 in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. was 585b0b8 Update nn.md The revisions that were on this branch are still contained in other references; therefore, this change does not discard any commits from the repository. -- To stop receiving notification emails like this one, please contact ['"comm...@mxnet.apache.org"'].
[GitHub] szha closed pull request #8134: Update nn.md
szha closed pull request #8134: Update nn.md URL: https://github.com/apache/incubator-mxnet/pull/8134 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[incubator-mxnet] branch master updated: Update nn.md (#8134)
This is an automated email from the ASF dual-hosted git repository. zhasheng pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git The following commit(s) were added to refs/heads/master by this push: new 8a4221b Update nn.md (#8134) 8a4221b is described below commit 8a4221bca2ddd4fa05840f7951a7216775021237 Author: Sheng ZhaAuthorDate: Tue Oct 3 16:07:42 2017 -0700 Update nn.md (#8134) --- docs/api/python/gluon/nn.md | 1 + 1 file changed, 1 insertion(+) diff --git a/docs/api/python/gluon/nn.md b/docs/api/python/gluon/nn.md index d230860..5e2dbe0 100644 --- a/docs/api/python/gluon/nn.md +++ b/docs/api/python/gluon/nn.md @@ -22,6 +22,7 @@ This document lists the neural network blocks in Gluon: BatchNorm LeakyReLU Embedding +Flatten ``` -- To stop receiving notification emails like this one, please contact ['"comm...@mxnet.apache.org" '].
[GitHub] szha commented on issue #8105: Proposal: PR Template
szha commented on issue #8105: Proposal: PR Template URL: https://github.com/apache/incubator-mxnet/issues/8105#issuecomment-334005341 Thanks. @apache/mxnet-committers just to make sure everyone is aware. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] szha commented on issue #8136: stable sum
szha commented on issue #8136: stable sum URL: https://github.com/apache/incubator-mxnet/pull/8136#issuecomment-333994412 I didn't realize that it was already added to mshadow. never mind https://github.com/dmlc/mshadow/blame/master/mshadow/base.h#L672-L685 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] piiswrong commented on issue #8136: stable sum
piiswrong commented on issue #8136: stable sum URL: https://github.com/apache/incubator-mxnet/pull/8136#issuecomment-333991002 Which file/function are you talking about? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] szha commented on issue #8136: stable sum
szha commented on issue #8136: stable sum URL: https://github.com/apache/incubator-mxnet/pull/8136#issuecomment-333993695 https://github.com/apache/incubator-mxnet/blob/master/src/operator/tensor/broadcast_reduce_op_value.cc#L81-L88 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] yxchng commented on issue #8132: How to disable MXNET_CUDNN_AUTOTUNE_DEFAULT and bucketing log message without turning off MXNET_CUDNN_AUTOTUNE_DEFAULT?
yxchng commented on issue #8132: How to disable MXNET_CUDNN_AUTOTUNE_DEFAULT and bucketing log message without turning off MXNET_CUDNN_AUTOTUNE_DEFAULT? URL: https://github.com/apache/incubator-mxnet/issues/8132#issuecomment-333791801 I tried exporting or using os.environ but they don't work. The code i am using is https://github.com/pangyupo/mxnet_mtcnn_face_detection This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] benqua commented on issue #8129: [scala] Module api: label (1,1,868,868) and prediction (1,868,868)should have the same length
benqua commented on issue #8129: [scala] Module api: label (1,1,868,868) and prediction (1,868,868)should have the same length URL: https://github.com/apache/incubator-mxnet/issues/8129#issuecomment-333796027 If that helps, the last layer of my network is: ```scala val out= Symbol.SoftmaxOutput()()(Map("data" -> conv10, "label" -> label, "multi_output" -> true)) ``` @javelinjs , @Ldpe2G , any idea? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] yxchng opened a new issue #8132: How to disable MXNET_CUDNN_AUTOTUNE_DEFAULT log message without turning off MXNET_CUDNN_AUTOTUNE_DEFAULT?
yxchng opened a new issue #8132: How to disable MXNET_CUDNN_AUTOTUNE_DEFAULT log message without turning off MXNET_CUDNN_AUTOTUNE_DEFAULT? URL: https://github.com/apache/incubator-mxnet/issues/8132 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] theSparta commented on issue #8126: Not able to train a neural network using MXNET with C++ API
theSparta commented on issue #8126: Not able to train a neural network using MXNET with C++ API URL: https://github.com/apache/incubator-mxnet/issues/8126#issuecomment-333847990 I am adding a very simple example here in which I tried to fit a neural network on the XOR function but unable to do so as well. ```C++ #include #include #include #include "mxnet-cpp/MxNetCpp.h" // Allow IDE to parse the types #include "../include/mxnet-cpp/op.h" using namespace std; using namespace mxnet::cpp; Symbol mlp(const vector , const vector & weights, const std::vector & biases, const string & inp_name ) { auto x = Symbol::Variable(inp_name); vector outputs(layers.size()); for (size_t i = 0; i < layers.size(); ++i) { string istr = to_string(i); Symbol fc = FullyConnected( i == 0? x : outputs[i-1], // data weights[i], biases[i], layers[i]); outputs[i] = i == layers.size()-1 ? fc : Activation(string("act") + istr, fc, ActivationActType::kTanh); } return outputs.back(); } int main(int argc, char** argv) { const int feature_size = 2; const vector layers{8, 4, 1}; const int batch_size = 4; const int max_epoch = 10; const float learning_rate = 0.001; const float weight_decay = 1e-2; auto ctx = Context::cpu(); // Use GPU for training auto ctx_cpu = Context::cpu(); vector weights(layers.size()); vector biases(layers.size()); for (size_t i = 0; i < layers.size(); ++i) { string istr = to_string(i); weights[i] = Symbol::Variable("w" + istr); biases[i] = Symbol::Variable("b" + istr); } auto Net = mlp(layers, weights, biases, "X"); auto sym_label = Symbol::Variable("label"); auto output = LogisticRegressionOutput(string("sigmoid"), Net, sym_label); mapargs_map; args_map["X"] = NDArray(Shape(batch_size, feature_size) , ctx); args_map["label"] = NDArray(Shape(batch_size, 1), ctx); auto *exec = output.SimpleBind(ctx, args_map); output.InferArgsMap(ctx, _map, args_map); auto arg_names = output.ListArguments(); Xavier xavier = Xavier(Xavier::gaussian, Xavier::avg); for (auto : args_map) { xavier(arg.first, ); } Optimizer* opt = OptimizerRegistry::Find("adam"); opt->SetParam("rescale_grad", 1.0 / batch_size) ->SetParam("lr", learning_rate) ->SetParam("wd", weight_decay); // XOR Function mx_float* aptr_x = new mx_float[batch_size * feature_size]; mx_float* aptr_y = new mx_float[batch_size]; aptr_x[0] = 0.; aptr_x[1] = 0.; aptr_y[0] = 0; aptr_x[2] = 0; aptr_x[3] = 1.; aptr_y[1] = 1; aptr_x[4] = 1.; aptr_x[5] = 0.; aptr_y[2] = 1; aptr_x[6] = 1.; aptr_x[7] = 1.; aptr_y[3] = 0; NDArray train_data = NDArray(Shape(batch_size, 2), ctx_cpu, false); NDArray train_label = NDArray(Shape(batch_size), ctx_cpu, false); train_data.SyncCopyFromCPU(aptr_x, batch_size * 2); train_label.SyncCopyFromCPU(aptr_y, batch_size); train_data.WaitToRead(); train_label.WaitToRead(); Accuracy acu_train; for (int ITER = 0; ITER < max_epoch ; ++ITER) { acu_train.Reset(); args_map["X"] = train_data.Copy(ctx); args_map["label"] = train_label.Copy(ctx); NDArray::WaitAll(); exec->Forward(true); acu_train.Update(args_map["label"], exec->outputs[0]); if(ITER % 5000 == 0){ auto out = (exec->outputs[0]).Copy(ctx_cpu); auto labels = args_map["label"].Copy(ctx_cpu); NDArray::WaitAll(); const mx_float * outs = out.GetData(); auto lbs = labels.GetData(); for (int i = 0 ; i < batch_size ; i++) cout << lbs[i] << ":" << outs[i] << " "; cout << endl; LG << "ITER: " << ITER << " Train Accuracy: " << acu_train.Get(); } exec->Backward(); // Update parameters for (size_t i = 0; i < arg_names.size(); ++i) { if (arg_names[i] == "X" || arg_names[i] == "label") continue; opt->Update(i, exec->arg_arrays[i], exec->grad_arrays[i]); } } delete exec; delete [] aptr_x; delete [] aptr_y; MXNotifyShutdown(); return 0; } ``` The output is (True_label : Predicted_label): ``` 0:0.95178 1:0.880215 1:0.944654 0:0.86154 [19:14:56] xor.cpp:114: ITER: 0 Train Accuracy: 0.5 0:0.786497 1:1 1:0.799124 0:3.35246e-13 [19:14:57] xor.cpp:114: ITER: 5000 Train Accuracy: 0.5 0:0.786137 1:1 1:0.800972
[GitHub] theSparta commented on issue #8126: Not able to train a neural network [XOR added]
theSparta commented on issue #8126: Not able to train a neural network [XOR added] URL: https://github.com/apache/incubator-mxnet/issues/8126#issuecomment-333847990 I am adding a very simple example here in which I tried to fit a neural network on the XOR function but unable to do so as well. ```C++ #include #include #include #include "mxnet-cpp/MxNetCpp.h" // Allow IDE to parse the types #include "../include/mxnet-cpp/op.h" using namespace std; using namespace mxnet::cpp; Symbol mlp(const vector , const vector & weights, const std::vector & biases, const string & inp_name ) { auto x = Symbol::Variable(inp_name); vector outputs(layers.size()); for (size_t i = 0; i < layers.size(); ++i) { string istr = to_string(i); Symbol fc = FullyConnected( i == 0? x : outputs[i-1], // data weights[i], biases[i], layers[i]); outputs[i] = i == layers.size()-1 ? fc : Activation(string("act") + istr, fc, ActivationActType::kTanh); } return outputs.back(); } int main(int argc, char** argv) { const int feature_size = 2; const vector layers{8, 4, 1}; const int batch_size = 4; const int max_epoch = 10; const float learning_rate = 0.001; const float weight_decay = 1e-2; auto ctx = Context::cpu(); // Use GPU for training auto ctx_cpu = Context::cpu(); vector weights(layers.size()); vector biases(layers.size()); for (size_t i = 0; i < layers.size(); ++i) { string istr = to_string(i); weights[i] = Symbol::Variable("w" + istr); biases[i] = Symbol::Variable("b" + istr); } auto Net = mlp(layers, weights, biases, "X"); auto sym_label = Symbol::Variable("label"); auto output = LogisticRegressionOutput(string("sigmoid"), Net, sym_label); mapargs_map; args_map["X"] = NDArray(Shape(batch_size, feature_size) , ctx); args_map["label"] = NDArray(Shape(batch_size, 1), ctx); auto *exec = output.SimpleBind(ctx, args_map); output.InferArgsMap(ctx, _map, args_map); auto arg_names = output.ListArguments(); Xavier xavier = Xavier(Xavier::gaussian, Xavier::avg); for (auto : args_map) { xavier(arg.first, ); } Optimizer* opt = OptimizerRegistry::Find("adam"); opt->SetParam("rescale_grad", 1.0 / batch_size) ->SetParam("lr", learning_rate) ->SetParam("wd", weight_decay); // XOR Function mx_float* aptr_x = new mx_float[batch_size * feature_size]; mx_float* aptr_y = new mx_float[batch_size]; aptr_x[0] = 0.; aptr_x[1] = 0.; aptr_y[0] = 0; aptr_x[2] = 0; aptr_x[3] = 1.; aptr_y[1] = 1; aptr_x[4] = 1.; aptr_x[5] = 0.; aptr_y[2] = 1; aptr_x[6] = 1.; aptr_x[7] = 1.; aptr_y[3] = 0; NDArray train_data = NDArray(Shape(batch_size, 2), ctx_cpu, false); NDArray train_label = NDArray(Shape(batch_size), ctx_cpu, false); train_data.SyncCopyFromCPU(aptr_x, batch_size * 2); train_label.SyncCopyFromCPU(aptr_y, batch_size); train_data.WaitToRead(); train_label.WaitToRead(); Accuracy acu_train; for (int ITER = 0; ITER < max_epoch ; ++ITER) { acu_train.Reset(); args_map["X"] = train_data.Copy(ctx); args_map["label"] = train_label.Copy(ctx); NDArray::WaitAll(); exec->Forward(true); acu_train.Update(args_map["label"], exec->outputs[0]); if(ITER % 5000 == 0){ auto out = (exec->outputs[0]).Copy(ctx_cpu); auto labels = args_map["label"].Copy(ctx_cpu); NDArray::WaitAll(); const mx_float * outs = out.GetData(); auto lbs = labels.GetData(); for (int i = 0 ; i < batch_size ; i++) cout << lbs[i] << ":" << outs[i] << " "; cout << endl; LG << "ITER: " << ITER << " Train Accuracy: " << acu_train.Get(); } exec->Backward(); // Update parameters for (size_t i = 0; i < arg_names.size(); ++i) { if (arg_names[i] == "X" || arg_names[i] == "label") continue; opt->Update(i, exec->arg_arrays[i], exec->grad_arrays[i]); } } delete exec; delete [] aptr_x; delete [] aptr_y; MXNotifyShutdown(); return 0; } ``` The output is (True_label : Predicted_label): ``` 0:0.95178 1:0.880215 1:0.944654 0:0.86154 [19:14:56] xor.cpp:114: ITER: 0 Train Accuracy: 0.5 0:0.786497 1:1 1:0.799124 0:3.35246e-13 [19:14:57] xor.cpp:114: ITER: 5000 Train Accuracy: 0.5 0:0.786137 1:1 1:0.800972 0:1.01632e-21
[GitHub] bhavinthaker commented on issue #8105: Proposal: PR Template
bhavinthaker commented on issue #8105: Proposal: PR Template URL: https://github.com/apache/incubator-mxnet/issues/8105#issuecomment-333874115 Looks good to me. Thanks for this suggestion. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] Ldpe2G commented on issue #8129: [scala] Module api: label (1,1,868,868) and prediction (1,868,868)should have the same length
Ldpe2G commented on issue #8129: [scala] Module api: label (1,1,868,868) and prediction (1,868,868)should have the same length URL: https://github.com/apache/incubator-mxnet/issues/8129#issuecomment-333860436 @benqua It seems like the error occurs when calling the `update` method of the Accuracy mertic. May be you should first check the output shape of your network by calling the inferShape method of the symbol. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] benqua commented on issue #8129: [scala] Module api: label (1,1,868,868) and prediction (1,868,868)should have the same length
benqua commented on issue #8129: [scala] Module api: label (1,1,868,868) and prediction (1,868,868)should have the same length URL: https://github.com/apache/incubator-mxnet/issues/8129#issuecomment-333862739 @Ldpe2G all right, I am going to check that (pretty sure it was (1,1,868,868), but I will cross-check once I arrive home). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] altosaar commented on issue #8130: autograd.backward() segfaults: how to get gradients with respect to a subset of variables in mxnet?
altosaar commented on issue #8130: autograd.backward() segfaults: how to get gradients with respect to a subset of variables in mxnet? URL: https://github.com/apache/incubator-mxnet/issues/8130#issuecomment-333903906 Thanks @piiswrong ! I installed the newest mxnet (`pip install --pre mxnet`). Here is a reproducible example based on what you suggested: ``` In [49]: from mxnet import nd, autograd In [50]: x = nd.array([1.]) In [51]: z = nd.array([1.]) In [52]: x.attach_grad() In [53]: z.attach_grad() In [54]: with autograd.record(): ...: first = nd.square(x) ...: second = nd.square(z) ...: y = first + second ...: autograd.grad(y, [x], retain_graph=True) ...: autograd.grad(y, [z]) ...: In [56]: x.grad Out[56]: [ 0.] In [57]: z.grad Out[57]: [ 0.] ``` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] aseyboldt opened a new issue #8133: Infer_shape_partial for rank 0 arrays
aseyboldt opened a new issue #8133: Infer_shape_partial for rank 0 arrays URL: https://github.com/apache/incubator-mxnet/issues/8133 In the python interface of mxnet it seems to be impossible to distinguish between an array of unknown shape and an array with known shape of `()`: ```python a = mx.sym.var('a') b = mx.sym.var('b') (a + b).infer_shape_partial(b=()) ([(), ()], [()], []) ``` In this case `b` is known to be a scalar, while `a`'s shape is unknown. Shouldn't it return something like `([None, ()], [None], [])`? If `a` is set to be a scalar, it returns an invalid result: ```python a = mx.sym.var('a') b = mx.sym.var('b') (a + b).infer_shape_partial(a=(), b=(1, 2)) ([(1, 2), (1, 2)], [(1, 2)], []) ``` It identifies the shape of `a` as `(1, 2)`, even though we set it to a scalar. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] szha commented on issue #8136: stable sum
szha commented on issue #8136: stable sum URL: https://github.com/apache/incubator-mxnet/pull/8136#issuecomment-334013658 Looks like square sum needs updating too: https://github.com/apache/incubator-mxnet/blob/master/src/operator/tensor/square_sum-inl.h#L127-L130 https://github.com/apache/incubator-mxnet/blob/master/src/operator/tensor/square_sum-inl.h#L148-L151 https://github.com/apache/incubator-mxnet/blob/master/src/operator/tensor/square_sum-inl.h#L171-L174 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] eric-haibin-lin commented on issue #7893: Add barriers in kvstore init
eric-haibin-lin commented on issue #7893: Add barriers in kvstore init URL: https://github.com/apache/incubator-mxnet/pull/7893#issuecomment-334014237 The test failure seems irrelevant. Do you mind sync up with master again to see if it passes? Regarding fp16, yes I think we need that fix. The current static_casting approach is a bug. Why is an extra copy required? If you train with fp16, are both weight and gradient in fp16? Or you're just trying to minimize the network traffic? BTW @rahul003 is working on gradient compression in parallel This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] eric-haibin-lin opened a new pull request #8138: add storage type logging to graph executor
eric-haibin-lin opened a new pull request #8138: add storage type logging to graph executor URL: https://github.com/apache/incubator-mxnet/pull/8138 This is mentioned in the tutorial PR #7921 and should be merged before that. The log message will be printed if env_var MXNET_INFER_STORAGE_TYPE_VERBOSE_LOGGING = 1 ``` >>> os.environ['MXNET_INFER_STORAGE_TYPE_VERBOSE_LOGGING'] = "1" >>> print(os.environ['MXNET_INFER_STORAGE_TYPE_VERBOSE_LOGGING']) 1 >>> # Data in csr format ... data = mx.sym.var('data', stype='csr', shape=(32, 1)) >>> # Weight in row_sparse format ... weight = mx.sym.var('weight', stype='row_sparse', shape=(1, 2)) >>> bias = mx.symbol.Variable("bias", shape=(2,)) >>> dot = mx.symbol.sparse.dot(data, weight) >>> pred = mx.symbol.broadcast_add(dot, bias) >>> y = mx.symbol.Variable("label") >>> output = mx.symbol.SoftmaxOutput(data=pred, label=y, name="output") >>> executor = output.simple_bind(ctx=mx.cpu()) [00:42:45] src/executor/../common/utils.h:130: node 0 var [00:42:45] src/executor/../common/utils.h:130: node 1 var [00:42:45] src/executor/../common/utils.h:132: node 2 dot: fcompute_ex [00:42:45] src/executor/../common/utils.h:136: input 0: csr [00:42:45] src/executor/../common/utils.h:136: input 1: row_sparse [00:42:45] src/executor/../common/utils.h:141: output 2: default [00:42:45] src/executor/../common/utils.h:130: node 3 var [00:42:45] src/executor/../common/utils.h:132: node 4 broadcast_add: fcompute [00:42:45] src/executor/../common/utils.h:136: input 2: default [00:42:45] src/executor/../common/utils.h:136: input 3: default [00:42:45] src/executor/../common/utils.h:141: output 4: default [00:42:45] src/executor/../common/utils.h:130: node 5 var [00:42:45] src/executor/../common/utils.h:132: node 6 SoftmaxOutput: fcompute [00:42:45] src/executor/../common/utils.h:136: input 4: default [00:42:45] src/executor/../common/utils.h:136: input 5: default [00:42:45] src/executor/../common/utils.h:141: output 6: default [00:42:45] src/executor/../common/utils.h:132: node 7 _backward_SoftmaxOutput: fcompute [00:42:45] src/executor/../common/utils.h:136: input 5: default [00:42:45] src/executor/../common/utils.h:136: input 6: default [00:42:45] src/executor/../common/utils.h:141: output 7: default [00:42:45] src/executor/../common/utils.h:141: output 8: default [00:42:45] src/executor/../common/utils.h:132: node 8 _backward_broadcast_add: fcompute [00:42:45] src/executor/../common/utils.h:136: input 7: default [00:42:45] src/executor/../common/utils.h:141: output 9: default [00:42:45] src/executor/../common/utils.h:141: output 10: default [00:42:45] src/executor/../common/utils.h:132: node 9 _backward_dot: fcompute_ex [00:42:45] src/executor/../common/utils.h:136: input 9: default [00:42:45] src/executor/../common/utils.h:136: input 0: csr [00:42:45] src/executor/../common/utils.h:136: input 1: row_sparse [00:42:45] src/executor/../common/utils.h:141: output 11: default [00:42:45] src/executor/../common/utils.h:141: output 12: row_sparse ``` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] szha commented on issue #8136: stable sum
szha commented on issue #8136: stable sum URL: https://github.com/apache/incubator-mxnet/pull/8136#issuecomment-334013658 Looks like square sum needs updating too: https://github.com/apache/incubator-mxnet/blob/master/src/operator/tensor/square_sum-inl.h#L127-L130 https://github.com/apache/incubator-mxnet/blob/master/src/operator/tensor/square_sum-inl.h#L148-L151 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] ZiyueHuang commented on issue #8130: autograd.backward() segfaults: how to get gradients with respect to a subset of variables in mxnet?
ZiyueHuang commented on issue #8130: autograd.backward() segfaults: how to get gradients with respect to a subset of variables in mxnet? URL: https://github.com/apache/incubator-mxnet/issues/8130#issuecomment-334040825 Instead of using two `autograd.grad`, please try ``` print autograd.grad(y, [x, z]) ``` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] caiqi opened a new issue #8139: mxnet ssd training speed slow down after some batches
caiqi opened a new issue #8139: mxnet ssd training speed slow down after some batches URL: https://github.com/apache/incubator-mxnet/issues/8139 ## Environment info Operating System: Windows Package used (Python/R/Scala/Julia): Python MXNet version: 0.11.0 Or if installed from source: install with pip When train ssd detection model using multi GPUs, the training speed is slowing down after a long time. Is there any way to solve the problem? Thanks. > INFO:root:Epoch[0] Batch [20] Speed: 74.29 samples/sec > INFO:root:Epoch[0] Batch [40] Speed: 76.35 samples/sec > INFO:root:Epoch[0] Batch [60] Speed: 75.31 samples/sec > INFO:root:Epoch[0] Batch [80] Speed: 74.59 samples/sec > INFO:root:Epoch[0] Batch [100] Speed: 75.76 samples/sec > INFO:root:Epoch[0] Batch [120] Speed: 77.23 samples/sec > INFO:root:Epoch[0] Batch [140] Speed: 74.44 samples/sec > INFO:root:Epoch[0] Batch [160] Speed: 74.67 samples/sec > INFO:root:Epoch[0] Batch [180] Speed: 75.40 samples/sec > INFO:root:Epoch[0] Batch [200] Speed: 76.74 samples/sec > INFO:root:Epoch[0] Batch [220] Speed: 75.12 samples/sec > INFO:root:Epoch[0] Batch [240] Speed: 76.70 samples/sec > INFO:root:Epoch[0] Batch [260] Speed: 74.27 samples/sec > INFO:root:Epoch[0] Batch [280] Speed: 75.89 samples/sec > INFO:root:Epoch[0] Batch [300] Speed: 75.57 samples/sec > INFO:root:Epoch[0] Batch [320] Speed: 76.34 samples/sec > INFO:root:Epoch[0] Batch [340] Speed: 75.85 samples/sec > INFO:root:Epoch[0] Batch [360] Speed: 76.27 samples/sec > INFO:root:Epoch[0] Batch [380] Speed: 76.11 samples/sec > INFO:root:Epoch[0] Batch [400] Speed: 76.88 samples/sec > INFO:root:Epoch[0] Batch [420] Speed: 75.87 samples/sec > INFO:root:Epoch[0] Batch [440] Speed: 75.08 samples/sec > INFO:root:Epoch[0] Batch [460] Speed: 76.34 samples/sec > INFO:root:Epoch[0] Batch [480] Speed: 76.06 samples/sec > INFO:root:Epoch[0] Batch [500] Speed: 73.84 samples/sec > INFO:root:Epoch[0] Batch [520] Speed: 69.82 samples/sec > INFO:root:Epoch[0] Batch [540] Speed: 65.33 samples/sec > INFO:root:Epoch[0] Batch [560] Speed: 63.28 samples/sec > INFO:root:Epoch[0] Batch [580] Speed: 59.28 samples/sec > INFO:root:Epoch[0] Batch [600] Speed: 54.57 samples/sec > INFO:root:Epoch[0] Batch [620] Speed: 52.37 samples/sec > INFO:root:Epoch[0] Batch [640] Speed: 51.08 samples/sec > INFO:root:Epoch[0] Batch [660] Speed: 50.30 samples/sec > INFO:root:Epoch[0] Batch [680] Speed: 49.22 samples/sec > INFO:root:Epoch[0] Batch [700] Speed: 49.70 samples/sec > INFO:root:Epoch[0] Batch [720] Speed: 50.45 samples/sec > INFO:root:Epoch[0] Batch [740] Speed: 52.21 samples/sec > INFO:root:Epoch[0] Batch [760] Speed: 54.90 samples/sec > INFO:root:Epoch[0] Batch [780] Speed: 58.65 samples/sec > INFO:root:Epoch[0] Batch [800] Speed: 60.69 samples/sec > INFO:root:Epoch[0] Batch [820] Speed: 66.90 samples/sec > INFO:root:Epoch[0] Batch [840] Speed: 68.57 samples/sec > INFO:root:Epoch[0] Batch [860] Speed: 70.10 samples/sec > INFO:root:Epoch[0] Batch [880] Speed: 70.06 samples/sec > INFO:root:Epoch[0] Batch [900] Speed: 71.81 samples/sec > INFO:root:Epoch[0] Batch [920] Speed: 73.46 samples/sec > INFO:root:Epoch[0] Batch [940] Speed: 72.55 samples/sec > INFO:root:Epoch[0] Batch [960] Speed: 71.95 samples/sec > INFO:root:Epoch[0] Batch [980] Speed: 72.64 samples/sec > INFO:root:Epoch[0] Batch [1000] Speed: 72.28 samples/sec > INFO:root:Epoch[0] Batch [1020] Speed: 72.63 samples/sec > INFO:root:Epoch[0] Batch [1040] Speed: 73.61 samples/sec > INFO:root:Epoch[0] Batch [1060] Speed: 74.30 samples/sec > INFO:root:Epoch[0] Batch [1080] Speed: 73.47 samples/sec > INFO:root:Epoch[0] Batch [1100] Speed: 73.09 samples/sec > INFO:root:Epoch[0] Batch [1120] Speed: 72.78 samples/sec > INFO:root:Epoch[0] Batch [1140] Speed: 73.37 samples/sec > INFO:root:Epoch[0] Batch [1160] Speed: 73.40 samples/sec > INFO:root:Epoch[0] Batch [1180] Speed: 73.53 samples/sec > INFO:root:Epoch[0] Batch [1200] Speed: 73.48 samples/sec > INFO:root:Epoch[0] Batch [1220] Speed: 71.79 samples/sec > INFO:root:Epoch[0] Batch [1240] Speed: 72.09 samples/sec > INFO:root:Epoch[0] Batch [1260] Speed: 69.93 samples/sec > INFO:root:Epoch[0] Batch [1280] Speed: 64.27 samples/sec > INFO:root:Epoch[0] Batch [1300] Speed: 59.91 samples/sec > INFO:root:Epoch[0] Batch [1320] Speed: 54.84 samples/sec > INFO:root:Epoch[0] Batch [1340] Speed: 51.23 samples/sec > INFO:root:Epoch[0] Batch [1360] Speed: 48.56 samples/sec > INFO:root:Epoch[0] Batch [1380] Speed: 44.69 samples/sec > INFO:root:Epoch[0] Batch [1400] Speed: 41.22 samples/sec > INFO:root:Epoch[0] Batch [1420] Speed: 38.33 samples/sec > INFO:root:Epoch[0] Batch [1440] Speed: 35.85 samples/sec > INFO:root:Epoch[0]
[GitHub] solin319 commented on issue #7893: Add barriers in kvstore init
solin319 commented on issue #7893: Add barriers in kvstore init URL: https://github.com/apache/incubator-mxnet/pull/7893#issuecomment-334041868 Sorry,I am in holiday these days. l didn't bring my computer back to my hometown. When I go back,I will sync the code and add a test file as soon as possible. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] eric-haibin-lin commented on issue #7893: Add barriers in kvstore init
eric-haibin-lin commented on issue #7893: Add barriers in kvstore init URL: https://github.com/apache/incubator-mxnet/pull/7893#issuecomment-334042834 @solin319 no worries. Happy mid-autumn festival! This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] szha opened a new pull request #8140: use fma in fully connected.
szha opened a new pull request #8140: use fma in fully connected. URL: https://github.com/apache/incubator-mxnet/pull/8140 when use bias, broadcast the bias to output first and then use kAddTo, which in turn keeps beta in gemm to be 1. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] zhreshold commented on issue #8139: mxnet ssd training speed slow down after some batches
zhreshold commented on issue #8139: mxnet ssd training speed slow down after some batches URL: https://github.com/apache/incubator-mxnet/issues/8139#issuecomment-334045504 Seems like thermal throttling. Are you using workstation with multiple GPUs? If so, you need to address the heating problem. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] caiqi commented on issue #8139: mxnet ssd training speed slow down after some batches
caiqi commented on issue #8139: mxnet ssd training speed slow down after some batches URL: https://github.com/apache/incubator-mxnet/issues/8139#issuecomment-334047247 I'm using muti GPUs on a server and it works well for other programs. When I train ssd on a single GPU, there is no problem, the speed is constantly around 39 samples/sec. So I think it maybe not due to the heating problem. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] caiqi commented on issue #8139: mxnet ssd training speed slow down after some batches
caiqi commented on issue #8139: mxnet ssd training speed slow down after some batches URL: https://github.com/apache/incubator-mxnet/issues/8139#issuecomment-334047544 I have read this page https://github.com/msracver/Flow-Guided-Feature-Aggregation, FAQ 2. I originally thought the problem is specific to that repo. Is the problem common to mxnet on windows? Thanks. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] eric-haibin-lin commented on issue #8062: CSVIter and LibSVMIter not returning correct number of batches per epoch
eric-haibin-lin commented on issue #8062: CSVIter and LibSVMIter not returning correct number of batches per epoch URL: https://github.com/apache/incubator-mxnet/issues/8062#issuecomment-334051905 The number of batches will be correct if reset() is moved to the end of the epoch: ``` for epoch in range(10): nbatch = 0 for batch in iter(data_train): nbatch += 1 assert(nbatch == 100), nbatch data_train.reset() ``` I've updated the documentation for this in #8111 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] eric-haibin-lin closed issue #8062: CSVIter and LibSVMIter not returning correct number of batches per epoch
eric-haibin-lin closed issue #8062: CSVIter and LibSVMIter not returning correct number of batches per epoch URL: https://github.com/apache/incubator-mxnet/issues/8062 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services