[GitHub] [incubator-mxnet] JFChi opened a new issue #15411: Need register_backward_hook() function in mxnet
JFChi opened a new issue #15411: Need register_backward_hook() function in mxnet URL: https://github.com/apache/incubator-mxnet/issues/15411 The current version of mxnet just provide register_forward_hook() function. However, register_backward_hook function is also useful (e.g. logging the information of gradient w.r.t the block or overwriting the backward function of the block). Can we add this feature in the next version of mxnet? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] JFChi closed issue #15411: Need register_backward_hook() function in mxnet
JFChi closed issue #15411: Need register_backward_hook() function in mxnet URL: https://github.com/apache/incubator-mxnet/issues/15411 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] JFChi opened a new issue #15411: Need register_backward_hook() function in mxnet
JFChi opened a new issue #15411: Need register_backward_hook() function in mxnet URL: https://github.com/apache/incubator-mxnet/issues/15411 The current version of mxnet just provide register_forward_hook() function. However, register_backward_hook function is also useful (e.g. logging the information of gradient w.r.t the block or overwriting the backward function of the block). Can we add this feature in the next version of mxnet? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] szha commented on issue #15410: data.dmlc.ml redirect (security breach?)
szha commented on issue #15410: data.dmlc.ml redirect (security breach?) URL: https://github.com/apache/incubator-mxnet/issues/15410#issuecomment-506916684 @mli told me that he doesn't own the domain name. Also, the domain name probably have expired. All links to data.dmlc.ml should be replaced. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] sandeep-krishnamurthy commented on a change in pull request #15403: Updating profiler tutorial to include new custom operator profiling
sandeep-krishnamurthy commented on a change in pull request #15403: Updating profiler tutorial to include new custom operator profiling URL: https://github.com/apache/incubator-mxnet/pull/15403#discussion_r298782963 ## File path: docs/tutorials/python/profiler.md ## @@ -206,6 +206,15 @@ Let's zoom in to check the time taken by operators The above picture visualizes the sequence in which the operators were executed and the time taken by each operator. +### Profiling Custom Operators +Should the existing NDArray operators fail to meet all your model's needs, MXNet supports [Custom Operators](https://mxnet.incubator.apache.org/versions/master/tutorials/gluon/customop.html) that you can define in python. In forward() and backward() of a custom operator, there are two kinds of code: `pure python` code (Numpy operators inclued) and `sub-operators` (NDArray operators called within foward() and backward()). With that said, MXNet can profile the execution time of both kinds without additional setup. More specifically, the MXNet profiler will break a single custom operator call into a `pure python` event and several `sub-operator` events if there is any. Furthermore, all those events will have a prefix in their names, which is conviniently the name of the custom operator you called. + +![Custom Operator Profiling Screenshot](https://cwiki.apache.org/confluence/download/attachments/118172065/image2019-6-14_15-23-42.png?version=1=1560551022000=v2) + +As shown by the sreenshot, in the `Custom Operator` domain where all the custom-operator-related events fall into, you can easily visualize the execution time of each segment of your custom operator. For example, we know that "CustomAddTwo::sqrt" is a `sub-operator` of custom operator "CustomAddTwo", and we also know when it is exectued accurately. Review comment: Spell check. Sreenshot - Screenshot. Even better wording would be -As shown in the below image. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] sandeep-krishnamurthy commented on a change in pull request #15403: Updating profiler tutorial to include new custom operator profiling
sandeep-krishnamurthy commented on a change in pull request #15403: Updating profiler tutorial to include new custom operator profiling URL: https://github.com/apache/incubator-mxnet/pull/15403#discussion_r298783010 ## File path: docs/tutorials/python/profiler.md ## @@ -206,6 +206,15 @@ Let's zoom in to check the time taken by operators The above picture visualizes the sequence in which the operators were executed and the time taken by each operator. +### Profiling Custom Operators +Should the existing NDArray operators fail to meet all your model's needs, MXNet supports [Custom Operators](https://mxnet.incubator.apache.org/versions/master/tutorials/gluon/customop.html) that you can define in Python. In `forward()` and `backward()` of a custom operator, there are two kinds of code: "pure Python" code (NumPy operators included) and "sub-operators" (NDArray operators called within `forward()` and `backward()`). With that said, MXNet can profile the execution time of both kinds without additional setup. Specifically, the MXNet profiler will break a single custom operator call into a pure Python event and several sub-operator events if there are any. Furthermore, all of those events will have a prefix in their names, which is, conveniently, the name of the custom operator you called. + +![Custom Operator Profiling Screenshot](https://cwiki.apache.org/confluence/download/attachments/118172065/image2019-6-14_15-23-42.png?version=1=1560551022000=v2) Review comment: All MXNet images are served from dmlc/web-data repository. Let us create a PR in that repo to upload this image. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] sandeep-krishnamurthy commented on a change in pull request #15403: Updating profiler tutorial to include new custom operator profiling
sandeep-krishnamurthy commented on a change in pull request #15403: Updating profiler tutorial to include new custom operator profiling URL: https://github.com/apache/incubator-mxnet/pull/15403#discussion_r298783062 ## File path: docs/tutorials/python/profiler.md ## @@ -206,6 +206,15 @@ Let's zoom in to check the time taken by operators The above picture visualizes the sequence in which the operators were executed and the time taken by each operator. +### Profiling Custom Operators +Should the existing NDArray operators fail to meet all your model's needs, MXNet supports [Custom Operators](https://mxnet.incubator.apache.org/versions/master/tutorials/gluon/customop.html) that you can define in Python. In `forward()` and `backward()` of a custom operator, there are two kinds of code: "pure Python" code (NumPy operators included) and "sub-operators" (NDArray operators called within `forward()` and `backward()`). With that said, MXNet can profile the execution time of both kinds without additional setup. Specifically, the MXNet profiler will break a single custom operator call into a pure Python event and several sub-operator events if there are any. Furthermore, all of those events will have a prefix in their names, which is, conveniently, the name of the custom operator you called. + +![Custom Operator Profiling Screenshot](https://cwiki.apache.org/confluence/download/attachments/118172065/image2019-6-14_15-23-42.png?version=1=1560551022000=v2) + +As shown by the screenshot, in the **Custom Operator** domain where all the custom operator-related events fall into, you can easily visualize the execution time of each segment of your custom operator. For example, we know that `CustomAddTwo::sqrt` is a sub-operator of custom operator `CustomAddTwo`, and we also know when it is executed accurately. + +Please note that: to be able to see the previously described information, you need to set `profile_imperative` to `True` even when you are using custom operators in [symbolic mode](https://mxnet.incubator.apache.org/versions/master/tutorials/basic/symbol.html). The reason is that within custom operators, pure python code and sub-operators are still called imperatively. Review comment: Please add a code example as well. Makes it very clear for readers on how to use. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] larroy commented on issue #14779: Fully connected, higher order grad
larroy commented on issue #14779: Fully connected, higher order grad URL: https://github.com/apache/incubator-mxnet/pull/14779#issuecomment-506915508 Thanks a lot for your help @sxjscience. I will refine this PR in light of your feedback. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[incubator-mxnet-site] branch asf-site updated: Bump the publish timestamp.
This is an automated email from the ASF dual-hosted git repository. marcoabreu pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/incubator-mxnet-site.git The following commit(s) were added to refs/heads/asf-site by this push: new 5472817 Bump the publish timestamp. 5472817 is described below commit 5472817a3f19d5681cbca87a993d61ce34c44fd3 Author: mxnet-ci AuthorDate: Sat Jun 29 01:16:43 2019 + Bump the publish timestamp. --- date.txt | 1 + 1 file changed, 1 insertion(+) diff --git a/date.txt b/date.txt new file mode 100644 index 000..5221d12 --- /dev/null +++ b/date.txt @@ -0,0 +1 @@ +Sat Jun 29 01:16:43 UTC 2019
[GitHub] [incubator-mxnet] mxnet-label-bot commented on issue #15410: data.dmlc.ml redirect (security breach?)
mxnet-label-bot commented on issue #15410: data.dmlc.ml redirect (security breach?) URL: https://github.com/apache/incubator-mxnet/issues/15410#issuecomment-506915205 Hey, this is the MXNet Label Bot. Thank you for submitting the issue! I will try and suggest some labels so that the appropriate MXNet community members can help resolve it. Here are my recommended labels: Bug This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] aaronmarkham commented on issue #15410: data.dmlc.ml redirect (security breach?)
aaronmarkham commented on issue #15410: data.dmlc.ml redirect (security breach?) URL: https://github.com/apache/incubator-mxnet/issues/15410#issuecomment-506915224 @mli - you might want to check this out... This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] aaronmarkham opened a new issue #15410: data.dmlc.ml redirect (security breach?)
aaronmarkham opened a new issue #15410: data.dmlc.ml redirect (security breach?) URL: https://github.com/apache/incubator-mxnet/issues/15410 Checking the broken links and found this really weird redirect for anything going to data.dmlc.ml Goes to iyfnzgb.com... whois on this is hidden but other results indicate mal/adware. URL - https://mxnet.incubator.apache.org/model_zoo/index.html Broken Links ─ http://data.dmlc.ml/mxnet/models/imagenet/caffenet/caffenet-symbol.json (HTTP_400) ─ http://data.dmlc.ml/models/imagenet/nin/nin-symbol.json (HTTP_400) ─ http://data.dmlc.ml/models/imagenet/squeezenet/squeezenet_v1.1-symbol.json (HTTP_400) ─ http://data.dmlc.ml/models/imagenet/vgg/vgg16-symbol.json (HTTP_400) ─ http://data.dmlc.ml/models/imagenet/vgg/vgg19-symbol.json (HTTP_400) ─ http://data.dmlc.ml/models/imagenet/inception-bn/Inception-BN-symbol.json (HTTP_400) ─ http://data.dmlc.ml/models/imagenet/resnet/152-layers/resnet-152-symbol.json (HTTP_400) ─ http://data.dmlc.ml/models/imagenet/resnext/101-layers/resnext-101-64x4d-symbol.json (HTTP_400) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] mxnet-label-bot commented on issue #15409: float16 tutorial broken link
mxnet-label-bot commented on issue #15409: float16 tutorial broken link URL: https://github.com/apache/incubator-mxnet/issues/15409#issuecomment-506914875 Hey, this is the MXNet Label Bot. Thank you for submitting the issue! I will try and suggest some labels so that the appropriate MXNet community members can help resolve it. Here are my recommended labels: Doc This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] aaronmarkham opened a new issue #15409: float16 tutorial broken link
aaronmarkham opened a new issue #15409: float16 tutorial broken link URL: https://github.com/apache/incubator-mxnet/issues/15409 URL - https://mxnet.incubator.apache.org/versions/master/faq/float16.html Broken Links ─ https://github.com/apache/incubator-mxnet/tree/master/example/image-classificatiIfon/train_imagenet.py (HTTP_404) URL should be: https://github.com/apache/incubator-mxnet/blob/master/example/image-classification/train_imagenet.py This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] mxnet-label-bot commented on issue #15408: tensorrt tutorial missing images
mxnet-label-bot commented on issue #15408: tensorrt tutorial missing images URL: https://github.com/apache/incubator-mxnet/issues/15408#issuecomment-506914639 Hey, this is the MXNet Label Bot. Thank you for submitting the issue! I will try and suggest some labels so that the appropriate MXNet community members can help resolve it. Here are my recommended labels: Doc This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] aaronmarkham opened a new issue #15408: tensorrt tutorial missing images
aaronmarkham opened a new issue #15408: tensorrt tutorial missing images URL: https://github.com/apache/incubator-mxnet/issues/15408 URL - https://mxnet.incubator.apache.org/tutorials/tensorrt/inference_with_trt.html Broken Links ─ https://mxnet.incubator.apache.org/versions/master/tutorials/tensorrt/_static/tutorials/tensorrt/wavenet_unoptimized.png (HTTP_404) ─ https://mxnet.incubator.apache.org/versions/master/tutorials/tensorrt/_static/tutorials/tensorrt/wavenet_optimized.png (HTTP_404) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] mxnet-label-bot commented on issue #15407: scala hello world broken link
mxnet-label-bot commented on issue #15407: scala hello world broken link URL: https://github.com/apache/incubator-mxnet/issues/15407#issuecomment-506914557 Hey, this is the MXNet Label Bot. Thank you for submitting the issue! I will try and suggest some labels so that the appropriate MXNet community members can help resolve it. Here are my recommended labels: Scala, Doc This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] aaronmarkham opened a new issue #15407: scala hello world broken link
aaronmarkham opened a new issue #15407: scala hello world broken link URL: https://github.com/apache/incubator-mxnet/issues/15407 URL - https://mxnet.incubator.apache.org/tutorials/java/mxnet_java_on_intellij.html Broken Links ─ https://github.com/apache/incubator-mxnet/blob/java-api/scala-package/mxnet-demo/java-demo/src/main/java/mxnet/HelloWorld.java (HTTP_404) ─ https://github.com/apache/incubator-mxnet/tree/java-api/scala-package/mxnet-demo/java-demo (HTTP_404) ─ https://github.com/apache/incubator-mxnet/blob/java-api/scala-package/mxnet-demo/java-demo/README.md (HTTP_404) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] vdantu commented on issue #14459: Cannot release memory sometimes
vdantu commented on issue #14459: Cannot release memory sometimes URL: https://github.com/apache/incubator-mxnet/issues/14459#issuecomment-506914516 @ShownX : Did you try the suggestion given by @chinakook ? Are you still facing this issue? Some other suggestions going through online forums are 1. Kill all python processes on the system. ``` killall python ``` 2. Do a list of all processes on that GPU ``` lsof /dev/nvidia* ``` and kill all processes running on that GPU. [Refer this](https://github.com/neighthan/gpu-utils/blob/c6329f21fa3780ee89985b7974194cb808643f5c/gpu_utils/utils.py#L114) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] larroy commented on issue #15405: Fix memory leak in NaiveEngine
larroy commented on issue #15405: Fix memory leak in NaiveEngine URL: https://github.com/apache/incubator-mxnet/pull/15405#issuecomment-506914439 @mxnet-label-bot add [Backend] This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] larroy commented on issue #15405: Fix memory leak in NaiveEngine
larroy commented on issue #15405: Fix memory leak in NaiveEngine URL: https://github.com/apache/incubator-mxnet/pull/15405#issuecomment-506914418 @mxnet-label-bot add [Engine, Bug, Memory] This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] larroy commented on issue #15405: Fix memory leak in NaiveEngine
larroy commented on issue #15405: Fix memory leak in NaiveEngine URL: https://github.com/apache/incubator-mxnet/pull/15405#issuecomment-506914377 Added valgrind run which shows the memory being leaked. ![naive_engine_leak](https://user-images.githubusercontent.com/928489/60377855-2fc17b80-99cf-11e9-9340-00c5892c8227.png) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] larroy commented on issue #14836: Refactor AGInfo and Imperative
larroy commented on issue #14836: Refactor AGInfo and Imperative URL: https://github.com/apache/incubator-mxnet/pull/14836#issuecomment-506913823 @szha should I fix this PR to get it merged or abandon it? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] larroy commented on issue #15117: nano instructions
larroy commented on issue #15117: nano instructions URL: https://github.com/apache/incubator-mxnet/pull/15117#issuecomment-506913716 Let's look at this next week ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] larroy commented on issue #15393: Unable to build mxnet with OpenCV4 on Raspberry Pi 3B
larroy commented on issue #15393: Unable to build mxnet with OpenCV4 on Raspberry Pi 3B URL: https://github.com/apache/incubator-mxnet/issues/15393#issuecomment-506913502 You can also have a look at our docker files for cross compiling. I would suggest that you can contribute to them if you want, it also compiles much faster than on the PI...check the ci/build.py you can do ci/build.py -p arm[xxx] to cross compile for the PI. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] larroy commented on issue #15393: Unable to build mxnet with OpenCV4 on Raspberry Pi 3B
larroy commented on issue #15393: Unable to build mxnet with OpenCV4 on Raspberry Pi 3B URL: https://github.com/apache/incubator-mxnet/issues/15393#issuecomment-506913429 When I have this issue, usually I try to locate the library which has this symbol. You can use something like "nm" to list the library symbols with c++filt together or elfdump or similar. Once you have located which library has this symbol, you make sure you are linking with that. I assume that you are using the opencv headers of your new version and not older ones... You can copy paste the linking command that fails and add required -l libraries until it links, then fix the build file. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] mxnet-label-bot commented on issue #15406: [Test failure]: test_custom_operator_profiling_naive_engine
mxnet-label-bot commented on issue #15406: [Test failure]: test_custom_operator_profiling_naive_engine URL: https://github.com/apache/incubator-mxnet/issues/15406#issuecomment-506913129 Hey, this is the MXNet Label Bot. Thank you for submitting the issue! I will try and suggest some labels so that the appropriate MXNet community members can help resolve it. Here are my recommended labels: Test This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] aaronmarkham opened a new issue #15406: [Test failure]: test_custom_operator_profiling_naive_engine
aaronmarkham opened a new issue #15406: [Test failure]: test_custom_operator_profiling_naive_engine URL: https://github.com/apache/incubator-mxnet/issues/15406 ## Description Docs update PR failed on a test. http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/mxnet-validation%2Funix-cpu/detail/PR-15117/4/pipeline ``` == FAIL: test_profiler.test_custom_operator_profiling_naive_engine -- Traceback (most recent call last): File "/usr/local/lib/python3.5/dist-packages/nose/case.py", line 198, in runTest self.test(*self.arg) File "/work/mxnet/tests/python/unittest/test_profiler.py", line 445, in test_custom_operator_profiling_naive_engine 'test_custom_operator_profiling_multiple_custom_ops_imperative_naive.json') File "/work/mxnet/tests/python/unittest/common.py", line 313, in run_in_spawned_process assert p.exitcode == 0, "Non-zero exit code %d from %s()." % (p.exitcode, func.__name__) AssertionError: Non-zero exit code 255 from test_custom_operator_profiling_multiple_custom_ops_imperative(). ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] larroy opened a new pull request #15405: Fix memory leak in NaiveEngine
larroy opened a new pull request #15405: Fix memory leak in NaiveEngine URL: https://github.com/apache/incubator-mxnet/pull/15405 Fixes #15375 ## Description ## Fixes a memory leak in NaiveEngine when using operator profiling. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] larroy commented on issue #15375: Memory leak in Naive engine when profiling
larroy commented on issue #15375: Memory leak in Naive engine when profiling URL: https://github.com/apache/incubator-mxnet/issues/15375#issuecomment-506912862 [valgrind.txt](https://github.com/apache/incubator-mxnet/files/3341240/valgrind.txt) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] larroy commented on issue #15375: Memory leak in Naive engine when profiling
larroy commented on issue #15375: Memory leak in Naive engine when profiling URL: https://github.com/apache/incubator-mxnet/issues/15375#issuecomment-506912535 Attaching valgrind run which shows the leaked memory. ![naive_engine_leak](https://user-images.githubusercontent.com/928489/60377426-c12eee80-99cb-11e9-81fc-a15ffb71a799.png) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] anirudh2290 commented on issue #15375: Memory leak in Naive engine when profiling
anirudh2290 commented on issue #15375: Memory leak in Naive engine when profiling URL: https://github.com/apache/incubator-mxnet/issues/15375#issuecomment-506907195 This thread has been derailed a lot from the original issue and the whole discussion has been about threaded engine where there is no issue. I will keep this closed, feel free to open a new issue and keep the discussion focused to naive engine. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] sxjscience opened a new issue #15404: [OP] Fix SequenceMask to support different datatypes for data / sequence_length
sxjscience opened a new issue #15404: [OP] Fix SequenceMask to support different datatypes for data / sequence_length URL: https://github.com/apache/incubator-mxnet/issues/15404 Currently, the SequenceMask operator does not support different dtypes for data and sequence_length. Usually, we could set data.dtype = np.float32 and sequence_length.dtype = np.int32. However, it's not allowed due to this line: https://github.com/apache/incubator-mxnet/blob/master/src/operator/sequence_mask-inl.h#L220 We should fix it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] mxnet-label-bot commented on issue #15404: [OP] Fix SequenceMask to support different datatypes for data / sequence_length
mxnet-label-bot commented on issue #15404: [OP] Fix SequenceMask to support different datatypes for data / sequence_length URL: https://github.com/apache/incubator-mxnet/issues/15404#issuecomment-506906185 Hey, this is the MXNet Label Bot. Thank you for submitting the issue! I will try and suggest some labels so that the appropriate MXNet community members can help resolve it. Here are my recommended labels: Feature This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] cyrusbehr commented on issue #15393: Unable to build mxnet with OpenCV4 on Raspberry Pi 3B
cyrusbehr commented on issue #15393: Unable to build mxnet with OpenCV4 on Raspberry Pi 3B URL: https://github.com/apache/incubator-mxnet/issues/15393#issuecomment-506906010 with the verbose flag enabled: ``` [14/42] cd /home/pi/Cyrus/mxnet-static/incubator-mxnet/cpp-package/scripts && echo Running:\ OpWrapperGenerator.py && python OpWrapperGenerator.py /home/pi/Cyrus/mxnet-static/incubator-mxnet/build/libmxnet.so FAILED: cpp-package/CMakeFiles/cpp_package_op_h ../cpp-package/include/mxnet-cpp/op.h cpp-package/MAIN_DEPENDENCY cpp-package/mxnet cd /home/pi/Cyrus/mxnet-static/incubator-mxnet/cpp-package/scripts && echo Running:\ OpWrapperGenerator.py && python OpWrapperGenerator.py /home/pi/Cyrus/mxnet-static/incubator-mxnet/build/libmxnet.so Running: OpWrapperGenerator.py Traceback (most recent call last): File "OpWrapperGenerator.py", line 432, in raise(e) OSError: /home/pi/Cyrus/mxnet-static/incubator-mxnet/build/libmxnet.so: undefined symbol: _ZN2cv5errorEiRKSsPKcS3_i [15/42] : && /usr/bin/c++ -Wall -Wno-unknown-pragmas -Wno-sign-compare -O3 -std=c++11 -fopenmp -std=c++0x -O3 -DNDEBUG CMakeFiles/im2rec.dir/tools/im2rec.cc.o -o im2rec -rdynamic -Wl,--whole-archive libmxnet.a -Wl,--no-whole-archive -lopenblas -lrt /usr/local/lib/libopencv_core.a /usr/local/lib/libopencv_highgui.a /usr/local/lib/libopencv_imgproc.a /usr/local/lib/libopencv_imgcodecs.a -llapack /usr/local/lib/libopencv_core.a /usr/local/lib/libopencv_highgui.a /usr/local/lib/libopencv_imgproc.a /usr/local/lib/libopencv_imgcodecs.a 3rdparty/dmlc-core/libdmlc.a /usr/local/lib/libopencv_videoio.a /usr/local/lib/libopencv_imgcodecs.a /usr/local/lib/libopencv_imgproc.a /usr/local/lib/libopencv_core.a /usr/local/lib/opencv4/3rdparty/liblibwebp.a -ljpeg -lpng -ltiff -ljasper -ljpeg -lpng -ltiff -ljasper /usr/local/lib/opencv4/3rdparty/libIlmImf.a -lz -ldl -lm -lpthread /usr/local/lib/opencv4/3rdparty/libtegra_hal.a -lavcodec -lavformat -lavutil -lswscale -lgtk-3 -lgdk-3 -lpangocairo-1.0 -lpango-1.0 -latk-1.0 -lcairo-gobject -lcairo -lgdk_pixbuf-2.0 -lgio-2.0 -lgobject-2.0 -lglib-2.0 -lgthread-2.0 -lrt && : FAILED: im2rec : && /usr/bin/c++ -Wall -Wno-unknown-pragmas -Wno-sign-compare -O3 -std=c++11 -fopenmp -std=c++0x -O3 -DNDEBUG CMakeFiles/im2rec.dir/tools/im2rec.cc.o -o im2rec -rdynamic -Wl,--whole-archive libmxnet.a -Wl,--no-whole-archive -lopenblas -lrt /usr/local/lib/libopencv_core.a /usr/local/lib/libopencv_highgui.a /usr/local/lib/libopencv_imgproc.a /usr/local/lib/libopencv_imgcodecs.a -llapack /usr/local/lib/libopencv_core.a /usr/local/lib/libopencv_highgui.a /usr/local/lib/libopencv_imgproc.a /usr/local/lib/libopencv_imgcodecs.a 3rdparty/dmlc-core/libdmlc.a /usr/local/lib/libopencv_videoio.a /usr/local/lib/libopencv_imgcodecs.a /usr/local/lib/libopencv_imgproc.a /usr/local/lib/libopencv_core.a /usr/local/lib/opencv4/3rdparty/liblibwebp.a -ljpeg -lpng -ltiff -ljasper -ljpeg -lpng -ltiff -ljasper /usr/local/lib/opencv4/3rdparty/libIlmImf.a -lz -ldl -lm -lpthread /usr/local/lib/opencv4/3rdparty/libtegra_hal.a -lavcodec -lavformat -lavutil -lswscale -lgtk-3 -lgdk-3 -lpangocairo-1.0 -lpango-1.0 -latk-1.0 -lcairo-gobject -lcairo -lgdk_pixbuf-2.0 -lgio-2.0 -lgobject-2.0 -lglib-2.0 -lgthread-2.0 -lrt && : CMakeFiles/im2rec.dir/tools/im2rec.cc.o: In function `main': im2rec.cc:(.text.startup+0x1c0c): undefined reference to `cv::imencode(std::string const&, cv::_InputArray const&, std::vector >&, std::vector > const&)' libmxnet.a(image_io.cc.o): In function `cv::Mat::Mat(int, int, int, void*, unsigned int) [clone .constprop.758]': image_io.cc:(.text+0x104): undefined reference to `cv::error(int, std::string const&, char const*, char const*, int)' libmxnet.a(iter_image_det_recordio.cc.o): In function `mxnet::io::ImageDetRecordIOParser::ParseNext(std::vector, std::allocator > >*)::{lambda()#1}::operator()() const': iter_image_det_recordio.cc:(.text._ZZN5mxnet2io22ImageDetRecordIOParserIfE9ParseNextEPSt6vectorINS0_10InstVectorIfEESaIS5_EEENKUlvE_clEv[_ZZN5mxnet2io22ImageDetRecordIOParserIfE9ParseNextEPSt6vectorINS0_10InstVectorIfEESaIS5_EEENKUlvE_clEv]+0x1254): undefined reference to `cv::error(int, std::string const&, char const*, char const*, int)' libmxnet.a(iter_image_recordio.cc.o): In function `mxnet::io::ImageRecordIOParser::ParseNext(std::vector, std::allocator > >*) [clone ._omp_fn.23]': iter_image_recordio.cc:(.text+0x4b24): undefined reference to `cv::error(int, std::string const&, char const*, char const*, int)' libmxnet.a(iter_image_recordio.cc.o): In function `mxnet::io::ImageRecordIOParser::ParseNext(std::vector, std::allocator > >*) [clone ._omp_fn.20]': iter_image_recordio.cc:(.text+0x5928): undefined reference to `cv::error(int, std::string const&, char const*, char const*, int)' libmxnet.a(iter_image_recordio_2.cc.o): In function
[GitHub] [incubator-mxnet] larroy edited a comment on issue #15375: Memory leak in Naive engine when profiling
larroy edited a comment on issue #15375: Memory leak in Naive engine when profiling URL: https://github.com/apache/incubator-mxnet/issues/15375#issuecomment-506904136 Did you read the title of the issue? Why are you closing this? it says the leak is in the NaiveEngine when profiling. I said the leak is in Naive Engine since the beginning. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] larroy commented on issue #15375: Memory leak in Naive engine when profiling
larroy commented on issue #15375: Memory leak in Naive engine when profiling URL: https://github.com/apache/incubator-mxnet/issues/15375#issuecomment-506904136 Did you read the title of the issue? Why are you closing this? it says the leak is in the NaiveEngine when profiling. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] cyrusbehr commented on issue #15393: Unable to build mxnet with OpenCV4 on Raspberry Pi 3B
cyrusbehr commented on issue #15393: Unable to build mxnet with OpenCV4 on Raspberry Pi 3B URL: https://github.com/apache/incubator-mxnet/issues/15393#issuecomment-506903490 > Try to locate cv::error symbol in one of the opencv libraries and see if you are linking with it. Can you please elaborate more on how to do this? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] cyrusbehr commented on issue #15393: Unable to build mxnet with OpenCV4 on Raspberry Pi 3B
cyrusbehr commented on issue #15393: Unable to build mxnet with OpenCV4 on Raspberry Pi 3B URL: https://github.com/apache/incubator-mxnet/issues/15393#issuecomment-506903264 Hi @larroy, yes I am compiling on the PI. Here is teh result of pkg-config --libs opencv ``` /usr/lib/arm-linux-gnueabihf/libopencv_calib3d.so -lopencv_calib3d /usr/lib/arm-linux-gnueabihf/libopencv_contrib.so -lopencv_contrib /usr/lib/arm-linux-gnueabihf/libopencv_core.so -lopencv_core /usr/lib/arm-linux-gnueabihf/libopencv_features2d.so -lopencv_features2d /usr/lib/arm-linux-gnueabihf/libopencv_flann.so -lopencv_flann /usr/lib/arm-linux-gnueabihf/libopencv_gpu.so -lopencv_gpu /usr/lib/arm-linux-gnueabihf/libopencv_highgui.so -lopencv_highgui /usr/lib/arm-linux-gnueabihf/libopencv_imgproc.so -lopencv_imgproc /usr/lib/arm-linux-gnueabihf/libopencv_legacy.so -lopencv_legacy /usr/lib/arm-linux-gnueabihf/libopencv_ml.so -lopencv_ml /usr/lib/arm-linux-gnueabihf/libopencv_objdetect.so -lopencv_objdetect /usr/lib/arm-linux-gnueabihf/libopencv_ocl.so -lopencv_ocl /usr/lib/arm-linux-gnueabihf/libopencv_photo.so -lopencv_photo /usr/lib/arm-linux-gnueabihf/libopencv_stitching.so -lopencv_stitching /usr/lib/arm-linux-gnueabihf/libopencv_superres.so -lopencv_superres /usr/lib/arm-linux-gnueabihf/libopencv_ts.so -lopencv_ts /usr/lib/arm-linux-gnueabihf/libopencv_video.so -lopencv_video /usr/lib/arm-linux-gnueabihf/libopencv_videostab.so -lopencv_videostab ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] larroy commented on issue #15393: Unable to build mxnet with OpenCV4 on Raspberry Pi 3B
larroy commented on issue #15393: Unable to build mxnet with OpenCV4 on Raspberry Pi 3B URL: https://github.com/apache/incubator-mxnet/issues/15393#issuecomment-506902723 Hi @cyrusbehr some suggestions: Can you show the output for pkg-config --libs opencv ? are you compiling in the PI? Please build with ninja -v to see linking flags. Try to locate cv::error symbol in one of the opencv libraries and see if you are linking with it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] anirudh2290 closed issue #15375: Memory leak in Naive engine when profiling
anirudh2290 closed issue #15375: Memory leak in Naive engine when profiling URL: https://github.com/apache/incubator-mxnet/issues/15375 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] anirudh2290 commented on issue #15375: Memory leak in Naive engine when profiling
anirudh2290 commented on issue #15375: Memory leak in Naive engine when profiling URL: https://github.com/apache/incubator-mxnet/issues/15375#issuecomment-506902499 Okay, I will say it again: the leak is in naive engine and not the threaded engine. Also, let me say again for the threaded engine, there is no problem here. I would request you to spend time fixing some real issues that mxnet has. I am going to close this issue, please feel free to open a new one for the naive engine memory leak. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] Zha0q1 commented on issue #15403: Updating profiler tutorial to include new custom operator profiling
Zha0q1 commented on issue #15403: Updating profiler tutorial to include new custom operator profiling URL: https://github.com/apache/incubator-mxnet/pull/15403#issuecomment-506902327 > Some spelling and rephrasing... Thanks! I have committed all the suggested changes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] aaronmarkham commented on a change in pull request #15403: Updating profiler tutorial to include new custom operator profiling
aaronmarkham commented on a change in pull request #15403: Updating profiler tutorial to include new custom operator profiling URL: https://github.com/apache/incubator-mxnet/pull/15403#discussion_r298771180 ## File path: docs/tutorials/python/profiler.md ## @@ -206,6 +206,15 @@ Let's zoom in to check the time taken by operators The above picture visualizes the sequence in which the operators were executed and the time taken by each operator. +### Profiling Custom Operators +Should the existing NDArray operators fail to meet all your model's needs, MXNet supports [Custom Operators](https://mxnet.incubator.apache.org/versions/master/tutorials/gluon/customop.html) that you can define in python. In forward() and backward() of a custom operator, there are two kinds of code: `pure python` code (Numpy operators inclued) and `sub-operators` (NDArray operators called within foward() and backward()). With that said, MXNet can profile the execution time of both kinds without additional setup. More specifically, the MXNet profiler will break a single custom operator call into a `pure python` event and several `sub-operator` events if there is any. Furthermore, all those events will have a prefix in their names, which is conviniently the name of the custom operator you called. Review comment: ```suggestion Should the existing NDArray operators fail to meet all your model's needs, MXNet supports [Custom Operators](https://mxnet.incubator.apache.org/versions/master/tutorials/gluon/customop.html) that you can define in Python. In `forward()` and `backward()` of a custom operator, there are two kinds of code: "pure Python" code (NumPy operators included) and "sub-operators" (NDArray operators called within `forward()` and `backward()`). With that said, MXNet can profile the execution time of both kinds without additional setup. Specifically, the MXNet profiler will break a single custom operator call into a pure Python event and several sub-operator events if there are any. Furthermore, all of those events will have a prefix in their names, which is, conveniently, the name of the custom operator you called. ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] aaronmarkham commented on a change in pull request #15403: Updating profiler tutorial to include new custom operator profiling
aaronmarkham commented on a change in pull request #15403: Updating profiler tutorial to include new custom operator profiling URL: https://github.com/apache/incubator-mxnet/pull/15403#discussion_r298771254 ## File path: docs/tutorials/python/profiler.md ## @@ -206,6 +206,15 @@ Let's zoom in to check the time taken by operators The above picture visualizes the sequence in which the operators were executed and the time taken by each operator. +### Profiling Custom Operators +Should the existing NDArray operators fail to meet all your model's needs, MXNet supports [Custom Operators](https://mxnet.incubator.apache.org/versions/master/tutorials/gluon/customop.html) that you can define in python. In forward() and backward() of a custom operator, there are two kinds of code: `pure python` code (Numpy operators inclued) and `sub-operators` (NDArray operators called within foward() and backward()). With that said, MXNet can profile the execution time of both kinds without additional setup. More specifically, the MXNet profiler will break a single custom operator call into a `pure python` event and several `sub-operator` events if there is any. Furthermore, all those events will have a prefix in their names, which is conviniently the name of the custom operator you called. + +![Custom Operator Profiling Screenshot](https://cwiki.apache.org/confluence/download/attachments/118172065/image2019-6-14_15-23-42.png?version=1=1560551022000=v2) + +As shown by the sreenshot, in the `Custom Operator` domain where all the custom-operator-related events fall into, you can easily visualize the execution time of each segment of your custom operator. For example, we know that "CustomAddTwo::sqrt" is a `sub-operator` of custom operator "CustomAddTwo", and we also know when it is exectued accurately. Review comment: ```suggestion As shown by the screenshot, in the **Custom Operator** domain where all the custom operator-related events fall into, you can easily visualize the execution time of each segment of your custom operator. For example, we know that `CustomAddTwo::sqrt` is a sub-operator of custom operator `CustomAddTwo`, and we also know when it is executed accurately. ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] aaronmarkham commented on a change in pull request #15403: Updating profiler tutorial to include new custom operator profiling
aaronmarkham commented on a change in pull request #15403: Updating profiler tutorial to include new custom operator profiling URL: https://github.com/apache/incubator-mxnet/pull/15403#discussion_r298771371 ## File path: docs/tutorials/python/profiler.md ## @@ -206,6 +206,15 @@ Let's zoom in to check the time taken by operators The above picture visualizes the sequence in which the operators were executed and the time taken by each operator. +### Profiling Custom Operators +Should the existing NDArray operators fail to meet all your model's needs, MXNet supports [Custom Operators](https://mxnet.incubator.apache.org/versions/master/tutorials/gluon/customop.html) that you can define in python. In forward() and backward() of a custom operator, there are two kinds of code: `pure python` code (Numpy operators inclued) and `sub-operators` (NDArray operators called within foward() and backward()). With that said, MXNet can profile the execution time of both kinds without additional setup. More specifically, the MXNet profiler will break a single custom operator call into a `pure python` event and several `sub-operator` events if there is any. Furthermore, all those events will have a prefix in their names, which is conviniently the name of the custom operator you called. + +![Custom Operator Profiling Screenshot](https://cwiki.apache.org/confluence/download/attachments/118172065/image2019-6-14_15-23-42.png?version=1=1560551022000=v2) + +As shown by the sreenshot, in the `Custom Operator` domain where all the custom-operator-related events fall into, you can easily visualize the execution time of each segment of your custom operator. For example, we know that "CustomAddTwo::sqrt" is a `sub-operator` of custom operator "CustomAddTwo", and we also know when it is exectued accurately. + +Please note that: to be able to see the above-dscribed information, you need to set `profile_imperative` to `True` even when you are using custom operators in [symbolic mode](https://mxnet.incubator.apache.org/versions/master/tutorials/basic/symbol.html). The reason is that within custom operators, `pure python code` and `sub-operators` are still called imperatively. Review comment: ```suggestion Please note that: to be able to see the previously described information, you need to set `profile_imperative` to `True` even when you are using custom operators in [symbolic mode](https://mxnet.incubator.apache.org/versions/master/tutorials/basic/symbol.html). The reason is that within custom operators, pure python code and sub-operators are still called imperatively. ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] larroy commented on issue #15375: Memory leak in Naive engine when profiling
larroy commented on issue #15375: Memory leak in Naive engine when profiling URL: https://github.com/apache/incubator-mxnet/issues/15375#issuecomment-506900904 > I have spent a lot of time here, having this discussion, on something that hardly adds any value. I have said all I have to say. Hi @anirudh2290 I don't think this is very constructive to conclude your reply with that, if you are asking questions. I was trying to understand your point and explain my proposal. If you don't have time to followup, no need to start a discussion in the first place. I just made a suggestion on this ticket to prevent similar kinds of leaks in other places using this code. I respect that you have your own opinion, but for me having unmanaged pointers in many places with a complex delete pattern is creating an environment when reasoning about the code is hard and prone to bugs. Saying that something hardly adds any value is not very constructive if you don't try to understand the problem first. Cheers. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] tklein23 commented on issue #15361: [WIP] Exclude external dependencies from MXNet JAR.
tklein23 commented on issue #15361: [WIP] Exclude external dependencies from MXNet JAR. URL: https://github.com/apache/incubator-mxnet/pull/15361#issuecomment-506897708 Since I cannot reproduce the failing build locally, doing a bit of trial-and-error now: Adding the dependencies to clojure, since this is what's failing. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] access2rohit edited a comment on issue #15360: Revert default return type for indices in argsort() and topk() to fp32
access2rohit edited a comment on issue #15360: Revert default return type for indices in argsort() and topk() to fp32 URL: https://github.com/apache/incubator-mxnet/pull/15360#issuecomment-506884725 @apeforest I don't need to do that. These tests are correct. I just updated my PR with correct macro after rerunning the nightly tests for sort, argsort and topk. The only problem was due to incorrect macro being used earlier that I have now fixed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] anirudh2290 commented on issue #15375: Memory leak in Naive engine when profiling
anirudh2290 commented on issue #15375: Memory leak in Naive engine when profiling URL: https://github.com/apache/incubator-mxnet/issues/15375#issuecomment-506886676 First when you talk about leak, you talk about naive, which does things differently from threaded. Coming to threaded, What will be the scope and lifetime of the unique_ptr, that you will be creating, how will dependency engine know when it has to be deleted ? It has individual view of operators not the full graph view. I have spent a lot of time here, having this discussion, on something that hardly adds any value. I have said all I have to say. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] larroy commented on a change in pull request #14779: Fully connected, higher order grad
larroy commented on a change in pull request #14779: Fully connected, higher order grad URL: https://github.com/apache/incubator-mxnet/pull/14779#discussion_r298756303 ## File path: tests/python/unittest/test_higher_order_grad.py ## @@ -129,6 +135,83 @@ def check_second_order_unary(x, op, grad_grad_op): # Validate the gradients. assert_almost_equal(expected_grad_grad, x.grad.asnumpy()) +class RandomShapes(object): +def __init__(self, dim, startdim=1): +self.dim = dim +self.curdim = startdim + +def __iter__(self): +return self + +def next(self): +return self.__next__() + +def __next__(self): +if self.curdim > self.dim: +raise StopIteration +shape = rand_shape_nd(self.curdim) +x = nd.random.normal(shape=shape) +self.curdim += 1 +return x + + +@with_seed() +def test_dense_backward(): +for x in RandomShapes(4,2): +net = gluon.nn.Sequential() +with net.name_scope(): +net.add(gluon.nn.Dense(1)) + +net.initialize(mxnet.initializer.Constant(.5)) +x.attach_grad() +with ag.record(): +y = net.forward(x) +x_grad = ag.grad(heads=y, variables=x, create_graph=True, retain_graph=True)[0] +x_grad.backward() +same(x.grad, nd.zeros(4)) + +with ag.record(): +y = net.forward(x) +x_grad = ag.grad(heads=y, variables=x, create_graph=True, retain_graph=True)[0] +random_multiplier = nd.random.uniform_like(x_grad) +z = (random_multiplier * x_grad).sum() +z.backward() +same(x.grad, nd.zeros(4)) + +with ag.record(): Review comment: numerical gradient, changing w and verifying that the first gradient changes but not the second one. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] access2rohit commented on issue #15360: Revert default return type for indices in argsort() and topk() to fp32
access2rohit commented on issue #15360: Revert default return type for indices in argsort() and topk() to fp32 URL: https://github.com/apache/incubator-mxnet/pull/15360#issuecomment-506884725 I don't need to do that. These tests are correct. I just updated my PR with correct macro after rerunning the nightly tests for sort, argsort and topk. The only problem was due to incorrect macro This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] sxjscience edited a comment on issue #14779: Fully connected, higher order grad
sxjscience edited a comment on issue #14779: Fully connected, higher order grad URL: https://github.com/apache/incubator-mxnet/pull/14779#issuecomment-506883467 @larroy You may also refer to Line-7 of Algorithm 1 in https://arxiv.org/pdf/1704.00028.pdf . When implementing the gradient penalty, we should compute something like: https://www.codecogs.com/eqnedit.php?latex=\nabla_wg\left(\nabla_xl(x,w)\right)" target="_blank">https://latex.codecogs.com/png.latex?\nabla_wg\left(\nabla_xl(x,w)\right)" title="\nabla_wg\left(\nabla_x l(x, w)\right)" /> Here, `g` and `l` are both functions that return a scalar. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] sxjscience edited a comment on issue #14779: Fully connected, higher order grad
sxjscience edited a comment on issue #14779: Fully connected, higher order grad URL: https://github.com/apache/incubator-mxnet/pull/14779#issuecomment-506883467 @larroy You may also refer to Line-7 of Algorithm 1 in https://arxiv.org/pdf/1704.00028.pdf . When implementing the gradient penalty, we should compute something like: https://www.codecogs.com/eqnedit.php?latex=\nabla_wg\left(\nabla_xl(x,w)\right)" target="_blank">https://latex.codecogs.com/png.latex?\nabla_wg\left(\nabla_xl(x,w)\right)" title="\nabla_wg\left(\nabla_x l(x, w)\right)" /> This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] sxjscience commented on issue #14779: Fully connected, higher order grad
sxjscience commented on issue #14779: Fully connected, higher order grad URL: https://github.com/apache/incubator-mxnet/pull/14779#issuecomment-506883467 @larroy You may also refer to Line-7 of Algorithm 1 in https://arxiv.org/pdf/1704.00028.pdf . When implementing the gradient penalty, we should compute something like: ![](https://latex.codecogs.com/png.download?%5Cnabla_wg%5Cleft%28%5Cnabla_x%20l%28x%2C%20w%29%5Cright%29) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] larroy commented on issue #15375: Memory leak in Naive engine when profiling
larroy commented on issue #15375: Memory leak in Naive engine when profiling URL: https://github.com/apache/incubator-mxnet/issues/15375#issuecomment-506881339 Maybe you know the code better and I'm wrong, my proposal was to wrap the operator in a unique_ptr with a custom deleter with a virtual apply operator, so delete doesn't need to be called here: https://github.com/apache/incubator-mxnet/blob/master/src/engine/threaded_engine.cc#L472 But it will still use the object pool, this wouldn't be changed. Did I understand correctly that you think this is not possible or that is better to explicitly delete the operator in this case? The leak would not be in the first place if you would always pass around a smart pointer from the operator, that was my point, maybe I'm missing something which really needs the raw pointer around, but couldn't you pass around a unique_ptr that gets freed from the pool when needed? having the pointer deleted manually is not exactly exception safe. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] larroy edited a comment on issue #15375: Memory leak in Naive engine when profiling
larroy edited a comment on issue #15375: Memory leak in Naive engine when profiling URL: https://github.com/apache/incubator-mxnet/issues/15375#issuecomment-506881339 Maybe you know the code better and what I propose doesn't make sense, my proposal was to wrap the operator in a unique_ptr with a custom deleter with a virtual apply operator, so delete doesn't need to be called here: https://github.com/apache/incubator-mxnet/blob/master/src/engine/threaded_engine.cc#L472 But it will still use the object pool, this wouldn't be changed. Did I understand correctly that you think this is not possible or that is better to explicitly delete the operator in this case? The leak would not be in the first place if you would always pass around a smart pointer from the operator, that was my point, maybe I'm missing something which really needs the raw pointer around, but couldn't you pass around a unique_ptr that gets freed from the pool when needed? having the pointer deleted manually is not exactly exception safe. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[incubator-mxnet] branch master updated (11e6d45 -> ca565a0)
This is an automated email from the ASF dual-hosted git repository. ptrendx pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from 11e6d45 [AMP] Move topk from FP16_FP32_FUNCS to FP32_FUNCS (#15342) add ca565a0 Conversion from FP32 model to Mixed Precision model (#15118) No new revisions were added by this update. Summary of changes: docs/tutorials/amp/amp_tutorial.md | 42 ++ .../automatic-mixed-precision}/README.md | 21 +- .../amp_model_conversion.py| 119 ++ .../common | 0 include/mxnet/c_api.h | 49 +++ python/mxnet/contrib/amp/amp.py| 353 - python/mxnet/gluon/parameter.py| 44 ++- python/mxnet/module/executor_group.py | 8 + python/mxnet/test_utils.py | 83 src/c_api/c_api_symbolic.cc| 204 ++ src/nnvm/amp_infer_unknown.cc | 148 +++ src/nnvm/low_precision_pass.cc | 261 + tests/python/gpu/test_contrib_amp.py | 428 + tests/python/tensorrt/test_tensorrt_lenet5.py | 2 +- tests/python/unittest/test_contrib_amp.py | 85 15 files changed, 1739 insertions(+), 108 deletions(-) copy {python => example/automatic-mixed-precision}/README.md (51%) create mode 100644 example/automatic-mixed-precision/amp_model_conversion.py copy example/{quantization => automatic-mixed-precision}/common (100%) create mode 100644 src/nnvm/amp_infer_unknown.cc create mode 100644 src/nnvm/low_precision_pass.cc create mode 100644 tests/python/gpu/test_contrib_amp.py delete mode 100644 tests/python/unittest/test_contrib_amp.py
[incubator-mxnet] branch master updated (8aaacde -> 11e6d45)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from 8aaacde Update sparse_retain Documentation (#15394) add 11e6d45 [AMP] Move topk from FP16_FP32_FUNCS to FP32_FUNCS (#15342) No new revisions were added by this update. Summary of changes: python/mxnet/contrib/amp/lists/symbol.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
[GitHub] [incubator-mxnet] ptrendx merged pull request #15118: Conversion from FP32 model to Mixed Precision model
ptrendx merged pull request #15118: Conversion from FP32 model to Mixed Precision model URL: https://github.com/apache/incubator-mxnet/pull/15118 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] ptrendx closed issue #14978: Flaky Test: test_tensorrt_lenet5.test_tensorrt_inference
ptrendx closed issue #14978: Flaky Test: test_tensorrt_lenet5.test_tensorrt_inference URL: https://github.com/apache/incubator-mxnet/issues/14978 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] ptrendx closed issue #14584: Conversion from FP32 to Mixed Precision Models
ptrendx closed issue #14584: Conversion from FP32 to Mixed Precision Models URL: https://github.com/apache/incubator-mxnet/issues/14584 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] eric-haibin-lin merged pull request #15342: [AMP] Move topk from FP16_FP32_FUNCS to FP32_FUNCS
eric-haibin-lin merged pull request #15342: [AMP] Move topk from FP16_FP32_FUNCS to FP32_FUNCS URL: https://github.com/apache/incubator-mxnet/pull/15342 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] sandeep-krishnamurthy commented on issue #15403: Updating profiler tutorial to include new custom operator profiling
sandeep-krishnamurthy commented on issue #15403: Updating profiler tutorial to include new custom operator profiling URL: https://github.com/apache/incubator-mxnet/pull/15403#issuecomment-506877413 @aaronmarkham - Can you please help review. Thanks. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] apeforest commented on a change in pull request #14779: Fully connected, higher order grad
apeforest commented on a change in pull request #14779: Fully connected, higher order grad URL: https://github.com/apache/incubator-mxnet/pull/14779#discussion_r298748450 ## File path: tests/python/unittest/test_higher_order_grad.py ## @@ -129,6 +131,44 @@ def check_second_order_unary(x, op, grad_grad_op): # Validate the gradients. assert_almost_equal(expected_grad_grad, x.grad.asnumpy()) +class RandomShapes(object): +def __init__(self, dim): +self.dim = dim +self.curdim = 1 + +def __iter__(self): +return self + +def next(self): +return self.__next__() + +def __next__(self): +if self.curdim > self.dim: +raise StopIteration +shape = rand_shape_nd(self.curdim) +print(shape) +x = nd.random.normal(shape=shape) +self.curdim += 1 +return x + + +@with_seed() +def test_dense_backward(): Review comment: The reason I was suggesting to test FullyConnected operator itself is then you could reuse the `check_second_order_unary` utility methods like all the other operators do. Having to write ad hoc test for each operator is not scalable and error prone. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] anirudh2290 commented on issue #15375: Memory leak in Naive engine when profiling
anirudh2290 commented on issue #15375: Memory leak in Naive engine when profiling URL: https://github.com/apache/incubator-mxnet/issues/15375#issuecomment-506877531 > it makes sense to you because you know the code So who are you targeting, someone unfamiliar with the code. Everyone needs to spend some time with the code to understand it, but after they spend that time it makes sense. I know the benefits of RAII, but it doesn't apply to this use case of threaded engine. I am saying there is no need to change threaded engine code here which you are proposing to do. RAII won't benefit here, because engine itself is only responsible for indicating to the object pool that the object has to be deleted. The rest is taken care by the object pool itself. As I said if you really want to add RAII, do it in the graph executor (although I don't think it is really needed there either). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] apeforest commented on a change in pull request #14779: Fully connected, higher order grad
apeforest commented on a change in pull request #14779: Fully connected, higher order grad URL: https://github.com/apache/incubator-mxnet/pull/14779#discussion_r298748009 ## File path: tests/python/unittest/test_higher_order_grad.py ## @@ -129,6 +135,83 @@ def check_second_order_unary(x, op, grad_grad_op): # Validate the gradients. assert_almost_equal(expected_grad_grad, x.grad.asnumpy()) +class RandomShapes(object): +def __init__(self, dim, startdim=1): +self.dim = dim +self.curdim = startdim + +def __iter__(self): +return self + +def next(self): +return self.__next__() + +def __next__(self): +if self.curdim > self.dim: +raise StopIteration +shape = rand_shape_nd(self.curdim) +x = nd.random.normal(shape=shape) +self.curdim += 1 +return x + + +@with_seed() +def test_dense_backward(): +for x in RandomShapes(4,2): +net = gluon.nn.Sequential() +with net.name_scope(): +net.add(gluon.nn.Dense(1)) + +net.initialize(mxnet.initializer.Constant(.5)) +x.attach_grad() +with ag.record(): +y = net.forward(x) +x_grad = ag.grad(heads=y, variables=x, create_graph=True, retain_graph=True)[0] +x_grad.backward() +same(x.grad, nd.zeros(4)) + +with ag.record(): +y = net.forward(x) +x_grad = ag.grad(heads=y, variables=x, create_graph=True, retain_graph=True)[0] +random_multiplier = nd.random.uniform_like(x_grad) +z = (random_multiplier * x_grad).sum() +z.backward() +same(x.grad, nd.zeros(4)) + +with ag.record(): +y = net.forward(x) +x_grad_0 = ag.grad(heads=y, variables=x, create_graph=True, retain_graph=True)[0] +x_grad_grad_0 = x.grad + +w_0 = list(net.collect_params().values())[0].data() +h_w = nd.ones_like(w_0) * 0.01 +net.initialize(mxnet.initializer.Constant(w_0 + h_w), force_reinit=True) +w_1 = list(net.collect_params().values())[0].data() +with ag.record(): +y = net.forward(x) +x_grad_1 = ag.grad(heads=y, variables=x, create_graph=True, retain_graph=True)[0] +x_grad_1.backward() +x_grad_grad_1 = x.grad +ok_(not np.array_equal(x_grad_0, x_grad_1)) +ok_(np.array_equal(x_grad_grad_0, x_grad_grad_1)) + +w = list(net.collect_params().values())[0].data() +with ag.record(): +y = net.forward(x) +w_grad_0 = ag.grad(heads=y, variables=w, create_graph=True, retain_graph=True)[0] +w_grad_0.backward() +w_grad_grad_0 = w.grad + +x = x + nd.ones_like(x) * 0.01 +with ag.record(): Review comment: What are you trying to test here? Maybe break the `with` blocks in this method with multiple methods so we know the purpose of each test? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] apeforest commented on a change in pull request #14779: Fully connected, higher order grad
apeforest commented on a change in pull request #14779: Fully connected, higher order grad URL: https://github.com/apache/incubator-mxnet/pull/14779#discussion_r298747764 ## File path: tests/python/unittest/test_higher_order_grad.py ## @@ -129,6 +135,83 @@ def check_second_order_unary(x, op, grad_grad_op): # Validate the gradients. assert_almost_equal(expected_grad_grad, x.grad.asnumpy()) +class RandomShapes(object): +def __init__(self, dim, startdim=1): +self.dim = dim +self.curdim = startdim + +def __iter__(self): +return self + +def next(self): +return self.__next__() + +def __next__(self): +if self.curdim > self.dim: +raise StopIteration +shape = rand_shape_nd(self.curdim) +x = nd.random.normal(shape=shape) +self.curdim += 1 +return x + + +@with_seed() +def test_dense_backward(): +for x in RandomShapes(4,2): +net = gluon.nn.Sequential() +with net.name_scope(): +net.add(gluon.nn.Dense(1)) + +net.initialize(mxnet.initializer.Constant(.5)) +x.attach_grad() +with ag.record(): +y = net.forward(x) +x_grad = ag.grad(heads=y, variables=x, create_graph=True, retain_graph=True)[0] +x_grad.backward() +same(x.grad, nd.zeros(4)) + +with ag.record(): +y = net.forward(x) +x_grad = ag.grad(heads=y, variables=x, create_graph=True, retain_graph=True)[0] +random_multiplier = nd.random.uniform_like(x_grad) +z = (random_multiplier * x_grad).sum() +z.backward() +same(x.grad, nd.zeros(4)) + +with ag.record(): Review comment: What are you trying to test here? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] Zha0q1 opened a new pull request #15403: Updating profiler tutorial to include new custom operator profiling
Zha0q1 opened a new pull request #15403: Updating profiler tutorial to include new custom operator profiling URL: https://github.com/apache/incubator-mxnet/pull/15403 ## Description ## This PR updates the profiler tutorial to include a new section on profiling custom operators. This is a reflection on my work here: https://github.com/apache/incubator-mxnet/pull/15210 ## Checklist ## ### Essentials ### Please feel free to remove inapplicable items for your PR. - [ ] The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant [JIRA issue](https://issues.apache.org/jira/projects/MXNET/issues) created (except PRs with tiny changes) - [ ] Changes are complete (i.e. I finished coding on this PR) - [ ] All changes have test coverage: - Unit tests are added for small changes to verify correctness (e.g. adding a new operator) - Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore) - Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL) - [ ] Code is well-documented: - For user-facing API changes, API doc string has been updated. - For new C++ functions in header files, their functionalities and arguments are documented. - For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable - Check the API doc at http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html - [ ] To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change ### Changes ### - [ ] Feature1, tests, (and when applicable, API doc) - [ ] Feature2, tests, (and when applicable, API doc) ## Comments ## - If this change is a backward incompatible change, why must this change be made. - Interesting edge cases to note here This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] apeforest commented on a change in pull request #14779: Fully connected, higher order grad
apeforest commented on a change in pull request #14779: Fully connected, higher order grad URL: https://github.com/apache/incubator-mxnet/pull/14779#discussion_r298747388 ## File path: tests/python/unittest/test_higher_order_grad.py ## @@ -18,8 +18,14 @@ import math from mxnet import nd, autograd -from mxnet.test_utils import assert_almost_equal, random_arrays, rand_shape_nd +from mxnet.test_utils import assert_almost_equal, random_arrays, rand_shape_nd, same from common import with_seed +import mxnet.autograd as ag +import mxnet.ndarray as nd +from mxnet import gluon +import mxnet +from nose.tools import ok_ Review comment: this module should be imported before mxnet module This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] apeforest commented on a change in pull request #14779: Fully connected, higher order grad
apeforest commented on a change in pull request #14779: Fully connected, higher order grad URL: https://github.com/apache/incubator-mxnet/pull/14779#discussion_r298747266 ## File path: tests/python/unittest/test_higher_order_grad.py ## @@ -18,8 +18,14 @@ import math from mxnet import nd, autograd -from mxnet.test_utils import assert_almost_equal, random_arrays, rand_shape_nd +from mxnet.test_utils import assert_almost_equal, random_arrays, rand_shape_nd, same from common import with_seed +import mxnet.autograd as ag +import mxnet.ndarray as nd Review comment: many of these imports are duplicated. Could you please keep the required ones? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] larroy commented on issue #15375: Memory leak in Naive engine when profiling
larroy commented on issue #15375: Memory leak in Naive engine when profiling URL: https://github.com/apache/incubator-mxnet/issues/15375#issuecomment-506873322 @anirudh2290 it makes sense to you because you know the code, I would encourage you to read about the benfits of RAII. This is orthogonal to having an object pool for allocation, which I was not proposing to change. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] larroy commented on issue #15402: Recent 1.5 pip wheel not using all the cores
larroy commented on issue #15402: Recent 1.5 pip wheel not using all the cores URL: https://github.com/apache/incubator-mxnet/issues/15402#issuecomment-506872767 @mxnet-label-bot add [Question] This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[incubator-mxnet-site] branch asf-site updated: Bump the publish timestamp.
This is an automated email from the ASF dual-hosted git repository. marcoabreu pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/incubator-mxnet-site.git The following commit(s) were added to refs/heads/asf-site by this push: new a985b70 Bump the publish timestamp. a985b70 is described below commit a985b707b69f1c3fb02c75a2cae82c59789231bf Author: mxnet-ci AuthorDate: Fri Jun 28 20:47:34 2019 + Bump the publish timestamp. --- date.txt | 1 + 1 file changed, 1 insertion(+) diff --git a/date.txt b/date.txt new file mode 100644 index 000..009ea82 --- /dev/null +++ b/date.txt @@ -0,0 +1 @@ +Fri Jun 28 20:47:34 UTC 2019
[GitHub] [incubator-mxnet] mxnet-label-bot commented on issue #15402: Recent 1.5 pip wheel not using all the cores
mxnet-label-bot commented on issue #15402: Recent 1.5 pip wheel not using all the cores URL: https://github.com/apache/incubator-mxnet/issues/15402#issuecomment-506871598 Hey, this is the MXNet Label Bot. Thank you for submitting the issue! I will try and suggest some labels so that the appropriate MXNet community members can help resolve it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] larroy opened a new issue #15402: Recent 1.5 pip wheel not using all the cores
larroy opened a new issue #15402: Recent 1.5 pip wheel not using all the cores URL: https://github.com/apache/incubator-mxnet/issues/15402 Using the pip package yield subpar core utilization, and is linked with libgomp. See a more detailed description on the mailing list: https://lists.apache.org/thread.html/dfc334cc6481a0d3632f63d8bbcebb6c6b5fa699d3dad005e5e85f00@%3Cdev.mxnet.apache.org%3E This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] Zha0q1 closed pull request #15401: Custom op profiling
Zha0q1 closed pull request #15401: Custom op profiling URL: https://github.com/apache/incubator-mxnet/pull/15401 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] Zha0q1 opened a new pull request #15401: Custom op profiling
Zha0q1 opened a new pull request #15401: Custom op profiling URL: https://github.com/apache/incubator-mxnet/pull/15401 ## Description ## This PR updates the profiler tutorial to include how to profiler custom operators. This is a reflection on my work here: https://github.com/apache/incubator-mxnet/pull/15210. ## Checklist ## ### Essentials ### Please feel free to remove inapplicable items for your PR. - [ ] The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant [JIRA issue](https://issues.apache.org/jira/projects/MXNET/issues) created (except PRs with tiny changes) - [ ] Changes are complete (i.e. I finished coding on this PR) - [ ] All changes have test coverage: - Unit tests are added for small changes to verify correctness (e.g. adding a new operator) - Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore) - Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL) - [ ] Code is well-documented: - For user-facing API changes, API doc string has been updated. - For new C++ functions in header files, their functionalities and arguments are documented. - For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable - Check the API doc at http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html - [ ] To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change ### Changes ### - [ ] Feature1, tests, (and when applicable, API doc) - [ ] Feature2, tests, (and when applicable, API doc) ## Comments ## - If this change is a backward incompatible change, why must this change be made. - Interesting edge cases to note here This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] eric-haibin-lin commented on a change in pull request #15400: Add a new arange_like operator to contrib
eric-haibin-lin commented on a change in pull request #15400: Add a new arange_like operator to contrib URL: https://github.com/apache/incubator-mxnet/pull/15400#discussion_r298735194 ## File path: tests/python/unittest/test_operator.py ## @@ -4041,11 +4041,22 @@ def test_arange_inferstop(): exe.forward() assert_almost_equal(exe.outputs[0].asnumpy(), np.array([0,1,2,3,4])) +def test_arange_like(): +shape_list = [(10,), (10, 20), (10, 20, 30), (10, 20, 30, 40)] +axis = 0 Review comment: Can we also test axis=None, axis=-1? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] eric-haibin-lin commented on a change in pull request #15400: Add a new arange_like operator to contrib
eric-haibin-lin commented on a change in pull request #15400: Add a new arange_like operator to contrib URL: https://github.com/apache/incubator-mxnet/pull/15400#discussion_r298735717 ## File path: src/operator/tensor/init_op.h ## @@ -588,6 +622,28 @@ inline bool LinspaceShape(const nnvm::NodeAttrs& attrs, return true; } +inline bool RangeLikeShape(const nnvm::NodeAttrs& attrs, + mxnet::ShapeVector *in_attrs, + mxnet::ShapeVector *out_attrs) { + const RangeLikeParam& param = nnvm::get(attrs.parsed); Review comment: Could you add a check so that the inferred shape based on start, step and repeat is consistent with the shape inferred by axis? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] eric-haibin-lin commented on a change in pull request #15400: Add a new arange_like operator to contrib
eric-haibin-lin commented on a change in pull request #15400: Add a new arange_like operator to contrib URL: https://github.com/apache/incubator-mxnet/pull/15400#discussion_r298735388 ## File path: src/operator/tensor/init_op.h ## @@ -250,12 +284,12 @@ inline bool InitShape(const nnvm::NodeAttrs& attrs, return shape_is_known(out_attrs->at(0)); } -template +template Review comment: nit: changing `inum` to `num_in` is more readable This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] eric-haibin-lin commented on a change in pull request #15400: Add a new arange_like operator to contrib
eric-haibin-lin commented on a change in pull request #15400: Add a new arange_like operator to contrib URL: https://github.com/apache/incubator-mxnet/pull/15400#discussion_r298734450 ## File path: src/operator/tensor/init_op.h ## @@ -174,6 +174,40 @@ struct RangeParam : public dmlc::Parameter { } }; +struct RangeLikeParam : public dmlc::Parameter { + double start; + double step; + int repeat; + std::string ctx; + int dtype; + dmlc::optional axis; + + DMLC_DECLARE_PARAMETER(RangeLikeParam) { +DMLC_DECLARE_FIELD(start) +.set_default(0) +.describe("Start of interval. The interval includes this value. The default start value is 0."); +DMLC_DECLARE_FIELD(step) +.set_default(1) +.describe("Spacing between values."); +DMLC_DECLARE_FIELD(repeat) +.set_default(1) +.describe("The repeating time of all elements." + " E.g repeat=3, the element a will be repeated three times --> a, a, a."); Review comment: Unfortunately `nd.arange` has a different interface compard to numpy.arange. I think it's reasonable to let `contrib.arange_like` have an interface similar to `nd.arange`. If we want to remove the repeat argument, it's best when we add the arange op to the `mx.np` namespace in the numpy project. @haojin2 @reminisce FYI the nd.arange operator interface is inconsistent with numpy This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] larroy commented on issue #15375: Memory leak in Naive engine when profiling
larroy commented on issue #15375: Memory leak in Naive engine when profiling URL: https://github.com/apache/incubator-mxnet/issues/15375#issuecomment-506863130 I will make a small fix to the leak in Naive engine, and a separate PR with a refactor, so we can have a more focused and productive conversation then. Thanks for your feedback. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] vdantu commented on issue #14283: undefined reference to `cv::_InputArray
vdantu commented on issue #14283: undefined reference to `cv::_InputArray URL: https://github.com/apache/incubator-mxnet/issues/14283#issuecomment-506858767 @HaichaoZhu : Thanks for sharing this config file. I was able to build latest MXNet with this config file. ``` ldd libmxnet.so ... libopencv_highgui.so.2.4 => /usr/lib/x86_64-linux-gnu/libopencv_highgui.so.2.4 (0x7f8a1286) libopencv_imgproc.so.2.4 => /usr/lib/x86_64-linux-gnu/libopencv_imgproc.so.2.4 (0x7f8a123d5000) libopencv_core.so.2.4 => /usr/lib/x86_64-linux-gnu/libopencv_core.so.2.4 (0x7f8a11fab000) ... ``` I am wondering how we can reproduce this further . This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] MyraBaba commented on issue #13782: Segmentation Fault on the Raspberry pi 3
MyraBaba commented on issue #13782: Segmentation Fault on the Raspberry pi 3 URL: https://github.com/apache/incubator-mxnet/issues/13782#issuecomment-506854965 @vdantu Small models can fit the memory. with Python mobileNEt is ok. Other bigger model cant fit memeory so error. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] vdantu commented on issue #13782: Segmentation Fault on the Raspberry pi 3
vdantu commented on issue #13782: Segmentation Fault on the Raspberry pi 3 URL: https://github.com/apache/incubator-mxnet/issues/13782#issuecomment-506852477 @MyraBaba : Were you able get an answer to the failure? Wondering how we could be of any help. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] anirudh2290 commented on issue #15375: Memory leak in Naive engine when profiling
anirudh2290 commented on issue #15375: Memory leak in Naive engine when profiling URL: https://github.com/apache/incubator-mxnet/issues/15375#issuecomment-506851797 @larroy even less encouraging is having you make a PR and then blocking it. I was trying to save you some time. Ownership of the pointer rests with the module that pushes it. So imperative or graph executor should handle it. If you are really keen on doing this do it in the graph executor. There is no need to touch the engine code here. > Having a pointer released in a call hierarchy 3 levels down leads to bugs and memory leaks, is difficult to reason about it. This is the design, Engine relies on object pool for allocation, and the modules depend on engine to handle this, the modules just tell them when it wants a new operator creator or destroyed. I think the code is very logical and it makes sense to me. Also, the memory leak you found is nowhere related to the ThreadedEngine code. I considered the benefits of your approach: "readability/understandability" in the first comment. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[incubator-mxnet-site] branch asf-site updated: Bump the publish timestamp.
This is an automated email from the ASF dual-hosted git repository. marcoabreu pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/incubator-mxnet-site.git The following commit(s) were added to refs/heads/asf-site by this push: new 15645b6 Bump the publish timestamp. 15645b6 is described below commit 15645b6aca73ed332bcdf81e11bad44620e77d3d Author: mxnet-ci AuthorDate: Fri Jun 28 19:21:23 2019 + Bump the publish timestamp. --- date.txt | 1 + 1 file changed, 1 insertion(+) diff --git a/date.txt b/date.txt new file mode 100644 index 000..cb61def --- /dev/null +++ b/date.txt @@ -0,0 +1 @@ +Fri Jun 28 19:21:23 UTC 2019
[GitHub] [incubator-mxnet] aaronmarkham commented on a change in pull request #15396: [TUTORIAL] Gluon and Sparse NDArray
aaronmarkham commented on a change in pull request #15396: [TUTORIAL] Gluon and Sparse NDArray URL: https://github.com/apache/incubator-mxnet/pull/15396#discussion_r298718366 ## File path: docs/tutorials/sparse/train_gluon.md ## @@ -0,0 +1,469 @@ + + + + + + + + + + + + + + + + + + +# Sparse NDArrays with Gluon + +When working on machine learning problems, you may encounter situations where the input data is sparse (i.e. the majority of values are zero). One example of this is in recommendation systems. You could have millions of user and product features, but only a few of these features are present for each sample. Without special treatment, the sheer magnitude of the feature space can lead to out-of-memory situations and cause significant slowdowns when training and making predictions. + +MXNet supports a number of sparse storage types (often called 'stype' for short) for these situations. In this tutorial, we'll start by generating some sparse data, write it to disk in the LibSVM format and then read back using the [`LibSVMIter`](https://mxnet.incubator.apache.org/api/python/io/io.html) for training. We use the Gluon API to train the model and leverage sparse storage types such as [`CSRNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=csrndarray#mxnet.ndarray.sparse.CSRNDArray) and [`RowSparseNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=rowsparsendarray#mxnet.ndarray.sparse.RowSparseNDArray) to maximise performance and memory efficiency. + + +```python +import mxnet as mx +import numpy as np +import time +``` + +### Generating Sparse Data + +You will most likely have a sparse dataset in mind already if you're reading this tutorial, but let's create a dummy dataset to use in the examples that follow. Using `rand_ndarray` we will generate 1000 samples, each with 1,000,000 features of which 99.999% of values will be zero (i.e. 10 non-zero features for each sample). We take this as our input data for training and calculate a label based on an arbitrary rule: whether the feature sum is higher than average. + + +```python +num_samples = 1000 +num_features = 100 +data = mx.test_utils.rand_ndarray((num_samples, num_features), stype='csr', density=0.1) +# generate label: 1 if row sum above average, 0 otherwise. +label = data.sum(axis=1) > data.sum(axis=1).mean() +``` + + +```python +print(type(data)) +print(data[:10].asnumpy()) +print('{:,.0f} elements'.format(np.product(data.shape))) +print('{:,.0f} non-zero elements'.format(data.data.size)) +``` + + +[[0. 0. 0. ... 0. 0. 0.] + [0. 0. 0. ... 0. 0. 0.] + [0. 0. 0. ... 0. 0. 0.] + ... + [0. 0. 0. ... 0. 0. 0.] + [0. 0. 0. ... 0. 0. 0.] + [0. 0. 0. ... 0. 0. 0.]] +1,000,000,000 elements +10,000 non-zero elements + + +Our storage type is CSR (Compressed Sparse Row) which is the ideal type for sparse data along multiple axes. See [this in-depth tutorial](https://mxnet.incubator.apache.org/versions/master/tutorials/sparse/csr.html) for more information. Just to confirm the generation process ran correctly, we can see that the vast majority of values are indeed zero. One of the first questions to ask would be how much memory is saved by storing this data in a [`CSRNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=csrndarray#mxnet.ndarray.sparse.CSRNDArray) versus a standard [`NDArray`](https://mxnet.incubator.apache.org/versions/master/api/python/ndarray/sparse.html?highlight=ndarray#module-mxnet.ndarray). Since sparse arrays are constructed from many components (e.g. `data`, `indices` and `indptr`) we define a function called `get_nbytes` to calculate the number of bytes taken in memory to store an array. We compare the same data stored in a standard [`NDArray`](https://mxnet.incubator.apache.org/versions/master/api/python/ndarray/sparse.html?highlight=ndarray#module-mxnet.ndarray) (with `data.tostype('default')`) to the [`CSRNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=csrndarray#mxnet.ndarray.sparse.CSRNDArray). + + +```python +def get_nbytes(array): +fn = lambda a: a.size * np.dtype(a).itemsize +if isinstance(array, mx.ndarray.sparse.CSRNDArray): +return fn(array.data) + fn(array.indices) + fn(array.indptr) +elif isinstance(array, mx.ndarray.sparse.RowSparseNDArray): +return fn(array.data) + fn(array.indices) +elif isinstance(array, mx.ndarray.NDArray): +return fn(array) +else: +TypeError('{} not supported'.format(type(array))) +``` + + +```python +print('NDarray:', get_nbytes(data.tostype('default'))/100, 'MBs') +print('CSRNDArray', get_nbytes(data)/100, 'MBs') +``` + +NDarray: 4000.0 MBs +CSRNDArray 0.128008 MBs + + +Given the extremely high sparsity of the data, we observe a huge memory saving here! 0.13 MBs versus 4 GBs: ~30,000
[GitHub] [incubator-mxnet] aaronmarkham commented on a change in pull request #15396: [TUTORIAL] Gluon and Sparse NDArray
aaronmarkham commented on a change in pull request #15396: [TUTORIAL] Gluon and Sparse NDArray URL: https://github.com/apache/incubator-mxnet/pull/15396#discussion_r298717013 ## File path: docs/tutorials/sparse/train_gluon.md ## @@ -0,0 +1,469 @@ + + + + + + + + + + + + + + + + + + +# Sparse NDArrays with Gluon + +When working on machine learning problems, you may encounter situations where the input data is sparse (i.e. the majority of values are zero). One example of this is in recommendation systems. You could have millions of user and product features, but only a few of these features are present for each sample. Without special treatment, the sheer magnitude of the feature space can lead to out-of-memory situations and cause significant slowdowns when training and making predictions. + +MXNet supports a number of sparse storage types (often called 'stype' for short) for these situations. In this tutorial, we'll start by generating some sparse data, write it to disk in the LibSVM format and then read back using the [`LibSVMIter`](https://mxnet.incubator.apache.org/api/python/io/io.html) for training. We use the Gluon API to train the model and leverage sparse storage types such as [`CSRNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=csrndarray#mxnet.ndarray.sparse.CSRNDArray) and [`RowSparseNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=rowsparsendarray#mxnet.ndarray.sparse.RowSparseNDArray) to maximise performance and memory efficiency. + + +```python +import mxnet as mx +import numpy as np +import time +``` + +### Generating Sparse Data + +You will most likely have a sparse dataset in mind already if you're reading this tutorial, but let's create a dummy dataset to use in the examples that follow. Using `rand_ndarray` we will generate 1000 samples, each with 1,000,000 features of which 99.999% of values will be zero (i.e. 10 non-zero features for each sample). We take this as our input data for training and calculate a label based on an arbitrary rule: whether the feature sum is higher than average. + + +```python +num_samples = 1000 +num_features = 100 +data = mx.test_utils.rand_ndarray((num_samples, num_features), stype='csr', density=0.1) +# generate label: 1 if row sum above average, 0 otherwise. +label = data.sum(axis=1) > data.sum(axis=1).mean() +``` + + +```python +print(type(data)) +print(data[:10].asnumpy()) +print('{:,.0f} elements'.format(np.product(data.shape))) +print('{:,.0f} non-zero elements'.format(data.data.size)) +``` + + +[[0. 0. 0. ... 0. 0. 0.] + [0. 0. 0. ... 0. 0. 0.] + [0. 0. 0. ... 0. 0. 0.] + ... + [0. 0. 0. ... 0. 0. 0.] + [0. 0. 0. ... 0. 0. 0.] + [0. 0. 0. ... 0. 0. 0.]] +1,000,000,000 elements +10,000 non-zero elements + + +Our storage type is CSR (Compressed Sparse Row) which is the ideal type for sparse data along multiple axes. See [this in-depth tutorial](https://mxnet.incubator.apache.org/versions/master/tutorials/sparse/csr.html) for more information. Just to confirm the generation process ran correctly, we can see that the vast majority of values are indeed zero. One of the first questions to ask would be how much memory is saved by storing this data in a [`CSRNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=csrndarray#mxnet.ndarray.sparse.CSRNDArray) versus a standard [`NDArray`](https://mxnet.incubator.apache.org/versions/master/api/python/ndarray/sparse.html?highlight=ndarray#module-mxnet.ndarray). Since sparse arrays are constructed from many components (e.g. `data`, `indices` and `indptr`) we define a function called `get_nbytes` to calculate the number of bytes taken in memory to store an array. We compare the same data stored in a standard [`NDArray`](https://mxnet.incubator.apache.org/versions/master/api/python/ndarray/sparse.html?highlight=ndarray#module-mxnet.ndarray) (with `data.tostype('default')`) to the [`CSRNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=csrndarray#mxnet.ndarray.sparse.CSRNDArray). + + +```python +def get_nbytes(array): +fn = lambda a: a.size * np.dtype(a).itemsize +if isinstance(array, mx.ndarray.sparse.CSRNDArray): +return fn(array.data) + fn(array.indices) + fn(array.indptr) +elif isinstance(array, mx.ndarray.sparse.RowSparseNDArray): +return fn(array.data) + fn(array.indices) +elif isinstance(array, mx.ndarray.NDArray): +return fn(array) +else: +TypeError('{} not supported'.format(type(array))) +``` + + +```python +print('NDarray:', get_nbytes(data.tostype('default'))/100, 'MBs') +print('CSRNDArray', get_nbytes(data)/100, 'MBs') +``` + +NDarray: 4000.0 MBs +CSRNDArray 0.128008 MBs + + +Given the extremely high sparsity of the data, we observe a huge memory saving here! 0.13 MBs versus 4 GBs: ~30,000
[GitHub] [incubator-mxnet] aaronmarkham commented on a change in pull request #15396: [TUTORIAL] Gluon and Sparse NDArray
aaronmarkham commented on a change in pull request #15396: [TUTORIAL] Gluon and Sparse NDArray URL: https://github.com/apache/incubator-mxnet/pull/15396#discussion_r298718419 ## File path: docs/tutorials/sparse/train_gluon.md ## @@ -0,0 +1,469 @@ + + + + + + + + + + + + + + + + + + +# Sparse NDArrays with Gluon + +When working on machine learning problems, you may encounter situations where the input data is sparse (i.e. the majority of values are zero). One example of this is in recommendation systems. You could have millions of user and product features, but only a few of these features are present for each sample. Without special treatment, the sheer magnitude of the feature space can lead to out-of-memory situations and cause significant slowdowns when training and making predictions. + +MXNet supports a number of sparse storage types (often called 'stype' for short) for these situations. In this tutorial, we'll start by generating some sparse data, write it to disk in the LibSVM format and then read back using the [`LibSVMIter`](https://mxnet.incubator.apache.org/api/python/io/io.html) for training. We use the Gluon API to train the model and leverage sparse storage types such as [`CSRNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=csrndarray#mxnet.ndarray.sparse.CSRNDArray) and [`RowSparseNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=rowsparsendarray#mxnet.ndarray.sparse.RowSparseNDArray) to maximise performance and memory efficiency. + + +```python +import mxnet as mx +import numpy as np +import time +``` + +### Generating Sparse Data + +You will most likely have a sparse dataset in mind already if you're reading this tutorial, but let's create a dummy dataset to use in the examples that follow. Using `rand_ndarray` we will generate 1000 samples, each with 1,000,000 features of which 99.999% of values will be zero (i.e. 10 non-zero features for each sample). We take this as our input data for training and calculate a label based on an arbitrary rule: whether the feature sum is higher than average. + + +```python +num_samples = 1000 +num_features = 100 +data = mx.test_utils.rand_ndarray((num_samples, num_features), stype='csr', density=0.1) +# generate label: 1 if row sum above average, 0 otherwise. +label = data.sum(axis=1) > data.sum(axis=1).mean() +``` + + +```python +print(type(data)) +print(data[:10].asnumpy()) +print('{:,.0f} elements'.format(np.product(data.shape))) +print('{:,.0f} non-zero elements'.format(data.data.size)) +``` + + +[[0. 0. 0. ... 0. 0. 0.] + [0. 0. 0. ... 0. 0. 0.] + [0. 0. 0. ... 0. 0. 0.] + ... + [0. 0. 0. ... 0. 0. 0.] + [0. 0. 0. ... 0. 0. 0.] + [0. 0. 0. ... 0. 0. 0.]] +1,000,000,000 elements +10,000 non-zero elements + + +Our storage type is CSR (Compressed Sparse Row) which is the ideal type for sparse data along multiple axes. See [this in-depth tutorial](https://mxnet.incubator.apache.org/versions/master/tutorials/sparse/csr.html) for more information. Just to confirm the generation process ran correctly, we can see that the vast majority of values are indeed zero. One of the first questions to ask would be how much memory is saved by storing this data in a [`CSRNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=csrndarray#mxnet.ndarray.sparse.CSRNDArray) versus a standard [`NDArray`](https://mxnet.incubator.apache.org/versions/master/api/python/ndarray/sparse.html?highlight=ndarray#module-mxnet.ndarray). Since sparse arrays are constructed from many components (e.g. `data`, `indices` and `indptr`) we define a function called `get_nbytes` to calculate the number of bytes taken in memory to store an array. We compare the same data stored in a standard [`NDArray`](https://mxnet.incubator.apache.org/versions/master/api/python/ndarray/sparse.html?highlight=ndarray#module-mxnet.ndarray) (with `data.tostype('default')`) to the [`CSRNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=csrndarray#mxnet.ndarray.sparse.CSRNDArray). + + +```python +def get_nbytes(array): +fn = lambda a: a.size * np.dtype(a).itemsize +if isinstance(array, mx.ndarray.sparse.CSRNDArray): +return fn(array.data) + fn(array.indices) + fn(array.indptr) +elif isinstance(array, mx.ndarray.sparse.RowSparseNDArray): +return fn(array.data) + fn(array.indices) +elif isinstance(array, mx.ndarray.NDArray): +return fn(array) +else: +TypeError('{} not supported'.format(type(array))) +``` + + +```python +print('NDarray:', get_nbytes(data.tostype('default'))/100, 'MBs') +print('CSRNDArray', get_nbytes(data)/100, 'MBs') +``` + +NDarray: 4000.0 MBs +CSRNDArray 0.128008 MBs + + +Given the extremely high sparsity of the data, we observe a huge memory saving here! 0.13 MBs versus 4 GBs: ~30,000
[GitHub] [incubator-mxnet] aaronmarkham commented on a change in pull request #15396: [TUTORIAL] Gluon and Sparse NDArray
aaronmarkham commented on a change in pull request #15396: [TUTORIAL] Gluon and Sparse NDArray URL: https://github.com/apache/incubator-mxnet/pull/15396#discussion_r298714786 ## File path: docs/tutorials/sparse/train_gluon.md ## @@ -0,0 +1,469 @@ + + + + + + + + + + + + + + + + + + +# Sparse NDArrays with Gluon + +When working on machine learning problems, you may encounter situations where the input data is sparse (i.e. the majority of values are zero). One example of this is in recommendation systems. You could have millions of user and product features, but only a few of these features are present for each sample. Without special treatment, the sheer magnitude of the feature space can lead to out-of-memory situations and cause significant slowdowns when training and making predictions. + +MXNet supports a number of sparse storage types (often called 'stype' for short) for these situations. In this tutorial, we'll start by generating some sparse data, write it to disk in the LibSVM format and then read back using the [`LibSVMIter`](https://mxnet.incubator.apache.org/api/python/io/io.html) for training. We use the Gluon API to train the model and leverage sparse storage types such as [`CSRNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=csrndarray#mxnet.ndarray.sparse.CSRNDArray) and [`RowSparseNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=rowsparsendarray#mxnet.ndarray.sparse.RowSparseNDArray) to maximise performance and memory efficiency. + + +```python +import mxnet as mx +import numpy as np +import time +``` + +### Generating Sparse Data + +You will most likely have a sparse dataset in mind already if you're reading this tutorial, but let's create a dummy dataset to use in the examples that follow. Using `rand_ndarray` we will generate 1000 samples, each with 1,000,000 features of which 99.999% of values will be zero (i.e. 10 non-zero features for each sample). We take this as our input data for training and calculate a label based on an arbitrary rule: whether the feature sum is higher than average. + + +```python +num_samples = 1000 +num_features = 100 +data = mx.test_utils.rand_ndarray((num_samples, num_features), stype='csr', density=0.1) +# generate label: 1 if row sum above average, 0 otherwise. +label = data.sum(axis=1) > data.sum(axis=1).mean() +``` + + +```python +print(type(data)) +print(data[:10].asnumpy()) +print('{:,.0f} elements'.format(np.product(data.shape))) +print('{:,.0f} non-zero elements'.format(data.data.size)) +``` + + +[[0. 0. 0. ... 0. 0. 0.] + [0. 0. 0. ... 0. 0. 0.] + [0. 0. 0. ... 0. 0. 0.] + ... + [0. 0. 0. ... 0. 0. 0.] + [0. 0. 0. ... 0. 0. 0.] + [0. 0. 0. ... 0. 0. 0.]] +1,000,000,000 elements +10,000 non-zero elements + + +Our storage type is CSR (Compressed Sparse Row) which is the ideal type for sparse data along multiple axes. See [this in-depth tutorial](https://mxnet.incubator.apache.org/versions/master/tutorials/sparse/csr.html) for more information. Just to confirm the generation process ran correctly, we can see that the vast majority of values are indeed zero. One of the first questions to ask would be how much memory is saved by storing this data in a [`CSRNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=csrndarray#mxnet.ndarray.sparse.CSRNDArray) versus a standard [`NDArray`](https://mxnet.incubator.apache.org/versions/master/api/python/ndarray/sparse.html?highlight=ndarray#module-mxnet.ndarray). Since sparse arrays are constructed from many components (e.g. `data`, `indices` and `indptr`) we define a function called `get_nbytes` to calculate the number of bytes taken in memory to store an array. We compare the same data stored in a standard [`NDArray`](https://mxnet.incubator.apache.org/versions/master/api/python/ndarray/sparse.html?highlight=ndarray#module-mxnet.ndarray) (with `data.tostype('default')`) to the [`CSRNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=csrndarray#mxnet.ndarray.sparse.CSRNDArray). + + +```python +def get_nbytes(array): +fn = lambda a: a.size * np.dtype(a).itemsize +if isinstance(array, mx.ndarray.sparse.CSRNDArray): +return fn(array.data) + fn(array.indices) + fn(array.indptr) +elif isinstance(array, mx.ndarray.sparse.RowSparseNDArray): +return fn(array.data) + fn(array.indices) +elif isinstance(array, mx.ndarray.NDArray): +return fn(array) +else: +TypeError('{} not supported'.format(type(array))) +``` + + +```python +print('NDarray:', get_nbytes(data.tostype('default'))/100, 'MBs') +print('CSRNDArray', get_nbytes(data)/100, 'MBs') +``` + +NDarray: 4000.0 MBs +CSRNDArray 0.128008 MBs + + +Given the extremely high sparsity of the data, we observe a huge memory saving here! 0.13 MBs versus 4 GBs: ~30,000
[GitHub] [incubator-mxnet] aaronmarkham commented on a change in pull request #15396: [TUTORIAL] Gluon and Sparse NDArray
aaronmarkham commented on a change in pull request #15396: [TUTORIAL] Gluon and Sparse NDArray URL: https://github.com/apache/incubator-mxnet/pull/15396#discussion_r298715361 ## File path: docs/tutorials/sparse/train_gluon.md ## @@ -0,0 +1,469 @@ + + + + + + + + + + + + + + + + + + +# Sparse NDArrays with Gluon + +When working on machine learning problems, you may encounter situations where the input data is sparse (i.e. the majority of values are zero). One example of this is in recommendation systems. You could have millions of user and product features, but only a few of these features are present for each sample. Without special treatment, the sheer magnitude of the feature space can lead to out-of-memory situations and cause significant slowdowns when training and making predictions. + +MXNet supports a number of sparse storage types (often called 'stype' for short) for these situations. In this tutorial, we'll start by generating some sparse data, write it to disk in the LibSVM format and then read back using the [`LibSVMIter`](https://mxnet.incubator.apache.org/api/python/io/io.html) for training. We use the Gluon API to train the model and leverage sparse storage types such as [`CSRNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=csrndarray#mxnet.ndarray.sparse.CSRNDArray) and [`RowSparseNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=rowsparsendarray#mxnet.ndarray.sparse.RowSparseNDArray) to maximise performance and memory efficiency. + + +```python +import mxnet as mx +import numpy as np +import time +``` + +### Generating Sparse Data + +You will most likely have a sparse dataset in mind already if you're reading this tutorial, but let's create a dummy dataset to use in the examples that follow. Using `rand_ndarray` we will generate 1000 samples, each with 1,000,000 features of which 99.999% of values will be zero (i.e. 10 non-zero features for each sample). We take this as our input data for training and calculate a label based on an arbitrary rule: whether the feature sum is higher than average. + + +```python +num_samples = 1000 +num_features = 100 +data = mx.test_utils.rand_ndarray((num_samples, num_features), stype='csr', density=0.1) +# generate label: 1 if row sum above average, 0 otherwise. +label = data.sum(axis=1) > data.sum(axis=1).mean() +``` + + +```python +print(type(data)) +print(data[:10].asnumpy()) +print('{:,.0f} elements'.format(np.product(data.shape))) +print('{:,.0f} non-zero elements'.format(data.data.size)) +``` + + +[[0. 0. 0. ... 0. 0. 0.] + [0. 0. 0. ... 0. 0. 0.] + [0. 0. 0. ... 0. 0. 0.] + ... + [0. 0. 0. ... 0. 0. 0.] + [0. 0. 0. ... 0. 0. 0.] + [0. 0. 0. ... 0. 0. 0.]] +1,000,000,000 elements +10,000 non-zero elements + + +Our storage type is CSR (Compressed Sparse Row) which is the ideal type for sparse data along multiple axes. See [this in-depth tutorial](https://mxnet.incubator.apache.org/versions/master/tutorials/sparse/csr.html) for more information. Just to confirm the generation process ran correctly, we can see that the vast majority of values are indeed zero. One of the first questions to ask would be how much memory is saved by storing this data in a [`CSRNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=csrndarray#mxnet.ndarray.sparse.CSRNDArray) versus a standard [`NDArray`](https://mxnet.incubator.apache.org/versions/master/api/python/ndarray/sparse.html?highlight=ndarray#module-mxnet.ndarray). Since sparse arrays are constructed from many components (e.g. `data`, `indices` and `indptr`) we define a function called `get_nbytes` to calculate the number of bytes taken in memory to store an array. We compare the same data stored in a standard [`NDArray`](https://mxnet.incubator.apache.org/versions/master/api/python/ndarray/sparse.html?highlight=ndarray#module-mxnet.ndarray) (with `data.tostype('default')`) to the [`CSRNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=csrndarray#mxnet.ndarray.sparse.CSRNDArray). + + +```python +def get_nbytes(array): +fn = lambda a: a.size * np.dtype(a).itemsize +if isinstance(array, mx.ndarray.sparse.CSRNDArray): +return fn(array.data) + fn(array.indices) + fn(array.indptr) +elif isinstance(array, mx.ndarray.sparse.RowSparseNDArray): +return fn(array.data) + fn(array.indices) +elif isinstance(array, mx.ndarray.NDArray): +return fn(array) +else: +TypeError('{} not supported'.format(type(array))) +``` + + +```python +print('NDarray:', get_nbytes(data.tostype('default'))/100, 'MBs') +print('CSRNDArray', get_nbytes(data)/100, 'MBs') +``` + +NDarray: 4000.0 MBs +CSRNDArray 0.128008 MBs + + +Given the extremely high sparsity of the data, we observe a huge memory saving here! 0.13 MBs versus 4 GBs: ~30,000
[GitHub] [incubator-mxnet] aaronmarkham commented on a change in pull request #15396: [TUTORIAL] Gluon and Sparse NDArray
aaronmarkham commented on a change in pull request #15396: [TUTORIAL] Gluon and Sparse NDArray URL: https://github.com/apache/incubator-mxnet/pull/15396#discussion_r298718015 ## File path: docs/tutorials/sparse/train_gluon.md ## @@ -0,0 +1,469 @@ + + + + + + + + + + + + + + + + + + +# Sparse NDArrays with Gluon + +When working on machine learning problems, you may encounter situations where the input data is sparse (i.e. the majority of values are zero). One example of this is in recommendation systems. You could have millions of user and product features, but only a few of these features are present for each sample. Without special treatment, the sheer magnitude of the feature space can lead to out-of-memory situations and cause significant slowdowns when training and making predictions. + +MXNet supports a number of sparse storage types (often called 'stype' for short) for these situations. In this tutorial, we'll start by generating some sparse data, write it to disk in the LibSVM format and then read back using the [`LibSVMIter`](https://mxnet.incubator.apache.org/api/python/io/io.html) for training. We use the Gluon API to train the model and leverage sparse storage types such as [`CSRNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=csrndarray#mxnet.ndarray.sparse.CSRNDArray) and [`RowSparseNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=rowsparsendarray#mxnet.ndarray.sparse.RowSparseNDArray) to maximise performance and memory efficiency. + + +```python +import mxnet as mx +import numpy as np +import time +``` + +### Generating Sparse Data + +You will most likely have a sparse dataset in mind already if you're reading this tutorial, but let's create a dummy dataset to use in the examples that follow. Using `rand_ndarray` we will generate 1000 samples, each with 1,000,000 features of which 99.999% of values will be zero (i.e. 10 non-zero features for each sample). We take this as our input data for training and calculate a label based on an arbitrary rule: whether the feature sum is higher than average. + + +```python +num_samples = 1000 +num_features = 100 +data = mx.test_utils.rand_ndarray((num_samples, num_features), stype='csr', density=0.1) +# generate label: 1 if row sum above average, 0 otherwise. +label = data.sum(axis=1) > data.sum(axis=1).mean() +``` + + +```python +print(type(data)) +print(data[:10].asnumpy()) +print('{:,.0f} elements'.format(np.product(data.shape))) +print('{:,.0f} non-zero elements'.format(data.data.size)) +``` + + +[[0. 0. 0. ... 0. 0. 0.] + [0. 0. 0. ... 0. 0. 0.] + [0. 0. 0. ... 0. 0. 0.] + ... + [0. 0. 0. ... 0. 0. 0.] + [0. 0. 0. ... 0. 0. 0.] + [0. 0. 0. ... 0. 0. 0.]] +1,000,000,000 elements +10,000 non-zero elements + + +Our storage type is CSR (Compressed Sparse Row) which is the ideal type for sparse data along multiple axes. See [this in-depth tutorial](https://mxnet.incubator.apache.org/versions/master/tutorials/sparse/csr.html) for more information. Just to confirm the generation process ran correctly, we can see that the vast majority of values are indeed zero. One of the first questions to ask would be how much memory is saved by storing this data in a [`CSRNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=csrndarray#mxnet.ndarray.sparse.CSRNDArray) versus a standard [`NDArray`](https://mxnet.incubator.apache.org/versions/master/api/python/ndarray/sparse.html?highlight=ndarray#module-mxnet.ndarray). Since sparse arrays are constructed from many components (e.g. `data`, `indices` and `indptr`) we define a function called `get_nbytes` to calculate the number of bytes taken in memory to store an array. We compare the same data stored in a standard [`NDArray`](https://mxnet.incubator.apache.org/versions/master/api/python/ndarray/sparse.html?highlight=ndarray#module-mxnet.ndarray) (with `data.tostype('default')`) to the [`CSRNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=csrndarray#mxnet.ndarray.sparse.CSRNDArray). + + +```python +def get_nbytes(array): +fn = lambda a: a.size * np.dtype(a).itemsize +if isinstance(array, mx.ndarray.sparse.CSRNDArray): +return fn(array.data) + fn(array.indices) + fn(array.indptr) +elif isinstance(array, mx.ndarray.sparse.RowSparseNDArray): +return fn(array.data) + fn(array.indices) +elif isinstance(array, mx.ndarray.NDArray): +return fn(array) +else: +TypeError('{} not supported'.format(type(array))) +``` + + +```python +print('NDarray:', get_nbytes(data.tostype('default'))/100, 'MBs') +print('CSRNDArray', get_nbytes(data)/100, 'MBs') +``` + +NDarray: 4000.0 MBs +CSRNDArray 0.128008 MBs + + +Given the extremely high sparsity of the data, we observe a huge memory saving here! 0.13 MBs versus 4 GBs: ~30,000
[GitHub] [incubator-mxnet] aaronmarkham commented on a change in pull request #15396: [TUTORIAL] Gluon and Sparse NDArray
aaronmarkham commented on a change in pull request #15396: [TUTORIAL] Gluon and Sparse NDArray URL: https://github.com/apache/incubator-mxnet/pull/15396#discussion_r298718278 ## File path: docs/tutorials/sparse/train_gluon.md ## @@ -0,0 +1,469 @@ + + + + + + + + + + + + + + + + + + +# Sparse NDArrays with Gluon + +When working on machine learning problems, you may encounter situations where the input data is sparse (i.e. the majority of values are zero). One example of this is in recommendation systems. You could have millions of user and product features, but only a few of these features are present for each sample. Without special treatment, the sheer magnitude of the feature space can lead to out-of-memory situations and cause significant slowdowns when training and making predictions. + +MXNet supports a number of sparse storage types (often called 'stype' for short) for these situations. In this tutorial, we'll start by generating some sparse data, write it to disk in the LibSVM format and then read back using the [`LibSVMIter`](https://mxnet.incubator.apache.org/api/python/io/io.html) for training. We use the Gluon API to train the model and leverage sparse storage types such as [`CSRNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=csrndarray#mxnet.ndarray.sparse.CSRNDArray) and [`RowSparseNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=rowsparsendarray#mxnet.ndarray.sparse.RowSparseNDArray) to maximise performance and memory efficiency. + + +```python +import mxnet as mx +import numpy as np +import time +``` + +### Generating Sparse Data + +You will most likely have a sparse dataset in mind already if you're reading this tutorial, but let's create a dummy dataset to use in the examples that follow. Using `rand_ndarray` we will generate 1000 samples, each with 1,000,000 features of which 99.999% of values will be zero (i.e. 10 non-zero features for each sample). We take this as our input data for training and calculate a label based on an arbitrary rule: whether the feature sum is higher than average. + + +```python +num_samples = 1000 +num_features = 100 +data = mx.test_utils.rand_ndarray((num_samples, num_features), stype='csr', density=0.1) +# generate label: 1 if row sum above average, 0 otherwise. +label = data.sum(axis=1) > data.sum(axis=1).mean() +``` + + +```python +print(type(data)) +print(data[:10].asnumpy()) +print('{:,.0f} elements'.format(np.product(data.shape))) +print('{:,.0f} non-zero elements'.format(data.data.size)) +``` + + +[[0. 0. 0. ... 0. 0. 0.] + [0. 0. 0. ... 0. 0. 0.] + [0. 0. 0. ... 0. 0. 0.] + ... + [0. 0. 0. ... 0. 0. 0.] + [0. 0. 0. ... 0. 0. 0.] + [0. 0. 0. ... 0. 0. 0.]] +1,000,000,000 elements +10,000 non-zero elements + + +Our storage type is CSR (Compressed Sparse Row) which is the ideal type for sparse data along multiple axes. See [this in-depth tutorial](https://mxnet.incubator.apache.org/versions/master/tutorials/sparse/csr.html) for more information. Just to confirm the generation process ran correctly, we can see that the vast majority of values are indeed zero. One of the first questions to ask would be how much memory is saved by storing this data in a [`CSRNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=csrndarray#mxnet.ndarray.sparse.CSRNDArray) versus a standard [`NDArray`](https://mxnet.incubator.apache.org/versions/master/api/python/ndarray/sparse.html?highlight=ndarray#module-mxnet.ndarray). Since sparse arrays are constructed from many components (e.g. `data`, `indices` and `indptr`) we define a function called `get_nbytes` to calculate the number of bytes taken in memory to store an array. We compare the same data stored in a standard [`NDArray`](https://mxnet.incubator.apache.org/versions/master/api/python/ndarray/sparse.html?highlight=ndarray#module-mxnet.ndarray) (with `data.tostype('default')`) to the [`CSRNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=csrndarray#mxnet.ndarray.sparse.CSRNDArray). + + +```python +def get_nbytes(array): +fn = lambda a: a.size * np.dtype(a).itemsize +if isinstance(array, mx.ndarray.sparse.CSRNDArray): +return fn(array.data) + fn(array.indices) + fn(array.indptr) +elif isinstance(array, mx.ndarray.sparse.RowSparseNDArray): +return fn(array.data) + fn(array.indices) +elif isinstance(array, mx.ndarray.NDArray): +return fn(array) +else: +TypeError('{} not supported'.format(type(array))) +``` + + +```python +print('NDarray:', get_nbytes(data.tostype('default'))/100, 'MBs') +print('CSRNDArray', get_nbytes(data)/100, 'MBs') +``` + +NDarray: 4000.0 MBs +CSRNDArray 0.128008 MBs + + +Given the extremely high sparsity of the data, we observe a huge memory saving here! 0.13 MBs versus 4 GBs: ~30,000
[GitHub] [incubator-mxnet] aaronmarkham commented on a change in pull request #15396: [TUTORIAL] Gluon and Sparse NDArray
aaronmarkham commented on a change in pull request #15396: [TUTORIAL] Gluon and Sparse NDArray URL: https://github.com/apache/incubator-mxnet/pull/15396#discussion_r298716161 ## File path: docs/tutorials/sparse/train_gluon.md ## @@ -0,0 +1,469 @@ + + + + + + + + + + + + + + + + + + +# Sparse NDArrays with Gluon + +When working on machine learning problems, you may encounter situations where the input data is sparse (i.e. the majority of values are zero). One example of this is in recommendation systems. You could have millions of user and product features, but only a few of these features are present for each sample. Without special treatment, the sheer magnitude of the feature space can lead to out-of-memory situations and cause significant slowdowns when training and making predictions. + +MXNet supports a number of sparse storage types (often called 'stype' for short) for these situations. In this tutorial, we'll start by generating some sparse data, write it to disk in the LibSVM format and then read back using the [`LibSVMIter`](https://mxnet.incubator.apache.org/api/python/io/io.html) for training. We use the Gluon API to train the model and leverage sparse storage types such as [`CSRNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=csrndarray#mxnet.ndarray.sparse.CSRNDArray) and [`RowSparseNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=rowsparsendarray#mxnet.ndarray.sparse.RowSparseNDArray) to maximise performance and memory efficiency. + + +```python +import mxnet as mx +import numpy as np +import time +``` + +### Generating Sparse Data + +You will most likely have a sparse dataset in mind already if you're reading this tutorial, but let's create a dummy dataset to use in the examples that follow. Using `rand_ndarray` we will generate 1000 samples, each with 1,000,000 features of which 99.999% of values will be zero (i.e. 10 non-zero features for each sample). We take this as our input data for training and calculate a label based on an arbitrary rule: whether the feature sum is higher than average. + + +```python +num_samples = 1000 +num_features = 100 +data = mx.test_utils.rand_ndarray((num_samples, num_features), stype='csr', density=0.1) +# generate label: 1 if row sum above average, 0 otherwise. +label = data.sum(axis=1) > data.sum(axis=1).mean() +``` + + +```python +print(type(data)) +print(data[:10].asnumpy()) +print('{:,.0f} elements'.format(np.product(data.shape))) +print('{:,.0f} non-zero elements'.format(data.data.size)) +``` + + +[[0. 0. 0. ... 0. 0. 0.] + [0. 0. 0. ... 0. 0. 0.] + [0. 0. 0. ... 0. 0. 0.] + ... + [0. 0. 0. ... 0. 0. 0.] + [0. 0. 0. ... 0. 0. 0.] + [0. 0. 0. ... 0. 0. 0.]] +1,000,000,000 elements +10,000 non-zero elements + + +Our storage type is CSR (Compressed Sparse Row) which is the ideal type for sparse data along multiple axes. See [this in-depth tutorial](https://mxnet.incubator.apache.org/versions/master/tutorials/sparse/csr.html) for more information. Just to confirm the generation process ran correctly, we can see that the vast majority of values are indeed zero. One of the first questions to ask would be how much memory is saved by storing this data in a [`CSRNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=csrndarray#mxnet.ndarray.sparse.CSRNDArray) versus a standard [`NDArray`](https://mxnet.incubator.apache.org/versions/master/api/python/ndarray/sparse.html?highlight=ndarray#module-mxnet.ndarray). Since sparse arrays are constructed from many components (e.g. `data`, `indices` and `indptr`) we define a function called `get_nbytes` to calculate the number of bytes taken in memory to store an array. We compare the same data stored in a standard [`NDArray`](https://mxnet.incubator.apache.org/versions/master/api/python/ndarray/sparse.html?highlight=ndarray#module-mxnet.ndarray) (with `data.tostype('default')`) to the [`CSRNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=csrndarray#mxnet.ndarray.sparse.CSRNDArray). + + +```python +def get_nbytes(array): +fn = lambda a: a.size * np.dtype(a).itemsize +if isinstance(array, mx.ndarray.sparse.CSRNDArray): +return fn(array.data) + fn(array.indices) + fn(array.indptr) +elif isinstance(array, mx.ndarray.sparse.RowSparseNDArray): +return fn(array.data) + fn(array.indices) +elif isinstance(array, mx.ndarray.NDArray): +return fn(array) +else: +TypeError('{} not supported'.format(type(array))) +``` + + +```python +print('NDarray:', get_nbytes(data.tostype('default'))/100, 'MBs') +print('CSRNDArray', get_nbytes(data)/100, 'MBs') +``` + +NDarray: 4000.0 MBs +CSRNDArray 0.128008 MBs + + +Given the extremely high sparsity of the data, we observe a huge memory saving here! 0.13 MBs versus 4 GBs: ~30,000
[GitHub] [incubator-mxnet] Caenorst commented on issue #15399: Add unit tests for TensorRT integration and fix some bugs
Caenorst commented on issue #15399: Add unit tests for TensorRT integration and fix some bugs URL: https://github.com/apache/incubator-mxnet/pull/15399#issuecomment-506847449 @KellenSunderland @haohuanw This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] zachgk commented on a change in pull request #15378: Add Sparse NDArray support for Scala
zachgk commented on a change in pull request #15378: Add Sparse NDArray support for Scala URL: https://github.com/apache/incubator-mxnet/pull/15378#discussion_r298717511 ## File path: scala-package/core/src/main/scala/org/apache/mxnet/SparseNDArray.scala ## @@ -0,0 +1,201 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.mxnet + +import org.apache.mxnet.Base.{NDArrayHandle, NDArrayHandleRef, checkCall, _LIB} +import org.apache.mxnet.DType.DType +import org.apache.mxnet.SparseFormat.SparseFormat + +object SparseNDArray { + /** +* Create a Compressed Sparse Row Storage (CSR) Format Matrix +* @param data the data to feed +* @param indices The indices array stores the column index for each non-zero element in data +* @param indptr The indptr array is what will help identify the rows where the data appears +* @param shape the shape of CSR NDArray to be created +* @param ctx the context of this NDArray +* @return SparseNDArray +*/ + def csrMatrix(data: Array[Float], indices: Array[Float], +indptr: Array[Float], shape: Shape, ctx: Context): SparseNDArray = { +val fmt = SparseFormat.CSR +val dataND = NDArray.array(data, Shape(data.length), ctx) +val indicesND = NDArray.array(indices, Shape(indices.length), ctx).asType(DType.Int64) +val indptrND = NDArray.array(indptr, Shape(indptr.length), ctx).asType(DType.Int64) +val dTypes = Array(indptrND.dtype, indicesND.dtype) +val shapes = Array(indptrND.shape, indicesND.shape) +val handle = + newAllocHandle(fmt, shape, ctx, false, DType.Float32, dTypes, shapes) +checkCall(_LIB.mxNDArraySyncCopyFromNDArray(handle, dataND.handle, -1)) +checkCall(_LIB.mxNDArraySyncCopyFromNDArray(handle, indptrND.handle, 0)) +checkCall(_LIB.mxNDArraySyncCopyFromNDArray(handle, indicesND.handle, 1)) +new SparseNDArray(handle) + } + + /** +* RowSparseNDArray stores the matrix in row sparse format, +* which is designed for arrays of which most row slices are all zeros +* @param data Any Array(Array(... Array(Float))) +* @param indices the indices to store the data +* @param shape shape of the NDArray +* @param ctx Context +* @return SparseNDArray +*/ + def rowSparseArray(data: Array[_], indices: Array[Float], + shape: Shape, ctx: Context): SparseNDArray = { +val dataND = NDArray.toNDArray(data) +val indicesND = NDArray.array(indices, Shape(indices.length), ctx).asType(DType.Int64) +rowSparseArray(dataND, indicesND, shape, ctx) + } + + /** +* RowSparseNDArray stores the matrix in row sparse format, +* which is designed for arrays of which most row slices are all zeros +* @param data NDArray input +* @param indices in NDArray. Only DType.Int64 supported +* @param shape shape of the NDArray +* @param ctx Context +* @return +*/ + def rowSparseArray(data: NDArray, indices: NDArray, + shape: Shape, ctx: Context): SparseNDArray = { +val fmt = SparseFormat.ROW_SPARSE +val handle = newAllocHandle(fmt, shape, ctx, false, + DType.Float32, Array(indices.dtype), Array(indices.shape)) +checkCall(_LIB.mxNDArraySyncCopyFromNDArray(handle, data.handle, -1)) +checkCall(_LIB.mxNDArraySyncCopyFromNDArray(handle, indices.handle, 0)) +new SparseNDArray(handle) + } + + def retain(sparseNDArray: SparseNDArray, indices: Array[Float]): SparseNDArray = { +if (sparseNDArray.sparseFormat == SparseFormat.CSR) { + throw new IllegalArgumentException("CSR not supported") +} +NDArray.genericNDArrayFunctionInvoke("_sparse_retain", + Seq(sparseNDArray, NDArray.toNDArray(indices))).head.toSparse() + } + + private def newAllocHandle(stype : SparseFormat, + shape: Shape, + ctx: Context, + delayAlloc: Boolean, + dtype: DType = DType.Float32, + auxDTypes: Array[DType], + auxShapes: Array[Shape]) : NDArrayHandle = { +val hdl = new NDArrayHandleRef +
[GitHub] [incubator-mxnet] larroy commented on issue #15375: Memory leak in Naive engine when profiling
larroy commented on issue #15375: Memory leak in Naive engine when profiling URL: https://github.com/apache/incubator-mxnet/issues/15375#issuecomment-506846181 @cjolivier01 I didn't understand a word of what you are writing. I'm not proposing to change the object pool, just make NewOp return a smart pointer. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] larroy edited a comment on issue #15375: Memory leak in Naive engine when profiling
larroy edited a comment on issue #15375: Memory leak in Naive engine when profiling URL: https://github.com/apache/incubator-mxnet/issues/15375#issuecomment-506843795 @anirudh2290 I proposed refactoring around a unique_ptr I don't think is going to change the way memory is released, maybe you got a different impression of what I was suggesting to do. I don't know why are you oposing a proposed change right away without considering the benefits. Having a pointer released in a call hiearchy 3 levels down leads to bugs and memory leaks, is difficult to reason about it. Best is to use RAII and return a unique_ptr or similar which holds ownership. I don't think the way the release of resources is going to be modified or performance impacted in any way, but If the proposal is blocked from the start makes no sense to even propose a PR, this is not very encouraging. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services