date:20170829

[GitHub] piiswrong commented on a change in pull request #7660: fix ctc on softmax grad and req option

2017-08-29 Thread git

piiswrong commented on a change in pull request #7660: fix ctc on softmax grad 
and req option
URL: https://github.com/apache/incubator-mxnet/pull/7660#discussion_r135974341
 
 

 ##
 File path: tests/python/gpu/test_operator_gpu.py
 ##
 @@ -1357,6 +1357,28 @@ def test_autograd_save_memory():
 x.wait_to_read()
 x.backward()
 
+def test_gluon_ctc_consistency():
+loss = mx.gluon.loss.CTCLoss(padding_mask=0)
+data = mx.nd.arange(0, 4, repeat=40, 
ctx=mx.gpu(0)).reshape((2,20,4)).flip(axis=0)
+cpu_label = mx.nd.array([[2,1,0,0],[3,2,2,0]], ctx=mx.cpu(0))
+gpu_label = mx.nd.array([[2,1,0,0],[3,2,2,0]], ctx=mx.gpu(0))
+
+cpu_data = data.copy().as_in_context(mx.cpu(0))
+cpu_data.attach_grad()
+with mx.autograd.record():
+l_cpu = loss(cpu_data, cpu_label)
+l_cpu.backward()
+cpu_data.detach()
+
+gpu_data = data.copyto(mx.gpu(0))
+gpu_data.attach_grad()
+with mx.autograd.record():
+l_gpu = loss(gpu_data, gpu_label)
+l_gpu.backward()
+gpu_data.detach()
+
+assert_almost_equal(cpu_data.grad.asnumpy(), gpu_data.grad.asnumpy(), 
atol=1e-3, rtol=1e-3)
 
 Review comment:
   gpu_data.detach() doesn't do anything to the original array. Only the 
returned array is detached. So this detach has no effect
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] piiswrong commented on a change in pull request #7660: fix ctc on softmax grad and req option

2017-08-29 Thread git

piiswrong commented on a change in pull request #7660: fix ctc on softmax grad 
and req option
URL: https://github.com/apache/incubator-mxnet/pull/7660#discussion_r135974341
 
 

 ##
 File path: tests/python/gpu/test_operator_gpu.py
 ##
 @@ -1357,6 +1357,28 @@ def test_autograd_save_memory():
 x.wait_to_read()
 x.backward()
 
+def test_gluon_ctc_consistency():
+loss = mx.gluon.loss.CTCLoss(padding_mask=0)
+data = mx.nd.arange(0, 4, repeat=40, 
ctx=mx.gpu(0)).reshape((2,20,4)).flip(axis=0)
+cpu_label = mx.nd.array([[2,1,0,0],[3,2,2,0]], ctx=mx.cpu(0))
+gpu_label = mx.nd.array([[2,1,0,0],[3,2,2,0]], ctx=mx.gpu(0))
+
+cpu_data = data.copy().as_in_context(mx.cpu(0))
+cpu_data.attach_grad()
+with mx.autograd.record():
+l_cpu = loss(cpu_data, cpu_label)
+l_cpu.backward()
+cpu_data.detach()
+
+gpu_data = data.copyto(mx.gpu(0))
+gpu_data.attach_grad()
+with mx.autograd.record():
+l_gpu = loss(gpu_data, gpu_label)
+l_gpu.backward()
+gpu_data.detach()
+
+assert_almost_equal(cpu_data.grad.asnumpy(), gpu_data.grad.asnumpy(), 
atol=1e-3, rtol=1e-3)
 
 Review comment:
   gpu_data.detach() doesn't do anything to the original array. Only the 
returned array is detached.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] piiswrong commented on a change in pull request #7660: fix ctc on softmax grad and req option

2017-08-29 Thread git

piiswrong commented on a change in pull request #7660: fix ctc on softmax grad 
and req option
URL: https://github.com/apache/incubator-mxnet/pull/7660#discussion_r135973923
 
 

 ##
 File path: src/operator/contrib/ctc_loss-inl.h
 ##
 @@ -402,19 +395,18 @@ class CTCLossOp : public Operator {
 label_lengths->data(),
 data_lengths->data(),
 costs.dptr_,
-grad_desc_,
-grad.dptr_,
+ctx.is_train?grad_desc_:NULL,
+ctx.is_train?grad.dptr_:NULL,
 ctc_algo,
 ctc_desc_,
 work_space.dptr_,
 workspace_bytes));
-  }
-  inline virtual void cudnn_backward_extra(mshadow::Stream* s,
-   mshadow::Tensor 
data_grad,
-   mshadow::Tensor 
output_grad,
-   mshadow::Tensor 
data_grad_computed) {
-mxnet_op::SoftmaxGrad(s,
-output_grad.dptr_, data_grad_computed.dptr_, data_grad.dptr_, 
data_grad.shape_, 2);
+
+if (ctx.is_train) {
 
 Review comment:
   We now allow for calculating gradient for is_train=False
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] szha commented on a change in pull request #7660: fix ctc on softmax grad and req option

2017-08-29 Thread git

szha commented on a change in pull request #7660: fix ctc on softmax grad and 
req option
URL: https://github.com/apache/incubator-mxnet/pull/7660#discussion_r135972464
 
 

 ##
 File path: tests/python/gpu/test_operator_gpu.py
 ##
 @@ -1357,6 +1357,28 @@ def test_autograd_save_memory():
 x.wait_to_read()
 x.backward()
 
+def test_gluon_ctc_consistency():
+loss = mx.gluon.loss.CTCLoss(padding_mask=0)
+data = mx.nd.arange(0, 4, repeat=40, 
ctx=mx.gpu(0)).reshape((2,20,4)).flip(axis=0)
+cpu_label = mx.nd.array([[2,1,0,0],[3,2,2,0]], ctx=mx.cpu(0))
+gpu_label = mx.nd.array([[2,1,0,0],[3,2,2,0]], ctx=mx.gpu(0))
+
+cpu_data = data.copy().as_in_context(mx.cpu(0))
+cpu_data.attach_grad()
+with mx.autograd.record():
+l_cpu = loss(cpu_data, cpu_label)
+l_cpu.backward()
+cpu_data.detach()
+
+gpu_data = data.copyto(mx.gpu(0))
+gpu_data.attach_grad()
+with mx.autograd.record():
+l_gpu = loss(gpu_data, gpu_label)
+l_gpu.backward()
+gpu_data.detach()
+
+assert_almost_equal(cpu_data.grad.asnumpy(), gpu_data.grad.asnumpy(), 
atol=1e-3, rtol=1e-3)
 
 Review comment:
   It's still there if backward is done first.
   ```python
   In [1]: from __future__ import print_function
  ...: from mxnet import gluon
  ...: import mxnet as mx
  ...: loss = gluon.loss.CTCLoss(padding_mask=0)
  ...: data = mx.nd.arange(0, 4, repeat=40).reshape((2,20,4)).flip(axis=0)
  ...: label = mx.nd.array([[2,1,0,0],[3,2,2,0]])
  ...: data.attach_grad()
  ...: with mx.autograd.record():
  ...: l = loss(data, label)
  ...: print(l)
  ...: l.backward()
  ...: data.detach()
  ...: print(data.grad)
  ...:
   
   [ 18.82820702  16.50581741]
   
   
   [[[-0.56818217  0.250.06818183  0.25  ]
 [-0.43571669  0.24740258 -0.06168866  0.25  ]
 [-0.34275508  0.24015717 -0.14740321  0.25  ]
 [-0.28055429  0.22675993 -0.19620776  0.25  ]
 [-0.24145621  0.20625414 -0.21479931  0.25  ]
 [-0.21889889  0.17822932 -0.20933169  0.25  ]
 [-0.20741701  0.14282268 -0.18540773  0.25  ]
 [-0.2026315   0.10071747 -0.1480875   0.25  ]
 [-0.20126608  0.05314353 -0.1018810.25  ]
 [-0.20112836  0.00187904 -0.05075252  0.25  ]
 [-0.20112836 -0.05075252  0.00187904  0.25  ]
 [-0.20126608 -0.1018810.05314353  0.25  ]
 [-0.2026315  -0.1480875   0.10071747  0.25  ]
 [-0.20741701 -0.18540773  0.14282268  0.25  ]
 [-0.21890068 -0.20933169  0.17822932  0.25  ]
 [-0.24145621 -0.21479931  0.20625414  0.25  ]
 [-0.28055429 -0.19620776  0.22675993  0.25  ]
 [-0.34275508 -0.14740321  0.24015717  0.25  ]
 [-0.43571669 -0.06168866  0.24740258  0.25  ]
 [-0.56818217  0.06818183  0.250.25  ]]
   
[[-0.47727478  0.250.25   -0.02272752]
 [-0.32142842  0.250.23701297 -0.16558546]
 [-0.2387220.250.20625421 -0.21753165]
 [-0.1993053   0.250.15863517 -0.2093308 ]
 [-0.18393114  0.250.0986056  -0.1646733 ]
 [-0.18109006  0.250.03234358 -0.1012527 ]
 [-0.1845623   0.25   -0.03370428 -0.0317339 ]
 [-0.19141069  0.25   -0.09393495  0.03534578]
 [-0.2004036   0.25   -0.14435497  0.09475859]
 [-0.21085775  0.25   -0.18299362  0.14385229]
 [-0.22191447  0.25   -0.20997432  0.18188837]
 [-0.23224759  0.25   -0.22722232  0.20947087]
 [-0.24019703  0.25   -0.23785028  0.22804667]
 [-0.24432448  0.25   -0.24516809  0.23949245]
 [-0.24441877  0.25   -0.2513606   0.2457782 ]
 [-0.24290282  0.25   -0.25580966  0.24871336]
 [-0.24669671  0.25   -0.25307524  0.24977216]
 [-0.26948035  0.25   -0.23051959  0.25  ]
 [-0.33441514  0.25   -0.16558546  0.25  ]
 [-0.47727478  0.25   -0.02272752  0.25  ]]]
   
   ```
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] szha commented on a change in pull request #7660: fix ctc on softmax grad and req option

2017-08-29 Thread git

szha commented on a change in pull request #7660: fix ctc on softmax grad and 
req option
URL: https://github.com/apache/incubator-mxnet/pull/7660#discussion_r135972464
 
 

 ##
 File path: tests/python/gpu/test_operator_gpu.py
 ##
 @@ -1357,6 +1357,28 @@ def test_autograd_save_memory():
 x.wait_to_read()
 x.backward()
 
+def test_gluon_ctc_consistency():
+loss = mx.gluon.loss.CTCLoss(padding_mask=0)
+data = mx.nd.arange(0, 4, repeat=40, 
ctx=mx.gpu(0)).reshape((2,20,4)).flip(axis=0)
+cpu_label = mx.nd.array([[2,1,0,0],[3,2,2,0]], ctx=mx.cpu(0))
+gpu_label = mx.nd.array([[2,1,0,0],[3,2,2,0]], ctx=mx.gpu(0))
+
+cpu_data = data.copy().as_in_context(mx.cpu(0))
+cpu_data.attach_grad()
+with mx.autograd.record():
+l_cpu = loss(cpu_data, cpu_label)
+l_cpu.backward()
+cpu_data.detach()
+
+gpu_data = data.copyto(mx.gpu(0))
+gpu_data.attach_grad()
+with mx.autograd.record():
+l_gpu = loss(gpu_data, gpu_label)
+l_gpu.backward()
+gpu_data.detach()
+
+assert_almost_equal(cpu_data.grad.asnumpy(), gpu_data.grad.asnumpy(), 
atol=1e-3, rtol=1e-3)
 
 Review comment:
   It's still there if backward is done first.
   ```ipython
   In [1]: from __future__ import print_function
  ...: from mxnet import gluon
  ...: import mxnet as mx
  ...: loss = gluon.loss.CTCLoss(padding_mask=0)
  ...: data = mx.nd.arange(0, 4, repeat=40).reshape((2,20,4)).flip(axis=0)
  ...: label = mx.nd.array([[2,1,0,0],[3,2,2,0]])
  ...: data.attach_grad()
  ...: with mx.autograd.record():
  ...: l = loss(data, label)
  ...: print(l)
  ...: l.backward()
  ...: data.detach()
  ...: print(data.grad)
  ...:
   
   [ 18.82820702  16.50581741]
   
   
   [[[-0.56818217  0.250.06818183  0.25  ]
 [-0.43571669  0.24740258 -0.06168866  0.25  ]
 [-0.34275508  0.24015717 -0.14740321  0.25  ]
 [-0.28055429  0.22675993 -0.19620776  0.25  ]
 [-0.24145621  0.20625414 -0.21479931  0.25  ]
 [-0.21889889  0.17822932 -0.20933169  0.25  ]
 [-0.20741701  0.14282268 -0.18540773  0.25  ]
 [-0.2026315   0.10071747 -0.1480875   0.25  ]
 [-0.20126608  0.05314353 -0.1018810.25  ]
 [-0.20112836  0.00187904 -0.05075252  0.25  ]
 [-0.20112836 -0.05075252  0.00187904  0.25  ]
 [-0.20126608 -0.1018810.05314353  0.25  ]
 [-0.2026315  -0.1480875   0.10071747  0.25  ]
 [-0.20741701 -0.18540773  0.14282268  0.25  ]
 [-0.21890068 -0.20933169  0.17822932  0.25  ]
 [-0.24145621 -0.21479931  0.20625414  0.25  ]
 [-0.28055429 -0.19620776  0.22675993  0.25  ]
 [-0.34275508 -0.14740321  0.24015717  0.25  ]
 [-0.43571669 -0.06168866  0.24740258  0.25  ]
 [-0.56818217  0.06818183  0.250.25  ]]
   
[[-0.47727478  0.250.25   -0.02272752]
 [-0.32142842  0.250.23701297 -0.16558546]
 [-0.2387220.250.20625421 -0.21753165]
 [-0.1993053   0.250.15863517 -0.2093308 ]
 [-0.18393114  0.250.0986056  -0.1646733 ]
 [-0.18109006  0.250.03234358 -0.1012527 ]
 [-0.1845623   0.25   -0.03370428 -0.0317339 ]
 [-0.19141069  0.25   -0.09393495  0.03534578]
 [-0.2004036   0.25   -0.14435497  0.09475859]
 [-0.21085775  0.25   -0.18299362  0.14385229]
 [-0.22191447  0.25   -0.20997432  0.18188837]
 [-0.23224759  0.25   -0.22722232  0.20947087]
 [-0.24019703  0.25   -0.23785028  0.22804667]
 [-0.24432448  0.25   -0.24516809  0.23949245]
 [-0.24441877  0.25   -0.2513606   0.2457782 ]
 [-0.24290282  0.25   -0.25580966  0.24871336]
 [-0.24669671  0.25   -0.25307524  0.24977216]
 [-0.26948035  0.25   -0.23051959  0.25  ]
 [-0.33441514  0.25   -0.16558546  0.25  ]
 [-0.47727478  0.25   -0.02272752  0.25  ]]]
   
   ```
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] szha commented on a change in pull request #7660: fix ctc on softmax grad and req option

2017-08-29 Thread git

szha commented on a change in pull request #7660: fix ctc on softmax grad and 
req option
URL: https://github.com/apache/incubator-mxnet/pull/7660#discussion_r135972464
 
 

 ##
 File path: tests/python/gpu/test_operator_gpu.py
 ##
 @@ -1357,6 +1357,28 @@ def test_autograd_save_memory():
 x.wait_to_read()
 x.backward()
 
+def test_gluon_ctc_consistency():
+loss = mx.gluon.loss.CTCLoss(padding_mask=0)
+data = mx.nd.arange(0, 4, repeat=40, 
ctx=mx.gpu(0)).reshape((2,20,4)).flip(axis=0)
+cpu_label = mx.nd.array([[2,1,0,0],[3,2,2,0]], ctx=mx.cpu(0))
+gpu_label = mx.nd.array([[2,1,0,0],[3,2,2,0]], ctx=mx.gpu(0))
+
+cpu_data = data.copy().as_in_context(mx.cpu(0))
+cpu_data.attach_grad()
+with mx.autograd.record():
+l_cpu = loss(cpu_data, cpu_label)
+l_cpu.backward()
+cpu_data.detach()
+
+gpu_data = data.copyto(mx.gpu(0))
+gpu_data.attach_grad()
+with mx.autograd.record():
+l_gpu = loss(gpu_data, gpu_label)
+l_gpu.backward()
+gpu_data.detach()
+
+assert_almost_equal(cpu_data.grad.asnumpy(), gpu_data.grad.asnumpy(), 
atol=1e-3, rtol=1e-3)
 
 Review comment:
   It's still there if backward is done first.
   ```
   In [1]: from __future__ import print_function
  ...: from mxnet import gluon
  ...: import mxnet as mx
  ...: loss = gluon.loss.CTCLoss(padding_mask=0)
  ...: data = mx.nd.arange(0, 4, repeat=40).reshape((2,20,4)).flip(axis=0)
  ...: label = mx.nd.array([[2,1,0,0],[3,2,2,0]])
  ...: data.attach_grad()
  ...: with mx.autograd.record():
  ...: l = loss(data, label)
  ...: print(l)
  ...: l.backward()
  ...: data.detach()
  ...: print(data.grad)
  ...:
   
   [ 18.82820702  16.50581741]
   
   
   [[[-0.56818217  0.250.06818183  0.25  ]
 [-0.43571669  0.24740258 -0.06168866  0.25  ]
 [-0.34275508  0.24015717 -0.14740321  0.25  ]
 [-0.28055429  0.22675993 -0.19620776  0.25  ]
 [-0.24145621  0.20625414 -0.21479931  0.25  ]
 [-0.21889889  0.17822932 -0.20933169  0.25  ]
 [-0.20741701  0.14282268 -0.18540773  0.25  ]
 [-0.2026315   0.10071747 -0.1480875   0.25  ]
 [-0.20126608  0.05314353 -0.1018810.25  ]
 [-0.20112836  0.00187904 -0.05075252  0.25  ]
 [-0.20112836 -0.05075252  0.00187904  0.25  ]
 [-0.20126608 -0.1018810.05314353  0.25  ]
 [-0.2026315  -0.1480875   0.10071747  0.25  ]
 [-0.20741701 -0.18540773  0.14282268  0.25  ]
 [-0.21890068 -0.20933169  0.17822932  0.25  ]
 [-0.24145621 -0.21479931  0.20625414  0.25  ]
 [-0.28055429 -0.19620776  0.22675993  0.25  ]
 [-0.34275508 -0.14740321  0.24015717  0.25  ]
 [-0.43571669 -0.06168866  0.24740258  0.25  ]
 [-0.56818217  0.06818183  0.250.25  ]]
   
[[-0.47727478  0.250.25   -0.02272752]
 [-0.32142842  0.250.23701297 -0.16558546]
 [-0.2387220.250.20625421 -0.21753165]
 [-0.1993053   0.250.15863517 -0.2093308 ]
 [-0.18393114  0.250.0986056  -0.1646733 ]
 [-0.18109006  0.250.03234358 -0.1012527 ]
 [-0.1845623   0.25   -0.03370428 -0.0317339 ]
 [-0.19141069  0.25   -0.09393495  0.03534578]
 [-0.2004036   0.25   -0.14435497  0.09475859]
 [-0.21085775  0.25   -0.18299362  0.14385229]
 [-0.22191447  0.25   -0.20997432  0.18188837]
 [-0.23224759  0.25   -0.22722232  0.20947087]
 [-0.24019703  0.25   -0.23785028  0.22804667]
 [-0.24432448  0.25   -0.24516809  0.23949245]
 [-0.24441877  0.25   -0.2513606   0.2457782 ]
 [-0.24290282  0.25   -0.25580966  0.24871336]
 [-0.24669671  0.25   -0.25307524  0.24977216]
 [-0.26948035  0.25   -0.23051959  0.25  ]
 [-0.33441514  0.25   -0.16558546  0.25  ]
 [-0.47727478  0.25   -0.02272752  0.25  ]]]
   
   ```
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] szha commented on a change in pull request #7660: fix ctc on softmax grad and req option

2017-08-29 Thread git

szha commented on a change in pull request #7660: fix ctc on softmax grad and 
req option
URL: https://github.com/apache/incubator-mxnet/pull/7660#discussion_r135972545
 
 

 ##
 File path: src/operator/contrib/ctc_loss-inl.h
 ##
 @@ -402,19 +395,18 @@ class CTCLossOp : public Operator {
 label_lengths->data(),
 data_lengths->data(),
 costs.dptr_,
-grad_desc_,
-grad.dptr_,
+ctx.is_train?grad_desc_:NULL,
+ctx.is_train?grad.dptr_:NULL,
 ctc_algo,
 ctc_desc_,
 work_space.dptr_,
 workspace_bytes));
-  }
-  inline virtual void cudnn_backward_extra(mshadow::Stream* s,
-   mshadow::Tensor 
data_grad,
-   mshadow::Tensor 
output_grad,
-   mshadow::Tensor 
data_grad_computed) {
-mxnet_op::SoftmaxGrad(s,
-output_grad.dptr_, data_grad_computed.dptr_, data_grad.dptr_, 
data_grad.shape_, 2);
+
+if (ctx.is_train) {
 
 Review comment:
   This is to see whether gradient calculation is needed, similar to the warp 
ctc case where if `ctx.is_train` is false then only inference is performed.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] szha commented on a change in pull request #7660: fix ctc on softmax grad and req option

2017-08-29 Thread git

szha commented on a change in pull request #7660: fix ctc on softmax grad and 
req option
URL: https://github.com/apache/incubator-mxnet/pull/7660#discussion_r135972464
 
 

 ##
 File path: tests/python/gpu/test_operator_gpu.py
 ##
 @@ -1357,6 +1357,28 @@ def test_autograd_save_memory():
 x.wait_to_read()
 x.backward()
 
+def test_gluon_ctc_consistency():
+loss = mx.gluon.loss.CTCLoss(padding_mask=0)
+data = mx.nd.arange(0, 4, repeat=40, 
ctx=mx.gpu(0)).reshape((2,20,4)).flip(axis=0)
+cpu_label = mx.nd.array([[2,1,0,0],[3,2,2,0]], ctx=mx.cpu(0))
+gpu_label = mx.nd.array([[2,1,0,0],[3,2,2,0]], ctx=mx.gpu(0))
+
+cpu_data = data.copy().as_in_context(mx.cpu(0))
+cpu_data.attach_grad()
+with mx.autograd.record():
+l_cpu = loss(cpu_data, cpu_label)
+l_cpu.backward()
+cpu_data.detach()
+
+gpu_data = data.copyto(mx.gpu(0))
+gpu_data.attach_grad()
+with mx.autograd.record():
+l_gpu = loss(gpu_data, gpu_label)
+l_gpu.backward()
+gpu_data.detach()
+
+assert_almost_equal(cpu_data.grad.asnumpy(), gpu_data.grad.asnumpy(), 
atol=1e-3, rtol=1e-3)
 
 Review comment:
   ```
   In [1]: from __future__ import print_function
  ...: from mxnet import gluon
  ...: import mxnet as mx
  ...: loss = gluon.loss.CTCLoss(padding_mask=0)
  ...: data = mx.nd.arange(0, 4, repeat=40).reshape((2,20,4)).flip(axis=0)
  ...: label = mx.nd.array([[2,1,0,0],[3,2,2,0]])
  ...: data.attach_grad()
  ...: with mx.autograd.record():
  ...: l = loss(data, label)
  ...: print(l)
  ...: l.backward()
  ...: data.detach()
  ...: print(data.grad)
  ...:
   
   [ 18.82820702  16.50581741]
   
   
   [[[-0.56818217  0.250.06818183  0.25  ]
 [-0.43571669  0.24740258 -0.06168866  0.25  ]
 [-0.34275508  0.24015717 -0.14740321  0.25  ]
 [-0.28055429  0.22675993 -0.19620776  0.25  ]
 [-0.24145621  0.20625414 -0.21479931  0.25  ]
 [-0.21889889  0.17822932 -0.20933169  0.25  ]
 [-0.20741701  0.14282268 -0.18540773  0.25  ]
 [-0.2026315   0.10071747 -0.1480875   0.25  ]
 [-0.20126608  0.05314353 -0.1018810.25  ]
 [-0.20112836  0.00187904 -0.05075252  0.25  ]
 [-0.20112836 -0.05075252  0.00187904  0.25  ]
 [-0.20126608 -0.1018810.05314353  0.25  ]
 [-0.2026315  -0.1480875   0.10071747  0.25  ]
 [-0.20741701 -0.18540773  0.14282268  0.25  ]
 [-0.21890068 -0.20933169  0.17822932  0.25  ]
 [-0.24145621 -0.21479931  0.20625414  0.25  ]
 [-0.28055429 -0.19620776  0.22675993  0.25  ]
 [-0.34275508 -0.14740321  0.24015717  0.25  ]
 [-0.43571669 -0.06168866  0.24740258  0.25  ]
 [-0.56818217  0.06818183  0.250.25  ]]
   
[[-0.47727478  0.250.25   -0.02272752]
 [-0.32142842  0.250.23701297 -0.16558546]
 [-0.2387220.250.20625421 -0.21753165]
 [-0.1993053   0.250.15863517 -0.2093308 ]
 [-0.18393114  0.250.0986056  -0.1646733 ]
 [-0.18109006  0.250.03234358 -0.1012527 ]
 [-0.1845623   0.25   -0.03370428 -0.0317339 ]
 [-0.19141069  0.25   -0.09393495  0.03534578]
 [-0.2004036   0.25   -0.14435497  0.09475859]
 [-0.21085775  0.25   -0.18299362  0.14385229]
 [-0.22191447  0.25   -0.20997432  0.18188837]
 [-0.23224759  0.25   -0.22722232  0.20947087]
 [-0.24019703  0.25   -0.23785028  0.22804667]
 [-0.24432448  0.25   -0.24516809  0.23949245]
 [-0.24441877  0.25   -0.2513606   0.2457782 ]
 [-0.24290282  0.25   -0.25580966  0.24871336]
 [-0.24669671  0.25   -0.25307524  0.24977216]
 [-0.26948035  0.25   -0.23051959  0.25  ]
 [-0.33441514  0.25   -0.16558546  0.25  ]
 [-0.47727478  0.25   -0.02272752  0.25  ]]]
   
   ```
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] hen opened a new issue #22: Logo is not hosted on Apache site

2017-08-29 Thread git

hen opened a new issue #22: Logo is not hosted on Apache site
URL: https://github.com/apache/incubator-mxnet-site/issues/22
 
 
   The logo doesn't display for me (because I block cross-site calls). This is 
because it's coming from the old mxnet.io website. We should move that file 
over to this GitHub.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] piiswrong commented on issue #7613: 1x1 convolution acceleration

2017-08-29 Thread git

piiswrong commented on issue #7613: 1x1 convolution acceleration
URL: https://github.com/apache/incubator-mxnet/pull/7613#issuecomment-325884590
 
 
   use
   ```
   git fetch upstream
   git rebase upstream/master
   ```
   to rebase to master.
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] hen opened a new issue #21: README.html is weird

2017-08-29 Thread git

hen opened a new issue #21: README.html is weird
URL: https://github.com/apache/incubator-mxnet-site/issues/21
 
 
   For GitHub, the README.html is weird as it tries to display. Converting it 
to a README.md seems like it might be a good idea, so one can read it on 
GitHub. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] hen opened a new issue #20: DOAP file

2017-08-29 Thread git

hen opened a new issue #20: DOAP file
URL: https://github.com/apache/incubator-mxnet-site/issues/20
 
 
   Per https://www.apache.org/foundation/marks/pmcs#metadata we should create a 
DOAP file.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] hen opened a new issue #19: TM in logo

2017-08-29 Thread git

hen opened a new issue #19: TM in logo
URL: https://github.com/apache/incubator-mxnet-site/issues/19
 
 
   Per https://www.apache.org/foundation/marks/pmcs#graphics the MXNet logo 
should contain a TM.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] hen opened a new issue #18: Links needed

2017-08-29 Thread git

hen opened a new issue #18: Links needed
URL: https://github.com/apache/incubator-mxnet-site/issues/18
 
 
   Per https://www.apache.org/foundation/marks/pmcs#navigation the MXNet site 
should link to the following Foundation pages:
   
 License
 Sponsorship
 Thanks
 Security (let's do the default for now, if we get lots of security issues 
we can setup a security list)
 Main ASF Homepage
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] hen opened a new issue #17: Footer needed

2017-08-29 Thread git

hen opened a new issue #17: Footer needed
URL: https://github.com/apache/incubator-mxnet-site/issues/17
 
 
   Under the Incubator footer we should say:
   
   "Copyright ? 2017, The Apache Software Foundation
   Apache MXNet, MXNet, Apache, the Apache feather, and the Apache MXNet 
project logo are trademarks of the Apache Software Foundation. "
   
   See https://www.apache.org/foundation/marks/pmcs for more info.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] piiswrong commented on a change in pull request #7660: fix ctc on softmax grad and req option

2017-08-29 Thread git

piiswrong commented on a change in pull request #7660: fix ctc on softmax grad 
and req option
URL: https://github.com/apache/incubator-mxnet/pull/7660#discussion_r135971545
 
 

 ##
 File path: src/operator/contrib/ctc_loss-inl.h
 ##
 @@ -402,19 +395,18 @@ class CTCLossOp : public Operator {
 label_lengths->data(),
 data_lengths->data(),
 costs.dptr_,
-grad_desc_,
-grad.dptr_,
+ctx.is_train?grad_desc_:NULL,
+ctx.is_train?grad.dptr_:NULL,
 ctc_algo,
 ctc_desc_,
 work_space.dptr_,
 workspace_bytes));
-  }
-  inline virtual void cudnn_backward_extra(mshadow::Stream* s,
-   mshadow::Tensor 
data_grad,
-   mshadow::Tensor 
output_grad,
-   mshadow::Tensor 
data_grad_computed) {
-mxnet_op::SoftmaxGrad(s,
-output_grad.dptr_, data_grad_computed.dptr_, data_grad.dptr_, 
data_grad.shape_, 2);
+
+if (ctx.is_train) {
 
 Review comment:
   why test for is_train?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] piiswrong commented on issue #7661: Use a big ndarray in gluon/data/vision

2017-08-29 Thread git

piiswrong commented on issue #7661: Use a big ndarray in gluon/data/vision
URL: https://github.com/apache/incubator-mxnet/pull/7661#issuecomment-325883267
 
 
   Is it actually faster? This way slicing happens at runtime
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] piiswrong closed pull request #7638: CSRNDArray from/to scipy csr_matrix; fix rand_shape_nd

2017-08-29 Thread git

piiswrong closed pull request #7638: CSRNDArray from/to scipy csr_matrix; fix 
rand_shape_nd
URL: https://github.com/apache/incubator-mxnet/pull/7638
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[incubator-mxnet] branch master updated: CSRNDArray from/to scipy csr_matrix; fix rand_shape_nd (#7638)

2017-08-29 Thread jxie

This is an automated email from the ASF dual-hosted git repository.

jxie pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git


The following commit(s) were added to refs/heads/master by this push:
 new ec7cd6e  CSRNDArray from/to scipy csr_matrix; fix rand_shape_nd (#7638)
ec7cd6e is described below

commit ec7cd6eeb5f95f86b9d73250ba1f61616dd42800
Author: Haibin Lin 
AuthorDate: Tue Aug 29 22:12:05 2017 -0700

CSRNDArray from/to scipy csr_matrix; fix rand_shape_nd (#7638)

* support creation from sp.csr

* enhance doc

* edit repr for sparse ndarray

* update doc for nd.empty

* preprocess noncanonical csr

* add asscipy to csr

* minor changes

* return tuple for rand_shape_nd

* fix lint

* throw exception on setters

* remove asscipy

* global import scipy in sparse.py

* update rand_shape_nd;

* add missing line

* better err msg. fix scipy import in utils.py

* fix lint
---
 python/mxnet/ndarray/sparse.py   | 95 +---
 python/mxnet/ndarray/utils.py| 24 ---
 python/mxnet/test_utils.py   |  8 +--
 tests/python/unittest/test_io.py |  1 -
 tests/python/unittest/test_module.py |  7 +-
 tests/python/unittest/test_sparse_ndarray.py | 40 ++--
 6 files changed, 142 insertions(+), 33 deletions(-)

diff --git a/python/mxnet/ndarray/sparse.py b/python/mxnet/ndarray/sparse.py
index 806398e..fa2761d 100644
--- a/python/mxnet/ndarray/sparse.py
+++ b/python/mxnet/ndarray/sparse.py
@@ -51,7 +51,6 @@ from .ndarray import zeros as _zeros_ndarray
 from .ndarray import array as _array
 from . import op
 
-# Use different verison of SymbolBase
 # When possible, use cython to speedup part of computation.
 # pylint: disable=unused-import
 try:
@@ -67,6 +66,10 @@ except ImportError:
 from .._ctypes.ndarray import _set_ndarray_class
 # pylint: enable=unused-import
 
+try:
+import scipy.sparse as spsp
+except ImportError:
+spsp = None
 
 _STORAGE_AUX_TYPES = {
 'row_sparse': [np.int64],
@@ -112,6 +115,13 @@ class BaseSparseNDArray(NDArray):
 See CSRNDArray and RowSparseNDArray for more details.
 """
 
+def __repr__(self):
+"""Returns a string representation of the sparse array."""
+shape_info = 'x'.join(['%d' % x for x in self.shape])
+# The data content is not displayed since the array usually has big 
shape
+return '\n<%s %s @%s>' % (self.__class__.__name__,
+  shape_info, self.context)
+
 def __iadd__(self, other):
 raise NotImplementedError()
 
@@ -417,6 +427,19 @@ class CSRNDArray(BaseSparseNDArray):
 """
 return self._data()
 
+@indices.setter
+def indices(self, indices):
+raise NotImplementedError()
+
+@indptr.setter
+def indptr(self, indptr):
+raise NotImplementedError()
+
+@data.setter
+def data(self, data):
+raise NotImplementedError()
+
+
 def tostype(self, stype):
 """Return a copy of the array with chosen storage type.
 
@@ -461,7 +484,6 @@ class CSRNDArray(BaseSparseNDArray):
 else:
 raise TypeError('copyto does not support type ' + str(type(other)))
 
-
 # pylint: disable=abstract-method
 class RowSparseNDArray(BaseSparseNDArray):
 """A sparse representation of a set of NDArray row slices at given indices.
@@ -630,6 +652,14 @@ class RowSparseNDArray(BaseSparseNDArray):
 """
 return self._data()
 
+@indices.setter
+def indices(self, indices):
+raise NotImplementedError()
+
+@data.setter
+def data(self, data):
+raise NotImplementedError()
+
 def tostype(self, stype):
 """Return a copy of the array with chosen storage type.
 
@@ -908,16 +938,61 @@ def empty(stype, shape, ctx=None, dtype=None, 
aux_types=None):
 
 def array(source_array, ctx=None, dtype=None, aux_types=None):
 """Creates a sparse array from any object exposing the array interface.
+
+Parameters
+--
+source_array : RowSparseNDArray, CSRNDArray or scipy.sparse.csr.csr_matrix
+The source sparse array
+ctx : Context, optional
+Device context (default is the current default context).
+dtype : str or numpy.dtype, optional
+The data type of the output array. The default dtype is 
``source_array.dtype``
+if `source_array` is an `NDArray`, `float32` otherwise.
+aux_types: list of numpy.dtype, optional
+An optional list of types of the aux data for RowSparseNDArray or 
CSRNDArray.
+The default value for CSRNDArray is [`int64`, `int64`] for `indptr` 
and `indices`.
+The default value for RowSparseNDArray is [`int64`] for `indices`.
+
+Returns
+---
+RowSparseNDArray or CSRNDArray
+

[incubator-mxnet] branch master updated: Remove python function negative for rendering ndarray api in doc (#7657)

2017-08-29 Thread jxie

This is an automated email from the ASF dual-hosted git repository.

jxie pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git


The following commit(s) were added to refs/heads/master by this push:
 new 583722d  Remove python function negative for rendering ndarray api in 
doc (#7657)
583722d is described below

commit 583722dc1a8d5d442c6202e4ddaae27fcd47e58a
Author: reminisce 
AuthorDate: Tue Aug 29 22:11:35 2017 -0700

Remove python function negative for rendering ndarray api in doc (#7657)
---
 python/mxnet/ndarray/ndarray.py | 29 ++---
 1 file changed, 2 insertions(+), 27 deletions(-)

diff --git a/python/mxnet/ndarray/ndarray.py b/python/mxnet/ndarray/ndarray.py
index b0500d3..e497ea6 100644
--- a/python/mxnet/ndarray/ndarray.py
+++ b/python/mxnet/ndarray/ndarray.py
@@ -43,8 +43,8 @@ from .op import NDArrayBase
 __all__ = ["NDArray", "concatenate", "_DTYPE_NP_TO_MX", "_DTYPE_MX_TO_NP", 
"_GRAD_REQ_MAP",
"ones", "add", "arange", "divide", "equal", "full", "greater", 
"greater_equal",
"imdecode", "lesser", "lesser_equal", "maximum", "minimum", 
"moveaxis", "modulo",
-   "multiply", "negative", "not_equal", "onehot_encode", "power", 
"subtract",
-   "true_divide", "waitall", "_new_empty_handle"]
+   "multiply", "not_equal", "onehot_encode", "power", "subtract", 
"true_divide",
+   "waitall", "_new_empty_handle"]
 
 _STORAGE_TYPE_UNDEFINED = -1
 _STORAGE_TYPE_DEFAULT = 0
@@ -2572,31 +2572,6 @@ def true_divide(lhs, rhs):
 return divide(lhs, rhs)
 
 
-def negative(arr):
-"""Numerical negative, element-wise.
-
-Equals ``-arr``
-
-Parameters
---
-arr : NDArray
-The input array
-
-Returns
----
-NDArray
-``-arr``
-
-Examples
-
->>> x = mx.nd.ones((2,3))
->>> (-x).asnumpy()
-array([[-1., -1., -1.],
-   [-1., -1., -1.]], dtype=float32)
-"""
-return multiply(arr, -1.0)
-
-
 def concatenate(arrays, axis=0, always_copy=True):
 """DEPRECATED, use ``concat`` instead
 

-- 
To stop receiving notification emails like this one, please contact
['"comm...@mxnet.apache.org" '].

[GitHub] piiswrong closed pull request #7657: Remove python function `negative` for rendering ndarray api in doc

2017-08-29 Thread git

piiswrong closed pull request #7657: Remove python function `negative` for 
rendering ndarray api in doc
URL: https://github.com/apache/incubator-mxnet/pull/7657
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] reminisce commented on issue #7613: 1x1 convolution acceleration

2017-08-29 Thread git

reminisce commented on issue #7613: 1x1 convolution acceleration
URL: https://github.com/apache/incubator-mxnet/pull/7613#issuecomment-325880446
 
 
   It seems not rebased. The **File Changed** tab should only contain your 
changes.
   Suppose `upstream` is the alias to https://github.com/apache/incubator-mxnet 
in your local git. Then type
   `git pull --rebase upstream master` to sync with the latest code base and 
put your commits on top the git history. Type `git log`, you should see your 
commits for this PR sit on top of the latest upstream commit history.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] CodingCat commented on issue #7571: [scala-package][spark] Resources running PS (role = server) should be explicit to Spark

2017-08-29 Thread git

CodingCat commented on issue #7571: [scala-package][spark] Resources running PS 
(role = server) should be explicit to Spark
URL: https://github.com/apache/incubator-mxnet/pull/7571#issuecomment-325873716
 
 
   thanks
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[incubator-mxnet] branch master updated: [scala-package][spark] Resources running PS (role = server) should be explicit to Spark (#7571)

2017-08-29 Thread liuyizhi

This is an automated email from the ASF dual-hosted git repository.

liuyizhi pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git


The following commit(s) were added to refs/heads/master by this push:
 new 9ed171e  [scala-package][spark] Resources running PS (role = server) 
should be explicit to Spark (#7571)
9ed171e is described below

commit 9ed171e35bf994b250e4a3ba9db5d792ee71c418
Author: Nan Zhu 
AuthorDate: Tue Aug 29 20:29:52 2017 -0700

[scala-package][spark] Resources running PS (role = server) should be 
explicit to Spark (#7571)

* temp

* resources running PS (role = server) should be explicit to Spark

* address the comments
---
 .../main/scala/ml/dmlc/mxnet/KVStoreServer.scala   |  4 -
 .../src/main/scala/ml/dmlc/mxnet/spark/MXNet.scala | 91 ++
 .../ml/dmlc/mxnet/spark/ParameterServer.scala  | 69 
 3 files changed, 91 insertions(+), 73 deletions(-)

diff --git 
a/scala-package/core/src/main/scala/ml/dmlc/mxnet/KVStoreServer.scala 
b/scala-package/core/src/main/scala/ml/dmlc/mxnet/KVStoreServer.scala
index 22f9269..d3c8691 100644
--- a/scala-package/core/src/main/scala/ml/dmlc/mxnet/KVStoreServer.scala
+++ b/scala-package/core/src/main/scala/ml/dmlc/mxnet/KVStoreServer.scala
@@ -20,10 +20,6 @@ package ml.dmlc.mxnet
 import ml.dmlc.mxnet.Base._
 import org.slf4j.{Logger, LoggerFactory}
 
-/**
- * Server node for the key value store
- * @author Yizhi Liu
- */
 private[mxnet] class KVStoreServer(private val kvStore: KVStore) {
   private val logger: Logger = LoggerFactory.getLogger(classOf[KVStoreServer])
   private val handle: KVStoreHandle = kvStore.handle
diff --git a/scala-package/spark/src/main/scala/ml/dmlc/mxnet/spark/MXNet.scala 
b/scala-package/spark/src/main/scala/ml/dmlc/mxnet/spark/MXNet.scala
index 27dd99f..cc77342 100644
--- a/scala-package/spark/src/main/scala/ml/dmlc/mxnet/spark/MXNet.scala
+++ b/scala-package/spark/src/main/scala/ml/dmlc/mxnet/spark/MXNet.scala
@@ -27,14 +27,24 @@ import org.apache.spark.mllib.regression.LabeledPoint
 import org.apache.spark.rdd.RDD
 import org.apache.spark.SparkContext
 
-/**
- * MXNet Training On Spark
- * @author Yizhi Liu
- */
 class MXNet extends Serializable {
+
+  class MXNetControllingThread(
+  schedulerIP: String,
+  schedulerPort: Int,
+  sparkContext: SparkContext,
+  triggerOfComponent: (String, Int, SparkContext) => Unit) extends Thread {
+override def run() {
+  triggerOfComponent(schedulerIP, schedulerPort, sparkContext)
+}
+  }
+
   private val logger: Logger = LoggerFactory.getLogger(classOf[MXNet])
   private val params: MXNetParams = new MXNetParams
 
+  @transient private var psServerThread: MXNetControllingThread = _
+  @transient private var psSchedulerThread: MXNetControllingThread = _
+
   def setBatchSize(batchSize: Int): this.type = {
 params.batchSize = batchSize
 this
@@ -105,30 +115,51 @@ class MXNet extends Serializable {
 this
   }
 
-  private def startParameterServers(
+  private def startPSServers(
   schedulerIP: String,
   schedulerPort: Int,
-  sc: SparkContext): ParameterServer = {
-// TODO: check ip & port available
-logger.info("Starting scheduler on {}:{}", schedulerIP, schedulerPort)
-val scheduler = new ParameterServer(params.runtimeClasspath, role = 
"scheduler",
-  rootUri = schedulerIP, rootPort = schedulerPort,
-  numServer = params.numServer, numWorker = params.numWorker,
-  timeout = params.timeout, java = params.javabin)
-require(scheduler.startProcess(), "Failed to start ps scheduler process")
-
-sc.parallelize(1 to params.numServer, params.numServer).foreachPartition { 
p =>
-  logger.info("Starting server ...")
-  val server = new ParameterServer(params.runtimeClasspath,
-role = "server",
+  sc: SparkContext) = {
+def startPSServersInner(
+schedulerIP: String,
+schedulerPort: Int,
+sc: SparkContext): Unit = {
+  sc.parallelize(1 to params.numServer, params.numServer).foreachPartition 
{ p =>
+  logger.info("Starting server ...")
+  val server = new ParameterServer(params.runtimeClasspath,
+role = "server",
+rootUri = schedulerIP, rootPort = schedulerPort,
+numServer = params.numServer,
+numWorker = params.numWorker,
+timeout = params.timeout,
+java = params.javabin)
+  val exitCode = server.startProcess()
+  require(exitCode == 0, s"ps server process quit with exit code 
$exitCode")
+}
+}
+psServerThread = new MXNetControllingThread(schedulerIP, schedulerPort, 
sc, startPSServersInner)
+psServerThread.start()
+  }
+
+  private def startPSScheduler(
+  schedulerIP: String,
+  schedulerPort: Int,
+  sc: SparkContext) = {
+def startPSSchedulerInner(
+

[GitHub] eric-haibin-lin commented on a change in pull request #7656: CSRNDArray Tutorial

2017-08-29 Thread git

eric-haibin-lin commented on a change in pull request #7656: CSRNDArray Tutorial
URL: https://github.com/apache/incubator-mxnet/pull/7656#discussion_r135961277
 
 

 ##
 File path: docs/tutorials/sparse/csr.md
 ##
 @@ -0,0 +1,258 @@
+# CSRNDArray - NDArray in Compressed Sparse Row Storage Format
+
+Many real world datasets deal with high dimensional sparse feature vectors. 
For instance,
+in a recommendation system, the number of categories and users is in the order 
of millions,
+while most users only made a few purchases, leading to feature vectors with 
high sparsity
+(i.e. most of the elements are zeros).
+
+Storing and manipulating such large sparse matrices in the default dense 
structure results
+in wasted memory and processing on the zeros.
+To take advantage of the sparse structure of the matrix, the ``CSRNDArray`` in 
MXNet
+stores the matrix in [compressed sparse 
row(CSR)](https://en.wikipedia.org/wiki/Sparse_matrix#Compressed_sparse_row_.28CSR.2C_CRS_or_Yale_format.29)
 format
+and uses specialized algorithms in operators.
+The format is designed for 2D matrices with a large number of columns,
+and each row is sparse(i.e. with only a few nonzeros).
+For matrices of high sparsity (e.g. ~1% non-zeros), the advantage of 
``CSRNDArray`` over
+the existing ``NDArray`` is that
+
+- memory consumption is reduced significantly
+- certain operations (e.g. matrix-vector multiplication) are much faster
+
+Meanwhile, ``CSRNDArray`` inherits competitive features from ``NDArray`` such 
as
+lazy evaluation and automatic parallelization, which are not available in the
+scientific computing python package [SciPy](https://www.scipy.org/).
+
+Apart from often queried attributes such as **ndarray.shape**, 
**ndarray.dtype** and **ndarray.context**,
+you?ll also want to query **ndarray.stype**: the storage type of the NDArray. 
For a usual dense NDArray,
+the value of stype is **"default"**. For an CSRNDArray, the value of stype is 
**"csr"**.
+
+## Prerequisites
+
+To complete this tutorial, we need:
+
+- MXNet. See the instructions for your operating system in [Setup and 
Installation](http://mxnet.io/get_started/install.html)
+- [Jupyter](http://jupyter.org/)
+```
+pip install jupyter
+```
+- Scipy - A section of this tutorial uses Scipy package in python. If you 
don't have Scipy,
+the example in that section will be ignored.
+- GPUs - A section of this tutorial uses GPUs. If you don't have GPUs on your
+machine, simply set the variable gpu_device (set in the GPUs section of this
+tutorial) to mx.cpu().
+
+## Compressed Sparse Row Format
+
+A CSRNDArray represents a 2D matrix as three separate 1D arrays: **data**,
+**indptr** and **indices**, where the column indices for
+row ``i`` are stored in ``indices[indptr[i]:indptr[i+1]]`` in ascending order,
+and their corresponding values are stored in ``data[indptr[i]:indptr[i+1]]``.
+
+For example, the CSR representation for matrix
+```
+[[7, 0, 8, 0]
+ [0, 0, 0, 0]
+ [0, 9, 0, 0]]
+```
+is:
+```
+[7, 8, 9]  # data
+[0, 2, 1]  # indices
+[0, 2, 2, 3]   # indptr
+```
+
+Note that in MXNet, the column indices for a given row are always sorted in 
ascending order,
+and duplicated column entries for the same row are not allowed.
+
+## Array Creation
+
+There are a few different ways to create a `CSRNDArray`.
+
+* We can create a CSRNDArray with data, indices and indptr by using the 
`csr_matrix` function:
+
+```python
+import mxnet as mx
+import numpy as np
+# create a CSRNDArray with python lists
+shape = (3, 4)
+data_list = [7, 8, 9]
+indptr_list = [0, 2, 2, 3]
+indices_list = [0, 2, 1]
+a = mx.nd.sparse.csr_matrix(data_list, indptr_list, indices_list, shape)
+# create a CSRNDArray with numpy arrays
+data_np = np.array([7, 8, 9])
+indptr_np = np.array([0, 2, 2, 3])
+indices_np = np.array([0, 2, 1])
+b = mx.nd.sparse.csr_matrix(data_np, indptr_np, indices_np, shape)
+{'a':a, 'b':b}
+```
+
+* We can also create an MXNet CSRNDArray from a `scipy.sparse.csr.csr_matrix` 
object by using the `array` function:
+
+```python
+try:
+import scipy.sparse as spsp
+# generate a csr matrix in scipy
+c = spsp.csr.csr_matrix((data_np, indices_np, indptr_np), shape=shape)
+# create a CSRNDArray from a scipy csr object
+d = mx.nd.sparse.array(c)
 
 Review comment:
   This feature is included in #7638. Probably it's still too early to review 
this since #7638 is not merged in ... Sorry about that
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] eric-haibin-lin commented on a change in pull request #7656: CSRNDArray Tutorial

2017-08-29 Thread git

eric-haibin-lin commented on a change in pull request #7656: CSRNDArray Tutorial
URL: https://github.com/apache/incubator-mxnet/pull/7656#discussion_r135960903
 
 

 ##
 File path: docs/tutorials/sparse/csr.md
 ##
 @@ -0,0 +1,258 @@
+# CSRNDArray - NDArray in Compressed Sparse Row Storage Format
+
+Many real world datasets deal with high dimensional sparse feature vectors. 
For instance,
+in a recommendation system, the number of categories and users is in the order 
of millions,
+while most users only made a few purchases, leading to feature vectors with 
high sparsity
+(i.e. most of the elements are zeros).
+
+Storing and manipulating such large sparse matrices in the default dense 
structure results
+in wasted memory and processing on the zeros.
+To take advantage of the sparse structure of the matrix, the ``CSRNDArray`` in 
MXNet
+stores the matrix in [compressed sparse 
row(CSR)](https://en.wikipedia.org/wiki/Sparse_matrix#Compressed_sparse_row_.28CSR.2C_CRS_or_Yale_format.29)
 format
+and uses specialized algorithms in operators.
+The format is designed for 2D matrices with a large number of columns,
+and each row is sparse(i.e. with only a few nonzeros).
+For matrices of high sparsity (e.g. ~1% non-zeros), the advantage of 
``CSRNDArray`` over
+the existing ``NDArray`` is that
+
+- memory consumption is reduced significantly
+- certain operations (e.g. matrix-vector multiplication) are much faster
+
+Meanwhile, ``CSRNDArray`` inherits competitive features from ``NDArray`` such 
as
+lazy evaluation and automatic parallelization, which are not available in the
+scientific computing python package [SciPy](https://www.scipy.org/).
+
+Apart from often queried attributes such as **ndarray.shape**, 
**ndarray.dtype** and **ndarray.context**,
+you?ll also want to query **ndarray.stype**: the storage type of the NDArray. 
For a usual dense NDArray,
+the value of stype is **"default"**. For an CSRNDArray, the value of stype is 
**"csr"**.
+
+## Prerequisites
+
+To complete this tutorial, we need:
+
+- MXNet. See the instructions for your operating system in [Setup and 
Installation](http://mxnet.io/get_started/install.html)
+- [Jupyter](http://jupyter.org/)
+```
+pip install jupyter
+```
+- Scipy - A section of this tutorial uses Scipy package in python. If you 
don't have Scipy,
+the example in that section will be ignored.
+- GPUs - A section of this tutorial uses GPUs. If you don't have GPUs on your
+machine, simply set the variable gpu_device (set in the GPUs section of this
+tutorial) to mx.cpu().
+
+## Compressed Sparse Row Format
+
+A CSRNDArray represents a 2D matrix as three separate 1D arrays: **data**,
+**indptr** and **indices**, where the column indices for
+row ``i`` are stored in ``indices[indptr[i]:indptr[i+1]]`` in ascending order,
+and their corresponding values are stored in ``data[indptr[i]:indptr[i+1]]``.
+
+For example, the CSR representation for matrix
+```
+[[7, 0, 8, 0]
+ [0, 0, 0, 0]
+ [0, 9, 0, 0]]
+```
+is:
+```
+[7, 8, 9]  # data
+[0, 2, 1]  # indices
+[0, 2, 2, 3]   # indptr
+```
+
+Note that in MXNet, the column indices for a given row are always sorted in 
ascending order,
+and duplicated column entries for the same row are not allowed.
+
+## Array Creation
+
+There are a few different ways to create a `CSRNDArray`.
+
+* We can create a CSRNDArray with data, indices and indptr by using the 
`csr_matrix` function:
+
+```python
+import mxnet as mx
+import numpy as np
+# create a CSRNDArray with python lists
+shape = (3, 4)
+data_list = [7, 8, 9]
+indptr_list = [0, 2, 2, 3]
+indices_list = [0, 2, 1]
+a = mx.nd.sparse.csr_matrix(data_list, indptr_list, indices_list, shape)
+# create a CSRNDArray with numpy arrays
+data_np = np.array([7, 8, 9])
+indptr_np = np.array([0, 2, 2, 3])
+indices_np = np.array([0, 2, 1])
+b = mx.nd.sparse.csr_matrix(data_np, indptr_np, indices_np, shape)
+{'a':a, 'b':b}
+```
+
+* We can also create an MXNet CSRNDArray from a `scipy.sparse.csr.csr_matrix` 
object by using the `array` function:
+
+```python
+try:
+import scipy.sparse as spsp
+# generate a csr matrix in scipy
+c = spsp.csr.csr_matrix((data_np, indices_np, indptr_np), shape=shape)
+# create a CSRNDArray from a scipy csr object
+d = mx.nd.sparse.array(c)
+{'d':d}
+except ImportError:
+print("scipy package is required")
+```
+
+We can specify the element data type with the option `dtype`, which accepts a 
numpy
+type. By default, `float32` is used.
+
+```python
+# float32 is used in default
+e = mx.nd.sparse.array(a)
+# create a 16-bit float array
+f = mx.nd.array(a, dtype=np.float16)
+(e.dtype, f.dtype)
+```
+
+## Inspecting Arrays
+
+* We can inspect the contents of an `CSRNDArray` by filling
+its contents into a dense `numpy.ndarray` using the `asnumpy` function.
+
+```python
+a.asnumpy()
+```
+
+* We can also inspect the internal storage of a CSRNDArray by accessing 
attributes such as `indptr`, `indices` and `data`:

[GitHub] eric-haibin-lin commented on a change in pull request #7656: CSRNDArray Tutorial

2017-08-29 Thread git

eric-haibin-lin commented on a change in pull request #7656: CSRNDArray Tutorial
URL: https://github.com/apache/incubator-mxnet/pull/7656#discussion_r135960822
 
 

 ##
 File path: docs/tutorials/sparse/csr.md
 ##
 @@ -0,0 +1,258 @@
+# CSRNDArray - NDArray in Compressed Sparse Row Storage Format
+
+Many real world datasets deal with high dimensional sparse feature vectors. 
For instance,
+in a recommendation system, the number of categories and users is in the order 
of millions,
+while most users only made a few purchases, leading to feature vectors with 
high sparsity
+(i.e. most of the elements are zeros).
+
+Storing and manipulating such large sparse matrices in the default dense 
structure results
+in wasted memory and processing on the zeros.
+To take advantage of the sparse structure of the matrix, the ``CSRNDArray`` in 
MXNet
+stores the matrix in [compressed sparse 
row(CSR)](https://en.wikipedia.org/wiki/Sparse_matrix#Compressed_sparse_row_.28CSR.2C_CRS_or_Yale_format.29)
 format
+and uses specialized algorithms in operators.
+The format is designed for 2D matrices with a large number of columns,
+and each row is sparse(i.e. with only a few nonzeros).
+For matrices of high sparsity (e.g. ~1% non-zeros), the advantage of 
``CSRNDArray`` over
+the existing ``NDArray`` is that
+
+- memory consumption is reduced significantly
+- certain operations (e.g. matrix-vector multiplication) are much faster
+
+Meanwhile, ``CSRNDArray`` inherits competitive features from ``NDArray`` such 
as
+lazy evaluation and automatic parallelization, which are not available in the
+scientific computing python package [SciPy](https://www.scipy.org/).
+
+Apart from often queried attributes such as **ndarray.shape**, 
**ndarray.dtype** and **ndarray.context**,
+you?ll also want to query **ndarray.stype**: the storage type of the NDArray. 
For a usual dense NDArray,
+the value of stype is **"default"**. For an CSRNDArray, the value of stype is 
**"csr"**.
+
+## Prerequisites
+
+To complete this tutorial, we need:
+
+- MXNet. See the instructions for your operating system in [Setup and 
Installation](http://mxnet.io/get_started/install.html)
+- [Jupyter](http://jupyter.org/)
+```
+pip install jupyter
+```
+- Scipy - A section of this tutorial uses Scipy package in python. If you 
don't have Scipy,
+the example in that section will be ignored.
+- GPUs - A section of this tutorial uses GPUs. If you don't have GPUs on your
+machine, simply set the variable gpu_device (set in the GPUs section of this
+tutorial) to mx.cpu().
+
+## Compressed Sparse Row Format
+
+A CSRNDArray represents a 2D matrix as three separate 1D arrays: **data**,
+**indptr** and **indices**, where the column indices for
+row ``i`` are stored in ``indices[indptr[i]:indptr[i+1]]`` in ascending order,
+and their corresponding values are stored in ``data[indptr[i]:indptr[i+1]]``.
+
+For example, the CSR representation for matrix
+```
+[[7, 0, 8, 0]
+ [0, 0, 0, 0]
+ [0, 9, 0, 0]]
+```
+is:
+```
+[7, 8, 9]  # data
+[0, 2, 1]  # indices
+[0, 2, 2, 3]   # indptr
+```
+
+Note that in MXNet, the column indices for a given row are always sorted in 
ascending order,
+and duplicated column entries for the same row are not allowed.
+
+## Array Creation
+
+There are a few different ways to create a `CSRNDArray`.
+
+* We can create a CSRNDArray with data, indices and indptr by using the 
`csr_matrix` function:
+
+```python
+import mxnet as mx
+import numpy as np
+# create a CSRNDArray with python lists
+shape = (3, 4)
+data_list = [7, 8, 9]
+indptr_list = [0, 2, 2, 3]
+indices_list = [0, 2, 1]
+a = mx.nd.sparse.csr_matrix(data_list, indptr_list, indices_list, shape)
+# create a CSRNDArray with numpy arrays
+data_np = np.array([7, 8, 9])
+indptr_np = np.array([0, 2, 2, 3])
+indices_np = np.array([0, 2, 1])
+b = mx.nd.sparse.csr_matrix(data_np, indptr_np, indices_np, shape)
+{'a':a, 'b':b}
+```
+
+* We can also create an MXNet CSRNDArray from a `scipy.sparse.csr.csr_matrix` 
object by using the `array` function:
+
+```python
+try:
+import scipy.sparse as spsp
+# generate a csr matrix in scipy
+c = spsp.csr.csr_matrix((data_np, indices_np, indptr_np), shape=shape)
+# create a CSRNDArray from a scipy csr object
+d = mx.nd.sparse.array(c)
+{'d':d}
+except ImportError:
+print("scipy package is required")
+```
+
+We can specify the element data type with the option `dtype`, which accepts a 
numpy
+type. By default, `float32` is used.
+
+```python
+# float32 is used in default
+e = mx.nd.sparse.array(a)
+# create a 16-bit float array
+f = mx.nd.array(a, dtype=np.float16)
+(e.dtype, f.dtype)
+```
+
+## Inspecting Arrays
+
+* We can inspect the contents of an `CSRNDArray` by filling
+its contents into a dense `numpy.ndarray` using the `asnumpy` function.
+
+```python
+a.asnumpy()
+```
+
+* We can also inspect the internal storage of a CSRNDArray by accessing 
attributes such as `indptr`, `indices` and `data`:

[GitHub] eric-haibin-lin commented on a change in pull request #7656: CSRNDArray Tutorial

2017-08-29 Thread git

eric-haibin-lin commented on a change in pull request #7656: CSRNDArray Tutorial
URL: https://github.com/apache/incubator-mxnet/pull/7656#discussion_r135960702
 
 

 ##
 File path: docs/tutorials/sparse/csr.md
 ##
 @@ -0,0 +1,258 @@
+# CSRNDArray - NDArray in Compressed Sparse Row Storage Format
+
+Many real world datasets deal with high dimensional sparse feature vectors. 
For instance,
+in a recommendation system, the number of categories and users is in the order 
of millions,
+while most users only made a few purchases, leading to feature vectors with 
high sparsity
+(i.e. most of the elements are zeros).
+
+Storing and manipulating such large sparse matrices in the default dense 
structure results
+in wasted memory and processing on the zeros.
+To take advantage of the sparse structure of the matrix, the ``CSRNDArray`` in 
MXNet
+stores the matrix in [compressed sparse 
row(CSR)](https://en.wikipedia.org/wiki/Sparse_matrix#Compressed_sparse_row_.28CSR.2C_CRS_or_Yale_format.29)
 format
+and uses specialized algorithms in operators.
+The format is designed for 2D matrices with a large number of columns,
+and each row is sparse(i.e. with only a few nonzeros).
+For matrices of high sparsity (e.g. ~1% non-zeros), the advantage of 
``CSRNDArray`` over
+the existing ``NDArray`` is that
+
+- memory consumption is reduced significantly
+- certain operations (e.g. matrix-vector multiplication) are much faster
+
+Meanwhile, ``CSRNDArray`` inherits competitive features from ``NDArray`` such 
as
+lazy evaluation and automatic parallelization, which are not available in the
+scientific computing python package [SciPy](https://www.scipy.org/).
+
+Apart from often queried attributes such as **ndarray.shape**, 
**ndarray.dtype** and **ndarray.context**,
+you?ll also want to query **ndarray.stype**: the storage type of the NDArray. 
For a usual dense NDArray,
+the value of stype is **"default"**. For an CSRNDArray, the value of stype is 
**"csr"**.
+
+## Prerequisites
+
+To complete this tutorial, we need:
+
+- MXNet. See the instructions for your operating system in [Setup and 
Installation](http://mxnet.io/get_started/install.html)
+- [Jupyter](http://jupyter.org/)
+```
+pip install jupyter
+```
+- Scipy - A section of this tutorial uses Scipy package in python. If you 
don't have Scipy,
+the example in that section will be ignored.
+- GPUs - A section of this tutorial uses GPUs. If you don't have GPUs on your
+machine, simply set the variable gpu_device (set in the GPUs section of this
+tutorial) to mx.cpu().
+
+## Compressed Sparse Row Format
+
+A CSRNDArray represents a 2D matrix as three separate 1D arrays: **data**,
+**indptr** and **indices**, where the column indices for
+row ``i`` are stored in ``indices[indptr[i]:indptr[i+1]]`` in ascending order,
+and their corresponding values are stored in ``data[indptr[i]:indptr[i+1]]``.
+
+For example, the CSR representation for matrix
+```
+[[7, 0, 8, 0]
+ [0, 0, 0, 0]
+ [0, 9, 0, 0]]
+```
+is:
+```
+[7, 8, 9]  # data
+[0, 2, 1]  # indices
+[0, 2, 2, 3]   # indptr
+```
+
+Note that in MXNet, the column indices for a given row are always sorted in 
ascending order,
+and duplicated column entries for the same row are not allowed.
+
+## Array Creation
+
+There are a few different ways to create a `CSRNDArray`.
+
+* We can create a CSRNDArray with data, indices and indptr by using the 
`csr_matrix` function:
+
+```python
+import mxnet as mx
+import numpy as np
+# create a CSRNDArray with python lists
+shape = (3, 4)
+data_list = [7, 8, 9]
+indptr_list = [0, 2, 2, 3]
+indices_list = [0, 2, 1]
+a = mx.nd.sparse.csr_matrix(data_list, indptr_list, indices_list, shape)
+# create a CSRNDArray with numpy arrays
+data_np = np.array([7, 8, 9])
+indptr_np = np.array([0, 2, 2, 3])
+indices_np = np.array([0, 2, 1])
+b = mx.nd.sparse.csr_matrix(data_np, indptr_np, indices_np, shape)
+{'a':a, 'b':b}
+```
+
+* We can also create an MXNet CSRNDArray from a `scipy.sparse.csr.csr_matrix` 
object by using the `array` function:
+
+```python
+try:
+import scipy.sparse as spsp
+# generate a csr matrix in scipy
+c = spsp.csr.csr_matrix((data_np, indices_np, indptr_np), shape=shape)
+# create a CSRNDArray from a scipy csr object
+d = mx.nd.sparse.array(c)
+{'d':d}
+except ImportError:
+print("scipy package is required")
+```
+
+We can specify the element data type with the option `dtype`, which accepts a 
numpy
+type. By default, `float32` is used.
+
+```python
+# float32 is used in default
+e = mx.nd.sparse.array(a)
+# create a 16-bit float array
+f = mx.nd.array(a, dtype=np.float16)
+(e.dtype, f.dtype)
+```
+
+## Inspecting Arrays
+
+* We can inspect the contents of an `CSRNDArray` by filling
+its contents into a dense `numpy.ndarray` using the `asnumpy` function.
+
+```python
+a.asnumpy()
+```
+
+* We can also inspect the internal storage of a CSRNDArray by accessing 
attributes such as `indptr`, `indices` and `data`:

[GitHub] eric-haibin-lin commented on a change in pull request #7656: CSRNDArray Tutorial

2017-08-29 Thread git

eric-haibin-lin commented on a change in pull request #7656: CSRNDArray Tutorial
URL: https://github.com/apache/incubator-mxnet/pull/7656#discussion_r135960614
 
 

 ##
 File path: docs/tutorials/sparse/csr.md
 ##
 @@ -0,0 +1,258 @@
+# CSRNDArray - NDArray in Compressed Sparse Row Storage Format
+
+Many real world datasets deal with high dimensional sparse feature vectors. 
For instance,
+in a recommendation system, the number of categories and users is in the order 
of millions,
+while most users only made a few purchases, leading to feature vectors with 
high sparsity
+(i.e. most of the elements are zeros).
+
+Storing and manipulating such large sparse matrices in the default dense 
structure results
+in wasted memory and processing on the zeros.
+To take advantage of the sparse structure of the matrix, the ``CSRNDArray`` in 
MXNet
+stores the matrix in [compressed sparse 
row(CSR)](https://en.wikipedia.org/wiki/Sparse_matrix#Compressed_sparse_row_.28CSR.2C_CRS_or_Yale_format.29)
 format
+and uses specialized algorithms in operators.
+The format is designed for 2D matrices with a large number of columns,
+and each row is sparse(i.e. with only a few nonzeros).
+For matrices of high sparsity (e.g. ~1% non-zeros), the advantage of 
``CSRNDArray`` over
+the existing ``NDArray`` is that
+
+- memory consumption is reduced significantly
+- certain operations (e.g. matrix-vector multiplication) are much faster
+
+Meanwhile, ``CSRNDArray`` inherits competitive features from ``NDArray`` such 
as
+lazy evaluation and automatic parallelization, which are not available in the
+scientific computing python package [SciPy](https://www.scipy.org/).
+
+Apart from often queried attributes such as **ndarray.shape**, 
**ndarray.dtype** and **ndarray.context**,
+you?ll also want to query **ndarray.stype**: the storage type of the NDArray. 
For a usual dense NDArray,
+the value of stype is **"default"**. For an CSRNDArray, the value of stype is 
**"csr"**.
+
+## Prerequisites
+
+To complete this tutorial, we need:
+
+- MXNet. See the instructions for your operating system in [Setup and 
Installation](http://mxnet.io/get_started/install.html)
+- [Jupyter](http://jupyter.org/)
+```
+pip install jupyter
+```
+- Scipy - A section of this tutorial uses Scipy package in python. If you 
don't have Scipy,
+the example in that section will be ignored.
+- GPUs - A section of this tutorial uses GPUs. If you don't have GPUs on your
+machine, simply set the variable gpu_device (set in the GPUs section of this
+tutorial) to mx.cpu().
+
+## Compressed Sparse Row Format
+
+A CSRNDArray represents a 2D matrix as three separate 1D arrays: **data**,
+**indptr** and **indices**, where the column indices for
+row ``i`` are stored in ``indices[indptr[i]:indptr[i+1]]`` in ascending order,
+and their corresponding values are stored in ``data[indptr[i]:indptr[i+1]]``.
+
+For example, the CSR representation for matrix
+```
+[[7, 0, 8, 0]
+ [0, 0, 0, 0]
+ [0, 9, 0, 0]]
+```
+is:
+```
+[7, 8, 9]  # data
+[0, 2, 1]  # indices
+[0, 2, 2, 3]   # indptr
+```
+
+Note that in MXNet, the column indices for a given row are always sorted in 
ascending order,
+and duplicated column entries for the same row are not allowed.
+
+## Array Creation
+
+There are a few different ways to create a `CSRNDArray`.
+
+* We can create a CSRNDArray with data, indices and indptr by using the 
`csr_matrix` function:
+
+```python
+import mxnet as mx
+import numpy as np
+# create a CSRNDArray with python lists
+shape = (3, 4)
+data_list = [7, 8, 9]
+indptr_list = [0, 2, 2, 3]
+indices_list = [0, 2, 1]
+a = mx.nd.sparse.csr_matrix(data_list, indptr_list, indices_list, shape)
+# create a CSRNDArray with numpy arrays
+data_np = np.array([7, 8, 9])
+indptr_np = np.array([0, 2, 2, 3])
+indices_np = np.array([0, 2, 1])
+b = mx.nd.sparse.csr_matrix(data_np, indptr_np, indices_np, shape)
+{'a':a, 'b':b}
+```
+
+* We can also create an MXNet CSRNDArray from a `scipy.sparse.csr.csr_matrix` 
object by using the `array` function:
+
+```python
+try:
+import scipy.sparse as spsp
+# generate a csr matrix in scipy
+c = spsp.csr.csr_matrix((data_np, indices_np, indptr_np), shape=shape)
+# create a CSRNDArray from a scipy csr object
+d = mx.nd.sparse.array(c)
+{'d':d}
+except ImportError:
+print("scipy package is required")
+```
+
+We can specify the element data type with the option `dtype`, which accepts a 
numpy
+type. By default, `float32` is used.
+
+```python
+# float32 is used in default
+e = mx.nd.sparse.array(a)
+# create a 16-bit float array
+f = mx.nd.array(a, dtype=np.float16)
+(e.dtype, f.dtype)
+```
+
+## Inspecting Arrays
+
+* We can inspect the contents of an `CSRNDArray` by filling
+its contents into a dense `numpy.ndarray` using the `asnumpy` function.
+
+```python
+a.asnumpy()
+```
+
+* We can also inspect the internal storage of a CSRNDArray by accessing 
attributes such as `indptr`, `indices` and `data`:

[GitHub] eric-haibin-lin commented on a change in pull request #7656: CSRNDArray Tutorial

2017-08-29 Thread git

eric-haibin-lin commented on a change in pull request #7656: CSRNDArray Tutorial
URL: https://github.com/apache/incubator-mxnet/pull/7656#discussion_r135960358
 
 

 ##
 File path: docs/tutorials/sparse/csr.md
 ##
 @@ -0,0 +1,258 @@
+# CSRNDArray - NDArray in Compressed Sparse Row Storage Format
+
+Many real world datasets deal with high dimensional sparse feature vectors. 
For instance,
+in a recommendation system, the number of categories and users is in the order 
of millions,
+while most users only made a few purchases, leading to feature vectors with 
high sparsity
+(i.e. most of the elements are zeros).
+
+Storing and manipulating such large sparse matrices in the default dense 
structure results
+in wasted memory and processing on the zeros.
+To take advantage of the sparse structure of the matrix, the ``CSRNDArray`` in 
MXNet
+stores the matrix in [compressed sparse 
row(CSR)](https://en.wikipedia.org/wiki/Sparse_matrix#Compressed_sparse_row_.28CSR.2C_CRS_or_Yale_format.29)
 format
+and uses specialized algorithms in operators.
+The format is designed for 2D matrices with a large number of columns,
+and each row is sparse(i.e. with only a few nonzeros).
+For matrices of high sparsity (e.g. ~1% non-zeros), the advantage of 
``CSRNDArray`` over
+the existing ``NDArray`` is that
+
+- memory consumption is reduced significantly
+- certain operations (e.g. matrix-vector multiplication) are much faster
+
+Meanwhile, ``CSRNDArray`` inherits competitive features from ``NDArray`` such 
as
+lazy evaluation and automatic parallelization, which are not available in the
+scientific computing python package [SciPy](https://www.scipy.org/).
+
+Apart from often queried attributes such as **ndarray.shape**, 
**ndarray.dtype** and **ndarray.context**,
+you?ll also want to query **ndarray.stype**: the storage type of the NDArray. 
For a usual dense NDArray,
+the value of stype is **"default"**. For an CSRNDArray, the value of stype is 
**"csr"**.
+
+## Prerequisites
+
+To complete this tutorial, we need:
+
+- MXNet. See the instructions for your operating system in [Setup and 
Installation](http://mxnet.io/get_started/install.html)
+- [Jupyter](http://jupyter.org/)
+```
+pip install jupyter
+```
+- Scipy - A section of this tutorial uses Scipy package in python. If you 
don't have Scipy,
+the example in that section will be ignored.
+- GPUs - A section of this tutorial uses GPUs. If you don't have GPUs on your
+machine, simply set the variable gpu_device (set in the GPUs section of this
+tutorial) to mx.cpu().
+
+## Compressed Sparse Row Format
+
+A CSRNDArray represents a 2D matrix as three separate 1D arrays: **data**,
+**indptr** and **indices**, where the column indices for
+row ``i`` are stored in ``indices[indptr[i]:indptr[i+1]]`` in ascending order,
+and their corresponding values are stored in ``data[indptr[i]:indptr[i+1]]``.
+
+For example, the CSR representation for matrix
+```
+[[7, 0, 8, 0]
+ [0, 0, 0, 0]
+ [0, 9, 0, 0]]
+```
+is:
+```
+[7, 8, 9]  # data
+[0, 2, 1]  # indices
+[0, 2, 2, 3]   # indptr
+```
+
+Note that in MXNet, the column indices for a given row are always sorted in 
ascending order,
+and duplicated column entries for the same row are not allowed.
+
+## Array Creation
+
+There are a few different ways to create a `CSRNDArray`.
+
+* We can create a CSRNDArray with data, indices and indptr by using the 
`csr_matrix` function:
+
+```python
+import mxnet as mx
+import numpy as np
+# create a CSRNDArray with python lists
+shape = (3, 4)
+data_list = [7, 8, 9]
+indptr_list = [0, 2, 2, 3]
+indices_list = [0, 2, 1]
+a = mx.nd.sparse.csr_matrix(data_list, indptr_list, indices_list, shape)
+# create a CSRNDArray with numpy arrays
+data_np = np.array([7, 8, 9])
+indptr_np = np.array([0, 2, 2, 3])
+indices_np = np.array([0, 2, 1])
+b = mx.nd.sparse.csr_matrix(data_np, indptr_np, indices_np, shape)
+{'a':a, 'b':b}
+```
+
+* We can also create an MXNet CSRNDArray from a `scipy.sparse.csr.csr_matrix` 
object by using the `array` function:
+
+```python
+try:
+import scipy.sparse as spsp
+# generate a csr matrix in scipy
+c = spsp.csr.csr_matrix((data_np, indices_np, indptr_np), shape=shape)
+# create a CSRNDArray from a scipy csr object
+d = mx.nd.sparse.array(c)
+{'d':d}
+except ImportError:
+print("scipy package is required")
+```
+
+We can specify the element data type with the option `dtype`, which accepts a 
numpy
+type. By default, `float32` is used.
+
+```python
+# float32 is used in default
+e = mx.nd.sparse.array(a)
+# create a 16-bit float array
+f = mx.nd.array(a, dtype=np.float16)
+(e.dtype, f.dtype)
+```
+
+## Inspecting Arrays
+
+* We can inspect the contents of an `CSRNDArray` by filling
+its contents into a dense `numpy.ndarray` using the `asnumpy` function.
+
+```python
+a.asnumpy()
+```
+
+* We can also inspect the internal storage of a CSRNDArray by accessing 
attributes such as `indptr`, `indices` and `data`:

[GitHub] eric-haibin-lin commented on a change in pull request #7656: CSRNDArray Tutorial

2017-08-29 Thread git

eric-haibin-lin commented on a change in pull request #7656: CSRNDArray Tutorial
URL: https://github.com/apache/incubator-mxnet/pull/7656#discussion_r135960358
 
 

 ##
 File path: docs/tutorials/sparse/csr.md
 ##
 @@ -0,0 +1,258 @@
+# CSRNDArray - NDArray in Compressed Sparse Row Storage Format
+
+Many real world datasets deal with high dimensional sparse feature vectors. 
For instance,
+in a recommendation system, the number of categories and users is in the order 
of millions,
+while most users only made a few purchases, leading to feature vectors with 
high sparsity
+(i.e. most of the elements are zeros).
+
+Storing and manipulating such large sparse matrices in the default dense 
structure results
+in wasted memory and processing on the zeros.
+To take advantage of the sparse structure of the matrix, the ``CSRNDArray`` in 
MXNet
+stores the matrix in [compressed sparse 
row(CSR)](https://en.wikipedia.org/wiki/Sparse_matrix#Compressed_sparse_row_.28CSR.2C_CRS_or_Yale_format.29)
 format
+and uses specialized algorithms in operators.
+The format is designed for 2D matrices with a large number of columns,
+and each row is sparse(i.e. with only a few nonzeros).
+For matrices of high sparsity (e.g. ~1% non-zeros), the advantage of 
``CSRNDArray`` over
+the existing ``NDArray`` is that
+
+- memory consumption is reduced significantly
+- certain operations (e.g. matrix-vector multiplication) are much faster
+
+Meanwhile, ``CSRNDArray`` inherits competitive features from ``NDArray`` such 
as
+lazy evaluation and automatic parallelization, which are not available in the
+scientific computing python package [SciPy](https://www.scipy.org/).
+
+Apart from often queried attributes such as **ndarray.shape**, 
**ndarray.dtype** and **ndarray.context**,
+you?ll also want to query **ndarray.stype**: the storage type of the NDArray. 
For a usual dense NDArray,
+the value of stype is **"default"**. For an CSRNDArray, the value of stype is 
**"csr"**.
+
+## Prerequisites
+
+To complete this tutorial, we need:
+
+- MXNet. See the instructions for your operating system in [Setup and 
Installation](http://mxnet.io/get_started/install.html)
+- [Jupyter](http://jupyter.org/)
+```
+pip install jupyter
+```
+- Scipy - A section of this tutorial uses Scipy package in python. If you 
don't have Scipy,
+the example in that section will be ignored.
+- GPUs - A section of this tutorial uses GPUs. If you don't have GPUs on your
+machine, simply set the variable gpu_device (set in the GPUs section of this
+tutorial) to mx.cpu().
+
+## Compressed Sparse Row Format
+
+A CSRNDArray represents a 2D matrix as three separate 1D arrays: **data**,
+**indptr** and **indices**, where the column indices for
+row ``i`` are stored in ``indices[indptr[i]:indptr[i+1]]`` in ascending order,
+and their corresponding values are stored in ``data[indptr[i]:indptr[i+1]]``.
+
+For example, the CSR representation for matrix
+```
+[[7, 0, 8, 0]
+ [0, 0, 0, 0]
+ [0, 9, 0, 0]]
+```
+is:
+```
+[7, 8, 9]  # data
+[0, 2, 1]  # indices
+[0, 2, 2, 3]   # indptr
+```
+
+Note that in MXNet, the column indices for a given row are always sorted in 
ascending order,
+and duplicated column entries for the same row are not allowed.
+
+## Array Creation
+
+There are a few different ways to create a `CSRNDArray`.
+
+* We can create a CSRNDArray with data, indices and indptr by using the 
`csr_matrix` function:
+
+```python
+import mxnet as mx
+import numpy as np
+# create a CSRNDArray with python lists
+shape = (3, 4)
+data_list = [7, 8, 9]
+indptr_list = [0, 2, 2, 3]
+indices_list = [0, 2, 1]
+a = mx.nd.sparse.csr_matrix(data_list, indptr_list, indices_list, shape)
+# create a CSRNDArray with numpy arrays
+data_np = np.array([7, 8, 9])
+indptr_np = np.array([0, 2, 2, 3])
+indices_np = np.array([0, 2, 1])
+b = mx.nd.sparse.csr_matrix(data_np, indptr_np, indices_np, shape)
+{'a':a, 'b':b}
+```
+
+* We can also create an MXNet CSRNDArray from a `scipy.sparse.csr.csr_matrix` 
object by using the `array` function:
+
+```python
+try:
+import scipy.sparse as spsp
+# generate a csr matrix in scipy
+c = spsp.csr.csr_matrix((data_np, indices_np, indptr_np), shape=shape)
+# create a CSRNDArray from a scipy csr object
+d = mx.nd.sparse.array(c)
+{'d':d}
+except ImportError:
+print("scipy package is required")
+```
+
+We can specify the element data type with the option `dtype`, which accepts a 
numpy
+type. By default, `float32` is used.
+
+```python
+# float32 is used in default
+e = mx.nd.sparse.array(a)
+# create a 16-bit float array
+f = mx.nd.array(a, dtype=np.float16)
+(e.dtype, f.dtype)
+```
+
+## Inspecting Arrays
+
+* We can inspect the contents of an `CSRNDArray` by filling
+its contents into a dense `numpy.ndarray` using the `asnumpy` function.
+
+```python
+a.asnumpy()
+```
+
+* We can also inspect the internal storage of a CSRNDArray by accessing 
attributes such as `indptr`, `indices` and `data`:

[GitHub] LeonJWH opened a new issue #7664: index out of bound error when update eval metric

2017-08-29 Thread git

LeonJWH opened a new issue #7664: index out of bound error when update eval 
metric
URL: https://github.com/apache/incubator-mxnet/issues/7664
 
 
   Hi ,I am training a binary classification model with my own dataset. I use 
**mx.image.ImageIter** API to load raw images according to the **.lst** file 
generated myself(with using **img2rec.py**).
   I set the data iter as below,
   ```
   train_iter = mx.image.ImageIter(
   batch_size   = batch_size,
   data_shape   = data_shape,
   path_imglist = '/database/liveness/data_prepare/liveness_train.lst',
   path_root= '/',
   data_name= 'data',
   label_name   = 'softmax_label',
   mean = np.array([123.68, 116.78, 103.94]),
   resize   = 224,
   rand_mirror  = True,
   shuffle  = False,
   inter_method = 1)
   ```.
   And my **.lst** file is
   ```
   16  1.0 
/database/liveness/lf_face/real/5905c125337f3131b4f0856a_image_0.jpg
   17  1.0 
/database/liveness/lf_face/real/58dd0a71337f311c56026c9e_image_0.jpg
   18  1.0 
/database/liveness/lf_face/real/58fb11d6de6c741b3501fd52_image_0.jpg
   19  1.0 
/database/liveness/lf_face/real/59060ef0337f317a0f74a42e_image_0.jpg
   ```.
   Then i start training, the first epoch went well, but a error was reported 
at the second epoch(epoch 1) as follow,
   ```
   Traceback (most recent call last):
 File "train.py", line 113, in 
   mod.update_metric(metric, batch.label)
 File "/dlproject/incubator-mxnet/python/mxnet/module/module.py", line 735, 
in update_metric
   self._exec_group.update_metric(eval_metric, labels)
 File "/dlproject/incubator-mxnet/python/mxnet/module/executor_group.py", 
line 582, in update_metric
   eval_metric.update_dict(labels_, preds)
 File "/dlproject/incubator-mxnet/python/mxnet/metric.py", line 280, in 
update_dict
   metric.update_dict(labels, preds)
 File "/dlproject/incubator-mxnet/python/mxnet/metric.py", line 108, in 
update_dict
   self.update(label, pred)
 File "/dlproject/incubator-mxnet/python/mxnet/metric.py", line 916, in 
update
   prob = pred[numpy.arange(label.shape[0]), numpy.int64(label)]
   IndexError: index -275 is out of bounds for axis 1 with size 2
   ```
   I set the eval metric in my training script with two parts:
   ```
   eval_metric = mx.metric.CompositeEvalMetric()
   eval_metric.add(mx.metric.CrossEntropy())
   eval_metric.add(mx.metric.Accuracy())
   ```,
   so what is wrong in my usage?
   Thx for your answer!
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] bhavinthaker commented on a change in pull request #7656: CSRNDArray Tutorial

2017-08-29 Thread git

bhavinthaker commented on a change in pull request #7656: CSRNDArray Tutorial
URL: https://github.com/apache/incubator-mxnet/pull/7656#discussion_r135957878
 
 

 ##
 File path: docs/tutorials/sparse/csr.md
 ##
 @@ -0,0 +1,258 @@
+# CSRNDArray - NDArray in Compressed Sparse Row Storage Format
+
+Many real world datasets deal with high dimensional sparse feature vectors. 
For instance,
+in a recommendation system, the number of categories and users is in the order 
of millions,
+while most users only made a few purchases, leading to feature vectors with 
high sparsity
+(i.e. most of the elements are zeros).
+
+Storing and manipulating such large sparse matrices in the default dense 
structure results
+in wasted memory and processing on the zeros.
+To take advantage of the sparse structure of the matrix, the ``CSRNDArray`` in 
MXNet
+stores the matrix in [compressed sparse 
row(CSR)](https://en.wikipedia.org/wiki/Sparse_matrix#Compressed_sparse_row_.28CSR.2C_CRS_or_Yale_format.29)
 format
+and uses specialized algorithms in operators.
+The format is designed for 2D matrices with a large number of columns,
+and each row is sparse(i.e. with only a few nonzeros).
+For matrices of high sparsity (e.g. ~1% non-zeros), the advantage of 
``CSRNDArray`` over
+the existing ``NDArray`` is that
+
+- memory consumption is reduced significantly
+- certain operations (e.g. matrix-vector multiplication) are much faster
+
+Meanwhile, ``CSRNDArray`` inherits competitive features from ``NDArray`` such 
as
+lazy evaluation and automatic parallelization, which are not available in the
+scientific computing python package [SciPy](https://www.scipy.org/).
+
+Apart from often queried attributes such as **ndarray.shape**, 
**ndarray.dtype** and **ndarray.context**,
+you?ll also want to query **ndarray.stype**: the storage type of the NDArray. 
For a usual dense NDArray,
+the value of stype is **"default"**. For an CSRNDArray, the value of stype is 
**"csr"**.
+
+## Prerequisites
+
+To complete this tutorial, we need:
+
+- MXNet. See the instructions for your operating system in [Setup and 
Installation](http://mxnet.io/get_started/install.html)
+- [Jupyter](http://jupyter.org/)
+```
+pip install jupyter
+```
+- Scipy - A section of this tutorial uses Scipy package in python. If you 
don't have Scipy,
+the example in that section will be ignored.
+- GPUs - A section of this tutorial uses GPUs. If you don't have GPUs on your
+machine, simply set the variable gpu_device (set in the GPUs section of this
+tutorial) to mx.cpu().
+
+## Compressed Sparse Row Format
+
+A CSRNDArray represents a 2D matrix as three separate 1D arrays: **data**,
+**indptr** and **indices**, where the column indices for
+row ``i`` are stored in ``indices[indptr[i]:indptr[i+1]]`` in ascending order,
+and their corresponding values are stored in ``data[indptr[i]:indptr[i+1]]``.
+
+For example, the CSR representation for matrix
+```
+[[7, 0, 8, 0]
+ [0, 0, 0, 0]
+ [0, 9, 0, 0]]
+```
+is:
+```
+[7, 8, 9]  # data
+[0, 2, 1]  # indices
+[0, 2, 2, 3]   # indptr
+```
+
+Note that in MXNet, the column indices for a given row are always sorted in 
ascending order,
+and duplicated column entries for the same row are not allowed.
+
+## Array Creation
+
+There are a few different ways to create a `CSRNDArray`.
+
+* We can create a CSRNDArray with data, indices and indptr by using the 
`csr_matrix` function:
+
+```python
+import mxnet as mx
+import numpy as np
+# create a CSRNDArray with python lists
+shape = (3, 4)
+data_list = [7, 8, 9]
+indptr_list = [0, 2, 2, 3]
+indices_list = [0, 2, 1]
+a = mx.nd.sparse.csr_matrix(data_list, indptr_list, indices_list, shape)
+# create a CSRNDArray with numpy arrays
+data_np = np.array([7, 8, 9])
+indptr_np = np.array([0, 2, 2, 3])
+indices_np = np.array([0, 2, 1])
+b = mx.nd.sparse.csr_matrix(data_np, indptr_np, indices_np, shape)
+{'a':a, 'b':b}
+```
+
+* We can also create an MXNet CSRNDArray from a `scipy.sparse.csr.csr_matrix` 
object by using the `array` function:
+
+```python
+try:
+import scipy.sparse as spsp
+# generate a csr matrix in scipy
+c = spsp.csr.csr_matrix((data_np, indices_np, indptr_np), shape=shape)
+# create a CSRNDArray from a scipy csr object
+d = mx.nd.sparse.array(c)
+{'d':d}
+except ImportError:
+print("scipy package is required")
+```
+
+We can specify the element data type with the option `dtype`, which accepts a 
numpy
+type. By default, `float32` is used.
+
+```python
+# float32 is used in default
+e = mx.nd.sparse.array(a)
+# create a 16-bit float array
+f = mx.nd.array(a, dtype=np.float16)
+(e.dtype, f.dtype)
+```
+
+## Inspecting Arrays
+
+* We can inspect the contents of an `CSRNDArray` by filling
+its contents into a dense `numpy.ndarray` using the `asnumpy` function.
+
+```python
+a.asnumpy()
+```
+
+* We can also inspect the internal storage of a CSRNDArray by accessing 
attributes such as `indptr`, `indices` and `data`:
+

[GitHub] bhavinthaker commented on a change in pull request #7656: CSRNDArray Tutorial

2017-08-29 Thread git

bhavinthaker commented on a change in pull request #7656: CSRNDArray Tutorial
URL: https://github.com/apache/incubator-mxnet/pull/7656#discussion_r135952698

Review comment:
Add a period after faster.

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

With regards,
Apache Git Services

[GitHub] bhavinthaker commented on a change in pull request #7656: CSRNDArray Tutorial

2017-08-29 Thread git

bhavinthaker commented on a change in pull request #7656: CSRNDArray Tutorial
URL: https://github.com/apache/incubator-mxnet/pull/7656#discussion_r135955870
 
 

 ##
 File path: docs/tutorials/sparse/csr.md
 ##
 @@ -0,0 +1,258 @@
+# CSRNDArray - NDArray in Compressed Sparse Row Storage Format
+
+Many real world datasets deal with high dimensional sparse feature vectors. 
For instance,
+in a recommendation system, the number of categories and users is in the order 
of millions,
+while most users only made a few purchases, leading to feature vectors with 
high sparsity
+(i.e. most of the elements are zeros).
+
+Storing and manipulating such large sparse matrices in the default dense 
structure results
+in wasted memory and processing on the zeros.
+To take advantage of the sparse structure of the matrix, the ``CSRNDArray`` in 
MXNet
+stores the matrix in [compressed sparse 
row(CSR)](https://en.wikipedia.org/wiki/Sparse_matrix#Compressed_sparse_row_.28CSR.2C_CRS_or_Yale_format.29)
 format
+and uses specialized algorithms in operators.
+The format is designed for 2D matrices with a large number of columns,
+and each row is sparse(i.e. with only a few nonzeros).
+For matrices of high sparsity (e.g. ~1% non-zeros), the advantage of 
``CSRNDArray`` over
+the existing ``NDArray`` is that
+
+- memory consumption is reduced significantly
+- certain operations (e.g. matrix-vector multiplication) are much faster
+
+Meanwhile, ``CSRNDArray`` inherits competitive features from ``NDArray`` such 
as
+lazy evaluation and automatic parallelization, which are not available in the
+scientific computing python package [SciPy](https://www.scipy.org/).
+
+Apart from often queried attributes such as **ndarray.shape**, 
**ndarray.dtype** and **ndarray.context**,
+you?ll also want to query **ndarray.stype**: the storage type of the NDArray. 
For a usual dense NDArray,
+the value of stype is **"default"**. For an CSRNDArray, the value of stype is 
**"csr"**.
+
+## Prerequisites
+
+To complete this tutorial, we need:
+
+- MXNet. See the instructions for your operating system in [Setup and 
Installation](http://mxnet.io/get_started/install.html)
+- [Jupyter](http://jupyter.org/)
+```
+pip install jupyter
+```
+- Scipy - A section of this tutorial uses Scipy package in python. If you 
don't have Scipy,
+the example in that section will be ignored.
+- GPUs - A section of this tutorial uses GPUs. If you don't have GPUs on your
+machine, simply set the variable gpu_device (set in the GPUs section of this
+tutorial) to mx.cpu().
+
+## Compressed Sparse Row Format
+
+A CSRNDArray represents a 2D matrix as three separate 1D arrays: **data**,
+**indptr** and **indices**, where the column indices for
+row ``i`` are stored in ``indices[indptr[i]:indptr[i+1]]`` in ascending order,
+and their corresponding values are stored in ``data[indptr[i]:indptr[i+1]]``.
+
+For example, the CSR representation for matrix
+```
+[[7, 0, 8, 0]
+ [0, 0, 0, 0]
+ [0, 9, 0, 0]]
+```
+is:
+```
+[7, 8, 9]  # data
+[0, 2, 1]  # indices
+[0, 2, 2, 3]   # indptr
+```
+
+Note that in MXNet, the column indices for a given row are always sorted in 
ascending order,
+and duplicated column entries for the same row are not allowed.
+
+## Array Creation
+
+There are a few different ways to create a `CSRNDArray`.
+
+* We can create a CSRNDArray with data, indices and indptr by using the 
`csr_matrix` function:
+
+```python
+import mxnet as mx
+import numpy as np
+# create a CSRNDArray with python lists
+shape = (3, 4)
+data_list = [7, 8, 9]
+indptr_list = [0, 2, 2, 3]
+indices_list = [0, 2, 1]
+a = mx.nd.sparse.csr_matrix(data_list, indptr_list, indices_list, shape)
+# create a CSRNDArray with numpy arrays
+data_np = np.array([7, 8, 9])
+indptr_np = np.array([0, 2, 2, 3])
+indices_np = np.array([0, 2, 1])
+b = mx.nd.sparse.csr_matrix(data_np, indptr_np, indices_np, shape)
+{'a':a, 'b':b}
+```
+
+* We can also create an MXNet CSRNDArray from a `scipy.sparse.csr.csr_matrix` 
object by using the `array` function:
+
+```python
+try:
+import scipy.sparse as spsp
+# generate a csr matrix in scipy
+c = spsp.csr.csr_matrix((data_np, indices_np, indptr_np), shape=shape)
+# create a CSRNDArray from a scipy csr object
+d = mx.nd.sparse.array(c)
+{'d':d}
+except ImportError:
+print("scipy package is required")
+```
+
+We can specify the element data type with the option `dtype`, which accepts a 
numpy
+type. By default, `float32` is used.
+
+```python
+# float32 is used in default
 
 Review comment:
   change "in" to "by" -- by default.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] bhavinthaker commented on a change in pull request #7656: CSRNDArray Tutorial

2017-08-29 Thread git

bhavinthaker commented on a change in pull request #7656: CSRNDArray Tutorial
URL: https://github.com/apache/incubator-mxnet/pull/7656#discussion_r135955792
 
 

 ##
 File path: docs/tutorials/sparse/csr.md
 ##
 @@ -0,0 +1,258 @@
+# CSRNDArray - NDArray in Compressed Sparse Row Storage Format
+
+Many real world datasets deal with high dimensional sparse feature vectors. 
For instance,
+in a recommendation system, the number of categories and users is in the order 
of millions,
+while most users only made a few purchases, leading to feature vectors with 
high sparsity
+(i.e. most of the elements are zeros).
+
+Storing and manipulating such large sparse matrices in the default dense 
structure results
+in wasted memory and processing on the zeros.
+To take advantage of the sparse structure of the matrix, the ``CSRNDArray`` in 
MXNet
+stores the matrix in [compressed sparse 
row(CSR)](https://en.wikipedia.org/wiki/Sparse_matrix#Compressed_sparse_row_.28CSR.2C_CRS_or_Yale_format.29)
 format
+and uses specialized algorithms in operators.
+The format is designed for 2D matrices with a large number of columns,
+and each row is sparse(i.e. with only a few nonzeros).
+For matrices of high sparsity (e.g. ~1% non-zeros), the advantage of 
``CSRNDArray`` over
+the existing ``NDArray`` is that
+
+- memory consumption is reduced significantly
+- certain operations (e.g. matrix-vector multiplication) are much faster
+
+Meanwhile, ``CSRNDArray`` inherits competitive features from ``NDArray`` such 
as
+lazy evaluation and automatic parallelization, which are not available in the
+scientific computing python package [SciPy](https://www.scipy.org/).
+
+Apart from often queried attributes such as **ndarray.shape**, 
**ndarray.dtype** and **ndarray.context**,
+you?ll also want to query **ndarray.stype**: the storage type of the NDArray. 
For a usual dense NDArray,
+the value of stype is **"default"**. For an CSRNDArray, the value of stype is 
**"csr"**.
+
+## Prerequisites
+
+To complete this tutorial, we need:
+
+- MXNet. See the instructions for your operating system in [Setup and 
Installation](http://mxnet.io/get_started/install.html)
+- [Jupyter](http://jupyter.org/)
+```
+pip install jupyter
+```
+- Scipy - A section of this tutorial uses Scipy package in python. If you 
don't have Scipy,
+the example in that section will be ignored.
+- GPUs - A section of this tutorial uses GPUs. If you don't have GPUs on your
+machine, simply set the variable gpu_device (set in the GPUs section of this
+tutorial) to mx.cpu().
+
+## Compressed Sparse Row Format
+
+A CSRNDArray represents a 2D matrix as three separate 1D arrays: **data**,
+**indptr** and **indices**, where the column indices for
+row ``i`` are stored in ``indices[indptr[i]:indptr[i+1]]`` in ascending order,
+and their corresponding values are stored in ``data[indptr[i]:indptr[i+1]]``.
+
+For example, the CSR representation for matrix
+```
+[[7, 0, 8, 0]
+ [0, 0, 0, 0]
+ [0, 9, 0, 0]]
+```
+is:
+```
+[7, 8, 9]  # data
+[0, 2, 1]  # indices
+[0, 2, 2, 3]   # indptr
+```
+
+Note that in MXNet, the column indices for a given row are always sorted in 
ascending order,
+and duplicated column entries for the same row are not allowed.
+
+## Array Creation
+
+There are a few different ways to create a `CSRNDArray`.
+
+* We can create a CSRNDArray with data, indices and indptr by using the 
`csr_matrix` function:
+
+```python
+import mxnet as mx
+import numpy as np
+# create a CSRNDArray with python lists
+shape = (3, 4)
+data_list = [7, 8, 9]
+indptr_list = [0, 2, 2, 3]
+indices_list = [0, 2, 1]
+a = mx.nd.sparse.csr_matrix(data_list, indptr_list, indices_list, shape)
+# create a CSRNDArray with numpy arrays
+data_np = np.array([7, 8, 9])
+indptr_np = np.array([0, 2, 2, 3])
+indices_np = np.array([0, 2, 1])
+b = mx.nd.sparse.csr_matrix(data_np, indptr_np, indices_np, shape)
+{'a':a, 'b':b}
+```
+
+* We can also create an MXNet CSRNDArray from a `scipy.sparse.csr.csr_matrix` 
object by using the `array` function:
+
+```python
+try:
+import scipy.sparse as spsp
+# generate a csr matrix in scipy
+c = spsp.csr.csr_matrix((data_np, indices_np, indptr_np), shape=shape)
+# create a CSRNDArray from a scipy csr object
+d = mx.nd.sparse.array(c)
 
 Review comment:
   I get the following error:
   >>> import scipy.sparse as spsp
   >>> c = spsp.csr.csr_matrix((data_np, indices_np, indptr_np), shape=shape)
   >>> d = mx.nd.sparse.array(c)
   Traceback (most recent call last):
 File "", line 1, in 
 File 
"/Users/thakerb/anaconda3/lib/python3.5/site-packages/mxnet-0.11.1-py3.5.egg/mxnet/ndarray/sparse.py",
 line 919, in array
   raise NotImplementedError('creating BaseSparseNDArray from ' \
   NotImplementedError: creating BaseSparseNDArray from  a non-NDArray object 
is not implemented.
 

This is an automated message from the Apache Git Service.
To

[GitHub] bhavinthaker commented on a change in pull request #7656: CSRNDArray Tutorial

2017-08-29 Thread git

bhavinthaker commented on a change in pull request #7656: CSRNDArray Tutorial
URL: https://github.com/apache/incubator-mxnet/pull/7656#discussion_r135958319
 
 

 ##
 File path: docs/tutorials/sparse/csr.md
 ##
 @@ -0,0 +1,258 @@
+# CSRNDArray - NDArray in Compressed Sparse Row Storage Format
+
+Many real world datasets deal with high dimensional sparse feature vectors. 
For instance,
+in a recommendation system, the number of categories and users is in the order 
of millions,
+while most users only made a few purchases, leading to feature vectors with 
high sparsity
+(i.e. most of the elements are zeros).
+
+Storing and manipulating such large sparse matrices in the default dense 
structure results
+in wasted memory and processing on the zeros.
+To take advantage of the sparse structure of the matrix, the ``CSRNDArray`` in 
MXNet
+stores the matrix in [compressed sparse 
row(CSR)](https://en.wikipedia.org/wiki/Sparse_matrix#Compressed_sparse_row_.28CSR.2C_CRS_or_Yale_format.29)
 format
+and uses specialized algorithms in operators.
+The format is designed for 2D matrices with a large number of columns,
+and each row is sparse(i.e. with only a few nonzeros).
+For matrices of high sparsity (e.g. ~1% non-zeros), the advantage of 
``CSRNDArray`` over
+the existing ``NDArray`` is that
+
+- memory consumption is reduced significantly
+- certain operations (e.g. matrix-vector multiplication) are much faster
+
+Meanwhile, ``CSRNDArray`` inherits competitive features from ``NDArray`` such 
as
+lazy evaluation and automatic parallelization, which are not available in the
+scientific computing python package [SciPy](https://www.scipy.org/).
+
+Apart from often queried attributes such as **ndarray.shape**, 
**ndarray.dtype** and **ndarray.context**,
+you?ll also want to query **ndarray.stype**: the storage type of the NDArray. 
For a usual dense NDArray,
+the value of stype is **"default"**. For an CSRNDArray, the value of stype is 
**"csr"**.
+
+## Prerequisites
+
+To complete this tutorial, we need:
+
+- MXNet. See the instructions for your operating system in [Setup and 
Installation](http://mxnet.io/get_started/install.html)
+- [Jupyter](http://jupyter.org/)
+```
+pip install jupyter
+```
+- Scipy - A section of this tutorial uses Scipy package in python. If you 
don't have Scipy,
+the example in that section will be ignored.
+- GPUs - A section of this tutorial uses GPUs. If you don't have GPUs on your
+machine, simply set the variable gpu_device (set in the GPUs section of this
+tutorial) to mx.cpu().
+
+## Compressed Sparse Row Format
+
+A CSRNDArray represents a 2D matrix as three separate 1D arrays: **data**,
+**indptr** and **indices**, where the column indices for
+row ``i`` are stored in ``indices[indptr[i]:indptr[i+1]]`` in ascending order,
+and their corresponding values are stored in ``data[indptr[i]:indptr[i+1]]``.
+
+For example, the CSR representation for matrix
+```
+[[7, 0, 8, 0]
+ [0, 0, 0, 0]
+ [0, 9, 0, 0]]
+```
+is:
+```
+[7, 8, 9]  # data
+[0, 2, 1]  # indices
+[0, 2, 2, 3]   # indptr
+```
+
+Note that in MXNet, the column indices for a given row are always sorted in 
ascending order,
+and duplicated column entries for the same row are not allowed.
+
+## Array Creation
+
+There are a few different ways to create a `CSRNDArray`.
+
+* We can create a CSRNDArray with data, indices and indptr by using the 
`csr_matrix` function:
+
+```python
+import mxnet as mx
+import numpy as np
+# create a CSRNDArray with python lists
+shape = (3, 4)
+data_list = [7, 8, 9]
+indptr_list = [0, 2, 2, 3]
+indices_list = [0, 2, 1]
+a = mx.nd.sparse.csr_matrix(data_list, indptr_list, indices_list, shape)
+# create a CSRNDArray with numpy arrays
+data_np = np.array([7, 8, 9])
+indptr_np = np.array([0, 2, 2, 3])
+indices_np = np.array([0, 2, 1])
+b = mx.nd.sparse.csr_matrix(data_np, indptr_np, indices_np, shape)
+{'a':a, 'b':b}
+```
+
+* We can also create an MXNet CSRNDArray from a `scipy.sparse.csr.csr_matrix` 
object by using the `array` function:
+
+```python
+try:
+import scipy.sparse as spsp
+# generate a csr matrix in scipy
+c = spsp.csr.csr_matrix((data_np, indices_np, indptr_np), shape=shape)
+# create a CSRNDArray from a scipy csr object
+d = mx.nd.sparse.array(c)
+{'d':d}
+except ImportError:
+print("scipy package is required")
+```
+
+We can specify the element data type with the option `dtype`, which accepts a 
numpy
+type. By default, `float32` is used.
+
+```python
+# float32 is used in default
+e = mx.nd.sparse.array(a)
+# create a 16-bit float array
+f = mx.nd.array(a, dtype=np.float16)
+(e.dtype, f.dtype)
+```
+
+## Inspecting Arrays
+
+* We can inspect the contents of an `CSRNDArray` by filling
+its contents into a dense `numpy.ndarray` using the `asnumpy` function.
+
+```python
+a.asnumpy()
+```
+
+* We can also inspect the internal storage of a CSRNDArray by accessing 
attributes such as `indptr`, `indices` and `data`:
+

[GitHub] bhavinthaker commented on a change in pull request #7656: CSRNDArray Tutorial

2017-08-29 Thread git

bhavinthaker commented on a change in pull request #7656: CSRNDArray Tutorial
URL: https://github.com/apache/incubator-mxnet/pull/7656#discussion_r135958774
 
 

 ##
 File path: docs/tutorials/sparse/csr.md
 ##
 @@ -0,0 +1,258 @@
+# CSRNDArray - NDArray in Compressed Sparse Row Storage Format
+
+Many real world datasets deal with high dimensional sparse feature vectors. 
For instance,
+in a recommendation system, the number of categories and users is in the order 
of millions,
+while most users only made a few purchases, leading to feature vectors with 
high sparsity
+(i.e. most of the elements are zeros).
+
+Storing and manipulating such large sparse matrices in the default dense 
structure results
+in wasted memory and processing on the zeros.
+To take advantage of the sparse structure of the matrix, the ``CSRNDArray`` in 
MXNet
+stores the matrix in [compressed sparse 
row(CSR)](https://en.wikipedia.org/wiki/Sparse_matrix#Compressed_sparse_row_.28CSR.2C_CRS_or_Yale_format.29)
 format
+and uses specialized algorithms in operators.
+The format is designed for 2D matrices with a large number of columns,
+and each row is sparse(i.e. with only a few nonzeros).
+For matrices of high sparsity (e.g. ~1% non-zeros), the advantage of 
``CSRNDArray`` over
+the existing ``NDArray`` is that
+
+- memory consumption is reduced significantly
+- certain operations (e.g. matrix-vector multiplication) are much faster
+
+Meanwhile, ``CSRNDArray`` inherits competitive features from ``NDArray`` such 
as
+lazy evaluation and automatic parallelization, which are not available in the
+scientific computing python package [SciPy](https://www.scipy.org/).
+
+Apart from often queried attributes such as **ndarray.shape**, 
**ndarray.dtype** and **ndarray.context**,
+you?ll also want to query **ndarray.stype**: the storage type of the NDArray. 
For a usual dense NDArray,
+the value of stype is **"default"**. For an CSRNDArray, the value of stype is 
**"csr"**.
+
+## Prerequisites
+
+To complete this tutorial, we need:
+
+- MXNet. See the instructions for your operating system in [Setup and 
Installation](http://mxnet.io/get_started/install.html)
+- [Jupyter](http://jupyter.org/)
+```
+pip install jupyter
+```
+- Scipy - A section of this tutorial uses Scipy package in python. If you 
don't have Scipy,
+the example in that section will be ignored.
+- GPUs - A section of this tutorial uses GPUs. If you don't have GPUs on your
+machine, simply set the variable gpu_device (set in the GPUs section of this
+tutorial) to mx.cpu().
+
+## Compressed Sparse Row Format
+
+A CSRNDArray represents a 2D matrix as three separate 1D arrays: **data**,
+**indptr** and **indices**, where the column indices for
+row ``i`` are stored in ``indices[indptr[i]:indptr[i+1]]`` in ascending order,
+and their corresponding values are stored in ``data[indptr[i]:indptr[i+1]]``.
+
+For example, the CSR representation for matrix
+```
+[[7, 0, 8, 0]
+ [0, 0, 0, 0]
+ [0, 9, 0, 0]]
+```
+is:
+```
+[7, 8, 9]  # data
+[0, 2, 1]  # indices
+[0, 2, 2, 3]   # indptr
+```
+
+Note that in MXNet, the column indices for a given row are always sorted in 
ascending order,
+and duplicated column entries for the same row are not allowed.
+
+## Array Creation
+
+There are a few different ways to create a `CSRNDArray`.
+
+* We can create a CSRNDArray with data, indices and indptr by using the 
`csr_matrix` function:
+
+```python
+import mxnet as mx
+import numpy as np
+# create a CSRNDArray with python lists
+shape = (3, 4)
+data_list = [7, 8, 9]
+indptr_list = [0, 2, 2, 3]
+indices_list = [0, 2, 1]
+a = mx.nd.sparse.csr_matrix(data_list, indptr_list, indices_list, shape)
+# create a CSRNDArray with numpy arrays
+data_np = np.array([7, 8, 9])
+indptr_np = np.array([0, 2, 2, 3])
+indices_np = np.array([0, 2, 1])
+b = mx.nd.sparse.csr_matrix(data_np, indptr_np, indices_np, shape)
+{'a':a, 'b':b}
+```
+
+* We can also create an MXNet CSRNDArray from a `scipy.sparse.csr.csr_matrix` 
object by using the `array` function:
+
+```python
+try:
+import scipy.sparse as spsp
+# generate a csr matrix in scipy
+c = spsp.csr.csr_matrix((data_np, indices_np, indptr_np), shape=shape)
+# create a CSRNDArray from a scipy csr object
+d = mx.nd.sparse.array(c)
+{'d':d}
+except ImportError:
+print("scipy package is required")
+```
+
+We can specify the element data type with the option `dtype`, which accepts a 
numpy
+type. By default, `float32` is used.
+
+```python
+# float32 is used in default
+e = mx.nd.sparse.array(a)
+# create a 16-bit float array
+f = mx.nd.array(a, dtype=np.float16)
+(e.dtype, f.dtype)
+```
+
+## Inspecting Arrays
+
+* We can inspect the contents of an `CSRNDArray` by filling
+its contents into a dense `numpy.ndarray` using the `asnumpy` function.
+
+```python
+a.asnumpy()
+```
+
+* We can also inspect the internal storage of a CSRNDArray by accessing 
attributes such as `indptr`, `indices` and `data`:
+

[GitHub] bhavinthaker commented on a change in pull request #7656: CSRNDArray Tutorial

2017-08-29 Thread git

bhavinthaker commented on a change in pull request #7656: CSRNDArray Tutorial
URL: https://github.com/apache/incubator-mxnet/pull/7656#discussion_r135957072
 
 

 ##
 File path: docs/tutorials/sparse/csr.md
 ##
 @@ -0,0 +1,258 @@
+# CSRNDArray - NDArray in Compressed Sparse Row Storage Format
+
+Many real world datasets deal with high dimensional sparse feature vectors. 
For instance,
+in a recommendation system, the number of categories and users is in the order 
of millions,
+while most users only made a few purchases, leading to feature vectors with 
high sparsity
+(i.e. most of the elements are zeros).
+
+Storing and manipulating such large sparse matrices in the default dense 
structure results
+in wasted memory and processing on the zeros.
+To take advantage of the sparse structure of the matrix, the ``CSRNDArray`` in 
MXNet
+stores the matrix in [compressed sparse 
row(CSR)](https://en.wikipedia.org/wiki/Sparse_matrix#Compressed_sparse_row_.28CSR.2C_CRS_or_Yale_format.29)
 format
+and uses specialized algorithms in operators.
+The format is designed for 2D matrices with a large number of columns,
+and each row is sparse(i.e. with only a few nonzeros).
+For matrices of high sparsity (e.g. ~1% non-zeros), the advantage of 
``CSRNDArray`` over
+the existing ``NDArray`` is that
+
+- memory consumption is reduced significantly
+- certain operations (e.g. matrix-vector multiplication) are much faster
+
+Meanwhile, ``CSRNDArray`` inherits competitive features from ``NDArray`` such 
as
+lazy evaluation and automatic parallelization, which are not available in the
+scientific computing python package [SciPy](https://www.scipy.org/).
+
+Apart from often queried attributes such as **ndarray.shape**, 
**ndarray.dtype** and **ndarray.context**,
+you?ll also want to query **ndarray.stype**: the storage type of the NDArray. 
For a usual dense NDArray,
+the value of stype is **"default"**. For an CSRNDArray, the value of stype is 
**"csr"**.
+
+## Prerequisites
+
+To complete this tutorial, we need:
+
+- MXNet. See the instructions for your operating system in [Setup and 
Installation](http://mxnet.io/get_started/install.html)
+- [Jupyter](http://jupyter.org/)
+```
+pip install jupyter
+```
+- Scipy - A section of this tutorial uses Scipy package in python. If you 
don't have Scipy,
+the example in that section will be ignored.
+- GPUs - A section of this tutorial uses GPUs. If you don't have GPUs on your
+machine, simply set the variable gpu_device (set in the GPUs section of this
+tutorial) to mx.cpu().
+
+## Compressed Sparse Row Format
+
+A CSRNDArray represents a 2D matrix as three separate 1D arrays: **data**,
+**indptr** and **indices**, where the column indices for
+row ``i`` are stored in ``indices[indptr[i]:indptr[i+1]]`` in ascending order,
+and their corresponding values are stored in ``data[indptr[i]:indptr[i+1]]``.
+
+For example, the CSR representation for matrix
+```
+[[7, 0, 8, 0]
+ [0, 0, 0, 0]
+ [0, 9, 0, 0]]
+```
+is:
+```
+[7, 8, 9]  # data
+[0, 2, 1]  # indices
+[0, 2, 2, 3]   # indptr
+```
+
+Note that in MXNet, the column indices for a given row are always sorted in 
ascending order,
+and duplicated column entries for the same row are not allowed.
+
+## Array Creation
+
+There are a few different ways to create a `CSRNDArray`.
+
+* We can create a CSRNDArray with data, indices and indptr by using the 
`csr_matrix` function:
+
+```python
+import mxnet as mx
+import numpy as np
+# create a CSRNDArray with python lists
+shape = (3, 4)
+data_list = [7, 8, 9]
+indptr_list = [0, 2, 2, 3]
+indices_list = [0, 2, 1]
+a = mx.nd.sparse.csr_matrix(data_list, indptr_list, indices_list, shape)
+# create a CSRNDArray with numpy arrays
+data_np = np.array([7, 8, 9])
+indptr_np = np.array([0, 2, 2, 3])
+indices_np = np.array([0, 2, 1])
+b = mx.nd.sparse.csr_matrix(data_np, indptr_np, indices_np, shape)
+{'a':a, 'b':b}
+```
+
+* We can also create an MXNet CSRNDArray from a `scipy.sparse.csr.csr_matrix` 
object by using the `array` function:
+
+```python
+try:
+import scipy.sparse as spsp
+# generate a csr matrix in scipy
+c = spsp.csr.csr_matrix((data_np, indices_np, indptr_np), shape=shape)
+# create a CSRNDArray from a scipy csr object
+d = mx.nd.sparse.array(c)
+{'d':d}
+except ImportError:
+print("scipy package is required")
+```
+
+We can specify the element data type with the option `dtype`, which accepts a 
numpy
+type. By default, `float32` is used.
+
+```python
+# float32 is used in default
+e = mx.nd.sparse.array(a)
+# create a 16-bit float array
+f = mx.nd.array(a, dtype=np.float16)
+(e.dtype, f.dtype)
+```
+
+## Inspecting Arrays
+
+* We can inspect the contents of an `CSRNDArray` by filling
+its contents into a dense `numpy.ndarray` using the `asnumpy` function.
+
+```python
+a.asnumpy()
+```
+
+* We can also inspect the internal storage of a CSRNDArray by accessing 
attributes such as `indptr`, `indices` and `data`:
+

[GitHub] bhavinthaker commented on a change in pull request #7656: CSRNDArray Tutorial

2017-08-29 Thread git

bhavinthaker commented on a change in pull request #7656: CSRNDArray Tutorial
URL: https://github.com/apache/incubator-mxnet/pull/7656#discussion_r135954787
 
 

 ##
 File path: docs/tutorials/sparse/csr.md
 ##
 @@ -0,0 +1,258 @@
+# CSRNDArray - NDArray in Compressed Sparse Row Storage Format
+
+Many real world datasets deal with high dimensional sparse feature vectors. 
For instance,
+in a recommendation system, the number of categories and users is in the order 
of millions,
+while most users only made a few purchases, leading to feature vectors with 
high sparsity
+(i.e. most of the elements are zeros).
+
+Storing and manipulating such large sparse matrices in the default dense 
structure results
+in wasted memory and processing on the zeros.
+To take advantage of the sparse structure of the matrix, the ``CSRNDArray`` in 
MXNet
+stores the matrix in [compressed sparse 
row(CSR)](https://en.wikipedia.org/wiki/Sparse_matrix#Compressed_sparse_row_.28CSR.2C_CRS_or_Yale_format.29)
 format
+and uses specialized algorithms in operators.
+The format is designed for 2D matrices with a large number of columns,
+and each row is sparse(i.e. with only a few nonzeros).
+For matrices of high sparsity (e.g. ~1% non-zeros), the advantage of 
``CSRNDArray`` over
+the existing ``NDArray`` is that
+
+- memory consumption is reduced significantly
+- certain operations (e.g. matrix-vector multiplication) are much faster
+
+Meanwhile, ``CSRNDArray`` inherits competitive features from ``NDArray`` such 
as
+lazy evaluation and automatic parallelization, which are not available in the
+scientific computing python package [SciPy](https://www.scipy.org/).
+
+Apart from often queried attributes such as **ndarray.shape**, 
**ndarray.dtype** and **ndarray.context**,
+you?ll also want to query **ndarray.stype**: the storage type of the NDArray. 
For a usual dense NDArray,
+the value of stype is **"default"**. For an CSRNDArray, the value of stype is 
**"csr"**.
+
+## Prerequisites
+
+To complete this tutorial, we need:
+
+- MXNet. See the instructions for your operating system in [Setup and 
Installation](http://mxnet.io/get_started/install.html)
+- [Jupyter](http://jupyter.org/)
+```
+pip install jupyter
+```
+- Scipy - A section of this tutorial uses Scipy package in python. If you 
don't have Scipy,
+the example in that section will be ignored.
+- GPUs - A section of this tutorial uses GPUs. If you don't have GPUs on your
+machine, simply set the variable gpu_device (set in the GPUs section of this
+tutorial) to mx.cpu().
+
+## Compressed Sparse Row Format
+
+A CSRNDArray represents a 2D matrix as three separate 1D arrays: **data**,
+**indptr** and **indices**, where the column indices for
+row ``i`` are stored in ``indices[indptr[i]:indptr[i+1]]`` in ascending order,
+and their corresponding values are stored in ``data[indptr[i]:indptr[i+1]]``.
+
+For example, the CSR representation for matrix
+```
+[[7, 0, 8, 0]
+ [0, 0, 0, 0]
+ [0, 9, 0, 0]]
+```
+is:
+```
+[7, 8, 9]  # data
+[0, 2, 1]  # indices
+[0, 2, 2, 3]   # indptr
+```
+
 
 Review comment:
   Suggestion: I think the suggested text below may help newbies understand the 
various numbers better. Try it with a newbie if you like (correct the text 
spacing appropriately).
   
   [7, 8, 9]  # data: flattened representation of the dense matrix in 
row-major format after removing all zeros.
   [0, 2, 1]  # indices: column indices pointing to the non-zero 
elements in the dense matrix.
   [0, 2, 2, 3] # indptr: index pointers into data[] array that signify 
start of a row in the dense matrix.
 # i.e. Row 0 starts at index pointer 0, pointing to 
element 7, in data[].
 # i.e. Row 1 starts at index pointer 2, pointing to 
element 9, in data[] since Row 1 is all-zeroes.
 # i.e. Row 2 starts at index pointer 2, pointing to 
element 9, in data[].
 # i.e. the last element in indptr is always one past 
the size of data[], signify end of data[].
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] bhavinthaker commented on a change in pull request #7656: CSRNDArray Tutorial

2017-08-29 Thread git

bhavinthaker commented on a change in pull request #7656: CSRNDArray Tutorial
URL: https://github.com/apache/incubator-mxnet/pull/7656#discussion_r135957134
 
 

 ##
 File path: docs/tutorials/sparse/csr.md
 ##
 @@ -0,0 +1,258 @@
+# CSRNDArray - NDArray in Compressed Sparse Row Storage Format
+
+Many real world datasets deal with high dimensional sparse feature vectors. 
For instance,
+in a recommendation system, the number of categories and users is in the order 
of millions,
+while most users only made a few purchases, leading to feature vectors with 
high sparsity
+(i.e. most of the elements are zeros).
+
+Storing and manipulating such large sparse matrices in the default dense 
structure results
+in wasted memory and processing on the zeros.
+To take advantage of the sparse structure of the matrix, the ``CSRNDArray`` in 
MXNet
+stores the matrix in [compressed sparse 
row(CSR)](https://en.wikipedia.org/wiki/Sparse_matrix#Compressed_sparse_row_.28CSR.2C_CRS_or_Yale_format.29)
 format
+and uses specialized algorithms in operators.
+The format is designed for 2D matrices with a large number of columns,
+and each row is sparse(i.e. with only a few nonzeros).
+For matrices of high sparsity (e.g. ~1% non-zeros), the advantage of 
``CSRNDArray`` over
+the existing ``NDArray`` is that
+
+- memory consumption is reduced significantly
+- certain operations (e.g. matrix-vector multiplication) are much faster
+
+Meanwhile, ``CSRNDArray`` inherits competitive features from ``NDArray`` such 
as
+lazy evaluation and automatic parallelization, which are not available in the
+scientific computing python package [SciPy](https://www.scipy.org/).
+
+Apart from often queried attributes such as **ndarray.shape**, 
**ndarray.dtype** and **ndarray.context**,
+you?ll also want to query **ndarray.stype**: the storage type of the NDArray. 
For a usual dense NDArray,
+the value of stype is **"default"**. For an CSRNDArray, the value of stype is 
**"csr"**.
+
+## Prerequisites
+
+To complete this tutorial, we need:
+
+- MXNet. See the instructions for your operating system in [Setup and 
Installation](http://mxnet.io/get_started/install.html)
+- [Jupyter](http://jupyter.org/)
+```
+pip install jupyter
+```
+- Scipy - A section of this tutorial uses Scipy package in python. If you 
don't have Scipy,
+the example in that section will be ignored.
+- GPUs - A section of this tutorial uses GPUs. If you don't have GPUs on your
+machine, simply set the variable gpu_device (set in the GPUs section of this
+tutorial) to mx.cpu().
+
+## Compressed Sparse Row Format
+
+A CSRNDArray represents a 2D matrix as three separate 1D arrays: **data**,
+**indptr** and **indices**, where the column indices for
+row ``i`` are stored in ``indices[indptr[i]:indptr[i+1]]`` in ascending order,
+and their corresponding values are stored in ``data[indptr[i]:indptr[i+1]]``.
+
+For example, the CSR representation for matrix
+```
+[[7, 0, 8, 0]
+ [0, 0, 0, 0]
+ [0, 9, 0, 0]]
+```
+is:
+```
+[7, 8, 9]  # data
+[0, 2, 1]  # indices
+[0, 2, 2, 3]   # indptr
+```
+
+Note that in MXNet, the column indices for a given row are always sorted in 
ascending order,
+and duplicated column entries for the same row are not allowed.
+
+## Array Creation
+
+There are a few different ways to create a `CSRNDArray`.
+
+* We can create a CSRNDArray with data, indices and indptr by using the 
`csr_matrix` function:
+
+```python
+import mxnet as mx
+import numpy as np
+# create a CSRNDArray with python lists
+shape = (3, 4)
+data_list = [7, 8, 9]
+indptr_list = [0, 2, 2, 3]
+indices_list = [0, 2, 1]
+a = mx.nd.sparse.csr_matrix(data_list, indptr_list, indices_list, shape)
+# create a CSRNDArray with numpy arrays
+data_np = np.array([7, 8, 9])
+indptr_np = np.array([0, 2, 2, 3])
+indices_np = np.array([0, 2, 1])
+b = mx.nd.sparse.csr_matrix(data_np, indptr_np, indices_np, shape)
+{'a':a, 'b':b}
+```
+
+* We can also create an MXNet CSRNDArray from a `scipy.sparse.csr.csr_matrix` 
object by using the `array` function:
+
+```python
+try:
+import scipy.sparse as spsp
+# generate a csr matrix in scipy
+c = spsp.csr.csr_matrix((data_np, indices_np, indptr_np), shape=shape)
+# create a CSRNDArray from a scipy csr object
+d = mx.nd.sparse.array(c)
+{'d':d}
+except ImportError:
+print("scipy package is required")
+```
+
+We can specify the element data type with the option `dtype`, which accepts a 
numpy
+type. By default, `float32` is used.
+
+```python
+# float32 is used in default
+e = mx.nd.sparse.array(a)
+# create a 16-bit float array
+f = mx.nd.array(a, dtype=np.float16)
+(e.dtype, f.dtype)
+```
+
+## Inspecting Arrays
+
+* We can inspect the contents of an `CSRNDArray` by filling
+its contents into a dense `numpy.ndarray` using the `asnumpy` function.
+
+```python
+a.asnumpy()
+```
+
+* We can also inspect the internal storage of a CSRNDArray by accessing 
attributes such as `indptr`, `indices` and `data`:
+

[GitHub] bhavinthaker commented on a change in pull request #7656: CSRNDArray Tutorial

2017-08-29 Thread git

bhavinthaker commented on a change in pull request #7656: CSRNDArray Tutorial
URL: https://github.com/apache/incubator-mxnet/pull/7656#discussion_r135958077
 
 

 ##
 File path: docs/tutorials/sparse/csr.md
 ##
 @@ -0,0 +1,258 @@
+# CSRNDArray - NDArray in Compressed Sparse Row Storage Format
+
+Many real world datasets deal with high dimensional sparse feature vectors. 
For instance,
+in a recommendation system, the number of categories and users is in the order 
of millions,
+while most users only made a few purchases, leading to feature vectors with 
high sparsity
+(i.e. most of the elements are zeros).
+
+Storing and manipulating such large sparse matrices in the default dense 
structure results
+in wasted memory and processing on the zeros.
+To take advantage of the sparse structure of the matrix, the ``CSRNDArray`` in 
MXNet
+stores the matrix in [compressed sparse 
row(CSR)](https://en.wikipedia.org/wiki/Sparse_matrix#Compressed_sparse_row_.28CSR.2C_CRS_or_Yale_format.29)
 format
+and uses specialized algorithms in operators.
+The format is designed for 2D matrices with a large number of columns,
+and each row is sparse(i.e. with only a few nonzeros).
+For matrices of high sparsity (e.g. ~1% non-zeros), the advantage of 
``CSRNDArray`` over
+the existing ``NDArray`` is that
+
+- memory consumption is reduced significantly
+- certain operations (e.g. matrix-vector multiplication) are much faster
+
+Meanwhile, ``CSRNDArray`` inherits competitive features from ``NDArray`` such 
as
+lazy evaluation and automatic parallelization, which are not available in the
+scientific computing python package [SciPy](https://www.scipy.org/).
+
+Apart from often queried attributes such as **ndarray.shape**, 
**ndarray.dtype** and **ndarray.context**,
+you?ll also want to query **ndarray.stype**: the storage type of the NDArray. 
For a usual dense NDArray,
+the value of stype is **"default"**. For an CSRNDArray, the value of stype is 
**"csr"**.
+
+## Prerequisites
+
+To complete this tutorial, we need:
+
+- MXNet. See the instructions for your operating system in [Setup and 
Installation](http://mxnet.io/get_started/install.html)
+- [Jupyter](http://jupyter.org/)
+```
+pip install jupyter
+```
+- Scipy - A section of this tutorial uses Scipy package in python. If you 
don't have Scipy,
+the example in that section will be ignored.
+- GPUs - A section of this tutorial uses GPUs. If you don't have GPUs on your
+machine, simply set the variable gpu_device (set in the GPUs section of this
+tutorial) to mx.cpu().
+
+## Compressed Sparse Row Format
+
+A CSRNDArray represents a 2D matrix as three separate 1D arrays: **data**,
+**indptr** and **indices**, where the column indices for
+row ``i`` are stored in ``indices[indptr[i]:indptr[i+1]]`` in ascending order,
+and their corresponding values are stored in ``data[indptr[i]:indptr[i+1]]``.
+
+For example, the CSR representation for matrix
+```
+[[7, 0, 8, 0]
+ [0, 0, 0, 0]
+ [0, 9, 0, 0]]
+```
+is:
+```
+[7, 8, 9]  # data
+[0, 2, 1]  # indices
+[0, 2, 2, 3]   # indptr
+```
+
+Note that in MXNet, the column indices for a given row are always sorted in 
ascending order,
+and duplicated column entries for the same row are not allowed.
+
+## Array Creation
+
+There are a few different ways to create a `CSRNDArray`.
+
+* We can create a CSRNDArray with data, indices and indptr by using the 
`csr_matrix` function:
+
+```python
+import mxnet as mx
+import numpy as np
+# create a CSRNDArray with python lists
+shape = (3, 4)
+data_list = [7, 8, 9]
+indptr_list = [0, 2, 2, 3]
+indices_list = [0, 2, 1]
+a = mx.nd.sparse.csr_matrix(data_list, indptr_list, indices_list, shape)
+# create a CSRNDArray with numpy arrays
+data_np = np.array([7, 8, 9])
+indptr_np = np.array([0, 2, 2, 3])
+indices_np = np.array([0, 2, 1])
+b = mx.nd.sparse.csr_matrix(data_np, indptr_np, indices_np, shape)
+{'a':a, 'b':b}
+```
+
+* We can also create an MXNet CSRNDArray from a `scipy.sparse.csr.csr_matrix` 
object by using the `array` function:
+
+```python
+try:
+import scipy.sparse as spsp
+# generate a csr matrix in scipy
+c = spsp.csr.csr_matrix((data_np, indices_np, indptr_np), shape=shape)
+# create a CSRNDArray from a scipy csr object
+d = mx.nd.sparse.array(c)
+{'d':d}
+except ImportError:
+print("scipy package is required")
+```
+
+We can specify the element data type with the option `dtype`, which accepts a 
numpy
+type. By default, `float32` is used.
+
+```python
+# float32 is used in default
+e = mx.nd.sparse.array(a)
+# create a 16-bit float array
+f = mx.nd.array(a, dtype=np.float16)
+(e.dtype, f.dtype)
+```
+
+## Inspecting Arrays
+
+* We can inspect the contents of an `CSRNDArray` by filling
+its contents into a dense `numpy.ndarray` using the `asnumpy` function.
+
+```python
+a.asnumpy()
+```
+
+* We can also inspect the internal storage of a CSRNDArray by accessing 
attributes such as `indptr`, `indices` and `data`:
+

[GitHub] bhavinthaker commented on a change in pull request #7656: CSRNDArray Tutorial

2017-08-29 Thread git

bhavinthaker commented on a change in pull request #7656: CSRNDArray Tutorial
URL: https://github.com/apache/incubator-mxnet/pull/7656#discussion_r135957219
 
 

 ##
 File path: docs/tutorials/sparse/csr.md
 ##
 @@ -0,0 +1,258 @@
+# CSRNDArray - NDArray in Compressed Sparse Row Storage Format
+
+Many real world datasets deal with high dimensional sparse feature vectors. 
For instance,
+in a recommendation system, the number of categories and users is in the order 
of millions,
+while most users only made a few purchases, leading to feature vectors with 
high sparsity
+(i.e. most of the elements are zeros).
+
+Storing and manipulating such large sparse matrices in the default dense 
structure results
+in wasted memory and processing on the zeros.
+To take advantage of the sparse structure of the matrix, the ``CSRNDArray`` in 
MXNet
+stores the matrix in [compressed sparse 
row(CSR)](https://en.wikipedia.org/wiki/Sparse_matrix#Compressed_sparse_row_.28CSR.2C_CRS_or_Yale_format.29)
 format
+and uses specialized algorithms in operators.
+The format is designed for 2D matrices with a large number of columns,
+and each row is sparse(i.e. with only a few nonzeros).
+For matrices of high sparsity (e.g. ~1% non-zeros), the advantage of 
``CSRNDArray`` over
+the existing ``NDArray`` is that
+
+- memory consumption is reduced significantly
+- certain operations (e.g. matrix-vector multiplication) are much faster
+
+Meanwhile, ``CSRNDArray`` inherits competitive features from ``NDArray`` such 
as
+lazy evaluation and automatic parallelization, which are not available in the
+scientific computing python package [SciPy](https://www.scipy.org/).
+
+Apart from often queried attributes such as **ndarray.shape**, 
**ndarray.dtype** and **ndarray.context**,
+you?ll also want to query **ndarray.stype**: the storage type of the NDArray. 
For a usual dense NDArray,
+the value of stype is **"default"**. For an CSRNDArray, the value of stype is 
**"csr"**.
+
+## Prerequisites
+
+To complete this tutorial, we need:
+
+- MXNet. See the instructions for your operating system in [Setup and 
Installation](http://mxnet.io/get_started/install.html)
+- [Jupyter](http://jupyter.org/)
+```
+pip install jupyter
+```
+- Scipy - A section of this tutorial uses Scipy package in python. If you 
don't have Scipy,
+the example in that section will be ignored.
+- GPUs - A section of this tutorial uses GPUs. If you don't have GPUs on your
+machine, simply set the variable gpu_device (set in the GPUs section of this
+tutorial) to mx.cpu().
+
+## Compressed Sparse Row Format
+
+A CSRNDArray represents a 2D matrix as three separate 1D arrays: **data**,
+**indptr** and **indices**, where the column indices for
+row ``i`` are stored in ``indices[indptr[i]:indptr[i+1]]`` in ascending order,
+and their corresponding values are stored in ``data[indptr[i]:indptr[i+1]]``.
+
+For example, the CSR representation for matrix
+```
+[[7, 0, 8, 0]
+ [0, 0, 0, 0]
+ [0, 9, 0, 0]]
+```
+is:
+```
+[7, 8, 9]  # data
+[0, 2, 1]  # indices
+[0, 2, 2, 3]   # indptr
+```
+
+Note that in MXNet, the column indices for a given row are always sorted in 
ascending order,
+and duplicated column entries for the same row are not allowed.
+
+## Array Creation
+
+There are a few different ways to create a `CSRNDArray`.
+
+* We can create a CSRNDArray with data, indices and indptr by using the 
`csr_matrix` function:
+
+```python
+import mxnet as mx
+import numpy as np
+# create a CSRNDArray with python lists
+shape = (3, 4)
+data_list = [7, 8, 9]
+indptr_list = [0, 2, 2, 3]
+indices_list = [0, 2, 1]
+a = mx.nd.sparse.csr_matrix(data_list, indptr_list, indices_list, shape)
+# create a CSRNDArray with numpy arrays
+data_np = np.array([7, 8, 9])
+indptr_np = np.array([0, 2, 2, 3])
+indices_np = np.array([0, 2, 1])
+b = mx.nd.sparse.csr_matrix(data_np, indptr_np, indices_np, shape)
+{'a':a, 'b':b}
+```
+
+* We can also create an MXNet CSRNDArray from a `scipy.sparse.csr.csr_matrix` 
object by using the `array` function:
+
+```python
+try:
+import scipy.sparse as spsp
+# generate a csr matrix in scipy
+c = spsp.csr.csr_matrix((data_np, indices_np, indptr_np), shape=shape)
+# create a CSRNDArray from a scipy csr object
+d = mx.nd.sparse.array(c)
+{'d':d}
+except ImportError:
+print("scipy package is required")
+```
+
+We can specify the element data type with the option `dtype`, which accepts a 
numpy
+type. By default, `float32` is used.
+
+```python
+# float32 is used in default
+e = mx.nd.sparse.array(a)
+# create a 16-bit float array
+f = mx.nd.array(a, dtype=np.float16)
+(e.dtype, f.dtype)
+```
+
+## Inspecting Arrays
+
+* We can inspect the contents of an `CSRNDArray` by filling
+its contents into a dense `numpy.ndarray` using the `asnumpy` function.
+
+```python
+a.asnumpy()
+```
+
+* We can also inspect the internal storage of a CSRNDArray by accessing 
attributes such as `indptr`, `indices` and `data`:
+

[GitHub] bhavinthaker commented on a change in pull request #7656: CSRNDArray Tutorial

2017-08-29 Thread git

bhavinthaker commented on a change in pull request #7656: CSRNDArray Tutorial
URL: https://github.com/apache/incubator-mxnet/pull/7656#discussion_r135953019

##
File path: docs/tutorials/sparse/csr.md
##
@@ -0,0 +1,258 @@
+# CSRNDArray - NDArray in Compressed Sparse Row Storage Format
+
+Many real world datasets deal with high dimensional sparse feature vectors.
For instance,
+in a recommendation system, the number of categories and users is in the order
of millions,
+while most users only made a few purchases, leading to feature vectors with
high sparsity
+(i.e. most of the elements are zeros).
+
+Storing and manipulating such large sparse matrices in the default dense
structure results
+in wasted memory and processing on the zeros.
+To take advantage of the sparse structure of the matrix, the ``CSRNDArray`` in
MXNet
+stores the matrix in [compressed sparse
row(CSR)](https://en.wikipedia.org/wiki/Sparse_matrix#Compressed_sparse_row_.28CSR.2C_CRS_or_Yale_format.29)
format
+and uses specialized algorithms in operators.
+The format is designed for 2D matrices with a large number of columns,
+and each row is sparse(i.e. with only a few nonzeros).
+For matrices of high sparsity (e.g. ~1% non-zeros), the advantage of
``CSRNDArray`` over
+the existing ``NDArray`` is that
+
+- memory consumption is reduced significantly
+- certain operations (e.g. matrix-vector multiplication) are much faster
+
+Meanwhile, ``CSRNDArray`` inherits competitive features from ``NDArray`` such
as
+lazy evaluation and automatic parallelization, which are not available in the
+scientific computing python package [SciPy](https://www.scipy.org/).
+
+Apart from often queried attributes such as **ndarray.shape**,
**ndarray.dtype** and **ndarray.context**,
+you?ll also want to query **ndarray.stype**: the storage type of the NDArray.
For a usual dense NDArray,
+the value of stype is **"default"**. For an CSRNDArray, the value of stype is
**"csr"**.
+
+## Prerequisites
+
+To complete this tutorial, we need:
+
+- MXNet. See the instructions for your operating system in [Setup and
Installation](http://mxnet.io/get_started/install.html)
+- [Jupyter](http://jupyter.org/)
+```
+pip install jupyter
+```
+- Scipy - A section of this tutorial uses Scipy package in python. If you
don't have Scipy,
+the example in that section will be ignored.

Review comment:
Suggestion: change "will" to "can". Oh, I see that you use try-exception and
so "will" is correct. Ignore my suggestion.

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

With regards,
Apache Git Services

[GitHub] bhavinthaker commented on a change in pull request #7656: CSRNDArray Tutorial

2017-08-29 Thread git

bhavinthaker commented on a change in pull request #7656: CSRNDArray Tutorial
URL: https://github.com/apache/incubator-mxnet/pull/7656#discussion_r135952393

Review comment:
Suggestion: Add a space before the opening parenthesis throughout the
document. Please check other occurrences in the doc and fix them as well.
FYI:
https://english.stackexchange.com/questions/5987/is-there-any-rule-for-the-placement-of-space-after-and-before-parenthesis

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

With regards,
Apache Git Services

[GitHub] bhavinthaker commented on a change in pull request #7656: CSRNDArray Tutorial

2017-08-29 Thread git

bhavinthaker commented on a change in pull request #7656: CSRNDArray Tutorial
URL: https://github.com/apache/incubator-mxnet/pull/7656#discussion_r135958153
 
 

 ##
 File path: docs/tutorials/sparse/csr.md
 ##
 @@ -0,0 +1,258 @@
+# CSRNDArray - NDArray in Compressed Sparse Row Storage Format
+
+Many real world datasets deal with high dimensional sparse feature vectors. 
For instance,
+in a recommendation system, the number of categories and users is in the order 
of millions,
+while most users only made a few purchases, leading to feature vectors with 
high sparsity
+(i.e. most of the elements are zeros).
+
+Storing and manipulating such large sparse matrices in the default dense 
structure results
+in wasted memory and processing on the zeros.
+To take advantage of the sparse structure of the matrix, the ``CSRNDArray`` in 
MXNet
+stores the matrix in [compressed sparse 
row(CSR)](https://en.wikipedia.org/wiki/Sparse_matrix#Compressed_sparse_row_.28CSR.2C_CRS_or_Yale_format.29)
 format
+and uses specialized algorithms in operators.
+The format is designed for 2D matrices with a large number of columns,
+and each row is sparse(i.e. with only a few nonzeros).
+For matrices of high sparsity (e.g. ~1% non-zeros), the advantage of 
``CSRNDArray`` over
+the existing ``NDArray`` is that
+
+- memory consumption is reduced significantly
+- certain operations (e.g. matrix-vector multiplication) are much faster
+
+Meanwhile, ``CSRNDArray`` inherits competitive features from ``NDArray`` such 
as
+lazy evaluation and automatic parallelization, which are not available in the
+scientific computing python package [SciPy](https://www.scipy.org/).
+
+Apart from often queried attributes such as **ndarray.shape**, 
**ndarray.dtype** and **ndarray.context**,
+you?ll also want to query **ndarray.stype**: the storage type of the NDArray. 
For a usual dense NDArray,
+the value of stype is **"default"**. For an CSRNDArray, the value of stype is 
**"csr"**.
+
+## Prerequisites
+
+To complete this tutorial, we need:
+
+- MXNet. See the instructions for your operating system in [Setup and 
Installation](http://mxnet.io/get_started/install.html)
+- [Jupyter](http://jupyter.org/)
+```
+pip install jupyter
+```
+- Scipy - A section of this tutorial uses Scipy package in python. If you 
don't have Scipy,
+the example in that section will be ignored.
+- GPUs - A section of this tutorial uses GPUs. If you don't have GPUs on your
+machine, simply set the variable gpu_device (set in the GPUs section of this
+tutorial) to mx.cpu().
+
+## Compressed Sparse Row Format
+
+A CSRNDArray represents a 2D matrix as three separate 1D arrays: **data**,
+**indptr** and **indices**, where the column indices for
+row ``i`` are stored in ``indices[indptr[i]:indptr[i+1]]`` in ascending order,
+and their corresponding values are stored in ``data[indptr[i]:indptr[i+1]]``.
+
+For example, the CSR representation for matrix
+```
+[[7, 0, 8, 0]
+ [0, 0, 0, 0]
+ [0, 9, 0, 0]]
+```
+is:
+```
+[7, 8, 9]  # data
+[0, 2, 1]  # indices
+[0, 2, 2, 3]   # indptr
+```
+
+Note that in MXNet, the column indices for a given row are always sorted in 
ascending order,
+and duplicated column entries for the same row are not allowed.
+
+## Array Creation
+
+There are a few different ways to create a `CSRNDArray`.
+
+* We can create a CSRNDArray with data, indices and indptr by using the 
`csr_matrix` function:
+
+```python
+import mxnet as mx
+import numpy as np
+# create a CSRNDArray with python lists
+shape = (3, 4)
+data_list = [7, 8, 9]
+indptr_list = [0, 2, 2, 3]
+indices_list = [0, 2, 1]
+a = mx.nd.sparse.csr_matrix(data_list, indptr_list, indices_list, shape)
+# create a CSRNDArray with numpy arrays
+data_np = np.array([7, 8, 9])
+indptr_np = np.array([0, 2, 2, 3])
+indices_np = np.array([0, 2, 1])
+b = mx.nd.sparse.csr_matrix(data_np, indptr_np, indices_np, shape)
+{'a':a, 'b':b}
+```
+
+* We can also create an MXNet CSRNDArray from a `scipy.sparse.csr.csr_matrix` 
object by using the `array` function:
+
+```python
+try:
+import scipy.sparse as spsp
+# generate a csr matrix in scipy
+c = spsp.csr.csr_matrix((data_np, indices_np, indptr_np), shape=shape)
+# create a CSRNDArray from a scipy csr object
+d = mx.nd.sparse.array(c)
+{'d':d}
+except ImportError:
+print("scipy package is required")
+```
+
+We can specify the element data type with the option `dtype`, which accepts a 
numpy
+type. By default, `float32` is used.
+
+```python
+# float32 is used in default
+e = mx.nd.sparse.array(a)
+# create a 16-bit float array
+f = mx.nd.array(a, dtype=np.float16)
+(e.dtype, f.dtype)
+```
+
+## Inspecting Arrays
+
+* We can inspect the contents of an `CSRNDArray` by filling
+its contents into a dense `numpy.ndarray` using the `asnumpy` function.
+
+```python
+a.asnumpy()
+```
+
+* We can also inspect the internal storage of a CSRNDArray by accessing 
attributes such as `indptr`, `indices` and `data`:
+

[GitHub] bhavinthaker commented on a change in pull request #7656: CSRNDArray Tutorial

2017-08-29 Thread git

bhavinthaker commented on a change in pull request #7656: CSRNDArray Tutorial
URL: https://github.com/apache/incubator-mxnet/pull/7656#discussion_r135952133
 
 

 ##
 File path: docs/tutorials/sparse/csr.md
 ##
 @@ -0,0 +1,258 @@
+# CSRNDArray - NDArray in Compressed Sparse Row Storage Format
+
+Many real world datasets deal with high dimensional sparse feature vectors. 
For instance,
+in a recommendation system, the number of categories and users is in the order 
of millions,
+while most users only made a few purchases, leading to feature vectors with 
high sparsity
 
 Review comment:
   Make all sentences have a common tense -- which is the present tense here. 
   Suggestion: while most users typically make a few purchases only, which 
leads to ...
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] bhavinthaker commented on a change in pull request #7656: CSRNDArray Tutorial

2017-08-29 Thread git

bhavinthaker commented on a change in pull request #7656: CSRNDArray Tutorial
URL: https://github.com/apache/incubator-mxnet/pull/7656#discussion_r135955435
 
 

 ##
 File path: docs/tutorials/sparse/csr.md
 ##
 @@ -0,0 +1,258 @@
+# CSRNDArray - NDArray in Compressed Sparse Row Storage Format
+
+Many real world datasets deal with high dimensional sparse feature vectors. 
For instance,
+in a recommendation system, the number of categories and users is in the order 
of millions,
+while most users only made a few purchases, leading to feature vectors with 
high sparsity
+(i.e. most of the elements are zeros).
+
+Storing and manipulating such large sparse matrices in the default dense 
structure results
+in wasted memory and processing on the zeros.
+To take advantage of the sparse structure of the matrix, the ``CSRNDArray`` in 
MXNet
+stores the matrix in [compressed sparse 
row(CSR)](https://en.wikipedia.org/wiki/Sparse_matrix#Compressed_sparse_row_.28CSR.2C_CRS_or_Yale_format.29)
 format
+and uses specialized algorithms in operators.
+The format is designed for 2D matrices with a large number of columns,
+and each row is sparse(i.e. with only a few nonzeros).
+For matrices of high sparsity (e.g. ~1% non-zeros), the advantage of 
``CSRNDArray`` over
+the existing ``NDArray`` is that
+
+- memory consumption is reduced significantly
+- certain operations (e.g. matrix-vector multiplication) are much faster
+
+Meanwhile, ``CSRNDArray`` inherits competitive features from ``NDArray`` such 
as
+lazy evaluation and automatic parallelization, which are not available in the
+scientific computing python package [SciPy](https://www.scipy.org/).
+
+Apart from often queried attributes such as **ndarray.shape**, 
**ndarray.dtype** and **ndarray.context**,
+you?ll also want to query **ndarray.stype**: the storage type of the NDArray. 
For a usual dense NDArray,
+the value of stype is **"default"**. For an CSRNDArray, the value of stype is 
**"csr"**.
+
+## Prerequisites
+
+To complete this tutorial, we need:
+
+- MXNet. See the instructions for your operating system in [Setup and 
Installation](http://mxnet.io/get_started/install.html)
+- [Jupyter](http://jupyter.org/)
+```
+pip install jupyter
+```
+- Scipy - A section of this tutorial uses Scipy package in python. If you 
don't have Scipy,
+the example in that section will be ignored.
+- GPUs - A section of this tutorial uses GPUs. If you don't have GPUs on your
+machine, simply set the variable gpu_device (set in the GPUs section of this
+tutorial) to mx.cpu().
+
+## Compressed Sparse Row Format
+
+A CSRNDArray represents a 2D matrix as three separate 1D arrays: **data**,
+**indptr** and **indices**, where the column indices for
+row ``i`` are stored in ``indices[indptr[i]:indptr[i+1]]`` in ascending order,
+and their corresponding values are stored in ``data[indptr[i]:indptr[i+1]]``.
+
+For example, the CSR representation for matrix
+```
+[[7, 0, 8, 0]
+ [0, 0, 0, 0]
+ [0, 9, 0, 0]]
+```
+is:
+```
+[7, 8, 9]  # data
+[0, 2, 1]  # indices
+[0, 2, 2, 3]   # indptr
+```
+
+Note that in MXNet, the column indices for a given row are always sorted in 
ascending order,
+and duplicated column entries for the same row are not allowed.
+
+## Array Creation
+
+There are a few different ways to create a `CSRNDArray`.
+
+* We can create a CSRNDArray with data, indices and indptr by using the 
`csr_matrix` function:
+
+```python
+import mxnet as mx
+import numpy as np
+# create a CSRNDArray with python lists
+shape = (3, 4)
+data_list = [7, 8, 9]
+indptr_list = [0, 2, 2, 3]
+indices_list = [0, 2, 1]
+a = mx.nd.sparse.csr_matrix(data_list, indptr_list, indices_list, shape)
+# create a CSRNDArray with numpy arrays
+data_np = np.array([7, 8, 9])
+indptr_np = np.array([0, 2, 2, 3])
+indices_np = np.array([0, 2, 1])
+b = mx.nd.sparse.csr_matrix(data_np, indptr_np, indices_np, shape)
+{'a':a, 'b':b}
+```
+
+* We can also create an MXNet CSRNDArray from a `scipy.sparse.csr.csr_matrix` 
object by using the `array` function:
+
+```python
+try:
+import scipy.sparse as spsp
+# generate a csr matrix in scipy
+c = spsp.csr.csr_matrix((data_np, indices_np, indptr_np), shape=shape)
+# create a CSRNDArray from a scipy csr object
+d = mx.nd.sparse.array(c)
+{'d':d}
+except ImportError:
 
 Review comment:
   Somehow in the rendered text, there is a newline between try and except and 
that causes invalid syntax when I cut-paste the text. Please check.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] bhavinthaker commented on a change in pull request #7656: CSRNDArray Tutorial

2017-08-29 Thread git

bhavinthaker commented on a change in pull request #7656: CSRNDArray Tutorial
URL: https://github.com/apache/incubator-mxnet/pull/7656#discussion_r135958450
 
 

 ##
 File path: docs/tutorials/sparse/csr.md
 ##
 @@ -0,0 +1,258 @@
+# CSRNDArray - NDArray in Compressed Sparse Row Storage Format
+
+Many real world datasets deal with high dimensional sparse feature vectors. 
For instance,
+in a recommendation system, the number of categories and users is in the order 
of millions,
+while most users only made a few purchases, leading to feature vectors with 
high sparsity
+(i.e. most of the elements are zeros).
+
+Storing and manipulating such large sparse matrices in the default dense 
structure results
+in wasted memory and processing on the zeros.
+To take advantage of the sparse structure of the matrix, the ``CSRNDArray`` in 
MXNet
+stores the matrix in [compressed sparse 
row(CSR)](https://en.wikipedia.org/wiki/Sparse_matrix#Compressed_sparse_row_.28CSR.2C_CRS_or_Yale_format.29)
 format
+and uses specialized algorithms in operators.
+The format is designed for 2D matrices with a large number of columns,
+and each row is sparse(i.e. with only a few nonzeros).
+For matrices of high sparsity (e.g. ~1% non-zeros), the advantage of 
``CSRNDArray`` over
+the existing ``NDArray`` is that
+
+- memory consumption is reduced significantly
+- certain operations (e.g. matrix-vector multiplication) are much faster
+
+Meanwhile, ``CSRNDArray`` inherits competitive features from ``NDArray`` such 
as
+lazy evaluation and automatic parallelization, which are not available in the
+scientific computing python package [SciPy](https://www.scipy.org/).
+
+Apart from often queried attributes such as **ndarray.shape**, 
**ndarray.dtype** and **ndarray.context**,
+you?ll also want to query **ndarray.stype**: the storage type of the NDArray. 
For a usual dense NDArray,
+the value of stype is **"default"**. For an CSRNDArray, the value of stype is 
**"csr"**.
+
+## Prerequisites
+
+To complete this tutorial, we need:
+
+- MXNet. See the instructions for your operating system in [Setup and 
Installation](http://mxnet.io/get_started/install.html)
+- [Jupyter](http://jupyter.org/)
+```
+pip install jupyter
+```
+- Scipy - A section of this tutorial uses Scipy package in python. If you 
don't have Scipy,
+the example in that section will be ignored.
+- GPUs - A section of this tutorial uses GPUs. If you don't have GPUs on your
+machine, simply set the variable gpu_device (set in the GPUs section of this
+tutorial) to mx.cpu().
+
+## Compressed Sparse Row Format
+
+A CSRNDArray represents a 2D matrix as three separate 1D arrays: **data**,
+**indptr** and **indices**, where the column indices for
+row ``i`` are stored in ``indices[indptr[i]:indptr[i+1]]`` in ascending order,
+and their corresponding values are stored in ``data[indptr[i]:indptr[i+1]]``.
+
+For example, the CSR representation for matrix
+```
+[[7, 0, 8, 0]
+ [0, 0, 0, 0]
+ [0, 9, 0, 0]]
+```
+is:
+```
+[7, 8, 9]  # data
+[0, 2, 1]  # indices
+[0, 2, 2, 3]   # indptr
+```
+
+Note that in MXNet, the column indices for a given row are always sorted in 
ascending order,
+and duplicated column entries for the same row are not allowed.
+
+## Array Creation
+
+There are a few different ways to create a `CSRNDArray`.
+
+* We can create a CSRNDArray with data, indices and indptr by using the 
`csr_matrix` function:
+
+```python
+import mxnet as mx
+import numpy as np
+# create a CSRNDArray with python lists
+shape = (3, 4)
+data_list = [7, 8, 9]
+indptr_list = [0, 2, 2, 3]
+indices_list = [0, 2, 1]
+a = mx.nd.sparse.csr_matrix(data_list, indptr_list, indices_list, shape)
+# create a CSRNDArray with numpy arrays
+data_np = np.array([7, 8, 9])
+indptr_np = np.array([0, 2, 2, 3])
+indices_np = np.array([0, 2, 1])
+b = mx.nd.sparse.csr_matrix(data_np, indptr_np, indices_np, shape)
+{'a':a, 'b':b}
+```
+
+* We can also create an MXNet CSRNDArray from a `scipy.sparse.csr.csr_matrix` 
object by using the `array` function:
+
+```python
+try:
+import scipy.sparse as spsp
+# generate a csr matrix in scipy
+c = spsp.csr.csr_matrix((data_np, indices_np, indptr_np), shape=shape)
+# create a CSRNDArray from a scipy csr object
+d = mx.nd.sparse.array(c)
+{'d':d}
+except ImportError:
+print("scipy package is required")
+```
+
+We can specify the element data type with the option `dtype`, which accepts a 
numpy
+type. By default, `float32` is used.
+
+```python
+# float32 is used in default
+e = mx.nd.sparse.array(a)
+# create a 16-bit float array
+f = mx.nd.array(a, dtype=np.float16)
+(e.dtype, f.dtype)
+```
+
+## Inspecting Arrays
+
+* We can inspect the contents of an `CSRNDArray` by filling
+its contents into a dense `numpy.ndarray` using the `asnumpy` function.
+
+```python
+a.asnumpy()
+```
+
+* We can also inspect the internal storage of a CSRNDArray by accessing 
attributes such as `indptr`, `indices` and `data`:
+

[GitHub] Godricly commented on issue #7606: Slicing mx.nd.array with postive start and neg end alone same axis

2017-08-29 Thread git

Godricly commented on issue #7606: Slicing mx.nd.array with postive start and 
neg end alone same axis
URL: 
https://github.com/apache/incubator-mxnet/issues/7606#issuecomment-325861301
 
 
   fixed in #7609.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] Godricly closed issue #7606: Slicing mx.nd.array with postive start and neg end alone same axis

2017-08-29 Thread git

Godricly closed issue #7606: Slicing mx.nd.array with postive start and neg end 
alone same axis
URL: https://github.com/apache/incubator-mxnet/issues/7606
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] Godricly commented on a change in pull request #7609: Update ndarray.py

2017-08-29 Thread git

Godricly commented on a change in pull request #7609: Update ndarray.py
URL: https://github.com/apache/incubator-mxnet/pull/7609#discussion_r135957328
 
 

 ##
 File path: python/mxnet/ndarray/ndarray.py
 ##
 @@ -587,8 +587,25 @@ def _slice(self, start, stop):
 array([], shape=(0, 2), dtype=float32)
 """
 handle = NDArrayHandle()
-start = mx_uint(start) if start else mx_uint(0)
-stop = mx_uint(stop) if stop else mx_uint(self.shape[0])
+if start is None:
+start = mx_uint(0)
+elif start < 0:
+length = self.shape[0]
+start += length
+assert start >= 0, "Slicing end %d exceeds limit of 
%d"%(start-length, length)
 
 Review comment:
   Slicing end -> slicing start
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] bfgray3 opened a new pull request #7663: simplifying R package for efficiency and robustness

2017-08-29 Thread git

bfgray3 opened a new pull request #7663: simplifying R package for efficiency 
and robustness
URL: https://github.com/apache/incubator-mxnet/pull/7663
 
 
   * avoid `for` loops whenever possible
   * do not attach `methods`; instead prefix functions from this package with 
`methods::`
   * prefer `seq_len` and `seq_along` over `1:n` or `1:length(x)` 
   * other small changes
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] leonid-pishchulin commented on issue #7393: add depthwise convolution's gpu version optimization

2017-08-29 Thread git

leonid-pishchulin commented on issue #7393: add depthwise convolution's gpu 
version optimization
URL: https://github.com/apache/incubator-mxnet/pull/7393#issuecomment-325860182
 
 
   I double-checked: DepthwiseConvolutionOp is called when num_group=num_filter 
for both mobilenet and resnet. Have you ever measured the speed-ups when 
running resnet with depth-separable convolutions? I get not speed-up, the 
performance is even a bit slower when setting num_group=num_filter
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] 7oud commented on issue #7393: add depthwise convolution's gpu version optimization

2017-08-29 Thread git

7oud commented on issue #7393: add depthwise convolution's gpu version 
optimization
URL: https://github.com/apache/incubator-mxnet/pull/7393#issuecomment-325856944
 
 
   This PR (#7393) is just the optimized conv without cudnn. 
   The mobilenet paper gives the computation of 569 Million Multi-Adds, could 
you give the approximately number when using num_group=1
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] chinakook commented on issue #7613: 1x1 convolution acceleration

2017-08-29 Thread git

chinakook commented on issue #7613: 1x1 convolution acceleration
URL: https://github.com/apache/incubator-mxnet/pull/7613#issuecomment-325856965
 
 
   Is that rebased? I found It's a little confused.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] leonid-pishchulin commented on issue #7393: add depthwise convolution's gpu version optimization

2017-08-29 Thread git

leonid-pishchulin commented on issue #7393: add depthwise convolution's gpu 
version optimization
URL: https://github.com/apache/incubator-mxnet/pull/7393#issuecomment-325855010
 
 
   what is the best way of measuring GFLOPS? 
   how to call optimized conv without cudnn?
   btw. I'm testing with cudnn 6 and cuda 8
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] leonid-pishchulin commented on issue #7393: add depthwise convolution's gpu version optimization

2017-08-29 Thread git

leonid-pishchulin commented on issue #7393: add depthwise convolution's gpu 
version optimization
URL: https://github.com/apache/incubator-mxnet/pull/7393#issuecomment-325855010
 
 
   what is the best way of measuring GFLOPS? 
   how to call optimized conv without cudnn?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] 7oud commented on issue #7393: add depthwise convolution's gpu version optimization

2017-08-29 Thread git

7oud commented on issue #7393: add depthwise convolution's gpu version 
optimization
URL: https://github.com/apache/incubator-mxnet/pull/7393#issuecomment-325853945
 
 
   @leonid-pishchulin Thx for your experimental data! Now there are three 
implement of depthwise conv, 
   - cudnn v6 : 7.5 ms/fr, batchsize=20
   - optimized conv without cudnn: 2 ms/fr, batchsize=20
   - cudnn v7 (grouped  conv): ?
   It seems that your test speed, 7,5 ms and 10.9 ms are slower, is it related 
to your image size 640x480 ?
   BTW, could you give the computational costs (GFLOPS) comparison between 7.5 
ms/fr vs. 10.9 ms/fr in your test ? 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] CodingCat commented on issue #7571: [scala-package][spark] Resources running PS (role = server) should be explicit to Spark

2017-08-29 Thread git

CodingCat commented on issue #7571: [scala-package][spark] Resources running PS 
(role = server) should be explicit to Spark
URL: https://github.com/apache/incubator-mxnet/pull/7571#issuecomment-325852929
 
 
   @javelinjs  ping
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] cjolivier01 commented on a change in pull request #7656: CSRNDArray Tutorial

2017-08-29 Thread git

cjolivier01 commented on a change in pull request #7656: CSRNDArray Tutorial
URL: https://github.com/apache/incubator-mxnet/pull/7656#discussion_r135951071
 
 

 ##
 File path: docs/tutorials/sparse/csr.md
 ##
 @@ -166,7 +175,7 @@ a.copyto(d)
 {'b is a': b is a, 'b.asnumpy()':b.asnumpy(), 'c.asnumpy()':c.asnumpy(), 
'd.asnumpy()':d.asnumpy()}
 ```
 
-If the storage types of source array and destination array doesn't match,
+* If the storage types of source array and destination array doesn't match,
 
 Review comment:
   Type...doesn't match
   Or
   Types...don't match
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] anirudh2290 opened a new pull request #7662: WIP: Isolated benchmarking support

2017-08-29 Thread git

anirudh2290 opened a new pull request #7662: WIP: Isolated benchmarking support
URL: https://github.com/apache/incubator-mxnet/pull/7662
 
 
   Allows for benchmarking only IO or compute costs using --io-only and 
--compute-only flags and benchmarking communication costs using 
--communication-only
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] astonzhang commented on a change in pull request #7659: Gluon trainer updates: add learning_rate and lr_scheduler properties and add setter for learning rate

2017-08-29 Thread git

astonzhang commented on a change in pull request #7659: Gluon trainer updates: 
add learning_rate and lr_scheduler properties and add setter for learning rate
URL: https://github.com/apache/incubator-mxnet/pull/7659#discussion_r135949182
 
 

 ##
 File path: python/mxnet/optimizer.py
 ##
 @@ -191,6 +204,24 @@ def update(self, index, weight, grad, state):
 """
 raise NotImplementedError()
 
+def set_learning_rate(self, lr):
 
 Review comment:
   resolved
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] astonzhang commented on a change in pull request #7659: Gluon trainer updates: add learning_rate and lr_scheduler properties and add setter for learning rate

2017-08-29 Thread git

astonzhang commented on a change in pull request #7659: Gluon trainer updates: 
add learning_rate and lr_scheduler properties and add setter for learning rate
URL: https://github.com/apache/incubator-mxnet/pull/7659#discussion_r135949173
 
 

 ##
 File path: python/mxnet/gluon/trainer.py
 ##
 @@ -113,6 +119,34 @@ def _init_kvstore(self):
 
 self._kv_initialized = True
 
+
+@property
+def learning_rate(self):
+if not isinstance(self._optimizer, opt.Optimizer):
+raise UserWarning("Optimizer has to be defined before its learning"
+  "rate can be accessed.")
+else:
+return self._optimizer.learning_rate
+
+
+def set_learning_rate(self, lr):
+"""Mutate the learning rate.
+
+Mutate the learning rate of the optimizer only if the LRScheduler of
+the optimizer is undefined.
+
+Parameters
+--
+lr : float
+The new learning rate of the optimizer.
+"""
+if not isinstance(self._optimizer, opt.Optimizer):
+raise UserWarning("Optimizer has to be defined before its learning"
+  "rate is mutated.")
+else:
+self._optimizer.set_learning_rate(lr)
 
 Review comment:
   resolved
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] astonzhang commented on a change in pull request #7659: Gluon trainer updates: add learning_rate and lr_scheduler properties and add setter for learning rate

2017-08-29 Thread git

astonzhang commented on a change in pull request #7659: Gluon trainer updates: 
add learning_rate and lr_scheduler properties and add setter for learning rate
URL: https://github.com/apache/incubator-mxnet/pull/7659#discussion_r135949138
 
 

 ##
 File path: python/mxnet/optimizer.py
 ##
 @@ -191,6 +204,24 @@ def update(self, index, weight, grad, state):
 """
 raise NotImplementedError()
 
+def set_learning_rate(self, lr):
 
 Review comment:
   resolved
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] szha commented on a change in pull request #7659: Gluon trainer updates: add learning_rate and lr_scheduler properties and add setter for learning rate

2017-08-29 Thread git

szha commented on a change in pull request #7659: Gluon trainer updates: add 
learning_rate and lr_scheduler properties and add setter for learning rate
URL: https://github.com/apache/incubator-mxnet/pull/7659#discussion_r135947929
 
 

 ##
 File path: python/mxnet/gluon/trainer.py
 ##
 @@ -113,6 +119,34 @@ def _init_kvstore(self):
 
 self._kv_initialized = True
 
+
+@property
+def learning_rate(self):
+if not isinstance(self._optimizer, opt.Optimizer):
+raise UserWarning("Optimizer has to be defined before its learning"
+  "rate can be accessed.")
+else:
+return self._optimizer.learning_rate
+
+
+def set_learning_rate(self, lr):
+"""Mutate the learning rate.
+
+Mutate the learning rate of the optimizer only if the LRScheduler of
+the optimizer is undefined.
+
+Parameters
+--
+lr : float
+The new learning rate of the optimizer.
+"""
+if not isinstance(self._optimizer, opt.Optimizer):
+raise UserWarning("Optimizer has to be defined before its learning"
+  "rate is mutated.")
+else:
+self._optimizer.set_learning_rate(lr)
 
 Review comment:
   Update according to the comment below
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] szha commented on a change in pull request #7659: Gluon trainer updates: add learning_rate and lr_scheduler properties and add setter for learning rate

2017-08-29 Thread git

szha commented on a change in pull request #7659: Gluon trainer updates: add 
learning_rate and lr_scheduler properties and add setter for learning rate
URL: https://github.com/apache/incubator-mxnet/pull/7659#discussion_r135947837
 
 

 ##
 File path: python/mxnet/optimizer.py
 ##
 @@ -191,6 +204,24 @@ def update(self, index, weight, grad, state):
 """
 raise NotImplementedError()
 
+def set_learning_rate(self, lr):
 
 Review comment:
   ```
   @learning_rate.setter
   def set_learning_rate
   ```
   and then you can do
   ```
   optimizer.learning_rate = 0.5
   ```
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] astonzhang commented on a change in pull request #7659: Gluon trainer updates: add learning_rate and lr_scheduler properties and add setter for learning rate

2017-08-29 Thread git

astonzhang commented on a change in pull request #7659: Gluon trainer updates: 
add learning_rate and lr_scheduler properties and add setter for learning rate
URL: https://github.com/apache/incubator-mxnet/pull/7659#discussion_r135946133
 
 

 ##
 File path: python/mxnet/gluon/trainer.py
 ##
 @@ -113,6 +113,36 @@ def _init_kvstore(self):
 
 self._kv_initialized = True
 
+
+@property
+def learning_rate(self):
+return self._optimizer.lr
 
 Review comment:
   resolved
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] astonzhang commented on a change in pull request #7659: Gluon trainer updates: add learning_rate and lr_scheduler properties and add setter for learning rate

2017-08-29 Thread git

astonzhang commented on a change in pull request #7659: Gluon trainer updates: 
add learning_rate and lr_scheduler properties and add setter for learning rate
URL: https://github.com/apache/incubator-mxnet/pull/7659#discussion_r135946140
 
 

 ##
 File path: python/mxnet/gluon/trainer.py
 ##
 @@ -113,6 +113,36 @@ def _init_kvstore(self):
 
 self._kv_initialized = True
 
+
+@property
+def learning_rate(self):
+return self._optimizer.lr
+
+
+@property
+def lr_scheduler(self):
+return self._optimizer.lr_scheduler
+
+
+def set_learning_rate(self, lr):
+"""Mutate the learning rate.
+
+Parameters
+--
+lr : float
+The new learning rate.
+"""
+if not isinstance(self._optimizer, opt.Optimizer):
+raise UserWarning("Optimizer has to be defined before its learning"
+  "rate is mutated.")
+elif self._optimizer.lr_scheduler is not None:
+raise UserWarning("set_learning_rate mutates the value of the"
+  "learning rate only when the LRScheduler of"
+  "the optimizer is undefined.")
+else:
+self._optimizer.lr = lr
 
 Review comment:
   resolved
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] astonzhang commented on a change in pull request #7659: Gluon trainer updates: add learning_rate and lr_scheduler properties and add setter for learning rate

2017-08-29 Thread git

astonzhang commented on a change in pull request #7659: Gluon trainer updates: 
add learning_rate and lr_scheduler properties and add setter for learning rate
URL: https://github.com/apache/incubator-mxnet/pull/7659#discussion_r135946081
 
 

 ##
 File path: python/mxnet/gluon/trainer.py
 ##
 @@ -113,6 +113,36 @@ def _init_kvstore(self):
 
 self._kv_initialized = True
 
+
+@property
+def learning_rate(self):
 
 Review comment:
   resolved
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] astonzhang commented on issue #7659: Gluon trainer updates: add learning_rate and lr_scheduler properties and add setter for learning rate

2017-08-29 Thread git

astonzhang commented on issue #7659: Gluon trainer updates: add learning_rate 
and lr_scheduler properties and add setter for learning rate
URL: https://github.com/apache/incubator-mxnet/pull/7659#issuecomment-325842915
 
 
   Dudes, thank you for your comments. Code is updated. Let me know if you spot 
any further issues.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[incubator-mxnet] branch master updated: update mklml and mkl mac support (#7587)

2017-08-29 Thread muli

This is an automated email from the ASF dual-hosted git repository.

muli pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git


The following commit(s) were added to refs/heads/master by this push:
 new 546b917  update mklml and mkl mac support (#7587)
546b917 is described below

commit 546b917e3fd2d1d387fcb130f63ada0b46231ce5
Author: Sheng Zha 
AuthorDate: Tue Aug 29 17:14:01 2017 -0700

update mklml and mkl mac support (#7587)
---
 Makefile| 46 +++--
 dmlc-core   |  2 +-
 mshadow |  2 +-
 prepare_mkl.sh  | 26 -
 tests/ci_build/Dockerfile.mklml_gpu |  2 +-
 5 files changed, 52 insertions(+), 26 deletions(-)

diff --git a/Makefile b/Makefile
index 300f901..b42621e 100644
--- a/Makefile
+++ b/Makefile
@@ -1,5 +1,11 @@
 ROOTDIR = $(CURDIR)
 
+ifeq ($(OS),Windows_NT)
+   UNAME_S := Windows
+else
+   UNAME_S := $(shell uname -s)
+endif
+
 ifndef config
 ifdef CXXNET_CONFIG
config = $(CXXNET_CONFIG)
@@ -74,7 +80,7 @@ endif
 
 # Caffe Plugin
 ifdef CAFFE_PATH
-  CFLAGS += -DMXNET_USE_CAFFE=1
+   CFLAGS += -DMXNET_USE_CAFFE=1
 endif
 
 ifndef LINT_LANG
@@ -91,7 +97,9 @@ else
 endif
 
 ifeq ($(USE_OPENMP), 1)
-   CFLAGS += -fopenmp
+   ifneq ($(UNAME_S), Darwin)
+   CFLAGS += -fopenmp
+   endif
 endif
 
 ifeq ($(USE_NNPACK), 1)
@@ -105,11 +113,17 @@ ifeq ($(USE_MKL2017), 1)
CFLAGS += -I$(ROOTDIR)/src/operator/mkl/
CFLAGS += -I$(MKLML_ROOT)/include
LDFLAGS += -L$(MKLML_ROOT)/lib
-ifeq ($(USE_MKL2017_EXPERIMENTAL), 1)
-   CFLAGS += -DMKL_EXPERIMENTAL=1
-else
-   CFLAGS += -DMKL_EXPERIMENTAL=0
-endif
+   ifeq ($(USE_MKL2017_EXPERIMENTAL), 1)
+   CFLAGS += -DMKL_EXPERIMENTAL=1
+   else
+   CFLAGS += -DMKL_EXPERIMENTAL=0
+   endif
+   ifeq ($(UNAME_S), Darwin)
+   LDFLAGS += -lmklml
+   else
+   LDFLAGS += -Wl,--as-needed -lmklml_intel -lmklml_gnu
+   endif
+   LDFLAGS +=  -liomp5
 endif
 
 # verify existence of separate lapack library when using blas/openblas/atlas
@@ -180,8 +194,8 @@ ifeq ($(CUDA_ARCH),)
# Run nvcc on a zero-length file to check architecture-level support.
# Create args to include SASS in the fat binary for supported levels.
CUDA_ARCH := $(foreach arch,$(KNOWN_CUDA_ARCHS), \
-  $(shell $(NVCC) -arch=sm_$(arch) -E --x cu /dev/null 
>/dev/null 2>&1 && \
-  echo -gencode arch=compute_$(arch),code=sm_$(arch)))
+   $(shell $(NVCC) -arch=sm_$(arch) -E --x cu 
/dev/null >/dev/null 2>&1 && \
+   echo -gencode 
arch=compute_$(arch),code=sm_$(arch)))
# Convert a trailing "code=sm_NN" to "code=[sm_NN,compute_NN]" to also
# include the PTX of the most recent arch in the fat-binaries for
# forward compatibility with newer GPUs.
@@ -189,7 +203,7 @@ ifeq ($(CUDA_ARCH),)
# Add fat binary compression if supported by nvcc.
COMPRESS := --fatbin-options -compress-all
CUDA_ARCH += $(shell $(NVCC) -cuda $(COMPRESS) --x cu /dev/null -o 
/dev/null >/dev/null 2>&1 && \
-echo $(COMPRESS))
+echo $(COMPRESS))
 endif
 endif
 
@@ -231,20 +245,18 @@ PLUGIN_OBJ =
 PLUGIN_CUOBJ =
 include $(MXNET_PLUGINS)
 
-# scala package profile
-ifeq ($(OS),Windows_NT)
+ifeq ($(UNAME_S), Windows)
# TODO(yizhi) currently scala package does not support windows
SCALA_PKG_PROFILE := windows
 else
-   UNAME_S := $(shell uname -s)
ifeq ($(UNAME_S), Darwin)
WHOLE_ARCH= -all_load
NO_WHOLE_ARCH= -noall_load
SCALA_PKG_PROFILE := osx-x86_64
else
-   SCALA_PKG_PROFILE := linux-x86_64
WHOLE_ARCH= --whole-archive
NO_WHOLE_ARCH= --no-whole-archive
+   SCALA_PKG_PROFILE := linux-x86_64
endif
 endif
 
@@ -307,9 +319,9 @@ lib/libmxnet.a: $(ALLX_DEP)
ar crv $@ $(filter %.o, $?)
 
 lib/libmxnet.so: $(ALLX_DEP)
-@mkdir -p $(@D)
-$(CXX) $(CFLAGS) -shared -o $@ $(filter-out %libnnvm.a, $(filter %.o 
%.a, $^)) $(LDFLAGS) \
--Wl,${WHOLE_ARCH} $(filter %libnnvm.a, $^) -Wl,${NO_WHOLE_ARCH}
+   @mkdir -p $(@D)
+   $(CXX) $(CFLAGS) -shared -o $@ $(filter-out %libnnvm.a, $(filter %.o 
%.a, $^)) $(LDFLAGS) \
+   -Wl,${WHOLE_ARCH} $(filter %libnnvm.a, $^) -Wl,${NO_WHOLE_ARCH}
 
 $(PS_PATH)/build/libps.a: PSLITE
 
diff --git a/dmlc-core b/dmlc-core
index e880afe..a527100 16
--- a/dmlc-core
+++ b/dmlc-core
@@ -1 +1 @@
-Subproject commit e880afeb932d746e55eb92e8c6eb3ff1b3697c48
+Subproject commit a527100d7d5001efc4954848a2fc6027e48c05f4
diff

[GitHub] mli closed pull request #7587: update mklml and mkl mac support

2017-08-29 Thread git

mli closed pull request #7587: update mklml and mkl mac support
URL: https://github.com/apache/incubator-mxnet/pull/7587
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] mli opened a new pull request #7661: Use a big ndarray in gluon/data/vision

2017-08-29 Thread git

mli opened a new pull request #7661: Use a big ndarray in gluon/data/vision
URL: https://github.com/apache/incubator-mxnet/pull/7661
 
 
   use `nd.array(data)` instead of a list of ndarray to accelerate the time
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] piiswrong commented on a change in pull request #7659: Gluon trainer updates: add learning_rate and lr_scheduler properties and add setter for learning rate

2017-08-29 Thread git

piiswrong commented on a change in pull request #7659: Gluon trainer updates: 
add learning_rate and lr_scheduler properties and add setter for learning rate
URL: https://github.com/apache/incubator-mxnet/pull/7659#discussion_r135928492
 
 

 ##
 File path: python/mxnet/gluon/trainer.py
 ##
 @@ -113,6 +113,36 @@ def _init_kvstore(self):
 
 self._kv_initialized = True
 
+
+@property
+def learning_rate(self):
 
 Review comment:
   document this as 
   ```
   Properties
   --
   ```
   
   in init doc
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] szha commented on a change in pull request #7659: Gluon trainer updates: add learning_rate and lr_scheduler properties and add setter for learning rate

2017-08-29 Thread git

szha commented on a change in pull request #7659: Gluon trainer updates: add 
learning_rate and lr_scheduler properties and add setter for learning rate
URL: https://github.com/apache/incubator-mxnet/pull/7659#discussion_r135936970
 
 

 ##
 File path: python/mxnet/gluon/trainer.py
 ##
 @@ -113,6 +113,36 @@ def _init_kvstore(self):
 
 self._kv_initialized = True
 
+
+@property
+def learning_rate(self):
+return self._optimizer.lr
+
+
+@property
+def lr_scheduler(self):
+return self._optimizer.lr_scheduler
+
+
+def set_learning_rate(self, lr):
+"""Mutate the learning rate.
+
+Parameters
+--
+lr : float
+The new learning rate.
+"""
+if not isinstance(self._optimizer, opt.Optimizer):
+raise UserWarning("Optimizer has to be defined before its learning"
+  "rate is mutated.")
+elif self._optimizer.lr_scheduler is not None:
+raise UserWarning("set_learning_rate mutates the value of the"
+  "learning rate only when the LRScheduler of"
+  "the optimizer is undefined.")
+else:
+self._optimizer.lr = lr
 
 Review comment:
   Same as the accessor comment. Try make setting learning rate the concern of 
optimizer's.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] szha opened a new pull request #7660: fix ctc on softmax grad and req option

2017-08-29 Thread git

szha opened a new pull request #7660: fix ctc on softmax grad and req option
URL: https://github.com/apache/incubator-mxnet/pull/7660
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] eric-haibin-lin commented on issue #7638: CSRNDArray from/to scipy csr_matrix; fix rand_shape_nd

2017-08-29 Thread git

eric-haibin-lin commented on issue #7638: CSRNDArray from/to scipy csr_matrix; 
fix rand_shape_nd
URL: https://github.com/apache/incubator-mxnet/pull/7638#issuecomment-325827738
 
 
   Added better err msg for mx.nd.sparse.array. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] leonid-pishchulin commented on issue #7393: add depthwise convolution's gpu version optimization

2017-08-29 Thread git

leonid-pishchulin commented on issue #7393: add depthwise convolution's gpu 
version optimization
URL: https://github.com/apache/incubator-mxnet/pull/7393#issuecomment-325824871
 
 
   I run mobilenet.py when setting num_group=num_filter and num_group=1 on GTX 
1080ti on 1024 640x480 images with batch size=8 (total 128 batches) and compute 
average run-time. Using num_group=num_filter achieves 7.5 ms/fr vs. 10.9 ms/fr 
in case of num_group=1. Great !
   However, a version of ResNet-18 with grouping convolutions runs at 5.3 ms/fr 
vs  3.7 ms/fr of the original convolutions with num_group=1. Why is that?
   I used implementation of ResNet-18 from resnet.py as a baseline and modified 
it in the following ways
   1) I added 1x1 projections to the beginning of each of the residual blocks 
2a, 3a, 4a and 5a prior to the first 3x3 convolutional layer to make the number 
of input channels and filters to be equal within each 3x3 block
   2) I set num_group = num_filter for each 3x3 convolutional layer starting 
from 2a.
   In order to evaluate the effect of including additional 1x1 projections onto 
the total run-time I measured the run-time of the version where I add 
projections, but set num_group=1. Differences are negligible compared to using 
no 1x1 projections.
   Any ideas?
   
   Is there smth specific about mobilenet that allows for a speed-up when using 
depth-wise factorized convolutions, which does not hold for ResNet-18? 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] piiswrong closed pull request #7627: Change namespace and make logging functionality changes

2017-08-29 Thread git

piiswrong closed pull request #7627: Change namespace and make logging 
functionality changes
URL: https://github.com/apache/incubator-mxnet/pull/7627
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[incubator-mxnet] branch master updated: Change namespace and make logging functionality changes (#7627)

2017-08-29 Thread jxie

This is an automated email from the ASF dual-hosted git repository.

jxie pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git


The following commit(s) were added to refs/heads/master by this push:
 new cb432a7  Change namespace and make logging functionality changes 
(#7627)
cb432a7 is described below

commit cb432a706146e6a2f8d84ca6de280ef74123
Author: Anirudh Subramanian 
AuthorDate: Tue Aug 29 15:36:13 2017 -0700

Change namespace and make logging functionality changes (#7627)

* Change namespace and make logging functionality changes

* Help comment changes
---
 benchmark/python/sparse/dot.py|  4 ++--
 benchmark/python/sparse/sparse_end2end.py | 17 -
 2 files changed, 14 insertions(+), 7 deletions(-)

diff --git a/benchmark/python/sparse/dot.py b/benchmark/python/sparse/dot.py
index aab34db..145d05d 100644
--- a/benchmark/python/sparse/dot.py
+++ b/benchmark/python/sparse/dot.py
@@ -172,7 +172,7 @@ def _compare_sparse_dense(data_dir, file_name, 
mini_file_name, feature_dim,
 for _ in train_iter:
 csr_data = train_iter.getdata()
 dns_data = csr_data.tostype('default')
-cost_sparse = measure_cost(num_repeat, False, False, mx.nd.dot, 
csr_data, weight, transpose_a=transpose)
+cost_sparse = measure_cost(num_repeat, False, False, 
mx.nd.sparse.dot, csr_data, weight, transpose_a=transpose)
 cost_dense = measure_cost(num_repeat, False, False, mx.nd.dot, 
dns_data, weight, transpose_a=transpose)
 total_cost["sparse"] += cost_sparse
 total_cost["dense"] += cost_dense
@@ -270,7 +270,7 @@ def test_dot_synthetic(data_dict):
 set_default_context(ctx)
 assert fw == "mxnet" or fw == "scipy"
 # Set funcs
-dot_func_sparse = mx.nd.dot if fw == "mxnet" else sp.spmatrix.dot
+dot_func_sparse = mx.nd.sparse.dot if fw == "mxnet" else 
sp.spmatrix.dot
 dot_func_dense = mx.nd.dot if fw == "mxnet" else np.dot
 # Create matrix instances
 lhs_nd = rand_ndarray(lhs_shape, lhs_stype, density=lhs_den, 
distribution=distribution)
diff --git a/benchmark/python/sparse/sparse_end2end.py 
b/benchmark/python/sparse/sparse_end2end.py
index e9d8bf8..717857a 100644
--- a/benchmark/python/sparse/sparse_end2end.py
+++ b/benchmark/python/sparse/sparse_end2end.py
@@ -35,8 +35,8 @@ parser.add_argument('--dummy-iter', type=int, default=0,
 help='whether to use dummy iterator to exclude io cost')
 parser.add_argument('--kvstore', type=str, default='local',
 help='what kvstore to use [local, dist_sync, etc]')
-parser.add_argument('--log-level', type=str, default='debug',
-help='logging level [debug, info, error]')
+parser.add_argument('--sparse-log-level', type=str, default='INFO',
+help='logging level [DEBUG, INFO, ERROR]')
 parser.add_argument('--dataset', type=str, default='avazu',
 help='what test dataset to use')
 parser.add_argument('--num-gpu', type=int, default=0,
@@ -46,6 +46,8 @@ parser.add_argument('--output-dim', type=int, default=4,
 help='number of columns of the forward output')
 parser.add_argument('--dummy-metric', type=int, default=0,
 help='whether to call update_metric')
+parser.add_argument('--enable-logging-for', default="0",
+help="Enable logging for the specified list of workers")
 
 
 def get_libsvm_data(data_dir, data_name, url, data_origin_name):
@@ -101,7 +103,7 @@ def get_sym(feature_dim):
  x = mx.symbol.Variable("data", stype='csr')
  norm_init = mx.initializer.Normal(sigma=0.01)
  w = mx.symbol.Variable("w", shape=(feature_dim, args.output_dim), 
init=norm_init, stype='row_sparse')
- embed = mx.symbol.dot(x, w)
+ embed = mx.symbol.sparse.dot(x, w)
  y = mx.symbol.Variable("softmax_label")
  model = mx.symbol.SoftmaxOutput(data=embed, label=y, name="out")
  return model
@@ -137,7 +139,7 @@ if __name__ == '__main__':
 batch_size = args.batch_size if args.num_gpu == 0 else args.num_gpu * 
args.batch_size
 dummy_iter = args.dummy_iter
 dataset = args.dataset
-log_level = args.log_level
+log_level = args.sparse_log_level
 contexts = mx.context.cpu(0) if args.num_gpu < 1\
 else [mx.context.gpu(i) for i in range(args.num_gpu)]
 
@@ -148,12 +150,17 @@ if __name__ == '__main__':
 
 # only print log for rank 0 worker
 import logging
-if rank != 0:
+if log_level == 'ERROR':
 log_level = logging.ERROR
 elif log_level == 'DEBUG':
 log_level = logging.DEBUG
 else:
 log_level = logging.INFO
+
+# Only log if it is in the list of workers to be logged
+logging_workers_list = [int(i) for i in args.enable_logging_for.split(",")]
+log_level = log_level if rank in

[GitHub] eric-haibin-lin commented on a change in pull request #7638: CSRNDArray from/to scipy csr_matrix; fix rand_shape_nd

2017-08-29 Thread git

eric-haibin-lin commented on a change in pull request #7638: CSRNDArray from/to 
scipy csr_matrix; fix rand_shape_nd
URL: https://github.com/apache/incubator-mxnet/pull/7638#discussion_r135928901
 
 

 ##
 File path: python/mxnet/ndarray/sparse.py
 ##
 @@ -908,16 +938,58 @@ def empty(stype, shape, ctx=None, dtype=None, 
aux_types=None):
 
 def array(source_array, ctx=None, dtype=None, aux_types=None):
 """Creates a sparse array from any object exposing the array interface.
+
+Parameters
+--
+source_array : RowSparseNDArray, CSRNDArray or scipy.sparse.csr.csr_matrix
+The source sparse array
+ctx : Context, optional
+Device context (default is the current default context).
+dtype : str or numpy.dtype, optional
+The data type of the output array. The default dtype is 
``source_array.dtype``
+if `source_array` is an `NDArray`, `float32` otherwise.
+aux_types: list of numpy.dtype, optional
+An optional list of types of the aux data for RowSparseNDArray or 
CSRNDArray.
+The default value for CSRNDArray is [`int64`, `int64`] for `indptr` 
and `indices`.
+The default value for RowSparseNDArray is [`int64`] for `indices`.
+
+Returns
+---
+RowSparseNDArray or CSRNDArray
+An array with the same contents as the `source_array`.
+
+Examples
+
+>>> import scipy.sparse as sp
+>>> csr = sp.csr_matrix((2, 100))
+>>> mx.nd.sparse.array(csr)
+
+>>> mx.nd.sparse.array(mx.nd.zeros((3, 2), stype='csr'))
+
+>>> mx.nd.sparse.array(mx.nd.zeros((3, 2), stype='row_sparse'))
+
 """
 if isinstance(source_array, NDArray):
-assert(source_array.stype != 'default'),\
-"Please use `cast_storage` to create BaseSparseNDArray from an 
NDArray"
+assert(source_array.stype != 'default'), \
+   "Please use `cast_storage` to create RowSparseNDArray or 
CSRNDArray from an NDArray"
 dtype = source_array.dtype if dtype is None else dtype
 aux_types = source_array._aux_types if aux_types is None else aux_types
+arr = empty(source_array.stype, source_array.shape, ctx, dtype, 
aux_types)
+arr[:] = source_array
+return arr
+if spsp is not None and isinstance(source_array, spsp.csr.csr_matrix):
+# TODO(haibin) implement `_sync_copy_from` with scipy csr object to 
reduce a copy
+indptr_type = None
+indices_type = None
+if aux_types is not None:
+assert(len(aux_types) == 2), "Expected types for both indices and 
indptr"
+indptr_type = aux_types[0]
+indices_type = aux_types[1]
+# preprocess scipy csr to canonical form
+csr = source_array.sorted_indices()
+csr.sum_duplicates()
+arr = csr_matrix(csr.data, csr.indptr, csr.indices, csr.shape, 
dtype=dtype,
+ indptr_type=indptr_type, indices_type=indices_type)
+return arr
 else:
-# TODO(haibin/anisub) support creation from scipy object when 
`_sync_copy_from` is ready
-raise NotImplementedError('creating BaseSparseNDArray from '
-  ' a non-NDArray object is not implemented.')
-arr = empty(source_array.stype, source_array.shape, ctx, dtype, aux_types)
 
 Review comment:
   sparse.array will only return csrNDArray or rowsparse ndarray
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] eric-haibin-lin commented on a change in pull request #7638: CSRNDArray from/to scipy csr_matrix; fix rand_shape_nd

2017-08-29 Thread git

eric-haibin-lin commented on a change in pull request #7638: CSRNDArray from/to 
scipy csr_matrix; fix rand_shape_nd
URL: https://github.com/apache/incubator-mxnet/pull/7638#discussion_r135928863
 
 

 ##
 File path: python/mxnet/ndarray/sparse.py
 ##
 @@ -908,16 +938,58 @@ def empty(stype, shape, ctx=None, dtype=None, 
aux_types=None):
 
 def array(source_array, ctx=None, dtype=None, aux_types=None):
 """Creates a sparse array from any object exposing the array interface.
+
+Parameters
+--
+source_array : RowSparseNDArray, CSRNDArray or scipy.sparse.csr.csr_matrix
+The source sparse array
+ctx : Context, optional
+Device context (default is the current default context).
+dtype : str or numpy.dtype, optional
+The data type of the output array. The default dtype is 
``source_array.dtype``
+if `source_array` is an `NDArray`, `float32` otherwise.
+aux_types: list of numpy.dtype, optional
+An optional list of types of the aux data for RowSparseNDArray or 
CSRNDArray.
+The default value for CSRNDArray is [`int64`, `int64`] for `indptr` 
and `indices`.
+The default value for RowSparseNDArray is [`int64`] for `indices`.
+
+Returns
+---
+RowSparseNDArray or CSRNDArray
+An array with the same contents as the `source_array`.
+
+Examples
+
+>>> import scipy.sparse as sp
+>>> csr = sp.csr_matrix((2, 100))
+>>> mx.nd.sparse.array(csr)
+
+>>> mx.nd.sparse.array(mx.nd.zeros((3, 2), stype='csr'))
+
+>>> mx.nd.sparse.array(mx.nd.zeros((3, 2), stype='row_sparse'))
+
 """
 if isinstance(source_array, NDArray):
-assert(source_array.stype != 'default'),\
-"Please use `cast_storage` to create BaseSparseNDArray from an 
NDArray"
+assert(source_array.stype != 'default'), \
+   "Please use `cast_storage` to create RowSparseNDArray or 
CSRNDArray from an NDArray"
 dtype = source_array.dtype if dtype is None else dtype
 aux_types = source_array._aux_types if aux_types is None else aux_types
+arr = empty(source_array.stype, source_array.shape, ctx, dtype, 
aux_types)
+arr[:] = source_array
+return arr
+if spsp is not None and isinstance(source_array, spsp.csr.csr_matrix):
+# TODO(haibin) implement `_sync_copy_from` with scipy csr object to 
reduce a copy
+indptr_type = None
+indices_type = None
+if aux_types is not None:
+assert(len(aux_types) == 2), "Expected types for both indices and 
indptr"
+indptr_type = aux_types[0]
+indices_type = aux_types[1]
+# preprocess scipy csr to canonical form
+csr = source_array.sorted_indices()
+csr.sum_duplicates()
+arr = csr_matrix(csr.data, csr.indptr, csr.indices, csr.shape, 
dtype=dtype,
+ indptr_type=indptr_type, indices_type=indices_type)
+return arr
 else:
-# TODO(haibin/anisub) support creation from scipy object when 
`_sync_copy_from` is ready
-raise NotImplementedError('creating BaseSparseNDArray from '
-  ' a non-NDArray object is not implemented.')
-arr = empty(source_array.stype, source_array.shape, ctx, dtype, aux_types)
 
 Review comment:
   mx.nd.array will either call ndarray.array or sparse.array 
   
   `__all__` doesnt include sparse.array so it won't be imported.
   
   I can extend the err msg that if input is numpy, "please use `mx.nd.array` 
to create a dense NDArray "
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] piiswrong commented on a change in pull request #7659: Gluon trainer updates: add learning_rate and lr_scheduler properties and add setter for learning rate

2017-08-29 Thread git

piiswrong commented on a change in pull request #7659: Gluon trainer updates: 
add learning_rate and lr_scheduler properties and add setter for learning rate
URL: https://github.com/apache/incubator-mxnet/pull/7659#discussion_r135928727
 
 

 ##
 File path: python/mxnet/gluon/trainer.py
 ##
 @@ -113,6 +113,36 @@ def _init_kvstore(self):
 
 self._kv_initialized = True
 
+
+@property
+def learning_rate(self):
 
 Review comment:
   Also report learning_rate when using lr_sheduler
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] piiswrong commented on a change in pull request #7659: Gluon trainer updates: add learning_rate and lr_scheduler properties and add setter for learning rate

2017-08-29 Thread git

piiswrong commented on a change in pull request #7659: Gluon trainer updates: 
add learning_rate and lr_scheduler properties and add setter for learning rate
URL: https://github.com/apache/incubator-mxnet/pull/7659#discussion_r135928517
 
 

 ##
 File path: python/mxnet/gluon/trainer.py
 ##
 @@ -113,6 +113,36 @@ def _init_kvstore(self):
 
 self._kv_initialized = True
 
+
+@property
+def learning_rate(self):
+return self._optimizer.lr
+
+
+@property
+def lr_scheduler(self):
 
 Review comment:
   don't expose this for now
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] piiswrong commented on a change in pull request #7659: Gluon trainer updates: add learning_rate and lr_scheduler properties and add setter for learning rate

2017-08-29 Thread git

piiswrong commented on a change in pull request #7659: Gluon trainer updates: 
add learning_rate and lr_scheduler properties and add setter for learning rate
URL: https://github.com/apache/incubator-mxnet/pull/7659#discussion_r135928492
 
 

 ##
 File path: python/mxnet/gluon/trainer.py
 ##
 @@ -113,6 +113,36 @@ def _init_kvstore(self):
 
 self._kv_initialized = True
 
+
+@property
+def learning_rate(self):
 
 Review comment:
   document this as 
   ```Properties
   --```
   
   in init doc
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] rahul003 commented on a change in pull request #7656: CSRNDArray Tutorial

2017-08-29 Thread git

rahul003 commented on a change in pull request #7656: CSRNDArray Tutorial
URL: https://github.com/apache/incubator-mxnet/pull/7656#discussion_r135917269

##
File path: docs/tutorials/sparse/csr.md
##
@@ -0,0 +1,299 @@
+# CSRNDArray - NDArray in Compressed Sparse Row Storage Format
+
+Many real world datasets deal with high dimensional sparse feature vectors.
For instance,
+in a recommendation system, the number of categories and users is in the order
of millions,
+while most users only made a few purchases, leading to feature vectors with
high sparsity
+(i.e. most of the elements are zeros).
+
+Storing and manipulating such large sparse matrices in the default dense
structure results
+in wated memory and processing on the zeros.
+To take advantage of the sparse structure of the matrix, the ``CSRNDArray`` in
MXNet
+stores the matrix in [compressed sparse
row(CSR)](https://en.wikipedia.org/wiki/Sparse_matrix#Compressed_sparse_row_.28CSR.2C_CRS_or_Yale_format.29)
format
+and uses specialized algorithms in operators.
+The format is designed for 2D matrices with a large number of columns,
+and each row is sparse(i.e. with only a few nonzeros).
+For matrices of high sparsity (e.g. ~1% non-zeros), the advantage of
``CSRNDArray`` over
+the existing ``NDArray`` is that
+
+- memory consumption is reduced significantly
+- certain operations (e.g. matrix-vector multiplication) are much faster
+
+Meanwhile, ``CSRNDArray`` inherits competitve features from ``NDArray`` such as

Review comment:
typo at competitive

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

With regards,
Apache Git Services

[GitHub] rahul003 commented on a change in pull request #7656: CSRNDArray Tutorial

2017-08-29 Thread git

rahul003 commented on a change in pull request #7656: CSRNDArray Tutorial
URL: https://github.com/apache/incubator-mxnet/pull/7656#discussion_r135917903

Review comment:
is -> are

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

With regards,
Apache Git Services

[GitHub] piiswrong commented on a change in pull request #7638: CSRNDArray from/to scipy csr_matrix; fix rand_shape_nd

2017-08-29 Thread git

piiswrong commented on a change in pull request #7638: CSRNDArray from/to scipy 
csr_matrix; fix rand_shape_nd
URL: https://github.com/apache/incubator-mxnet/pull/7638#discussion_r135928021
 
 

 ##
 File path: python/mxnet/ndarray/sparse.py
 ##
 @@ -908,16 +938,58 @@ def empty(stype, shape, ctx=None, dtype=None, 
aux_types=None):
 
 def array(source_array, ctx=None, dtype=None, aux_types=None):
 """Creates a sparse array from any object exposing the array interface.
+
+Parameters
+--
+source_array : RowSparseNDArray, CSRNDArray or scipy.sparse.csr.csr_matrix
+The source sparse array
+ctx : Context, optional
+Device context (default is the current default context).
+dtype : str or numpy.dtype, optional
+The data type of the output array. The default dtype is 
``source_array.dtype``
+if `source_array` is an `NDArray`, `float32` otherwise.
+aux_types: list of numpy.dtype, optional
+An optional list of types of the aux data for RowSparseNDArray or 
CSRNDArray.
+The default value for CSRNDArray is [`int64`, `int64`] for `indptr` 
and `indices`.
+The default value for RowSparseNDArray is [`int64`] for `indices`.
+
+Returns
+---
+RowSparseNDArray or CSRNDArray
+An array with the same contents as the `source_array`.
+
+Examples
+
+>>> import scipy.sparse as sp
+>>> csr = sp.csr_matrix((2, 100))
+>>> mx.nd.sparse.array(csr)
+
+>>> mx.nd.sparse.array(mx.nd.zeros((3, 2), stype='csr'))
+
+>>> mx.nd.sparse.array(mx.nd.zeros((3, 2), stype='row_sparse'))
+
 """
 if isinstance(source_array, NDArray):
-assert(source_array.stype != 'default'),\
-"Please use `cast_storage` to create BaseSparseNDArray from an 
NDArray"
+assert(source_array.stype != 'default'), \
+   "Please use `cast_storage` to create RowSparseNDArray or 
CSRNDArray from an NDArray"
 dtype = source_array.dtype if dtype is None else dtype
 aux_types = source_array._aux_types if aux_types is None else aux_types
+arr = empty(source_array.stype, source_array.shape, ctx, dtype, 
aux_types)
+arr[:] = source_array
+return arr
+if spsp is not None and isinstance(source_array, spsp.csr.csr_matrix):
+# TODO(haibin) implement `_sync_copy_from` with scipy csr object to 
reduce a copy
+indptr_type = None
+indices_type = None
+if aux_types is not None:
+assert(len(aux_types) == 2), "Expected types for both indices and 
indptr"
+indptr_type = aux_types[0]
+indices_type = aux_types[1]
+# preprocess scipy csr to canonical form
+csr = source_array.sorted_indices()
+csr.sum_duplicates()
+arr = csr_matrix(csr.data, csr.indptr, csr.indices, csr.shape, 
dtype=dtype,
+ indptr_type=indptr_type, indices_type=indices_type)
+return arr
 else:
-# TODO(haibin/anisub) support creation from scipy object when 
`_sync_copy_from` is ready
-raise NotImplementedError('creating BaseSparseNDArray from '
-  ' a non-NDArray object is not implemented.')
-arr = empty(source_array.stype, source_array.shape, ctx, dtype, aux_types)
 
 Review comment:
   So sparse.array is different from nd.array?
   
   Should we explain this better in the error message?
   Is this going to be imported into mx.nd through * import?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] astonzhang opened a new pull request #7659: Gluon trainer updates: add learning_rate and lr_scheduler properties and add setter for learning rate

2017-08-29 Thread git

astonzhang opened a new pull request #7659: Gluon trainer updates: add 
learning_rate and lr_scheduler properties and add setter for learning rate
URL: https://github.com/apache/incubator-mxnet/pull/7659
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] piiswrong commented on a change in pull request #7657: Remove python function `negative` for rendering ndarray api in doc

2017-08-29 Thread git

piiswrong commented on a change in pull request #7657: Remove python function 
`negative` for rendering ndarray api in doc
URL: https://github.com/apache/incubator-mxnet/pull/7657#discussion_r135927038
 
 

 ##
 File path: python/mxnet/ndarray/ndarray.py
 ##
 @@ -43,8 +43,8 @@
 __all__ = ["NDArray", "concatenate", "_DTYPE_NP_TO_MX", "_DTYPE_MX_TO_NP", 
"_GRAD_REQ_MAP",
"ones", "add", "arange", "divide", "equal", "full", "greater", 
"greater_equal",
"imdecode", "lesser", "lesser_equal", "maximum", "minimum", 
"moveaxis", "modulo",
-   "multiply", "negative", "not_equal", "onehot_encode", "power", 
"subtract",
-   "true_divide", "waitall", "_new_empty_handle"]
 
 Review comment:
   why remove true_divide?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] piiswrong commented on a change in pull request #7657: Remove python function `negative` for rendering ndarray api in doc

2017-08-29 Thread git

piiswrong commented on a change in pull request #7657: Remove python function 
`negative` for rendering ndarray api in doc
URL: https://github.com/apache/incubator-mxnet/pull/7657#discussion_r135927038
 
 

 ##
 File path: python/mxnet/ndarray/ndarray.py
 ##
 @@ -43,8 +43,8 @@
 __all__ = ["NDArray", "concatenate", "_DTYPE_NP_TO_MX", "_DTYPE_MX_TO_NP", 
"_GRAD_REQ_MAP",
"ones", "add", "arange", "divide", "equal", "full", "greater", 
"greater_equal",
"imdecode", "lesser", "lesser_equal", "maximum", "minimum", 
"moveaxis", "modulo",
-   "multiply", "negative", "not_equal", "onehot_encode", "power", 
"subtract",
-   "true_divide", "waitall", "_new_empty_handle"]
 
 Review comment:
   why remove true_divide?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] rahul003 commented on issue #7658: Parallelize Python 2 and 3 unit test cases in Jenkins CI.

2017-08-29 Thread git

rahul003 commented on issue #7658: Parallelize Python 2 and 3 unit test cases 
in Jenkins CI.
URL: https://github.com/apache/incubator-mxnet/pull/7658#issuecomment-325818765
 
 
   Ok, new changes look fine as well.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[incubator-mxnet] branch master updated: Parallelize Python 2 and 3 unit test cases in Jenkins CI. (#7658)

2017-08-29 Thread skm

This is an automated email from the ASF dual-hosted git repository.

skm pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git


The following commit(s) were added to refs/heads/master by this push:
 new 62e6d2f  Parallelize Python 2 and 3 unit test cases in Jenkins CI. 
(#7658)
62e6d2f is described below

commit 62e6d2f99df23e601508142004124cb1ddc5ffaf
Author: Sandeep Krishnamurthy 
AuthorDate: Tue Aug 29 15:07:08 2017 -0700

Parallelize Python 2 and 3 unit test cases in Jenkins CI. (#7658)

* Parallelize Python 2 and 3 unit test cases.

* Parallelize python 2 and 3 unit tests cases in jenkins

* Parallelize python 2 and 3 unit tests cases in jenkins
---
 Jenkinsfile | 82 -
 1 file changed, 70 insertions(+), 12 deletions(-)

diff --git a/Jenkinsfile b/Jenkinsfile
index fe0151a..bf514bc 100644
--- a/Jenkinsfile
+++ b/Jenkinsfile
@@ -76,11 +76,18 @@ echo ${libs} | sed -e 's/,/ /g' | xargs md5sum
 }
 
 // Python unittest for CPU
-def python_ut(docker_type) {
+// Python 2
+def python2_ut(docker_type) {
   timeout(time: max_time, unit: 'MINUTES') {
 sh "${docker_run} ${docker_type} find . -name '*.pyc' -type f -delete"
 sh "${docker_run} ${docker_type} PYTHONPATH=./python/ nosetests 
--with-timer --verbose tests/python/unittest"
 sh "${docker_run} ${docker_type} PYTHONPATH=./python/ nosetests 
--with-timer --verbose tests/python/train"
+  }
+}
+
+// Python 3
+def python3_ut(docker_type) {
+  timeout(time: max_time, unit: 'MINUTES') {
 sh "${docker_run} ${docker_type} find . -name '*.pyc' -type f -delete"
 sh "${docker_run} ${docker_type} PYTHONPATH=./python/ nosetests-3.4 
--with-timer --verbose tests/python/unittest"
   }
@@ -88,10 +95,17 @@ def python_ut(docker_type) {
 
 // GPU test has two parts. 1) run unittest on GPU, 2) compare the results on
 // both CPU and GPU
-def python_gpu_ut(docker_type) {
+// Python 2
+def python2_gpu_ut(docker_type) {
   timeout(time: max_time, unit: 'MINUTES') {
 sh "${docker_run} ${docker_type} find . -name '*.pyc' -type f -delete"
 sh "${docker_run} ${docker_type} PYTHONPATH=./python/ nosetests 
--with-timer --verbose tests/python/gpu"
+  }
+}
+
+// Python 3
+def python3_gpu_ut(docker_type) {
+  timeout(time: max_time, unit: 'MINUTES') {
 sh "${docker_run} ${docker_type} find . -name '*.pyc' -type f -delete"
 sh "${docker_run} ${docker_type} PYTHONPATH=./python/ nosetests-3.4 
--with-timer --verbose tests/python/gpu"
   }
@@ -250,31 +264,75 @@ try {
 }
 
 stage('Unit Test') {
-  parallel 'Python2/3: CPU': {
+  parallel 'Python2: CPU': {
 node('mxnetlinux') {
-  ws('workspace/ut-python-cpu') {
+  ws('workspace/ut-python2-cpu') {
+init_git()
+unpack_lib('cpu')
+python2_ut('cpu')
+  }
+}
+  },
+  'Python3: CPU': {
+node('mxnetlinux') {
+  ws('workspace/ut-python3-cpu') {
 init_git()
 unpack_lib('cpu')
-python_ut('cpu')
+python3_ut('cpu')
   }
 }
   },
-  'Python2/3: GPU': {
+  'Python2: GPU': {
 node('mxnetlinux') {
-  ws('workspace/ut-python-gpu') {
+  ws('workspace/ut-python2-gpu') {
 init_git()
 unpack_lib('gpu', mx_lib)
-python_gpu_ut('gpu')
+python2_gpu_ut('gpu')
+  }
+}
+  },
+  'Python3: GPU': {
+node('mxnetlinux') {
+  ws('workspace/ut-python3-gpu') {
+init_git()
+unpack_lib('gpu', mx_lib)
+python3_gpu_ut('gpu')
+  }
+}
+  },
+  'Python2: MKLML-CPU': {
+node('mxnetlinux') {
+  ws('workspace/ut-python2-mklml-cpu') {
+init_git()
+unpack_lib('mklml')
+python2_ut('mklml_gpu')
+  }
+}
+  },
+  'Python2: MKLML-GPU': {
+node('mxnetlinux') {
+  ws('workspace/ut-python2-mklml-gpu') {
+init_git()
+unpack_lib('mklml')
+python2_gpu_ut('mklml_gpu')
+  }
+}
+  },
+  'Python3: MKLML-CPU': {
+node('mxnetlinux') {
+  ws('workspace/ut-python3-mklml-cpu') {
+init_git()
+unpack_lib('mklml')
+python3_ut('mklml_gpu')
   }
 }
   },
-  'Python2/3: MKLML': {
+  'Python3: MKLML-GPU': {
 node('mxnetlinux') {
-  ws('workspace/ut-python-mklml') {
+  ws('workspace/ut-python3-mklml-gpu') {
 init_git()
 unpack_lib('mklml')
-python_ut('mklml_gpu')
-python_gpu_ut('mklml_gpu')
+python3_gpu_ut('mklml_gpu')
   }
 }
   },

-- 
To stop receiving notification emails like this one, please contact

[GitHub] sandeep-krishnamurthy closed pull request #7658: Parallelize Python 2 and 3 unit test cases in Jenkins CI.

2017-08-29 Thread git

sandeep-krishnamurthy closed pull request #7658: Parallelize Python 2 and 3 
unit test cases in Jenkins CI.
URL: https://github.com/apache/incubator-mxnet/pull/7658
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] anirudh2290 commented on a change in pull request #7656: CSRNDArray Tutorial

2017-08-29 Thread git

anirudh2290 commented on a change in pull request #7656: CSRNDArray Tutorial
URL: https://github.com/apache/incubator-mxnet/pull/7656#discussion_r135922966
 
 

 ##
 File path: docs/tutorials/sparse/csr.md
 ##
 @@ -0,0 +1,299 @@
+# CSRNDArray - NDArray in Compressed Sparse Row Storage Format
+
+Many real world datasets deal with high dimensional sparse feature vectors. 
For instance,
+in a recommendation system, the number of categories and users is in the order 
of millions,
+while most users only made a few purchases, leading to feature vectors with 
high sparsity
+(i.e. most of the elements are zeros).
+
+Storing and manipulating such large sparse matrices in the default dense 
structure results
+in wated memory and processing on the zeros.
+To take advantage of the sparse structure of the matrix, the ``CSRNDArray`` in 
MXNet
+stores the matrix in [compressed sparse 
row(CSR)](https://en.wikipedia.org/wiki/Sparse_matrix#Compressed_sparse_row_.28CSR.2C_CRS_or_Yale_format.29)
 format
+and uses specialized algorithms in operators.
+The format is designed for 2D matrices with a large number of columns,
+and each row is sparse(i.e. with only a few nonzeros).
+For matrices of high sparsity (e.g. ~1% non-zeros), the advantage of 
``CSRNDArray`` over
+the existing ``NDArray`` is that
+
+- memory consumption is reduced significantly
+- certain operations (e.g. matrix-vector multiplication) are much faster
+
+Meanwhile, ``CSRNDArray`` inherits competitve features from ``NDArray`` such as
+lazy evaluation and automatic parallelization, which is not available in the
+scientific computing python package [SciPy](https://www.scipy.org/).
+
+Apart from often queried attributes such as **ndarray.shape**, 
**ndarray.dtype** and **ndarray.context**,
+you?ll also want to query **ndarray.stype**: the storage type of the NDArray. 
For a usual dense NDArray,
+the value of stype is **"default"**. For an CSRNDArray, the value of stype is 
**"csr"**.
+
+## Prerequisites
+
+To complete this tutorial, we need:
+
+- MXNet. See the instructions for your operating system in [Setup and 
Installation](http://mxnet.io/get_started/install.html)
+- [Jupyter](http://jupyter.org/)
+```
+pip install jupyter
+```
+
+## Compressed Sparse Row Format
+
+A CSRNDArray represents a 2D matrix as three separate 1D arrays: **data**,
+**indptr** and **indices**, where the column indices for
+row ``i`` are stored in ``indices[indptr[i]:indptr[i+1]]`` in ascending order,
+and their corresponding values are stored in ``data[indptr[i]:indptr[i+1]]``.
+
+For example, the CSR representation for matrix
+```
+[[7, 0, 8, 0]
+ [0, 0, 0, 0]
+ [0, 9, 0, 0]]
+```
+is:
+```
+[7, 8, 9]  # data
+[0, 2, 1]  # indices
+[0, 2, 2, 3]   # indptr
+```
+
+Note that in MXNet, the column indices for a given row are always sorted in 
ascending order,
+and duplicated column entries for the same row are not allowed.
+
+## Array Creation
+
+There are a few different ways to create a `CSRNDArray`.
+
+* We can create a CSRNDArray with data, indices and indptr by using the 
`csr_matrix` function:
+
+```python
+import mxnet as mx
+import numpy as np
+# create a CSRNDArray with python lists
+shape = (3, 4)
+data_list = [7, 8, 9]
+indptr_list = [0, 2, 2, 3]
+indices_list = [0, 2, 1]
+a = mx.nd.sparse.csr_matrix(data_list, indptr_list, indices_list, shape)
+# create a CSRNDArray with numpy arrays
+data_np = np.array([7, 8, 9])
 
 Review comment:
   We can just use the above lists, data_list, indptr_list, indices_list
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] anirudh2290 commented on a change in pull request #7656: CSRNDArray Tutorial

2017-08-29 Thread git

anirudh2290 commented on a change in pull request #7656: CSRNDArray Tutorial
URL: https://github.com/apache/incubator-mxnet/pull/7656#discussion_r135921813
 
 

 ##
 File path: docs/tutorials/sparse/csr.md
 ##
 @@ -0,0 +1,299 @@
+# CSRNDArray - NDArray in Compressed Sparse Row Storage Format
+
+Many real world datasets deal with high dimensional sparse feature vectors. 
For instance,
+in a recommendation system, the number of categories and users is in the order 
of millions,
+while most users only made a few purchases, leading to feature vectors with 
high sparsity
+(i.e. most of the elements are zeros).
+
+Storing and manipulating such large sparse matrices in the default dense 
structure results
+in wated memory and processing on the zeros.
 
 Review comment:
   wated -> wasted
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] sandeep-krishnamurthy commented on a change in pull request #7658: Parallelize Python 2 and 3 unit test cases in Jenkins CI.

2017-08-29 Thread git

sandeep-krishnamurthy commented on a change in pull request #7658: Parallelize 
Python 2 and 3 unit test cases in Jenkins CI.
URL: https://github.com/apache/incubator-mxnet/pull/7658#discussion_r135923707
 
 

 ##
 File path: Jenkinsfile
 ##
 @@ -250,31 +264,59 @@ try {
 }
 
 stage('Unit Test') {
-  parallel 'Python2/3: CPU': {
+  parallel 'Python2: CPU': {
 node('mxnetlinux') {
-  ws('workspace/ut-python-cpu') {
+  ws('workspace/ut-python2-cpu') {
 init_git()
 unpack_lib('cpu')
-python_ut('cpu')
+python2_ut('cpu')
   }
 }
   },
-  'Python2/3: GPU': {
+  'Python3: CPU': {
 node('mxnetlinux') {
-  ws('workspace/ut-python-gpu') {
+  ws('workspace/ut-python3-cpu') {
+init_git()
+unpack_lib('cpu')
+python3_ut('cpu')
+  }
+}
+  },
+  'Python2: GPU': {
+node('mxnetlinux') {
+  ws('workspace/ut-python2-gpu') {
+init_git()
+unpack_lib('gpu', mx_lib)
+python2_gpu_ut('gpu')
+  }
+}
+  },
+  'Python3: GPU': {
+node('mxnetlinux') {
+  ws('workspace/ut-python3-gpu') {
 init_git()
 unpack_lib('gpu', mx_lib)
-python_gpu_ut('gpu')
+python3_gpu_ut('gpu')
+  }
+}
+  },
+  'Python2: MKLML': {
+node('mxnetlinux') {
+  ws('workspace/ut-python2-mklml') {
+init_git()
+unpack_lib('mklml')
+python2_ut('mklml_gpu')
+python2_gpu_ut('mklml_gpu')
   }
 }
   },
-  'Python2/3: MKLML': {
+  'Python3: MKLML': {
 node('mxnetlinux') {
   ws('workspace/ut-python-mklml') {
 init_git()
 unpack_lib('mklml')
-python_ut('mklml_gpu')
-python_gpu_ut('mklml_gpu')
+python3_ut('mklml_gpu')
+python3_gpu_ut('mklml_gpu')
 
 Review comment:
   Thanks. Updated.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] rahul003 commented on a change in pull request #7658: Parallelize Python 2 and 3 unit test cases in Jenkins CI.

2017-08-29 Thread git

rahul003 commented on a change in pull request #7658: Parallelize Python 2 and 
3 unit test cases in Jenkins CI.
URL: https://github.com/apache/incubator-mxnet/pull/7658#discussion_r135923115
 
 

 ##
 File path: Jenkinsfile
 ##
 @@ -250,31 +264,59 @@ try {
 }
 
 stage('Unit Test') {
-  parallel 'Python2/3: CPU': {
+  parallel 'Python2: CPU': {
 node('mxnetlinux') {
-  ws('workspace/ut-python-cpu') {
+  ws('workspace/ut-python2-cpu') {
 init_git()
 unpack_lib('cpu')
-python_ut('cpu')
+python2_ut('cpu')
   }
 }
   },
-  'Python2/3: GPU': {
+  'Python3: CPU': {
 node('mxnetlinux') {
-  ws('workspace/ut-python-gpu') {
+  ws('workspace/ut-python3-cpu') {
+init_git()
+unpack_lib('cpu')
+python3_ut('cpu')
+  }
+}
+  },
+  'Python2: GPU': {
+node('mxnetlinux') {
+  ws('workspace/ut-python2-gpu') {
+init_git()
+unpack_lib('gpu', mx_lib)
+python2_gpu_ut('gpu')
+  }
+}
+  },
+  'Python3: GPU': {
+node('mxnetlinux') {
+  ws('workspace/ut-python3-gpu') {
 init_git()
 unpack_lib('gpu', mx_lib)
-python_gpu_ut('gpu')
+python3_gpu_ut('gpu')
+  }
+}
+  },
+  'Python2: MKLML': {
+node('mxnetlinux') {
+  ws('workspace/ut-python2-mklml') {
+init_git()
+unpack_lib('mklml')
+python2_ut('mklml_gpu')
+python2_gpu_ut('mklml_gpu')
   }
 }
   },
-  'Python2/3: MKLML': {
+  'Python3: MKLML': {
 node('mxnetlinux') {
   ws('workspace/ut-python-mklml') {
 init_git()
 unpack_lib('mklml')
-python_ut('mklml_gpu')
-python_gpu_ut('mklml_gpu')
+python3_ut('mklml_gpu')
+python3_gpu_ut('mklml_gpu')
 
 Review comment:
   You could also parallelize these two?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] sandeep-krishnamurthy opened a new pull request #7658: Parallelize Python 2 and 3 unit test cases.

2017-08-29 Thread git

sandeep-krishnamurthy opened a new pull request #7658: Parallelize Python 2 and 
3 unit test cases.
URL: https://github.com/apache/incubator-mxnet/pull/7658
 
 
   Parallelly run Python 2 and 3 unit tests.
   This will save around 45 minutes of build time in Jenkins.
   
   @rahul003 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

1 2 >

1 - 100 of 146 matches

Mail list logo