[GitHub] bhavinthaker commented on a change in pull request #7921: Add three sparse tutorials

2017-09-24 Thread git
bhavinthaker commented on a change in pull request #7921: Add three sparse 
tutorials
URL: https://github.com/apache/incubator-mxnet/pull/7921#discussion_r140676020
 
 

 ##
 File path: docs/tutorials/sparse/rowsparse.md
 ##
 @@ -0,0 +1,383 @@
+
+# RowSparseNDArray - NDArray for Sparse Gradient Updates
+
+## Motivation
+
+Many real world datasets deal with high dimensional sparse feature vectors. 
When learning
+the weights of models with sparse datasets, the derived gradients of the 
weights could be sparse.
+For example, let's say we learn a linear model ``Y = XW + b``, where ``X`` are 
sparse feature vectors:
+
+
+```python
+import mxnet as mx
+shape = (3, 10)
+# `X` only contains 4 non-zeros
+data = [6, 7, 8, 9]
+indptr = [0, 2, 3, 4]
+indices = [0, 4, 0, 0]
+X = mx.nd.sparse.csr_matrix(data, indptr, indices, shape)
+# the content of `X`
+X.asnumpy()
+```
+
+For some columns in `X`, they do not have any non-zero value, therefore the 
gradient for the weight `W` will have many row slices of all zeros 
corresponding to the zero columns in `X`.
+
+
+```python
+W = mx.nd.random_uniform(shape=(10, 2))
+b = mx.nd.zeros((3, 1))
+# attach a gradient placeholder for W
+W.attach_grad(stype='row_sparse')
+with mx.autograd.record():
+Y = mx.nd.dot(X, W) + b
+
+Y.backward()
+# the content of the gradients of `W`
+{'W.grad': W.grad, 'W.grad.asnumpy()': W.grad.asnumpy()}
+```
+
+Storing and manipulating such sparse matrices with many row slices of all 
zeros in the default dense structure results in wasted memory and processing on 
the zeros. More importantly, many gradient based optimization methods such as 
SGD, [AdaGrad](https://stanford.edu/~jduchi/projects/DuchiHaSi10_colt.pdf) and 
[Adam](https://arxiv.org/pdf/1412.6980.pdf)
+take advantage of sparse gradients and prove to be efficient and effective. 
+**In MXNet, the ``RowSparseNDArray`` stores the matrix in ``row sparse`` 
format, which is designed for arrays of which most row slices are all zeros.**
+In this tutorial, we will describe what the row sparse format is and how to 
use RowSparseNDArray for sparse gradient updates in MXNet.
+
+## Prerequisites
+
+To complete this tutorial, we need:
+
+- MXNet. See the instructions for your operating system in [Setup and 
Installation](http://mxnet.io/get_started/install.html)
+- [Jupyter](http://jupyter.org/)
+```
+pip install jupyter
+```
+- Basic knowledge of NDArray in MXNet. See the detailed tutorial for NDArray 
in [NDArray - Imperative tensor operations on 
CPU/GPU](https://mxnet.incubator.apache.org/tutorials/basic/ndarray.html)
+- Understanding of [automatic differentiation with 
autograd](http://gluon.mxnet.io/chapter01_crashcourse/autograd.html)
+- GPUs - A section of this tutorial uses GPUs. If you don't have GPUs on your
+machine, simply set the variable `gpu_device` (set in the GPUs section of this
+tutorial) to `mx.cpu()`
+
+## Row Sparse Format
+
+A RowSparseNDArray represents a multidimensional NDArray using two separate 1D 
arrays:
+`data` and `indices`.
+
+- data: an NDArray of any dtype with shape `[D0, D1, ..., Dn]`.
+- indices: a 1D int64 NDArray with shape `[D0]` with values sorted in 
ascending order.
+
+The ``indices`` array stores the indices of the row slices with non-zeros,
+while the values are stored in ``data`` array. The corresponding NDArray 
`dense` represented by RowSparseNDArray `rsp` has
+
+``dense[rsp.indices[i], :, :, :, ...] = rsp.data[i, :, :, :, ...]``
+
+A RowSparseNDArray is typically used to represent non-zero row slices of a 
large NDArray of shape [LARGE0, D1, .. , Dn] where LARGE0 >> D0 and most row 
slices are zeros.
+
+Given this two-dimension matrix:
+
+
+```python
+[[ 1, 2, 3],
+ [ 0, 0, 0],
+ [ 4, 0, 5],
+ [ 0, 0, 0],
+ [ 0, 0, 0]]
+```
+
+The row sparse representation would be:
+- `data` array holds all the non-zero row slices of the array.
+- `indices` array stores the row index for each row slice with non-zero 
elements.
+
+
+
+```python
+data = [[1, 2, 3], [4, 0, 5]]
+indices = [0, 2]
+```
+
+`RowSparseNDArray` supports multidimensional arrays. Given this 3D tensor:
+
+
+```python
+[[[1, 0],
+  [0, 2],
+  [3, 4]],
+
+ [[5, 0],
+  [6, 0],
+  [0, 0]],
+
+ [[0, 0],
+  [0, 0],
+  [0, 0]]]
+```
+
+The row sparse representation would be (with `data` and `indices` defined the 
same as above):
+
+
+```python
+data = [[[1, 0], [0, 2], [3, 4]], [[5, 0], [6, 0], [0, 0]]]
+indices = [0, 1]
+```
+
+``RowSparseNDArray`` is a subclass of ``NDArray``. If you query **stype** of a 
RowSparseNDArray,
+the value will be **"row_sparse"**.
 
 Review comment:
   Since this is called "row_sparse", wouldn't it be nice if we had called 
CSR's stype as "compressed_sparse" for ease in understanding, i.e. 
"compressed_sparse" for CSR and "row_sparse" for RSP? :-)
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above 

[GitHub] bhavinthaker commented on a change in pull request #7921: Add three sparse tutorials

2017-09-24 Thread git
bhavinthaker commented on a change in pull request #7921: Add three sparse 
tutorials
URL: https://github.com/apache/incubator-mxnet/pull/7921#discussion_r140672403
 
 

 ##
 File path: docs/tutorials/sparse/csr.md
 ##
 @@ -0,0 +1,338 @@
+
+# CSRNDArray - NDArray in Compressed Sparse Row Storage Format
+
+Many real world datasets deal with high dimensional sparse feature vectors. 
Take for instance a recommendation system where the number of categories and 
users is on the order of millions. The purchase data for each category by user 
would show that most users only make a few purchases, leading to a dataset with 
high sparsity (i.e. most of the elements are zeros).
+
+Storing and manipulating such large sparse matrices in the default dense 
structure results in wasted memory and processing on the zeros. To take 
advantage of the sparse structure of the matrix, the `CSRNDArray` in MXNet 
stores the matrix in [compressed sparse row 
(CSR)](https://en.wikipedia.org/wiki/Sparse_matrix#Compressed_sparse_row_.28CSR.2C_CRS_or_Yale_format.29)
 format and uses specialized algorithms in operators.
+**The format is designed for 2D matrices with a large number of columns,
+and each row is sparse (i.e. with only a few nonzeros).**
+
+## Advantages of Compressed Sparse Row NDArray (CSRNDArray)
+For matrices of high sparsity (e.g. ~1% non-zeros), there are two primary 
advantages of `CSRNDArray` over the existing `NDArray`:
+
+- memory consumption is reduced significantly
+- certain operations are much faster (e.g. matrix-vector multiplication)
+
+You may be familiar with the CSR storage format in 
[SciPy](https://www.scipy.org/) and will note the similarities in MXNet's 
implementation. However there are some additional competitive features in 
`CSRNDArray` inherited from `NDArray`, such as lazy evaluation and automatic 
parallelization that are not available in SciPy's flavor of CSR.
+
+The introduction of `CSRNDArray` also brings a new attribute, `stype` as a 
holder for storage type info, to `NDArray`. You can query **ndarray.stype** now 
in addition to the oft-queried attributes such as **ndarray.shape**, 
**ndarray.dtype**, and **ndarray.context**. For a typical dense NDArray, the 
value of `stype` is **"default"**. For a `CSRNDArray`, the value of stype is 
**"csr"**.
+
+## Prerequisites
+
+To complete this tutorial, you will need:
+
+- MXNet. See the instructions for your operating system in [Setup and 
Installation](http://mxnet.io/get_started/install.html)
+- [Jupyter](http://jupyter.org/)
+```
+pip install jupyter
+```
+- Basic knowledge of NDArray in MXNet. See the detailed tutorial for NDArray 
in [NDArray - Imperative tensor operations on 
CPU/GPU](https://mxnet.incubator.apache.org/tutorials/basic/ndarray.html).
+- SciPy - A section of this tutorial uses SciPy package in Python. If you 
don't have SciPy, the example in that section will be ignored.
+- GPUs - A section of this tutorial uses GPUs. If you don't have GPUs on your 
machine, simply set the variable `gpu_device` (set in the GPUs section of this 
tutorial) to `mx.cpu()`.
+
+## Compressed Sparse Row Matrix
+
+A CSRNDArray represents a 2D matrix as three separate 1D arrays: **data**, 
**indptr** and **indices**, where the column indices for row `i` are stored in 
`indices[indptr[i]:indptr[i+1]]` in ascending order, and their corresponding 
values are stored in `data[indptr[i]:indptr[i+1]]`.
+
+- **data**: CSR format data array of the matrix
+- **indices**: CSR format index array of the matrix
+- **indptr**: CSR format index pointer array of the matrix
+
+### Example Matrix Compression
+
+For example, given the matrix:
+```
+[[7, 0, 8, 0]
+ [0, 0, 0, 0]
+ [0, 9, 0, 0]]
+```
+
+We can compress this matrix using CSR, and to do so we need to calculate 
`data`, `indices`, and `indptr`.
+
+The `data` array holds all the non-zero entries of the matrix in row-major 
order. Put another way, you create a data array that has all of the zeros 
removed from the matrix, row by row, storing the numbers in that order. Your 
result:
+
+data = [7, 8, 9]
+
+The `indices` array stores the column index for each non-zero element in 
`data`. As you cycle through the data array, starting with 7, you can see it is 
in column 0. Then looking at 8, you can see it is in column 2. Lastly 9 is in 
column 1. Your result:
+
+indices = [0, 2, 1]
+
+The `indptr` array is what will help identify the rows where the data appears. 
It stores the index into `data` of the first non-zero element number of each 
row of the matrix. This array always starts with 0 (reasons can be explored 
later), so indptr[0] is 0. Each subsequent value in the array is the aggregate 
number of non-zero elements up to that row. Looking at the first row of the 
matrix you can see two non-zero values, so indptr[1] is 2. The next row 
contains all zeros, so the aggregate is still 2, so indptr[2] is 2. Finally, 
you see the last row contains one non-zero element bring the 

[GitHub] bhavinthaker commented on a change in pull request #7921: Add three sparse tutorials

2017-09-24 Thread git
bhavinthaker commented on a change in pull request #7921: Add three sparse 
tutorials
URL: https://github.com/apache/incubator-mxnet/pull/7921#discussion_r140672517
 
 

 ##
 File path: docs/tutorials/sparse/csr.md
 ##
 @@ -0,0 +1,338 @@
+
+# CSRNDArray - NDArray in Compressed Sparse Row Storage Format
+
+Many real world datasets deal with high dimensional sparse feature vectors. 
Take for instance a recommendation system where the number of categories and 
users is on the order of millions. The purchase data for each category by user 
would show that most users only make a few purchases, leading to a dataset with 
high sparsity (i.e. most of the elements are zeros).
+
+Storing and manipulating such large sparse matrices in the default dense 
structure results in wasted memory and processing on the zeros. To take 
advantage of the sparse structure of the matrix, the `CSRNDArray` in MXNet 
stores the matrix in [compressed sparse row 
(CSR)](https://en.wikipedia.org/wiki/Sparse_matrix#Compressed_sparse_row_.28CSR.2C_CRS_or_Yale_format.29)
 format and uses specialized algorithms in operators.
+**The format is designed for 2D matrices with a large number of columns,
+and each row is sparse (i.e. with only a few nonzeros).**
+
+## Advantages of Compressed Sparse Row NDArray (CSRNDArray)
+For matrices of high sparsity (e.g. ~1% non-zeros), there are two primary 
advantages of `CSRNDArray` over the existing `NDArray`:
+
+- memory consumption is reduced significantly
+- certain operations are much faster (e.g. matrix-vector multiplication)
+
+You may be familiar with the CSR storage format in 
[SciPy](https://www.scipy.org/) and will note the similarities in MXNet's 
implementation. However there are some additional competitive features in 
`CSRNDArray` inherited from `NDArray`, such as lazy evaluation and automatic 
parallelization that are not available in SciPy's flavor of CSR.
+
+The introduction of `CSRNDArray` also brings a new attribute, `stype` as a 
holder for storage type info, to `NDArray`. You can query **ndarray.stype** now 
in addition to the oft-queried attributes such as **ndarray.shape**, 
**ndarray.dtype**, and **ndarray.context**. For a typical dense NDArray, the 
value of `stype` is **"default"**. For a `CSRNDArray`, the value of stype is 
**"csr"**.
+
+## Prerequisites
+
+To complete this tutorial, you will need:
+
+- MXNet. See the instructions for your operating system in [Setup and 
Installation](http://mxnet.io/get_started/install.html)
+- [Jupyter](http://jupyter.org/)
+```
+pip install jupyter
+```
+- Basic knowledge of NDArray in MXNet. See the detailed tutorial for NDArray 
in [NDArray - Imperative tensor operations on 
CPU/GPU](https://mxnet.incubator.apache.org/tutorials/basic/ndarray.html).
+- SciPy - A section of this tutorial uses SciPy package in Python. If you 
don't have SciPy, the example in that section will be ignored.
+- GPUs - A section of this tutorial uses GPUs. If you don't have GPUs on your 
machine, simply set the variable `gpu_device` (set in the GPUs section of this 
tutorial) to `mx.cpu()`.
+
+## Compressed Sparse Row Matrix
+
+A CSRNDArray represents a 2D matrix as three separate 1D arrays: **data**, 
**indptr** and **indices**, where the column indices for row `i` are stored in 
`indices[indptr[i]:indptr[i+1]]` in ascending order, and their corresponding 
values are stored in `data[indptr[i]:indptr[i+1]]`.
+
+- **data**: CSR format data array of the matrix
+- **indices**: CSR format index array of the matrix
+- **indptr**: CSR format index pointer array of the matrix
+
+### Example Matrix Compression
+
+For example, given the matrix:
+```
+[[7, 0, 8, 0]
+ [0, 0, 0, 0]
+ [0, 9, 0, 0]]
+```
+
+We can compress this matrix using CSR, and to do so we need to calculate 
`data`, `indices`, and `indptr`.
+
+The `data` array holds all the non-zero entries of the matrix in row-major 
order. Put another way, you create a data array that has all of the zeros 
removed from the matrix, row by row, storing the numbers in that order. Your 
result:
+
+data = [7, 8, 9]
+
+The `indices` array stores the column index for each non-zero element in 
`data`. As you cycle through the data array, starting with 7, you can see it is 
in column 0. Then looking at 8, you can see it is in column 2. Lastly 9 is in 
column 1. Your result:
+
+indices = [0, 2, 1]
+
+The `indptr` array is what will help identify the rows where the data appears. 
It stores the index into `data` of the first non-zero element number of each 
row of the matrix. This array always starts with 0 (reasons can be explored 
later), so indptr[0] is 0. Each subsequent value in the array is the aggregate 
number of non-zero elements up to that row. Looking at the first row of the 
matrix you can see two non-zero values, so indptr[1] is 2. The next row 
contains all zeros, so the aggregate is still 2, so indptr[2] is 2. Finally, 
you see the last row contains one non-zero element bring the 

[GitHub] bhavinthaker commented on a change in pull request #7921: Add three sparse tutorials

2017-09-24 Thread git
bhavinthaker commented on a change in pull request #7921: Add three sparse 
tutorials
URL: https://github.com/apache/incubator-mxnet/pull/7921#discussion_r140673905
 
 

 ##
 File path: docs/tutorials/sparse/csr.md
 ##
 @@ -0,0 +1,338 @@
+
+# CSRNDArray - NDArray in Compressed Sparse Row Storage Format
+
+Many real world datasets deal with high dimensional sparse feature vectors. 
Take for instance a recommendation system where the number of categories and 
users is on the order of millions. The purchase data for each category by user 
would show that most users only make a few purchases, leading to a dataset with 
high sparsity (i.e. most of the elements are zeros).
+
+Storing and manipulating such large sparse matrices in the default dense 
structure results in wasted memory and processing on the zeros. To take 
advantage of the sparse structure of the matrix, the `CSRNDArray` in MXNet 
stores the matrix in [compressed sparse row 
(CSR)](https://en.wikipedia.org/wiki/Sparse_matrix#Compressed_sparse_row_.28CSR.2C_CRS_or_Yale_format.29)
 format and uses specialized algorithms in operators.
+**The format is designed for 2D matrices with a large number of columns,
+and each row is sparse (i.e. with only a few nonzeros).**
+
+## Advantages of Compressed Sparse Row NDArray (CSRNDArray)
+For matrices of high sparsity (e.g. ~1% non-zeros), there are two primary 
advantages of `CSRNDArray` over the existing `NDArray`:
+
+- memory consumption is reduced significantly
+- certain operations are much faster (e.g. matrix-vector multiplication)
+
+You may be familiar with the CSR storage format in 
[SciPy](https://www.scipy.org/) and will note the similarities in MXNet's 
implementation. However there are some additional competitive features in 
`CSRNDArray` inherited from `NDArray`, such as lazy evaluation and automatic 
parallelization that are not available in SciPy's flavor of CSR.
+
+The introduction of `CSRNDArray` also brings a new attribute, `stype` as a 
holder for storage type info, to `NDArray`. You can query **ndarray.stype** now 
in addition to the oft-queried attributes such as **ndarray.shape**, 
**ndarray.dtype**, and **ndarray.context**. For a typical dense NDArray, the 
value of `stype` is **"default"**. For a `CSRNDArray`, the value of stype is 
**"csr"**.
+
+## Prerequisites
+
+To complete this tutorial, you will need:
+
+- MXNet. See the instructions for your operating system in [Setup and 
Installation](http://mxnet.io/get_started/install.html)
+- [Jupyter](http://jupyter.org/)
+```
+pip install jupyter
+```
+- Basic knowledge of NDArray in MXNet. See the detailed tutorial for NDArray 
in [NDArray - Imperative tensor operations on 
CPU/GPU](https://mxnet.incubator.apache.org/tutorials/basic/ndarray.html).
+- SciPy - A section of this tutorial uses SciPy package in Python. If you 
don't have SciPy, the example in that section will be ignored.
+- GPUs - A section of this tutorial uses GPUs. If you don't have GPUs on your 
machine, simply set the variable `gpu_device` (set in the GPUs section of this 
tutorial) to `mx.cpu()`.
+
+## Compressed Sparse Row Matrix
+
+A CSRNDArray represents a 2D matrix as three separate 1D arrays: **data**, 
**indptr** and **indices**, where the column indices for row `i` are stored in 
`indices[indptr[i]:indptr[i+1]]` in ascending order, and their corresponding 
values are stored in `data[indptr[i]:indptr[i+1]]`.
+
+- **data**: CSR format data array of the matrix
+- **indices**: CSR format index array of the matrix
+- **indptr**: CSR format index pointer array of the matrix
+
+### Example Matrix Compression
+
+For example, given the matrix:
+```
+[[7, 0, 8, 0]
+ [0, 0, 0, 0]
+ [0, 9, 0, 0]]
+```
+
+We can compress this matrix using CSR, and to do so we need to calculate 
`data`, `indices`, and `indptr`.
+
+The `data` array holds all the non-zero entries of the matrix in row-major 
order. Put another way, you create a data array that has all of the zeros 
removed from the matrix, row by row, storing the numbers in that order. Your 
result:
+
+data = [7, 8, 9]
+
+The `indices` array stores the column index for each non-zero element in 
`data`. As you cycle through the data array, starting with 7, you can see it is 
in column 0. Then looking at 8, you can see it is in column 2. Lastly 9 is in 
column 1. Your result:
+
+indices = [0, 2, 1]
+
+The `indptr` array is what will help identify the rows where the data appears. 
It stores the index into `data` of the first non-zero element number of each 
row of the matrix. This array always starts with 0 (reasons can be explored 
later), so indptr[0] is 0. Each subsequent value in the array is the aggregate 
number of non-zero elements up to that row. Looking at the first row of the 
matrix you can see two non-zero values, so indptr[1] is 2. The next row 
contains all zeros, so the aggregate is still 2, so indptr[2] is 2. Finally, 
you see the last row contains one non-zero element bring the 

[GitHub] bhavinthaker commented on a change in pull request #7921: Add three sparse tutorials

2017-09-24 Thread git
bhavinthaker commented on a change in pull request #7921: Add three sparse 
tutorials
URL: https://github.com/apache/incubator-mxnet/pull/7921#discussion_r140676677
 
 

 ##
 File path: docs/tutorials/sparse/train.md
 ##
 @@ -0,0 +1,256 @@
+
+# Train a Linear Regression Model with Sparse Symbols
+In previous tutorials, we introduced `CSRNDArray` and `RowSparseNDArray`,
+the basic data structures for manipulating sparse data.
+MXNet also provides `Sparse Symbol` API, which enables symbolic expressions 
that handle sparse arrays.
+In this tutorial, we first focus on how to compose a symbolic graph with 
sparse operators,
+then train a linear regression model using sparse symbols with the Module API.
+
+## Prerequisites
+
+To complete this tutorial, we need:
+
+- MXNet. See the instructions for your operating system in [Setup and 
Installation](http://mxnet.io/get_started/install.html).  
+
+- [Jupyter Notebook](http://jupyter.org/index.html) and [Python 
Requests](http://docs.python-requests.org/en/master/) packages.
+```
+pip install jupyter requests
+```
+
+- Basic knowledge of Symbol in MXNet. See the detailed tutorial for Symbol in 
[Symbol - Neural network graphs and 
auto-differentiation](https://mxnet.incubator.apache.org/tutorials/basic/symbol.html).
+
+- Basic knowledge of CSRNDArray in MXNet. See the detailed tutorial for 
CSRNDArray in [TODO(haibin) Add Link 
Here](http://ec2-54-187-32-207.us-west-2.compute.amazonaws.com/tutorials/sparse/csr.html).
+
+- Basic knowledge of RowSparseNDArray in MXNet. See the detailed tutorial for 
RowSparseNDArray in [TODO(haibin) Add Link 
Here](http://ec2-54-187-32-207.us-west-2.compute.amazonaws.com/tutorials/sparse/rowsparse.html).
+
+## Variables
+
+Variables are placeholder for arrays. We can use them to hold sparse arrays, 
too.
 
 Review comment:
   Remove the comma between arrays and too.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] bhavinthaker commented on a change in pull request #7921: Add three sparse tutorials

2017-09-24 Thread git
bhavinthaker commented on a change in pull request #7921: Add three sparse 
tutorials
URL: https://github.com/apache/incubator-mxnet/pull/7921#discussion_r140680349
 
 

 ##
 File path: docs/tutorials/sparse/train.md
 ##
 @@ -0,0 +1,256 @@
+
+# Train a Linear Regression Model with Sparse Symbols
+In previous tutorials, we introduced `CSRNDArray` and `RowSparseNDArray`,
+the basic data structures for manipulating sparse data.
+MXNet also provides `Sparse Symbol` API, which enables symbolic expressions 
that handle sparse arrays.
+In this tutorial, we first focus on how to compose a symbolic graph with 
sparse operators,
+then train a linear regression model using sparse symbols with the Module API.
+
+## Prerequisites
+
+To complete this tutorial, we need:
+
+- MXNet. See the instructions for your operating system in [Setup and 
Installation](http://mxnet.io/get_started/install.html).  
+
+- [Jupyter Notebook](http://jupyter.org/index.html) and [Python 
Requests](http://docs.python-requests.org/en/master/) packages.
+```
+pip install jupyter requests
+```
+
+- Basic knowledge of Symbol in MXNet. See the detailed tutorial for Symbol in 
[Symbol - Neural network graphs and 
auto-differentiation](https://mxnet.incubator.apache.org/tutorials/basic/symbol.html).
+
+- Basic knowledge of CSRNDArray in MXNet. See the detailed tutorial for 
CSRNDArray in [TODO(haibin) Add Link 
Here](http://ec2-54-187-32-207.us-west-2.compute.amazonaws.com/tutorials/sparse/csr.html).
+
+- Basic knowledge of RowSparseNDArray in MXNet. See the detailed tutorial for 
RowSparseNDArray in [TODO(haibin) Add Link 
Here](http://ec2-54-187-32-207.us-west-2.compute.amazonaws.com/tutorials/sparse/rowsparse.html).
+
+## Variables
+
+Variables are placeholder for arrays. We can use them to hold sparse arrays, 
too.
+
+### Variable Storage Types
+
+The `stype` attribute of a variable is used to indicate the storage type of 
the array.
+By default, the `stype` of a variable is "default" which indicates the default 
dense storage format.
+We can specify the `stype` of a variable as "csr" or "row_sparse" to hold 
sparse arrays.
+
+
+```python
+import mxnet as mx
+# Create a variable to hold an NDArray
+a = mx.sym.Variable('a')
+# Create a variable to hold a CSRNDArray
+b = mx.sym.Variable('b', stype='csr')
+# Create a variable to hold a RowSparseNDArray
+c = mx.sym.Variable('c', stype='row_sparse')
+(a, b, c)
+```
+
+### Bind with Sparse Arrays
+
+The sparse symbols constructed above declare storage types of the arrays to 
hold.
+To evaluate them, we need to feed the free variables with sparse data.
+
+You can instantiate an executor from a sparse symbol by using the 
`simple_bind` method,
+which allocate zeros to all free variables according to their storage types.
+The executor provides `forward` method for evaluation and an attribute
+`outputs` to get all the results. Later, we will show the use of the 
`backward` method and other methods computing the gradients and updating 
parameters. A simple example first:
+
+
+```python
+shape = (2,2)
+# Instantiate an executor from sparse symbols
+b_exec = b.simple_bind(ctx=mx.cpu(), b=shape)
+c_exec = c.simple_bind(ctx=mx.cpu(), c=shape)
+b_exec.forward()
+c_exec.forward()
+# Sparse arrays of zeros are bound to b and c
+print(b_exec.outputs, c_exec.outputs)
+```
+
+You can update the array held by the variable by accessing executor's 
`arg_dict` and assigning new values.
+
+
+```python
+var_exec.arg_dict['b'][:] = mx.nd.ones(shape).tostype('csr')
+var_exec.forward()
+# The array `b` holds are updated to be ones
+eval_b = var_exec.outputs[0]
+{'eval_b': eval_b, 'eval_b.asnumpy()': eval_b.asnumpy()}
+```
+
+## Symbol Composition and Storage Type Inference
+
+### Basic Symbol Composition
+
+The following example builds a simple element-wise addition expression with 
different storage types.
+The sparse symbols are available in the `mx.sym.sparse` package.
+
+
+```python
+# Element-wise addition of variables with "default" stype
+d = mx.sym.elemwise_add(a, a)
+# Element-wise addition of variables with "csr" stype
+e = mx.sym.sparse.negative(b)
+# Element-wise addition of variables with "row_sparse" stype
+f = mx.sym.sparse.elemwise_add(c, c)
+{'d':d, 'e':e, 'f':f}
+```
+
+### Storage Type Inference
+
+What will be the output storage types of sparse symbols? In MXNet, for any 
sparse symbol, the result storage types are inferred based on storage types of 
inputs.
+You can read the [Sparse Symbol API](mxnet.io/api/python/symbol/sparse.html) 
documentation to find what output storage types are. In the example below we 
will try out the storage types introduced in the Row Sparse and Compressed 
Sparse Row tutorials: `default` (dense), `csr`, and `row_sparse`.
+
+
+```python
+add_exec = mx.sym.Group([d, e, f]).simple_bind(ctx=mx.cpu(), a=shape, b=shape, 
c=shape)
+add_exec.forward()
+dense_add = add_exec.outputs[0]
+# The output storage type of elemwise_add(csr, csr) will be inferred as "csr"

[GitHub] bhavinthaker commented on a change in pull request #7921: Add three sparse tutorials

2017-09-24 Thread git
bhavinthaker commented on a change in pull request #7921: Add three sparse 
tutorials
URL: https://github.com/apache/incubator-mxnet/pull/7921#discussion_r140676905
 
 

 ##
 File path: docs/tutorials/sparse/train.md
 ##
 @@ -0,0 +1,256 @@
+
+# Train a Linear Regression Model with Sparse Symbols
+In previous tutorials, we introduced `CSRNDArray` and `RowSparseNDArray`,
+the basic data structures for manipulating sparse data.
+MXNet also provides `Sparse Symbol` API, which enables symbolic expressions 
that handle sparse arrays.
+In this tutorial, we first focus on how to compose a symbolic graph with 
sparse operators,
+then train a linear regression model using sparse symbols with the Module API.
+
+## Prerequisites
+
+To complete this tutorial, we need:
+
+- MXNet. See the instructions for your operating system in [Setup and 
Installation](http://mxnet.io/get_started/install.html).  
+
+- [Jupyter Notebook](http://jupyter.org/index.html) and [Python 
Requests](http://docs.python-requests.org/en/master/) packages.
+```
+pip install jupyter requests
+```
+
+- Basic knowledge of Symbol in MXNet. See the detailed tutorial for Symbol in 
[Symbol - Neural network graphs and 
auto-differentiation](https://mxnet.incubator.apache.org/tutorials/basic/symbol.html).
+
+- Basic knowledge of CSRNDArray in MXNet. See the detailed tutorial for 
CSRNDArray in [TODO(haibin) Add Link 
Here](http://ec2-54-187-32-207.us-west-2.compute.amazonaws.com/tutorials/sparse/csr.html).
+
+- Basic knowledge of RowSparseNDArray in MXNet. See the detailed tutorial for 
RowSparseNDArray in [TODO(haibin) Add Link 
Here](http://ec2-54-187-32-207.us-west-2.compute.amazonaws.com/tutorials/sparse/rowsparse.html).
+
+## Variables
+
+Variables are placeholder for arrays. We can use them to hold sparse arrays, 
too.
+
+### Variable Storage Types
+
+The `stype` attribute of a variable is used to indicate the storage type of 
the array.
+By default, the `stype` of a variable is "default" which indicates the default 
dense storage format.
+We can specify the `stype` of a variable as "csr" or "row_sparse" to hold 
sparse arrays.
+
+
+```python
+import mxnet as mx
+# Create a variable to hold an NDArray
+a = mx.sym.Variable('a')
+# Create a variable to hold a CSRNDArray
+b = mx.sym.Variable('b', stype='csr')
+# Create a variable to hold a RowSparseNDArray
+c = mx.sym.Variable('c', stype='row_sparse')
+(a, b, c)
+```
+
+### Bind with Sparse Arrays
+
+The sparse symbols constructed above declare storage types of the arrays to 
hold.
+To evaluate them, we need to feed the free variables with sparse data.
+
+You can instantiate an executor from a sparse symbol by using the 
`simple_bind` method,
+which allocate zeros to all free variables according to their storage types.
+The executor provides `forward` method for evaluation and an attribute
+`outputs` to get all the results. Later, we will show the use of the 
`backward` method and other methods computing the gradients and updating 
parameters. A simple example first:
+
+
+```python
+shape = (2,2)
+# Instantiate an executor from sparse symbols
+b_exec = b.simple_bind(ctx=mx.cpu(), b=shape)
+c_exec = c.simple_bind(ctx=mx.cpu(), c=shape)
+b_exec.forward()
+c_exec.forward()
+# Sparse arrays of zeros are bound to b and c
+print(b_exec.outputs, c_exec.outputs)
+```
+
+You can update the array held by the variable by accessing executor's 
`arg_dict` and assigning new values.
+
+
+```python
+var_exec.arg_dict['b'][:] = mx.nd.ones(shape).tostype('csr')
 
 Review comment:
   FYI: NameError: name 'var_exec' is not defined
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] bhavinthaker commented on a change in pull request #7921: Add three sparse tutorials

2017-09-24 Thread git
bhavinthaker commented on a change in pull request #7921: Add three sparse 
tutorials
URL: https://github.com/apache/incubator-mxnet/pull/7921#discussion_r140672227
 
 

 ##
 File path: docs/tutorials/sparse/csr.md
 ##
 @@ -0,0 +1,338 @@
+
+# CSRNDArray - NDArray in Compressed Sparse Row Storage Format
+
+Many real world datasets deal with high dimensional sparse feature vectors. 
Take for instance a recommendation system where the number of categories and 
users is on the order of millions. The purchase data for each category by user 
would show that most users only make a few purchases, leading to a dataset with 
high sparsity (i.e. most of the elements are zeros).
+
+Storing and manipulating such large sparse matrices in the default dense 
structure results in wasted memory and processing on the zeros. To take 
advantage of the sparse structure of the matrix, the `CSRNDArray` in MXNet 
stores the matrix in [compressed sparse row 
(CSR)](https://en.wikipedia.org/wiki/Sparse_matrix#Compressed_sparse_row_.28CSR.2C_CRS_or_Yale_format.29)
 format and uses specialized algorithms in operators.
+**The format is designed for 2D matrices with a large number of columns,
+and each row is sparse (i.e. with only a few nonzeros).**
+
+## Advantages of Compressed Sparse Row NDArray (CSRNDArray)
+For matrices of high sparsity (e.g. ~1% non-zeros), there are two primary 
advantages of `CSRNDArray` over the existing `NDArray`:
+
+- memory consumption is reduced significantly
+- certain operations are much faster (e.g. matrix-vector multiplication)
+
+You may be familiar with the CSR storage format in 
[SciPy](https://www.scipy.org/) and will note the similarities in MXNet's 
implementation. However there are some additional competitive features in 
`CSRNDArray` inherited from `NDArray`, such as lazy evaluation and automatic 
parallelization that are not available in SciPy's flavor of CSR.
+
+The introduction of `CSRNDArray` also brings a new attribute, `stype` as a 
holder for storage type info, to `NDArray`. You can query **ndarray.stype** now 
in addition to the oft-queried attributes such as **ndarray.shape**, 
**ndarray.dtype**, and **ndarray.context**. For a typical dense NDArray, the 
value of `stype` is **"default"**. For a `CSRNDArray`, the value of stype is 
**"csr"**.
+
+## Prerequisites
+
+To complete this tutorial, you will need:
+
+- MXNet. See the instructions for your operating system in [Setup and 
Installation](http://mxnet.io/get_started/install.html)
+- [Jupyter](http://jupyter.org/)
+```
+pip install jupyter
+```
+- Basic knowledge of NDArray in MXNet. See the detailed tutorial for NDArray 
in [NDArray - Imperative tensor operations on 
CPU/GPU](https://mxnet.incubator.apache.org/tutorials/basic/ndarray.html).
+- SciPy - A section of this tutorial uses SciPy package in Python. If you 
don't have SciPy, the example in that section will be ignored.
+- GPUs - A section of this tutorial uses GPUs. If you don't have GPUs on your 
machine, simply set the variable `gpu_device` (set in the GPUs section of this 
tutorial) to `mx.cpu()`.
+
+## Compressed Sparse Row Matrix
+
+A CSRNDArray represents a 2D matrix as three separate 1D arrays: **data**, 
**indptr** and **indices**, where the column indices for row `i` are stored in 
`indices[indptr[i]:indptr[i+1]]` in ascending order, and their corresponding 
values are stored in `data[indptr[i]:indptr[i+1]]`.
+
+- **data**: CSR format data array of the matrix
+- **indices**: CSR format index array of the matrix
+- **indptr**: CSR format index pointer array of the matrix
+
+### Example Matrix Compression
+
+For example, given the matrix:
+```
+[[7, 0, 8, 0]
+ [0, 0, 0, 0]
+ [0, 9, 0, 0]]
+```
+
+We can compress this matrix using CSR, and to do so we need to calculate 
`data`, `indices`, and `indptr`.
+
+The `data` array holds all the non-zero entries of the matrix in row-major 
order. Put another way, you create a data array that has all of the zeros 
removed from the matrix, row by row, storing the numbers in that order. Your 
result:
+
+data = [7, 8, 9]
+
+The `indices` array stores the column index for each non-zero element in 
`data`. As you cycle through the data array, starting with 7, you can see it is 
in column 0. Then looking at 8, you can see it is in column 2. Lastly 9 is in 
column 1. Your result:
+
+indices = [0, 2, 1]
+
+The `indptr` array is what will help identify the rows where the data appears. 
It stores the index into `data` of the first non-zero element number of each 
row of the matrix. This array always starts with 0 (reasons can be explored 
later), so indptr[0] is 0. Each subsequent value in the array is the aggregate 
number of non-zero elements up to that row. Looking at the first row of the 
matrix you can see two non-zero values, so indptr[1] is 2. The next row 
contains all zeros, so the aggregate is still 2, so indptr[2] is 2. Finally, 
you see the last row contains one non-zero element bring the 

[GitHub] bhavinthaker commented on a change in pull request #7921: Add three sparse tutorials

2017-09-24 Thread git
bhavinthaker commented on a change in pull request #7921: Add three sparse 
tutorials
URL: https://github.com/apache/incubator-mxnet/pull/7921#discussion_r140680425
 
 

 ##
 File path: docs/tutorials/sparse/train.md
 ##
 @@ -0,0 +1,256 @@
+
+# Train a Linear Regression Model with Sparse Symbols
+In previous tutorials, we introduced `CSRNDArray` and `RowSparseNDArray`,
+the basic data structures for manipulating sparse data.
+MXNet also provides `Sparse Symbol` API, which enables symbolic expressions 
that handle sparse arrays.
+In this tutorial, we first focus on how to compose a symbolic graph with 
sparse operators,
+then train a linear regression model using sparse symbols with the Module API.
+
+## Prerequisites
+
+To complete this tutorial, we need:
+
+- MXNet. See the instructions for your operating system in [Setup and 
Installation](http://mxnet.io/get_started/install.html).  
+
+- [Jupyter Notebook](http://jupyter.org/index.html) and [Python 
Requests](http://docs.python-requests.org/en/master/) packages.
+```
+pip install jupyter requests
+```
+
+- Basic knowledge of Symbol in MXNet. See the detailed tutorial for Symbol in 
[Symbol - Neural network graphs and 
auto-differentiation](https://mxnet.incubator.apache.org/tutorials/basic/symbol.html).
+
+- Basic knowledge of CSRNDArray in MXNet. See the detailed tutorial for 
CSRNDArray in [TODO(haibin) Add Link 
Here](http://ec2-54-187-32-207.us-west-2.compute.amazonaws.com/tutorials/sparse/csr.html).
+
+- Basic knowledge of RowSparseNDArray in MXNet. See the detailed tutorial for 
RowSparseNDArray in [TODO(haibin) Add Link 
Here](http://ec2-54-187-32-207.us-west-2.compute.amazonaws.com/tutorials/sparse/rowsparse.html).
+
+## Variables
+
+Variables are placeholder for arrays. We can use them to hold sparse arrays, 
too.
+
+### Variable Storage Types
+
+The `stype` attribute of a variable is used to indicate the storage type of 
the array.
+By default, the `stype` of a variable is "default" which indicates the default 
dense storage format.
+We can specify the `stype` of a variable as "csr" or "row_sparse" to hold 
sparse arrays.
+
+
+```python
+import mxnet as mx
+# Create a variable to hold an NDArray
+a = mx.sym.Variable('a')
+# Create a variable to hold a CSRNDArray
+b = mx.sym.Variable('b', stype='csr')
+# Create a variable to hold a RowSparseNDArray
+c = mx.sym.Variable('c', stype='row_sparse')
+(a, b, c)
+```
+
+### Bind with Sparse Arrays
+
+The sparse symbols constructed above declare storage types of the arrays to 
hold.
+To evaluate them, we need to feed the free variables with sparse data.
+
+You can instantiate an executor from a sparse symbol by using the 
`simple_bind` method,
+which allocate zeros to all free variables according to their storage types.
+The executor provides `forward` method for evaluation and an attribute
+`outputs` to get all the results. Later, we will show the use of the 
`backward` method and other methods computing the gradients and updating 
parameters. A simple example first:
+
+
+```python
+shape = (2,2)
+# Instantiate an executor from sparse symbols
+b_exec = b.simple_bind(ctx=mx.cpu(), b=shape)
+c_exec = c.simple_bind(ctx=mx.cpu(), c=shape)
+b_exec.forward()
+c_exec.forward()
+# Sparse arrays of zeros are bound to b and c
+print(b_exec.outputs, c_exec.outputs)
+```
+
+You can update the array held by the variable by accessing executor's 
`arg_dict` and assigning new values.
+
+
+```python
+var_exec.arg_dict['b'][:] = mx.nd.ones(shape).tostype('csr')
+var_exec.forward()
+# The array `b` holds are updated to be ones
+eval_b = var_exec.outputs[0]
+{'eval_b': eval_b, 'eval_b.asnumpy()': eval_b.asnumpy()}
+```
+
+## Symbol Composition and Storage Type Inference
+
+### Basic Symbol Composition
+
+The following example builds a simple element-wise addition expression with 
different storage types.
+The sparse symbols are available in the `mx.sym.sparse` package.
+
+
+```python
+# Element-wise addition of variables with "default" stype
+d = mx.sym.elemwise_add(a, a)
+# Element-wise addition of variables with "csr" stype
+e = mx.sym.sparse.negative(b)
+# Element-wise addition of variables with "row_sparse" stype
+f = mx.sym.sparse.elemwise_add(c, c)
+{'d':d, 'e':e, 'f':f}
+```
+
+### Storage Type Inference
+
+What will be the output storage types of sparse symbols? In MXNet, for any 
sparse symbol, the result storage types are inferred based on storage types of 
inputs.
+You can read the [Sparse Symbol API](mxnet.io/api/python/symbol/sparse.html) 
documentation to find what output storage types are. In the example below we 
will try out the storage types introduced in the Row Sparse and Compressed 
Sparse Row tutorials: `default` (dense), `csr`, and `row_sparse`.
+
+
+```python
+add_exec = mx.sym.Group([d, e, f]).simple_bind(ctx=mx.cpu(), a=shape, b=shape, 
c=shape)
+add_exec.forward()
+dense_add = add_exec.outputs[0]
+# The output storage type of elemwise_add(csr, csr) will be inferred as "csr"

[GitHub] bhavinthaker commented on a change in pull request #7921: Add three sparse tutorials

2017-09-24 Thread git
bhavinthaker commented on a change in pull request #7921: Add three sparse 
tutorials
URL: https://github.com/apache/incubator-mxnet/pull/7921#discussion_r140674509
 
 

 ##
 File path: docs/tutorials/sparse/csr.md
 ##
 @@ -0,0 +1,338 @@
+
+# CSRNDArray - NDArray in Compressed Sparse Row Storage Format
+
+Many real world datasets deal with high dimensional sparse feature vectors. 
Take for instance a recommendation system where the number of categories and 
users is on the order of millions. The purchase data for each category by user 
would show that most users only make a few purchases, leading to a dataset with 
high sparsity (i.e. most of the elements are zeros).
+
+Storing and manipulating such large sparse matrices in the default dense 
structure results in wasted memory and processing on the zeros. To take 
advantage of the sparse structure of the matrix, the `CSRNDArray` in MXNet 
stores the matrix in [compressed sparse row 
(CSR)](https://en.wikipedia.org/wiki/Sparse_matrix#Compressed_sparse_row_.28CSR.2C_CRS_or_Yale_format.29)
 format and uses specialized algorithms in operators.
+**The format is designed for 2D matrices with a large number of columns,
+and each row is sparse (i.e. with only a few nonzeros).**
+
+## Advantages of Compressed Sparse Row NDArray (CSRNDArray)
+For matrices of high sparsity (e.g. ~1% non-zeros), there are two primary 
advantages of `CSRNDArray` over the existing `NDArray`:
+
+- memory consumption is reduced significantly
+- certain operations are much faster (e.g. matrix-vector multiplication)
+
+You may be familiar with the CSR storage format in 
[SciPy](https://www.scipy.org/) and will note the similarities in MXNet's 
implementation. However there are some additional competitive features in 
`CSRNDArray` inherited from `NDArray`, such as lazy evaluation and automatic 
parallelization that are not available in SciPy's flavor of CSR.
+
+The introduction of `CSRNDArray` also brings a new attribute, `stype` as a 
holder for storage type info, to `NDArray`. You can query **ndarray.stype** now 
in addition to the oft-queried attributes such as **ndarray.shape**, 
**ndarray.dtype**, and **ndarray.context**. For a typical dense NDArray, the 
value of `stype` is **"default"**. For a `CSRNDArray`, the value of stype is 
**"csr"**.
+
+## Prerequisites
+
+To complete this tutorial, you will need:
+
+- MXNet. See the instructions for your operating system in [Setup and 
Installation](http://mxnet.io/get_started/install.html)
+- [Jupyter](http://jupyter.org/)
+```
+pip install jupyter
+```
+- Basic knowledge of NDArray in MXNet. See the detailed tutorial for NDArray 
in [NDArray - Imperative tensor operations on 
CPU/GPU](https://mxnet.incubator.apache.org/tutorials/basic/ndarray.html).
+- SciPy - A section of this tutorial uses SciPy package in Python. If you 
don't have SciPy, the example in that section will be ignored.
+- GPUs - A section of this tutorial uses GPUs. If you don't have GPUs on your 
machine, simply set the variable `gpu_device` (set in the GPUs section of this 
tutorial) to `mx.cpu()`.
+
+## Compressed Sparse Row Matrix
+
+A CSRNDArray represents a 2D matrix as three separate 1D arrays: **data**, 
**indptr** and **indices**, where the column indices for row `i` are stored in 
`indices[indptr[i]:indptr[i+1]]` in ascending order, and their corresponding 
values are stored in `data[indptr[i]:indptr[i+1]]`.
+
+- **data**: CSR format data array of the matrix
+- **indices**: CSR format index array of the matrix
+- **indptr**: CSR format index pointer array of the matrix
+
+### Example Matrix Compression
+
+For example, given the matrix:
+```
+[[7, 0, 8, 0]
+ [0, 0, 0, 0]
+ [0, 9, 0, 0]]
+```
+
+We can compress this matrix using CSR, and to do so we need to calculate 
`data`, `indices`, and `indptr`.
+
+The `data` array holds all the non-zero entries of the matrix in row-major 
order. Put another way, you create a data array that has all of the zeros 
removed from the matrix, row by row, storing the numbers in that order. Your 
result:
+
+data = [7, 8, 9]
+
+The `indices` array stores the column index for each non-zero element in 
`data`. As you cycle through the data array, starting with 7, you can see it is 
in column 0. Then looking at 8, you can see it is in column 2. Lastly 9 is in 
column 1. Your result:
+
+indices = [0, 2, 1]
+
+The `indptr` array is what will help identify the rows where the data appears. 
It stores the index into `data` of the first non-zero element number of each 
row of the matrix. This array always starts with 0 (reasons can be explored 
later), so indptr[0] is 0. Each subsequent value in the array is the aggregate 
number of non-zero elements up to that row. Looking at the first row of the 
matrix you can see two non-zero values, so indptr[1] is 2. The next row 
contains all zeros, so the aggregate is still 2, so indptr[2] is 2. Finally, 
you see the last row contains one non-zero element bring the 

[GitHub] bhavinthaker commented on a change in pull request #7921: Add three sparse tutorials

2017-09-24 Thread git
bhavinthaker commented on a change in pull request #7921: Add three sparse 
tutorials
URL: https://github.com/apache/incubator-mxnet/pull/7921#discussion_r140676545
 
 

 ##
 File path: docs/tutorials/sparse/rowsparse.md
 ##
 @@ -0,0 +1,383 @@
+
+# RowSparseNDArray - NDArray for Sparse Gradient Updates
+
+## Motivation
+
+Many real world datasets deal with high dimensional sparse feature vectors. 
When learning
+the weights of models with sparse datasets, the derived gradients of the 
weights could be sparse.
+For example, let's say we learn a linear model ``Y = XW + b``, where ``X`` are 
sparse feature vectors:
+
+
+```python
+import mxnet as mx
+shape = (3, 10)
+# `X` only contains 4 non-zeros
+data = [6, 7, 8, 9]
+indptr = [0, 2, 3, 4]
+indices = [0, 4, 0, 0]
+X = mx.nd.sparse.csr_matrix(data, indptr, indices, shape)
+# the content of `X`
+X.asnumpy()
+```
+
+For some columns in `X`, they do not have any non-zero value, therefore the 
gradient for the weight `W` will have many row slices of all zeros 
corresponding to the zero columns in `X`.
+
+
+```python
+W = mx.nd.random_uniform(shape=(10, 2))
+b = mx.nd.zeros((3, 1))
+# attach a gradient placeholder for W
+W.attach_grad(stype='row_sparse')
+with mx.autograd.record():
+Y = mx.nd.dot(X, W) + b
+
+Y.backward()
+# the content of the gradients of `W`
+{'W.grad': W.grad, 'W.grad.asnumpy()': W.grad.asnumpy()}
+```
+
+Storing and manipulating such sparse matrices with many row slices of all 
zeros in the default dense structure results in wasted memory and processing on 
the zeros. More importantly, many gradient based optimization methods such as 
SGD, [AdaGrad](https://stanford.edu/~jduchi/projects/DuchiHaSi10_colt.pdf) and 
[Adam](https://arxiv.org/pdf/1412.6980.pdf)
+take advantage of sparse gradients and prove to be efficient and effective. 
+**In MXNet, the ``RowSparseNDArray`` stores the matrix in ``row sparse`` 
format, which is designed for arrays of which most row slices are all zeros.**
+In this tutorial, we will describe what the row sparse format is and how to 
use RowSparseNDArray for sparse gradient updates in MXNet.
+
+## Prerequisites
+
+To complete this tutorial, we need:
+
+- MXNet. See the instructions for your operating system in [Setup and 
Installation](http://mxnet.io/get_started/install.html)
+- [Jupyter](http://jupyter.org/)
+```
+pip install jupyter
+```
+- Basic knowledge of NDArray in MXNet. See the detailed tutorial for NDArray 
in [NDArray - Imperative tensor operations on 
CPU/GPU](https://mxnet.incubator.apache.org/tutorials/basic/ndarray.html)
+- Understanding of [automatic differentiation with 
autograd](http://gluon.mxnet.io/chapter01_crashcourse/autograd.html)
+- GPUs - A section of this tutorial uses GPUs. If you don't have GPUs on your
+machine, simply set the variable `gpu_device` (set in the GPUs section of this
+tutorial) to `mx.cpu()`
+
+## Row Sparse Format
+
+A RowSparseNDArray represents a multidimensional NDArray using two separate 1D 
arrays:
+`data` and `indices`.
+
+- data: an NDArray of any dtype with shape `[D0, D1, ..., Dn]`.
+- indices: a 1D int64 NDArray with shape `[D0]` with values sorted in 
ascending order.
+
+The ``indices`` array stores the indices of the row slices with non-zeros,
+while the values are stored in ``data`` array. The corresponding NDArray 
`dense` represented by RowSparseNDArray `rsp` has
+
+``dense[rsp.indices[i], :, :, :, ...] = rsp.data[i, :, :, :, ...]``
+
+A RowSparseNDArray is typically used to represent non-zero row slices of a 
large NDArray of shape [LARGE0, D1, .. , Dn] where LARGE0 >> D0 and most row 
slices are zeros.
+
+Given this two-dimension matrix:
+
+
+```python
+[[ 1, 2, 3],
+ [ 0, 0, 0],
+ [ 4, 0, 5],
+ [ 0, 0, 0],
+ [ 0, 0, 0]]
+```
+
+The row sparse representation would be:
+- `data` array holds all the non-zero row slices of the array.
+- `indices` array stores the row index for each row slice with non-zero 
elements.
+
+
+
+```python
+data = [[1, 2, 3], [4, 0, 5]]
+indices = [0, 2]
+```
+
+`RowSparseNDArray` supports multidimensional arrays. Given this 3D tensor:
+
+
+```python
+[[[1, 0],
+  [0, 2],
+  [3, 4]],
+
+ [[5, 0],
+  [6, 0],
+  [0, 0]],
+
+ [[0, 0],
+  [0, 0],
+  [0, 0]]]
+```
+
+The row sparse representation would be (with `data` and `indices` defined the 
same as above):
+
+
+```python
+data = [[[1, 0], [0, 2], [3, 4]], [[5, 0], [6, 0], [0, 0]]]
+indices = [0, 1]
+```
+
+``RowSparseNDArray`` is a subclass of ``NDArray``. If you query **stype** of a 
RowSparseNDArray,
+the value will be **"row_sparse"**.
+
+## Array Creation
+
+You can create a `RowSparseNDArray` with data and indices by using the 
`row_sparse_array` function:
+
+
+```python
+import mxnet as mx
+import numpy as np
+# Create a RowSparseNDArray with python lists
+shape = (6, 2)
+data_list = [[1, 2], [3, 4]]
+indices_list = [1, 4]
+a = mx.nd.sparse.row_sparse_array(data_list, indices_list, shape)
+# Create a RowSparseNDArray with numpy arrays

[GitHub] bhavinthaker commented on a change in pull request #7921: Add three sparse tutorials

2017-09-24 Thread git
bhavinthaker commented on a change in pull request #7921: Add three sparse 
tutorials
URL: https://github.com/apache/incubator-mxnet/pull/7921#discussion_r140672144
 
 

 ##
 File path: docs/tutorials/sparse/csr.md
 ##
 @@ -0,0 +1,338 @@
+
+# CSRNDArray - NDArray in Compressed Sparse Row Storage Format
+
+Many real world datasets deal with high dimensional sparse feature vectors. 
Take for instance a recommendation system where the number of categories and 
users is on the order of millions. The purchase data for each category by user 
would show that most users only make a few purchases, leading to a dataset with 
high sparsity (i.e. most of the elements are zeros).
+
+Storing and manipulating such large sparse matrices in the default dense 
structure results in wasted memory and processing on the zeros. To take 
advantage of the sparse structure of the matrix, the `CSRNDArray` in MXNet 
stores the matrix in [compressed sparse row 
(CSR)](https://en.wikipedia.org/wiki/Sparse_matrix#Compressed_sparse_row_.28CSR.2C_CRS_or_Yale_format.29)
 format and uses specialized algorithms in operators.
+**The format is designed for 2D matrices with a large number of columns,
+and each row is sparse (i.e. with only a few nonzeros).**
+
+## Advantages of Compressed Sparse Row NDArray (CSRNDArray)
+For matrices of high sparsity (e.g. ~1% non-zeros), there are two primary 
advantages of `CSRNDArray` over the existing `NDArray`:
+
+- memory consumption is reduced significantly
+- certain operations are much faster (e.g. matrix-vector multiplication)
+
+You may be familiar with the CSR storage format in 
[SciPy](https://www.scipy.org/) and will note the similarities in MXNet's 
implementation. However there are some additional competitive features in 
`CSRNDArray` inherited from `NDArray`, such as lazy evaluation and automatic 
parallelization that are not available in SciPy's flavor of CSR.
+
+The introduction of `CSRNDArray` also brings a new attribute, `stype` as a 
holder for storage type info, to `NDArray`. You can query **ndarray.stype** now 
in addition to the oft-queried attributes such as **ndarray.shape**, 
**ndarray.dtype**, and **ndarray.context**. For a typical dense NDArray, the 
value of `stype` is **"default"**. For a `CSRNDArray`, the value of stype is 
**"csr"**.
+
+## Prerequisites
+
+To complete this tutorial, you will need:
+
+- MXNet. See the instructions for your operating system in [Setup and 
Installation](http://mxnet.io/get_started/install.html)
+- [Jupyter](http://jupyter.org/)
+```
+pip install jupyter
+```
+- Basic knowledge of NDArray in MXNet. See the detailed tutorial for NDArray 
in [NDArray - Imperative tensor operations on 
CPU/GPU](https://mxnet.incubator.apache.org/tutorials/basic/ndarray.html).
+- SciPy - A section of this tutorial uses SciPy package in Python. If you 
don't have SciPy, the example in that section will be ignored.
+- GPUs - A section of this tutorial uses GPUs. If you don't have GPUs on your 
machine, simply set the variable `gpu_device` (set in the GPUs section of this 
tutorial) to `mx.cpu()`.
+
+## Compressed Sparse Row Matrix
+
+A CSRNDArray represents a 2D matrix as three separate 1D arrays: **data**, 
**indptr** and **indices**, where the column indices for row `i` are stored in 
`indices[indptr[i]:indptr[i+1]]` in ascending order, and their corresponding 
values are stored in `data[indptr[i]:indptr[i+1]]`.
+
+- **data**: CSR format data array of the matrix
+- **indices**: CSR format index array of the matrix
+- **indptr**: CSR format index pointer array of the matrix
+
+### Example Matrix Compression
+
+For example, given the matrix:
+```
+[[7, 0, 8, 0]
+ [0, 0, 0, 0]
+ [0, 9, 0, 0]]
+```
+
+We can compress this matrix using CSR, and to do so we need to calculate 
`data`, `indices`, and `indptr`.
+
+The `data` array holds all the non-zero entries of the matrix in row-major 
order. Put another way, you create a data array that has all of the zeros 
removed from the matrix, row by row, storing the numbers in that order. Your 
result:
+
+data = [7, 8, 9]
+
+The `indices` array stores the column index for each non-zero element in 
`data`. As you cycle through the data array, starting with 7, you can see it is 
in column 0. Then looking at 8, you can see it is in column 2. Lastly 9 is in 
column 1. Your result:
+
+indices = [0, 2, 1]
+
+The `indptr` array is what will help identify the rows where the data appears. 
It stores the index into `data` of the first non-zero element number of each 
row of the matrix. This array always starts with 0 (reasons can be explored 
later), so indptr[0] is 0. Each subsequent value in the array is the aggregate 
number of non-zero elements up to that row. Looking at the first row of the 
matrix you can see two non-zero values, so indptr[1] is 2. The next row 
contains all zeros, so the aggregate is still 2, so indptr[2] is 2. Finally, 
you see the last row contains one non-zero element bring the 

[GitHub] bhavinthaker commented on a change in pull request #7921: Add three sparse tutorials

2017-09-24 Thread git
bhavinthaker commented on a change in pull request #7921: Add three sparse 
tutorials
URL: https://github.com/apache/incubator-mxnet/pull/7921#discussion_r140674775
 
 

 ##
 File path: docs/tutorials/sparse/csr.md
 ##
 @@ -0,0 +1,338 @@
+
+# CSRNDArray - NDArray in Compressed Sparse Row Storage Format
+
+Many real world datasets deal with high dimensional sparse feature vectors. 
Take for instance a recommendation system where the number of categories and 
users is on the order of millions. The purchase data for each category by user 
would show that most users only make a few purchases, leading to a dataset with 
high sparsity (i.e. most of the elements are zeros).
+
+Storing and manipulating such large sparse matrices in the default dense 
structure results in wasted memory and processing on the zeros. To take 
advantage of the sparse structure of the matrix, the `CSRNDArray` in MXNet 
stores the matrix in [compressed sparse row 
(CSR)](https://en.wikipedia.org/wiki/Sparse_matrix#Compressed_sparse_row_.28CSR.2C_CRS_or_Yale_format.29)
 format and uses specialized algorithms in operators.
+**The format is designed for 2D matrices with a large number of columns,
+and each row is sparse (i.e. with only a few nonzeros).**
+
+## Advantages of Compressed Sparse Row NDArray (CSRNDArray)
+For matrices of high sparsity (e.g. ~1% non-zeros), there are two primary 
advantages of `CSRNDArray` over the existing `NDArray`:
+
+- memory consumption is reduced significantly
+- certain operations are much faster (e.g. matrix-vector multiplication)
+
+You may be familiar with the CSR storage format in 
[SciPy](https://www.scipy.org/) and will note the similarities in MXNet's 
implementation. However there are some additional competitive features in 
`CSRNDArray` inherited from `NDArray`, such as lazy evaluation and automatic 
parallelization that are not available in SciPy's flavor of CSR.
+
+The introduction of `CSRNDArray` also brings a new attribute, `stype` as a 
holder for storage type info, to `NDArray`. You can query **ndarray.stype** now 
in addition to the oft-queried attributes such as **ndarray.shape**, 
**ndarray.dtype**, and **ndarray.context**. For a typical dense NDArray, the 
value of `stype` is **"default"**. For a `CSRNDArray`, the value of stype is 
**"csr"**.
+
+## Prerequisites
+
+To complete this tutorial, you will need:
+
+- MXNet. See the instructions for your operating system in [Setup and 
Installation](http://mxnet.io/get_started/install.html)
+- [Jupyter](http://jupyter.org/)
+```
+pip install jupyter
+```
+- Basic knowledge of NDArray in MXNet. See the detailed tutorial for NDArray 
in [NDArray - Imperative tensor operations on 
CPU/GPU](https://mxnet.incubator.apache.org/tutorials/basic/ndarray.html).
+- SciPy - A section of this tutorial uses SciPy package in Python. If you 
don't have SciPy, the example in that section will be ignored.
+- GPUs - A section of this tutorial uses GPUs. If you don't have GPUs on your 
machine, simply set the variable `gpu_device` (set in the GPUs section of this 
tutorial) to `mx.cpu()`.
+
+## Compressed Sparse Row Matrix
+
+A CSRNDArray represents a 2D matrix as three separate 1D arrays: **data**, 
**indptr** and **indices**, where the column indices for row `i` are stored in 
`indices[indptr[i]:indptr[i+1]]` in ascending order, and their corresponding 
values are stored in `data[indptr[i]:indptr[i+1]]`.
+
+- **data**: CSR format data array of the matrix
+- **indices**: CSR format index array of the matrix
+- **indptr**: CSR format index pointer array of the matrix
+
+### Example Matrix Compression
+
+For example, given the matrix:
+```
+[[7, 0, 8, 0]
+ [0, 0, 0, 0]
+ [0, 9, 0, 0]]
+```
+
+We can compress this matrix using CSR, and to do so we need to calculate 
`data`, `indices`, and `indptr`.
+
+The `data` array holds all the non-zero entries of the matrix in row-major 
order. Put another way, you create a data array that has all of the zeros 
removed from the matrix, row by row, storing the numbers in that order. Your 
result:
+
+data = [7, 8, 9]
+
+The `indices` array stores the column index for each non-zero element in 
`data`. As you cycle through the data array, starting with 7, you can see it is 
in column 0. Then looking at 8, you can see it is in column 2. Lastly 9 is in 
column 1. Your result:
+
+indices = [0, 2, 1]
+
+The `indptr` array is what will help identify the rows where the data appears. 
It stores the index into `data` of the first non-zero element number of each 
row of the matrix. This array always starts with 0 (reasons can be explored 
later), so indptr[0] is 0. Each subsequent value in the array is the aggregate 
number of non-zero elements up to that row. Looking at the first row of the 
matrix you can see two non-zero values, so indptr[1] is 2. The next row 
contains all zeros, so the aggregate is still 2, so indptr[2] is 2. Finally, 
you see the last row contains one non-zero element bring the 

[GitHub] bhavinthaker commented on a change in pull request #7921: Add three sparse tutorials

2017-09-24 Thread git
bhavinthaker commented on a change in pull request #7921: Add three sparse 
tutorials
URL: https://github.com/apache/incubator-mxnet/pull/7921#discussion_r140671190
 
 

 ##
 File path: docs/tutorials/sparse/csr.md
 ##
 @@ -0,0 +1,338 @@
+
+# CSRNDArray - NDArray in Compressed Sparse Row Storage Format
+
+Many real world datasets deal with high dimensional sparse feature vectors. 
Take for instance a recommendation system where the number of categories and 
users is on the order of millions. The purchase data for each category by user 
would show that most users only make a few purchases, leading to a dataset with 
high sparsity (i.e. most of the elements are zeros).
+
+Storing and manipulating such large sparse matrices in the default dense 
structure results in wasted memory and processing on the zeros. To take 
advantage of the sparse structure of the matrix, the `CSRNDArray` in MXNet 
stores the matrix in [compressed sparse row 
(CSR)](https://en.wikipedia.org/wiki/Sparse_matrix#Compressed_sparse_row_.28CSR.2C_CRS_or_Yale_format.29)
 format and uses specialized algorithms in operators.
+**The format is designed for 2D matrices with a large number of columns,
+and each row is sparse (i.e. with only a few nonzeros).**
+
+## Advantages of Compressed Sparse Row NDArray (CSRNDArray)
+For matrices of high sparsity (e.g. ~1% non-zeros), there are two primary 
advantages of `CSRNDArray` over the existing `NDArray`:
 
 Review comment:
   You may want to introduce both terms, sparsity and density. Say, For 
matrices of high sparsity, also known as low density (e.g. ~1% non-zeros = ~1% 
density), there are ...
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] bhavinthaker commented on a change in pull request #7921: Add three sparse tutorials

2017-09-24 Thread git
bhavinthaker commented on a change in pull request #7921: Add three sparse 
tutorials
URL: https://github.com/apache/incubator-mxnet/pull/7921#discussion_r140672649
 
 

 ##
 File path: docs/tutorials/sparse/csr.md
 ##
 @@ -0,0 +1,338 @@
+
+# CSRNDArray - NDArray in Compressed Sparse Row Storage Format
+
+Many real world datasets deal with high dimensional sparse feature vectors. 
Take for instance a recommendation system where the number of categories and 
users is on the order of millions. The purchase data for each category by user 
would show that most users only make a few purchases, leading to a dataset with 
high sparsity (i.e. most of the elements are zeros).
+
+Storing and manipulating such large sparse matrices in the default dense 
structure results in wasted memory and processing on the zeros. To take 
advantage of the sparse structure of the matrix, the `CSRNDArray` in MXNet 
stores the matrix in [compressed sparse row 
(CSR)](https://en.wikipedia.org/wiki/Sparse_matrix#Compressed_sparse_row_.28CSR.2C_CRS_or_Yale_format.29)
 format and uses specialized algorithms in operators.
+**The format is designed for 2D matrices with a large number of columns,
+and each row is sparse (i.e. with only a few nonzeros).**
+
+## Advantages of Compressed Sparse Row NDArray (CSRNDArray)
+For matrices of high sparsity (e.g. ~1% non-zeros), there are two primary 
advantages of `CSRNDArray` over the existing `NDArray`:
+
+- memory consumption is reduced significantly
+- certain operations are much faster (e.g. matrix-vector multiplication)
+
+You may be familiar with the CSR storage format in 
[SciPy](https://www.scipy.org/) and will note the similarities in MXNet's 
implementation. However there are some additional competitive features in 
`CSRNDArray` inherited from `NDArray`, such as lazy evaluation and automatic 
parallelization that are not available in SciPy's flavor of CSR.
+
+The introduction of `CSRNDArray` also brings a new attribute, `stype` as a 
holder for storage type info, to `NDArray`. You can query **ndarray.stype** now 
in addition to the oft-queried attributes such as **ndarray.shape**, 
**ndarray.dtype**, and **ndarray.context**. For a typical dense NDArray, the 
value of `stype` is **"default"**. For a `CSRNDArray`, the value of stype is 
**"csr"**.
+
+## Prerequisites
+
+To complete this tutorial, you will need:
+
+- MXNet. See the instructions for your operating system in [Setup and 
Installation](http://mxnet.io/get_started/install.html)
+- [Jupyter](http://jupyter.org/)
+```
+pip install jupyter
+```
+- Basic knowledge of NDArray in MXNet. See the detailed tutorial for NDArray 
in [NDArray - Imperative tensor operations on 
CPU/GPU](https://mxnet.incubator.apache.org/tutorials/basic/ndarray.html).
+- SciPy - A section of this tutorial uses SciPy package in Python. If you 
don't have SciPy, the example in that section will be ignored.
+- GPUs - A section of this tutorial uses GPUs. If you don't have GPUs on your 
machine, simply set the variable `gpu_device` (set in the GPUs section of this 
tutorial) to `mx.cpu()`.
+
+## Compressed Sparse Row Matrix
+
+A CSRNDArray represents a 2D matrix as three separate 1D arrays: **data**, 
**indptr** and **indices**, where the column indices for row `i` are stored in 
`indices[indptr[i]:indptr[i+1]]` in ascending order, and their corresponding 
values are stored in `data[indptr[i]:indptr[i+1]]`.
+
+- **data**: CSR format data array of the matrix
+- **indices**: CSR format index array of the matrix
+- **indptr**: CSR format index pointer array of the matrix
+
+### Example Matrix Compression
+
+For example, given the matrix:
+```
+[[7, 0, 8, 0]
+ [0, 0, 0, 0]
+ [0, 9, 0, 0]]
+```
+
+We can compress this matrix using CSR, and to do so we need to calculate 
`data`, `indices`, and `indptr`.
+
+The `data` array holds all the non-zero entries of the matrix in row-major 
order. Put another way, you create a data array that has all of the zeros 
removed from the matrix, row by row, storing the numbers in that order. Your 
result:
+
+data = [7, 8, 9]
+
+The `indices` array stores the column index for each non-zero element in 
`data`. As you cycle through the data array, starting with 7, you can see it is 
in column 0. Then looking at 8, you can see it is in column 2. Lastly 9 is in 
column 1. Your result:
+
+indices = [0, 2, 1]
+
+The `indptr` array is what will help identify the rows where the data appears. 
It stores the index into `data` of the first non-zero element number of each 
row of the matrix. This array always starts with 0 (reasons can be explored 
later), so indptr[0] is 0. Each subsequent value in the array is the aggregate 
number of non-zero elements up to that row. Looking at the first row of the 
matrix you can see two non-zero values, so indptr[1] is 2. The next row 
contains all zeros, so the aggregate is still 2, so indptr[2] is 2. Finally, 
you see the last row contains one non-zero element bring the 

[GitHub] bhavinthaker commented on a change in pull request #7921: Add three sparse tutorials

2017-09-24 Thread git
bhavinthaker commented on a change in pull request #7921: Add three sparse 
tutorials
URL: https://github.com/apache/incubator-mxnet/pull/7921#discussion_r140679971
 
 

 ##
 File path: docs/tutorials/sparse/train.md
 ##
 @@ -0,0 +1,256 @@
+
+# Train a Linear Regression Model with Sparse Symbols
+In previous tutorials, we introduced `CSRNDArray` and `RowSparseNDArray`,
+the basic data structures for manipulating sparse data.
+MXNet also provides `Sparse Symbol` API, which enables symbolic expressions 
that handle sparse arrays.
+In this tutorial, we first focus on how to compose a symbolic graph with 
sparse operators,
+then train a linear regression model using sparse symbols with the Module API.
+
+## Prerequisites
+
+To complete this tutorial, we need:
+
+- MXNet. See the instructions for your operating system in [Setup and 
Installation](http://mxnet.io/get_started/install.html).  
+
+- [Jupyter Notebook](http://jupyter.org/index.html) and [Python 
Requests](http://docs.python-requests.org/en/master/) packages.
+```
+pip install jupyter requests
+```
+
+- Basic knowledge of Symbol in MXNet. See the detailed tutorial for Symbol in 
[Symbol - Neural network graphs and 
auto-differentiation](https://mxnet.incubator.apache.org/tutorials/basic/symbol.html).
+
+- Basic knowledge of CSRNDArray in MXNet. See the detailed tutorial for 
CSRNDArray in [TODO(haibin) Add Link 
Here](http://ec2-54-187-32-207.us-west-2.compute.amazonaws.com/tutorials/sparse/csr.html).
+
+- Basic knowledge of RowSparseNDArray in MXNet. See the detailed tutorial for 
RowSparseNDArray in [TODO(haibin) Add Link 
Here](http://ec2-54-187-32-207.us-west-2.compute.amazonaws.com/tutorials/sparse/rowsparse.html).
+
+## Variables
+
+Variables are placeholder for arrays. We can use them to hold sparse arrays, 
too.
+
+### Variable Storage Types
+
+The `stype` attribute of a variable is used to indicate the storage type of 
the array.
+By default, the `stype` of a variable is "default" which indicates the default 
dense storage format.
+We can specify the `stype` of a variable as "csr" or "row_sparse" to hold 
sparse arrays.
+
+
+```python
+import mxnet as mx
+# Create a variable to hold an NDArray
+a = mx.sym.Variable('a')
+# Create a variable to hold a CSRNDArray
+b = mx.sym.Variable('b', stype='csr')
+# Create a variable to hold a RowSparseNDArray
+c = mx.sym.Variable('c', stype='row_sparse')
+(a, b, c)
+```
+
+### Bind with Sparse Arrays
+
+The sparse symbols constructed above declare storage types of the arrays to 
hold.
+To evaluate them, we need to feed the free variables with sparse data.
+
+You can instantiate an executor from a sparse symbol by using the 
`simple_bind` method,
+which allocate zeros to all free variables according to their storage types.
+The executor provides `forward` method for evaluation and an attribute
+`outputs` to get all the results. Later, we will show the use of the 
`backward` method and other methods computing the gradients and updating 
parameters. A simple example first:
+
+
+```python
+shape = (2,2)
+# Instantiate an executor from sparse symbols
+b_exec = b.simple_bind(ctx=mx.cpu(), b=shape)
+c_exec = c.simple_bind(ctx=mx.cpu(), c=shape)
+b_exec.forward()
+c_exec.forward()
+# Sparse arrays of zeros are bound to b and c
+print(b_exec.outputs, c_exec.outputs)
+```
+
+You can update the array held by the variable by accessing executor's 
`arg_dict` and assigning new values.
+
+
+```python
+var_exec.arg_dict['b'][:] = mx.nd.ones(shape).tostype('csr')
+var_exec.forward()
+# The array `b` holds are updated to be ones
+eval_b = var_exec.outputs[0]
+{'eval_b': eval_b, 'eval_b.asnumpy()': eval_b.asnumpy()}
+```
+
+## Symbol Composition and Storage Type Inference
+
+### Basic Symbol Composition
+
+The following example builds a simple element-wise addition expression with 
different storage types.
+The sparse symbols are available in the `mx.sym.sparse` package.
+
+
+```python
+# Element-wise addition of variables with "default" stype
+d = mx.sym.elemwise_add(a, a)
+# Element-wise addition of variables with "csr" stype
+e = mx.sym.sparse.negative(b)
+# Element-wise addition of variables with "row_sparse" stype
+f = mx.sym.sparse.elemwise_add(c, c)
+{'d':d, 'e':e, 'f':f}
+```
+
+### Storage Type Inference
+
+What will be the output storage types of sparse symbols? In MXNet, for any 
sparse symbol, the result storage types are inferred based on storage types of 
inputs.
+You can read the [Sparse Symbol API](mxnet.io/api/python/symbol/sparse.html) 
documentation to find what output storage types are. In the example below we 
will try out the storage types introduced in the Row Sparse and Compressed 
Sparse Row tutorials: `default` (dense), `csr`, and `row_sparse`.
 
 Review comment:
   Add http:// to the URL.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above 

[GitHub] bhavinthaker commented on a change in pull request #7921: Add three sparse tutorials

2017-09-24 Thread git
bhavinthaker commented on a change in pull request #7921: Add three sparse 
tutorials
URL: https://github.com/apache/incubator-mxnet/pull/7921#discussion_r140676413
 
 

 ##
 File path: docs/tutorials/sparse/rowsparse.md
 ##
 @@ -0,0 +1,383 @@
+
+# RowSparseNDArray - NDArray for Sparse Gradient Updates
+
+## Motivation
+
+Many real world datasets deal with high dimensional sparse feature vectors. 
When learning
+the weights of models with sparse datasets, the derived gradients of the 
weights could be sparse.
+For example, let's say we learn a linear model ``Y = XW + b``, where ``X`` are 
sparse feature vectors:
+
+
+```python
+import mxnet as mx
+shape = (3, 10)
+# `X` only contains 4 non-zeros
+data = [6, 7, 8, 9]
+indptr = [0, 2, 3, 4]
+indices = [0, 4, 0, 0]
+X = mx.nd.sparse.csr_matrix(data, indptr, indices, shape)
+# the content of `X`
+X.asnumpy()
+```
+
+For some columns in `X`, they do not have any non-zero value, therefore the 
gradient for the weight `W` will have many row slices of all zeros 
corresponding to the zero columns in `X`.
+
+
+```python
+W = mx.nd.random_uniform(shape=(10, 2))
+b = mx.nd.zeros((3, 1))
+# attach a gradient placeholder for W
+W.attach_grad(stype='row_sparse')
+with mx.autograd.record():
+Y = mx.nd.dot(X, W) + b
+
+Y.backward()
+# the content of the gradients of `W`
+{'W.grad': W.grad, 'W.grad.asnumpy()': W.grad.asnumpy()}
+```
+
+Storing and manipulating such sparse matrices with many row slices of all 
zeros in the default dense structure results in wasted memory and processing on 
the zeros. More importantly, many gradient based optimization methods such as 
SGD, [AdaGrad](https://stanford.edu/~jduchi/projects/DuchiHaSi10_colt.pdf) and 
[Adam](https://arxiv.org/pdf/1412.6980.pdf)
+take advantage of sparse gradients and prove to be efficient and effective. 
+**In MXNet, the ``RowSparseNDArray`` stores the matrix in ``row sparse`` 
format, which is designed for arrays of which most row slices are all zeros.**
+In this tutorial, we will describe what the row sparse format is and how to 
use RowSparseNDArray for sparse gradient updates in MXNet.
+
+## Prerequisites
+
+To complete this tutorial, we need:
+
+- MXNet. See the instructions for your operating system in [Setup and 
Installation](http://mxnet.io/get_started/install.html)
+- [Jupyter](http://jupyter.org/)
+```
+pip install jupyter
+```
+- Basic knowledge of NDArray in MXNet. See the detailed tutorial for NDArray 
in [NDArray - Imperative tensor operations on 
CPU/GPU](https://mxnet.incubator.apache.org/tutorials/basic/ndarray.html)
+- Understanding of [automatic differentiation with 
autograd](http://gluon.mxnet.io/chapter01_crashcourse/autograd.html)
+- GPUs - A section of this tutorial uses GPUs. If you don't have GPUs on your
+machine, simply set the variable `gpu_device` (set in the GPUs section of this
+tutorial) to `mx.cpu()`
+
+## Row Sparse Format
+
+A RowSparseNDArray represents a multidimensional NDArray using two separate 1D 
arrays:
+`data` and `indices`.
+
+- data: an NDArray of any dtype with shape `[D0, D1, ..., Dn]`.
+- indices: a 1D int64 NDArray with shape `[D0]` with values sorted in 
ascending order.
+
+The ``indices`` array stores the indices of the row slices with non-zeros,
+while the values are stored in ``data`` array. The corresponding NDArray 
`dense` represented by RowSparseNDArray `rsp` has
+
+``dense[rsp.indices[i], :, :, :, ...] = rsp.data[i, :, :, :, ...]``
+
+A RowSparseNDArray is typically used to represent non-zero row slices of a 
large NDArray of shape [LARGE0, D1, .. , Dn] where LARGE0 >> D0 and most row 
slices are zeros.
+
+Given this two-dimension matrix:
+
+
+```python
+[[ 1, 2, 3],
+ [ 0, 0, 0],
+ [ 4, 0, 5],
+ [ 0, 0, 0],
+ [ 0, 0, 0]]
+```
+
+The row sparse representation would be:
+- `data` array holds all the non-zero row slices of the array.
+- `indices` array stores the row index for each row slice with non-zero 
elements.
+
+
+
+```python
+data = [[1, 2, 3], [4, 0, 5]]
+indices = [0, 2]
+```
+
+`RowSparseNDArray` supports multidimensional arrays. Given this 3D tensor:
+
+
+```python
+[[[1, 0],
+  [0, 2],
+  [3, 4]],
+
+ [[5, 0],
+  [6, 0],
+  [0, 0]],
+
+ [[0, 0],
+  [0, 0],
+  [0, 0]]]
+```
+
+The row sparse representation would be (with `data` and `indices` defined the 
same as above):
+
+
+```python
+data = [[[1, 0], [0, 2], [3, 4]], [[5, 0], [6, 0], [0, 0]]]
+indices = [0, 1]
+```
+
+``RowSparseNDArray`` is a subclass of ``NDArray``. If you query **stype** of a 
RowSparseNDArray,
+the value will be **"row_sparse"**.
+
+## Array Creation
+
+You can create a `RowSparseNDArray` with data and indices by using the 
`row_sparse_array` function:
+
+
+```python
+import mxnet as mx
+import numpy as np
+# Create a RowSparseNDArray with python lists
+shape = (6, 2)
+data_list = [[1, 2], [3, 4]]
+indices_list = [1, 4]
+a = mx.nd.sparse.row_sparse_array(data_list, indices_list, shape)
+# Create a RowSparseNDArray with numpy arrays