[GitHub] bhavinthaker commented on a change in pull request #7921: Add three sparse tutorials

git Sun, 24 Sep 2017 19:45:13 -0700

bhavinthaker commented on a change in pull request #7921: Add three sparse 
tutorials
URL: https://github.com/apache/incubator-mxnet/pull/7921#discussion_r140676243


 ##########
 File path: docs/tutorials/sparse/rowsparse.md
 ##########
 @@ -0,0 +1,383 @@
+
+# RowSparseNDArray - NDArray for Sparse Gradient Updates
+
+## Motivation
+
+Many real world datasets deal with high dimensional sparse feature vectors. 
When learning
+the weights of models with sparse datasets, the derived gradients of the 
weights could be sparse.
+For example, let's say we learn a linear model ``Y = XW + b``, where ``X`` are 
sparse feature vectors:
+
+
+```python
+import mxnet as mx
+shape = (3, 10)
+# `X` only contains 4 non-zeros
+data = [6, 7, 8, 9]
+indptr = [0, 2, 3, 4]
+indices = [0, 4, 0, 0]
+X = mx.nd.sparse.csr_matrix(data, indptr, indices, shape)
+# the content of `X`
+X.asnumpy()
+```
+
+For some columns in `X`, they do not have any non-zero value, therefore the 
gradient for the weight `W` will have many row slices of all zeros 
corresponding to the zero columns in `X`.
+
+
+```python
+W = mx.nd.random_uniform(shape=(10, 2))
+b = mx.nd.zeros((3, 1))
+# attach a gradient placeholder for W
+W.attach_grad(stype='row_sparse')
+with mx.autograd.record():
+    Y = mx.nd.dot(X, W) + b
+
+Y.backward()
+# the content of the gradients of `W`
+{'W.grad': W.grad, 'W.grad.asnumpy()': W.grad.asnumpy()}
+```
+
+Storing and manipulating such sparse matrices with many row slices of all 
zeros in the default dense structure results in wasted memory and processing on 
the zeros. More importantly, many gradient based optimization methods such as 
SGD, [AdaGrad](https://stanford.edu/~jduchi/projects/DuchiHaSi10_colt.pdf) and 
[Adam](https://arxiv.org/pdf/1412.6980.pdf)
+take advantage of sparse gradients and prove to be efficient and effective. 
+**In MXNet, the ``RowSparseNDArray`` stores the matrix in ``row sparse`` 
format, which is designed for arrays of which most row slices are all zeros.**
+In this tutorial, we will describe what the row sparse format is and how to 
use RowSparseNDArray for sparse gradient updates in MXNet.
+
+## Prerequisites
+
+To complete this tutorial, we need:
+
+- MXNet. See the instructions for your operating system in [Setup and 
Installation](http://mxnet.io/get_started/install.html)
+- [Jupyter](http://jupyter.org/)
+    ```
+    pip install jupyter
+    ```
+- Basic knowledge of NDArray in MXNet. See the detailed tutorial for NDArray 
in [NDArray - Imperative tensor operations on 
CPU/GPU](https://mxnet.incubator.apache.org/tutorials/basic/ndarray.html)
+- Understanding of [automatic differentiation with 
autograd](http://gluon.mxnet.io/chapter01_crashcourse/autograd.html)
+- GPUs - A section of this tutorial uses GPUs. If you don't have GPUs on your
+machine, simply set the variable `gpu_device` (set in the GPUs section of this
+tutorial) to `mx.cpu()`
+
+## Row Sparse Format
+
+A RowSparseNDArray represents a multidimensional NDArray using two separate 1D 
arrays:
+`data` and `indices`.
+
+- data: an NDArray of any dtype with shape `[D0, D1, ..., Dn]`.
+- indices: a 1D int64 NDArray with shape `[D0]` with values sorted in 
ascending order.
+
+The ``indices`` array stores the indices of the row slices with non-zeros,
+while the values are stored in ``data`` array. The corresponding NDArray 
`dense` represented by RowSparseNDArray `rsp` has
+
+``dense[rsp.indices[i], :, :, :, ...] = rsp.data[i, :, :, :, ...]``
+
+A RowSparseNDArray is typically used to represent non-zero row slices of a 
large NDArray of shape [LARGE0, D1, .. , Dn] where LARGE0 >> D0 and most row 
slices are zeros.
+
+Given this two-dimension matrix:
+
+
+```python
+[[ 1, 2, 3],
+ [ 0, 0, 0],
+ [ 4, 0, 5],
+ [ 0, 0, 0],
+ [ 0, 0, 0]]
+```
+
+The row sparse representation would be:
+- `data` array holds all the non-zero row slices of the array.
+- `indices` array stores the row index for each row slice with non-zero 
elements.
+
+
+
+```python
+data = [[1, 2, 3], [4, 0, 5]]
+indices = [0, 2]
+```
+
+`RowSparseNDArray` supports multidimensional arrays. Given this 3D tensor:
+
+
+```python
+[[[1, 0],
+  [0, 2],
+  [3, 4]],
+
+ [[5, 0],
+  [6, 0],
+  [0, 0]],
+
+ [[0, 0],
+  [0, 0],
+  [0, 0]]]
+```
+
+The row sparse representation would be (with `data` and `indices` defined the 
same as above):
+
+
+```python
+data = [[[1, 0], [0, 2], [3, 4]], [[5, 0], [6, 0], [0, 0]]]
+indices = [0, 1]
+```
+
+``RowSparseNDArray`` is a subclass of ``NDArray``. If you query **stype** of a 
RowSparseNDArray,
+the value will be **"row_sparse"**.
+
+## Array Creation
+
+You can create a `RowSparseNDArray` with data and indices by using the 
`row_sparse_array` function:
+
+
+```python
+import mxnet as mx
+import numpy as np
+# Create a RowSparseNDArray with python lists
+shape = (6, 2)
+data_list = [[1, 2], [3, 4]]
+indices_list = [1, 4]
+a = mx.nd.sparse.row_sparse_array(data_list, indices_list, shape)
+# Create a RowSparseNDArray with numpy arrays
+data_np = np.array([[1, 2], [3, 4]])
+indices_np = np.array([1, 4])
+b = mx.nd.sparse.row_sparse_array(data_np, indices_np, shape)
+{'a':a, 'b':b}
+```
+
+## Function Overview
+
+Similar to `CSRNDArray`, the are several functions with `RowSparseNDArray` 
that behave the same way. In the code blocks below you can try out these common 
functions:
+
+- **.dtype** - to set the data type
+- **.asnumpy** - to cast as a numpy array for inspecting it
+- **.data** - to access the data array
+- **.indices** - to access the indices array
+- **.tostype** - to set the storage type
+- **.cast_storage** - to convert the storage type
+- **.copy** - to copy the array
+- **.copyto** - to copy to deep copy an existing array
+
+
+## Setting Type
+
+You can create a `RowSparseNDArray` from another specifying the element data 
type with the option `dtype`, which accepts a numpy type. By default, `float32` 
is used.
+
+
+```python
+# Float32 is used by default
+c = mx.nd.sparse.array(a)
+# Create a 16-bit float array
+d = mx.nd.array(a, dtype=np.float16)
+(c.dtype, d.dtype)
+```
+
+## Inspecting Arrays
+
+Also as with `CSRNDArray`, you can inspect the contents of a 
`RowSparseNDArray` by filling
+its contents into a dense `numpy.ndarray` using the `asnumpy` function.
+
+
+```python
+a.asnumpy()
+```
+
+You can inspect the internal storage of a RowSparseNDArray by accessing 
attributes such as `indices` and `data`:
+
+
+```python
+# Access data array
+data = a.data
+# Access indices array
+indices = a.indices
+{'a.stype': a.stype, 'data':data, 'indices':indices}
+```
+
+## Storage Type Conversion
+
+You can convert an NDArray to a RowSparseNDArray and vice versa by using the 
`tostype` function:
+
+
+```python
+# Create a dense NDArray
+ones = mx.nd.ones((2,2))
+# Cast the storage type from `default` to `row_sparse`
+rsp = ones.tostype('row_sparse')
+# Cast the storage type from `row_sparse` to `default`
+dense = rsp.tostype('default')
+{'rsp':rsp, 'dense':dense}
+```
+
+You can also convert the storage type by using the `cast_storage` operator:
+
+
+```python
+# Create a dense NDArray
+ones = mx.nd.ones((2,2))
+# Cast the storage type to `row_sparse`
+rsp = mx.nd.sparse.cast_storage(ones, 'row_sparse')
+# Cast the storage type to `default`
+dense = mx.nd.sparse.cast_storage(rsp, 'default')
+{'rsp':rsp, 'dense':dense}
+```
+
+## Copies
+
+You can use the `copy` method which makes a deep copy of the array and its 
data, and returns a new array.
+We can also use the `copyto` method or the slice operator `[]` to deep copy to 
an existing array.
+
+
+```python
+a = mx.nd.ones((2,2)).tostype('row_sparse')
+b = a.copy()
+c = mx.nd.sparse.zeros('row_sparse', (2,2))
+c[:] = a
+d = mx.nd.sparse.zeros('row_sparse', (2,2))
+a.copyto(d)
+{'b is a': b is a, 'b.asnumpy()':b.asnumpy(), 'c.asnumpy()':c.asnumpy(), 
'd.asnumpy()':d.asnumpy()}
+```
+
+If the storage types of source array and destination array do not match,
+the storage type of destination array will not change when copying with 
`copyto` or the slice operator `[]`.
+
+
+```python
+e = mx.nd.sparse.zeros('row_sparse', (2,2))
+f = mx.nd.sparse.zeros('row_sparse', (2,2))
+g = mx.nd.ones(e.shape)
+e[:] = g
+g.copyto(f)
+{'e.stype':e.stype, 'f.stype':f.stype}
+```
+
+## Retain Row Slices
+
+You can retain a subset of row slices from a RowSparseNDArray specified by 
their row indices.
+
+
+```python
+data = [[1, 2], [3, 4], [5, 6]]
+indices = [0, 2, 3]
+rsp = mx.nd.sparse.row_sparse_array(data, indices, (5, 2))
+# Retain row 0 and row 1
+rsp_retained = mx.nd.sparse.retain(rsp, mx.nd.array([0, 1]))
+{'rsp.asnumpy()': rsp.asnumpy(), 'rsp_retained': rsp_retained, 
'rsp_retained.asnumpy()': rsp_retained.asnumpy()}
+```
+
+## Sparse Operators and Storage Type Inference
+
+Operators that have specialized implementation for sparse arrays can be 
accessed in ``mx.nd.sparse``. You can read the [mxnet.ndarray.sparse API 
documentation](mxnet.io/api/python/ndarray.html) to find what sparse operators 
are available.
 
 Review comment:
   Add http:// to the URL.
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

[GitHub] bhavinthaker commented on a change in pull request #7921: Add three sparse tutorials

Reply via email to