Repository: incubator-systemml
Updated Branches:
  refs/heads/gh-pages 9e715abcb -> 9e457e5bf


[SYSTEMML-540] Initial implementation of conv2d/pooling builtin function


Project: http://git-wip-us.apache.org/repos/asf/incubator-systemml/repo
Commit: 
http://git-wip-us.apache.org/repos/asf/incubator-systemml/commit/686bf815
Tree: http://git-wip-us.apache.org/repos/asf/incubator-systemml/tree/686bf815
Diff: http://git-wip-us.apache.org/repos/asf/incubator-systemml/diff/686bf815

Branch: refs/heads/gh-pages
Commit: 686bf815e5e459511bba33909124b21ced965299
Parents: 9e715ab
Author: Niketan Pansare <npan...@us.ibm.com>
Authored: Mon May 16 16:43:39 2016 -0800
Committer: Niketan Pansare <npan...@us.ibm.com>
Committed: Mon May 16 17:45:39 2016 -0700

----------------------------------------------------------------------
 devdocs/deep-learning.md | 122 ++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 122 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-systemml/blob/686bf815/devdocs/deep-learning.md
----------------------------------------------------------------------
diff --git a/devdocs/deep-learning.md b/devdocs/deep-learning.md
new file mode 100644
index 0000000..46f2502
--- /dev/null
+++ b/devdocs/deep-learning.md
@@ -0,0 +1,122 @@
+# Initial prototype for Deep Learning
+
+## Representing tensor and images in SystemML
+
+In this prototype, we represent a tensor as a matrix stored in a row-major 
format,
+where first dimension of tensor and matrix are exactly the same. For example, 
a tensor (with all zeros)
+of shape [3, 2, 4, 5] can be instantiated by following DML statement:
+```sh
+A = matrix(0, rows=3, cols=2*4*5) 
+```
+### Tensor functions:
+
+#### Element-wise arithmetic operators:
+Following operators work out-of-the box when both tensors X and Y have same 
shape:
+
+* Element-wise exponentiation: `X ^ Y`
+* Element-wise unary minus: `-X`
+* Element-wise integer division: `X %/% Y`
+* Element-wise modulus operation: `X %% Y`
+* Element-wise multiplication: `X * Y`
+* Element-wise division: `X / Y`
+* Element-wise addition: `X + Y`
+* Element-wise subtraction: `X - Y`
+
+SystemML does not support implicit broadcast for above tensor operations, 
however one can write a DML-bodied function to do so.
+For example: to perform the above operations with broadcasting on second 
dimensions, one can use the below `rep(Z, n)` function:
+``` python
+rep = function(matrix[double] Z, int C) return (matrix[double] ret) {
+       ret = Z
+       for(i in 2:C) {
+               ret = cbind(ret, Z)
+       }
+}
+```
+Using the above `rep(Z, n)` function, we can realize the element-wise 
arithmetic operation with broadcasting. Here are some examples:
+* X of shape [N, C, H, W] and Y of shape [1, C, H, W]: `X + Y` (Note: SystemML 
does implicit broadcasting in this case because of the way 
+it represents the tensor)
+* X of shape [1, C, H, W] and Y of shape [N, C, H, W]: `X + Y` (Note: SystemML 
does implicit broadcasting in this case because of the way 
+it represents the tensor)
+* X of shape [N, C, H, W] and Y of shape [N, 1, H, W]: `X + rep(Y, C)`
+* X of shape [N, C, H, W] and Y of shape [1, 1, H, W]: `X + rep(Y, C)`
+* X of shape [N, 1, H, W] and Y of shape [N, C, H, W]: `rep(X, C) + Y`
+* X of shape [1, 1, H, W] and Y of shape [N, C, H, W]: `rep(X, C) + Y`
+
+TODO: Map the NumPy tensor calls to DML expressions.
+
+## Representing images in SystemML
+
+The images are assumed to be stored NCHW format, where N = batch size, C = 
#channels, H = height of image and W = width of image. 
+Hence, the images are internally represented as a matrix with dimension (N, C 
* H * W).
+
+## Convolution and Pooling built-in functions
+
+This prototype also contains initial implementation of forward/backward 
functions for 2D convolution and pooling:
+* `conv2d(x, w, ...)`
+* `conv2d_backward_filter(x, dout, ...)` and `conv2d_backward_data(w, dout, 
...)`
+* `max_pool(x, ...)` and `max_pool_backward(x, dout, ...)`
+
+The required arguments for all above functions are:
+* stride=[stride_h, stride_w]
+* padding=[pad_h, pad_w]
+* input_shape=[numImages, numChannels, height_image, width_image]
+
+The additional required argument for 
conv2d/conv2d_backward_filter/conv2d_backward_data functions is:
+* filter_shape=[numFilters, numChannels, height_filter, width_filter]
+
+The additional required argument for max_pool/avg_pool functions is:
+* pool_size=[height_pool, width_pool]
+
+The results of these functions are consistent with Nvidia's CuDNN library.
+
+### Border mode:
+* To perform valid padding, use `padding = 
(input_shape-filter_shape)*(stride-1)/ 2`. (Hint: for stride length of 1, 
`padding = [0, 0]` performs valid padding).
+
+* To perform full padding, use `padding = ((stride-1)*input_shape + 
(stride+1)*filter_shape - 2*stride) / 2`. (Hint: for stride length of 1, 
`padding = [filter_h-1, filter_w-1]` performs full padding).
+
+* To perform same padding, use `padding = (input_shape*(stride-1) + 
filter_shape - stride)/2`. (Hint: for stride length of 1, `padding = 
[(filter_h-1)/2, (filter_w-1)/2]` performs same padding).
+
+### Explanation of backward functions for conv2d
+
+Consider one-channel 3 X 3 image =
+  
+| x1 | x2 | x3 |
+|----|----|----|
+| x4 | x5 | x6 |
+| x7 | x8 | x9 |
+
+and one 2 X 2 filter:
+
+| w1 | w2 |
+|----|----|
+| w3 | w4 |
+
+Then, `conv2d(x, w, stride=[1, 1], padding=[0, 0], input_shape=[1, 1, 3, 3], 
filter_shape=[1, 1, 2, 2])` produces following tensor
+of shape `[1, 1, 2, 2]`, which is represented as `1 X 4` matrix in NCHW format:
+
+| `w1*x1 + w2*x2 + w3*x4 + w4*x5` | `w1*x2 + w2*x3 + w3*x5 + w4*x6` | `w1*x4 + 
w2*x5 + w3*x7 + w4*x8` | `w1*x5 + w2*x6 + w3*x8 + w4*x9` |
+|---------------------------------|---------------------------------|---------------------------------|---------------------------------|
+
+
+Let the error propagated from above layer is
+
+| y1 | y2 | y3 | y4 |
+|----|----|----|----|
+
+Then `conv2d_backward_filter(x, y, stride=[1, 1], padding=[0, 0], 
input_shape=[1, 1, 3, 3], filter_shape=[1, 1, 2, 2])` produces following 
+updates for the filter:
+
+| `y1*x1 + y2*x2 + y3*x4 + y4*x5` | `y1*x2 + y2*x3 + y3*x5 + y4*x6` |
+|---------------------------------|---------------------------------|
+| `y1*x4 + y2*x5 + y3*x7 + y4*x8` | `y1*x5 + y2*x6 + y3*x8 + y4*x9` |
+
+Note: since the above update is a tensor of shape [1, 1, 2, 2], it will be 
represented as matrix of dimension [1, 4].
+
+Similarly, `conv2d_backward_data(w, y, stride=[1, 1], padding=[0, 0], 
input_shape=[1, 1, 3, 3], filter_shape=[1, 1, 2, 2])` produces following 
+updates for the image:
+
+
+| `w1*y1`         | `w2*y1 + w1*y2`                 | `w2*y2`         |
+|-----------------|---------------------------------|-----------------|
+| `w3*y1 + w1*y3` | `w4*y1 + w3*y2 + w2*y3 + w1*y4` | `w4*y2 + w2*y4` |
+| `w3*y3`         | `w4*y3 + w3*y4`                 | `w4*y4`         |

Reply via email to