[jira] [Comment Edited] (SYSTEMML-1000) Allow users to pass non-1 bias filler in conv_builtin.dml

Mike Dusenberry (JIRA) Sat, 01 Oct 2016 13:56:35 -0700

    [ 
https://issues.apache.org/jira/browse/SYSTEMML-1000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15539139#comment-15539139
 ]


Mike Dusenberry edited comment on SYSTEMML-1000 at 10/1/16 8:55 PM:
--------------------------------------------------------------------

So just to be clear, I don't have the bias hard-coded to 1 in the conv_builtin 
layer, or any of the layers.  The user just passes in an arbitrary weight 
matrix {{W}} and bias matrix {{b}} with the correct dimensions, and any 
arbitrary values ([line 25 | 
https://github.com/apache/incubator-systemml/blob/master/scripts/staging/SystemML-NN/nn/layers/conv_builtin.dml#L25],
 [line 37 | 
https://github.com/apache/incubator-systemml/blob/master/scripts/staging/SystemML-NN/nn/layers/conv_builtin.dml#L37]).
  The {{ones}} matrix on line 72 is just used to replicate the bias vector from 
(F, 1) shape to (F, Hout*Wout) so that it can be then be flattened and added to 
{{out}}.  So, the user is free to initialize a weight matrix {{W}} and bias 
vector {{b}} with any of the initialization plans that you mentioned and pass 
it into the layer. :)

That being said, I do provide {{init}} *convenience* initialization functions 
for the layers with parameters that return new {{W}} and {{b}} matrices -- 
[line 120 | 
https://github.com/apache/incubator-systemml/blob/master/scripts/staging/SystemML-NN/nn/layers/conv_builtin.dml#L120].
  Basically, these are just *helper functions* that generate the parameters for 
that layer, but the user is *not required* to use them.  Currently, for those 
{{init}} functions, I am just using sane defaults, which right now is the 
initialization from He et. al. [http://arxiv.org/abs/1502.01852] for the weight 
matrix, and 0 for the bias vector.  The plan is to extend this to add the 
common initialization schemes, such as the ones you mentioned, and those on 
https://keras.io/initializations/ (I use the "he_normal" initialization on this 
list).

In terms of integrating now with caffe, you should be able to use the filler 
functions you've written to create the {{W}} and {{b}} matrices, and then 
simply pass them in. :)


was (Author: [email protected]):
So just to be clear, I don't have the bias hard-coded to 1 in the conv_builtin 
layer, or any of the layers.  The user just passes in an arbitrary weight 
matrix {{W}} and bias matrix {{b}} with the correct dimensions, and any 
arbitrary values ([line 25 | 
https://github.com/apache/incubator-systemml/blob/master/scripts/staging/SystemML-NN/nn/layers/conv_builtin.dml#L25],
 [line 37 | 
https://github.com/apache/incubator-systemml/blob/master/scripts/staging/SystemML-NN/nn/layers/conv_builtin.dml#L37]).
  The {{ones}} matrix on line 72 is just used to replicate the bias vector from 
(F, 1) shape to (F, Hout*Wout) so that it can be then be flattened and added to 
{{out}}.  So, the user is free to initialize a weight matrix {{W}} and bias 
vector {{b}} with any of the initialization plans that you mentioned and pass 
it into the layer. :)

That being said, I do provide {{init}} *convenience* initialization functions 
for the layers with parameters that return new {{W}} and {{b}} matrices -- 
https://github.com/apache/incubator-systemml/blob/master/scripts/staging/SystemML-NN/nn/layers/conv_builtin.dml#L120.
  Basically, these are just *helper functions* that generate the parameters for 
that layer, but the user is *not required* to use them.  Currently, for those 
{{init}} functions, I am just using sane defaults, which right now is the 
initialization from He et. al. [http://arxiv.org/abs/1502.01852] for the weight 
matrix, and 0 for the bias vector.  The plan is to extend this to add the 
common initialization schemes, such as the ones you mentioned, and those on 
https://keras.io/initializations/ (I use the "he_normal" initialization on this 
list).

In terms of integrating now with caffe, you should be able to use the filler 
functions you've written to create the {{W}} and {{b}} matrices, and then 
simply pass them in. :)

> Allow users to pass non-1 bias filler in conv_builtin.dml
> ---------------------------------------------------------
>
>                 Key: SYSTEMML-1000
>                 URL: https://issues.apache.org/jira/browse/SYSTEMML-1000
>             Project: SystemML
>          Issue Type: Wish
>            Reporter: Niketan Pansare
>
> Useful pre-step to incorporate nn functions with caffe
> https://github.com/apache/incubator-systemml/blob/master/scripts/staging/SystemML-NN/nn/layers/conv_builtin.dml#L72
> https://github.com/apache/incubator-systemml/pull/158
> [[email protected]]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (SYSTEMML-1000) Allow users to pass non-1 bias filler in conv_builtin.dml

Reply via email to