[
https://issues.apache.org/jira/browse/SYSTEMML-1000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15539139#comment-15539139
]
Mike Dusenberry edited comment on SYSTEMML-1000 at 10/1/16 8:54 PM:
--------------------------------------------------------------------
So just to be clear, I don't have the bias hard-coded to 1 in the conv_builtin
layer, or any of the layers. The user just passes in an arbitrary weight
matrix {{W}} and bias matrix {{b}} with the correct dimensions, and any
arbitrary values ([line 25 |
https://github.com/apache/incubator-systemml/blob/master/scripts/staging/SystemML-NN/nn/layers/conv_builtin.dml#L25],
[line 37 |
https://github.com/apache/incubator-systemml/blob/master/scripts/staging/SystemML-NN/nn/layers/conv_builtin.dml#L37]).
The {{ones}} matrix on line 72 is just used to replicate the bias vector from
(F, 1) shape to (F, Hout*Wout) so that it can be then be flattened and added to
{{out}}. So, the user is free to initialize a weight matrix {{W}} and bias
vector {{b}} with any of the initialization plans that you mentioned and pass
it into the layer. :)
That being said, I do provide {{init}} *convenience* initialization functions
for the layers with parameters that return new {{W}} and {{b}} matrices --
https://github.com/apache/incubator-systemml/blob/master/scripts/staging/SystemML-NN/nn/layers/conv_builtin.dml#L120.
Basically, these are just *helper functions* that generate the parameters for
that layer, but the user is *not required* to use them. Currently, for those
{{init}} functions, I am just using sane defaults, which right now is the
initialization from He et. al. [http://arxiv.org/abs/1502.01852] for the weight
matrix, and 0 for the bias vector. The plan is to extend this to add the
common initialization schemes, such as the ones you mentioned, and those on
https://keras.io/initializations/ (I use the "he_normal" initialization on this
list).
In terms of integrating now with caffe, you should be able to use the filler
functions you've written to create the {{W}} and {{b}} matrices, and then
simply pass them in. :)
was (Author: [email protected]):
So just to be clear, I don't have the bias hard-coded to 1 in the conv_builtin
layer, or any of the layers. The user just passes in an arbitrary weight
matrix {{W}} and bias matrix {{b}} with the correct dimensions, and any
arbitrary values
[https://github.com/apache/incubator-systemml/blob/master/scripts/staging/SystemML-NN/nn/layers/conv_builtin.dml#L25].
The {{ones}} matrix on line 72 is just used to replicate the bias vector from
(F, 1) shape to (F, Hout*Wout) so that it can be then be flattened and added to
{{out}}. So, the user is free to initialize a weight matrix {{W}} and bias
vector {{b}} with any of the initialization plans that you mentioned and pass
it into the layer. :)
That being said, I do provide {{init}} *convenience* initialization functions
for the layers with parameters that return new {{W}} and {{b}} matrices --
https://github.com/apache/incubator-systemml/blob/master/scripts/staging/SystemML-NN/nn/layers/conv_builtin.dml#L120.
Basically, these are just *helper functions* that generate the parameters for
that layer, but the user is *not required* to use them. Currently, for those
{{init}} functions, I am just using sane defaults, which right now is the
initialization from He et. al. [http://arxiv.org/abs/1502.01852] for the weight
matrix, and 0 for the bias vector. The plan is to extend this to add the
common initialization schemes, such as the ones you mentioned, and those on
https://keras.io/initializations/ (I use the "he_normal" initialization on this
list).
In terms of integrating now with caffe, you should be able to use the filler
functions you've written to create the {{W}} and {{b}} matrices, and then
simply pass them in. :)
> Allow users to pass non-1 bias filler in conv_builtin.dml
> ---------------------------------------------------------
>
> Key: SYSTEMML-1000
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1000
> Project: SystemML
> Issue Type: Wish
> Reporter: Niketan Pansare
>
> Useful pre-step to incorporate nn functions with caffe
> https://github.com/apache/incubator-systemml/blob/master/scripts/staging/SystemML-NN/nn/layers/conv_builtin.dml#L72
> https://github.com/apache/incubator-systemml/pull/158
> [[email protected]]
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)