[GitHub] szha commented on a change in pull request #9662: Gluon PReLU, ELU, SELU, Swish

2018-02-05 Thread GitBox
szha commented on a change in pull request #9662: Gluon PReLU, ELU, SELU, Swish
URL: https://github.com/apache/incubator-mxnet/pull/9662#discussion_r166087641
 
 

 ##
 File path: src/operator/leaky_relu-inl.h
 ##
 @@ -225,7 +242,11 @@ class LeakyReLUProp : public OperatorProperty {
 const TShape  = in_shape->at(leakyrelu::kData);
 if (dshape.ndim() == 0) return false;
 if (param_.act_type == leakyrelu::kPReLU) {
-  in_shape->at(leakyrelu::kGamma) = TShape(Shape1(dshape[1]));
+  const TShape  = in_shape->at(leakyrelu::kGamma);
+  if (gshape.Size() != 1)
 
 Review comment:
   So, I should check for both ndim and shape_[0] then. How do I check whether 
it?s undefined?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] szha commented on a change in pull request #9662: Gluon PReLU, ELU, SELU, Swish

2018-02-05 Thread GitBox
szha commented on a change in pull request #9662: Gluon PReLU, ELU, SELU, Swish
URL: https://github.com/apache/incubator-mxnet/pull/9662#discussion_r166086915
 
 

 ##
 File path: python/mxnet/gluon/nn/activations.py
 ##
 @@ -0,0 +1,215 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# coding: utf-8
+# pylint: disable= arguments-differ
+"""Basic neural network layers."""
+__all__ = ['Activation', 'LeakyReLU', 'PReLU', 'ELU', 'SELU', 'Swish']
+
+from ... import initializer
+from ..block import HybridBlock
+
+
+class Activation(HybridBlock):
+r"""Applies an activation function to input.
+
+Parameters
+--
+activation : str
+Name of activation function to use.
+See :func:`~mxnet.ndarray.Activation` for available choices.
+
+
+Inputs:
+- **data**: input tensor with arbitrary shape.
+
+Outputs:
+- **out**: output tensor with the same shape as `data`.
+"""
+def __init__(self, activation, **kwargs):
+self._act_type = activation
+super(Activation, self).__init__(**kwargs)
+
+def _alias(self):
+return self._act_type
+
+def hybrid_forward(self, F, x):
+return F.Activation(x, act_type=self._act_type, name='fwd')
+
+def __repr__(self):
+s = '{name}({_act_type})'
+return s.format(name=self.__class__.__name__,
+**self.__dict__)
+
+
+class LeakyReLU(HybridBlock):
+r"""Leaky version of a Rectified Linear Unit.
+
+It allows a small gradient when the unit is not active
+
+.. math::
+
+f\left(x\right) = \left\{
+\begin{array}{lr}
+   \alpha x & : x \lt 0 \\
+  x & : x \geq 0 \\
+\end{array}
+\right.\\
+
+Parameters
+--
+alpha : float
+slope coefficient for the negative half axis. Must be >= 0.
+
+
+Inputs:
+- **data**: input tensor with arbitrary shape.
+
+Outputs:
+- **out**: output tensor with the same shape as `data`.
+"""
+def __init__(self, alpha, **kwargs):
+assert alpha >= 0, "Slope coefficient for LeakyReLU must be no less 
than 0."
+super(LeakyReLU, self).__init__(**kwargs)
+self._alpha = alpha
+
+def hybrid_forward(self, F, x):
+return F.LeakyReLU(x, act_type='leaky', slope=self._alpha, name='fwd')
+
+def __repr__(self):
+s = '{name}({alpha})'
+return s.format(name=self.__class__.__name__,
+alpha=self._alpha)
+
+
+class PReLU(HybridBlock):
+r"""Parametric leaky version of a Rectified Linear Unit.
+`_ paper.
+
+It learns a gradient when the unit is not active
+
+.. math::
+
+f\left(x\right) = \left\{
+\begin{array}{lr}
+   \alpha x & : x \lt 0 \\
+  x & : x \geq 0 \\
+\end{array}
+\right.\\
+
+where alpha is a learned parameter.
+
+Parameters
+--
+alpha_initializer : Initializer
+Initializer for the `embeddings` matrix.
+
+
+Inputs:
+- **data**: input tensor with arbitrary shape.
+
+Outputs:
+- **out**: output tensor with the same shape as `data`.
+"""
+def __init__(self, alpha_initializer=initializer.Constant(0.25), *args):
+super(PReLU, self).__init__(*args)
+with self.name_scope():
+self.alpha = self.params.get('alpha', shape=(1,), 
init=alpha_initializer)
+
+def hybrid_forward(self, F, x, alpha):
+return F.LeakyReLU(x, gamma=alpha, act_type='prelu', name='fwd')
+
+def __repr__(self):
+s = '{name}'
 
 Review comment:
   Yes


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] szha commented on a change in pull request #9662: Gluon PReLU, ELU, SELU, Swish

2018-02-02 Thread GitBox
szha commented on a change in pull request #9662: Gluon PReLU, ELU, SELU, Swish
URL: https://github.com/apache/incubator-mxnet/pull/9662#discussion_r165807578
 
 

 ##
 File path: src/operator/leaky_relu-inl.h
 ##
 @@ -177,9 +182,20 @@ class LeakyReLUOp : public Operator {
   case leakyrelu::kPReLU: {
 weight = in_data[leakyrelu::kGamma].get(s);
 grad_weight = in_grad[leakyrelu::kGamma].get(s);
-grad_weight = sumall_except_dim<1>(F(data) * grad);
-gdata = F(data, 
mshadow::expr::broadcast<1>(weight, data.shape_))
-* grad;
+if (weight.shape_[0] == 1) {
 
 Review comment:
   There are two options, either writing it in Gluon by defining hybrid_forward 
in python, or extending the leaky relu operator in C for better performance.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] szha commented on a change in pull request #9662: Gluon PReLU, ELU, SELU, Swish

2018-02-02 Thread GitBox
szha commented on a change in pull request #9662: Gluon PReLU, ELU, SELU, Swish
URL: https://github.com/apache/incubator-mxnet/pull/9662#discussion_r165807547
 
 

 ##
 File path: src/operator/leaky_relu-inl.h
 ##
 @@ -177,9 +182,20 @@ class LeakyReLUOp : public Operator {
   case leakyrelu::kPReLU: {
 weight = in_data[leakyrelu::kGamma].get(s);
 grad_weight = in_grad[leakyrelu::kGamma].get(s);
-grad_weight = sumall_except_dim<1>(F(data) * grad);
-gdata = F(data, 
mshadow::expr::broadcast<1>(weight, data.shape_))
-* grad;
+if (weight.shape_[0] == 1) {
 
 Review comment:
   @bradcar sorry that I missed your comment earlier, and thanks for sharing 
your work. In this PR I'd like to first focus on wrapping up the previous two 
PRs for activations. Since you wrote the paper, would you like to implement 
that in mxnet?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] szha commented on a change in pull request #9662: Gluon PReLU, ELU, SELU, Swish

2018-02-01 Thread GitBox
szha commented on a change in pull request #9662: Gluon PReLU, ELU, SELU, Swish
URL: https://github.com/apache/incubator-mxnet/pull/9662#discussion_r165499599
 
 

 ##
 File path: src/operator/leaky_relu-inl.h
 ##
 @@ -177,9 +182,20 @@ class LeakyReLUOp : public Operator {
   case leakyrelu::kPReLU: {
 weight = in_data[leakyrelu::kGamma].get(s);
 grad_weight = in_grad[leakyrelu::kGamma].get(s);
-grad_weight = sumall_except_dim<1>(F(data) * grad);
-gdata = F(data, 
mshadow::expr::broadcast<1>(weight, data.shape_))
-* grad;
+if (weight.shape_[0] == 1) {
 
 Review comment:
   If the weight parameter is shared across all axis, then the only one scalar 
value is shared everywhere, in which case the weight should be (1,)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] szha commented on a change in pull request #9662: Gluon PReLU, ELU, SELU, Swish

2018-02-01 Thread GitBox
szha commented on a change in pull request #9662: Gluon PReLU, ELU, SELU, Swish
URL: https://github.com/apache/incubator-mxnet/pull/9662#discussion_r165470447
 
 

 ##
 File path: src/operator/leaky_relu-inl.h
 ##
 @@ -177,9 +182,20 @@ class LeakyReLUOp : public Operator {
   case leakyrelu::kPReLU: {
 weight = in_data[leakyrelu::kGamma].get(s);
 grad_weight = in_grad[leakyrelu::kGamma].get(s);
-grad_weight = sumall_except_dim<1>(F(data) * grad);
-gdata = F(data, 
mshadow::expr::broadcast<1>(weight, data.shape_))
-* grad;
+if (weight.shape_[0] == 1) {
 
 Review comment:
   It's equivalent because weight is `Tensor`. I think `Size()` is 
safer choice in case `weight` changes definition.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] szha commented on a change in pull request #9662: Gluon PReLU, ELU, SELU, Swish

2018-02-01 Thread GitBox
szha commented on a change in pull request #9662: Gluon PReLU, ELU, SELU, Swish
URL: https://github.com/apache/incubator-mxnet/pull/9662#discussion_r165469850
 
 

 ##
 File path: python/mxnet/gluon/nn/activations.py
 ##
 @@ -0,0 +1,214 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# coding: utf-8
+# pylint: disable= arguments-differ
+"""Basic neural network layers."""
+__all__ = ['Activation', 'LeakyReLU', 'PReLU', 'ELU', 'SELU', 'Swish']
+
+from ..block import HybridBlock
+
+
+class Activation(HybridBlock):
+r"""Applies an activation function to input.
+
+Parameters
+--
+activation : str
+Name of activation function to use.
+See :func:`~mxnet.ndarray.Activation` for available choices.
+
+
+Inputs:
+- **data**: input tensor with arbitrary shape.
+
+Outputs:
+- **out**: output tensor with the same shape as `data`.
+"""
+def __init__(self, activation, **kwargs):
+self._act_type = activation
+super(Activation, self).__init__(**kwargs)
+
+def _alias(self):
+return self._act_type
+
+def hybrid_forward(self, F, x):
+return F.Activation(x, act_type=self._act_type, name='fwd')
+
+def __repr__(self):
+s = '{name}({_act_type})'
+return s.format(name=self.__class__.__name__,
+**self.__dict__)
+
+
+class LeakyReLU(HybridBlock):
+r"""Leaky version of a Rectified Linear Unit.
+
+It allows a small gradient when the unit is not active
+
+.. math::
+
+f\left(x\right) = \left\{
+\begin{array}{lr}
+   \alpha x & : x \lt 0 \\
+  x & : x \geq 0 \\
+\end{array}
+\right.\\
+
+Parameters
+--
+alpha : float
+slope coefficient for the negative half axis. Must be >= 0.
+
+
+Inputs:
+- **data**: input tensor with arbitrary shape.
+
+Outputs:
+- **out**: output tensor with the same shape as `data`.
+"""
+def __init__(self, alpha, **kwargs):
+assert alpha >= 0, "Slope coefficient for LeakyReLU must be no less 
than 0."
+super(LeakyReLU, self).__init__(**kwargs)
+self._alpha = alpha
+
+def hybrid_forward(self, F, x):
+return F.LeakyReLU(x, act_type='leaky', slope=self._alpha, name='fwd')
+
+def __repr__(self):
+s = '{name}({alpha})'
+return s.format(name=self.__class__.__name__,
+alpha=self._alpha)
+
+
+class PReLU(HybridBlock):
+r"""Parametric leaky version of a Rectified Linear Unit.
+`_ paper.
+
+It learns a gradient when the unit is not active
+
+.. math::
+
+f\left(x\right) = \left\{
+\begin{array}{lr}
+   \alpha x & : x \lt 0 \\
+  x & : x \geq 0 \\
+\end{array}
+\right.\\
+
+where alpha is a learned parameter.
+
+Parameters
+--
+alpha_initializer : Initializer
+Initializer for the `embeddings` matrix.
+
+
+Inputs:
+- **data**: input tensor with arbitrary shape.
+
+Outputs:
+- **out**: output tensor with the same shape as `data`.
+"""
+def __init__(self, alpha_initializer='zeros', *args):
 
 Review comment:
   [Tensorflow/Keras uses 
zeros](https://www.tensorflow.org/api_docs/python/tf/keras/layers/PReLU#__init__),
 [Pytorch uses 
0.25](http://pytorch.org/docs/master/nn.html?highlight=prelu#torch.nn.PReLU)
   
   Do we have a constant initializer to achieve the latter?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] szha commented on a change in pull request #9662: Gluon PReLU, ELU, SELU, Swish

2018-02-01 Thread GitBox
szha commented on a change in pull request #9662: Gluon PReLU, ELU, SELU, Swish
URL: https://github.com/apache/incubator-mxnet/pull/9662#discussion_r165437542
 
 

 ##
 File path: src/operator/leaky_relu-inl.h
 ##
 @@ -177,9 +182,20 @@ class LeakyReLUOp : public Operator {
   case leakyrelu::kPReLU: {
 weight = in_data[leakyrelu::kGamma].get(s);
 grad_weight = in_grad[leakyrelu::kGamma].get(s);
-grad_weight = sumall_except_dim<1>(F(data) * grad);
-gdata = F(data, 
mshadow::expr::broadcast<1>(weight, data.shape_))
-* grad;
+if (weight.shape_[0] == 1) {
 
 Review comment:
   Existing PReLU had two problems:
   1. gamma parameter was never documented and can't be passed in using kwargs.
   2. it doesn't support scalar broadcast as was attempted in #8912 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services