xzqjack opened a new issue #7444: define a new Parametrized symbol layer and how to use (bind, init, set learning rate ) it? URL: https://github.com/apache/incubator-mxnet/issues/7444 I have been learning to define Parametrized layer, of which parameters will be leared during training time. I start by defining a fc layer with reference to 1. [https://github.com/apache/incubator-mxnet/blob/master/docs/tutorials/gluon/customop.md"](https://github.com/apache/incubator-mxnet/blob/master/docs/tutorials/gluon/customop.md"); 2. [https://github.com/apache/incubator-mxnet/blob/251ae71a20d8318ab20f4c19520f74b881fdf3ff/example/rcnn/rcnn/symbol/proposal.py](https://github.com/apache/incubator-mxnet/blob/251ae71a20d8318ab20f4c19520f74b881fdf3ff/example/rcnn/rcnn/symbol/proposal.py) 3. [https://github.com/apache/incubator-mxnet/blob/251ae71a20d8318ab20f4c19520f74b881fdf3ff/example/reinforcement-learning/dqn/operators.py]() The new parameterized layer definition seems OK, but i dont know how to use it. Eg, how to bind the symbol and how to set myFC parameters learning rate. Here is my code ``` import mxnet as mx import logging from data_iter import get_mnist_iter logging.basicConfig(level=logging.INFO) class MyFc(mx.operator.CustomOpProp): def __init__ (self): super(MyFc, self).__init__() def forward(self, is_train, req, in_data, out_data, aut): x = in_data[0] w = in_data[1] b = in_data[2] y = mx.nd.batch_dot(x, w.T) + b self.assign(out_data[0], req[0], y) def backward(self, out_grad, in_data, out_data, in_grad): dy = out_grad[0] w = in_data[1] x = in_data[0] b = in_data[2] dx = mx.nd.batch_dot(dy, w) self.assign(in_grad[0], req[0], dx) self.assing(in_grad[1], req[0], mx.nd.batch_dot(dy, x)) self.assign(in_grad[2], req[0], mx.nd.ones(shape=b.shape)) @mx.operator.register("myFc") class MyFcProp(mx.operator.CustomOpProp): def __init__(self): super(MyFcProp, self).__init__(need_top_grad=True) def list_arguments(self): return ['data', 'w', 'b'] def list_outputs(self): return ['output'] def infer_shape(self, in_shape): data_shape = in_shape[0] w_shape = in_shape[1] b_shape = in_shape[2] output_shape = (data_shape[0], 10) return [data_shape, w_shape, b_shape], [output_shape],[] def create_operator(self, ctx, shapes, dtypes): return MyFc() def get_symbol(num_classes=10, add_stn=False, perturbation_flag=False, **kwargs): data = mx.symbol.Variable('data') weight_2 = mx.sym.Variable('custom0_w', shape=(10, 200)) bias_2 = mx.symbol.Variable('custom0_b', shape=(10, )) if add_stn: data = mx.sym.SpatialTransformer(data=data, loc=get_loc(data), target_shape = (28,28), transform_type="affine", sampler_type="bilinear") # first conv # conv1 = mx.symbol.Convolution(data=data, kernel=(5,5), num_filter=20) conv1 = conv_pertub(data=data, kernel=(5,5), num_filter=20, perturbation_flag=False) tanh1 = mx.symbol.Activation(data=conv1, act_type="tanh") pool1 = mx.symbol.Pooling(data=tanh1, pool_type="max", kernel=(2,2), stride=(2,2)) # second conv conv2 = mx.symbol.Convolution(data=pool1, kernel=(5,5), num_filter=50) tanh2 = mx.symbol.Activation(data=conv2, act_type="tanh") pool2 = mx.symbol.Pooling(data=tanh2, pool_type="max", kernel=(2,2), stride=(2,2)) # first fullc flatten = mx.symbol.Flatten(data=pool2) # fc1 = mx.symbol.Custom(data=flatten, ) fc1 = fc_pertub(data=flatten, num_hidden=500, perturbation_flag=False) tanh3 = mx.symbol.Activation(data=fc1, act_type="tanh") # second fullc fc2 = mx.symbol.Custom(data=tanh3, op_type="myFc") # loss lenet = mx.symbol.SoftmaxOutput(data=fc2, name='softmax') return lenet num_classes = 10 perturbation_flag = True batch_size = 5 net = get_symbol(num_classes, add_stn=False, perturbation_flag=perturbation_flag) train, val = get_mnist_iter(batch_size=batch_size ) print train.provide_label mod = mx.mod.Module(net, data_names=['data','custom0_w','custom0_b']) mod.bind(data_shapes=[DataDesc('data', (batch_size ,1,28,28)), DataDesc('custom0_w', (10, 200)), DataDesc('custom0_b',(10,))], label_shapes=train.provide_label) ``` ## What have you tried to solve it? 1. `a, b, c = net.infer_shape(data=(5,1,28,28), custom0_w=(10, 200), custom0_b=(10,))` infer_shape operation is ok. Can anyone show me how to use (bind, init, set learning rate) a new parameterizd layer ? Thanks. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
With regards, Apache Git Services