[GitHub] [singa] joddiy edited a comment on issue #696: Refactor autograd module

GitBox Thu, 14 May 2020 04:34:03 -0700


joddiy edited a comment on issue #696:
URL: https://github.com/apache/singa/issues/696#issuecomment-628573921



   > ```python
   > class Module:
   >     def compile(self, inputs, is_train, use_graph, graph_alg):
   >         set train, graph etc config
   >         ===turn on graph===
   >         if inputs are not filled, print warnings and fill inputs according 
to data type.
   >         self.forward(*inputs)
   >         ===turn off graph===
   >     
   >      def load(self, ckp_path, include_state=False):
   >        load onnx model and copy the params to each layer; 
   >        generate warnings for mismatched layers/params.
   >        restore the states and return it as a dict
   >      
   >      def save(self, ckp_path, state={}):
   >        save the model as onnx format
   >        save the states
   >     
   >      def forward(self, x):    # turn on graph if necessary
   >         pass
   > 
   >      def train_one_batch(self, x, y):  # turn on graph if necessary
   >         pass   
   >    
   >      @deprecated 
   >      def loss(self, ):
   >         pass
   > 
   >       @deprecated 
   >       def optim(self,):
   >           pass      
   > 
   > 
   > class Layer:
   >     def __init__(name=None):
   >       self.init = False
   > 
   >     def do_init(x):
   >         ===turn off graph===
   >            init layer states
   >            As the graph is turned off, the initialization operations will 
be executed
   >         ===restore the state of the graph===
   >       
   >     def forward():
   >         # do the forward propagation 
   > 
   >     def __call__(self, x):
   >        if self.init == False:
   >           self.do_init(x)
   >        self.forward(x)
   > 
   > class MyLayer(Layer):
   >      def __init__(self):
   >           self.layer1 = layer.Conv2d(nb_kernels = 32, kernel=3, stride=1, 
padding=0, kernel_init='he_uniform') 
   >           self.layer2 = layer.MaxPool2d(kernel=3, stride=2)
   > 
   >       def forward(self, x):
   >           return self.layer2(self.layer1(x))
   > 
   > 
   > 
   > class MyModule(Module):
   >      def __init__(self):
   >            self.blk1 = MyLayer()
   >            self.blk2 = MyLayer()
   >            self.optim = SGD()
   >            self.loss = CrossEntropyLoss()
   > 
   >       def forward(self, x):
   >            return self.blk2(self.blk1(x))    
   > 
   >       def train_one_batch(self, x, y): 
   >            y_ = self.forward(x)
   >            l = self.loss(y_, y)
   >            self.optim.backward_and_update(l)
   >            return l
   > 
   > x = Placeholder((2, 3), device = gpu, dtype=singa.float) # alias of Tensor
   > #  === no need to fill x with values===
   > m = MyModel()
   > 
   > # compatible with existing code which does not have the following two 
statements.
   > m.compile([x], is_train=True, use_graph=True, graph_alg='sequence')
   > for pname, ptensor in m.get_params():
   >     ptensor.uniform(-1, 1)   # not necessary if each layer's param init 
methods are configured.
   > 
   > y = Placeholder((2,), device = gpu)
   > for npx, npy in data:
   >    x.copy_from(npx)
   >    y.copy_from(npy)
   >    m.train_one_batch(x, y)  # build the graph in the first iter.  For the 
old code, the params are initialized here.
   > 
   > m.save('mymodel', state={'epoch': data.size(), 'sgd': m.optim}
   > ```
   > 
   > How about this proposal?
   
   Thanks for your comments. I guess it's a good idea we add a compile function 
before the training. Based on Ruling's code, if we don't want to run the 
computation during the init phase, we can add a function to compute the shape:
   
   ```py
   class Module:
       def compile(self, inputs, is_train, use_graph, graph_alg):
           set train, graph etc config
           turn off graph
           if inputs are not filled, print warnings and fill inputs according 
to data type.
           self.forward(*inputs)
       
        def load(self, ckp_path, include_state=False):
          load onnx model and copy the params to each layer; 
          generate warnings for mismatched layers/params.
          restore the states and return it as a dict
        
        def save(self, ckp_path, state={}):
          save the model as onnx format
          save the states
       
        def forward(self, x):    # turn on graph if necessary
           pass
   
        def train_one_batch(self, x, y):  # turn on graph if necessary
           pass   
      
        @deprecated 
        def loss(self, ):
           pass
   
         @deprecated 
         def optim(self,):
             pass      
   
   
   class Layer:
       def __init__(name=None):
         self.init = False
   
       def do_init(x):
           ##  compute the output shape
           output_shape = self.infer_shape(x)
           # init weights by the shape
           # return a new Placeholder to the next operation
           return Placeholder(output_shape, device = gpu, dtype=singa.float) # 
alias of Tensor
         
       def forward():
           # do the forward propagation 
   
       def __call__(self, x):
          if self.init == False:
             y = self.do_init(x)
          y = self.forward(x)
          return y
       
       def infer_shape(x):
           # infer shape
   
   
   class MyLayer(Layer):
        def __init__(self):
             self.layer1 = layer.Conv2d(nb_kernels = 32, kernel=3, stride=1, 
padding=0, kernel_init='he_uniform') 
             self.layer2 = layer.MaxPool2d(kernel=3, stride=2)
   
         def forward(self, x):
             return self.layer2(self.layer1(x))
   
   
   
   class MyModule(Module):
        def __init__(self):
              self.blk1 = MyLayer()
              self.blk2 = MyLayer()
              self.optim = SGD()
              self.loss = CrossEntropyLoss()
   
         def forward(self, x):
              return self.blk2(self.blk1(x))    
   
         def train_one_batch(self, x, y): 
              y_ = self.forward(x)
              l = self.loss(y_, y)
              self.optim.backward_and_update(l)
              return l
   
   x = Placeholder((2, 3), device = gpu, dtype=singa.float) # alias of Tensor
   #  === no need to fill x with values===
   m = MyModel()
   
   # compatible with existing code which does not have the following two 
statements.
   m.compile([x], is_train=True, use_graph=True, graph_alg='sequence')
   for pname, ptensor in m.get_params():
       ptensor.uniform(-1, 1)   # not necessary if each layer's param init 
methods are configured.
   
   y = Placeholder((2,), device = gpu)
   for npx, npy in data:
      x.copy_from(npx)
      y.copy_from(npy)
      m.train_one_batch(x, y)  # build the graph in the first iter.  For the 
old code, the params are initialized here.
   
   m.save('mymodel', state={'epoch': data.size(), 'sgd': m.optim}
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [singa] joddiy edited a comment on issue #696: Refactor autograd module

Reply via email to