RuRo opened a new issue #15103: [Feature Request] Gluon SymbolBlock structured 
imports.
URL: https://github.com/apache/incubator-mxnet/issues/15103
 
 
   As far as I am aware, gluon provides only 2 ways to save/load models. Please 
correct me, if I am wrong.
   
   1) using `save/load_parameters`
   
   The problem with this option is that it doesn't save the model architecture, 
so you have to construct the model, before loading the parameters. You have to 
know **exactly** what the model was. which is really inconvenient.
   
   For example, if you are trying to do transfer learning, you have to know the 
exact number of classes the model was pretrained on and all the hyperparameters 
used during pretraining, which is really inelegant and annoying.
   
   Furthermore, when releasing to production, this means that you need to 
actually create the whole model and then load parameters, instead of just doing 
an `imports`. This essentially rules out this option unless you can ship the 
whole training code to production.
   
   2) using `export/imports`
   
   This is a much saner option for exporting your model for inference. But it 
is (as far as I can tell) completely useless for experiments/training/transfer 
learning. That is because `export/imports` doesn't actually preserve the Block 
hierarchy and instead squashes the whole model into one block.
   
   For example:
   
   ```python
   from mxnet import gluon, ndarray
   M = gluon.nn.HybridSequential()
   with M.name_scope():
       M.add(gluon.nn.Dense(10))
       M.add(gluon.nn.Dense(1))
   M.initialize()
   M.hybridize()
   M(ndarray.arange(10))
   ```
   
   At this point, `M` is a Hybrid Block, with structure
   ```
   HybridSequential(
     (0): Dense(1 -> 10, linear)
     (1): Dense(10 -> 1, linear)
   )
   ```
   
   And I can access the various children by indexing (or by names if it's not a 
sequential block).
   However after I do
   
   ```python
   M.export('model')
   M = gluon.nn.SymbolBlock.imports('model-symbol.json', ['data'], 
'model-0000.params')
   ```
   
   `M` is a Symbol Block without any structure, so if I wanted to access only 
the first dense layer, I couldn't.
   ```
   SymbolBlock(
   
   )
   ```
   
   ---
   
   In the end, I am left with two options 
   
   1) rebuild a model from scratch every time I want to load it (which is a 
complete nightmare if your model could potentially change while you are 
experimenting with the architecture and requires you to ship training code to 
production)
   2) give up transfer learning
   
   Please correct me if I am wrong. Otherwise this is a feature request to 
somehow save the HybridBlock structure, when using `export/imports`.
   
   ---
   
   Temporary Workarounds:
   
   - use both (1) and (2), load from 1 during development, ship 2 to production 
(I'm currently doing this)
   - use (2) and write your own thing to extract the weights from `M.params` 
for transfer learning (this sounds like a huge pain)

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to