[GitHub] tornadomeet commented on issue #9338: why share the same value of class member when using multigpu of HybridBlock, bug?

GitBox Sun, 07 Jan 2018 19:10:38 -0800

tornadomeet commented on issue #9338: why share the same value of class member 
when using multigpu of HybridBlock, bug?
URL: 
https://github.com/apache/incubator-mxnet/issues/9338#issuecomment-355876605
 
 
   hello, @szha ?Is ` self.cnt` a parameter instance variable or regular 
instance variabel?
   
   there are may case which using it own class member value independently, such 
as 
   ```python
   class ToyBlock(gluon.HybridBlock):
       """just a toy block for explain inint class member for each device"""
       def __init__(self):
           superToyBlock, self).__init__()
           self._initialized = False
   
       def initialize(self, features, F=mx.nd):
           self._initialized = True 
           if F == mx.nd:
               idx_vec = F.arange(0, stop=self._N, ctx=features.context)
           elif F == mx.sym:
               idx_vec = F.arange(0, stop=self._N)
           self._eye_matrix = 
F.stop_gradient(F.broadcast_equal(F.expand_dims(idx_vec, 0), 
F.expand_dims(idx_vec, 1)))
   
       def hybrid_forward(self, F, x):
           if not self._initialized:
               self.initialize(features, F)
           loss = F.mean(F.sqrt(F.maximum(1e-10, 2.0 - 2.0 * F.dot(features, 
features, transpose_b=True))) - self._eye_matrix)
           return loss
   ```
   because we want only init `self._eye_matirx` once time, so .when `F=mx.nd`, 
we should know the context advance for initialized(of course we can initialize 
it every time when do forward()).
   
   but when using multi-gpu for training, after finish forward() on gpu0, 
`self._initialized=True`, then  gpu1 will not initialize it's own 
`self._eye_matrix `, so ``` loss = F.mean(F.sqrt(F.maximum(1e-10, 2.0 - 2.0 * 
F.dot(features, features, transpose_b=True))) - self._eye_matrix)``` will crash 
for different context operator.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

[GitHub] tornadomeet commented on issue #9338: why share the same value of class member when using multigpu of HybridBlock, bug?

Reply via email to