tornadomeet commented on issue #9338: why share the same value of class member when using multigpu of HybridBlock, bug? URL: https://github.com/apache/incubator-mxnet/issues/9338#issuecomment-355876605 hello, @szha ?Is ` self.cnt` a parameter instance variable or regular instance variabel? there are may case which using it own class member value independently, such as ```python class ToyBlock(gluon.HybridBlock): """just a toy block for explain inint class member for each device""" def __init__(self): superToyBlock, self).__init__() self._initialized = False def initialize(self, features, F=mx.nd): self._initialized = True if F == mx.nd: idx_vec = F.arange(0, stop=self._N, ctx=features.context) elif F == mx.sym: idx_vec = F.arange(0, stop=self._N) self._eye_matrix = F.stop_gradient(F.broadcast_equal(F.expand_dims(idx_vec, 0), F.expand_dims(idx_vec, 1))) def hybrid_forward(self, F, x): if not self._initialized: self.initialize(features, F) loss = F.mean(F.sqrt(F.maximum(1e-10, 2.0 - 2.0 * F.dot(features, features, transpose_b=True))) - self._eye_matrix) return loss ``` because we want only init `self._eye_matirx` once time, so .when `F=mx.nd`, we should know the context advance for initialized(of course we can initialize it every time when do forward()). but when using multi-gpu for training, after finish forward() on gpu0, `self._initialized=True`, then gpu1 will not initialize it's own `self._eye_matrix `, so ``` loss = F.mean(F.sqrt(F.maximum(1e-10, 2.0 - 2.0 * F.dot(features, features, transpose_b=True))) - self._eye_matrix)``` will crash for different context operator.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
