chandana1332 commented on issue #17237: Data imbalance handling in MXNet Gluon URL: https://github.com/apache/incubator-mxnet/issues/17237#issuecomment-571719098 So that only works when the number of samples being samples is less than batch_size but I'm talking about a case when number of batches being sampled is less than number of GPUs. Hence, the scenario I'm talking about is outside of the data loader. Also, we don't have an issue handling data imbalance but I'm trying to understand the internals of how MXNet does it. Today, we sample batches and if the number of sampled batches is less than number of GPUs, we just simple process batches on those GPUs and do a trainer.step() which reduces the gradients correctly and updates params. I would like to understand how MXNET handles this internally in the PS architecture.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
