Neutron3529 edited a comment on issue #15655: Performance regression for gluon dataloader with large batch size URL: https://github.com/apache/incubator-mxnet/issues/15655#issuecomment-515744051 @zhreshold I find the bottleneck: ``` >>> import mxnet as mx >>> def data_xform(data): ... """Move channel axis to the beginning, cast to float32, and normalize to [0, 1].""" ... return mx.nd.moveaxis(data, 2, 0).astype('float32') / 255 ... >>> train_data = mx.gluon.data.vision.MNIST(train=True).transform_first(data_xform) >>> train_data_NO_TRANSFORM = mx.gluon.data.vision.MNIST(train=True) >>> >>> batch_size = 10000 >>> train_loader = mx.gluon.data.DataLoader(train_data, shuffle=True, batch_size=batch_size) >>> train_loader_NO_TRANSFORM = mx.gluon.data.DataLoader(train_data_NO_TRANSFORM, shuffle=True, batch_size=batch_size) >>> >>> >>> from time import time >>> t=time() >>> for data, label in train_loader: ... pass ... >>> print(time()-t) 26.350508451461792 >>> t=time() >>> >>> for data, label in train_loader_NO_TRANSFORM: ... pass ... >>> print(time()-t) 3.0209174156188965 >>> batch_size=100 >>> train_loader = mx.gluon.data.DataLoader(train_data, shuffle=True, batch_size=batch_size) >>> t=time() >>> for data, label in train_loader_NO_TRANSFORM: ... data=data_xform(data).asnumpy()#to ensure the function is executed ... >>> print(time()-t) 3.950432062149048 >>> batch_size=10 >>> train_loader = mx.gluon.data.DataLoader(train_data, shuffle=True, batch_size=batch_size) >>> t=time() >>> for data, label in train_loader_NO_TRANSFORM: ... data=data_xform(data).asnumpy() ... >>> print(time()-t) 4.091055631637573 >>> batch_size=1 >>> train_loader = mx.gluon.data.DataLoader(train_data, shuffle=True, batch_size=batch_size) >>> t=time() >>> for data, label in train_loader_NO_TRANSFORM: ... data=data_xform(data).asnumpy() ... >>> print(time()-t) 4.060138940811157 >>> batch_size=100 >>> t=time() >>> train_data_lazy = mx.gluon.data.vision.MNIST(train=True).transform_first(data_xform,lazy=False) >>> print(time()-t) 24.288618326187134 >>> train_loader_lazy = mx.gluon.data.DataLoader(train_data_lazy, shuffle=True, batch_size=batch_size) >>> print(time()-t) 24.294118642807007 >>> t=time() >>> for data, label in train_loader_lazy: ... pass ... >>> print(time()-t) 1.6100592613220215 ``` `.transform_first` spends so much time than it is expect. Change the time to execute `data_xform` (e.g., in a batch) could help improve the performance
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
