ThomasDelteil commented on issue #13447: Rewrite dataloader, improves responsiveness and reliability URL: https://github.com/apache/incubator-mxnet/pull/13447#issuecomment-442946640 @leezu, the problem with the stream is that the shuffling if it happens is to be done ahead of time so that the sequential access remains random. Currently one solution that works and that I prefer, with the current Dataset and DataLoader API, is to use a continuous batch sampler and effectively iterate through your batches rather than using the concept of epoch. You iterate your dataloader until you break out of the loop after N iterations. You can design your continuous batch sampler to return an iterator that is effectively creating `epochs` times sequences of indices without replacement in a single iterator.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
