zhreshold opened a new issue #17263: [mxnet 2.0][item 4.8][RFC] Gluon Data API Extension and Fixes(Part 1) URL: https://github.com/apache/incubator-mxnet/issues/17263 ## Description This is the part 1 of Gluon Data API extension and fixes, which mainly focus on cleaning up diverging usage of mxnet module/gluon. Through long time evolution, there's currently two streams of data loading conventions implemented in mxnet - Iterator: mxnet.io.DataIter(https://github.com/apache/incubator-mxnet/blob/ac88f1e87aa7da2c33f0cb89524d444ddc3a2ae8/python/mxnet/io/io.py#L180) - Dataset + DataLoader: gluon.data.Dataset + gluon.data.DataLoader In order to eliminate the confusion here and to reduce the maintenance efforts, the plan is to drop all old iterators and provide similar Dataset + Dataloader experience in gluon data API. ## Things to be removed ### iterators - Base mxnet.io.DataIter(https://github.com/apache/incubator-mxnet/blob/ac88f1e87aa7da2c33f0cb89524d444ddc3a2ae8/python/mxnet/io/io.py#L180) - mxnet.io.ResizeIter(https://github.com/apache/incubator-mxnet/blob/ac88f1e87aa7da2c33f0cb89524d444ddc3a2ae8/python/mxnet/io/io.py#L282) - mxnet.io.PrefetchIter(https://github.com/apache/incubator-mxnet/blob/ac88f1e87aa7da2c33f0cb89524d444ddc3a2ae8/python/mxnet/io/io.py#L347) - mxnet.io.NDArrayIter(https://github.com/apache/incubator-mxnet/blob/ac88f1e87aa7da2c33f0cb89524d444ddc3a2ae8/python/mxnet/io/io.py#L491) - mxnet.io.MXDataIter(https://github.com/apache/incubator-mxnet/blob/ac88f1e87aa7da2c33f0cb89524d444ddc3a2ae8/python/mxnet/io/io.py#L800) - mxnet.image.ImageIter(https://github.com/apache/incubator-mxnet/blob/ac88f1e87aa7da2c33f0cb89524d444ddc3a2ae8/python/mxnet/image/detection.py#L626) - mxnet.image.ImageDetIter(https://github.com/apache/incubator-mxnet/blob/ac88f1e87aa7da2c33f0cb89524d444ddc3a2ae8/python/mxnet/image/detection.py#L626) ### Augmenters from mxnet.image and mxnet.image.detection module Random augmenters, e.g. (https://github.com/apache/incubator-mxnet/blob/ac88f1e87aa7da2c33f0cb89524d444ddc3a2ae8/python/mxnet/image/image.py#L615) will be removed. ### Transform = args in gluon.data.Datasets transform = is no longer supported, and can be replaced with `dataset.transform` or `dataset.transform_first` ## Things to be added ### Gluon Data Datasets Dataset + Transfrom combo that simulate the removed Iterators For example, NDArrayIter can be reimplemented as NDArrayDataset + empty transform function. ### Gluon Data Augmentaters/Transforms Data augmenters as mxnet.gluon.Block Candidates TBD, useful candidates from GluonCV(https://github.com/dmlc/gluon-cv/tree/master/gluoncv/data/transforms) and GluonNLP(https://github.com/dmlc/gluon-nlp/blob/v0.8.x/src/gluonnlp/data/transforms.py) ### mxnet.image image processing functions will be absorbed from GluonCV(https://github.com/dmlc/gluon-cv/blob/master/gluoncv/data/transforms/image.py)
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
