kjchalup opened a new issue #11883: ImageIter last batch / batch padding behavior URL: https://github.com/apache/incubator-mxnet/issues/11883 ## Description mxnet.image.ImageIter lacks a way to choose behavior on last batch (when batch size doesn't divide the size of the dataset). In contrast, [mxnet.io.NDArrayIter](https://mxnet.incubator.apache.org/api/python/io/io.html#mxnet.io.NDArrayIter) has a `last_batch_handle` keyword arg which allows the user to choose what to do. This can lead to unexpected behavior: ``` # Create a diter for an image dataset containing a total of 8 images. diter = mxnet.image.ImageIter(batch_size=5, ...) batch1 = diter.next() batch2 = diter.next() print(b2.label) ``` output: ``` [ [3.0000000e+00 1.0000000e+00 4.0000000e+00 2.5249697e-29 2.8025969e-45] <NDArray 5 @cpu(0)>] ``` In this case, the last batch has only 3 legit labels, and the last two labels are garbage. I think the user is expected to check the padding manually: ``` print('batch1.pad = {}, batch2.pad = {}'.format(batch1.pad, batch2.pad)) ``` output: ``` batch1.pad = 5, batch2.pad=2 ``` But this is not documented in the [mxnet.image API reference](https://mxnet.incubator.apache.org/api/python/image/image.html). In addition, b2.data is *not* filled with garbage -- it contains 5 legit images, adding to the confusion.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
