Neutron3529 opened a new issue #20164:
URL: https://github.com/apache/incubator-mxnet/issues/20164
## Description
I found that the exist image decode method(opencv) is much slower than
libjpeg-turbo
it takes 2:31 (timing by tqdm) to read all the data in validation
set(val.rec,~6GiB, 50000 jpeg,batch_size=100,~3.3 it/s with
num_workers=1,results may not accurate since I am eating lots of CPU with
existing training processes).
with the default ImageRecordDataset, the speed decreased to ~3 it/s
what's more, if we fetch data directly from my `ImageIdx` class, it is about
400~500images/s.
I cannot figure out what make libjpeg-turbo so slow with dataloader, but the
exist result shows that libjpeg-turbo is faster enough.
## Code
this section shows the code I am using now, maybe that is a good replacement
of the exist `unpack_img`/`ImageRecordDataset` method
This dataset is faster than the default dataset with only a little
difference(plus or minus 1 in some pixels)
Although jpeg4py do not support jpeg with CMYK color, we could using a
try-except to avoid this error
```
import os
import jpeg4py._cffi as jpeg
from jpeg4py._cffi import TJPF_RGB
from jpeg4py import JPEG
from numpy import frombuffer,uint8,zeros
from mxnet.ndarray import array
from mxnet.gluon.data.dataset import Dataset
from mxnet.image import imdecode
from mxnet.recordio import unpack,MXIndexedRecordIO
class Decode(JPEG):
def __init__(self):
super(JPEG, self).__init__(None)
self.decompressor = None
self.width = 0
self.height = 0
self.subsampling = 1
self.dst = zeros((512*512*3,),dtype=uint8) # should be thread-local,
do not compitatible with threading.
def decode(self, source, pixfmt=TJPF_RGB):
self.source=frombuffer(source,dtype=uint8) # new line
#bpp = jpeg.tjPixelSize[pixfmt]
# since pixfmt is fixed
#if dst is None:
# if self.width is None:
# self.parse_header()
self.parse_header()
# sh = [self.height, self.width]
data_len = self.height*self.width*3 # new line, for reusing propose.
# if bpp > 1:
# sh.append(bpp)
# dst = numpy.zeros(sh, dtype=numpy.uint8)
#elif not hasattr(dst, "__array_interface__"):
# raise ValueError("dst should be numpy array or None")
#if len(dst.shape) < 2:
# raise ValueError("dst shape length should 2 or 3")
if self.dst.nbytes < data_len:
# raise ValueError(
# "dst is too small to hold the requested pixel format")
self.dst = zeros(data_len, dtype=uint8) # increase dst if needed.
#self._get_decompressor()
# previous line already done in self.parse_header
n = self.lib_.tjDecompress2(
self.decompressor.handle_,
jpeg.ffi.cast("unsigned char*",
self.source.__array_interface__["data"][0]),
self.source.nbytes,
jpeg.ffi.cast("unsigned char*",
self.dst.__array_interface__["data"][0]),
#dst.shape[1], dst.strides[0], dst.shape[0], pixfmt, 0)
self.width , 1, self.height, pixfmt, 0)
if n:
return imdecode(self.source, 1)
else:
return array(self.dst[:data_len].reshape([self.height,
self.width,3]), dtype='uint8')
class ImageIdx(Dataset):
"""ImageNet with a RecordIO (.rec) file.
Each sample is a string representing the raw content of an record.
Parameters
----------
filename : str
Path to rec file.
"""
def __init__(self, filename):
self.idx_file = os.path.splitext(filename)[0] + '.idx'
self.filename = filename
self._record = MXIndexedRecordIO(self.idx_file, self.filename, 'r')
self._decode = Decode()
self.decode = self._decode.decode
def __getitem__(self, idx):
header,img=unpack(self._record.read_idx(self._record.keys[idx]))
return self.decode(img), header.label, idx
def __len__(self):
return len(self._record.keys)
#test
a=ImageIdx('/imagenet/rec/val.rec')
sum(sum(sum(a[2][0])))
```
## References
- https://www.libjpeg-turbo.org/
- jpeg4py
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]