[GitHub] [incubator-mxnet] Neutron3529 opened a new issue #20164: RecordIO is too slow to decode, why not using libjpeg-turbo instead?

GitBox Wed, 14 Apr 2021 15:06:44 -0700


Neutron3529 opened a new issue #20164:
URL: https://github.com/apache/incubator-mxnet/issues/20164



   ## Description
   I found that the exist image decode method(opencv) is much slower than 
libjpeg-turbo
   it takes 2:31 (timing by tqdm) to read all the data in validation 
set(val.rec,~6GiB, 50000 jpeg,batch_size=100,~3.3 it/s with 
num_workers=1,results may not accurate since I am eating lots of CPU with 
existing training processes).
   
   with the default ImageRecordDataset, the speed decreased to ~3 it/s
   
   what's more, if we fetch data directly from my `ImageIdx` class, it is about 
400~500images/s.
   
   I cannot figure out what make libjpeg-turbo so slow with dataloader, but the 
exist result shows that libjpeg-turbo is faster enough.
   
   
   ## Code
   
   this section shows the code I am using now, maybe that is a good replacement 
of the exist `unpack_img`/`ImageRecordDataset` method
   
   This dataset is faster than the default dataset with only a little 
difference(plus or minus 1 in some pixels)
   
   Although jpeg4py do not support jpeg with CMYK color, we could using a 
try-except to avoid this error
   
   ```
   import os
   import jpeg4py._cffi as jpeg
   from jpeg4py._cffi import TJPF_RGB
   from jpeg4py import JPEG
   from numpy import frombuffer,uint8,zeros
   from mxnet.ndarray import array
   from mxnet.gluon.data.dataset import Dataset
   from mxnet.image import imdecode
   from mxnet.recordio import unpack,MXIndexedRecordIO
   class Decode(JPEG):
       def __init__(self):
           super(JPEG, self).__init__(None)
           self.decompressor = None
           self.width = 0
           self.height = 0
           self.subsampling = 1
           self.dst = zeros((512*512*3,),dtype=uint8) # should be thread-local, 
do not compitatible with threading.
       def decode(self, source, pixfmt=TJPF_RGB):
           self.source=frombuffer(source,dtype=uint8) # new line
           #bpp = jpeg.tjPixelSize[pixfmt]
               # since pixfmt is fixed
           #if dst is None:
           #    if self.width is None:
           #        self.parse_header()
           self.parse_header()
           #    sh = [self.height, self.width]
           data_len = self.height*self.width*3 # new line, for reusing propose.
           #    if bpp > 1:
           #        sh.append(bpp)
           #    dst = numpy.zeros(sh, dtype=numpy.uint8)
           #elif not hasattr(dst, "__array_interface__"):
           #    raise ValueError("dst should be numpy array or None")
           #if len(dst.shape) < 2:
           #    raise ValueError("dst shape length should 2 or 3")
           if self.dst.nbytes < data_len:
           #    raise ValueError(
           #        "dst is too small to hold the requested pixel format")
               self.dst = zeros(data_len, dtype=uint8) # increase dst if needed.
           #self._get_decompressor()
               # previous line already done in self.parse_header
           n = self.lib_.tjDecompress2(
               self.decompressor.handle_,
               jpeg.ffi.cast("unsigned char*",
                             self.source.__array_interface__["data"][0]),
               self.source.nbytes,
               jpeg.ffi.cast("unsigned char*",
                             self.dst.__array_interface__["data"][0]),
               #dst.shape[1], dst.strides[0], dst.shape[0], pixfmt, 0)
               self.width   ,              1,  self.height, pixfmt, 0)
           if n:
               return imdecode(self.source, 1)
           else:
               return array(self.dst[:data_len].reshape([self.height, 
self.width,3]), dtype='uint8')
   
   
   class ImageIdx(Dataset):
       """ImageNet with a RecordIO (.rec) file.
   
       Each sample is a string representing the raw content of an record.
   
       Parameters
       ----------
       filename : str
           Path to rec file.
       """
       def __init__(self, filename):
           self.idx_file = os.path.splitext(filename)[0] + '.idx'
           self.filename = filename
           self._record = MXIndexedRecordIO(self.idx_file, self.filename, 'r')
           self._decode = Decode()
           self.decode = self._decode.decode
       def __getitem__(self, idx):
           header,img=unpack(self._record.read_idx(self._record.keys[idx]))
           return self.decode(img), header.label, idx
       def __len__(self):
           return len(self._record.keys)
   
   #test
   a=ImageIdx('/imagenet/rec/val.rec')
   sum(sum(sum(a[2][0])))
   ```
   
   ## References
   - https://www.libjpeg-turbo.org/
   - jpeg4py
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [incubator-mxnet] Neutron3529 opened a new issue #20164: RecordIO is too slow to decode, why not using libjpeg-turbo instead?

Reply via email to