[GitHub] [hudi] vburenin opened a new pull request #2440: Fixed suboptimal implementation of a magic sequence search that may take days

GitBox Wed, 13 Jan 2021 09:53:33 -0800


vburenin opened a new pull request #2440:
URL: https://github.com/apache/hudi/pull/2440



   ## What is the purpose of the pull request
   Fixed suboptimal implementation of a magic sequence search that may take 
days on the file sizes of a few megabytes.
   Instead of using 6 bytes buffer to find a magic sequence it uses a lot 
larger buffer that speeds up process like 170k times in some cases. The 
inefficiency is very noticeable when GCS or S3 storages are begin used.
   
   ## Brief change log
   
   Rewrote scanForNextAvailableBlockOffset function to use a large buffer size.
   
   ## Verify this pull request
   
   This pull request is already covered by existing tests
          


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hudi] vburenin opened a new pull request #2440: Fixed suboptimal implementation of a magic sequence search that may take days

Reply via email to