Public bug reported:

Binary package hint: ubiquity

Ubiquity's new MD5 checking implementation uses shutil.copyfileobj to
copy the file (which reads and writes 16KB chunks at a time), and then
seeks back to the start of the source file, reads the whole thing into
memory in one go, and computes its MD5. This is inefficient in two ways:
firstly, it must read the file twice; secondly, it uses memory linear in
the size of the largest file being copied.

hashlib.md5 allows you to create a hash object with no data, and feed
data to it block by block. Therefore, I would suggest expanding out the
(trivial) code for shutil.copyfileobj, and feeding each block to
md5obj.update as you go along. Then you just have to close and reopen
the target file and compute its MD5; I would suggest also doing that
block-by-block to keep memory requirements down.

** Affects: ubiquity (Ubuntu)
     Importance: Low
     Assignee: Evan Dandrea (evand)
         Status: New

** Changed in: ubiquity (Ubuntu)
   Importance: Undecided => Low
     Assignee: (unassigned) => Evan Dandrea (evand)

-- 
MD5 check implementation is inefficient
https://bugs.launchpad.net/bugs/198019
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to