Alejandro, I don't think the process described by Mark Waddingham will produce the same result as getting the digest of the whole file, no matter what chunk size you use. But I'm sure he's right that the result will produce very little chance of collision.

If your computer has openssl installed, you can us a shell call:

get shell("openssl md5" && pathToFile)

Best,

Mark


On 21 Feb 2008, at 19:12, capellan wrote:


Hi Mark,

This has been discussed before ;-)

In 2005, september 25
Mark Waddingham wrote,
in the message:
put url some url into someVar...

Mark Schonewille was trying to do something very similar a while ago and filed an enhancement request about being able to do md5 digests on large
files *without* them needing to be loaded. For interest see here:

  <http://support.runrev.com/bugdatabase/show_bug.cgi?id=2410>

As suggested (by me) in the bug-report, you needn't take the md5digest
of the entire file at once and can do something like this instead:

function quasiMD5 pFile
  local tMD5s
  open file pFile for binary read
  repeat
    read from file pFile for CHUNK_SIZE chars
    if the result is EOF then
      exit repeat
    end if
    put the md5digest of it after tMD5s
  end repeat
  close file pFile
  return the md5Digest of tMD5s
end quasiMD5

Where you can make CHUNK_SIZE some suitable size (perhaps 256k/512k).

[ My intuitive analysis of the impact of doing this on the integrity
(i.e. potential for collision) of the digest is that it will be minimal
- but perhaps someone more knowledgeable in this area could comment. ]

Hope this helps,

Mark Waddinham.


Well, i do not try this code myself, but i am sure that no md5digest command line tool tries to load the whole Linux Installer image CD in memory to
verify
the file. Surely, they use an aproach similar to this suggested by Mark
Waddingham.
But there is only one thing to know for sure... What exact size are the
chunks to process?

alejandro



masmit wrote:

I need to get the md5 of potentially very large
files on disc. I'm hoping to avoid having to load them into memory in
order get the digest.

Mark


--
View this message in context: http://www.nabble.com/md5- tp15536091p15618684.html
Sent from the Revolution - User mailing list archive at Nabble.com.

_______________________________________________
use-revolution mailing list
[email protected]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution

_______________________________________________
use-revolution mailing list
[email protected]
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution

Reply via email to