crypto:md5 vs erlang:md5
------------------------
Key: COUCHDB-757
URL: https://issues.apache.org/jira/browse/COUCHDB-757
Project: CouchDB
Issue Type: Improvement
Environment: GNU/Linux
Reporter: Filipe Manana
Attachments: crypto_md5.patch
Just noticed that crypto:md5 is faster than erlang:md5 by about an order of
magnitude when hashing just 8Kb or 4Kb of data.
Basically we use md5 hashing when writing and reading documents and attachments
through couch_file and couch_stream.
Eshell V5.8 (abort with ^G)
1> crypto:start().
ok
2> Bin1 = crypto:rand_bytes(4 * 1024).
<<92,239,233,29,1,237,96,193,188,97,4,72,51,90,96,91,187,
112,112,198,7,173,105,99,205,65,105,94,144,...>>
3>
3> {T1, _} = timer:tc(erlang, md5, [Bin1]).
{211,
<<20,235,111,74,212,254,194,144,49,70,205,105,124,106,
131,230>>}
4>
4> {T2, _} = timer:tc(crypto, md5, [Bin1]).
{60,
<<20,235,111,74,212,254,194,144,49,70,205,105,124,106,
131,230>>}
5>
5> Bin2 = crypto:rand_bytes(8 * 1024).
<<246,66,158,227,62,127,62,239,202,232,133,244,191,9,136,
6,164,179,109,166,253,41,144,185,177,39,177,88,142,...>>
6>
6> {T3, _} = timer:tc(erlang, md5, [Bin2]).
{446,
<<7,55,252,42,249,30,58,22,245,12,111,82,131,58,199,51>>}
7>
7> {T4, _} = timer:tc(crypto, md5, [Bin2]).
{77,
<<7,55,252,42,249,30,58,22,245,12,111,82,131,58,199,51>>}
8>
I know there's a ticket around with the goal of the possibility to remove the
dependency on the crypto module, but for environments where this is not a
problem it would be a plus.
Made a test that wrote 400 attachments with about 60Kbs and noticed an average
response time of 0.16s versus 0.18s (erlang:md5).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.