[
https://issues.apache.org/jira/browse/CASSANDRA-13291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16183505#comment-16183505
]
Jason Brown commented on CASSANDRA-13291:
-----------------------------------------
A [slightly
fairer|https://github.com/jasobrown/cassandra/commit/e57bc21903687dfed573ff427ad4eeededac41a9]
comparison, wherein I call {{MessageDigest#close()}} on a prototype instance
each time, instead of a fresh new instance via
{{MessageDigest.getInstance("MD5")}}.
Update results:
{noformat}
[java] Benchmark (bufferSize) Mode Cnt
Score Error Units
[java] HashingBench.benchHasherMD5 31 avgt 5
340.186 ± 54.611 ns/op
[java] HashingBench.benchHasherMD5 131 avgt 5
708.117 ± 42.826 ns/op
[java] HashingBench.benchHasherMD5 517 avgt 5
1801.402 ± 47.358 ns/op
[java] HashingBench.benchHasherMD5 2041 avgt 5
6294.723 ± 518.325 ns/op
[java] HashingBench.benchHasherMurmur3_128 31 avgt 5
286.312 ± 65.617 ns/op
[java] HashingBench.benchHasherMurmur3_128 131 avgt 5
429.138 ± 36.589 ns/op
[java] HashingBench.benchHasherMurmur3_128 517 avgt 5
908.452 ± 27.860 ns/op
[java] HashingBench.benchHasherMurmur3_128 2041 avgt 5
2830.657 ± 225.470 ns/op
[java] HashingBench.benchMessageDigestMD5 31 avgt 5
484.350 ± 474.141 ns/op
[java] HashingBench.benchMessageDigestMD5 131 avgt 5
1059.691 ± 53.677 ns/op
[java] HashingBench.benchMessageDigestMD5 517 avgt 5
2557.586 ± 319.597 ns/op
[java] HashingBench.benchMessageDigestMD5 2041 avgt 5
8585.662 ± 135.474 ns/op
{noformat}
Either way, the guava hasher is faster.
In other news, The guava MD5 implementation uses {{MessageDigest}} under the
covers, so I think the hash results from the guava md5 and the
{{MessageDigest}} should be the same. [~mkjellman] can you confirm?
> Replace usages of MessageDigest with Guava's Hasher
> ---------------------------------------------------
>
> Key: CASSANDRA-13291
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13291
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Reporter: Michael Kjellman
> Assignee: Michael Kjellman
> Attachments: CASSANDRA-13291-trunk.diff
>
>
> During my profiling of C* I frequently see lots of aggregate time across
> threads being spent inside the MD5 MessageDigest implementation. Given that
> there are tons of modern alternative hashing functions better than MD5
> available -- both in terms of providing better collision resistance and
> actual computational speed -- I wanted to switch out our usage of MD5 for
> alternatives (like adler128 or murmur3_128) and test for performance
> improvements.
> Unfortunately, I found given the fact we use MessageDigest everywhere --
> switching out the hashing function to something like adler128 or murmur3_128
> (for example) -- which don't ship with the JDK -- wasn't straight forward.
> The goal of this ticket is to propose switching out usages of MessageDigest
> directly in favor of Hasher from Guava. This means going forward we can
> change a single line of code to switch the hashing algorithm being used
> (assuming there is an implementation in Guava).
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]