[
https://issues.apache.org/jira/browse/CASSANDRA-2975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13073310#comment-13073310
]
Brian Lindauer commented on CASSANDRA-2975:
-------------------------------------------
Summary:
{code}
Mean FP rates for version 2:
LongBloomFilterTest: 0.997967059178744
LongLegacyBloomFilterTest: 0.997908061594203
Mean FP rates for version 3:
LongBloomFilterTest: 0.998045621980676
LongLegacyBloomFilterTest: 0.998863888888889
{code}
Details:
{code}
Version 2:
[echo] running long tests
[junit] WARNING: multiple versions of ant detected in path for junit
[junit]
jar:file:/usr/share/ant/lib/ant.jar!/org/apache/tools/ant/Project.class
[junit] and
jar:file:/Users/jbl/git/cassandra/build/lib/jars/ant-1.6.5.jar!/org/apache/tools/ant/Project.class
[junit] Testsuite: org.apache.cassandra.utils.LongBloomFilterTest
[junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 106.213 sec
[junit]
[junit] ------------- Standard Error -----------------
[junit] fp_ratio = 0.9973043478260869
[junit] fp_ratio = 0.9965793478260869
[junit] fp_ratio = 0.9996123188405797
[junit] fp_ratio = 1.0004746376811595
[junit] fp_ratio = 0.998409420289855
[junit] fp_ratio = 0.9920978260869565
[junit] fp_ratio = 0.9979420289855072
[junit] fp_ratio = 0.9940797101449276
[junit] fp_ratio = 0.9983913043478261
[junit] fp_ratio = 1.0006159420289855
[junit] fp_ratio = 1.0000362318840579
[junit] fp_ratio = 1.0000615942028985
[junit] ------------- ---------------- ---------------
mean = 0.997967059178744
[junit] Testsuite: org.apache.cassandra.utils.LongLegacyBloomFilterTest
[junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 61.721 sec
[junit]
[junit] ------------- Standard Error -----------------
[junit] fp_ratio = 0.998095652173913
[junit] fp_ratio = 0.9982576086956522
[junit] fp_ratio = 0.999159420289855
[junit] fp_ratio = 1.0001340579710145
[junit] fp_ratio = 1.0011557971014493
[junit] fp_ratio = 0.9967717391304348
[junit] fp_ratio = 0.9955978260869566
[junit] fp_ratio = 0.9989673913043479
[junit] fp_ratio = 0.9966231884057971
[junit] fp_ratio = 0.9973514492753623
[junit] fp_ratio = 0.9969855072463768
[junit] fp_ratio = 0.9957971014492754
[junit] ------------- ---------------- ---------------
mean = 0.997908061594203
Version 3:
[echo] running long tests
[junit] WARNING: multiple versions of ant detected in path for junit
[junit]
jar:file:/usr/share/ant/lib/ant.jar!/org/apache/tools/ant/Project.class
[junit] and
jar:file:/Users/jbl/git/cassandra/build/lib/jars/ant-1.6.5.jar!/org/apache/tools/ant/Project.class
[junit] Testsuite: org.apache.cassandra.utils.LongBloomFilterTest
[junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 75.994 sec
[junit]
[junit] ------------- Standard Error -----------------
[junit] fp_ratio = 0.9986532608695652
[junit] fp_ratio = 0.997158695652174
[junit] fp_ratio = 0.9995797101449275
[junit] fp_ratio = 0.9995
[junit] fp_ratio = 0.9984565217391305
[junit] fp_ratio = 0.9987101449275362
[junit] fp_ratio = 0.9979528985507247
[junit] fp_ratio = 0.9998224637681159
[junit] fp_ratio = 0.9938876811594203
[junit] fp_ratio = 0.9993623188405797
[junit] fp_ratio = 0.9953369565217391
[junit] fp_ratio = 0.9981268115942029
[junit] ------------- ---------------- ---------------
mean = 0.998045621980676
[junit] Testsuite: org.apache.cassandra.utils.LongLegacyBloomFilterTest
[junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 60.999 sec
[junit]
[junit] ------------- Standard Error -----------------
[junit] fp_ratio = 0.998095652173913
[junit] fp_ratio = 0.9983760869565217
[junit] fp_ratio = 0.9993043478260869
[junit] fp_ratio = 0.9996159420289855
[junit] fp_ratio = 0.9980217391304348
[junit] fp_ratio = 1.0016920289855074
[junit] fp_ratio = 0.9953623188405797
[junit] fp_ratio = 0.9968188405797102
[junit] fp_ratio = 0.9947173913043478
[junit] fp_ratio = 1.000695652173913
[junit] fp_ratio = 1.0030760869565218
[junit] fp_ratio = 1.0005905797101449
[junit] ------------- ---------------- ---------------
mean = 0.998863888888889
{code}
> Upgrade MurmurHash to version 3
> -------------------------------
>
> Key: CASSANDRA-2975
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2975
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Affects Versions: 0.8.3
> Reporter: Brian Lindauer
> Priority: Trivial
> Labels: lhf
>
> MurmurHash version 3 was finalized on June 3. It provides an enormous speedup
> and increased robustness over version 2, which is implemented in Cassandra.
> Information here:
> http://code.google.com/p/smhasher/
> The reference implementation is here:
> http://code.google.com/p/smhasher/source/browse/trunk/MurmurHash3.cpp?spec=svn136&r=136
> I have already done the work to port the (public domain) reference
> implementation to Java in the MurmurHash class and updated the BloomFilter
> class to use the new implementation:
> https://github.com/lindauer/cassandra/commit/cea6068a4a3e5d7d9509335394f9ef3350d37e93
> Apart from the faster hash time, the new version only requires one call to
> hash() rather than 2, since it returns 128 bits of hash instead of 64.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira