[ 
https://issues.apache.org/jira/browse/CASSANDRA-2975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13073832#comment-13073832
 ] 

Brian Lindauer commented on CASSANDRA-2975:
-------------------------------------------

You weren't kidding about compatibility with old data files not being simple. 
It actually turned out to be fairly major surgery. The original changes just to 
support Mumur3 are here:

https://github.com/lindauer/cassandra/commit/cea6068a4a3e5d7d9509335394f9ef3350d37e93

The additional proposed changes to support backward compatibility are at:

https://github.com/lindauer/cassandra/commit/9d7479675752a07732f434b307be6642d8b3e85f

I can't say I'm completely satisfied with these changes. It feels like we 
should unify with LegacyBloomFilter now that there are 3 versions. It also 
feels like all of the places where a serializer is selected based on a 
Descriptor version/flag could be moved under one roof, where callers just pass 
the Descriptor and it returns the correct serializer instance. But, not being 
too familiar with Cassandra, I was trying to be minimally invasive for fear of 
breaking something.

All of the tests pass, but I haven't added any tests, such as making sure that 
old files can still be read in. Like I said, I'm not very familiar with 
Cassandra, so you should review these changes carefully. (I'm sure you would 
anyway.)


> Upgrade MurmurHash to version 3
> -------------------------------
>
>                 Key: CASSANDRA-2975
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2975
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 0.8.3
>            Reporter: Brian Lindauer
>            Priority: Trivial
>              Labels: lhf
>
> MurmurHash version 3 was finalized on June 3. It provides an enormous speedup 
> and increased robustness over version 2, which is implemented in Cassandra. 
> Information here:
> http://code.google.com/p/smhasher/
> The reference implementation is here:
> http://code.google.com/p/smhasher/source/browse/trunk/MurmurHash3.cpp?spec=svn136&r=136
> I have already done the work to port the (public domain) reference 
> implementation to Java in the MurmurHash class and updated the BloomFilter 
> class to use the new implementation:
> https://github.com/lindauer/cassandra/commit/cea6068a4a3e5d7d9509335394f9ef3350d37e93
> Apart from the faster hash time, the new version only requires one call to 
> hash() rather than 2, since it returns 128 bits of hash instead of 64.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to