[
https://issues.apache.org/jira/browse/COLLECTIONS-728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17000951#comment-17000951
]
Gilles Sadowski commented on COLLECTIONS-728:
---------------------------------------------
{quote}properties are now tracked in the HashFunctionIdentity.
{quote}
Perhaps {{HashFunctionProperties}} or {{HashFunctionSpecification}} would be a
better name (?).
{quote}An equals method on the HashFunction would require the HashFunction
object be in some sense Serializable.
{quote}
IIUC correctly (?), a hash function can be implemented by a third party (not
abiding by any of APIs to be defined here).
Then to be usable within the {{BloomFilter}} framework defined here, an
application developer will need
* to wrap the function's "properties" in a class (say,
{{MyHashFunctionIdentity}}) that implements {{HashFunctionIdentity}}
* to define a serialization scheme forĀ {{MyHashFunctionIndentity}} (if the
application is distributed).
If so, why not have a {{MyHashFunction}} class that wraps the full third-party
hash function rather then just its "properties"?
We could have, in [Collections]
{code:java}
/**
* Interface for implementations used within this framework.
*/
public interface HashFunction {
/**
* Returns {@code true} when from a given input, this instance
* computes the same output as the {@code other} instance.
*/
boolean isCompatible(HashFunction other);
/**
* Computes the hash value.
*
* @param input Input.
* @param seed Seed.
* @return the hash.
*/
long compute(byte[] input, int seed);
}
{code}
As simple as can be (?).
And, the application developer would responsible for implementing the notion of
"compatibility" (in the same way that he is responsible for computing the hash
value, including issues arising from using a buggy function):
{code:java}
import java.io.Serializable;
import com.thirdparty.hash.NiceFunction;
import org.apache.commons.collections.bloomfilter.HashFunction;
public class MyHash implements HashFunction, Serializable {
private static final byte[] COMP_TEST_A = new byte[] {- 19, 45, -34, 65, 1,
22, 17, 74};
private static final int COMP_TEST_B = -1395561;
private static final long serialVersionUID = 123456789L;
private NiceFunction f; // Assuming that "NiceFunction" is "Serializable".
public MyHash(NiceFunction f) {
this.f = f;
}
@Override
public boolean isCompatible(HashFunction other) {
if (other instanceof MyHash) {
return true;
} else {
return Long.compare(compute(COMP_TEST_A, COMP_TEST_B),
other.compute(COMP_TEST_A, COMP_TEST_B)) == 0;
}
}
@Override
public long compute(byte[] input, int seed) {
// ...
}
}
{code}
Then, [Collections] could provide wrappers for the functions implemented in
[Codec]. Casual users will get new functions at release time, while not
preventing power users to define their own wrappers around experimental and
non-standard functions.
Or am I completely off base?
bq. I would not expect the actual code for for the function to be sent. This
would probably require that the listener be a Java based application so that it
could have the HashFunction implementation.
I'm confused. Where is the hash computation used?
> BloomFilter contribution
> ------------------------
>
> Key: COLLECTIONS-728
> URL: https://issues.apache.org/jira/browse/COLLECTIONS-728
> Project: Commons Collections
> Issue Type: Task
> Reporter: Claude Warren
> Priority: Minor
> Attachments: BF_Func.md, BloomFilter.java, BloomFilterI2.java,
> Usage.md
>
>
> Contribution of BloomFilter library comprising base implementation and gated
> collections.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)