[ 
https://bro-tracker.atlassian.net/browse/BIT-1039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13316#comment-13316
 ] 

Robin Sommer commented on BIT-1039:
-----------------------------------

Merged, thanks.

However, we need to address the hash problem to support merging across Bro 
instances, I'm leaving the ticket open for that. Here's a proposal what to do 
(after talking to Bernhard):

1. Change {{CompositeHash}} to optionally use a custom {{H3}} instance.

2. Extend the {{Hasher}} to take both a name and an optional seed value 
(probably another string). Internally, it combines the two into the seed for 
the {{Hasher's}} internal H3, i.e., same name+seed means same hash functions. 
If the optional seed value is not given, take it from a global script level 
variable {{GLOBAL_INSTALLATION_SEED}} (or so :).

3. Change BloomFilterVal to pass to {{CompositeHash}} a custom {{H3}}. I 
believe this could be the same instance that {{Hasher}} is using internally, so 
that we get the same consistency guarantees. Indeed, hashing of {{Val}} should 
then probably move into {{Hasher}}.

3. Along the same lines as (2), extend the bloom filter interface to take both 
a name and the optional seed. Same name+seed means filters can be merged.

4. Change BroControl to redefine {{GLOBAL_INSTALLATION_SEED}} to a 
non-predictable value that will remain consistent across {{install}}.

I believe that with this we can support two use cases: (1) in a cluster, all 
bloom filters created with the same name but without any further seed value 
will be compatible (because they'll use {{GLOBAL_INSTALLATION_SEED}}); and (2) 
externally provided Bloom filters can specify their own seed so that any Bro 
installation can pull them in. 

Does this make sense?
                
> Merge request for Bloom filters
> -------------------------------
>
>                 Key: BIT-1039
>                 URL: https://bro-tracker.atlassian.net/browse/BIT-1039
>             Project: Bro Issue Tracker
>          Issue Type: New Feature
>          Components: Bro
>            Reporter: Matthias Vallentin
>            Priority: Medium
>             Fix For: 2.2
>
>
> The Bloom filter implementation in `topic/matthias/bloom-filter` is ready to 
> merge into master. Have a look at the very end of `bro.bif` for the 
> script-land interface.
> Internally, we have a new `BloomFilterVal`, which is serializable and 
> mergeable and thus ready for cluster use. This `Val` contains a polymorphic 
> Bloom filter instance, which hides the concrete Bloom filter type (currently 
> only basic and counting). Moreover, this branch introduces the notion of 
> ''hashers'', which are parameterizable (i.e., seedable) structures for 
> hashing values ''k'' times. I recall that Bernhard waits for this feature. 
> See `Hasher.h` for the documented interface.
> In the future, we need to rethink how to construct hash functions which only 
> depend on a seed given at script land. This will be important when sharing 
> Bloom filters across organizational boundaries. At this point, the 
> implementation relies on `CompHash` (at least for composite values, such as 
> records) which itself depends on the initial Bro seed generated at startup 
> time or when the user specifies the environment variable `$BRO_SEED`.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://bro-tracker.atlassian.net/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
_______________________________________________
bro-dev mailing list
[email protected]
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev

Reply via email to