[
https://bro-tracker.atlassian.net/browse/BIT-1039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13316#comment-13316
]
Robin Sommer commented on BIT-1039:
-----------------------------------
Merged, thanks.
However, we need to address the hash problem to support merging across Bro
instances, I'm leaving the ticket open for that. Here's a proposal what to do
(after talking to Bernhard):
1. Change {{CompositeHash}} to optionally use a custom {{H3}} instance.
2. Extend the {{Hasher}} to take both a name and an optional seed value
(probably another string). Internally, it combines the two into the seed for
the {{Hasher's}} internal H3, i.e., same name+seed means same hash functions.
If the optional seed value is not given, take it from a global script level
variable {{GLOBAL_INSTALLATION_SEED}} (or so :).
3. Change BloomFilterVal to pass to {{CompositeHash}} a custom {{H3}}. I
believe this could be the same instance that {{Hasher}} is using internally, so
that we get the same consistency guarantees. Indeed, hashing of {{Val}} should
then probably move into {{Hasher}}.
3. Along the same lines as (2), extend the bloom filter interface to take both
a name and the optional seed. Same name+seed means filters can be merged.
4. Change BroControl to redefine {{GLOBAL_INSTALLATION_SEED}} to a
non-predictable value that will remain consistent across {{install}}.
I believe that with this we can support two use cases: (1) in a cluster, all
bloom filters created with the same name but without any further seed value
will be compatible (because they'll use {{GLOBAL_INSTALLATION_SEED}}); and (2)
externally provided Bloom filters can specify their own seed so that any Bro
installation can pull them in.
Does this make sense?
> Merge request for Bloom filters
> -------------------------------
>
> Key: BIT-1039
> URL: https://bro-tracker.atlassian.net/browse/BIT-1039
> Project: Bro Issue Tracker
> Issue Type: New Feature
> Components: Bro
> Reporter: Matthias Vallentin
> Priority: Medium
> Fix For: 2.2
>
>
> The Bloom filter implementation in `topic/matthias/bloom-filter` is ready to
> merge into master. Have a look at the very end of `bro.bif` for the
> script-land interface.
> Internally, we have a new `BloomFilterVal`, which is serializable and
> mergeable and thus ready for cluster use. This `Val` contains a polymorphic
> Bloom filter instance, which hides the concrete Bloom filter type (currently
> only basic and counting). Moreover, this branch introduces the notion of
> ''hashers'', which are parameterizable (i.e., seedable) structures for
> hashing values ''k'' times. I recall that Bernhard waits for this feature.
> See `Hasher.h` for the documented interface.
> In the future, we need to rethink how to construct hash functions which only
> depend on a seed given at script land. This will be important when sharing
> Bloom filters across organizational boundaries. At this point, the
> implementation relies on `CompHash` (at least for composite values, such as
> records) which itself depends on the initial Bro seed generated at startup
> time or when the user specifies the environment variable `$BRO_SEED`.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://bro-tracker.atlassian.net/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
_______________________________________________
bro-dev mailing list
[email protected]
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev