The other day when merging Johanna's code to clusterize the configuration framework, I noticed this code in there:
# [Send id=val to everyone else] Broker::publish(change_topic, Config::cluster_set_option, ID, val, location); if ( Cluster::local_node_type() != Cluster::MANAGER ) Broker::relay(change_topic, change_topic, Config::cluster_set_option, ID, val, location); It took me a bit to understand that ... The goal here is that a change in a configuration value gets propagated out to all nodes in the cluster. The Broker::publish() sends it to a node's immediate neighbors, but not further. That means that for workers it goes (only) to their manager; for the manager it means, it goes to all workers. If we're not a manager, we then separately (through Broker::relay()) ask our neighbors (that's the manager) to forward the change to *their* neighbors (that's the other workers), without reraising it locally. I remember we have discussed this API before, but I wanted to bring it up again as I keep finding it confusing. I believe the code above could be simplified by using the newer Broker::publish_and_relay(), which was added to combine the two operations. Still, I'm realizing now that I don't like thinking about this in terms of separate publishing and relaying operations. It all won't become easier once we add multi-hop routing to the mix (which is in the works). And on top of all that, we also have Cluster::publish_rr, Cluster::publish_hew, Cluster::relay_rr, and Cluster::relay_hew -- another set of separate publishing & relay options. I'm wondering if we should give it another try to simply this API while we still can (i.e., before 2.6 goes out). To me, the most intuitive publish operation is "send to topic T and propagate to everybody subscribed to that topic". I'd structure the API around that, making that the main publish function for that simply: Broker::publish(topic, args); That would send to all neighbors, which then process locally and relay to their neighbors. Right now, that would propagate just across one hop but once we have multihop that'd start being broadcasted out broadly. To support the other use cases, we can then add modifiers & functions to tweak this default, e.g.: - Give publish() another argument "relay: bool &default=T" to prevent it from going beyond the immediate receiver. Or maybe instead: "relay_hops: int &default=-1" to specify the max number of hops to relay across, with -1 meaning no limit. (I recall concerns about loops being too easy to create; we could set the default here to F/0 to default to no forwarding, although conceptually I don't really like that :-) - Give publish() another argument "relay_topic: string &default="" to change the topic when relaying on the 1st hop. - Give publish() another argument "process_on_relays: bool &default=T" to change whether a relaying hop also sees the event locally. - Add a second function publish_pool() that has all the same options, but receives a pool type instead of a topic (just an enum: RR, HRW). What I'm not quite sure about is if some of these modifiers are better to leave for the receiver to specify (e.g., whether to raise events received on a given topic locally, or just forward). I think I can see that either way. Robin -- Robin Sommer * Corelight, Inc. * ro...@corelight.com * www.corelight.com _______________________________________________ bro-dev mailing list bro-dev@bro.org http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev