Re: [Bro-Dev] Broker::publish API
On Wed, Aug 08, 2018 at 12:36 -0500, Jonathan Siwek wrote: > * publish() API simplifications/compressions (pending decision on > exactly what those should be) Yeah, with an eye on the semantics for forwarding (now and later), and whether to raise published events locally as well if the host is subscribed itself. And maybe the 2nd eye on: can define these semantics so that we can get rid of some of the "what node type am I?" checks? I'm not sure how that would look like, but generally it would be nice if one could just publish stuff liberally without worrying too much and the subscriptions and forwarding semantics do the right thing (not always, but often)). > * enable message forwarding by default (meaning re-implement the one > or two subscription patterns that might create a cycle) Haven't quite made up my mind on this one. In principlel yes, but right now a host needs to be subscribed to a topic to forward it if I remember than right. That may limit how we use topics, not sure (e.g., if a worker wanted to talk to other workers, with "real" forwarding/routing they'd just publish to the worker topic and that message would get routed there, but not be processed at the intermediary hops as well. With our current forwarding, the hops would need to subscribe to the worker topic as well and hence the event got raised there, too.) > * see if any script-specific topics can instead use a pre-existing > "cluster" topic Yep. > difficult due to having to hunt down things in various scripts and > whether a more centralized config could be something to do? Yeah, that sounds useful for the cluster case: it could be part of the cluster framework to define all the relevant node types with their characeristics. That would also make later changes easier & centralized to how topics and connections are set up. For other use cases, it should still be possible to configure things independently, too, though (say, for talking to external Broker applications). Robin -- Robin Sommer * Corelight, Inc. * ro...@corelight.com * www.corelight.com ___ bro-dev mailing list bro-dev@bro.org http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev
Re: [Bro-Dev] Broker::publish API
On Wed, Aug 8, 2018 at 10:55 AM Robin Sommer wrote: > That's actually something I realized yesterday: we don't have direct > worker-to-worker communication right now, correct? A worker cannot > just publish to "bro/cluster/workers". Right, here's a crude graphic of the cluster layout from the docs: https://github.com/bro/bro/blob/master/doc/frameworks/broker/cluster-layout.png - Jon ___ bro-dev mailing list bro-dev@bro.org http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev
Re: [Bro-Dev] Broker::publish API
On Wed, Aug 8, 2018 at 10:53 AM Robin Sommer wrote: > > Yeah, I realize that. A direct port of the old logic was of course the > goal so far, with the drawbacks of that approach accepted & > understood. That's what's in place now; that's great and exactly as > planned. We can get 2.6 out this way, and it'll be fine. I'm earnestly probing to try to get a better decomposition of the issues that make it hard to understand cluster communication patterns. There's the exercise of trying to answer "what *is* this script doing?" and then there's also trying to answer "what *should* it be doing?". I seldom felt like I had definitive answers for the later, but I can see how it would be beneficial to do that and also broader script/framework makeovers, possibly before 2.6, because it would help inform whether new APIs are catering to "good" use-cases. Though my thinking is it's not critical to get a 100% API/use-case match off the bat and that there's some actionable stuff to take away from this thread that is at least going to have us heading in a better direction sooner rather than later... > My point is that now also seems like a good time to take stock of what > we got that way. That direct porting is finally getting us some sense > of where things aren't an ideal match between API and use cases yet. > And if there's something easy we can do about that before people start > relying on the new API, it seems that would be beneficial to do. But > we can see. Yeah, agreed. What I've taken away from your earlier points is that these smaller changes are seeming like they'd be beneficial to do before 2.6: * publish() API simplifications/compressions (pending decision on exactly what those should be) * enable message forwarding by default (meaning re-implement the one or two subscription patterns that might create a cycle) * see if any script-specific topics can instead use a pre-existing "cluster" topic What do you think? A separate question/idea I just had was whether how much of the process of auditing the subscriptions and communication patterns was difficult due to having to hunt down things in various scripts and whether a more centralized config could be something to do? e.g. I don't know how the details would work out, but I'm imagining a workflow where one edits a centralized config file with subscription/node info in it and that auto-generates the code to set them up. Sort of like working backward from the info in the PDF you shared. - Jon ___ bro-dev mailing list bro-dev@bro.org http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev
Re: [Bro-Dev] Broker::publish API
On Wed, Aug 08, 2018 at 14:20 +, Justin Azoff wrote: > There's also a bunch of places that I think were written standalone first and > then updated to work on a cluster in > place resulting in some awkwardness.. Yeah, indeed, that's another other source of complexity with these scripts. > But if this was written in a more 'cluster by default' way, it would just > look like: Nice example. That's the kind of thing I hope we can do during the next cycle: streamline the scripts to unify these kinds of logic. > Broker::publish could possibly be optimized for standalone to raise the event > directly if not being ran in a cluster. Or we generally raise published events locally as well if the node is subscribed to the destination topic. There are pros and cons for that I think. Robin -- Robin Sommer * Corelight, Inc. * ro...@corelight.com * www.corelight.com ___ bro-dev mailing list bro-dev@bro.org http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev
Re: [Bro-Dev] Broker::publish API
Yeah, I realize that. A direct port of the old logic was of course the goal so far, with the drawbacks of that approach accepted & understood. That's what's in place now; that's great and exactly as planned. We can get 2.6 out this way, and it'll be fine. My point is that now also seems like a good time to take stock of what we got that way. That direct porting is finally getting us some sense of where things aren't an ideal match between API and use cases yet. And if there's something easy we can do about that before people start relying on the new API, it seems that would be beneficial to do. But we can see. Robin On Tue, Aug 07, 2018 at 13:39 -0500, Jonathan Siwek wrote: > How much is due to new API usage and how much is due to things mainly > being a direct port of old communication patterns (which I guess are > written by various people over extended lengths of time and so there's > inconsistencies to be expected) ? Or due to being a mishmash of both > old and new? -- Robin Sommer * Corelight, Inc. * ro...@corelight.com * www.corelight.com ___ bro-dev mailing list bro-dev@bro.org http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev
Re: [Bro-Dev] Broker::publish API
On Tue, Aug 07, 2018 at 12:05 +0200, Jan Grashöfer wrote: > What I can recall, it's about simplifying the API in the light of > multi-hop routing, which is not fully functional yet. To level up a bit, what I'm hoping for is that we can find some easy ways to simplify the API a bit more now, with an eye towards dynamic multi-hop coming later. I don't know if it'll work out before 2.6 still, but changing the API later is more painful. We don't need to (or even can) solve multi-hop topologies right now, I think nobody really has the use cases clear in their heads yet. But if we could simplify the API a bit more for our current use cases in a way that may extend to multihop naturally later, that would probably save us some headaches at that point. > How does forwarding work if I add another node type? That's actually something I realized yesterday: we don't have direct worker-to-worker communication right now, correct? A worker cannot just publish to "bro/cluster/workers". > Do we assume a certain cluster structure here? If yes: Is that a valid > assumption? I think it's safe to assume we have the cluster structure under our own control; it's whatever we configure it to be. That's something that's easier to change later than the API itself. Said differently: we can always adjust the connections and topics that we set up by default; it's much harder to change how the publish() function works. > From my understanding this would mean going back to the old > communication patterns. What's the point of having topics if we don't > use them? Let me try to phrase it differently: If there's already a topic for a use case, it's better to use it. That's easier and less error-prone. So if, e.g., I want to send my script's data to all workers, publishing to bro/cluster/worker will do the job. And that will even automatically adapt if things get more complex later. For example, I can see having multiple otherwise independent cluster sharing a communication channel. In that case, we could internally change the topic to "bro/cluster//workers", and everybody using the predefined worker topic would still reach "their" workers without any further changes. > That's something I would have expected. I don't think this is > necessarily an indicator of bad design. Maybe it's a *necessary* design, but that doesn't make it nice. ;-) It makes it very hard to follow the logic; when reading through the scripts I got lost multiple times because some "@if I-am-a-manager" was somewhere half a page earlier, disabling the code I was currently looking at for most nodes. We probably can't totally avoid that, but the less the better. Robin -- Robin Sommer * Corelight, Inc. * ro...@corelight.com * www.corelight.com ___ bro-dev mailing list bro-dev@bro.org http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev
Re: [Bro-Dev] Broker::publish API
> On Aug 6, 2018, at 3:50 PM, Robin Sommer wrote: > >- Relaying is hardly used. > > >- There's a lot of checks in publishing code of the type "if I am > (not) of node type X". I think these 2 are somewhat related. Since there weren't higher level things like relaying, in order to relay a message from one worker to all other workers you had to jump through hoops with worker2manger and manager2worker events and often lots of @if stuff. There's also a bunch of places that I think were written standalone first and then updated to work on a cluster in place resulting in some awkwardness.. like notice/main.bro: function NOTICE(n: Notice::Info) { if ( Notice::is_being_suppressed(n) ) return; @if ( Cluster::is_enabled() ) if ( Cluster::local_node_type() == Cluster::MANAGER ) Notice::internal_NOTICE(n); else { n$peer_name = n$peer_descr = Cluster::node; Broker::publish(Cluster::manager_topic, Notice::cluster_notice, n); } @else Notice::internal_NOTICE(n); @endif } event Notice::cluster_notice(n: Notice::Info) { NOTICE(n); } So on a worker, calling NOTICE publishes a cluster_notice event that then re-calls NOTICE on the manager, which then does the right thing. You end up with a single small function with nested @if/if that works 3 different ways. But if this was written in a more 'cluster by default' way, it would just look like: function NOTICE(n: Notice::Info) { if ( Notice::is_being_suppressed(n) ) return; n$peer_name = n$peer_descr = Cluster::node; Broker::publish(Cluster::manager_topic, Notice::cluster_notice, n); } event Notice::cluster_notice(n: Notice::Info) { if ( Notice::is_being_suppressed(n) ) return; Notice::internal_NOTICE(n); } Which other than the suppression check, has no branching at all. Broker::publish could possibly be optimized for standalone to raise the event directly if not being ran in a cluster. The only small downside is on a standalone you'd call is_being_suppressed twice, could always add a @if if you really wanted, but is_being_suppressed is just a set lookup. Then this stuff would be a good use for efficient relaying/broadcasting instead of making the manager do all the work: Broker::auto_publish(Cluster::worker_topic, Notice::begin_suppression); Broker::auto_publish(Cluster::proxy_topic, Notice::begin_suppression); — Justin Azoff ___ bro-dev mailing list bro-dev@bro.org http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev