Re: [Zeek-Dev] Hi + LL Analyzer
On Thu, Feb 28, 2019 at 11:35 +0100, Jan Grashöfer wrote: > The question here would be whether LL-analyzers have to be linked > dynamically. Well, the point of the plugin API is being able to add new functionality externally through an independently compiled shared library. Excluding link-layer analyzers from that would feel like a gap to me. That said, we definitely need to benchmark performance to make sure it's feasible. My hunch is that a lookup table should be good enough, but we'll see. Robin -- Robin Sommer * Corelight, Inc. * ro...@corelight.com * www.corelight.com ___ zeek-dev mailing list zeek-dev@zeek.org http://mailman.icsi.berkeley.edu/mailman/listinfo/zeek-dev
Re: [Zeek-Dev] Hi + LL Analyzer
On 27/02/2019 20:40, Robin Sommer wrote: >> One question here would be whether it makes sense to assume that the set of >> LL-analyzers tash should be available is known at compile-time? > > The built-in ones can be known, but any added through dynamic plugins > can't really. We'll know only at runtime what the final set is. But we > could precompute a lookup table in advance at startup that maps link > types to analyzers. The question here would be whether LL-analyzers have to be linked dynamically. Another option would be to require users to build Zeek if they need additional LL-analyzers. The analyzers would still be modular but using some meta programming one might be able to generate efficient dispatching code at compile-time. If the focus is on performance we could benchmark both approaches and decide based on the results. Jan ___ zeek-dev mailing list zeek-dev@zeek.org http://mailman.icsi.berkeley.edu/mailman/listinfo/zeek-dev
Re: [Zeek-Dev] Hi + LL Analyzer
On Wed, Feb 27, 2019 at 16:07 +0100, Jan Grashöfer wrote: > At first glance it looks like IP-layer multiplexing is done in > NetSessions::{NextPacket, DoNextPacket} and the Transport-layer is tackled > in Manager::BuildInitialAnalyzerTree in context of initializing a > connection. Well, there, too. :) That's indeed doing the packet dispatching, while DoNextPacket() sets up state mgmt. It's all not quite clear cut, which is part of the problem. > That is the central point. So a first step would be to rely on TCP/IP in the > "middle" of the stack but allow pluggable Link-layer protocols. Those might > feed their data to the TCP/IP pipeline or handle them on their own. The next > step would be the IP-layer. Yeah, that sounds good to me. > One question here would be whether it makes sense to assume that the set of > LL-analyzers tash should be available is known at compile-time? The built-in ones can be known, but any added through dynamic plugins can't really. We'll know only at runtime what the final set is. But we could precompute a lookup table in advance at startup that maps link types to analyzers. > I think this would be part of the larger effort to re-think Zeek's notion of > connections. This could be addressed together with implementing a flexible > mechanism to make meta data like LL-addresses available in context of a > connection. Yep. > In case we allow to plug in new transport protocols, they might need > their own PIA to support the analysis of known protocols like HTTP > etc. Yeah, or a more generic PIA that provides its own hook for plugins. The main difference between TCP/UDP PIAs is packet vs stream semantics, iirc. That might generalize sufficiently, but not sure. Robin -- Robin Sommer * Corelight, Inc. * ro...@corelight.com * www.corelight.com ___ zeek-dev mailing list zeek-dev@zeek.org http://mailman.icsi.berkeley.edu/mailman/listinfo/zeek-dev
Re: [Zeek-Dev] Hi + LL Analyzer
On 26/02/2019 02:36, Robin Sommer wrote: > I see three pieces here overall that I think can be tackled > independently: > > (1) Link-layer: Currently hardcoded in Packet::ProcessLayer2() > > (2) IP-Layer: Currently hardcoded in NetSessions::NextPacket() > > (3) Transport-layer: Currently hardcoded in NetSessions::DoNextPacket(). At first glance it looks like IP-layer multiplexing is done in NetSessions::{NextPacket, DoNextPacket} and the Transport-layer is tackled in Manager::BuildInitialAnalyzerTree in context of initializing a connection. > Case (1) is all about skipping the header to get to IP. There's some > redundancy across cases, though, and MPLS makes it all more messy. One thing that comes to my mind here is whether it might be possible to pass information such as VLAN tags, MPLS labels or link layer addresses to upper layers in a generic way without hardcoding. However, that might be out of scope for now. > With (2), a plugin would be able to add support for non-IP protocols. > However, due to Bro generally assuming that it is analyzing IP, the > plugin would either need to take care of such packets completely (like > ARP does), or eventually get to an IP packet that it can then feed > back for further analysis (like if it some kind of a tunnel). The non-IP packet might also contain a Transport-layer PDU. I guess it should be possible to pass these on as well. > There's also a more general version of (2) and (3) where we'd remove > Bro's assumption of analyzing TCP/IP protocols. But that's a separate, > large effort by itself. That is the central point. So a first step would be to rely on TCP/IP in the "middle" of the stack but allow pluggable Link-layer protocols. Those might feed their data to the TCP/IP pipeline or handle them on their own. The next step would be the IP-layer. > On a technical level, plugging in such low-level analyzers needs to be > very efficient, in particular if we move the currently hardcoded cases > into the plugins as well (as I think we should; similar to how > application-layer analyzers have all moved into internal plugins). > Then the lookup-the-analyzer-and-dispatch operation will happen > multiple times for every packet. One question here would be whether it makes sense to assume that the set of LL-analyzers tash should be available is known at compile-time? >> - What about the concept of connections? For some LL protocols the >> concept might be counterintuitive. > > Couple cases there: > > - If there's really no sense of a connection, then the plugin will >need to take complete care of the packets, as the rest of Bro >assumes connection-semantics. Maybe there is another general abstraction that is worth to be supported as well. I was thinking of request-reply-pairs that can be correlated. However, I haven't put much thought into this, yet. > - If it's just the definition of what defines a connection that is >different, then I think we could make that more flexible. I've been >hoping for a while that we can make Bro's notion of connection IDs >dynamic, so that it's not necessarily just the 5-tuple. There are >use cases outside of new protocols for this, too. For example, one >could include the VLAN ID to deal with overlapping IP ranges in >independent VLANs. I think this would be part of the larger effort to re-think Zeek's notion of connections. This could be addressed together with implementing a flexible mechanism to make meta data like LL-addresses available in context of a connection. >> - The interface should support to pass payload to other analyzers. Does >> it make sense to come up with a generalized DPD-mechanism? > > Not quite sure what you're thinking here, but I believe that fully > solving this would require addressing Bro's overall assumption of > analyzing TCP/IP. For now, maybe the best way would be just having the > analyzer call back into entry points corresponding to the various > layers where analysis would then proceed as normal. I.e., some > variation of: ProcessLinkLayer(...), ProcessIP(...), > ProcessTransport(data), ProcessAppLayer(...). The caller would be > responsible for providing all the right (meta-)data, like IP headers. > Were you thinking something different / more general? While I haven't looked into it, I noticed that there are distinct PIA implementations for TCP and UDP. In case we allow to plug in new transport protocols, they might need their own PIA to support the analysis of known protocols like HTTP etc. However, if we keep a focus on TCP/IP as suggested that would be out of scope for now. Jan ___ zeek-dev mailing list zeek-dev@zeek.org http://mailman.icsi.berkeley.edu/mailman/listinfo/zeek-dev
Re: [Zeek-Dev] Hi + LL Analyzer
(I realized this slipped through the cracks, sorry for the late feedback, hope it still helps) On Thu, Feb 07, 2019 at 11:32 +0100, Jan Grashöfer wrote: > - What would be the lowest layer to built up on or should everything be > pluggable down to the packet source? I see three pieces here overall that I think can be tackled independently: (1) Link-layer: Currently hardcoded in Packet::ProcessLayer2() (2) IP-Layer: Currently hardcoded in NetSessions::NextPacket() (3) Transport-layer: Currently hardcoded in NetSessions::DoNextPacket(). Case (1) is all about skipping the header to get to IP. There's some redundancy across cases, though, and MPLS makes it all more messy. With (2), a plugin would be able to add support for non-IP protocols. However, due to Bro generally assuming that it is analyzing IP, the plugin would either need to take care of such packets completely (like ARP does), or eventually get to an IP packet that it can then feed back for further analysis (like if it some kind of a tunnel). Similar for (3): A plugin would be able to add support for further transport layer protocols, but it'd be mostly about stripping additional headers to eventually get to TCP/UDP/ICMP. There's also a more general version of (2) and (3) where we'd remove Bro's assumption of analyzing TCP/IP protocols. But that's a separate, large effort by itself. On a technical level, plugging in such low-level analyzers needs to be very efficient, in particular if we move the currently hardcoded cases into the plugins as well (as I think we should; similar to how application-layer analyzers have all moved into internal plugins). Then the lookup-the-analyzer-and-dispatch operation will happen multiple times for every packet. > - What about the concept of connections? For some LL protocols the > concept might be counterintuitive. Couple cases there: - If there's really no sense of a connection, then the plugin will need to take complete care of the packets, as the rest of Bro assumes connection-semantics. - If it's just the definition of what defines a connection that is different, then I think we could make that more flexible. I've been hoping for a while that we can make Bro's notion of connection IDs dynamic, so that it's not necessarily just the 5-tuple. There are use cases outside of new protocols for this, too. For example, one could include the VLAN ID to deal with overlapping IP ranges in independent VLANs. > - The interface should support to pass payload to other analyzers. Does > it make sense to come up with a generalized DPD-mechanism? Not quite sure what you're thinking here, but I believe that fully solving this would require addressing Bro's overall assumption of analyzing TCP/IP. For now, maybe the best way would be just having the analyzer call back into entry points corresponding to the various layers where analysis would then proceed as normal. I.e., some variation of: ProcessLinkLayer(...), ProcessIP(...), ProcessTransport(data), ProcessAppLayer(...). The caller would be responsible for providing all the right (meta-)data, like IP headers. Were you thinking something different / more general? Robin -- Robin Sommer * Corelight, Inc. * ro...@corelight.com * www.corelight.com ___ zeek-dev mailing list zeek-dev@zeek.org http://mailman.icsi.berkeley.edu/mailman/listinfo/zeek-dev
Re: [Zeek-Dev] Hi + LL Analyzer
To add a bit more context: The idea is to implement a plugin interface for low-level analyzers (see https://github.com/zeek/zeek/issues/248) and collect requirements on the list. Some first thoughts and questions: - What would be the lowest layer to built up on or should everything be pluggable down to the packet source? - What about the concept of connections? For some LL protocols the concept might be counterintuitive. - The interface should support to pass payload to other analyzers. Does it make sense to come up with a generalized DPD-mechanism? Jan ___ zeek-dev mailing list zeek-dev@zeek.org http://mailman.icsi.berkeley.edu/mailman/listinfo/zeek-dev