Re: [Zeek-Dev] support for event handlers using a subset of parameters
On Fri, Feb 1, 2019 at 12:59 PM Vern Paxson wrote: > I don't see how it helps with > deprecating existing parameters (which seems would be better served with > some sort of attribute), Support for in parameters is part of the changes. But if we don't allow the user to immediately remove the field, they are then stuck doing 2 changes: Step 1: we mark a field Step 2: the user sees that, so they remove uses of that parameter from their body Step 3: we actually remove the field Step 4: if the user was forced to still have the param in their handler's param list they now have to do a second change to remove it instead of just removing it right away With the proposed patch, we get rid of the need for Step 4 and decrease burden on users. > and I don't see how it helps with > changing the semantics of existing event parameters. Step 1: we mark old field and introduce a new parameter Step 2: the users sees that.. etc, etc. same as above. > It actually makes sense to me to support overloading for events. Then for > example you could have two event signatures depending on what information > the handler was going to leverage, which would allow the event engine to > offload work if there isn't a handler for a signature that requires extra > computation. I think the same kind of offloading is possible with the "parameter subset" approach. We know exactly what parameters are being consumed, so we might have optimizations that don't produce a parameter if no one consumes it. And if no one consumes any parameters we also don't generate the call. If you have two different event signatures, we just get the same type of optimization we currently do, which only optimize out the entire call if there's no handlers, but doesn't know if individual parameters are being consumed or not. e.g: http_request(a, b, c) http_request(d, e, f) If someone only consumes 'a' and 'e', you still have to produce both function calls in their entirety (and also the other unused params), but: http_request(a, b, c, d, e, f) You can potentially not do any work generating the unused parameters and only have to do the one function call with 'a' and 'e'. Technically, we can still require a matching signature and do such an optimization by walking the AST and finding local parameter usages. I guess you have to do that ultimately, but it's an easy head start at implementing such optimizations as a test/idea if we can simply see someone isn't using a parameter because it's not in their handler param list. - Jon ___ zeek-dev mailing list zeek-dev@zeek.org http://mailman.icsi.berkeley.edu/mailman/listinfo/zeek-dev
Re: [Zeek-Dev] support for event handlers using a subset of parameters
> The compelling use-case I'd say is the ability to change/deprecate > event parameters without suddenly breaking people's code since that > has come up many times already. I see how it allows adding new parameters. I don't see how it helps with deprecating existing parameters (which seems would be better served with some sort of attribute), and I don't see how it helps with changing the semantics of existing event parameters. > Also this change only effects events and hooks, not functions. The > semantics are different enough that maybe we would only want > overloading for functions anyway. It actually makes sense to me to support overloading for events. Then for example you could have two event signatures depending on what information the handler was going to leverage, which would allow the event engine to offload work if there isn't a handler for a signature that requires extra computation. > Hooks and events have multiple implementations/bodies that are defined > by the *user*. The *author* is generally the one the generates > (calls) the event/hook. The big exception being the event engine (if I follow what you mean by user/author). > So if the event/hook name were overloaded, it's a bit confusing -- the > user now has to decide between different signatures to handle, each > containing different data sets and maybe neither contains the set they > want (so now they handle two events of the same name instead of one). Not really seeing this. I'm picturing that a common idiom will be a lightweight version of an event and a heavyweight version, or maybe a spectrum from light-to-heavy. > Really, an event is a unique name with some amount of data > (parameters) associated with it and may always be generated with the > full data set -- the user then chooses which data they are interested > in by defining that explicitly in the handler's parameter list. Yeah, I agree that this too would allow (most of) the sort of offloading I sketch above. Vern ___ zeek-dev mailing list zeek-dev@zeek.org http://mailman.icsi.berkeley.edu/mailman/listinfo/zeek-dev
Re: [Zeek-Dev] support for event handlers using a subset of parameters
On 1 Feb 2019, at 11:24, Robin Sommer wrote: > It's a nice a idea to relax parameter passing to work by name, and > allow subsets. However, I can't quite get myself to really like it in > this form, because it *looks* like an error to not have matching > argument lists. Is there some syntax that would make it more clear > what's going on? I think the change to using names does make things a bit more confusing for users, but it opens the door for us to greatly improve reliability of scripts in the long term and generally it feels like a nice way for analyzer authors to deprecate functionality without needing to create all new events. In my opinion even though there are hairy side effects to this I think it's a net positive. It would be great to get case sensitive versions of dns events and the http header event. That has been a very long standing deficit. I guess if there is some more obvious way to do it could make sense, but I haven't been able to come up with anything after thinking about this for quite a while. .Seth -- Seth Hall * Corelight, Inc * www.corelight.com ___ zeek-dev mailing list zeek-dev@zeek.org http://mailman.icsi.berkeley.edu/mailman/listinfo/zeek-dev
Re: [Zeek-Dev] support for event handlers using a subset of parameters
On Fri, Feb 1, 2019 at 10:24 AM Robin Sommer wrote: > On Thu, Jan 31, 2019 at 16:29 -0800, Vern Paxson wrote: > > > > global my_event: event(a: count, b: string); > > > event my_event(b: string) > > > { print "my_event", b; } > > it *looks* like an error to not have matching > argument lists. Is there some syntax that would make it more clear > what's going on? Not sure. If the syntax were different, that introduces a "one more thing to remember" issue, so I might prefer consistency with other function-like constructs. Any other language we know that has multi-body functions we can reference for ideas? Did it look like an error in the sense of the user making a mistake or in the sense of traditional way functions in other languages like C/C++ require matching signatures? In the former, I think the semantics/intentions are actually clearer than before: the user didn't list a parameter because they don't care about it, so why make them. I know what event they want because they use unique names and the parameters they listed do map in a valid way. On the traditional side of things, overloading seems it's maybe a legit reason for requiring matching signatures, but I also explained why I think overloading wouldn't make sense in the context of events. - Jon ___ zeek-dev mailing list zeek-dev@zeek.org http://mailman.icsi.berkeley.edu/mailman/listinfo/zeek-dev
Re: [Zeek-Dev] support for event handlers using a subset of parameters
On Thu, Jan 31, 2019 at 16:29 -0800, Vern Paxson wrote: > > global my_event: event(a: count, b: string); > > event my_event(b: string) > > { print "my_event", b; } > Is there a compelling use-case that's motivating this change? I'm sure the main use case is changing an existing event's parameters without breaking existing scripts -- someting we've been increasingly running into as a major challenge. It's a nice a idea to relax parameter passing to work by name, and allow subsets. However, I can't quite get myself to really like it in this form, because it *looks* like an error to not have matching argument lists. Is there some syntax that would make it more clear what's going on? Robin -- Robin Sommer * Corelight, Inc. * ro...@corelight.com * www.corelight.com ___ zeek-dev mailing list zeek-dev@zeek.org http://mailman.icsi.berkeley.edu/mailman/listinfo/zeek-dev
Re: [Zeek-Dev] support for event handlers using a subset of parameters
On Thu, Jan 31, 2019 at 6:29 PM Vern Paxson wrote: > > * user doesn't care about parameter 'a', so they shouldn't have to list it > > * it makes it easier for to deprecate/change event parameters > > This seems like a pretty niche pair of benefits. Is there a compelling > use-case that's motivating this change? The compelling use-case I'd say is the ability to change/deprecate event parameters without suddenly breaking people's code since that has come up many times already. I briefly skimmed NEWS for just the last 2.6 release and count 5 times we broke an event signature where this patch would have helped. I think there's also some other higher-profile changes to event args we haven't moved forward with because we didn't want to break user code that this would help with. Old example from unresolved ticket: https://bro-tracker.atlassian.net/browse/BIT-1431 > One thing I initially wondered was whether this was going to tie our hands > in the future if we want to introduce C++-style overloading. However, it > looks like you've implemented this based on matching the names in the > declaration rather than the types, so that should be okay. Also this change only effects events and hooks, not functions. The semantics are different enough that maybe we would only want overloading for functions anyway. That is, functions have a single, fixed implementation/body that is defined by the *author*, so you may want to re-use the same name for something implemented in different ways. The *user* is the one that calls the function. Hooks and events have multiple implementations/bodies that are defined by the *user*. The *author* is generally the one the generates (calls) the event/hook. So if the event/hook name were overloaded, it's a bit confusing -- the user now has to decide between different signatures to handle, each containing different data sets and maybe neither contains the set they want (so now they handle two events of the same name instead of one). Really, an event is a unique name with some amount of data (parameters) associated with it and may always be generated with the full data set -- the user then chooses which data they are interested in by defining that explicitly in the handler's parameter list. - Jon ___ zeek-dev mailing list zeek-dev@zeek.org http://mailman.icsi.berkeley.edu/mailman/listinfo/zeek-dev