Re: [Zeek-Dev] Log archival (Re: Zeek Supervisor: designing client and log archival) behavior
On Wed, Jul 1, 2020 at 1:59 AM Robin Sommer wrote: > > > Log::default_rotation_dir > > Seems we should then set this to "." by default, and have the cluster > framework override it. Yes, exactly. > Once moved, I suppose we would continue to optionally run a > post-processor, right? For a supervised cluster, we wouldn't use that > and suggest that people go with "zeek-archive" instead; but with > ZeekControl we'd keep the current behavior of gzipping behavior so > that we don't break any setups. Yes, with the proposed changes, custom postprocessors still work the same as before and everything is backwards compatible / equivalent in non-supervised-mode. Supervised-mode is just picking some different default settings from non-supervised-mode: * don't use a postprocessing script (archive-log) * rotate into a `Log::default_rotation_dir` of "log-queue" instead of "." > Not sure it's worth retaining the information about the post-processor > function, and it could to potentially lead to trouble if the function > changed somehow in between (or disppeared). We could instead just run > the leftovers through whatever the restarted config says to do with > files. * Disappeared: easy to notice the function no longer exists and fallback to default post-processor * Changed: running through a function of same-name, but it happened to get changed between restart is probably still going to be closer to what user expects than running it through the default post-processor which is completely different ? > Do we even need any other meta data at all in the new scheme? I'm > wondering if we could simplify this all to: "If at open() time, X.log > exists, first rotate it away through the currently configured > postprocessor function". What if an open() rarely or never happens again for a given log? I'm thinking the rotation of leftover logs needs to happen once at startup rather than lazily. > Hmm, actually, there's a piece of meta that we'll need: the opening > timestamp, so that one can incorporate that into the name of the > rotated file (assuming we want to retain that capability). Unless we > parsed that out of the X.log itself ... Don't think we'd have the opening timestamp to parse from the log when LogAscii::use_json=T. So still think it's necessary to obtain open-time meta from a `.shadow.X.log`, either it's explicitly in there or use the files modified time (essentially creation time). The close-time of X.log is just taken as last-modified time of X.log. - Jon ___ Zeek-Dev mailing list Zeek-Dev@zeek.org http://mailman.icsi.berkeley.edu/mailman/listinfo/zeek-dev
Re: [Zeek-Dev] Zeek Supervisor Command-Line Client
On Tue, Jun 30, 2020 at 14:29 -0700, Jon Siwek wrote: > Maybe the important observation is that the logic can be performed > anywhere that has access to the Zeek-Supervisor process. Agree. > So where we put the logic at this point may not be important. If we > can find a single-best-place for the logic to live, that's great I believe that's what Seth is arguing for: have a Zeek-side script be the single point of that logic, rather than implement it multiple times and/or outside of Zeek. I can see doing that in Zeek but I think there's a trade-off here: if we want to do the singe-place approach with a multi-system setup, we'd need an authoritative place to run this logic and hence depend on *that* Zeek supervisor being up and running for performing the operation. That may be a reasonably assumption (say if we dedicated the supervisor running the manager to also be the cluster coordinator), but it's different from a world where the client can execute higher-level operations on its own. Robin -- Robin Sommer * Corelight, Inc. * ro...@corelight.com * www.corelight.com ___ Zeek-Dev mailing list Zeek-Dev@zeek.org http://mailman.icsi.berkeley.edu/mailman/listinfo/zeek-dev
[Zeek-Dev] Supervisor client (Re: Zeek Super-isor: designing client and log archival behavior)
> * https://github.com/zeek/zeek/wiki/Zeek-Supervisor-Client Some thoughts on the commands: > $ zeekc status [all | ] > Do we need to include any other metrics in the returned status? That information is mostly static, would be nice to get some dynamic information in there as well, like uptime, CPU/memory/traffic stats, No need to have that right away, but worth keeping in mind. > # Do we need more categories to filter by (e.g. node type) ? I'd skip for now. > # If there's downed nodes at this point, what do we expect users to do? > # Check the standard services logs for stderr/stdout info? Check > reporter.log ? Yeah, would be cool if zeekc had access to the stderr/stdout from the nodes through their supervisors. The supervisors could buffer that for a while and return on request. More generally, the supervisor could get a "diagnostics buffer" that, over time, we could use for more stuff like store backtraces etc. "reporter.log" is out I'd say, that will go through the normal log rotation & archival, and be accessible that way. > # A `zeekc diag` command could help gather information, like ask Zeek > supervisor > # to find core dumps and extract stack trace. Would it do more than that, > like > # show last N lines of downed nodes' stderr, or last N lines of reporter.log? > $ zeekc check I'm wondering which supervisor that would be be talking to in a multi-system setup? All? > $ zeekc terminate > ... > # Normally wouldn't terminate the supervisor if a service-manager is handling > # the Zeek supervisor process itself and will just restart it, but`terminate` > # would be helpful for anyone running a supervised Zeek cluster > "manually". Another use case: If for some reason one wants to restart the supervisor itself, "terminate" would kill it and the service manager would then restart it. Robin -- Robin Sommer * Corelight, Inc. * ro...@corelight.com * www.corelight.com ___ Zeek-Dev mailing list Zeek-Dev@zeek.org http://mailman.icsi.berkeley.edu/mailman/listinfo/zeek-dev
[Zeek-Dev] Log archival (Re: Zeek Supervisor: designing client and log archival) behavior
On Tue, Jun 30, 2020 at 01:39 -0700, Jon Siwek wrote: > * https://github.com/zeek/zeek/wiki/Zeek-Supervisor-Log-Handling This overall sounds good to me. Some notes & questions: > Log Rotation > To help bridge/replace Step (4) and (5), suggest adding a new option: > Log::default_rotation_dir. The Log::rotation_format_func() will use > this as part of its default return value. Seems we should then set this to "." by default, and have the cluster framework override it. > The log_mgr will attempt to create necessary dirs just-in-time, > failing to do so emits an error, but otherwise continues with rotation > using working directory instead. I'd extend this to any error case: if moving from current location to Log::default_rotation_dir fails (e.g., because the latter is a on different file system), continue with new name inside the current working directory (and report the error). Once moved, I suppose we would continue to optionally run a post-processor, right? For a supervised cluster, we wouldn't use that and suggest that people go with "zeek-archive" instead; but with ZeekControl we'd keep the current behavior of gzipping behavior so that we don't break any setups. We can implement that distinction through the post-processer function: the new default function would just do the rename according to the new scheme, and a separate legacy function for ZeekControl spawns the "archive-log" script. > zeek-archiver I like making this a standard tool, but seems like something we could postpone doing right now and prioritize getting the Zeek-side infrastructure in place. > We can potentially have the Zeek Supervisor process configurable to > auto-start and keep a zeek-archiver child alive. I'd say that's a job for systemd (or whatever service manager). I know Seth disagress. :-) > Leftover Log Rotation > The rotation for such a leftover log file uses the metadata in the > shadowfile to help try to go through the exact rotation that it should > have occurred, including running the postprocessor function. Not sure it's worth retaining the information about the post-processor function, and it could to potentially lead to trouble if the function changed somehow in between (or disppeared). We could instead just run the leftovers through whatever the restarted config says to do with files. Do we even need any other meta data at all in the new scheme? I'm wondering if we could simplify this all to: "If at open() time, X.log exists, first rotate it away through the currently configured postprocessor function". If we did that, we should probably have an global boolean that allows to choose between that and just overwriting existing files. The latter would be the default to retain current command-line behavior, and the cluster framework would enable leftover recovery. Hmm, actually, there's a piece of meta that we'll need: the opening timestamp, so that one can incorporate that into the name of the rotated file (assuming we want to retain that capability). Unless we parsed that out of the X.log itself ... Robin -- Robin Sommer * Corelight, Inc. * ro...@corelight.com * www.corelight.com ___ Zeek-Dev mailing list Zeek-Dev@zeek.org http://mailman.icsi.berkeley.edu/mailman/listinfo/zeek-dev