Hi Mark, I haven't had a chance to test your script yet, but I generally write one-liners to process text like yours. The question is ...how you want to take input: do you want to take it off the command line?
1. In the first example below, the file is `slurp`ed in, then split on records using the appropriate lookahead/lookbehind. 2. In the second example below, a 'sliding window` approach is used: lines accumulate in an array. You can modify this code to empty the array when the next <log-record-herald> (header) line arrives. https://unix.stackexchange.com/questions/703741/how-to-retrieve-data-from-a-logfile-where-timestamp-can-be-followed-by-multiline/704242#704242 https://unix.stackexchange.com/questions/29906/delete-range-of-lines-above-pattern-with-sed-or-awk/774599#774599 Finally, if you're looking for a (possibly-difficult) "negated"-regex answer, you could look at the StackOverflow link below: https://stackoverflow.com/questions/47396166/how-to-negate-subtract-regexes-not-only-character-classes-in-perl-6?rq=1 HTH, Bill. > On Oct 25, 2025, at 10:16, Mark Devine <[email protected]> wrote: > > RE Gurus, > > I have a “match anything that is not this thing” pattern that I haven’t > worked out yet. > > Colorized below is the (log) data to parse, sometimes multi-line, sometimes > single line, in a repeated pattern. Here’s my test script: > > #!/usr/bin/env raku > > use Data::Dump::Tree; > use Grammar::Debugger; > > my $data = q:to/END/; > 3_1 2025-08-30T03:06:44-04:00 info Advanced Intrusion > Detection Environment (AIDE) detected potential changes to software on this > system. The changes are listed in /var/log/aide/aide.log and also at the end > of this alert message. > Summary : : > Total number of entries : > 54096 > Added entries : 1 > Removed entries : 0 > Changed entries : 0 > 1_1 2025-08-14T07:18:41-04:00 critical After initial accelerated > space reclamation, file system / is 80% full, which is equal to or above the > 80% threshold. Accelerated space reclamation will continue. > This alert will be > cleared when file system / becomes less than 75% full. > Top three directories > ordered by total space usage are as follows: > /opt : 2.69G > /root : 2.15G > /usr : 1.76G > 1_2 2025-08-14T17:36:40-04:00 clear File system / is 58% > full, which is below the 75% threshold. Normal space reclamation will resume. > END > > my grammar EXADATALOG-grammar { > token TOP { <log-record>+ > } > token log-record { <log-record-herald> \s+ <message> > } > token log-record-herald { ^ \s+ <name-field> \s+ <datetime-field> \s+ > <status-field> } > token name-field { \d+ '_' \d+ > } > token datetime-field { \d\d\d\d '-' \d\d '-' \d\d 'T' \d\d ':' > \d\d ':' \d\d '-' \d\d ':' \d\d } > token status-field { \w+ > } > token not-log-record-herald { <!log-record-herald> > } > token message { <not-log-record-herald>+ > } > } > > ddt EXADATALOG-grammar.parse($data); > > =finish > > My strategy is to characterize the start of each record with > <log-record-herald> as the anchor for the logic. Match a <log-record-herald> > and match a potentially multi-line <message>, with <message> being anything > that IS NOT a <log-record-herald>. > > Is this a viable approach? Anyone know what I’m missing here? > > Thanks, > > Mark
