The logic of the idea seems to be sound, but I see here that <not-log-record-herald> matches "" (blank). I was hoping that <not-log-record-herald> would consider the string a match and pack it in the Match object. I'd use an action to accumulate if necessary. But since it discards the string, I think I might be out of luck entirely. Thoughts?
TOP > | log-record > | | log-record-herald > | | | name-field > | | | * MATCH "3_1" > | | | datetime-field > | | | * MATCH "2025-08-30T03:06:44-04:00" > | | | status-field > | | | * MATCH "info" > | | * MATCH " 3_1 2025-08-30T03:06:44-04:00 info" > | | message > | | | not-log-record-herald > | | | | log-record-herald > | | | | * FAIL > | | | * MATCH "" Thanks, Mark Devine (202) 878-1500 From: Mark Devine <[email protected]> Sent: Saturday, October 25, 2025 1:16 PM To: [email protected] Subject: Grammar: "match anything that is not this thing" RE Gurus, I have a "match anything that is not this thing" pattern that I haven't worked out yet. Colorized below is the (log) data to parse, sometimes multi-line, sometimes single line, in a repeated pattern. Here's my test script: #!/usr/bin/env raku use Data::Dump::Tree; use Grammar::Debugger; my $data = q:to/END/; 3_1 2025-08-30T03:06:44-04:00 info Advanced Intrusion Detection Environment (AIDE) detected potential changes to software on this system. The changes are listed in /var/log/aide/aide.log and also at the end of this alert message. Summary : : Total number of entries : 54096 Added entries : 1 Removed entries : 0 Changed entries : 0 1_1 2025-08-14T07:18:41-04:00 critical After initial accelerated space reclamation, file system / is 80% full, which is equal to or above the 80% threshold. Accelerated space reclamation will continue. This alert will be cleared when file system / becomes less than 75% full. Top three directories ordered by total space usage are as follows: /opt : 2.69G /root : 2.15G /usr : 1.76G 1_2 2025-08-14T17:36:40-04:00 clear File system / is 58% full, which is below the 75% threshold. Normal space reclamation will resume. END my grammar EXADATALOG-grammar { token TOP { <log-record>+ } token log-record { <log-record-herald> \s+ <message> } token log-record-herald { ^ \s+ <name-field> \s+ <datetime-field> \s+ <status-field> } token name-field { \d+ '_' \d+ } token datetime-field { \d\d\d\d '-' \d\d '-' \d\d 'T' \d\d ':' \d\d ':' \d\d '-' \d\d ':' \d\d } token status-field { \w+ } token not-log-record-herald { <!log-record-herald> } token message { <not-log-record-herald>+ } } ddt EXADATALOG-grammar.parse($data); =finish My strategy is to characterize the start of each record with <log-record-herald> as the anchor for the logic. Match a <log-record-herald> and match a potentially multi-line <message>, with <message> being anything that IS NOT a <log-record-herald>. Is this a viable approach? Anyone know what I'm missing here? Thanks, Mark
