On 10/07/16 09:32, j...@7lan.net wrote: > Hi list, > > I want to parse the dspam system.log to add to a database. The "space" > separated format makes it difficult to parse. Does anyone has some kind > of grep / awk / regex to parse the system.log fields? > > Thanks!
It would be trivial in Perl. Looking at a log line, you have nine fields: A Unix timestamp, 10 digits A single character status code A sender address ending with the character '>' A token which consists of a numeric UID and a 32-hex-digit hash separated by a comma The subject line as an unquoted string The spam score, as a single floating-point number The addressee's username, a single word The action, a single string containing no whitespace The message-ID enclosed in <> You can trivially and unambiguously strip off the first four and the last four fields, because you know enough about what their contents must be to uniquely delimit each token. Whatever remains must be the subject line. -- Phil Stracchino Babylon Communications ph...@caerllewys.net p...@co.ordinate.org Landline: 603.293.8485 ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot _______________________________________________ Dspam-user mailing list Dspam-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspam-user