On 10/07/16 09:32, j...@7lan.net wrote:
> Hi list, 
> 
> I want to parse the dspam system.log to add to a database. The "space"
> separated format makes it difficult to parse. Does anyone has some kind
> of grep / awk / regex to parse the system.log fields?
> 
> Thanks!


It would be trivial in Perl.  Looking at a log line, you have nine fields:

A Unix timestamp, 10 digits
A single character status code
A sender address ending with the character '>'
A token which consists of a numeric UID and a 32-hex-digit hash
separated by a comma

The subject line as an unquoted string

The spam score, as a single floating-point number
The addressee's username, a single word
The action, a single string containing no whitespace
The message-ID enclosed in <>

You can trivially and unambiguously strip off the first four and the
last four fields, because you know enough about what their contents must
be to uniquely delimit each token.  Whatever remains must be the subject
line.


-- 
  Phil Stracchino
  Babylon Communications
  ph...@caerllewys.net
  p...@co.ordinate.org
  Landline: 603.293.8485

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
Dspam-user mailing list
Dspam-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspam-user

Reply via email to