On Oct 20, Rose, Jeff said:

Basically what I am trying to do here is parse through an email file
grab the the basics, from/to/subject put those in a small text tab
separated database in the format of

File NumRecipients From FromIP Subject Spam-Status

and then pass the contents along to spamassassin pm to check the status but the email file contains these lines which mess with spamassassins filtering which I have to remove in order to get an accurate spam score(using the pm not the daemon don't ask me why :-))

P I 19-10-2005 21:35:00 0000 ____ ____ < [EMAIL PROTECTED] >
O T
A domain.com [123.12.123.1]
S SMTP [IP ADDRESS]
R W 19-10-2005 21:35:00 0000 ____ _FY_ [EMAIL PROTECTED]
R W 19-10-2005 21:35:00 0000 ____ _FY_ [EMAIL PROTECTED]
R W 19-10-2005 21:35:00 0000 ____ _FY_ [EMAIL PROTECTED]
R W 19-10-2005 21:35:00 0000 ____ _FY_ [EMAIL PROTECTED]

You're on the right track, but I think you're doing far too much with the regex when you should really just split each line on whitespace and deal with it like that.

  my @records;

  while (<FILE>) {
    my @fields = split;

    # sender
    if ($fields[0] eq 'P') {
      push @records, [ { SENDER => $fields[-2] } ];  # $fields[-1] is '>'
    }

    # recipient
    elsif ($fields[0] eq 'R') {
      push @{ $records[-1]{RECIPIENT} }, $fields[-1];
    }

    # SMTP
    elsif ($fields[0] eq 'S') {
      ${ $records[-1] }{SMTP} = $fields[-1];
    }

    # etc.
  }

Now you have an array, @records, whose elements are hash references. Here's what it's like:

  @records = (
    # email 1
    {
      SENDER => '...',
      RECIPIENT => [ '...', '...' ],
      SMTP => '...',
      # whatever other fields you want
    },

    # email 2
    { ... },
  );

--
Jeff "japhy" Pinyan        %  How can we ever be the sold short or
RPI Acacia Brother #734    %  the cheated, we who for every service
http://www.perlmonks.org/  %  have long ago been overpaid?
http://princeton.pm.org/   %    -- Meister Eckhart

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>


Reply via email to