First off, thanks to all of you who scratched their heads over this
puzzle. All had the right idea to some extent or another.

Based in part on the replies, and my own work, here's the final result:



    cat $FOLDER \
      |grep -A 5 "^Received:" \
      |egrep "^(Received:|      )" \
      |sed -E \
        -e "s/(^Received:|by|from)[[:space:]]+//g" \
        -e "s/\([HELO]{4}[[:space:]]+($NAME_RE)\)/\1/" \
        -e "s/\(($NAME_RE)[[:space:]]+\[($ADDY_RE)\]\)/\1 \2/g" \
        -e "s/(\(\[?|\[)($ADDY_RE)(\]|\]?\))/\2/g" \
        -e "s/[[:space:]]*(\(|id|via|with|E?SMTP|;).*//" \
        -e "s/(\(envelope-|for|Sun|Mon|Tue|Wed|Thu|Fri|Sat).*//" \
        -e "s/[][(){}<>]//g" \

Note that the whitespace in the second pipe is one tab character.

The first two pipes isolate the multi-line headers. The first sed
command strips "keywords" and any following whitespace. The second
sed command returns the name in a parenthetical HELO or EHLO. The
third sed command returns the name and address in a "(... [...]).
The fourth sed command - the one I inquired about - returns the
address in any of "(...)", "([...])", or "[...]". The fifth sed
command strips possible whitespace, "keywords" or an opening
parenthesis (now that it's of no consequence), and anything after
them. The sixth sed command strips more "keywords" and anything
after them (it might be merged into the fifth, what it strips is
often on another line). Finally, the last sed command strips any
errant delimiters; strictly speaking, it's redundant, but when I
ran a spam file (~12.3Mb) through this, some delimiters did leak

Just thought those that replied to my plea might like to see this,
and perhaps somebody else will find it useful. No, I'm not telling
what it's for.  ;-,


  ______________________                         ______________________
  \__________________   \    D. J. HAWKEY JR.   /   __________________/
     \________________/\     [EMAIL PROTECTED]    /\________________/

[EMAIL PROTECTED] mailing list
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Reply via email to