Marielle Lange wrote:


I wrote the following awk script. Basically, it looks for a line starting with "From: ..." that is right after a line starting with "From ..." (if you download an archive file you will see that this systematically and unambiguously corresponds to the start of a post).

It's actually easier than that .... the archives are in mbox format, so the start of a message is unambiguously marked by a line which begins "From ...." (Any line within a message that starts with these 5 letters must be modified - usually by prepending with ">" before that line can be put into an mbox file).

There is no guarantee that the From: line will immediately follow it; although it seems that it does in these archives, other versions of mbox format will put the From: lines later in the header section. But it doesn't matter - the "From " line is in itself unambiguous, and carries all the info we need (except for the synonyms for changed names).

I'd handle the synonyms by having an array (for example)
 put "[EMAIL PROTECTED]" into  tMainAlias["[EMAIL PROTECTED]"]
(NB only need to do this for those which have synonyms).

So then we have

    set the caseSensitive to true
    repeat for each line tFile in tFiles
        repeat for each line L in URL ("file:" & tFile)
            if char 1 to 5 of L = "From " then
                put word 2 to 4 of L into t
                replace " at " with "@" in t
if tMainAlias[t] is not empty then put tMainAlias[t] into t
                add 1 to tArray[t]
            end if
        end repeat
    end repeat
    put empty into tSubmitters
    repeat for each line L in the keys of tArray
        put L && tArray[L] & cr after tSubmitters
    end repeat
    sort lines of tSubmitters descending numeric by word 2 of each
    put tSubmitters after msg
    repeat for each line L in tSubmitters
put word 1 of L && TAB && bars(word 2 of L) & CR after field "Field 1"
    end repeat
end mouseUp

function bars pN
    repeat pN times
        put "|" after t
    end repeat
    return t
end bars

There - one simple solution in Rev rather than using awk and Excel :-)

--
Alex Tweedly       http://www.tweedly.net



--
No virus found in this outgoing message.
Checked by AVG Free Edition.
Version: 7.1.385 / Virus Database: 268.4.1/309 - Release Date: 11/04/2006

_______________________________________________
use-revolution mailing list
[email protected]
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution

Reply via email to