Re: [Assp-user] Spam pattern - can it be detected?

Dave Emory Sat, 14 Jul 2007 13:56:25 -0700

Hmmm....it looks like I'm not finding all of the invalid addresses.  I'll 
keep working on it.


    Dave

Dave Emory wrote:
> Marrco wrote:
>>> Thanks!  I'll be working on some scripting to maintain a good
>>> spamtrap list using yours as a starting point.  The list I use now
>>> was tediously created
>>
>>
>>
>> Don't hesitate to ask, should any need arise. But don't forget to
>> share you findings and you improved scripts with the list. (or use
>> the wiki)
> OK, here's what I've got so far...I decided to use native Windows
> commands with the exception of one executable program to "sort
> unique".  Here's the link to the original file I downloaded:
> http://golden-triangle.com/UNIQUE.COM
>
> Here's the batch file.  I run it as part of my rebuilddb.bat.  Edit
> the file names and/or directories to suit your own ASSP setup.  Note
> "*maillog" will use all of my logs:
>
> ------------X---------------
> @echo off
> if exist tmp.txt del tmp.txt
> if exist tmpaddr.txt del tmpaddr.txt
> echo Collecting invalid email addresses from the ASSP logs...
> findstr /C:"invalid address rejected: " *maillog > invalid.txt
>
>>> Get interesting data only
>
> echo.
> FOR /F "tokens=9 delims= " %%i in (invalid.txt) do @echo %%i >>
> tmpaddr.txt FOR /F "tokens=1 delims=@" %%i in (tmpaddr.txt) do @echo
> %%i >> tmp.txt
>
> echo Data collected and parsed...
> echo.
>>> Sort list
> echo Sorting list...
> echo.
> sort <tmp.txt> sorted.txt
> del tmp.txt
> del tmpaddr.txt
>
>>> Keep unique lines
> echo Removing duplicate email names...
> type sorted.txt | unique > penaltytrapaddresses.txt
> del sorted.txt
>
> echo.
> echo Finished!
> exit
>
> ------------X---------------
> Here's perl script that will "sort unique" as well:
>
> #!/usr/bin/perl -w
> use strict;
>
> sub ltrim($);
> # Set to filename of CSV file
> my $infile = 'names.txt';
>
> # Set to filename of de-duped file (new file)
> my $newfile = 'trapaddresses.txt';
>
>
> ### Shouldn't need to change stuff below here ###
>
> open (IN, "<$infile")  or die "Couldn't open input file: $!";
> open (OUT, ">$newfile") or die "Couldn't open output file: $!";
>
>
> # Slurp in & sort everything else
> my @data = sort <IN>;
> my $n = 0;
> # Now go through the data line by line, writing it to output unless
> it's identical
> # to the previous line (in which case it's a dupe)
> my $lastline = '';
> foreach my $currentline (@data) {
>
>  next if $currentline eq $lastline;
>  print OUT ltrim($currentline);
>  $lastline = $currentline;
>  $n++;
> }
>
> close IN; close OUT;
>
> print "Processing complete. In = " . scalar @data . " records, Out =
> $n records\n";
>
> # Left trim function to remove leading whitespace
> sub ltrim($)
> {
> my $string = shift;
> $string =~ s/^\s+//;
> return $string;
> }
>
>
> -------------------------------------------------------------------------
> This SF.net email is sponsored by DB2 Express
> Download DB2 Express C - the FREE version of DB2 express and take
> control of your XML. No limits. Just data. Click to get it now.
> http://sourceforge.net/powerbar/db2/
> _______________________________________________
> Assp-user mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/assp-user 


-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Assp-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/assp-user

Re: [Assp-user] Spam pattern - can it be detected?

Reply via email to