* On 09 Dec 2014, John Long wrote: 
> The messages seem to all have message-ids in the form
> 
> bunchofch...@m.something.com

You'll need to be much more specific if you want help writing a matching
regex.  Is "something" a semantic variable or literal?  What does
"bunchofchars" look like?

>From all I can gather it sounds like they're generating totally legit
and normalized message-ids.  Any message-id that someone out here
provides you will match false positives as well.  That's why we need
specific examples to help.

> They also have email ids in the form
> 
> Idiot Spammer <id...@m.something.com>

By "email id" do you mean address?  Again, that looks completely normal.
Matching it will require examples.

> > However if your spammer's message-ids are actually showing an RFC822
> > address format, you might try:
> > 
> > ~i '\S+\s+<\S+@\S+>'
> > 
> > I'm assuming your regex library supports \s, \S. PCRE does. Otherwise
> > you could try
> > 
> > ~i '[^ ]+ +<[^ ]+@[^ ]+>'
> 
> You lost me on these two regexps. What's going on here?

That matches the following:

        [text][whitespace]<[text]@[text]>

This is what an email address should look like, but a message ID should
have only the <text@text> part.  (It should not have leading text +
whitespace.)

-- 
David Champion • d...@bikeshed.us

Reply via email to