--- [EMAIL PROTECTED] wrote:
> Hello All,
>
> I have made an e-mail forwarder that I want to filter out e-mail addresses.
> It is all working except the REGEX.
>
> I have NOT used REGEX before. Can someone tell me what I am doing wrong?
>
> This script loads the e-mail as a file from the local e-mail folder on the
> server.
>
> $regex =
>
"/^[-_a-z0-9\'+*$^&%=~!?{}]++(?:\.[-_a-z0-9\'+*$^&%=~!?{}]+)*+@(?:(?![-.])[-a-z0-9.]+(?<![-.])\.[a-z]{2,6}|\d{1,3}(?:\.\d{1,3}){3})(?::\d++)?$/iD";
> $message = preg_replace( $regex, '[e-mail prorected]', $message);
Wow. That's one complex RegEx. I wonder how many of the characters in your
square brackets are even legal for emails according to the RFC. Also, you are
not allowing uppercase characters and those are legal.
As you know, an email address usually follows the form:
[EMAIL PROTECTED]
The server names can vary, especially when subdomains or country codes are
involved.
To keep this fairly simple at first and then build upon it, let's say that the
user name can contain letters, numbers, the dash, the underscore, and the dot
for our friends with old CompuServe addresses. We can get the upper and lower
case letters and numbers with \w:
[0-9A-Za-z] = \w
We need to have at least one letter before the @ symbol. To PHP the @ symbol
is not a special character, unlike Perl, so we don't need to escape it.
Similarly, the server name will have at least one character after the @ symbol.
If we allow the dot and so forth then we may not care about the separation
between the server domain name and the top-level domain (eg com, net, org).
However, domain names don't have underscores, I believe, so I won't includ it
here (YMMV):
/[EMAIL PROTECTED]/
The + is an enumerator which says "1 or more of the match to the left".
The trouble with really complex RegEx rules is they can be hard to proofread.
One swapped or errant character and it fails. You'll have to decide if it
needs to be more complex than this for some reason.
James