[Declude.JunkMail] phone regex/pcre help
I'm looking to replace these lines with a pcre but it doesn't seem to be working. Any suggestions? BODY 175 CONTAINS 206 888-2083 BODY 175 CONTAINS 206.8882083 BODY 175 CONTAINS 2068882083 BODY 175 CONTAINS 206-8882083 BODY 175 CONTAINS 206 8882083 BODY 175 PCRE (?i:[\(\{]?2[0o]6[\)\}]?{\-\_\.\s}?888{\-\_\.\s}?2[0o]83) Scott Fisher Dir of IT Farm Progress Companies 191 S Gary Ave Carol Stream, IL 60188 Tel: 630-462-2323 This email message, including any attachments, is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. Although Farm Progress Companies has taken reasonable precautions to ensure no viruses are present in this email, the company cannot accept responsibility for any loss or damage arising from the use of this email or attachments. --- This E-mail came from the Declude.JunkMail mailing list. To unsubscribe, just send an E-mail to [EMAIL PROTECTED], and type unsubscribe Declude.JunkMail. The archives can be found at http://www.mail-archive.com.
RE: [Declude.JunkMail] phone regex/pcre help
Scot, my eyes water when I look at a long regexp. So without trying to work out that specific PCRE syntax, I'll suggest two things: 1) Make a generic detection that finds zero or more junk characters between the text you're looking for. The longer the parent string is, the less likely you are to have a false positive, e.g. finding filler between ab BAD: a.*b This is bad because it is too greedy and matches the longest line that has a then zero or any amount of characters up to the buffer size, and then a b. LESS BAD: a.{0,2}b This is less bad because we're restricting the count of the wildcard to 0 through 2 characters between the a and the b, but it's still bad because the string is so short. Even if this were gibberish, you will likely hit it eventually as a false positive when finding it in the MIME encoding of a binary file. AWESOME: Taking a long string like a phone number and dropping the: .{0,2} between each of the bits of text you think the bad guy will try to stuff with junk, including whitespace. Replace the 2 with however many characters you think are sensible. I think Declude wants the brace characters escaped, e.g.: .\{0,2\} is the syntax to use in a PCRE. 2) A while back I had to fix some ugly regexp that plain old didn't work, and I used a Windows shareware app called The Regex Coach and it worked for me. http://weitz.de/regex-coach/ Andrew. From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Scott Fisher Sent: Tuesday, July 03, 2007 12:34 PM To: declude.junkmail@declude.com Subject: [Declude.JunkMail] phone regex/pcre help I'm looking to replace these lines with a pcre but it doesn't seem to be working. Any suggestions? BODY 175 CONTAINS 206 888-2083 BODY 175 CONTAINS 206.8882083 BODY 175 CONTAINS 2068882083 BODY 175 CONTAINS 206-8882083 BODY 175 CONTAINS 206 8882083 BODY 175 PCRE (?i:[\(\{]?2[0o]6[\)\}]?{\-\_\.\s}?888{\-\_\.\s}?2[0o]83) Scott Fisher Dir of IT Farm Progress Companies 191 S Gary Ave Carol Stream, IL 60188 Tel: 630-462-2323 This email message, including any attachments, is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. Although Farm Progress Companies has taken reasonable precautions to ensure no viruses are present in this email, the company cannot accept responsibility for any loss or damage arising from the use of this email or attachments. --- This E-mail came from the Declude.JunkMail mailing list. To unsubscribe, just send an E-mail to [EMAIL PROTECTED], and type unsubscribe Declude.JunkMail. The archives can be found at http://www.mail-archive.com. --- This E-mail came from the Declude.JunkMail mailing list. To unsubscribe, just send an E-mail to [EMAIL PROTECTED], and type unsubscribe Declude.JunkMail. The archives can be found at http://www.mail-archive.com.
Re: [Declude.JunkMail] phone regex/pcre help
Scott, The following should do the same. Note that I do not know if Declude requires the whole match to be placed in parenthesis. 2[0Oo]6[\s\r\n\-\.]*888[\s\r\n\-\.]*2[0Oo]83 Matt Scott Fisher wrote: I'm looking to replace these lines with a pcre but it doesn't seem to be working. Any suggestions? BODY 175 CONTAINS 206 888-2083 BODY 175 CONTAINS 206.8882083 BODY 175 CONTAINS 2068882083 BODY 175 CONTAINS 206-8882083 BODY 175 CONTAINS 206 8882083 BODY 175 PCRE (?i:[\(\{]?2[0o]6[\)\}]?{\-\_\.\s}?888{\-\_\.\s}?2[0o]83) Scott Fisher Dir of IT Farm Progress Companies 191 S Gary Ave Carol Stream, IL 60188 Tel: 630-462-2323 /This email message, including any attachments, is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. Although Farm Progress Companies has taken reasonable precautions to ensure no viruses are present in this email, the company cannot accept responsibility for any loss or damage arising from the use of this email or attachments./ --- This E-mail came from the Declude.JunkMail mailing list. To unsubscribe, just send an E-mail to [EMAIL PROTECTED], and type unsubscribe Declude.JunkMail. The archives can be found at http://www.mail-archive.com. --- This E-mail came from the Declude.JunkMail mailing list. To unsubscribe, just send an E-mail to [EMAIL PROTECTED], and type unsubscribe Declude.JunkMail. The archives can be found at http://www.mail-archive.com.
Re: [Declude.JunkMail] phone regex/pcre help
This would match on all you have provided, the . meaning any character including a space {0,1} means min of 0 max of 1 (206.{0,1}888.{0,1}2083) If you wanted to use detect O as well as the 0 [o0] also you could use the ?i: meaning case insensitive: (?i:2[o0]6.{0,1}888.{0,1}2[o0]83) David B From: Matt [EMAIL PROTECTED] Sent: Tuesday, July 03, 2007 4:08 PM To: declude.junkmail@declude.com Subject: Re: [Declude.JunkMail] phone regex/pcre help Scott, The following should do the same. Note that I do not know if Declude requires the whole match to be placed in parenthesis. 2[0Oo]6[\s\r\n\-\.]*888[\s\r\n\-\.]*2[0Oo]83 Matt Scott Fisher wrote: I'm looking to replace these lines with a pcre but it doesn't seem to be working. Any suggestions? BODY 175 CONTAINS 206 888-2083 BODY 175 CONTAINS 206.8882083 BODY 175 CONTAINS 2068882083 BODY 175 CONTAINS 206-8882083 BODY 175 CONTAINS 206 8882083 BODY 175 PCRE (?i:[\(\{]?2[0o]6[\)\}]?{\-\_\.\s}?888{\-\_\.\s}?2[0o]83) Scott Fisher Dir of IT Farm Progress Companies 191 S Gary Ave Carol Stream , IL 60188 Tel: 630-462-2323 This email message, including any attachments, is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. Although Farm Progress Companies has taken reasonable precautions to ensure no viruses are present in this email, the company cannot accept responsibility for any loss or damage arising from the use of this email or attachments. --- This E-mail came from the Declude.JunkMail mailing list. To unsubscribe, just send an E-mail to [EMAIL PROTECTED], and type unsubscribe Declude.JunkMail. The archives can be found at http://www.mail-archive.com. --- This E-mail came from the Declude.JunkMail mailing list. To unsubscribe, just send an E-mail to [EMAIL PROTECTED], and type unsubscribe Declude.JunkMail. The archives can be found at http://www.mail-archive.com. --- This E-mail came from the Declude.JunkMail mailing list. To unsubscribe, just send an E-mail to [EMAIL PROTECTED], and type unsubscribe Declude.JunkMail. The archives can be found at http://www.mail-archive.com.
Re: [Declude.JunkMail] phone regex/pcre help
Dave, {0,1} = ? {0,} = * {1,} = + Also note that beginning a sub-match with a (? improves PCRE's performance because it tells it not to track the sub-matches, and the engine likely has a hard limit in order to prevent an expression from causing itself to become overly complicated with sub-matches that don't need to be tracked (which can result in missing matches). So never start a sub-match with just a parenthesis, always use a (?, or other more specific argument (or whatever they call it). A good thing to remember when dealing with regex and E-mail is that there can be both code breaks, CODE888/CODE, line breaks, and also quoted printable encoding. For instance, between every two characters that display immediately together and that you are attempting to match without normalizing, you would need to test for: (?=\r\n|(?[^]+)+) It gets a lot worse when you start trying to apply spaces because of all the ways that this can appear. If Declude wants to get serious about applying regular expressions to the bodies of E-mail, you would need to normalize the data otherwise you would end up with too many permutations. When I do this programatically, I produce a range of variables, for instance one that is the full original source, one that strips out all line breaks, removes quoted-printable encoding, removes HTML, and combinations there-of. If you are going to try to use regular expressions for finding phrases, it is the only way to do this without leaving a huge gaping hole that even standard E-mail clients will produce source that would be missed. If you are going after E-mail format and not the content, then what you have is perfect. Matt David Barker wrote: This would match on all you have provided, the . meaning any character including a space {0,1} means min of 0 max of 1 (206.{0,1}888.{0,1}2083) If you wanted to use detect O as well as the 0 [o0] also you could use the ?i: meaning case insensitive: (?i:2[o0]6.{0,1}888.{0,1}2[o0]83) David B *From*: Matt [EMAIL PROTECTED] *Sent*: Tuesday, July 03, 2007 4:08 PM *To*: declude.junkmail@declude.com *Subject*: Re: [Declude.JunkMail] phone regex/pcre help Scott, The following should do the same. Note that I do not know if Declude requires the whole match to be placed in parenthesis. 2[0Oo]6[\s\r\n\-\.]*888[\s\r\n\-\.]*2[0Oo]83 Matt Scott Fisher wrote: I'm looking to replace these lines with a pcre but it doesn't seem to be working. Any suggestions? BODY 175 CONTAINS 206 888-2083 BODY 175 CONTAINS 206.8882083 BODY 175 CONTAINS 2068882083 BODY 175 CONTAINS 206-8882083 BODY 175 CONTAINS 206 8882083 BODY 175 PCRE (?i:[\(\{]?2[0o]6[\)\}]?{\-\_\.\s}?888{\-\_\.\s}?2[0o]83) Scott Fisher Dir of IT Farm Progress Companies 191 S Gary Ave Carol Stream, IL 60188 Tel: 630-462-2323 /This email message, including any attachments, is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. Although Farm Progress Companies has taken reasonable precautions to ensure no viruses are present in this email, the company cannot accept responsibility for any loss or damage arising from the use of this email or attachments./ --- This E-mail came from the Declude.JunkMail mailing list. To unsubscribe, just send an E-mail to [EMAIL PROTECTED], and type unsubscribe Declude.JunkMail. The archives can be found at http://www.mail-archive.com. --- This E-mail came from the Declude.JunkMail mailing list. To unsubscribe, just send an E-mail to [EMAIL PROTECTED], and type unsubscribe Declude.JunkMail. The archives can be found at http://www.mail-archive.com. --- This E-mail came from the Declude.JunkMail mailing list. To unsubscribe, just send an E-mail to [EMAIL PROTECTED], and type unsubscribe Declude.JunkMail. The archives can be found at http://www.mail-archive.com. --- This E-mail came from the Declude.JunkMail mailing list. To unsubscribe, just send an E-mail to [EMAIL PROTECTED], and type unsubscribe Declude.JunkMail. The archives can be found at http://www.mail-archive.com.