On Tue, Oct 28, 2014 at 11:47 AM, francis picabia <fpica...@gmail.com>
wrote:

>
>
> On Mon, Oct 27, 2014 at 4:55 PM, John Hardin <jhar...@impsec.org> wrote:
>
>> On Mon, 27 Oct 2014, francis picabia wrote:
>>
>>    uri  URI_EXAMPLE_EXTRA  m;^https?://(?:www\.)?example\.com[^/?];i
>>>>>>
>>>>>
>>> However another spoofed message was received today and the rule
>>> did not capture it.
>>>
>>> If I want to detect something in the form of:
>>> random_server.example.com.junk
>>> I need to wildcard the first bit.  Would that be:
>>>
>>> uri  URI_EXAMPLE_EXTRA  m;^https?://(?:.*\.)?example\.com[^/?];i
>>>
>>> I don't understand what the question mark and colon does inside the ( )
>>> I thought it followed an optional char or expression.  Should it be
>>> like this?
>>>
>>> uri  URI_EXAMPLE_EXTRA  m;^https?://(.*\.)?example\.com[^/?];i
>>>
>>
>> (?:) means "group, don't remember the match". () remembers what's matched
>> for future use in the RE (e.g. to check for repeated strings like
>> "abcabcabcabc".
>>
>> Try this:
>>
>>   uri  URI_EXAMPLE_EXTRA  m;^https?://(?:[^./]+\.)*example\.com[^/?];i
>>
>>
> Once again, thanks for the RE coding.
>
> I found a false positive it captured with my attempt at this :
>
>  <a href="
> http://www.newslettersite.com/redirectnewsletter_login.asp?URL=http://www.secondsite.com/PYB/contact_us.asp&loginemail=u...@example.com&logincode=123456&utm_source=Articles_Air_01112014&utm_medium=email&utm_campaign=newsletter&utm_content=contactus
> "
>
> I've tested your rule with that and it does not tag for the above.
> Great.  Hopefully useful to others facing domain spoofs in phishing.
>
> I thought this was a representative test case, but apparently
there is something triggering a false positive when the
email is a newsletter which embeds a user's email within URLs.

In the sample I've seen, there are 34 such possible links which may have
triggered the issue, but I don't know which.

I ran the quarantined sample through spamassassin -D and it shows:

Oct 28 16:24:01.391 [28945] dbg: rules: ran uri rule URI_MYDOMAIN_PHISH
======> got hit: "http://example.com&";

On prior lines in the trace I see other uri rules getting hits, but it
seems to be about different URLs.  The entire body of the email is base64
encoded.  Extracting that part and running base64 -d I am not finding
the hit described by SA trace.

This is my method:

zcat spam-jUVZBDml0wS5.gz | grep 'http://example.com'

So the URL is not in the non-base64 part.

zcat spam-jUVZBDml0wS5.gz > /tmp/spamfull
cp /tmp/spamfull /tmp/spam64
vi /tmp/spam64  (to remove headers)
base64  -d /tmp/spam64  | grep 'http://example.com'

(no matchs)

Double checked with:

spamassassin -D -lint < /tmp/spamfull 2>&1 | grep http://example.com

nothing is output except the line above with URI_MYDOMAIN_PHISH.

Is there any suggestion on how to nail down where the match is happening?

Reply via email to