On Sat, 3 Sep 2016, John Hardin wrote:
>I've tweaked the FP avoidance a bit, maybe that will be enough
>to get the S/O up high enough to publish it.

John, do you have any detailed info about the Ham hits?

I just datamined my three best corpora, from the beginning of
2014 thru this weekend, and found zero FPs, except for two hits
on that "img" test.  My data does NOT prove it's impossible for
anybody else, but it does seem odd, so I'm wondering if the
SA MassCheck mechanism has some means for the contributor to
pull out the corpses of specific hits.
If it doesn't, that would be a cool feature to add. :)


On Wed, 31 Aug 2016, Axb wrote:
>IMG src="data  can FP a lot.

AXB,
You are correct.
A few months ago, I had moved that rule in with my other "data"
rules, apparently because they had the token "data" in common.

I dug thru my notes, and the image rule was originally added to
combat a semi-subtle snowshoe campaign sent via Linode (as hosts,
they're much better than the other big-cheap-VPSs, so I've been
resisting scoring their IP blocks, which means that snow sent
thru them is sometimes harder to catch).

When I checked all data for 2014 to now in my three best corpora
(about 840 K-spam), I found that all the image spam hits were in
snow, and were NOT overtly dangerous, whereas all the non-image
"data" stuff has been in well-crafted Phish (UBER-dangerous).

There were exactly two Ham hits, and both were :grind-teeth:
ostensibly legitimate, albeit non-urgent.

Perhaps ironically or merely sadly, one was an 800 Kb monstrosity
of HTML badness (yes, all in one single Part), with several 
images and :cring: fonts inlined via "data" statements.  When I
tried to view it as an HTML page in my raw corpse viewer (using
an old-ish open source HTML rendering engine), it grinded away
for a while then died. :(
Who was the Sender?
Norton.
Yes, THAT Norton.
... and the Subject header was:
"ClubNorton Newsletter: Avoiding Social Engineering Tricks on Social Networks"

I've been scoring my data img rule at about 2.3 so it's well
below Poison Pill, and would not have caused either of those two
Hams to die.  Though I would not have lost sleep over a
Mercy Killing of the "ClubNorton" monstrosity. ;)

Bottom-line:
I strongly recommend a high scoring non-img "data" rule, and
gently recommend a modest scoring img "data" rule.
Everyone's mileage will vary, as always. :)
        - "Chip"

P.S. Javascript... I agree 100% with John, while respecting AXB's
right to disagree and choose his own poison. ;)
I'll describe what I'm doing later, in a separate thread.
It's flexible enough to provide good protection, while letting
in all but the self-injurious Ham (e.g. someone at Amazon drank
some of the ClubNorton koolaid).


Reply via email to