On 06/25/2014 11:35 PM, Philip Prindeville wrote:
On Jun 25, 2014, at 3:00 PM, Axb <[email protected]> wrote:
On 06/25/2014 10:37 PM, Philip Prindeville wrote:
On Jun 25, 2014, at 3:09 AM, Axb <[email protected]> wrote:
On 06/25/2014 03:07 AM, Philip Prindeville wrote:
Anyone have rules to catch these they could point me at? Or any empirical
evidence about how successful they’ve been with such?
Wouldn't use this for a rule unless you meta it with lots of other traits
the rawbody /href\=\"#\"/ plus other traits could be combined.
Can you pastebin a sample ?
Sure:
http://pastebin.com/4QFUZ6vd
the href template bork + the Base8 hashes are giveaways.
meta those rawbody traits together and you're rocking (for a while)
Sorry, which base8 hashes?
F1B9215E, etc
Also, I’m noticing the tracking info following the href…
F1B9215E-B1D0-40BC-92D1-F13D501596B7;F1B9215E-B1D0-40BC-92D1-F13D501596B7;F1B9215E-B1D0-40BC-92D1-F13D501596B7;F1B9215E-B1D0-40BC-92D1-F13D501596B7;F1B9215E-B1D0-40BC-92D1-F13D501596B7;F1B9215E-B1D0-40BC-92D1-F13D501596B7
Including 6 distinct UUID’s would seem to be useful. Including the same UUID 6
times seems broken.
Perhaps a pattern like:
body /((;[A-F0-9]{8}-[A-F0-9]{4}-[A-F0-9]{4}-[A-F0-9]{4}-[A-F0-9]{12})){4,}/
would be… no, wait… we’d need to save the first one, and then check for 3 or
more recurrences of the exact same literal string.
rawbody L_REPEATING_UUIDS /<a href="\#"
.*(;[A-F0-9]{8}-[A-F0-9]{4}-[A-F0-9]{4}-[A-F0-9]{4}-[A-F0-9]{12}){4,}>/i
describe L_REPEATING_UUIDS Seeing the same tracking info repeated
score L_REPEATING_UUIDS 0.1
I'd do a less specific:
rawbody HREF_TP_BORK_HASH /\<href\=\"#\"/
score HREF_TP_BORK_HASH 1.5
body BASE812C_DASHS /\;?\-?[A-F0-9]{12}\-?\;?/
score BASE812C_DASHS 1.5
meta META_DASHES_URIHASH (BASE812C_DASHS && HREF_TP_BORK_HASH)
score META_DASHES_URIHASH 3.5
tflags META_DASHES_URIHASH autolearn_force