Re: How to create a URIBL

2016-10-19 Thread Kris Deugau
Alex wrote:
> Hi,
> 
> I've collected a bunch of URIs that I'd like to incorporate into my
> rulebase. I know how to create a DNSBL, but I don't specifically know
> how to create a URIBL. Can I use rbldnsd for this? Or would I have to
> extract the IP or hostname from the URL, then also use a bunch of uri
> rules? If so, is there a way of automating this, given a list of URIs?
> 
> For example, I have URIs like:
> 
> http://109.73.134.241/dgq01px
> http://51steel1.org/s4b5ztgcx
> http://amessofblues1.com/m0dqfx

Do you want to use the full URI (including the /dgq01px or /s4b5ztgcx
parts), or just the domain names?

If you want the full URI, I think you're pretty much stuck collecting
them up in a huge list of uri rules, unless you want to write a custom
plugin to do a custom DNS lookup.  (Not sure some of the new DNS lookup
widgets will go quite far enough to support something like this directly.)

If you only want the domain name, you can feed those into a local DNSBL.

> I'm also then not sure which of uri* rule definition should be used.
> I've used urirhsbl before for a local host blocklist, but now after
> reading the man page again for the first time in a while, I'm not even
> sure that's correct.

"uri" rules are standard SA regular expression rules that only look at
things that SA has extracted from the message as a URI.

The others are DNSBL lookup rules, with a lot of variations on how the
lookup should be done, and the results broken down.  The
Mail::SpamAssassin::Plugin::URIDNSBL man page has all the details, but
my experience has been that for local use, you generally only need
uridnsbl and/or uridnssub.

> I'm also unclear about rbldnsd config for dnset, where hostnames would
> be used. Here is my current command-line:

Other responses have gone into more detail on this, which I probably
tested for myself at one point when I set up local DNS blacklists.

I also wrote some basic tools to feed both relay IP and URI domain data
into these local lists;  I've published them at
https://secure.deepnet.cx/trac/dnsbl.  Note that these are mainly
data-entry/export utilities, and they're a little rough around the
edges, but these are substantially what I've been using in production
for quite a few years now.

-kgd


Re: How to create a URIBL

2016-10-19 Thread Rob McEwen

On 10/18/2016 9:09 PM, Alex wrote:

How do you then enter ranges? For example, one of the rbldnsd zone
examples I've seen have entries such as:
1.168.160.0-255
That does not look to be in reverse order, as the host octet is still last.


while there may be a more complicated and unusual answer for this.. the 
short answer is... you don't, and you shouldn't have to.


(1) IPs at the base of clickable links inside the body of the message in 
spams... is still a little rare... comprising roughly 2% of all such 
listings.


(2) This means that (a) those IPs aren't taking up a lot of space in the 
dnset files, when compared to the domains and host names there, and (b) 
of that ~2% of IPs, extremely few of those are even in the same /24 
block - so you don't get much mileage out of trying to list ranges


having said that... sending-IP lists that use ipset DO have the 
functionality that you desire. ipset actually has quite a number of 
acceptable formats to list blocks or ranges of IPs.


iptset... not so much. iptset is built for EXTRA speed and EXTRA 
low-memory usage, but isn't as flexible and generally requires one 
single IP per line.


Based on your question, it could be that you're trying to merge your 
sending IP blacklist, with your URI/domain blacklists... all into one 
single dnset rbldnsd file? if so, that is NOT recommended. It causes 
problems and removes some of rbldnsd best features/strengths.



Your service is great, btw.


Thanks. Please send me a note off-list as you how/why you think that. 
I'm not looking for praise... just curious if you're one of my clients 
(such as at your dayjob?) or if we've crossed paths somewhere and I 
forgot about it?... or if you have ever testing invaluement? etc (though 
I know you're a frequent SA discussion participant)



--
Rob McEwen
http://www.invaluement.com
+1 (478) 475-9032




Re: How to create a URIBL

2016-10-19 Thread Rob McEwen

On 10/19/2016 3:51 AM, Matus UHLAR - fantomas wrote:

are you REALLY sure the IP has to be reversed?
rbldns parses IP and reverses them by itself, if used in ip4* dataset.
When used in dnset, it should not be reversed.


Your most valid points do not apply to "dnset". they apply to ip4tset 
and ip4set for sending-IP blacklists.


Let me explain... but before I explain, let me say that I'm not arguing 
for any of this. These standards were put in place long before my time 
(and are followed by SURBL and URIBL, too). Or, at least I didn't set 
these standards. I MIGHT have been involved in some of the discussions 
about this circa 2004, in internal discussions at SURBL - and in SA 
discussions - but I think this was all set just a little before my time 
in those forums.


So basically, if you look at the anatomy of a domain name... from left 
to right, you get into a higher hierarchy.


So in "foo.example.com"

"foo" is drilling into detail. while "example.com" is the bigger 
picture. And then ".com" is an even bigger picture! In a domain, as you 
get FURTHER to the right, you go to a HIGHER hierarchy or level.


But IPs are the opposite. For an IPv4 IP, the leftmost number is the 
highest in the hierarchy, and you drill down into more detail as you 
move to the right.


For this reason, it was decided a long time ago... that for URI DNSBL 
blacklists that use "dnset", the IP should be reversed in the source file.


Therefore, in the data file, the test point IP:

127.0.0.1

shows up as

1.0.0.127

And then when the client queries that IP, the query is formatted as follows:

1.0.0.127.example.com

(where example.com is the URI blacklist's host name)

And, likewise, ALL of the major anti-spam software, (such as 
SpamAssassin), automatically reverses the IP when that (forward-ordered) 
IP is extracted from a base of a URL found in the body of a spam, and 
then this is appended to the beginning of a URI blacklist's hostname, 
for checking against a URIBL blacklists (such as SURBL, URIBL, or my own 
ivmURI list)


This decision to do it this way PROBABLY had something to do with trying 
to get rbldnsd engine to NOT have to internally treat IPs and 
domains/host-names differently. otherwise, it would have had to "know" 
to reverse IPs, but yet know to NOT reverse domains or host names. (and 
who knows what TLDs could be coming up in the future?)


In contrast, IPs found in sending IP data files (for ip4tset and ip4set) 
don't have this inconsistency problem. So it make sense to just leave 
them in forward-order, for EASY readability... and then just allow 
rbldnsd to reverse order them on-the-fly. (thank God - I'd go nuts if my 
ip4tset and ip4set were all in reverse order! meanwhile, IPs in URIBL 
data files are usually a TINY percentage of the listings!)


--

Having said all of that, for regular sending0IP blacklists, (just as you 
said) the IP is NOT in reverse order in the file. But rbldnsd "knows" to 
reverse order it in memory, before it is compared to the reverse-ordered 
query that comes in from the client.


So you're correct when you say, "rbldns parses IP and reverses them by 
itself" ... but that only applies to sending-IP blacklists, set up with 
ip4tset and ip4set in rbldnsd.


As shown, dnset operates differently for IP addresses found in URIBL 
blacklists.


--

This was a trip down memory lane for me.

--
Rob McEwen
invaluement


Re: How to create a URIBL

2016-10-19 Thread Axb

On 10/19/2016 09:51 AM, Matus UHLAR - fantomas wrote:

On 18.10.16 20:03, Rob McEwen wrote:

So your three examples:

109 .73 .134 .241



would like like this:

.241 .134 .73 .109



NOTICE 2 things:



(2) the fact that the IP is in reverse order. The great part about
rbldnsd is that a lookup on either


are you REALLY sure the IP has to be reversed?
rbldns parses IP and reverses them by itself, if used in ip4* dataset.
When used in dnset, it should not be reversed.



in the rbldnsd zone the ip does NOT have to reversed
the query reverses the IP



Re: How to create a URIBL

2016-10-19 Thread Matus UHLAR - fantomas

On 18.10.16 20:03, Rob McEwen wrote:

So your three examples:

109 .73 .134 .241



would like like this:

.241 .134 .73 .109



NOTICE 2 things:


(2) the fact that the IP is in reverse order. The great part about 
rbldnsd is that a lookup on either


are you REALLY sure the IP has to be reversed?
rbldns parses IP and reverses them by itself, if used in ip4* dataset.
When used in dnset, it should not be reversed.

--
Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
Eagles may soar, but weasels don't get sucked into jet engines. 


Re: How to create a URIBL

2016-10-18 Thread Alex
Hi,

> (2) the fact that the IP is in reverse order.

How do you then enter ranges? For example, one of the rbldnsd zone
examples I've seen have entries such as:

1.168.160.0-255

That does not look to be in reverse order, as the host octet is still last.

> foo.example.com:127.0.0.2:Blocked System
>
> in my experience, I haven't been able to get this to work unless I put a
> space just before the first colon, as follows
>
> foo.example.com :127.0.0.2:Blocked System

That was my exact problem that caused me to write this post. It was
frustrating that ip4set worked fine, but dnset always failed because
of that.

> But sometimes you don't need that and can simply use just the domain or IP
> on each line, since much of that can be accomplished with a single line
> at/near the top of the file, such as this one that I use for the invaluement
> URI list:
>
> :127.0.0.2:Blocked by ivmURI - see http://www.invaluement.com/lookup/?item=$

Yes, this is what I've settled on for now.

> of course, the most difficult part is not collecting spammy IPs and
> domains... that part is easy. The most difficult part is knowing when NOT to
> blacklist a domain--which would be a decoy domain found in a spam, that
> wasn't the actual "payload" for the spam and is instead an innocent
> bystander's domain -- and/or generally keeping FPs super low. THAT is the
> hard part.

Yeah, absolutely. That's a large part of what's been delaying my
progress with my honeypots. It's still in progress, but one thing I've
been doing is checking my entries against existing whitelists, and
other ways such as seeing how long they've been around, etc.

> But try this and blacklist:
>
> .blogspot.com
>
> ...and trigger massive FPs... when you should have listed:
>
> .somehorrificspammerfromhell.blogspot.com

Yes, exactly. I've just been doing specific hostnames.

I appreciate that this is slightly off-topic, but it's an extension of
SA. Thanks so much for your help. Your service is great, btw.


Re: How to create a URIBL

2016-10-18 Thread Rob McEwen

Alex,

here are some suggestions:

In your rbldnsd-formatted file, put a dot at the beginning, which serves 
as a wildcard.


So your three examples:

109 .73 .134 .241
51steel1 .org
amessofblues1 .com

(I added spaces here to evade spam filtering, but those spaces shouldn't 
actually be there)


would like like this:

.241 .134 .73 .109
.51steel1 .org
.amessofblues1 .com

(again, the extra spaces shouldn't be there)

NOTICE 2 things:

(1) The extra dot at the beginning
-and-
(2) the fact that the IP is in reverse order. The great part about 
rbldnsd is that a lookup on either


example.com
OR
www.example.com
OR
foo.bar.foo.example.com

ALL of those will get a "hit" when the rbldnsd file has

.example.com



When it comes to formatting the rbldnsd-formatted file, in addition to 
my suggestions above, it comes down to a choice... make it a simply list 
of the domains and (reverse-ordered) IPs? Or provide more information 
for each individual IP, such as a custom text response, as you did here:


foo.example.com:127.0.0.2:Blocked System

in my experience, I haven't been able to get this to work unless I put a 
space just before the first colon, as follows


foo.example.com :127.0.0.2:Blocked System

But sometimes you don't need that and can simply use just the domain or 
IP on each line, since much of that can be accomplished with a single 
line at/near the top of the file, such as this one that I use for the 
invaluement URI list:


:127.0.0.2:Blocked by ivmURI - see http://www.invaluement.com/lookup/?item=$

...which then causes all following lines of just domains and IPs... to 
use this line above as if it were on every single line. - and the "$" 
causes the actual listed item to show up in the SMTP text message. That 
"$" feature can be very informative and helpful!


of course, the most difficult part is not collecting spammy IPs and 
domains... that part is easy. The most difficult part is knowing when 
NOT to blacklist a domain--which would be a decoy domain found in a 
spam, that wasn't the actual "payload" for the spam and is instead an 
innocent bystander's domain -- and/or generally keeping FPs super low. 
THAT is the hard part.


There are other issues as to WHERE to divide the domain.

For example, if you listed

.foo.bar.foo.bar.foo.bar.foo.bar.example.com

... but foo.bar.foo.bar.foo.bar.foo.bar. was just decoy material added 
by the spammer... then...


foo.bar.example.com comes in and guess what? your lookup fails to find 
it. Yet all such variations would be listed if you had simply blacklisted:


.example.com
(again, with the dot in front)

But try this and blacklist:

.blogspot.com

...and trigger massive FPs... when you should have listed:

.somehorrificspammerfromhell.blogspot.com

so that either

www.somehorrificspammerfromhell.blogspot.com
OR
somehorrificspammerfromhell.blogspot.com
foo.bar.foo.bar.somehorrificspammerfromhell.blogspot.com

would ALL return listing, but

blogspot.com

...wouldn't.

So it also takes some work determining those boundaries. Some of those 
are simple domains... while others like blogspot.com or wordpress.com, 
are more "artificial" (but still critically important).



--
Rob McEwen
invaluement.com



Re: How to create a URIBL

2016-10-18 Thread Joe Quinn

On 10/18/2016 6:21 PM, Alex wrote:

Hi,

I've collected a bunch of URIs that I'd like to incorporate into my
rulebase. I know how to create a DNSBL, but I don't specifically know
how to create a URIBL. Can I use rbldnsd for this? Or would I have to
extract the IP or hostname from the URL, then also use a bunch of uri
rules? If so, is there a way of automating this, given a list of URIs?

For example, I have URIs like:

http://109.73.134.241/dgq01px
http://51steel1.org/s4b5ztgcx
http://amessofblues1.com/m0dqfx

I'm also then not sure which of uri* rule definition should be used.
I've used urirhsbl before for a local host blocklist, but now after
reading the man page again for the first time in a while, I'm not even
sure that's correct.

I'm also unclear about rbldnsd config for dnset, where hostnames would
be used. Here is my current command-line:

/usr/sbin/rbldnsd -n -srbldnsd.stats -r/var/lib/rbldnsd -f -n -b
66.123.123.106/53 uri.example.com:dnset:urilist

My urilist file looks like this:

:127.0.0.2:Blocked System: http://example.com/bl?$
$NS 1w uri.example.com
$SOA 1w uri.example.com admin.uri.example.com 0 2h 2h 1w 1h
@ A 66.123.123.106
@ MX 10 uri.example.com
@ TXT "example hostname blocklist"
25z5g623wpqpdwis.onion1.to:127.0.0.2:Blocked System, Last-Attack: 1476825181
27lelchgcvs2wpm7.3lhjyx1.top:127.0.0.2:Blocked System, Last-Attack: 1476825181
27lelchgcvs2wpm7.7jiff71.top:127.0.0.2:Blocked System, Last-Attack: 1476825181

Using the following (and variations, including dig +short) fail with NXDOMAIN
# host 25z5g623wpqpdwis.onion1.to.uri.example.com 66.123.123.106

Can someone show me an example zone file using the dnset option?

I'm guessing my first attempt at this message being received by the
list was due to the domain samples I've included, so they've been
modified.

Any ideas greatly appreciated.
Thanks,
Alex


rbldnsd is still suitable for this, as the DNS lookups are fundamentally 
just mapping strings to IPs. Getting too deep into it is outside SA's 
scope, but the only real difference between an IP rbl and a domain rbl 
is that IP rbls tend to reverse the IP so the most significant octet is 
the most significant subdomain.


On the rules side of things there's multiple different ways to write uri 
rules that match against a dns lookup. Some of them are looking for 
nxdomain vs anything else, some of them can look for particular IPs, 
etc. Just look for the existing RBL that's most similar to what you are 
looking to create.