Doc Bug: Trusted_networks versus internal

2018-03-15 Thread Dan Mahoney (Gushi)

Hey there,

I'm seeing conflicting information about what 
trusted_networks/internal_networks means.


One of $dayjob's emails tripped off our internal spamassassin, which was 
scanning outbound mail as well.  Apparently we used a URL in our mail 
(talking about a security issue) and caused URIBL to go crazy, causing the 
message to be still flagged as spam:


spamd: result: Y 9 - 
ALL_TRUSTED,T_RP_MATCHES_RCVD,URIBL_ABUSE_SURBL,URIBL_BLACK,URIBL_DBL_SPAM,URIBL_PH_SURBL,URIBL_SBL,URIBL_SBL_A


We're fixing this to not scan our outbound mail, since we work in security 
and need to occasionally send a mail that looks spammy.


Here's my problem with the somewhat unclear docs:

One Apache page 
(https://wiki.apache.org/spamassassin/Rules/ALL_TRUSTED?action=show=ALL_TRUSTED)


Says:

"Trusted" does not mean "trusted to not send spam." It means "trusted to 
not forge Received: headers."


And another page (https://wiki.apache.org/spamassassin/TrustPath)

Says:

"Note that it doesn't matter if the server relays spam to you from other 
hosts; that still means you trust the server not to originate spam, which 
is what 'trusted_networks' specifies."


Could someone who really understands the internals fix one of these? 
These two pages are directly offering conflicting information.


--

Dan Mahoney
Techie,  Sysadmin,  WebGeek
Gushi on efnet/undernet IRC
FB:  fb.com/DanielMahoneyIV
LI:   linkedin.com/in/gushi
Site:  http://www.gushi.org
---



Re: The "goo.gl" shortner is OUT OF CONTROL (+ invaluement's response)

2018-03-15 Thread Rob McEwen

On 3/15/2018 11:13 AM, sha...@shanew.net wrote:

You might take a look at
https://developers.google.com/url-shortener/v1/getting_started

1 miion requests per day is the default limit.



Excellent! Thanks for the suggestion. This should help me MUCH!

But, unfortunately, it still leaves a lot to be desired for fixing the 
problems I described... when it comes to more distribution of automated 
"plugins" or having spam filters automatically doing this - in a "set it 
and forget it mode".


Still, there is a good argument that if someone has a high enough volume 
to trigger rate limiting... then they ought to have a large enough staff 
to handle going the extra mile and setting up API access systems, etc. 
But I wish it could be more simple. MANY are going to struggle with 
this... and NOT easily know all the details discussed on this thread!


But again, this should help me (and others) much... and it is good to 
know that there is a proper way to do this at a higher volume that meets 
Google's approval.


--
Rob McEwen
https://www.invaluement.com



Re: The "goo.gl" shortner is OUT OF CONTROL (+ invaluement's response)

2018-03-15 Thread shanew

You might take a look at
https://developers.google.com/url-shortener/v1/getting_started

1 miion requests per day is the default limit.

On Wed, 14 Mar 2018, Rob McEwen wrote:


On 2/20/2018 9:42 PM, Rob McEwen wrote:
  Google might easily start putting captchas in the way or
  otherwise consider such lookups to be abusive and/or mistake
  them for malicious bots...

This prediction turned out to be 100% true. Even though others have
mentioned that they have been able to do high-volume lookups with no
problems... And granted I wasn't implementing a multi-server or multi-ip
lookup strategy... But I don't think I was doing nearly as many lookups as
others have claimed that they were able to do. I took a batch of 55,000
spams that I had collected from the past 4 weeks where those spams were
maliciously using the Google shortener as a way to get their spam delivered
via hiding their spammy domain names from spam filters. I started checking
those by looking up the redirect from Google's redirector, but without
actually visiting the site that the redirector was pointing to. Please note
that I was doing the lookups one-at-a-time, not starting the next lookup
until the last lookup had completed. After about ONLY 1,400 lookups, ALL of
my following lookups started hitting captchas. See attached screenshot.
Also, other than not sending from multiple IPs, I was otherwise doing
everything correct to make my script look/act like a regular browser.

I'll try spreading it out between multiple IPs in order to try to avoid rate
limits... However... This is still cause for concern about high-volume
lookups in high production systems... those may have to be implemented a
little more carefully if they're going to do these kind of lookups!

Just because small or medium production systems are able to do this... Or
just because somebody went out of their way to get more sophisticated with
it to get it to work out... doesn't mean that it's going to work in high
production systems that are trying to use "canned" software or plugins. This
is a particular challenge for anti-spam blacklists because they typically
process a very high volume of spams. Hopefully, the randomness of the ones I
process as they come in... will be sufficiently spread out enough to avoid
rate limiting?

It was my hope to start processing these live with my own DNSBL engine, so
that I could start blacklisting the domains that they redirect to... In
those cases where they were not already blacklisted... Now I'm going to have
to deal with constantly trying to make sure that I'm not hitting this
captcha, along with implementing some other strategies to hopefully prevent
that.

But this brings up a whole other issue... That is more of a policy or legal
issue... is Google basically making a statement that automated lookups are
not welcome? Or are considered abusive?

(btw, I could have collected order of magnitudes more than 55,000 of THESE
types of spams, but this was merely what was left over in an after-the-fact
search of my archives, after a lot of otherwise redundant spams had already
been purged from my system.)

PS - Once I gather this information, I will submit more details about the
results of this testing. But what is shocking right now is that less than
four tenths of 1% of these redirect URLs has been terminated, even though
they average two weeks old, with some almost a month old.




--
Public key #7BBC68D9 at| Shane Williams
http://pgp.mit.edu/|  System Admin - UT CompSci
=--+---
All syllogisms contain three lines |  sha...@shanew.net
Therefore this is not a syllogism  | www.ischool.utexas.edu/~shanew