Re: [dns-operations] rate-limiting state

2014-02-22 Thread Paul Vixie
sorry for the delay in getting back to this thread. i know damian raised
some important points.

Damian Menscher wrote:
> On Thu, Feb 6, 2014 at 4:46 PM, Paul Vixie  > wrote:
>
> Damian Menscher wrote:
> > ...
> > My recommendation (which Vixie and Vernon disagree with) is to
> use RRL
> > with slip=1 -- return TC=1 responses to all queries over the limit.
>
> my disagreement is explained in detail here:
> 
> http://www.circleid.com/posts/20130913_on_the_time_value_of_security_features_in_dns/
>
>
> Since I haven't explained my objections before, I'll pick apart your
> arguments:
>
> 1) "RRL must be attenuative in packets per second, not just in bits
> per second".  The attacker is using DNS amplification specifically to
> increase bits/second.

no. there are two classes of attackers: those who understand and
innovate, vs. those who just follow well trodden paths. the attackers we
mostly see are of the second variety, but our defenses must take account
of both varieties.

>  If they wanted to amplify packets/second they could just spoof syn
> packets to webservers.

and they will, when we force them to, which is my goal in the DNS RRL
work. forcing the attacker to adopt a more complex technique is not a
pure win but i'll take it.

>  Returning to a 1:1 ratio should be our goal, and slip=1 achieves that.

my goal is as stated, attenuation of both packets and bits, for the
reasons i've stated. if the attacker is willing to accept 1:1 then they
can forge packets directly to the victim. i want to encourage that, by
being a worse alternative for them than reflecting through my server. i
won't try to talk you out of your chosen goal, so long as you clearly
state it when making recommendations in keeping with it.

>
> 2) "A pure TCP fallback strategy would be less reliable due to the
> fragility of TCP/DNS".  You go on to argue that the 3-way handshake
> adds latency and server load, which I agree with.  But keep in mind
> only the legitimate queries will need to use TCP, so the actual load
> is low.

no. actual query load as witnessed on dns servers i have operated even
15+ years ago was not sustainable via tcp due to state load.

>  And these are queries which would otherwise have had to retry over
> UDP after a timeout (and even then only have a 50% success rate), so
> the amortized latency hit isn't particularly significant either.

anyone on the internet can exhaust the tcp listener quota of any dns
server they target, thus ensuring that tcp fallback temporarily fails
for other victims whom they are simultaneously trying to starve via an
RRL flow overrun. that's what i mean by "tcp fragility". any design that
calls for tcp fallback of dns is by definition too fragile to be used in
production. (that's why i criticized nominum's answer to the kaminsky
attacks back in 2008, too.)

>
> 3) [Addressing the increased poisoning risk], "requires many hours of
> uninterrupted 100 Mbit/sec blasting from the attacker to the victim in
> order to have a chance at success".  I don't worry about 100Mbps attacks,

you work at google. in my world, a 100Mbit/sec attack is noticeable. in
your world, it doesn't even raise an eyebrow.

> but in the age of 10Gbps (unamplified) attacks, I think this does
> introduce a non-negligible (and unnecessary!) risk for high-value
> domains.  Keep in mind a single poison packet can inject a high TTL to
> cause a long outage, and potentially use that time to steal
> unencrypted data (SMTP, for example).  Why take that risk just to
> reduce the amplification factor *below* 1:1?

there may be a marked difference in our perspective. i went along with
bernstein-style source port randomization as a temporary work around to
the kaminsky bug back in 2008, because we had to have something, and it
was something. the real fix, as i said then, is dnssec. other real
fixes, like eastlake-style cookies, or several proposals i'm aware of
which havn't been published as yet, might also come. but in no case did
i sign up for, nor will i accept an indefinite future where, cache
poisoning remains feasible using sustained flows of 100Mbit/sec for five
to fifteen minutes.

that means the  risk which you claim is non-negligible and unnecessary,
i claim is both negligible and a sunk cost. so, i won't spend new manna
on it. especially if to get additional traction on it i would have to
accept a defense strategy for reflection that made me no less attractive
than sending spoofed packets directly to the victims.

i think my article covers pretty well the topic of why reflection is a
separate boon for attackers, over and above amplification.

see also the followup acm queue article at
.

>
> > This ensures your legitimate users can get through with a TCP
> request,
> > rather than having to attempt multiple retries before learning to
> > retry over TCP.  Does slip=1 address your concerns?
> >
> > Of c

Re: [dns-operations] rate-limiting state

2014-02-07 Thread Paul Vixie


Colm MacCárthaigh wrote:
>
> On Fri, Feb 7, 2014 at 9:35 AM, Paul Vixie  > wrote:
>
> Colm MacCárthaigh wrote:
>
>
>
>  > Now if I have a botnet or client that can generate 1M PPS (this is
>
> > small, but adjust to any number), I can try to spoof 66,666 popular
> > resolvers (this is a knowable set) at 5 QPS each to 3 auth
> servers, I
> > can use RRL to degrade service in a more widespread way.
> >
> > Now, let's say you have the capacity to answer these queries
> (which is
> > realistic for some) which behavior is better for your users? Just
> > answering the responses? Or rate-limiting the responses?
>
> Rate limiting is always better, given that recursive servers will
> retry,
> will act on TC=1, and will stop asking once they cache the result.
>
>
> Just to be clear; you're saying it's better, for your legitimate
> users, only to answer their queries probabilistically? I agree that
> they'll retry, and with 2 retries they'll even get a TC=1 87.5% of the
> time. But I have to consider that some kind of degradation of service.
> For the 12% who got no answer, do you consider that some kind of
> degradation?

since you've asked for clarity, let me provide it as follows.

for the case of an attack against the name server itself, not a
reflection victim via ddos but a pool of response-starvation victims via
a logic attack on RRL itself, it is theoretically worse for the
response-starvation victims to have RRL deployed. i say "theoretically"
because the impact would be (a) exceedingly brief, (b) exceedingly
narrow, (c) not user-visible, and must be (d) exquisitely and
expensively well targeted. in this one corner case, my statement "always
better" is wrong.

for the case of an attack against a reflection victim via name server
reflected DDoS, the impact on response-starvation victims due to the RRL
logic, will be no worse than the impact of not having RRL, and if the
attack is large, it will be better with RRL than without. therefore my
statement "always better" should have been written "never worse" and i
apologize.

> ... If I use RRL, my user queries can be degraded. And that is user
> visible, including to stubs, even with caching. If you cause a caching
> resolver to delay or timeout lookups that does hold up and impact stubs.

i any delay at all is to you "user visible" even though once cached the
same delay won't reoccur within a DNS TTL interval, then so be it. DNS
RRL is a DNS-specific rate limiter which relies on retries, TC=1
behaviour, and caching -- by design, mind you -- for its success. i'm
not merely splitting hairs here -- in my own testing, the only way i
could cause a stub lookup failure, noting that a stub tries multiple
recursive servers and will retry to each, was to send enough attack
traffic toward all of that stub's recursives to also cause failures on
unrelated names. in that latter case, it made no difference whether RRL
was off or on, because the attack was on the recursive name server's
resources, not the RRL logic. so, if you know a way to reliably cause
targeted stub query failures using an attack that only works with RRL
turned on, i'd like to see your demonstration.

vixie

___
dns-operations mailing list
dns-operations@lists.dns-oarc.net
https://lists.dns-oarc.net/mailman/listinfo/dns-operations
dns-jobs mailing list
https://lists.dns-oarc.net/mailman/listinfo/dns-jobs

Re: [dns-operations] rate-limiting state

2014-02-07 Thread Vernon Schryver
> From: =?ISO-8859-1?Q?Colm_MacC=E1rthaigh?= 


> Both can cause collateral damage, but to different targets. RRL does reduce
> the collateral damage to a reflection target. But it also increases the
> collateral damage to your legitimate users. If an attacker spoofs a popular
> resolver, then for just 5-10 queries per second they cause a degradation in
> service to the real legitimate users of that resolver. With the default
> settings, 12.5% of queries from that resolver may not get any answer at
> all, even with three attempts, and the lookup time is increased by about
> 1.3 RTTs on average.  With the resolver trying 3 authoritative nameservers,
> the availability hit diminishes to about 0.2% (which brings us to two
> nines), but the RTT hit gets worse.

Great!  Colm has finally read a little.  He still hasn't bothered
to understand.

> Now if I have a botnet or client that can generate 1M PPS (this is small,
> but adjust to any number), I can try to spoof 66,666 popular resolvers
> (this is a knowable set) at 5 QPS each to 3 auth servers, I can use RRL to
> degrade service in a more widespread way.

nonsense.

  - there are a lot more than 66K open resolvers, almost all of
 which should be closed, but there are not 66K popular resolvers
 in the sense that Google, OpenDNS, Comcast, &co are popular.

  - My (or someone else's) 12.5% number copied above assumes only
 a few retries.  Colm's latest doomsday scenario would cause
 more than only a few retries, which would drive that 12.5% way
 down.  The neat thing about the probabililty of failure in
 schemes like this is that it goes as (a/b)**r where a My overall point is that with RRL there is some trade-off between
> protecting innocent reflection victims and opening yourself to an attack
> that degrade service to your real users in some way. 

That is always true about any defense.  The resources including
bandwidth, CPU cycles, and human sweat devoted to RRL necessarily
reduce what can be spent for the service itself.  RRL has a remarkably
light footprint as such things go, but while Colm's specific
complaints are trolling nonsense, RRL's footprint is non-zero.  RRL
should not be used in networks that do not DNS reflection issues.


>  Were RRL to be widely
> deployed, attacks could shift to table-exhaustion and popular-resolver
> spoofing and be effective in different ways.

RRL is long since widely deployed, the bad guys still haven't switched 
in that way.

That bit about "table exhaustion" is trolling based on the false
assumption that RRL is a naive firewall rate limit that uses a naive
firewall ACL (naive in this context but not necessarily other
firewall contexts).

There is an serious, ultimately fatal problem with RRL that has been
discussed repeatedly here and elsewhere, but it's unrelated to Colm's
"collateral damage".


Vernon Schryverv...@rhyolite.com
___
dns-operations mailing list
dns-operations@lists.dns-oarc.net
https://lists.dns-oarc.net/mailman/listinfo/dns-operations
dns-jobs mailing list
https://lists.dns-oarc.net/mailman/listinfo/dns-jobs


Re: [dns-operations] rate-limiting state

2014-02-07 Thread Colm MacCárthaigh
On Fri, Feb 7, 2014 at 9:35 AM, Paul Vixie  wrote:

> Colm MacCárthaigh wrote:
> > ... But [RRL] also increases the collateral damage to your legitimate
> > users. If an attacker spoofs a popular resolver, then for just 5-10
> > queries per second they cause a degradation in service to the real
> > legitimate users of that resolver. With the default settings, 12.5% of
> > queries from that resolver may not get any answer at all, even with
> > three attempts, and the lookup time is increased by about 1.3 RTTs on
> > average.  With the resolver trying 3 authoritative nameservers, the
> > availability hit diminishes to about 0.2% (which brings us to two
> > nines), but the RTT hit gets worse.
>
> You're calling it damage for some reason, even though it's not
> stub-visible. RRL does of course rely on retries and TC=1 and other
> known recursive-dns behaviour. That's because RRL is a protocol-specific
> method. A non-DNS protocol would probably call for a different method.
> If we take your 30% average RTT impact to heart, we've moving the needle
> for a stub transaction from 20ms to 25ms. I'm unwilling to call that
> damage, collateral or otherwise.
>

I got the RTT impact wrong by forgetting that TCP would itself take extra
round-trips. It's closer to 2.5x if I account for that. If you have a 20ms
average RTT as a base-case, you're probably using anycast at a bunch of
datacenters and should deploy more sophisticated techniques. I'm more
worried about RRL as the default for the small implementors who are in a
relatively small number of locations.

 > Now if I have a botnet or client that can generate 1M PPS (this is

> > small, but adjust to any number), I can try to spoof 66,666 popular
> > resolvers (this is a knowable set) at 5 QPS each to 3 auth servers, I
> > can use RRL to degrade service in a more widespread way.
> >
> > Now, let's say you have the capacity to answer these queries (which is
> > realistic for some) which behavior is better for your users? Just
> > answering the responses? Or rate-limiting the responses?
>
> Rate limiting is always better, given that recursive servers will retry,
> will act on TC=1, and will stop asking once they cache the result.
>

Just to be clear; you're saying it's better, for your legitimate users,
only to answer their queries probabilistically? I agree that they'll retry,
and with 2 retries they'll even get a TC=1 87.5% of the time. But I have to
consider that some kind of degradation of service. For the 12% who got no
answer, do you consider that some kind of degradation?


> My overall point is that with RRL there is some trade-off between
> > protecting innocent reflection victims and opening yourself to an
> > attack that degrade service to your real users in some way. Were RRL
> > to be widely deployed, attacks could shift to table-exhaustion and
> > popular-resolver spoofing and be effective in different ways.
>
> There is no operable trade-off of the kind you're proposing. RRL makes
> everyone's life better except the attackers, in all cases. The "degrade"
> you're describing is far better than the non-RRL case, and is in any
> case not user-visible.


If I answer all of the responses, that is 100% non user-visible. All of my
user-facing queries get answers. If I use RRL, my user queries can be
degraded. And that is user visible, including to stubs, even with caching.
If you cause a caching resolver to delay or timeout lookups that does hold
up and impact stubs.



> Are you criticizing RRL for using the known
> behaviour of recursive servers (retrying, respecting TC=1, ceasing to
> ask once an answer is obtained) deliberately to increase resiliency?
>

No, I think that's smart. Responding with TC=1 all of the time would make
me a little more comfortable with the impact, though like you I would then
question its efficacy for reflections.

-- 
Colm
___
dns-operations mailing list
dns-operations@lists.dns-oarc.net
https://lists.dns-oarc.net/mailman/listinfo/dns-operations
dns-jobs mailing list
https://lists.dns-oarc.net/mailman/listinfo/dns-jobs

Re: [dns-operations] rate-limiting state

2014-02-07 Thread Paul Vixie


Colm MacCárthaigh wrote:
> ... But [RRL] also increases the collateral damage to your legitimate
> users. If an attacker spoofs a popular resolver, then for just 5-10
> queries per second they cause a degradation in service to the real
> legitimate users of that resolver. With the default settings, 12.5% of
> queries from that resolver may not get any answer at all, even with
> three attempts, and the lookup time is increased by about 1.3 RTTs on
> average.  With the resolver trying 3 authoritative nameservers, the
> availability hit diminishes to about 0.2% (which brings us to two
> nines), but the RTT hit gets worse.

You're calling it damage for some reason, even though it's not
stub-visible. RRL does of course rely on retries and TC=1 and other
known recursive-dns behaviour. That's because RRL is a protocol-specific
method. A non-DNS protocol would probably call for a different method.
If we take your 30% average RTT impact to heart, we've moving the needle
for a stub transaction from 20ms to 25ms. I'm unwilling to call that
damage, collateral or otherwise.

> Now if I have a botnet or client that can generate 1M PPS (this is
> small, but adjust to any number), I can try to spoof 66,666 popular
> resolvers (this is a knowable set) at 5 QPS each to 3 auth servers, I
> can use RRL to degrade service in a more widespread way. 
>
> Now, let's say you have the capacity to answer these queries (which is
> realistic for some) which behavior is better for your users? Just
> answering the responses? Or rate-limiting the responses?

Rate limiting is always better, given that recursive servers will retry,
will act on TC=1, and will stop asking once they cache the result.

> My overall point is that with RRL there is some trade-off between
> protecting innocent reflection victims and opening yourself to an
> attack that degrade service to your real users in some way. Were RRL
> to be widely deployed, attacks could shift to table-exhaustion and
> popular-resolver spoofing and be effective in different ways.

There is no operable trade-off of the kind you're proposing. RRL makes
everyone's life better except the attackers, in all cases. The "degrade"
you're describing is far better than the non-RRL case, and is in any
case not user-visible. Are you criticizing RRL for using the known
behaviour of recursive servers (retrying, respecting TC=1, ceasing to
ask once an answer is obtained) deliberately to increase resiliency?

Separately, I dispute your implication that there's a table-exhaustion
condition that can be hit. The design of RRL takes table size into
account. I am, as before, ready to evaluate your experimental results if
you can show otherwise.

Vixie
___
dns-operations mailing list
dns-operations@lists.dns-oarc.net
https://lists.dns-oarc.net/mailman/listinfo/dns-operations
dns-jobs mailing list
https://lists.dns-oarc.net/mailman/listinfo/dns-jobs

Re: [dns-operations] rate-limiting state(Internet mail)

2014-02-07 Thread Paul Vixie


samwu(吴洪声) wrote:
> in DNSPod, we responded user a random cname like afda7896.dnspod.com
> to prevent DNS query flood and avoid TCP issue.

this approach changes the meaning of the dns result, such that the qname
is now an alias. some cname-aware protocols like smtp and http will
behave differently when you insert a cname chain like this. that's a
cost i consider to be too high, even for ddos mitigation.

vixie
___
dns-operations mailing list
dns-operations@lists.dns-oarc.net
https://lists.dns-oarc.net/mailman/listinfo/dns-operations
dns-jobs mailing list
https://lists.dns-oarc.net/mailman/listinfo/dns-jobs

Re: [dns-operations] rate-limiting state(Internet mail)

2014-02-07 Thread 吴洪声
in DNSPod, we responded user a random cname like afda7896.dnspod.com to prevent 
DNS query flood and avoid TCP issue.

Kind regards
--
DNSPod - make your domain intelligent

Sam Wu
Founder & CEO

E-Mail: sa...@dnspod.com
http://www.dnspod.cn
地址: 山东省烟台市开发区长江路28号华新国际大厦1210室 264006
Addr: #1210 HuaXin Intl. Bldg., #28 ChangJiang Rd., Development Dist., YanTai 
City, ShanDong Prov., China

From: Colm MacCárthaigh Colm MacCárthaigh<mailto:c...@stdlib.net>
Date: February 8, 2014 at 1:03:03 AM
To: Tony Finch d...@dotat.at<mailto:d...@dotat.at>
Subject:  Re: [dns-operations] rate-limiting state(Internet mail)
On Fri, Feb 7, 2014 at 6:16 AM, Tony Finch 
mailto:d...@dotat.at>> wrote:
What not just the victim? In the absence of RRL the DDoS attack is likely
to cause collateral damage, yes. In the presence of RRL non-victims are
unaffected as long as the attack isn't overwhelming the name server.

Both can cause collateral damage, but to different targets. RRL does reduce the 
collateral damage to a reflection target. But it also increases the collateral 
damage to your legitimate users. If an attacker spoofs a popular resolver, then 
for just 5-10 queries per second they cause a degradation in service to the 
real legitimate users of that resolver. With the default settings, 12.5% of 
queries from that resolver may not get any answer at all, even with three 
attempts, and the lookup time is increased by about 1.3 RTTs on average.  With 
the resolver trying 3 authoritative nameservers, the availability hit 
diminishes to about 0.2% (which brings us to two nines), but the RTT hit gets 
worse.

Now if I have a botnet or client that can generate 1M PPS (this is small, but 
adjust to any number), I can try to spoof 66,666 popular resolvers (this is a 
knowable set) at 5 QPS each to 3 auth servers, I can use RRL to degrade service 
in a more widespread way.

Now, let's say you have the capacity to answer these queries (which is 
realistic for some) which behavior is better for your users? Just answering the 
responses? Or rate-limiting the responses?

My overall point is that with RRL there is some trade-off between protecting 
innocent reflection victims and opening yourself to an attack that degrade 
service to your real users in some way. Were RRL to be widely deployed, attacks 
could shift to table-exhaustion and popular-resolver spoofing and be effective 
in different ways.

--
Colm
___
dns-operations mailing list
dns-operations@lists.dns-oarc.net
https://lists.dns-oarc.net/mailman/listinfo/dns-operations
dns-jobs mailing list
https://lists.dns-oarc.net/mailman/listinfo/dns-jobs

Re: [dns-operations] rate-limiting state

2014-02-07 Thread Colm MacCárthaigh
On Fri, Feb 7, 2014 at 6:16 AM, Tony Finch  wrote:

> What not just the victim? In the absence of RRL the DDoS attack is likely
> to cause collateral damage, yes. In the presence of RRL non-victims are
> unaffected as long as the attack isn't overwhelming the name server.
>

Both can cause collateral damage, but to different targets. RRL does reduce
the collateral damage to a reflection target. But it also increases the
collateral damage to your legitimate users. If an attacker spoofs a popular
resolver, then for just 5-10 queries per second they cause a degradation in
service to the real legitimate users of that resolver. With the default
settings, 12.5% of queries from that resolver may not get any answer at
all, even with three attempts, and the lookup time is increased by about
1.3 RTTs on average.  With the resolver trying 3 authoritative nameservers,
the availability hit diminishes to about 0.2% (which brings us to two
nines), but the RTT hit gets worse.

Now if I have a botnet or client that can generate 1M PPS (this is small,
but adjust to any number), I can try to spoof 66,666 popular resolvers
(this is a knowable set) at 5 QPS each to 3 auth servers, I can use RRL to
degrade service in a more widespread way.

Now, let's say you have the capacity to answer these queries (which is
realistic for some) which behavior is better for your users? Just answering
the responses? Or rate-limiting the responses?

My overall point is that with RRL there is some trade-off between
protecting innocent reflection victims and opening yourself to an attack
that degrade service to your real users in some way. Were RRL to be widely
deployed, attacks could shift to table-exhaustion and popular-resolver
spoofing and be effective in different ways.

-- 
Colm
___
dns-operations mailing list
dns-operations@lists.dns-oarc.net
https://lists.dns-oarc.net/mailman/listinfo/dns-operations
dns-jobs mailing list
https://lists.dns-oarc.net/mailman/listinfo/dns-jobs

Re: [dns-operations] rate-limiting state

2014-02-07 Thread David C Lawrence
Tony Finch writes:
> At that point the name server itself is the victim, and there isn't
> anything it can do about the attack - DDoS mitigation has to happen well
> upstream of the victim.

Well, it's *a* victim, if not the intended target.  As someone who
runs servers behind a small pipe (and recently had the pipe collapse
thanks to an NTP reflection targeted at someone else) I definitely
agree with you.

As a supporter of RRL, I'll point out that even with overwhelming
inbound attack traffic RRL will still help so "isn't anything it can
do about the attack" is too bleak.
___
dns-operations mailing list
dns-operations@lists.dns-oarc.net
https://lists.dns-oarc.net/mailman/listinfo/dns-operations
dns-jobs mailing list
https://lists.dns-oarc.net/mailman/listinfo/dns-jobs


Re: [dns-operations] rate-limiting state

2014-02-07 Thread Patrick W. Gilmore
On Feb 7, 2014, at 9:56, Tony Finch  wrote:
> David C Lawrence  wrote:
>> 
>> Maybe Patrick glossed over the mere "1000 qps", which for many (most?
>> hand-waving) operators doesn't even blip as an attack.  At the
>> attack-level traffic to which he is accustomed, the inbound requests
>> can easily surpass the server's ability to generate responses even if
>> it ends up not sending most of them.
> 
> At that point the name server itself is the victim, and there isn't
> anything it can do about the attack - DDoS mitigation has to happen well
> upstream of the victim.
> 
> I picked 1000pps because it is enough to trigger RRL without killing the
> server.

Yeah, I missed the 1K  number. Was thinking 10M which was discussed before.

I agree with David, 1K qps, while enough to trigger RRL, really wouldn't hurt 
anyone or anything else, so hardly worth talking about.

Sorry for my confusion and resulting noise on the list.

-- 
TTFN,
patrick

___
dns-operations mailing list
dns-operations@lists.dns-oarc.net
https://lists.dns-oarc.net/mailman/listinfo/dns-operations
dns-jobs mailing list
https://lists.dns-oarc.net/mailman/listinfo/dns-jobs


Re: [dns-operations] rate-limiting state

2014-02-07 Thread Tony Finch
David C Lawrence  wrote:
>
> Maybe Patrick glossed over the mere "1000 qps", which for many (most?
> hand-waving) operators doesn't even blip as an attack.  At the
> attack-level traffic to which he is accustomed, the inbound requests
> can easily surpass the server's ability to generate responses even if
> it ends up not sending most of them.

At that point the name server itself is the victim, and there isn't
anything it can do about the attack - DDoS mitigation has to happen well
upstream of the victim.

I picked 1000pps because it is enough to trigger RRL without killing the
server.

Tony.
-- 
f.anthony.n.finchhttp://dotat.at/
Forties, Cromarty: East, veering southeast, 4 or 5, occasionally 6 at first.
Rough, becoming slight or moderate. Showers, rain at first. Moderate or good,
occasionally poor at first.
___
dns-operations mailing list
dns-operations@lists.dns-oarc.net
https://lists.dns-oarc.net/mailman/listinfo/dns-operations
dns-jobs mailing list
https://lists.dns-oarc.net/mailman/listinfo/dns-jobs


Re: [dns-operations] rate-limiting state

2014-02-07 Thread David C Lawrence
Tony Finch writes:
> Patrick W. Gilmore  wrote:
> > On Feb 07, 2014, at 07:09 , Tony Finch  wrote:
> > > If my busy name server is getting 1000 qps of real traffic from all over
> > > the net, and 1000 qps of attack traffic "from" some victim, then RRL will
> > > attenuate responses to the victim without affecting other users.
> > >
> > > In the absence of RRL, the victim will be denied service by overwhelming
> > > traffic. In the presence of RRL the victim might have slightly slower DNS
> > > resolution.
> >
> > Not just the victim.
> 
> What not just the victim? In the absence of RRL the DDoS attack is likely
> to cause collateral damage, yes. In the presence of RRL non-victims are
> unaffected as long as the attack isn't overwhelming the name server.

Maybe Patrick glossed over the mere "1000 qps", which for many (most?
hand-waving) operators doesn't even blip as an attack.  At the
attack-level traffic to which he is accustomed, the inbound requests
can easily surpass the server's ability to generate responses even if
it ends up not sending most of them.

___
dns-operations mailing list
dns-operations@lists.dns-oarc.net
https://lists.dns-oarc.net/mailman/listinfo/dns-operations
dns-jobs mailing list
https://lists.dns-oarc.net/mailman/listinfo/dns-jobs


Re: [dns-operations] rate-limiting state

2014-02-07 Thread Patrick W. Gilmore
On Feb 7, 2014, at 9:16, Tony Finch  wrote:
> Patrick W. Gilmore  wrote:
>>> On Feb 07, 2014, at 07:09 , Tony Finch  wrote:
>>> 
>>> If my busy name server is getting 1000 qps of real traffic from all over
>>> the net, and 1000 qps of attack traffic "from" some victim, then RRL will
>>> attenuate responses to the victim without affecting other users.
>>> 
>>> In the absence of RRL, the victim will be denied service by overwhelming
>>> traffic. In the presence of RRL the victim might have slightly slower DNS
>>> resolution.
>> 
>> Not just the victim.
> 
> What not just the victim? In the absence of RRL the DDoS attack is likely
> to cause collateral damage, yes. In the presence of RRL non-victims are
> unaffected as long as the attack isn't overwhelming the name server.

You said: "In the absence of RRL, the victim will be denied service by 
overwhelming traffic."

I was saying more than the victim would be hurt in the absence of RRL. The 
other users of the amp server very likely would be affected through resource 
exhaustion. Users between the amp & victim as the amp attack makes its way 
through the Internet. Etc., etc.

My guess is you agree with those statements. Sorry if this wasn't clear 
originally.

-- 
TTFN,
patrick

___
dns-operations mailing list
dns-operations@lists.dns-oarc.net
https://lists.dns-oarc.net/mailman/listinfo/dns-operations
dns-jobs mailing list
https://lists.dns-oarc.net/mailman/listinfo/dns-jobs


Re: [dns-operations] rate-limiting state

2014-02-07 Thread Tony Finch
Patrick W. Gilmore  wrote:
> On Feb 07, 2014, at 07:09 , Tony Finch  wrote:
> >
> > If my busy name server is getting 1000 qps of real traffic from all over
> > the net, and 1000 qps of attack traffic "from" some victim, then RRL will
> > attenuate responses to the victim without affecting other users.
> >
> > In the absence of RRL, the victim will be denied service by overwhelming
> > traffic. In the presence of RRL the victim might have slightly slower DNS
> > resolution.
>
> Not just the victim.

What not just the victim? In the absence of RRL the DDoS attack is likely
to cause collateral damage, yes. In the presence of RRL non-victims are
unaffected as long as the attack isn't overwhelming the name server.

Tony.
-- 
f.anthony.n.finchhttp://dotat.at/
Forties, Cromarty: East, veering southeast, 4 or 5, occasionally 6 at first.
Rough, becoming slight or moderate. Showers, rain at first. Moderate or good,
occasionally poor at first.
___
dns-operations mailing list
dns-operations@lists.dns-oarc.net
https://lists.dns-oarc.net/mailman/listinfo/dns-operations
dns-jobs mailing list
https://lists.dns-oarc.net/mailman/listinfo/dns-jobs


Re: [dns-operations] rate-limiting state

2014-02-07 Thread Patrick W. Gilmore
On Feb 07, 2014, at 07:09 , Tony Finch  wrote:
> Colm MacCárthaigh  wrote:

>> I don't see anyone disputing my example, and I'm not calling out RRLs
>> ability to dampen a reflection attack. I'm saying that RRL can be used to
>> counter-attack your users.  Let's say a busy website gets 1,000 QPS of
>> "real" user queries. If I want those queries to survive say with 2 retries,
>> then I need to let through 40% of traffic to have a 95p confidence of them
>> getting an answer. Yes, I'll have mitigated the reflection to 4Gbit/sec,
>> but meanwhile users will be seeing increased resolution times and timeouts.
> 
> You seem to be assuming that RRL is a blanket rate limit. It is not.
> 
> If my busy name server is getting 1000 qps of real traffic from all over
> the net, and 1000 qps of attack traffic "from" some victim, then RRL will
> attenuate responses to the victim without affecting other users.
> 
> In the absence of RRL, the victim will be denied service by overwhelming
> traffic. In the presence of RRL the victim might have slightly slower DNS
> resolution.

Not just the victim.

Let's all agree Colm is a bit confused on both how RRL works and the failure 
modes we are discussing. Then we can go back to arguing about other useless 
stuff instead of arguing about this useless stuff. :)

-- 
TTFN,
patrick

___
dns-operations mailing list
dns-operations@lists.dns-oarc.net
https://lists.dns-oarc.net/mailman/listinfo/dns-operations
dns-jobs mailing list
https://lists.dns-oarc.net/mailman/listinfo/dns-jobs


Re: [dns-operations] rate-limiting state

2014-02-07 Thread Tony Finch
Colm MacCárthaigh  wrote:
>
> I don't see anyone disputing my example, and I'm not calling out RRLs
> ability to dampen a reflection attack. I'm saying that RRL can be used to
> counter-attack your users.  Let's say a busy website gets 1,000 QPS of
> "real" user queries. If I want those queries to survive say with 2 retries,
> then I need to let through 40% of traffic to have a 95p confidence of them
> getting an answer. Yes, I'll have mitigated the reflection to 4Gbit/sec,
> but meanwhile users will be seeing increased resolution times and timeouts.

You seem to be assuming that RRL is a blanket rate limit. It is not.

If my busy name server is getting 1000 qps of real traffic from all over
the net, and 1000 qps of attack traffic "from" some victim, then RRL will
attenuate responses to the victim without affecting other users.

In the absence of RRL, the victim will be denied service by overwhelming
traffic. In the presence of RRL the victim might have slightly slower DNS
resolution.

Tony.
-- 
f.anthony.n.finchhttp://dotat.at/
Forties, Cromarty: East, veering southeast, 4 or 5, occasionally 6 at first.
Rough, becoming slight or moderate. Showers, rain at first. Moderate or good,
occasionally poor at first.___
dns-operations mailing list
dns-operations@lists.dns-oarc.net
https://lists.dns-oarc.net/mailman/listinfo/dns-operations
dns-jobs mailing list
https://lists.dns-oarc.net/mailman/listinfo/dns-jobs

Re: [dns-operations] rate-limiting state

2014-02-06 Thread Dobbins, Roland

On Feb 7, 2014, at 8:58 AM, Damian Menscher  wrote:

> You go on to argue that the 3-way handshake adds latency and server load, 
> which I agree with.  But keep in mind only the legitimate queries will need 
> to use TCP, so the actual load is low.  And these are queries which would 
> otherwise have had to retry over UDP after a timeout (and even then only have 
> a 50% success rate), so the amortized latency hit isn't particularly 
> significant either.

This is my experience with forcing TC=1 for the initial query from a given 
source (after re-issuance via TC=1, said source is 'authenticated' for some 
configurable period of time) - the latency effects and the server overhead are 
minimal.  

There are two nontrivial problems with forcing TC=1 these days, neither of 
which is related to actual DNS server performance:

1.  Some large-scale DNS operators have incorrectly disabled TCP/53 for 
their authoritative DNS farms due to a combination of the old misinformation 
about TCP/53 being a 'security' risk with regards to AXFR, as well as the 
continuing misperception that TCP/53 overhead is crippling, based upon 
early-1990s server performance specs vs. specs of modern servers.

2.  Incorrect filtering of TCP/53 on endpoint networks (and in some 
intermediary networks) due to the aforementioned AXFR myth.

Tangentially, Geoff Huston gave a preso at NZNOG last week which analyzes the 
crypto-related sever overhead of DNSSEC.  It's quite interesting to compare the 
crypto overhead to perceived TCP overhead . . .


---
Roland Dobbins  // 

  Luck is the residue of opportunity and design.

   -- John Milton

___
dns-operations mailing list
dns-operations@lists.dns-oarc.net
https://lists.dns-oarc.net/mailman/listinfo/dns-operations
dns-jobs mailing list
https://lists.dns-oarc.net/mailman/listinfo/dns-jobs


Re: [dns-operations] rate-limiting state

2014-02-06 Thread Vernon Schryver
> From: =?ISO-8859-1?Q?Colm_MacC=E1rthaigh?= 

> I chose a fairly typical number, which is actually below average. Arbor's
> data on DDOS puts 10M somewhere between the 40th and 50th percentile.  I'd
> be really surprised if OpenDNS's pipes fill up with that kind of small
> volume.

That seems to assume that the infamous Gbit/sec DNS reflection attacks
involve one or at most a handful of mirrors.  That assumption is wrong.


> > so, third, let's look squarely at "large enough UDP flow to activate RRL".
>
> 10M requests/sec for www.example.com, type=A. Would that be large enough?

10 Mqps is about 1,000,000 times higher than necessary to trigger DNS
RRL.  I think 5 or 10 qps is an appropriate DNS response rate limit
(although many operators like 50 or even 100).  5, 10, or even 500 qps
is a bad limit if your DNS rate limiting is naive firewall counting
that pays attention only to source addresses. 



>   but I don't think that the numbers work out. If
> you're getting an attack of 10M PPS, which is very realistic, you'll end up
> denying service to real users.

In most cases (i.e. not OpenDNS, Google, Comcast, etc.), if you're
getting 10 Mqps, then your DNS server is denying service to real
users regardless of any response rate limiting, because 10M DNS
queries/second is perhaps 4 Gbit/sec as well as a healthy CPU load.
What is the queryperf number of your DNS system over localhost?
(queryperf is a common tool for measuring how many queries your DNS
system can answer.)


Vernon Schryverv...@rhyolite.com
___
dns-operations mailing list
dns-operations@lists.dns-oarc.net
https://lists.dns-oarc.net/mailman/listinfo/dns-operations
dns-jobs mailing list
https://lists.dns-oarc.net/mailman/listinfo/dns-jobs


Re: [dns-operations] rate-limiting state

2014-02-06 Thread Paul Vixie


Damian Menscher wrote:
> ...
> My recommendation (which Vixie and Vernon disagree with) is to use RRL
> with slip=1 -- return TC=1 responses to all queries over the limit.

my disagreement is explained in detail here:

http://www.circleid.com/posts/20130913_on_the_time_value_of_security_features_in_dns/

> This ensures your legitimate users can get through with a TCP request,
> rather than having to attempt multiple retries before learning to
> retry over TCP.  Does slip=1 address your concerns?
>
> Of course TCP isn't perfect -- it has higher latency and
> per-connection costs -- but at least it ensures your legitimate users
> can't be affected by the RRL.

it does not. see [ibid].

vixie
___
dns-operations mailing list
dns-operations@lists.dns-oarc.net
https://lists.dns-oarc.net/mailman/listinfo/dns-operations
dns-jobs mailing list
https://lists.dns-oarc.net/mailman/listinfo/dns-jobs


Re: [dns-operations] rate-limiting state

2014-02-06 Thread Paul Vixie


Colm MacCárthaigh wrote:
> On Thu, Feb 6, 2014 at 3:28 PM, Paul Vixie  > wrote:
>
> second, RRL does not see SYNs. the kernel probably has SYN flood
> protection, which like a stateful firewall might penalize a host
> or netblock's real SYNs, but that has nothing to do with RRL's logic.
>
>
> So the reflection attack is completed? If you're going to respond to
> all of the SYNs, you may as well respond to the UDP queries. It'll be
> the same reflected PPS. The byte count will be a little higher, but
> most networks are bottlenecked on PPS at DNS payload sizes.

i'm willing to discuss RRL's impact on third parties, since that' s our
topic here. if you are also worried about logic outside of RRL, please
start a new thread.

> so, third, let's look squarely at "large enough UDP flow to
> activate RRL". 
>
>
> 10M requests/sec for www.example.com , type=A.
> Would that be large enough?

it has to be more than five or ten per second to trigger RRL, and less
than a full pipe to avoid being a non-specific DDoS. i can't say whether
10M qualifies or not. let's say that there's 20MPPS in headroom, so that
your 10M example will not cause general congestion/exhaustion.

> in that steady state situation, opendns's legitimate queries whose
> response matches an RRL flow are mixed with an avalanche of forged
> questions soliciting the same answer. opendns will retry three
> times over ten to 90 seconds. if opendns ever gets an answer, it
> will fill its cache and stop asking that question. the possibility
> of opendns receiving a TC=1 and retrying with TCP, or receiving
> one of our periodic normal answers, and either way filling its
> cache is high, on the order of unity. of course, opendns might ask
> other authorities in between retries to any one authority, so
> you'll need to spoof all of the potential authorities who could
> help with the terminal cache-fill operation that ends the race.
>
>
> I agree with all of this, but I don't think that the numbers work out.
> If you're getting an attack of 10M PPS, which is very realistic,
> you'll end up denying service to real users.

are you attempting to shift the onus of proof here? in my testing i have
not been able to create an attack against friendly traffic by triggering
the RRL logic on some victim's behalf. you assert that it's possible,
so, i'd like to see your demonstration.

> Important to consider here, is that if you did nothing, and let the
> responses go answered (if you can), there's no impact on the real users.

that was not my experience. the unattenuated output flows toward forgery
victims were causing both network congestion and resource exhaustion
that affected untargeted third party "real users". that was the
motivation for many authority server operators to deploy RRL -- by which
i mean, they were happy to be stopping pain for victims, but even
happier to make their own pain stop. i think you may be speaking from
ignorance here.

> The reflection target does get hit of course though.  So in effect, at
> realistic DDOS scales, RRL can be used to deny service to your real
> users to protect victims of reflection attacks.

that is a complete (end to end, top to bottom) nonsequitur.

> That's a form of asymmetric altruism. I'm not against it, doing the
> internet a favour is worthwhile, we all benefit ; but it's worth
> calling out.

no. you've said approximately nothing worth calling out here.

vixie
___
dns-operations mailing list
dns-operations@lists.dns-oarc.net
https://lists.dns-oarc.net/mailman/listinfo/dns-operations
dns-jobs mailing list
https://lists.dns-oarc.net/mailman/listinfo/dns-jobs

Re: [dns-operations] rate-limiting state

2014-02-06 Thread Colm MacCárthaigh
On Thu, Feb 6, 2014 at 4:06 PM, Patrick W. Gilmore wrote:

> >> second, RRL does not see SYNs. the kernel probably has SYN flood
> protection, which like a stateful firewall might penalize a host or
> netblock's real SYNs, but that has nothing to do with RRL's logic.
> >
> > So the reflection attack is completed? If you're going to respond to all
> of the SYNs, you may as well respond to the UDP queries. It'll be the same
> reflected PPS. The byte count will be a little higher, but most networks
> are bottlenecked on PPS at DNS payload sizes.
>
> Paul was responding at RRL. Whether the OS on a name server responds to
> SYNs is outside RRL's scope.
>

I don't think it is, if the point is to reduce the impact of reflection
attacks then we need to consider every source of reflection. Per Paul's
article he calls out TCP and ICMP as reflection friendly.

Plus who is going to hit a name server with SYNs when pretty much every
> server on the Internet will ACK any SYN sent to it? Pick your favorite NN
> million servers and send them all 10 SYNs / sec, will fall under anyone's
> rate limiting and you're done.
>

Amplification is one reason that people use DNS servers as reflectors, but
another common one is that they are hard to filter. If www.example.com is a
well-known site, then the target may need to permit traffic inbound from
their nameservers. You as the reflection victim can't simply filter it;
there'll be impact. I think that's why even non-amplifying reflection
attacks happen that way.

> I agree with all of this, but I don't think that the numbers work out. If
> you're getting an attack of 10M PPS, which is very realistic, you'll end up
> denying service to real users.
>
> You don't think the numbers work out? That's your response. Lots of people
> have RRL installed & have survived attacks with it. Have you any data other
> than "I don't think that the numbers work out" to show otherwise?
>

I don't see anyone disputing my example, and I'm not calling out RRLs
ability to dampen a reflection attack. I'm saying that RRL can be used to
counter-attack your users.  Let's say a busy website gets 1,000 QPS of
"real" user queries. If I want those queries to survive say with 2 retries,
then I need to let through 40% of traffic to have a 95p confidence of them
getting an answer. Yes, I'll have mitigated the reflection to 4Gbit/sec,
but meanwhile users will be seeing increased resolution times and timeouts.

> Important to consider here, is that if you did nothing, and let the
> responses go answered (if you can), there's no impact on the real users.
> The reflection target does get hit of course though.  So in effect, at
> realistic DDOS scales, RRL can be used to deny service to your real users
> to protect victims of reflection attacks. That's a form of asymmetric
> altruism. I'm not against it, doing the internet a favour is worthwhile, we
> all benefit ; but it's worth calling out.
>
> I think you'll find there is huge impact on the real users, since the
> target is down and the target is a real user. Plus all the people between
> the reflector and the target.
>

Not necessarily; large targets often come with a lot of defenses and don't
always need help from the reflectors. There's a large class of targets that
are unprepared and may be impacted, but it's a complicated trade-off. I
think with RRL you have to consider whether you value defending the third
party target (who you may or may not have any relationship with) or or
value your users?  There are ways to do both, with more sophisticated
spoofing and reflection attack detection.

-- 
Colm
___
dns-operations mailing list
dns-operations@lists.dns-oarc.net
https://lists.dns-oarc.net/mailman/listinfo/dns-operations
dns-jobs mailing list
https://lists.dns-oarc.net/mailman/listinfo/dns-jobs

Re: [dns-operations] rate-limiting state

2014-02-06 Thread Vernon Schryver
> From: =?ISO-8859-1?Q?Colm_MacC=E1rthaigh?= 
> To: Paul Vixie 
> Cc: DNS Operations List 

> > For example, if the authoritative provider www.example.com were to
> > implement RRL as you describe, then an attacker could spoof traffic
> > purporting to be from Google Public DNS, OpenDNS, Comcast ... etc, and
> > cause www.example.com to be un-resolvable by users of those resolvers.
> >
> > no. it just does not work that way.
>
> O.k., so say I spoof 10M UDP queries per second and 10M TCP SYNs per second
> purporting to be from OpenDNS's IP address. Does RRL  a)  Let the queries
> and SYNs go answered. Or b) Rate limit the responses?
>
> If it's (a) RRL doesn't prevent the reflection. If it's (b) then you
> complete a denial of service attack against the OpenDNS users.
>
> Which is it? or what's option (c)?

I think one option (c) (there might be others) is related to what
Paul Vixie meant when he wrote:

]  The more common case will be like DNS RRL, where deep knowledge
]  of the protocol is necessary for a correctly engineered rate-limiting
]  solution applicable to the protocol

in http://queue.acm.org/detail.cfm?id=2578510

I've written too many times here and elsewhere that DNS RRL is not a
naive firewall rate limit.  Simplistic firewall rate limiting against
DNS reflections is little better than blocking all ICMP on "security"
grounds.  That is why DNS RRL is in the DNS code instead of firewalls.
That's also why there are two R's in RRL.

There are plenty of words in the documentation, technical reports, and
analyses of the various RRL implementations about RRL false positives.
There is disagreement about the best values for the parameters that
minimize RRL false positives, but we who have the least interest in
the topic agree that neither option (a) nor option (b) fit.


Vernon Schryverv...@rhyolite.com
___
dns-operations mailing list
dns-operations@lists.dns-oarc.net
https://lists.dns-oarc.net/mailman/listinfo/dns-operations
dns-jobs mailing list
https://lists.dns-oarc.net/mailman/listinfo/dns-jobs


Re: [dns-operations] rate-limiting state

2014-02-06 Thread Patrick W. Gilmore
On Feb 06, 2014, at 18:59 , Colm MacCárthaigh  wrote:
> On Thu, Feb 6, 2014 at 3:28 PM, Paul Vixie  wrote:

>> second, RRL does not see SYNs. the kernel probably has SYN flood protection, 
>> which like a stateful firewall might penalize a host or netblock's real 
>> SYNs, but that has nothing to do with RRL's logic.
> 
> So the reflection attack is completed? If you're going to respond to all of 
> the SYNs, you may as well respond to the UDP queries. It'll be the same 
> reflected PPS. The byte count will be a little higher, but most networks are 
> bottlenecked on PPS at DNS payload sizes. 

Paul was responding at RRL. Whether the OS on a name server responds to SYNs is 
outside RRL's scope.

Plus who is going to hit a name server with SYNs when pretty much every server 
on the Internet will ACK any SYN sent to it? Pick your favorite NN million 
servers and send them all 10 SYNs / sec, will fall under anyone's rate limiting 
and you're done.



>> so, third, let's look squarely at "large enough UDP flow to activate RRL".
> 
> 10M requests/sec for www.example.com, type=A. Would that be large enough?
> 
>> in that steady state situation, opendns's legitimate queries whose response 
>> matches an RRL flow are mixed with an avalanche of forged questions 
>> soliciting the same answer. opendns will retry three times over ten to 90 
>> seconds. if opendns ever gets an answer, it will fill its cache and stop 
>> asking that question. the possibility of opendns receiving a TC=1 and 
>> retrying with TCP, or receiving one of our periodic normal answers, and 
>> either way filling its cache is high, on the order of unity. of course, 
>> opendns might ask other authorities in between retries to any one authority, 
>> so you'll need to spoof all of the potential authorities who could help with 
>> the terminal cache-fill operation that ends the race.
> 
> I agree with all of this, but I don't think that the numbers work out. If 
> you're getting an attack of 10M PPS, which is very realistic, you'll end up 
> denying service to real users. 

You don't think the numbers work out? That's your response. Lots of people have 
RRL installed & have survived attacks with it. Have you any data other than "I 
don't think that the numbers work out" to show otherwise?


> Important to consider here, is that if you did nothing, and let the responses 
> go answered (if you can), there's no impact on the real users. The reflection 
> target does get hit of course though.  So in effect, at realistic DDOS 
> scales, RRL can be used to deny service to your real users to protect victims 
> of reflection attacks. That's a form of asymmetric altruism. I'm not against 
> it, doing the internet a favour is worthwhile, we all benefit ; but it's 
> worth calling out. 

I think you'll find there is huge impact on the real users, since the target is 
down and the target is a real user. Plus all the people between the reflector 
and the target.

Until you can do better than "I don't think the numbers work out", I think I'll 
go with this being better than nothing despite your reservations.

-- 
TTFN,
patrick


___
dns-operations mailing list
dns-operations@lists.dns-oarc.net
https://lists.dns-oarc.net/mailman/listinfo/dns-operations
dns-jobs mailing list
https://lists.dns-oarc.net/mailman/listinfo/dns-jobs


Re: [dns-operations] rate-limiting state

2014-02-06 Thread Colm MacCárthaigh
On Thu, Feb 6, 2014 at 3:28 PM, Paul Vixie  wrote:
>
> first, i think we need smaller numbers, since at those volumes, opendns's
> pipes are full, and nobody would get any answers to any questions. so let's
> pick some number of requests and SYNs per second that is enough to use all
> the head room opendns had, but without affecting response flows to queries
> not related to your attack.
>

I chose a fairly typical number, which is actually below average. Arbor's
data on DDOS puts 10M somewhere between the 40th and 50th percentile.  I'd
be really surprised if OpenDNS's pipes fill up with that kind of small
volume.

second, RRL does not see SYNs. the kernel probably has SYN flood
> protection, which like a stateful firewall might penalize a host or
> netblock's real SYNs, but that has nothing to do with RRL's logic.
>

So the reflection attack is completed? If you're going to respond to all of
the SYNs, you may as well respond to the UDP queries. It'll be the same
reflected PPS. The byte count will be a little higher, but most networks
are bottlenecked on PPS at DNS payload sizes.


> so, third, let's look squarely at "large enough UDP flow to activate RRL".
>

10M requests/sec for www.example.com, type=A. Would that be large enough?

in that steady state situation, opendns's legitimate queries whose response
> matches an RRL flow are mixed with an avalanche of forged questions
> soliciting the same answer. opendns will retry three times over ten to 90
> seconds. if opendns ever gets an answer, it will fill its cache and stop
> asking that question. the possibility of opendns receiving a TC=1 and
> retrying with TCP, or receiving one of our periodic normal answers, and
> either way filling its cache is high, on the order of unity. of course,
> opendns might ask other authorities in between retries to any one
> authority, so you'll need to spoof all of the potential authorities who
> could help with the terminal cache-fill operation that ends the race.
>

I agree with all of this, but I don't think that the numbers work out. If
you're getting an attack of 10M PPS, which is very realistic, you'll end up
denying service to real users.

Important to consider here, is that if you did nothing, and let the
responses go answered (if you can), there's no impact on the real users.
The reflection target does get hit of course though.  So in effect, at
realistic DDOS scales, RRL can be used to deny service to your real users
to protect victims of reflection attacks. That's a form of asymmetric
altruism. I'm not against it, doing the internet a favour is worthwhile, we
all benefit ; but it's worth calling out.

-- 
Colm
___
dns-operations mailing list
dns-operations@lists.dns-oarc.net
https://lists.dns-oarc.net/mailman/listinfo/dns-operations
dns-jobs mailing list
https://lists.dns-oarc.net/mailman/listinfo/dns-jobs

Re: [dns-operations] rate-limiting state

2014-02-06 Thread Paul Vixie


Paul Vixie wrote:
> ...
> my advice is, don't take my word for it. build a VM farm inside your
> laptop, set up a test environment, perform the experiment, and show us
> all the recipe that results in maximum retries for a DNS stub. show
> the effort required to deliberate create a service-affecting outage
> using RRL. i think you'll find this an unsatisfying exercise but i am
> ready to study your results.

see also:

http://www.circleid.com/posts/20130913_on_the_time_value_of_security_features_in_dns/

vixie
___
dns-operations mailing list
dns-operations@lists.dns-oarc.net
https://lists.dns-oarc.net/mailman/listinfo/dns-operations
dns-jobs mailing list
https://lists.dns-oarc.net/mailman/listinfo/dns-jobs

Re: [dns-operations] rate-limiting state

2014-02-06 Thread Paul Vixie


Colm MacCárthaigh wrote:
>
> On Thu, Feb 6, 2014 at 2:37 PM, Paul Vixie  > wrote:
>
>> For example, if the authoritative provider www.example.com
>>  were to implement RRL as you describe,
>> then an attacker could spoof traffic purporting to be from Google
>> Public DNS, OpenDNS, Comcast ... etc, and cause www.example.com
>>  to be un-resolvable by users of those
>> resolvers. 
>
> no. it just does not work that way.
>
>
> O.k., so say I spoof 10M UDP queries per second and 10M TCP SYNs per
> second purporting to be from OpenDNS's IP address. Does RRL  a)  Let
> the queries and SYNs go answered. Or b) Rate limit the responses? 
>
> If it's (a) RRL doesn't prevent the reflection. If it's (b) then you
> complete a denial of service attack against the OpenDNS users. 
>
> Which is it? or what's option (c)?

first, i think we need smaller numbers, since at those volumes,
opendns's pipes are full, and nobody would get any answers to any
questions. so let's pick some number of requests and SYNs per second
that is enough to use all the head room opendns had, but without
affecting response flows to queries not related to your attack.

second, RRL does not see SYNs. the kernel probably has SYN flood
protection, which like a stateful firewall might penalize a host or
netblock's real SYNs, but that has nothing to do with RRL's logic.
furthermore, RRL is not invoked for TCP-received queries. whatever TCP
sessions are able to start up are presumed to not have a forged
other-end. what you really want here is to use non-spoofed TCP SYN so
that you can camp onto all available connection control blocks. if
opendns follows RFC 1035, it won't close TCP sessions until something
like two minutes of idle time. you'd

so, third, let's look squarely at "large enough UDP flow to activate
RRL". when RRL is active for a flow, it means the server is controlling
its responses to UDP questions which are from a certain netblock and
which produce a given response. by "control" i mean "don't just answer,
think about it first, consider your alternatives". one alternative is to
drop the question unanswered. another alternative is to answer with a
truncated UDP having TC=1. by default, netblocks are /24's and every
third prospective response is sent as a TC=1, and we will of course send
normal answers from time to time also. RRL's goal is attenuation, not
silence.

in your example, the forger is using a netblock that contains opendns's
upstream address, and it is soliciting responses similar to the ones
that opendns is trying to fetch. opendns only fetches from authorities
when it has a cache miss, so you have to find something to spoof that
opendns does not yet have in cache (or which has a very low TTL).

in that steady state situation, opendns's legitimate queries whose
response matches an RRL flow are mixed with an avalanche of forged
questions soliciting the same answer. opendns will retry three times
over ten to 90 seconds. if opendns ever gets an answer, it will fill its
cache and stop asking that question. the possibility of opendns
receiving a TC=1 and retrying with TCP, or receiving one of our periodic
normal answers, and either way filling its cache is high, on the order
of unity. of course, opendns might ask other authorities in between
retries to any one authority, so you'll need to spoof all of the
potential authorities who could help with the terminal cache-fill
operation that ends the race.

my advice is, don't take my word for it. build a VM farm inside your
laptop, set up a test environment, perform the experiment, and show us
all the recipe that results in maximum retries for a DNS stub. show the
effort required to deliberate create a service-affecting outage using
RRL. i think you'll find this an unsatisfying exercise but i am ready to
study your results.

so, (c).

vixie
___
dns-operations mailing list
dns-operations@lists.dns-oarc.net
https://lists.dns-oarc.net/mailman/listinfo/dns-operations
dns-jobs mailing list
https://lists.dns-oarc.net/mailman/listinfo/dns-jobs

Re: [dns-operations] rate-limiting state

2014-02-06 Thread Colm MacCárthaigh
On Thu, Feb 6, 2014 at 2:37 PM, Paul Vixie  wrote:

> For example, if the authoritative provider www.example.com were to
> implement RRL as you describe, then an attacker could spoof traffic
> purporting to be from Google Public DNS, OpenDNS, Comcast ... etc, and
> cause www.example.com to be un-resolvable by users of those resolvers.
>
>
> no. it just does not work that way.
>

O.k., so say I spoof 10M UDP queries per second and 10M TCP SYNs per second
purporting to be from OpenDNS's IP address. Does RRL  a)  Let the queries
and SYNs go answered. Or b) Rate limit the responses?

If it's (a) RRL doesn't prevent the reflection. If it's (b) then you
complete a denial of service attack against the OpenDNS users.

Which is it? or what's option (c)?

-- 
Colm
___
dns-operations mailing list
dns-operations@lists.dns-oarc.net
https://lists.dns-oarc.net/mailman/listinfo/dns-operations
dns-jobs mailing list
https://lists.dns-oarc.net/mailman/listinfo/dns-jobs

Re: [dns-operations] rate-limiting state

2014-02-06 Thread Paul Vixie


Colm MacCárthaigh wrote:
>
> Your article mentions RRL and asymmetric threats,  but does not
> mention that RRL opens the implementor up to a new asymmetric threat.
> With RRL, an attacker can spoof legitimate clients and cause the RRL
> implementation to deny them service. 

no.

>
> For example, if the authoritative provider www.example.com
>  were to implement RRL as you describe, then
> an attacker could spoof traffic purporting to be from Google Public
> DNS, OpenDNS, Comcast ... etc, and cause www.example.com
>  to be un-resolvable by users of those
> resolvers. 

no. it just does not work that way.

>
> The more widely RRL is applied to more protocols and schemes, the more
> they are vulnerable to this same simple counter-attack. It seems like
> setting the internet up with a brittle component that may ultimately
>  makes spoofing-based denial of service easier, not harder. This
> creates additional risk on the implementor at very little benefit to
> themselves, which still seems asymmetric.

dns rrl is a protocol-specific approach to rate limiting, for dns, based
on responses.

as i said in the ACM Queue article, every protocol we want to rate limit
is going to need a protocol-specific, protocol-aware method of rate
limiting. we must not create new vulnerabilities as a side effect of
closing old ones.

vixie
___
dns-operations mailing list
dns-operations@lists.dns-oarc.net
https://lists.dns-oarc.net/mailman/listinfo/dns-operations
dns-jobs mailing list
https://lists.dns-oarc.net/mailman/listinfo/dns-jobs

Re: [dns-operations] rate-limiting state

2014-02-06 Thread Colm MacCárthaigh
Your article mentions RRL and asymmetric threats,  but does not mention
that RRL opens the implementor up to a new asymmetric threat. With RRL, an
attacker can spoof legitimate clients and cause the RRL implementation to
deny them service.

For example, if the authoritative provider www.example.com were to
implement RRL as you describe, then an attacker could spoof traffic
purporting to be from Google Public DNS, OpenDNS, Comcast ... etc, and
cause www.example.com to be un-resolvable by users of those resolvers.

The more widely RRL is applied to more protocols and schemes, the more they
are vulnerable to this same simple counter-attack. It seems like setting
the internet up with a brittle component that may ultimately  makes
spoofing-based denial of service easier, not harder. This creates
additional risk on the implementor at very little benefit to themselves,
which still seems asymmetric.



On Thu, Feb 6, 2014 at 9:53 AM, Paul Vixie  wrote:

>  my latest bcp38 related effort was published in ACM Queue today:
>
> http://queue.acm.org/detail.cfm?id=2578510
>
> vixie
>
> ___
> dns-operations mailing list
> dns-operations@lists.dns-oarc.net
> https://lists.dns-oarc.net/mailman/listinfo/dns-operations
> dns-jobs mailing list
> https://lists.dns-oarc.net/mailman/listinfo/dns-jobs
>



-- 
Colm
___
dns-operations mailing list
dns-operations@lists.dns-oarc.net
https://lists.dns-oarc.net/mailman/listinfo/dns-operations
dns-jobs mailing list
https://lists.dns-oarc.net/mailman/listinfo/dns-jobs

[dns-operations] rate-limiting state

2014-02-06 Thread Paul Vixie
my latest bcp38 related effort was published in ACM Queue today:

http://queue.acm.org/detail.cfm?id=2578510

vixie
___
dns-operations mailing list
dns-operations@lists.dns-oarc.net
https://lists.dns-oarc.net/mailman/listinfo/dns-operations
dns-jobs mailing list
https://lists.dns-oarc.net/mailman/listinfo/dns-jobs