Re: [dns-privacy] Use of separate caches for plain and secure transports

2018-12-19 Thread Giovane Moura
Hi folks,

> Basically, one of the reasons the DNS protocol has been so robust is
> because of the caching behavior.  It greatly reduces traffic, greatly
> speeds up lookups.

Just want to provide some numbers on lookups RTT.

On experiment 1800 (tab 1 at
https://www.isi.edu/~johnh/PAPERS/Moura18a.pdf, also
https://atlas.ripe.net/measurements/10507676/), we had:
  *  ~10,000 atlas probes querying their resolvers for a unique domain name
  * two auth servers hosted at EC2 in Frankfurt
  * Median RTT values:
 * cache miss queries: 61.51ms
 * cache hit queries:   2.94ms

I know this is not representative for all scenarios. It only covers two
auth servers on the same location (unicast), and if we'd used anycast,
cache miss medians will significantly decrease.
But at least we have a concrete number that works for some scenarios.

/giovane
___
dns-privacy mailing list
dns-privacy@ietf.org
https://www.ietf.org/mailman/listinfo/dns-privacy


Re: [dns-privacy] Use of separate caches for plain and secure transports

2018-12-17 Thread Christopher Wood
On Mon, Dec 17, 2018 at 1:33 PM Warren Kumari  wrote:
>
>
>
> On Fri, Dec 14, 2018 at 4:05 PM Christopher Wood 
>  wrote:
>>
>> On Dec 14, 2018, 12:29 PM -0800, Daniel Kahn Gillmor 
>> , wrote:
>>
>> On Fri 2018-12-14 11:47:58 -0800, Christopher Wood wrote:
>>
>> On Dec 14, 2018, 10:47 AM -0800, Wes Hardaker , wrote:
>>
>> [And, no, we shouldn't go down the road of "privacy requires you disable
>> the cache"]
>>
>>
>> Would you mind elaborating on this comment? As you observe, caches are
>> harmful to privacy. Refusal to disable the cache in any (?)
>> circumstance therefore seems dismissive of user privacy. Perhaps you
>> mean turning it off for every query is not a viable path forward?
>>
>>
>> I hope Wes will answer this question on his own, but i wanted to note
>> that privacy is not only harmed by caches. it can also be helped by
>> caches.
>>
>> A query for any name will typically radiate *less* information into the
>> world if it's answered from a cache, simply because the resolver in
>> question doesn't create additional traffic.
>>
>> In particular, if the cache is already well-populated, and queries are
>> padded appropriately, and the name is relatively likely to be in-cache,
>> then the only parties that know what was looked up are the client and
>> the resolver itself. No authoritative servers or network observers have
>> any additional information to distinguish the query from any other
>> cache-resolved query handled by the resolver.
>>
>> So i don't think caching itself offers a clear benefit or harm for
>> privacy. One advantage of a resolver is that it effectively acts as a
>> mixing/semi-anonymizing agent on behalf of its users. Assuming that the
>> resolver itself is not compromised, it can buffer its users from the
>> authoritative servers.
>>
>>
>> Yes, of course, thanks for clarifying the other piece of this puzzle! This 
>> is indeed a benefit. However, I am not convinced this yields a greater net 
>> benefit than disabling caching. (I am not aware of any such study or 
>> analysis on this problem.) That said, all of this depends entirely upon the 
>> threat model, which can vary greatly.
>
>
> If you disable the cache, and can see that there is an (encrypted) input 
> query and then immediately an (encrypted) output query to 208.80.154.238 
> (ns0.wikimedia.org) you know with very high likelihood what input query was 
> for.

Agreed, though I think leaking the origin through the address is an
issue regardless of whether the cache is shared or not.

> If you have a shared cache, there is a much higher likelihood that the input 
> query gets answered from cache (especially for higher popularity names) and 
> so there is no output query to correlate with. Techniques which refresh the 
> cache before the TTL has expired (al la HAMMER) further thwart correlation 
> attacks.

I agree in principle, yet it seems TTL-based stub cache refresh
mechanisms could be implemented regardless of whether or not there's a
shared resolver cache. (Please correct me if I misunderstood your
point!)

In my opinion, tradeoffs made between enabling or disabling caching
are not well studied. (Thanks to Wes for sharing a pointer to his
paper which scratches the surface of this interesting problem.) We
need more work before we understand these tradeoffs and choose the
"right" answer.

Best,
Chris

___
dns-privacy mailing list
dns-privacy@ietf.org
https://www.ietf.org/mailman/listinfo/dns-privacy


Re: [dns-privacy] Use of separate caches for plain and secure transports

2018-12-17 Thread Warren Kumari
On Fri, Dec 14, 2018 at 4:05 PM Christopher Wood <
christopherwoo...@gmail.com> wrote:

> On Dec 14, 2018, 12:29 PM -0800, Daniel Kahn Gillmor <
> d...@fifthhorseman.net>, wrote:
>
> On Fri 2018-12-14 11:47:58 -0800, Christopher Wood wrote:
>
> On Dec 14, 2018, 10:47 AM -0800, Wes Hardaker , wrote:
>
> [And, no, we shouldn't go down the road of "privacy requires you disable
> the cache"]
>
>
> Would you mind elaborating on this comment? As you observe, caches are
> harmful to privacy. Refusal to disable the cache in any (?)
> circumstance therefore seems dismissive of user privacy. Perhaps you
> mean turning it off for every query is not a viable path forward?
>
>
> I hope Wes will answer this question on his own, but i wanted to note
> that privacy is not only harmed by caches. it can also be helped by
> caches.
>
> A query for any name will typically radiate *less* information into the
> world if it's answered from a cache, simply because the resolver in
> question doesn't create additional traffic.
>
> In particular, if the cache is already well-populated, and queries are
> padded appropriately, and the name is relatively likely to be in-cache,
> then the only parties that know what was looked up are the client and
> the resolver itself. No authoritative servers or network observers have
> any additional information to distinguish the query from any other
> cache-resolved query handled by the resolver.
>
> So i don't think caching itself offers a clear benefit or harm for
> privacy. One advantage of a resolver is that it effectively acts as a
> mixing/semi-anonymizing agent on behalf of its users. Assuming that the
> resolver itself is not compromised, it can buffer its users from the
> authoritative servers.
>
>
> Yes, of course, thanks for clarifying the other piece of this puzzle! This
> is indeed a benefit. However, I am not convinced this yields a greater net
> benefit than disabling caching. (I am not aware of any such study or
> analysis on this problem.) That said, all of this depends entirely upon the
> threat model, which can vary greatly.
>

If you disable the cache, and can see that there is an (encrypted) input
query and then immediately an (encrypted) output query to 208.80.154.238 (
ns0.wikimedia.org) you know with very high likelihood what input query was
for.
If you have a shared cache, there is a much higher likelihood that the
input query gets answered from cache (especially for higher popularity
names) and so there is no output query to correlate with. Techniques which
refresh the cache before the TTL has expired (al la HAMMER) further thwart
correlation attacks.

W




>
> Best,
> Chris
>
>
> --dkg
>
> ___
> dns-privacy mailing list
> dns-privacy@ietf.org
> https://www.ietf.org/mailman/listinfo/dns-privacy
>


-- 
I don't think the execution is relevant when it was obviously a bad idea in
the first place.
This is like putting rabid weasels in your pants, and later expressing
regret at having chosen those particular rabid weasels and that pair of
pants.
   ---maf
___
dns-privacy mailing list
dns-privacy@ietf.org
https://www.ietf.org/mailman/listinfo/dns-privacy


Re: [dns-privacy] Use of separate caches for plain and secure transports

2018-12-17 Thread Wes Hardaker
Daniel Kahn Gillmor  writes:

> I hope Wes will answer this question on his own

Basically, one of the reasons the DNS protocol has been so robust is
because of the caching behavior.  It greatly reduces traffic, greatly
speeds up lookups.  Turning off caching would disable much of this
critical infrastructure that the DNS was designed with.  Recent work has
proven that longer TTLs enable zones to survive DDoS attacks because of
caching (https://www.isi.edu/~johnh/PAPERS/Moura18a.pdf).

Instead, we could maybe cache the delay instead and do something like
"if privacy mode is enabled for first query missing the cache for name
X, then store [X, delay] for the resolution time.  For all future
requests up until the first non-privacy protected query for X, force a
delay response but respond from the cache".  That's kinda messy, but at
least may balance the need to keep the cache with privacy.

> , but i wanted to note that privacy is not only harmed by caches.  it
> can also be helped by caches.

Yep.  I did some experiments around this at the beginning of 2018 for
the NDSS DNS privacy workshop.

Paper: 
http://www.isi.edu/~hardaker/papers/2018-02-ndss-analyzing-root-privacy.pdf

Youtube 1: https://youtu.be/bSKBRMNQ7s0
Youtube 2: https://youtu.be/9YYH8JFH_bY?t=21m0s

-- 
Wes Hardaker 
My Pictures:   http://capturedonearth.com/
My Thoughts:   http://blog.capturedonearth.com/

___
dns-privacy mailing list
dns-privacy@ietf.org
https://www.ietf.org/mailman/listinfo/dns-privacy


Re: [dns-privacy] Use of separate caches for plain and secure transports

2018-12-14 Thread Christopher Wood
On Dec 14, 2018, 12:29 PM -0800, Daniel Kahn Gillmor , 
wrote:
> On Fri 2018-12-14 11:47:58 -0800, Christopher Wood wrote:
> > On Dec 14, 2018, 10:47 AM -0800, Wes Hardaker , wrote:
> > > [And, no, we shouldn't go down the road of "privacy requires you disable
> > > the cache"]
> >
> > Would you mind elaborating on this comment? As you observe, caches are
> > harmful to privacy. Refusal to disable the cache in any (?)
> > circumstance therefore seems dismissive of user privacy. Perhaps you
> > mean turning it off for every query is not a viable path forward?
>
> I hope Wes will answer this question on his own, but i wanted to note
> that privacy is not only harmed by caches. it can also be helped by
> caches.
>
> A query for any name will typically radiate *less* information into the
> world if it's answered from a cache, simply because the resolver in
> question doesn't create additional traffic.
>
> In particular, if the cache is already well-populated, and queries are
> padded appropriately, and the name is relatively likely to be in-cache,
> then the only parties that know what was looked up are the client and
> the resolver itself. No authoritative servers or network observers have
> any additional information to distinguish the query from any other
> cache-resolved query handled by the resolver.
>
> So i don't think caching itself offers a clear benefit or harm for
> privacy. One advantage of a resolver is that it effectively acts as a
> mixing/semi-anonymizing agent on behalf of its users. Assuming that the
> resolver itself is not compromised, it can buffer its users from the
> authoritative servers.

Yes, of course, thanks for clarifying the other piece of this puzzle! This is 
indeed a benefit. However, I am not convinced this yields a greater net benefit 
than disabling caching. (I am not aware of any such study or analysis on this 
problem.) That said, all of this depends entirely upon the threat model, which 
can vary greatly.

Best,
Chris

>
> --dkg
___
dns-privacy mailing list
dns-privacy@ietf.org
https://www.ietf.org/mailman/listinfo/dns-privacy


Re: [dns-privacy] Use of separate caches for plain and secure transports

2018-12-14 Thread Daniel Kahn Gillmor
On Fri 2018-12-14 11:47:58 -0800, Christopher Wood wrote:
> On Dec 14, 2018, 10:47 AM -0800, Wes Hardaker , wrote:
>> [And, no, we shouldn't go down the road of "privacy requires you disable
>> the cache"]
>
> Would you mind elaborating on this comment? As you observe, caches are
> harmful to privacy. Refusal to disable the cache in any (?)
> circumstance therefore seems dismissive of user privacy.  Perhaps you
> mean turning it off for every query is not a viable path forward?

I hope Wes will answer this question on his own, but i wanted to note
that privacy is not only harmed by caches.  it can also be helped by
caches.

A query for any name will typically radiate *less* information into the
world if it's answered from a cache, simply because the resolver in
question doesn't create additional traffic.

In particular, if the cache is already well-populated, and queries are
padded appropriately, and the name is relatively likely to be in-cache,
then the only parties that know what was looked up are the client and
the resolver itself.  No authoritative servers or network observers have
any additional information to distinguish the query from any other
cache-resolved query handled by the resolver.

So i don't think caching itself offers a clear benefit or harm for
privacy.  One advantage of a resolver is that it effectively acts as a
mixing/semi-anonymizing agent on behalf of its users.  Assuming that the
resolver itself is not compromised, it can buffer its users from the
authoritative servers.

  --dkg


signature.asc
Description: PGP signature
___
dns-privacy mailing list
dns-privacy@ietf.org
https://www.ietf.org/mailman/listinfo/dns-privacy


Re: [dns-privacy] Use of separate caches for plain and secure transports

2018-12-14 Thread Christopher Wood
On Dec 14, 2018, 10:47 AM -0800, Wes Hardaker , wrote:
> Daniel Kahn Gillmor  writes:
>
> > I have *not* done any analysis of the larger, less-corner-y cases to
> > see whether there's a strong argument for or against treating data
> > that came in under confidential cover differently once it's in the
> > cache.
>
> Technically, it's near impossible to completely protect privacy unless
> you don't use a cache. Imagine the case where someone goes to a coffee
> shop first thing every morning that supports a TLS based resolver.
> A second "customer" 5 minutes later can then perform queries to the
> resolver (regardless of TLS or not) for a slew of names to find which
> were "in cache" and responding quickly. You now know that person #1
> went to sites A, X and Y since they returned in < 5ms, but not the rest
> of the alphabet which returned in > 5ms.
>
> However, I don't think this changes the nature of whether or not the
> caches should be separate. If anything, it may argue for a shared cache
> so that normal traffic from non-privacy protected lookups will mean
> someone snooping caches for private-protected lookups won't know it came
> from a TLS-based user.
>
> [And, no, we shouldn't go down the road of "privacy requires you disable
> the cache"]

Would you mind elaborating on this comment? As you observe, caches are harmful 
to privacy. Refusal to disable the cache in any (?) circumstance therefore 
seems dismissive of user privacy.
Perhaps you mean turning it off for every query is not a viable path forward?

Relatedly, would per-query cache rules be an appropriate trade off? For 
example, sensitive queries could carry a “do not cache” flag, requiring the 
resolver to not cache any (or only some) of the answers used to produce a 
response.

Best,
Chris

>
> --
> Wes Hardaker
> My Pictures: http://capturedonearth.com/
> My Thoughts: http://blog.capturedonearth.com/
>
> ___
> dns-privacy mailing list
> dns-privacy@ietf.org
> https://www.ietf.org/mailman/listinfo/dns-privacy
___
dns-privacy mailing list
dns-privacy@ietf.org
https://www.ietf.org/mailman/listinfo/dns-privacy


Re: [dns-privacy] Use of separate caches for plain and secure transports

2018-12-14 Thread Wes Hardaker
Daniel Kahn Gillmor  writes:

> I have *not* done any analysis of the larger, less-corner-y cases to
> see whether there's a strong argument for or against treating data
> that came in under confidential cover differently once it's in the
> cache.

Technically, it's near impossible to completely protect privacy unless
you don't use a cache.  Imagine the case where someone goes to a coffee
shop first thing every morning that supports a TLS based resolver.
A second "customer" 5 minutes later can then perform queries to the
resolver (regardless of TLS or not) for a slew of names to find which
were "in cache" and responding quickly.  You now know that person #1
went to sites A, X and Y since they returned in < 5ms, but not the rest
of the alphabet which returned in > 5ms.

However, I don't think this changes the nature of whether or not the
caches should be separate.  If anything, it may argue for a shared cache
so that normal traffic from non-privacy protected lookups will mean
someone snooping caches for private-protected lookups won't know it came
from a TLS-based user.

[And, no, we shouldn't go down the road of "privacy requires you disable
the cache"]

-- 
Wes Hardaker 
My Pictures:   http://capturedonearth.com/
My Thoughts:   http://blog.capturedonearth.com/

___
dns-privacy mailing list
dns-privacy@ietf.org
https://www.ietf.org/mailman/listinfo/dns-privacy


Re: [dns-privacy] Use of separate caches for plain and secure transports

2018-12-13 Thread Mukund Sivaraman
Hi Daniel

On Thu, Dec 13, 2018 at 02:32:41PM -0500, Daniel Kahn Gillmor wrote:
> The degenerate scenario i'd painted on the call was:
> 
>  * consider a DPRIVE-capable DNS resolver; for whatever reason, only a
>single request has been made to it since it booted.
> 
>  * a new cleartext (non-private) request comes in for foo.example, and
>it does the lookups it needs to do, all in the clear. (private
>queries to authoritatives would have worked, but they weren't tried
>since the initial request was in the clear anyway).
> 
>  * a subsequent private request comes in to the resolver, and the
>resolver responds without doing any upstream lookup.
> 
> in this scenario, a passive observer of the resolver's traffic can infer
> that the private query was likely also for foo.example (or at least, for
> one of the names that needed resolution in order to get an answer for
> foo.example, like NS records).

Ah.. I follow it now, and why you think it is a leak. :)

A resolver can respond to several queries without performing any
upstream queries. As an example, take RFC 6761. Nothing can be inferred
about a query simply because it didn't result in resolution.

> 
> So this is a privacy leak, which could be mitigated by treating the
> cache of RRs-accessed-in-the-clear as invalid for retrieval of the
> private query unless private authoritative DNS is confirmed to be
> unavailable.
> 
> There might be other effective mitigation besides a split cache,
> though.  for example, preferring private queries upstream in the first
> place for every query might offer some mitigation.
> 
> what do you think?
> 
>   --dkg

Mukund

___
dns-privacy mailing list
dns-privacy@ietf.org
https://www.ietf.org/mailman/listinfo/dns-privacy


Re: [dns-privacy] Use of separate caches for plain and secure transports

2018-12-13 Thread Daniel Kahn Gillmor
Hi Mukund--

On Tue 2018-12-11 11:13:39 +0530, Mukund Sivaraman wrote:
> During last night's meeting, there was talk about use of a split-cache -
> one with answers learned from plain transports and another with answers
> learned via secure transports.

I think i was the one that mentioned that there *could* be a security or
privacy issue there.  fwiw, i really want the answer here to be "don't
worry about it, use a single cache", because that makes implementations
significantly easier.

In the long run, there might even be privacy or security tradeoffs here,
and we might decide that they're prices worth paying for the additional
implementation simplicity -- i don't know.

I just want to try to ensure that we've at least thought about some
potential downsides and mapped them out.

The degenerate scenario i'd painted on the call was:

 * consider a DPRIVE-capable DNS resolver; for whatever reason, only a
   single request has been made to it since it booted.

 * a new cleartext (non-private) request comes in for foo.example, and
   it does the lookups it needs to do, all in the clear. (private
   queries to authoritatives would have worked, but they weren't tried
   since the initial request was in the clear anyway).

 * a subsequent private request comes in to the resolver, and the
   resolver responds without doing any upstream lookup.

in this scenario, a passive observer of the resolver's traffic can infer
that the private query was likely also for foo.example (or at least, for
one of the names that needed resolution in order to get an answer for
foo.example, like NS records).

So this is a privacy leak, which could be mitigated by treating the
cache of RRs-accessed-in-the-clear as invalid for retrieval of the
private query unless private authoritative DNS is confirmed to be
unavailable.

There might be other effective mitigation besides a split cache,
though.  for example, preferring private queries upstream in the first
place for every query might offer some mitigation.

what do you think?

  --dkg


signature.asc
Description: PGP signature
___
dns-privacy mailing list
dns-privacy@ietf.org
https://www.ietf.org/mailman/listinfo/dns-privacy