On Mon, Dec 17, 2018 at 1:33 PM Warren Kumari <[email protected]> wrote:
>
>
>
> On Fri, Dec 14, 2018 at 4:05 PM Christopher Wood 
> <[email protected]> wrote:
>>
>> On Dec 14, 2018, 12:29 PM -0800, Daniel Kahn Gillmor 
>> <[email protected]>, wrote:
>>
>> On Fri 2018-12-14 11:47:58 -0800, Christopher Wood wrote:
>>
>> On Dec 14, 2018, 10:47 AM -0800, Wes Hardaker <[email protected]>, wrote:
>>
>> [And, no, we shouldn't go down the road of "privacy requires you disable
>> the cache"]
>>
>>
>> Would you mind elaborating on this comment? As you observe, caches are
>> harmful to privacy. Refusal to disable the cache in any (?)
>> circumstance therefore seems dismissive of user privacy. Perhaps you
>> mean turning it off for every query is not a viable path forward?
>>
>>
>> I hope Wes will answer this question on his own, but i wanted to note
>> that privacy is not only harmed by caches. it can also be helped by
>> caches.
>>
>> A query for any name will typically radiate *less* information into the
>> world if it's answered from a cache, simply because the resolver in
>> question doesn't create additional traffic.
>>
>> In particular, if the cache is already well-populated, and queries are
>> padded appropriately, and the name is relatively likely to be in-cache,
>> then the only parties that know what was looked up are the client and
>> the resolver itself. No authoritative servers or network observers have
>> any additional information to distinguish the query from any other
>> cache-resolved query handled by the resolver.
>>
>> So i don't think caching itself offers a clear benefit or harm for
>> privacy. One advantage of a resolver is that it effectively acts as a
>> mixing/semi-anonymizing agent on behalf of its users. Assuming that the
>> resolver itself is not compromised, it can buffer its users from the
>> authoritative servers.
>>
>>
>> Yes, of course, thanks for clarifying the other piece of this puzzle! This 
>> is indeed a benefit. However, I am not convinced this yields a greater net 
>> benefit than disabling caching. (I am not aware of any such study or 
>> analysis on this problem.) That said, all of this depends entirely upon the 
>> threat model, which can vary greatly.
>
>
> If you disable the cache, and can see that there is an (encrypted) input 
> query and then immediately an (encrypted) output query to 208.80.154.238 
> (ns0.wikimedia.org) you know with very high likelihood what input query was 
> for.

Agreed, though I think leaking the origin through the address is an
issue regardless of whether the cache is shared or not.

> If you have a shared cache, there is a much higher likelihood that the input 
> query gets answered from cache (especially for higher popularity names) and 
> so there is no output query to correlate with. Techniques which refresh the 
> cache before the TTL has expired (al la HAMMER) further thwart correlation 
> attacks.

I agree in principle, yet it seems TTL-based stub cache refresh
mechanisms could be implemented regardless of whether or not there's a
shared resolver cache. (Please correct me if I misunderstood your
point!)

In my opinion, tradeoffs made between enabling or disabling caching
are not well studied. (Thanks to Wes for sharing a pointer to his
paper which scratches the surface of this interesting problem.) We
need more work before we understand these tradeoffs and choose the
"right" answer.

Best,
Chris

_______________________________________________
dns-privacy mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/dns-privacy

Reply via email to