Op di 6 aug. 2019 om 09:10 schreef <mark.m...@gmail.com>:
> Thanks Devon!

You're welcome!

> So just to clarify our request flow is:
> Client > CDN > Go Reverse Proxy > Origin
> Our Go Reverse Proxy has historically been responsible for adding caching 
> headers (e.g. Cache-Control and Surrogate-Control) when the origins have 
> failed to do so (as a way to ensure things are cached appropriately).
>> It's unclear to me why you should be setting an etag header if you're a 
>> proxy.
> That's why when it came to looking at setting serve stale defaults for our 
> origins (e.g. stale-while-revalidate and stale-if-error) I realized that 
> somewhere along the chain an appropriate ETag/Last-Modified should be set and 
> that's why I started wondering if our proxy should be responsible for setting 
> them.
> Even then I felt like setting Last-Modified was way outside the 
> responsibility of our proxy, but that maybe setting of ETag would have 
> sufficed.

Ah, I see. So you're still the content owner; you're just further
offloading work from between your origin and the CDN. Assuming you're
not still multi-tenant behind your proxy (i.e. your proxy only serves
_your_ assets), then I think it's probably reasonable for you to make
that determination at your proxy. And from that perspective, I agree
that you'd be more interested in ETag/INM than LM/IMS on your proxy.

>> Unless you're serving from the filesystem handler (which does
>> implement IMS/INM), you'll need to implement these yourself.
> I think your other related answers might explain to me why the go reverse 
> proxy doesn't support conditional requests, in that it's NOT a 'caching 
> proxy' and so being able to handle that revalidation logic wouldn't make 
> sense.

Right -- it boils down to whether a proxy is transparent or not. A
transparent proxy observes traffic and makes no changes to the
protocol or the discussion over it. The only impact it can really have
is if it stops servicing requests. A transparent proxy assumes that
both sides of the connection are speaking the same protocol, and so it
doesn't really have to know about protocol semantics.

A caching proxy isn't transparent. It looks like it because it ends up
having very good knowledge of the protocol it's proxying, but every
request isn't passed through unmodified, so it's by definition opaque.

>> Note that you _could_ simply proxy this to the origin and let it
>> handle the validation. This is often overkill for what people actually
>> need, but it is guaranteed to work.
> OK, so as we are indeed just proxying the request pretty much 'as is' to the 
> origin, i.e. the CDN is making the revalidation conditional request when our 
> stale-while-revalidate TTL expires, I'm guessing (I appreciate this is the 
> 'basics' of how a proxy works, but I want to talk it through in case I'm 
> mistaken in any way!) the go proxy will transparently keep that information 
> for the origin to respond with the appropriate ETag/Last-Modified, and the go 
> proxy again will transparently pass back their response through to the CDN to 
> then update its cache if it indeed got a `200 OK` from origin or to continue 
> serving stale if the origin returned a `304 Not Modified` (and in either case 
> I expect the origin should send ETag/Last-Modified headers regardless of 
> 200/304 status').

It's been a while since I was working at a CDN (Fastly) so I may be a
bit fuzzy here; what you've written sounds like a correct
understanding. Again, as the proxy is transparent, its knowledge of
the protocol is really not meaningful; as long as your CDN and origin
both implement the protocol correctly, your transparent proxy will
also be by definition correct. (Though I'd note that one could argue
that X-Forwarded-For makes most HTTP proxies not strictly transparent;
it's also not super meaningful anyway except for logging to make sure
you understand your topology when things go wrong.)

As you've already got a CDN in the picture, it seems to me (especially
if you're using origin shielding) that it won't be super helpful for
you to implement LM/IMS or ETag/INM in your proxy. Lack of explicit
support for this in Go is therefore hopefully not an issue for you
because a CDN supporting stale-while-revalidate and stale-if-error
will already be shielding your origins from heaps of revalidation

However, it sounds like your origin isn't setting ETag or other cache
control headers everywhere it could. Adding strong ETags at your proxy
should be reasonably cheap since the CDN is shielding you from
revalidation storms, and it can also save you on your CDN's bandwidth
bill as it will allow your CDN to respond with 304s rather than 200s
for more objects.

>> A hash function over the body of the response would constitute strong
>> validation. I'm not sure why you'd need to mix in the path; there's
>> nothing wrong with serving the exact same content between two
>> endpoints, and the ETag is tied to a response object.
> Ah ok, so I was thinking along these lines, but was getting confused between 
> content that is cached vs content that is rendered at 'runtime' (e.g. I was 
> getting confused with the response containing a <script> tag that might 
> dynamically change the adverts on the page depending on the client and 
> wondering if that meant it wasn't "strong" validation just hashing the server 
> response body, but I guess it's redundant thinking like that because the 
> actual cached content is what's compared as far as the hash is concerned and 
> not what the client-side scripting is modifying.

Yeah, this is more to do with counters, timestamps, and ad links that
are inserted at page construction (e.g. an iframe that might contain
different URLs for an ad) than it is to do with scripts that modify
the DOM to pick a different ad service or a "proxy URL" that is
capable of serving ads from different vendors. It's really more to do
with whether the semantics of the ETag are purely based on content, or
whether they're based on something more abstract like a version. (And
then why you'd pick one over another has to do with those previously
mentioned points.)


> On Tuesday, 6 August 2019 15:48:49 UTC+1, Devon H. O'Dell wrote:
>> Hi Mark,
>> Whether or not your proxy is caching, you may find RFC7234[1] relevant
>> in addressing some of your questions (as well as many you may later
>> encounter). I think you may find section 5.2 to be of particular
>> interest, though any proxy author should be familiar with the full
>> text.
>> Op di 6 aug. 2019 om 05:14 schreef <mark...@gmail.com>:
>> >
>> > Hello,
>> >
>> > I'm using Go's standard library reverse proxy and I'm trying to figure out 
>> > if the standard library HTTP web server (e.g. http.ListenAndServe) 
>> > implements the relevant conditional request handling logic for 
>> > ETag/Last-Modified headers.
>> >
>> > I did some Googling and noticed the HTTP file system request handler 
>> > (https://golang.org/src/net/http/fs.go) does implement that logic, but I 
>> > couldn't find the same for the HTTP web server.
>> >
>> > I also couldn't find any examples of setting ETags/Last-Modified (other 
>> > than this basic implementation for setting ETags: 
>> > https://github.com/go-http-utils/etag/blob/master/etag.go).
>> >
>> > What's confusing me there is the concept of "strong" and "weak" validation 
>> > and how certain scenarios might influence whether an ETag is marked as 
>> > either strong or weak (depending on the implementation -- see 
>> > https://developer.mozilla.org/en-US/docs/Web/HTTP/Conditional_requests#Validators).
>> >
>> > So to recap, my questions are (and I appreciate some of these are outside 
>> > the scope of just Go -- so apologies if that's not allowed in this forum):
>> I think this is a fine question for this list, which isn't necessarily
>> constrained to questions about Go, but also for how to achieve things
>> while using Go. Lines get blurred since many technologies touch each
>> other. I don't think any apologies are necessary :).
>> > 1. Should I set ETag/Last-Modified in a proxy? Last-Modified feels like 
>> > it's not the responsibility of the proxy but the origin, where as an ETag 
>> > is something I feel is "ok" to do in the proxy as a 'fallback' (as we 
>> > already set 'serve stale' caching headers on behalf of our origins if they 
>> > neglect to include them).
>> ETag and Last-Modified should be sent by the origin to any proxy to
>> let the proxy know when the content is stale (assuming the proxy is
>> caching). The only case in which a proxy might set these things is if
>> there are configurations provided by the content owner that allow the
>> proxy to determine what the lifetime of the response object is outside
>> of response headers. This is most useful in cases where the content is
>> synthetically generated by the proxy as a result of the content
>> owner's configuration. If you don't have such a system in place, your
>> proxy should never be generating these response headers, and you
>> should be working with your customers / users to help them understand
>> when to set cache control headers.
>> > 2. Do I need to implement `If-None-Match` and `If-Modified-Since` 
>> > behaviours myself (i.e. is it not provided by the Go standard library's 
>> > HTTP web server)?
>> Unless you're serving from the filesystem handler (which does
>> implement IMS/INM), you'll need to implement these yourself.
>> Note that you _could_ simply proxy this to the origin and let it
>> handle the validation. This is often overkill for what people actually
>> need, but it is guaranteed to work.
>> One trick that many CDN providers leverage is to offer their customers
>> the option to serve the stale object while revalidating it. If that
>> option is set, an asynchronous revalidation request is spawned -- new
>> requests are blocked on the completion of that request -- and the
>> potentially stale content is served to the original requester without
>> blocking that request on revalidation.
>> > 3. I was planning on setting an ETag header on the response from within 
>> > httputil.ReverseProxy#ModifyResponse but wasn't sure if that would be the 
>> > correct place to set it.
>> It's unclear to me why you should be setting an etag header if you're a 
>> proxy.
>> > 4. What constitutes a strong/weak validator (e.g. would a simple hash 
>> > function generating a digest of the URL path + response body suffice)?
>> A hash function over the body of the response would constitute strong
>> validation. I'm not sure why you'd need to mix in the path; there's
>> nothing wrong with serving the exact same content between two
>> endpoints, and the ETag is tied to a response object.
>> Weak validation is signified by an additional "W/" in the etag
>> identifier. In practice, this means that you mustn't use weak
>> identifiers for serving byte-range requests. Weak identifiers may be
>> more useful for dynamically generated content where you might for
>> example have a date added in, or an ad server link that is rotated
>> each time the page is served, or a counter, or something like this. An
>> example of weak validation would be something that is version and
>> encoding based -- each time the content changes materially, you'd
>> increment the version, and some identifier for the content-encoding
>> would also be mixed in.
>> > Thanks for any help/insights/opinions y'all can share with me.
>> >
>> > Kind regards,
>> > Mark
>> >
>> Hope that helps!
>> Kind regards,
>> --dho
>> [1]: https://tools.ietf.org/html/rfc7234
> --
> You received this message because you are subscribed to the Google Groups 
> "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to golang-nuts+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/golang-nuts/a8da59d3-c905-4e13-8d15-3792a36c2f61%40googlegroups.com.

You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 

Reply via email to