danielcweeks commented on PR #15428:
URL: https://github.com/apache/iceberg/pull/15428#issuecomment-4014002209

   > > The only two requests that should be cached are HEAD and GET.
   > 
   > But this rule isn't _enforced_. Nothing in the specs prevents a server 
from sending a `Cache-Control: private` header for other methods. And by doing 
so, it would break the client. I'm not saying it makes sense to do so, I'm 
saying that it's not fair for a server to break the client so easily. If the 
client's cache is designed to only handle these 2 methods and nothing else, I 
think the client should make sure to filter out other methods. It seems a bit 
pointless to me to require from servers to send the `Cache-Control` header when 
the client already knows what requests it can and cannot cache.
   
   I believe the default is that the client doesn't cache unless told to do so, 
which makes caching a server responsibility.  While it might make sense to 
limit to just the two methods we expect, it should be the client's 
responsibility to fix a bad server implementation.  Yes, the client would 
break, but it's really the server that needs to be fixed.
    
   > Yes but in fact, the most problematic scenario for me is a `GET` request 
with a `range` header. If a server decides to sign the `range` header (which is 
imho totally valid), the client would break. The prevailing philosophy is that 
"the server decides what to to sign," but in reality, the server's control 
appears limited due to potential client-side cache issues. Again, it appears to 
me that, if the client already knows that it would break if the server signs 
some header, it's best for the client to proactively remove that header from 
the request to sign.
   
   I think that's putting to much control in the clients hand and limits what 
functionality the server has in deciding what to sign for.  If a client "hides" 
the range header, a server would only have the option to sign for everything or 
nothing.  While in practice, I don't know of any implementation is protecting 
ranges of files, it is entirely feasible and since it's the servers 
responsibility to protect the data, it should have the final say on what it 
allows to be read.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to