On Wed, Feb 19, 2014 at 1:02 PM, Alex Rousskov <rouss...@measurement-factory.com> wrote: > On 02/19/2014 12:42 PM, Rajiv Desai wrote: >> On Wed, Feb 19, 2014 at 11:09 AM, Alex Rousskov >> <rouss...@measurement-factory.com> wrote: >>> On 02/19/2014 03:11 AM, Rajiv Desai wrote: >>>> I am interested in adding functionality to squid to optionally add >>>> objects from PUT requests to cache. Has there been any related work >>>> done in the past or is being pursued currently that I can use as >>>> reference? >>> >>> Just to make sure we are all on the same page, do you want Squid to take >>> the body of a PUT request and store it in the cache so that subsequent >>> GET requests for the same URI will result in a cache hit? >> >> Yes. >> >>> If yes, what response headers do you want Squid to use when caching that >>> PUT body? >>> >> >> The GET response only requires Content-Length to be accurate. Other >> time values can use Date from the PUT request header. >> The expiry time does not matter but can be set to a very large value >> (never expires). > > I believe the cached PUT entity should get entity headers from the PUT > request, reusing them as response entity headers. RFC 2616 Section 9.6 > seems to suggest that. > > >>> Will the PUT body contain response headers? >>> >> The PUT body does not contain response headers. It simply contains the >> object. >> PUT header has the following : >> >> PUT /mag-1363987602-cmbogo/c9e935e0-10812585 HTTP/1.1 >> Host: s3-us-west-1.amazonaws.com >> Accept: */* >> Content-MD5: o8VChHm6LUVSQNSFg57DSA== >> Content-Type: application/octet-stream >> Date: Wed, 19 Feb 2014 19:30:19 GMT >> Content-Length: 10256 >> Expect: 100-continue >> >> >>> What is your use case? That is, why do you want this feature? >>> >> >> I currently use squid as a caching gateway (forward proxy) for >> uploads(PUTs) and downloads(GETs) to/from an object store (eg: AWS >> S3). >> In a branch office when one client uploads content, other clients (or >> even the same client) should be able to fetch content from the squid >> cache to accelerate downloads. >> These objects are typically 64KB in size and are immutable so no >> freshness/expiry checks are required. So, if a PUT request is accepted >> by the server, the object uploaded should be cached by squid and >> subsequent GETs for these objects should be HITs. > > > > Thank you for detailing your use case. > > I believe this can be supported, but it will not be easy. You probably > should add write-to-store support to the Squid HTTP server (the code > currently residing in client_side*cc and related files) but all of the > examples doing so live in Squid HTTP clients (the code currently > residing in Server.cc and http.cc). Yes, I know this sounds backwards. > It will take some effort to extract reusable code (if any) into a class > and use that class in servers and clients, but it is possible. > > The alternative approach is to add write-request-body-to-store to Squid > client code that already deals with writing response to store. However, > I believe that doing so will be even more confusing and, technically, > wrong because the next hop may not even be HTTP in some use cases. You > should store the body as Squid gets it from the HTTP client, not when > Squid forwards it to the next hop. > > I am not aware of any existing code in that direction, but you should > double check by searching old postings and Squid2 change logs. I know > this question has been asked several times before (and some recent > answers contradict my suggestions in this email :-). > > > If you decide to code this feature, you may want to start by looking at > ServerStateData::setFinalReply() and ServerStateData::storeReplyBody(). > Those two methods and ALL,9 cache.log analysis when caching a simple > response may help you find most of the necessary Store APIs. Again, > handling all corner cases correctly is not going to be easy. > > > HTH, > > Alex. >
Thanks a lot for the pointers! -Rajiv