On 21 Oct 2012, at 17:28, Urs Holzer wrote:
> I'd like to implement Transparent Content Negotiation as described in 
> RFC 2295. Is this a good idea?

I'm no expert on HTTP, but it seems to me that you would first have to
implement normal HTTP 1.1 caching of negotiated responses.[1] (See my
analysis of polipo's current behaviour below.)

Out of curiosity: What is the status / adoption of RFC 2295? Apache
calls it an "experimental protocol", and "supports 'transparent' content
negotiation, [but] does not support [RFC 2295's] 'feature
negotiation'."[2]

[1] http://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html#sec13.6
[2] http://httpd.apache.org/docs/2.4/content-negotiation.html#about


> I have not yet worked a lot with C, so it 
> would take some time for me to try to implement it.

The main concern would be not to break existing polipo functionality,
and the best way to be sure of that is an automated test suite. Before
making any code changes, I would start with figuring out a way to run
curl's test suite[3] against polipo. It shouldn't be too difficult;
I've managed to run a couple of curl's tests against polipo before.[4]

[3] https://github.com/bagder/curl/tree/master/tests
[4] http://permalink.gmane.org/gmane.comp.web.polipo.user/2899


> As I understand it, Polipo caches only one variant for a negotiated 
> resource and serves this one based on the Vary, Content-* and Accept-* 
> headers. Is this correct?

I take it that you have read what the polipo manual has to say about the
Vary header.

Grepping through the sources of polipo reveals the following (all line
numbers are for revision ec4d6385 on the master branch):

* The HTTP parser sets cache_control.flags |= CACHE_VARY when the
  response contains a Vary header (http_parse.c:1206). I couldn't see
  polipo checking the _value_ of the Vary header anywhere, so it clearly
  doesn't cache different variation of the response to fulfil subsequent
  requests based on the new request's Accept-* headers.

* objectMustRevalidate (object.c) returns true if CACHE_VARY is set,
  unless you configure mindlesslyCacheVary=true.

* If the response had CACHE_VARY and an Etag, polipo can serve from
  cache but first validates with a "conditional" GET
  (client.c:1244,1252). A "conditional" GET is a GET request with
  If-Modified-Since / If-None-Match headers (server.c:1661).

* If the response had CACHE_VARY but no Etag, polipo sets CACHE_MISMATCH
  (http.c:1053). CACHE_MISMATCH forces a full GET request, as opposed to
  a conditional GET (client.c:1247).

* There are additional CACHE_VARY checks in validateEntry (diskcache.c)
  and httpTweakCacheability (http.c) that I haven't read through.

(Superficially at least, it seems there is some redundancy in the above
checks. But I didn't really study the source code in any depth.)

To better understand the current behaviour you could also try enabling
logging levels L_VARY, D_SERVER_REQ and D_CLIENT_REQ (the last two
require modifying LOGGING_MAX in log.h and recompiling) and run some
manual tests of your own.


Hope this helps in some way,
Dave.


------------------------------------------------------------------------------
LogMeIn Central: Instant, anywhere, Remote PC access and management.
Stay in control, update software, and manage PCs from one command center
Diagnose problems and improve visibility into emerging IT issues
Automate, monitor and manage. Do more in less time with Central
http://p.sf.net/sfu/logmein12331_d2d
_______________________________________________
Polipo-users mailing list
Polipo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/polipo-users

Reply via email to