Hey - We've been using StaticURLParser to serve up our static files (under pylons) but we've run into a few issues.
One of the biggest is the ETags header. The problem is that we're deploying across a bunch of machines in a cluster, and static files get pulled out of SVN, which timestamps the files with the date as it's checked out. This means that each machine has a different etag for each file, because each file has a different date. Now one option is to fix our deployment to go re-touch the files after they've been checked out, to match their timestamps in svn. But another option is to do md5-based etagging... so it's really the contents of the file, not the date that happens to be on disk. This brings up a few problems with StaticURLParser/fileapp: - Headers are not really configurable. Frankly a last ditch effort would be to stop including the ETag header altogether, but Paste makes this extraordinarily hard - I can monkeypatch it but it's really not pretty. I can make middleware, but it seems like overkill to write middleware just to remove a header - Really, if the ETag was the MD5 of the file, then the etag would be consistent across the cluster. This technique is described here: http://dev.aol.com/implementing-atom-publishing-protocol-python-wsgi The trick with doing MD5 is how/when do you calculate the MD5 hash to compare it to If-None-Match? Clearly MD5 hashing is more expensive than just stat()ing a file. I can think of a few possibilities: - an in-memory cache mapping resources -> hashes- calculate the md5 hash when you serve the file for the first time, and remember it after that - hash the file unconditionally - if you assume that your request is ultimately bound more by network traffic than the cost of reading it off disk, then it's still cheaper to pass the whole file through RAM and Not-Modified than it would be to serve the whole file over the network.. but if you actually end up serving it (i.e. the etag doesn't match) then it's going to be hard not to read the file a second time in order to serve it up. - store the md5 hash persistently somewhere. Perhaps just by appending .md5 to the filename - if it exists, assume it's the right MD5 hash Anyway, obviously this is tricky, but I'm curious if anyone else has tackled this issue, or if anyone would consider adding some kind of MD5 support like those listed above to Paste/StaticURLParser/fileapp - i.e. what if you could make StaticURLParser just look for a .md5 file on disk, and used it if it found it, and otherwise use the current mechanism? Alec
_______________________________________________ Paste-users mailing list [email protected] http://webwareforpython.org/cgi-bin/mailman/listinfo/paste-users
