> -----Original Message----- > From: Nathan Eisenberg > Sent: Thursday, November 26, 2009 11:19 AM > To: '[email protected]' > Subject: RE: [tahoe-dev] 'Client' caching? > > > From: [email protected] [mailto:tahoe-dev- > > [email protected]] On Behalf Of Brian Warner > > Sent: Thursday, November 26, 2009 11:09 AM > > To: [email protected] > > Subject: Re: [tahoe-dev] 'Client' caching? > > > > > > Welcome! Yes, this is the perfect place for questions like this. > > > > It's useful to note the distinction between mutable and immutable > files > > here. Caching immutable files is perfectly safe: it's a simple > tradeoff > > between local storage consumption and performance, assuming a given > > locality-of-reference / reader behavior. Caching *mutable* files (or > > directories), on the other hand, is not safe: the tradeoff includes a > > correctness aspect, since someone else might change the contents and > > you might use your (now stale) cached copy. In general, we try to > avoid > > putting any sorts of heuristics about correctness into tahoe itself, > so > > any caching layer that requires a decision on a correctness-vs- > > performance policy would need to be placed above the tahoe node. > > > > Hm, I'm surprised that didn't work. Do you know what caused Squid to > > believe the file was changing each time? Maybe a quick peek at the > > returned headers would be informative. > > > > Basically, any URL that starts with /uri/URI:CHK: should be immutable > > and an ideal choice for caching. We'd planned (although I can't check > > right now to see whether we got around to implementing it or not) to > > add an ETag: header with the file's UEB header, which is basically a > > hash of the contents, and thus an ideal etag. And I know we aren't > > intentionally adding any Cache-Control or date headers that might > make > > the file look uncacheable. > > > > (for mutable files, there is a similar value called the "roothash" > > which covers the file contents, and allows If-ETag-Differs: -type > > queries to do the right thing, but I don't know if we've actually > > implemented that either). > > > > hope that helps, > > -Brian > > Hello Brian, > > Yep, I'm currently only interested in immutable files. I might be > missing out on functionality by doing so, but I've been trying to get > up to speed on Tahoe rapidly, which means picking things to hold off on > attempting to wrap my brain around. I can see why caching mutable > files would be bad, though! > > I'm not very familiar with debugging Squid, but here's what I saw in > the access logs: > > > and the store logs: > > 1259221695.460 RELEASE 00 00002F1D 687E491AEFFD986129E9BC7FFF6EF9D2 > 200 1259221693 -1 -1 text/plain 620888/620888 GET > http://x.x.10.44/uri/(URI) > 1259221695.477 RELEASE 00 00002F1E 5593E0D36660EBADEE217CE11710366E > 200 1259221693 -1 -1 text/plain 620888/620888 GET > http://x.x.10.44/uri/(URI) > 1259221695.517 RELEASE 00 00002F1F EABE7B59FF7FCC4197BB11E7135F136C > 200 1259221693 -1 -1 text/plain 620888/620888 GET > http://x.x.10.44/uri/(URI) > 1259221695.532 SWAPOUT 00 00002F20 72B9FDA4BB757341E001D528A8E6DE56 > 200 1259221693 -1 -1 text/plain 620888/620888 GET > http://x.x.10.44/uri/(URI)
Buggery Outlook shortcuts, sorry for the double post... Whatever I hit send my message before I was done typing! Here's what was in the access logs. x.x.10.13 - - [26/Nov/2009:07:48:15 +0000] "GET http://x.x.10.44/uri/(URI) HTTP/1.0" 200 621198 "-" "ApacheBench/2 " TCP_REFRESH_MISS:FIRST_UP_PARENT x.x.10.13 - - [26/Nov/2009:07:48:15 +0000] "GET http://x.x.10.44/uri/(URI) HTTP/1.0" 200 621198 "-" "ApacheBench/2 " TCP_REFRESH_MISS:FIRST_UP_PARENT x.x.10.13 - - [26/Nov/2009:07:48:15 +0000] "GET http://x.x.10.44/uri/(URI) HTTP/1.0" 200 621198 "-" "ApacheBench/2 " TCP_REFRESH_MISS:FIRST_UP_PARENT x.x.10.13 - - [26/Nov/2009:07:48:15 +0000] "GET http://x.x.10.44/uri/(URI) HTTP/1.0" 200 621198 "-" "ApacheBench/2 " TCP_REFRESH_MISS:FIRST_UP_PARENT And the store logs, again 1259221695.460 RELEASE 00 00002F1D 687E491AEFFD986129E9BC7FFF6EF9D2 200 1259221693 -1 -1 text/plain 620888/620888 GET http://x.x.10.44/uri/(URI) 1259221695.477 RELEASE 00 00002F1E 5593E0D36660EBADEE217CE11710366E 200 1259221693 -1 -1 text/plain 620888/620888 GET http://x.x.10.44/uri/(URI) 1259221695.517 RELEASE 00 00002F1F EABE7B59FF7FCC4197BB11E7135F136C 200 1259221693 -1 -1 text/plain 620888/620888 GET http://x.x.10.44/uri/(URI) 1259221695.532 SWAPOUT 00 00002F20 72B9FDA4BB757341E001D528A8E6DE56 200 1259221693 -1 -1 text/plain 620888/620888 GET http://x.x.10.44/uri/(URI) I presume that the 5th column is some sort of version hash that identifies when a file was modified. More than likely, there's a way of telling squid to 'shut up and cache the file for x seconds', but I couldn't determine what it was. In any event, I'm pretty happy with apache/mod_proxy/mod_cache's performance. It also lets me filter out the 'root' Tahoe interface, so that I can present things to the end user differently (thinking of building a very basic user interface, since the existing one is really too technical for Joe customer.) Best Regards, Nathan Eisenberg _______________________________________________ tahoe-dev mailing list [email protected] http://allmydata.org/cgi-bin/mailman/listinfo/tahoe-dev
