[ https://issues.apache.org/jira/browse/SOLR-127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12562803#action_12562803 ]
Hoss Man commented on SOLR-127: ------------------------------- bq. Ad 2.: Whatever we choose: Two things must be linked: changed index and/or changed config must change the Etag and the Last-Modified I'm not sure that this is strictly true ... if something changes the Etag, then the Last-Modified should also change, but if the Last-Modified changes the Etag doesn't necessarily have to change. consider use cases where solrconfig.xml never changes: we can use openTime for Last-Modified (in case we have to rollback to an older index), and indexVersion for the ETag - bouncing the server will change the Last-Mod because a new searcher is opened, but the Etag won't change becuase the index hasn't changed. here's what i'm thinking... * two new options (we can pobably think of better names for these)... *# lastModFrom="openTime|dirLastMod" ... default is dirLastMod *# cacheHeaderSeed="[some date format]" ... default is epoch * headers are commuted as... ** Last-Modified = the max(lastModFrom, cacheHeaderSeed) ... where lastModFrom is computed using the specified value ** ETag is a hashcode of the indexVersion and cacheHeaderSeed * resulting behavior... ** Users who aren't pick get the default where slaves with identical snapshots will have identical Etags and Last-Mod headers. ** Changing configs by default won't immediately change the Etag or Last-Mod header ... if you've got an index that changes semi regularly you can just touch the index to get new headers, or you can add the cacheHeaderSeed option with a timestamp value to force new headers on startup. ** if you are supper paranoid about making sure your headers are always a perfect reflection of reality (even if you rollback your index to an older copy) use lastModFrom="openTime" and update the cacheHeaderSeed option every time you change your config ... downside being that in multi-slave setups every machine will generate a different Last-Mod (but the ETags should be the same) ...thoughts? bq. One comment only: change must-revalidate="" to must-revalidate="true/false" . For no-store/no-cache as well. yeah, that's what i was thinking originally, except i wanted to leave out any special knowledge about what the attributes were (ie: know hardcoded list of directive names) .. any XML attribute in the config would automatically becomes a directive in the header value, if it had a value in the config, itwould have a directive value in the header.. {code} <cacheControl max-age="23" no-cache="" no-store="" must-revalidate="" private="Foo" asdf="" qwert="666" /> ...becomes... Cache-Control: max-age="23", no-cache, must-revalidate, private="Foo", asdf, qwert="666" {code} ...that way we don't have to worry about any HTTP extensions, people can put anything they freaking want in their Cache-Control header. What i forgot until today though is that the numeric directives in the Cache-Control header aren't suppose to be quoted (ie: max-age=23 ... not max-age="23") ... so that won't work very easily either. So then started thinking maybe we use the named list syntax, and let the data type tell us wether or not the value should be quoted (<str>) or not (<int>) ... but that seems awfully verbose for something this simple ... so now i'm wondering if maybe we should just make it be one big string and use a regex to look for max-age so we can set the Expires header as well. I'm liking the simple string + regex approach personally. > Make Solr more friendly to external HTTP caches > ----------------------------------------------- > > Key: SOLR-127 > URL: https://issues.apache.org/jira/browse/SOLR-127 > Project: Solr > Issue Type: Wish > Reporter: Hoss Man > Assignee: Hoss Man > Fix For: 1.3 > > Attachments: CacheUnitTest.patch, HTTPCaching.patch, > HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, > HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, > HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, > HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, > HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, > HTTPCaching.patch > > > an offhand comment I saw recently reminded me of something that really bugged > me about the serach solution i used *before* Solr -- it didn't play nicely > with HTTP caches that might be sitting in front of it. > at the moment, Solr doesn't put in particularly usefull info in the HTTP > Response headers to aid in caching (ie: Last-Modified), responds to all HEAD > requests with a 400, and doesn't do anything special with If-Modified-Since. > t the very least, we can set a Last-Modified based on when the current > IndexReder was open (if not the Date on the IndexReader) and use the same > info to determing how to respond to If-Modified-Since requests. > (for the record, i think the reason this hasn't occured to me in the 2+ years > i've been using Solr, is because with the internal caching, i've yet to need > to put a proxy cache in front of Solr) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.