[ 
https://issues.apache.org/jira/browse/SOLR-127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12562803#action_12562803
 ] 

Hoss Man commented on SOLR-127:
-------------------------------

bq. Ad 2.: Whatever we choose: Two things must be linked: changed index and/or 
changed config must change the Etag and the Last-Modified 

I'm not sure that this is strictly true ... if something changes the Etag, then 
the Last-Modified should also change, but if the Last-Modified changes the Etag 
doesn't necessarily have to change.  consider use cases where solrconfig.xml 
never changes: we can use openTime for Last-Modified (in case we have to 
rollback to an older index), and indexVersion for the ETag - bouncing the 
server will change the Last-Mod because a new searcher is opened, but the Etag 
won't change becuase the index hasn't changed.

here's what i'm thinking...
* two new options (we can pobably think of better names for these)...
*# lastModFrom="openTime|dirLastMod" ... default is dirLastMod
*# cacheHeaderSeed="[some date format]" ... default is epoch
* headers are commuted as...
** Last-Modified = the max(lastModFrom, cacheHeaderSeed) ... where lastModFrom 
is computed using the specified value
** ETag is a hashcode of the indexVersion and cacheHeaderSeed
* resulting behavior...
** Users who aren't pick get the default where slaves with identical snapshots 
will have identical Etags and Last-Mod headers.  
** Changing configs by default won't immediately change the Etag or Last-Mod 
header ... if you've got an index that changes semi regularly you can just 
touch the index to get new headers, or you can add the cacheHeaderSeed option 
with a timestamp value to force new headers on startup.
** if you are supper paranoid about making sure your headers are always a 
perfect reflection of reality (even if you rollback your index to an older 
copy) use lastModFrom="openTime" and update the  cacheHeaderSeed option every 
time you change your config ... downside being that in multi-slave setups every 
machine will generate a different Last-Mod (but the ETags should be the same)

...thoughts?

bq. One comment only: change must-revalidate="" to must-revalidate="true/false" 
. For no-store/no-cache as well.

yeah, that's what i was thinking originally, except i wanted to leave out any 
special knowledge about what the attributes were (ie: know hardcoded list of 
directive names) .. any XML attribute in the config would automatically becomes 
a directive in the header value, if it had a value in the config, itwould have 
a directive value in the header..

{code}
<cacheControl max-age="23" no-cache="" no-store="" must-revalidate="" 
private="Foo" asdf="" qwert="666" />
...becomes...
Cache-Control: max-age="23", no-cache, must-revalidate, private="Foo", asdf, 
qwert="666"
{code}

...that way we don't have to worry about any HTTP extensions, people can put 
anything they freaking want in their Cache-Control header. What i forgot until 
today though is that the numeric directives in the Cache-Control header aren't 
suppose to be quoted (ie: max-age=23 ... not max-age="23")  ... so that won't 
work very easily either.

So then started thinking maybe we use the named list syntax, and let the data 
type tell us wether or not the value should be quoted (<str>) or not (<int>) 
... but that seems awfully verbose for something this simple ... so now i'm 
wondering if maybe we should just make it be one big string and use a regex to 
look for max-age so we can set the Expires header as well.

I'm liking the simple string + regex approach personally.



> Make Solr more friendly to external HTTP caches
> -----------------------------------------------
>
>                 Key: SOLR-127
>                 URL: https://issues.apache.org/jira/browse/SOLR-127
>             Project: Solr
>          Issue Type: Wish
>            Reporter: Hoss Man
>            Assignee: Hoss Man
>             Fix For: 1.3
>
>         Attachments: CacheUnitTest.patch, HTTPCaching.patch, 
> HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, 
> HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, 
> HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, 
> HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, 
> HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, 
> HTTPCaching.patch
>
>
> an offhand comment I saw recently reminded me of something that really bugged 
> me about the serach solution i used *before* Solr -- it didn't play nicely 
> with HTTP caches that might be sitting in front of it.
> at the moment, Solr doesn't put in particularly usefull info in the HTTP 
> Response headers to aid in caching (ie: Last-Modified), responds to all HEAD 
> requests with a 400, and doesn't do anything special with If-Modified-Since.
> t the very least, we can set a Last-Modified based on when the current 
> IndexReder was open (if not the Date on the IndexReader) and use the same 
> info to determing how to respond to If-Modified-Since requests.
> (for the record, i think the reason this hasn't occured to me in the 2+ years 
> i've been using Solr, is because with the internal caching, i've yet to need 
> to put a proxy cache in front of Solr)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to