Re: [fossil-users] HTTP caching, again

2018-06-11 Thread Florian Balmer
Though I'm aware that this is not something you may consider useful
for Fossil, I'm posting a minor update, for the sake of correctness:

(0) Wrong Statement

Me:

> I think that "Vary: Cookie" is intended to work with unconditional
> HTTP requests: the browser is directed to stick to the expiry date
> and use the cached page, unless the cookies have changed.

I'm sorry I was wrong here.

"Vary: Cookie" directs clients to include cookies as part of their
"cache key", and it also works with conditional HTTP requests.

But if clients mark resources as stale due to cookie updates, they
will still "revalidate" them with conditional If-None-Match requests,
using their last-received ETag. And if the server does not consult all
the information from the cookies to generate its ETag (here, the user
login time), it won't produce a new ETag and expire the page, but
instead reply with "304 Not Modified".

(1) Introduced two Bugs

With the previous version of the patch, accessing the /uv page with no
file name specified caused an assertion failure, as doc_page()
repeatedly tries various index documents (index.html, index.wiki,
index.md, and 404.md). So the cache checks are only performed if
there's a valid hash (i.e. if "SELECT hash FROM unversioned" returns
non-empty data). Moreover, the Last-Modified response header was not
sent, if an ETag cache hit was already handled (though, at least my
Apache web server seems to purge the Last-Modified header from "304
Not Modified" responses).

(2) Updated If-Modified-Since checks

The updated patch also includes the modifications to the
If-Modified-Since checks, as suggested in a separate post.

--Florian

= Patch for Fossil [a7056e64] ==

Index: src/cgi.c
==
--- src/cgi.c
+++ src/cgi.c
@@ -249,11 +249,11 @@
 iReplyStatus = 200;
 zReplyStatus = "OK";
   }

   if( g.fullHttpReply ){
-fprintf(g.httpOut, "HTTP/1.0 %d %s\r\n", iReplyStatus, zReplyStatus);
+fprintf(g.httpOut, "HTTP/1.1 %d %s\r\n", iReplyStatus, zReplyStatus);
 fprintf(g.httpOut, "Date: %s\r\n", cgi_rfc822_datestamp(time(0)));
 fprintf(g.httpOut, "Connection: close\r\n");
 fprintf(g.httpOut, "X-UA-Compatible: IE=edge\r\n");
   }else{
 fprintf(g.httpOut, "Status: %d %s\r\n", iReplyStatus, zReplyStatus);
@@ -269,10 +269,11 @@
 fprintf(g.httpOut, "Cache-control: no-cache\r\n");
   }
   if( etag_mtime()>0 ){
 fprintf(g.httpOut, "Last-Modified: %s\r\n",
 cgi_rfc822_datestamp(etag_mtime()));
+fprintf(g.httpOut, "Vary: Cookie\r\n"); /* HTTP/1.0 (no ETags) */
   }

   if( blob_size()>0 ){
 fprintf(g.httpOut, "%s", blob_buffer());
   }

Index: src/doc.c
==
--- src/doc.c
+++ src/doc.c
@@ -641,17 +641,29 @@
   }
 }
 if( isUV ){
   if( db_table_exists("repository","unversioned") ){
 Stmt q;
+char *zHash=0;
+sqlite3_int64 mtime=0;
 db_prepare(, "SELECT hash, mtime FROM unversioned"
" WHERE name=%Q", zName);
 if( db_step()==SQLITE_ROW ){
-  etag_check(ETAG_HASH, db_column_text(,0));
-  etag_last_modified(db_column_int64(,1));
+  zHash = fossil_strdup(db_column_text(,0));
+  mtime = db_column_int64(,1);
 }
 db_finalize();
+if( zHash ){
+  /* Only call etag_check() if the unversioned file was found
+  ** and has a valid hash, as doc_page() is called repeatedly
+  ** to search for index documents (index.html, index.wiki,
+  ** index.md, and 404.md), causing an assertion failure in
+  ** etag_check(), due to zETag already initialized. */
+  if( zHash[0] )
+etag_check(ETAG_HASH|ETAG_CEXP, zHash, mtime);
+  free(zHash);
+}
 if( unversioned_content(zName, )==0 ){
   rid = 1;
   zDfltTitle = zName;
 }
   }
@@ -847,11 +859,11 @@
 */
 void logo_page(void){
   Blob logo;
   char *zMime;

-  etag_check(ETAG_CONFIG, 0);
+  etag_check(ETAG_CONFIG, 0, 0);
   zMime = db_get("logo-mimetype", "image/gif");
   blob_zero();
   db_blob(, "SELECT value FROM config WHERE name='logo-image'");
   if( blob_size()==0 ){
 blob_init(, (char*)aLogo, sizeof(aLogo));
@@ -881,11 +893,11 @@
 */
 void background_page(void){
   Blob bgimg;
   char *zMime;

-  etag_check(ETAG_CONFIG, 0);
+  etag_check(ETAG_CONFIG, 0, 0);
   zMime = db_get("background-mimetype", "image/gif");
   blob_zero();
   db_blob(, "SELECT value FROM config WHERE name='background-image'");
   if( blob_size()==0 ){
 blob_init(, (char*)aBackground, sizeof(aBackground));

Index: src/etag.c
==
--- src/etag.c
+++ src/etag.c
@@ -24,10 +24,11 @@
 **   (1)  The mtime on the Fossil executable
 **   (2)  The last change to the CONFIG table
 **   (3)  The last change 

Re: [fossil-users] HTTP caching, again

2018-05-21 Thread Florian Balmer
Now I see why the clever and elegant solution to use "Vary: Cookie", as
suggested by Joerg, does not fix /uv page expiration after login and
logout, and I can also explain the strange differences between the local
Fossil built-in web server, and my remote web server.

The local Fossil built-in web server uses the HTTP/1.0 protocol. On my
remote web server, Apache automatically upgrades the CGI responses
generated by Fossil to HTTP/1.1.

HTTP/1.0 does not yet support ETags, but only Last-Modified stamps, and
thus web browsers do not include If-None-Match with their requests, but
just stick to If-Modified-Since [0]. Interestingly, "Vary: Cookie" works
(i.e. /uv pages are expired after login and logout) with Chrome, and looks
like it's working with IE and Edge (but in fact they are not caching pages
with "Vary: Cookie" at all, mimicking a correct refresh triggered by a
cookie update; it may be my browser settings).

[0] https://stackoverflow.com/a/28033770

With HTTP/1.1, browsers include If-None-Match with their requests, but
"Vary: Cookie" no longer has any effects (i.e. /uv pages are not expired
after login and logout).

I think that "Vary: Cookie" is intended to work with unconditional HTTP
requests: the browser is directed to stick to the expiry date and use the
cached page, unless the cookies have changed.

But caching works differently with conditional HTTP requests
(If-None-Match, If-Modified-Since): pages are always revalidated with the
server, whether or not the cookies have changed, and "Vary: Cookie" has no
additional effects, here.

I've attached a simple PHP script to test this:

The script generates two web pages, one to expire after 10 seconds (through
a "Cache-Control: max-age=10" HTTP response header), the other to handle a
conditional request (through a "Cache-Control: must-revalidate, private"
header) that always returns "304 Not Modified" (to simulate the current
Fossil "Last-Modified always wins" caching behavior).

Repeatedly clicking the "Reload" links causes the first page to refresh
every 10 seconds (watch the "Date" and "Cookie" entries). The second page
remains unchanged (unless reloaded with Ctrl+F5).

Clicking "Update cookie" modifies the test cookie (by JavaScript). Now the
first page is refreshed immediately - the effect of "Vary: Cookie". The
second page still remains unchanged.

That's why /uv pages are not expired after login and logout, even if the
login cookie has changed. The browser always revalidates the page, and
whether or not Fossil detects an ETag mismatch after login and logout
(currently, it doesn't, as the ETag is not "login-time-sensitive"), it is
immediately undermined by a Last-Modified match. "Vary: Cookie" can't fix
this.

I have been using Fossil with the patch to expire /uv web pages whenever a
new user is logged-in for a few days, now.

With the repository index page set to a /uv page [1], I have a very smooth
user experience for login and logout actions:

After login, I can immediately see the Admin menu entry and the user name
display in the top right corner, without the need to do a "hard" reload
(Ctrl+F5) -- exactly the way it was before Fossil supported HTTP caching.

My index page has a direct logout link, and when clicked, the Admin menu
entry and user name display are again updated immediately:

   [/login?out | Logout]

If the user login state does not change, Fossil sends a "304 Not Modified"
response, and the web browser shows the cached page, with the Admin menu
entry and the user name display in sync.

If there's future plans to use HTTP caching not only for /uv, but also for
/doc and /wiki pages, more people may run into the issue that they need to
do "hard" reloads after login and logout.

I have refactored the patch (attached) to have one single function handle
either conditional request, and hide the logic to ensure that ETag
mismatches won't be undermined by Last-Modified matches, so it's easier to
reuse it for other page generators than /uv, at a later stage. I'm not sure
if it's safe to change HTTP/1.0 to HTTP/1.1 for the local Fossil built-in
web server (I think it is, as it's likely that it is silently upgraded by
most web servers).

"Vary: Cookie" was left in place, just in case, as a possible fallback for
local HTTP/1.0 servers. Like this, it's easy to test the impact of "Vary:
Cookie", as removing the ETAG_CEXP flag from the call to etag_check()
changes ETag generation to be "login-time-agnostic", again.

I would really like to encourage you to try the patch, and see how this
changes the user experience for login and logout actions related to /uv web
pages, on local and remote Fossil web servers.

Thank you very much
--Florian

[1] A /uv repository index page can be updated by scripting and replaced
for local clones, and unlike with /doc or /wiki, changes never show up in
the Fossil timeline and/or file hierarchy. I'm keeping the (identical)
index pages for my repositories in a separate meta-repository, so no need
to archive 

Re: [fossil-users] HTTP caching, again

2018-05-18 Thread Florian Balmer
Joerg:

> Such a proxy would be pretty broken. ... Again, such a client is
> pretty much broken already under the caching model. ...

I agree. Writing an HTTP server in a perfect world may be easy. But I feel
like a lot of programming work are efforts to make broken clients (or
"components") work?

My tests for the "Vary: Cookie" header were on Windows, with the Fossil
built-in (`fossil ui') local web server, and the "localauth" setting
enabled, so that login and logout was possible. This worked fine.

But then I noticed something strange: most major web browsers (not tested
with Firefox) do not seem to send If-None-Match request headers for
localhost connections, but only use If-Modified-Since.

So I rebuilt Fossil with the same modification to send a "Vary: Cookie"
header along with ETag and Last-Modified headers, and tested it on my
FreeBSD / Apache remote web server -- and unfortunately it doesn't work
(again tested with most major web browsers, not tested with Firefox).

Apache seems to do minimal HTTP header rearrangements compared to the
Fossil local web server, such as:

Vary: Accept-Encoding
Vary: Cookie

to:

Vary: Cookie,Accept-Encoding

and:

Connection: close

to:

Connection: Keep-Alive
Keep-Alive: timeout=2, max=99

but I think this should not affect HTTP caching.

So it looks like "Vary: Cookie" only works for clients relying on the
If-Modified-Since cache control mechanism?

The only way I found to get both the Fossil built-in local web server, and
the FreeBSD / Apache remote web server to work as expected, i.e. that /uv
web pages are are correctly expired after a new user is logged-in, was the
following combination:

* Add "user.cexpire" to the ETag ingredients.
* Skip If-Modified-Since handling after If-None-Match has already detected
an ETag cache miss.
* Send a "Vary: Cookie" header with ETag and Last-Modified headers.

--Florian

= Patch for Fossil [04190488] ==

Index: src/cgi.c
==
--- src/cgi.c
+++ src/cgi.c
@@ -242,10 +242,11 @@

  /*
  ** Do a normal HTTP reply
  */
  void cgi_reply(void){
+  int vary_cookie = 0;
int total_size;
if( iReplyStatus<=0 ){
  iReplyStatus = 200;
  zReplyStatus = "OK";
}
@@ -262,19 +263,24 @@
  /* isConst means that the reply is guaranteed to be invariant, even
  ** after configuration changes and/or Fossil binary recompiles. */
  fprintf(g.httpOut, "Cache-Control: max-age=31536000\r\n");
}else if( etag_tag()!=0 ){
  fprintf(g.httpOut, "ETag: %s\r\n", etag_tag());
+vary_cookie = 1;
  fprintf(g.httpOut, "Cache-Control: max-age=%d\r\n", etag_maxage());
}else{
  fprintf(g.httpOut, "Cache-control: no-cache\r\n");
}
if( etag_mtime()>0 ){
  fprintf(g.httpOut, "Last-Modified: %s\r\n",
  cgi_rfc822_datestamp(etag_mtime()));
+vary_cookie = 1;
}

+  if( vary_cookie )
+fprintf(g.httpOut, "Vary: Cookie\r\n");
+
if( blob_size()>0 ){
  fprintf(g.httpOut, "%s", blob_buffer());
}

/* Add headers to turn on useful security options in browsers. */

Index: src/doc.c
==
--- src/doc.c
+++ src/doc.c
@@ -641,17 +641,26 @@
}
  }
  if( isUV ){
if( db_table_exists("repository","unversioned") ){
  Stmt q;
+char *zHash=0;
+sqlite3_int64 mtime=0;
  db_prepare(, "SELECT hash, mtime FROM unversioned"
 " WHERE name=%Q", zName);
  if( db_step()==SQLITE_ROW ){
-  etag_check(ETAG_HASH, db_column_text(,0));
-  etag_last_modified(db_column_int64(,1));
+  zHash = fossil_strdup(db_column_text(,0));
+  mtime = db_column_int64(,1);
  }
  db_finalize();
+if( zHash==0 || etag_check(ETAG_HASH, zHash)==0 ){
+  /* Prevent the If-Modified-Since cache handler to
+  ** undermine cache misses already cleared by the
+  ** If-None-Match cache handler. */
+  if ( mtime ) etag_last_modified(mtime);
+}
+if( zHash ) free(zHash);
  if( unversioned_content(zName, )==0 ){
rid = 1;
zDfltTitle = zName;
  }
}

Index: src/etag.c
==
--- src/etag.c
+++ src/etag.c
@@ -20,14 +20,15 @@
  ** An ETag is a hash that encodes attributes which must be the same for
  ** the page to continue to be valid.  Attributes that might be contained
  ** in the ETag include:
  **
  **   (1)  The mtime on the Fossil executable
-**   (2)  The last change to the CONFIG table
-**   (3)  The last change to the EVENT table
-**   (4)  The value of the display cookie
-**   (5)  A hash value supplied by the page generator
+**   (2)  The "user.cexpire" field for logged-in users
+**   (3)  The last change to the CONFIG table
+**   (4)  The last change to the EVENT table
+**   (5) 

Re: [fossil-users] HTTP caching, again

2018-05-18 Thread Joerg Sonnenberger
On Fri, May 18, 2018 at 08:39:15AM +0200, Florian Balmer wrote:
> Also, with "Vary: Cookie", there may be issues with caching proxies,
> depending on whether they receive and evaluate all the cookies, but this
> may not be a problem for Fossil.

Such a proxy would be pretty broken. It has to parse the request to find
the URL already and the header will tell it the client cookie. Varying
on cookies is also one of the most common instances.

> For clients that do not understand or support "Vary: Cookie", I would still
> suggest to perform the Last-Modified checks only if no ETag was included
> with the request (so that ETag misses can not be outdone by Last-Modified
> hits).

Again, such a client is pretty much broken already under the caching
model. But it would likely not care about the login details in that
case.

Joerg
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] HTTP caching, again

2018-05-18 Thread Florian Balmer
Joerg:

> I don't think you need to reset it, just sending the vary header
> should be enough?

I was able to try this, and it works fine!

Adding the following line:

   fprintf(g.httpOut, "Vary: Cookie\r\n");

right after printing the ETag header in src/cgi.c (and also after printing
the Last-Modified header, if not already printed after the ETag header)
results in correct web page expiration after login and logout.

Using "user.cexpire" to calculate the ETag may give more fine-grained
control, as for example a /uv page would not need a refresh if an unrelated
cookie (for example, to set /timeline display options) were changed, but
overall, the "Vary: Cookie" method may work well enough.

Also, with "Vary: Cookie", there may be issues with caching proxies,
depending on whether they receive and evaluate all the cookies, but this
may not be a problem for Fossil.

For clients that do not understand or support "Vary: Cookie", I would still
suggest to perform the Last-Modified checks only if no ETag was included
with the request (so that ETag misses can not be outdone by Last-Modified
hits).

--Florian
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] HTTP caching, again

2018-05-17 Thread Joerg Sonnenberger
On Thu, May 17, 2018 at 04:08:18PM -0400, Richard Hipp wrote:
> On 5/17/18, Joerg Sonnenberger  wrote:
> > On Thu, May 17, 2018 at 07:02:17PM +0200, Florian Balmer wrote:
> >> So I tried to to generate a "login-time-sensitive" ETag. This worked well
> >> with the "cexpire" field from the "user" table (which is actually the
> >> login
> >> time, shifted to the future by one unit of the "cookie-expire" setting).
> >
> > Would a vary-on-cookie solve this already?
> >
> 
> Fossil uses two cookies:  One for the login and a separate "display
> cookie" to remember display preferences.  The ETag values already
> reset based on changes to the display cookie.  I suppose they could
> change again based on the login cookie.  The question is, would this
> solve Florent's problem?

I don't think you need to reset it, just sending the vary header should
be enough?

Joerg
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] HTTP caching, again

2018-05-17 Thread Florian Balmer
D. Richard Hipp:

> The ETag values already reset based on changes to the display
> cookie. I suppose they could change again based on the login
> cookie. The question is, would this solve Florent's problem?

Yes, adding "user.cexpire" to the ETag ingredients [0] would solve part of
the problem: /uv pages (and hence the navigation menu entries) are not
expired after a new user is logged-in.

[0] http://fossil-scm.org/index.html/artifact?=9ef915be04=24-28

The other part is that whenever the client has included an ETag header with
the request, the Last-Modified header should no longer be evaluated, but an
ETag mismatch should already result in a cache miss, without giving the
Last-Modified pathway a chance to "validate" a page previously classified
as expired by the ETag handler.

To test this, you can view the (unversioned) Download page on the Fossil
website, then click login and logout from there, and check if you can
always see the correct (i.e. the current, or no) username near the
login/logout link, and the Admin menu entry (if enabled by the skin), or if
you need to press Ctrl+F5.

Joerg:

> Would a vary-on-cookie solve this already?

Aha, thanks for the input, today's bedtime reading ... I live almost in the
UTC time zone, handy when using Fossil :)

--Florian
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] HTTP caching, again

2018-05-17 Thread Richard Hipp
On 5/17/18, Joerg Sonnenberger  wrote:
> On Thu, May 17, 2018 at 07:02:17PM +0200, Florian Balmer wrote:
>> So I tried to to generate a "login-time-sensitive" ETag. This worked well
>> with the "cexpire" field from the "user" table (which is actually the
>> login
>> time, shifted to the future by one unit of the "cookie-expire" setting).
>
> Would a vary-on-cookie solve this already?
>

Fossil uses two cookies:  One for the login and a separate "display
cookie" to remember display preferences.  The ETag values already
reset based on changes to the display cookie.  I suppose they could
change again based on the login cookie.  The question is, would this
solve Florent's problem?

-- 
D. Richard Hipp
d...@sqlite.org
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] HTTP caching, again

2018-05-17 Thread Joerg Sonnenberger
On Thu, May 17, 2018 at 07:02:17PM +0200, Florian Balmer wrote:
> So I tried to to generate a "login-time-sensitive" ETag. This worked well
> with the "cexpire" field from the "user" table (which is actually the login
> time, shifted to the future by one unit of the "cookie-expire" setting).

Would a vary-on-cookie solve this already?

Joerg
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] HTTP caching, again

2018-05-17 Thread Florian Balmer
> One possible solution may be to include the "cexpire" field in
> ETag calculation, drop the If-Modified-Since handler, but still
> return a Last-Modified date.

Well, it may be possible to support both caching mechanisms. But then an
ETag mismatch should result in a cache miss, and the If-Modified-Since
route should no longer be taken thereafter.

As Fossil supplies both and ETag and a Last-Modified header for the initial
(non-cached) page view, web browsers will also send back both headers, and
then If-Modified-Since can undermine If-None-Match cache misses.

--Florian
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


[fossil-users] HTTP caching, again

2018-05-17 Thread Florian Balmer
Sorry for my perseveration on the topic.

I'm using a /uv index page for my repositories. After login, the index page
is not expired, and I can only see the Admin entry from the top navigation
bar until after a "hard" reload with Ctrl+F5.

So I tried to to generate a "login-time-sensitive" ETag. This worked well
with the "cexpire" field from the "user" table (which is actually the login
time, shifted to the future by one unit of the "cookie-expire" setting).

But as there's serial cache expiration checks, an ETag cache miss is
immediately caught by a Last-Modified cache hit (no matter whether or not
the Fossil executable mtime is used to limit the age of the Last-Modified
date).

I'm not sure if this can be solved to work well for both web UI pages (ETag
preferred) and "static" files downloaded via scripts (Last-Modified
preferred).

One possible solution may be to include the "cexpire" field in ETag
calculation, drop the If-Modified-Since handler, but still return a
Last-Modified date.

Like this, wget and simple scripts can still do their own If-Modified-Since
cache checks based on HTTP HEAD requests, and use the returned
Last-Modified date to adjust file time stamps. Given that the time stamps
of unversioned files can be changed arbitrarily, and may not be 100%
accurate anyway, this could make a good compromise.

--Florian
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users