Re: [patch] httpd static gzip compression

2022-02-28 Thread Steffen Nurpmeso
j...@bitminer.ca wrote in
 <07cf90c2d8adab83ffd36ef69ebcd...@bitminer.ca>:
 |On 2022-02-27 19:28, Solène wrote:
 |> If I remember well, lighttpd will compress server side upon request if
 |> a compressed version of that file doesn't exist.
 |> 
 |> This is entirely different than how httpd behave, because it doesn't
 |> compress files at all.

The separate path to avoid ambiguities would still address one
concern de Raadt had a couple of months back if i recall
correctly.  And for example doing a dirname and adding ."CVS"
before the basename.gz would only slightly complicate aka enlargen
the patch.

My main use case for www.openbsd.org (having a quick look at
manuals) would not be covered by that though, and ML archives are
not hosted either.  HTML compresses a lot, i find it desirable.

 |I would describe the automatic selection of available .gz files as
 |a hand-crafted compression cache.
 |
 |This is not user-friendly to either system admins or webmasters
 |(assuming they are different people).
 |
 |My proposal was to automate the cache-invalidate logic (so-to-speak),
 |until/unless the file owner (sysadmin or webmaster) updates the gz
 |file.  This reduces a possible error path.
 |
 |There are a few use-cases for the gz feature:
 |
 |1) large files built exactly once; the gz file is therefore built
 |once, and hand-crafted compression is selected in config and by
 |creating the gz file.  The file owner(s) know what they are doing.
 |
 |2) large files updated a few times; perhaps the owner(s) will
 |remember to update gz files.  This is where the proposal fits in.
 |Even though they know what they are doing, helping invalidate the
 |cache is a good thing until the cache is updated.
 |
 |3) a set of large files which need consistent treatment: outside
 |the scope of such a program at httpd or it's users.
 |
 |Please consider the above as reasoning for somewhat more logic
 |than was first suggested.

--steffen
|
|Der Kragenbaer,The moon bear,
|der holt sich munter   he cheerfully and one by one
|einen nach dem anderen runter  wa.ks himself off
|(By Robert Gernhardt)



Re: [patch] httpd static gzip compression

2022-02-27 Thread Steffen Nurpmeso
Steffen Nurpmeso wrote in
 <20220227225619.pzahj%stef...@sdaoden.eu>:
 |Solène wrote in
 | <3311e74f-1ad8-49b3-96d7-3f3c7f2af...@perso.pw>:
 ||27 févr. 2022 19:37:20 j...@bitminer.ca:
 ||
 ||> Would it be too much to ask to defend the (poor) web master against
 ||> own-goal errors?
 ||> 
 ||> That is, approximately:
 ||> 
 ||> if ((access(gzpath, R_OK) == 0) &&
 ||>  (stat(gzpath, ) == 0) &&
 ||>  (gzst->st_mtim.tv_sec >=
 ||>  st->st_mtim.tv_sec)) {   /* new test */
 ||>  path = gzpath; st =  kv_add(>http_headers,
 ||>  "Content-Encoding", "gzip");
 ||>  }
 ||> 
 ||> (apologies for formatting errors)
 ||> 
 ||> In english: the gz file must be the same age as or newer than the
 ||> original.
 ||> 
 ||> My assumption being that "static" files are not always static.
 ||> And correctly updating .gz files requires a bit of a delete-update-recre\
 ||> \
 ||> ate
 ||> dance.
 |
 ||I'd prefer not have much logic for this so it's easier to understand \
 ||for admins. This feels wrong to serve a file or another depending on \
 ||their timestamp.
 |
 |Sorry for stepping into this again, but lighttpd compress uses
 |a special folder for this, like this ambiguities cannot happen.

(And a simple cron job can remove things which have been unused
for a long time.)

--steffen
|
|Der Kragenbaer,The moon bear,
|der holt sich munter   he cheerfully and one by one
|einen nach dem anderen runter  wa.ks himself off
|(By Robert Gernhardt)



Re: [patch] httpd static gzip compression

2022-02-27 Thread Steffen Nurpmeso
Solène wrote in
 <3311e74f-1ad8-49b3-96d7-3f3c7f2af...@perso.pw>:
 |27 févr. 2022 19:37:20 j...@bitminer.ca:
 |
 |> Would it be too much to ask to defend the (poor) web master against
 |> own-goal errors?
 |> 
 |> That is, approximately:
 |> 
 |> if ((access(gzpath, R_OK) == 0) &&
 |>  (stat(gzpath, ) == 0) &&
 |>  (gzst->st_mtim.tv_sec >=
 |>  st->st_mtim.tv_sec)) {   /* new test */
 |>  path = gzpath; st =  kv_add(>http_headers,
 |>  "Content-Encoding", "gzip");
 |>  }
 |> 
 |> (apologies for formatting errors)
 |> 
 |> In english: the gz file must be the same age as or newer than the
 |> original.
 |> 
 |> My assumption being that "static" files are not always static.
 |> And correctly updating .gz files requires a bit of a delete-update-recre\
 |> ate
 |> dance.

 |I'd prefer not have much logic for this so it's easier to understand \
 |for admins. This feels wrong to serve a file or another depending on \
 |their timestamp.

Sorry for stepping into this again, but lighttpd compress uses
a special folder for this, like this ambiguities cannot happen.

--steffen
|
|Der Kragenbaer,The moon bear,
|der holt sich munter   he cheerfully and one by one
|einen nach dem anderen runter  wa.ks himself off
|(By Robert Gernhardt)



Re: [patch] httpd static gzip compression

2022-02-26 Thread Tracey Emery
On Sat, Feb 26, 2022 at 10:55:59AM +0100, prx wrote:
> First, thank you for your interest!
> 
> > > Shouldn't we check for truncation on strlcpy and strlcat and goto fail
> > > in that event?
> > 
> > With goto abort we get an 500 internal server error.
> > 
> 
> Moreover, if the strlcpy and strlcat failed, then the file requested 
> (gpath) is obviously not found, and httpd switch back to original path.
> 
> But to avoid unexpected behaviour, maybe something like this can be enough ?
> 

I'd prefer the hard failure. I'm sure someone else will chime in if they
think otherwise. Thanks! :)

> ```
> int gztoolong = 0;
> 
> /* check Accept-Encoding header */
> key.kv_key = "Accept-Encoding";
> r = kv_find(>http_headers, );
> 
> if (r != NULL) {
> if (strstr(r->kv_value, "gzip") != NULL) {
> /* append ".gz" to path and check existence */
> if (strlcpy(gzpath, path, sizeof(gzpath)) >= sizeof(gzpath))
> gztoolong = 1;
> if (strlcat(gzpath, ".gz", sizeof(gzpath)) >= sizeof(gzpath))
> gztoolong = 1;
> 
> if ((gztoolong == 0) &&
> (access(gzpath, R_OK) == 0) &&
> (stat(gzpath, ) == 0)) {
> path = gzpath;
> st = 
> kv_add(>http_headers,
> "Content-Encoding", "gzip");
> }
> ```
> 

-- 

Tracey Emery



Re: [patch] httpd static gzip compression

2022-02-26 Thread Tracey Emery
On Sat, Feb 26, 2022 at 02:52:03AM +0100, Alexander Bluhm wrote:
> > Shouldn't we check for truncation on strlcpy and strlcat and goto fail
> > in that event?
> 
> With goto abort we get an 500 internal server error.

ok

> 
> Index: httpd.conf.5
> ===
> RCS file: /data/mirror/openbsd/cvs/src/usr.sbin/httpd/httpd.conf.5,v
> retrieving revision 1.119
> diff -u -p -r1.119 httpd.conf.5
> --- httpd.conf.5  24 Oct 2021 16:01:04 -  1.119
> +++ httpd.conf.5  25 Feb 2022 18:41:42 -
> @@ -425,6 +425,12 @@ A variable that is set to a comma separa
>  features in use
>  .Pq omitted when TLS client verification is not in use .
>  .El
> +.It Ic gzip-static
> +Enable static gzip compression to save bandwith.
> +.Pp
> +If gzip encoding is accepted and if the requested file exists with
> +an additional .gz suffix, use the compressed file instead and deliver
> +it with content encoding gzip.
>  .It Ic hsts Oo Ar option Oc
>  Enable HTTP Strict Transport Security.
>  Valid options are:
> Index: httpd.h
> ===
> RCS file: /data/mirror/openbsd/cvs/src/usr.sbin/httpd/httpd.h,v
> retrieving revision 1.158
> diff -u -p -r1.158 httpd.h
> --- httpd.h   24 Oct 2021 16:01:04 -  1.158
> +++ httpd.h   25 Feb 2022 18:40:58 -
> @@ -396,6 +396,7 @@ SPLAY_HEAD(client_tree, client);
>  #define SRVFLAG_DEFAULT_TYPE 0x0080
>  #define SRVFLAG_PATH_REWRITE 0x0100
>  #define SRVFLAG_NO_PATH_REWRITE  0x0200
> +#define SRVFLAG_GZIP_STATIC  0x0400
>  #define SRVFLAG_LOCATION_FOUND   0x4000
>  #define SRVFLAG_LOCATION_NOT_FOUND 0x8000
>  
> Index: parse.y
> ===
> RCS file: /data/mirror/openbsd/cvs/src/usr.sbin/httpd/parse.y,v
> retrieving revision 1.127
> diff -u -p -r1.127 parse.y
> --- parse.y   24 Oct 2021 16:01:04 -  1.127
> +++ parse.y   25 Feb 2022 18:24:30 -
> @@ -141,7 +141,7 @@ typedef struct {
>  %token   TIMEOUT TLS TYPE TYPES HSTS MAXAGE SUBDOMAINS DEFAULT PRELOAD 
> REQUEST
>  %token   ERROR INCLUDE AUTHENTICATE WITH BLOCK DROP RETURN PASS REWRITE
>  %token   CA CLIENT CRL OPTIONAL PARAM FORWARDED FOUND NOT
> -%token   ERRDOCS
> +%token   ERRDOCS GZIPSTATIC
>  %token STRING
>  %token NUMBER
>  %typeport
> @@ -553,6 +553,7 @@ serveroptsl   : LISTEN ON STRING opttls po
>   | logformat
>   | fastcgi
>   | authenticate
> + | gzip_static
>   | filter
>   | LOCATION optfound optmatch STRING {
>   struct server   *s;
> @@ -1217,6 +1218,14 @@ fcgiport   : NUMBER{
>   }
>   ;
>  
> +gzip_static  : NO GZIPSTATIC {
> + srv->srv_conf.flags &= ~SRVFLAG_GZIP_STATIC;
> + }
> + | GZIPSTATIC{
> + srv->srv_conf.flags |= SRVFLAG_GZIP_STATIC;
> + }
> + ;
> +
>  tcpip: TCP '{' optnl tcpflags_l '}'
>   | TCP tcpflags
>   ;
> @@ -1441,6 +1450,7 @@ lookup(char *s)
>   { "fastcgi",FCGI },
>   { "forwarded",  FORWARDED },
>   { "found",  FOUND },
> + { "gzip-static",GZIPSTATIC },
>   { "hsts",   HSTS },
>   { "include",INCLUDE },
>   { "index",  INDEX },
> Index: server_file.c
> ===
> RCS file: /data/mirror/openbsd/cvs/src/usr.sbin/httpd/server_file.c,v
> retrieving revision 1.70
> diff -u -p -r1.70 server_file.c
> --- server_file.c 29 Apr 2021 18:23:07 -  1.70
> +++ server_file.c 26 Feb 2022 01:43:17 -
> @@ -223,26 +223,56 @@ server_file_request(struct httpd *env, s
>   const char  *errstr = NULL;
>   int  fd = -1, ret, code = 500;
>   size_t   bufsiz;
> + struct stat  gzst;
> + char gzpath[PATH_MAX];
>  
>   if ((ret = server_file_method(clt)) != 0) {
>   code = ret;
>   goto abort;
>   }
>  
> + media = media_find_config(env, srv_conf, path);
> +
>   if ((ret = server_file_modified_since(clt->clt_descreq, st)) != -1) {
>   /* send the header without a body */
> - media = media_find_config(env, srv_conf, path);
>   if ((ret = server_response_http(clt, ret, media, -1,
>   MINIMUM(time(NULL), st->st_mtim.tv_sec))) == -1)
>   goto fail;
>   goto done;
>   }
>  
> + /* change path to path.gz if necessary. */
> + if (srv_conf->flags & SRVFLAG_GZIP_STATIC) {
> + struct 

Re: [patch] httpd static gzip compression

2022-02-26 Thread prx
First, thank you for your interest!

> > Shouldn't we check for truncation on strlcpy and strlcat and goto fail
> > in that event?
> 
> With goto abort we get an 500 internal server error.
> 

Moreover, if the strlcpy and strlcat failed, then the file requested 
(gpath) is obviously not found, and httpd switch back to original path.

But to avoid unexpected behaviour, maybe something like this can be enough ?

```
int gztoolong = 0;

/* check Accept-Encoding header */
key.kv_key = "Accept-Encoding";
r = kv_find(>http_headers, );

if (r != NULL) {
if (strstr(r->kv_value, "gzip") != NULL) {
/* append ".gz" to path and check existence */
if (strlcpy(gzpath, path, sizeof(gzpath)) >= sizeof(gzpath))
gztoolong = 1;
if (strlcat(gzpath, ".gz", sizeof(gzpath)) >= sizeof(gzpath))
gztoolong = 1;

if ((gztoolong == 0) &&
(access(gzpath, R_OK) == 0) &&
(stat(gzpath, ) == 0)) {
path = gzpath;
st = 
kv_add(>http_headers,
"Content-Encoding", "gzip");
}
```



Re: [patch] httpd static gzip compression

2022-02-25 Thread Alexander Bluhm
> Shouldn't we check for truncation on strlcpy and strlcat and goto fail
> in that event?

With goto abort we get an 500 internal server error.

Index: httpd.conf.5
===
RCS file: /data/mirror/openbsd/cvs/src/usr.sbin/httpd/httpd.conf.5,v
retrieving revision 1.119
diff -u -p -r1.119 httpd.conf.5
--- httpd.conf.524 Oct 2021 16:01:04 -  1.119
+++ httpd.conf.525 Feb 2022 18:41:42 -
@@ -425,6 +425,12 @@ A variable that is set to a comma separa
 features in use
 .Pq omitted when TLS client verification is not in use .
 .El
+.It Ic gzip-static
+Enable static gzip compression to save bandwith.
+.Pp
+If gzip encoding is accepted and if the requested file exists with
+an additional .gz suffix, use the compressed file instead and deliver
+it with content encoding gzip.
 .It Ic hsts Oo Ar option Oc
 Enable HTTP Strict Transport Security.
 Valid options are:
Index: httpd.h
===
RCS file: /data/mirror/openbsd/cvs/src/usr.sbin/httpd/httpd.h,v
retrieving revision 1.158
diff -u -p -r1.158 httpd.h
--- httpd.h 24 Oct 2021 16:01:04 -  1.158
+++ httpd.h 25 Feb 2022 18:40:58 -
@@ -396,6 +396,7 @@ SPLAY_HEAD(client_tree, client);
 #define SRVFLAG_DEFAULT_TYPE   0x0080
 #define SRVFLAG_PATH_REWRITE   0x0100
 #define SRVFLAG_NO_PATH_REWRITE0x0200
+#define SRVFLAG_GZIP_STATIC0x0400
 #define SRVFLAG_LOCATION_FOUND 0x4000
 #define SRVFLAG_LOCATION_NOT_FOUND 0x8000
 
Index: parse.y
===
RCS file: /data/mirror/openbsd/cvs/src/usr.sbin/httpd/parse.y,v
retrieving revision 1.127
diff -u -p -r1.127 parse.y
--- parse.y 24 Oct 2021 16:01:04 -  1.127
+++ parse.y 25 Feb 2022 18:24:30 -
@@ -141,7 +141,7 @@ typedef struct {
 %token TIMEOUT TLS TYPE TYPES HSTS MAXAGE SUBDOMAINS DEFAULT PRELOAD REQUEST
 %token ERROR INCLUDE AUTHENTICATE WITH BLOCK DROP RETURN PASS REWRITE
 %token CA CLIENT CRL OPTIONAL PARAM FORWARDED FOUND NOT
-%token ERRDOCS
+%token ERRDOCS GZIPSTATIC
 %token   STRING
 %token   NUMBER
 %type  port
@@ -553,6 +553,7 @@ serveroptsl : LISTEN ON STRING opttls po
| logformat
| fastcgi
| authenticate
+   | gzip_static
| filter
| LOCATION optfound optmatch STRING {
struct server   *s;
@@ -1217,6 +1218,14 @@ fcgiport : NUMBER{
}
;
 
+gzip_static: NO GZIPSTATIC {
+   srv->srv_conf.flags &= ~SRVFLAG_GZIP_STATIC;
+   }
+   | GZIPSTATIC{
+   srv->srv_conf.flags |= SRVFLAG_GZIP_STATIC;
+   }
+   ;
+
 tcpip  : TCP '{' optnl tcpflags_l '}'
| TCP tcpflags
;
@@ -1441,6 +1450,7 @@ lookup(char *s)
{ "fastcgi",FCGI },
{ "forwarded",  FORWARDED },
{ "found",  FOUND },
+   { "gzip-static",GZIPSTATIC },
{ "hsts",   HSTS },
{ "include",INCLUDE },
{ "index",  INDEX },
Index: server_file.c
===
RCS file: /data/mirror/openbsd/cvs/src/usr.sbin/httpd/server_file.c,v
retrieving revision 1.70
diff -u -p -r1.70 server_file.c
--- server_file.c   29 Apr 2021 18:23:07 -  1.70
+++ server_file.c   26 Feb 2022 01:43:17 -
@@ -223,26 +223,56 @@ server_file_request(struct httpd *env, s
const char  *errstr = NULL;
int  fd = -1, ret, code = 500;
size_t   bufsiz;
+   struct stat  gzst;
+   char gzpath[PATH_MAX];
 
if ((ret = server_file_method(clt)) != 0) {
code = ret;
goto abort;
}
 
+   media = media_find_config(env, srv_conf, path);
+
if ((ret = server_file_modified_since(clt->clt_descreq, st)) != -1) {
/* send the header without a body */
-   media = media_find_config(env, srv_conf, path);
if ((ret = server_response_http(clt, ret, media, -1,
MINIMUM(time(NULL), st->st_mtim.tv_sec))) == -1)
goto fail;
goto done;
}
 
+   /* change path to path.gz if necessary. */
+   if (srv_conf->flags & SRVFLAG_GZIP_STATIC) {
+   struct http_descriptor  *req = clt->clt_descreq;
+   struct http_descriptor  *resp = clt->clt_descresp;
+   struct kv   *r, key;
+
+   /* check Accept-Encoding header */
+   key.kv_key = "Accept-Encoding";
+

Re: [patch] httpd static gzip compression

2022-02-25 Thread Tracey Emery
On Fri, Feb 25, 2022 at 07:47:38PM +0100, Alexander Bluhm wrote:
> On Fri, Feb 25, 2022 at 11:00:22AM +0100, prx wrote:
> > After a few months, I reupload the patch to enable httpd static 
> > compression using "location {}" instructions.
> > 
> > I use it without any issue on my own website and to serve 
> > https://webzine.pufy.cafe.
> > Anyone else tried it?
> 
> I just added it to bluhm.genua.de.  There I deliver large html
> tables.  For http://bluhm.genua.de/regress/results/regress.html it
> reduces bandwith by factor 20.
> 
> > +.It Ic gzip_static
> > +Enable static gzip compression.
> > +.Pp
> > +When a file is requested, serves the file with .gz added to its path if it 
> > exists.
> 
> Mention for what this is useful.
> 
> > +#define SERVER_DEFAULT_GZIP_STATIC 0
> 
> A define for a default 0 is overkill.
> 
> > +   int  gzip_static;
> 
> Better use a flag than an int for a boolean.
> 
> > +   { "gzip_static",GZIPSTATIC },
> 
> We do not have keywords with _ but one with -
> 
> > +   /* change path to path.gz if necessary. */
> > +   if (srv_conf->gzip_static) {
> > +   struct http_descriptor  *req = clt->clt_descreq;
> > +   struct http_descriptor  *resp = clt->clt_descresp;
> > +   struct stat gzst;
> > +   struct kv   *r, key;
> > +   chargzpath[PATH_MAX];
> > +
> > +   /* check Accept-Encoding header */
> > +   key.kv_key = "Accept-Encoding";
> > +   r = kv_find(>http_headers, );
> > +
> > +   if (r != NULL) {
> > +   if (strstr(r->kv_value, "gzip") != NULL) {
> > +   /* append ".gz" to path and check existence */
> > +   strlcpy(gzpath, path, sizeof(gzpath));
> > +   strlcat(gzpath, ".gz", sizeof(gzpath));
> > +
> > +   if ((access(gzpath, R_OK) == 0) &&
> > +   (stat(gzpath, ) == 0)) {
> > +   path = gzpath;
> > +   st = 
> 
> Outside of a block you must not use pointer to variables that are
> defined inside a block.
> 
> > +   kv_add(>http_headers,
> > +   "Content-Encoding", "gzip");
> > +   }
> > +   }
> > +   }
> > +   }
> > +
> > /* Now open the file, should be readable or we have another problem */
> > if ((fd = open(path, O_RDONLY)) == -1)
> > goto abort;
> 
> With all that fixed, diff looks like this and works fine in my setup.
> 
> ok?
> 
> bluhm
> 
> Index: httpd.conf.5
> ===
> RCS file: /data/mirror/openbsd/cvs/src/usr.sbin/httpd/httpd.conf.5,v
> retrieving revision 1.119
> diff -u -p -r1.119 httpd.conf.5
> --- httpd.conf.5  24 Oct 2021 16:01:04 -  1.119
> +++ httpd.conf.5  25 Feb 2022 18:41:42 -
> @@ -425,6 +425,12 @@ A variable that is set to a comma separa
>  features in use
>  .Pq omitted when TLS client verification is not in use .
>  .El
> +.It Ic gzip-static
> +Enable static gzip compression to save bandwith.
> +.Pp
> +If gzip encoding is accepted and if the requested file exists with
> +an additional .gz suffix, use the compressed file instead and deliver
> +it with content encoding gzip.
>  .It Ic hsts Oo Ar option Oc
>  Enable HTTP Strict Transport Security.
>  Valid options are:
> Index: httpd.h
> ===
> RCS file: /data/mirror/openbsd/cvs/src/usr.sbin/httpd/httpd.h,v
> retrieving revision 1.158
> diff -u -p -r1.158 httpd.h
> --- httpd.h   24 Oct 2021 16:01:04 -  1.158
> +++ httpd.h   25 Feb 2022 18:40:58 -
> @@ -396,6 +396,7 @@ SPLAY_HEAD(client_tree, client);
>  #define SRVFLAG_DEFAULT_TYPE 0x0080
>  #define SRVFLAG_PATH_REWRITE 0x0100
>  #define SRVFLAG_NO_PATH_REWRITE  0x0200
> +#define SRVFLAG_GZIP_STATIC  0x0400
>  #define SRVFLAG_LOCATION_FOUND   0x4000
>  #define SRVFLAG_LOCATION_NOT_FOUND 0x8000
>  
> Index: parse.y
> ===
> RCS file: /data/mirror/openbsd/cvs/src/usr.sbin/httpd/parse.y,v
> retrieving revision 1.127
> diff -u -p -r1.127 parse.y
> --- parse.y   24 Oct 2021 16:01:04 -  1.127
> +++ parse.y   25 Feb 2022 18:24:30 -
> @@ -141,7 +141,7 @@ typedef struct {
>  %token   TIMEOUT TLS TYPE TYPES HSTS MAXAGE SUBDOMAINS DEFAULT PRELOAD 
> REQUEST
>  %token   ERROR INCLUDE AUTHENTICATE WITH BLOCK DROP RETURN PASS REWRITE
>  %token   CA CLIENT CRL OPTIONAL PARAM FORWARDED FOUND NOT
> -%token   ERRDOCS
> +%token   ERRDOCS GZIPSTATIC
>  %token STRING
>  %token NUMBER
>  %typeport
> @@ -553,6 +553,7 @@ serveroptsl   : LISTEN ON STRING opttls po
> 

Re: [patch] httpd static gzip compression

2022-02-25 Thread Alexander Bluhm
On Fri, Feb 25, 2022 at 11:00:22AM +0100, prx wrote:
> After a few months, I reupload the patch to enable httpd static 
> compression using "location {}" instructions.
> 
> I use it without any issue on my own website and to serve 
> https://webzine.pufy.cafe.
> Anyone else tried it?

I just added it to bluhm.genua.de.  There I deliver large html
tables.  For http://bluhm.genua.de/regress/results/regress.html it
reduces bandwith by factor 20.

> +.It Ic gzip_static
> +Enable static gzip compression.
> +.Pp
> +When a file is requested, serves the file with .gz added to its path if it 
> exists.

Mention for what this is useful.

> +#define SERVER_DEFAULT_GZIP_STATIC 0

A define for a default 0 is overkill.

> + int  gzip_static;

Better use a flag than an int for a boolean.

> + { "gzip_static",GZIPSTATIC },

We do not have keywords with _ but one with -

> + /* change path to path.gz if necessary. */
> + if (srv_conf->gzip_static) {
> + struct http_descriptor  *req = clt->clt_descreq;
> + struct http_descriptor  *resp = clt->clt_descresp;
> + struct stat gzst;
> + struct kv   *r, key;
> + chargzpath[PATH_MAX];
> +
> + /* check Accept-Encoding header */
> + key.kv_key = "Accept-Encoding";
> + r = kv_find(>http_headers, );
> +
> + if (r != NULL) {
> + if (strstr(r->kv_value, "gzip") != NULL) {
> + /* append ".gz" to path and check existence */
> + strlcpy(gzpath, path, sizeof(gzpath));
> + strlcat(gzpath, ".gz", sizeof(gzpath));
> +
> + if ((access(gzpath, R_OK) == 0) &&
> + (stat(gzpath, ) == 0)) {
> + path = gzpath;
> + st = 

Outside of a block you must not use pointer to variables that are
defined inside a block.

> + kv_add(>http_headers,
> + "Content-Encoding", "gzip");
> + }
> + }
> + }
> + }
> +
>   /* Now open the file, should be readable or we have another problem */
>   if ((fd = open(path, O_RDONLY)) == -1)
>   goto abort;

With all that fixed, diff looks like this and works fine in my setup.

ok?

bluhm

Index: httpd.conf.5
===
RCS file: /data/mirror/openbsd/cvs/src/usr.sbin/httpd/httpd.conf.5,v
retrieving revision 1.119
diff -u -p -r1.119 httpd.conf.5
--- httpd.conf.524 Oct 2021 16:01:04 -  1.119
+++ httpd.conf.525 Feb 2022 18:41:42 -
@@ -425,6 +425,12 @@ A variable that is set to a comma separa
 features in use
 .Pq omitted when TLS client verification is not in use .
 .El
+.It Ic gzip-static
+Enable static gzip compression to save bandwith.
+.Pp
+If gzip encoding is accepted and if the requested file exists with
+an additional .gz suffix, use the compressed file instead and deliver
+it with content encoding gzip.
 .It Ic hsts Oo Ar option Oc
 Enable HTTP Strict Transport Security.
 Valid options are:
Index: httpd.h
===
RCS file: /data/mirror/openbsd/cvs/src/usr.sbin/httpd/httpd.h,v
retrieving revision 1.158
diff -u -p -r1.158 httpd.h
--- httpd.h 24 Oct 2021 16:01:04 -  1.158
+++ httpd.h 25 Feb 2022 18:40:58 -
@@ -396,6 +396,7 @@ SPLAY_HEAD(client_tree, client);
 #define SRVFLAG_DEFAULT_TYPE   0x0080
 #define SRVFLAG_PATH_REWRITE   0x0100
 #define SRVFLAG_NO_PATH_REWRITE0x0200
+#define SRVFLAG_GZIP_STATIC0x0400
 #define SRVFLAG_LOCATION_FOUND 0x4000
 #define SRVFLAG_LOCATION_NOT_FOUND 0x8000
 
Index: parse.y
===
RCS file: /data/mirror/openbsd/cvs/src/usr.sbin/httpd/parse.y,v
retrieving revision 1.127
diff -u -p -r1.127 parse.y
--- parse.y 24 Oct 2021 16:01:04 -  1.127
+++ parse.y 25 Feb 2022 18:24:30 -
@@ -141,7 +141,7 @@ typedef struct {
 %token TIMEOUT TLS TYPE TYPES HSTS MAXAGE SUBDOMAINS DEFAULT PRELOAD REQUEST
 %token ERROR INCLUDE AUTHENTICATE WITH BLOCK DROP RETURN PASS REWRITE
 %token CA CLIENT CRL OPTIONAL PARAM FORWARDED FOUND NOT
-%token ERRDOCS
+%token ERRDOCS GZIPSTATIC
 %token   STRING
 %token   NUMBER
 %type  port
@@ -553,6 +553,7 @@ serveroptsl : LISTEN ON STRING opttls po
| logformat
| fastcgi
| authenticate
+   | gzip_static
| filter
| LOCATION optfound optmatch STRING {
struct server   *s;
@@ -1217,6 +1218,14 @@ fcgiport : NUMBER{

Re: [patch] httpd static gzip compression

2022-02-25 Thread prx
Hello,

After a few months, I reupload the patch to enable httpd static 
compression using "location {}" instructions.

I use it without any issue on my own website and to serve 
https://webzine.pufy.cafe.
Anyone else tried it?

I emphasize on the fact it is admin responsibility to enable or not 
this feature ans webmaster's to deliver gzipped files.

Regards.

prx


* Ingo Schwarze  le [05-11-2021 13:37:15 +]:
> Hi Theo,
> 
> Theo de Raadt wrote on Thu, Nov 04, 2021 at 08:27:47AM -0600:
> > prx  wrote:
> >> On 2021/11/04 14:21, prx wrote:
> 
> >>> The attached patch add support for static gzip compression.
> >>> 
> >>> In other words, if a client support gzip compression, when "file" is
> >>> requested, httpd will check if "file.gz" is avaiable to serve.
> 
> >> This diff doesn't compress "on the fly".
> >> It's up to the webmaster to compress files **before** serving them.
> 
> > Does any other program work this way?
> 
> Yes.  The man(1) program does.  At least on the vast majority of
> Linux systems (at least those using the man-db implementation
> of man(1)), on FreeBSD, and on DragonFly BSD.
> 
> Those systems store most manual pages as gzipped man(7) and mdoc(7)
> files, and man(1) decompresses them every time a user wants to look
> at one of them.  You say "man ls", and what you get is actually
> /usr/share/man/man1/ls.1.gz or something like that.
> 
> For man(1), that is next to useless because du -sh /usr/share/man =
> 42.6M uncompressed.  But it has repeatedly caused bugs in the past.
> I would love to remove the feature from mandoc, but even though it is
> rarely used in OpenBSD (some ports installed gzipped manuals in the
> past, but i think the ports tree has been clean now for quite some
> time; you might still need the feature when installing software
> or unpacking foreign manual page packages without using ports)
> it would be a bad idea to remove it because it is too widely used
> elsewhere.  Note that even the old BSD man(1) supported it.
> 
> > Where you request one filename, and it gives you another?
> 
> You did not ask what web servers do, but we are discussing a patch to
> a webserver.  For this reason, let me note in passing that even some
> otherwise extremely useful sites get it very wrong the other way round:
> 
>  $ ftp https://manpages.debian.org/bullseye/coreutils/ls.1.en.gz
> Trying 130.89.148.77...
> Requesting https://manpages.debian.org/bullseye/coreutils/ls.1.en.gz
> 100% |**|  8050   00:00   
>  
> 8050 bytes received in 0.00 seconds (11.74 MB/s)
>  $ file ls.1.en.gz
> ls.1.en.gz: troff or preprocessor input text
>  $ grep -A 1 '^.SH NAME' ls.1.en.gz  
> .SH NAME
> ls \- list directory contents
>  $ gunzip ls.1.en.gz
> gunzip: ls.1.en.gz: unrecognized file format
> 
> > I have a difficult time understanding why gzip has to sneak it's way
> > into everything.
> > 
> > I always prefer software that does precisely what I expect it to do.
> 
> Certainly.
> 
> I have no strong opinion whether a webserver qualifies as "everything",
> though, nor did i look at the diff.  While all manpages are small in the
> real world, some web servers may have to store huge amounts of data that
> clients might request, so disk space might occasionally be an issue on
> a web server even in 2021.  Also, some websites deliver huge amounts of
> data to the client even when the user merely asked for some text (not sure
> such sites would consider running OpenBSD httpd(8), but whatever :) - when
> browsing the web, bandwidth is still occasionally an issue even in 2021,
> even though that is a rather absurd fact.
> 
> Yours,
>   Ingo
Index: httpd.conf.5
===
RCS file: /cvs/src/usr.sbin/httpd/httpd.conf.5,v
retrieving revision 1.119
diff -u -p -r1.119 httpd.conf.5
--- httpd.conf.524 Oct 2021 16:01:04 -  1.119
+++ httpd.conf.55 Nov 2021 14:04:22 -
@@ -425,6 +425,10 @@ A variable that is set to a comma separa
 features in use
 .Pq omitted when TLS client verification is not in use .
 .El
+.It Ic gzip_static
+Enable static gzip compression.
+.Pp
+When a file is requested, serves the file with .gz added to its path if it 
exists.
 .It Ic hsts Oo Ar option Oc
 Enable HTTP Strict Transport Security.
 Valid options are:
Index: httpd.h
===
RCS file: /cvs/src/usr.sbin/httpd/httpd.h,v
retrieving revision 1.158
diff -u -p -r1.158 httpd.h
--- httpd.h 24 Oct 2021 16:01:04 -  1.158
+++ httpd.h 5 Nov 2021 14:04:22 -
@@ -87,6 +87,7 @@
 #define SERVER_DEF_TLS_LIFETIME(2 * 3600)
 #define SERVER_MIN_TLS_LIFETIME(60)
 #define SERVER_MAX_TLS_LIFETIME(24 * 3600)
+#define SERVER_DEFAULT_GZIP_STATIC 0
 
 #define MEDIATYPE_NAMEMAX  128 /* file name extension */
 #define MEDIATYPE_TYPEMAX  64  /* length of 

Re: [patch] httpd static gzip compression

2021-11-05 Thread Steffen Nurpmeso
  ...
 |You would be asking for
 |
 |https://exmaple.com/whatever/ls.1
 |
 |and with Accept-Encoding: gzip in the http header, and the
 |webserver would then look if it has a file
 |
 |/whatever/ls.1.gz
 |
 |(instead of without .gz) in its document tree and send you that, with 
 |"Content-Encoding: gzip" http header.

As an outsider i find this thread very amusing.
As you all know the normal approach is to have a cache directory
where some "compress" module performs on-the-fly storage if the
client asks for some file, accepts compression, and the compressed
version yet does not exist.  Funnily i once got not even an answer
when i asked for static compression, since on-the-fly compression
was already available, "so what", i have forgotten which
webserver that has been.  Cleanup via cron anyhow.

Now something for Theo, from the webserver i use.

  #if defined HAVE_ZLIB_H && defined HAVE_LIBZ
  # define USE_ZLIB
  # include 
  #endif
  #ifndef Z_DEFAULT_COMPRESSION
  #define Z_DEFAULT_COMPRESSION -1
  #endif
  #ifndef MAX_WBITS
  #define MAX_WBITS 15
  #endif
  
  #if defined HAVE_BZLIB_H && defined HAVE_LIBBZ2
  # define USE_BZ2LIB
  /* we don't need stdio interface */
  # define BZ_NO_STDIO
  # include 
  #endif
  
  #if defined HAVE_BROTLI_ENCODE_H && defined HAVE_BROTLI
  # define USE_BROTLI
  # include 
  #endif
  
  #if defined HAVE_ZSTD_H && defined HAVE_ZSTD
  # define USE_ZSTD
  # include 
  #endif
  
  #if defined HAVE_SYS_MMAN_H && defined HAVE_MMAP && defined ENABLE_MMAP
  #define USE_MMAP

--steffen
|
|Der Kragenbaer,The moon bear,
|der holt sich munter   he cheerfully and one by one
|einen nach dem anderen runter  wa.ks himself off
|(By Robert Gernhardt)



Re: [patch] httpd static gzip compression

2021-11-05 Thread Crystal Kolipe
On Fri, Nov 05, 2021 at 08:24:21AM -0600, Theo de Raadt wrote:
> prx  wrote:
> 
> > I think this remark should be placed into perspective.
> > 
> > When a file is requested, its gzipped version is send if : 
> > * The client ask for it with appropriate header.
> > * The server admin configured httpd to do so **and** compressed files.
> 
> and if the gzipd file does not contain the contents as the ungzip'd
> file, then two different clients may get different results.

Just as big-endian and little-endian architectures can be fed different files
from the same ISO-9660 filesystem.

The issue you describe certainly exists, but it's not without precedent.

It could even be considered a feature, if the client supports compression then
send more verbose content.



Re: [patch] httpd static gzip compression

2021-11-05 Thread Sebastian Benoit
Theo de Raadt(dera...@openbsd.org) on 2021.11.05 08:24:21 -0600:
> prx  wrote:
> 
> > I think this remark should be placed into perspective.
> > 
> > When a file is requested, its gzipped version is send if : 
> > * The client ask for it with appropriate header.
> > * The server admin configured httpd to do so **and** compressed files.
> 
> and if the gzipd file does not contain the contents as the ungzip'd
> file, then two different clients may get different results.

Yes, but it's the responsibility of whoever puts the content there to do the
correct thing.

If i put a html file into a web directory and call it foo.jpg i would not
expect a sensible result either.




Re: [patch] httpd static gzip compression

2021-11-05 Thread Sebastian Benoit
Ingo Schwarze(schwa...@usta.de) on 2021.11.05 14:37:15 +0100:
> Hi Theo,
> 
> Theo de Raadt wrote on Thu, Nov 04, 2021 at 08:27:47AM -0600:
> > prx  wrote:
> >> On 2021/11/04 14:21, prx wrote:
> 
> >>> The attached patch add support for static gzip compression.
> >>> 
> >>> In other words, if a client support gzip compression, when "file" is
> >>> requested, httpd will check if "file.gz" is avaiable to serve.
> 
> >> This diff doesn't compress "on the fly".
> >> It's up to the webmaster to compress files **before** serving them.
> 
> > Does any other program work this way?
> 
> Yes.  The man(1) program does.  At least on the vast majority of
> Linux systems (at least those using the man-db implementation
> of man(1)), on FreeBSD, and on DragonFly BSD.
> 
> Those systems store most manual pages as gzipped man(7) and mdoc(7)
> files, and man(1) decompresses them every time a user wants to look
> at one of them.  You say "man ls", and what you get is actually
> /usr/share/man/man1/ls.1.gz or something like that.
> 
> For man(1), that is next to useless because du -sh /usr/share/man =
> 42.6M uncompressed.  But it has repeatedly caused bugs in the past.
> I would love to remove the feature from mandoc, but even though it is
> rarely used in OpenBSD (some ports installed gzipped manuals in the
> past, but i think the ports tree has been clean now for quite some
> time; you might still need the feature when installing software
> or unpacking foreign manual page packages without using ports)
> it would be a bad idea to remove it because it is too widely used
> elsewhere.  Note that even the old BSD man(1) supported it.
> 
> > Where you request one filename, and it gives you another?
> 
> You did not ask what web servers do, but we are discussing a patch to
> a webserver.  For this reason, let me note in passing that even some
> otherwise extremely useful sites get it very wrong the other way round:
> 
>  $ ftp https://manpages.debian.org/bullseye/coreutils/ls.1.en.gz
> Trying 130.89.148.77...
> Requesting https://manpages.debian.org/bullseye/coreutils/ls.1.en.gz
> 100% |**|  8050   00:00   
>  
> 8050 bytes received in 0.00 seconds (11.74 MB/s)
>  $ file ls.1.en.gz
> ls.1.en.gz: troff or preprocessor input text
>  $ grep -A 1 '^.SH NAME' ls.1.en.gz  
> .SH NAME
> ls \- list directory contents
>  $ gunzip ls.1.en.gz
> gunzip: ls.1.en.gz: unrecognized file format

But with this patch, you are not asking the webserver for 
https://manpages.debian.org/bullseye/coreutils/ls.1.en.gz

You would be asking for

https://exmaple.com/whatever/ls.1

and with Accept-Encoding: gzip in the http header, and the
webserver would then look if it has a file

/whatever/ls.1.gz

(instead of without .gz) in its document tree and send you that, with 
"Content-Encoding: gzip" http header.
And because of that header, your client will know that the data is gzipped
and will unzip it before writing the file to the output.

I.e. there is no such problem (unless the patch has a bug).

/Benno

> 
> > I have a difficult time understanding why gzip has to sneak it's way
> > into everything.
> > 
> > I always prefer software that does precisely what I expect it to do.
> 
> Certainly.
> 
> I have no strong opinion whether a webserver qualifies as "everything",
> though, nor did i look at the diff.  While all manpages are small in the
> real world, some web servers may have to store huge amounts of data that
> clients might request, so disk space might occasionally be an issue on
> a web server even in 2021.  Also, some websites deliver huge amounts of
> data to the client even when the user merely asked for some text (not sure
> such sites would consider running OpenBSD httpd(8), but whatever :) - when
> browsing the web, bandwidth is still occasionally an issue even in 2021,
> even though that is a rather absurd fact.
> 
> Yours,
>   Ingo
> 



Re: [patch] httpd static gzip compression

2021-11-05 Thread Theo de Raadt
prx  wrote:

> I think this remark should be placed into perspective.
> 
> When a file is requested, its gzipped version is send if : 
> * The client ask for it with appropriate header.
> * The server admin configured httpd to do so **and** compressed files.

and if the gzipd file does not contain the contents as the ungzip'd
file, then two different clients may get different results.





Re: [patch] httpd static gzip compression

2021-11-05 Thread prx
* Theo de Raadt  le [04-11-2021 08:27:47 -0600]:
> prx  wrote:
> 
> > * Stuart Henderson  le [04-11-2021 14:09:39 +]:
> > > On 2021/11/04 14:21, prx wrote:
> > > > Hello,
> > > > The attached patch add support for static gzip compression.
> > > > 
> > > > In other words, if a client support gzip compression, when "file" is
> > > > requested, httpd will check if "file.gz" is avaiable to serve.
> > > > 
> > > > Regards.
> > > > 
> > > > prx
> > > 
> > > btw this was rejected before,
> > > 
> > > https://github.com/reyk/httpd/issues/21
> > > 
> > 
> > This diff doesn't compress "on the fly".
> > It's up to the webmaster to compress files **before** serving them.
> 
> Does any other program work this way?
> 
> Where you request one filename, and it gives you another?
> 
> I have a difficult time understanding why gzip has to sneak it's way
> into everything.
> 
> I always prefer software that does precisely what I expect it to do.
> 

I think this remark should be placed into perspective.

When a file is requested, its gzipped version is send if : 
* The client ask for it with appropriate header.
* The server admin configured httpd to do so **and** compressed files.

In this situation, the client does get the expected file.

Of course, the admin has the responsibility to give the same content in "file" 
and "file.gz".

The cost for the server is negligible but reduce bandwidth usage (for
both client and server).

According to previous comments, find below a modified patch to enable
static gzip compression on a location match. ie : 

server "foo" {
#[... snip ... ]
location "/*.html" { gzip_static }
location "/*.css" { gzip_static }
}

Regards.


Index: httpd.conf.5
===
RCS file: /cvs/src/usr.sbin/httpd/httpd.conf.5,v
retrieving revision 1.119
diff -u -p -r1.119 httpd.conf.5
--- httpd.conf.524 Oct 2021 16:01:04 -  1.119
+++ httpd.conf.55 Nov 2021 14:04:22 -
@@ -425,6 +425,10 @@ A variable that is set to a comma separa
 features in use
 .Pq omitted when TLS client verification is not in use .
 .El
+.It Ic gzip_static
+Enable static gzip compression.
+.Pp
+When a file is requested, serves the file with .gz added to its path if it 
exists.
 .It Ic hsts Oo Ar option Oc
 Enable HTTP Strict Transport Security.
 Valid options are:
Index: httpd.h
===
RCS file: /cvs/src/usr.sbin/httpd/httpd.h,v
retrieving revision 1.158
diff -u -p -r1.158 httpd.h
--- httpd.h 24 Oct 2021 16:01:04 -  1.158
+++ httpd.h 5 Nov 2021 14:04:22 -
@@ -87,6 +87,7 @@
 #define SERVER_DEF_TLS_LIFETIME(2 * 3600)
 #define SERVER_MIN_TLS_LIFETIME(60)
 #define SERVER_MAX_TLS_LIFETIME(24 * 3600)
+#define SERVER_DEFAULT_GZIP_STATIC 0
 
 #define MEDIATYPE_NAMEMAX  128 /* file name extension */
 #define MEDIATYPE_TYPEMAX  64  /* length of type/subtype */
@@ -546,6 +547,7 @@ struct server_config {
struct server_fcgiparams fcgiparams;
int  fcgistrip;
char errdocroot[HTTPD_ERRDOCROOT_MAX];
+   int  gzip_static;
 
TAILQ_ENTRY(server_config) entry;
 };
Index: parse.y
===
RCS file: /cvs/src/usr.sbin/httpd/parse.y,v
retrieving revision 1.127
diff -u -p -r1.127 parse.y
--- parse.y 24 Oct 2021 16:01:04 -  1.127
+++ parse.y 5 Nov 2021 14:04:22 -
@@ -141,7 +141,7 @@ typedef struct {
 %token TIMEOUT TLS TYPE TYPES HSTS MAXAGE SUBDOMAINS DEFAULT PRELOAD REQUEST
 %token ERROR INCLUDE AUTHENTICATE WITH BLOCK DROP RETURN PASS REWRITE
 %token CA CLIENT CRL OPTIONAL PARAM FORWARDED FOUND NOT
-%token ERRDOCS
+%token ERRDOCS GZIPSTATIC
 %token   STRING
 %token   NUMBER
 %type  port
@@ -553,6 +553,7 @@ serveroptsl : LISTEN ON STRING opttls po
| logformat
| fastcgi
| authenticate
+   | gzip_static
| filter
| LOCATION optfound optmatch STRING {
struct server   *s;
@@ -1217,6 +1218,14 @@ fcgiport : NUMBER{
}
;
 
+gzip_static: NO GZIPSTATIC {
+   srv->srv_conf.gzip_static = SERVER_DEFAULT_GZIP_STATIC;
+   }
+   | GZIPSTATIC {
+   srv->srv_conf.gzip_static = 1;
+   }
+   ;
+
 tcpip  : TCP '{' optnl tcpflags_l '}'
| TCP tcpflags
;
@@ -1441,6 +1450,7 @@ lookup(char *s)
{ "fastcgi",FCGI },
{ "forwarded",  FORWARDED },
{ "found",  FOUND },
+   { "gzip_static",GZIPSTATIC },
{ "hsts",   HSTS },
{ "include",

Re: [patch] httpd static gzip compression

2021-11-05 Thread Ingo Schwarze
Hi Theo,

Theo de Raadt wrote on Thu, Nov 04, 2021 at 08:27:47AM -0600:
> prx  wrote:
>> On 2021/11/04 14:21, prx wrote:

>>> The attached patch add support for static gzip compression.
>>> 
>>> In other words, if a client support gzip compression, when "file" is
>>> requested, httpd will check if "file.gz" is avaiable to serve.

>> This diff doesn't compress "on the fly".
>> It's up to the webmaster to compress files **before** serving them.

> Does any other program work this way?

Yes.  The man(1) program does.  At least on the vast majority of
Linux systems (at least those using the man-db implementation
of man(1)), on FreeBSD, and on DragonFly BSD.

Those systems store most manual pages as gzipped man(7) and mdoc(7)
files, and man(1) decompresses them every time a user wants to look
at one of them.  You say "man ls", and what you get is actually
/usr/share/man/man1/ls.1.gz or something like that.

For man(1), that is next to useless because du -sh /usr/share/man =
42.6M uncompressed.  But it has repeatedly caused bugs in the past.
I would love to remove the feature from mandoc, but even though it is
rarely used in OpenBSD (some ports installed gzipped manuals in the
past, but i think the ports tree has been clean now for quite some
time; you might still need the feature when installing software
or unpacking foreign manual page packages without using ports)
it would be a bad idea to remove it because it is too widely used
elsewhere.  Note that even the old BSD man(1) supported it.

> Where you request one filename, and it gives you another?

You did not ask what web servers do, but we are discussing a patch to
a webserver.  For this reason, let me note in passing that even some
otherwise extremely useful sites get it very wrong the other way round:

 $ ftp https://manpages.debian.org/bullseye/coreutils/ls.1.en.gz
Trying 130.89.148.77...
Requesting https://manpages.debian.org/bullseye/coreutils/ls.1.en.gz
100% |**|  8050   00:00
8050 bytes received in 0.00 seconds (11.74 MB/s)
 $ file ls.1.en.gz
ls.1.en.gz: troff or preprocessor input text
 $ grep -A 1 '^.SH NAME' ls.1.en.gz  
.SH NAME
ls \- list directory contents
 $ gunzip ls.1.en.gz
gunzip: ls.1.en.gz: unrecognized file format

> I have a difficult time understanding why gzip has to sneak it's way
> into everything.
> 
> I always prefer software that does precisely what I expect it to do.

Certainly.

I have no strong opinion whether a webserver qualifies as "everything",
though, nor did i look at the diff.  While all manpages are small in the
real world, some web servers may have to store huge amounts of data that
clients might request, so disk space might occasionally be an issue on
a web server even in 2021.  Also, some websites deliver huge amounts of
data to the client even when the user merely asked for some text (not sure
such sites would consider running OpenBSD httpd(8), but whatever :) - when
browsing the web, bandwidth is still occasionally an issue even in 2021,
even though that is a rather absurd fact.

Yours,
  Ingo



Re: [patch] httpd static gzip compression

2021-11-04 Thread Maxim
Stuart Henderson, 2021-11-04 14:44:44 +:
> On 2021/11/04 08:27, Theo de Raadt wrote:
> 
> > Does any other program work this way?
> > 
> > Where you request one filename, and it gives you another?
> 
> Some of the webservers do, for language selection etc. Sometimes it's
> useful. Fortunately there are various options for more fully featured
> web servers if people need that, if I understand correctly the whole
> point of httpd was that it doesn't have many features.

For reference: it's called content negotiation. Apache httpd does it
(when configured) via request headers Accept, Accept-Language,
Accept-Charset, and Accept-Encoding. It appears to me, it does it in a
"static" manner too: it picks a file from a disk that fits a request
best. (See "Note on hyperlinks and naming conventions" at [1] for
examples.)

In my view, the proposed patch is a selective implementation of content
negotiation limited to the Accept-Encoding header only. Perhaps, the
conversation should be on whether httpd wants to have content
negotiation, to what extend, and, in what form.

[1] http://httpd.apache.org/docs/2.4/content-negotiation.html



Re: [patch] httpd static gzip compression

2021-11-04 Thread Sebastian Benoit
Theo de Raadt(dera...@openbsd.org) on 2021.11.04 08:53:13 -0600:
> Stuart Henderson  wrote:
> 
> > In some ways it would be better if it *did* compress on the fly, as then
> > you don't have so much to consider with the effect on block/match rules,
> > whether a request is passed to a fastcgi handler, etc. (But of course
> > then you have CPU use issues).
> 
> I don't want my webservers to perform unexpected compute.  Sending extra
> packets is cheaper than doing the compute. 

Exactly.

Other webservers can do this too. In apache you do it with rewrite rules, in
nginx there is the gzip_static option.

As to your question:

> Where you request one filename, and it gives you another?

With most web applications it is common that paths are rewritten. You dont
get the file at the path you request.
 
> > Not sure if it's still actually needed, but most web servers with gzip
> > support usually have a way to disable it per user-agent due to problems
> > that have occurred.

This was needed for old internet explorer version and when i was still in
this business, we stopped using such configuration about 6 years ago. I
don't think we have to care about that anymore.

All currently used browsers support compression _when they ask for it_

> I was not talking about other webservers.  I was talking about any other
> program going, "OH i see you have a .gz file, I cannot actually confirm it
> is a gzip of the non-gzip file, but here you go, here is the thing you
> didn't ask for".

Changing the content of what is served by a webserver depending not only on
the path but also on other headers (such as Accept-Language etc) has been a
feature of HTTP for ages.

Its the job of the administrator setting things up to make sure that the
content served is correct. I don't see a problem with that: the admin needs
to make sure that the correct files are in a directory, independent of what
type of file they are.

However, i think the feature needs to be optional, on a per directory basis.

If the patch is extended to be setable per directory, i'm willing to review
it further.



Re: [patch] httpd static gzip compression

2021-11-04 Thread Chris Cappuccio
Solene Rapenne [sol...@perso.pw] wrote:
> On jeudi 4 novembre 2021 15:09:39 CET, Stuart Henderson wrote:
> > 
> > btw this was rejected before,
> > 
> > https://github.com/reyk/httpd/issues/21
> 
> It's not clear if "static" compression is rejected. Sure, on-the-fly
> compilation is complicated and bring issues, static compression
> is easy to implement and predictible IMO.

In my opinion, this feature makes sense if it can be activated by the
'location' ...

It requires explicit preparation by the site operator so it should
only activated on demand, per-directory

It makes sense to me in this context

If someone explicitly requests the .gz version, they get it, regardless
of this setting

If someone requests the non-gz version, their browser should only
get the gz if it agrees to transparently handle the compression

And the gz swap only gets activated if the site operator tells
httpd this is the desired behavior for a particular directory tree through
the location keyword...

Chris



Re: [patch] httpd static gzip compression

2021-11-04 Thread Florian Obser
On 2021-11-04 08:53 -06, "Theo de Raadt"  wrote:
> Oh, httpd was written for only that reason?  That's incorrect.

The other reason is need to know.

-- 
I'm not entirely sure you are real.



Re: [patch] httpd static gzip compression

2021-11-04 Thread Theo de Raadt
Stuart Henderson  wrote:

> In some ways it would be better if it *did* compress on the fly, as then
> you don't have so much to consider with the effect on block/match rules,
> whether a request is passed to a fastcgi handler, etc. (But of course
> then you have CPU use issues).

I don't want my webservers to perform unexpected compute.  Sending extra
packets is cheaper than doing the compute. 

> Not sure if it's still actually needed, but most web servers with gzip
> support usually have a way to disable it per user-agent due to problems
> that have occurred.

I was not talking about other webservers.  I was talking about any other
program going, "OH i see you have a .gz file, I cannot actually confirm it
is a gzip of the non-gzip file, but here you go, here is the thing you
didn't ask for".

Nothing ever went wrong with that approach, right?


> Some of the webservers do, for language selection etc. Sometimes it's
> useful. Fortunately there are various options for more fully featured
> web servers if people need that, if I understand correctly the whole
> point of httpd was that it doesn't have many features.

Oh, httpd was written for only that reason?  That's incorrect.



Re: [patch] httpd static gzip compression

2021-11-04 Thread Stuart Henderson
On 2021/11/04 08:27, Theo de Raadt wrote:
> prx  wrote:
> 
> > * Stuart Henderson  le [04-11-2021 14:09:39 +]:
> > > On 2021/11/04 14:21, prx wrote:
> > > > Hello,
> > > > The attached patch add support for static gzip compression.
> > > > 
> > > > In other words, if a client support gzip compression, when "file" is
> > > > requested, httpd will check if "file.gz" is avaiable to serve.
> > > > 
> > > > Regards.
> > > > 
> > > > prx
> > > 
> > > btw this was rejected before,
> > > 
> > > https://github.com/reyk/httpd/issues/21
> > > 
> > 
> > This diff doesn't compress "on the fly".
> > It's up to the webmaster to compress files **before** serving them.

In some ways it would be better if it *did* compress on the fly, as then
you don't have so much to consider with the effect on block/match rules,
whether a request is passed to a fastcgi handler, etc. (But of course
then you have CPU use issues).

Not sure if it's still actually needed, but most web servers with gzip
support usually have a way to disable it per user-agent due to problems
that have occurred.

> Does any other program work this way?
> 
> Where you request one filename, and it gives you another?

Some of the webservers do, for language selection etc. Sometimes it's
useful. Fortunately there are various options for more fully featured
web servers if people need that, if I understand correctly the whole
point of httpd was that it doesn't have many features.

> I have a difficult time understanding why gzip has to sneak it's way
> into everything.
> 
> I always prefer software that does precisely what I expect it to do.
> 



Re: [patch] httpd static gzip compression

2021-11-04 Thread Theo de Raadt
prx  wrote:

> * Stuart Henderson  le [04-11-2021 14:09:39 +]:
> > On 2021/11/04 14:21, prx wrote:
> > > Hello,
> > > The attached patch add support for static gzip compression.
> > > 
> > > In other words, if a client support gzip compression, when "file" is
> > > requested, httpd will check if "file.gz" is avaiable to serve.
> > > 
> > > Regards.
> > > 
> > > prx
> > 
> > btw this was rejected before,
> > 
> > https://github.com/reyk/httpd/issues/21
> > 
> 
> This diff doesn't compress "on the fly".
> It's up to the webmaster to compress files **before** serving them.

Does any other program work this way?

Where you request one filename, and it gives you another?

I have a difficult time understanding why gzip has to sneak it's way
into everything.

I always prefer software that does precisely what I expect it to do.



Re: [patch] httpd static gzip compression

2021-11-04 Thread prx
* Stuart Henderson  le [04-11-2021 14:09:39 +]:
> On 2021/11/04 14:21, prx wrote:
> > Hello,
> > The attached patch add support for static gzip compression.
> > 
> > In other words, if a client support gzip compression, when "file" is
> > requested, httpd will check if "file.gz" is avaiable to serve.
> > 
> > Regards.
> > 
> > prx
> 
> btw this was rejected before,
> 
> https://github.com/reyk/httpd/issues/21
> 

This diff doesn't compress "on the fly".
It's up to the webmaster to compress files **before** serving them.



Re: [patch] httpd static gzip compression

2021-11-04 Thread Stuart Henderson
On 2021/11/04 14:21, prx wrote:
> Hello,
> The attached patch add support for static gzip compression.
> 
> In other words, if a client support gzip compression, when "file" is
> requested, httpd will check if "file.gz" is avaiable to serve.
> 
> Regards.
> 
> prx

btw this was rejected before,

https://github.com/reyk/httpd/issues/21



[patch] httpd static gzip compression

2021-11-04 Thread prx
Hello,
The attached patch add support for static gzip compression.

In other words, if a client support gzip compression, when "file" is
requested, httpd will check if "file.gz" is avaiable to serve.

Regards.

prx
Index: httpd.conf.5
===
RCS file: /cvs/src/usr.sbin/httpd/httpd.conf.5,v
retrieving revision 1.119
diff -u -p -r1.119 httpd.conf.5
--- httpd.conf.524 Oct 2021 16:01:04 -  1.119
+++ httpd.conf.54 Nov 2021 13:13:58 -
@@ -425,6 +425,10 @@ A variable that is set to a comma separa
 features in use
 .Pq omitted when TLS client verification is not in use .
 .El
+.It Ic gzip_static
+Enable static gzip compression.
+.Pp
+When a file is requested, serves the file with .gz added to its path if it 
exists.
 .It Ic hsts Oo Ar option Oc
 Enable HTTP Strict Transport Security.
 Valid options are:
Index: httpd.h
===
RCS file: /cvs/src/usr.sbin/httpd/httpd.h,v
retrieving revision 1.158
diff -u -p -r1.158 httpd.h
--- httpd.h 24 Oct 2021 16:01:04 -  1.158
+++ httpd.h 4 Nov 2021 13:13:58 -
@@ -87,6 +87,7 @@
 #define SERVER_DEF_TLS_LIFETIME(2 * 3600)
 #define SERVER_MIN_TLS_LIFETIME(60)
 #define SERVER_MAX_TLS_LIFETIME(24 * 3600)
+#define SERVER_DEFAULT_GZIP_STATIC 0
 
 #define MEDIATYPE_NAMEMAX  128 /* file name extension */
 #define MEDIATYPE_TYPEMAX  64  /* length of type/subtype */
@@ -546,6 +547,7 @@ struct server_config {
struct server_fcgiparams fcgiparams;
int  fcgistrip;
char errdocroot[HTTPD_ERRDOCROOT_MAX];
+   int  gzip_static;
 
TAILQ_ENTRY(server_config) entry;
 };
Index: parse.y
===
RCS file: /cvs/src/usr.sbin/httpd/parse.y,v
retrieving revision 1.127
diff -u -p -r1.127 parse.y
--- parse.y 24 Oct 2021 16:01:04 -  1.127
+++ parse.y 4 Nov 2021 13:13:58 -
@@ -141,7 +141,7 @@ typedef struct {
 %token TIMEOUT TLS TYPE TYPES HSTS MAXAGE SUBDOMAINS DEFAULT PRELOAD REQUEST
 %token ERROR INCLUDE AUTHENTICATE WITH BLOCK DROP RETURN PASS REWRITE
 %token CA CLIENT CRL OPTIONAL PARAM FORWARDED FOUND NOT
-%token ERRDOCS
+%token ERRDOCS GZIPSTATIC
 %token   STRING
 %token   NUMBER
 %type  port
@@ -306,6 +306,8 @@ server  : SERVER optmatch STRING{
if (conf->sc_custom_errdocs)
s->srv_conf.flags |= SRVFLAG_ERRDOCS;
 
+   s->srv_conf.gzip_static = SERVER_DEFAULT_GZIP_STATIC;
+
if (last_server_id == INT_MAX) {
yyerror("too many servers defined");
free(s);
@@ -1180,6 +1182,9 @@ filter: block RETURN NUMBER optstring 
srv_conf->flags &= ~SRVFLAG_BLOCK;
srv_conf->flags |= SRVFLAG_NO_BLOCK;
}
+   | GZIPSTATIC{
+   srv_conf->gzip_static = 1;
+   }
;
 
 block  : BLOCK {
@@ -1441,6 +1446,7 @@ lookup(char *s)
{ "fastcgi",FCGI },
{ "forwarded",  FORWARDED },
{ "found",  FOUND },
+   { "gzip_static",GZIPSTATIC },
{ "hsts",   HSTS },
{ "include",INCLUDE },
{ "index",  INDEX },
Index: server_file.c
===
RCS file: /cvs/src/usr.sbin/httpd/server_file.c,v
retrieving revision 1.70
diff -u -p -r1.70 server_file.c
--- server_file.c   29 Apr 2021 18:23:07 -  1.70
+++ server_file.c   4 Nov 2021 13:13:58 -
@@ -229,20 +229,49 @@ server_file_request(struct httpd *env, s
goto abort;
}
 
+   media = media_find_config(env, srv_conf, path);
+
if ((ret = server_file_modified_since(clt->clt_descreq, st)) != -1) {
/* send the header without a body */
-   media = media_find_config(env, srv_conf, path);
if ((ret = server_response_http(clt, ret, media, -1,
MINIMUM(time(NULL), st->st_mtim.tv_sec))) == -1)
goto fail;
goto done;
}
 
+   /* change path to path.gz if necessary. */
+   if (srv_conf->gzip_static) {
+   struct http_descriptor  *req = clt->clt_descreq;
+   struct http_descriptor  *resp = clt->clt_descresp;
+   struct stat gzst;
+   struct kv   *r, key;
+   chargzpath[PATH_MAX];
+
+   /* check Accept-Encoding header */
+