On Wed, Aug 10, 2022 at 09:29:43PM +0200, Thomas Oltmann wrote: > I think we can all agree that the current web archive over at > lists.suckless.org isn't all that great; > Author names get mangled, the navigation is terrible, some messages > are duplicated, some missing.
I've noticed the missing mails too. > Is there currently any interest in such a project here? If it'd be an improvement over the current system then I don't see why not. > So far, I've gone ahead and implemented a sort of proof-of-concept (at > https://www.github.com/tomolt/mailarchiver). Hmm, interesting source code. A couple observations: 0. `.POSIX` needs to be first non-comment line in the Makefile 1. L277: pointer arithmetic is only valid as long as the result is within the array or just 1 past it. 2. L36: `mail` should be declared `static` as it's not used outside of the TU. Usage of memcpy for string copying is good to see. I think more C programmers should start thinking of strings as buffers and tracking their length as necessary. Which can both improve efficiency and reduce chances of buffer mishandling. But in the case of `encode_html()`, stpcpy is probably the proper function to use. Anyways, I've attached patches for all the above. The stpcpy change is opinionated, so feel free to reject that. And one more thing: /* TODO we should probably handle EINTR and partial reads */ Best thing to do here is not using `read()` to begin with. Instead use `mmap()` to map the file into a private buffer. Otherwise I think using `fread` is also an (inferior) option, don't think you need to worry about EINTR with fread. - NRK
>From 2e16b5e88e2a9394c1b331d81994186151474391 Mon Sep 17 00:00:00 2001 From: NRK <[email protected]> Date: Thu, 11 Aug 2022 14:13:55 +0600 Subject: [PATCH 1/4] .POSIX must be the first non-comment line MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit from `make 1p make`: > The application shall ensure that this special target is > specified without prerequisites or commands. If it appears > as the first non-comment line in the makefile, make shall > process the makefile as specified by this section; other‐ > wise, the behavior of make is unspecified. --- Makefile | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/Makefile b/Makefile index bd4c260..851d368 100644 --- a/Makefile +++ b/Makefile @@ -1,10 +1,10 @@ +.POSIX: + include config.mk SRC = mailarchiver.c OBJ = ${SRC:.c=.o} -.POSIX: - .PHONY: all clean install uninstall all: mailarchiver -- 2.35.1
>From 128a4d8ad26ffbd46f320747f408ad5a1edd0bbc Mon Sep 17 00:00:00 2001 From: NRK <[email protected]> Date: Thu, 11 Aug 2022 14:19:33 +0600 Subject: [PATCH 2/4] fix pointer arithmetic pointer arithmetic are only allowed to go 1 past the end of the array. going any further than that is UB. > If both the pointer operand and the result point to elements of the > same array object, or one past the last element of the array object, > the evaluation shall not produce an overflow; otherwise, the behavior > is undefined. ref: https://port70.net/~nsz/c/c11/n1570.html#6.5.6p8 --- mailarchiver.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mailarchiver.c b/mailarchiver.c index e2a37f5..9f6fbad 100644 --- a/mailarchiver.c +++ b/mailarchiver.c @@ -274,7 +274,7 @@ encode_html(int fd, char *str) case '\t': memcpy(w, " ", 24); w += 24; break; default: *w++ = *r; } - if (w + 24 > buf + sizeof buf) { + if (buf + sizeof buf - w <= 24) { write(fd, buf, w - buf); w = buf; } -- 2.35.1
>From 4e0bebc289d053c3fad8ea0f0fc401c0b18d3aef Mon Sep 17 00:00:00 2001 From: NRK <[email protected]> Date: Thu, 11 Aug 2022 14:21:22 +0600 Subject: [PATCH 3/4] use stpcpy instead of memcpy it's nice to see usage of memcpy but in this case stpcpy is the proper function to use. and optimizing compilers are able to optimize the function away [^0] due to the src being available at compile time, so there's no performance benefit towards using memcpy here either. [^0]: https://godbolt.org/z/r89q6oGre --- mailarchiver.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/mailarchiver.c b/mailarchiver.c index 9f6fbad..b05c7df 100644 --- a/mailarchiver.c +++ b/mailarchiver.c @@ -266,12 +266,12 @@ encode_html(int fd, char *str) for (r = str, w = buf; *r; r++) { switch (*r) { - case '<': memcpy(w, "<", 4); w += 4; break; - case '>': memcpy(w, ">", 4); w += 4; break; - case '&': memcpy(w, "&", 5); w += 5; break; - case '"': memcpy(w, """, 6); w += 6; break; - case '\n': memcpy(w, "\n<br/>\n", 7); w += 7; break; - case '\t': memcpy(w, " ", 24); w += 24; break; + case '<': w = stpcpy(w, "<"); break; + case '>': w = stpcpy(w, ">"); break; + case '&': w = stpcpy(w, "&"); break; + case '"': w = stpcpy(w, """); break; + case '\n': w = stpcpy(w, "\n<br/>\n"); break; + case '\t': w = stpcpy(w, " "); break; default: *w++ = *r; } if (buf + sizeof buf - w <= 24) { -- 2.35.1
>From d7076c67797309ebb9849d7267753371518c1756 Mon Sep 17 00:00:00 2001 From: NRK <[email protected]> Date: Thu, 11 Aug 2022 14:50:09 +0600 Subject: [PATCH 4/4] make `mail` static it's not used outside of the translation unit --- mailarchiver.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mailarchiver.c b/mailarchiver.c index b05c7df..529cb34 100644 --- a/mailarchiver.c +++ b/mailarchiver.c @@ -33,7 +33,7 @@ struct mail { char *content; }; -struct mail mail; +static struct mail mail; void die(const char *format, ...) -- 2.35.1
