Bad news everyone,
gyes from GNU coreutils in ports seems to top out at ~185 MB/s when
writing to /dev/null on my otherwise idle laptop (per sysutils/pv from
ports):
$ doas nice -n -20 gyes | pv > /dev/null
921MiB 0:00:05 [ 185MiB/s] [ <=> ]
1.80GiB 0:00:10 [ 185MiB/s] [ <=> ]
3.07GiB 0:00:17 [ 186MiB/s] [ <=> ]
4.34GiB 0:00:24 [ 185MiB/s] [ <=> ]
5.06GiB 0:00:28 [ 185MiB/s] [ <=> ]
Under the same conditions, our yes(1) tops out at a mere ~20 MB/s:
$ doas nice -n -20 yes | pv > /dev/null
206MiB 0:00:10 [20.7MiB/s] [ <=> ]
414MiB 0:00:20 [20.7MiB/s] [ <=> ]
641MiB 0:00:31 [20.7MiB/s] [ <=> ]
828MiB 0:00:40 [20.7MiB/s] [ <=> ]
1014MiB 0:00:49 [20.7MiB/s] [ <=> ]
Not great. Not great at all.
Attached is a patch to improve our yes(1) throughput and perhaps
restore glory to src/usr.bin. Basically we bypass stdio and write(2)
up to a page of the expletive all at once. Or, if the expletive is
too long to pattern a page with we just write(2) it directly.
With the enclosed patch, OpenBSD yes(1) now tops out at ~211 MB/s
under the aforementioned conditions:
$ doas nice -n -20 yes | pv > /dev/null
1.04GiB 0:00:05 [ 211MiB/s] [ <=> ]
2.07GiB 0:00:10 [ 212MiB/s] [ <=> ]
3.10GiB 0:00:15 [ 211MiB/s] [ <=> ]
4.14GiB 0:00:20 [ 211MiB/s] [ <=> ]
5.18GiB 0:00:25 [ 211MiB/s] [ <=> ]
It's possible sysutils/pv itself is a bottleneck here, but I think a
tenfold throughput improvement is probably pretty robust. Also, there
may be a more optimal buffer size, but this seems like a good enough
place to start.
--
No, no, I'm not being serious. Sorry. :)
The throughput improvement with such a small code change is
interesting though.
Index: yes.c
===================================================================
RCS file: /cvs/src/usr.bin/yes/yes.c,v
retrieving revision 1.9
diff -u -p -r1.9 yes.c
--- yes.c 13 Oct 2015 07:03:26 -0000 1.9
+++ yes.c 18 Jun 2021 06:23:02 -0000
@@ -32,18 +32,55 @@
#include <err.h>
#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
#include <unistd.h>
int
main(int argc, char *argv[])
{
+ char *buf, *expletive, *tmp;
+ size_t buflen, exlen, off, pagesize;
+ ssize_t nw;
+
if (pledge("stdio", NULL) == -1)
err(1, "pledge");
- if (argc > 1)
- for (;;)
- puts(argv[1]);
- else
- for (;;)
- puts("y");
+ if (argc == 1) {
+ expletive = "y\n";
+ exlen = 2;
+ } else {
+ expletive = argv[1];
+ exlen = strlen(expletive);
+ expletive[exlen] = '\n'; /* overwrite NUL with NL */
+ exlen += 1;
+ }
+
+ /*
+ * If possible, pack a page-sized buffer with as many copies of
+ * the expletive as we can fit. Batching multiple lines into
+ * each write(2) improves throughput.
+ */
+ pagesize = getpagesize();
+ if (exlen <= pagesize / 2) {
+ buflen = pagesize / exlen * exlen;
+ buf = malloc(buflen);
+ if (buf == NULL)
+ err(1, NULL);
+ for (tmp = buf; tmp < buf + buflen; tmp += exlen)
+ memcpy(tmp, expletive, exlen);
+ } else {
+ buf = expletive;
+ buflen = exlen;
+ }
+
+ for (;;) {
+ for (off = 0; off < buflen; off += nw) {
+ nw = write(STDOUT_FILENO, buf + off, buflen - off);
+ if (nw == 0 || nw == -1)
+ err(1, "write");
+ }
+ }
+
+ return 1;
}