Hi there! TLDR; a simple test program appears to show that Go's (*os.File).Write is 10x slower than C's fputs (on MacOS).
While doing some benchmarking for my lua implementation in Go [1], I found very big differences between C Lua and and golua for benchmarks that do a lot of output to stdout. Using pprof, I found that my implementation spends a lot of its time in syscall. I couldn't see an obvious reason why so I decided to make a minimal example. It is a program that writes the string "Hello There" one million times to stdout: -------- test.go -------- package main import "os" func main() { hello := []byte("Hello There\n") // To make it fairer for i := 0; i < 10000000; i++ { os.Stdout.Write(hello) } } --------- /test.go -------- To compare with, here what I think is the equivalent in C: -------- test.c -------- #include <stdio.h> int main() { for (int i = 0; i < 10000000; i++) { fputs("Hello There\n", stdout); } return 0; } -------- /test.c -------- I compared those using multitime [2], using both go 1.15.6 and the beta1 release of go 1.16, using the following steps (I am using gvm to select different Go versions). - Compile the Go version using go 1.15 and go 1.16, and the C version using clang. $ gvm use go1.16beta1 Now using version go1.16beta1 $ go version && go build -o test-go1.16 test.go go version go1.16beta1 darwin/amd64 $ gvm use go1.15.6 Now using version go1.15.6 $ go version && go build -o test-go1.15 test.go go version go1.15.6 darwin/amd64 $ clang -o test-c test.c - Check that the C version and the Go version output the same amount of data to stdout: $ ./test-c | wc -c 120000000 $ ./test-go1.15 | wc -c 120000000 - Run each executable 5 times $ cat >cmds <<EOF > -q ./test-c > -q ./test-go1.15 > -q ./test-go1.16 > EOF $ multitime -b cmds -n 5 ===> multitime results 1: -q ./test-c Mean Std.Dev. Min Median Max real 0.524 0.070 0.476 0.492 0.662 user 0.475 0.011 0.465 0.472 0.495 sys 0.011 0.002 0.009 0.011 0.014 2: -q ./test-go1.15 Mean Std.Dev. Min Median Max real 5.986 0.125 5.861 5.947 6.186 user 3.717 0.040 3.677 3.715 3.788 sys 2.262 0.034 2.221 2.260 2.314 3: -q ./test-go1.16 Mean Std.Dev. Min Median Max real 5.958 0.160 5.781 5.941 6.213 user 3.706 0.094 3.624 3.638 3.855 sys 2.258 0.069 2.200 2.215 2.373 There is no significant difference between 1.15 and 1.16, but both are more than 10 times slower than the C version. Why is it so? Is there something that I can do to overcome this performance penalty? Any insights would be appreciated. FWIW, I am running these on MacOS Catalina $ uname -v Darwin Kernel Version 19.6.0: Thu Oct 29 22:56:45 PDT 2020; root:xnu-6153.141.2.2~1/RELEASE_X86_64 (sorry I haven't got easy access to a Linux box to run this on). -- Arnaud Delobelle [1] https://github.com/arnodel/golua [2] https://tratt.net/laurie/src/multitime/multitime.1.html -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/c7966990-d873-4d29-a5aa-9c52642d98fdn%40googlegroups.com.